RESUMEN
Linked-read sequencing promises a one-method approach for genome-wide insights including single nucleotide variants (SNVs), structural variants, and haplotyping. We introduce Barcode Linked Reads (BLR), an open-source haplotyping pipeline capable of handling millions of barcodes and data from multiple linked-read technologies including DBS, 10× Genomics, TELL-seq and stLFR. Running BLR on DBS linked-reads yielded megabase-scale phasing with low (<0.2%) switch error rates. Of 13616 protein-coding genes phased in the GIAB benchmark set (v4.2.1), 98.6% matched the BLR phasing. In addition, large structural variants showed concordance with HPRC-HG002 reference assembly calls. Compared to diploid assembly with PacBio HiFi reads, BLR phasing was more continuous when considering switch errors. We further show that integrating long reads at low coverage (â¼10×) can improve phasing contiguity and reduce switch errors in tandem repeats. When compared to Long Ranger on 10× Genomics data, BLR showed an increase in phase block N50 with low switch-error rates. For TELL-Seq and stLFR linked reads, BLR generated longer or similar phase block lengths and low switch error rates compared to results presented in the original publications. In conclusion, BLR provides a flexible workflow for comprehensive haplotype analysis of linked reads from multiple platforms.
Asunto(s)
Genoma Humano , Genómica , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodosRESUMEN
BACKGROUND: Hundreds of variants associated with atopic dermatitis (AD) and psoriasis, 2 common inflammatory skin disorders, have previously been discovered through genome-wide association studies (GWASs). The majority of these variants are in noncoding regions, and their target genes remain largely unclear. OBJECTIVE: We sought to understand the effects of these noncoding variants on the development of AD and psoriasis by linking them to the genes that they regulate. METHODS: We constructed genomic 3-dimensional maps of human keratinocytes during differentiation by using targeted chromosome conformation capture (Capture Hi-C) targeting more than 20,000 promoters and 214 GWAS variants and combined these data with transcriptome and epigenomic data sets. We validated our results with reporter assays, clustered regularly interspaced short palindromic repeats activation, and examination of patient gene expression from previous studies. RESULTS: We identified 118 target genes of 82 AD and psoriasis GWAS variants. Differential expression of 58 of the 118 target genes (49%) occurred in either AD or psoriatic lesions, many of which were not previously linked to any skin disease. We highlighted the genes AFG1L, CLINT1, ADO, LINC00302, and RP1-140J1.1 and provided further evidence for their potential roles in AD and psoriasis. CONCLUSIONS: Our work focused on skin barrier pathology through investigation of the interaction profile of GWAS variants during keratinocyte differentiation. We have provided a catalogue of candidate genes that could modulate the risk of AD and psoriasis. Given that only 35% of the target genes are the gene nearest to the known GWAS variants, we expect that our work will contribute to the discovery of novel pathways involved in AD and psoriasis.
Asunto(s)
Cromatina , Dermatitis Atópica/genética , Queratinocitos , Psoriasis/genética , Predisposición Genética a la Enfermedad , HumanosRESUMEN
Small extracellular vesicles (sEVs) have in recent years evolved as a source of biomarkers for disease diagnosis and therapeutic follow up. sEV samples derived from multicellular organisms exhibit a high heterogeneous repertoire of vesicles which current methods based on ensemble measurements cannot capture. In this work we present droplet barcode sequencing for protein analysis (DBS-Pro) to profile surface proteins on individual sEVs, facilitating identification of sEV-subtypes within and between samples. The method allows for analysis of multiple proteins through use of DNA barcoded affinity reagents and sequencing as readout. High throughput single vesicle profiling is enabled through compartmentalization of individual sEVs in emulsion droplets followed by droplet barcoding through PCR. In this proof-of-concept study we demonstrate that DBS-Pro allows for analysis of single sEVs, with a mixing rate below 2%. A total of over 120,000 individual sEVs obtained from a NSCLC cell line and from malignant pleural effusion (MPE) fluid of NSCLC patients have been analyzed based on their surface proteins. We also show that the method enables single vesicle surface protein profiling and by extension characterization of sEV-subtypes, which is essential to identify the cellular origin of vesicles in heterogenous samples.
Asunto(s)
Vesículas Extracelulares , Humanos , Vesículas Extracelulares/genética , Biomarcadores/metabolismo , Línea Celular , Proteínas de la Membrana/metabolismoRESUMEN
BACKGROUND: Genetic variant landscape of coronary artery disease is dominated by noncoding variants among which many occur within putative enhancers regulating the expression levels of relevant genes. It is crucial to assign the genetic variants to their correct genes both to gain insights into perturbed functions and better assess the risk of disease. METHODS: In this study, we generated high-resolution genomic interaction maps (≈750 bases) in aortic endothelial, smooth muscle cells and THP-1 (human leukemia monocytic cell line) macrophages stimulated with lipopolysaccharide using Hi-C coupled with sequence capture targeting 25 429 features, including variants associated with coronary artery disease. We also sequenced their transcriptomes and mapped putative enhancers using chromatin immunoprecipitation with an antibody against H3K27Ac. RESULTS: The regions interacting with promoters showed strong enrichment for enhancer elements and validated several previously known interactions and enhancers. We detected interactions for 727 risk variants obtained by genome-wide association studies and identified novel, as well as established genes and functions associated with cardiovascular diseases. We were able to assign potential target genes for additional 398 genome-wide association studies variants using haplotype information, thereby identifying additional relevant genes and functions. Importantly, we discovered that a subset of risk variants interact with multiple promoters and their expression levels were strongly correlated. CONCLUSIONS: In summary, we present a catalog of candidate genes regulated by coronary artery disease-related variants and think that it will be an invaluable resource to further the investigation of cardiovascular pathologies and disease.