Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 49
1.
Nat Methods ; 21(4): 723-734, 2024 Apr.
Article En | MEDLINE | ID: mdl-38504114

The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE-gene links in K562 cells, we established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, we find that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. We uncover a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, we provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.


CRISPR-Cas Systems , Clustered Regularly Interspaced Short Palindromic Repeats , Humans , CRISPR-Cas Systems/genetics , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , RNA, Guide, CRISPR-Cas Systems , Genome , K562 Cells
2.
Nature ; 626(8000): 799-807, 2024 Feb.
Article En | MEDLINE | ID: mdl-38326615

Linking variants from genome-wide association studies (GWAS) to underlying mechanisms of disease remains a challenge1-3. For some diseases, a successful strategy has been to look for cases in which multiple GWAS loci contain genes that act in the same biological pathway1-6. However, our knowledge of which genes act in which pathways is incomplete, particularly for cell-type-specific pathways or understudied genes. Here we introduce a method to connect GWAS variants to functions. This method links variants to genes using epigenomics data, links genes to pathways de novo using Perturb-seq and integrates these data to identify convergence of GWAS loci onto pathways. We apply this approach to study the role of endothelial cells in genetic risk for coronary artery disease (CAD), and discover 43 CAD GWAS signals that converge on the cerebral cavernous malformation (CCM) signalling pathway. Two regulators of this pathway, CCM2 and TLNRD1, are each linked to a CAD risk variant, regulate other CAD risk genes and affect atheroprotective processes in endothelial cells. These results suggest a model whereby CAD risk is driven in part by the convergence of causal genes onto a particular transcriptional pathway in endothelial cells. They highlight shared genes between common and rare vascular diseases (CAD and CCM), and identify TLNRD1 as a new, previously uncharacterized member of the CCM signalling pathway. This approach will be widely useful for linking variants to functions for other common polygenic diseases.


Coronary Artery Disease , Endothelial Cells , Genome-Wide Association Study , Hemangioma, Cavernous, Central Nervous System , Humans , Coronary Artery Disease/genetics , Coronary Artery Disease/pathology , Endothelial Cells/metabolism , Endothelial Cells/pathology , Genetic Predisposition to Disease/genetics , Hemangioma, Cavernous, Central Nervous System/genetics , Hemangioma, Cavernous, Central Nervous System/pathology , Polymorphism, Single Nucleotide , Epigenomics , Signal Transduction/genetics , Multifactorial Inheritance
3.
bioRxiv ; 2024 Feb 04.
Article En | MEDLINE | ID: mdl-38352544

Pathological high shear stress (HSS, 100 dyn/cm 2 ) is generated in distal pulmonary arteries (PA) (100-500 µm) in congenital heart defects and in progressive PA hypertension (PAH) with inward remodeling and luminal narrowing. Human PA endothelial cells (PAEC) were subjected to HSS versus physiologic laminar shear stress (LSS, 15 dyn/cm 2 ). Endothelial-mesenchymal transition (EndMT), a feature of PAH not previously attributed to HSS, was observed. H3K27ac peaks containing motifs for an ETS-family transcription factor (ERG) were reduced, as was ERG-Krüppel-like factors (KLF)2/4 interaction and ERG expression. Reducing ERG by siRNA in PAEC during LSS caused EndMT; transfection of ERG in PAEC under HSS prevented EndMT. An aorto-caval shunt was preformed in mice to induce HSS and progressive PAH. Elevated PA pressure, EndMT and vascular remodeling were reduced by an adeno-associated vector that selectively replenished ERG in PAEC. Agents maintaining ERG in PAEC should overcome the adverse effect of HSS on progressive PAH.

4.
bioRxiv ; 2023 Nov 13.
Article En | MEDLINE | ID: mdl-38014075

Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.

5.
Nat Commun ; 14(1): 7578, 2023 Nov 21.
Article En | MEDLINE | ID: mdl-37989727

Pulmonary arterial hypertension (PAH) is a progressive disease in which pulmonary arterial (PA) endothelial cell (EC) dysfunction is associated with unrepaired DNA damage. BMPR2 is the most common genetic cause of PAH. We report that human PAEC with reduced BMPR2 have persistent DNA damage in room air after hypoxia (reoxygenation), as do mice with EC-specific deletion of Bmpr2 (EC-Bmpr2-/-) and persistent pulmonary hypertension. Similar findings are observed in PAEC with loss of the DNA damage sensor ATM, and in mice with Atm deleted in EC (EC-Atm-/-). Gene expression analysis of EC-Atm-/- and EC-Bmpr2-/- lung EC reveals reduced Foxf1, a transcription factor with selectivity for lung EC. Reducing FOXF1 in control PAEC induces DNA damage and impaired angiogenesis whereas transfection of FOXF1 in PAH PAEC repairs DNA damage and restores angiogenesis. Lung EC targeted delivery of Foxf1 to reoxygenated EC-Bmpr2-/- mice repairs DNA damage, induces angiogenesis and reverses pulmonary hypertension.


Hypertension, Pulmonary , Pulmonary Arterial Hypertension , Mice , Humans , Animals , Pulmonary Arterial Hypertension/genetics , Hypertension, Pulmonary/genetics , Hypertension, Pulmonary/metabolism , Familial Primary Pulmonary Hypertension/metabolism , Pulmonary Artery/metabolism , DNA Damage , Bone Morphogenetic Protein Receptors, Type II/genetics , Forkhead Transcription Factors/genetics , Forkhead Transcription Factors/metabolism
6.
Nat Genet ; 55(8): 1267-1276, 2023 08.
Article En | MEDLINE | ID: mdl-37443254

Genome-wide association studies (GWASs) are a valuable tool for understanding the biology of complex human traits and diseases, but associated variants rarely point directly to causal genes. In the present study, we introduce a new method, polygenic priority score (PoPS), that learns trait-relevant gene features, such as cell-type-specific expression, to prioritize genes at GWAS loci. Using a large evaluation set of genes with fine-mapped coding variants, we show that PoPS and the closest gene individually outperform other gene prioritization methods, but observe the best overall performance by combining PoPS with orthogonal methods. Using this combined approach, we prioritize 10,642 unique gene-trait pairs across 113 complex traits and diseases with high precision, finding not only well-established gene-trait relationships but nominating new genes at unresolved loci, such as LGR4 for estimated glomerular filtration rate and CCR7 for deep vein thrombosis. Overall, we demonstrate that PoPS provides a powerful addition to the gene prioritization toolbox.


Multifactorial Inheritance , Quantitative Trait Loci , Humans , Multifactorial Inheritance/genetics , Quantitative Trait Loci/genetics , Genome-Wide Association Study/methods , Genetic Predisposition to Disease/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics
7.
Circ Genom Precis Med ; 16(3): 258-266, 2023 06.
Article En | MEDLINE | ID: mdl-37026454

BACKGROUND: Congenital heart disease (CHD) is highly heritable, but the power to identify inherited risk has been limited to analyses of common variants in small cohorts. METHODS: We performed reimputation of 4 CHD cohorts (n=55 342) to the TOPMed reference panel (freeze 5), permitting meta-analysis of 14 784 017 variants including 6 035 962 rare variants of high imputation quality as validated by whole genome sequencing. RESULTS: Meta-analysis identified 16 novel loci, including 12 rare variants, which displayed moderate or large effect sizes (median odds ratio, 3.02) for 4 separate CHD categories. Analyses of chromatin structure link 13 of the genome-wide significant loci to key genes in cardiac development; rs373447426 (minor allele frequency, 0.003 [odds ratio, 3.37 for Conotruncal heart disease]; P=1.49×10-8) is predicted to disrupt chromatin structure for 2 nearby genes BDH1 and DLG1 involved in Conotruncal development. A lead variant rs189203952 (minor allele frequency, 0.01 [odds ratio, 2.4 for left ventricular outflow tract obstruction]; P=1.46×10-8) is predicted to disrupt the binding sites of 4 transcription factors known to participate in cardiac development in the promoter of SPAG9. A tissue-specific model of chromatin conformation suggests that common variant rs78256848 (minor allele frequency, 0.11 [odds ratio, 1.4 for Conotruncal heart disease]; P=2.6×10-8) physically interacts with NCAM1 (PFDR=1.86×10-27), a neural adhesion molecule acting in cardiac development. Importantly, while each individual malformation displayed substantial heritability (observed h2 ranging from 0.26 for complex malformations to 0.37 for left ventricular outflow tract obstructive disease) the risk for different CHD malformations appeared to be separate, without genetic correlation measured by linkage disequilibrium score regression or regional colocalization. CONCLUSIONS: We describe a set of rare noncoding variants conferring significant risk for individual heart malformations which are linked to genes governing cardiac development. These results illustrate that the oligogenic basis of CHD and significant heritability may be linked to rare variants outside protein-coding regions conferring substantial risk for individual categories of cardiac malformation.


Heart Defects, Congenital , Humans , Heart Defects, Congenital/diagnosis , Heart Defects, Congenital/genetics , Phenotype , Gene Frequency , Whole Genome Sequencing , Chromatin , Adaptor Proteins, Signal Transducing/genetics
8.
bioRxiv ; 2023 Dec 21.
Article En | MEDLINE | ID: mdl-38187584

Regulatory DNA sequences within enhancers and promoters bind transcription factors to encode cell type-specific patterns of gene expression. However, the regulatory effects and programmability of such DNA sequences remain difficult to map or predict because we have lacked scalable methods to precisely edit regulatory DNA and quantify the effects in an endogenous genomic context. Here we present an approach to measure the quantitative effects of hundreds of designed DNA sequence variants on gene expression, by combining pooled CRISPR prime editing with RNA fluorescence in situ hybridization and cell sorting (Variant-FlowFISH). We apply this method to mutagenize and rewrite regulatory DNA sequences in an enhancer and the promoter of PPIF in two immune cell lines. Of 672 variant-cell type pairs, we identify 497 that affect PPIF expression. These variants appear to act through a variety of mechanisms including disruption or optimization of existing transcription factor binding sites, as well as creation of de novo sites. Disrupting a single endogenous transcription factor binding site often led to large changes in expression (up to -40% in the enhancer, and -50% in the promoter). The same variant often had different effects across cell types and states, demonstrating a highly tunable regulatory landscape. We use these data to benchmark performance of sequence-based predictive models of gene regulation, and find that certain types of variants are not accurately predicted by existing models. Finally, we computationally design 185 small sequence variants (≤10 bp) and optimize them for specific effects on expression in silico. 84% of these rationally designed edits showed the intended direction of effect, and some had dramatic effects on expression (-100% to +202%). Variant-FlowFISH thus provides a powerful tool to map the effects of variants and transcription factor binding sites on gene expression, test and improve computational models of gene regulation, and reprogram regulatory DNA.

9.
Cell ; 185(26): 4937-4953.e23, 2022 12 22.
Article En | MEDLINE | ID: mdl-36563664

To define the multi-cellular epigenomic and transcriptional landscape of cardiac cellular development, we generated single-cell chromatin accessibility maps of human fetal heart tissues. We identified eight major differentiation trajectories involving primary cardiac cell types, each associated with dynamic transcription factor (TF) activity signatures. We contrasted regulatory landscapes of iPSC-derived cardiac cell types and their in vivo counterparts, which enabled optimization of in vitro differentiation of epicardial cells. Further, we interpreted sequence based deep learning models of cell-type-resolved chromatin accessibility profiles to decipher underlying TF motif lexicons. De novo mutations predicted to affect chromatin accessibility in arterial endothelium were enriched in congenital heart disease (CHD) cases vs. controls. In vitro studies in iPSCs validated the functional impact of identified variation on the predicted developmental cell types. This work thus defines the cell-type-resolved cis-regulatory sequence determinants of heart development and identifies disruption of cell type-specific regulatory elements in CHD.


Chromatin , Heart Defects, Congenital , Humans , Chromatin/genetics , Heart Defects, Congenital/genetics , Heart , Mutation , Single-Cell Analysis
10.
Nat Genet ; 54(10): 1479-1492, 2022 10.
Article En | MEDLINE | ID: mdl-36175791

Genome-wide association studies provide a powerful means of identifying loci and genes contributing to disease, but in many cases, the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important for identifying pathogenic processes and developing therapeutics. In the present study, we introduce sc-linker, a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. The inferred disease enrichments recapitulated known biology and highlighted notable cell-disease relationships, including γ-aminobutyric acid-ergic neurons in major depressive disorder, a disease-dependent M-cell program in ulcerative colitis and a disease-specific complement cascade process in multiple sclerosis. In autoimmune disease, both healthy and disease-dependent immune cell-type programs were associated, whereas only disease-dependent epithelial cell programs were prominent, suggesting a role in disease response rather than initiation. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.


Depressive Disorder, Major , Genome-Wide Association Study , Depressive Disorder, Major/genetics , Genetic Predisposition to Disease , Human Genetics , Humans , Polymorphism, Single Nucleotide/genetics , RNA , gamma-Aminobutyric Acid
11.
Nat Commun ; 13(1): 4941, 2022 08 23.
Article En | MEDLINE | ID: mdl-35999210

Physiologic laminar shear stress (LSS) induces an endothelial gene expression profile that is vasculo-protective. In this report, we delineate how LSS mediates changes in the epigenetic landscape to promote this beneficial response. We show that under LSS, KLF4 interacts with the SWI/SNF nucleosome remodeling complex to increase accessibility at enhancer sites that promote the expression of homeostatic endothelial genes. By combining molecular and computational approaches we discover enhancers that loop to promoters of KLF4- and LSS-responsive genes that stabilize endothelial cells and suppress inflammation, such as BMPR2, SMAD5, and DUSP5. By linking enhancers to genes that they regulate under physiologic LSS, our work establishes a foundation for interpreting how non-coding DNA variants in these regions might disrupt protective gene expression to influence vascular disease.


Chromatin , Endothelial Cells , Chromatin/genetics , Chromatin Assembly and Disassembly/genetics , Nucleosomes/genetics , Regulatory Sequences, Nucleic Acid
12.
Cell Genom ; 2(7)2022 Jul 13.
Article En | MEDLINE | ID: mdl-35873673

We assess contributions to autoimmune disease of genes whose regulation is driven by enhancer regions (enhancer-related) and genes that regulate other genes in trans (candidate master-regulator). We link these genes to SNPs using several SNP-to-gene (S2G) strategies and apply heritability analyses to draw three conclusions about 11 autoimmune/blood-related diseases/traits. First, several characterizations of enhancer-related genes using functional genomics data are informative for autoimmune disease heritability after conditioning on a broad set of regulatory annotations. Second, candidate master-regulator genes defined using trans-eQTL in blood are also conditionally informative for autoimmune disease heritability. Third, integrating enhancer-related and master-regulator gene sets with protein-protein interaction (PPI) network information magnified their disease signal. The resulting PPI-enhancer gene score produced >2-fold stronger heritability signal and >2-fold stronger enrichment for drug targets, compared with the recently proposed enhancer domain score. In each case, functionally informed S2G strategies produced 4.1- to 13-fold stronger disease signals than conventional window-based strategies.

13.
Nat Genet ; 54(6): 827-836, 2022 06.
Article En | MEDLINE | ID: mdl-35668300

Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.


Genome-Wide Association Study , Polymorphism, Single Nucleotide , Genome-Wide Association Study/methods , Phenotype , Polymorphism, Single Nucleotide/genetics
14.
Nature ; 607(7917): 176-184, 2022 07.
Article En | MEDLINE | ID: mdl-35594906

Gene regulation in the human genome is controlled by distal enhancers that activate specific nearby promoters1. A proposed model for this specificity is that promoters have sequence-encoded preferences for certain enhancers, for example, mediated by interacting sets of transcription factors or cofactors2. This 'biochemical compatibility' model has been supported by observations at individual human promoters and by genome-wide measurements in Drosophila3-9. However, the degree to which human enhancers and promoters are intrinsically compatible has not yet been systematically measured, and how their activities combine to control RNA expression remains unclear. Here we design a high-throughput reporter assay called enhancer × promoter self-transcribing active regulatory region sequencing (ExP STARR-seq) and applied it to examine the combinatorial compatibilities of 1,000 enhancer and 1,000 promoter sequences in human K562 cells. We identify simple rules for enhancer-promoter compatibility, whereby most enhancers activate all promoters by similar amounts, and intrinsic enhancer and promoter activities multiplicatively combine to determine RNA output (R2 = 0.82). In addition, two classes of enhancers and promoters show subtle preferential effects. Promoters of housekeeping genes contain built-in activating motifs for factors such as GABPA and YY1, which decrease the responsiveness of promoters to distal enhancers. Promoters of variably expressed genes lack these motifs and show stronger responsiveness to enhancers. Together, this systematic assessment of enhancer-promoter compatibility suggests a multiplicative model tuned by enhancer and promoter class to control gene transcription in the human genome.


Enhancer Elements, Genetic , Promoter Regions, Genetic , Enhancer Elements, Genetic/genetics , Humans , Promoter Regions, Genetic/genetics , RNA/biosynthesis , RNA/genetics , Transcription Factors/metabolism
15.
JCI Insight ; 7(3)2022 02 08.
Article En | MEDLINE | ID: mdl-35132965

The fibrous annulus of the mitral valve plays an important role in valvular function and cardiac physiology, while normal variation in the size of cardiovascular anatomy may share a genetic link with common and rare disease. We derived automated estimates of mitral valve annular diameter in the 4-chamber view from 32,220 MRI images from the UK Biobank at ventricular systole and diastole as the basis for GWAS. Mitral annular dimensions corresponded to previously described anatomical norms, and GWAS inclusive of 4 population strata identified 10 loci, including possibly novel loci (GOSR2, ERBB4, MCTP2, MCPH1) and genes related to cardiac contractility (BAG3, TTN, RBFOX1). ATAC-Seq of primary mitral valve tissue localized multiple variants to regions of open chromatin in biologically relevant cell types and rs17608766 to an algorithmically predicted enhancer element in GOSR2. We observed strong genetic correlation with measures of contractility and mitral valve disease and clinical correlations with heart failure, cerebrovascular disease, and ventricular arrhythmias. Polygenic scoring of mitral valve annular diameter in systole was predictive of risk mitral valve prolapse across 4 cohorts. In summary, genetic and clinical studies of mitral valve annular diameter revealed genetic determinants of mitral valve biology, while highlighting clinical associations. Polygenic determinants of mitral valve annular diameter may represent an independent risk factor for mitral prolapse. Overall, computationally estimated phenotypes derived at scale from medical imaging represent an important substrate for genetic discovery and clinical risk prediction.


DNA/genetics , Heart Valve Diseases/genetics , Mitral Valve/diagnostic imaging , Mutation , Myocardial Contraction/physiology , Qb-SNARE Proteins/genetics , DNA Mutational Analysis , Echocardiography , Female , Heart Valve Diseases/diagnosis , Heart Valve Diseases/physiopathology , Humans , Male , Middle Aged , Mitral Valve/physiopathology , Qb-SNARE Proteins/metabolism
16.
Hum Mol Genet ; 31(12): 1946-1961, 2022 06 22.
Article En | MEDLINE | ID: mdl-34970970

BACKGROUND: FCGR2A binds antibody-antigen complexes to regulate the abundance of circulating and deposited complexes along with downstream immune and autoimmune responses. Although the abundance of FCRG2A may be critical in immune-mediated diseases, little is known about whether its surface expression is regulated through cis genomic elements and non-coding variants. In the current study, we aimed to characterize the regulation of FCGR2A expression, the impact of genetic variation and its association with autoimmune disease. METHODS: We applied CRISPR-based interference and editing to scrutinize 1.7 Mb of open chromatin surrounding the FCGR2A gene to identify regulatory elements. Relevant transcription factors (TFs) binding to these regions were defined through public databases. Genetic variants affecting regulation were identified using luciferase reporter assays and were verified in a cohort of 1996 genotyped healthy individuals using flow cytometry. RESULTS: We identified a complex proximal region and five distal enhancers regulating FCGR2A. The proximal region split into subregions upstream and downstream of the transcription start site, was enriched in binding of inflammation-regulated TFs, and harbored a variant associated with FCGR2A expression in primary myeloid cells. One distal enhancer region was occupied by CCCTC-binding factor (CTCF) whose binding site was disrupted by a rare genetic variant, altering gene expression. CONCLUSIONS: The FCGR2A gene is regulated by multiple proximal and distal genomic regions, with links to autoimmune disease. These findings may open up novel therapeutic avenues where fine-tuning of FCGR2A levels may constitute a part of treatment strategies for immune-mediated diseases.


Autoimmune Diseases , Enhancer Elements, Genetic , Receptors, IgG , Autoimmune Diseases/genetics , Binding Sites , Genomics , Genotype , Humans , Receptors, IgG/genetics
17.
bioRxiv ; 2021 Nov 23.
Article En | MEDLINE | ID: mdl-34845454

Genome-wide association studies (GWAS) provide a powerful means to identify loci and genes contributing to disease, but in many cases the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important for identifying pathogenic processes and developing therapeutics. Here, we introduce sc-linker, a framework for integrating single-cell RNA-seq (scRNA-seq), epigenomic maps and GWAS summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. We analyzed 1.6 million scRNA-seq profiles from 209 individuals spanning 11 tissue types and 6 disease conditions, and constructed gene programs capturing cell types, disease progression, and cellular processes both within and across cell types. We evaluated these gene programs for disease enrichment by transforming them to SNP annotations with tissue-specific epigenomic maps and computing enrichment scores across 60 diseases and complex traits (average N= 297K). Cell type, disease progression, and cellular process programs captured distinct heritability signals even within the same cell type, as we show in multiple complex diseases that affect the brain (Alzheimer’s disease, multiple sclerosis), colon (ulcerative colitis) and lung (asthma, idiopathic pulmonary fibrosis, severe COVID-19). The inferred disease enrichments recapitulated known biology and highlighted novel cell-disease relationships, including GABAergic neurons in major depressive disorder (MDD), a disease progression M cell program in ulcerative colitis, and a disease-specific complement cascade process in multiple sclerosis. In autoimmune disease, both healthy and disease progression immune cell type programs were associated, whereas for epithelial cells, disease progression programs were most prominent, perhaps suggesting a role in disease progression over initiation. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.

18.
Nature ; 595(7865): 107-113, 2021 07.
Article En | MEDLINE | ID: mdl-33915569

COVID-19, which is caused by SARS-CoV-2, can result in acute respiratory distress syndrome and multiple organ failure1-4, but little is known about its pathophysiology. Here we generated single-cell atlases of 24 lung, 16 kidney, 16 liver and 19 heart autopsy tissue samples and spatial atlases of 14 lung samples from donors who died of COVID-19. Integrated computational analysis uncovered substantial remodelling in the lung epithelial, immune and stromal compartments, with evidence of multiple paths of failed tissue regeneration, including defective alveolar type 2 differentiation and expansion of fibroblasts and putative TP63+ intrapulmonary basal-like progenitor cells. Viral RNAs were enriched in mononuclear phagocytic and endothelial lung cells, which induced specific host programs. Spatial analysis in lung distinguished inflammatory host responses in lung regions with and without viral RNA. Analysis of the other tissue atlases showed transcriptional alterations in multiple cell types in heart tissue from donors with COVID-19, and mapped cell types and genes implicated with disease severity based on COVID-19 genome-wide association studies. Our foundational dataset elucidates the biological effect of severe SARS-CoV-2 infection across the body, a key step towards new treatments.


COVID-19/pathology , COVID-19/virology , Kidney/pathology , Liver/pathology , Lung/pathology , Myocardium/pathology , SARS-CoV-2/pathogenicity , Adult , Aged , Aged, 80 and over , Atlases as Topic , Autopsy , Biological Specimen Banks , COVID-19/genetics , COVID-19/immunology , Endothelial Cells , Epithelial Cells/pathology , Epithelial Cells/virology , Female , Fibroblasts , Genome-Wide Association Study , Heart/virology , Humans , Inflammation/pathology , Inflammation/virology , Kidney/virology , Liver/virology , Lung/virology , Male , Middle Aged , Organ Specificity , Phagocytes , Pulmonary Alveoli/pathology , Pulmonary Alveoli/virology , RNA, Viral/analysis , Regeneration , SARS-CoV-2/immunology , Single-Cell Analysis , Viral Load
19.
Nature ; 593(7858): 238-243, 2021 05.
Article En | MEDLINE | ID: mdl-33828297

Genome-wide association studies (GWAS) have identified thousands of noncoding loci that are associated with human diseases and complex traits, each of which could reveal insights into the mechanisms of disease1. Many of the underlying causal variants may affect enhancers2,3, but we lack accurate maps of enhancers and their target genes to interpret such variants. We recently developed the activity-by-contact (ABC) model to predict which enhancers regulate which genes and validated the model using CRISPR perturbations in several cell types4. Here we apply this ABC model to create enhancer-gene maps in 131 human cell types and tissues, and use these maps to interpret the functions of GWAS variants. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577 genes that appear to influence multiple phenotypes through variants in enhancers that act in different cell types. In inflammatory bowel disease (IBD), causal variants are enriched in predicted enhancers by more than 20-fold in particular cell types such as dendritic cells, and ABC achieves higher precision than other regulatory methods at connecting noncoding variants to target genes. These variant-to-function maps reveal an enhancer that contains an IBD risk variant and that regulates the expression of PPIF to alter the membrane potential of mitochondria in macrophages. Our study reveals principles of genome regulation, identifies genes that affect IBD and provides a resource and generalizable strategy to connect risk variants of common diseases to their molecular and cellular functions.


Enhancer Elements, Genetic/genetics , Genetic Predisposition to Disease , Genetic Variation/genetics , Genome, Human/genetics , Genome-Wide Association Study , Inflammatory Bowel Diseases/genetics , Cell Line , Chromosomes, Human, Pair 10/genetics , Cyclophilins/genetics , Dendritic Cells , Female , Humans , Macrophages/metabolism , Male , Mitochondria/metabolism , Organ Specificity/genetics , Phenotype
20.
bioRxiv ; 2021 Feb 25.
Article En | MEDLINE | ID: mdl-33655247

The SARS-CoV-2 pandemic has caused over 1 million deaths globally, mostly due to acute lung injury and acute respiratory distress syndrome, or direct complications resulting in multiple-organ failures. Little is known about the host tissue immune and cellular responses associated with COVID-19 infection, symptoms, and lethality. To address this, we collected tissues from 11 organs during the clinical autopsy of 17 individuals who succumbed to COVID-19, resulting in a tissue bank of approximately 420 specimens. We generated comprehensive cellular maps capturing COVID-19 biology related to patients' demise through single-cell and single-nucleus RNA-Seq of lung, kidney, liver and heart tissues, and further contextualized our findings through spatial RNA profiling of distinct lung regions. We developed a computational framework that incorporates removal of ambient RNA and automated cell type annotation to facilitate comparison with other healthy and diseased tissue atlases. In the lung, we uncovered significantly altered transcriptional programs within the epithelial, immune, and stromal compartments and cell intrinsic changes in multiple cell types relative to lung tissue from healthy controls. We observed evidence of: alveolar type 2 (AT2) differentiation replacing depleted alveolar type 1 (AT1) lung epithelial cells, as previously seen in fibrosis; a concomitant increase in myofibroblasts reflective of defective tissue repair; and, putative TP63+ intrapulmonary basal-like progenitor (IPBLP) cells, similar to cells identified in H1N1 influenza, that may serve as an emergency cellular reserve for severely damaged alveoli. Together, these findings suggest the activation and failure of multiple avenues for regeneration of the epithelium in these terminal lungs. SARS-CoV-2 RNA reads were enriched in lung mononuclear phagocytic cells and endothelial cells, and these cells expressed distinct host response transcriptional programs. We corroborated the compositional and transcriptional changes in lung tissue through spatial analysis of RNA profiles in situ and distinguished unique tissue host responses between regions with and without viral RNA, and in COVID-19 donor tissues relative to healthy lung. Finally, we analyzed genetic regions implicated in COVID-19 GWAS with transcriptomic data to implicate specific cell types and genes associated with disease severity. Overall, our COVID-19 cell atlas is a foundational dataset to better understand the biological impact of SARS-CoV-2 infection across the human body and empowers the identification of new therapeutic interventions and prevention strategies.

...