RESUMO
The human hippocampus and prefrontal cortex play critical roles in learning and cognition1,2, yet the dynamic molecular characteristics of their development remain enigmatic. Here we investigated the epigenomic and three-dimensional chromatin conformational reorganization during the development of the hippocampus and prefrontal cortex, using more than 53,000 joint single-nucleus profiles of chromatin conformation and DNA methylation generated by single-nucleus methyl-3C sequencing (snm3C-seq3)3. The remodelling of DNA methylation is temporally separated from chromatin conformation dynamics. Using single-cell profiling and multimodal single-molecule imaging approaches, we have found that short-range chromatin interactions are enriched in neurons, whereas long-range interactions are enriched in glial cells and non-brain tissues. We reconstructed the regulatory programs of cell-type development and differentiation, finding putatively causal common variants for schizophrenia strongly overlapping with chromatin loop-connected, cell-type-specific regulatory regions. Our data provide multimodal resources for studying gene regulatory dynamics in brain development and demonstrate that single-cell three-dimensional multi-omics is a powerful approach for dissecting neuropsychiatric risk loci.
RESUMO
Myasthenia gravis (MG) is etiologically associated with thymus abnormalities, but its pathology in the thymus remains unclear. In this study, we attempt to narrow down the features associated with MG using spatial transcriptome analysis of thymoma and thymic hyperplasia samples. We find that the majority of thymomas are constituted by the cortical region. However, the small medullary region is enlarged in seropositive thymomas and contains polygenic enrichment and MG-specific germinal center structures. Neuromuscular medullary thymic epithelial cells, previously identified as MG-specific autoantigen-producing cells, are enriched in the cortico-medullary junction. The medulla is characterized by a specific chemokine pattern and immune cell composition, including migratory dendritic cells and effector regulatory T cells. Similar germinal center structures and immune microenvironments are also observed in the thymic hyperplasia medulla. This study shows that the medulla and junction areas are linked to MG pathology and provides insights into future MG research.
Assuntos
Centro Germinativo , Miastenia Gravis , Timoma , Transcriptoma , Miastenia Gravis/patologia , Miastenia Gravis/genética , Humanos , Timoma/patologia , Timoma/genética , Centro Germinativo/metabolismo , Centro Germinativo/patologia , Centro Germinativo/imunologia , Transcriptoma/genética , Timo/patologia , Neoplasias do Timo/genética , Neoplasias do Timo/patologia , Feminino , Masculino , Perfilação da Expressão Gênica , Pessoa de Meia-IdadeRESUMO
Leveraging data from multiple ancestries can greatly improve fine-mapping power due to differences in linkage disequilibrium and allele frequencies. We propose MultiSuSiE, an extension of the sum of single effects model (SuSiE) to multiple ancestries that allows causal effect sizes to vary across ancestries based on a multivariate normal prior informed by empirical data. We evaluated MultiSuSiE via simulations and analyses of 14 quantitative traits leveraging whole-genome sequencing data in 47k African-ancestry and 94k European-ancestry individuals from All of Us. In simulations, MultiSuSiE applied to Afr47k+Eur47k was well-calibrated and attained higher power than SuSiE applied to Eur94k; interestingly, higher causal variant PIPs in Afr47k compared to Eur47k were entirely explained by differences in the extent of LD quantified by LD 4th moments. Compared to very recently proposed multi-ancestry fine-mapping methods, MultiSuSiE attained higher power and/or much lower computational costs, making the analysis of large-scale All of Us data feasible. In real trait analyses, MultiSuSiE applied to Afr47k+Eur94k identified 579 fine-mapped variants with PIP > 0.5, and MultiSuSiE applied to Afr47k+Eur47k identified 44% more fine-mapped variants with PIP > 0.5 than SuSiE applied to Eur94k. We validated MultiSuSiE results for real traits via functional enrichment of fine-mapped variants. We highlight several examples where MultiSuSiE implicates well-studied or biologically plausible fine-mapped variants that were not implicated by other methods.
RESUMO
The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.
RESUMO
Heritable diseases often manifest in a highly tissue-specific manner, with different disease loci mediated by genes in distinct tissues or cell types. We propose Tissue-Gene Fine-Mapping (TGFM), a fine-mapping method that infers the posterior probability (PIP) for each gene-tissue pair to mediate a disease locus by analyzing GWAS summary statistics (and in-sample LD) and leveraging eQTL data from diverse tissues to build cis-predicted expression models; TGFM also assigns PIPs to causal variants that are not mediated by gene expression in assayed genes and tissues. TGFM accounts for both co-regulation across genes and tissues and LD between SNPs (generalizing existing fine-mapping methods), and incorporates genome-wide estimates of each tissue's contribution to disease as tissue-level priors. TGFM was well-calibrated and moderately well-powered in simulations; unlike previous methods, TGFM was able to attain correct calibration by modeling uncertainty in cis-predicted expression models. We applied TGFM to 45 UK Biobank diseases/traits (average N=316K) using eQTL data from 38 GTEx tissues. TGFM identified an average of 147 PIP > 0.5 causal genetic elements per disease/trait, of which 11% were gene-tissue pairs. Implicated gene-tissue pairs were concentrated in known disease-critical tissues, and causal genes were strongly enriched in disease-relevant gene sets. Causal gene-tissue pairs identified by TGFM recapitulated known biology (e.g., TPO-thyroid for Hypothyroidism), but also included biologically plausible novel findings (e.g., SLC20A2-artery aorta for Diastolic blood pressure). Further application of TGFM to single-cell eQTL data from 9 cell types in peripheral blood mononuclear cells (PBMC), analyzed jointly with GTEx tissues, identified 30 additional causal gene-PBMC cell type pairs at PIP > 0.5-primarily for autoimmune disease and blood cell traits, including the well-established role of CTLA4 in CD8+ T cells for All autoimmune disease. In conclusion, TGFM is a robust and powerful method for fine-mapping causal tissues and genes at disease-associated loci.
RESUMO
In autoimmune diseases such as rheumatoid arthritis, the immune system attacks the body's own cells. Developing a precise understanding of the cell states where noncoding autoimmune risk variants impart causal mechanisms is critical to developing curative therapies. Here, to identify noncoding regions with accessible chromatin that associate with cell-state-defining gene expression patterns, we leveraged multimodal single-nucleus RNA and assay for transposase-accessible chromatin (ATAC) sequencing data across 28,674 cells from the inflamed synovial tissue of 12 donors. Specifically, we used a multivariate Poisson model to predict peak accessibility from single-nucleus RNA sequencing principal components. For 14 autoimmune diseases, we discovered that cell-state-dependent ('dynamic') chromatin accessibility peaks in immune cell types were enriched for heritability, compared with cell-state-invariant ('cs-invariant') peaks. These dynamic peaks marked regulatory elements associated with T peripheral helper, regulatory T, dendritic and STAT1+CXCL10+ myeloid cell states. We argue that dynamic regulatory elements can help identify precise cell states enriched for disease-critical genetic variation.
Assuntos
Doenças Autoimunes , Cromatina , Humanos , Cromatina/genética , Sequências Reguladoras de Ácido Nucleico/genética , Cromossomos , Doenças Autoimunes/genética , Genoma HumanoRESUMO
The analysis of longitudinal data from electronic health records (EHRs) has the potential to improve clinical diagnoses and enable personalized medicine, motivating efforts to identify disease subtypes from patient comorbidity information. Here we introduce an age-dependent topic modeling (ATM) method that provides a low-rank representation of longitudinal records of hundreds of distinct diseases in large EHR datasets. We applied ATM to 282,957 UK Biobank samples, identifying 52 diseases with heterogeneous comorbidity profiles; analyses of 211,908 All of Us samples produced concordant results. We defined subtypes of the 52 heterogeneous diseases based on their comorbidity profiles and compared genetic risk across disease subtypes using polygenic risk scores (PRSs), identifying 18 disease subtypes whose PRS differed significantly from other subtypes of the same disease. We further identified specific genetic variants with subtype-dependent effects on disease risk. In conclusion, ATM identifies disease subtypes with differential genome-wide and locus-specific genetic risk profiles.
Assuntos
Predisposição Genética para Doença , Saúde da População , Humanos , Bancos de Espécimes Biológicos , Estudo de Associação Genômica Ampla/métodos , Fatores de Risco , Comorbidade , Herança Multifatorial/genética , Reino Unido/epidemiologiaRESUMO
The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.
RESUMO
Single-cell RNA sequencing (scRNA-seq) provides unique insights into the pathology and cellular origin of disease. We introduce single-cell disease relevance score (scDRS), an approach that links scRNA-seq with polygenic disease risk at single-cell resolution, independent of annotated cell types. scDRS identifies cells exhibiting excess expression across disease-associated genes implicated by genome-wide association studies (GWASs). We applied scDRS to 74 diseases/traits and 1.3 million single-cell gene-expression profiles across 31 tissues/organs. Cell-type-level results broadly recapitulated known cell-type-disease associations. Individual-cell-level results identified subpopulations of disease-associated cells not captured by existing cell-type labels, including T cell subpopulations associated with inflammatory bowel disease, partially characterized by their effector-like states; neuron subpopulations associated with schizophrenia, partially characterized by their spatial locations; and hepatocyte subpopulations associated with triglyceride levels, partially characterized by their higher ploidy levels. Genes whose expression was correlated with the scDRS score across cells (reflecting coexpression with GWAS disease-associated genes) were strongly enriched for gold-standard drug target and Mendelian disease genes.
Assuntos
Estudo de Associação Genômica Ampla , Análise de Célula Única , Perfilação da Expressão Gênica/métodos , Herança Multifatorial/genética , RNA-Seq , Análise de Célula Única/métodos , TriglicerídeosRESUMO
Dysbiosis of the oral microbiome mediates chronic periodontal disease. Realignment of microbial dysbiosis towards health may prevent disease. Treatment with antibiotics and probiotics can modulate the microbial, immunological, and clinical landscape of periodontal disease with some success. Antibacterial peptides or bacteriocins, such as nisin, and a nisin-producing probiotic, Lactococcus lactis, have not been examined in this context, yet warrant examination because of their biomedical benefits in eradicating biofilms and pathogenic bacteria, modulating immune mechanisms, and their safety profile in humans. This study's goal was to examine the potential for nisin and a nisin-producing probiotic to abrogate periodontal bone loss, the host inflammatory response, and changes in oral microbiome composition in a polymicrobial mouse model of periodontal disease. Nisin and a nisin-producing Lactococcus lactis probiotic significantly decreased the levels of several periodontal pathogens, alveolar bone loss, and the oral and systemic inflammatory host response. Surprisingly, nisin and/or the nisin-producing L. lactis probiotic enhanced the population of fibroblasts and osteoblasts despite the polymicrobial infection. Nisin mediated human periodontal ligament cell proliferation dose-dependently by increasing the proliferation marker, Ki-67. Nisin and probiotic treatment significantly shifted the oral microbiome towards the healthy control state; health was associated with Proteobacteria, whereas 3 retroviruses were associated with disease. Disease-associated microbial species were correlated with IL-6 levels. Nisin or nisin-producing probiotic's ability to shift the oral microbiome towards health, mitigate periodontal destruction and the host immune response, and promote a novel proliferative phenotype in reparative connective tissue cells, addresses key aspects of the pathogenesis of periodontal disease and reveals a new biomedical application for nisin in treatment of periodontitis and reparative medicine.
Assuntos
Perda do Osso Alveolar , Lactococcus lactis , Microbiota , Nisina , Doenças Periodontais , Probióticos , Perda do Osso Alveolar/prevenção & controle , Animais , Antibacterianos , Proliferação de Células , Disbiose , Lactococcus lactis/genética , Camundongos , Doenças Periodontais/microbiologiaRESUMO
Attempts to identify and prioritize functional DNA elements in coding and non-coding regions, particularly through use of in silico functional annotation data, continue to increase in popularity. However, specific functional roles can vary widely from one variant to another, making it challenging to summarize different aspects of variant function with a one-dimensional rating. Here we propose multi-dimensional annotation-class integrative estimation (MACIE), an unsupervised multivariate mixed-model framework capable of integrating annotations of diverse origin to assess multi-dimensional functional roles for both coding and non-coding variants. Unlike existing one-dimensional scoring methods, MACIE views variant functionality as a composite attribute encompassing multiple characteristics and estimates the joint posterior functional probabilities of each genomic position. This estimate offers more comprehensive and interpretable information in the presence of multiple aspects of functionality. Applied to a variety of independent coding and non-coding datasets, MACIE demonstrates powerful and robust performance in discriminating between functional and non-functional variants. We also show an application of MACIE to fine-mapping and heritability enrichment analysis by using the lipids GWAS summary statistics data from the European Network for Genetic and Genomic Epidemiology Consortium.
Assuntos
Genoma Humano , Estudo de Associação Genômica Ampla , Genoma Humano/genética , Estudo de Associação Genômica Ampla/métodos , Genômica , Humanos , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único/genética , ProbabilidadeRESUMO
Aging is associated with complex molecular and cellular processes that are poorly understood. Here we leveraged the Tabula Muris Senis single-cell RNA-seq data set to systematically characterize gene expression changes during aging across diverse cell types in the mouse. We identified aging-dependent genes in 76 tissue-cell types from 23 tissues and characterized both shared and tissue-cell-specific aging behaviors. We found that the aging-related genes shared by multiple tissue-cell types also change their expression congruently in the same direction during aging in most tissue-cell types, suggesting a coordinated global aging behavior at the organismal level. Scoring cells based on these shared aging genes allowed us to contrast the aging status of different tissues and cell types from a transcriptomic perspective. In addition, we identified genes that exhibit age-related expression changes specific to each functional category of tissue-cell types. Altogether, our analyses provide one of the most comprehensive and systematic characterizations of the molecular signatures of aging across diverse tissue-cell types in a mammalian system.
Assuntos
Senescência Celular/genética , Perfilação da Expressão Gênica , Camundongos/fisiologia , Análise de Célula Única , Animais , Feminino , Masculino , Camundongos Endogâmicos C57BLRESUMO
The influence of seasons on biological processes is poorly understood. In order to identify biological seasonal patterns based on diverse molecular data, rather than calendar dates, we performed a deep longitudinal multiomics profiling of 105 individuals over 4 years. Here, we report more than 1000 seasonal variations in omics analytes and clinical measures. The different molecules group into two major seasonal patterns which correlate with peaks in late spring and late fall/early winter in California. The two patterns are enriched for molecules involved in human biological processes such as inflammation, immunity, cardiovascular health, as well as neurological and psychiatric conditions. Lastly, we identify molecules and microbes that demonstrate different seasonal patterns in insulin sensitive and insulin resistant individuals. The results of our study have important implications in healthcare and highlight the value of considering seasonality when assessing population wide health risk and management.
Assuntos
Exposição Ambiental , Resistência à Insulina/fisiologia , Redes e Vias Metabólicas/fisiologia , Microbiota/fisiologia , Estações do Ano , Adulto , Idoso , Glicemia/análise , Glicemia/metabolismo , California , Análise por Conglomerados , Feminino , Nível de Saúde , Humanos , Insulina/metabolismo , Estudos Longitudinais , Masculino , Metabolômica , Pessoa de Meia-Idade , RNA-SeqRESUMO
Periodontal disease is a microbially-mediated inflammatory disease of tooth-supporting tissues that leads to bone and tissue loss around teeth. Although bacterially-mediated mechanisms of alveolar bone destruction have been widely studied, the effects of a polymicrobial infection on the periodontal ligament and microbiome/virome have not been well explored. Therefore, the current investigation introduced a new mouse model of periodontal disease to examine the effects of a polymicrobial infection on periodontal ligament (PDL) properties, changes in bone loss, the host immune response, and the microbiome/virome using shotgun sequencing. Periodontal pathogens, namely Porphyromonas gingivalis, Treponema denticola, Tannerella forsythia, and Fusobacterium nucleatum were used as the polymicrobial oral inoculum in BALB/cByJ mice. The polymicrobial infection triggered significant alveolar bone loss, a heightened antibody response, an elevated cytokine immune response, a significant shift in viral diversity and virome composition, and a widening of the PDL space; the latter two findings have not been previously reported in periodontal disease models. Changes in the PDL space were present at sites far away from the site of insult, indicating that the polymicrobial radius of effect extends beyond the bone loss areas and site of initial infection and wider than previously appreciated. Associations were found between bone loss, specific viral and bacterial species, immune genes, and PDL space changes. These findings may have significant implications for the pathogenesis of periodontal disease and biomechanical properties of the periodontium. This new polymicrobial mouse model of periodontal disease in a common mouse strain is useful for evaluating the features of periodontal disease.
Assuntos
Perda do Osso Alveolar/microbiologia , Citocinas/metabolismo , Doenças Periodontais/microbiologia , Ligamento Periodontal/virologia , Vírus/classificação , Perda do Osso Alveolar/virologia , Animais , Modelos Animais de Doenças , Feminino , Fusobacterium nucleatum/patogenicidade , Metagenômica/métodos , Camundongos , Camundongos Endogâmicos BALB C , Doenças Periodontais/imunologia , Doenças Periodontais/virologia , Ligamento Periodontal/microbiologia , Filogenia , Porphyromonas gingivalis/patogenicidade , Tannerella forsythia/patogenicidade , Treponema denticola/patogenicidade , Vírus/genética , Vírus/imunologia , Vírus/isolamento & purificaçãoRESUMO
An underlying question for virtually all single-cell RNA sequencing experiments is how to allocate the limited sequencing budget: deep sequencing of a few cells or shallow sequencing of many cells? Here we present a mathematical framework which reveals that, for estimating many important gene properties, the optimal allocation is to sequence at a depth of around one read per cell per gene. Interestingly, the corresponding optimal estimator is not the widely-used plug-in estimator, but one developed via empirical Bayes.