RESUMO
The human hippocampus and prefrontal cortex play critical roles in learning and cognition1,2, yet the dynamic molecular characteristics of their development remain enigmatic. Here we investigated the epigenomic and three-dimensional chromatin conformational reorganization during the development of the hippocampus and prefrontal cortex, using more than 53,000 joint single-nucleus profiles of chromatin conformation and DNA methylation generated by single-nucleus methyl-3C sequencing (snm3C-seq3)3. The remodelling of DNA methylation is temporally separated from chromatin conformation dynamics. Using single-cell profiling and multimodal single-molecule imaging approaches, we have found that short-range chromatin interactions are enriched in neurons, whereas long-range interactions are enriched in glial cells and non-brain tissues. We reconstructed the regulatory programs of cell-type development and differentiation, finding putatively causal common variants for schizophrenia strongly overlapping with chromatin loop-connected, cell-type-specific regulatory regions. Our data provide multimodal resources for studying gene regulatory dynamics in brain development and demonstrate that single-cell three-dimensional multi-omics is a powerful approach for dissecting neuropsychiatric risk loci.
RESUMO
Genome wide studies are yielding a growing catalogue of common and rare variants that confer risk for psychopathology. Yet, despite representing unprecedented progress, emerging data also indicate that the full promise of psychiatric genetics - including understanding pathophysiology and improving personalized care - will not be fully realized by targeting traditional, dichotomous diagnostic categories. The current article provides reflections on themes emerging from a 2021 NIMH sponsored conference convened to address strategies for the evolving field of psychiatric genetics. As anticipated by NIMH's Research Domain Framework, multi-level investigations of dimensional and transdiagnostic phenotypes, particularly when integrated with biobanks and big data, will be critical to advancing knowledge. The path forward will also require more diverse representation in source studies. Additionally, progress will be catalyzed by a range of converging approaches, including capitalizing on computational methods, pursuing biological insights, working within a developmental framework, and engaging healthcare systems and patient communities.
RESUMO
While respiratory diseases such as COPD and asthma share many risk factors, most studies investigate them in insolation and in predominantly European ancestry populations. Here, we conducted the most powerful multi-trait and -ancestry genetic analysis of respiratory diseases and auxiliary traits to date. Our approach improves the power of genetic discovery across traits and ancestries, identifying 44 novel loci associated with lung function in individuals of East Asian ancestry. Using these results, we developed PRSxtra (cross TRait and Ancestry), a multi-trait and -ancestry polygenic risk score approach that leverages shared components of heritable risk via pleiotropic effects. PRSxtra significantly improved the prediction of asthma, COPD, and lung cancer compared to trait- and ancestry-matched PRS in a multi-ancestry cohort from the All of Us Research Program, especially in diverse populations. PRSxtra identified individuals in the top decile with over four-fold odds of asthma and COPD compared to the first decile. Our results present a new framework for multi-trait and -ancestry studies of respiratory diseases to improve genetic discovery and polygenic prediction.
RESUMO
Obesity is a significant public health concern. GLP-1 receptor agonists (GLP1-RA), predominantly in use as a type 2 diabetes treatment, are a promising pharmacological approach for weight loss, while bariatric surgery (BS) remains a durable, but invasive, intervention. Despite observed heterogeneity in weight loss effects, the genetic effects on weight loss from GLP1-RA and BS have not been extensively explored in large sample sizes, and most studies have focused on differences in race and ethnicity, rather than genetic ancestry. We studied whether genetic factors, previously shown to affect body weight, impact weight loss due to GLP1-RA therapy or BS in 10,960 individuals from 9 multi-ancestry biobank studies in 6 countries. The average weight change between 6 and 12 months from therapy initiation was -3.93% for GLP1-RA users, with marginal differences across genetic ancestries. For BS patients the weight change between 6 and 48 months from the operation was -21.17%. There were no significant associations between weight loss due to GLP1-RA and polygenic scores for BMI or type 2 diabetes or specific missense variants in the GLP1R, PCSK1 and APOE genes, after multiple-testing correction. However, a higher polygenic score for BMI was significantly linked to lower weight loss after BS (+0.7% for 1 standard deviation change in the polygenic score, P = 1.24×10-4). In contrast, higher weight at baseline was associated with greater weight loss. Our findings suggest that existing polygenic scores related to weight and type 2 diabetes and missense variants in the drug target gene do not have a large impact on GLP1-RA effectiveness. Our results also confirm the effectiveness of these treatments across all major continental ancestry groups considered.
RESUMO
Importance: Polygenic risk scores (PRSs) for coronary artery disease (CAD) are a growing clinical and commercial reality. Whether existing scores provide similar individual-level assessments of disease liability is a critical consideration for clinical implementation that remains uncharacterized. Objective: Characterize the reliability of CAD PRSs that perform equivalently at the population level at predicting individual-level risk. Design: Cross-sectional Study. Setting: All of Us Research Program (AOU), Penn Medicine Biobank (PMBB), and UCLA ATLAS Precision Health Biobank. Participants: Volunteers of diverse genetic backgrounds enrolled in AOU, PMBB, and UCLA with available electronic health record and genotyping data. Exposures: Polygenic risk for CAD from previously published PRSs and new PRSs developed separately from the testing cohorts. Main Outcomes and Measures: Sets of CAD PRSs that perform population prediction equivalently were identified by comparing calibration and discrimination (Brier score and AUROC) of generalized linear models of prevalent CAD using Bayesian analysis of variance. Among equivalently performing scores, individual-level agreement between risk estimates was tested with intraclass correlation (ICC) and Light's Kappa, measures of inter-rater reliability. Results: 50 PRSs were calculated for 171,095 AOU participants. When included in a model of prevalent CAD, 48 scores had practically equivalent Brier scores and AUROCs (region of practical equivalence = 0.02). Across these scores, 84% of participants had at least one score in both the top and bottom risk quintile. Continuous agreement of individual risk predictions from the 48 scores was poor, with an ICC of 0.351 (95% CI; 0.349, 0.352). Agreement between two statistically equivalent scores was moderate, with an ICC of 0.649 (95% CI; 0.646, 0.652). Light's Kappa, used to evaluate consistency of assignment to high-risk thresholds, did not exceed 0.56 (interpreted as 'fair') across statistically and practically equivalent scores. Repeating the analysis among 41,193 PMBB and 50,748 UCLA participants yielded different sets of statistically and practically equivalent scores which also lacked strong individual agreement. Conclusions and Relevance: Across three diverse biobanks, CAD PRSs that performed equivalently at the population level produced unreliable individual risk estimates. Approaches to clinical implementation of CAD PRSs must consider the potential for discordant individual risk estimates from otherwise indistinguishable scores.
RESUMO
Genetic risk modeling for dementia offers significant benefits, but studies based on real-world data, particularly for underrepresented populations, are limited. We employ an Elastic Net model for dementia risk prediction using single-nucleotide polymorphisms prioritized by functional genomic data from multiple neurodegenerative disease genome-wide association studies. We compare this model with APOE and polygenic risk score models across genetic ancestry groups (Hispanic Latino American sample: 610 patients with 126 cases; African American sample: 440 patients with 84 cases; East Asian American sample: 673 patients with 75 cases), using electronic health records from UCLA Health for discovery and the All of Us cohort for validation. Our model significantly outperforms other models across multiple ancestries, improving the area-under-precision-recall curve by 31-84% (Wilcoxon signed-rank test p-value <0.05) and the area-under-the-receiver-operating characteristic by 11-17% (DeLong test p-value <0.05) compared to the APOE and the polygenic risk score models. We identify shared and ancestry-specific risk genes and biological pathways, reinforcing and adding to existing knowledge. Our study highlights the benefits of integrating functional mapping, multiple neurodegenerative diseases, and machine learning for genetic risk models in diverse populations. Our findings hold potential for refining precision medicine strategies in dementia diagnosis.
Assuntos
Demência , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Demência/genética , Demência/epidemiologia , Feminino , Masculino , Idoso , Modelos Genéticos , Herança Multifatorial , Fatores de Risco , Medição de Risco/métodos , Pessoa de Meia-IdadeRESUMO
Recent studies have demonstrated that polygenic risk scores (PRS) trained on multi-ancestry data can improve prediction accuracy in groups historically underrepresented in genomic studies, but the availability of linked health and genetic data from large-scale diverse cohorts representative of a wide spectrum of human diversity remains limited. To address this need, the All of Us research program (AoU) generated whole-genome sequences of 245,388 individuals who collectively reflect the diversity of the USA. Leveraging this resource and another widely-used population-scale biobank, the UK Biobank (UKB) with a half million participants, we developed PRS trained on multi-ancestry and multi-biobank data with up to ~750,000 participants for 32 common, complex traits and diseases across a range of genetic architectures. We then compared effects of ancestry, PRS methodology, and genetic architecture on PRS accuracy across a held out subset of ancestrally diverse AoU participants. Due to the more heterogeneous study design of AoU, we found lower heritability on average compared to UKB (0.075 vs 0.165), which limited the maximal achievable PRS accuracy in AoU. Overall, we found that the increased diversity of AoU significantly improved PRS performance in some participants in AoU, especially underrepresented individuals, across multiple phenotypes. Notably, maximizing sample size by combining discovery data across AoU and UKB is not the optimal approach for predicting some phenotypes in African ancestry populations; rather, using data from only AoU for these traits resulted in the greatest accuracy. This was especially true for less polygenic traits with large ancestry-enriched effects, such as neutrophil count (R 2: 0.055 vs. 0.035 using AoU vs. cross-biobank meta-analysis, respectively, because of e.g. DARC). Lastly, we calculated individual-level PRS accuracies rather than grouping by continental ancestry, a critical step towards interpretability in precision medicine. Individualized PRS accuracy decays linearly as a function of ancestry divergence, but the slope was smaller using multi-ancestry GWAS compared to using European GWAS. Our results highlight the potential of biobanks with more balanced representations of human diversity to facilitate more accurate PRS for the individuals least represented in genomic studies.
RESUMO
Polygenic scores (PGS) have emerged as the tool of choice for genomic prediction in a wide range of fields. We show that PGS performance varies broadly across contexts and biobanks. Contexts such as age, sex and income can impact PGS accuracy with similar magnitudes as genetic ancestry. Here we introduce an approach (CalPred) that models all contexts jointly to produce prediction intervals that vary across contexts to achieve calibration (include the trait with 90% probability), whereas existing methods are miscalibrated. In analyses of 72 traits across large and diverse biobanks (All of Us and UK Biobank), we find that prediction intervals required adjustment by up to 80% for quantitative traits. For disease traits, PGS-based predictions were miscalibrated across socioeconomic contexts such as annual household income levels, further highlighting the need of accounting for context information in PGS-based prediction across diverse populations.
Assuntos
Estudo de Associação Genômica Ampla , Modelos Genéticos , Herança Multifatorial , Humanos , Herança Multifatorial/genética , Estudo de Associação Genômica Ampla/métodos , Feminino , Masculino , Calibragem , Bancos de Espécimes Biológicos , Fenótipo , Genômica/métodos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Recent studies have highlighted the essential role of RNA splicing, a key mechanism of alternative RNA processing, in establishing connections between genetic variations and disease. Genetic loci influencing RNA splicing variations show considerable influence on complex traits, possibly surpassing those affecting total gene expression. Dysregulated RNA splicing has emerged as a major potential contributor to neurological and psychiatric disorders, likely due to the exceptionally high prevalence of alternatively spliced genes in the human brain. Nevertheless, establishing direct associations between genetically altered splicing and complex traits has remained an enduring challenge. We introduce Spliced-Transcriptome-Wide Associations (SpliTWAS) to integrate alternative splicing information with genome-wide association studies to pinpoint genes linked to traits through exon splicing events. We applied SpliTWAS to two schizophrenia (SCZ) RNA-sequencing datasets, BrainGVEX and CommonMind, revealing 137 and 88 trait-associated exons (in 84 and 67 genes), respectively. Enriched biological functions in the associated gene sets converged on neuronal function and development, immune cell activation, and cellular transport, which are highly relevant to SCZ. SpliTWAS variants impacted RNA-binding protein binding sites, revealing potential disruption of RNA-protein interactions affecting splicing. We extended the probabilistic fine-mapping method FOCUS to the exon level, identifying 36 genes and 48 exons as putatively causal for SCZ. We highlight VPS45 and APOPT1, where splicing of specific exons was associated with disease risk, eluding detection by conventional gene expression analysis. Collectively, this study supports the substantial role of alternative splicing in shaping the genetic basis of SCZ, providing a valuable approach for future investigations in this area.
Assuntos
Processamento Alternativo , Éxons , Estudo de Associação Genômica Ampla , Esquizofrenia , Transcriptoma , Humanos , Esquizofrenia/genética , Processamento Alternativo/genética , Éxons/genética , Predisposição Genética para Doença , Splicing de RNA/genética , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Venous thromboembolism (VTE) is a significant contributor to morbidity and mortality, with large disparities in incidence rates between Black and White Americans. Polygenic risk scores (PRSs) limited to variants discovered in genome-wide association studies in European-ancestry samples can identify European-ancestry individuals at high risk of VTE. However, there is limited evidence on whether high-dimensional PRS constructed using more sophisticated methods and more diverse training data can enhance the predictive ability and their utility across diverse populations. We developed PRSs for VTE using summary statistics from the International Network against Venous Thrombosis (INVENT) consortium genome-wide association studies meta-analyses of European- (71 771 cases and 1 059 740 controls) and African-ancestry samples (7482 cases and 129 975 controls). We used LDpred2 and PRS-CSx to construct ancestry-specific and multi-ancestry PRSs and evaluated their performance in an independent European- (6781 cases and 103 016 controls) and African-ancestry sample (1385 cases and 12 569 controls). Multi-ancestry PRSs with weights tuned in European-ancestry samples slightly outperformed ancestry-specific PRSs in European-ancestry test samples (e.g. the area under the receiver operating curve [AUC] was 0.609 for PRS-CSx_combinedEUR and 0.608 for PRS-CSxEUR [P = 0.00029]). Multi-ancestry PRSs with weights tuned in African-ancestry samples also outperformed ancestry-specific PRSs in African-ancestry test samples (PRS-CSxAFR: AUC = 0.58, PRS-CSx_combined AFR: AUC = 0.59), although this difference was not statistically significant (P = 0.34). The highest fifth percentile of the best-performing PRS was associated with 1.9-fold and 1.68-fold increased risk for VTE among European- and African-ancestry subjects, respectively, relative to those in the middle stratum. These findings suggest that the multi-ancestry PRS might be used to improve performance across diverse populations to identify individuals at highest risk for VTE.
Assuntos
Estratificação de Risco Genético , Tromboembolia Venosa , Feminino , Humanos , Masculino , Negro ou Afro-Americano/genética , Estudos de Casos e Controles , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Tromboembolia Venosa/genética , Tromboembolia Venosa/epidemiologia , Brancos/genéticaRESUMO
Polygenic scores (PGSs) summarize the combined effect of common risk variants and are associated with breast cancer risk in patients without identifiable monogenic risk factors. One of the most well-validated PGSs in breast cancer to date is PGS313, which was developed from a Northern European biobank but has shown attenuated performance in non-European ancestries. We further investigate the generalizability of the PGS313 for American women of European (EA), African (AFR), Asian (EAA), and Latinx (HL) ancestry within one institution with a singular electronic health record (EHR) system, genotyping platform, and quality control process. We found that the PGS313 achieved overlapping areas under the receiver operator characteristic (ROC) curve (AUCs) in females of HL (AUC = 0.68, 95% confidence interval [CI] = 0.65-0.71) and EA ancestry (AUC = 0.70, 95% CI = 0.69-0.71) but lower AUCs for the AFR and EAA populations (AFR: AUC = 0.61, 95% CI = 0.56-0.65; EAA: AUC = 0.64, 95% CI = 0.60-0.680). While PGS313 is associated with hormone-receptor-positive (HR+) disease in EA Americans (odds ratio [OR] = 1.42, 95% CI = 1.16-1.64), this association is lost in African, Latinx, and Asian Americans. In summary, we found that PGS313 was significantly associated with breast cancer but with attenuated accuracy in women of AFR and EAA descent within a singular health system in Los Angeles. Our work further highlights the need for additional validation in diverse cohorts prior to the clinical implementation of PGSs.
Assuntos
Bancos de Espécimes Biológicos , Neoplasias da Mama , Predisposição Genética para Doença , Humanos , Neoplasias da Mama/genética , Neoplasias da Mama/epidemiologia , Neoplasias da Mama/etnologia , Feminino , Los Angeles/epidemiologia , Pessoa de Meia-Idade , Fatores de Risco , Herança Multifatorial , Curva ROC , Adulto , Idoso , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Neuropsychiatric genome-wide association studies (GWASs), including those for autism spectrum disorder and schizophrenia, show strong enrichment for regulatory elements in the developing brain. However, prioritizing risk genes and mechanisms is challenging without a unified regulatory atlas. Across 672 diverse developing human brains, we identified 15,752 genes harboring gene, isoform, and/or splicing quantitative trait loci, mapping 3739 to cellular contexts. Gene expression heritability drops during development, likely reflecting both increasing cellular heterogeneity and the intrinsic properties of neuronal maturation. Isoform-level regulation, particularly in the second trimester, mediated the largest proportion of GWAS heritability. Through colocalization, we prioritized mechanisms for about 60% of GWAS loci across five disorders, exceeding adult brain findings. Finally, we contextualized results within gene and isoform coexpression networks, revealing the comprehensive landscape of transcriptome regulation in development and disease.
Assuntos
Processamento Alternativo , Encéfalo , Regulação da Expressão Gênica no Desenvolvimento , Transtornos Mentais , Humanos , Atlas como Assunto , Transtorno do Espectro Autista/genética , Encéfalo/metabolismo , Encéfalo/crescimento & desenvolvimento , Encéfalo/embriologia , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Locos de Características Quantitativas , Esquizofrenia/genética , Transcriptoma , Transtornos Mentais/genéticaRESUMO
Human inborn errors of immunity include rare disorders entailing functional and quantitative antibody deficiencies due to impaired B cells called the common variable immunodeficiency (CVID) phenotype. Patients with CVID face delayed diagnoses and treatments for 5 to 15 years after symptom onset because the disorders are rare (prevalence of ~1/25,000), and there is extensive heterogeneity in CVID phenotypes, ranging from infections to autoimmunity to inflammatory conditions, overlapping with other more common disorders. The prolonged diagnostic odyssey drives excessive system-wide costs before diagnosis. Because there is no single causal mechanism, there are no genetic tests to definitively diagnose CVID. Here, we present PheNet, a machine learning algorithm that identifies patients with CVID from their electronic health records (EHRs). PheNet learns phenotypic patterns from verified CVID cases and uses this knowledge to rank patients by likelihood of having CVID. PheNet could have diagnosed more than half of our patients with CVID 1 or more years earlier than they had been diagnosed. When applied to a large EHR dataset, followed by blinded chart review of the top 100 patients ranked by PheNet, we found that 74% were highly probable to have CVID. We externally validated PheNet using >6 million records from disparate medical systems in California and Tennessee. As artificial intelligence and machine learning make their way into health care, we show that algorithms such as PheNet can offer clinical benefits by expediting the diagnosis of rare diseases.
Assuntos
Imunodeficiência de Variável Comum , Registros Eletrônicos de Saúde , Humanos , Imunodeficiência de Variável Comum/diagnóstico , Aprendizado de Máquina , Algoritmos , Masculino , Feminino , Fenótipo , Adulto , Doenças não Diagnosticadas/diagnósticoRESUMO
RNA splicing is highly prevalent in the brain and has strong links to neuropsychiatric disorders; yet, the role of cell type-specific splicing and transcript-isoform diversity during human brain development has not been systematically investigated. In this work, we leveraged single-molecule long-read sequencing to deeply profile the full-length transcriptome of the germinal zone and cortical plate regions of the developing human neocortex at tissue and single-cell resolution. We identified 214,516 distinct isoforms, of which 72.6% were novel (not previously annotated in Gencode version 33), and uncovered a substantial contribution of transcript-isoform diversity-regulated by RNA binding proteins-in defining cellular identity in the developing neocortex. We leveraged this comprehensive isoform-centric gene annotation to reprioritize thousands of rare de novo risk variants and elucidate genetic risk mechanisms for neuropsychiatric disorders.
Assuntos
Transtornos Mentais , Neocórtex , Neurogênese , Isoformas de Proteínas , Splicing de RNA , Análise de Célula Única , Transcriptoma , Humanos , Processamento Alternativo , Predisposição Genética para Doença , Transtornos Mentais/genética , Anotação de Sequência Molecular , Neocórtex/metabolismo , Neocórtex/embriologia , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Neurogênese/genéticaRESUMO
Few neuropsychiatric disorders have replicable biomarkers, prompting high-resolution and large-scale molecular studies. However, we still lack consensus on a more foundational question: whether quantitative shifts in cell types-the functional unit of life-contribute to neuropsychiatric disorders. Leveraging advances in human brain single-cell methylomics, we deconvolve seven major cell types using bulk DNA methylation profiling across 1270 postmortem brains, including from individuals diagnosed with Alzheimer's disease, schizophrenia, and autism. We observe and replicate cell-type compositional shifts for Alzheimer's disease (endothelial cell loss), autism (increased microglia), and schizophrenia (decreased oligodendrocytes), and find age- and sex-related changes. Multiple layers of evidence indicate that endothelial cell loss contributes to Alzheimer's disease, with comparable effect size to APOE genotype among older people. Genome-wide association identified five genetic loci related to cell-type composition, involving plausible genes for the neurovascular unit (P2RX5 and TRPV3) and excitatory neurons (DPY30 and MEMO1). These results implicate specific cell-type shifts in the pathophysiology of neuropsychiatric disorders.
Assuntos
Doença de Alzheimer , Transtorno Autístico , Encéfalo , Metilação de DNA , Esquizofrenia , Humanos , Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Doença de Alzheimer/metabolismo , Esquizofrenia/genética , Esquizofrenia/patologia , Encéfalo/metabolismo , Encéfalo/patologia , Transtorno Autístico/genética , Transtorno Autístico/patologia , Masculino , Feminino , Estudo de Associação Genômica Ampla , Idoso , Células Endoteliais/metabolismo , Células Endoteliais/patologia , Epigenômica/métodos , Pessoa de Meia-Idade , Idoso de 80 Anos ou maisRESUMO
SUMMARY: Admixed populations, with their unique and diverse genetic backgrounds, are often underrepresented in genetic studies. This oversight not only limits our understanding but also exacerbates existing health disparities. One major barrier has been the lack of efficient tools tailored for the special challenges of genetic studies of admixed populations. Here, we present admix-kit, an integrated toolkit and pipeline for genetic analyses of admixed populations. Admix-kit implements a suite of methods to facilitate genotype and phenotype simulation, association testing, genetic architecture inference, and polygenic scoring in admixed populations. AVAILABILITY AND IMPLEMENTATION: Admix-kit package is open-source and available at https://github.com/KangchengHou/admix-kit. Additionally, users can use the pipeline designed for admixed genotype simulation available at https://github.com/UW-GAC/admix-kit_workflow.
Assuntos
Software , Genótipo , FenótipoRESUMO
Genome-wide association studies (GWASs) have uncovered susceptibility loci associated with psychiatric disorders such as bipolar disorder (BP) and schizophrenia (SCZ). However, most of these loci are in non-coding regions of the genome, and the causal mechanisms of the link between genetic variation and disease risk is unknown. Expression quantitative trait locus (eQTL) analysis of bulk tissue is a common approach used for deciphering underlying mechanisms, although this can obscure cell-type-specific signals and thus mask trait-relevant mechanisms. Although single-cell sequencing can be prohibitively expensive in large cohorts, computationally inferred cell-type proportions and cell-type gene expression estimates have the potential to overcome these problems and advance mechanistic studies. Using bulk RNA-seq from 1,730 samples derived from whole blood in a cohort ascertained from individuals with BP and SCZ, this study estimated cell-type proportions and their relation with disease status and medication. For each cell type, we found between 2,875 and 4,629 eGenes (genes with an associated eQTL), including 1,211 that are not found on the basis of bulk expression alone. We performed a colocalization test between cell-type eQTLs and various traits and identified hundreds of associations that occur between cell-type eQTLs and GWASs but that are not detected in bulk eQTLs. Finally, we investigated the effects of lithium use on the regulation of cell-type expression loci and found examples of genes that are differentially regulated according to lithium use. Our study suggests that applying computational methods to large bulk RNA-seq datasets of non-brain tissue can identify disease-relevant, cell-type-specific biology of psychiatric disorders and psychiatric medication.
Assuntos
Estudo de Associação Genômica Ampla , Lítio , Humanos , Estudo de Associação Genômica Ampla/métodos , RNA-Seq , Locos de Características Quantitativas/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Predisposição Genética para DoençaRESUMO
BACKGROUND: Genetic risk modeling for dementia offers significant benefits, but studies based on real-world data, particularly for underrepresented populations, are limited. METHODS: We employed an Elastic Net model for dementia risk prediction using single-nucleotide polymorphisms prioritized by functional genomic data from multiple neurodegenerative disease genome-wide association studies. We compared this model with APOE and polygenic risk score models across genetic ancestry groups, using electronic health records from UCLA Health for discovery and All of Us cohort for validation. RESULTS: Our model significantly outperforms other models across multiple ancestries, improving the area-under-precision-recall curve by 21-61% and the area-under-the-receiver-operating characteristic by 10-21% compared to the APOE and the polygenic risk score models. We identified shared and ancestry-specific risk genes and biological pathways, reinforcing and adding to existing knowledge. CONCLUSIONS: Our study highlights benefits of integrating functional mapping, multiple neurodegenerative diseases, and machine learning for genetic risk models in diverse populations. Our findings hold potential for refining precision medicine strategies in dementia diagnosis.
RESUMO
BACKGROUND: Genetic risk modeling for dementia offers significant benefits, but studies based on real-world data, particularly for underrepresented populations, are limited. METHODS: We employed an Elastic Net model for dementia risk prediction using single-nucleotide polymorphisms prioritized by functional genomic data from multiple neurodegenerative disease genome-wide association studies. We compared this model with APOE and polygenic risk score models across genetic ancestry groups, using electronic health records from UCLA Health for discovery and All of Us cohort for validation. RESULTS: Our model significantly outperforms other models across multiple ancestries, improving the area-under-precision-recall curve by 21-61% and the area-under-the-receiver-operating characteristic by 10-21% compared to the APOEand the polygenic risk score models. We identified shared and ancestry-specific risk genes and biological pathways, reinforcing and adding to existing knowledge. CONCLUSIONS: Our study highlights benefits of integrating functional mapping, multiple neurodegenerative diseases, and machine learning for genetic risk models in diverse populations. Our findings hold potential for refining precision medicine strategies in dementia diagnosis.
RESUMO
Background: Previous studies have established a strong link between late-onset epilepsy (LOE) and Alzheimer's disease (AD). However, their shared genetic risk beyond the APOE gene remains unclear. Our study sought to examine the shared genetic factors of AD and LOE, interpret the biological pathways involved, and evaluate how AD onset may be mediated by LOE and shared genetic risks. Methods: We defined phenotypes using phecodes mapped from diagnosis codes, with patients' records aged 60-90. A two-step Least Absolute Shrinkage and Selection Operator (LASSO) workflow was used to identify shared genetic variants based on prior AD GWAS integrated with functional genomic data. We calculated an AD-LOE shared risk score and used it as a proxy in a causal mediation analysis. We used electronic health records from an academic health center (UCLA Health) for discovery analyses and validated our findings in a multi-institutional EHR database (All of Us). Results: The two-step LASSO method identified 34 shared genetic loci between AD and LOE, including the APOE region. These loci were mapped to 65 genes, which showed enrichment in molecular functions and pathways such as tau protein binding and lipoprotein metabolism. Individuals with high predicted shared risk scores have a higher risk of developing AD, LOE, or both in their later life compared to those with low-risk scores. LOE partially mediates the effect of AD-LOE shared genetic risk on AD (15% proportion mediated on average). Validation results from All of Us were consistent with findings from the UCLA sample. Conclusions: We employed a machine learning approach to identify shared genetic risks of AD and LOE. In addition to providing substantial evidence for the significant contribution of the APOE-TOMM40-APOC1 gene cluster to shared risk, we uncovered novel genes that may contribute. Our study is one of the first to utilize All of Us genetic data to investigate AD, and provides valuable insights into the potential common and disease-specific mechanisms underlying AD and LOE, which could have profound implications for the future of disease prevention and the development of targeted treatment strategies to combat the co-occurrence of these two diseases.