RESUMO
Over the past decade, genome-wide association studies have identified thousands of variants significantly associated with complex traits. For each locus, gene expression levels are needed to further explore its biological functions. To address this, the PrediXcan algorithm leverages large-scale reference data to impute the gene expression level from single nucleotide polymorphisms, and thus the gene-trait associations can be tested to identify the candidate causal genes. However, a challenge arises due to the fact that most reference data are from subjects of European ancestry, and the accuracy and robustness of predicted gene expression in subjects of East Asian (EAS) ancestry remains unclear. Here, we first simulated a variety of scenarios to explore the impact of the level of population diversity on gene expression. Population differentiated variants were estimated by using the allele frequency information from The Genome Aggregation Database. We found that the weights of a variants was the main factor that affected the gene expression predictions, and that ~70% of variants were significantly population differentiated based on proportion tests. To provide insights into this population effect on gene expression levels, we utilized the allele frequency information to develop a gene expression reference panel, Predict Asian-Population (PredictAP), for EAS ancestry. PredictAP can be viewed as an auxiliary tool for PrediXcan when using genotype data from EAS subjects.
Assuntos
Algoritmos , População do Leste Asiático , Frequência do Gene , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Ásia Oriental , Bases de Dados Genéticas , População do Leste Asiático/genética , Genética Populacional , Estudo de Associação Genômica Ampla/métodosRESUMO
Increasing gestational weight gain (GWG) is linked to adverse outcomes in pregnant persons and their children. The Early Growth Genetics (EGG) Consortium identified previously genetic variants that could contribute to early, late, and total GWG from fetal and maternal genomes. However, the biologic mechanisms and tissue-Specificity of these variants in GWG is unknown. We evaluated the association between genetically predicted gene expression in five relevant maternal (subcutaneous and visceral adipose, breast, uterus, and whole blood) from GTEx (v7) and fetal (placenta) tissues and early, late, and total GWG using S-PrediXcan. We tested enrichment of pre-defined biological pathways for nominally (P < 0.05) significant associations using the GENE2FUNC module from Functional Mapping and Annotation of Genome-Wide Association Studies. After multiple testing correction, we did not find significant associations between maternal and fetal gene expression and early, late, or total GWG. There was significant enrichment of several biological pathways, including metabolic processes, secretion, and intracellular transport, among nominally significant genes from the maternal analyses (false discovery rate p-values: 0.016 to 9.37×10). Enriched biological pathways varied across pregnancy. Though additional research is necessary, these results indicate that diverse biological pathways are likely to impact GWG, with their influence varying by tissue and weeks of gestation.
RESUMO
Transcriptome-wide association study (TWAS) methodologies aim to identify genetic effects on phenotypes through the mediation of gene transcription. In TWAS, in silico models of gene expression are trained as functions of genetic variants and then applied to genome-wide association study (GWAS) data. This post-GWAS analysis identifies gene-trait associations with high interpretability, enabling follow-up functional genomics studies and the development of genetics-anchored resources. We provide an overview of commonly used TWAS approaches, their advantages and limitations, and some widely used applications. © 2024 Wiley Periodicals LLC.
Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Locos de Características Quantitativas , Simulação por Computador , FenótipoRESUMO
BACKGROUND: Chronic pain is a common, poorly understood condition. Genetic studies including genome-wide association studies have identified many relevant variants, which have yet to be translated into full understanding of chronic pain. Transcriptome-wide association studies using transcriptomic imputation methods such as S-PrediXcan can help bridge this genotype-phenotype gap. METHODS: We carried out transcriptomic imputation using S-PrediXcan to identify genetically regulated gene expression associated with multisite chronic pain in 13 brain tissues and whole blood. Then, we imputed genetically regulated gene expression for over 31,000 Mount Sinai BioMe participants and performed a phenome-wide association study to investigate clinical relationships in chronic pain-associated gene expression changes. RESULTS: We identified 95 experiment-wide significant gene-tissue associations (p < 7.97 × 10-7), including 36 unique genes and an additional 134 gene-tissue associations reaching within-tissue significance, including 53 additional unique genes. Of the 89 unique genes in total, 59 were novel for multisite chronic pain and 18 are established drug targets. Chronic pain genetically regulated gene expression for 10 unique genes was significantly associated with cardiac dysrhythmia, metabolic syndrome, disc disorders/dorsopathies, joint/ligament sprain, anemias, and neurologic disorder phecodes. Phenome-wide association study analyses adjusting for mean pain score showed that associations were not driven by mean pain score. CONCLUSIONS: We carried out the largest transcriptomic imputation study of any chronic pain trait to date. Results highlight potential causal genes in chronic pain development and tissue and direction of effect. Several gene results were also drug targets. Phenome-wide association study results showed significant associations for phecodes including cardiac dysrhythmia and metabolic syndrome, thereby indicating potential shared mechanisms.
Assuntos
Dor Crônica , Síndrome Metabólica , Humanos , Estudo de Associação Genômica Ampla/métodos , Predisposição Genética para Doença , Dor Crônica/tratamento farmacológico , Dor Crônica/genética , Reposicionamento de Medicamentos , Fenótipo , Transcriptoma , Encéfalo , Arritmias Cardíacas , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Transcriptome prediction models built with data from European-descent individuals are less accurate when applied to different populations because of differences in linkage disequilibrium patterns and allele frequencies. We hypothesized that methods that leverage shared regulatory effects across different conditions, in this case, across different populations, may improve cross-population transcriptome prediction. To test this hypothesis, we made transcriptome prediction models for use in transcriptome-wide association studies (TWASs) using different methods (elastic net, joint-tissue imputation [JTI], matrix expression quantitative trait loci [Matrix eQTL], multivariate adaptive shrinkage in R [MASHR], and transcriptome-integrated genetic association resource [TIGAR]) and tested their out-of-sample transcriptome prediction accuracy in population-matched and cross-population scenarios. Additionally, to evaluate model applicability in TWASs, we integrated publicly available multiethnic genome-wide association study (GWAS) summary statistics from the Population Architecture using Genomics and Epidemiology (PAGE) study and Pan-ancestry genetic analysis of the UK Biobank (PanUKBB) with our developed transcriptome prediction models. In regard to transcriptome prediction accuracy, MASHR models performed better or the same as other methods in both population-matched and cross-population transcriptome predictions. Furthermore, in multiethnic TWASs, MASHR models yielded more discoveries that replicate in both PAGE and PanUKBB across all methods analyzed, including loci previously mapped in GWASs and loci previously not found in GWASs. Overall, our study demonstrates the importance of using methods that benefit from different populations' effect size estimates in order to improve TWASs for multiethnic or underrepresented populations.
Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Transcriptoma/genética , Locos de Características Quantitativas/genética , Frequência do Gene , Desequilíbrio de LigaçãoRESUMO
BACKGROUND: Anorexia nervosa (AN) is a psychiatric disorder with complex etiology, with a significant portion of disease risk imparted by genetics. Traditional genome-wide association studies (GWAS) produce principal evidence for the association of genetic variants with disease. Transcriptomic imputation (TI) allows for the translation of those variants into regulatory mechanisms, which can then be used to assess the functional outcome of genetically regulated gene expression (GReX) in a broader setting through the use of phenome-wide association studies (pheWASs) in large and diverse clinical biobank populations with electronic health record phenotypes. METHODS: Here, we applied TI using S-PrediXcan to translate the most recent PGC-ED AN GWAS findings into AN-GReX. For significant genes, we imputed AN-GReX in the Mount Sinai BioMe™ Biobank and performed pheWASs on over 2000 outcomes to test the clinical consequences of aberrant expression of these genes. We performed a secondary analysis to assess the impact of body mass index (BMI) and sex on AN-GReX clinical associations. RESULTS: Our S-PrediXcan analysis identified 53 genes associated with AN, including what is, to our knowledge, the first-genetic association of AN with the major histocompatibility complex. AN-GReX was associated with autoimmune, metabolic, and gastrointestinal diagnoses in our biobank cohort, as well as measures of cholesterol, medications, substance use, and pain. Additionally, our analyses showed moderation of AN-GReX associations with measures of cholesterol and substance use by BMI, and moderation of AN-GReX associations with celiac disease by sex. CONCLUSIONS: Our BMI-stratified results provide potential avenues of functional mechanism for AN-genes to investigate further.
Assuntos
Anorexia Nervosa , Estudo de Associação Genômica Ampla , Humanos , Anorexia Nervosa/genética , Polimorfismo de Nucleotídeo Único , Fenótipo , Transcriptoma , Predisposição Genética para Doença/genéticaRESUMO
As popularised by PrediXcan (and related methods), transcriptome-wide association studies (TWAS), in which gene expression is imputed from single-nucleotide polymorphism (SNP) genotypes and tested for association with a phenotype, are a popular approach for investigating the role of gene expression in complex traits. Like gene expression, DNA methylation is an important biological process and, being under genetic regulation, may be imputable from SNP genotypes. Here, we investigate prediction of CpG methylation levels from SNP genotype data to help elucidate relationships between methylation, gene expression and complex traits. We start by examining how well CpG methylation can be predicted from SNP genotypes, comparing three penalised regression approaches and examining whether changing the window size improves prediction accuracy. Although methylation at most CpG sites cannot be accurately predicted from SNP genotypes, for a subset it can be predicted well. We next apply our methylation prediction models (trained using the optimal method and window size) to carry out a methylome-wide association study (MWAS) of primary biliary cholangitis. We intersect the regions identified via MWAS with those identified via TWAS, providing insight into the interplay between CpG methylation, gene expression and disease status. We conclude that MWAS has the potential to improve understanding of biological mechanisms in complex traits.
Assuntos
Herança Multifatorial , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Metilação de DNA/genética , Genótipo , Transcriptoma , Ilhas de CpG/genéticaRESUMO
One mechanism by which genetic factors influence complex traits and diseases is altering gene expression. Direct measurement of gene expression in relevant tissues is rarely tenable; however, genetically regulated gene expression (GReX) can be estimated using prediction models derived from large multi-omic datasets. These approaches have led to the discovery of many gene-trait associations, but whether models derived from predominantly European ancestry (EA) reference panels can map novel associations in ancestrally diverse populations remains unclear. We applied PrediXcan to impute GReX in 51,520 ancestrally diverse Population Architecture using Genomics and Epidemiology (PAGE) participants (35% African American, 45% Hispanic/Latino, 10% Asian, and 7% Hawaiian) across 25 key cardiometabolic traits and relevant tissues to identify 102 novel associations. We then compared associations in PAGE to those in a random subset of 50,000 White British participants from UK Biobank (UKBB50k) for height and body mass index (BMI). We identified 517 associations across 47 tissues in PAGE but not UKBB50k, demonstrating the importance of diverse samples in identifying trait-associated GReX. We observed that variants used in PrediXcan models were either more or less differentiated across continental-level populations than matched-control variants depending on the specific population reflecting sampling bias. Additionally, variants from identified genes specific to either PAGE or UKBB50k analyses were more ancestrally differentiated than those in genes detected in both analyses, underlining the value of population-specific discoveries. This suggests that while EA-derived transcriptome imputation models can identify new associations in non-EA populations, models derived from closely matched reference panels may yield further insights. Our findings call for more diversity in reference datasets of tissue-specific gene expression.
Assuntos
Doenças Cardiovasculares , Estudo de Associação Genômica Ampla , Predisposição Genética para Doença , Humanos , Estilo de Vida , Polimorfismo de Nucleotídeo Único , TranscriptomaRESUMO
We conducted PrediXcan analysis of hydrocephalus risk in ten neurological tissues and whole blood. Decreased expression of MAEL in the brain was significantly associated (Bonferroni-adjusted p < 0.05) with hydrocephalus. PrediXcan analysis of brain imaging and genomics data in the independent UK Biobank (N = 8,428) revealed that MAEL expression in the frontal cortex is associated with white matter and total brain volumes. Among the top differentially expressed genes in brain, we observed a significant enrichment for gene-level associations with these structural phenotypes, suggesting an effect on disease risk through regulation of brain structure and integrity. We found additional support for these genes through analysis of the choroid plexus transcriptome of a murine model of hydrocephalus. Finally, differential protein expression analysis in patient cerebrospinal fluid recapitulated disease-associated expression changes in neurological tissues, but not in whole blood. Our findings provide convergent evidence highlighting the importance of tissue-specific pathways and mechanisms in the pathophysiology of hydrocephalus.
Assuntos
Genômica/métodos , Hidrocefalia/genética , Animais , Humanos , CamundongosRESUMO
Orofacial cleft (OFC) is one of the most prevalent birth defects, leading to substantial and long-term burdens in a newborn's quality of life. Although studies revealed several genetic variants associated with the birth defect, novel approaches may provide additional clues about its etiology. Using the Center for Craniofacial and Dental Genetics project data (n = 10,542), we performed linear mixed-model analyses to study the genetic compositions of OFC and investigated the dependence among identified loci using conditional analyses. To identify genes associated with OFC, we conducted a transcriptome-wide association study (TWAS) based on predicted expression levels. In addition to confirming the previous findings at four loci, 1q32.2, 8q24, 2p24.2 and 17p13.1, we untwined two independent loci at 1q32.2, TRAF3IP3 and IRF6. The sentinel SNP in TRAF3IP3 (rs2235370, p-value = 5.15 × 10-9) was independent of the sentinel SNP at IRF6 (rs2235373, r2 < 0.3). We found that the IRF6 effect became nonsignificant once the 8q24 effect was conditioned, while the TRAF3IP3 effect remained significant. Furthermore, we identified nine genes associated with OFC in TWAS, implicating a glutathione synthesis and drug detoxification pathway. We identified some meaningful additions to the OFC etiology using novel statistical methods in the existing data.
Assuntos
Cromossomos Humanos Par 1/genética , Fenda Labial/genética , Fissura Palatina/genética , Fatores Reguladores de Interferon/genética , Proteínas Associadas aos Microtúbulos/genética , Adolescente , Adulto , Estudos de Casos e Controles , Criança , Pré-Escolar , Mapeamento Cromossômico , Feminino , Marcadores Genéticos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Humanos , Lactente , Recém-Nascido , Modelos Lineares , Masculino , Pessoa de Meia-Idade , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Adulto JovemRESUMO
Colorectal cancer (CRC) survival has environmental and inherited components. The expression of specific genes can be inferred based on individual genotypes-so called expression quantitative trait loci. In this study, we used the PrediXcan method to predict gene expression in normal colon tissue using individual genotype data from 91 CRC patients and examined the correlation ρ between predicted and measured gene expression levels. Out of 5434 predicted genes, 58% showed a negative ρ value and only 16% presented a ρ higher than 0.10. We subsequently investigated the association between genotype-based gene expression in colon tissue for genes with ρ > 0.10 and survival of 4436 CRC patients. We identified an inverse association between the predicted expression of ARID3B and CRC-specific survival for patients with a body mass index greater than or equal to 30 kg/m2 (HR (hazard ratio) = 0.66 for an expression higher vs. lower than the median, p = 0.005). This association was validated using genotype and clinical data from the UK Biobank (HR = 0.74, p = 0.04). In addition to the identification of ARID3B expression in normal colon tissue as a candidate prognostic biomarker for obese CRC patients, our study illustrates the challenges of genotype-based prediction of gene expression, and the advantage of reassessing the prediction accuracy in a subset of the study population using measured gene expression data.
Assuntos
Biomarcadores Tumorais/genética , Colo/patologia , Neoplasias Colorretais/patologia , Proteínas de Ligação a DNA/genética , Regulação Neoplásica da Expressão Gênica , Polimorfismo de Nucleotídeo Único , Idoso , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , Colo/metabolismo , Neoplasias Colorretais/genética , Neoplasias Colorretais/terapia , Feminino , Seguimentos , Perfilação da Expressão Gênica , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico , Taxa de SobrevidaRESUMO
The integration of transcriptomic studies and genome-wide association studies (GWAS) via imputed expression has seen extensive application in recent years, enabling the functional characterization and causal gene prioritization of GWAS loci. However, the techniques for imputing transcriptomic traits from DNA variation remain underdeveloped. Furthermore, associations found when linking eQTL studies to complex traits through methods like PrediXcan can lead to false positives due to linkage disequilibrium between distinct causal variants. Therefore, the best prediction performance models may not necessarily lead to more reliable causal gene discovery. With the goal of improving discoveries without increasing false positives, we develop and compare multiple transcriptomic imputation approaches using the most recent GTEx release of expression and splicing data on 17,382 RNA-sequencing samples from 948 post-mortem donors in 54 tissues. We find that informing prediction models with posterior causal probability from fine-mapping (dap-g) and borrowing information across tissues (mashr) can lead to better performance in terms of number and proportion of significant associations that are colocalized and the proportion of silver standard genes identified as indicated by precision-recall and receiver operating characteristic curves. All prediction models are made publicly available at predictdb.org.
RESUMO
It is of great scientific interest to identify interactions between genetic variants and environmental exposures that may modify the risk of complex diseases. However, larger sample sizes are usually required to detect gene-by-environment interaction (G × E) than required to detect genetic main association effects. To boost the statistical power and improve the understanding of the underlying molecular mechanisms, we incorporate functional genomics information, specifically, expression quantitative trait loci (eQTLs), into a data-adaptive G × E test, called aGEw. This test adaptively chooses the best eQTL weights from multiple tissues and provides an extra layer of weighting at the genetic variant level. Extensive simulations show that the aGEw test can control the Type 1 error rate, and the power is resilient to the inclusion of neutral variants and noninformative external weights. We applied the proposed aGEw test to the Pancreatic Cancer Case-Control Consortium (discovery cohort of 3,585 cases and 3,482 controls) and the PanScan II genome-wide association study data (replication cohort of 2,021 cases and 2,105 controls) with smoking as the exposure of interest. Two novel putative smoking-related pancreatic cancer susceptibility genes, TRIP10 and KDM3A, were identified. The aGEw test is implemented in an R package aGE.
Assuntos
Regulação da Expressão Gênica , Interação Gene-Ambiente , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Neoplasias Pancreáticas/genética , Locos de Características Quantitativas/genética , Estudos de Casos e Controles , Estudos de Coortes , Simulação por Computador , Interpretação Estatística de Dados , Humanos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética , Fumar/genéticaRESUMO
The development of explanatory models of protein sequence evolution has broad implications for our understanding of cellular biology, population history, and disease etiology. Here we analyze the GTEx transcriptome resource to quantify the effect of the transcriptome on protein sequence evolution in a multi-tissue framework. We find substantial variation among the central nervous system tissues in the effect of expression variance on evolutionary rate, with highly variable genes in the cortex showing significantly greater purifying selection than highly variable genes in subcortical regions (Mann-Whitney U p = 1.4 × 10-4). The remaining tissues cluster in observed expression correlation with evolutionary rate, enabling evolutionary analysis of genes in diverse physiological systems, including digestive, reproductive, and immune systems. Importantly, the tissue in which a gene attains its maximum expression variance significantly varies (p = 5.55 × 10-284) with evolutionary rate, suggesting a tissue-anchored model of protein sequence evolution. Using a large-scale reference resource, we show that the tissue-anchored model provides a transcriptome-based approach to predicting the primary affected tissue of developmental disorders. Using gradient boosted regression trees to model evolutionary rate under a range of model parameters, selected features explain up to 62% of the variation in evolutionary rate and provide additional support for the tissue model. Finally, we investigate several methodological implications, including the importance of evolutionary-rate-aware gene expression imputation models using genetic data for improved search for disease-associated genes in transcriptome-wide association studies. Collectively, this study presents a comprehensive transcriptome-based analysis of a range of factors that may constrain molecular evolution and proposes a novel framework for the study of gene function and disease mechanism.
RESUMO
There is particular interest in transcriptome-wide association studies (TWAS) gene-level tests based on multi-SNP predictive models of gene expression-for identifying causal genes at loci associated with complex traits. However, interpretation of TWAS associations may be complicated by divergent effects of model SNPs on phenotype and gene expression. We developed an iterative modeling scheme for obtaining multi-SNP models of gene expression and applied this framework to generate expression models for 43 human tissues from the Genotype-Tissue Expression (GTEx) Project. We characterized the performance of single- and multi-SNP models for identifying causal genes in GWAS data for 46 circulating metabolites. We show that: (A) multi-SNP models captured more variation in expression than did the top cis-eQTL (median 2-fold improvement); (B) predicted expression based on multi-SNP models was associated (false discovery rate < 0.01) with metabolite levels for 826 unique gene-metabolite pairs, but, after stepwise conditional analyses, 90% were dominated by a single eQTL SNP; (C) among the 35% of associations where a SNP in the expression model was a significant cis-eQTL and metabolomic-QTL (met-QTL), 92% demonstrated colocalization between these signals, but interpretation was often complicated by incomplete overlap of QTLs in multi-SNP models; and (D) using a "truth" set of causal genes at 61 met-QTLs, the sensitivity was high (67%), but the positive predictive value was low, as only 8% of TWAS associations (19% when restricted to colocalized associations at met-QTLs) involved true causal genes. These results guide the interpretation of TWAS and highlight the need for corroborative data to provide confident assignment of causality.
Assuntos
Regulação da Expressão Gênica , Predisposição Genética para Doença , Metaboloma , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Transcriptoma , Estudo de Associação Genômica Ampla , Humanos , FenótipoRESUMO
BACKGROUND: Little is known about the functional mechanisms through which genetic loci associated with substance use traits ascertain their effect. This study aims to identify and functionally annotate loci associated with substance use traits based on their role in genetic regulation of gene expression. METHODS: We evaluated expression Quantitative Trait Loci (eQTLs) from 13 brain regions and whole blood of the Genotype-Tissue Expression (GTEx) database, and from whole blood of the Depression Genes and Networks (DGN) database. The role of single eQTLs was examined for six substance use traits: alcohol consumption (Nâ¯=â¯537,349), cigarettes per day (CPD; Nâ¯=â¯263,954), former vs. current smoker (Nâ¯=â¯312,821), age of smoking initiation (Nâ¯=â¯262,990), ever smoker (Nâ¯=â¯632,802), and cocaine dependence (Nâ¯=â¯4,769). Subsequently, we conducted a gene level analysis of gene expression on these substance use traits using S-PrediXcan. RESULTS: Using an FDR-adjusted p-value <0.05 we found 2,976 novel candidate genetic loci for substance use traits, and identified genes and tissues through which these loci potentially exert their effects. Using S-PrediXcan, we identified significantly associated genes for all substance traits. DISCUSSION: Annotating genes based on transcriptomic regulation improves the identification and functional characterization of candidate loci and genes for substance use traits.
Assuntos
Usuários de Drogas/psicologia , Regulação da Expressão Gênica/genética , Predisposição Genética para Doença/genética , Locos de Características Quantitativas/genética , Transtornos Relacionados ao Uso de Substâncias/genética , Sangue/metabolismo , Encéfalo/metabolismo , Perfilação da Expressão Gênica , Humanos , Metanálise como Assunto , Fenótipo , Transtornos Relacionados ao Uso de Substâncias/psicologia , Transcriptoma/genéticaRESUMO
Genome-wide association studies (GWAS) have successfully identified many genetic variants associated with complex traits. However, GWAS experience power issues, resulting in the failure to detect certain associated variants. Additionally, GWAS are often unable to parse the biological mechanisms of driving associations. An existing gene-based association test framework, Transcriptome-Wide Association Studies (TWAS), leverages expression quantitative trait loci data to increase the power of association tests and illuminate the biological mechanisms by which genetic variants modulate complex traits. We extend the TWAS methodology to incorporate somatic information from tumors. By integrating germline and somatic data we are able to leverage information from the nuanced somatic landscape of tumors. Thus we can augment the power of TWAS-type tests to detect germline genetic variants associated with cancer phenotypes. We use somatic and germline data on lung adenocarcinomas from The Cancer Genome Atlas in conjunction with a meta-analyzed lung cancer GWAS to identify novel genes associated with lung cancer.
Assuntos
Genes Neoplásicos , Estudos de Associação Genética , Predisposição Genética para Doença , Células Germinativas/metabolismo , Neoplasias Pulmonares/genética , Regulação Neoplásica da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Metanálise como Assunto , Mutação/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas/genética , Reprodutibilidade dos TestesRESUMO
In the past 15 years, genome-wide association studies (GWAS) have provided novel insight into the genetic architecture of various complex traits; however, this insight has been primarily focused on populations of European descent. This emphasis on European populations has led to individuals of recent African descent being grossly underrepresented in the study of genetics. With African Americans making up less than 2% of participants in neuropsychiatric GWAS, this discrepancy is magnified in diseases such as schizophrenia and bipolar disorder. In this study, we performed GWAS and the gene-based association method PrediXcan for schizophrenia (n = 2,256) and bipolar disorder (n = 1,019) in African American cohorts. In our PrediXcan analyses, we identified PRMT7 (P = 5.5 × 10-6, local false sign rate = 0.12) as significantly associated with schizophrenia following an adaptive shrinkage multiple testing adjustment. This association with schizophrenia was confirmed in the much larger, predominantly European, Psychiatric Genomics Consortium. In addition to the PRMT7 association with schizophrenia, we identified rs10168049 (P = 1.0 × 10-6) as a potential candidate locus for bipolar disorder with highly divergent allele frequencies across populations, highlighting the need for diversity in genetic studies.
RESUMO
Plasma lipid levels are risk factors for cardiovascular disease, a leading cause of death worldwide. While many studies have been conducted on lipid genetics, they mainly focus on Europeans and thus their transferability to diverse populations is unclear. We performed SNP- and gene-level genome-wide association studies (GWAS) of four lipid traits in cohorts from Nigeria and the Philippines and compared them to the results of larger, predominantly European meta-analyses. Two previously implicated loci met genome-wide significance in our SNP-level GWAS in the Nigerian cohort, rs34065661 in CETP associated with HDL cholesterol (P = 9.0 × 10-10) and rs1065853 upstream of APOE associated with LDL cholesterol (P = 6.6 × 10-9). The top SNP in the Filipino cohort associated with triglyceride levels (rs662799; P = 2.7 × 10-16) and has been previously implicated in other East Asian studies. While this SNP is located directly upstream of well known APOA5, we show it may also be involved in the regulation of BACE1 and SIDT2. Our gene-based association analysis, PrediXcan, revealed decreased expression of BACE1 and decreased expression of SIDT2 in several tissues, all driven by rs662799, significantly associate with increased triglyceride levels in Filipinos (FDR <0.1). In addition, our PrediXcan analysis implicated gene regulation as the mechanism underlying the associations of many other previously discovered lipid loci. Our novel BACE1 and SIDT2 findings were confirmed using summary statistics from the Global Lipids Genetic Consortium (GLGC) meta-GWAS.
RESUMO
Preclinical Alzheimer's disease (AD) is characterized by amyloid deposition in the absence of overt clinical impairment. There is substantial heterogeneity in the long-term clinical outcomes among amyloid positive individuals, yet limited work has focused on identifying molecular factors driving resilience from amyloid-related cognitive impairment. We apply a recently developed predicted gene expression analysis (PrediXcan) to identify genes that modify the association between baseline amyloid deposition and longitudinal cognitive changes. Participants free of clinical AD (n = 631) were selected from the AD Neuroimaging Initiative (ADNI) who had a baseline positron emission tomography measure of amyloid deposition (quantified as a standard uptake value ratio), longitudinal neuropsychological data, and genetic data. PrediXcan was used to impute gene expression levels across 15 heart and brain tissues. Mixed effect regression models assessed the interaction between predicted gene expression levels and amyloid deposition on longitudinal cognitive outcomes. The predicted gene expression levels for two genes in the coronary artery (CNTLN, PROK1) and two genes in the atrial appendage (PRSS50, PROK1) interacted with amyloid deposition on episodic memory performance. The predicted gene expression levels for two additional genes (TMC4 in the basal ganglia and HMBS in the aorta) interacted with amyloid deposition on executive function performance. Post-hoc analyses provide additional validation of the HMBS and PROK1 effects across two independent subsets of ADNI using two additional metrics of amyloid deposition. These results highlight a subset of unique candidate genes of resilience and provide evidence that cell-cycle regulation, angiogenesis, and heme biosynthesis likely play a role in AD progression.