RESUMO
Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.
Assuntos
Variação Genética , Estudo de Associação Genômica Ampla , Células-Tronco Hematopoéticas/metabolismo , Doenças do Sistema Imunitário/genética , Alelos , Diferenciação Celular , Predisposição Genética para Doença , Células-Tronco Hematopoéticas/patologia , Humanos , Doenças do Sistema Imunitário/patologia , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , População Branca/genéticaRESUMO
Mendelian randomization uses genetic variants as instrumental variables to make causal inferences on the effect of an exposure on an outcome. Due to the recent abundance of high-powered genome-wide association studies, many putative causal exposures of interest have large numbers of independent genetic variants with which they associate, each representing a potential instrument for use in a Mendelian randomization analysis. Such polygenic analyses increase the power of the study design to detect causal effects; however, they also increase the potential for bias due to instrument invalidity. Recent attention has been given to dealing with bias caused by correlated pleiotropy, which results from violation of the "instrument strength independent of direct effect" assumption. Although methods have been proposed that can account for this bias, a number of restrictive conditions remain in many commonly used techniques. In this paper, we propose a Bayesian framework for Mendelian randomization that provides valid causal inference under very general settings. We propose the methods MR-Horse and MVMR-Horse, which can be performed without access to individual-level data, using only summary statistics of the type commonly published by genome-wide association studies, and can account for both correlated and uncorrelated pleiotropy. In simulation studies, we show that the approach retains type I error rates below nominal levels even in high-pleiotropy scenarios. We demonstrate the proposed approaches in applied examples in both univariable and multivariable settings, some with very weak instruments.
Assuntos
Estudo de Associação Genômica Ampla , Análise da Randomização Mendeliana , Animais , Cavalos , Teorema de Bayes , Simulação por Computador , Herança MultifatorialRESUMO
Treatments for neurodegenerative disorders remain rare, but recent FDA approvals, such as lecanemab and aducanumab for Alzheimer disease (MIM: 607822), highlight the importance of the underlying biological mechanisms in driving discovery and creating disease modifying therapies. The global population is aging, driving an urgent need for therapeutics that stop disease progression and eliminate symptoms. In this study, we create an open framework and resource for evidence-based identification of therapeutic targets for neurodegenerative disease. We use summary-data-based Mendelian randomization to identify genetic targets for drug discovery and repurposing. In parallel, we provide mechanistic insights into disease processes and potential network-level consequences of gene-based therapeutics. We identify 116 Alzheimer disease, 3 amyotrophic lateral sclerosis (MIM: 105400), 5 Lewy body dementia (MIM: 127750), 46 Parkinson disease (MIM: 605909), and 9 progressive supranuclear palsy (MIM: 601104) target genes passing multiple test corrections (pSMR_multi < 2.95 × 10-6 and pHEIDI > 0.01). We created a therapeutic scheme to classify our identified target genes into strata based on druggability and approved therapeutics, classifying 41 novel targets, 3 known targets, and 115 difficult targets (of these, 69.8% are expressed in the disease-relevant cell type from single-nucleus experiments). Our novel class of genes provides a springboard for new opportunities in drug discovery, development, and repurposing in the pre-competitive space. In addition, looking at drug-gene interaction networks, we identify previous trials that may require further follow-up such as riluzole in Alzheimer disease. We also provide a user-friendly web platform to help users explore potential therapeutic targets for neurodegenerative diseases, decreasing activation energy for the community.
Assuntos
Doença de Alzheimer , Doenças Neurodegenerativas , Doença de Parkinson , Humanos , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/genética , Recursos Comunitários , Multiômica , Doenças Neurodegenerativas/tratamento farmacológico , Doenças Neurodegenerativas/genética , Análise da Randomização MendelianaRESUMO
In Mendelian randomization, two single SNP-trait correlation-based methods have been developed to infer the causal direction between an exposure (e.g., a gene) and an outcome (e.g., a trait), called MR Steiger's method and its recent extension called Causal Direction-Ratio (CD-Ratio). Here we propose an approach based on R2, the coefficient of determination, to combine information from multiple (possibly correlated) SNPs to simultaneously infer the presence and direction of a causal relationship between an exposure and an outcome. Our proposed method generalizes Steiger's method from using a single SNP to multiple SNPs as IVs. It is especially useful in transcriptome-wide association studies (TWASs) (and similar applications) with typically small sample sizes for gene expression (or another molecular trait) data, providing a more flexible and powerful approach to inferring causal directions. It can be applied to GWAS summary data with a reference panel. We also discuss the influence of invalid IVs and introduce a new approach called R2S to select and remove invalid IVs (if any) to enhance the robustness. We compared the performance of the proposed method with existing methods in simulations to demonstrate its advantages. We applied the methods to identify causal genes for high/low-density lipoprotein cholesterol (HDL/LDL) using the individual-level GTEx gene expression data and UK Biobank GWAS data. The proposed method was able to confirm some well-known causal genes while identifying some novel ones. Additionally, we illustrated an application of the proposed method to GWAS summary to infer causal relationships between HDL/LDL and stroke/coronary artery disease (CAD).
Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Transcriptoma , Humanos , Estudo de Associação Genômica Ampla/métodos , Transcriptoma/genética , Análise da Randomização Mendeliana/métodos , Modelos Genéticos , LDL-Colesterol/genética , LDL-Colesterol/sangue , FenótipoRESUMO
Mendelian randomization (MR) provides valuable assessments of the causal effect of exposure on outcome, yet the application of conventional MR methods for mapping risk genes encounters new challenges. One of the issues is the limited availability of expression quantitative trait loci (eQTLs) as instrumental variables (IVs), hampering the estimation of sparse causal effects. Additionally, the often context- or tissue-specific eQTL effects challenge the MR assumption of consistent IV effects across eQTL and GWAS data. To address these challenges, we propose a multi-context multivariable integrative MR framework, mintMR, for mapping expression and molecular traits as joint exposures. It models the effects of molecular exposures across multiple tissues in each gene region, while simultaneously estimating across multiple gene regions. It uses eQTLs with consistent effects across more than one tissue type as IVs, improving IV consistency. A major innovation of mintMR involves employing multi-view learning methods to collectively model latent indicators of disease relevance across multiple tissues, molecular traits, and gene regions. The multi-view learning captures the major patterns of disease relevance and uses these patterns to update the estimated tissue relevance probabilities. The proposed mintMR iterates between performing a multi-tissue MR for each gene region and joint learning the disease-relevant tissue probabilities across gene regions, improving the estimation of sparse effects across genes. We apply mintMR to evaluate the causal effects of gene expression and DNA methylation for 35 complex traits using multi-tissue QTLs as IVs. The proposed mintMR controls genome-wide inflation and offers insights into disease mechanisms.
Assuntos
Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Análise da Randomização Mendeliana , Locos de Características Quantitativas , Humanos , Análise da Randomização Mendeliana/métodos , Estudo de Associação Genômica Ampla/métodos , Especificidade de Órgãos/genética , Modelos Genéticos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Mendelian randomization (MR), which utilizes genetic variants as instrumental variables (IVs), has gained popularity as a method for causal inference between phenotypes using genetic data. While efforts have been made to relax IV assumptions and develop new methods for causal inference in the presence of invalid IVs due to confounding, the reliability of MR methods in real-world applications remains uncertain. Instead of using simulated datasets, we conducted a benchmark study evaluating 16 two-sample summary-level MR methods using real-world genetic datasets to provide guidelines for the best practices. Our study focused on the following crucial aspects: type I error control in the presence of various confounding scenarios (e.g., population stratification, pleiotropy, and family-level confounders like assortative mating), the accuracy of causal effect estimates, replicability, and power. By comprehensively evaluating the performance of compared methods over one thousand exposure-outcome trait pairs, our study not only provides valuable insights into the performance and limitations of the compared methods but also offers practical guidance for researchers to choose appropriate MR methods for causal inference.
Assuntos
Benchmarking , Estudo de Associação Genômica Ampla , Análise da Randomização Mendeliana , Análise da Randomização Mendeliana/métodos , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Variação Genética , Causalidade , Polimorfismo de Nucleotídeo Único , Modelos GenéticosRESUMO
Type 2 diabetes (T2D) is a major risk factor for heart failure (HF) and has elevated incidence among individuals with HF. Since genetics and HF can independently influence T2D, collider bias may occur when T2D (i.e., collider) is controlled for by design or analysis. Thus, we conducted a genome-wide association study (GWAS) of diabetes-related HF with correction for collider bias. We first performed a GWAS of HF to identify genetic instrumental variables (GIVs) for HF and to enable bidirectional Mendelian randomization (MR) analysis between T2D and HF. We identified 61 genomic loci, significantly associated with all-cause HF in 114,275 individuals with HF and over 1.5 million controls of European ancestry. Using a two-sample bidirectional MR approach with 59 and 82 GIVs for HF and T2D, respectively, we estimated that T2D increased HF risk (odds ratio [OR] 1.07, 95% confidence interval [CI] 1.04-1.10), while HF also increased T2D risk (OR 1.60, 95% CI 1.36-1.88). Then we performed a GWAS of diabetes-related HF corrected for collider bias due to the study design of index cases. After removing the spurious association of TCF7L2 locus due to collider bias, we identified two genome-wide significant loci close to PITX2 (chromosome 4) and CDKN2B-AS1 (chromosome 9) associated with diabetes-related HF in the Million Veteran Program and replicated the associations in the UK Biobank. Our MR findings provide strong evidence that HF increases T2D risk. As a result, collider bias leads to spurious genetic associations of diabetes-related HF, which can be effectively corrected to identify true positive loci.
Assuntos
Diabetes Mellitus Tipo 2 , Estudo de Associação Genômica Ampla , Insuficiência Cardíaca , Análise da Randomização Mendeliana , Humanos , Insuficiência Cardíaca/genética , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/complicações , Masculino , Feminino , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença , Pessoa de Meia-Idade , Fatores de Risco , Idoso , Inibidor de Quinase Dependente de Ciclina p15/genética , População Branca/genética , Viés , Proteínas de Homeodomínio/genética , Fatores de Transcrição/genéticaRESUMO
Whereas 16p11.2 BP4-5 copy-number variants (CNVs) represent one of the most pleiotropic etiologies of genomic syndromes in both clinical and population cohorts, the mechanisms leading to such pleiotropy remain understudied. Identifying 73 deletion and 89 duplication carrier individuals among unrelated White British UK Biobank participants, we performed a phenome-wide association study (PheWAS) between the region's copy number and 117 complex traits and diseases, mimicking four dosage models. Forty-six phenotypes (39%) were affected by 16p11.2 BP4-5 CNVs, with the deletion-only, mirror, U-shape, and duplication-only models being the best fit for 30, 10, 4, and 2 phenotypes, respectively, aligning with the stronger deleteriousness of the deletion. Upon individually adjusting CNV effects for either body mass index (BMI), height, or educational attainment (EA), we found that sixteen testable deletion-driven associations-primarily with cardiovascular and metabolic traits-were BMI dependent, with EA playing a more subtle role and no association depending on height. Bidirectional Mendelian randomization supported that 13 out of these 16 associations were secondary consequences of the CNV's impact on BMI. For the 23 traits that remained significantly associated upon individual adjustment for mediators, matched-control analyses found that 10 phenotypes, including musculoskeletal traits, liver enzymes, fluid intelligence, platelet count, and pneumonia and acute kidney injury risk, remained associated under strict Bonferroni correction, with 10 additional nominally significant associations. These results paint a complex picture of 16p11.2 BP4-5's pleiotropic pattern that involves direct effects on multiple physiological systems and indirect co-morbidities consequential to the CNV's impact on BMI and EA, acting through trait-specific dosage mechanisms.
RESUMO
Genetic variants are involved in the orchestration of alternative polyadenylation (APA) events, while the role of DNA methylation in regulating APA remains unclear. We generated a comprehensive atlas of APA quantitative trait methylation sites (apaQTMs) across 21 different types of cancer (1,612 to 60,219 acting in cis and 4,448 to 142,349 in trans). Potential causal apaQTMs in non-cancer samples were also identified. Mechanistically, we observed a strong enrichment of cis-apaQTMs near polyadenylation sites (PASs) and both cis- and trans-apaQTMs in proximity to transcription factor (TF) binding regions. Through the integration of ChIP-signals and RNA-seq data from cell lines, we have identified several regulators of APA events, acting either directly or indirectly, implicating novel functions of some important genes, such as TCF7L2, which is known for its involvement in type 2 diabetes and cancers. Furthermore, we have identified a vast number of QTMs that share the same putative causal CpG sites with five different cancer types, underscoring the roles of QTMs, including apaQTMs, in the process of tumorigenesis. DNA methylation is extensively involved in the regulation of APA events in human cancers. In an attempt to elucidate the potential underlying molecular mechanisms of APA by DNA methylation, our study paves the way for subsequent experimental validations into the intricate biological functions of DNA methylation in APA regulation and the pathogenesis of human cancers. To present a comprehensive catalog of apaQTM patterns, we introduce the Pancan-apaQTM database, available at https://pancan-apaqtm-zju.shinyapps.io/pancanaQTM/.
Assuntos
Diabetes Mellitus Tipo 2 , Neoplasias , Humanos , Poliadenilação/genética , Diabetes Mellitus Tipo 2/genética , Neoplasias/genética , Neoplasias/patologia , Regulação da Expressão Gênica , Metilação de DNA/genética , Regiões 3' não TraduzidasRESUMO
PURPOSE: Pathogenesis and the associated risk factors of cataracts, glaucoma, and age-related macular degeneration (AMD) remain unclear. We aimed to investigate causal relationships between circulating cytokine levels and the development of these diseases. PATIENTS AND METHODS: Genetic instrumental variables for circulating cytokines were derived from a genome-wide association study of 8293 European participants. Summary-level data for AMD, glaucoma, and senile cataract were obtained from the FinnGen database. The inverse variance weighted (IVW) was the main Mendelian randomization (MR) analysis method. The Cochran's Q, MR-Egger regression, and MR pleiotropy residual sum and outlier test were used for sensitivity analysis. RESULTS: Based on the IVW method, MR analysis demonstrated five circulating cytokines suggestively associated with AMD (SCGF-ß, 1.099 [95%CI, 1.037-1.166], P = 0.002; SCF, 1.155 [95%CI, 1.015-1.315], P = 0.029; MCP-1, 1.103 [95%CI, 1.012-1.202], P = 0.026; IL-10, 1.102 [95%CI, 1.012-1.200], P = 0.025; eotaxin, 1.086 [95%CI, 1.002-1.176], P = 0.044), five suggestively linked with glaucoma (MCP-1, 0.945 [95%CI, 0.894-0.999], P = 0.047; IL1ra, 0.886 [95%CI, 0.809-0.969], P = 0.008; IL-1ß, 0.866 [95%CI, 0.762-0.983], P = 0.027; IL-9, 0.908 [95%CI, 0.841-0.980], P = 0.014; IL2ra, 1.065 [95%CI, 1.004-1.130], P = 0.035), and four suggestively associated with senile cataract (TRAIL, 1.043 [95%CI, 1.009-1.077], P = 0.011; IL-16, 1.032 [95%CI, 1.001-1.064], P = 0.046; IL1ra, 0.942 [95%CI, 0.887-0.999], P = 0.047; FGF-basic, 1.144 [95%CI, 1.052-1.244], P = 0.002). Furthermore, sensitivity analysis results supported the above associations. CONCLUSION: This study highlights the involvement of several circulating cytokines in the development ophthalmic diseases and holds potential as viable pharmacological targets for these diseases.
Assuntos
Catarata , Citocinas , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Glaucoma , Degeneração Macular , Análise da Randomização Mendeliana , Humanos , Citocinas/sangue , Citocinas/genética , Catarata/sangue , Catarata/genética , Degeneração Macular/genética , Degeneração Macular/sangue , Glaucoma/genética , Glaucoma/sangue , Fatores de Risco , Polimorfismo de Nucleotídeo Único , Masculino , Feminino , Oftalmopatias/genética , Oftalmopatias/sangueRESUMO
Unlike other cancers with widespread screening (breast, colorectal, cervical, prostate, and skin), lung nodule biopsies for positive screenings have higher morbidity with clinical complications. Development of non-invasive diagnostic biomarkers could thereby significantly enhance lung cancer management for at-risk patients. Here, we leverage Mendelian Randomization (MR) to investigate the plasma proteome and metabolome for potential biomarkers relevant to lung cancer. Utilizing bidirectional MR and co-localization analyses, we identify novel associations, highlighting inverse relationships between plasma proteins SFTPB and KDELC2 in lung adenocarcinoma (LUAD) and positive associations of TCL1A with lung squamous cell carcinoma (LUSC) and CNTN1 with small cell lung cancer (SCLC). Additionally, our work reveals significant negative correlations between metabolites such as theobromine and paraxanthine, along with paraxanthine-related ratios, in both LUAD and LUSC. Conversely, positive correlations are found in caffeine/paraxanthine and arachidonate (20:4n6)/paraxanthine ratios with these cancer types. Through single-cell sequencing data of normal lung tissue, we further explore the role of lung tissue-specific protein SFTPB in carcinogenesis. These findings offer new insights into lung cancer etiology, potentially guiding the development of diagnostic biomarkers and therapeutic approaches.
Assuntos
Biomarcadores Tumorais , Neoplasias Pulmonares , Análise da Randomização Mendeliana , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/sangue , Neoplasias Pulmonares/patologia , Neoplasias Pulmonares/metabolismo , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/sangue , Biomarcadores Tumorais/metabolismo , Proteoma/genética , Proteoma/metabolismo , Metaboloma/genética , Adenocarcinoma de Pulmão/genética , Adenocarcinoma de Pulmão/sangue , Adenocarcinoma de Pulmão/patologia , Adenocarcinoma de Pulmão/metabolismo , Carcinoma de Pequenas Células do Pulmão/genética , Carcinoma de Pequenas Células do Pulmão/sangue , Carcinoma de Pequenas Células do Pulmão/metabolismo , Carcinoma de Pequenas Células do Pulmão/diagnóstico , Carcinoma de Pequenas Células do Pulmão/patologia , Metabolômica/métodosRESUMO
A long-standing recognition that information from human genetics studies has the potential to accelerate drug discovery has led to decades of research on how to leverage genetic and phenotypic information for drug discovery. Established simple and advanced statistical methods that allow the simultaneous analysis of genotype and clinical phenotype data by genome- and phenome-wide analyses, colocalization analyses with quantitative trait loci data from transcriptomics and proteomics data sets from different tissues, and Mendelian randomization are essential tools for drug development in the postgenomic era. Numerous studies have demonstrated how genomic data provide opportunities for the identification of new drug targets, the repurposing of drugs, and drug safety analyses. With an increase in the number of biobanks that enable linking in-depth omics data with rich repositories of phenotypic traits via electronic health records, more powerful ways for the evaluation and validation of drug targets will continue to expand across different disciplines of clinical research.
Assuntos
Registros Eletrônicos de Saúde , Estudo de Associação Genômica Ampla , Humanos , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Fenótipo , Descoberta de DrogasRESUMO
The existing framework of Mendelian randomization (MR) infers the causal effect of one or multiple exposures on one single outcome. It is not designed to jointly model multiple outcomes, as would be necessary to detect causes of more than one outcome and would be relevant to model multimorbidity or other related disease outcomes. Here, we introduce multi-response Mendelian randomization (MR2), an MR method specifically designed for multiple outcomes to identify exposures that cause more than one outcome or, conversely, exposures that exert their effect on distinct responses. MR2 uses a sparse Bayesian Gaussian copula regression framework to detect causal effects while estimating the residual correlation between summary-level outcomes, i.e., the correlation that cannot be explained by the exposures, and vice versa. We show both theoretically and in a comprehensive simulation study how unmeasured shared pleiotropy induces residual correlation between outcomes irrespective of sample overlap. We also reveal how non-genetic factors that affect more than one outcome contribute to their correlation. We demonstrate that by accounting for residual correlation, MR2 has higher power to detect shared exposures causing more than one outcome. It also provides more accurate causal effect estimates than existing methods that ignore the dependence between related responses. Finally, we illustrate how MR2 detects shared and distinct causal exposures for five cardiovascular diseases in two applications considering cardiometabolic and lipidomic exposures and uncovers residual correlation between summary-level outcomes reflecting known relationships between cardiovascular diseases.
Assuntos
Doenças Cardiovasculares , Humanos , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética , Teorema de Bayes , Multimorbidade , Análise da Randomização Mendeliana/métodos , Causalidade , Estudo de Associação Genômica AmplaRESUMO
Evidence on the validity of drug targets from randomized trials is reliable but typically expensive and slow to obtain. In contrast, evidence from conventional observational epidemiological studies is less reliable because of the potential for bias from confounding and reverse causation. Mendelian randomization is a quasi-experimental approach analogous to a randomized trial that exploits naturally occurring randomization in the transmission of genetic variants. In Mendelian randomization, genetic variants that can be regarded as proxies for an intervention on the proposed drug target are leveraged as instrumental variables to investigate potential effects on biomarkers and disease outcomes in large-scale observational datasets. This approach can be implemented rapidly for a range of drug targets to provide evidence on their effects and thus inform on their priority for further investigation. In this review, we present statistical methods and their applications to showcase the diverse opportunities for applying Mendelian randomization in guiding clinical development efforts, thus enabling interventions to target the right mechanism in the right population group at the right time. These methods can inform investigators on the mechanisms underlying drug effects, their related biomarkers, implications for the timing of interventions, and the population subgroups that stand to gain the most benefit. Most methods can be implemented with publicly available data on summarized genetic associations with traits and diseases, meaning that the only major limitations to their usage are the availability of appropriately powered studies for the exposure and outcome and the existence of a suitable genetic proxy for the proposed intervention.
Assuntos
Descoberta de Drogas , Análise da Randomização Mendeliana , Humanos , Análise da Randomização Mendeliana/métodos , Causalidade , Biomarcadores , ViésRESUMO
ERAP2 is an aminopeptidase involved in immunological antigen presentation. Genotype data in human samples from before and after the Black Death, an epidemic due to Yersinia pestis, have marked changes in allele frequency of the single-nucleotide polymorphism (SNP) rs2549794, with the T allele suggested to be deleterious during this period, while ERAP2 is also implicated in autoimmune diseases. This study explored the association between variation at ERAP2 and (1) infection, (2) autoimmune disease, and (3) parental longevity. Genome-wide association studies (GWASs) of these outcomes were identified in contemporary cohorts (UK Biobank, FinnGen, and GenOMICC). Effect estimates were extracted for rs2549794 and rs2248374, a haplotype tagging SNP. Additionally, cis expression and protein quantitative trait loci (QTLs) for ERAP2 were used in Mendelian randomization (MR) analyses. Consistent with decreased survival in the Black Death, the T allele of rs2549794 showed evidence of association with respiratory infection (odds ratio; OR for pneumonia 1.03; 95% CI 1.01-1.05). Effect estimates were larger for more severe phenotypes (OR for critical care admission with pneumonia 1.08; 95% CI 1.02-1.14). In contrast, opposing effects were identified for Crohn disease (OR 0.86; 95% CI 0.82-0.90). This allele was shown to associate with decreased ERAP2 expression and protein levels, independent of haplotype. MR analyses suggest that ERAP2 expression may be mediating disease associations. Decreased ERAP2 expression is associated with severe respiratory infection with an opposing association with autoimmune diseases. These data support the hypothesis of balancing selection at this locus driven by autoimmune and infectious disease.
Assuntos
Doenças Autoimunes , Peste , Humanos , Estudo de Associação Genômica Ampla , Genótipo , Haplótipos/genética , Doenças Autoimunes/genética , Polimorfismo de Nucleotídeo Único/genética , Predisposição Genética para Doença , Aminopeptidases/genética , Aminopeptidases/metabolismoRESUMO
Multimorbidity is a rising public health challenge with important implications for health management and policy. The most common multimorbidity pattern is the combination of cardiometabolic and osteoarticular diseases. Here, we study the genetic underpinning of the comorbidity between type 2 diabetes and osteoarthritis. We find genome-wide genetic correlation between the two diseases and robust evidence for association-signal colocalization at 18 genomic regions. We integrate multi-omics and functional information to resolve the colocalizing signals and identify high-confidence effector genes, including FTO and IRX3, which provide proof-of-concept insights into the epidemiologic link between obesity and both diseases. We find enrichment for lipid metabolism and skeletal formation pathways for signals underpinning the knee and hip osteoarthritis comorbidities with type 2 diabetes, respectively. Causal inference analysis identifies complex effects of tissue-specific gene expression on comorbidity outcomes. Our findings provide insights into the biological basis for the type 2 diabetes-osteoarthritis disease co-occurrence.
Assuntos
Diabetes Mellitus Tipo 2 , Osteoartrite , Humanos , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/genética , Comorbidade , Osteoartrite/epidemiologia , Osteoartrite/genética , Obesidade/complicações , Obesidade/epidemiologia , Obesidade/genética , Causalidade , Estudo de Associação Genômica Ampla , Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo Único , Dioxigenase FTO Dependente de alfa-Cetoglutarato/genéticaRESUMO
While extensively studied in clinical cohorts, the phenotypic consequences of 22q11.2 copy-number variants (CNVs) in the general population remain understudied. To address this gap, we performed a phenome-wide association scan in 405,324 unrelated UK Biobank (UKBB) participants by using CNV calls from genotyping array. We mapped 236 Human Phenotype Ontology terms linked to any of the 90 genes encompassed by the region to 170 UKBB traits and assessed the association between these traits and the copy-number state of 504 genotyping array probes in the region. We found significant associations for eight continuous and nine binary traits associated under different models (duplication-only, deletion-only, U-shape, and mirror models). The causal effect of the expression level of 22q11.2 genes on associated traits was assessed through transcriptome-wide Mendelian randomization (TWMR), revealing that increased expression of ARVCF increased BMI. Similarly, increased DGCR6 expression causally reduced mean platelet volume, in line with the corresponding CNV effect. Furthermore, cross-trait multivariable Mendelian randomization (MVMR) suggested a predominant role of genuine (horizontal) pleiotropy in the CNV region. Our findings show that within the general population, 22q11.2 CNVs are associated with traits previously linked to genes in the region, and duplications and deletions act upon traits in different fashions. We also showed that gain or loss of distinct segments within 22q11.2 may impact a trait under different association models. Our results have provided new insights to help further the understanding of the complex 22q11.2 region.
Assuntos
Variações do Número de Cópias de DNA , Fenômica , Humanos , Variações do Número de Cópias de DNA/genética , Fenótipo , Cromossomos Humanos Par 22RESUMO
Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder for which current treatments are limited and drug development costs are prohibitive. Identifying drug targets for ASD is crucial for the development of targeted therapies. Summary-level data of expression quantitative trait loci obtained from GTEx, protein quantitative trait loci data from the ROSMAP project, and two ASD genome-wide association studies datasets were utilized for discovery and replication. We conducted a combined analysis using Mendelian randomization (MR), transcriptome-wide association studies, Bayesian colocalization, and summary-data-based MR to identify potential therapeutic targets associated with ASD and examine whether there are shared causal variants among them. Furthermore, pathway and drug enrichment analyses were performed to further explore the underlying mechanisms and summarize the current status of pharmacological targets for developing drugs to treat ASD. The protein-protein interaction (PPI) network and mouse knockout models were performed to estimate the effect of therapeutic targets. A total of 17 genes revealed causal associations with ASD and were identified as potential targets for ASD patients. Cathepsin B (CTSB) [odd ratio (OR) = 2.66 95, confidence interval (CI): 1.28-5.52, P = 8.84 × 10-3], gamma-aminobutyric acid type B receptor subunit 1 (GABBR1) (OR = 1.99, 95CI: 1.06-3.75, P = 3.24 × 10-2), and formin like 1 (FMNL1) (OR = 0.15, 95CI: 0.04-0.58, P = 5.59 × 10-3) were replicated in the proteome-wide MR analyses. In Drugbank, two potential therapeutic drugs, Acamprosate (GABBR1 inhibitor) and Bryostatin 1 (CASP8 inhibitor), were inferred as potential influencers of autism. Knockout mouse models suggested the involvement of the CASP8, GABBR1, and PLEKHM1 genes in neurological processes. Our findings suggest 17 candidate therapeutic targets for ASD and provide novel drug targets for therapy development and critical drug repurposing opportunities.
Assuntos
Transtorno do Espectro Autista , Estudo de Associação Genômica Ampla , Proteômica , Humanos , Transtorno do Espectro Autista/tratamento farmacológico , Transtorno do Espectro Autista/genética , Transtorno do Espectro Autista/metabolismo , Animais , Camundongos , Transcriptoma , Locos de Características Quantitativas , Mapas de Interação de Proteínas/efeitos dos fármacos , Camundongos Knockout , Terapia de Alvo MolecularRESUMO
Causal discovery is a powerful tool to disclose underlying structures by analyzing purely observational data. Genetic variants can provide useful complementary information for structure learning. Recently, Mendelian randomization (MR) studies have provided abundant marginal causal relationships of traits. Here, we propose a causal network pruning algorithm MRSL (MR-based structure learning algorithm) based on these marginal causal relationships. MRSL combines the graph theory with multivariable MR to learn the conditional causal structure using only genome-wide association analyses (GWAS) summary statistics. Specifically, MRSL utilizes topological sorting to improve the precision of structure learning. It proposes MR-separation instead of d-separation and three candidates of sufficient separating set for MR-separation. The results of simulations revealed that MRSL had up to 2-fold higher F1 score and 100 times faster computing time than other eight competitive methods. Furthermore, we applied MRSL to 26 biomarkers and 44 International Classification of Diseases 10 (ICD10)-defined diseases using GWAS summary data from UK Biobank. The results cover most of the expected causal links that have biological interpretations and several new links supported by clinical case reports or previous observational literatures.
Assuntos
Algoritmos , Estudo de Associação Genômica Ampla , Causalidade , Fenótipo , Transporte Proteico , Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Attention-deficit/hyperactivity disorder (ADHD) is a chronic psychiatric disease that often affects a patient's whole life. Research has found that genetics plays an important role in the development of ADHD. However, there is still a lack of knowledge about the tissue-specific causal effects of biological processes beyond gene expression, such as alternative splicing (AS) and DNA methylation (DNAm), on ADHD. In this paper, a multi-omics study was conducted to investigate the causal effects of the transcription and the DNAm on ADHD, by integrating ADHD genome-wide association data with quantitative trait loci data of gene expression, AS, and DNAm across 14 different brain tissues. The causal effects were estimated using four different two-sample Mendelian randomization methods. Finally, we also prioritized the expression of 866 genes showing significant causal effects, including COMMD5, ENSG00000271904, HYAL3, etc., within at least one brain tissue. We prioritized 966 unique genes that have statistically significant causal AS events, within at least one of the 14 different brain tissues. These genes include PPP1R16A, GGT7, TREM2, etc. Furthermore, through mediation analysis, 106 regulatory pathways were inferred where DNAm influences ADHD through gene expression or AS processes. Our research findings provide guidance for future experimental studies on the molecular mechanisms of ADHD development, and also put forward valuable knowledge for the prevention, diagnosis, and treatment of ADHD.