RESUMO
Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.
Assuntos
Doença/genética , Herança Multifatorial/genética , População/genética , RNA Longo não Codificante/genética , Transcriptoma , Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 2/genética , Perfilação da Expressão Gênica , Variação Genética , Humanos , Doenças Inflamatórias Intestinais/genética , Especificidade de Órgãos/genética , Locos de Características QuantitativasRESUMO
The synthesis of new ribosomes begins during transcription of the rRNA and is widely assumed to follow an orderly 5' to 3' gradient. To visualize co-transcriptional assembly of ribosomal protein-RNA complexes in real time, we developed a single-molecule platform that simultaneously monitors transcription and protein association with the elongating transcript. Unexpectedly, the early assembly protein uS4 binds newly made pre-16S rRNA only transiently, likely due to non-native folding of the rRNA during transcription. Stable uS4 binding became more probable only in the presence of additional ribosomal proteins that bind upstream and downstream of protein uS4 by allowing productive assembly intermediates to form earlier. We propose that dynamic sampling of elongating RNA by multiple proteins overcomes heterogeneous RNA folding, preventing assembly bottlenecks and initiating assembly within the transcription time window. This may be a common feature of transcription-coupled RNP assembly.
Assuntos
Ribonucleoproteínas/metabolismo , Transcrição Gênica , Fluorescência , Modelos Biológicos , Ligação Proteica , Estabilidade Proteica , Precursores de RNA/biossíntese , Precursores de RNA/química , Precursores de RNA/genética , Precursores de RNA/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas Ribossômicas/metabolismo , Ribossomos/metabolismo , Elongação da Transcrição GenéticaRESUMO
Understanding the molecular mechanisms of complex traits is essential for developing targeted interventions. We analyzed liver expression quantitative-trait locus (eQTL) meta-analysis data on 1,183 participants to identify conditionally distinct signals. We found 9,013 eQTL signals for 6,564 genes; 23% of eGenes had two signals, and 6% had three or more signals. We then integrated the eQTL results with data from 29 cardiometabolic genome-wide association study (GWAS) traits and identified 1,582 GWAS-eQTL colocalizations for 747 eGenes. Non-primary eQTL signals accounted for 17% of all colocalizations. Isolating signals by conditional analysis prior to coloc resulted in 37% more colocalizations than using marginal eQTL and GWAS data, highlighting the importance of signal isolation. Isolating signals also led to stronger evidence of colocalization: among 343 eQTL-GWAS signal pairs in multi-signal regions, analyses that isolated the signals of interest resulted in higher posterior probability of colocalization for 41% of tests. Leveraging allelic heterogeneity, we predicted causal effects of gene expression on liver traits for four genes. To predict functional variants and regulatory elements, we colocalized eQTL with liver chromatin accessibility QTL (caQTL) and found 391 colocalizations, including 73 with non-primary eQTL signals and 60 eQTL signals that colocalized with both a caQTL and a GWAS signal. Finally, we used publicly available massively parallel reporter assays in HepG2 to highlight 14 eQTL signals that include at least one expression-modulating variant. This multi-faceted approach to unraveling the genetic underpinnings of liver-related traits could lead to therapeutic development.
Assuntos
Estudo de Associação Genômica Ampla , Fígado , Locos de Características Quantitativas , Humanos , Alelos , Doenças Cardiovasculares/genética , Predisposição Genética para Doença , Fígado/metabolismo , Fenótipo , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Genetic variants are involved in the orchestration of alternative polyadenylation (APA) events, while the role of DNA methylation in regulating APA remains unclear. We generated a comprehensive atlas of APA quantitative trait methylation sites (apaQTMs) across 21 different types of cancer (1,612 to 60,219 acting in cis and 4,448 to 142,349 in trans). Potential causal apaQTMs in non-cancer samples were also identified. Mechanistically, we observed a strong enrichment of cis-apaQTMs near polyadenylation sites (PASs) and both cis- and trans-apaQTMs in proximity to transcription factor (TF) binding regions. Through the integration of ChIP-signals and RNA-seq data from cell lines, we have identified several regulators of APA events, acting either directly or indirectly, implicating novel functions of some important genes, such as TCF7L2, which is known for its involvement in type 2 diabetes and cancers. Furthermore, we have identified a vast number of QTMs that share the same putative causal CpG sites with five different cancer types, underscoring the roles of QTMs, including apaQTMs, in the process of tumorigenesis. DNA methylation is extensively involved in the regulation of APA events in human cancers. In an attempt to elucidate the potential underlying molecular mechanisms of APA by DNA methylation, our study paves the way for subsequent experimental validations into the intricate biological functions of DNA methylation in APA regulation and the pathogenesis of human cancers. To present a comprehensive catalog of apaQTM patterns, we introduce the Pancan-apaQTM database, available at https://pancan-apaqtm-zju.shinyapps.io/pancanaQTM/.
Assuntos
Diabetes Mellitus Tipo 2 , Neoplasias , Humanos , Poliadenilação/genética , Diabetes Mellitus Tipo 2/genética , Neoplasias/genética , Neoplasias/patologia , Regulação da Expressão Gênica , Metilação de DNA/genética , Regiões 3' não TraduzidasRESUMO
While many disease-associated single nucleotide polymorphisms (SNPs) are expression quantitative trait loci (eQTLs), a large proportion of genome-wide association study (GWAS) variants are of unknown function. Alternative polyadenylation (APA) plays an important role in posttranscriptional regulation by allowing genes to shorten or extend 3' untranslated regions (UTRs). We hypothesized that genetic variants that affect APA in lung tissue may lend insight into the function of respiratory associated GWAS loci. We generated alternative polyadenylation (apa) QTLs using RNA sequencing and whole genome sequencing on 1241 subjects from the Lung Tissue Research Consortium (LTRC) as part of the NHLBI TOPMed project. We identified 56 179 APA sites corresponding to 13 582 unique genes after filtering out APA sites with low usage. We found that a total of 8831 APA sites were associated with at least one SNP with q-value < 0.05. The genomic distribution of lead APA SNPs indicated that the majority are intronic variants (33%), followed by downstream gene variants (26%), 3' UTR variants (17%), and upstream gene variants (within 1 kb region upstream of transcriptional start site, 10%). APA sites in 193 genes colocalized with GWAS data for at least one phenotype. Genes containing the top APA sites associated with GWAS variants include membrane associated ring-CH-type finger 2 (MARCHF2), nectin cell adhesion molecule 2 (NECTIN2), and butyrophilin subfamily 3 member A2 (BTN3A2). Overall, these findings suggest that APA may be an important mechanism for genetic variants in lung function and chronic obstructive pulmonary disease (COPD).
Assuntos
Regiões 3' não Traduzidas , Estudo de Associação Genômica Ampla , Pulmão , Poliadenilação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Locos de Características Quantitativas/genética , Humanos , Regiões 3' não Traduzidas/genética , Poliadenilação/genética , Pulmão/metabolismo , Masculino , Predisposição Genética para Doença , Doença Pulmonar Obstrutiva Crônica/genética , Feminino , Regulação da Expressão Gênica/genéticaRESUMO
The burden of human disease lies predominantly in polygenic diseases. Since the early 2000s, genome-wide association studies (GWAS) have identified genetic variants and loci associated with complex traits. These have ranged from variants in coding sequences to mutations in regulatory regions, such as promoters and enhancers, as well as mutations affecting mediators of mRNA stability and other downstream regulators, such as 5' and 3'-untranslated regions (UTRs), long noncoding RNA (lncRNA), and miRNA. Recent research advances in genetics have utilized a combination of computational techniques, high-throughput in vitro and in vivo screening modalities, and precise genome editing to impute the function of diverse classes of genetic variants identified through GWAS. In this review, we highlight the vastness of genomic variants associated with polygenic disease risk and address recent advances in how genetic tools can be used to functionally characterize them.
Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Predisposição Genética para Doença , Variação Genética/genética , GenômicaRESUMO
Genome-wide association studies of blood pressure (BP) have identified >1,000 loci, but the effector genes and biological pathways at these loci are mostly unknown. Using published association summary statistics, we conducted annotation-informed fine-mapping incorporating tissue-specific chromatin segmentation and colocalization to identify causal variants and candidate effector genes for systolic BP, diastolic BP, and pulse pressure. We observed 532 distinct signals associated with ≥2 BP traits and 84 with all three. For >20% of signals, a single variant accounted for >75% posterior probability, 65 were missense variants in known (SLC39A8, ADRB2, and DBH) and previously unreported BP candidate genes (NRIP1 and MMP14). In disease-relevant tissues, we colocalized >80 and >400 distinct signals for each BP trait with cis-eQTLs and regulatory regions from promoter capture Hi-C, respectively. Integrating mouse, human disorder, gene expression and tissue abundance data, and literature review, we provide consolidated evidence for 436 BP candidate genes for future functional validation and discover several potential drug targets.
Assuntos
Estudo de Associação Genômica Ampla , Hipertensão , Humanos , Animais , Camundongos , Locos de Características Quantitativas/genética , Multiômica , Predisposição Genética para Doença , Hipertensão/genética , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Insulin secretion is critical for glucose homeostasis, and increased levels of the precursor proinsulin relative to insulin indicate pancreatic islet beta-cell stress and insufficient insulin secretory capacity in the setting of insulin resistance. We conducted meta-analyses of genome-wide association results for fasting proinsulin from 16 European-ancestry studies in 45,861 individuals. We found 36 independent signals at 30 loci (p value < 5 × 10-8), which validated 12 previously reported loci for proinsulin and ten additional loci previously identified for another glycemic trait. Half of the alleles associated with higher proinsulin showed higher rather than lower effects on glucose levels, corresponding to different mechanisms. Proinsulin loci included genes that affect prohormone convertases, beta-cell dysfunction, vesicle trafficking, beta-cell transcriptional regulation, and lysosomes/autophagy processes. We colocalized 11 proinsulin signals with islet expression quantitative trait locus (eQTL) data, suggesting candidate genes, including ARSG, WIPI1, SLC7A14, and SIX3. The NKX6-3/ANK1 proinsulin signal colocalized with a T2D signal and an adipose ANK1 eQTL signal but not the islet NKX6-3 eQTL. Signals were enriched for islet enhancers, and we showed a plausible islet regulatory mechanism for the lead signal in the MADD locus. These results show how detailed genetic studies of an intermediate phenotype can elucidate mechanisms that may predispose one to disease.
Assuntos
Diabetes Mellitus Tipo 2 , Proinsulina , Humanos , Proinsulina/genética , Proinsulina/metabolismo , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Estudo de Associação Genômica Ampla/métodos , Insulina/genética , Insulina/metabolismo , Glucose , Fatores de Transcrição/genética , Proteínas de Homeodomínio/genéticaRESUMO
Multimorbidity is a rising public health challenge with important implications for health management and policy. The most common multimorbidity pattern is the combination of cardiometabolic and osteoarticular diseases. Here, we study the genetic underpinning of the comorbidity between type 2 diabetes and osteoarthritis. We find genome-wide genetic correlation between the two diseases and robust evidence for association-signal colocalization at 18 genomic regions. We integrate multi-omics and functional information to resolve the colocalizing signals and identify high-confidence effector genes, including FTO and IRX3, which provide proof-of-concept insights into the epidemiologic link between obesity and both diseases. We find enrichment for lipid metabolism and skeletal formation pathways for signals underpinning the knee and hip osteoarthritis comorbidities with type 2 diabetes, respectively. Causal inference analysis identifies complex effects of tissue-specific gene expression on comorbidity outcomes. Our findings provide insights into the biological basis for the type 2 diabetes-osteoarthritis disease co-occurrence.
Assuntos
Diabetes Mellitus Tipo 2 , Osteoartrite , Humanos , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/genética , Comorbidade , Osteoartrite/epidemiologia , Osteoartrite/genética , Obesidade/complicações , Obesidade/epidemiologia , Obesidade/genética , Causalidade , Estudo de Associação Genômica Ampla , Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo Único , Dioxigenase FTO Dependente de alfa-Cetoglutarato/genéticaRESUMO
Integrative genetic association methods have shown great promise in post-GWAS (genome-wide association study) analyses, in which one of the most challenging tasks is identifying putative causal genes and uncovering molecular mechanisms of complex traits. Recent studies suggest that prevailing computational approaches, including transcriptome-wide association studies (TWASs) and colocalization analysis, are individually imperfect, but their joint usage can yield robust and powerful inference results. This paper presents INTACT, a computational framework to integrate probabilistic evidence from these distinct types of analyses and implicate putative causal genes. This procedure is flexible and can work with a wide range of existing integrative analysis approaches. It has the unique ability to quantify the uncertainty of implicated genes, enabling rigorous control of false-positive discoveries. Taking advantage of this highly desirable feature, we further propose an efficient algorithm, INTACT-GSE, for gene set enrichment analysis based on the integrated probabilistic evidence. We examine the proposed computational methods and illustrate their improved performance over the existing approaches through simulation studies. We apply the proposed methods to analyze the multi-tissue eQTL data from the GTEx project and eight large-scale complex- and molecular-trait GWAS datasets from multiple consortia and the UK Biobank. Overall, we find that the proposed methods markedly improve the existing putative gene implication methods and are particularly advantageous in evaluating and identifying key gene sets and biological pathways underlying complex traits.
Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Locos de Características Quantitativas/genética , Simulação por Computador , Polimorfismo de Nucleotídeo Único/genética , Predisposição Genética para DoençaRESUMO
Sequence-level data offers insights into biological processes through the interaction of two or more genomic features from the same or different molecular data types. Within motifs, this interaction is often explored via the co-occurrence of feature genomic tracks using fixed-segments or analytical tests that respectively require window size determination and risk of false positives from over-simplified models. Moreover, methods for robustly examining the co-localization of genomic features, and thereby understanding their spatial interaction, have been elusive. We present a new analytical method for examining feature interaction by introducing the notion of reciprocal co-occurrence, define statistics to estimate it and hypotheses to test for it. Our approach leverages conditional motif co-occurrence events between features to infer their co-localization. Using reverse conditional probabilities and introducing a novel simulation approach that retains motif properties (e.g. length, guanine-content), our method further accounts for potential confounders in testing. As a proof-of-concept, motif co-localization (MoCoLo) confirmed the co-occurrence of histone markers in a breast cancer cell line. As a novel analysis, MoCoLo identified significant co-localization of oxidative DNA damage within non-B DNA-forming regions that significantly differed between non-B DNA structures. Altogether, these findings demonstrate the potential utility of MoCoLo for testing spatial interactions between genomic features via their co-localization.
Assuntos
DNA , Genômica , Simulação por ComputadorRESUMO
Uncovering the functional impact of genetic variation on gene expression is important in understanding tissue biology and the pathogenesis of complex traits. Despite large efforts to map expression quantitative trait loci (eQTLs) across many human tissues, our ability to translate those findings to understanding human disease has been incomplete, and the majority of disease loci are not explained by association with expression of a target gene. Cell-type specificity and the presence of multiple independent causal variants for many eQTLs are potential confounders contributing to the apparent discrepancy with disease loci. In this study, we investigate the tissue specificity of genetic effects on gene expression and the overlap with disease loci while considering the presence of multiple causal variants within and across tissues. We find evidence of pervasive tissue specificity of eQTLs, often masked by linkage disequilibrium that misleads traditional meta-analytic approaches. We propose CAFEH (colocalization and fine-mapping in the presence of allelic heterogeneity), a Bayesian method that integrates genetic association data across multiple traits, incorporating linkage disequilibrium to identify causal variants. CAFEH outperforms previous approaches in colocalization and fine-mapping. Using CAFEH, we show that genes with highly tissue-specific genetic effects are under greater selection, enriched in differentiation and developmental processes, and more likely to be involved in human disease. Last, we demonstrate that CAFEH can efficiently leverage the widespread allelic heterogeneity in genetic regulation of gene expression to prioritize the target tissue in genome-wide association complex trait loci, thereby improving our ability to interpret complex trait genetics.
Assuntos
Alelos , Regulação da Expressão Gênica , Heterogeneidade Genética , Genoma Humano , Herança Multifatorial , Tecido Adiposo/metabolismo , Teorema de Bayes , Mapeamento Cromossômico , Fibroblastos/metabolismo , Estudo de Associação Genômica Ampla , Ventrículos do Coração/metabolismo , Humanos , Desequilíbrio de Ligação , Especificidade de Órgãos , Locos de Características Quantitativas , Glândula Tireoide/metabolismoRESUMO
Transcriptome-wide association studies and colocalization analysis are popular computational approaches for integrating genetic-association data from molecular and complex traits. They show the unique ability to go beyond variant-level genetic-association evidence and implicate critical functional units, e.g., genes, in disease etiology. However, in practice, when the two approaches are applied to the same molecular and complex-trait data, the inference results can be markedly different. This paper systematically investigates the inferential reproducibility between the two approaches through theoretical derivation, numerical experiments, and analyses of four complex trait GWAS and GTEx eQTL data. We identify two classes of inconsistent inference results. We find that the first class of inconsistent results (i.e., genes with strong colocalization but weak transcriptome-wide association study [TWAS] signals) might suggest an interesting biological phenomenon, i.e., horizontal pleiotropy; thus, the two approaches are truly complementary. The inconsistency in the second class (i.e., genes with weak colocalization but strong TWAS signals) can be understood and effectively reconciled. To this end, we propose a computational approach for locus-level colocalization analysis. We demonstrate that the joint TWAS and locus-level colocalization analysis improves specificity and sensitivity for implicating biologically relevant genes.
Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Reprodutibilidade dos Testes , Transcriptoma/genéticaRESUMO
Body mass index (BMI) is a complex disease risk factor known to be influenced by genes acting via both metabolic pathways and appetite regulation. In this study, we aimed to gain insight into the phenotypic consequences of BMI-associated genetic variants, which may be mediated by their expression in different tissues. First, we harnessed meta-analyzed gene expression datasets derived from subcutaneous adipose (n = 1257) and brain (n = 1194) tissue to identify 86 and 140 loci, respectively, which provided evidence of genetic colocalization with BMI. These two sets of tissue-partitioned loci had differential effects with respect to waist-to-hip ratio, suggesting that the way they influence fat distribution might vary despite their having very similar average magnitudes of effect on BMI itself (adipose = 0.0148 and brain = 0.0149 standard deviation change in BMI per effect allele). For instance, BMI-associated variants colocalized with TBX15 expression in adipose tissue (posterior probability [PPA] = 0.97), but not when we used TBX15 expression data derived from brain tissue (PPA = 0.04) This gene putatively influences BMI via its role in skeletal development. Conversely, there were loci where BMI-associated variants provided evidence of colocalization with gene expression in brain tissue (e.g., NEGR1, PPA = 0.93), but not when we used data derived from adipose tissue, suggesting that these genes might be more likely to influence BMI via energy balance. Leveraging these tissue-partitioned variant sets through a multivariable Mendelian randomization framework provided strong evidence that the brain-tissue-derived variants are predominantly responsible for driving the genetically predicted effects of BMI on cardiovascular-disease endpoints (e.g., coronary artery disease: odds ratio = 1.05, 95% confidence interval = 1.04-1.07, p = 4.67 × 10-14). In contrast, our analyses suggested that the adipose tissue variants might predominantly be responsible for the underlying relationship between BMI and measures of cardiac function, such as left ventricular stroke volume (beta = 0.21, 95% confidence interval = 0.09-0.32, p = 6.43 × 10-4).
Assuntos
Índice de Massa Corporal , Moléculas de Adesão Celular Neuronais/genética , Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 2/genética , Obesidade/genética , Proteínas com Domínio T/genética , Tecido Adiposo/metabolismo , Tecido Adiposo/patologia , Encéfalo/metabolismo , Encéfalo/patologia , Moléculas de Adesão Celular Neuronais/metabolismo , Doença da Artéria Coronariana/metabolismo , Doença da Artéria Coronariana/patologia , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 2/patologia , Proteínas Ligadas por GPI/genética , Proteínas Ligadas por GPI/metabolismo , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Loci Gênicos , Variação Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Análise da Randomização Mendeliana , Redes e Vias Metabólicas/genética , Obesidade/metabolismo , Obesidade/patologia , Volume Sistólico/fisiologia , Proteínas com Domínio T/metabolismo , Relação Cintura-QuadrilRESUMO
Lack of diversity in human genomics limits our understanding of the genetic underpinnings of complex traits, hinders precision medicine, and contributes to health disparities. To map genetic effects on gene regulation in the underrepresented Indonesian population, we have integrated genotype, gene expression, and CpG methylation data from 115 participants across three island populations that capture the major sources of genomic diversity in the region. In a comparison with European datasets, we identify eQTLs shared between Indonesia and Europe as well as population-specific eQTLs that exhibit differences in allele frequencies and/or overall expression levels between populations. By combining local ancestry and archaic introgression inference with eQTLs and methylQTLs, we identify regulatory loci driven by modern Papuan ancestry as well as introgressed Denisovan and Neanderthal variation. GWAS colocalization connects QTLs detected here to hematological traits, and further comparison with European datasets reflects the poor overall transferability of GWAS statistics across diverse populations. Our findings illustrate how population-specific genetic architecture, local ancestry, and archaic introgression drive variation in gene regulation across genetically distinct and in admixed populations and highlight the need for performing association studies on non-European populations.
Assuntos
Regulação da Expressão Gênica , Genética Populacional , Genoma Humano , Locos de Características Quantitativas , Biologia Computacional/métodos , Metilação de DNA , Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Indonésia , Masculino , Modelos Genéticos , Anotação de Sequência Molecular , Herança Multifatorial , Característica Quantitativa Herdável , Seleção Genética , Sequenciamento Completo do GenomaRESUMO
Alternate splicing events can create isoforms that alter gene function, and genetic variants associated with alternate gene isoforms may reveal molecular mechanisms of disease. We used subcutaneous adipose tissue of 426 Finnish men from the METSIM study and identified splice junction quantitative trait loci (sQTLs) for 6,077 splice junctions (FDR < 1%). In the same individuals, we detected expression QTLs (eQTLs) for 59,443 exons and 15,397 genes (FDR < 1%). We identified 595 genes with an sQTL and exon eQTL but no gene eQTL, which could indicate potential isoform differences. Of the significant sQTL signals, 2,114 (39.8%) included at least one proxy variant (linkage disequilibrium r2 > 0.8) located within an intron spanned by the splice junction. We identified 203 sQTLs that colocalized with 141 genome-wide association study (GWAS) signals for cardiometabolic traits, including 25 signals for lipid traits, 24 signals for body mass index (BMI), and 12 signals for waist-hip ratio adjusted for BMI. Among all 141 GWAS signals colocalized with an sQTL, we detected 26 that also colocalized with an exon eQTL for an exon skipped by the sQTL splice junction. At a GWAS signal for high-density lipoprotein cholesterol colocalized with an NR1H3 sQTL splice junction, we show that the alternative splice product encodes an NR1H3 transcription factor that lacks a DNA binding domain and fails to activate transcription. Together, these results detect splicing events and candidate mechanisms that may contribute to gene function at GWAS loci.
Assuntos
Processamento Alternativo , Fatores de Risco Cardiometabólico , Regulação da Expressão Gênica , Locos de Características Quantitativas , Característica Quantitativa Herdável , Gordura Subcutânea/metabolismo , Sítios de Ligação , Doenças Cardiovasculares/etiologia , Doenças Cardiovasculares/metabolismo , Biologia Computacional/métodos , Éxons , Finlândia , Genes Reporter , Estudos de Associação Genética , Predisposição Genética para Doença , Genética Populacional , Estudo de Associação Genômica Ampla/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Receptores X do Fígado/genética , Masculino , Síndrome Metabólica/etiologia , Síndrome Metabólica/metabolismo , Anotação de Sequência Molecular , Fenótipo , Isoformas de Proteínas/genética , Sítios de Splice de RNA , Proteínas de Ligação a RNARESUMO
Mucus obstruction is a central feature in the cystic fibrosis (CF) airways. A genome-wide association study (GWAS) of lung disease by the CF Gene Modifier Consortium (CFGMC) identified a significant locus containing two mucin genes, MUC20 and MUC4. Expression quantitative trait locus (eQTL) analysis using human nasal epithelia (HNE) from 94 CF-affected Canadians in the CFGMC demonstrated MUC4 eQTLs that mirrored the lung association pattern in the region, suggesting that MUC4 expression may mediate CF lung disease. Complications arose, however, with colocalization testing using existing methods: the locus is complex and the associated SNPs span a 0.2 Mb region with high linkage disequilibrium (LD) and evidence of allelic heterogeneity. We previously developed the Simple Sum (SS), a powerful colocalization test in regions with allelic heterogeneity, but SS assumed eQTLs to be present to achieve type I error control. Here we propose a two-stage SS (SS2) colocalization test that avoids a priori eQTL assumptions, accounts for multiple hypothesis testing and the composite null hypothesis, and enables meta-analysis. We compare SS2 to published approaches through simulation and demonstrate type I error control for all settings with the greatest power in the presence of high LD and allelic heterogeneity. Applying SS2 to the MUC20/MUC4 CF lung disease locus with eQTLs from CF HNE revealed significant colocalization with MUC4 (p = 1.31 × 10-5) rather than with MUC20. The SS2 is a powerful method to inform the responsible gene(s) at a locus and guide future functional studies. SS2 has been implemented in the application LocusFocus.
Assuntos
Sistemas de Transporte de Aminoácidos/genética , Fibrose Cística/genética , Modelos Estatísticos , Mucina-4/genética , Mucinas/genética , Locos de Características Quantitativas , Alelos , Sistemas de Transporte de Aminoácidos/metabolismo , Fibrose Cística/metabolismo , Fibrose Cística/patologia , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Heterogeneidade Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Desequilíbrio de Ligação , Pulmão/metabolismo , Pulmão/patologia , Mucina-4/metabolismo , Mucinas/metabolismo , Mucosa Nasal/metabolismo , Mucosa Nasal/patologia , Polimorfismo de Nucleotídeo ÚnicoRESUMO
OBJECTIVE: Myasthenia gravis (MG) is a complex autoimmune disease affecting the neuromuscular junction with limited drug options, but the field of MG treatment recently benefits from novel biological agents. We performed a drug-targeted Mendelian randomization (MR) study to identify novel therapeutic targets of MG. METHODS: Cis-expression quantitative loci (cis-eQTL), which proxy expression levels for 2176 druggable genes, were used for MR analysis. Causal relationships between genes and disease, identified by eQTL MR analysis, were verified by comprehensive sensitivity, colocalization, and protein quantitative loci (pQTL) MR analyses. The protein-protein interaction (PPI) analysis was also performed to extend targets, followed by enzyme-linked immunosorbent assay (ELISA) to explore the serum level of drug targets in MG patients. A phenome-wide MR analysis was then performed to assess side effects with a clinical trial review assessing druggability. RESULTS: The eQTL MR analysis has identified eight potential targets for MG, one for early-onset MG and seven for late-onset MG. Further colocalization analyses indicated that CD226, CDC42BPB, PRSS36, and TNFSF12 possess evidence for colocalization with MG or late-onset MG. pQTL MR analyses identified the causal relations of TNFSF12 and CD226 with MG and late-onset MG. Furthermore, PPI analysis has revealed the protein interaction between TNFSF12-TNFSF13(APRIL) and TNFSF12-TNFSF13B(BLyS). Elevated TNFSF13 serum level of MG patients was also identified by ELISA experiments. This study has ultimately proposed three promising therapeutic targets (TNFSF12, TNFSF13, TNFSF13B) of MG. CONCLUSIONS: Three drug targets associated with the BLyS/APRIL pathway have been identified. Multiple biological agents, including telitacicept and belimumab, are promising for MG therapy.
Assuntos
Análise da Randomização Mendeliana , Miastenia Gravis , Locos de Características Quantitativas , Humanos , Miastenia Gravis/genética , Miastenia Gravis/tratamento farmacológico , Miastenia Gravis/patologia , Miastenia Gravis/sangue , Locos de Características Quantitativas/genética , Mapas de Interação de Proteínas/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
The causal relationships between plasma metabolites and cholelithiasis/cholecystitis risks remain elusive. Using two-sample Mendelian randomization, we found that genetic proxied plasma campesterol level showed negative correlation with the risk of both cholelithiasis and cholecystitis. Furthermore, the increased risk of cholelithiasis is correlating with the increased level of plasma campesterol. Lastly, genetic colocalization study showed that the leading SNP, rs4299376, which residing at the ABCG5/ABCG8 gene loci, was shared by plasma campesterol level and cholelithiasis, indicating that the aberrant transportation of plant sterol/cholesterol from the blood stream to the bile duct/gut lumen might be the key in preventing cholesterol gallstone formation.
Assuntos
Colecistite , Colesterol/análogos & derivados , Cálculos Biliares , Fitosteróis , Humanos , Lipoproteínas/genética , Análise da Randomização Mendeliana , Membro 8 da Subfamília G de Transportadores de Cassetes de Ligação de ATP/genética , Membro 5 da Subfamília G de Transportadores de Cassetes de Ligação de ATP/genética , Colecistite/epidemiologia , Colecistite/genética , Cálculos Biliares/epidemiologia , Cálculos Biliares/genética , Cálculos Biliares/metabolismoRESUMO
Preeclampsia, a multifaceted condition characterized by high blood pressure during pregnancy, is linked to substantial health risks for both the mother and the fetus. Previous studies suggest potential neurological impacts, but the causal relationships between cortical structural changes and preeclampsia remain unclear. We utilized genome-wide association study data for cortical thickness (TH) and surface area (SA) across multiple brain regions and preeclampsia. Bidirectional Mendelian randomization (MR) analyses were conducted to assess causality, followed by co-localization analyses to confirm shared genetic architecture. Increased cortical TH in the inferior parietal and supramarginal regions, and an enlarged SA in the postcentral region, were significantly associated with higher preeclampsia risk. Conversely, preeclampsia was linked to increased SA in the supramarginal and middle temporal gyri, and decreased SA in the lingual gyrus. Co-localization analyses indicated distinct genetic determinants for cortical structures and preeclampsia. Our findings reveal bidirectional influences between cortical structural features and preeclampsia, suggesting neuroinflammatory and vascular mechanisms as potential pathways. These insights underscore the importance of considering brain structure in preeclampsia risk assessment and highlight the need for further research into neuroprotective strategies.