RESUMEN
BACKGROUND: The genetic background of cancer remains complex and challenging to integrate. Many somatic mutations within genes are known to cause and drive cancer, while genome-wide association studies (GWAS) of cancer have revealed many germline risk factors associated with cancer. However, the overlap between known somatic driver genes and positional candidate genes from GWAS loci is surprisingly small. We hypothesised that genes from multiple independent cancer GWAS loci should show tissue-specific co-regulation patterns that converge on cancer-specific driver genes. RESULTS: We studied recent well-powered GWAS of breast, prostate, colorectal and skin cancer by estimating co-expression between genes and subsequently prioritising genes that show significant co-expression with genes mapping within susceptibility loci from cancer GWAS. We observed that the prioritised genes were strongly enriched for cancer drivers defined by COSMIC, IntOGen and Dietlein et al. The enrichment of known cancer driver genes was most significant when using co-expression networks derived from non-cancer samples of the relevant tissue of origin. CONCLUSION: We show how genes within risk loci identified by cancer GWAS can be linked to known cancer driver genes through tissue-specific co-expression networks. This provides an important explanation for why seemingly unrelated sets of genes that harbour either germline risk factors or somatic mutations can eventually cause the same type of disease.
Asunto(s)
Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Neoplasias , Humanos , Neoplasias/genética , Especificidad de Órganos/genética , Regulación Neoplásica de la Expresión Génica , Sitios GenéticosRESUMEN
We investigated indirect genetic effects (IGEs), also known as genetic nurture, in education with a novel approach that uses phased data to include parent-offspring pairs in the transmitted/nontransmitted study design. This method increases the power to detect IGEs, enhances the generalizability of the findings, and allows for the study of effects by parent-of-origin. We validated and applied this method in a family-based subsample of adolescents and adults from the Lifelines Cohort Study in the Netherlands (N = 6147), using the latest genome-wide association study data on educational attainment to construct polygenic scores (PGS). Our results indicated that IGEs play a role in education outcomes in the Netherlands: we found significant associations of the nontransmitted PGS with secondary school level in youth between 13 and 24 years old as well as with education attainment and years of education in adults over 25 years old (ß = 0.14, 0.17 and 0.26, respectively), with tentative evidence for larger maternal IGEs. In conclusion, we replicated previous findings and showed that including parent-offspring pairs in addition to trios in the transmitted/nontransmitted design can benefit future studies of parental IGEs in a wide range of outcomes.
Asunto(s)
Escolaridad , Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Humanos , Adolescente , Femenino , Masculino , Adulto , Estudio de Asociación del Genoma Completo/métodos , Países Bajos , Padres , Adulto Joven , Estudios de Cohortes , Modelos GenéticosRESUMEN
Enhancers play a vital role in gene regulation and are critical in mediating the impact of noncoding genetic variants associated with complex traits. Enhancer activity is a cell-type-specific process regulated by transcription factors (TFs), epigenetic mechanisms and genetic variants. Despite the strong mechanistic link between TFs and enhancers, we currently lack a framework for jointly analysing them in cell-type-specific gene regulatory networks (GRN). Equally important, we lack an unbiased way of assessing the biological significance of inferred GRNs since no complete ground truth exists. To address these gaps, we present GRaNIE (Gene Regulatory Network Inference including Enhancers) and GRaNPA (Gene Regulatory Network Performance Analysis). GRaNIE (https://git.embl.de/grp-zaugg/GRaNIE) builds enhancer-mediated GRNs based on covariation of chromatin accessibility and RNA-seq across samples (e.g. individuals), while GRaNPA (https://git.embl.de/grp-zaugg/GRaNPA) assesses the performance of GRNs for predicting cell-type-specific differential expression. We demonstrate their power by investigating gene regulatory mechanisms underlying the response of macrophages to infection, cancer and common genetic traits including autoimmune diseases. Finally, our methods identify the TF PURA as a putative regulator of pro-inflammatory macrophage polarisation.
Asunto(s)
Redes Reguladoras de Genes , Neoplasias , Humanos , Regulación de la Expresión Génica , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Cromatina , Neoplasias/genética , Elementos de Facilitación Genéticos/genéticaRESUMEN
BACKGROUND: The genetic underpinning of sexual dimorphism is very poorly understood. The prevalence of many diseases differs between men and women, which could be in part caused by sex-specific genetic effects. Nevertheless, only a few published genome-wide association studies (GWAS) were performed separately in each sex. The reported enrichment of expression quantitative trait loci (eQTLs) among GWAS-associated SNPs suggests a potential role of sex-specific eQTLs in the sex-specific genetic mechanism underlying complex traits. METHODS: To explore this scenario, we combined sex-specific whole blood RNA-seq eQTL data from 3447 European individuals included in BIOS Consortium and GWAS data from UK Biobank. Next, to test the presence of sex-biased causal effect of gene expression on complex traits, we performed sex-specific transcriptome-wide Mendelian randomization (TWMR) analyses on the two most sexually dimorphic traits, waist-to-hip ratio (WHR) and testosterone levels. Finally, we performed power analysis to calculate the GWAS sample size needed to observe sex-specific trait associations driven by sex-biased eQTLs. RESULTS: Among 9 million SNP-gene pairs showing sex-combined associations, we found 18 genes with significant sex-biased cis-eQTLs (FDR 5%). Our phenome-wide association study of the 18 top sex-biased eQTLs on >700 traits unraveled that these eQTLs do not systematically translate into detectable sex-biased trait-associations. In addition, we observed that sex-specific causal effects of gene expression on complex traits are not driven by sex-specific eQTLs. Power analyses using real eQTL- and causal-effect sizes showed that millions of samples would be necessary to observe sex-biased trait associations that are fully driven by sex-biased cis-eQTLs. Compensatory effects may further hamper their detection. CONCLUSIONS: Our results suggest that sex-specific eQTLs in whole blood do not translate to detectable sex-specific trait associations of complex diseases, and vice versa that the observed sex-specific trait associations cannot be explained by sex-specific eQTLs.
Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Femenino , Estudio de Asociación del Genoma Completo/métodos , Humanos , Masculino , Polimorfismo de Nucleótido Simple , Caracteres Sexuales , TranscriptomaRESUMEN
Copy-number variations (CNV) are believed to play an important role in a wide range of complex traits, but discovering such associations remains challenging. While whole-genome sequencing (WGS) is the gold-standard approach for CNV detection, there are several orders of magnitude more samples with available genotyping microarray data. Such array data can be exploited for CNV detection using dedicated software (e.g., PennCNV); however, these calls suffer from elevated false-positive and -negative rates. In this study, we developed a CNV quality score that weights PennCNV calls (pCNVs) based on their likelihood of being true positive. First, we established a measure of pCNV reliability by leveraging evidence from multiple omics data (WGS, transcriptomics, and methylomics) obtained from the same samples. Next, we built a predictor of omics-confirmed pCNVs, termed omics-informed quality score (OQS), using only PennCNV software output parameters. Promisingly, OQS assigned to pCNVs detected in close family members was up to 35% higher than the OQS of pCNVs not carried by other relatives (p < 3.0 × 10-90), outperforming other scores. Finally, in an association study of four anthropometric traits in 89,516 Estonian Biobank samples, the use of OQS led to a relative increase in the trait variance explained by CNVs of up to 56% compared with published quality filtering methods or scores. Overall, we put forward a flexible framework to improve any CNV detection method leveraging multi-omics evidence, applied it to improve PennCNV calls, and demonstrated its utility by improving the statistical power for downstream association analyses.
RESUMEN
Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and trans-expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis-eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans-eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans-eQTL. Trans-eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes.
Asunto(s)
Proteínas Sanguíneas/genética , Regulación de la Expresión Génica/genética , Sitios de Carácter Cuantitativo/genética , Estudio de Asociación del Genoma Completo , Humanos , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple/genética , Transcriptoma/genéticaRESUMEN
Reproductive longevity is essential for fertility and influences healthy ageing in women1,2, but insights into its underlying biological mechanisms and treatments to preserve it are limited. Here we identify 290 genetic determinants of ovarian ageing, assessed using normal variation in age at natural menopause (ANM) in about 200,000 women of European ancestry. These common alleles were associated with clinical extremes of ANM; women in the top 1% of genetic susceptibility have an equivalent risk of premature ovarian insufficiency to those carrying monogenic FMR1 premutations3. The identified loci implicate a broad range of DNA damage response (DDR) processes and include loss-of-function variants in key DDR-associated genes. Integration with experimental models demonstrates that these DDR processes act across the life-course to shape the ovarian reserve and its rate of depletion. Furthermore, we demonstrate that experimental manipulation of DDR pathways highlighted by human genetics increases fertility and extends reproductive life in mice. Causal inference analyses using the identified genetic variants indicate that extending reproductive life in women improves bone health and reduces risk of type 2 diabetes, but increases the risk of hormone-sensitive cancers. These findings provide insight into the mechanisms that govern ovarian ageing, when they act, and how they might be targeted by therapeutic approaches to extend fertility and prevent disease.
Asunto(s)
Envejecimiento/genética , Ovario/metabolismo , Adulto , Alelos , Animales , Huesos/metabolismo , Quinasa 1 Reguladora del Ciclo Celular (Checkpoint 1)/genética , Quinasa de Punto de Control 2/genética , Diabetes Mellitus Tipo 2 , Dieta , Europa (Continente)/etnología , Asia Oriental/etnología , Femenino , Fertilidad/genética , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Envejecimiento Saludable/genética , Humanos , Longevidad/genética , Menopausia/genética , Menopausia Prematura/genética , Ratones , Ratones Endogámicos C57BL , Persona de Mediana Edad , Insuficiencia Ovárica Primaria/genética , ÚteroRESUMEN
Epidemiological and genetic studies on COVID-19 are currently hindered by inconsistent and limited testing policies to confirm SARS-CoV-2 infection. Recently, it was shown that it is possible to predict COVID-19 cases using cross-sectional self-reported disease-related symptoms. Here, we demonstrate that this COVID-19 prediction model has reasonable and consistent performance across multiple independent cohorts and that our attempt to improve upon this model did not result in improved predictions. Using the existing COVID-19 prediction model, we then conducted a GWAS on the predicted phenotype using a total of 1,865 predicted cases and 29,174 controls. While we did not find any common, large-effect variants that reached genome-wide significance, we do observe suggestive genetic associations at two SNPs (rs11844522, p = 1.9x10-7; rs5798227, p = 2.2x10-7). Explorative analyses furthermore suggest that genetic variants associated with other viral infectious diseases do not overlap with COVID-19 susceptibility and that severity of COVID-19 may have a different genetic architecture compared to COVID-19 susceptibility. This study represents a first effort that uses a symptom-based predicted phenotype as a proxy for COVID-19 in our pursuit of understanding the genetic susceptibility of the disease. We conclude that the inclusion of symptom-based predicted cases could be a useful strategy in a scenario of limited testing, either during the current COVID-19 pandemic or any future viral outbreak.
Asunto(s)
COVID-19/patología , Predisposición Genética a la Enfermedad , Área Bajo la Curva , COVID-19/genética , COVID-19/virología , Estudios Transversales , Estudio de Asociación del Genoma Completo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple , Curva ROC , SARS-CoV-2/aislamiento & purificaciónRESUMEN
Enhancers are genomic sequences that play a key role in regulating tissue-specific gene expression levels. An increasing number of diseases are linked to impaired enhancer function through chromosomal rearrangement, genetic variation within enhancers, or epigenetic modulation. Here, we review how these enhancer disruptions have recently been implicated in congenital disorders, cancers, and common complex diseases and address the implications for diagnosis and treatment. Although further fundamental research into enhancer function, target genes, and context is required, enhancer-targeting drugs and gene editing approaches show great therapeutic promise for a range of diseases.
Asunto(s)
Elementos de Facilitación Genéticos , Epigenómica , Edición Génica , Genómica , HumanosRESUMEN
Glycemic traits are used to diagnose and monitor type 2 diabetes and cardiometabolic health. To date, most genetic studies of glycemic traits have focused on individuals of European ancestry. Here we aggregated genome-wide association studies comprising up to 281,416 individuals without diabetes (30% non-European ancestry) for whom fasting glucose, 2-h glucose after an oral glucose challenge, glycated hemoglobin and fasting insulin data were available. Trans-ancestry and single-ancestry meta-analyses identified 242 loci (99 novel; P < 5 × 10-8), 80% of which had no significant evidence of between-ancestry heterogeneity. Analyses restricted to individuals of European ancestry with equivalent sample size would have led to 24 fewer new loci. Compared with single-ancestry analyses, equivalent-sized trans-ancestry fine-mapping reduced the number of estimated variants in 99% credible sets by a median of 37.5%. Genomic-feature, gene-expression and gene-set analyses revealed distinct biological signatures for each trait, highlighting different underlying biological pathways. Our results increase our understanding of diabetes pathophysiology by using trans-ancestry studies for improved power and resolution.
Asunto(s)
Glucemia/genética , Carácter Cuantitativo Heredable , Población Blanca/genética , Alelos , Epigénesis Genética , Perfilación de la Expresión Génica , Genoma Humano , Estudio de Asociación del Genoma Completo , Hemoglobina Glucada/metabolismo , Humanos , Herencia Multifactorial/genética , Mapeo Físico de Cromosoma , Sitios de Carácter Cuantitativo/genéticaRESUMEN
BACKGROUND: Aging is a multifactorial process that affects multiple tissues and is characterized by changes in homeostasis over time, leading to increased morbidity. Whole blood gene expression signatures have been associated with aging and have been used to gain information on its biological mechanisms, which are still not fully understood. However, blood is composed of many cell types whose proportions in blood vary with age. As a result, previously observed associations between gene expression levels and aging might be driven by cell type composition rather than intracellular aging mechanisms. To overcome this, previous aging studies already accounted for major cell types, but the possibility that the reported associations are false positives driven by less prevalent cell subtypes remains. RESULTS: Here, we compared the regression model from our previous work to an extended model that corrects for 33 additional white blood cell subtypes. Both models were applied to whole blood gene expression data from 3165 individuals belonging to the general population (age range of 18-81 years). We evaluated that the new model is a better fit for the data and it identified fewer genes associated with aging (625, compared to the 2808 of the initial model; P ≤ 2.5⨯10-6). Moreover, 511 genes (~ 18% of the 2808 genes identified by the initial model) were found using both models, indicating that the other previously reported genes could be proxies for less abundant cell types. In particular, functional enrichment of the genes identified by the new model highlighted pathways and GO terms specifically associated with platelet activity. CONCLUSIONS: We conclude that gene expression analyses in blood strongly benefit from correction for both common and rare blood cell types, and recommend using blood-cell count estimates as standard covariates when studying whole blood gene expression.
Asunto(s)
Envejecimiento , Transcriptoma , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Envejecimiento/genética , Humanos , Persona de Mediana Edad , Adulto JovenRESUMEN
PURPOSE: The Lifelines COVID-19 cohort was set up to assess the psychological and societal impacts of the COVID-19 pandemic and investigate potential risk factors for COVID-19 within the Lifelines prospective population cohort. PARTICIPANTS: Participants were recruited from the 140 000 eligible participants of Lifelines and the Lifelines NEXT birth cohort, who are all residents of the three northern provinces of the Netherlands. Participants filled out detailed questionnaires about their physical and mental health and experiences on a weekly basis starting in late March 2020, and the cohort consists of everyone who filled in at least one questionnaire in the first 8 weeks of the project. FINDINGS TO DATE: >71 000 unique participants responded to the questionnaires at least once during the first 8 weeks, with >22 000 participants responding to seven questionnaires. Compiled questionnaire results are continuously updated and shared with the public through the Corona Barometer website. Early results included a clear signal that younger people living alone were experiencing greater levels of loneliness due to lockdown, and subsequent results showed the easing of anxiety as lockdown was eased in June 2020. FUTURE PLANS: Questionnaires were sent on a (bi)weekly basis starting in March 2020 and on a monthly basis starting July 2020, with plans for new questionnaire rounds to continue through 2020 and early 2021. Questionnaire frequency can be increased again for subsequent waves of infections. Cohort data will be used to address how the COVID-19 pandemic developed in the northern provinces of the Netherlands, which environmental and genetic risk factors predict disease susceptibility and severity and the psychological and societal impacts of the crisis. Cohort data are linked to the extensive health, lifestyle and sociodemographic data held for these participants by Lifelines, a 30-year project that started in 2006, and to data about participants held in national databases.
Asunto(s)
COVID-19/psicología , Pandemias , Adulto , Ansiedad , Control de Enfermedades Transmisibles , Femenino , Humanos , Soledad , Masculino , Persona de Mediana Edad , Países Bajos/epidemiología , Estudios Prospectivos , Calidad de Vida , Encuestas y CuestionariosRESUMEN
To study the effect of host genetics on gut microbiome composition, the MiBioGen consortium curated and analyzed genome-wide genotypes and 16S fecal microbiome data from 18,340 individuals (24 cohorts). Microbial composition showed high variability across cohorts: only 9 of 410 genera were detected in more than 95% of samples. A genome-wide association study of host genetic variation regarding microbial taxa identified 31 loci affecting the microbiome at a genome-wide significant (P < 5 × 10-8) threshold. One locus, the lactase (LCT) gene locus, reached study-wide significance (genome-wide association study signal: P = 1.28 × 10-20), and it showed an age-dependent association with Bifidobacterium abundance. Other associations were suggestive (1.95 × 10-10 < P < 5 × 10-8) but enriched for taxa showing high heritability and for genes expressed in the intestine and brain. A phenome-wide association study and Mendelian randomization identified enrichment of microbiome trait loci in the metabolic, nutrition and environment domains and suggested the microbiome might have causal effects in ulcerative colitis and rheumatoid arthritis.
Asunto(s)
Microbioma Gastrointestinal/fisiología , Variación Genética , Sitios de Carácter Cuantitativo , Adolescente , Adulto , Bifidobacterium/genética , Niño , Preescolar , Estudios de Cohortes , Femenino , Microbioma Gastrointestinal/genética , Estudio de Asociación del Genoma Completo , Humanos , Lactasa/genética , Desequilibrio de Ligamiento , Masculino , Análisis de la Aleatorización Mendeliana , Metabolismo/genética , ARN Ribosómico 16SRESUMEN
Circulating proteins are vital in human health and disease and are frequently used as biomarkers for clinical decision-making or as targets for pharmacological intervention. Here, we map and replicate protein quantitative trait loci (pQTL) for 90 cardiovascular proteins in over 30,000 individuals, resulting in 451 pQTLs for 85 proteins. For each protein, we further perform pathway mapping to obtain trans-pQTL gene and regulatory designations. We substantiate these regulatory findings with orthogonal evidence for trans-pQTLs using mouse knockdown experiments (ABCA1 and TRIB1) and clinical trial results (chemokine receptors CCR2 and CCR5), with consistent regulation. Finally, we evaluate known drug targets, and suggest new target candidates or repositioning opportunities using Mendelian randomization. This identifies 11 proteins with causal evidence of involvement in human disease that have not previously been targeted, including EGF, IL-16, PAPPA, SPON1, F3, ADM, CASP-8, CHI3L1, CXCL16, GDF15 and MMP-12. Taken together, these findings demonstrate the utility of large-scale mapping of the genetics of the proteome and provide a resource for future precision studies of circulating proteins in human health.
Asunto(s)
Sistema Cardiovascular/metabolismo , Mapeo Cromosómico , Sistemas de Liberación de Medicamentos , Genómica , Transportador 1 de Casete de Unión a ATP/genética , Asma/genética , Técnicas de Silenciamiento del Gen , Estudio de Asociación del Genoma Completo , Humanos , Enfermedades Inflamatorias del Intestino/genética , Proteína 1 Similar al Receptor de Interleucina-1/genética , Péptidos y Proteínas de Señalización Intracelular/genética , Desequilibrio de Ligamiento , Análisis de la Aleatorización Mendeliana , Proteínas Serina-Treonina Quinasas/antagonistas & inhibidores , Proteínas Serina-Treonina Quinasas/genética , Proteoma , Sitios de Carácter Cuantitativo , Receptores CCR2/genética , Receptores CCR5/genéticaRESUMEN
Inference of causality between gene expression and complex traits using Mendelian randomization (MR) is confounded by pleiotropy and linkage disequilibrium (LD) of gene-expression quantitative trait loci (eQTL). Here, we propose an MR method, MR-link, that accounts for unobserved pleiotropy and LD by leveraging information from individual-level data, even when only one eQTL variant is present. In simulations, MR-link shows false-positive rates close to expectation (median 0.05) and high power (up to 0.89), outperforming all other tested MR methods and coloc. Application of MR-link to low-density lipoprotein cholesterol (LDL-C) measurements in 12,449 individuals with expression and protein QTL summary statistics from blood and liver identifies 25 genes causally linked to LDL-C. These include the known SORT1 and ApoE genes as well as PVRL2, located in the APOE locus, for which a causal role in liver was not known. Our results showcase the strength of MR-link for transcriptome-wide causal inferences.
Asunto(s)
LDL-Colesterol/sangre , Regulación de la Expresión Génica , Predisposición Genética a la Enfermedad , Modelos Genéticos , Sitios de Carácter Cuantitativo , Proteínas Adaptadoras del Transporte Vesicular/genética , Proteínas Adaptadoras del Transporte Vesicular/metabolismo , Apolipoproteínas E/genética , Apolipoproteínas E/metabolismo , LDL-Colesterol/metabolismo , Simulación por Computador , Conjuntos de Datos como Asunto , Pleiotropía Genética , Humanos , Desequilibrio de Ligamiento , Metabolismo de los Lípidos/genética , Análisis de la Aleatorización Mendeliana , Redes y Vías Metabólicas/genética , Herencia Multifactorial , Nectinas/genética , Nectinas/metabolismo , Países Bajos , Proteómica , RNA-SeqRESUMEN
BACKGROUND: Expression quantitative trait loci (eQTL) studies are used to interpret the function of disease-associated genetic risk factors. To date, most eQTL analyses have been conducted in bulk tissues, such as whole blood and tissue biopsies, which are likely to mask the cell type-context of the eQTL regulatory effects. Although this context can be investigated by generating transcriptional profiles from purified cell subpopulations, current methods to do this are labor-intensive and expensive. We introduce a new method, Decon2, as a framework for estimating cell proportions using expression profiles from bulk blood samples (Decon-cell) followed by deconvolution of cell type eQTLs (Decon-eQTL). RESULTS: The estimated cell proportions from Decon-cell agree with experimental measurements across cohorts (R ≥ 0.77). Using Decon-cell, we could predict the proportions of 34 circulating cell types for 3194 samples from a population-based cohort. Next, we identified 16,362 whole-blood eQTLs and deconvoluted cell type interaction (CTi) eQTLs using the predicted cell proportions from Decon-cell. CTi eQTLs show excellent allelic directional concordance with eQTL (≥ 96-100%) and chromatin mark QTL (≥87-92%) studies that used either purified cell subpopulations or single-cell RNA-seq, outperforming the conventional interaction effect. CONCLUSIONS: Decon2 provides a method to detect cell type interaction effects from bulk blood eQTLs that is useful for pinpointing the most relevant cell type for a given complex disease. Decon2 is available as an R package and Java application (https://github.com/molgenis/systemsgenetics/tree/master/Decon2) and as a web tool (www.molgenis.org/deconvolution).
Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Sitios de Carácter Cuantitativo/inmunología , Recuento Corporal Total/métodos , HumanosRESUMEN
Insights into individual differences in gene expression and its heritability (h2) can help in understanding pathways from DNA to phenotype. We estimated the heritability of gene expression of 52,844 genes measured in whole blood in the largest twin RNA-Seq sample to date (1497 individuals including 459 monozygotic twin pairs and 150 dizygotic twin pairs) from classical twin modeling and identity-by-state-based approaches. We estimated for each gene h2total, composed of cis-heritability (h2cis, the variance explained by single nucleotide polymorphisms in the cis-window of the gene), and trans-heritability (h2res, the residual variance explained by all other genome-wide variants). Mean h2total was 0.26, which was significantly higher than heritability estimates earlier found in a microarray-based study using largely overlapping (>60%) RNA samples (mean h2 = 0.14, p = 6.15 × 10-258). Mean h2cis was 0.06 and strongly correlated with beta of the top cis expression quantitative loci (eQTL, ρ = 0.76, p < 10-308) and with estimates from earlier RNA-Seq-based studies. Mean h2res was 0.20 and correlated with the beta of the corresponding trans-eQTL (ρ = 0.04, p < 1.89 × 10-3) and was significantly higher for genes involved in cytokine-cytokine interactions (p = 4.22 × 10-15), many other immune system pathways, and genes identified in genome-wide association studies for various traits including behavioral disorders and cancer. This study provides a thorough characterization of cis- and trans-h2 estimates of gene expression, which is of value for interpretation of GWAS and gene expression studies.
Asunto(s)
Interacción Gen-Ambiente , Polimorfismo de Nucleótido Simple , Carácter Cuantitativo Heredable , Adolescente , Adulto , Anciano , Femenino , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Sitios de Carácter Cuantitativo , RNA-Seq/métodos , Gemelos Dicigóticos/genética , Gemelos Monocigóticos/genéticaRESUMEN
Celiac disease (CeD) is a complex T cell-mediated enteropathy induced by gluten. Although genome-wide association studies have identified numerous genomic regions associated with CeD, it is difficult to accurately pinpoint which genes in these loci are most likely to cause CeD. We used four different in silico approaches-Mendelian randomization inverse variance weighting, COLOC, LD overlap, and DEPICT-to integrate information gathered from a large transcriptomics dataset. This identified 118 prioritized genes across 50 CeD-associated regions. Co-expression and pathway analysis of these genes indicated an association with adaptive and innate cytokine signaling and T cell activation pathways. Fifty-one of these genes are targets of known drug compounds or likely druggable genes, suggesting that our methods can be used to pinpoint potential therapeutic targets. In addition, we detected 172 gene combinations that were affected by our CeD-prioritized genes in trans. Notably, 41 of these trans-mediated genes appear to be under control of one master regulator, TRAF-type zinc finger domain containing 1 (TRAFD1), and were found to be involved in interferon (IFN)γ signaling and MHC I antigen processing/presentation. Finally, we performed in vitro experiments in a human monocytic cell line that validated the role of TRAFD1 as an immune regulator acting in trans. Our strategy confirmed the role of adaptive immunity in CeD and revealed a genetic link between CeD and IFNγ signaling as well as with MHC I antigen processing, both major players of immune activation and CeD pathogenesis.
RESUMEN
INTRODUCTION: The role of TOMM40-APOE 19q13.3 region variants is well documented in Alzheimer's disease (AD) but remains contentious in dementia with Lewy bodies (DLB) and Parkinson's disease dementia (PDD). METHODS: We dissected genetic profiles within the TOMM40-APOE region in 451 individuals from four European brain banks, including DLB and PDD cases with/without neuropathological evidence of AD-related pathology and healthy controls. RESULTS: TOMM 40-L/APOE-ε4 alleles were associated with DLB (OR TOMM40 -L = 3.61; P value = 3.23 × 10-9; OR APOE -ε4 = 3.75; P value = 4.90 × 10-10) and earlier age at onset of DLB (HR TOMM40 -L = 1.33, P value = .031; HR APOE -ε4 = 1.46, P value = .004), but not with PDD. The TOMM40-L/APOE-ε4 effect was most pronounced in DLB individuals with concomitant AD pathology (OR TOMM40 -L = 4.40, P value = 1.15 × 10-6; OR APOE - ε 4 = 5.65, P value = 2.97 × 10-8) but was not significant in DLB without AD. Meta-analyses combining all APOE-ε4 data in DLB confirmed our findings (ORDLB = 2.93, P value = 3.78 × 10-99; ORDLB+AD = 5.36, P value = 1.56 × 10-47). DISCUSSION: APOE-ε4/TOMM 40-L alleles increase susceptibility and risk of earlier DLB onset, an effect explained by concomitant AD-related pathology. These findings have important implications in future drug discovery and development efforts in DLB.
RESUMEN
Early childhood growth patterns are associated with adult health, yet the genetic factors and the developmental stages involved are not fully understood. Here, we combine genome-wide association studies with modeling of longitudinal growth traits to study the genetics of infant and child growth, followed by functional, pathway, genetic correlation, risk score, and colocalization analyses to determine how developmental timings, molecular pathways, and genetic determinants of these traits overlap with those of adult health. We found a robust overlap between the genetics of child and adult body mass index (BMI), with variants associated with adult BMI acting as early as 4 to 6 years old. However, we demonstrated a completely distinct genetic makeup for peak BMI during infancy, influenced by variation at the LEPR/LEPROT locus. These findings suggest that different genetic factors control infant and child BMI. In light of the obesity epidemic, these findings are important to inform the timing and targets of prevention strategies.