RESUMEN
An important goal of clinical genomics is to be able to estimate the risk of adverse disease outcomes. Between 5% and 10% of individuals with ulcerative colitis (UC) require colectomy within 5 years of diagnosis, but polygenic risk scores (PRSs) utilizing findings from genome-wide association studies (GWASs) are unable to provide meaningful prediction of this adverse status. By contrast, in Crohn disease, gene expression profiling of GWAS-significant genes does provide some stratification of risk of progression to complicated disease in the form of a transcriptional risk score (TRS). Here, we demonstrate that a measured TRS based on bulk rectal gene expression in the PROTECT inception cohort study has a positive predictive value approaching 50% for colectomy. Single-cell profiling demonstrates that the genes are active in multiple diverse cell types from both the epithelial and immune compartments. Expression quantitative trait locus (QTL) analysis identifies genes with differential effects at baseline and week 52 follow-up, but for the most part, differential expression associated with colectomy risk is independent of local genetic regulation. Nevertheless, a predicted polygenic transcriptional risk score (PPTRS) derived by summation of transcriptome-wide association study (TWAS) effects identifies UC-affected individuals at 5-fold elevated risk of colectomy with data from the UK Biobank population cohort studies, independently replicated in an NIDDK-IBDGC dataset. Prediction of gene expression from relatively small transcriptome datasets can thus be used in conjunction with TWASs for stratification of risk of disease complications.
Asunto(s)
Colectomía/estadística & datos numéricos , Colitis Ulcerosa/cirugía , Enfermedad de Crohn/cirugía , Sitios de Carácter Cuantitativo , Transcriptoma , Bancos de Muestras Biológicas , Estudios de Cohortes , Colitis Ulcerosa/complicaciones , Colitis Ulcerosa/diagnóstico , Colitis Ulcerosa/genética , Colon/metabolismo , Colon/patología , Colon/cirugía , Enfermedad de Crohn/complicaciones , Enfermedad de Crohn/diagnóstico , Enfermedad de Crohn/genética , Conjuntos de Datos como Asunto , Progresión de la Enfermedad , Perfilación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Herencia Multifactorial , Pronóstico , Medición de Riesgo , Reino UnidoRESUMEN
Whether or not populations diverge with respect to the genetic contribution to risk of specific complex diseases is relevant to understanding the evolution of susceptibility and origins of health disparities. Here, we describe a large-scale whole-genome sequencing study of inflammatory bowel disease encompassing 1,774 affected individuals and 1,644 healthy control Americans with African ancestry (African Americans). Although no new loci for inflammatory bowel disease are discovered at genome-wide significance levels, we identify numerous instances of differential effect sizes in combination with divergent allele frequencies. For example, the major effect at PTGER4 fine maps to a single credible interval of 22 SNPs corresponding to one of four independent associations at the locus in European ancestry individuals but with an elevated odds ratio for Crohn disease in African Americans. A rare variant aggregate analysis implicates Ca2+-binding neuro-immunomodulator CALB2 in ulcerative colitis. Highly significant overall overlap of common variant risk for inflammatory bowel disease susceptibility between individuals with African and European ancestries was observed, with 41 of 241 previously known lead variants replicated and overall correlations in effect sizes of 0.68 for combined inflammatory bowel disease. Nevertheless, subtle differences influence the performance of polygenic risk scores, and we show that ancestry-appropriate weights significantly improve polygenic prediction in the highest percentiles of risk. The median amount of variance explained per locus remains the same in African and European cohorts, providing evidence for compensation of effect sizes as allele frequencies diverge, as expected under a highly polygenic model of disease.
Asunto(s)
Calbindina 2/genética , Predisposición Genética a la Enfermedad , Enfermedades Inflamatorias del Intestino/genética , Subtipo EP4 de Receptores de Prostaglandina E/genética , Negro o Afroamericano/genética , Anciano , Anciano de 80 o más Años , Colitis Ulcerosa/genética , Colitis Ulcerosa/patología , Enfermedad de Crohn/genética , Enfermedad de Crohn/patología , Femenino , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Humanos , Enfermedades Inflamatorias del Intestino/patología , Masculino , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple/genética , Población Blanca/genética , Secuenciación Completa del GenomaRESUMEN
Since organisms develop and thrive in the face of constant perturbations due to environmental and genetic variation, species may evolve resilient genetic architectures. We sought evidence for this process, known as canalization, through a comparison of the prevalence of phenotypes as a function of the polygenic score (PGS) across environments in the UK Biobank cohort study. Contrasting seven diseases and three categorical phenotypes with respect to 151 exposures in 408,925 people, the deviation between the prevalence-risk curves was observed to increase monotonically with the PGS percentile in one-fifth of the comparisons, suggesting extensive PGS-by-Environment (PGS×E) interaction. After adjustment for the dependency of allelic effect sizes on increased prevalence in the perturbing environment, cases where polygenic influences are greater or lesser than expected are seen to be particularly pervasive for educational attainment, obesity, and metabolic condition type-2 diabetes. Inflammatory bowel disease analysis shows fewer interactions but confirms that smoking and some aspects of diet influence risk. Notably, body mass index has more evidence for decanalization (increased genetic influence at the extremes of polygenic risk), whereas the waist-to-hip ratio shows canalization, reflecting different evolutionary pressures on the architectures of these weight-related traits. An additional 10 % of comparisons showed evidence for an additive shift of prevalence independent of PGS between exposures. These results provide the first widespread evidence for canalization protecting against disease in humans and have implications for personalized medicine as well as understanding the evolution of complex traits. The findings can be explored through an R shiny app at https://canalization-gibsonlab.shinyapps.io/rshiny/.
Asunto(s)
Bancos de Muestras Biológicas , Herencia Multifactorial , Estudios de Cohortes , Humanos , Fenotipo , Reino UnidoRESUMEN
The transcriptome-wide association studies (TWASs) that test for association between the study trait and the imputed gene expression levels from cis-acting expression quantitative trait loci (cis-eQTL) genotypes have successfully enhanced the discovery of genetic risk loci for complex traits. By using the gene expression imputation models fitted from reference datasets that have both genetic and transcriptomic data, TWASs facilitate gene-based tests with GWAS data while accounting for the reference transcriptomic data. The existing TWAS tools like PrediXcan and FUSION use parametric imputation models that have limitations for modeling the complex genetic architecture of transcriptomic data. Therefore, to improve on this, we employ a nonparametric Bayesian method that was originally proposed for genetic prediction of complex traits, which assumes a data-driven nonparametric prior for cis-eQTL effect sizes. The nonparametric Bayesian method is flexible and general because it includes both of the parametric imputation models used by PrediXcan and FUSION as special cases. Our simulation studies showed that the nonparametric Bayesian model improved both imputation R2 for transcriptomic data and the TWAS power over PrediXcan when ≥1% cis-SNPs co-regulate gene expression and gene expression heritability ≤0.2. In real applications, the nonparametric Bayesian method fitted transcriptomic imputation models for 57.8% more genes over PrediXcan, thus improving the power of follow-up TWASs. We implement both parametric PrediXcan and nonparametric Bayesian methods in a convenient software tool "TIGAR" (Transcriptome-Integrated Genetic Association Resource), which imputes transcriptomic data and performs subsequent TWASs using individual-level or summary-level GWAS data.
Asunto(s)
Envejecimiento/genética , Teorema de Bayes , Mapeo Cromosómico/métodos , Demencia/genética , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple , Transcriptoma , Perfilación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Fenotipo , Estudios Prospectivos , Sitios de Carácter Cuantitativo , Programas InformáticosRESUMEN
Inflammatory bowel disease (IBD) is characterized by chronic inflammation of the gastrointestinal system. Omega-3 (ω3) fatty acids are polyunsaturated fatty acids (PUFAs) that are largely obtained from diet and have been speculated to decrease the inflammatory response that is involved in IBD; however, the causality of this association has not been established. A two-sample Mendelian randomization (MR) was used to assess genetic associations between 249 circulating metabolites measured in the UK Biobank as exposures and IBD as the outcome. The genome-wide association study summary level data for metabolite measurements and IBD were derived from large European ancestry cohorts. We observed ω3 fatty acids as a significant protective association with IBD, with multiple modes of MR evidence replicated in three IBD summary genetic datasets. The instrumental variables that were involved in the causal association of ω3 fatty acids with IBD highlighted an intronic SNP, rs174564, in FADS2, a protein engaged in the first step of alpha-linolenic acid desaturation leading to anti-inflammatory EPA and thence DHA production. A low ratio of ω3 to ω6 fatty acids was observed to be a causal risk factor, particularly for Crohn's disease. ω3 fatty acid supplementation may provide anti-inflammatory responses that are required to attenuate inflammation that is involved in IBD.
Asunto(s)
Ácidos Grasos Omega-3 , Enfermedades Inflamatorias del Intestino , Humanos , Estudio de Asociación del Genoma Completo , Análisis de la Aleatorización Mendeliana , Ácidos Grasos Omega-3/uso terapéutico , Enfermedades Inflamatorias del Intestino/genética , Inflamación/tratamiento farmacológico , Antiinflamatorios/uso terapéutico , Enfermedad CrónicaRESUMEN
The transferability of polygenic scores across population groups is a major concern with respect to the equitable clinical implementation of genomic medicine. Since genetic associations are identified relative to the population mean, inevitably differences in disease or trait prevalence among social strata influence the relationship between PGS and risk. Here we quantify the magnitude of PGS-by-Exposure (PGSxE) interactions for seven human diseases (coronary artery disease, type 2 diabetes, obesity thresholded to body mass index and to waist-to-hip ratio, inflammatory bowel disease, chronic kidney disease, and asthma) and pairs of 75 exposures in the White-British subset of the UK Biobank study (n=408,801). Across 24,198 PGSxE models, 746 (3.1%) were significant by two criteria, at least three-fold more than expected by chance under each criterion. Predictive accuracy is significantly improved in the high-risk exposures and by including interaction terms with effects as large as those documented for low transferability of PGS across ancestries. The predominant mechanism for PGS×E interactions is shown to be amplification of genetic effects in the presence of adverse exposures such as low polyunsaturated fatty acids, mediators of obesity, and social determinants of ill health. We introduce the notion of the proportion needed to benefit (PNB) which is the cumulative number needed to treat across the range of the PGS and show that typically this is halved in the 70th to 80th percentile. These findings emphasize how individuals experiencing adverse exposures stand to preferentially benefit from interventions that may reduce risk, and highlight the need for more comprehensive sampling across socioeconomic groups in the performance of genome-wide association studies.
RESUMEN
BACKGROUND AND AIMS: Crohn's disease is characterized by inflammation in the gastrointestinal tract due to a combination of genetic, immune, and environmental factors. Transcriptomic and epigenomic profiling of intestinal tissue of Crohn's disease patients have revealed valuable insights into pathology, however have not been conducted jointly on less invasive peripheral blood mononuclear cells (PBMCs). Furthermore, the heterogeneous responses to treatments among individuals with Crohn's disease imply hidden diversity of pathological mechanisms. METHODS: We employed single nucleus multiomic analysis, integrating both snRNA-seq and snATAC-seq of PBMCs with a variety of open source bioinformatics applications. RESULTS: Our findings reveal a diverse range of transcriptional signatures among individuals, highlighting the heterogeneity in PBMC profiles. Nevertheless, striking concordance between three heterogeneous groups was observed across B cells and T cells. Differential gene regulatory mechanisms partially explain these profiles, notably including a signature involving TGFß signaling in two individuals with Crohn's disease. A mutation mapped to a transcription factor binding site within a differentially accessible peak associated with the expression of this pathway, with implications for a personalized approach to understanding disease pathology. CONCLUSIONS: This study highlights how multiomic analysis can reveal common regulatory mechanisms that underlie heterogeneity of PBMC profiles, one of which may be specific to inflammatory disease.
RESUMEN
BACKGROUND: Identification of rare variants involved in complex, polygenic diseases like Crohn's disease (CD) has accelerated with the introduction of whole exome/genome sequencing association studies. Rare variants can be used in both diagnostic and therapeutic assessments; however, since they are likely to be restricted to specific ancestry groups, their contributions to risk assessment need to be evaluated outside the discovery population. Prior studies implied that the three known rare variants in NOD2 are absent in West African and Asian populations and only contribute in African Americans via admixture. METHODS: Whole genome sequencing (WGS) data from 3418 African American individuals, 1774 inflammatory bowel disease (IBD) cases, and 1644 controls were used to assess odds ratios and allele frequencies (AF), as well as haplotype-specific ancestral origins of European-derived CD variants discovered in a large exome-wide association study. Local and global ancestry was performed to assess the contribution of admixture to IBD contrasting European and African American cohorts. RESULTS: Twenty-five rare variants associated with CD in European discovery cohorts are typically five-fold lower frequency in African Americans. Correspondingly, where comparisons could be made, the rare variants were found to have a predicted four-fold reduced burden for IBD in African Americans, when compared to European individuals. Almost all of the rare CD European variants were found on European haplotypes in the African American cohort, implying that they contribute to disease risk in African Americans primarily due to recent admixture. In addition, proportion of European ancestry correlates the number of rare CD European variants each African American individual carry, as well as their polygenic risk of disease. Similar findings were observed for 23 mutations affecting 10 other common complex diseases for which the rare variants were discovered in European cohorts. CONCLUSIONS: European-derived Crohn's disease rare variants are even more rare in African Americans and contribute to disease risk mainly due to admixture, which needs to be accounted for when performing cross-ancestry genetic assessments.
Asunto(s)
Enfermedad de Crohn , Enfermedades Inflamatorias del Intestino , Humanos , Enfermedad de Crohn/genética , Predisposición Genética a la Enfermedad , Enfermedades Inflamatorias del Intestino/genética , Negro o Afroamericano/genética , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , BlancoRESUMEN
The membrane protein angiotensin-converting enzyme-2 (ACE2) has gained notoriety as the receptor for severe acute respiratory syndrome coronavirus 2. Prior evidence has shown ACE2 is expressed within the liver but its function has not been fully discerned. Here, we utilized novel methodology to assess ACE2 expression in pediatric immune-mediated liver disease to better understand its presence in liver diseases and its role during infections such as COVID-19. We stained liver tissue with ACE2-specific immunofluorescent antibodies, analyzed via confocal microscopy. Computational deep learning-based segmentation models identified nuclei and cells, allowing the quantification of mean cellular and cytosolic immunofluorescent. Spatial transcriptomics provided high-throughput gene expression analysis in tissue to determine cellular composition for ACE2 expression. ACE2 plasma expression was quantified via enzyme-linked immunosorbent assay. High ACE2 expression was seen at the apical surface of cholangiocytes, with lower expression within hepatocyte cytosol and nonparenchymal cells ( P <0.001). Children with liver disease had higher ACE2 hepatic expression than pediatric control tissue ( P <0.001). Adult control tissue had higher expression than pediatric control ( P <0.001). Plasma ACE2 was not found to be statistically different between samples. Spatial transcriptomics identified cell composition of ACE2-expressing spots containing antibody-secreting cells. Our results show ACE2 expression throughout the liver, with strongest localization to cholangiocyte membranes. Machine learning can be used to rapidly identify hepatic cellular components for histologic analysis. ACE2 expression in the liver may be increased in pediatric liver disease. Future work is needed to better understand the role of ACE2 in chronic disease and acute infections.
Asunto(s)
COVID-19 , Hepatopatías , Humanos , Niño , Peptidil-Dipeptidasa A/genética , Peptidil-Dipeptidasa A/metabolismo , AngiotensinasRESUMEN
Population health research is increasingly focused on the genetic determinants of healthy ageing, but there is no public resource of whole genome sequences and phenotype data from healthy elderly individuals. Here we describe the first release of the Medical Genome Reference Bank (MGRB), comprising whole genome sequence and phenotype of 2570 elderly Australians depleted for cancer, cardiovascular disease, and dementia. We analyse the MGRB for single-nucleotide, indel and structural variation in the nuclear and mitochondrial genomes. MGRB individuals have fewer disease-associated common and rare germline variants, relative to both cancer cases and the gnomAD and UK Biobank cohorts, consistent with risk depletion. Age-related somatic changes are correlated with grip strength in men, suggesting blood-derived whole genomes may also provide a biologic measure of age-related functional deterioration. The MGRB provides a broadly applicable reference cohort for clinical genetics and genomic association studies, and for understanding the genetics of healthy ageing.
Asunto(s)
Bases de Datos Genéticas , Variación Genética , Genoma Humano , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Femenino , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Voluntarios Sanos , Humanos , Masculino , Persona de Mediana Edad , Mitocondrias/genética , Neoplasias/genética , Rendimiento Físico Funcional , Polimorfismo de Nucleótido Simple , Secuenciación Completa del GenomaRESUMEN
The prevalence of the so-called diseases of affluence, such as type 2 diabetes or hypertension, has increased dramatically in the last two generations. Although genome-wide association studies (GWAS) have discovered hundreds of genes involved in disease etiology, the sudden increase in disease incidence suggests a major role for environmental risk factors. Obesity constitutes a case example of a modern trait shaped by contemporary environment, although with considerable debates about the extent to which gene-by-environment (G×E) interactions accentuate obesity risk in individuals following obesogenic lifestyles. Although interaction effects have been robustly confirmed at the FTO locus, accumulating evidence at the genome-wide level implicates a role for polygenic risk-by-environment interactions. Through a variety of analyses using the UK Biobank, we confirm that the genomic background plays a major role in shaping the expressivity of alleles that increase body mass index (BMI).