RESUMO
The majority of genetic variants detected in genome wide association studies (GWAS) exert their effects on phenotypes through gene regulation. Motivated by this observation, we propose a multi-omic integration method that models the cascading effects of genetic variants from epigenome to transcriptome and eventually to the phenome in identifying target genes influenced by risk alleles. This cascading epigenomic analysis for GWAS, which we refer to as CEWAS, comprises two types of models: one for linking cis genetic effects to epigenomic variation and another for linking cis epigenomic variation to gene expression. Applying these models in cascade to GWAS summary statistics generates gene level statistics that reflect genetically-driven epigenomic effects. We show on sixteen brain-related GWAS that CEWAS provides higher gene detection rate than related methods, and finds disease relevant genes and gene sets that point toward less explored biological processes. CEWAS thus presents a novel means for exploring the regulatory landscape of GWAS variants in uncovering disease mechanisms.
Assuntos
Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas/genética , Alelos , Epigenoma/genética , Doenças Genéticas Inatas/patologia , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Transcriptoma/genéticaRESUMO
Huntington disease (HD) is a neurodegenerative disorder that is caused by a CAG repeat expansion in HTT. The length of this repeat, however, only explains a proportion of the variability in age of onset in patients. Genome-wide association studies have identified modifiers that contribute toward a proportion of the observed variance. By incorporating tissue-specific transcriptomic information with these results, additional modifiers can be identified. We performed a transcriptome-wide association study assessing heritable differences in genetically determined expression in diverse tissues, with genome-wide data from over 4000 patients. Functional validation of prioritized genes was undertaken in isogenic HD stem cells and patient brains. Enrichment analyses were performed with biologically relevant gene sets to identify the core pathways. HD-associated gene coexpression modules were assessed for associations with neurological phenotypes in an independent cohort and to guide drug repurposing analyses. Transcriptomic analyses identified genes that were associated with age of HD onset and displayed colocalization with gene expression signals in brain tissue (FAN1, GPR161, PMS2, SUMF2), with supporting evidence from functional experiments. This included genes involved in DNA repair, as well as novel-candidate modifier genes that have been associated with other neurological conditions. Further, cortical coexpression modules were also associated with cognitive decline and HD-related traits in a longitudinal cohort. In summary, the combination of population-scale gene expression information with HD patient genomic data identified novel modifier genes for the disorder. Further, these analyses expanded the pathways potentially involved in modifying HD onset and prioritized candidate therapeutics for future study.
Assuntos
Estudo de Associação Genômica Ampla , Proteína Huntingtina/genética , Doença de Huntington/genética , Transcriptoma/genética , Adulto , Idade de Início , Idoso , Reparo do DNA/genética , Endodesoxirribonucleases/genética , Exodesoxirribonucleases/genética , Feminino , Regulação da Expressão Gênica/genética , Genoma/genética , Genômica , Humanos , Doença de Huntington/epidemiologia , Doença de Huntington/patologia , Masculino , Pessoa de Meia-Idade , Endonuclease PMS2 de Reparo de Erro de Pareamento/genética , Enzimas Multifuncionais/genética , Especificidade de Órgãos/genética , Polimorfismo de Nucleotídeo Único/genética , Receptores Acoplados a Proteínas G/genética , Sulfatases/genética , Expansão das Repetições de Trinucleotídeos/genéticaRESUMO
Deciphering the environmental contexts at which genetic effects are most prominent is central for making full use of GWAS results in follow-up experiment design and treatment development. However, measuring a large number of environmental factors at high granularity might not always be feasible. Instead, here we propose extracting cellular embedding of environmental factors from gene expression data by using latent variable (LV) analysis and taking these LVs as environmental proxies in detecting gene-by-environment (GxE) interaction effects on gene expression, i.e., GxE expression quantitative trait loci (eQTLs). Applying this approach to two largest brain eQTL datasets (n = 1,100), we show that LVs and GxE eQTLs in one dataset replicate well in the other dataset. Combining the two samples via meta-analysis, 895 GxE eQTLs are identified. On average, GxE effect explains an additional â¼4% variation in expression of each gene that displays a GxE effect. Ten of these 52 genes are associated with cell-type-specific eQTLs, and the remaining genes are multi-functional. Furthermore, after substituting LVs with expression of transcription factors (TF), we found 91 TF-specific eQTLs, which demonstrates an important use of our brain GxE eQTLs.
Assuntos
Encéfalo/metabolismo , Genótipo , Transcriptoma , Humanos , Locos de Características QuantitativasRESUMO
Complexity of cell-type composition has created much skepticism surrounding the interpretation of bulk tissue transcriptomic studies. Recent studies have shown that deconvolution algorithms can be applied to computationally estimate cell-type proportions from gene expression data of bulk blood samples, but their performance when applied to brain tissue is unclear. Here, we have generated an immunohistochemistry (IHC) dataset for five major cell-types from brain tissue of 70 individuals, who also have bulk cortical gene expression data. With the IHC data as the benchmark, this resource enables quantitative assessment of deconvolution algorithms for brain tissue. We apply existing deconvolution algorithms to brain tissue by using marker sets derived from human brain single cell and cell-sorted RNA-seq data. We show that these algorithms can indeed produce informative estimates of constituent cell-type proportions. In fact, neuronal subpopulations can also be estimated from bulk brain tissue samples. Further, we show that including the cell-type proportion estimates as confounding factors is important for reducing false associations between Alzheimer's disease phenotypes and gene expression. Lastly, we demonstrate that using more accurate marker sets can substantially improve statistical power in detecting cell-type specific expression quantitative trait loci (eQTLs).
Assuntos
Algoritmos , Encéfalo , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Transcriptoma/genética , Encéfalo/citologia , Encéfalo/metabolismo , Biologia Computacional , Humanos , Imuno-Histoquímica , Especificidade de Órgãos/genética , Fenótipo , Locos de Características Quantitativas/genética , Análise de Célula ÚnicaRESUMO
Molecular quantitative trait loci (QTLs) allow us to understand the biology captured in genome-wide association studies (GWASs). The placenta regulates fetal development and shows sex differences in DNA methylation. We therefore hypothesized that placental methylation QTL (mQTL) explain variation in genetic risk for childhood onset traits, and that effects differ by sex. We analyzed 411 term placentas from two studies and found 49,252 methylation (CpG) sites with mQTL and 2,489 CpG sites with sex-dependent mQTL. All mQTL were enriched in regions that typically affect gene expression in prenatal tissues. All mQTL were also enriched in GWAS results for growth- and immune-related traits, but male- and female-specific mQTL were more enriched than cross-sex mQTL. mQTL colocalized with trait loci at 777 CpG sites, with 216 (28%) specific to males or females. Overall, mQTL specific to male and female placenta capture otherwise overlooked variation in childhood traits.
RESUMO
Introduction: Age-related macular degeneration (AMD) is the leading cause of central vision loss in the elderly. One-third of the genetic contribution to this disease remains unexplained. Methods: We analyzed targeted sequencing data from two independent cohorts (4,245 cases, 1,668 controls) which included genomic regions of known AMD loci in 49 genes. Results: At a false discovery rate of <0.01, we identified 11 low-frequency AMD variants (minor allele frequency <0.05). Two of those variants were present in the complement C4A gene, including the replacement of the residues that contribute to the Rodgers-1/Chido-1 blood group antigens: [VDLL1207-1210ADLR (V1207A)] with discovery odds ratio (OR) = 1.7 (p = 3.2 × 10-5) which was replicated in the UK Biobank dataset (3,294 cases, 200,086 controls, OR = 1.52, p = 0.037). A novel variant associated with reduced risk for AMD in our discovery cohort was P1120T, one of the four C4A-isotypic residues. Gene-based tests yielded aggregate effects of nonsynonymous variants in 10 genes including C4A, which were associated with increased risk of AMD. In human eye tissues, immunostaining demonstrated C4A protein accumulation in and around endothelial cells of retinal and choroidal vasculature, and total C4 in soft drusen. Conclusion: Our results indicate that C4A protein in the complement activation pathways may play a role in the pathogenesis of AMD.
RESUMO
Individual reactions to traumatic stress vary dramatically, yet the biological basis of this variation remains poorly understood. Recent studies demonstrate the surprising plasticity of oligodendrocytes and myelin with stress and experience, providing a potential mechanism by which trauma induces aberrant structural and functional changes in the adult brain. In this study, we utilized a translational approach to test the hypothesis that gray matter oligodendrocytes contribute to traumatic-stress-induced behavioral variation in both rats and humans. We exposed adult, male rats to a single, severe stressor and used a multimodal approach to characterize avoidance, startle, and fear-learning behavior, as well as oligodendrocyte and myelin basic protein (MBP) content in multiple brain areas. We found that oligodendrocyte cell density and MBP were correlated with behavioral outcomes in a region-specific manner. Specifically, stress-induced avoidance positively correlated with hippocampal dentate gyrus oligodendrocytes and MBP. Viral overexpression of the oligodendrogenic factor Olig1 in the dentate gyrus was sufficient to induce an anxiety-like behavioral phenotype. In contrast, contextual fear learning positively correlated with MBP in the amygdala and spatial-processing regions of the hippocampus. In a group of trauma-exposed US veterans, T1-/T2-weighted magnetic resonance imaging estimates of hippocampal and amygdala myelin associated with symptom profiles in a region-specific manner that mirrored the findings in rats. These results demonstrate a species-independent relationship between region-specific, gray matter oligodendrocytes and differential behavioral phenotypes following traumatic stress exposure. This study suggests a novel mechanism for brain plasticity that underlies individual variance in sensitivity to traumatic stress.
Assuntos
Substância Cinzenta , Bainha de Mielina , Tonsila do Cerebelo/metabolismo , Animais , Substância Cinzenta/diagnóstico por imagem , Substância Cinzenta/metabolismo , Hipocampo/metabolismo , Humanos , Masculino , Proteína Básica da Mielina/metabolismo , Bainha de Mielina/metabolismo , Oligodendroglia/metabolismo , RatosRESUMO
The vast bacteriophage population harbors an immense reservoir of genetic information. Almost 2000 phage genomes have been sequenced from phages infecting hosts in the phylum Actinobacteria, and analysis of these genomes reveals substantial diversity, pervasive mosaicism, and novel mechanisms for phage replication and lysogeny. Here, we describe the isolation and genomic characterization of 46 phages from environmental samples at various geographic locations in the U.S. infecting a single Arthrobacter sp. strain. These phages include representatives of all three virion morphologies, and Jasmine is the first sequenced podovirus of an actinobacterial host. The phages also span considerable sequence diversity, and can be grouped into 10 clusters according to their nucleotide diversity, and two singletons each with no close relatives. However, the clusters/singletons appear to be genomically well separated from each other, and relatively few genes are shared between clusters. Genome size varies from among the smallest of siphoviral phages (15,319 bp) to over 70 kbp, and G+C contents range from 45-68%, compared to 63.4% for the host genome. Although temperate phages are common among other actinobacterial hosts, these Arthrobacter phages are primarily lytic, and only the singleton Galaxy is likely temperate.