RESUMO
The functional engagement between an enhancer and its target promoter ensures precise gene transcription1. Understanding the basis of promoter choice by enhancers has important implications for health and disease. Here we report that functional loss of a preferred promoter can release its partner enhancer to loop to and activate an alternative promoter (or alternative promoters) in the neighbourhood. We refer to this target-switching process as 'enhancer release and retargeting'. Genetic deletion, motif perturbation or mutation, and dCas9-mediated CTCF tethering reveal that promoter choice by an enhancer can be determined by the binding of CTCF at promoters, in a cohesin-dependent manner-consistent with a model of 'enhancer scanning' inside the contact domain. Promoter-associated CTCF shows a lower affinity than that at chromatin domain boundaries and often lacks a preferred motif orientation or a partnering CTCF at the cognate enhancer, suggesting properties distinct from boundary CTCF. Analyses of cancer mutations, data from the GTEx project and risk loci from genome-wide association studies, together with a focused CRISPR interference screen, reveal that enhancer release and retargeting represents an overlooked mechanism that underlies the activation of disease-susceptibility genes, as exemplified by a risk locus for Parkinson's disease (NUCKS1-RAB7L1) and three loci associated with cancer (CLPTM1L-TERT, ZCCHC7-PAX5 and PVT1-MYC).
Assuntos
Fator de Ligação a CCCTC/genética , Elementos Facilitadores Genéticos , Predisposição Genética para Doença , Regiões Promotoras Genéticas , Sistemas CRISPR-Cas , Proteínas de Ciclo Celular/genética , Células Cultivadas , Cromatina , Proteínas Cromossômicas não Histona/genética , Deleção de Genes , Regulação Neoplásica da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Células MCF-7 , Neoplasias/genética , Células-Tronco Neurais , Oncogenes , Doença de Parkinson/genética , CoesinasRESUMO
22q11.2 deletion is one of the strongest known genetic risk factors for schizophrenia. Recent whole-genome sequencing of schizophrenia cases and controls with this deletion provided an unprecedented opportunity to identify risk modifying genetic variants and investigate their contribution to the pathogenesis of schizophrenia in 22q11.2 deletion syndrome. Here, we apply a novel analytic framework that integrates gene network and phenotype data to investigate the aggregate effects of rare coding variants and identified modifier genes in this etiologically homogenous cohort (223 schizophrenia cases and 233 controls of European descent). Our analyses revealed significant additive genetic components of rare nonsynonymous variants in 110 modifier genes (adjusted P = 9.4E-04) that overall accounted for 4.6% of the variance in schizophrenia status in this cohort, of which 4.0% was independent of the common polygenic risk for schizophrenia. The modifier genes affected by rare coding variants were enriched with genes involved in synaptic function and developmental disorders. Spatiotemporal transcriptomic analyses identified an enrichment of coexpression between modifier and 22q11.2 genes in cortical brain regions from late infancy to young adulthood. Corresponding gene coexpression modules are enriched with brain-specific protein-protein interactions of SLC25A1, COMT, and PI4KA in the 22q11.2 deletion region. Overall, our study highlights the contribution of rare coding variants to the SCZ risk. They not only complement common variants in disease genetics but also pinpoint brain regions and developmental stages critical to the etiology of syndromic schizophrenia.
Assuntos
Síndrome de DiGeorge , Esquizofrenia , Humanos , Adulto Jovem , Adulto , Esquizofrenia/genética , Síndrome de DiGeorge/genética , Encéfalo , Perfilação da Expressão Gênica , Sequenciamento Completo do GenomaRESUMO
Enhancers, as specialized genomic cis-regulatory elements, activate transcription of their target genes and play an important role in pathogenesis of many human complex diseases. Despite recent systematic identification of them in the human genome, currently there is an urgent need for comprehensive annotation databases of human enhancers with a focus on their disease connections. In response, we built the Human Enhancer Disease Database (HEDD) to facilitate studies of enhancers and their potential roles in human complex diseases. HEDD currently provides comprehensive genomic information for â¼2.8 million human enhancers identified by ENCODE, FANTOM5 and RoadMap with disease association scores based on enhancer-gene and gene-disease connections. It also provides Web-based analytical tools to visualize enhancer networks and score enhancers given a set of selected genes in a specific gene network. HEDD is freely accessible at http://zdzlab.einstein.yu.edu/1/hedd.php.
Assuntos
Bases de Dados de Ácidos Nucleicos , Elementos Facilitadores Genéticos , Cromossomos Humanos Par 9/genética , Doença/genética , Redes Reguladoras de Genes , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Internet , Anotação de Sequência Molecular , Herança Multifatorial , Polimorfismo de Nucleotídeo ÚnicoRESUMO
The highly polygenic nature of human longevity renders pleiotropy an indispensable feature of its genetic architecture. Leveraging the genetic correlation between aging-related traits (ARTs), we aimed to model the additive variance in lifespan as a function of the cumulative liability from pleiotropic segregating variants. We tracked allele frequency changes as a function of viability across different age bins and prioritized 34 variants with an immediate implication on lipid metabolism, body mass index (BMI), and cognitive performance, among other traits, revealed by PheWAS analysis in the UK Biobank. Given the highly complex and non-linear interactions between the genetic determinants of longevity, we reasoned that a composite polygenic score would approximate a substantial portion of the variance in lifespan and developed the integrated longevity genetic scores (iLGSs) for distinguishing exceptional survival. We showed that coefficients derived from our ensemble model could potentially reveal an interesting pattern of genomic pleiotropy specific to lifespan. We assessed the predictive performance of our model for distinguishing the enrichment of exceptional longevity among long-lived individuals in two replication cohorts (the Scripps Wellderly cohort and the Medical Genome Reference Bank (MRGB)) and showed that the median lifespan in the highest decile of our composite prognostic index is up to 4.8 years longer. Finally, using the proteomic correlates of iLGS, we identified protein markers associated with exceptional longevity irrespective of chronological age and prioritized drugs with repurposing potentials for gerotherapeutics. Together, our approach demonstrates a promising framework for polygenic modeling of additive liability conferred by ARTs in defining exceptional longevity and assisting the identification of individuals at a higher risk of mortality for targeted lifestyle modifications earlier in life. Furthermore, the proteomic signature associated with iLGS highlights the functional pathway upstream of the PI3K-Akt that can be effectively targeted to slow down aging and extend lifespan.
Assuntos
Pleiotropia Genética , Longevidade , Herança Multifatorial , Humanos , Longevidade/genética , Herança Multifatorial/genética , Feminino , Masculino , Envelhecimento/genética , Idoso , Idoso de 80 Anos ou mais , Polimorfismo de Nucleotídeo Único , Pessoa de Meia-Idade , Estudo de Associação Genômica Ampla , Frequência do GeneRESUMO
Tiling microarrays have proven to be a valuable tool for gaining insights into the transcriptomes of microbial organisms grown under various nutritional or stress conditions. Here, we describe the use of such an array, constructed at the level of 20 nt resolution for the Escherichia coli MG1655 genome, to observe genome-wide changes in the steady-state RNA levels in mutants defective in either RNase E or RNase III. The array data were validated by comparison to previously published results for a variety of specific transcripts as well as independent northern analysis of additional mRNAs and sRNAs. In the absence of RNase E, 60% of the annotated coding sequences showed either increases or decreases in their steady-state levels. In contrast, only 12% of the coding sequences were affected in the absence of RNase III. Unexpectedly, many coding sequences showed decreased abundance in the RNase E mutant, while more than half of the annotated sRNAs showed changes in abundance. Furthermore, the steady-state levels of many transcripts showed overlapping effects of both ribonucleases. Data are also presented demonstrating how the arrays were used to identify potential new genes, RNase III cleavage sites and the direct or indirect control of specific biological pathways.
Assuntos
Endorribonucleases/metabolismo , Escherichia coli/enzimologia , Escherichia coli/genética , Ribonuclease III/metabolismo , Cisteína/biossíntese , Endorribonucleases/genética , Escherichia coli/metabolismo , Deleção de Genes , Perfilação da Expressão Gênica , Genes Bacterianos , Genoma Bacteriano , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/metabolismo , Pequeno RNA não Traduzido/análise , Ribonuclease III/genéticaRESUMO
The highly polygenic nature of human longevity renders cross-trait pleiotropy an indispensable feature of its genetic architecture. Leveraging the genetic correlation between the aging-related traits (ARTs), we sought to model the additive variance in lifespan as a function of cumulative liability from pleiotropic segregating variants. We tracked allele frequency changes as a function of viability across different age bins and prioritized 34 variants with an immediate implication on lipid metabolism, body mass index (BMI), and cognitive performance, among other traits, revealed by PheWAS analysis in the UK Biobank. Given the highly complex and non-linear interactions between the genetic determinants of longevity, we reasoned that a composite polygenic score would approximate a substantial portion of the variance in lifespan and developed the integrated longevity genetic scores (iLGSs) for distinguishing exceptional survival. We showed that coefficients derived from our ensemble model could potentially reveal an interesting pattern of genomic pleiotropy specific to lifespan. We assessed the predictive performance of our model for distinguishing the enrichment of exceptional longevity among long-lived individuals in two replication cohorts and showed that the median lifespan in the highest decile of our composite prognostic index is up to 4.8 years longer. Finally, using the proteomic correlates of iLGS, we identified protein markers associated with exceptional longevity irrespective of chronological age and prioritized drugs with repurposing potentials for gerotherapeutics. Together, our approach demonstrates a promising framework for polygenic modeling of additive liability conferred by ARTs in defining exceptional longevity and assisting the identification of individuals at higher risk of mortality for targeted lifestyle modifications earlier in life. Furthermore, the proteomic signature associated with iLGS highlights the functional pathway upstream of the PI3K-Akt that can be effectively targeted to slow down aging and extend lifespan.
RESUMO
Alzheimer's disease (AD) is a genetically complex, multifactorial neurodegenerative disease. It affects more than 45 million people worldwide and currently remains untreatable. Although genome-wide association studies (GWAS) have identified many AD-associated common variants, only about 25 genes are currently known to affect the risk of developing AD, despite its highly polygenic nature. Moreover, the risk variants underlying GWAS AD-association signals remain unknown. Here, we describe a deep post-GWAS analysis of AD-associated variants, using an integrated computational framework for predicting both disease genes and their risk variants. We identified 342 putative AD risk genes in 203 risk regions spanning 502 AD-associated common variants. 246 AD risk genes have not been identified as AD risk genes by previous GWAS collected in GWAS catalogs, and 115 of 342 AD risk genes are outside the risk regions, likely under the regulation of transcriptional regulatory elements contained therein. Even more significantly, for 109 AD risk genes, we predicted 150 risk variants, of both coding and regulatory (in promoters or enhancers) types, and 85 (57%) of them are supported by functional annotation. In-depth functional analyses showed that AD risk genes were overrepresented in AD-related pathways or GO terms-e.g., the complement and coagulation cascade and phosphorylation and activation of immune response-and their expression was relatively enriched in microglia, endothelia, and pericytes of the human brain. We found nine AD risk genes-e.g., IL1RAP, PMAIP1, LAMTOR4-as predictors for the prognosis of AD survival and genes such as ARL6IP5 with altered network connectivity between AD patients and normal individuals involved in AD progression. Our findings open new strategies for developing therapeutics targeting AD risk genes or risk variants to influence AD pathogenesis.
Assuntos
Doença de Alzheimer/genética , Encéfalo/metabolismo , Doença de Alzheimer/metabolismo , Redes Reguladoras de Genes , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
The North American beaver is an exceptionally long-lived and cancer-resistant rodent species. Here, we report the evolutionary changes in its gene coding sequences, copy numbers, and expression. We identify changes that likely increase its ability to detoxify aldehydes, enhance tumor suppression and DNA repair, and alter lipid metabolism, potentially contributing to its longevity and cancer resistance. Hpgd, a tumor suppressor gene, is uniquely duplicated in beavers among rodents, and several genes associated with tumor suppression and longevity are under positive selection in beavers. Lipid metabolism genes show positive selection signals, changes in copy numbers, or altered gene expression in beavers. Aldh1a1, encoding an enzyme for aldehydes detoxification, is particularly notable due to its massive expansion in beavers, which enhances their cellular resistance to ethanol and capacity to metabolize diverse aldehyde substrates from lipid oxidation and their woody diet. We hypothesize that the amplification of Aldh1a1 may contribute to the longevity of beavers.
Assuntos
Família Aldeído Desidrogenase 1/metabolismo , Aldeídos/metabolismo , Genes Supressores de Tumor , Genoma , Lipídeos/química , Longevidade , Família Aldeído Desidrogenase 1/genética , Animais , Feminino , Humanos , Masculino , Camundongos , Filogenia , RoedoresRESUMO
Extreme longevity in humans has a strong genetic component, but whether this involves genetic variation in the same longevity pathways as found in model organisms is unclear. Using whole-exome sequences of a large cohort of Ashkenazi Jewish centenarians to examine enrichment for rare coding variants, we found most longevity-associated rare coding variants converge upon conserved insulin/insulin-like growth factor 1 signaling and AMP-activating protein kinase signaling pathways. Centenarians have a number of pathogenic rare coding variants similar to control individuals, suggesting that rare variants detected in the conserved longevity pathways are protective against age-related pathology. Indeed, we detected a pro-longevity effect of rare coding variants in the Wnt signaling pathway on individuals harboring the known common risk allele APOE4. The genetic component of extreme human longevity constitutes, at least in part, rare coding variants in pathways that protect against aging, including those that control longevity in model organisms.
Assuntos
Envelhecimento , Longevidade , Idoso de 80 Anos ou mais , Humanos , Longevidade/genética , Envelhecimento/genética , Transdução de Sinais , Centenários , AlelosRESUMO
BAL1 is a transcription modulator that is overexpressed in chemoresistant, diffuse large B-cell lymphomas (DLBCLs). BAL1 complexes with a recently described DELTEX family member termed BBAP. Herein, we characterized BAL1 and BBAP expression in primary DLBCL subtypes defined by their comprehensive transcriptional profiles. BAL1 and BBAP were most abundant in lymphomas with a brisk host inflammatory response, designated host response (HR) tumors. Although these DLBCLs include significant numbers of tumor-infiltrating lymphocytes and interdigitating dendritic cells, BAL1 and BBAP were expressed primarily by malignant B cells, prompting speculation that the genes might be induced by host-derived inflammatory mediators such as gamma interferon (IFN-gamma). In fact, IFN-gamma induced BAL1 and BBAP expression in DLBCL cell lines; doxycycline-induced BAL1 also increased the expression of multiple IFN-stimulated genes, directly implicating BAL1 in an IFN signaling pathway. We show that BAL1 and BBAP are located on chromosome 3q21 in a head-to-head orientation and are regulated by a IFN-gamma-responsive bidirectional promoter. BBAP regulates the subcellular localization of BAL1 by a dynamic shuttling mechanism, highlighting the functional requirement for coordinated BBAP and BAL1 expression. IFN-gamma-induced BAL1/BBAP expression contributes to the molecular signature of HR DLBCLs and highlights the interplay between the inflammatory infiltrate and malignant B cells in these tumors.
Assuntos
Interferon gama/metabolismo , Linfoma de Células B/genética , Linfoma de Células B/imunologia , Linfoma Difuso de Grandes Células B/genética , Linfoma Difuso de Grandes Células B/imunologia , Proteínas de Neoplasias/genética , Regiões Promotoras Genéticas , Ubiquitina-Proteína Ligases/genética , Sequência de Bases , Linhagem Celular Tumoral , Cromossomos Humanos Par 3/genética , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Humanos , Interferon gama/farmacologia , Janus Quinase 2 , Linfoma de Células B/patologia , Linfoma Difuso de Grandes Células B/patologia , Mutação , Poli(ADP-Ribose) Polimerases , Proteínas Tirosina Quinases/metabolismo , Proteínas Proto-Oncogênicas/metabolismo , Interferência de RNA , RNA Interferente Pequeno/genética , Proteínas Recombinantes , Frações Subcelulares/metabolismo , Ubiquitina-Proteína Ligases/antagonistas & inibidoresRESUMO
In this work, we integrate a non-linear signal analysis method, recurrence quantification analysis (RQA), with the well-known machine-learning algorithm, support vector machines for the binary classification of protein sequences. Two different classification problems were selected, discriminating between aggregating and non-aggregating proteins and mostly disordered and completely ordered proteins, respectively. It has also been shown that classification performance of SVM models improve on selection of the most informative RQA descriptors as SVM input features.