RESUMO
Gastric cancer (GC) is the fifth most common cancer worldwide and is a heterogeneous disease. Among GC subtypes, the mesenchymal phenotype (Mes-like) is more invasive than the epithelial phenotype (Epi-like). Although gene expression of the epithelial-to-mesenchymal transition (EMT) has been studied, the regulatory landscape shaping this process is not fully understood. Here we use ATAC-seq and RNA-seq data from a compendium of GC cell lines and primary tumors to detect drivers of regulatory state changes and their transcriptional responses. Using the ATAC-seq data, we developed a machine learning approach to determine the transcription factors (TFs) regulating the subtypes of GC. We identified TFs driving the mesenchymal (RUNX2, ZEB1, SNAI2, AP-1 dimer) and the epithelial (GATA4, GATA6, KLF5, HNF4A, FOXA2, GRHL2) states in GC. We identified DNA copy number alterations associated with dysregulation of these TFs, specifically deletion of GATA4 and amplification of MAPK9 Comparisons with bulk and single-cell RNA-seq data sets identified activation toward fibroblast-like epigenomic and expression signatures in Mes-like GC. The activation of this mesenchymal fibrotic program is associated with differentially accessible DNA cis-regulatory elements flanking upregulated mesenchymal genes. These findings establish a map of TF activity in GC and highlight the role of copy number driven alterations in shaping epigenomic regulatory programs as potential drivers of GC heterogeneity and progression.
Assuntos
Transição Epitelial-Mesenquimal , Regulação Neoplásica da Expressão Gênica , Aprendizado de Máquina , Neoplasias Gástricas , Humanos , Neoplasias Gástricas/genética , Neoplasias Gástricas/patologia , Neoplasias Gástricas/metabolismo , Transição Epitelial-Mesenquimal/genética , Fator de Transcrição AP-1/metabolismo , Fator de Transcrição AP-1/genética , Linhagem Celular Tumoral , Fibrose/genética , Subunidade alfa 1 de Fator de Ligação ao Core/genética , Subunidade alfa 1 de Fator de Ligação ao Core/metabolismo , Variações do Número de Cópias de DNA , Subunidade alfa 2 de Fator de Ligação ao CoreRESUMO
OBJECTIVE: Gastric cancer (GC) comprises multiple molecular subtypes. Recent studies have highlighted mesenchymal-subtype GC (Mes-GC) as a clinically aggressive subtype with few treatment options. Combining multiple studies, we derived and applied a consensus Mes-GC classifier to define the Mes-GC enhancer landscape revealing disease vulnerabilities. DESIGN: Transcriptomic profiles of ~1000 primary GCs and cell lines were analysed to derive a consensus Mes-GC classifier. Clinical and genomic associations were performed across >1200 patients with GC. Genome-wide epigenomic profiles (H3K27ac, H3K4me1 and assay for transposase-accessible chromatin with sequencing (ATAC-seq)) of 49 primary GCs and GC cell lines were generated to identify Mes-GC-specific enhancer landscapes. Upstream regulators and downstream targets of Mes-GC enhancers were interrogated using chromatin immunoprecipitation followed by sequencing (ChIP-seq), RNA sequencing, CRISPR/Cas9 editing, functional assays and pharmacological inhibition. RESULTS: We identified and validated a 993-gene cancer-cell intrinsic Mes-GC classifier applicable to retrospective cohorts or prospective single samples. Multicohort analysis of Mes-GCs confirmed associations with poor patient survival, therapy resistance and few targetable genomic alterations. Analysis of enhancer profiles revealed a distinctive Mes-GC epigenomic landscape, with TEAD1 as a master regulator of Mes-GC enhancers and Mes-GCs exhibiting preferential sensitivity to TEAD1 pharmacological inhibition. Analysis of Mes-GC super-enhancers also highlighted NUAK1 kinase as a downstream target, with synergistic effects observed between NUAK1 inhibition and cisplatin treatment. CONCLUSION: Our results establish a consensus Mes-GC classifier applicable to multiple transcriptomic scenarios. Mes-GCs exhibit a distinct epigenomic landscape, and TEAD1 inhibition and combinatorial NUAK1 inhibition/cisplatin may represent potential targetable options.
Assuntos
Elementos Facilitadores Genéticos , Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Neoplasias Gástricas , Humanos , Cisplatino/metabolismo , Cisplatino/uso terapêutico , Estudos Prospectivos , Proteínas Quinases/genética , Proteínas Repressoras , Estudos Retrospectivos , Neoplasias Gástricas/genéticaRESUMO
Spatiotemporal control of gene expression during development requires orchestrated activities of numerous enhancers, which are cis-regulatory DNA sequences that, when bound by transcription factors, support selective activation or repression of associated genes. Proper activation of enhancers is critical during embryonic development, adult tissue homeostasis, and regeneration, and inappropriate enhancer activity is often associated with pathological conditions such as cancer. Multiple consortia [e.g., the Encyclopedia of DNA Elements (ENCODE) Consortium and National Institutes of Health Roadmap Epigenomics Mapping Consortium] and independent investigators have mapped putative regulatory regions in a large number of cell types and tissues, but the sequence determinants of cell-specific enhancers are not yet fully understood. Machine learning approaches trained on large sets of these regulatory regions can identify core transcription factor binding sites and generate quantitative predictions of enhancer activity and the impact of sequence variants on activity. Here, we review these computational methods in the context of enhancer prediction and gene regulatory network models specifying cell fate.
Assuntos
Biologia Computacional/métodos , Elementos Facilitadores Genéticos , Redes Reguladoras de Genes , Genoma Humano , HumanosRESUMO
The integrative analysis of high-throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease-associated human enhancers and nine disease-associated promoters. A library of mutations covering all bases was generated by saturation mutagenesis and altered activity was assessed in a massively parallel reporter assay (MPRA) in relevant cell lines. Reporter expression was measured relative to plasmid DNA to determine the impact of variants. The challenge was to predict the functional effects of variants on reporter expression. Comparative analysis of the full range of submitted prediction results identifies the most successful models of transcription factor binding sites, machine learning algorithms, and ways to choose among or incorporate diverse datatypes and cell-types for training computational models. These results have the potential to improve the design of future studies on more diverse sets of regulatory elements and aid the interpretation of disease-associated genetic variation.
Assuntos
DNA/química , Epigenômica/métodos , Mutação Puntual , Sítios de Ligação , Linhagem Celular , Cromatina/genética , DNA/metabolismo , Elementos Facilitadores Genéticos , Predisposição Genética para Doença , Humanos , Aprendizado de Máquina , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismoRESUMO
Comprehensive enhancer discovery is challenging because most enhancers, especially those affected in complex diseases, have weak effects on gene expression. Our network modeling revealed that nonlinear enhancer-gene regulation during cell state transitions can be leveraged to improve the sensitivity of enhancer discovery. Utilizing hESC definitive endoderm differentiation as a dynamic transition system, we conducted a mid-transition CRISPRi-based enhancer screen. The screen discovered a comprehensive set of enhancers (4 to 9 per locus) for each of the core endoderm lineage-specifying transcription factors, and many enhancers had strong effects mid-transition but weak effects post-transition. Through integrating enhancer activity measurements and three-dimensional enhancer-promoter interaction information, we were able to develop a CTCF loop-constrained Interaction Activity (CIA) model that can better predict functional enhancers compared to models that rely on Hi-C-based enhancer-promoter contact frequency. Our study provides generalizable strategies for sensitive and more comprehensive enhancer discovery in both normal and pathological cell state transitions.
RESUMO
Comprehensive enhancer discovery is challenging because most enhancers, especially those contributing to complex diseases, have weak effects on gene expression. Our gene regulatory network modeling identified that nonlinear enhancer gene regulation during cell state transitions can be leveraged to improve the sensitivity of enhancer discovery. Using human embryonic stem cell definitive endoderm differentiation as a dynamic transition system, we conducted a mid-transition CRISPRi-based enhancer screen. We discovered a comprehensive set of enhancers for each of the core endoderm-specifying transcription factors. Many enhancers had strong effects mid-transition but weak effects post-transition, consistent with the nonlinear temporal responses to enhancer perturbation predicted by the modeling. Integrating three-dimensional genomic information, we were able to develop a CTCF-loop-constrained Interaction Activity model that can better predict functional enhancers compared to models that rely on Hi-C-based enhancer-promoter contact frequency. Our study provides generalizable strategies for sensitive and systematic enhancer discovery in both normal and pathological cell state transitions.