RESUMO
Changes in transcriptional regulatory networks can significantly alter cell fate. To gain insight into transcriptional dynamics, several studies have profiled bulk multi-omic data sets with parallel transcriptomic and epigenomic measurements at different stages of a developmental process. However, integrating these data to infer cell type-specific regulatory networks is a major challenge. We present dynamic regulatory module networks (DRMNs), a novel approach to infer cell type-specific cis-regulatory networks and their dynamics. DRMN integrates expression, chromatin state, and accessibility to predict cis-regulators of context-specific expression, where context can be cell type, developmental stage, or time point, and uses multitask learning to capture network dynamics across linearly and hierarchically related contexts. We applied DRMNs to study regulatory network dynamics in three developmental processes, each showing different temporal relationships and measuring a different combination of regulatory genomic data sets: cellular reprogramming, liver dedifferentiation, and forward differentiation. DRMN identified known and novel regulators driving cell type-specific expression patterns, showing its broad applicability to examine dynamics of gene regulatory networks from linearly and hierarchically related multi-omic data sets.
Assuntos
Redes Reguladoras de Genes , Genoma , Cromatina/genética , Genômica , TranscriptomaRESUMO
Recent advances in consortium-scale genome-wide association studies (GWAS) have highlighted the involvement of common genetic variants in autism spectrum disorder (ASD), but our understanding of their etiologic roles, especially the interplay with rare variants, is incomplete. In this work, we introduce an analytical framework to quantify the transmission disequilibrium of genetically regulated gene expression from parents to offspring. We applied this framework to conduct a transcriptome-wide association study (TWAS) on 7,805 ASD proband-parent trios, and replicated our findings using 35,740 independent samples. We identified 31 associations at the transcriptome-wide significance level. In particular, we identified POU3F2 (p = 2.1E-7), a transcription factor mainly expressed in developmental brain. Gene targets regulated by POU3F2 showed a 2.7-fold enrichment for known ASD genes (p = 2.0E-5) and a 2.7-fold enrichment for loss-of-function de novo mutations in ASD probands (p = 7.1E-5). These results provide a novel connection between rare and common variants, whereby ASD genes affected by very rare mutations are regulated by an unlinked transcription factor affected by common genetic variations.
Assuntos
Transtorno do Espectro Autista/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Hipocampo/metabolismo , Proteínas de Homeodomínio/genética , Fatores do Domínio POU/genética , Transcriptoma/genética , Alelos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Humanos , Mutação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Fatores de Risco , Análise Espaço-TemporalRESUMO
Dominantly inherited disorders are not typically considered to be therapeutic candidates for gene augmentation. Here, we utilized induced pluripotent stem cell-derived retinal pigment epithelium (iPSC-RPE) to test the potential of gene augmentation to treat Best disease, a dominant macular dystrophy caused by over 200 missense mutations in BEST1. Gene augmentation in iPSC-RPE fully restored BEST1 calcium-activated chloride channel activity and improved rhodopsin degradation in an iPSC-RPE model of recessive bestrophinopathy as well as in two models of dominant Best disease caused by different mutations in regions encoding ion-binding domains. A third dominant Best disease iPSC-RPE model did not respond to gene augmentation, but showed normalization of BEST1 channel activity following CRISPR-Cas9 editing of the mutant allele. We then subjected all three dominant Best disease iPSC-RPE models to gene editing, which produced premature stop codons specifically within the mutant BEST1 alleles. Single-cell profiling demonstrated no adverse perturbation of retinal pigment epithelium (RPE) transcriptional programs in any model, although off-target analysis detected a silent genomic alteration in one model. These results suggest that gene augmentation is a viable first-line approach for some individuals with dominant Best disease and that non-responders are candidates for alternate approaches such as gene editing. However, testing gene editing strategies for on-target efficiency and off-target events using personalized iPSC-RPE model systems is warranted. In summary, personalized iPSC-RPE models can be used to select among a growing list of gene therapy options to maximize safety and efficacy while minimizing time and cost. Similar scenarios likely exist for other genotypically diverse channelopathies, expanding the therapeutic landscape for affected individuals.
Assuntos
Células-Tronco Pluripotentes Induzidas/fisiologia , Degeneração Macular/genética , Mutação/genética , Alelos , Bestrofinas/genética , Cálcio/metabolismo , Linhagem Celular , Canalopatias/genética , Proteínas do Olho/genética , Edição de Genes/métodos , Terapia Genética/métodos , Genótipo , Células HEK293 , Humanos , Epitélio Pigmentado da Retina/fisiologiaRESUMO
Long range regulatory interactions among distal enhancers and target genes are important for tissue-specific gene expression. Genome-scale identification of these interactions in a cell line-specific manner, especially using the fewest possible datasets, is a significant challenge. We develop a novel computational approach, Regulatory Interaction Prediction for Promoters and Long-range Enhancers (RIPPLE), that integrates published Chromosome Conformation Capture (3C) data sets with a minimal set of regulatory genomic data sets to predict enhancer-promoter interactions in a cell line-specific manner. Our results suggest that CTCF, RAD21, a general transcription factor (TBP) and activating chromatin marks are important determinants of enhancer-promoter interactions. To predict interactions in a new cell line and to generate genome-wide interaction maps, we develop an ensemble version of RIPPLE and apply it to generate interactions in five human cell lines. Computational validation of these predictions using existing ChIA-PET and Hi-C data sets showed that RIPPLE accurately predicts interactions among enhancers and promoters. Enhancer-promoter interactions tend to be organized into subnetworks representing coordinately regulated sets of genes that are enriched for specific biological processes and cis-regulatory elements. Overall, our work provides a systematic approach to predict and interpret enhancer-promoter interactions in a genome-wide cell-type specific manner using a few experimentally tractable measurements.
Assuntos
Elementos Facilitadores Genéticos , Genômica/métodos , Modelos Genéticos , Regiões Promotoras Genéticas , Algoritmos , Fator de Ligação a CCCTC , Proteínas de Ciclo Celular/análise , Linhagem Celular , Cromatina/química , Cromatina/metabolismo , Proteínas Cromossômicas não Histona/análise , Código das Histonas , Humanos , Proteínas Repressoras/análise , Proteína de Ligação a TATA-Box/análise , CoesinasRESUMO
Cell type-specific gene expression patterns are outputs of transcriptional gene regulatory networks (GRNs) that connect transcription factors and signaling proteins to target genes. Single-cell technologies such as single cell RNA-sequencing (scRNA-seq) and single cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq), can examine cell-type specific gene regulation at unprecedented detail. However, current approaches to infer cell type-specific GRNs are limited in their ability to integrate scRNA-seq and scATAC-seq measurements and to model network dynamics on a cell lineage. To address this challenge, we have developed single-cell Multi-Task Network Inference (scMTNI), a multi-task learning framework to infer the GRN for each cell type on a lineage from scRNA-seq and scATAC-seq data. Using simulated and real datasets, we show that scMTNI is a broadly applicable framework for linear and branching lineages that accurately infers GRN dynamics and identifies key regulators of fate transitions for diverse processes such as cellular reprogramming and differentiation.
Assuntos
Redes Reguladoras de Genes , Fatores de Transcrição , Linhagem da Célula/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Cromatina/genética , Análise de Célula ÚnicaRESUMO
Transcriptional regulatory networks specify the regulatory proteins of target genes that control the context-specific expression levels of genes. With our ability to profile the different types of molecular components of cells under different conditions, we are now uniquely positioned to infer regulatory networks in diverse biological contexts such as different cell types, tissues, and time points. In this chapter, we cover two main classes of computational methods to integrate different types of information to infer genome-scale transcriptional regulatory networks. The first class of methods focuses on integrative methods for specifically inferring connections between transcription factors and target genes by combining gene expression data with regulatory edge-specific knowledge. The second class of methods integrates upstream signaling networks with transcriptional regulatory networks by combining gene expression data with protein-protein interaction networks and proteomic datasets. We conclude with a section on practical applications of a network inference algorithm to infer a genome-scale regulatory network.
Assuntos
Biologia Computacional/métodos , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Modelos Genéticos , Algoritmos , Biologia Computacional/instrumentação , Conjuntos de Dados como Assunto , Perfilação da Expressão Gênica/instrumentação , Perfilação da Expressão Gênica/métodos , Genoma/genética , Mapas de Interação de Proteínas/genética , Proteômica/instrumentação , Proteômica/métodos , Software , Fatores de Transcrição/metabolismoRESUMO
Elucidating the mechanism of reprogramming is confounded by heterogeneity due to the low efficiency and differential kinetics of obtaining induced pluripotent stem cells (iPSCs) from somatic cells. Therefore, we increased the efficiency with a combination of epigenomic modifiers and signaling molecules and profiled the transcriptomes of individual reprogramming cells. Contrary to the established temporal order, somatic gene inactivation and upregulation of cell cycle, epithelial, and early pluripotency genes can be triggered independently such that any combination of these events can occur in single cells. Sustained co-expression of Epcam, Nanog, and Sox2 with other genes is required to progress toward iPSCs. Ehf, Phlda2, and translation initiation factor Eif4a1 play functional roles in robust iPSC generation. Using regulatory network analysis, we identify a critical role for signaling inhibition by 2i in repressing somatic expression and synergy between the epigenomic modifiers ascorbic acid and a Dot1L inhibitor for pluripotency gene activation.
Assuntos
Pontos de Checagem do Ciclo Celular , Reprogramação Celular , Células-Tronco Pluripotentes Induzidas/citologia , Análise de Célula Única , Animais , Pontos de Checagem do Ciclo Celular/genética , Reprogramação Celular/genética , Regulação para Baixo/genética , Epigenômica , Epitélio/metabolismo , Feminino , Fibroblastos/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Células-Tronco Pluripotentes Induzidas/metabolismo , Masculino , Mesoderma/citologia , Camundongos Endogâmicos C57BL , Modelos Biológicos , Transdução de Sinais , Regulação para Cima/genéticaRESUMO
Many human diseases including cancer are the result of perturbations to transcriptional regulatory networks that control context-specific expression of genes. A comparative approach across multiple cancer types is a powerful approach to illuminate the common and specific network features of this family of diseases. Recent efforts from The Cancer Genome Atlas (TCGA) have generated large collections of functional genomic data sets for multiple types of cancers. An emerging challenge is to devise computational approaches that systematically compare these genomic data sets across different cancer types that identify common and cancer-specific network components. We present a module- and network-based characterization of transcriptional patterns in six different cancers being studied in TCGA: breast, colon, rectal, kidney, ovarian, and endometrial. Our approach uses a recently developed regulatory network reconstruction algorithm, modular regulatory network learning with per gene information (MERLIN), within a stability selection framework to predict regulators for individual genes and gene modules. Our module-based analysis identifies a common theme of immune system processes in each cancer study, with modules statistically enriched for immune response processes as well as targets of key immune response regulators from the interferon regulatory factor (IRF) and signal transducer and activator of transcription (STAT) families. Comparison of the inferred regulatory networks from each cancer type identified a core regulatory network that included genes involved in chromatin remodeling, cell cycle, and immune response. Regulatory network hubs included genes with known roles in specific cancer types as well as genes with potentially novel roles in different cancer types. Overall, our integrated module and network analysis recapitulated known themes in cancer biology and additionally revealed novel regulatory hubs that suggest a complex interplay of immune response, cell cycle, and chromatin remodeling across multiple cancers.