RESUMEN
Signaling pathways drive cell fate transitions largely by changing gene expression. However, the mechanisms for rapid and selective transcriptome rewiring in response to signaling cues remain elusive. Here we use deep learning to deconvolve both the sequence determinants and the trans-acting regulators that trigger extracellular signal-regulated kinase (ERK)-mitogen-activated protein kinase kinase (MEK)-induced decay of the naive pluripotency mRNAs. Timing of decay is coupled to embryo implantation through ERK-MEK phosphorylation of LIN28A, which repositions pLIN28A to the highly A+U-rich 3' untranslated region (3'UTR) termini of naive pluripotency mRNAs. Interestingly, these A+U-rich 3'UTR termini serve as poly(A)-binding protein (PABP)-binding hubs, poised for signal-induced convergence with LIN28A. The multivalency of AUU motifs determines the efficacy of pLIN28A-PABP convergence, which enhances PABP 3'UTR binding, decreases the protection of poly(A) tails and activates mRNA decay to enable progression toward primed pluripotency. Thus, the signal-induced convergence of LIN28A with PABP-RNA hubs drives the rapid selection of naive mRNAs for decay, enabling the transcriptome remodeling that ensures swift developmental progression.
Asunto(s)
Regiones no Traducidas 3' , Estabilidad del ARN , ARN Mensajero , Proteínas de Unión al ARN , Regiones no Traducidas 3'/genética , ARN Mensajero/metabolismo , ARN Mensajero/genética , Animales , Proteínas de Unión al ARN/metabolismo , Proteínas de Unión al ARN/genética , Ratones , Transducción de Señal , Humanos , Proteínas de Unión a Poli(A)/metabolismo , Proteínas de Unión a Poli(A)/genética , Regulación del Desarrollo de la Expresión Génica , FosforilaciónRESUMEN
Spatiotemporal regulation of gene expression is controlled by transcription factor (TF) binding to regulatory elements, resulting in a plethora of cell types and cell states from the same genetic information. Due to the importance of regulatory elements, various sequencing methods have been developed to localise them in genomes, for example using ChIP-seq profiling of the histone mark H3K27ac that marks active regulatory regions. Moreover, multiple tools have been developed to predict TF binding to these regulatory elements based on DNA sequence. As altered gene expression is a hallmark of disease phenotypes, identifying TFs driving such gene expression programs is critical for the identification of novel drug targets. In this study, we curated 84 chromatin profiling experiments (H3K27ac ChIP-seq) where TFs were perturbed through e.g., genetic knockout or overexpression. We ran nine published tools to prioritize TFs using these real-world datasets and evaluated the performance of the methods in identifying the perturbed TFs. This allowed the nomination of three frontrunner tools, namely RcisTarget, MEIRLOP and monaLisa. Our analyses revealed opportunities and commonalities of tools that will help to guide further improvements and developments in the field.
RESUMEN
Non-alcoholic fatty liver disease (NAFLD) - characterized by excess accumulation of fat in the liver - now affects one third of the world's population. As NAFLD progresses, extracellular matrix components including collagen accumulate in the liver causing tissue fibrosis, a major determinant of disease severity and mortality. To identify transcriptional regulators of fibrosis, we computationally inferred the activity of transcription factors (TFs) relevant to fibrosis by profiling the matched transcriptomes and epigenomes of 108 human liver biopsies from a deeply-characterized cohort of patients spanning the full histopathologic spectrum of NAFLD. CRISPR-based genetic knockout of the top 100 TFs identified ZNF469 as a regulator of collagen expression in primary human hepatic stellate cells (HSCs). Gain- and loss-of-function studies established that ZNF469 regulates collagen genes and genes involved in matrix homeostasis through direct binding to gene bodies and regulatory elements. By integrating multiomic large-scale profiling of human biopsies with extensive experimental validation we demonstrate that ZNF469 is a transcriptional regulator of collagen in HSCs. Overall, these data nominate ZNF469 as a previously unrecognized determinant of NAFLD-associated liver fibrosis.
RESUMEN
RNA-binding proteins (RBPs) play diverse roles in regulating co-transcriptional RNA-processing and chromatin functions, but our knowledge of the repertoire of chromatin-associated RBPs (caRBPs) and their interactions with chromatin remains limited. Here, we developed SPACE (Silica Particle Assisted Chromatin Enrichment) to isolate global and regional chromatin components with high specificity and sensitivity, and SPACEmap to identify the chromatin-contact regions in proteins. Applied to mouse embryonic stem cells, SPACE identified 1459 chromatin-associated proteins, â¼48% of which are annotated as RBPs, indicating their dual roles in chromatin and RNA-binding. Additionally, SPACEmap stringently verified chromatin-binding of 403 RBPs and identified their chromatin-contact regions. Notably, SPACEmap showed that about 40% of the caRBPs bind chromatin by intrinsically disordered regions (IDRs). Studying SPACE and total proteome dynamics from mES cells grown in 2iL and serum medium indicates significant correlation (R = 0.62). One of the most dynamic caRBPs is Dazl, which we find co-localized with PRC2 at transcription start sites of genes that are distinct from Dazl mRNA binding. Dazl and other PRC2-colocalised caRBPs are rich in intrinsically disordered regions (IDRs), which could contribute to the formation and regulation of phase-separated PRC condensates. Together, our approach provides an unprecedented insight into IDR-mediated interactions and caRBPs with moonlighting functions in native chromatin.
Asunto(s)
Cromatina/metabolismo , Proteínas Intrínsecamente Desordenadas/metabolismo , Células Madre Embrionarias de Ratones/metabolismo , Proteínas de Unión al ARN/metabolismo , Animales , Sitios de Unión/genética , Células Cultivadas , Cromatina/genética , Proteínas Intrínsecamente Desordenadas/genética , Espectrometría de Masas/métodos , Ratones , Unión Proteica , Mapas de Interacción de Proteínas/genética , Proteoma/genética , Proteoma/metabolismo , Proteómica/métodos , Proteínas de Unión al ARN/genética , Reproducibilidad de los ResultadosRESUMEN
Non-negative matrix factorization (NMF) has been widely used for the analysis of genomic data to perform feature extraction and signature identification due to the interpretability of the decomposed signatures. However, running a basic NMF analysis requires the installation of multiple tools and dependencies, along with a steep learning curve and computing time. To mitigate such obstacles, we developed ShinyButchR, a novel R/Shiny application that provides a complete NMF-based analysis workflow, allowing the user to perform matrix decomposition using NMF, feature extraction, interactive visualization, relevant signature identification, and association to biological and clinical variables. ShinyButchR builds upon the also novel R package ButchR, which provides new TensorFlow solvers for algorithms of the NMF family, functions for downstream analysis, a rational method to determine the optimal factorization rank and a novel feature selection strategy.
RESUMEN
Neural induction in vertebrates generates a CNS that extends the rostral-caudal length of the body. The prevailing view is that neural cells are initially induced with anterior (forebrain) identity; caudalizing signals then convert a proportion to posterior fates (spinal cord). To test this model, we used chromatin accessibility to define how cells adopt region-specific neural fates. Together with genetic and biochemical perturbations, this identified a developmental time window in which genome-wide chromatin-remodeling events preconfigure epiblast cells for neural induction. Contrary to the established model, this revealed that cells commit to a regional identity before acquiring neural identity. This "primary regionalization" allocates cells to anterior or posterior regions of the nervous system, explaining how cranial and spinal neurons are generated at appropriate axial positions. These findings prompt a revision to models of neural induction and support the proposed dual evolutionary origin of the vertebrate CNS.
Asunto(s)
Ensamble y Desensamble de Cromatina , Inducción Embrionaria , Neurogénesis , Animales , Línea Celular , Células Cultivadas , Embrión de Pollo , Femenino , Regulación del Desarrollo de la Expresión Génica , Masculino , Ratones , Ratones Endogámicos C57BL , Células-Madre Neurales/citología , Células-Madre Neurales/metabolismo , Médula Espinal/citología , Médula Espinal/crecimiento & desarrollo , Médula Espinal/metabolismoRESUMEN
ChIP-seq has become a widely adopted genomic assay in recent years to determine binding sites for transcription factors or enrichments for specific histone modifications. Beside detection of enriched or bound regions, an important question is to determine differences between conditions. While this is a common analysis for gene expression, for which a large number of computational approaches have been validated, the same question for ChIP-seq is particularly challenging owing to the complexity of ChIP-seq data in terms of noisiness and variability. Many different tools have been developed and published in recent years. However, a comprehensive comparison and review of these tools is still missing. Here, we have reviewed 14 tools, which have been developed to determine differential enrichment between two conditions. They differ in their algorithmic setups, and also in the range of applicability. Hence, we have benchmarked these tools on real data sets for transcription factors and histone modifications, as well as on simulated data sets to quantitatively evaluate their performance. Overall, there is a great variety in the type of signal detected by these tools with a surprisingly low level of agreement. Depending on the type of analysis performed, the choice of method will crucially impact the outcome.