Search | VHL Search Portal

1.

Temporal and sequential transcriptional dynamics define lineage shifts in corticogenesis.

Mukhtar, Tanzila; Breda, Jeremie; Adam, Manal A; Boareto, Marcelo; Grobecker, Pascal; Karimaddini, Zahra; Grison, Alice; Eschbach, Katja; Chandrasekhar, Ramakrishnan; Vermeul, Swen; Okoniewski, Michal; Pachkov, Mikhail; Harwell, Corey C; Atanasoski, Suzana; Beisel, Christian; Iber, Dagmar; van Nimwegen, Erik; Taylor, Verdon.

EMBO J ; 41(24): e111132, 2022 12 15.

Article in English | MEDLINE | ID: mdl-36345783

ABSTRACT

The cerebral cortex contains billions of neurons, and their disorganization or misspecification leads to neurodevelopmental disorders. Understanding how the plethora of projection neuron subtypes are generated by cortical neural stem cells (NSCs) is a major challenge. Here, we focused on elucidating the transcriptional landscape of murine embryonic NSCs, basal progenitors (BPs), and newborn neurons (NBNs) throughout cortical development. We uncover dynamic shifts in transcriptional space over time and heterogeneity within each progenitor population. We identified signature hallmarks of NSC, BP, and NBN clusters and predict active transcriptional nodes and networks that contribute to neural fate specification. We find that the expression of receptors, ligands, and downstream pathway components is highly dynamic over time and throughout the lineage implying differential responsiveness to signals. Thus, we provide an expansive compendium of gene expression during cortical development that will be an invaluable resource for studying neural developmental processes and neurodevelopmental disorders.

Subject(s)

Neural Stem Cells , Neurons , Animals , Mice , Cell Differentiation , Cell Lineage/genetics , Cerebral Cortex , Embryonic Stem Cells , Neurogenesis/genetics , Neurons/metabolism

2.

Effective bet-hedging through growth rate dependent stability.

de Groot, Daan H; Tjalma, Age J; Bruggeman, Frank J; van Nimwegen, Erik.

Proc Natl Acad Sci U S A ; 120(8): e2211091120, 2023 02 21.

Article in English | MEDLINE | ID: mdl-36780518

ABSTRACT

Microbes in the wild face highly variable and unpredictable environments and are naturally selected for their average growth rate across environments. Apart from using sensory regulatory systems to adapt in a targeted manner to changing environments, microbes employ bet-hedging strategies where cells in an isogenic population switch stochastically between alternative phenotypes. Yet, bet-hedging suffers from a fundamental trade-off: Increasing the phenotype-switching rate increases the rate at which maladapted cells explore alternative phenotypes but also increases the rate at which cells switch out of a well-adapted state. Consequently, it is currently believed that bet-hedging strategies are effective only when the number of possible phenotypes is limited and when environments last for sufficiently many generations. However, recent experimental results show that gene expression noise generally decreases with growth rate, suggesting that phenotype-switching rates may systematically decrease with growth rate. Such growth rate dependent stability (GRDS) causes cells to be more explorative when maladapted and more phenotypically stable when well-adapted, and we show that GRDS can almost completely overcome the trade-off that limits bet-hedging, allowing for effective adaptation even when environments are diverse and change rapidly. We further show that even a small decrease in switching rates of faster-growing phenotypes can substantially increase long-term fitness of bet-hedging strategies. Together, our results suggest that stochastic strategies may play an even bigger role for microbial adaptation than hitherto appreciated.

Subject(s)

Acclimatization , Biological Evolution , Phenotype , Adaptation, Physiological/genetics

3.

Identifying cell states in single-cell RNA-seq data at statistically maximal resolution.

Grobecker, Pascal; Sakoparnig, Thomas; van Nimwegen, Erik.

PLoS Comput Biol ; 20(7): e1012224, 2024 Jul 12.

Article in English | MEDLINE | ID: mdl-38995959

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) has become a popular experimental method to study variation of gene expression within a population of cells. However, obtaining an accurate picture of the diversity of distinct gene expression states that are present in a given dataset is highly challenging because the sparsity of the scRNA-seq data and its inhomogeneous measurement noise properties. Although a vast number of different methods is applied in the literature for clustering cells into subsets with 'similar' expression profiles, these methods generally lack rigorously specified objectives, involve multiple complex layers of normalization, filtering, feature selection, dimensionality-reduction, employ ad hoc measures of distance or similarity between cells, often ignore the known measurement noise properties of scRNA-seq measurements, and include a large number of tunable parameters. Consequently, it is virtually impossible to assign concrete biophysical meaning to the clusterings that result from these methods. Here we address the following problem: Given raw unique molecule identifier (UMI) counts of an scRNA-seq dataset, partition the cells into subsets such that the gene expression states of the cells in each subset are statistically indistinguishable, and each subset corresponds to a distinct gene expression state. That is, we aim to partition cells so as to maximally reduce the complexity of the dataset without removing any of its meaningful structure. We show that, given the known measurement noise structure of scRNA-seq data, this problem is mathematically well-defined and derive its unique solution from first principles. We have implemented this solution in a tool called Cellstates which operates directly on the raw data and automatically determines the optimal partition and cluster number, with zero tunable parameters. We show that, on synthetic datasets, Cellstates almost perfectly recovers optimal partitions. On real data, Cellstates robustly identifies subtle substructure within groups of cells that are traditionally annotated as a common cell type. Moreover, we show that the diversity of gene expression states that Cellstates identifies systematically depends on the tissue of origin and not on technical features of the experiments such as the total number of cells and total UMI count per cell. In addition to the Cellstates tool we also provide a small toolbox of software to place the identified cellstates into a hierarchical tree of higher-order clusters, to identify the most important differentially expressed genes at each branch of this hierarchy, and to visualize these results.

4.

An atlas of combinatorial transcriptional regulation in mouse and man.

Ravasi, Timothy; Suzuki, Harukazu; Cannistraci, Carlo Vittorio; Katayama, Shintaro; Bajic, Vladimir B; Tan, Kai; Akalin, Altuna; Schmeier, Sebastian; Kanamori-Katayama, Mutsumi; Bertin, Nicolas; Carninci, Piero; Daub, Carsten O; Forrest, Alistair R R; Gough, Julian; Grimmond, Sean; Han, Jung-Hoon; Hashimoto, Takehiro; Hide, Winston; Hofmann, Oliver; Kamburov, Atanas; Kaur, Mandeep; Kawaji, Hideya; Kubosaki, Atsutaka; Lassmann, Timo; van Nimwegen, Erik; MacPherson, Cameron Ross; Ogawa, Chihiro; Radovanovic, Aleksandar; Schwartz, Ariel; Teasdale, Rohan D; Tegnér, Jesper; Lenhard, Boris; Teichmann, Sarah A; Arakawa, Takahiro; Ninomiya, Noriko; Murakami, Kayoko; Tagami, Michihira; Fukuda, Shiro; Imamura, Kengo; Kai, Chikatoshi; Ishihara, Ryoko; Kitazume, Yayoi; Kawai, Jun; Hume, David A; Ideker, Trey; Hayashizaki, Yoshihide.

Cell ; 140(5): 744-52, 2010 Mar 05.

Article in English | MEDLINE | ID: mdl-20211142

ABSTRACT

Combinatorial interactions among transcription factors are critical to directing tissue-specific gene expression. To build a global atlas of these combinations, we have screened for physical interactions among the majority of human and mouse DNA-binding transcription factors (TFs). The complete networks contain 762 human and 877 mouse interactions. Analysis of the networks reveals that highly connected TFs are broadly expressed across tissues, and that roughly half of the measured interactions are conserved between mouse and human. The data highlight the importance of TF combinations for determining cell fate, and they lead to the identification of a SMAD3/FLI1 complex expressed during development of immunity. The availability of large TF combinatorial networks in both human and mouse will provide many opportunities to study gene regulation, tissue differentiation, and mammalian evolution.

Subject(s)

Gene Expression Regulation , Gene Regulatory Networks , Transcription Factors/metabolism , Animals , Cell Differentiation , Evolution, Molecular , Humans , Mice , Monocytes/cytology , Organ Specificity , Smad3 Protein/metabolism , Trans-Activators/metabolism

5.

Genome-wide gene expression noise in Escherichia coli is condition-dependent and determined by propagation of noise through the regulatory network.

Urchueguía, Arantxa; Galbusera, Luca; Chauvin, Dany; Bellement, Gwendoline; Julou, Thomas; van Nimwegen, Erik.

PLoS Biol ; 19(12): e3001491, 2021 12.

Article in English | MEDLINE | ID: mdl-34919538

ABSTRACT

Although it is well appreciated that gene expression is inherently noisy and that transcriptional noise is encoded in a promoter's sequence, little is known about the extent to which noise levels of individual promoters vary across growth conditions. Using flow cytometry, we here quantify transcriptional noise in Escherichia coli genome-wide across 8 growth conditions and find that noise levels systematically decrease with growth rate, with a condition-dependent lower bound on noise. Whereas constitutive promoters consistently exhibit low noise in all conditions, regulated promoters are both more noisy on average and more variable in noise across conditions. Moreover, individual promoters show highly distinct variation in noise across conditions. We show that a simple model of noise propagation from regulators to their targets can explain a significant fraction of the variation in relative noise levels and identifies TFs that most contribute to both condition-specific and condition-independent noise propagation. In addition, analysis of the genome-wide correlation structure of various gene properties shows that gene regulation, expression noise, and noise plasticity are all positively correlated genome-wide and vary independently of variations in absolute expression, codon bias, and evolutionary rate. Together, our results show that while absolute expression noise tends to decrease with growth rate, relative noise levels of genes are highly condition-dependent and determined by the propagation of noise through the gene regulatory network.

Subject(s)

Escherichia coli/genetics , Gene Expression Regulation, Bacterial/genetics , Promoter Regions, Genetic/genetics , Escherichia coli Proteins/genetics , Gene Expression/genetics , Gene Expression Profiling/methods , Gene Regulatory Networks/genetics , Genes, Reporter/genetics , Transcriptome/genetics

6.

Subpopulations of sensorless bacteria drive fitness in fluctuating environments.

Julou, Thomas; Zweifel, Ludovit; Blank, Diana; Fiori, Athos; van Nimwegen, Erik.

PLoS Biol ; 18(12): e3000952, 2020 12.

Article in English | MEDLINE | ID: mdl-33270631

ABSTRACT

Populations of bacteria often undergo a lag in growth when switching conditions. Because growth lags can be large compared to typical doubling times, variations in growth lag are an important but often overlooked component of bacterial fitness in fluctuating environments. We here explore how growth lag variation is determined for the archetypical switch from glucose to lactose as a carbon source in Escherichia coli. First, we show that single-cell lags are bimodally distributed and controlled by a single-molecule trigger. That is, gene expression noise causes the population before the switch to divide into subpopulations with zero and nonzero lac operon expression. While "sensorless" cells with zero preexisting lac expression at the switch have long lags because they are unable to sense the lactose signal, any nonzero lac operon expression suffices to ensure a short lag. Second, we show that the growth lag at the population level depends crucially on the fraction of sensorless cells and that this fraction in turn depends sensitively on the growth condition before the switch. Consequently, even small changes in basal expression can significantly affect the fraction of sensorless cells, thereby population lags and fitness under switching conditions, and may thus be subject to significant natural selection. Indeed, we show that condition-dependent population lags vary across wild E. coli isolates. Since many sensory genes are naturally low expressed in conditions where their inducer is not present, bimodal responses due to subpopulations of sensorless cells may be a general mechanism inducing phenotypic heterogeneity and controlling population lags in switching environments. This mechanism also illustrates how gene expression noise can turn even a simple sensory gene circuit into a bet hedging module and underlines the profound role of gene expression noise in regulatory responses.

Subject(s)

Escherichia coli/metabolism , Gene Expression Regulation, Bacterial/genetics , Genetic Fitness/physiology , Bacteria/genetics , Bacteria/metabolism , Environment , Escherichia coli/genetics , Escherichia coli Proteins/genetics , Escherichia coli Proteins/metabolism , Gene Expression Regulation, Bacterial/physiology , Gene Regulatory Networks/genetics , Gene-Environment Interaction , Genetic Fitness/genetics , Glucose/metabolism , Lac Operon , Lactose/metabolism , Phenotype

7.

Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs.

Berger, Severin; Pachkov, Mikhail; Arnold, Phil; Omidi, Saeed; Kelley, Nicholas; Salatino, Silvia; van Nimwegen, Erik.

Genome Res ; 29(7): 1164-1177, 2019 07.

Article in English | MEDLINE | ID: mdl-31138617

ABSTRACT

Although ChIP-seq has become a routine experimental approach for quantitatively characterizing the genome-wide binding of transcription factors (TFs), computational analysis procedures remain far from standardized, making it difficult to compare ChIP-seq results across experiments. In addition, although genome-wide binding patterns must ultimately be determined by local constellations of DNA-binding sites, current analysis is typically limited to identifying enriched motifs in ChIP-seq peaks. Here we present Crunch, a completely automated computational method that performs all ChIP-seq analysis from quality control through read mapping and peak detecting and that integrates comprehensive modeling of the ChIP signal in terms of known and novel binding motifs, quantifying the contribution of each motif and annotating which combinations of motifs explain each binding peak. By applying Crunch to 128 data sets from the ENCODE Project, we show that Crunch outperforms current peak finders and find that TFs naturally separate into "solitary TFs," for which a single motif explains the ChIP-peaks, and "cobinding TFs," for which multiple motifs co-occur within peaks. Moreover, for most data sets, the motifs that Crunch identified de novo outperform known motifs, and both the set of cobinding motifs and the top motif of solitary TFs are consistent across experiments and cell lines. Crunch is implemented as a web server, enabling standardized analysis of any collection of ChIP-seq data sets by simply uploading raw sequencing data. Results are provided both in a graphical web interface and as downloadable files.

Subject(s)

Chromatin Immunoprecipitation Sequencing , Computational Biology/methods , Transcription Factors/metabolism , Amino Acid Motifs , Animals , Binding Sites , Datasets as Topic , Humans , Nucleotide Motifs , Quality Control , Regulatory Sequences, Nucleic Acid

8.

Single-cell mRNA profiling reveals the hierarchical response of miRNA targets to miRNA induction.

Rzepiela, Andrzej J; Ghosh, Souvik; Breda, Jeremie; Vina-Vilaseca, Arnau; Syed, Afzal P; Gruber, Andreas J; Eschbach, Katja; Beisel, Christian; van Nimwegen, Erik; Zavolan, Mihaela.

Mol Syst Biol ; 14(8): e8266, 2018 08 27.

Article in English | MEDLINE | ID: mdl-30150282

ABSTRACT

miRNAs are small RNAs that regulate gene expression post-transcriptionally. By repressing the translation and promoting the degradation of target mRNAs, miRNAs may reduce the cell-to-cell variability in protein expression, induce correlations between target expression levels, and provide a layer through which targets can influence each other's expression as "competing RNAs" (ceRNAs). However, experimental evidence for these behaviors is limited. Combining mathematical modeling with RNA sequencing of individual human embryonic kidney cells in which the expression of two distinct miRNAs was induced over a wide range, we have inferred parameters describing the response of hundreds of miRNA targets to miRNA induction. Individual targets have widely different response dynamics, and only a small proportion of predicted targets exhibit high sensitivity to miRNA induction. Our data reveal for the first time the response parameters of the entire network of endogenous miRNA targets to miRNA induction, demonstrating that miRNAs correlate target expression and at the same time increase the variability in expression of individual targets across cells. The approach is generalizable to other miRNAs and post-transcriptional regulators to improve the understanding of gene expression dynamics in individual cell types.

Subject(s)

Gene Regulatory Networks/genetics , MicroRNAs/genetics , RNA, Messenger/genetics , Single-Cell Analysis , Computational Biology , Gene Expression Profiling , Gene Expression Regulation/genetics , HEK293 Cells , Humans , Models, Theoretical , Sequence Analysis, RNA

9.

A large-scale, in vivo transcription factor screen defines bivalent chromatin as a key property of regulatory factors mediating Drosophila wing development.

Schertel, Claus; Albarca, Monica; Rockel-Bauer, Claudia; Kelley, Nicholas W; Bischof, Johannes; Hens, Korneel; van Nimwegen, Erik; Basler, Konrad; Deplancke, Bart.

Genome Res ; 25(4): 514-23, 2015 Apr.

Article in English | MEDLINE | ID: mdl-25568052

ABSTRACT

Transcription factors (TFs) are key regulators of cell fate. The estimated 755 genes that encode DNA binding domain-containing proteins comprise â¼ 5% of all Drosophila genes. However, the majority has remained uncharacterized so far due to the lack of proper genetic tools. We generated 594 site-directed transgenic Drosophila lines that contain integrations of individual UAS-TF constructs to facilitate spatiotemporally controlled misexpression in vivo. All transgenes were expressed in the developing wing, and two-thirds induced specific phenotypic defects. In vivo knockdown of the same genes yielded a phenotype for 50%, with both methods indicating a great potential for misexpression to characterize novel functions in wing growth, patterning, and development. Thus, our UAS-TF library provides an important addition to the genetic toolbox of Drosophila research, enabling the identification of several novel wing development-related TFs. In parallel, we established the chromatin landscape of wing imaginal discs by ChIP-seq analyses of five chromatin marks and RNA Pol II. Subsequent clustering revealed six distinct chromatin states, with two clusters showing enrichment for both active and repressive marks. TFs that carry such "bivalent" chromatin are highly enriched for causing misexpression phenotypes in the wing, and analysis of existing expression data shows that these TFs tend to be differentially expressed across the wing disc. Thus, bivalently marked chromatin can be used as a marker for spatially regulated TFs that are functionally relevant in a developing tissue.

Subject(s)

Body Patterning/genetics , Drosophila melanogaster/embryology , Imaginal Discs/embryology , Transcription Factors/genetics , Wings, Animal/embryology , Animals , Animals, Genetically Modified , Chromatin/genetics , Chromatin/metabolism , DNA Methylation/genetics , DNA-Binding Proteins/genetics , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Gene Expression Regulation, Developmental , Histones/genetics , Phenotype , Promoter Regions, Genetic/genetics , Protein Structure, Tertiary/genetics , RNA Interference , RNA Polymerase II/genetics , RNA, Small Interfering

10.

Automated incorporation of pairwise dependency in transcription factor binding site prediction using dinucleotide weight tensors.

Omidi, Saeed; Zavolan, Mihaela; Pachkov, Mikhail; Breda, Jeremie; Berger, Severin; van Nimwegen, Erik.

PLoS Comput Biol ; 13(7): e1005176, 2017 Jul.

Article in English | MEDLINE | ID: mdl-28753602

ABSTRACT

Gene regulatory networks are ultimately encoded by the sequence-specific binding of (TFs) to short DNA segments. Although it is customary to represent the binding specificity of a TF by a position-specific weight matrix (PSWM), which assumes each position within a site contributes independently to the overall binding affinity, evidence has been accumulating that there can be significant dependencies between positions. Unfortunately, methodological challenges have so far hindered the development of a practical and generally-accepted extension of the PSWM model. On the one hand, simple models that only consider dependencies between nearest-neighbor positions are easy to use in practice, but fail to account for the distal dependencies that are observed in the data. On the other hand, models that allow for arbitrary dependencies are prone to overfitting, requiring regularization schemes that are difficult to use in practice for non-experts. Here we present a new regulatory motif model, called dinucleotide weight tensor (DWT), that incorporates arbitrary pairwise dependencies between positions in binding sites, rigorously from first principles, and free from tunable parameters. We demonstrate the power of the method on a large set of ChIP-seq data-sets, showing that DWTs outperform both PSWMs and motif models that only incorporate nearest-neighbor dependencies. We also demonstrate that DWTs outperform two previously proposed methods. Finally, we show that DWTs inferred from ChIP-seq data also outperform PSWMs on HT-SELEX data for the same TF, suggesting that DWTs capture inherent biophysical properties of the interactions between the DNA binding domains of TFs and their binding sites. We make a suite of DWT tools available at dwt.unibas.ch, that allow users to automatically perform 'motif finding', i.e. the inference of DWT motifs from a set of sequences, binding site prediction with DWTs, and visualization of DWT 'dilogo' motifs.

Subject(s)

Binding Sites/genetics , Computational Biology/methods , DNA , Nucleotide Motifs/genetics , Transcription Factors , DNA/chemistry , DNA/genetics , DNA/metabolism , Models, Statistical , RNA/chemistry , RNA/genetics , RNA/metabolism , Sequence Analysis, DNA , Transcription Factors/chemistry , Transcription Factors/genetics , Transcription Factors/metabolism

11.

ISMARA: automated modeling of genomic signals as a democracy of regulatory motifs.

Balwierz, Piotr J; Pachkov, Mikhail; Arnold, Phil; Gruber, Andreas J; Zavolan, Mihaela; van Nimwegen, Erik.

Genome Res ; 24(5): 869-84, 2014 May.

Article in English | MEDLINE | ID: mdl-24515121

ABSTRACT

Accurate reconstruction of the regulatory networks that control gene expression is one of the key current challenges in molecular biology. Although gene expression and chromatin state dynamics are ultimately encoded by constellations of binding sites recognized by regulators such as transcriptions factors (TFs) and microRNAs (miRNAs), our understanding of this regulatory code and its context-dependent read-out remains very limited. Given that there are thousands of potential regulators in mammals, it is not practical to use direct experimentation to identify which of these play a key role for a particular system of interest. We developed a methodology that models gene expression or chromatin modifications in terms of genome-wide predictions of regulatory sites and completely automated it into a web-based tool called ISMARA (Integrated System for Motif Activity Response Analysis). Given only gene expression or chromatin state data across a set of samples as input, ISMARA identifies the key TFs and miRNAs driving expression/chromatin changes and makes detailed predictions regarding their regulatory roles. These include predicted activities of the regulators across the samples, their genome-wide targets, enriched gene categories among the targets, and direct interactions between the regulators. Applying ISMARA to data sets from well-studied systems, we show that it consistently identifies known key regulators ab initio. We also present a number of novel predictions including regulatory interactions in innate immunity, a master regulator of mucociliary differentiation, TFs consistently disregulated in cancer, and TFs that mediate specific chromatin modifications.

Subject(s)

Genome, Human , Models, Genetic , Nucleotide Motifs , Regulatory Sequences, Nucleic Acid , Sequence Analysis, DNA/methods , Algorithms , Animals , Chromatin Assembly and Disassembly , Humans , Mice

12.

DNA-binding factors shape the mouse methylome at distal regulatory regions.

Stadler, Michael B; Murr, Rabih; Burger, Lukas; Ivanek, Robert; Lienert, Florian; Schöler, Anne; van Nimwegen, Erik; Wirbelauer, Christiane; Oakeley, Edward J; Gaidatzis, Dimos; Tiwari, Vijay K; Schübeler, Dirk.

Nature ; 480(7378): 490-5, 2011 Dec 14.

Article in English | MEDLINE | ID: mdl-22170606

ABSTRACT

Methylation of cytosines is an essential epigenetic modification in mammalian genomes, yet the rules that govern methylation patterns remain largely elusive. To gain insights into this process, we generated base-pair-resolution mouse methylomes in stem cells and neuronal progenitors. Advanced quantitative analysis identified low-methylated regions (LMRs) with an average methylation of 30%. These represent CpG-poor distal regulatory regions as evidenced by location, DNase I hypersensitivity, presence of enhancer chromatin marks and enhancer activity in reporter assays. LMRs are occupied by DNA-binding factors and their binding is necessary and sufficient to create LMRs. A comparison of neuronal and stem-cell methylomes confirms this dependency, as cell-type-specific LMRs are occupied by cell-type-specific transcription factors. This study provides methylome references for the mouse and shows that DNA-binding factors locally influence DNA methylation, enabling the identification of active regulatory regions.

Subject(s)

Cytosine/metabolism , DNA Methylation , DNA-Binding Proteins/metabolism , Epigenomics , Animals , Cell Differentiation , CpG Islands , Embryonic Stem Cells/cytology , Mice , Neurons/cytology , Promoter Regions, Genetic/genetics , Protein Binding , Stem Cells/cytology , Transcription Factors/metabolism

13.

Tead2 expression levels control the subcellular distribution of Yap and Taz, zyxin expression and epithelial-mesenchymal transition.

Diepenbruck, Maren; Waldmeier, Lorenz; Ivanek, Robert; Berninger, Philipp; Arnold, Phil; van Nimwegen, Erik; Christofori, Gerhard.

J Cell Sci ; 127(Pt 7): 1523-36, 2014 Apr 01.

Article in English | MEDLINE | ID: mdl-24554433

ABSTRACT

The cellular changes during an epithelial-mesenchymal transition (EMT) largely rely on global changes in gene expression orchestrated by transcription factors. Tead transcription factors and their transcriptional co-activators Yap and Taz have been previously implicated in promoting an EMT; however, their direct transcriptional target genes and their functional role during EMT have remained elusive. We have uncovered a previously unanticipated role of the transcription factor Tead2 during EMT. During EMT in mammary gland epithelial cells and breast cancer cells, levels of Tead2 increase in the nucleus of cells, thereby directing a predominant nuclear localization of its co-factors Yap and Taz via the formation of Tead2-Yap-Taz complexes. Genome-wide chromatin immunoprecipitation and next generation sequencing in combination with gene expression profiling revealed the transcriptional targets of Tead2 during EMT. Among these, zyxin contributes to the migratory and invasive phenotype evoked by Tead2. The results demonstrate that Tead transcription factors are crucial regulators of the cellular distribution of Yap and Taz, and together they control the expression of genes critical for EMT and metastasis.

Subject(s)

Adaptor Proteins, Signal Transducing/metabolism , DNA-Binding Proteins/biosynthesis , Epithelial-Mesenchymal Transition/physiology , Phosphoproteins/metabolism , Transcription Factors/biosynthesis , Transcription Factors/metabolism , Zyxin/biosynthesis , Adaptor Proteins, Signal Transducing/genetics , Animals , Cell Cycle Proteins , Cell Growth Processes/physiology , Cell Line, Tumor , DNA-Binding Proteins/genetics , Female , Mammary Glands, Animal/cytology , Mammary Glands, Animal/metabolism , Mammary Neoplasms, Experimental/genetics , Mammary Neoplasms, Experimental/metabolism , Mammary Neoplasms, Experimental/pathology , Mice , Mice, Inbred BALB C , Mice, Nude , Mice, Transgenic , Phosphoproteins/genetics , Signal Transduction , TEA Domain Transcription Factors , Trans-Activators , Transcription Factors/genetics , Transcriptional Activation , YAP-Signaling Proteins , Zyxin/genetics

14.

Modeling of epigenome dynamics identifies transcription factors that mediate Polycomb targeting.

Arnold, Phil; Schöler, Anne; Pachkov, Mikhail; Balwierz, Piotr J; Jørgensen, Helle; Stadler, Michael B; van Nimwegen, Erik; Schübeler, Dirk.

Genome Res ; 23(1): 60-73, 2013 Jan.

Article in English | MEDLINE | ID: mdl-22964890

ABSTRACT

Although changes in chromatin are integral to transcriptional reprogramming during cellular differentiation, it is currently unclear how chromatin modifications are targeted to specific loci. To systematically identify transcription factors (TFs) that can direct chromatin changes during cell fate decisions, we model the relationship between genome-wide dynamics of chromatin marks and the local occurrence of computationally predicted TF binding sites. By applying this computational approach to a time course of Polycomb-mediated H3K27me3 marks during neuronal differentiation of murine stem cells, we identify several motifs that likely regulate the dynamics of this chromatin mark. Among these, the sites bound by REST and by the SNAIL family of TFs are predicted to transiently recruit H3K27me3 in neuronal progenitors. We validate these predictions experimentally and show that absence of REST indeed causes loss of H3K27me3 at target promoters in trans, specifically at the neuronal progenitor state. Moreover, using targeted transgenic insertion, we show that promoter fragments containing REST or SNAIL binding sites are sufficient to recruit H3K27me3 in cis, while deletion of these sites results in loss of H3K27me3. These findings illustrate that the occurrence of TF binding sites can determine chromatin dynamics. Local determination of Polycomb activity by REST and SNAIL motifs exemplifies such TF based regulation of chromatin. Furthermore, our results show that key TFs can be identified ab initio through computational modeling of epigenome data sets using a modeling approach that we make readily accessible.

Subject(s)

Chromatin Assembly and Disassembly , Epigenesis, Genetic , Models, Genetic , Polycomb-Group Proteins/metabolism , Transcription Factors/metabolism , Animals , Binding Sites , Cattle , Cell Differentiation , Chromatin/metabolism , Dogs , Genome , Histones/metabolism , Horses , Humans , Macaca , Mice , Neurons/cytology , Opossums , Promoter Regions, Genetic , Snail Family Transcription Factors , Stem Cells/cytology , Transgenes

15.

A biophysical miRNA-mRNA interaction model infers canonical and noncanonical targets.

Khorshid, Mohsen; Hausser, Jean; Zavolan, Mihaela; van Nimwegen, Erik.

Nat Methods ; 10(3): 253-5, 2013 Mar.

Article in English | MEDLINE | ID: mdl-23334102

ABSTRACT

We introduce a biophysical model of miRNA-target interaction and infer its parameters from Argonaute 2 cross-linking and immunoprecipitation data. We show that a substantial fraction of human miRNA target sites are noncanonical and that predicted target-site affinity correlates well with the extent of target destabilization. Our model provides a rigorous biophysical approach to miRNA target identification beyond ad hoc miRNA seed-based methods.

Subject(s)

Argonaute Proteins/metabolism , Biophysical Phenomena , Gene Targeting , MicroRNAs/genetics , Models, Biological , RNA, Messenger/genetics , Argonaute Proteins/genetics , Base Pairing , Binding Sites , Data Interpretation, Statistical , Databases, Genetic , Gene Targeting/methods , HEK293 Cells , HeLa Cells , Humans , Immunoprecipitation , MicroRNAs/metabolism , Probability , Protein Binding , RNA, Messenger/metabolism , Transcriptome

16.

ARMADA: Using motif activity dynamics to infer gene regulatory networks from gene expression data.

Pemberton-Ross, Peter J; Pachkov, Mikhail; van Nimwegen, Erik.

Methods ; 85: 62-74, 2015 Sep 01.

Article in English | MEDLINE | ID: mdl-26164700

ABSTRACT

Analysis of gene expression data remains one of the most promising avenues toward reconstructing genome-wide gene regulatory networks. However, the large dimensionality of the problem prohibits the fitting of explicit dynamical models of gene regulatory networks, whereas machine learning methods for dimensionality reduction such as clustering or principal component analysis typically fail to provide mechanistic interpretations of the reduced descriptions. To address this, we recently developed a general methodology called motif activity response analysis (MARA) that, by modeling gene expression patterns in terms of the activities of concrete regulators, accomplishes dramatic dimensionality reduction while retaining mechanistic biological interpretations of its predictions (Balwierz, 2014). Here we extend MARA by presenting ARMADA, which models the activity dynamics of regulators across a time course, and infers the causal interactions between the regulators that drive the dynamics of their activities across time. We have implemented ARMADA as part of our ISMARA webserver, ismara.unibas.ch, allowing any researcher to automatically apply it to any gene expression time course. To illustrate the method, we apply ARMADA to a time course of human umbilical vein endothelial cells treated with TNF. Remarkably, ARMADA is able to reproduce the complex observed motif activity dynamics using a relatively small set of interactions between the key regulators in this system. In addition, we show that ARMADA successfully infers many of the key regulatory interactions known to drive this inflammatory response and discuss several novel interactions that ARMADA predicts. In combination with ISMARA, ARMADA provides a powerful approach to generating plausible hypotheses for the key interactions between regulators that control gene expression in any system for which time course measurements are available.

Subject(s)

Gene Expression Profiling/methods , Gene Regulatory Networks/genetics , Systems Analysis , Algorithms , Amino Acid Motifs/genetics , Animals , Computational Biology/methods , Humans , Mice

17.

Quantifying the strength of miRNA-target interactions.

Breda, Jeremie; Rzepiela, Andrzej J; Gumienny, Rafal; van Nimwegen, Erik; Zavolan, Mihaela.

Methods ; 85: 90-99, 2015 Sep 01.

Article in English | MEDLINE | ID: mdl-25892562

ABSTRACT

We quantify the strength of miRNA-target interactions with MIRZA, a recently introduced biophysical model. We show that computationally predicted energies of interaction correlate strongly with the energies of interaction estimated from biochemical measurements of Michaelis-Menten constants. We further show that the accuracy of the MIRZA model can be improved taking into account recently emerged experimental data types. In particular, we use chimeric miRNA-mRNA sequences to infer a MIRZA-CHIMERA model and we provide a framework for inferring a similar model from measurements of rate constants of miRNA-mRNA interaction in the context of Argonaute proteins. Finally, based on a simple model of miRNA-based regulation, we discuss the importance of interaction energy and its variability between targets for the modulation of miRNA target expression in vivo.

Subject(s)

Gene Targeting/methods , MicroRNAs/chemistry , MicroRNAs/metabolism , Models, Molecular , Binding Sites/physiology , Humans , Protein Structure, Secondary

18.

Computational modeling identifies key gene regulatory interactions underlying phenobarbital-mediated tumor promotion.

Luisier, Raphaëlle; Unterberger, Elif B; Goodman, Jay I; Schwarz, Michael; Moggs, Jonathan; Terranova, Rémi; van Nimwegen, Erik.

Nucleic Acids Res ; 42(7): 4180-95, 2014 Apr.

Article in English | MEDLINE | ID: mdl-24464994

ABSTRACT

Gene regulatory interactions underlying the early stages of non-genotoxic carcinogenesis are poorly understood. Here, we have identified key candidate regulators of phenobarbital (PB)-mediated mouse liver tumorigenesis, a well-characterized model of non-genotoxic carcinogenesis, by applying a new computational modeling approach to a comprehensive collection of in vivo gene expression studies. We have combined our previously developed motif activity response analysis (MARA), which models gene expression patterns in terms of computationally predicted transcription factor binding sites with singular value decomposition (SVD) of the inferred motif activities, to disentangle the roles that different transcriptional regulators play in specific biological pathways of tumor promotion. Furthermore, transgenic mouse models enabled us to identify which of these regulatory activities was downstream of constitutive androstane receptor and ß-catenin signaling, both crucial components of PB-mediated liver tumorigenesis. We propose novel roles for E2F and ZFP161 in PB-mediated hepatocyte proliferation and suggest that PB-mediated suppression of ESR1 activity contributes to the development of a tumor-prone environment. Our study shows that combining MARA with SVD allows for automated identification of independent transcription regulatory programs within a complex in vivo tissue environment and provides novel mechanistic insights into PB-mediated hepatocarcinogenesis.

Subject(s)

Carcinogenesis/genetics , Gene Expression Regulation, Neoplastic , Liver Neoplasms/genetics , Phenobarbital/toxicity , Transcription, Genetic/drug effects , Animals , Binding Sites , Cell Proliferation/drug effects , Computational Biology/methods , Computer Simulation , Constitutive Androstane Receptor , Gene Regulatory Networks , Liver/drug effects , Liver/metabolism , Liver Neoplasms/chemically induced , Liver Neoplasms/metabolism , Male , Mice , Nucleotide Motifs , Receptors, Cytoplasmic and Nuclear/metabolism , Signal Transduction , Transcription Factors/metabolism , beta Catenin/metabolism

19.

Fifteen years SIB Swiss Institute of Bioinformatics: life science databases, tools and support.

Stockinger, Heinz; Altenhoff, Adrian M; Arnold, Konstantin; Bairoch, Amos; Bastian, Frederic; Bergmann, Sven; Bougueleret, Lydie; Bucher, Philipp; Delorenzi, Mauro; Lane, Lydie; Le Mercier, Philippe; Lisacek, Frédérique; Michielin, Olivier; Palagi, Patricia M; Rougemont, Jacques; Schwede, Torsten; von Mering, Christian; van Nimwegen, Erik; Walther, Daniel; Xenarios, Ioannis; Zavolan, Mihaela; Zdobnov, Evgeny M; Zoete, Vincent; Appel, Ron D.

Nucleic Acids Res ; 42(Web Server issue): W436-41, 2014 Jul.

Article in English | MEDLINE | ID: mdl-24792157

ABSTRACT

The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years.

Subject(s)

Computational Biology , Databases, Chemical , Software , Biological Evolution , Biostatistics , Drug Design , Genomics , Humans , Internet , Protein Conformation , Proteomics , Systems Biology

20.

Automated reconstruction of whole-genome phylogenies from short-sequence reads.

Bertels, Frederic; Silander, Olin K; Pachkov, Mikhail; Rainey, Paul B; van Nimwegen, Erik.

Mol Biol Evol ; 31(5): 1077-88, 2014 May.

Article in English | MEDLINE | ID: mdl-24600054

ABSTRACT

Studies of microbial evolutionary dynamics are being transformed by the availability of affordable high-throughput sequencing technologies, which allow whole-genome sequencing of hundreds of related taxa in a single study. Reconstructing a phylogenetic tree of these taxa is generally a crucial step in any evolutionary analysis. Instead of constructing genome assemblies for all taxa, annotating these assemblies, and aligning orthologous genes, many recent studies 1) directly map raw sequencing reads to a single reference sequence, 2) extract single nucleotide polymorphisms (SNPs), and 3) infer the phylogenetic tree using maximum likelihood methods from the aligned SNP positions. However, here we show that, when using such methods to reconstruct phylogenies from sets of simulated sequences, both the exclusion of nonpolymorphic positions and the alignment to a single reference genome, introduce systematic biases and errors in phylogeny reconstruction. To address these problems, we developed a new method that combines alignments from mappings to multiple reference sequences and show that this successfully removes biases from the reconstructed phylogenies. We implemented this method as a web server named REALPHY (Reference sequence Alignment-based Phylogeny builder), which fully automates phylogenetic reconstruction from raw sequencing reads.

Subject(s)

Genomics/methods , Phylogeny , Algorithms , Computer Simulation , Escherichia coli/genetics , Evolution, Molecular , Genome, Bacterial , High-Throughput Nucleotide Sequencing , Likelihood Functions , Models, Genetic , Polymorphism, Single Nucleotide , Pseudomonas syringae/genetics , Reproducibility of Results , Sequence Alignment , Sinorhizobium meliloti/genetics

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL