RESUMO
The cerebral cortex contains billions of neurons, and their disorganization or misspecification leads to neurodevelopmental disorders. Understanding how the plethora of projection neuron subtypes are generated by cortical neural stem cells (NSCs) is a major challenge. Here, we focused on elucidating the transcriptional landscape of murine embryonic NSCs, basal progenitors (BPs), and newborn neurons (NBNs) throughout cortical development. We uncover dynamic shifts in transcriptional space over time and heterogeneity within each progenitor population. We identified signature hallmarks of NSC, BP, and NBN clusters and predict active transcriptional nodes and networks that contribute to neural fate specification. We find that the expression of receptors, ligands, and downstream pathway components is highly dynamic over time and throughout the lineage implying differential responsiveness to signals. Thus, we provide an expansive compendium of gene expression during cortical development that will be an invaluable resource for studying neural developmental processes and neurodevelopmental disorders.
Assuntos
Células-Tronco Neurais , Neurônios , Animais , Camundongos , Diferenciação Celular , Linhagem da Célula/genética , Córtex Cerebral , Células-Tronco Embrionárias , Neurogênese/genética , Neurônios/metabolismoRESUMO
Microbes in the wild face highly variable and unpredictable environments and are naturally selected for their average growth rate across environments. Apart from using sensory regulatory systems to adapt in a targeted manner to changing environments, microbes employ bet-hedging strategies where cells in an isogenic population switch stochastically between alternative phenotypes. Yet, bet-hedging suffers from a fundamental trade-off: Increasing the phenotype-switching rate increases the rate at which maladapted cells explore alternative phenotypes but also increases the rate at which cells switch out of a well-adapted state. Consequently, it is currently believed that bet-hedging strategies are effective only when the number of possible phenotypes is limited and when environments last for sufficiently many generations. However, recent experimental results show that gene expression noise generally decreases with growth rate, suggesting that phenotype-switching rates may systematically decrease with growth rate. Such growth rate dependent stability (GRDS) causes cells to be more explorative when maladapted and more phenotypically stable when well-adapted, and we show that GRDS can almost completely overcome the trade-off that limits bet-hedging, allowing for effective adaptation even when environments are diverse and change rapidly. We further show that even a small decrease in switching rates of faster-growing phenotypes can substantially increase long-term fitness of bet-hedging strategies. Together, our results suggest that stochastic strategies may play an even bigger role for microbial adaptation than hitherto appreciated.
Assuntos
Aclimatação , Evolução Biológica , Fenótipo , Adaptação Fisiológica/genéticaRESUMO
Single-cell RNA sequencing (scRNA-seq) has become a popular experimental method to study variation of gene expression within a population of cells. However, obtaining an accurate picture of the diversity of distinct gene expression states that are present in a given dataset is highly challenging because of the sparsity of the scRNA-seq data and its inhomogeneous measurement noise properties. Although a vast number of different methods is applied in the literature for clustering cells into subsets with 'similar' expression profiles, these methods generally lack rigorously specified objectives, involve multiple complex layers of normalization, filtering, feature selection, dimensionality-reduction, employ ad hoc measures of distance or similarity between cells, often ignore the known measurement noise properties of scRNA-seq measurements, and include a large number of tunable parameters. Consequently, it is virtually impossible to assign concrete biophysical meaning to the clusterings that result from these methods. Here we address the following problem: Given raw unique molecule identifier (UMI) counts of an scRNA-seq dataset, partition the cells into subsets such that the gene expression states of the cells in each subset are statistically indistinguishable, and each subset corresponds to a distinct gene expression state. That is, we aim to partition cells so as to maximally reduce the complexity of the dataset without removing any of its meaningful structure. We show that, given the known measurement noise structure of scRNA-seq data, this problem is mathematically well-defined and derive its unique solution from first principles. We have implemented this solution in a tool called Cellstates which operates directly on the raw data and automatically determines the optimal partition and cluster number, with zero tunable parameters. We show that, on synthetic datasets, Cellstates almost perfectly recovers optimal partitions. On real data, Cellstates robustly identifies subtle substructure within groups of cells that are traditionally annotated as a common cell type. Moreover, we show that the diversity of gene expression states that Cellstates identifies systematically depends on the tissue of origin and not on technical features of the experiments such as the total number of cells and total UMI count per cell. In addition to the Cellstates tool we also provide a small toolbox of software to place the identified cellstates into a hierarchical tree of higher-order clusters, to identify the most important differentially expressed genes at each branch of this hierarchy, and to visualize these results.
Assuntos
Biologia Computacional , RNA-Seq , Análise de Sequência de RNA , Análise de Célula Única , Análise de Célula Única/métodos , RNA-Seq/métodos , Biologia Computacional/métodos , Análise por Conglomerados , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Algoritmos , Animais , Análise da Expressão Gênica de Célula ÚnicaRESUMO
Combinatorial interactions among transcription factors are critical to directing tissue-specific gene expression. To build a global atlas of these combinations, we have screened for physical interactions among the majority of human and mouse DNA-binding transcription factors (TFs). The complete networks contain 762 human and 877 mouse interactions. Analysis of the networks reveals that highly connected TFs are broadly expressed across tissues, and that roughly half of the measured interactions are conserved between mouse and human. The data highlight the importance of TF combinations for determining cell fate, and they lead to the identification of a SMAD3/FLI1 complex expressed during development of immunity. The availability of large TF combinatorial networks in both human and mouse will provide many opportunities to study gene regulation, tissue differentiation, and mammalian evolution.
Assuntos
Regulação da Expressão Gênica , Redes Reguladoras de Genes , Fatores de Transcrição/metabolismo , Animais , Diferenciação Celular , Evolução Molecular , Humanos , Camundongos , Monócitos/citologia , Especificidade de Órgãos , Proteína Smad3/metabolismo , Transativadores/metabolismoRESUMO
Although it is well appreciated that gene expression is inherently noisy and that transcriptional noise is encoded in a promoter's sequence, little is known about the extent to which noise levels of individual promoters vary across growth conditions. Using flow cytometry, we here quantify transcriptional noise in Escherichia coli genome-wide across 8 growth conditions and find that noise levels systematically decrease with growth rate, with a condition-dependent lower bound on noise. Whereas constitutive promoters consistently exhibit low noise in all conditions, regulated promoters are both more noisy on average and more variable in noise across conditions. Moreover, individual promoters show highly distinct variation in noise across conditions. We show that a simple model of noise propagation from regulators to their targets can explain a significant fraction of the variation in relative noise levels and identifies TFs that most contribute to both condition-specific and condition-independent noise propagation. In addition, analysis of the genome-wide correlation structure of various gene properties shows that gene regulation, expression noise, and noise plasticity are all positively correlated genome-wide and vary independently of variations in absolute expression, codon bias, and evolutionary rate. Together, our results show that while absolute expression noise tends to decrease with growth rate, relative noise levels of genes are highly condition-dependent and determined by the propagation of noise through the gene regulatory network.
Assuntos
Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica/genética , Regiões Promotoras Genéticas/genética , Proteínas de Escherichia coli/genética , Expressão Gênica/genética , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes/genética , Genes Reporter/genética , Transcriptoma/genéticaRESUMO
Populations of bacteria often undergo a lag in growth when switching conditions. Because growth lags can be large compared to typical doubling times, variations in growth lag are an important but often overlooked component of bacterial fitness in fluctuating environments. We here explore how growth lag variation is determined for the archetypical switch from glucose to lactose as a carbon source in Escherichia coli. First, we show that single-cell lags are bimodally distributed and controlled by a single-molecule trigger. That is, gene expression noise causes the population before the switch to divide into subpopulations with zero and nonzero lac operon expression. While "sensorless" cells with zero preexisting lac expression at the switch have long lags because they are unable to sense the lactose signal, any nonzero lac operon expression suffices to ensure a short lag. Second, we show that the growth lag at the population level depends crucially on the fraction of sensorless cells and that this fraction in turn depends sensitively on the growth condition before the switch. Consequently, even small changes in basal expression can significantly affect the fraction of sensorless cells, thereby population lags and fitness under switching conditions, and may thus be subject to significant natural selection. Indeed, we show that condition-dependent population lags vary across wild E. coli isolates. Since many sensory genes are naturally low expressed in conditions where their inducer is not present, bimodal responses due to subpopulations of sensorless cells may be a general mechanism inducing phenotypic heterogeneity and controlling population lags in switching environments. This mechanism also illustrates how gene expression noise can turn even a simple sensory gene circuit into a bet hedging module and underlines the profound role of gene expression noise in regulatory responses.
Assuntos
Escherichia coli/metabolismo , Regulação Bacteriana da Expressão Gênica/genética , Aptidão Genética/fisiologia , Bactérias/genética , Bactérias/metabolismo , Meio Ambiente , Escherichia coli/genética , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Regulação Bacteriana da Expressão Gênica/fisiologia , Redes Reguladoras de Genes/genética , Interação Gene-Ambiente , Aptidão Genética/genética , Glucose/metabolismo , Óperon Lac , Lactose/metabolismo , FenótipoRESUMO
Although ChIP-seq has become a routine experimental approach for quantitatively characterizing the genome-wide binding of transcription factors (TFs), computational analysis procedures remain far from standardized, making it difficult to compare ChIP-seq results across experiments. In addition, although genome-wide binding patterns must ultimately be determined by local constellations of DNA-binding sites, current analysis is typically limited to identifying enriched motifs in ChIP-seq peaks. Here we present Crunch, a completely automated computational method that performs all ChIP-seq analysis from quality control through read mapping and peak detecting and that integrates comprehensive modeling of the ChIP signal in terms of known and novel binding motifs, quantifying the contribution of each motif and annotating which combinations of motifs explain each binding peak. By applying Crunch to 128 data sets from the ENCODE Project, we show that Crunch outperforms current peak finders and find that TFs naturally separate into "solitary TFs," for which a single motif explains the ChIP-peaks, and "cobinding TFs," for which multiple motifs co-occur within peaks. Moreover, for most data sets, the motifs that Crunch identified de novo outperform known motifs, and both the set of cobinding motifs and the top motif of solitary TFs are consistent across experiments and cell lines. Crunch is implemented as a web server, enabling standardized analysis of any collection of ChIP-seq data sets by simply uploading raw sequencing data. Results are provided both in a graphical web interface and as downloadable files.
Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Biologia Computacional/métodos , Fatores de Transcrição/metabolismo , Motivos de Aminoácidos , Animais , Sítios de Ligação , Conjuntos de Dados como Assunto , Humanos , Motivos de Nucleotídeos , Controle de Qualidade , Sequências Reguladoras de Ácido NucleicoRESUMO
miRNAs are small RNAs that regulate gene expression post-transcriptionally. By repressing the translation and promoting the degradation of target mRNAs, miRNAs may reduce the cell-to-cell variability in protein expression, induce correlations between target expression levels, and provide a layer through which targets can influence each other's expression as "competing RNAs" (ceRNAs). However, experimental evidence for these behaviors is limited. Combining mathematical modeling with RNA sequencing of individual human embryonic kidney cells in which the expression of two distinct miRNAs was induced over a wide range, we have inferred parameters describing the response of hundreds of miRNA targets to miRNA induction. Individual targets have widely different response dynamics, and only a small proportion of predicted targets exhibit high sensitivity to miRNA induction. Our data reveal for the first time the response parameters of the entire network of endogenous miRNA targets to miRNA induction, demonstrating that miRNAs correlate target expression and at the same time increase the variability in expression of individual targets across cells. The approach is generalizable to other miRNAs and post-transcriptional regulators to improve the understanding of gene expression dynamics in individual cell types.
Assuntos
Redes Reguladoras de Genes/genética , MicroRNAs/genética , RNA Mensageiro/genética , Análise de Célula Única , Biologia Computacional , Perfilação da Expressão Gênica , Regulação da Expressão Gênica/genética , Células HEK293 , Humanos , Modelos Teóricos , Análise de Sequência de RNARESUMO
Transcription factors (TFs) are key regulators of cell fate. The estimated 755 genes that encode DNA binding domain-containing proteins comprise â¼ 5% of all Drosophila genes. However, the majority has remained uncharacterized so far due to the lack of proper genetic tools. We generated 594 site-directed transgenic Drosophila lines that contain integrations of individual UAS-TF constructs to facilitate spatiotemporally controlled misexpression in vivo. All transgenes were expressed in the developing wing, and two-thirds induced specific phenotypic defects. In vivo knockdown of the same genes yielded a phenotype for 50%, with both methods indicating a great potential for misexpression to characterize novel functions in wing growth, patterning, and development. Thus, our UAS-TF library provides an important addition to the genetic toolbox of Drosophila research, enabling the identification of several novel wing development-related TFs. In parallel, we established the chromatin landscape of wing imaginal discs by ChIP-seq analyses of five chromatin marks and RNA Pol II. Subsequent clustering revealed six distinct chromatin states, with two clusters showing enrichment for both active and repressive marks. TFs that carry such "bivalent" chromatin are highly enriched for causing misexpression phenotypes in the wing, and analysis of existing expression data shows that these TFs tend to be differentially expressed across the wing disc. Thus, bivalently marked chromatin can be used as a marker for spatially regulated TFs that are functionally relevant in a developing tissue.
Assuntos
Padronização Corporal/genética , Drosophila melanogaster/embriologia , Discos Imaginais/embriologia , Fatores de Transcrição/genética , Asas de Animais/embriologia , Animais , Animais Geneticamente Modificados , Cromatina/genética , Cromatina/metabolismo , Metilação de DNA/genética , Proteínas de Ligação a DNA/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Regulação da Expressão Gênica no Desenvolvimento , Histonas/genética , Fenótipo , Regiões Promotoras Genéticas/genética , Estrutura Terciária de Proteína/genética , Interferência de RNA , RNA Polimerase II/genética , RNA Interferente PequenoRESUMO
Gene regulatory networks are ultimately encoded by the sequence-specific binding of (TFs) to short DNA segments. Although it is customary to represent the binding specificity of a TF by a position-specific weight matrix (PSWM), which assumes each position within a site contributes independently to the overall binding affinity, evidence has been accumulating that there can be significant dependencies between positions. Unfortunately, methodological challenges have so far hindered the development of a practical and generally-accepted extension of the PSWM model. On the one hand, simple models that only consider dependencies between nearest-neighbor positions are easy to use in practice, but fail to account for the distal dependencies that are observed in the data. On the other hand, models that allow for arbitrary dependencies are prone to overfitting, requiring regularization schemes that are difficult to use in practice for non-experts. Here we present a new regulatory motif model, called dinucleotide weight tensor (DWT), that incorporates arbitrary pairwise dependencies between positions in binding sites, rigorously from first principles, and free from tunable parameters. We demonstrate the power of the method on a large set of ChIP-seq data-sets, showing that DWTs outperform both PSWMs and motif models that only incorporate nearest-neighbor dependencies. We also demonstrate that DWTs outperform two previously proposed methods. Finally, we show that DWTs inferred from ChIP-seq data also outperform PSWMs on HT-SELEX data for the same TF, suggesting that DWTs capture inherent biophysical properties of the interactions between the DNA binding domains of TFs and their binding sites. We make a suite of DWT tools available at dwt.unibas.ch, that allow users to automatically perform 'motif finding', i.e. the inference of DWT motifs from a set of sequences, binding site prediction with DWTs, and visualization of DWT 'dilogo' motifs.
Assuntos
Sítios de Ligação/genética , Biologia Computacional/métodos , DNA , Motivos de Nucleotídeos/genética , Fatores de Transcrição , DNA/química , DNA/genética , DNA/metabolismo , Modelos Estatísticos , RNA/química , RNA/genética , RNA/metabolismo , Análise de Sequência de DNA , Fatores de Transcrição/química , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismoRESUMO
Accurate reconstruction of the regulatory networks that control gene expression is one of the key current challenges in molecular biology. Although gene expression and chromatin state dynamics are ultimately encoded by constellations of binding sites recognized by regulators such as transcriptions factors (TFs) and microRNAs (miRNAs), our understanding of this regulatory code and its context-dependent read-out remains very limited. Given that there are thousands of potential regulators in mammals, it is not practical to use direct experimentation to identify which of these play a key role for a particular system of interest. We developed a methodology that models gene expression or chromatin modifications in terms of genome-wide predictions of regulatory sites and completely automated it into a web-based tool called ISMARA (Integrated System for Motif Activity Response Analysis). Given only gene expression or chromatin state data across a set of samples as input, ISMARA identifies the key TFs and miRNAs driving expression/chromatin changes and makes detailed predictions regarding their regulatory roles. These include predicted activities of the regulators across the samples, their genome-wide targets, enriched gene categories among the targets, and direct interactions between the regulators. Applying ISMARA to data sets from well-studied systems, we show that it consistently identifies known key regulators ab initio. We also present a number of novel predictions including regulatory interactions in innate immunity, a master regulator of mucociliary differentiation, TFs consistently disregulated in cancer, and TFs that mediate specific chromatin modifications.
Assuntos
Genoma Humano , Modelos Genéticos , Motivos de Nucleotídeos , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de DNA/métodos , Algoritmos , Animais , Montagem e Desmontagem da Cromatina , Humanos , CamundongosRESUMO
Methylation of cytosines is an essential epigenetic modification in mammalian genomes, yet the rules that govern methylation patterns remain largely elusive. To gain insights into this process, we generated base-pair-resolution mouse methylomes in stem cells and neuronal progenitors. Advanced quantitative analysis identified low-methylated regions (LMRs) with an average methylation of 30%. These represent CpG-poor distal regulatory regions as evidenced by location, DNase I hypersensitivity, presence of enhancer chromatin marks and enhancer activity in reporter assays. LMRs are occupied by DNA-binding factors and their binding is necessary and sufficient to create LMRs. A comparison of neuronal and stem-cell methylomes confirms this dependency, as cell-type-specific LMRs are occupied by cell-type-specific transcription factors. This study provides methylome references for the mouse and shows that DNA-binding factors locally influence DNA methylation, enabling the identification of active regulatory regions.
Assuntos
Citosina/metabolismo , Metilação de DNA , Proteínas de Ligação a DNA/metabolismo , Epigenômica , Animais , Diferenciação Celular , Ilhas de CpG , Células-Tronco Embrionárias/citologia , Camundongos , Neurônios/citologia , Regiões Promotoras Genéticas/genética , Ligação Proteica , Células-Tronco/citologia , Fatores de Transcrição/metabolismoRESUMO
The cellular changes during an epithelial-mesenchymal transition (EMT) largely rely on global changes in gene expression orchestrated by transcription factors. Tead transcription factors and their transcriptional co-activators Yap and Taz have been previously implicated in promoting an EMT; however, their direct transcriptional target genes and their functional role during EMT have remained elusive. We have uncovered a previously unanticipated role of the transcription factor Tead2 during EMT. During EMT in mammary gland epithelial cells and breast cancer cells, levels of Tead2 increase in the nucleus of cells, thereby directing a predominant nuclear localization of its co-factors Yap and Taz via the formation of Tead2-Yap-Taz complexes. Genome-wide chromatin immunoprecipitation and next generation sequencing in combination with gene expression profiling revealed the transcriptional targets of Tead2 during EMT. Among these, zyxin contributes to the migratory and invasive phenotype evoked by Tead2. The results demonstrate that Tead transcription factors are crucial regulators of the cellular distribution of Yap and Taz, and together they control the expression of genes critical for EMT and metastasis.
Assuntos
Proteínas Adaptadoras de Transdução de Sinal/metabolismo , Proteínas de Ligação a DNA/biossíntese , Transição Epitelial-Mesenquimal/fisiologia , Fosfoproteínas/metabolismo , Fatores de Transcrição/biossíntese , Fatores de Transcrição/metabolismo , Zixina/biossíntese , Proteínas Adaptadoras de Transdução de Sinal/genética , Animais , Proteínas de Ciclo Celular , Processos de Crescimento Celular/fisiologia , Linhagem Celular Tumoral , Proteínas de Ligação a DNA/genética , Feminino , Glândulas Mamárias Animais/citologia , Glândulas Mamárias Animais/metabolismo , Neoplasias Mamárias Experimentais/genética , Neoplasias Mamárias Experimentais/metabolismo , Neoplasias Mamárias Experimentais/patologia , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Nus , Camundongos Transgênicos , Fosfoproteínas/genética , Transdução de Sinais , Fatores de Transcrição de Domínio TEA , Transativadores , Fatores de Transcrição/genética , Ativação Transcricional , Proteínas de Sinalização YAP , Zixina/genéticaRESUMO
Although changes in chromatin are integral to transcriptional reprogramming during cellular differentiation, it is currently unclear how chromatin modifications are targeted to specific loci. To systematically identify transcription factors (TFs) that can direct chromatin changes during cell fate decisions, we model the relationship between genome-wide dynamics of chromatin marks and the local occurrence of computationally predicted TF binding sites. By applying this computational approach to a time course of Polycomb-mediated H3K27me3 marks during neuronal differentiation of murine stem cells, we identify several motifs that likely regulate the dynamics of this chromatin mark. Among these, the sites bound by REST and by the SNAIL family of TFs are predicted to transiently recruit H3K27me3 in neuronal progenitors. We validate these predictions experimentally and show that absence of REST indeed causes loss of H3K27me3 at target promoters in trans, specifically at the neuronal progenitor state. Moreover, using targeted transgenic insertion, we show that promoter fragments containing REST or SNAIL binding sites are sufficient to recruit H3K27me3 in cis, while deletion of these sites results in loss of H3K27me3. These findings illustrate that the occurrence of TF binding sites can determine chromatin dynamics. Local determination of Polycomb activity by REST and SNAIL motifs exemplifies such TF based regulation of chromatin. Furthermore, our results show that key TFs can be identified ab initio through computational modeling of epigenome data sets using a modeling approach that we make readily accessible.
Assuntos
Montagem e Desmontagem da Cromatina , Epigênese Genética , Modelos Genéticos , Proteínas do Grupo Polycomb/metabolismo , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação , Bovinos , Diferenciação Celular , Cromatina/metabolismo , Cães , Genoma , Histonas/metabolismo , Cavalos , Humanos , Macaca , Camundongos , Neurônios/citologia , Gambás , Regiões Promotoras Genéticas , Fatores de Transcrição da Família Snail , Células-Tronco/citologia , TransgenesRESUMO
We introduce a biophysical model of miRNA-target interaction and infer its parameters from Argonaute 2 cross-linking and immunoprecipitation data. We show that a substantial fraction of human miRNA target sites are noncanonical and that predicted target-site affinity correlates well with the extent of target destabilization. Our model provides a rigorous biophysical approach to miRNA target identification beyond ad hoc miRNA seed-based methods.
Assuntos
Proteínas Argonautas/metabolismo , Fenômenos Biofísicos , Marcação de Genes , MicroRNAs/genética , Modelos Biológicos , RNA Mensageiro/genética , Proteínas Argonautas/genética , Pareamento de Bases , Sítios de Ligação , Interpretação Estatística de Dados , Bases de Dados Genéticas , Marcação de Genes/métodos , Células HEK293 , Células HeLa , Humanos , Imunoprecipitação , MicroRNAs/metabolismo , Probabilidade , Ligação Proteica , RNA Mensageiro/metabolismo , TranscriptomaRESUMO
Analysis of gene expression data remains one of the most promising avenues toward reconstructing genome-wide gene regulatory networks. However, the large dimensionality of the problem prohibits the fitting of explicit dynamical models of gene regulatory networks, whereas machine learning methods for dimensionality reduction such as clustering or principal component analysis typically fail to provide mechanistic interpretations of the reduced descriptions. To address this, we recently developed a general methodology called motif activity response analysis (MARA) that, by modeling gene expression patterns in terms of the activities of concrete regulators, accomplishes dramatic dimensionality reduction while retaining mechanistic biological interpretations of its predictions (Balwierz, 2014). Here we extend MARA by presenting ARMADA, which models the activity dynamics of regulators across a time course, and infers the causal interactions between the regulators that drive the dynamics of their activities across time. We have implemented ARMADA as part of our ISMARA webserver, ismara.unibas.ch, allowing any researcher to automatically apply it to any gene expression time course. To illustrate the method, we apply ARMADA to a time course of human umbilical vein endothelial cells treated with TNF. Remarkably, ARMADA is able to reproduce the complex observed motif activity dynamics using a relatively small set of interactions between the key regulators in this system. In addition, we show that ARMADA successfully infers many of the key regulatory interactions known to drive this inflammatory response and discuss several novel interactions that ARMADA predicts. In combination with ISMARA, ARMADA provides a powerful approach to generating plausible hypotheses for the key interactions between regulators that control gene expression in any system for which time course measurements are available.
Assuntos
Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes/genética , Análise de Sistemas , Algoritmos , Motivos de Aminoácidos/genética , Animais , Biologia Computacional/métodos , Humanos , CamundongosRESUMO
We quantify the strength of miRNA-target interactions with MIRZA, a recently introduced biophysical model. We show that computationally predicted energies of interaction correlate strongly with the energies of interaction estimated from biochemical measurements of Michaelis-Menten constants. We further show that the accuracy of the MIRZA model can be improved taking into account recently emerged experimental data types. In particular, we use chimeric miRNA-mRNA sequences to infer a MIRZA-CHIMERA model and we provide a framework for inferring a similar model from measurements of rate constants of miRNA-mRNA interaction in the context of Argonaute proteins. Finally, based on a simple model of miRNA-based regulation, we discuss the importance of interaction energy and its variability between targets for the modulation of miRNA target expression in vivo.
Assuntos
Marcação de Genes/métodos , MicroRNAs/química , MicroRNAs/metabolismo , Modelos Moleculares , Sítios de Ligação/fisiologia , Humanos , Estrutura Secundária de ProteínaRESUMO
Gene regulatory interactions underlying the early stages of non-genotoxic carcinogenesis are poorly understood. Here, we have identified key candidate regulators of phenobarbital (PB)-mediated mouse liver tumorigenesis, a well-characterized model of non-genotoxic carcinogenesis, by applying a new computational modeling approach to a comprehensive collection of in vivo gene expression studies. We have combined our previously developed motif activity response analysis (MARA), which models gene expression patterns in terms of computationally predicted transcription factor binding sites with singular value decomposition (SVD) of the inferred motif activities, to disentangle the roles that different transcriptional regulators play in specific biological pathways of tumor promotion. Furthermore, transgenic mouse models enabled us to identify which of these regulatory activities was downstream of constitutive androstane receptor and ß-catenin signaling, both crucial components of PB-mediated liver tumorigenesis. We propose novel roles for E2F and ZFP161 in PB-mediated hepatocyte proliferation and suggest that PB-mediated suppression of ESR1 activity contributes to the development of a tumor-prone environment. Our study shows that combining MARA with SVD allows for automated identification of independent transcription regulatory programs within a complex in vivo tissue environment and provides novel mechanistic insights into PB-mediated hepatocarcinogenesis.
Assuntos
Carcinogênese/genética , Regulação Neoplásica da Expressão Gênica , Neoplasias Hepáticas/genética , Fenobarbital/toxicidade , Transcrição Gênica/efeitos dos fármacos , Animais , Sítios de Ligação , Proliferação de Células/efeitos dos fármacos , Biologia Computacional/métodos , Simulação por Computador , Receptor Constitutivo de Androstano , Redes Reguladoras de Genes , Fígado/efeitos dos fármacos , Fígado/metabolismo , Neoplasias Hepáticas/induzido quimicamente , Neoplasias Hepáticas/metabolismo , Masculino , Camundongos , Motivos de Nucleotídeos , Receptores Citoplasmáticos e Nucleares/metabolismo , Transdução de Sinais , Fatores de Transcrição/metabolismo , beta Catenina/metabolismoRESUMO
The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years.
Assuntos
Biologia Computacional , Bases de Dados de Compostos Químicos , Software , Evolução Biológica , Bioestatística , Desenho de Fármacos , Genômica , Humanos , Internet , Conformação Proteica , Proteômica , Biologia de SistemasRESUMO
Studies of microbial evolutionary dynamics are being transformed by the availability of affordable high-throughput sequencing technologies, which allow whole-genome sequencing of hundreds of related taxa in a single study. Reconstructing a phylogenetic tree of these taxa is generally a crucial step in any evolutionary analysis. Instead of constructing genome assemblies for all taxa, annotating these assemblies, and aligning orthologous genes, many recent studies 1) directly map raw sequencing reads to a single reference sequence, 2) extract single nucleotide polymorphisms (SNPs), and 3) infer the phylogenetic tree using maximum likelihood methods from the aligned SNP positions. However, here we show that, when using such methods to reconstruct phylogenies from sets of simulated sequences, both the exclusion of nonpolymorphic positions and the alignment to a single reference genome, introduce systematic biases and errors in phylogeny reconstruction. To address these problems, we developed a new method that combines alignments from mappings to multiple reference sequences and show that this successfully removes biases from the reconstructed phylogenies. We implemented this method as a web server named REALPHY (Reference sequence Alignment-based Phylogeny builder), which fully automates phylogenetic reconstruction from raw sequencing reads.