RESUMO
Ancestral sequence reconstruction is a technique that is gaining widespread use in molecular evolution studies and protein engineering. Accurate reconstruction requires the ability to handle appropriately large numbers of sequences, as well as insertion and deletion (indel) events, but available approaches exhibit limitations. To address these limitations, we developed Graphical Representation of Ancestral Sequence Predictions (GRASP), which efficiently implements maximum likelihood methods to enable the inference of ancestors of families with more than 10,000 members. GRASP implements partial order graphs (POGs) to represent and infer insertion and deletion events across ancestors, enabling the identification of building blocks for protein engineering. To validate the capacity to engineer novel proteins from realistic data, we predicted ancestor sequences across three distinct enzyme families: glucose-methanol-choline (GMC) oxidoreductases, cytochromes P450, and dihydroxy/sugar acid dehydratases (DHAD). All tested ancestors demonstrated enzymatic activity. Our study demonstrates the ability of GRASP (1) to support large data sets over 10,000 sequences and (2) to employ insertions and deletions to identify building blocks for engineering biologically active ancestors, by exploring variation over evolutionary time.
Assuntos
Evolução Molecular , Mutação INDEL , Mutação INDEL/genética , Proteínas/genética , Evolução Biológica , FilogeniaRESUMO
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the primary protocol for detecting genome-wide DNA-protein interactions, and therefore a key tool for understanding transcriptional regulation. A number of factors, including low specificity of antibody and cellular heterogeneity of sample, may cause "peak" callers to output noise and experimental artefacts. Statistically combining multiple experimental replicates from the same condition could significantly enhance our ability to distinguish actual transcription factor binding events, even when peak caller accuracy and consistency of detection are compromised. We adapted the rank-product test to statistically evaluate the reproducibility from any number of ChIP-seq experimental replicates. We demonstrate over a number of benchmarks that our adaptation "ChIP-R" (pronounced 'chipper') performs as well as or better than comparable approaches on recovering transcription factor binding sites in ChIP-seq peak data. We also show ChIP-R extends to evaluate ATAC-seq peaks, finding reproducible peak sets even at low sequencing depth. ChIP-R decomposes peaks across replicates into "fragments" which either form part of a peak in a replicate, or not. We show that by re-analysing existing data sets, ChIP-R reconstructs reproducible peaks from fragments with enhanced biological enrichment relative to current strategies.
Assuntos
Algoritmos , Sequenciamento de Cromatina por Imunoprecipitação , Sítios de Ligação , Imunoprecipitação da Cromatina/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodosRESUMO
Transcriptional regulation plays a central role in controlling neural stem and progenitor cell proliferation and differentiation during neurogenesis. For instance, transcription factors from the nuclear factor I (NFI) family have been shown to co-ordinate neural stem and progenitor cell differentiation within multiple regions of the embryonic nervous system, including the neocortex, hippocampus, spinal cord and cerebellum. Knockout of individual Nfi genes culminates in similar phenotypes, suggestive of common target genes for these transcription factors. However, whether or not the NFI family regulates common suites of genes remains poorly defined. Here, we use granule neuron precursors (GNPs) of the postnatal murine cerebellum as a model system to analyse regulatory targets of three members of the NFI family: NFIA, NFIB and NFIX. By integrating transcriptomic profiling (RNA-seq) of Nfia- and Nfix-deficient GNPs with epigenomic profiling (ChIP-seq against NFIA, NFIB and NFIX, and DNase I hypersensitivity assays), we reveal that these transcription factors share a large set of potential transcriptional targets, suggestive of complementary roles for these NFI family members in promoting neural development.
Assuntos
Cerebelo/crescimento & desenvolvimento , Cerebelo/metabolismo , Fatores de Transcrição NFI/metabolismo , Animais , Animais Recém-Nascidos , Cerebelo/citologia , Sequenciamento de Cromatina por Imunoprecipitação/métodos , Feminino , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Fatores de Transcrição NFI/genética , Neurogênese/fisiologia , GravidezRESUMO
During forebrain development, radial glia generate neurons through the production of intermediate progenitor cells (IPCs). The production of IPCs is a central tenet underlying the generation of the appropriate number of cortical neurons, but the transcriptional logic underpinning this process remains poorly defined. Here, we examined IPC production using mice lacking the transcription factor nuclear factor I/X (Nfix). We show that Nfix deficiency delays IPC production and prolongs the neurogenic window, resulting in an increased number of neurons in the postnatal forebrain. Loss of additional Nfi alleles (Nfib) resulted in a severe delay in IPC generation while, conversely, overexpression of NFIX led to precocious IPC generation. Mechanistically, analyses of microarray and ChIP-seq datasets, coupled with the investigation of spindle orientation during radial glial cell division, revealed that NFIX promotes the generation of IPCs via the transcriptional upregulation of inscuteable (Insc). These data thereby provide novel insights into the mechanisms controlling the timely transition of radial glia into IPCs during forebrain development.
Assuntos
Proteínas de Ciclo Celular/biossíntese , Hipocampo/embriologia , Fatores de Transcrição NFI/genética , Células-Tronco Neurais/citologia , Neurogênese/genética , Animais , Proteínas de Ciclo Celular/genética , Regulação da Expressão Gênica , Camundongos , Camundongos Knockout , Neurogênese/fisiologia , Neurônios/citologia , Regiões Promotoras Genéticas/genética , Transcrição Gênica , Ativação Transcricional/genéticaRESUMO
Transcription factors regulate gene expression and play an essential role in development by maintaining proliferative states, driving cellular differentiation and determining cell fate. Transcription factors are capable of regulating multiple genes over potentially long distances making target gene identification challenging. Currently available experimental approaches to detect distal interactions have multiple weaknesses that have motivated the development of computational approaches. Although an improvement over experimental approaches, existing computational approaches are still limited in their application, with different weaknesses depending on the approach. Here, we review computational approaches with a focus on data dependency, cell type specificity and usability. With the aim of identifying transcription factor target genes, we apply available approaches to typical transcription factor experimental datasets. We show that approaches are not always capable of annotating all transcription factor binding sites; binding sites should be treated disparately; and a combination of approaches can increase the biological relevance of the set of genes identified as targets.
Assuntos
Biologia Computacional/métodos , Regulação da Expressão Gênica/genética , Regiões Promotoras Genéticas/genética , Fatores de Transcrição/genética , Transcrição Gênica/genética , Animais , Sítios de Ligação/genética , Imunoprecipitação da Cromatina/métodos , Conjuntos de Dados como Assunto , Humanos , Ligação Proteica/genética , Análise de Sequência de DNA/métodos , SoftwareRESUMO
OBJECTIVE: Nuclear Factor One X (NFIX) is a transcription factor expressed by neural stem cells within the developing mouse brain and spinal cord. In order to characterise the pathways by which NFIX may regulate neural stem cell biology within the developing mouse spinal cord, we performed an microarray-based transcriptomic analysis of the spinal cord of embryonic day (E)14.5 Nfix-/- mice in comparison to wild-type controls. DATA DESCRIPTION: Using microarray and differential gene expression analyses, we were able to identify differentially expressed genes in the spinal cords of E14.5 Nfix-/- mice compared to wild-type controls. We performed microarray-based sequencing on spinal cords from n = 3 E14.5 Nfix-/- mice and n = 3 E14.5 Nfix+/+ mice. Differential gene expression analysis, using a false discovery rate (FDR) p-value of p < 0.05, and a fold change cut-off for differential expression of > ± 1.5, revealed 1351 differentially regulated genes in the spinal cord of Nfix-/- mice. Of these, 828 were upregulated, and 523 were downregulated. This resource provides a tool to interrogate the role of this transcription factor in spinal cord development.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Fatores de Transcrição NFI , Animais , Expressão Gênica , Camundongos , Camundongos Endogâmicos C57BL , Fatores de Transcrição NFI/genética , Medula EspinalRESUMO
Cerebellar granule neurons are the most numerous neuronal subtype in the central nervous system. Within the developing cerebellum, these neurons are derived from a population of progenitor cells found within the external granule layer of the cerebellar anlage, namely the cerebellar granule neuron precursors (GNPs). The timely proliferation and differentiation of these precursor cells, which, in rodents occurs predominantly in the postnatal period, is tightly controlled to ensure the normal morphogenesis of the cerebellum. Despite this, our understanding of the factors mediating how GNP differentiation is controlled remains limited. Here, we reveal that the transcription factor nuclear factor I X (NFIX) plays an important role in this process. Mice lacking Nfix exhibit reduced numbers of GNPs during early postnatal development, but elevated numbers of these cells at postnatal day 15. Moreover, Nfix-/- GNPs exhibit increased proliferation when cultured in vitro, suggestive of a role for NFIX in promoting GNP differentiation. At a mechanistic level, profiling analyses using both ChIP-seq and RNA-seq identified the actin-associated factor intersectin 1 as a downstream target of NFIX during cerebellar development. In support of this, mice lacking intersectin 1 also displayed delayed GNP differentiation. Collectively, these findings highlight a key role for NFIX and intersectin 1 in the regulation of cerebellar development.
Assuntos
Proteínas Adaptadoras de Transporte Vesicular/metabolismo , Proliferação de Células/fisiologia , Cerebelo/citologia , Fatores de Transcrição NFI/metabolismo , Células-Tronco Neurais/citologia , Neurônios/citologia , Proteínas Adaptadoras de Transporte Vesicular/genética , Animais , Cerebelo/crescimento & desenvolvimento , Cerebelo/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Camundongos Knockout , Fatores de Transcrição NFI/genética , Células-Tronco Neurais/metabolismo , Neurogênese/fisiologia , Neurônios/metabolismoRESUMO
Transcription factors from the nuclear factor one (NFI) family have been shown to play a central role in regulating neural progenitor cell differentiation within the embryonic and post-natal brain. NFIA and NFIB, for instance, promote the differentiation and functional maturation of granule neurons within the cerebellum. Mice lacking Nfix exhibit delays in the development of neuronal and glial lineages within the cerebellum, but the cell-type-specific expression of this transcription factor remains undefined. Here, we examined the expression of NFIX, together with various cell-type-specific markers, within the developing and adult cerebellum using both chromogenic immunohistochemistry and co-immunofluorescence labelling and confocal microscopy. In embryos, NFIX was expressed by progenitor cells within the rhombic lip and ventricular zone. After birth, progenitor cells within the external granule layer, as well as migrating and mature granule neurons, expressed NFIX. Within the adult cerebellum, NFIX displayed a broad expression profile, and was evident within granule cells, Bergmann glia, and interneurons, but not within Purkinje neurons. Furthermore, transcriptomic profiling of cerebellar granule neuron progenitor cells showed that multiple splice variants of Nfix are expressed within this germinal zone of the post-natal brain. Collectively, these data suggest that NFIX plays a role in regulating progenitor cell biology within the embryonic and post-natal cerebellum, as well as an ongoing role within multiple neuronal and glial populations within the adult cerebellum.
Assuntos
Diferenciação Celular/fisiologia , Cerebelo/citologia , Fatores de Transcrição NFI/metabolismo , Células-Tronco Neurais/metabolismo , Neuroglia/metabolismo , Envelhecimento , Animais , Astrócitos/metabolismo , Cerebelo/crescimento & desenvolvimento , Regulação da Expressão Gênica no Desenvolvimento/fisiologia , Camundongos Endogâmicos C57BL , Neurogênese/fisiologia , Neurônios/metabolismoRESUMO
More than 30 human genetic diseases are linked to tri-nucleotide repeat expansions. There is no known mechanism that explains repeat expansions in full, but changes in the epigenetic state of the associated locus has been implicated in the disease pathology for a growing number of examples. A comprehensive comparative analysis of the genomic features associated with diverse repeat expansions has been lacking. Here, in an effort to decipher the propensity of repeats to undergo expansion and result in a disease state, we determine the genomic coordinates of tri-nucleotide repeat tracts at base pair resolution and computationally establish epigenetic profiles around them. Using three complementary statistical tests, we reveal that several epigenetic states are enriched around repeats that are associated with disease, even in cells that do not harbor expansion, relative to a carefully stratified background. Analysis of over one hundred cell types reveals that epigenetic states generally tend to vary widely between genic regions and cell types. However, there is qualified consistency in the epigenetic signatures of repeats associated with disease suggesting that changes to the chromatin and the DNA around an expanding repeat locus are likely to be similar. These epigenetic signatures may be exploited further to develop models that could explain the propensity of repeats to undergo expansions.