Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
1.
Cell ; 158(6): 1431-1443, 2014 Sep 11.
Article in English | MEDLINE | ID: mdl-25215497

ABSTRACT

Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.


Subject(s)
Arabidopsis/genetics , Nucleotide Motifs , Sequence Analysis, DNA , Transcription Factors/metabolism , Arabidopsis/metabolism , Chromatin Immunoprecipitation , Humans , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Protein Binding , Quantitative Trait Loci
2.
Nature ; 513(7516): 65-70, 2014 Sep 04.
Article in English | MEDLINE | ID: mdl-25079319

ABSTRACT

The translational control of oncoprotein expression is implicated in many cancers. Here we report an eIF4A RNA helicase-dependent mechanism of translational control that contributes to oncogenesis and underlies the anticancer effects of silvestrol and related compounds. For example, eIF4A promotes T-cell acute lymphoblastic leukaemia development in vivo and is required for leukaemia maintenance. Accordingly, inhibition of eIF4A with silvestrol has powerful therapeutic effects against murine and human leukaemic cells in vitro and in vivo. We use transcriptome-scale ribosome footprinting to identify the hallmarks of eIF4A-dependent transcripts. These include 5' untranslated region (UTR) sequences such as the 12-nucleotide guanine quartet (CGG)4 motif that can form RNA G-quadruplex structures. Notably, among the most eIF4A-dependent and silvestrol-sensitive transcripts are a number of oncogenes, superenhancer-associated transcription factors, and epigenetic regulators. Hence, the 5' UTRs of select cancer genes harbour a targetable requirement for the eIF4A RNA helicase.


Subject(s)
5' Untranslated Regions/genetics , Eukaryotic Initiation Factor-4A/metabolism , G-Quadruplexes , Oncogene Proteins/biosynthesis , Oncogene Proteins/genetics , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/metabolism , Protein Biosynthesis , Animals , Antineoplastic Agents, Phytogenic/pharmacology , Antineoplastic Agents, Phytogenic/therapeutic use , Base Sequence , Cell Line, Tumor , Epigenesis, Genetic , Female , Humans , Mice , Mice, Inbred C57BL , Nucleotide Motifs , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/drug therapy , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/genetics , Protein Biosynthesis/drug effects , Ribosomes/metabolism , Transcription Factors/metabolism , Transcription, Genetic/drug effects , Transcription, Genetic/genetics , Triterpenes/pharmacology
3.
Plant Cell ; 28(11): 2715-2734, 2016 11.
Article in English | MEDLINE | ID: mdl-27803310

ABSTRACT

Plants use light as source of energy and information to detect diurnal rhythms and seasonal changes. Sensing changing light conditions is critical to adjust plant metabolism and to initiate developmental transitions. Here, we analyzed transcriptome-wide alterations in gene expression and alternative splicing (AS) of etiolated seedlings undergoing photomorphogenesis upon exposure to blue, red, or white light. Our analysis revealed massive transcriptome reprogramming as reflected by differential expression of ∼20% of all genes and changes in several hundred AS events. For more than 60% of all regulated AS events, light promoted the production of a presumably protein-coding variant at the expense of an mRNA with nonsense-mediated decay-triggering features. Accordingly, AS of the putative splicing factor REDUCED RED-LIGHT RESPONSES IN CRY1CRY2 BACKGROUND1, previously identified as a red light signaling component, was shifted to the functional variant under light. Downstream analyses of candidate AS events pointed at a role of photoreceptor signaling only in monochromatic but not in white light. Furthermore, we demonstrated similar AS changes upon light exposure and exogenous sugar supply, with a critical involvement of kinase signaling. We propose that AS is an integration point of signaling pathways that sense and transmit information regarding the energy availability in plants.


Subject(s)
Alternative Splicing/physiology , Arabidopsis Proteins/metabolism , Arabidopsis/genetics , Transcriptome/genetics , Alternative Splicing/genetics , Arabidopsis/physiology , Arabidopsis Proteins/genetics , Gene Expression Regulation, Plant/genetics , Gene Expression Regulation, Plant/physiology , Signal Transduction/genetics , Signal Transduction/physiology
4.
Bioinformatics ; 33(1): 139-141, 2017 01 01.
Article in English | MEDLINE | ID: mdl-27634950

ABSTRACT

MOTIVATION: Deep sequencing based ribosome footprint profiling can provide novel insights into the regulatory mechanisms of protein translation. However, the observed ribosome profile is fundamentally confounded by transcriptional activity. In order to decipher principles of translation regulation, tools that can reliably detect changes in translation efficiency in case-control studies are needed. RESULTS: We present a statistical framework and an analysis tool, RiboDiff, to detect genes with changes in translation efficiency across experimental treatments. RiboDiff uses generalized linear models to estimate the over-dispersion of RNA-Seq and ribosome profiling measurements separately, and performs a statistical test for differential translation efficiency using both mRNA abundance and ribosome occupancy. AVAILABILITY AND IMPLEMENTATION: RiboDiff webpage http://bioweb.me/ribodiff Source code including scripts for preprocessing the FASTQ data are available at http://github.com/ratschlab/ribodiff CONTACTS: zhongy@cbio.mskcc.org or raetsch@inf.ethz.chSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Protein Biosynthesis , RNA, Messenger/metabolism , Ribosomes/metabolism , Sequence Analysis, RNA/methods , Software , Gene Expression Regulation , High-Throughput Nucleotide Sequencing/methods
5.
Nature ; 477(7365): 419-23, 2011 Aug 28.
Article in English | MEDLINE | ID: mdl-21874022

ABSTRACT

Genetic differences between Arabidopsis thaliana accessions underlie the plant's extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions.


Subject(s)
Arabidopsis/genetics , Gene Expression Profiling , Gene Expression Regulation, Plant/genetics , Genome, Plant/genetics , Transcription, Genetic/genetics , Arabidopsis/classification , Arabidopsis Proteins/genetics , Base Sequence , Genes, Plant/genetics , Genomics , Haplotypes/genetics , INDEL Mutation/genetics , Molecular Sequence Annotation , Phylogeny , Polymorphism, Single Nucleotide/genetics , Proteome/genetics , Seedlings/genetics , Sequence Analysis, DNA
6.
Plant Cell ; 25(10): 3726-42, 2013 Oct.
Article in English | MEDLINE | ID: mdl-24163313

ABSTRACT

The nonsense-mediated decay (NMD) surveillance pathway can recognize erroneous transcripts and physiological mRNAs, such as precursor mRNA alternative splicing (AS) variants. Currently, information on the global extent of coupled AS and NMD remains scarce and even absent for any plant species. To address this, we conducted transcriptome-wide splicing studies using Arabidopsis thaliana mutants in the NMD factor homologs UP FRAMESHIFT1 (UPF1) and UPF3 as well as wild-type samples treated with the translation inhibitor cycloheximide. Our analyses revealed that at least 17.4% of all multi-exon, protein-coding genes produce splicing variants that are targeted by NMD. Moreover, we provide evidence that UPF1 and UPF3 act in a translation-independent mRNA decay pathway. Importantly, 92.3% of the NMD-responsive mRNAs exhibit classical NMD-eliciting features, supporting their authenticity as direct targets. Genes generating NMD-sensitive AS variants function in diverse biological processes, including signaling and protein modification, for which NaCl stress-modulated AS-NMD was found. Besides mRNAs, numerous noncoding RNAs and transcripts derived from intergenic regions were shown to be NMD responsive. In summary, we provide evidence for a major function of AS-coupled NMD in shaping the Arabidopsis transcriptome, having fundamental implications in gene regulation and quality control of transcript processing.


Subject(s)
Alternative Splicing , Arabidopsis/genetics , Nonsense Mediated mRNA Decay , Transcriptome , Arabidopsis Proteins/genetics , Gene Expression Regulation, Plant , Genotype , Mutation , RNA Helicases/genetics , RNA, Plant/genetics , Sequence Analysis, RNA
7.
Bioinformatics ; 30(9): 1300-1, 2014 May 01.
Article in English | MEDLINE | ID: mdl-24413671

ABSTRACT

We present Oqtans, an open-source workbench for quantitative transcriptome analysis, that is integrated in Galaxy. Its distinguishing features include customizable computational workflows and a modular pipeline architecture that facilitates comparative assessment of tool and data quality. Oqtans integrates an assortment of machine learning-powered tools into Galaxy, which show superior or equal performance to state-of-the-art tools. Implemented tools comprise a complete transcriptome analysis workflow: short-read alignment, transcript identification/quantification and differential expression analysis. Oqtans and Galaxy facilitate persistent storage, data exchange and documentation of intermediate results and analysis workflows. We illustrate how Oqtans aids the interpretation of data from different experiments in easy to understand use cases. Users can easily create their own workflows and extend Oqtans by integrating specific tools. Oqtans is available as (i) a cloud machine image with a demo instance at cloud.oqtans.org, (ii) a public Galaxy instance at galaxy.cbio.mskcc.org, (iii) a git repository containing all installed software (oqtans.org/git); most of which is also available from (iv) the Galaxy Toolshed and (v) a share string to use along with Galaxy CloudMan.


Subject(s)
RNA/genetics , Sequence Analysis, RNA/methods , Transcriptome , Base Sequence , Internet , Software
8.
Nucleic Acids Res ; 41(10): 5189-98, 2013 May 01.
Article in English | MEDLINE | ID: mdl-23585274

ABSTRACT

Deep transcriptome sequencing (RNA-Seq) has become a vital tool for studying the state of cells in the context of varying environments, genotypes and other factors. RNA-Seq profiling data enable identification of novel isoforms, quantification of known isoforms and detection of changes in transcriptional or RNA-processing activity. Existing approaches to detect differential isoform abundance between samples either require a complete isoform annotation or fall short in providing statistically robust and calibrated significance estimates. Here, we propose a suite of statistical tests to address these open needs: a parametric test that uses known isoform annotations to detect changes in relative isoform abundance and a non-parametric test that detects differential read coverages and can be applied when isoform annotations are not available. Both methods account for the discrete nature of read counts and the inherent biological variability. We demonstrate that these tests compare favorably to previous methods, both in terms of accuracy and statistical calibrations. We use these techniques to analyze RNA-Seq libraries from Arabidopsis thaliana and Drosophila melanogaster. The identified differential RNA processing events were consistent with RT-qPCR measurements and previous studies. The proposed toolkit is available from http://bioweb.me/rdiff and enables in-depth analyses of transcriptomes, with or without available isoform annotation.


Subject(s)
RNA Processing, Post-Transcriptional , Algorithms , Animals , Arabidopsis/genetics , Arabidopsis/metabolism , Data Interpretation, Statistical , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Gene Expression Profiling , Molecular Sequence Annotation , RNA Isoforms/metabolism , Reverse Transcriptase Polymerase Chain Reaction
9.
Bioinformatics ; 29(20): 2529-38, 2013 Oct 15.
Article in English | MEDLINE | ID: mdl-23980025

ABSTRACT

MOTIVATION: High-throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in the detection of expressed genes and reconstruction of RNA transcripts. However, the extensive dynamic range of gene expression, technical limitations and biases, as well as the observed complexity of the transcriptional landscape, pose profound computational challenges for transcriptome reconstruction. RESULTS: We present the novel framework MITIE (Mixed Integer Transcript IdEntification) for simultaneous transcript reconstruction and quantification. We define a likelihood function based on the negative binomial distribution, use a regularization approach to select a few transcripts collectively explaining the observed read data and show how to find the optimal solution using Mixed Integer Programming. MITIE can (i) take advantage of known transcripts, (ii) reconstruct and quantify transcripts simultaneously in multiple samples, and (iii) resolve the location of multi-mapping reads. It is designed for genome- and assembly-based transcriptome reconstruction. We present an extensive study based on realistic simulated RNA-Seq data. When compared with state-of-the-art approaches, MITIE proves to be significantly more sensitive and overall more accurate. Moreover, MITIE yields substantial performance gains when used with multiple samples. We applied our system to 38 Drosophila melanogaster modENCODE RNA-Seq libraries and estimated the sensitivity of reconstructing omitted transcript annotations and the specificity with respect to annotated transcripts. Our results corroborate that a well-motivated objective paired with appropriate optimization techniques lead to significant improvements over the state-of-the-art in transcriptome reconstruction. AVAILABILITY: MITIE is implemented in C++ and is available from http://bioweb.me/mitie under the GPL license.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , RNA/analysis , Sequence Analysis, RNA/methods , Software , Transcription, Genetic , Animals , Drosophila melanogaster , Humans , Internet , RNA/genetics
10.
RNA Biol ; 9(5): 596-609, 2012 May.
Article in English | MEDLINE | ID: mdl-22614838

ABSTRACT

Deep sequencing of transcriptomes allows quantitative and qualitative analysis of many RNA species in a sample, with parallel comparison of expression levels, splicing variants, natural antisense transcripts, RNA editing and transcriptional start and stop sites the ideal goal. By computational modeling, we show how libraries of multiple insert sizes combined with strand-specific, paired-end (SS-PE) sequencing can increase the information gained on alternative splicing, especially in higher eukaryotes. Despite the benefits of gaining SS-PE data with paired ends of varying distance, the standard Illumina protocol allows only non-strand-specific, paired-end sequencing with a single insert size. Here, we modify the Illumina RNA ligation protocol to allow SS-PE sequencing by using a custom pre-adenylated 3' adaptor. We generate parallel libraries with differing insert sizes to aid deconvolution of alternative splicing events and to characterize the extent and distribution of natural antisense transcription in C. elegans. Despite stringent requirements for detection of alternative splicing, our data increases the number of intron retention and exon skipping events annotated in the Wormbase genome annotations by 127% and 121%, respectively. We show that parallel libraries with a range of insert sizes increase transcriptomic information gained by sequencing and that by current established benchmarks our protocol gives competitive results with respect to library quality.


Subject(s)
Caenorhabditis elegans/genetics , Gene Expression Profiling/methods , Transcriptome , Alternative Splicing , Animals , Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans Proteins/metabolism , Databases, Genetic , Gene Library , Genes, Helminth , High-Throughput Nucleotide Sequencing , Humans , Oligonucleotide Array Sequence Analysis , Protein Isoforms/genetics , Protein Isoforms/metabolism , Sequence Analysis, RNA , Transcription, Genetic
11.
Nat Commun ; 12(1): 3358, 2021 06 07.
Article in English | MEDLINE | ID: mdl-34099733

ABSTRACT

Early stages of embryogenesis depend on subcellular localization and transport of maternal mRNA. However, systematic analysis of these processes is hindered by a lack of spatio-temporal information in single-cell RNA sequencing. Here, we combine spatially-resolved transcriptomics and single-cell RNA labeling to perform a spatio-temporal analysis of the transcriptome during early zebrafish development. We measure spatial localization of mRNA molecules within the one-cell stage embryo, which allows us to identify a class of mRNAs that are specifically localized at an extraembryonic position, the vegetal pole. Furthermore, we establish a method for high-throughput single-cell RNA labeling in early zebrafish embryos, which enables us to follow the fate of individual maternal transcripts until gastrulation. This approach reveals that many localized transcripts are specifically transported to the primordial germ cells. Finally, we acquire spatial transcriptomes of two xenopus species and compare evolutionary conservation of localized genes as well as enriched sequence motifs.


Subject(s)
Cell Tracking/methods , Embryo, Nonmammalian/metabolism , RNA, Messenger/genetics , Transcriptome/genetics , Zebrafish/genetics , Animals , Embryo, Nonmammalian/cytology , Embryo, Nonmammalian/embryology , Female , Gene Expression Regulation, Developmental , Oocytes/cytology , Oocytes/metabolism , RNA, Messenger/metabolism , Single-Cell Analysis/methods , Spatio-Temporal Analysis , Species Specificity , Xenopus/embryology , Xenopus/genetics , Xenopus laevis/embryology , Xenopus laevis/genetics , Zebrafish/embryology
12.
Pac Symp Biocomput ; 21: 433-44, 2016.
Article in English | MEDLINE | ID: mdl-26776207

ABSTRACT

CLIP-Seq protocols such as PAR-CLIP, HITS-CLIP or iCLIP allow a genome-wide analysis of protein-RNA interactions. For the processing of the resulting short read data, various tools are utilized. Some of these tools were specifically developed for CLIP-Seq data, whereas others were designed for the analysis of RNA-Seq data. To this date, however, it has not been assessed which of the available tools are most appropriate for the analysis of CLIP-Seq data. This is because an experimental gold standard dataset on which methods can be accessed and compared, is still not available. To address this lack of a gold-standard dataset, we here present Cseq-Simulator, a simulator for PAR-CLIP, HITS-CLIP and iCLIP-data. This simulator can be applied to generate realistic datasets that can serve as surrogates for experimental gold standard dataset. In this work, we also show how Cseq-Simulator can be used to perform a comparison of steps of typical CLIP-Seq analysis pipelines, such as the read alignment or the peak calling. These comparisons show which tools are useful in different settings and also allow identifying pitfalls in the data analysis.


Subject(s)
High-Throughput Nucleotide Sequencing/statistics & numerical data , Sequence Analysis, RNA/statistics & numerical data , Software , Algorithms , Computational Biology/methods , Computational Biology/statistics & numerical data , Computer Simulation , Cross-Linking Reagents , Genome, Human , Humans , RNA/genetics , RNA/metabolism , RNA Processing, Post-Transcriptional , RNA-Binding Proteins/metabolism , Sequence Alignment/statistics & numerical data
13.
Elife ; 4: e05255, 2015 May 05.
Article in English | MEDLINE | ID: mdl-25939354

ABSTRACT

Epigenome modulation potentially provides a mechanism for organisms to adapt, within and between generations. However, neither the extent to which this occurs, nor the mechanisms involved are known. Here we investigate DNA methylation variation in Swedish Arabidopsis thaliana accessions grown at two different temperatures. Environmental effects were limited to transposons, where CHH methylation was found to increase with temperature. Genome-wide association studies (GWAS) revealed that the extensive CHH methylation variation was strongly associated with genetic variants in both cis and trans, including a major trans-association close to the DNA methyltransferase CMT2. Unlike CHH methylation, CpG gene body methylation (GBM) was not affected by growth temperature, but was instead correlated with the latitude of origin. Accessions from colder regions had higher levels of GBM for a significant fraction of the genome, and this was associated with increased transcription for the genes affected. GWAS revealed that this effect was largely due to trans-acting loci, many of which showed evidence of local adaptation.


Subject(s)
Adaptation, Physiological/genetics , Arabidopsis Proteins/genetics , Arabidopsis/genetics , DNA (Cytosine-5-)-Methyltransferases/genetics , Gene Expression Regulation, Plant , Genome, Plant , Arabidopsis/metabolism , Arabidopsis Proteins/metabolism , CpG Islands , DNA (Cytosine-5-)-Methyltransferases/metabolism , DNA Methylation , DNA Transposable Elements , Epigenesis, Genetic , Gene Expression Profiling , Genetic Variation , Genome-Wide Association Study , Temperature , Transcription, Genetic
14.
Signal Image Video Process ; 8(1 Suppl): 41-48, 2014 Dec 01.
Article in English | MEDLINE | ID: mdl-25866587

ABSTRACT

Analysis of microscopy images can provide insight into many biological processes. One particularly challenging problem is cellular nuclear segmentation in highly anisotropic and noisy 3D image data. Manually localizing and segmenting each and every cellular nucleus is very time-consuming, which remains a bottleneck in large-scale biological experiments. In this work, we present a tool for automated segmentation of cellular nuclei from 3D fluorescent microscopic data. Our tool is based on state-of-the-art image processing and machine learning techniques and provides a user-friendly graphical user interface. We show that our tool is as accurate as manual annotation and greatly reduces the time for the registration.

15.
Nat Cell Biol ; 15(11): 1328-39, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24161933

ABSTRACT

The spindle assembly checkpoint is a conserved signalling pathway that protects genome integrity. Given its central importance, this checkpoint should withstand stochastic fluctuations and environmental perturbations, but the extent of and mechanisms underlying its robustness remain unknown. We probed spindle assembly checkpoint signalling by modulating checkpoint protein abundance and nutrient conditions in fission yeast. For core checkpoint proteins, a mere 20% reduction can suffice to impair signalling, revealing a surprising fragility. Quantification of protein abundance in single cells showed little variability (noise) of critical proteins, explaining why the checkpoint normally functions reliably. Checkpoint-mediated stoichiometric inhibition of the anaphase activator Cdc20 (Slp1 in Schizosaccharomyces pombe) can account for the tolerance towards small fluctuations in protein abundance and explains our observation that some perturbations lead to non-genetic variation in the checkpoint response. Our work highlights low gene expression noise as an important determinant of reliable checkpoint signalling.


Subject(s)
M Phase Cell Cycle Checkpoints , Signal Transduction , Spindle Apparatus , Schizosaccharomyces/metabolism , Schizosaccharomyces pombe Proteins/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL