Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Cell Metab ; 35(5): 821-836.e7, 2023 05 02.
Article in English | MEDLINE | ID: mdl-36948185

ABSTRACT

The mechanisms that specify and stabilize cell subtypes remain poorly understood. Here, we identify two major subtypes of pancreatic ß cells based on histone mark heterogeneity (ßHI and ßLO). ßHI cells exhibit ∼4-fold higher levels of H3K27me3, distinct chromatin organization and compaction, and a specific transcriptional pattern. ßHI and ßLO cells also differ in size, morphology, cytosolic and nuclear ultrastructure, epigenomes, cell surface marker expression, and function, and can be FACS separated into CD24+ and CD24- fractions. Functionally, ßHI cells have increased mitochondrial mass, activity, and insulin secretion in vivo and ex vivo. Partial loss of function indicates that H3K27me3 dosage regulates ßHI/ßLO ratio in vivo, suggesting that control of ß cell subtype identity and ratio is at least partially uncoupled. Both subtypes are conserved in humans, with ßHI cells enriched in humans with type 2 diabetes. Thus, epigenetic dosage is a novel regulator of cell subtype specification and identifies two functionally distinct ß cell subtypes.


Subject(s)
Diabetes Mellitus, Type 2 , Insulin-Secreting Cells , Humans , Insulin-Secreting Cells/metabolism , Histones/metabolism , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Epigenesis, Genetic , Insulin Secretion
2.
Nat Metab ; 4(9): 1150-1165, 2022 09.
Article in English | MEDLINE | ID: mdl-36097183

ABSTRACT

Studies in genetically 'identical' individuals indicate that as much as 50% of complex trait variation cannot be traced to genetics or to the environment. The mechanisms that generate this 'unexplained' phenotypic variation (UPV) remain largely unknown. Here, we identify neuronatin (NNAT) as a conserved factor that buffers against UPV. We find that Nnat deficiency in isogenic mice triggers the emergence of a bi-stable polyphenism, where littermates emerge into adulthood either 'normal' or 'overgrown'. Mechanistically, this is mediated by an insulin-dependent overgrowth that arises from histone deacetylase (HDAC)-dependent ß-cell hyperproliferation. A multi-dimensional analysis of monozygotic twin discordance reveals the existence of two patterns of human UPV, one of which (Type B) phenocopies the NNAT-buffered polyphenism identified in mice. Specifically, Type-B monozygotic co-twins exhibit coordinated increases in fat and lean mass across the body; decreased NNAT expression; increased HDAC-responsive gene signatures; and clinical outcomes linked to insulinemia. Critically, the Type-B UPV signature stratifies both childhood and adult cohorts into four metabolic states, including two phenotypically and molecularly distinct types of obesity.


Subject(s)
Membrane Proteins , Nerve Tissue Proteins , Adaptation, Physiological , Adult , Animals , Child , Histone Deacetylases , Humans , Insulin , Membrane Proteins/metabolism , Mice , Nerve Tissue Proteins/genetics , Obesity/genetics , Obesity/metabolism
3.
J Vis Exp ; (184)2022 06 16.
Article in English | MEDLINE | ID: mdl-35786676

ABSTRACT

Obesity is a complex disease influenced by genetics, epigenetics, the environment, and their interactions. Mature adipocytes represent the major cell type in white adipose tissue. Understanding how adipocytes function and respond to (epi)genetic and environmental signals is essential for identifying the cause(s) of obesity. RNA and chromatin have previously been isolated from adipocytes using enzymatic digestion. In addition, protocols have been developed for nuclear isolation, where purification is achieved by fluorescence-activated cell sorting (FACS) of adipocyte-specific transgenic reporters. One of the greatest challenges to achieving high yield and quality during such protocols is the substantial amount of lipid contained in adipose tissue. The present protocol describes an optimized procedure for isolating mature adipocytes that leverages heptane to separate lipids from the targets of interest (RNA/chromatin). The resulting RNA has high integrity and generates high-quality RNA-seq results. Likewise, the procedure improves nuclei yield rate and generates reproducible ChIP-seq results across samples. Therefore, the current study provides a reliable and universal murine adipocyte isolation protocol suitable for whole-genome transcriptome and epigenome studies.


Subject(s)
Adipocytes, White , Transcriptome , Animals , Chromatin/metabolism , Epigenome , Mice , Obesity/metabolism , RNA/metabolism
4.
Gigascience ; 8(12)2019 12 01.
Article in English | MEDLINE | ID: mdl-31808801

ABSTRACT

BACKGROUND: RNA plays essential roles in all known forms of life. Clustering RNA sequences with common sequence and structure is an essential step towards studying RNA function. With the advent of high-throughput sequencing techniques, experimental and genomic data are expanding to complement the predictive methods. However, the existing methods do not effectively utilize and cope with the immense amount of data becoming available. RESULTS: Hundreds of thousands of non-coding RNAs have been detected; however, their annotation is lagging behind. Here we present GraphClust2, a comprehensive approach for scalable clustering of RNAs based on sequence and structural similarities. GraphClust2 bridges the gap between high-throughput sequencing and structural RNA analysis and provides an integrative solution by incorporating diverse experimental and genomic data in an accessible manner via the Galaxy framework. GraphClust2 can efficiently cluster and annotate large datasets of RNAs and supports structure-probing data. We demonstrate that the annotation performance of clustering functional RNAs can be considerably improved. Furthermore, an off-the-shelf procedure is introduced for identifying locally conserved structure candidates in long RNAs. We suggest the presence and the sparseness of phylogenetically conserved local structures for a collection of long non-coding RNAs. CONCLUSIONS: By clustering data from 2 cross-linking immunoprecipitation experiments, we demonstrate the benefits of GraphClust2 for motif discovery under the presence of biological and methodological biases. Finally, we uncover prominent targets of double-stranded RNA binding protein Roquin-1, such as BCOR's 3' untranslated region that contains multiple binding stem-loops that are evolutionary conserved.


Subject(s)
RNA, Untranslated/chemistry , RNA, Untranslated/genetics , Sequence Analysis, RNA/methods , Cluster Analysis , Computational Biology , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation , Nucleic Acid Conformation , Software
5.
Genome Biol ; 20(1): 227, 2019 11 08.
Article in English | MEDLINE | ID: mdl-31699133

ABSTRACT

We present the software Condition-specific Regulatory Units Prediction (CRUP) to infer from epigenetic marks a list of regulatory units consisting of dynamically changing enhancers with their target genes. The workflow consists of a novel pre-trained enhancer predictor that can be reliably applied across cell types and species, solely based on histone modification ChIP-seq data. Enhancers are subsequently assigned to different conditions and correlated with gene expression to derive regulatory units. We thoroughly test and then apply CRUP to a rheumatoid arthritis model, identifying enhancer-gene pairs comprising known disease genes as well as new candidate genes.


Subject(s)
Enhancer Elements, Genetic , Software , Animals , Arthritis, Experimental/genetics , Arthritis, Rheumatoid/genetics , Chromatin Immunoprecipitation Sequencing , Histone Code , Mice
6.
Bioinformatics ; 35(22): 4757-4759, 2019 11 01.
Article in English | MEDLINE | ID: mdl-31134269

ABSTRACT

SUMMARY: Due to the rapidly increasing scale and diversity of epigenomic data, modular and scalable analysis workflows are of wide interest. Here we present snakePipes, a workflow package for processing and downstream analysis of data from common epigenomic assays: ChIP-seq, RNA-seq, Bisulfite-seq, ATAC-seq, Hi-C and single-cell RNA-seq. snakePipes enables users to assemble variants of each workflow and to easily install and upgrade the underlying tools, via its simple command-line wrappers and yaml files. AVAILABILITY AND IMPLEMENTATION: snakePipes can be installed via conda: `conda install -c mpi-ie -c bioconda -c conda-forge snakePipes'. Source code (https://github.com/maxplanck-ie/snakepipes) and documentation (https://snakepipes.readthedocs.io/en/latest/) are available online. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Epigenomics , Software , RNA-Seq , Exome Sequencing , Workflow
7.
Cell Metab ; 27(6): 1294-1308.e7, 2018 Jun 05.
Article in English | MEDLINE | ID: mdl-29754954

ABSTRACT

To date, it remains largely unclear to what extent chromatin machinery contributes to the susceptibility and progression of complex diseases. Here, we combine deep epigenome mapping with single-cell transcriptomics to mine for evidence of chromatin dysregulation in type 2 diabetes. We find two chromatin-state signatures that track ß cell dysfunction in mice and humans: ectopic activation of bivalent Polycomb-silenced domains and loss of expression at an epigenomically unique class of lineage-defining genes. ß cell-specific Polycomb (Eed/PRC2) loss of function in mice triggers diabetes-mimicking transcriptional signatures and highly penetrant, hyperglycemia-independent dedifferentiation, indicating that PRC2 dysregulation contributes to disease. The work provides novel resources for exploring ß cell transcriptional regulation and identifies PRC2 as necessary for long-term maintenance of ß cell identity. Importantly, the data suggest a two-hit (chromatin and hyperglycemia) model for loss of ß cell identity in diabetes.


Subject(s)
Chromatin/metabolism , Diabetes Mellitus, Type 2/metabolism , Diet, High-Fat , Gene Silencing , Insulin-Secreting Cells/metabolism , Polycomb Repressive Complex 2/physiology , Animals , Cell Differentiation/genetics , Cells, Cultured , Chromosome Mapping , Diabetes Mellitus, Type 2/genetics , Epigenomics , Histone-Lysine N-Methyltransferase/genetics , Histone-Lysine N-Methyltransferase/metabolism , Humans , Hyperglycemia/genetics , Mice , Mice, Inbred C57BL , Mice, Knockout , Myeloid-Lymphoid Leukemia Protein/genetics , Myeloid-Lymphoid Leukemia Protein/metabolism , Polycomb Repressive Complex 2/genetics , Single-Cell Analysis
8.
Nucleic Acids Res ; 46(W1): W25-W29, 2018 07 02.
Article in English | MEDLINE | ID: mdl-29788132

ABSTRACT

The Freiburg RNA tools webserver is a well established online resource for RNA-focused research. It provides a unified user interface and comprehensive result visualization for efficient command line tools. The webserver includes RNA-RNA interaction prediction (IntaRNA, CopraRNA, metaMIR), sRNA homology search (GLASSgo), sequence-structure alignments (LocARNA, MARNA, CARNA, ExpaRNA), CRISPR repeat classification (CRISPRmap), sequence design (antaRNA, INFO-RNA, SECISDesign), structure aberration evaluation of point mutations (RaSE), and RNA/protein-family models visualization (CMV), and other methods. Open education resources offer interactive visualizations of RNA structure and RNA-RNA interaction prediction as well as basic and advanced sequence alignment algorithms. The services are freely available at http://rna.informatik.uni-freiburg.de.


Subject(s)
Base Sequence/genetics , Internet , RNA/genetics , Software , Algorithms , Nucleic Acid Conformation , RNA/chemistry , Sequence Alignment/instrumentation , Sequence Analysis, RNA/instrumentation , Structure-Activity Relationship
9.
Nucleic Acids Res ; 44(W1): W160-5, 2016 07 08.
Article in English | MEDLINE | ID: mdl-27079975

ABSTRACT

We present an update to our Galaxy-based web server for processing and visualizing deeply sequenced data. Its core tool set, deepTools, allows users to perform complete bioinformatic workflows ranging from quality controls and normalizations of aligned reads to integrative analyses, including clustering and visualization approaches. Since we first described our deepTools Galaxy server in 2014, we have implemented new solutions for many requests from the community and our users. Here, we introduce significant enhancements and new tools to further improve data visualization and interpretation. deepTools continue to be open to all users and freely available as a web service at deeptools.ie-freiburg.mpg.de The new deepTools2 suite can be easily deployed within any Galaxy framework via the toolshed repository, and we also provide source code for command line usage under Linux and Mac OS X. A public and documented API for access to deepTools functionality is also available.


Subject(s)
Computational Biology/statistics & numerical data , Drosophila melanogaster/genetics , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA/statistics & numerical data , Software , Animals , Base Sequence , Computational Biology/methods , Computer Graphics , Humans , Information Storage and Retrieval , Internet , Sequence Alignment
10.
Cell ; 164(3): 353-64, 2016 Jan 28.
Article in English | MEDLINE | ID: mdl-26824653

ABSTRACT

More than one-half billion people are obese, and despite progress in genetic research, much of the heritability of obesity remains enigmatic. Here, we identify a Trim28-dependent network capable of triggering obesity in a non-Mendelian, "on/off" manner. Trim28(+/D9) mutant mice exhibit a bi-modal body-weight distribution, with isogenic animals randomly emerging as either normal or obese and few intermediates. We find that the obese-"on" state is characterized by reduced expression of an imprinted gene network including Nnat, Peg3, Cdkn1c, and Plagl1 and that independent targeting of these alleles recapitulates the stochastic bi-stable disease phenotype. Adipose tissue transcriptome analyses in children indicate that humans too cluster into distinct sub-populations, stratifying according to Trim28 expression, transcriptome organization, and obesity-associated imprinted gene dysregulation. These data provide evidence of discrete polyphenism in mouse and man and thus carry important implications for complex trait genetics, evolution, and medicine.


Subject(s)
Epigenesis, Genetic , Haploinsufficiency , Nuclear Proteins/genetics , Obesity/genetics , Repressor Proteins/genetics , Thinness/genetics , Adolescent , Animals , Body Mass Index , Child , Child, Preschool , Humans , Mice , Nutrition Surveys , Polymorphism, Genetic , Tripartite Motif-Containing Protein 28
11.
Cell ; 159(6): 1352-64, 2014 Dec 04.
Article in English | MEDLINE | ID: mdl-25480298

ABSTRACT

The global rise in obesity has revitalized a search for genetic and epigenetic factors underlying the disease. We present a Drosophila model of paternal-diet-induced intergenerational metabolic reprogramming (IGMR) and identify genes required for its encoding in offspring. Intriguingly, we find that as little as 2 days of dietary intervention in fathers elicits obesity in offspring. Paternal sugar acts as a physiological suppressor of variegation, desilencing chromatin-state-defined domains in both mature sperm and in offspring embryos. We identify requirements for H3K9/K27me3-dependent reprogramming of metabolic genes in two distinct germline and zygotic windows. Critically, we find evidence that a similar system may regulate obesity susceptibility and phenotype variation in mice and humans. The findings provide insight into the mechanisms underlying intergenerational metabolic reprogramming and carry profound implications for our understanding of phenotypic variation and evolution.


Subject(s)
Disease Models, Animal , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Epigenesis, Genetic , Obesity/genetics , Animals , Carbohydrate Metabolism , Diet , Embryo, Nonmammalian/metabolism , Eye Color , Female , Genetic Predisposition to Disease , Heterochromatin/metabolism , Humans , Male , Mice , Obesity/metabolism , Spermatozoa/metabolism
12.
J Mol Endocrinol ; 52(3): 373-82, 2014 Jun.
Article in English | MEDLINE | ID: mdl-24711644

ABSTRACT

The control of mRNA translation has been mainly explored in response to activated tyrosine kinase receptors. In contrast, mechanistic details on the translational machinery are far less available in the case of ligand-bound G protein-coupled receptors (GPCRs). In this study, using the FSH receptor (FSH-R) as a model receptor, we demonstrate that part of the translational regulations occurs by phosphorylation of the translation pre-initiation complex scaffold protein, eukaryotic initiation factor 4G (eIF4G), in HEK293 cells stably expressing the FSH-R. This phosphorylation event occurred when eIF4G was bound to the mRNA 5' cap, and probably involves mammalian target of rapamycin. This regulation might contribute to cap-dependent translation in response to FSH. The cap-binding protein eIF4E also had its phosphorylation level enhanced upon FSH stimulation. We also show that FSH-induced signaling not only led to cap-dependent translation but also to internal ribosome entry site (IRES)-dependent translation of some mRNA. These data add detailed information on the molecular bases underlying the regulation of selective mRNA translation by a GPCR, and a topological model recapitulating these mechanisms is proposed.


Subject(s)
Eukaryotic Initiation Factor-4G/metabolism , Follicle Stimulating Hormone/metabolism , Protein Biosynthesis/genetics , Receptors, FSH/genetics , Adaptor Proteins, Signal Transducing/metabolism , Cell Cycle Proteins , Cell Line, Tumor , Enzyme Activation , Gene Expression Regulation/genetics , HEK293 Cells , Humans , Peptide Chain Initiation, Translational/genetics , Peptide Initiation Factors , Phosphoproteins/metabolism , Phosphorylation , RNA, Messenger/genetics , Receptors, FSH/biosynthesis , Receptors, FSH/metabolism , Ribosomes/genetics , TOR Serine-Threonine Kinases
13.
BMC Bioinformatics ; 15: 404, 2014 Dec 31.
Article in English | MEDLINE | ID: mdl-25551362

ABSTRACT

BACKGROUND: Identifying sequence-structure motifs common to two RNAs can speed up the comparison of structural RNAs substantially. The core algorithm of the existent approach ExpaRNA solves this problem for a priori known input structures. However, such structures are rarely known; moreover, predicting them computationally is no rescue, since single sequence structure prediction is highly unreliable. RESULTS: The novel algorithm ExpaRNA-P computes exactly matching sequence-structure motifs in entire Boltzmann-distributed structure ensembles of two RNAs; thereby we match and fold RNAs simultaneously, analogous to the well-known "simultaneous alignment and folding" of RNAs. While this implies much higher flexibility compared to ExpaRNA, ExpaRNA-P has the same very low complexity (quadratic in time and space), which is enabled by its novel structure ensemble-based sparsification. Furthermore, we devise a generalized chaining algorithm to compute compatible subsets of ExpaRNA-P's sequence-structure motifs. Resulting in the very fast RNA alignment approach ExpLoc-P, we utilize the best chain as anchor constraints for the sequence-structure alignment tool LocARNA. ExpLoc-P is benchmarked in several variants and versus state-of-the-art approaches. In particular, we formally introduce and evaluate strict and relaxed variants of the problem; the latter makes the approach sensitive to compensatory mutations. Across a benchmark set of typical non-coding RNAs, ExpLoc-P has similar accuracy to LocARNA but is four times faster (in both variants), while it achieves a speed-up over 30-fold for the longest benchmark sequences (≈400nt). Finally, different ExpLoc-P variants enable tailoring of the method to specific application scenarios. ExpaRNA-P and ExpLoc-P are distributed as part of the LocARNA package. The source code is freely available at http://www.bioinf.uni-freiburg.de/Software/ExpaRNA-P . CONCLUSIONS: ExpaRNA-P's novel ensemble-based sparsification reduces its complexity to quadratic time and space. Thereby, ExpaRNA-P significantly speeds up sequence-structure alignment while maintaining the alignment quality. Different ExpaRNA-P variants support a wide range of applications.


Subject(s)
Algorithms , RNA Folding , Sequence Homology, Nucleic Acid , RNA/chemistry , Sequence Analysis, RNA , Software
14.
Article in English | MEDLINE | ID: mdl-26355520

ABSTRACT

Detecting local common sequence-structure regions of RNAs is a biologically important problem. Detecting such regions allows biologists to identify functionally relevant similarities between the inspected molecules. We developed dynamic programming algorithms for finding common structure-sequence patterns between two RNAs. The RNAs are given by their sequence and a set of potential base pairs with associated probabilities. In contrast to prior work on local pattern matching of RNAs, we support the breaking of arcs. This allows us to add flexibility over matching only fixed structures; potentially matching only a similar subset of specified base pairs. We present an O(n(3)) algorithm for local exact pattern matching between two nested RNAs, and an O(n(3) log n) algorithm for one nested RNA and one bounded-unlimited RNA. In addition, an algorithm for approximate pattern matching is introduced that for two given nested RNAs and a number k, finds the maximal local pattern matching score between the two RNAs with at most k mismatches in O(n(3)k(2)) time. Finally, we present an O(n(3)) algorithm for finding the most similar subforest between two nested RNAs.


Subject(s)
Computational Biology/methods , Pattern Recognition, Automated/methods , RNA/chemistry , Sequence Analysis, RNA/methods , Algorithms , Nucleic Acid Conformation
15.
Algorithms Mol Biol ; 8: 14, 2013.
Article in English | MEDLINE | ID: mdl-23601347

ABSTRACT

BACKGROUND: The search for distant homologs has become an import issue in genome annotation. A particular difficulty is posed by divergent homologs that have lost recognizable sequence similarity. This same problem also arises in the recognition of novel members of large classes of RNAs such as snoRNAs or microRNAs that consist of families unrelated by common descent. Current homology search tools for structured RNAs are either based entirely on sequence similarity (such as blast or hmmer) or combine sequence and secondary structure. The most prominent example of the latter class of tools is Infernal. Alternatives are descriptor-based methods. In most practical applications published to-date, however, the information contained in covariance models or manually prescribed search patterns is dominated by sequence information. Here we ask two related questions: (1) Is secondary structure alone informative for homology search and the detection of novel members of RNA classes? (2) To what extent is the thermodynamic propensity of the target sequence to fold into the correct secondary structure helpful for this task? RESULTS: Sequence-structure alignment can be used as an alternative search strategy. In this scenario, the query consists of a base pairing probability matrix, which can be derived either from a single sequence or from a multiple alignment representing a set of known representatives. Sequence information can be optionally added to the query. The target sequence is pre-processed to obtain local base pairing probabilities. As a search engine we devised a semi-global scanning variant of LocARNA's algorithm for sequence-structure alignment. The LocARNAscan tool is optimized for speed and low memory consumption. In benchmarking experiments on artificial data we observe that the inclusion of thermodynamic stability is helpful, albeit only in a regime of extremely low sequence information in the query. We observe, furthermore, that the sensitivity is bounded in particular by the limited accuracy of the predicted local structures of the target sequence. CONCLUSIONS: Although we demonstrate that a purely structure-based homology search is feasible in principle, it is unlikely to outperform tools such as Infernal in most application scenarios, where a substantial amount of sequence information is typically available. The LocARNAscan approach will profit, however, from high throughput methods to determine RNA secondary structure. In transcriptome-wide applications, such methods will provide accurate structure annotations on the target side. AVAILABILITY: Source code of the free software LocARNAscan 1.0 and supplementary data are available at http://www.bioinf.uni-leipzig.de/Software/LocARNAscan.

16.
Bioinformatics ; 28(23): 3034-41, 2012 Dec 01.
Article in English | MEDLINE | ID: mdl-23052038

ABSTRACT

MOTIVATION: The computational search for novel microRNA (miRNA) precursors often involves some sort of structural analysis with the aim of identifying which type of structures are prone to being recognized and processed by the cellular miRNA-maturation machinery. A natural way to tackle this problem is to perform clustering over the candidate structures along with known miRNA precursor structures. Mixed clusters allow then the identification of candidates that are similar to known precursors. Given the large number of pre-miRNA candidates that can be identified in single-genome approaches, even after applying several filters for precursor robustness and stability, a conventional structural clustering approach is unfeasible. RESULTS: We propose a method to represent candidate structures in a feature space, which summarizes key sequence/structure characteristics of each candidate. We demonstrate that proximity in this feature space is related to sequence/structure similarity, and we select candidates that have a high similarity to known precursors. Additional filtering steps are then applied to further reduce the number of candidates to those with greater transcriptional potential. Our method is compared with another single-genome method (TripletSVM) in two datasets, showing better performance in one and comparable performance in the other, for larger training sets. Additionally, we show that our approach allows for a better interpretation of the results. AVAILABILITY AND IMPLEMENTATION: The MinDist method is implemented using Perl scripts and is freely available at http://www.cravela.org/?mindist=1. CONTACT: backofen@informatik.uni-freiburg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , MicroRNAs/chemistry , Software , Animals , Anopheles/genetics , Base Sequence , Cluster Analysis , Computational Biology/methods , Drosophila melanogaster/genetics , Genome , MicroRNAs/genetics , Nucleic Acid Conformation , Principal Component Analysis , ROC Curve
17.
Bioinformatics ; 28(12): i224-32, 2012 Jun 15.
Article in English | MEDLINE | ID: mdl-22689765

ABSTRACT

MOTIVATION: Clustering according to sequence-structure similarity has now become a generally accepted scheme for ncRNA annotation. Its application to complete genomic sequences as well as whole transcriptomes is therefore desirable but hindered by extremely high computational costs. RESULTS: We present a novel linear-time, alignment-free method for comparing and clustering RNAs according to sequence and structure. The approach scales to datasets of hundreds of thousands of sequences. The quality of the retrieved clusters has been benchmarked against known ncRNA datasets and is comparable to state-of-the-art sequence-structure methods although achieving speedups of several orders of magnitude. A selection of applications aiming at the detection of novel structural ncRNAs are presented. Exemplarily, we predicted local structural elements specific to lincRNAs likely functionally associating involved transcripts to vital processes of the human nervous system. In total, we predicted 349 local structural RNA elements. AVAILABILITY: The GraphClust pipeline is available on request.


Subject(s)
Computational Biology/methods , Nucleic Acid Conformation , RNA, Long Noncoding/chemistry , Sequence Analysis, RNA/methods , Algorithms , Animals , Base Sequence , Cluster Analysis , Humans , Models, Theoretical , Nucleotide Motifs , Sequence Alignment
18.
Nucleic Acids Res ; 38(Web Server issue): W373-7, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20444875

ABSTRACT

The Freiburg RNA tools web server integrates three tools for the advanced analysis of RNA in a common web-based user interface. The tools IntaRNA, ExpaRNA and LocARNA support the prediction of RNA-RNA interaction, exact RNA matching and alignment of RNA, respectively. The Freiburg RNA tools web server and the software packages of the stand-alone tools are freely accessible at http://rna.informatik.uni-freiburg.de.


Subject(s)
RNA/chemistry , Sequence Analysis, RNA , Software , Internet , Nucleic Acid Conformation , RNA/metabolism , RNA, Messenger/chemistry , RNA, Messenger/metabolism , RNA, Untranslated/chemistry , RNA, Untranslated/metabolism , Sequence Alignment , Systems Integration
19.
Bioinformatics ; 25(16): 2095-102, 2009 Aug 15.
Article in English | MEDLINE | ID: mdl-19189979

ABSTRACT

MOTIVATION: Specific functions of ribonucleic acid (RNA) molecules are often associated with different motifs in the RNA structure. The key feature that forms such an RNA motif is the combination of sequence and structure properties. In this article, we introduce a new RNA sequence-structure comparison method which maintains exact matching substructures. Existing common substructures are treated as whole unit while variability is allowed between such structural motifs. Based on a fast detectable set of overlapping and crossing substructure matches for two nested RNA secondary structures, our method ExpaRNA (exact pattern of alignment of RNA) computes the longest collinear sequence of substructures common to two RNAs in O(H.nm) time and O(nm) space, where H << n.m for real RNA structures. Applied to different RNAs, our method correctly identifies sequence-structure similarities between two RNAs. RESULTS: We have compared ExpaRNA with two other alignment methods that work with given RNA structures, namely RNAforester and RNA_align. The results are in good agreement, but can be obtained in a fraction of running time, in particular for larger RNAs. We have also used ExpaRNA to speed up state-of-the-art Sankoff-style alignment tools like LocARNA, and observe a tradeoff between quality and speed. However, we get a speedup of 4.25 even in the highest quality setting, where the quality of the produced alignment is comparable to that of LocARNA alone. AVAILABILITY: The presented algorithm is implemented in the program ExpaRNA, which is available from our website (http://www.bioinf.uni-freiburg.de/Software).


Subject(s)
Computational Biology/methods , RNA/chemistry , Sequence Analysis, RNA/methods , Base Sequence , Nucleic Acid Conformation , Sequence Alignment/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...