Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
1.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: mdl-33624017

ABSTRACT

Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads.


Subject(s)
Benchmarking/methods , DNA Methylation/genetics , Epigenomics/methods , Fragaria/genetics , Genome, Plant , Poaceae/genetics , Software , Sulfites/pharmacology , Thlaspi/genetics , Chromosome Mapping/methods , DNA, Plant/drug effects , DNA, Plant/genetics , Epigenesis, Genetic , Sequence Alignment/methods , Whole Genome Sequencing/methods
2.
BMC Genomics ; 23(1): 477, 2022 Jun 28.
Article in English | MEDLINE | ID: mdl-35764934

ABSTRACT

BACKGROUND: Calling germline SNP variants from bisulfite-converted sequencing data poses a challenge for conventional software, which have no inherent capability to dissociate true polymorphisms from artificial mutations induced by the chemical treatment. Nevertheless, SNP data is desirable both for genotyping and to understand the DNA methylome in the context of the genetic background. The confounding effect of bisulfite conversion however can be conceptually resolved by observing differences in allele counts on a per-strand basis, whereby artificial mutations are reflected by non-complementary base pairs. RESULTS: Herein, we present a computational pre-processing approach for adapting sequence alignment data, thus indirectly enabling downstream analysis on a per-strand basis using conventional variant calling software such as GATK or Freebayes. In comparison to specialised tools, the method represents a marked improvement in precision-sensitivity based on high-quality, published benchmark datasets for both human and model plant variants. CONCLUSION: The presented "double-masking" procedure represents an open source, easy-to-use method to facilitate accurate variant calling using conventional software, thus negating any dependency on specialised tools and mitigating the need to generate additional, conventional sequencing libraries alongside bisulfite sequencing experiments. The method is available at https://github.com/bio15anu/revelio and an implementation with Freebayes is available at https://github.com/EpiDiverse/SNP.


Subject(s)
High-Throughput Nucleotide Sequencing , Bayes Theorem , High-Throughput Nucleotide Sequencing/methods , Humans , Sequence Alignment , Sequence Analysis, DNA/methods , Sulfites
3.
Plant Biotechnol J ; 20(5): 944-963, 2022 05.
Article in English | MEDLINE | ID: mdl-34990041

ABSTRACT

Thlaspi arvense (field pennycress) is being domesticated as a winter annual oilseed crop capable of improving ecosystems and intensifying agricultural productivity without increasing land use. It is a selfing diploid with a short life cycle and is amenable to genetic manipulations, making it an accessible field-based model species for genetics and epigenetics. The availability of a high-quality reference genome is vital for understanding pennycress physiology and for clarifying its evolutionary history within the Brassicaceae. Here, we present a chromosome-level genome assembly of var. MN106-Ref with improved gene annotation and use it to investigate gene structure differences between two accessions (MN108 and Spring32-10) that are highly amenable to genetic transformation. We describe non-coding RNAs, pseudogenes and transposable elements, and highlight tissue-specific expression and methylation patterns. Resequencing of forty wild accessions provided insights into genome-wide genetic variation, and QTL regions were identified for a seedling colour phenotype. Altogether, these data will serve as a tool for pennycress improvement in general and for translational research across the Brassicaceae.


Subject(s)
Thlaspi , Chromosomes , Ecosystem , Genome, Plant/genetics , Molecular Sequence Annotation , Thlaspi/genetics , Translational Research, Biomedical
4.
RNA ; 23(8): 1259-1269, 2017 08.
Article in English | MEDLINE | ID: mdl-28473453

ABSTRACT

The hard tick Ixodes ricinus is an important disease vector whose salivary secretions mediate blood-feeding success on vertebrate hosts, including humans. Here we describe the expression profiles and downstream analysis of de novo-discovered microRNAs (miRNAs) expressed in I. ricinus salivary glands and saliva. Eleven tick-derived libraries were sequenced to produce 67,375,557 Illumina reads. De novo prediction yielded 67 bona fide miRNAs out of which 35 are currently not present in miRBase. We report for the first time the presence of microRNAs in tick saliva, obtaining furthermore molecular indicators that those might be of exosomal origin. Ten out of these microRNAs are at least 100 times more represented in saliva. For the four most expressed microRNAs from this subset, we analyzed their combinatorial effects upon their host transcriptome using a novel in silico target network approach. We show that only the inclusion of combinatorial effects reveals the functions in important pathways related to inflammation and pain sensing. A control set of highly abundant microRNAs in both saliva and salivary glands indicates no significant pathways and a far lower number of shared target genes. Therefore, the analysis of miRNAs from pure tick saliva strongly supports the hypothesis that tick saliva miRNAs can modulate vertebrate host homeostasis and represents the first direct evidence of tick miRNA-mediated regulation of vertebrate host gene expression at the tick-host interface. As such, the herein described miRNAs may support future drug discovery and development projects that will also experimentally question their predicted molecular targets in the vertebrate host.


Subject(s)
Gene Regulatory Networks , Host-Parasite Interactions/genetics , Ixodes/genetics , MicroRNAs/analysis , Saliva/chemistry , Tick Infestations/parasitology , Transcriptome , Animals , Computer Simulation , High-Throughput Nucleotide Sequencing/methods , MicroRNAs/genetics , Saliva/metabolism , Salivary Glands/metabolism , Tick Infestations/genetics , Vertebrates/parasitology
6.
PLoS Genet ; 9(7): e1003588, 2013.
Article in English | MEDLINE | ID: mdl-23861667

ABSTRACT

The chromosome 9p21 (Chr9p21) locus of coronary artery disease has been identified in the first surge of genome-wide association and is the strongest genetic factor of atherosclerosis known today. Chr9p21 encodes the long non-coding RNA (ncRNA) antisense non-coding RNA in the INK4 locus (ANRIL). ANRIL expression is associated with the Chr9p21 genotype and correlated with atherosclerosis severity. Here, we report on the molecular mechanisms through which ANRIL regulates target-genes in trans, leading to increased cell proliferation, increased cell adhesion and decreased apoptosis, which are all essential mechanisms of atherogenesis. Importantly, trans-regulation was dependent on Alu motifs, which marked the promoters of ANRIL target genes and were mirrored in ANRIL RNA transcripts. ANRIL bound Polycomb group proteins that were highly enriched in the proximity of Alu motifs across the genome and were recruited to promoters of target genes upon ANRIL over-expression. The functional relevance of Alu motifs in ANRIL was confirmed by deletion and mutagenesis, reversing trans-regulation and atherogenic cell functions. ANRIL-regulated networks were confirmed in 2280 individuals with and without coronary artery disease and functionally validated in primary cells from patients carrying the Chr9p21 risk allele. Our study provides a molecular mechanism for pro-atherogenic effects of ANRIL at Chr9p21 and suggests a novel role for Alu elements in epigenetic gene regulation by long ncRNAs.


Subject(s)
Alu Elements/genetics , Atherosclerosis/genetics , Coronary Artery Disease/genetics , RNA, Long Noncoding/genetics , Apoptosis/genetics , Atherosclerosis/pathology , Cell Adhesion/genetics , Cell Proliferation , Chromosomes, Human, Pair 9/genetics , Coronary Artery Disease/pathology , Epigenesis, Genetic , Gene Expression Regulation , Gene Regulatory Networks , Genetic Predisposition to Disease , Genome-Wide Association Study , HEK293 Cells , Humans , Polycomb-Group Proteins , Polymorphism, Single Nucleotide
7.
Bioinformatics ; 28(1): 17-24, 2012 Jan 01.
Article in English | MEDLINE | ID: mdl-22053076

ABSTRACT

MOTIVATION: High-throughput sequencing methods allow whole transcriptomes to be sequenced fast and cost-effectively. Short RNA sequencing provides not only quantitative expression data but also an opportunity to identify novel coding and non-coding RNAs. Many long transcripts undergo post-transcriptional processing that generates short RNA sequence fragments. Mapped back to a reference genome, they form distinctive patterns that convey information on both the structure of the parent transcript and the modalities of its processing. The miR-miR* pattern from microRNA precursors is the best-known, but by no means singular, example. RESULTS: deepBlockAlign introduces a two-step approach to align RNA-seq read patterns with the aim of quickly identifying RNAs that share similar processing footprints. Overlapping mapped reads are first merged to blocks and then closely spaced blocks are combined to block groups, each representing a locus of expression. In order to compare block groups, the constituent blocks are first compared using a modified sequence alignment algorithm to determine similarity scores for pairs of blocks. In the second stage, block patterns are compared by means of a modified Sankoff algorithm that takes both block similarities and similarities of pattern of distances within the block groups into account. Hierarchical clustering of block groups clearly separates most miRNA and tRNA, and also identifies about a dozen tRNAs clustering together with miRNA. Most of these putative Dicer-processed tRNAs, including eight cases reported to generate products with miRNA-like features in literature, exhibit read blocks distinguished by precise start position of reads. AVAILABILITY: The program deepBlockAlign is available as source code from http://rth.dk/resources/dba/. CONTACT: gorodkin@rth.dk; studla@bioinf.uni-leipzig.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , High-Throughput Nucleotide Sequencing , Sequence Analysis, RNA/methods , Software , Base Sequence , Humans , MicroRNAs/genetics , RNA, Untranslated/analysis , RNA, Untranslated/genetics , Sequence Alignment , Transcriptome
8.
J Exp Zool B Mol Dev Evol ; 320(1): 35-46, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23165937

ABSTRACT

Canonical microRNAs are excised from their hairpin-shaped precursors by Dicer. In order to find possible exceptions to this rule and to identify additional substrates for Dicer processing we re-evaluate the small RNA sequencing data of the Dicer knockdown experiment in MCF-7 cells orignally published by Friedländer et al. [Friedländer et al., 2012, Nucleic Acids Res 40:37-52]. While the well-known non-Dicer mir-451 is not sufficiently expressed in these experiments, there are several additional Dicer-independent microRNAs, among them the important tumor supressor mir-663a. We recover previously described examples of non-miRNA Dicer substrates such as tRNA-Gln and several snoRNAs. Interestingly, sdRNAs derived from box C/D snoRNAs are Dicer-independent, while those derived from box H/ACA snoRNAs are often Dicer dependent. Several pol-III transcripts, in particular the vault RNAs and the great ape specific snaRs are processed by Dicer, while the small RNAs originating from Y RNAs seem to be Dicer independent.


Subject(s)
DEAD-box RNA Helicases/metabolism , Genome, Human/genetics , MicroRNAs/metabolism , Ribonuclease III/metabolism , DEAD-box RNA Helicases/genetics , DNA Polymerase III/metabolism , Gene Knockdown Techniques , Humans , MCF-7 Cells , Ribonuclease III/genetics
9.
PLoS Biol ; 8(9)2010 Sep 07.
Article in English | MEDLINE | ID: mdl-20838655

ABSTRACT

A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest.


Subject(s)
Genome , Turkeys/genetics , Animals , Base Sequence , Chromosome Mapping , DNA/genetics , Polymorphism, Single Nucleotide , Sequence Analysis, DNA , Sequence Homology, Nucleic Acid , Species Specificity
10.
RNA Biol ; 10(7): 1204-10, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23702463

ABSTRACT

Prokaryotic transcripts constitute almost always uninterrupted intervals when mapped back to the genome. Split reads, i.e., RNA-seq reads consisting of parts that only map to discontiguous loci, are thus disregarded in most analysis pipelines. There are, however, some well-known exceptions, in particular, tRNA splicing and circularized small RNAs in Archaea as well as self-splicing introns. Here, we reanalyze a series of published RNA-seq data sets, screening them specifically for non-contiguously mapping reads. We recover most of the known cases together with several novel archaeal ncRNAs associated with circularized products. In Eubacteria, only a handful of interesting candidates were obtained beyond a few previously described group I and group II introns. Most of the atypically mapping reads do not appear to correspond to well-defined, specifically processed products. Whether this diffuse background is, at least in part, an incidental by-product of prokaryotic RNA processing or whether it consists entirely of technical artifacts of reverse transcription or amplification remains unknown.


Subject(s)
Computational Biology/methods , Prokaryotic Cells/metabolism , RNA/chemistry , Sequence Analysis, RNA , Transcriptome , Archaea/genetics , Bacteria/genetics , Genomics/methods , Molecular Sequence Annotation , RNA/genetics
11.
Nucleic Acids Res ; 39(Web Server issue): W112-7, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21622957

ABSTRACT

Small non-coding RNAs (ncRNAs) such as microRNAs, snoRNAs and tRNAs are a diverse collection of molecules with several important biological functions. Current methods for high-throughput sequencing for the first time offer the opportunity to investigate the entire ncRNAome in an essentially unbiased way. However, there is a substantial need for methods that allow a convenient analysis of these overwhelmingly large data sets. Here, we present DARIO, a free web service that allows to study short read data from small RNA-seq experiments. It provides a wide range of analysis features, including quality control, read normalization, ncRNA quantification and prediction of putative ncRNA candidates. The DARIO web site can be accessed at http://dario.bioinf.uni-leipzig.de/.


Subject(s)
RNA, Untranslated/chemistry , RNA, Untranslated/metabolism , Sequence Analysis, RNA , Software , High-Throughput Nucleotide Sequencing , Internet , RNA, Untranslated/analysis , User-Computer Interface
12.
Quant Plant Biol ; 3: e19, 2022.
Article in English | MEDLINE | ID: mdl-37077980

ABSTRACT

Whole-genome bisulfite sequencing (WGBS) is the standard method for profiling DNA methylation at single-nucleotide resolution. Different tools have been developed to extract differentially methylated regions (DMRs), often built upon assumptions from mammalian data. Here, we present MethylScore, a pipeline to analyse WGBS data and to account for the substantially more complex and variable nature of plant DNA methylation. MethylScore uses an unsupervised machine learning approach to segment the genome by classification into states of high and low methylation. It processes data from genomic alignments to DMR output and is designed to be usable by novice and expert users alike. We show how MethylScore can identify DMRs from hundreds of samples and how its data-driven approach can stratify associated samples without prior information. We identify DMRs in the A. thaliana 1,001 Genomes dataset to unveil known and unknown genotype-epigenotype associations .

13.
Biol Chem ; 392(4): 305-13, 2011 Apr.
Article in English | MEDLINE | ID: mdl-21345160

ABSTRACT

Many aspects of the RNA maturation leave traces in RNA sequencing data in the form of deviations from the reference genomic DNA. This includes, in particular, genomically non-encoded nucleotides and chemical modifications. The latter leave their signatures in the form of mismatches and conspicuous patterns of sequencing reads. Modified mapping procedures focusing on particular types of deviations can help to unravel post-transcriptional modification, maturation and degradation processes. Here, we focus on small RNA sequencing data that is produced in large quantities aimed at the analysis of microRNA expression. Starting from the recovery of many well known modified sites in tRNAs, we provide evidence that modified nucleotides are a pervasive phenomenon in these data sets. Regarding non-encoded nucleotides we concentrate on CCA tails, which surprisingly can be found in a diverse collection of transcripts including sub-populations of mature microRNAs. Although small RNA sequencing libraries alone are insufficient to obtain a complete picture, they can inform on many aspects of the complex processes of RNA maturation.


Subject(s)
Computational Biology , RNA Processing, Post-Transcriptional , RNA/genetics , RNA/metabolism , Sequence Analysis, DNA , Animals , Base Sequence , Gene Library , Humans , MicroRNAs/genetics , MicroRNAs/metabolism , RNA/chemistry , RNA Nucleotidyltransferases/metabolism , RNA, Small Untranslated/chemistry , RNA, Small Untranslated/genetics , RNA, Small Untranslated/metabolism , RNA, Transfer/chemistry , RNA, Transfer/genetics , RNA, Transfer/metabolism
14.
Nucleic Acids Res ; 37(Web Server issue): W68-76, 2009 Jul.
Article in English | MEDLINE | ID: mdl-19433510

ABSTRACT

Next-generation sequencing allows now the sequencing of small RNA molecules and the estimation of their expression levels. Consequently, there will be a high demand of bioinformatics tools to cope with the several gigabytes of sequence data generated in each single deep-sequencing experiment. Given this scene, we developed miRanalyzer, a web server tool for the analysis of deep-sequencing experiments for small RNAs. The web server tool requires a simple input file containing a list of unique reads and its copy numbers (expression levels). Using these data, miRanalyzer (i) detects all known microRNA sequences annotated in miRBase, (ii) finds all perfect matches against other libraries of transcribed sequences and (iii) predicts new microRNAs. The prediction of new microRNAs is an especially important point as there are many species with very few known microRNAs. Therefore, we implemented a highly accurate machine learning algorithm for the prediction of new microRNAs that reaches AUC values of 97.9% and recall values of up to 75% on unseen data. The web tool summarizes all the described steps in a single output page, which provides a comprehensive overview of the analysis, adding links to more detailed output pages for each analysis module. miRanalyzer is available at http://web.bioinformatics.cicbiogune.es/microRNA/.


Subject(s)
MicroRNAs/analysis , Software , Animals , Dogs , Humans , Mice , MicroRNAs/chemistry , MicroRNAs/metabolism , Rats , Sequence Analysis, RNA , Transcription, Genetic , User-Computer Interface
15.
Epigenomes ; 5(2)2021 May 04.
Article in English | MEDLINE | ID: mdl-34968299

ABSTRACT

Bisulfite sequencing is a widely used technique for determining DNA methylation and its relationship with epigenetics, genetics, and environmental parameters. Various techniques were implemented for epigenome-wide association studies (EWAS) to reveal meaningful associations; however, there are only very few plant studies available to date. Here, we developed the EpiDiverse EWAS pipeline and tested it using two plant datasets, from P. abies (Norway spruce) and Q. lobata (valley oak). Hence, we present an EWAS implementation tested for non-model plant species and describe its use.

16.
NAR Genom Bioinform ; 3(4): lqab106, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34805989

ABSTRACT

The expanding scope and scale of next generation sequencing experiments in ecological plant epigenetics brings new challenges for computational analysis. Existing tools built for model data may not address the needs of users looking to apply these techniques to non-model species, particularly on a population or community level. Here we present a toolkit suitable for plant ecologists working with whole genome bisulfite sequencing; it includes pipelines for mapping, the calling of methylation values and differential methylation between groups, epigenome-wide association studies, and a novel implementation for both variant calling and discriminating between genetic and epigenetic variation.

17.
BMC Bioinformatics ; 11: 292, 2010 May 28.
Article in English | MEDLINE | ID: mdl-20509939

ABSTRACT

BACKGROUND: Virtually all currently available microRNA target site prediction algorithms require the presence of a (conserved) seed match to the 5' end of the microRNA. Recently however, it has been shown that this requirement might be too stringent, leading to a substantial number of missed target sites. RESULTS: We developed TargetSpy, a novel computational approach for predicting target sites regardless of the presence of a seed match. It is based on machine learning and automatic feature selection using a wide spectrum of compositional, structural, and base pairing features covering current biological knowledge. Our model does not rely on evolutionary conservation, which allows the detection of species-specific interactions and makes TargetSpy suitable for analyzing unconserved genomic sequences.In order to allow for an unbiased comparison of TargetSpy to other methods, we classified all algorithms into three groups: I) no seed match requirement, II) seed match requirement, and III) conserved seed match requirement. TargetSpy predictions for classes II and III are generated by appropriate postfiltering. On a human dataset revealing fold-change in protein production for five selected microRNAs our method shows superior performance in all classes. In Drosophila melanogaster not only our class II and III predictions are on par with other algorithms, but notably the class I (no-seed) predictions are just marginally less accurate. We estimate that TargetSpy predicts between 26 and 112 functional target sites without a seed match per microRNA that are missed by all other currently available algorithms. CONCLUSION: Only a few algorithms can predict target sites without demanding a seed match and TargetSpy demonstrates a substantial improvement in prediction accuracy in that class. Furthermore, when conservation and the presence of a seed match are required, the performance is comparable with state-of-the-art algorithms. TargetSpy was trained on mouse and performs well in human and drosophila, suggesting that it may be applicable to a broad range of species. Moreover, we have demonstrated that the application of machine learning techniques in combination with upcoming deep sequencing data results in a powerful microRNA target site prediction tool http://www.targetspy.org.


Subject(s)
Artificial Intelligence , MicroRNAs/chemistry , Software , Animals , Drosophila , Proteins/chemistry , RNA, Messenger/genetics
18.
Bioinformatics ; 25(18): 2298-301, 2009 Sep 15.
Article in English | MEDLINE | ID: mdl-19584066

ABSTRACT

MicroRNA-offset-RNAs (moRNAs) were recently detected as highly abundant class of small RNAs in a basal chordate. Using short read sequencing data, we show here that moRNAs are also produced from human microRNA precursors, albeit at quite low expression levels. The expression levels of moRNAs are unrelated to those of the associated microRNAs. Surprisingly, microRNA precursors that also show moRNAs are typically evolutionarily old, comprising more than half of the microRNA families that were present in early Bilateria, while evidence for moRNAs was found only for a relative small fraction of microRNA families of recent origin.


Subject(s)
MicroRNAs/chemistry , RNA, Small Interfering/chemistry , RNA/chemistry , Humans , Sequence Analysis, RNA
19.
Methods Mol Biol ; 1097: 437-56, 2014.
Article in English | MEDLINE | ID: mdl-24639171

ABSTRACT

The computational identification of novel microRNA (miRNA) genes is a challenging task in bioinformatics. Massive amounts of data describing unknown functional RNA transcripts have to be analyzed for putative miRNA candidates with automated computational pipelines. Beyond those miRNAs that meet the classical definition, high-throughput sequencing techniques have revealed additional miRNA-like molecules that are derived by alternative biogenesis pathways. Exhaustive bioinformatics analyses on such data involve statistical issues as well as precise sequence and structure inspection not only of the functional mature part but also of the whole precursor sequence of the putative miRNA. Apart from a considerable amount of species-specific miRNAs, the majority of all those genes are conserved at least among closely related organisms. Some miRNAs, however, can be traced back to very early points in the evolution of eukaryotic species. Thus, the investigation of the conservation of newly found miRNA candidates comprises an important step in the computational annotation of miRNAs.Topics covered in this chapter include a review on the obvious problem of miRNA annotation and family definition, recommended pipelines of computational miRNA annotation or detection, and an overview of current computer tools for the prediction of miRNAs and their limitations. The chapter closes discussing how those bioinformatic approaches address the problem of faithful miRNA prediction and correct annotation.


Subject(s)
Computational Biology/methods , Genomics/methods , MicroRNAs/chemistry , MicroRNAs/genetics , Databases, Nucleic Acid , Internet , Software
20.
Front Plant Sci ; 5: 708, 2014.
Article in English | MEDLINE | ID: mdl-25566282

ABSTRACT

High-throughput sequencing techniques have made it possible to assay an organism's entire repertoire of small non-coding RNAs (ncRNAs) in an efficient and cost-effective manner. The moderate size of small RNA-seq datasets makes it feasible to provide free web services to the research community that provide many basic features of a small RNA-seq analysis, including quality control, read normalization, ncRNA quantification, and the prediction of putative novel ncRNAs. DARIO is one such system that so far has been focussed on animals. Here we introduce an extension of this system to plant short non-coding RNAs (sncRNAs). It includes major modifications to cope with plant-specific sncRNA processing. The current version of plantDARIO covers analyses of mapping files, small RNA-seq quality control, expression analyses of annotated sncRNAs, including the prediction of novel miRNAs and snoRNAs from unknown expressed loci and expression analyses of user-defined loci. At present Arabidopsis thaliana, Beta vulgaris, and Solanum lycopersicum are covered. The web tool links to a plant specific visualization browser to display the read distribution of the analyzed sample. The easy-to-use platform of plantDARIO quantifies RNA expression of annotated sncRNAs from different sncRNA databases together with new sncRNAs, annotated by our group. The plantDARIO website can be accessed at http://plantdario.bioinf.uni-leipzig.de/.

SELECTION OF CITATIONS
SEARCH DETAIL