Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
PLoS One ; 12(3): e0174699, 2017.
Article in English | MEDLINE | ID: mdl-28346544

ABSTRACT

Mucosal-associated invariant T cells (MAITs) are innate-like T cells that play a pivotal role in the host defense against infectious diseases, and are also implicated in autoimmune diseases, metabolic diseases, and cancer. Recent studies have shown that induced pluripotent stem cells (iPSCs) derived from MAITs selectively redifferentiate into MAITs without altering their antigen specificity. Such a selective differentiation is a prerequisite for the use of MAITs in cell therapy and/or regenerative medicine. However, the molecular mechanisms underlying this phenomenon remain unclear. Here, we performed methylome and transcriptome analyses of MAITs during the course of differentiation from iPSCs. Our multi-omics analyses revealed that recombination-activating genes (RAG1 and RAG2) and DNA nucleotidylexotransferase (DNTT) were highly methylated with their expression being repressed throughout differentiation. Since these genes are essential for V(D)J recombination of the T cell receptor (TCR) locus, this indicates that nascent MAITs are kept from further rearrangement that may alter their antigen specificity. Importantly, we found that the repression of RAGs was assured in two layers: one by the modulation of transcription factors for RAGs, and the other by DNA methylation at the RAG loci. Together, our study provides a possible explanation for the unaltered antigen specificity in the selective differentiation of MAITs from iPSCs.


Subject(s)
Epigenesis, Genetic , Gene Silencing , Induced Pluripotent Stem Cells/cytology , Mucosal-Associated Invariant T Cells/cytology , V(D)J Recombination/genetics , Cell Differentiation , DNA Methylation , Gene Expression Profiling , Induced Pluripotent Stem Cells/metabolism , Mucosal-Associated Invariant T Cells/metabolism , Receptors, Antigen, T-Cell/metabolism , Transcriptome
2.
BMC Genomics ; 16 Suppl 12: S3, 2015.
Article in English | MEDLINE | ID: mdl-26681544

ABSTRACT

BACKGROUND: Detection of differential methylation between biological samples is an important task in bisulfite-seq data analysis. Several studies have attempted de novo finding of differentially methylated regions (DMRs) using hidden Markov models (HMMs). However, there is room for improvement in the design of HMMs, especially on emission functions that evaluate the likelihood of differential methylation at each cytosine site. RESULTS: We describe a new HMM for DMR detection from bisulfite-seq data. Our method utilizes emission functions that combine binomial models for aligned read counts, and beta mixtures for incorporating genome-wide methylation level distributions. We also develop unsupervised learning algorithms to adjust parameters of the beta-binomial models depending on differential methylation types (up, down, and not changed). In experiments on both simulated and real datasets, the new HMM improves DMR detection accuracy compared with HMMs in our previous study. Furthermore, our method achieves better accuracy than other methods using Fisher's exact test and methylation level smoothing. CONCLUSIONS: Our method enables accurate DMR detection from bisulfite-seq data. The implementation of our method is named ComMet, and distributed as a part of Bisulfighter package, which is available at http://epigenome.cbrc.jp/bisulfighter.


Subject(s)
Computational Biology/methods , DNA Methylation , Sequence Analysis, DNA/methods , Algorithms , Binomial Distribution , Cytosine/metabolism , Genome, Human , Genome-Wide Association Study , High-Throughput Nucleotide Sequencing , Humans , Markov Chains , Sulfites
3.
Epigenetics ; 9(9): 1195-206, 2014 Sep.
Article in English | MEDLINE | ID: mdl-25093444

ABSTRACT

Although DNA modification is adaptive to extrinsic demands, little is known about epigenetic alterations associated with adipose differentiation and reprogramming. We systematically characterized the global trends of our methylome and transcriptome data with reported PPAR(γ) cistrome data. Our analysis revealed that DNA methylation was altered between induced pluripotent stem cells (iPSCs) and adipose derived stem cells (ADSCs). Surprisingly, DNA methylation was not obviously changed in differentiation from ADSCs to mature fat cells (FatCs). This indicates that epigenetic predetermination of the adipogenic fate is almost established prior to substantial expression of the lineage. Furthermore, the majority of the PPAR(γ) cistrome corresponded to the pre-set methylation profile between ADSCs and FatCs. In contrast to the pre-set model, we found that a subset of PPAR(γ)-binding sites for late-expressing genes such as Adiponectin and Adiponectin receptor2 were differentially methylated independently of the early program. Thus, these analyses identify two types of epigenetic mechanisms that distinguish the pre-set cell fate and later stages of adipose differentiation.


Subject(s)
Adipocytes/metabolism , Cell Differentiation , DNA Methylation , Epigenesis, Genetic , PPAR gamma/metabolism , Stem Cells/metabolism , Transcriptome , Adipocytes/cytology , Adiponectin/metabolism , Cell Line , Cellular Reprogramming , Humans , Induced Pluripotent Stem Cells/cytology , Induced Pluripotent Stem Cells/metabolism , PPAR gamma/genetics , Promoter Regions, Genetic , Receptors, Adiponectin/genetics , Receptors, Adiponectin/metabolism , Stem Cells/cytology
4.
Nucleic Acids Res ; 42(6): e45, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24423865

ABSTRACT

Analysis of bisulfite sequencing data usually requires two tasks: to call methylated cytosines (mCs) in a sample, and to detect differentially methylated regions (DMRs) between paired samples. Although numerous tools have been proposed for mC calling, methods for DMR detection have been largely limited. Here, we present Bisulfighter, a new software package for detecting mCs and DMRs from bisulfite sequencing data. Bisulfighter combines the LAST alignment tool for mC calling, and a novel framework for DMR detection based on hidden Markov models (HMMs). Unlike previous attempts that depend on empirical parameters, Bisulfighter can use the expectation-maximization algorithm for HMMs to adjust parameters for each data set. We conduct extensive experiments in which accuracy of mC calling and DMR detection is evaluated on simulated data with various mC contexts, read qualities, sequencing depths and DMR lengths, as well as on real data from a wide range of biological processes. We demonstrate that Bisulfighter consistently achieves better accuracy than other published tools, providing greater sensitivity for mCs with fewer false positives, more precise estimates of mC levels, more exact locations of DMRs and better agreement of DMRs with gene expression and DNase I hypersensitivity. The source code is available at http://epigenome.cbrc.jp/bisulfighter.


Subject(s)
Cytosine/metabolism , DNA Methylation , Sequence Analysis, DNA/methods , Software , Sequence Alignment , Sulfites
5.
Bioinformatics ; 29(2): 255-61, 2013 Jan 15.
Article in English | MEDLINE | ID: mdl-23172862

ABSTRACT

MOTIVATION: Non-coding RNA (ncRNA) genes are increasingly acknowledged for their importance in the human genome. However, there is no comprehensive non-redundant database for all such human genes. RESULTS: We leveraged the effective platform of GeneCards, the human gene compendium, together with the power of fRNAdb and additional primary sources, to judiciously unify all ncRNA gene entries obtainable from 15 different primary sources. Overlapping entries were clustered to unified locations based on an algorithm employing genomic coordinates. This allowed GeneCards' gamut of relevant entries to rise ∼5-fold, resulting in ∼80,000 human non-redundant ncRNAs, belonging to 14 classes. Such 'grand unification' within a regularly updated data structure will assist future ncRNA research. AVAILABILITY AND IMPLEMENTATION: All of these non-coding RNAs are included among the ∼122,500 entries in GeneCards V3.09, along with pertinent annotation, automatically mined by its built-in pipeline from 100 data sources. This information is available at www.genecards.org. CONTACT: Frida.Belinky@weizmann.ac.il SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Databases, Genetic , RNA, Untranslated/genetics , Algorithms , Cluster Analysis , Genes , Genome, Human , Genomics , Humans , Internet , Molecular Sequence Annotation
6.
PLoS One ; 7(9): e44314, 2012.
Article in English | MEDLINE | ID: mdl-22957063

ABSTRACT

MicroRNA (miRNA) precursor hairpins have a unique secondary structure, nucleotide length, and nucleotide content that are in most cases evolutionarily conserved. The aim of this study was to utilize position-specific features of miRNA hairpins to improve their identification. To this end, we defined the evolutionary and structurally conserved features in each position of miRNA hairpins with heuristically derived values, which were successfully integrated using a probabilistic framework. Our method, miRRim2, can not only accurately detect miRNA hairpins, but infer the location of a mature miRNA sequence. To evaluate the accuracy of miRRim2, we designed a cross validation test in which the whole human genome was used for evaluation. miRRim2 could more accurately detect miRNA hairpins than the other computational predictions that had been performed on the human genome, and detect the position of the 5'-end of mature miRNAs with sensitivity and positive predictive value (PPV) above 0.4. To further evaluate miRRim2 on independent data, we applied it to the Ciona intestinalis genome. Our method detected 47 known miRNA hairpins among top 115 candidates, and pinpointed the 5'-end of mature miRNAs with sensitivity and PPV about 0.4. When our results were compared with deep-sequencing reads of small RNA libraries from Ciona intestinalis cells, we found several candidates in which the predicted mature miRNAs were in good accordance with deep-sequencing results.


Subject(s)
Ciona intestinalis/genetics , Computational Biology/methods , MicroRNAs/genetics , Algorithms , Animals , Conserved Sequence , Evolution, Molecular , Genetic Vectors , Genome , Genome, Human , High-Throughput Nucleotide Sequencing , Humans , MicroRNAs/metabolism , Nucleic Acid Conformation , Predictive Value of Tests , RNA Precursors/metabolism , Reproducibility of Results , Sequence Analysis, DNA
7.
RNA ; 16(12): 2503-15, 2010 Dec.
Article in English | MEDLINE | ID: mdl-20980675

ABSTRACT

PIWI-interacting RNAs (piRNAs) silence transposable elements in animal germ cells. In Drosophila ovaries, piRNAs are produced by two distinct pathways: the "ping-pong" amplification cycle that operates in germ cells and a ping-pong-independent pathway termed the primary pathway that mainly operates in somatic cells. AGO3, one of three PIWI proteins in flies, is involved in the ping-pong cycle in ovaries. We characterized AGO3-associated piRNAs in fly testes and found that like in ovaries, AGO3 functions in the ping-pong cycle with Aubergine (Aub) for piRNA production from transposon transcripts. In contrast, most AGO3-associated piRNAs corresponding to Suppressor of Stellate [Su(Ste)] genes are antisense-oriented and bound to Aub. In addition, the vast majority of AGO3-bound piRNAs derived from the AT-chX locus on chromosome X are antisense-oriented and are also found among Aub-associated piRNAs. The presence of very few sense Su(Ste) and AT-chX piRNAs suggests that biogenesis of both Su(Ste) and AT-chX piRNAs by a ping-pong mechanism only is highly unlikely. Nevertheless, the mutual interdependence of AGO3 and Aub for the accumulation of these piRNAs shows that their production relies on both AGO3 and Aub. Analysis of piRNA pathway mutants revealed that although the requirements for piRNA factors for Su(Ste)- and AT-chX-piRNA levels mostly overlap and resemble those for the ping-pong mechanism in the ovaries, Armitage (armi) is not required for the accumulation of AT-chX-1 piRNA. These findings suggest that the impacts of armi mutants on the operation of the piRNA pathway are variable in germ cells of fly testes.


Subject(s)
Drosophila Proteins/metabolism , Drosophila , Peptide Initiation Factors/metabolism , RNA, Small Interfering/biosynthesis , Testis/metabolism , Animals , Animals, Genetically Modified , Argonaute Proteins , Cluster Analysis , Drosophila/genetics , Drosophila/metabolism , Drosophila Proteins/genetics , Female , Gene Expression , Gene Expression Profiling , Male , Metabolic Networks and Pathways , Microarray Analysis , Ovary/metabolism , Peptide Initiation Factors/genetics , Protein Binding , RNA, Small Interfering/genetics , RNA, Small Interfering/metabolism , Repressor Proteins/genetics , Repressor Proteins/metabolism
8.
Nucleic Acids Res ; 38(4): 1163-71, 2010 Mar.
Article in English | MEDLINE | ID: mdl-19965772

ABSTRACT

More than 40% of the human genome is generated by retrotransposition, a series of in vivo processes involving reverse transcription of RNA molecules and integration of the transcripts into the genomic sequence. The mechanism of retrotransposition, however, is not fully understood, and additional genomic elements generated by retrotransposition may remain to be discovered. Here, we report that the human genome contains many previously unidentified short pseudogenes generated by retrotransposition of mRNAs. Genomic elements generated by non-long terminal repeat retrotransposition have specific sequence signatures: a poly-A tract that is immediately downstream and a pair of duplicated sequences, called target site duplications (TSDs), at either end. Using a new computer program, TSDscan, that can accurately detect pseudogenes based on the presence of the poly-A tract and TSDs, we found 654 short (< or = 300 bp), previously unknown pseudogenes derived from mRNAs. Comprehensive analyses of the pseudogenes that we identified and their parent mRNAs revealed that the pseudogene length depends on the parent mRNA length: long mRNAs generate more short pseudogenes than do short mRNAs. To explain this phenomenon, we hypothesize that most long mRNAs are truncated before they are reverse transcribed. Truncated mRNAs would be rapidly degraded during reverse transcription, resulting in the generation of short pseudogenes.


Subject(s)
Pseudogenes , RNA, Messenger/chemistry , Retroelements , Algorithms , Genome, Human , Humans , Models, Genetic , Poly A/analysis
9.
EMBO J ; 28(24): 3820-31, 2009 Dec 16.
Article in English | MEDLINE | ID: mdl-19959991

ABSTRACT

In Drosophila, the PIWI proteins, Aubergine (Aub), AGO3, and Piwi are expressed in germlines and function in silencing transposons by associating with PIWI-interacting RNAs (piRNAs). Recent studies show that PIWI proteins contain symmetric dimethyl-arginines (sDMAs) and that dPRMT5/Capsuleen/DART5 is the modifying enzyme. Here, we show that Tudor (Tud), one of Tud domain-containing proteins, associates with Aub and AGO3, specifically through their sDMA modifications and that these three proteins form heteromeric complexes. piRNA precursor-like molecules are detected in these complexes. The expression levels of Aub and AGO3, along with their degree of sDMA modification, were not changed by tud mutations. However, the population of transposon-derived piRNAs associated with Aub and AGO3 was altered by tud mutations, whereas the total amounts of small RNAs on Aub and AGO3 was increased. Loss of dprmt5 did not change the stability of Aub, but impaired its association with Tud and lowered piRNA association with Aub. Thus, in germline cells, piRNAs are quality-controlled by dPRMT5 that modifies PIWI proteins, in tight association with Tud.


Subject(s)
Drosophila Proteins/metabolism , Drosophila melanogaster/metabolism , Membrane Transport Proteins/metabolism , Protein Methyltransferases/metabolism , Amino Acid Sequence , Animals , Arginine/analogs & derivatives , Arginine/chemistry , Chromatography, Liquid/methods , Databases, Protein , Gene Expression Regulation , Mass Spectrometry/methods , Molecular Sequence Data , Mutation , Protein-Arginine N-Methyltransferases , RNA Interference , Sequence Homology, Amino Acid
10.
Nature ; 461(7268): 1296-9, 2009 Oct 29.
Article in English | MEDLINE | ID: mdl-19812547

ABSTRACT

PIWI-interacting RNAs (piRNAs) silence retrotransposons in Drosophila germ lines by associating with the PIWI proteins Argonaute 3 (AGO3), Aubergine (Aub) and Piwi. piRNAs in Drosophila are produced from intergenic repetitive genes and piRNA clusters by two systems: the primary processing pathway and the amplification loop. The amplification loop occurs in a Dicer-independent, PIWI-Slicer-dependent manner. However, primary piRNA processing remains elusive. Here we analysed piRNA processing in a Drosophila ovarian somatic cell line where Piwi, but not Aub or AGO3, is expressed; thus, only the primary piRNAs exist. In addition to flamenco, a Piwi-specific piRNA cluster, traffic jam (tj), a large Maf gene, was determined as a new piRNA cluster. piRNAs arising from tj correspond to the untranslated regions of tj messenger RNA and are sense-oriented. piRNA loading on to Piwi may occur in the cytoplasm. zucchini, a gene encoding a putative cytoplasmic nuclease, is required for tj-derived piRNA production. In tj and piwi mutant ovaries, somatic cells fail to intermingle with germ cells and Fasciclin III is overexpressed. Loss of tj abolishes Piwi expression in gonadal somatic cells. Thus, in gonadal somatic cells, tj gives rise simultaneously to two different molecules: the TJ protein, which activates Piwi expression, and piRNAs, which define the Piwi targets for silencing.


Subject(s)
Drosophila Proteins/metabolism , Drosophila melanogaster/metabolism , Maf Transcription Factors, Large/metabolism , Proto-Oncogene Proteins/metabolism , RNA-Induced Silencing Complex/metabolism , RNA/metabolism , Animals , Argonaute Proteins , Cell Adhesion Molecules, Neuronal/metabolism , Cell Line , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Endoribonucleases/metabolism , Female , Genes, Insect/genetics , Genetic Loci/genetics , Maf Transcription Factors, Large/genetics , Male , Ovary/cytology , Ovary/metabolism , Phenotype , Proto-Oncogene Proteins/genetics , RNA/biosynthesis , RNA/genetics , RNA Interference , RNA Processing, Post-Transcriptional , RNA-Induced Silencing Complex/genetics , Testis/cytology , Testis/metabolism
11.
Bioinformatics ; 25(24): 3236-43, 2009 Dec 15.
Article in English | MEDLINE | ID: mdl-19808876

ABSTRACT

MOTIVATION: The importance of accurate and fast predictions of multiple alignments for RNA sequences has increased due to recent findings about functional non-coding RNAs. Recent studies suggest that maximizing the expected accuracy of predictions will be useful for many problems in bioinformatics. RESULTS: We designed a novel estimator for multiple alignments of structured RNAs, based on maximizing the expected accuracy of predictions. First, we define the maximum expected accuracy (MEA) estimator for pairwise alignment of RNA sequences. This maximizes the expected sum-of-pairs score (SPS) of a predicted alignment under a probability distribution of alignments given by marginalizing the Sankoff model. Then, by approximating the MEA estimator, we obtain an estimator whose time complexity is O(L(3)+c(2)dL(2)) where L is the length of input sequences and both c and d are constants independent of L. The proposed estimator can handle uncertainty of secondary structures and alignments that are obstacles in Bioinformatics because it considers all the secondary structures and all the pairwise alignments as input sequences. Moreover, we integrate the probabilistic consistency transformation (PCT) on alignments into the proposed estimator. Computational experiments using six benchmark datasets indicate that the proposed method achieved a favorable SPS and was the fastest of many state-of-the-art tools for multiple alignments of structured RNAs. AVAILABILITY: The software called CentroidAlign, which is an implementation of the algorithm in this article, is freely available on our website: http://www.ncrna.org/software/centroidalign/. CONTACT: hamada-michiaki@aist.go.jp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology/methods , RNA/chemistry , Sequence Alignment/methods , Sequence Analysis, RNA/methods , Software , Base Sequence , Databases, Genetic , RNA/metabolism , RNA, Untranslated/chemistry , RNA, Untranslated/metabolism
12.
Bioinformatics ; 25(12): i330-8, 2009 Jun 15.
Article in English | MEDLINE | ID: mdl-19478007

ABSTRACT

MOTIVATION: Secondary structure prediction of RNA sequences is an important problem. There have been progresses in this area, but the accuracy of prediction from an RNA sequence is still limited. In many cases, however, homologous RNA sequences are available with the target RNA sequence whose secondary structure is to be predicted. RESULTS: In this article, we propose a new method for secondary structure predictions of individual RNA sequences by taking the information of their homologous sequences into account without assuming the common secondary structure of the entire sequences. The proposed method is based on posterior decoding techniques, which consider all the suboptimal secondary structures of the target and homologous sequences and all the suboptimal alignments between the target sequence and each of the homologous sequences. In our computational experiments, the proposed method provides better predictions than those performed only on the basis of the formation of individual RNA sequences and those performed by using methods for predicting the common secondary structure of the homologous sequences. Remarkably, we found that the common secondary predictions sometimes give worse predictions for the secondary structure of a target sequence than the predictions from the individual target sequence, while the proposed method always gives good predictions for the secondary structure of target sequences in all tested cases. AVAILABILITY: Supporting information and software are available online at: http://www.ncrna.org/software/centroidfold/ismb2009/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
RNA/chemistry , Sequence Analysis, RNA/methods , Base Sequence , Computational Biology/methods , Molecular Sequence Data , Nucleic Acid Conformation , Sequence Alignment/methods
13.
Nucleic Acids Res ; 37(Web Server issue): W277-80, 2009 Jul.
Article in English | MEDLINE | ID: mdl-19435882

ABSTRACT

The CENTROIDFOLD web server (http://www.ncrna.org/centroidfold/) is a web application for RNA secondary structure prediction powered by one of the most accurate prediction engine. The server accepts two kinds of sequence data: a single RNA sequence and a multiple alignment of RNA sequences. It responses with a prediction result shown as a popular base-pair notation and a graph representation. PDF version of the graph representation is also available. For a multiple alignment sequence, the server predicts a common secondary structure. Usage of the server is quite simple. You can paste a single RNA sequence (FASTA or plain sequence text) or a multiple alignment (CLUSTAL-W format) into the textarea then click on the 'execute CentroidFold' button. The server quickly responses with a prediction result. The major advantage of this server is that it employs our original CentroidFold software as its prediction engine which scores the best accuracy in our benchmark results. Our web server is freely available with no login requirement.


Subject(s)
RNA, Untranslated/chemistry , Software , Algorithms , Computer Graphics , Internet , Nucleic Acid Conformation , RNA, Transfer/chemistry , Sequence Alignment , Sequence Analysis, RNA
14.
Proc Natl Acad Sci U S A ; 106(8): 2525-30, 2009 Feb 24.
Article in English | MEDLINE | ID: mdl-19188602

ABSTRACT

Recent transcriptome analyses have shown that thousands of noncoding RNAs (ncRNAs) are transcribed from mammalian genomes. Although the number of functionally annotated ncRNAs is still limited, they are known to be frequently retained in the nucleus, where they coordinate regulatory networks of gene expression. Some subnuclear organelles or nuclear bodies include RNA species whose identity and structural roles are largely unknown. We identified 2 abundant overlapping ncRNAs, MENepsilon and MENbeta (MENepsilon/beta), which are transcribed from the corresponding site in the multiple endocrine neoplasia (MEN) I locus and which localize to nuclear paraspeckles. This finding raises the intriguing possibility that MENepsilon/beta are involved in paraspeckle organization, because paraspeckles are, reportedly, RNase-sensitive structures. Successful removal of MENepsilon/beta by a refined knockdown method resulted in paraspeckle disintegration. Furthermore, the reassembly of paraspeckles disassembled by transcriptional arrest appeared to be unsuccessful in the absence of MENepsilon/beta. RNA interference and immunoprecipitation further revealed that the paraspeckle proteins p54/nrb and PSF selectively associate with and stabilize the longer MENbeta, thereby contributing to the organization of the paraspeckle structure. The paraspeckle protein PSP1 is not directly involved in either MENepsilon/beta stabilization or paraspeckle organization. We postulate a model for nuclear paraspeckle body organization where specific ncRNAs and RNA-binding proteins cooperate to maintain and, presumably, establish the structure.


Subject(s)
Cell Nucleus/metabolism , Proto-Oncogene Proteins/genetics , RNA, Untranslated , Dactinomycin/pharmacology , Gene Knockdown Techniques , HeLa Cells , Humans , Immunoprecipitation , Oligonucleotides, Antisense/genetics , RNA Interference , Reverse Transcriptase Polymerase Chain Reaction
15.
Bioinformatics ; 25(4): 465-73, 2009 Feb 15.
Article in English | MEDLINE | ID: mdl-19095700

ABSTRACT

MOTIVATION: Recent studies have shown that the methods for predicting secondary structures of RNAs on the basis of posterior decoding of the base-pairing probabilities has an advantage with respect to prediction accuracy over the conventionally utilized minimum free energy methods. However, there is room for improvement in the objective functions presented in previous studies, which are maximized in the posterior decoding with respect to the accuracy measures for secondary structures. RESULTS: We propose novel estimators which improve the accuracy of secondary structure prediction of RNAs. The proposed estimators maximize an objective function which is the weighted sum of the expected number of the true positives and that of the true negatives of the base pairs. The proposed estimators are also improved versions of the ones used in previous works, namely CONTRAfold for secondary structure prediction from a single RNA sequence and McCaskill-MEA for common secondary structure prediction from multiple alignments of RNA sequences. We clarify the relations between the proposed estimators and the estimators presented in previous works, and theoretically show that the previous estimators include additional unnecessary terms in the evaluation measures with respect to the accuracy. Furthermore, computational experiments confirm the theoretical analysis by indicating improvement in the empirical accuracy. The proposed estimators represent extensions of the centroid estimators proposed in Ding et al. and Carvalho and Lawrence, and are applicable to a wide variety of problems in bioinformatics. AVAILABILITY: Supporting information and the CentroidFold software are available online at: http://www.ncrna.org/software/centroidfold/.


Subject(s)
RNA/chemistry , Sequence Analysis, RNA/methods , Base Pairing , Base Sequence , Computational Biology/methods , Databases, Genetic , Entropy , Molecular Sequence Data , Nucleic Acid Conformation
16.
Nucleic Acids Res ; 37(Database issue): D89-92, 2009 Jan.
Article in English | MEDLINE | ID: mdl-18948287

ABSTRACT

We developed a pair of databases that support two important tasks: annotation of anonymous RNA transcripts and discovery of novel non-coding RNAs. The database combo is called the Functional RNA Database and consists of two databases: a rewrite of the original version of the Functional RNA Database (fRNAdb) and the latest version of the UCSC GenomeBrowser for Functional RNA. The former is a sequence database equipped with a powerful search function and hosts a large collection of known/predicted non-coding RNA sequences acquired from existing databases as well as novel/predicted sequences reported by researchers of the Functional RNA Project. The latter is a UCSC Genome Browser mirror with large additional custom tracks specifically associated with non-coding elements. It also includes several functional enhancements such as a presentation of a common secondary structure prediction at any given genomic window < or =500 bp. Our GenomeBrowser supports user authentication and user-specific tracks. The current version of the fRNAdb is a complete rewrite of the former version, hosting a larger number of sequences and with a much friendlier interface. The current version of UCSC GenomeBrowser for Functional RNA features a larger number of tracks and richer features than the former version. The databases are available at http://www.ncrna.org/.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated/chemistry , Animals , Genomics , Humans , Mice , RNA, Untranslated/physiology , Rats , Sequence Analysis, RNA
17.
BMC Bioinformatics ; 9: 318, 2008 Jul 22.
Article in English | MEDLINE | ID: mdl-18647390

ABSTRACT

BACKGROUND: Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. RESULTS: We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. CONCLUSION: Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.


Subject(s)
Models, Molecular , RNA, Untranslated/chemistry , Base Pairing , Base Sequence , Methods , Nucleic Acid Conformation , Probability , Sequence Alignment/methods
18.
Proc Natl Acad Sci U S A ; 105(23): 7964-9, 2008 Jun 10.
Article in English | MEDLINE | ID: mdl-18524951

ABSTRACT

Small RNAs triggering RNA silencing are loaded onto Argonautes and then sequence-specifically guide them to target transcripts. Epitope-tagged human Argonautes (hAgo1, hAgo2, hAgo3, and hAgo4) are associated with siRNAs and miRNAs, but only epitope-tagged hAgo2 has been shown to have Slicer activity. Contrarily, how endogenous hAgos behave with respect to small RNA association and target RNA destruction has remained unclear. Here, we produced monoclonal antibodies for individual hAgos. High-throughput pyrosequencing revealed that immunopurified endogenous hAgo2 and hAgo3 associated mostly with miRNAs. Endogenous hAgo3 did not show Slicer function but localized in P-bodies, suggesting that hAgo3 endogenously expressed is, like hAgo2, involved in the miRNA pathway but antagonizes the RNAi activity of hAgo2. Sequence variations of miRNAs were found at both 5' and 3' ends, suggesting that multiple mature miRNAs containing different "seed" sequences can arise from one miRNA precursor. The hAgo antibodies we raised are valuable tools for ascertaining the functional behavior of endogenous Argonautes and miRNAs in RNA silencing.


Subject(s)
Eukaryotic Initiation Factor-2/metabolism , MicroRNAs/metabolism , RNA Interference , Antibodies, Monoclonal , Base Sequence , Fluorescent Antibody Technique , HeLa Cells , Humans , Immunoprecipitation , Jurkat Cells , Mutation/genetics , RNA, Small Interfering/metabolism
19.
Nucleic Acids Res ; 36(Web Server issue): W75-8, 2008 Jul 01.
Article in English | MEDLINE | ID: mdl-18440970

ABSTRACT

We present web servers for analysis of non-coding RNA sequences on the basis of their secondary structures. Software tools for structural multiple sequence alignments, structural pairwise sequence alignments and structural motif findings are available from the integrated web server and the individual stand-alone web servers. The servers are located at http://software.ncrna.org, along with the information for the evaluation and downloading. This website is freely available to all users and there is no login requirement.


Subject(s)
RNA, Untranslated/chemistry , Sequence Alignment , Sequence Analysis, RNA , Software , Internet , Nucleic Acid Conformation
SELECTION OF CITATIONS
SEARCH DETAIL
...