Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 13(1): 13665, 2023 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-37607960

RESUMO

Solar flares are explosions on the Sun. They happen when energy stored in magnetic fields around solar active regions (ARs) is suddenly released. Solar flares and accompanied coronal mass ejections are sources of space weather, which negatively affects a variety of technologies at or near Earth, ranging from blocking high-frequency radio waves used for radio communication to degrading power grid operations. Monitoring and providing early and accurate prediction of solar flares is therefore crucial for preparedness and disaster risk management. In this article, we present a transformer-based framework, named SolarFlareNet, for predicting whether an AR would produce a [Formula: see text]-class flare within the next 24 to 72 h. We consider three [Formula: see text] classes, namely the [Formula: see text]M5.0 class, the [Formula: see text]M class and the [Formula: see text]C class, and build three transformers separately, each corresponding to a [Formula: see text] class. Each transformer is used to make predictions of its corresponding [Formula: see text]-class flares. The crux of our approach is to model data samples in an AR as time series and to use transformers to capture the temporal dynamics of the data samples. Each data sample consists of magnetic parameters taken from Space-weather HMI Active Region Patches (SHARP) and related data products. We survey flare events that occurred from May 2010 to December 2022 using the Geostationary Operational Environmental Satellite X-ray flare catalogs provided by the National Centers for Environmental Information (NCEI), and build a database of flares with identified ARs in the NCEI flare catalogs. This flare database is used to construct labels of the data samples suitable for machine learning. We further extend the deterministic approach to a calibration-based probabilistic forecasting method. The SolarFlareNet system is fully operational and is capable of making near real-time predictions of solar flares on the Web.

2.
Comput Biol Med ; 107: 302-322, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30771879

RESUMO

Predicting the response, or sensitivity, of a clinical drug to a specific cancer type is an important research problem. By predicting the clinical drug response correctly, clinicians are able to understand patient-to-patient differences in drug sensitivity outcomes, which in turn results in lesser time spent and lower cost associated with identifying effective drug candidates. Although technological advances in high-throughput drug screening in cells led to the generation of a substantial amount of relevant data, the analysis of such data would be a challenging task. There is a critical need for advanced machine learning (ML) algorithms to generate accurate predictions of clinical drug response. A major goal of this work is to provide advanced ML tools to data analysts, who would in turn build prediction calculators to be incorporated into intelligent clinical decision support systems. Such innovative tools could be used to enhance patient-care, among other uses. To achieve this goal, we develop new ML techniques, including a transfer learning approach coupled with or without a boosting technique. Experimental results on real clinical data pertaining to breast cancer, multiple myeloma, and triple-negative cancer patients demonstrate the effectiveness and superiority of the proposed approaches compared to baseline approaches, including existing transfer learning methods.


Assuntos
Biologia Computacional/métodos , Sistemas de Apoio a Decisões Clínicas , Descoberta de Drogas/métodos , Aprendizado de Máquina , Algoritmos , Antineoplásicos/uso terapêutico , Ensaios Clínicos como Assunto , Bases de Dados Factuais , Perfilação da Expressão Gênica , Ensaios de Triagem em Larga Escala , Humanos , Neoplasias/tratamento farmacológico
3.
J Bioinform Comput Biol ; 16(3): 1840014, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29945499

RESUMO

Transfer learning (TL) algorithms aim to improve the prediction performance in a target task (e.g. the prediction of cisplatin sensitivity in triple-negative breast cancer patients) via transferring knowledge from auxiliary data of a related task (e.g. the prediction of docetaxel sensitivity in breast cancer patients), where the distribution and even the feature space of the data pertaining to the tasks can be different. In real-world applications, we sometimes have a limited training set in a target task while we have auxiliary data from a related task. To obtain a better prediction performance in the target task, supervised learning requires a sufficiently large training set in the target task to perform well in predicting future test examples of the target task. In this paper, we propose a TL approach for cancer drug sensitivity prediction, where our approach combines three techniques. First, we shift the representation of a subset of examples from auxiliary data of a related task to a representation closer to a target training set of a target task. Second, we align the shifted representation of the selected examples of the auxiliary data to the target training set to obtain examples with representation aligned to the target training set. Third, we train machine learning algorithms using both the target training set and the aligned examples. We evaluate the performance of our approach against baseline approaches using the Area Under the receiver operating characteristic (ROC) Curve (AUC) on real clinical trial datasets pertaining to multiple myeloma, nonsmall cell lung cancer, triple-negative breast cancer, and breast cancer. Experimental results show that our approach is better than the baseline approaches in terms of performance and statistical significance.


Assuntos
Algoritmos , Antineoplásicos/farmacologia , Biologia Computacional/métodos , Área Sob a Curva , Bortezomib/farmacologia , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/genética , Cisplatino/farmacologia , Ensaios Clínicos como Assunto , Bases de Dados Factuais , Docetaxel/farmacologia , Cloridrato de Erlotinib/farmacologia , Feminino , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/genética , Aprendizado de Máquina , Mieloma Múltiplo/tratamento farmacológico , Mieloma Múltiplo/genética , Neoplasias de Mama Triplo Negativas/tratamento farmacológico , Neoplasias de Mama Triplo Negativas/genética
4.
Biomed Res Int ; 2017: 6261802, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28243601

RESUMO

Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here, we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Teoria da Informação , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Saccharomyces cerevisiae/genética , Fatores de Tempo
5.
PLoS One ; 11(1): e0147097, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26789998

RESUMO

RNA junctions are important structural elements of RNA molecules. They are formed when three or more helices come together in three-dimensional space. Recent studies have focused on the annotation and prediction of coaxial helical stacking (CHS) motifs within junctions. Here we exploit such predictions to develop an efficient alignment tool to handle RNA secondary structures with CHS motifs. Specifically, we build upon our Junction-Explorer software for predicting coaxial stacking and RNAJAG for modelling junction topologies as tree graphs to incorporate constrained tree matching and dynamic programming algorithms into a new method, called CHSalign, for aligning the secondary structures of RNA molecules containing CHS motifs. Thus, CHSalign is intended to be an efficient alignment tool for RNAs containing similar junctions. Experimental results based on thousands of alignments demonstrate that CHSalign can align two RNA secondary structures containing CHS motifs more accurately than other RNA secondary structure alignment tools. CHSalign yields a high score when aligning two RNA secondary structures with similar CHS motifs or helical arrangement patterns, and a low score otherwise. This new method has been implemented in a web server, and the program is also made freely available, at http://bioinformatics.njit.edu/CHSalign/.


Assuntos
Algoritmos , Internet , Motivos de Nucleotídeos , Dobramento de RNA , Análise de Sequência de RNA/métodos , Software , Alinhamento de Sequência/métodos
6.
J Biosci ; 40(4): 731-40, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26564975

RESUMO

Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely, support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabelled data for training. We investigated inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabelled data. We then applied our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluated the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.


Assuntos
Proteínas de Escherichia coli/genética , Escherichia coli/genética , Redes Reguladoras de Genes , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Máquina de Vetores de Suporte , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Regulação Bacteriana da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Fatores de Transcrição/genética
7.
BMC Bioinformatics ; 16: 39, 2015 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-25727492

RESUMO

BACKGROUND: RNA pseudoknots play important roles in many biological processes. Previous methods for comparative pseudoknot analysis mainly focus on simultaneous folding and alignment of RNA sequences. Little work has been done to align two known RNA secondary structures with pseudoknots taking into account both sequence and structure information of the two RNAs. RESULTS: In this article we present a novel method for aligning two known RNA secondary structures with pseudoknots. We adopt the partition function methodology to calculate the posterior log-odds scores of the alignments between bases or base pairs of the two RNAs with a dynamic programming algorithm. The posterior log-odds scores are then used to calculate the expected accuracy of an alignment between the RNAs. The goal is to find an optimal alignment with the maximum expected accuracy. We present a heuristic to achieve this goal. The performance of our method is investigated and compared with existing tools for RNA structure alignment. An extension of the method to multiple alignment of pseudoknot structures is also discussed. CONCLUSIONS: The method described here has been implemented in a tool named RKalign, which is freely accessible on the Internet. As more and more pseudoknots are revealed, collected and stored in public databases, we anticipate a tool like RKalign will play a significant role in data comparison, annotation, analysis, and retrieval in these databases.


Assuntos
Algoritmos , RNA/química , Pareamento Incorreto de Bases , Pareamento de Bases , Conformação de Ácido Nucleico , Alinhamento de Sequência , Análise de Sequência de RNA , Software
8.
IEEE Trans Cybern ; 45(6): 1113-25, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25137740

RESUMO

We consider a new tree mining problem that aims to discover restrictedly embedded subtree patterns from a set of rooted labeled unordered trees. We study the properties of a canonical form of unordered trees, and develop new Apriori-based techniques to generate all candidate subtrees level by level through two efficient rightmost expansion operations: 1) pairwise joining and 2) leg attachment. Next, we show that restrictedly embedded subtree detection can be achieved by calculating the restricted edit distance between a candidate subtree and a data tree. These techniques are then integrated into an efficient algorithm, named frequent restrictedly embedded subtree miner (FRESTM), to solve the tree mining problem at hand. The correctness of the FRESTM algorithm is proved and the time and space complexities of the algorithm are discussed. Experimental results on synthetic and real-world data demonstrate the effectiveness of the proposed approach.

9.
Comput Biol Chem ; 47: 240-5, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24211672

RESUMO

RNA tertiary interactions or tertiary motifs are conserved structural patterns formed by pairwise interactions between nucleotides. They include base-pairing, base-stacking, and base-phosphate interactions. A-minor motifs are the most common tertiary interactions in the large ribosomal subunit. The A-minor motif is a nucleotide triple in which minor groove edges of an adenine base are inserted into the minor groove of neighboring helices, leading to interaction with a stabilizing base pair. We propose here novel features for identifying and predicting A-minor motifs in a given three-dimensional RNA molecule. By utilizing the features together with machine learning algorithms including random forests and support vector machines, we show experimentally that our approach is capable of predicting A-minor motifs in the given RNA molecule effectively, demonstrating the usefulness of the proposed approach. The techniques developed from this work will be useful for molecular biologists and biochemists to analyze RNA tertiary motifs, specifically A-minor interactions.


Assuntos
Algoritmos , RNA/química , Cristalografia por Raios X , Modelos Moleculares , Simulação de Dinâmica Molecular , Ressonância Magnética Nuclear Biomolecular , Conformação de Ácido Nucleico
10.
OMICS ; 17(9): 486-93, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23808606

RESUMO

MicroRNAs play important roles in most biological processes, including cell proliferation, tissue differentiation, and embryonic development, among others. They originate from precursor transcripts (pre-miRNAs), which contain phylogenetically conserved stem-loop structures. An important bioinformatics problem is to distinguish the pre-miRNAs from pseudo pre-miRNAs that have similar stem-loop structures. We present here a novel method for tackling this bioinformatics problem. Our method, named MirID, accepts an RNA sequence as input, and classifies the RNA sequence either as positive (i.e., a real pre-miRNA) or as negative (i.e., a pseudo pre-miRNA). MirID employs a feature mining algorithm for finding combinations of features suitable for building pre-miRNA classification models. These models are implemented using support vector machines, which are combined to construct a classifier ensemble. The accuracy of the classifier ensemble is further enhanced by the utilization of an AdaBoost algorithm. When compared with two closely related tools on twelve species analyzed with these tools, MirID outperforms the existing tools on the majority of the twelve species. MirID was also tested on nine additional species, and the results showed high accuracies on the nine species. The MirID web server is fully operational and freely accessible at http://bioinformatics.njit.edu/MirID/ . Potential applications of this software in genomics and medicine are also discussed.


Assuntos
Biologia Computacional , Mineração de Dados , MicroRNAs/classificação , Precursores de RNA/classificação , Software , Algoritmos , Animais , Biologia Computacional/métodos , Mineração de Dados/métodos , Bases de Dados de Ácidos Nucleicos , Humanos , Internet , MicroRNAs/química , MicroRNAs/genética , Precursores de RNA/química , Precursores de RNA/genética , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
11.
Recent Pat DNA Gene Seq ; 7(2): 115-22, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22974261

RESUMO

Motif finding in DNA, RNA and proteins plays an important role in life science research. Recent patents concerning motif finding in biomolecular data are recorded in the DNA Patent Database which serves as a resource for policy makers and members of the general public interested in fields like genomics, genetics and biotechnology. In this paper, we present a computational approach to mining for RNA tertiary motifs in genomic sequences. Specifically, we describe a method, named CSminer, and show, as a case study, the application of CSminer to genome-wide search for coaxial helical stackings in RNA 3-way junctions. A coaxial helical stacking occurs in an RNA 3-way junction where two separate helical elements form a pseudocontiguous helix and provide thermodynamic stability to the RNA molecule as a whole. Experimental results demonstrate the effectiveness of our approach.


Assuntos
Biologia Computacional , RNA/química , Sequência de Bases , Cromossomos de Archaea/genética , Haloarcula/genética , Conformação de Ácido Nucleico , Motivos de Nucleotídeos , Patentes como Assunto
12.
J Bioinform Comput Biol ; 10(4): 1250001, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22809414

RESUMO

We propose an ab initio method, named DiscoverR, for finding common patterns from two RNA secondary structures. The method works by representing RNA secondary structures as ordered labeled trees and performs tree pattern discovery using an efficient dynamic programming algorithm. DiscoverR is able to identify and extract the largest common substructures from two RNA molecules having different sizes without prior knowledge of the locations and topologies of these substructures. We also extend DiscoverR to find repeated regions in an RNA secondary structure, and apply this extended method to detect structural repeats in the 3'-untranslated region of a protein kinase gene. We describe the biological significance of a repeated hairpin found by our method, demonstrating the usefulness of the method. DiscoverR is implemented in Java; a jar file including the source code of the program is available for download at http://bioinformatics.njit.edu/DiscoverR.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Regiões 3' não Traduzidas , Algoritmos , Sequência de Bases , Dados de Sequência Molecular , Proteínas Quinases/química , Análise de Sequência de RNA
13.
Genomics Proteomics Bioinformatics ; 10(2): 114-21, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22768985

RESUMO

Recently non-coding RNA (ncRNA) genes have been found to serve many important functions in the cell such as regulation of gene expression at the transcriptional level. Potentially there are more ncRNA molecules yet to be found and their possible functions are to be revealed. The discovery of ncRNAs is a difficult task because they lack sequence indicators such as the start and stop codons displayed by protein-coding RNAs. Current methods utilize either sequence motifs or structural parameters to detect novel ncRNAs within genomes. Here, we present an ab initio ncRNA finder, named ncRNAscout, by utilizing both sequence motifs and structural parameters. Specifically, our method has three components: (i) a measure of the frequency of a sequence, (ii) a measure of the structural stability of a sequence contained in a t-score, and (iii) a measure of the frequency of certain patterns within a sequence that may indicate the presence of ncRNA. Experimental results show that, given a genome and a set of known ncRNAs, our method is able to accurately identify and locate a significant number of ncRNA sequences in the genome. The ncRNAscout tool is available for downloading at http://bioinformatics.njit.edu/ncRNAscout.


Assuntos
RNA Bacteriano/genética , RNA não Traduzido/genética , Algoritmos , Sequência de Bases , Biologia Computacional , Genoma Bacteriano
14.
Nucleic Acids Res ; 40(2): 487-98, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21917853

RESUMO

RNA junctions are important structural elements that form when three or more helices come together in space in the tertiary structures of RNA molecules. Determining their structural configuration is important for predicting RNA 3D structure. We introduce a computational method to predict, at the secondary structure level, the coaxial helical stacking arrangement in junctions, as well as classify the junction topology. Our approach uses a data mining approach known as random forests, which relies on a set of decision trees trained using length, sequence and other variables specified for any given junction. The resulting protocol predicts coaxial stacking within three- and four-way junctions with an accuracy of 81% and 77%, respectively; the accuracy increases to 83% and 87%, respectively, when knowledge from the junction family type is included. Coaxial stacking predictions for the five to ten-way junctions are less accurate (60%) due to sparse data available for training. Additionally, our application predicts the junction family with an accuracy of 85% for three-way junctions and 74% for four-way junctions. Comparisons with other methods, as well applications to unsolved RNAs, are also presented. The web server Junction-Explorer to predict junction topologies is freely available at: http://bioinformatics.njit.edu/junction.


Assuntos
Árvores de Decisões , RNA de Cadeia Dupla/química , Algoritmos , Biologia Computacional/métodos , Mineração de Dados , Modelos Moleculares , Conformação de Ácido Nucleico
15.
Int J Bioinform Res Appl ; 7(4): 355-75, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22112528

RESUMO

We propose here a new approach for ncRNA prediction. Our approach selects features derived from RNA folding programs and ranks these features using a class separation method that measures the ability of the features to differentiate between positive and negative classes. The target feature set comprising top-ranked features is then used to construct several classifiers with different supervised learning algorithms. These classifiers are compared to the same supervised learning algorithms with the baseline feature set employed in a state-of-the-art method. Experimental results based on ncRNA families taken from the Rfam database demonstrate the good performance of the proposed approach.


Assuntos
Inteligência Artificial , Dobramento de RNA , RNA não Traduzido/química , Algoritmos , Bases de Dados Genéticas
16.
J Bioinform Comput Biol ; 8(6): 967-80, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21121021

RESUMO

We present a method, called BlockMatch, for aligning two blocks, where a block is an RNA multiple sequence alignment with the consensus secondary structure of the alignment in Stockholm format. The method employs a quadratic-time dynamic programming algorithm for aligning columns and column pairs of the multiple alignments in the blocks. Unlike many other tools that can perform pairwise alignment of either single sequences or structures only, BlockMatch takes into account the characteristics of all the sequences in the blocks along with their consensus structures during the alignment process, thus being able to achieve a high-quality alignment result. We apply BlockMatch to phylogeny reconstruction on a set of 5S rRNA sequences taken from fifteen bacteria species. Experimental results showed that the phylogenetic tree generated by our method is more accurate than the tree constructed based on the widely used ClustalW tool. The BlockMatch algorithm is implemented into a web server, accessible at http://bioinformatics.njit.edu/blockmatch. A jar file of the program is also available for download from the web server.


Assuntos
RNA/genética , Alinhamento de Sequência/estatística & dados numéricos , Algoritmos , Bactérias/classificação , Bactérias/genética , Biologia Computacional , Conformação de Ácido Nucleico , Filogenia , RNA/química , RNA Bacteriano/genética , RNA Ribossômico 5S/genética
17.
Bioinform Biol Insights ; 3: 51-69, 2009 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-20140072

RESUMO

Thermodynamic processes with free energy parameters are often used in algorithms that solve the free energy minimization problem to predict secondary structures of single RNA sequences. While results from these algorithms are promising, an observation is that single sequence-based methods have moderate accuracy and more information is needed to improve on RNA secondary structure prediction, such as covariance scores obtained from multiple sequence alignments. We present in this paper a new approach to predicting the consensus secondary structure of a set of aligned RNA sequences via pseudo-energy minimization. Our tool, called RSpredict, takes into account sequence covariation and employs effective heuristics for accuracy improvement. RSpredict accepts, as input data, a multiple sequence alignment in FASTA or ClustalW format and outputs the consensus secondary structure of the input sequences in both the Vienna style Dot Bracket format and the Connectivity Table format. Our method was compared with some widely used tools including KNetFold, Pfold and RNAalifold. A comprehensive test on different datasets including Rfam sequence alignments and a multiple sequence alignment obtained from our study on the Drosophila X chromosome reveals that RSpredict is competitive with the existing tools on the tested datasets. RSpredict is freely available online as a web server and also as a jar file for download at http://datalab.njit.edu/biology/RSpredict.

18.
Comput Biol Chem ; 32(4): 264-72, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18472302

RESUMO

Constrained sequence alignment has been studied extensively in the past. Different forms of constraints have been investigated, where a constraint can be a subsequence, a regular expression, or a probability matrix of symbols and positions. However, constrained structural alignment has been investigated to a much lesser extent. In this paper, we present an efficient method for constrained structural alignment and apply the method to detecting conserved secondary structures, or structural motifs, in a set of RNA molecules. The proposed method combines both sequence and structural information of RNAs to find an optimal local alignment between two RNA secondary structures, one of which is a query and the other is a subject structure in the given set. The method allows a biologist to annotate conserved regions, or constraints, in the query RNA structure and incorporates these regions into the alignment process to obtain biologically more meaningful alignment scores. A statistical measure is developed to assess the significance of the scores. Experimental results based on detecting internal ribosome entry sites in the RNA molecules of hepatitis C virus and Trypanosoma brucei demonstrate the effectiveness of the proposed method and its superiority over existing techniques.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Algoritmos , Animais , Sequência de Bases , Hepacivirus/genética , Dados de Sequência Molecular , RNA/genética , Trypanosoma brucei brucei/genética
19.
BMC Genomics ; 9: 189, 2008 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-18439287

RESUMO

BACKGROUND: UnTranslated Regions (UTRs) of mRNAs contain regulatory elements for various aspects of mRNA metabolism, such as mRNA localization, translation, and mRNA stability. Several RNA stem-loop structures in UTRs have been experimentally identified, including the histone 3' UTR stem-loop structure (HSL3) and iron response element (IRE). These stem-loop structures are conserved among mammalian orthologs, and exist in a group of genes encoding proteins involved in the same biological pathways. It is not known to what extent RNA structures like these exist in all mammalian UTRs. RESULTS: In this paper we took a systematic approach, named GLEAN-UTR, to identify small stem-loop RNA structure elements in UTRs that are conserved between human and mouse orthologs and exist in multiple genes with common Gene Ontology terms. This approach resulted in 90 distinct RNA structure groups containing 748 structures, with HSL3 and IRE among the top hits based on conservation of structure. CONCLUSION: Our result indicates that there may exist many conserved stem-loop structures in mammalian UTRs that are involved in coordinate post-transcriptional regulation of biological pathways.


Assuntos
RNA Mensageiro/química , RNA Mensageiro/genética , Regiões não Traduzidas , Animais , Sequência de Bases , Análise por Conglomerados , Sequência Conservada , Bases de Dados de Ácidos Nucleicos , Humanos , Camundongos , Alinhamento de Sequência , Design de Software , Especificidade da Espécie
20.
Nucleic Acids Res ; 35(Web Server issue): W300-4, 2007 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-17517784

RESUMO

RADAR is a web server that provides a multitude of functionality for RNA data analysis and research. It can align structure-annotated RNA sequences so that both sequence and structure information are taken into consideration during the alignment process. This server is capable of performing pairwise structure alignment, multiple structure alignment, database search and clustering. In addition, RADAR provides two salient features: (i) constrained alignment of RNA secondary structures, and (ii) prediction of the consensus structure for a set of RNA sequences. RADAR will be able to assist scientists in performing many important RNA mining operations, including the understanding of the functionality of RNA sequences, the detection of RNA structural motifs and the clustering of RNA molecules, among others. The web server together with a software package for download is freely accessible at http://datalab.njit.edu/biodata/rna/RSmatch/server.htm and http://www.ccrnp.ncifcrf.gov/~bshapiro/


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Algoritmos , Sequência de Bases , Simulação por Computador , Sequência Conservada , Bases de Dados Genéticas , Humanos , Dados de Sequência Molecular , RNA não Traduzido , Sequências Reguladoras de Ácido Ribonucleico , Alinhamento de Sequência , Análise de Sequência de RNA , Homologia de Sequência do Ácido Nucleico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA