Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
J Mol Biol ; : 168552, 2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38552946

RESUMO

With advances in protein structure prediction thanks to deep learning models like AlphaFold, RNA structure prediction has recently received increased attention from deep learning researchers. RNAs introduce substantial challenges due to the sparser availability and lower structural diversity of the experimentally resolved RNA structures in comparison to protein structures. These challenges are often poorly addressed by the existing literature, many of which report inflated performance due to using training and testing sets with significant structural overlap. Further, the most recent Critical Assessment of Structure Prediction (CASP15) has shown that deep learning models for RNA structure are currently outperformed by traditional methods. In this paper we present RNA3DB, a dataset of structured RNAs, derived from the Protein Data Bank (PDB), that is designed for training and benchmarking deep learning models. The RNA3DB method arranges the RNA 3D chains into distinct groups (Components) that are non-redundant both with regard to sequence as well as structure, providing a robust way of dividing training, validation, and testing sets. Any split of these structurally-dissimilar Components are guaranteed to produce test and validations sets that are distinct by sequence and structure from those in the training set. We provide the RNA3DB dataset, a particular train/test split of the RNA3DB Components (in an approximate 70/30 ratio) that will be updated periodically. We also provide the RNA3DB methodology along with the source-code, with the goal of creating a reproducible and customizable tool for producing structurally-dissimilar dataset splits for structural RNAs.

2.
bioRxiv ; 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38352531

RESUMO

With advances in protein structure prediction thanks to deep learning models like AlphaFold, RNA structure prediction has recently received increased attention from deep learning researchers. RNAs introduce substantial challenges due to the sparser availability and lower structural diversity of the experimentally resolved RNA structures in comparison to protein structures. These challenges are often poorly addressed by the existing literature, many of which report inflated performance due to using training and testing sets with significant structural overlap. Further, the most recent Critical Assessment of Structure Prediction (CASP15) has shown that deep learning models for RNA structure are currently outperformed by traditional methods. In this paper we present RNA3DB, a dataset of structured RNAs, derived from the Protein Data Bank (PDB), that is designed for training and benchmarking deep learning models. The RNA3DB method arranges the RNA 3D chains into distinct groups (Components) that are non-redundant both with regard to sequence as well as structure, providing a robust way of dividing training, validation, and testing sets. Any split of these structurally-dissimilar Components are guaranteed to produce test and validations sets that are distinct by sequence and structure from those in the training set. We provide the RNA3DB dataset, a particular train/test split of the RNA3DB Components (in an approximate 70/30 ratio) that will be updated periodically. We also provide the RNA3DB methodology along with the source-code, with the goal of creating a reproducible and customizable tool for producing structurally-dissimilar dataset splits for structural RNAs.

3.
Nephrol Nurs J ; 50(4): 333-344, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37695519

RESUMO

Central venous catheter-related infection is the most common complication in patients on hemodialysis. Nursing care is essential for its maintenance, minimizing risk factors, and avoiding complications, such as bacteremia. A systematic review was conducted to identify the influence of nursing care on the prevention of bacteremia due to hemodialysis catheter. The primary endpoint was the bacteremia rate measured as number of events per 1000 catheter days. The rate of bacteremia in the studies ranged from 0.2 to 5.47 events per 1000 catheter days after the application of nursing care. Several studies have shown a significant reduction in central venous catheter bacteremia with the application of management protocols, appropriate vigilance, and monitoring, as well as the inclusion of the Plan Do Check Act cycle and education.


Assuntos
Bacteriemia , Infecções Relacionadas a Cateter , Cateterismo Venoso Central , Cateteres Venosos Centrais , Humanos , Cateterismo Venoso Central/efeitos adversos , Cateteres de Demora/efeitos adversos , Cateteres Venosos Centrais/efeitos adversos , Diálise Renal/efeitos adversos , Bacteriemia/etiologia , Bacteriemia/prevenção & controle , Infecções Relacionadas a Cateter/etiologia
4.
PLoS Comput Biol ; 19(7): e1011262, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37450549

RESUMO

Many biologically important RNAs fold into specific 3D structures conserved through evolution. Knowing when an RNA sequence includes a conserved RNA structure that could lead to new biology is not trivial and depends on clues left behind by conservation in the form of covariation and variation. For that purpose, the R-scape statistical test was created to identify from alignments of RNA sequences, the base pairs that significantly covary above phylogenetic expectation. R-scape treats base pairs as independent units. However, RNA base pairs do not occur in isolation. The Watson-Crick (WC) base pairs stack together forming helices that constitute the scaffold that facilitates the formation of the non-WC base pairs, and ultimately the complete 3D structure. The helix-forming WC base pairs carry most of the covariation signal in an RNA structure. Here, I introduce a new measure of statistically significant covariation at helix-level by aggregation of the covariation significance and covariation power calculated at base-pair-level resolution. Performance benchmarks show that helix-level aggregated covariation increases sensitivity in the detection of evolutionarily conserved RNA structure without sacrificing specificity. This additional helix-level sensitivity reveals an artifact that results from using covariation to build an alignment for a hypothetical structure and then testing the alignment for whether its covariation significantly supports the structure. Helix-level reanalysis of the evolutionary evidence for a selection of long non-coding RNAs (lncRNAs) reinforces the evidence against these lncRNAs having a conserved secondary structure.


Assuntos
RNA Longo não Codificante , RNA , RNA/genética , Conformação de Ácido Nucleico , RNA Longo não Codificante/genética , Filogenia , Pareamento de Bases , Sequência de Bases
5.
bioRxiv ; 2023 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-37131783

RESUMO

Many biologically important RNAs fold into specific 3D structures conserved through evolution. Knowing when an RNA sequence includes a conserved RNA structure that could lead to new biology is not trivial and depends on clues left behind by conservation in the form of covariation and variation. For that purpose, the R-scape statistical test was created to identify from alignments of RNA sequences, the base pairs that significantly covary above phylogenetic expectation. R-scape treats base pairs as independent units. However, RNA base pairs do not occur in isolation. The Watson-Crick (WC) base pairs stack together forming helices that constitute the scaffold that facilitates the formation of the non-WC base pairs, and ultimately the complete 3D structure. The helix-forming WC base pairs carry most of the covariation signal in an RNA structure. Here, I introduce a new measure of statistically significant covariation at helix-level by aggregation of the covariation significance and covariation power calculated at base-pair-level resolution. Performance benchmarks show that helix-level aggregated covariation increases sensitivity in the detection of evolutionarily conserved RNA structure without sacrificing specificity. This additional helix-level sensitivity reveals an artifact that results from using covariation to build an alignment for a hypothetical structure and then testing the alignment for whether its covariation significantly supports the structure. Helix-level reanalysis of the evolutionary evidence for a selection of long non-coding RNAs (lncRNAs) reinforces the evidence against these lncRNAs having a conserved secondary structure. Availability: Helix aggregated E-values are integrated in the R-scape software package (version 2.0.0.p and higher). The R-scape web server eddylab.org/R-scape includes a link to download the source code. Contact: elenarivas@fas.harvard.edu. Supplementary information: Supplementary data and code are provided with this manuscript at rivaslab.org .

6.
Nucleic Acids Res ; 51(7): e40, 2023 04 24.
Artigo em Inglês | MEDLINE | ID: mdl-36869673

RESUMO

An RNA design algorithm takes a target RNA structure and finds a sequence that folds into that structure. This is fundamentally important for engineering therapeutics using RNA. Computational RNA design algorithms are guided by fitness functions, but not much research has been done on the merits of these functions. We survey current RNA design approaches with a particular focus on the fitness functions used. We experimentally compare the most widely used fitness functions in RNA design algorithms on both synthetic and natural sequences. It has been almost 20 years since the last comparison was published, and we find similar results with a major new result: maximizing probability outperforms minimizing ensemble defect. The probability is the likelihood of a structure at equilibrium and the ensemble defect is the weighted average number of incorrect positions in the ensemble. We find that maximizing probability leads to better results on synthetic RNA design puzzles and agrees more often than other fitness functions with natural sequences and structures, which were designed by evolution. Also, we observe that many recently published approaches minimize structure distance to the minimum free energy prediction, which we find to be a poor fitness function.


Assuntos
Algoritmos , RNA , RNA/genética , RNA/química , Conformação de Ácido Nucleico , Probabilidade
7.
Nurs Open ; 10(3): 1611-1618, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36266761

RESUMO

AIM: To evaluate the impact of an educational intervention focused on teaching students to create infographics to improve pharmacology learning. DESIGN: This is a comparative study. METHODS: The population was 250 nursing students who had to create two infographics in groups related to the content that had been addressed in pharmacology in two different moments. Students and professors evaluated the infographics through a 5-point Likert scale. Scores from the official exam of the pharmacology subject were obtained. RESULTS: Most of the students scored below 50% for the "excellent" and "good" categories. Intraclass correlation and kappa correlations among students and professors' evaluations were low. The comparison between both times of students' evaluations only yields significant correlation values for the criterion "Understanding of information" (r = .039, p = .024) and the "Visual presentation of information" (r = .041, p = .019). No correlation was obtained between the test and evaluations values of the infographic.


Assuntos
Visualização de Dados , Estudantes de Enfermagem , Humanos , Avaliação Educacional , Aprendizagem , Currículo
8.
IUBMB Life ; 75(6): 471-492, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-36495545

RESUMO

Covariation induced by compensatory base substitutions in RNA alignments is a great way to deduce conserved RNA structure, in principle. In practice, success depends on many factors, importantly the quality and depth of the alignment and the choice of covariation statistic. Measuring covariation between pairs of aligned positions is easy. However, using covariation to infer evolutionarily conserved RNA structure is complicated by other extraneous sources of covariation such as that resulting from homologous sequences having evolved from a common ancestor. In order to provide evidence of evolutionarily conserved RNA structure, a method to distinguish covariation due to sources other than RNA structure is necessary. Moreover, there are several sorts of artifactually generated covariation signals that can further confound the analysis. Additionally, some covariation signal is difficult to detect due to incomplete comparative data. Here, we investigate and critically discuss the practice of inferring conserved RNA structure by comparative sequence analysis. We provide new methods on how to approach and decide which of the numerous long non-coding RNAs (lncRNAs) have biologically relevant structures.


Assuntos
RNA Longo não Codificante , Conformação de Ácido Nucleico , Alinhamento de Sequência
9.
Nucleic Acids Res ; 49(11): 6128-6143, 2021 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-34086938

RESUMO

Many non-coding RNAs with known functions are structurally conserved: their intramolecular secondary and tertiary interactions are maintained across evolutionary time. Consequently, the presence of conserved structure in multiple sequence alignments can be used to identify candidate functional non-coding RNAs. Here, we present a bioinformatics method that couples iterative homology search with covariation analysis to assess whether a genomic region has evidence of conserved RNA structure. We used this method to examine all unannotated regions of five well-studied fungal genomes (Saccharomyces cerevisiae, Candida albicans, Neurospora crassa, Aspergillus fumigatus, and Schizosaccharomyces pombe). We identified 17 novel structurally conserved non-coding RNA candidates, which include four H/ACA box small nucleolar RNAs, four intergenic RNAs and nine RNA structures located within the introns and untranslated regions (UTRs) of mRNAs. For the two structures in the 3' UTRs of the metabolic genes GLY1 and MET13, we performed experiments that provide evidence against them being eukaryotic riboswitches.


Assuntos
RNA Fúngico/química , RNA não Traduzido/química , Regiões 3' não Traduzidas , Biologia Computacional/métodos , Genoma Fúngico , Íntrons , Lisina-tRNA Ligase/genética , Cadeias de Markov , Conformação de Ácido Nucleico , RNA Nucleolar Pequeno/química , Proteínas Ribossômicas/genética , Riboswitch , Alinhamento de Sequência , Tiorredoxinas/genética
10.
Wiley Interdiscip Rev RNA ; 12(5): e1649, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-33754485

RESUMO

An RNA structure prediction from a single-sequence RNA folding program is not evidence for an RNA whose structure is important for function. Random sequences have plausible and complex predicted structures not easily distinguishable from those of structural RNAs. How to tell when an RNA has a conserved structure is a question that requires looking at the evolutionary signature left by the conserved RNA. This question is important not just for long noncoding RNAs which usually lack an identified function, but also for RNA binding protein motifs which can be single stranded RNAs or structures. Here we review recent advances using sequence and structural analysis to determine when RNA structure is conserved or not. Although covariation measures assess structural RNA conservation, one must distinguish covariation due to RNA structure from covariation due to independent phylogenetic substitutions. We review a statistical test to measure false positives expected under the null hypothesis of phylogenetic covariation alone (specificity). We also review a complementary test that measures power, that is, expected covariation derived from sequence variation alone (sensitivity). Power in the absence of covariation signals the absence of a conserved RNA structure. We analyze artifacts that falsely identify conserved RNA structure such as the misuse of programs that do not assess significance, the use of inappropriate statistics confounded by signals other than covariation, or misalignments that induce spurious covariation. Among artifacts that obscure the signal of a conserved RNA structure, we discuss the inclusion of pseudogenes in alignments which increase power but destroy covariation. This article is categorized under: RNA Structure and Dynamics > RNA Structure, Dynamics and Chemistry RNA Evolution and Genomics > Computational Analyses of RNA RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution.


Assuntos
RNA Longo não Codificante , RNA , Sequência de Bases , Sequência Conservada , Conformação de Ácido Nucleico , Filogenia , RNA/genética , Alinhamento de Sequência
11.
Artigo em Inglês, Espanhol | MEDLINE | ID: mdl-32475609

RESUMO

OBJECTIVE: The aim of this study was to create and validate an abbreviated version of the Spanish Transsexual Voice Questionnaire for Male-to-Female Transsexuals (SvTVQMtF). SETTING: The study was conducted by two referral hospitals for voice feminization surgery and by a university department of psychology and speech therapy, all in Spain. SUBJECTS AND METHODS: We prospectively studied 51 male-to-female transsexuals who underwent voice feminization surgery between January 2017 and December 2018. The SvTVQMtF was completed before and after surgery, and the 10 items with the greatest variation were selected by clinical consensus of an expert panel to develop the short version of the SvTVQMtF (SvTVQMtF-10). The correlation between the total score and the score for each item on the SvTVQMtF and the SvTVQMtF-10 was studied. The internal consistency of the SvTVQMtF-10 was analysed. RESULTS: Good correlation (Pearson coefficient above .90) was found between the two questionnaires. A significant correlation was found between the total SvTVQMtF-10 score and the score for each item. A significant negative correlation was found between the SvTVQMtF and fundamental frequency after voice feminization surgery. Cronbach's α was .79. CONCLUSION: The SvTVQMtF-10 is a valid short version of the SvTVQMtF and can be used to quantify voice-related quality of life in MtF transsexuals.

12.
Nucleic Acids Res ; 49(D1): D192-D200, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33211869

RESUMO

Rfam is a database of RNA families where each of the 3444 families is represented by a multiple sequence alignment of known RNA sequences and a covariance model that can be used to search for additional members of the family. Recent developments have involved expert collaborations to improve the quality and coverage of Rfam data, focusing on microRNAs, viral and bacterial RNAs. We have completed the first phase of synchronising microRNA families in Rfam and miRBase, creating 356 new Rfam families and updating 40. We established a procedure for comprehensive annotation of viral RNA families starting with Flavivirus and Coronaviridae RNAs. We have also increased the coverage of bacterial and metagenome-based RNA families from the ZWD database. These developments have enabled a significant growth of the database, with the addition of 759 new families in Rfam 14. To facilitate further community contribution to Rfam, expert users are now able to build and submit new families using the newly developed Rfam Cloud family curation system. New Rfam website features include a new sequence similarity search powered by RNAcentral, as well as search and visualisation of families with pseudoknots. Rfam is freely available at https://rfam.org.


Assuntos
Bases de Dados de Ácidos Nucleicos , Metagenoma , MicroRNAs/genética , RNA Bacteriano/genética , RNA não Traduzido/genética , RNA Viral/genética , Bactérias/genética , Bactérias/metabolismo , Pareamento de Bases , Sequência de Bases , Humanos , Internet , MicroRNAs/classificação , MicroRNAs/metabolismo , Anotação de Sequência Molecular , Conformação de Ácido Nucleico , RNA Bacteriano/classificação , RNA Bacteriano/metabolismo , RNA não Traduzido/classificação , RNA não Traduzido/metabolismo , RNA Viral/classificação , RNA Viral/metabolismo , Alinhamento de Sequência , Análise de Sequência de RNA , Software , Vírus/genética , Vírus/metabolismo
13.
PLoS Comput Biol ; 16(10): e1008387, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33125376

RESUMO

Knowing the structure of conserved structural RNAs is important to elucidate their function and mechanism of action. However, predicting a conserved RNA structure remains unreliable, even when using a combination of thermodynamic stability and evolutionary covariation information. Here we present a method to predict a conserved RNA structure that combines the following three features. First, it uses significant covariation due to RNA structure and removes spurious covariation due to phylogeny. Second, it uses negative evolutionary information: basepairs that have variation but no significant covariation are prevented from occurring. Lastly, it uses a battery of probabilistic folding algorithms that incorporate all positive covariation into one structure. The method, named CaCoFold (Cascade variation/covariation Constrained Folding algorithm), predicts a nested structure guided by a maximal subset of positive basepairs, and recursively incorporates all remaining positive basepairs into alternative helices. The alternative helices can be compatible with the nested structure such as pseudoknots, or overlapping such as competing structures, base triplets, or other 3D non-antiparallel interactions. We present evidence that CaCoFold predictions are consistent with structures modeled from crystallography.


Assuntos
RNA , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Algoritmos , Biologia Computacional , Evolução Molecular , Modelos Moleculares , Conformação de Ácido Nucleico , RNA/química , RNA/genética , RNA/metabolismo , Termodinâmica
14.
Bioinformatics ; 36(10): 3072-3076, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32031582

RESUMO

Pairwise sequence covariations are a signal of conserved RNA secondary structure. We describe a method for distinguishing when lack of covariation signal can be taken as evidence against a conserved RNA structure, as opposed to when a sequence alignment merely has insufficient variation to detect covariations. We find that alignments for several long non-coding RNAs previously shown to lack covariation support do have adequate covariation detection power, providing additional evidence against their proposed conserved structures. AVAILABILITY AND IMPLEMENTATION: The R-scape web server is at eddylab.org/R-scape, with a link to download the source code. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
RNA Longo não Codificante , RNA , Algoritmos , Sequência Conservada , Conformação de Ácido Nucleico , RNA/genética , Alinhamento de Sequência , Análise de Sequência de RNA , Software
15.
Nucleic Acids Res ; 46(D1): D335-D342, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29112718

RESUMO

The Rfam database is a collection of RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. In this paper we introduce Rfam release 13.0, which switches to a new genome-centric approach that annotates a non-redundant set of reference genomes with RNA families. We describe new web interface features including faceted text search and R-scape secondary structure visualizations. We discuss a new literature curation workflow and a pipeline for building families based on RNAcentral. There are 236 new families in release 13.0, bringing the total number of families to 2687. The Rfam website is http://rfam.org.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma , RNA não Traduzido/química , RNA não Traduzido/genética , Humanos , Anotação de Sequência Molecular , Conformação de Ácido Nucleico , RNA não Traduzido/classificação , Alinhamento de Sequência , Análise de Sequência de RNA
16.
Clin Infect Dis ; 64(8): 1092-1097, 2017 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-28329390

RESUMO

Background: In Western countries emergence of human immunodeficiency virus (HIV) drug resistance has tremendously decreased, and transmission of drug resistance has merely stabilized in recent years. However, in many endemic settings with limited resources rates of emerging and transmitted drug resistance are not regularly assessed. Methods: We performed a survey including all HIV-infected individuals who received resistance testing in 2010-2015 in Aruba, a highly endemic HIV area in the Caribbean. Transmitted HIV drug resistance was determined using World Health Organization (WHO) criteria. Transmission dynamics were investigated using phylogenetic analyses. In a subset, baseline samples were re-analyzed using next generation sequencing (NGS). Results: Baseline resistance testing was performed in 104 newly diagnosed untreated individuals (54% of all newly diagnosed individuals in 2010-2015): 86% were men, 39% were foreign-born, and 22% had AIDS at diagnosis. And 33% (95% CI: 24-42%) was infected with a drug-resistant HIV variant. The prevalence of resistance to non-nucleoside reverse transcriptase inhibitors (NNRTIs) reached 45% (95% CI: 27-64%) in 2015, all based on the prevalence of mutation K103N. NGS did not demonstrate additional minority K103N-variants compared to routine resistance testing. K103N-harboring strains were introduced into the therapy-unexposed population via at least 6 independent transmissions epidemiologically linked to the surrounding countries. Virological failure of the WHO-recommended first-line NNRTI-based regimen was higher in the presence of K103N. Conclusions: The prevalence of resistant HIV in Aruba has increased to alarming levels, compromising the WHO-recommended first-line regimen. As adequate surveillance as advocated by the WHO is limited, the Caribbean region could face an unidentified rise of NNRTI-resistant HIV.


Assuntos
Fármacos Anti-HIV/farmacologia , Farmacorresistência Viral , Infecções por HIV/epidemiologia , Infecções por HIV/virologia , HIV/efeitos dos fármacos , Adulto , Fármacos Anti-HIV/uso terapêutico , Região do Caribe/epidemiologia , Feminino , HIV/isolamento & purificação , Infecções por HIV/transmissão , Humanos , Masculino , Pessoa de Meia-Idade , Inquéritos e Questionários
17.
Int J Infect Dis ; 59: 14-21, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28347851

RESUMO

OBJECTIVES: No interventions have yet been implemented to improve antibiotic use on Aruba. In the Netherlands, the introduction of an antibiotic checklist resulted in more appropriate antibiotic use in nine hospitals. The aim of this study was to introduce the antibiotic checklist on Aruba, test its effectiveness, and evaluate the possibility of implementing this checklist outside the Netherlands. METHODS: The antibiotic checklist includes seven quality indicators (QIs) that define appropriate antibiotic use. It applies to adult patients with a suspected bacterial infection, treated with intravenous antibiotics. The primary endpoint was the QI sum score, calculated by the patient's sum of performed checklist-items divided by the total number of QIs that applied to that specific patient. Outcomes before and after the introduction of the checklist were compared. RESULTS: The percentage of patients with a QI sum score ≥50% increased significantly during the intervention (n=173) compared to baseline (n=150) (odds ratio 3.67, p<0.001). However, performance did not improve on each individual QI. The checklist was used in 63.3% of the eligible patients. CONCLUSIONS: The introduction of the antibiotic checklist increased appropriate antibiotic use on Aruba. Additional initiatives are necessary for further improvement per QI. These results suggest that the antibiotic checklist could be used internationally.


Assuntos
Antibacterianos/administração & dosagem , Infecções Bacterianas/tratamento farmacológico , Administração Intravenosa , Adulto , Idoso , Idoso de 80 Anos ou mais , Lista de Checagem , Feminino , Hospitais , Humanos , Masculino , Pessoa de Meia-Idade , Países Baixos , Indicadores de Qualidade em Assistência à Saúde
18.
Nat Methods ; 14(1): 45-48, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27819659

RESUMO

Many functional RNAs have an evolutionarily conserved secondary structure. Conservation of RNA base pairing induces pairwise covariations in sequence alignments. We developed a computational method, R-scape (RNA Structural Covariation Above Phylogenetic Expectation), that quantitatively tests whether covariation analysis supports the presence of a conserved RNA secondary structure. R-scape analysis finds no statistically significant support for proposed secondary structures of the long noncoding RNAs HOTAIR, SRA, and Xist.


Assuntos
Evolução Molecular , Filogenia , RNA Longo não Codificante/química , RNA Longo não Codificante/genética , Pareamento de Bases , Sequência de Bases , Humanos , Conformação de Ácido Nucleico
19.
BMC Bioinformatics ; 16: 406, 2015 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-26652060

RESUMO

BACKGROUND: Inference of sequence homology is inherently an evolutionary question, dependent upon evolutionary divergence. However, the insertion and deletion penalties in the most widely used methods for inferring homology by sequence alignment, including BLAST and profile hidden Markov models (profile HMMs), are not based on any explicitly time-dependent evolutionary model. Using one fixed score system (BLOSUM62 with some gap open/extend costs, for example) corresponds to making an unrealistic assumption that all sequence relationships have diverged by the same time. Adoption of explicit time-dependent evolutionary models for scoring insertions and deletions in sequence alignments has been hindered by algorithmic complexity and technical difficulty. RESULTS: We identify and implement several probabilistic evolutionary models compatible with the affine-cost insertion/deletion model used in standard pairwise sequence alignment. Assuming an affine gap cost imposes important restrictions on the realism of the evolutionary models compatible with it, as single insertion events with geometrically distributed lengths do not result in geometrically distributed insert lengths at finite times. Nevertheless, we identify one evolutionary model compatible with symmetric pair HMMs that are the basis for Smith-Waterman pairwise alignment, and two evolutionary models compatible with standard profile-based alignment. We test different aspects of the performance of these "optimized branch length" models, including alignment accuracy and homology coverage (discrimination of residues in a homologous region from nonhomologous flanking residues). We test on benchmarks of both global homologies (full length sequence homologs) and local homologies (homologous subsequences embedded in nonhomologous sequence). CONCLUSIONS: Contrary to our expectations, we find that for global homologies a single long branch parameterization suffices both for distant and close homologous relationships. In contrast, we do see an advantage in using explicit evolutionary models for local homologies. Optimal branch parameterization reduces a known artifact called "homologous overextension", in which local alignments erroneously extend through flanking nonhomologous residues.


Assuntos
Algoritmos , Evolução Molecular , Modelos Teóricos , Humanos , Modelos Estatísticos , Probabilidade , Alinhamento de Sequência
20.
RNA Biol ; 10(7): 1185-96, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23695796

RESUMO

Any method for RNA secondary structure prediction is determined by four ingredients. The architecture is the choice of features implemented by the model (such as stacked basepairs, loop length distributions, etc.). The architecture determines the number of parameters in the model. The scoring scheme is the nature of those parameters (whether thermodynamic, probabilistic, or weights). The parameterization stands for the specific values assigned to the parameters. These three ingredients are referred to as "the model." The fourth ingredient is the folding algorithms used to predict plausible secondary structures given the model and the sequence of a structural RNA. Here, I make several unifying observations drawn from looking at more than 40 years of methods for RNA secondary structure prediction in the light of this classification. As a final observation, there seems to be a performance ceiling that affects all methods with complex architectures, a ceiling that impacts all scoring schemes with remarkable similarity. This suggests that modeling RNA secondary structure by using intrinsic sequence-based plausible "foldability" will require the incorporation of other forms of information in order to constrain the folding space and to improve prediction accuracy. This could give an advantage to probabilistic scoring systems since a probabilistic framework is a natural platform to incorporate different sources of information into one single inference problem.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Software , Algoritmos , Dobramento de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA