Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 149
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nature ; 621(7978): 396-403, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37130545

RESUMO

Messenger RNA (mRNA) vaccines are being used to combat the spread of COVID-19 (refs. 1-3), but they still exhibit critical limitations caused by mRNA instability and degradation, which are major obstacles for the storage, distribution and efficacy of the vaccine products4. Increasing secondary structure lengthens mRNA half-life, which, together with optimal codons, improves protein expression5. Therefore, a principled mRNA design algorithm must optimize both structural stability and codon usage. However, owing to synonymous codons, the mRNA design space is prohibitively large-for example, there are around 2.4 × 10632 candidate mRNA sequences for the SARS-CoV-2 spike protein. This poses insurmountable computational challenges. Here we provide a simple and unexpected solution using the classical concept of lattice parsing in computational linguistics, where finding the optimal mRNA sequence is analogous to identifying the most likely sentence among similar-sounding alternatives6. Our algorithm LinearDesign finds an optimal mRNA design for the spike protein in just 11 minutes, and can concurrently optimize stability and codon usage. LinearDesign substantially improves mRNA half-life and protein expression, and profoundly increases antibody titre by up to 128 times in mice compared to the codon-optimization benchmark on mRNA vaccines for COVID-19 and varicella-zoster virus. This result reveals the great potential of principled mRNA design and enables the exploration of previously unreachable but highly stable and efficient designs. Our work is a timely tool for vaccines and other mRNA-based medicines encoding therapeutic proteins such as monoclonal antibodies and anti-cancer drugs7,8.


Assuntos
Algoritmos , Vacinas contra COVID-19 , COVID-19 , Estabilidade de RNA , RNA Mensageiro , SARS-CoV-2 , Vacinas de mRNA , Animais , Humanos , Camundongos , Códon/genética , COVID-19/genética , COVID-19/imunologia , COVID-19/prevenção & controle , Vacinas contra COVID-19/química , Vacinas contra COVID-19/genética , Vacinas contra COVID-19/imunologia , Meia-Vida , Herpesvirus Humano 3/genética , Herpesvirus Humano 3/imunologia , Vacinas de mRNA/química , Vacinas de mRNA/genética , Vacinas de mRNA/imunologia , Estabilidade de RNA/genética , Estabilidade de RNA/imunologia , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Mensageiro/imunologia , RNA Mensageiro/metabolismo , SARS-CoV-2/genética , SARS-CoV-2/imunologia
2.
Nucleic Acids Res ; 51(18): e94, 2023 10 13.
Artigo em Inglês | MEDLINE | ID: mdl-37650626

RESUMO

Many RNAs function through RNA-RNA interactions. Fast and reliable RNA structure prediction with consideration of RNA-RNA interaction is useful, however, existing tools are either too simplistic or too slow. To address this issue, we present LinearCoFold, which approximates the complete minimum free energy structure of two strands in linear time, and LinearCoPartition, which approximates the cofolding partition function and base pairing probabilities in linear time. LinearCoFold and LinearCoPartition are orders of magnitude faster than RNAcofold. For example, on a sequence pair with combined length of 26,190 nt, LinearCoFold is 86.8× faster than RNAcofold MFE mode, and LinearCoPartition is 642.3× faster than RNAcofold partition function mode. Surprisingly, LinearCoFold and LinearCoPartition's predictions have higher PPV and sensitivity of intermolecular base pairs. Furthermore, we apply LinearCoFold to predict the RNA-RNA interaction between SARS-CoV-2 genomic RNA (gRNA) and human U4 small nuclear RNA (snRNA), which has been experimentally studied, and observe that LinearCoFold's prediction correlates better with the wet lab results than RNAcofold's.


Assuntos
Algoritmos , RNA , Humanos , Pareamento de Bases , Genômica , Conformação de Ácido Nucleico , RNA/química , RNA/metabolismo , RNA Viral/química , SARS-CoV-2/química
3.
Nucleic Acids Res ; 51(2): e7, 2023 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-36401871

RESUMO

Many RNAs fold into multiple structures at equilibrium, and there is a need to sample these structures according to their probabilities in the ensemble. The conventional sampling algorithm suffers from two limitations: (i) the sampling phase is slow due to many repeated calculations; and (ii) the end-to-end runtime scales cubically with the sequence length. These issues make it difficult to be applied to long RNAs, such as the full genomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To address these problems, we devise a new sampling algorithm, LazySampling, which eliminates redundant work via on-demand caching. Based on LazySampling, we further derive LinearSampling, an end-to-end linear time sampling algorithm. Benchmarking on nine diverse RNA families, the sampled structures from LinearSampling correlate better with the well-established secondary structures than Vienna RNAsubopt and RNAplfold. More importantly, LinearSampling is orders of magnitude faster than standard tools, being 428× faster (72 s versus 8.6 h) than RNAsubopt on the full genome of SARS-CoV-2 (29 903 nt). The resulting sample landscape correlates well with the experimentally guided secondary structure models, and is closer to the alternative conformations revealed by experimentally driven analysis. Finally, LinearSampling finds 23 regions of 15 nt with high accessibilities in the SARS-CoV-2 genome, which are potential targets for COVID-19 diagnostics and therapeutics.


Assuntos
Algoritmos , COVID-19 , SARS-CoV-2 , Humanos , Sequência de Bases , COVID-19/diagnóstico , COVID-19/genética , RNA Viral/genética , RNA Viral/química , SARS-CoV-2/genética , Conformação de Ácido Nucleico
4.
Nucleic Acids Res ; 51(5): 2464-2484, 2023 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-36762498

RESUMO

Riboswitches regulate downstream gene expression by binding cellular metabolites. Regulation of translation initiation by riboswitches is posited to occur by metabolite-mediated sequestration of the Shine-Dalgarno sequence (SDS), causing bypass by the ribosome. Recently, we solved a co-crystal structure of a prequeuosine1-sensing riboswitch from Carnobacterium antarcticum that binds two metabolites in a single pocket. The structure revealed that the second nucleotide within the gene-regulatory SDS, G34, engages in a crystal contact, obscuring the molecular basis of gene regulation. Here, we report a co-crystal structure wherein C10 pairs with G34. However, molecular dynamics simulations reveal quick dissolution of the pair, which fails to reform. Functional and chemical probing assays inside live bacterial cells corroborate the dispensability of the C10-G34 pair in gene regulation, leading to the hypothesis that the compact pseudoknot fold is sufficient for translation attenuation. Remarkably, the C. antarcticum aptamer retained significant gene-regulatory activity when uncoupled from the SDS using unstructured spacers up to 10 nucleotides away from the riboswitch-akin to steric-blocking employed by sRNAs. Accordingly, our work reveals that the RNA fold regulates translation without SDS sequestration, expanding known riboswitch-mediated gene-regulatory mechanisms. The results infer that riboswitches exist wherein the SDS is not embedded inside a stable fold.


Assuntos
Biossíntese de Proteínas , Riboswitch , Sítios de Ligação , Regulação da Expressão Gênica , Simulação de Dinâmica Molecular , Conformação de Ácido Nucleico , Ribossomos/genética , Ribossomos/metabolismo
5.
Bioinformatics ; 39(39 Suppl 1): i563-i571, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37387188

RESUMO

MOTIVATION: RNA design is the search for a sequence or set of sequences that will fold to desired structure, also known as the inverse problem of RNA folding. However, the sequences designed by existing algorithms often suffer from low ensemble stability, which worsens for long sequence design. Additionally, for many methods only a small number of sequences satisfying the MFE criterion can be found by each run of design. These drawbacks limit their use cases. RESULTS: We propose an innovative optimization paradigm, SAMFEO, which optimizes ensemble objectives (equilibrium probability or ensemble defect) by iterative search and yields a very large number of successfully designed RNA sequences as byproducts. We develop a search method which leverages structure level and ensemble level information at different stages of the optimization: initialization, sampling, mutation, and updating. Our work, while being less complicated than others, is the first algorithm that is able to design thousands of RNA sequences for the puzzles from the Eterna100 benchmark. In addition, our algorithm solves the most Eterna100 puzzles among all the general optimization based methods in our study. The only baseline solving more puzzles than our work is dependent on handcrafted heuristics designed for a specific folding model. Surprisingly, our approach shows superiority on designing long sequences for structures adapted from the database of 16S Ribosomal RNAs. AVAILABILITY AND IMPLEMENTATION: Our source code and data used in this article is available at https://github.com/shanry/SAMFEO.


Assuntos
Algoritmos , Benchmarking , Bases de Dados Factuais , Mutação , RNA Ribossômico 16S
6.
Cell Mol Life Sci ; 80(5): 136, 2023 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-37131079

RESUMO

Influenza A virus (IAV) is a respiratory virus that causes epidemics and pandemics. Knowledge of IAV RNA secondary structure in vivo is crucial for a better understanding of virus biology. Moreover, it is a fundament for the development of new RNA-targeting antivirals. Chemical RNA mapping using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) coupled with Mutational Profiling (MaP) allows for the thorough examination of secondary structures in low-abundance RNAs in their biological context. So far, the method has been used for analyzing the RNA secondary structures of several viruses including SARS-CoV-2 in virio and in cellulo. Here, we used SHAPE-MaP and dimethyl sulfate mutational profiling with sequencing (DMS-MaPseq) for genome-wide secondary structure analysis of viral RNA (vRNA) of the pandemic influenza A/California/04/2009 (H1N1) strain in both in virio and in cellulo environments. Experimental data allowed the prediction of the secondary structures of all eight vRNA segments in virio and, for the first time, the structures of vRNA5, 7, and 8 in cellulo. We conducted a comprehensive structural analysis of the proposed vRNA structures to reveal the motifs predicted with the highest accuracy. We also performed a base-pairs conservation analysis of the predicted vRNA structures and revealed many highly conserved vRNA motifs among the IAVs. The structural motifs presented herein are potential candidates for new IAV antiviral strategies.


Assuntos
COVID-19 , Vírus da Influenza A Subtipo H1N1 , Vírus da Influenza A , Humanos , Vírus da Influenza A Subtipo H1N1/genética , SARS-CoV-2/genética , Vírus da Influenza A/genética , RNA Viral/genética , Genômica
7.
Nucleic Acids Res ; 50(9): 5251-5262, 2022 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-35524574

RESUMO

Nearest neighbor parameters for estimating the folding stability of RNA secondary structures are in widespread use. For helices, current parameters penalize terminal AU base pairs relative to terminal GC base pairs. We curated an expanded database of helix stabilities determined by optical melting experiments. Analysis of the updated database shows that terminal penalties depend on the sequence identity of the adjacent penultimate base pair. New nearest neighbor parameters that include this additional sequence dependence accurately predict the measured values of 271 helices in an updated database with a correlation coefficient of 0.982. This refined understanding of helix ends facilitates fitting terms for base pair stacks with GU pairs. Prior parameter sets treated 5'GGUC3' paired to 3'CUGG5' separately from other 5'GU3'/3'UG5' stacks. The improved understanding of helix end stability, however, makes the separate treatment unnecessary. Introduction of the additional terms was tested with three optical melting experiments. The average absolute difference between measured and predicted free energy changes at 37°C for these three duplexes containing terminal adjacent AU and GU pairs improved from 1.38 to 0.27 kcal/mol. This confirms the need for the additional sequence dependence in the model.


Assuntos
Dobramento de RNA , RNA , Sequência de Bases , Conformação de Ácido Nucleico , RNA/química , Termodinâmica
8.
Nucleic Acids Res ; 50(9): 5299-5312, 2022 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-35524551

RESUMO

The essential pre-mRNA splicing factor U2AF2 (also called U2AF65) identifies polypyrimidine (Py) tract signals of nascent transcripts, despite length and sequence variations. Previous studies have shown that the U2AF2 RNA recognition motifs (RRM1 and RRM2) preferentially bind uridine-rich RNAs. Nonetheless, the specificity of the RRM1/RRM2 interface for the central Py tract nucleotide has yet to be investigated. We addressed this question by determining crystal structures of U2AF2 bound to a cytidine, guanosine, or adenosine at the central position of the Py tract, and compared U2AF2-bound uridine structures. Local movements of the RNA site accommodated the different nucleotides, whereas the polypeptide backbone remained similar among the structures. Accordingly, molecular dynamics simulations revealed flexible conformations of the central, U2AF2-bound nucleotide. The RNA binding affinities and splicing efficiencies of structure-guided mutants demonstrated that U2AF2 tolerates nucleotide substitutions at the central position of the Py tract. Moreover, enhanced UV-crosslinking and immunoprecipitation of endogenous U2AF2 in human erythroleukemia cells showed uridine-sensitive binding sites, with lower sequence conservation at the central nucleotide positions of otherwise uridine-rich, U2AF2-bound splice sites. Altogether, these results highlight the importance of RNA flexibility for protein recognition and take a step towards relating splice site motifs to pre-mRNA splicing efficiencies.


Assuntos
Nucleotídeos , Precursores de RNA , Fator de Processamento U2AF , Humanos , Nucleotídeos/metabolismo , RNA/metabolismo , Precursores de RNA/metabolismo , Splicing de RNA , Fator de Processamento U2AF/metabolismo , Uridina/metabolismo
9.
Proc Natl Acad Sci U S A ; 118(52)2021 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-34887342

RESUMO

The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single-sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurboFold's purely in silico prediction not only is close to experimentally guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5' and 3' untranslated regions (UTRs) (∼29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies undiscovered conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, small interfering RNAs (siRNAs), CRISPR-Cas13 guide RNAs, and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies and will be a useful tool in fighting the current and future pandemics.


Assuntos
Algoritmos , RNA Viral/química , SARS-CoV-2/química , Betacoronavirus/química , Betacoronavirus/genética , Sequência Conservada , Genoma Viral , Mutação , Conformação de Ácido Nucleico , Dobramento de RNA , RNA Viral/genética , SARS-CoV-2/genética , Alinhamento de Sequência
10.
Bioinformatics ; 38(16): 3892-3899, 2022 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-35748706

RESUMO

MOTIVATION: The secondary structure of RNA is of importance to its function. Over the last few years, several papers attempted to use machine learning to improve de novo RNA secondary structure prediction. Many of these papers report impressive results for intra-family predictions but seldom address the much more difficult (and practical) inter-family problem. RESULTS: We demonstrate that it is nearly trivial with convolutional neural networks to generate pseudo-free energy changes, modelled after structure mapping data that improve the accuracy of structure prediction for intra-family cases. We propose a more rigorous method for inter-family cross-validation that can be used to assess the performance of learning-based models. Using this method, we further demonstrate that intra-family performance is insufficient proof of generalization despite the widespread assumption in the literature and provide strong evidence that many existing learning-based models have not generalized inter-family. AVAILABILITY AND IMPLEMENTATION: Source code and data are available at https://github.com/marcellszi/dl-rna. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado Profundo , RNA , Humanos , Redes Neurais de Computação , Estrutura Secundária de Proteína , Aprendizado de Máquina
11.
Nucleic Acids Res ; 48(14): 8146-8164, 2020 08 20.
Artigo em Inglês | MEDLINE | ID: mdl-32597951

RESUMO

Riboswitches are structured RNA motifs that recognize metabolites to alter the conformations of downstream sequences, leading to gene regulation. To investigate this molecular framework, we determined crystal structures of a preQ1-I riboswitch in effector-free and bound states at 2.00 Å and 2.65 Å-resolution. Both pseudoknots exhibited the elusive L2 loop, which displayed distinct conformations. Conversely, the Shine-Dalgarno sequence (SDS) in the S2 helix of each structure remained unbroken. The expectation that the effector-free state should expose the SDS prompted us to conduct solution experiments to delineate environmental changes to specific nucleobases in response to preQ1. We then used nudged elastic band computational methods to derive conformational-change pathways linking the crystallographically-determined effector-free and bound-state structures. Pathways featured: (i) unstacking and unpairing of L2 and S2 nucleobases without preQ1-exposing the SDS for translation and (ii) stacking and pairing L2 and S2 nucleobases with preQ1-sequestering the SDS. Our results reveal how preQ1 binding reorganizes L2 into a nucleobase-stacking spine that sequesters the SDS, linking effector recognition to biological function. The generality of stacking spines as conduits for effector-dependent, interdomain communication is discussed in light of their existence in adenine riboswitches, as well as the turnip yellow mosaic virus ribosome sensor.


Assuntos
Simulação de Dinâmica Molecular , Riboswitch , Pareamento de Bases , Regulação Bacteriana da Expressão Gênica , Guanina/análogos & derivados , Dodecilsulfato de Sódio/química , Thermoanaerobacter/genética
12.
Genes Dev ; 28(15): 1721-32, 2014 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-25085423

RESUMO

Sequence variation in tRNA genes influences the structure, modification, and stability of tRNA; affects translation fidelity; impacts the activity of numerous isodecoders in metazoans; and leads to human diseases. To comprehensively define the effects of sequence variation on tRNA function, we developed a high-throughput in vivo screen to quantify the activity of a model tRNA, the nonsense suppressor SUP4oc of Saccharomyces cerevisiae. Using a highly sensitive fluorescent reporter gene with an ochre mutation, fluorescence-activated cell sorting of a library of SUP4oc mutant yeast strains, and deep sequencing, we scored 25,491 variants. Unexpectedly, SUP4oc tolerates numerous sequence variations, accommodates slippage in tertiary and secondary interactions, and exhibits genetic interactions that suggest an alternative functional tRNA conformation. Furthermore, we used this methodology to define tRNA variants subject to rapid tRNA decay (RTD). Even though RTD normally degrades tRNAs with exposed 5' ends, mutations that sensitize SUP4oc to RTD were found to be located throughout the sequence, including the anti-codon stem. Thus, the integrity of the entire tRNA molecule is under surveillance by cellular quality control machinery. This approach to assess activity at high throughput is widely applicable to many problems in tRNA biology.


Assuntos
Estabilidade de RNA/genética , RNA de Transferência/genética , RNA de Transferência/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Citometria de Fluxo , Variação Genética , Ensaios de Triagem em Larga Escala , Mutação/genética , Conformação de Ácido Nucleico , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
13.
Int J Mol Sci ; 23(5)2022 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-35269600

RESUMO

Influenza A virus (IAV) is a member of the single-stranded RNA (ssRNA) family of viruses. The most recent global pandemic caused by the SARS-CoV-2 virus has shown the major threat that RNA viruses can pose to humanity. In comparison, influenza has an even higher pandemic potential as a result of its high rate of mutations within its relatively short (<13 kbp) genome, as well as its capability to undergo genetic reassortment. In light of this threat, and the fact that RNA structure is connected to a broad range of known biological functions, deeper investigation of viral RNA (vRNA) structures is of high interest. Here, for the first time, we propose a secondary structure for segment 8 vRNA (vRNA8) of A/California/04/2009 (H1N1) formed in the presence of cellular and viral components. This structure shows similarities with prior in vitro experiments. Additionally, we determined the location of several well-defined, conserved structural motifs of vRNA8 within IAV strains with possible functionality. These RNA motifs appear to fold independently of regional nucleoprotein (NP)-binding affinity, but a low or uneven distribution of NP in each motif region is noted. This research also highlights several accessible sites for oligonucleotide tools and small molecules in vRNA8 in a cellular environment that might be a target for influenza A virus inhibition on the RNA level.


Assuntos
Regulação Viral da Expressão Gênica , Genoma Viral/genética , Vírus da Influenza A Subtipo H1N1/genética , Conformação de Ácido Nucleico , RNA Viral/química , Animais , Sequência de Bases , Cães , Humanos , Vírus da Influenza A Subtipo H1N1/metabolismo , Influenza Humana/virologia , Células Madin Darby de Rim Canino , Modelos Moleculares , Motivos de Nucleotídeos/genética , Dobramento de RNA , RNA Viral/genética , Proteínas Virais/genética , Proteínas Virais/metabolismo
14.
RNA ; 25(6): 747-754, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-30952689

RESUMO

Nearest neighbor parameters for estimating the folding stability of RNA are commonly used in secondary structure prediction, for generating folding ensembles of structures, and for analyzing RNA function. Previously, we demonstrated that we could quantify the uncertainties in each nearest neighbor parameter by perturbing the underlying optical melting data within experimental error and rederiving the parameters, which accounts for the substantial correlations that exist between the parameters. In this contribution, we describe a method to estimate uncertainty in the estimated folding stabilities of RNA structures, accounting for correlations in the nearest neighbor parameters. This method is incorporated in the RNA structure software package.


Assuntos
Algoritmos , Dobramento de RNA , RNA/química , Software , Pareamento de Bases , Sequência de Bases , Humanos , Termodinâmica , Incerteza
15.
Bioinformatics ; 36(Suppl_1): i258-i267, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32657379

RESUMO

MOTIVATION: RNA secondary structure prediction is widely used to understand RNA function. Recently, there has been a shift away from the classical minimum free energy methods to partition function-based methods that account for folding ensembles and can therefore estimate structure and base pair probabilities. However, the classical partition function algorithm scales cubically with sequence length, and is therefore prohibitively slow for long sequences. This slowness is even more severe than cubic-time free energy minimization due to a substantially larger constant factor in runtime. RESULTS: Inspired by the success of our recent LinearFold algorithm that predicts the approximate minimum free energy structure in linear time, we design a similar linear-time heuristic algorithm, LinearPartition, to approximate the partition function and base-pairing probabilities, which is shown to be orders of magnitude faster than Vienna RNAfold and CONTRAfold (e.g. 2.5 days versus 1.3 min on a sequence with length 32 753 nt). More interestingly, the resulting base-pairing probabilities are even better correlated with the ground-truth structures. LinearPartition also leads to a small accuracy improvement when used for downstream structure prediction on families with the longest length sequences (16S and 23S rRNAs), as well as a substantial improvement on long-distance base pairs (500+ nt apart). AVAILABILITY AND IMPLEMENTATION: Code: http://github.com/LinearFold/LinearPartition; Server: http://linearfold.org/partition. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Dobramento de RNA , RNA , Algoritmos , Pareamento de Bases , Humanos , Conformação de Ácido Nucleico , Probabilidade , RNA/genética , Análise de Sequência de RNA
16.
Nucleic Acids Res ; 47(1): 29-42, 2019 01 10.
Artigo em Inglês | MEDLINE | ID: mdl-30462314

RESUMO

Design of RNA sequences that adopt functional folds establishes principles of RNA folding and applications in biotechnology. Inverse folding for RNAs, which allows computational design of sequences that adopt specific structures, can be utilized for unveiling RNA functions and developing genetic tools in synthetic biology. Although many algorithms for inverse RNA folding have been developed, the pseudoknot, which plays a key role in folding of ribozymes and riboswitches, is not addressed in most algorithms. For the few algorithms that attempt to predict pseudoknot-containing ribozymes, self-cleavage activity has not been tested. Herein, we design double-pseudoknot HDV ribozymes using an inverse RNA folding algorithm and test their kinetic mechanisms experimentally. More than 90% of the positively designed ribozymes possess self-cleaving activity, whereas more than 70% of negative control ribozymes, which are predicted to fold to the necessary structure but with low fidelity, do not possess it. Kinetic and mutation analyses reveal that these RNAs cleave site-specifically and with the same mechanism as the WT ribozyme. Most ribozymes react just 50- to 80-fold slower than the WT ribozyme, and this rate can be improved to near WT by modification of a junction. Thus, fast-cleaving functional ribozymes with multiple pseudoknots can be designed computationally.


Assuntos
Biologia Computacional/métodos , Dobramento de RNA , RNA Catalítico/química , Riboswitch/genética , Algoritmos , Biotecnologia/tendências , Cinética , Conformação de Ácido Nucleico , RNA Catalítico/genética , Biologia Sintética/tendências
17.
Nucleic Acids Res ; 47(3): 1164-1177, 2019 02 20.
Artigo em Inglês | MEDLINE | ID: mdl-30576464

RESUMO

Synonymous codons provide redundancy in the genetic code that influences translation rates in many organisms, in which overall codon use is driven by selection for optimal codons. It is unresolved if or to what extent translational selection drives use of suboptimal codons or codon pairs. In Saccharomyces cerevisiae, 17 specific inhibitory codon pairs, each comprised of adjacent suboptimal codons, inhibit translation efficiency in a manner distinct from their constituent codons, and many are translated slowly in native genes. We show here that selection operates within Saccharomyces sensu stricto yeasts to conserve nine of these codon pairs at defined positions in genes. Conservation of these inhibitory codon pairs is significantly greater than expected, relative to conservation of their constituent codons, with seven pairs more highly conserved than any other synonymous pair. Conservation is strongly correlated with slow translation of the pairs. Conservation of suboptimal codon pairs extends to two related Candida species, fungi that diverged from Saccharomyces ∼270 million years ago, with an enrichment for codons decoded by I•A and U•G wobble in both Candida and Saccharomyces. Thus, conservation of inhibitory codon pairs strongly implies selection for slow translation at particular gene locations, executed by suboptimal codon pairs.


Assuntos
Códon , Biossíntese de Proteínas , Saccharomyces/genética , Sequência de Bases , Candida/genética , Sequência Conservada , Genes Fúngicos , Saccharomyces cerevisiae/genética
18.
J Am Chem Soc ; 142(47): 19835-19839, 2020 11 25.
Artigo em Inglês | MEDLINE | ID: mdl-33170672

RESUMO

RNA recognition by proteins is central to biology. Here we demonstrate the existence of a recurrent structural motif, the "arginine fork", that codifies arginine readout of cognate backbone and guanine nucleobase interactions in a variety of protein-RNA complexes derived from viruses, metabolic enzymes, and ribosomes. Nearly 30 years ago, a theoretical arginine fork model was posited to account for the specificity between the HIV-1 Tat protein and TAR RNA. This model predicted that a single arginine should form four complementary contacts with nearby phosphates, yielding a two-pronged backbone readout. Recent high-resolution structures of TAR-protein complexes have unveiled new details, including (i) arginine interactions with the phosphate backbone and the major-groove edge of guanine and (ii) simultaneous cation-π contacts between the guanidinium group and flanking nucleobases. These findings prompted us to search for arginine forks within experimental protein-RNA structures retrieved from the Protein Data Bank. The results revealed four distinct classes of arginine forks that we have defined using a rigorous but flexible nomenclature. Examples are presented in the context of ribosomal and nonribosomal interfaces with analysis of arginine dihedral angles and structural (suite) classification of RNA targets. When arginine fork chemical recognition principles were applied to existing structures with unusual arginine-guanine recognition, we found that the arginine fork geometry was more consistent with the experimental data, suggesting the utility of fork classifications to improve structural models. Software to analyze arginine-RNA interactions has been made available to the community.


Assuntos
Arginina/metabolismo , Guanina/metabolismo , RNA Viral/metabolismo , Arginina/química , Sítios de Ligação , Guanina/química , Repetição Terminal Longa de HIV/genética , HIV-1/metabolismo , Conformação de Ácido Nucleico , Fosfatos/química , Fosfatos/metabolismo , RNA Viral/química , Produtos do Gene tat do Vírus da Imunodeficiência Humana/genética , Produtos do Gene tat do Vírus da Imunodeficiência Humana/metabolismo
19.
RNA ; 24(11): 1555-1567, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30097542

RESUMO

Nucleic acids can be designed to be nano-machines, pharmaceuticals, or probes. RNA secondary structures can form the basis of self-assembling nanostructures. There are only four natural RNA bases, therefore it can be difficult to design sequences that fold to a single, specified structure because many other structures are often possible for a given sequence. One approach taken by state-of-the-art sequence design methods is to select sequences that fold to the specified structure using stochastic, iterative refinement. The goal of this work is to accelerate design. Many existing iterative methods select and refine sequences one base pair and one unpaired nucleotide at a time. Here, the hypothesis that sequences can be preselected in order to accelerate design was tested. To this aim, a database was built of helix sequences that demonstrate thermodynamic features found in natural sequences and that also have little tendency to cross-hybridize. Additionally, a database was assembled of RNA loop sequences with low helix-formation propensity and little tendency to cross-hybridize with either the helices or other loops. These databases of preselected sequences accelerate the selection of sequences that fold with minimal ensemble defect by replacing some of the trial and error of current refinement approaches. When using the database of preselected sequences as compared to randomly chosen sequences, sequences for natural structures are designed 36 times faster, and random structures are designed six times faster. The sequences selected with the aid of the database have similar ensemble defect as those sequences selected at random. The sequence database is part of RNAstructure package at http://rna.urmc.rochester.edu/RNAstructure.html.


Assuntos
Conformação de Ácido Nucleico , RNA/química , Algoritmos , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Dobramento de RNA , Análise de Sequência de RNA , Termodinâmica
20.
RNA ; 24(11): 1568-1582, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30104207

RESUMO

RNA secondary structure prediction is often used to develop hypotheses about structure-function relationships for newly discovered RNA sequences, to identify unknown functional RNAs, and to design sequences. Secondary structure prediction methods typically use a thermodynamic model that estimates the free energy change of possible structures based on a set of nearest neighbor parameters. These parameters were derived from optical melting experiments of small model oligonucleotides. This work aims to better understand the precision of structure prediction. Here, the experimental errors in optical melting experiments were propagated to errors in the derived nearest neighbor parameter values and then to errors in RNA secondary structure prediction. To perform this analysis, the optical melting experimental values were systematically perturbed within the estimates of experimental error and alternative sets of nearest neighbor parameters were then derived from these error-bounded values. Secondary structure predictions using either the perturbed or reference parameter sets were then compared. This work demonstrated that the precision of RNA secondary structure prediction is more robust than suggested by previous work based on perturbation of the nearest neighbor parameters. This robustness is due to correlations between parameters. Additionally, this work identified weaknesses in the parameter derivation that makes accurate assessment of parameter uncertainty difficult. Considerations for experimental design are provided to mitigate these weaknesses are provided.


Assuntos
Conformação de Ácido Nucleico , Dobramento de RNA , RNA/química , Pareamento de Bases , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA