Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 112
Filter
1.
RSC Chem Biol ; 5(4): 360-371, 2024 Apr 03.
Article in English | MEDLINE | ID: mdl-38576723

ABSTRACT

We developed chemically modified PCR primers that allow the design of flexible sticky ends by introducing a photo-cleavable group at the phosphate moiety. Nucleic acid derivatives containing o-nitrobenzyl photo-cleavable groups with a tert-butyl group at the benzyl position were stable during strong base treatment for oligonucleotide synthesis and thermal cycling in PCR reactions. PCR using primers incorporating these nucleic acid derivatives confirmed that chain extension reactions completely stopped at position 1 before and after the site of the photo-cleavable group was introduced. DNA fragments of 2 and 3 kbp, with sticky ends of 50 bases, were successfully concatenated with a high yield of 77%. A plasmid was constructed using this method. Finally, we applied this approach to construct a 48.5 kbp lambda phage DNA, which is difficult to achieve using restriction enzyme-based methods. After 7 days, we were able to confirm the generation of DNA of the desired length. Although the efficiency is yet to be improved, the chemically modified PCR primer offers potential to complement enzymatic methods and serve as a DNA concatenation technique.

2.
Chembiochem ; 24(14): e202200572, 2023 07 17.
Article in English | MEDLINE | ID: mdl-37253903

ABSTRACT

Controlling PCR fidelity is an important issue for molecular biology and high-fidelity PCR is essential for gene cloning. In general, fidelity control is achieved by protein engineering of polymerases. In contrast, only a few studies have reported controlling fidelity using chemically modified nucleotide substrates. In this report, we synthesized nucleotide substrates possessing a modification on Pγ and evaluated the effect of this modification on PCR fidelity. One of the substrates, nucleotide tetraphosphate, caused a modest decrease in Taq DNA polymerase activity and the effect on PCR fidelity was dependent on the type of mutation. The use of deoxyadenosine tetraphosphate enhanced the A : T→G : C mutation dramatically, which is common when using Taq polymerase. Conversely, deoxyguanosine tetraphosphate (dG4P) suppressed this mutation but increased the G : C→A : T mutation during PCR. Using an excess amount of dG4P suppressed both mutations successfully and total fidelity was improved.


Subject(s)
Nucleic Acid Amplification Techniques , Phosphates , Taq Polymerase/genetics , Taq Polymerase/metabolism , Polymerase Chain Reaction , Mutation , Nucleotides
3.
Methods Mol Biol ; 2586: 1-14, 2023.
Article in English | MEDLINE | ID: mdl-36705895

ABSTRACT

Predicting the secondary structures of RNA molecules is an essential step to characterize their functions, but the thermodynamic probability of any prediction is generally small. On the other hand, there are a few tools for calculating and visualizing various secondary structural information from RNA sequences. We implemented a web server that calculates in parallel various features of secondary structures: different types of secondary structure predictions, the marginal probabilities for local structural contexts, accessibilities of the subsequences, the energy changes by arbitrary base mutations, and the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp , which integrates software tools, CentroidFold, CentroidHomfold, IPknot, CapR, Raccess, Rchange, RintD, and RintW.


Subject(s)
Algorithms , RNA , Nucleic Acid Conformation , RNA/genetics , RNA/chemistry , Base Sequence , Sequence Analysis, RNA , Software , Internet
4.
RNA ; 29(2): 170-177, 2023 02.
Article in English | MEDLINE | ID: mdl-36384963

ABSTRACT

The mammalian cell nucleus contains dozens of membrane-less nuclear bodies that play significant roles in various aspects of gene expression. Several nuclear bodies are nucleated by specific architectural noncoding RNAs (arcRNAs) acting as structural scaffolds. We have reported that a minor population of cellular RNAs exhibits an unusual semi-extractable feature upon using the conventional procedure of RNA preparation and that needle shearing or heating of cell lysates remarkably improves extraction of dozens of RNAs. Because semi-extractable RNAs, including known arcRNAs, commonly localize in nuclear bodies, this feature may be a hallmark of arcRNAs. Using the semi-extractability of RNA, we performed genome-wide screening of semi-extractable long noncoding RNAs to identify new candidate arcRNAs for arcRNA under hyperosmotic and heat stress conditions. After screening stress-inducible and semi-extractable RNAs, hundreds of readthrough downstream-of-gene (DoG) transcripts over several hundreds of kilobases, many of which were not detected among RNAs prepared by the conventional extraction procedure, were found to be stress-inducible and semi-extractable. We further characterized some of the abundant DoGs and found that stress-inducible transient extension of the 3'-UTR made DoGs semi-extractable. Furthermore, they were localized in distinct nuclear foci that were sensitive to 1,6-hexanediol. These data suggest that semi-extractable DoGs exhibit arcRNA-like features and our semi-extractable RNA-seq is a powerful tool to extensively monitor DoGs that are induced under specific physiological conditions.


Subject(s)
Cell Nucleus , RNA, Long Noncoding , Animals , Base Sequence , Cell Nucleus/metabolism , RNA, Untranslated/genetics , RNA, Untranslated/metabolism , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Mammals/genetics
5.
NAR Genom Bioinform ; 4(4): lqac092, 2022 Dec.
Article in English | MEDLINE | ID: mdl-36465498

ABSTRACT

Long-read sequencers, such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) sequencers, have improved their read length and accuracy, thereby opening up unprecedented research. Many tools and algorithms have been developed to analyze long reads, and rapid progress in PacBio and ONT has further accelerated their development. Together with the development of high-throughput sequencing technologies and their analysis tools, many read simulators have been developed and effectively utilized. PBSIM is one of the popular long-read simulators. In this study, we developed PBSIM3 with three new functions: error models for long reads, multi-pass sequencing for high-fidelity read simulation and transcriptome sequencing simulation. Therefore, PBSIM3 is now able to meet a wide range of long-read simulation requirements.

6.
Leukemia ; 36(11): 2605-2620, 2022 11.
Article in English | MEDLINE | ID: mdl-36229594

ABSTRACT

Myeloid malignancies with DDX41 mutations are often associated with bone marrow failure and cytopenia before overt disease manifestation. However, the mechanisms underlying these specific conditions remain elusive. Here, we demonstrate that loss of DDX41 function impairs efficient RNA splicing, resulting in DNA replication stress with excess R-loop formation. Mechanistically, DDX41 binds to the 5' splice site (5'SS) of coding RNA and coordinates RNA splicing and transcriptional elongation; loss of DDX41 prevents splicing-coupled transient pausing of RNA polymerase II at 5'SS, causing aberrant R-loop formation and transcription-replication collisions. Although the degree of DNA replication stress acquired in S phase is small, cells undergo mitosis with under-replicated DNA being remained, resulting in micronuclei formation and significant DNA damage, thus leading to impaired cell proliferation and genomic instability. These processes may be responsible for disease phenotypes associated with DDX41 mutations.


Subject(s)
RNA Splice Sites , RNA Splicing , Cell Line , RNA Splicing/genetics , Mutation , DNA Replication
7.
Commun Biol ; 5(1): 561, 2022 06 08.
Article in English | MEDLINE | ID: mdl-35676418

ABSTRACT

Expression of secreted recombinant proteins burdens the protein secretion machinery, limiting production. Here, we describe an approach to improving protein production by the non-conventional yeast Komagataella phaffii comprised of genome-wide screening for effective gene disruptions, combining them in a single strain, and recovering growth reduction by adaptive evolution. For the screen, we designed a multiwell-formatted, streamlined workflow to high-throughput assay of secretion of a single-chain small antibody, which is cumbersome to detect but serves as a good model of proteins that are difficult to secrete. Using the consolidated screening system, we evaluated >19,000 mutant strains from a mutant library prepared by a modified random gene-disruption method, and identified six factors for which disruption led to increased antibody production. We then combined the disruptions, up to quadruple gene knockouts, which appeared to contribute independently, in a single strain and observed an additive effect. Target protein and promoter were basically interchangeable for the effects of knockout genes screened. We finally used adaptive evolution to recover reduced cell growth by multiple gene knockouts and examine the possibility for further enhancing protein secretion. Our successful, three-part approach holds promise as a method for improving protein production by non-conventional microorganisms.


Subject(s)
Saccharomycetales , Gene Knockout Techniques , Recombinant Proteins/metabolism , Saccharomycetales/genetics , Saccharomycetales/metabolism , Workflow
8.
Nucleic Acids Res ; 50(13): e73, 2022 07 22.
Article in English | MEDLINE | ID: mdl-35390152

ABSTRACT

Recent technological advances have enabled the generation of large amounts of data consisting of RNA sequences and their functional activity. Here, we propose a method for extracting secondary structure features that affect the functional activity of RNA from sequence-activity data. Given pairs of RNA sequences and their corresponding bioactivity values, our method calculates position-specific structural features of the input RNA sequences, considering every possible secondary structure of each RNA. A Ridge regression model is trained using the structural features as feature vectors and the bioactivity values as response variables. Optimized model parameters indicate how secondary structure features affect bioactivity. We used our method to extract intramolecular structural features of bacterial translation initiation sites and self-cleaving ribozymes, and the intermolecular features between rRNAs and Shine-Dalgarno sequences and between U1 RNAs and splicing sites. We not only identified known structural features but also revealed more detailed insights into structure-activity relationships than previously reported. Importantly, the datasets we analyzed here were obtained from different experimental systems and differed in size, sequence length and similarity, and number of RNA molecules involved, demonstrating that our method is applicable to various types of data consisting of RNA sequences and bioactivity values.


Subject(s)
RNA, Catalytic , Nucleic Acid Conformation , Peptide Chain Initiation, Translational , RNA, Catalytic/chemistry , Regression Analysis , Structure-Activity Relationship
9.
Bioinformatics ; 38(3): 710-719, 2022 01 12.
Article in English | MEDLINE | ID: mdl-34694364

ABSTRACT

MOTIVATION: By detecting homology among RNAs, the probabilistic consideration of RNA structural alignments has improved the prediction accuracy of significant RNA prediction problems. Predicting an RNA consensus secondary structure from an RNA sequence alignment is a fundamental research objective because in the detection of conserved base-pairings among RNA homologs, predicting an RNA consensus secondary structure is more convenient than predicting an RNA structural alignment. RESULTS: We developed and implemented ConsAlifold, a dynamic programming-based method that predicts the consensus secondary structure of an RNA sequence alignment. ConsAlifold considers RNA structural alignments. ConsAlifold achieves moderate running time and the best prediction accuracy of RNA consensus secondary structures among available prediction methods. AVAILABILITY AND IMPLEMENTATION: ConsAlifold, data and Python scripts for generating both figures and tables are freely available at https://github.com/heartsh/consalifold. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , RNA , Nucleic Acid Conformation , RNA/chemistry , Consensus , Sequence Analysis, RNA/methods , Software
10.
EMBO J ; 40(15): e107976, 2021 08 02.
Article in English | MEDLINE | ID: mdl-34184765

ABSTRACT

Nuclear stress bodies (nSBs) are nuclear membraneless organelles formed around stress-inducible HSATIII architectural long noncoding RNAs (lncRNAs). nSBs repress splicing of hundreds of introns during thermal stress recovery, which are partly regulated by CLK1 kinase phosphorylation of temperature-dependent Ser/Arg-rich splicing factors (SRSFs). Here, we report a distinct mechanism for this splicing repression through protein sequestration by nSBs. Comprehensive identification of RNA-binding proteins revealed HSATIII association with proteins related to N6 -methyladenosine (m6 A) RNA modification. 11% of the first adenosine in the repetitive HSATIII sequence were m6 A-modified. nSBs sequester the m6 A writer complex to methylate HSATIII, leading to subsequent sequestration of the nuclear m6 A reader, YTHDC1. Sequestration of these factors from the nucleoplasm represses m6 A modification of pre-mRNAs, leading to repression of m6 A-dependent splicing during stress recovery phase. Thus, nSBs serve as a common platform for regulation of temperature-dependent splicing through dual mechanisms employing two distinct ribonucleoprotein modules with partially m6 A-modified architectural lncRNAs.


Subject(s)
Nerve Tissue Proteins/genetics , RNA Splicing Factors/genetics , RNA Splicing , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Adenosine/analogs & derivatives , Adenosine/metabolism , Cell Nucleus/genetics , HeLa Cells , Humans , Nerve Tissue Proteins/metabolism , Phosphorylation , Protein Serine-Threonine Kinases/genetics , Protein-Tyrosine Kinases/genetics , RNA Splicing Factors/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Repetitive Sequences, Nucleic Acid , Serine-Arginine Splicing Factors/genetics , Serine-Arginine Splicing Factors/metabolism , Temperature
11.
Bioinformatics ; 37(5): 589-595, 2021 05 05.
Article in English | MEDLINE | ID: mdl-32976553

ABSTRACT

MOTIVATION: Recent advances in high-throughput long-read sequencers, such as PacBio and Oxford Nanopore sequencers, produce longer reads with more errors than short-read sequencers. In addition to the high error rates of reads, non-uniformity of errors leads to difficulties in various downstream analyses using long reads. Many useful simulators, which characterize long-read error patterns and simulate them, have been developed. However, there is still room for improvement in the simulation of the non-uniformity of errors. RESULTS: To capture characteristics of errors in reads for long-read sequencers, here, we introduce a generative model for quality scores, in which a hidden Markov Model with a latest model selection method, called factorized information criteria, is utilized. We evaluated our developed simulator from various points, indicating that our simulator successfully simulates reads that are consistent with real reads. AVAILABILITY AND IMPLEMENTATION: The source codes of PBSIM2 are freely available from https://github.com/yukiteruono/pbsim2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
High-Throughput Nucleotide Sequencing , Nanopores , Computer Simulation , Sequence Analysis, DNA , Software
12.
Nucleic Acids Res ; 48(22): 13000-13012, 2020 12 16.
Article in English | MEDLINE | ID: mdl-33257988

ABSTRACT

In the yeast Saccharomyces cerevisiae, terminator sequences not only terminate transcription but also affect expression levels of the protein-encoded upstream of the terminator. The non-conventional yeast Pichia pastoris (syn. Komagataella phaffii) has frequently been used as a platform for metabolic engineering but knowledge regarding P. pastoris terminators is limited. To explore terminator sequences available to tune protein expression levels in P. pastoris, we created a 'terminator catalog' by testing 72 sequences, including terminators from S. cerevisiae or P. pastoris and synthetic terminators. Altogether, we found that the terminators have a tunable range of 17-fold. We also found that S. cerevisiae terminator sequences maintain function when transferred to P. pastoris. Successful tuning of protein expression levels was shown not only for the reporter gene used to define the catalog but also using betaxanthin production as an example application in pathway flux regulation. Moreover, we found experimental evidence that protein expression levels result from mRNA abundance and in silico evidence that levels reflect the stability of mRNA 3'-UTR secondary structure. In combination with promoter selection, the novel terminator catalog constitutes a basic toolbox for tuning protein expression levels in metabolic engineering and synthetic biology in P. pastoris.


Subject(s)
RNA Stability/genetics , RNA, Messenger/genetics , Saccharomycetales/genetics , Terminator Regions, Genetic/genetics , Gene Expression Regulation, Fungal/genetics , Metabolic Engineering , Promoter Regions, Genetic , Saccharomyces cerevisiae/genetics , Synthetic Biology
13.
J Chem Theory Comput ; 16(9): 5923-5935, 2020 Sep 08.
Article in English | MEDLINE | ID: mdl-32786906

ABSTRACT

Can current simulations quantitatively predict the stability of ribonucleic acids (RNAs)? In this research, we apply a free-energy perturbation simulation of RNAs containing inosine, a modified ribonucleic base, to the derivation of RNA nearest-neighbor parameters. A parameter set derived solely from 30 simulations was used to predict the free-energy difference of the RNA duplex with a mean unbiased error of 0.70 kcal/mol, which is a level of accuracy comparable to that obtained with parameters derived from 25 experiments. We further show that the error can be lowered to 0.60 kcal/mol by combining the simulation-derived free-energy differences with experimentally measured differences. This protocol can be used as a versatile method for deriving nearest-neighbor parameters of RNAs with various modified bases.


Subject(s)
Inosine/chemistry , RNA/chemistry , Base Pairing , Base Sequence , Nucleic Acid Conformation , RNA/metabolism , Thermodynamics
14.
Bioinformatics ; 36(Suppl_1): i227-i235, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32657400

ABSTRACT

MOTIVATION: RNA folding kinetics plays an important role in the biological functions of RNA molecules. An important goal in the investigation of the kinetic behavior of RNAs is to find the folding pathway with the lowest energy barrier. For this purpose, most of the existing methods use heuristics because the number of possible pathways is huge even if only the shortest (direct) folding pathways are considered. RESULTS: In this study, we propose a new method using a best-first search strategy to efficiently compute the exact solution of the minimum barrier energy of direct pathways. Using our method, we can find the exact direct pathways within a Hamming distance of 20, whereas the previous methods even miss the exact short pathways. Moreover, our method can be used to improve the pathways found by existing methods for exploring indirect pathways. AVAILABILITY AND IMPLEMENTATION: The source code and datasets created and used in this research are available at https://github.com/eukaryo/czno. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , RNA , Nucleic Acid Conformation , RNA Folding , Software
15.
Comput Struct Biotechnol J ; 18: 1811-1818, 2020.
Article in English | MEDLINE | ID: mdl-32695273

ABSTRACT

Codon optimization in protein-coding sequences (CDSs) is a widely used technique to promote the heterologous expression of target genes. In codon optimization, a combinatorial space of nucleotide sequences that code a given amino acid sequence and take into account user-prescribed forbidden sequence motifs is explored to optimize multiple criteria. Although evolutionary algorithms have been used to tackle such complex codon optimization problems, evolutionary codon optimization tools do not provide guarantees to find the optimal solutions for these multicriteria codon optimization problems. We have developed a novel multicriteria dynamic programming algorithm, COSMO. By using this algorithm, we can obtain all Pareto-optimal solutions for the multiple features of CDS, which include codon usage, codon context, and the number of hidden stop codons. User-prescribed forbidden sequence motifs are rigorously excluded from the Pareto-optimal solutions. To accelerate CDS design by COSMO, we introduced constraints that reduce the number of Pareto-optimal solutions to be processed in a branch-and-bound manner. We benchmarked COSMO for run-time and the number of generated solutions by adapting selected human genes to yeast codon usage frequencies, and found that the constraints effectively reduce the run-time. In addition to the benchmarking of COSMO, a multi-objective genetic algorithm (MOGA) for CDS design was also benchmarked for the same two aspects and their performances were compared. In this comparison, (i) MOGA identified significantly fewer Pareto-optimal solutions than COSMO, and (ii) the MOGA solutions did not achieve the same mean hypervolume values as those provided by COSMO. These results suggest that generating the whole set of the Pareto-optimal solutions of the codon optimization problems is a difficult task for MOGA.

16.
Nucleic Acids Res ; 48(14): e81, 2020 08 20.
Article in English | MEDLINE | ID: mdl-32504488

ABSTRACT

RNA secondary structure around translation initiation sites strongly affects the abundance of expressed proteins in Escherichia coli. However, detailed secondary structural features governing protein abundance remain elusive. Recent advances in high-throughput DNA synthesis and experimental systems enable us to obtain large amounts of data. Here, we evaluated six types of structural features using two large-scale datasets. We found that accessibility, which is the probability that a given region around the start codon has no base-paired nucleotides, showed the highest correlation with protein abundance in both datasets. Accessibility showed a significantly higher correlation (Spearman's ρ = 0.709) than the widely used minimum free energy (0.554) in one of the datasets. Interestingly, accessibility showed the highest correlation only when it was calculated by a log-linear model, indicating that the RNA structural model and how to utilize it are important. Furthermore, by combining the accessibility and activity of the Shine-Dalgarno sequence, we devised a method for predicting protein abundance more accurately than existing methods. We inferred that the log-linear model has a broader probabilistic distribution than the widely used Turner energy model, which contributed to more accurate quantification of ribosome accessibility to translation initiation sites.


Subject(s)
Escherichia coli/genetics , Nucleic Acid Conformation , RNA, Bacterial/metabolism , RNA, Messenger/metabolism , RNA-Binding Proteins/metabolism , 5' Untranslated Regions , Algorithms , Codon, Initiator/metabolism , Datasets as Topic , Escherichia coli/metabolism , Forecasting , Linear Models , Machine Learning , Peptide Chain Initiation, Translational , RNA, Bacterial/chemistry , RNA, Messenger/chemistry , Regulatory Sequences, Ribonucleic Acid , Structure-Activity Relationship
17.
BMC Bioinformatics ; 21(1): 210, 2020 May 24.
Article in English | MEDLINE | ID: mdl-32448174

ABSTRACT

BACKGROUND: Analysis of secondary structures is essential for understanding the functions of RNAs. Because RNA molecules thermally fluctuate, it is necessary to analyze the probability distributions of their secondary structures. Existing methods, however, are not applicable to long RNAs owing to their high computational complexity. Additionally, previous research has suffered from two numerical difficulties: overflow and significant numerical errors. RESULT: In this research, we reduced the computational complexity of calculating the landscape of the probability distribution of secondary structures by introducing a maximum-span constraint. In addition, we resolved numerical computation problems through two techniques: extended logsumexp and accuracy-guaranteed numerical computation. We analyzed the stability of the secondary structures of 16S ribosomal RNAs at various temperatures without overflow. The results obtained are consistent with previous research on thermophilic bacteria, suggesting that our method is applicable in thermal stability analysis. Furthermore, we quantitatively assessed numerical stability using our method.. CONCLUSION: These results demonstrate that the proposed method is applicable to long RNAs..


Subject(s)
Algorithms , Nucleic Acid Conformation , RNA/chemistry , Software , Escherichia coli/genetics , Probability , RNA, Ribosomal, 16S/chemistry , Temperature , Thermus thermophilus/genetics , Time Factors
18.
EMBO J ; 39(3): e102729, 2020 02 03.
Article in English | MEDLINE | ID: mdl-31782550

ABSTRACT

A number of long noncoding RNAs (lncRNAs) are induced in response to specific stresses to construct membrane-less nuclear bodies; however, their function remains poorly understood. Here, we report the role of nuclear stress bodies (nSBs) formed on highly repetitive satellite III (HSATIII) lncRNAs derived from primate-specific satellite III repeats upon thermal stress exposure. A transcriptomic analysis revealed that depletion of HSATIII lncRNAs, resulting in elimination of nSBs, promoted splicing of 533 retained introns during thermal stress recovery. A HSATIII-Comprehensive identification of RNA-binding proteins by mass spectrometry (ChIRP-MS) analysis identified multiple splicing factors in nSBs, including serine and arginine-rich pre-mRNA splicing factors (SRSFs), the phosphorylation states of which affect splicing patterns. SRSFs are rapidly de-phosphorylated upon thermal stress exposure. During stress recovery, CDC like kinase 1 (CLK1) was recruited to nSBs and accelerated the re-phosphorylation of SRSF9, thereby promoting target intron retention. Our findings suggest that HSATIII-dependent nSBs serve as a conditional platform for phosphorylation of SRSFs by CLK1 to promote the rapid adaptation of gene expression through intron retention following thermal stress exposure.


Subject(s)
Cell Nucleus/metabolism , Heat-Shock Response , Microsatellite Repeats , Protein Serine-Threonine Kinases/metabolism , Protein-Tyrosine Kinases/metabolism , RNA, Long Noncoding/metabolism , Serine-Arginine Splicing Factors/metabolism , Animals , CHO Cells , Cricetulus , Gene Expression Profiling , Gene Expression Regulation , HeLa Cells , Humans , Introns , Phosphorylation , RNA Splicing Factors/metabolism , Exome Sequencing
19.
RNA Biol ; 17(2): 264-280, 2020 02.
Article in English | MEDLINE | ID: mdl-31601146

ABSTRACT

MicroRNAs (miRNAs) are small non-coding RNAs that play essential roles in the regulation of gene function by a mechanism known as RNA silencing. In a previous study, we revealed that miRNA-mediated silencing efficacy is correlated with the combinatorial thermodynamic properties of the miRNA seed-target mRNA duplex and the 5´-terminus of the miRNA duplex, which can be predicted using 'miScore'. In this study, a robust refined-miScore was developed by integrating the thermodynamic properties of various miRNA secondary structures and the latest thermodynamic parameters of wobble base-pairing, including newly established parameters for I:C base pairing. Through repeated random sampling and machine learning, refined-miScore models calculated with either melting temperature (Tm) or free energy change (ΔG) values were successfully built and validated in both wild-type and adenosine-to-inosine edited miRNAs. In addition to the previously reported contribution of the seed-target duplex and 5´-terminus region, the refined-miScore suggests that the central and 3´-terminus regions of the miRNA duplex also play a role in the thermodynamic regulation of miRNA-mediated silencing efficacy.


Subject(s)
Adenosine , Amino Acid Substitution , Inosine , MicroRNAs/genetics , Models, Biological , RNA Editing , RNA Interference , Algorithms , Machine Learning , Nucleic Acid Conformation , RNA Stability , RNA, Messenger/genetics , Thermodynamics
20.
IEEE/ACM Trans Comput Biol Bioinform ; 16(5): 1645-1655, 2019.
Article in English | MEDLINE | ID: mdl-29994069

ABSTRACT

Computational RNA secondary structure prediction depends on a large number of nearest-neighbor free-energy parameters, including 10 parameters for Watson-Crick stacked base pairs that were estimated from experimental measurements of the free energies of 90 RNA duplexes. These experimental data are provided by time-consuming and cost-intensive experiments. In contrast, various modified nucleotides in RNAs, which would affect not only their structures but also functions, have been found, and rapid determination of energy parameters for a such modified nucleotides is needed. To reduce the high cost of determining energy parameters, we propose a novel method to estimate energy parameters from both experimental and computational data, where the computational data are provided by a recently developed molecular dynamics simulation protocol. We evaluate our method for Watson-Crick stacked base pairs, and show that parameters estimated from 10 experimental data items and 10 computational data items can predict RNA secondary structures with accuracy comparable to that using conventional parameters. The results indicate that the combination of experimental free-energy measurements and molecular dynamics simulations is capable of estimating the thermodynamic properties of RNA secondary structures at lower cost.


Subject(s)
Base Pairing/physiology , Molecular Dynamics Simulation , Nucleic Acid Conformation , RNA , Computational Biology , RNA/chemistry , RNA/metabolism , Thermodynamics
SELECTION OF CITATIONS
SEARCH DETAIL
...