RESUMO
In the problem of RNA design, also known as inverse folding, RNA sequences are predicted that achieve the desired secondary structure at the lowest possible free energy and under certain constraints. The designed sequences have applications in synthetic biology and RNA-based nanotechnologies. There are also known cases of the successful use of inverse folding to discover previously unknown noncoding RNAs. Several computational methods have been dedicated to the problem of RNA design. They differ by algorithm and additional parameters, e.g., those determining the goal function in the sequence optimization process. Users can obtain many promising RNA sequences quite easily. The more difficult issue is to critically evaluate them and select the most favorable and reliable sequence that form1s the expected RNA structure. The latter problem is addressed in this paper. We propose an RNA design protocol extended to include sequence evaluation, for which a 3D structure is used. Experiments show that the accuracy of RNA design can be improved by adding a 3D structure prediction and analysis step.
Assuntos
Algoritmos , Biologia Computacional , Conformação de Ácido Nucleico , Dobramento de RNA , RNA , RNA/química , RNA/genética , Biologia Computacional/métodos , Software , Modelos Moleculares , Biologia Sintética/métodosRESUMO
In this chapter, we discuss the potential application of Restricted Boltzmann machines (RBM) to model sequence families of structured RNA molecules. RBMs are a simple two-layer machine learning model able to capture intricate sequence dependencies induced by secondary and tertiary structure, as well as mechanisms of structural flexibility, resulting in a model that can be successfully used for the design of allosteric RNA such as riboswitches. They have recently been experimentally validated as generative models for the SAM-I riboswitch aptamer domain sequence family. We introduce RBM mathematically and practically, providing self-contained code examples to download the necessary training sequence data, train the RBM, and sample novel sequences. We present in detail the implementation of algorithms necessary to use RBMs, focusing on applications in biological sequence modeling.
Assuntos
Algoritmos , Aprendizado de Máquina , Conformação de Ácido Nucleico , RNA , Riboswitch , RNA/química , RNA/genética , Riboswitch/genética , Biologia Computacional/métodos , Modelos Moleculares , SoftwareRESUMO
As the most abundant modification in eukaryotic messenger RNA (mRNA) and long noncoding RNA (lncRA), N6-methyladenosine (m6A) has been shown to play essential roles in various significant biological processes and attracted growing attention in recent years. To investigate its functions and dynamics, there is a critical need to quantitatively determine the m6A modification fractions at a precise location. Here, we report a deoxyribozyme mediated CRISPR-Cas12a platform (termed "DCAS") that can directly quantify m6A fractions at single-base resolution. DCAS employs a deoxyribozyme (VMC10) to selectively cleave the unmodified adenine (A) in the RNA, allowing only m6A-modified RNA amplified by RT-PCR. Leveraging the CRISPR-Cas12a quantify the PCR amplification products, DCAS can directly determine the presence of m6A at target sites and its fractions. The combination of CRISPR-Cas12a with RT-PCR has greatly improved the sensitivity and accuracy, enabling the detection of m6A-modified RNA as low as 100 aM in 2 fM total target RNA. This robustly represents an improvement of 2-3 orders of magnitude of sensitivity and selectivity compared to traditional standard methods, such as SCARLET and primer extension methods. Therefore, this method can be successfully employed to accurately determine m6A fractions in real biological samples, even in low abundance RNA biomarkers.
Assuntos
Adenosina , Sistemas CRISPR-Cas , DNA Catalítico , RNA , Sistemas CRISPR-Cas/genética , Adenosina/análogos & derivados , Adenosina/análise , Adenosina/química , DNA Catalítico/química , DNA Catalítico/metabolismo , DNA Catalítico/genética , RNA/genética , RNA/análise , RNA/química , HumanosRESUMO
Prokaryotes use CRISPR-Cas systems to interfere with viruses and other mobile genetic elements. CRISPR arrays comprise repeated DNA elements and spacer sequences that can be engineered for custom target sites. These arrays are transcribed into precursor CRISPR RNAs (pre-crRNAs) that undergo maturation steps to form individual CRISPR RNAs (crRNAs). Each crRNA contains a single spacer that identifies the target cleavage site for a large variety of Cas protein effectors. Precise manipulation of spacer sequences within CRISPR arrays is crucial for advancing the functionality of CRISPR-based technologies. Here, we describe a protocol for the design and creation of a minimal, plasmid-based CRISPR array to enable the expression of specific, synthetic crRNAs. Plasmids contain entry spacer sequences with two type IIS restriction sites and Golden Gate cloning enables the efficient exchange of these spacer sequences. Factors that influence the compatibility of the CRISPR arrays with native or recombinant Cas proteins are discussed.
Assuntos
Sistemas CRISPR-Cas , Clonagem Molecular , Plasmídeos , Clonagem Molecular/métodos , Plasmídeos/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Escherichia coli/genética , RNA/genéticaRESUMO
Imaging-based spatial multi-omics technologies facilitate the analysis of higher-order genomic structures, gene transcription, and the localization of proteins and posttranslational modifications (PTMs) at the single-allele level, thereby enabling detailed observations of biological phenomena, including transcription machinery within cells and tissues. This chapter details the principles of such technologies, with a focus on DNA/RNA/immunofluorescence (IF) sequential fluorescence in situ hybridization (seqFISH). A comprehensive step-by-step protocol for image analysis is provided, covering image preprocessing, spot detection, and data visualization. For practical application, complete Jupyter Notebook codes are made available on GitHub ( https://github.com/Ochiai-Lab/seqFISH_analysis ).
Assuntos
DNA , Imunofluorescência , Processamento de Imagem Assistida por Computador , Hibridização in Situ Fluorescente , RNA , Software , Hibridização in Situ Fluorescente/métodos , RNA/genética , RNA/análise , RNA/metabolismo , Processamento de Imagem Assistida por Computador/métodos , DNA/genética , Imunofluorescência/métodos , Humanos , AnimaisRESUMO
The design of RNA sequences with desired structural properties presents a challenging computational problem with promising applications in biotechnology and biomedicine. Most regulatory RNAs function by forming RNA-RNA interactions, e.g., in order to regulate mRNA expression. It is therefore natural to consider problems where a sequence is designed to form a desired RNA-RNA interaction and switch between structures upon binding. This contribution demonstrates the use of the Infrared framework to design interacting sequences. Specifically, we consider the regulation of the rpoS mRNA by the sRNA DsrA and design artificial 5 ' UTRs that place a downstream protein coding gene under control of DsrA. The design process is explained step by step in a Jupyter notebook, accompanied by Python code. The text discusses setting up design constraints for sampling sequences in Infrared, computing quality measures, constructing a suitable cost function, as well as the optimization procedure. We show that not only thermodynamic but also kinetic folding features can be relevant. Kinetics of interaction formation can be estimated efficiently using the RRIkinDP tool, and the chapter explains how to include kinetic folding features from RRIkinDP directly in the cost function. The protocol implemented in our Jupyter notebook can easily be extended to consider additional requirements or adapted to novel design scenarios.
Assuntos
Conformação de Ácido Nucleico , Termodinâmica , Biologia Computacional/métodos , Software , Cinética , RNA/genética , RNA/química , RNA/metabolismo , Regiões 5' não Traduzidas , RNA Mensageiro/genética , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Algoritmos , Dobramento de RNARESUMO
In the advent of the RNA therapeutics and diagnostics era, it is of great relevance to introduce new and more efficient RNA technologies that prove to be effective tools in practical contexts. Moreover, it is of utmost importance to develop and provide access to computational tools capable of designing such RNA constructs. Here we introduce one such novel diagnostics technology (Apta-SMART) and show how to design (using MoiRNAiFold) and implement it, step by step. Moreover, we show how to combine this technique with well-known RNA amplification methods and briefly mention some encouraging results.
Assuntos
Simulação por Computador , RNA , RNA/genética , RNA/química , Biologia Computacional/métodos , Software , Humanos , Técnicas de Amplificação de Ácido Nucleico/métodosRESUMO
RNA is present in all domains of life. It was once thought to be solely involved in protein expression, but recent advances have revealed its crucial role in catalysis and gene regulation through noncoding RNA. With a growing interest in exploring RNAs with specific structures, there is an increasing focus on designing RNA structures for in vivo and in vitro experimentation and for therapeutics. The development of RNA secondary structure prediction methods has also spurred the growth of RNA design software. However, there are challenges to designing RNA sequences that meet secondary structure requirements. One major challenge is that the secondary structure design problem is likely NP-hard, making it computationally intensive. Another issue is that objective functions need to consider the folding ensemble of RNA molecules to avoid off target structures. In this chapter, we provide protocols for two software tools from the RNAstructure package: "Design" for structured RNA sequence design and "orega" for unstructured RNA sequence design.
Assuntos
Biologia Computacional , Conformação de Ácido Nucleico , RNA , Software , RNA/química , RNA/genética , Biologia Computacional/métodos , Dobramento de RNA , Análise de Sequência de RNA/métodos , AlgoritmosRESUMO
Machine learning algorithms, and in particular deep learning approaches, have recently garnered attention in the field of molecular biology due to remarkable results. In this chapter, we describe machine learning approaches specifically developed for the design of RNAs, with a focus on the learna_tools Python package, a collection of automated deep reinforcement learning algorithms for secondary structure-based RNA design. We explain the basic concepts of reinforcement learning and its extension, automated reinforcement learning, and outline how these concepts can be successfully applied to the design of RNAs. The chapter is structured to guide through the usage of the different programs with explicit examples, highlighting particular applications of the individual tools.
Assuntos
Algoritmos , Aprendizado de Máquina , Conformação de Ácido Nucleico , RNA , Software , RNA/química , RNA/genética , Biologia Computacional/métodos , Aprendizado ProfundoRESUMO
Ribonucleic acid (RNA) design is the inverse of RNA folding. RNA folding aims to identify the most likely secondary structure into which a given strand of nucleotides will fold. RNA design algorithms, on the other hand, attempt to design a strand of nucleotides that will fold into a specified secondary structure. Despite the apparent NP-hard nature of RNA design, promising results can be achieved when formulated as a combinatorial optimization problem and approached with simple heuristics. The main focus of this paper is to describe an RNA design algorithm based on simulated annealing. Additionally, noteworthy features and results will be presented herein.
Assuntos
Algoritmos , Conformação de Ácido Nucleico , Dobramento de RNA , RNA , RNA/química , RNA/genética , Software , Biologia Computacional/métodos , Simulação por ComputadorRESUMO
Understanding the connection between complex structural features of RNA and biological function is a fundamental challenge in evolutionary studies and in RNA design. However, building datasets of RNA 3D structures and making appropriate modeling choices remain time-consuming and lack standardization. In this chapter, we describe the use of rnaglib, to train supervised and unsupervised machine learning-based function prediction models on datasets of RNA 3D structures.
Assuntos
Biologia Computacional , Conformação de Ácido Nucleico , RNA , Software , RNA/química , RNA/genética , Biologia Computacional/métodos , Aprendizado de Máquina , Modelos MolecularesRESUMO
Riboswitches are naturally occurring regulatory segments of RNA molecules that modulate gene expression in response to specific ligand binding. They serve as a molecular 'switch' that controls the RNA's structure and function, typically influencing the synthesis of proteins. Riboswitches are unique because they directly interact with metabolites without the need for proteins, making them attractive tools in synthetic biology and RNA-based therapeutics. In synthetic biology, riboswitches are harnessed to create biosensors and genetic circuits. Their ability to respond to specific molecular signals allows for the design of precise control mechanisms in genetic engineering. This specificity is particularly useful in therapeutic applications, where riboswitches can be synthetically designed to respond to disease-specific metabolites, thereby enabling targeted drug delivery or gene therapy. Advancements in designing synthetic riboswitches for RNA-based therapeutics hinge on sophisticated computational techniques, which are described in this chapter. The chapter concludes by underscoring the potential of computational strategies in revolutionizing the design and application of synthetic riboswitches, paving the way for advanced RNA-based therapeutic solutions.
Assuntos
Biologia Computacional , Riboswitch , Biologia Sintética , Riboswitch/genética , Biologia Sintética/métodos , Biologia Computacional/métodos , Humanos , RNA/genética , Engenharia Genética/métodos , Aptâmeros de Nucleotídeos/genética , Ligantes , Conformação de Ácido NucleicoRESUMO
The inverse RNA folding problem deals with designing a sequence of nucleotides that will fold into a desired target structure. Generalized Nested Rollout Policy Adaptation (GNRPA) is a Monte Carlo search algorithm for optimizing a sequence of choices. It learns a playout policy to intensify the search of the state space near the current best sequence. The algorithm uses a prior on the possible actions so as to perform non uniform playouts when learning the instance of problem at hand. We trained a transformer neural network on the inverse RNA folding problem using the Rfam database. This network is used to generate a prior for every Eterna100 puzzle. GNRPA is used with this prior to solve some of the instances of the Eterna100 dataset. The transformer prior gives better result than handcrafted heuristics.
Assuntos
Algoritmos , Método de Monte Carlo , Dobramento de RNA , RNA , RNA/química , RNA/genética , Conformação de Ácido Nucleico , Redes Neurais de Computação , Biologia Computacional/métodosRESUMO
RNA design is a major challenge for the future development of synthetic biology and RNA-based therapy. The development of efficient and accurate RNA design pipelines is based on trial and error strategies. The fast progression of such algorithms requires assaying the properties of many RNA sequences in a short time frame. High throughput RNA structure chemical probing technologies such as SHAPE-MaP allow for assaying RNA structure and interaction rapidly and at a very large scale. However, the promiscuity of the designed sequences that may differ only by one nucleotide requires special care. In addition, it necessitates the analysis and evaluation of many experimental results that may reveal to be very tedious. Here we propose an experimental and analytical workflow that eases the screening of thousands of designed RNA sequences at once. In particular, we have developed shapemap tools a customized software suite available at https://github.com/sargueil-citcom/shapemap-tools .
Assuntos
Algoritmos , Biologia Computacional , Conformação de Ácido Nucleico , RNA , Software , RNA/química , RNA/genética , Biologia Computacional/métodos , Biologia Sintética/métodosRESUMO
RNA molecules play vital roles in many biological processes, such as gene regulation or protein synthesis. The adoption of a specific secondary and tertiary structure by RNA is essential to perform these diverse functions, making RNA a popular tool in bioengineering therapeutics. The field of RNA design responds to the need to develop novel RNA molecules that possess specific functional attributes. In recent years, computational tools for predicting RNA sequences with desired folding characteristics have improved and expanded. However, there is still a lack of well-defined and standardized datasets to assess these programs. Here, we present a large dataset of internal and multibranched loops extracted from PDB-deposited RNA structures that encompass a wide spectrum of design difficulties. Furthermore, we conducted benchmarking tests of widely utilized open-source RNA design algorithms employing this dataset.
Assuntos
Algoritmos , Benchmarking , Biologia Computacional , Conformação de Ácido Nucleico , RNA , RNA/genética , RNA/química , Biologia Computacional/métodos , SoftwareRESUMO
Nucleic acid tests (NATs) are considered as gold standard in molecular diagnosis. To meet the demand for onsite, point-of-care, specific and sensitive, trace and genotype detection of pathogens and pathogenic variants, various types of NATs have been developed since the discovery of PCR. As alternatives to traditional NATs (e.g., PCR), isothermal nucleic acid amplification techniques (INAATs) such as LAMP, RPA, SDA, HDR, NASBA, and HCA were invented gradually. PCR and most of these techniques highly depend on efficient and optimal primer and probe design to deliver accurate and specific results. This chapter starts with a discussion of traditional NATs and INAATs in concert with the description of computational tools available to aid the process of primer/probe design for NATs and INAATs. Besides briefly covering nanoparticles-assisted NATs, a more comprehensive presentation is given on the role CRISPR-based technologies have played in molecular diagnosis. Here we provide examples of a few groundbreaking CRISPR assays that have been developed to counter epidemics and pandemics and outline CRISPR biology, highlighting the role of CRISPR guide RNA and its design in any successful CRISPR-based application. In this respect, we tabularize computational tools that are available to aid the design of guide RNAs in CRISPR-based applications. In the second part of our chapter, we discuss machine learning (ML)- and deep learning (DL)-based computational approaches that facilitate the design of efficient primer and probe for NATs/INAATs and guide RNAs for CRISPR-based applications. Given the role of microRNA (miRNAs) as potential future biomarkers of disease diagnosis, we have also discussed ML/DL-based computational approaches for miRNA-target predictions. Our chapter presents the evolution of nucleic acid-based diagnosis techniques from PCR and INAATs to more advanced CRISPR/Cas-based methodologies in concert with the evolution of deep learning (DL)- and machine learning (ml)-based computational tools in the most relevant application domains.
Assuntos
Aprendizado Profundo , Humanos , Sistemas CRISPR-Cas , Técnicas de Diagnóstico Molecular/métodos , Técnicas de Amplificação de Ácido Nucleico/métodos , RNA/genética , Aprendizado de Máquina , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genéticaRESUMO
Fundamental to the diverse biological functions of RNA are its 3D structure and conformational flexibility, which enable single sequences to adopt a variety of distinct 3D states. Currently, computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. In this tutorial, we present gRNAde, a geometric RNA design pipeline operating on sets of 3D RNA backbone structures to design sequences that explicitly account for RNA 3D structure and dynamics. gRNAde is a graph neural network that uses an SE (3) equivariant encoder-decoder framework for generating RNA sequences conditioned on backbone structures where the identities of the bases are unknown. We demonstrate the utility of gRNAde for fixed-backbone re-design of existing RNA structures of interest from the PDB, including riboswitches, aptamers, and ribozymes. gRNAde is more accurate in terms of native sequence recovery while being significantly faster compared to existing physics-based tools for 3D RNA inverse design, such as Rosetta.
Assuntos
Aprendizado Profundo , Conformação de Ácido Nucleico , RNA , Software , RNA/química , RNA/genética , Biologia Computacional/métodos , RNA Catalítico/química , RNA Catalítico/genética , Modelos Moleculares , Redes Neurais de ComputaçãoRESUMO
QuantiGene™ 2.0 technique could be used to investigate the gene expression signature of the immune system senescence and thus to understand the molecular mechanism involved in the defects of the immune response during aging.QuantiGene™ 2.0 technique is a multiplex platform allowing the simultaneous analysis of several target RNA molecules (up to 80) present in a single sample. QuantiGene Assays use an accurate method for multiplexed or for single gene expression quantitation. QuantiGene 2.0 uses magnetic beads which are dyed internally with two fluorescence dyes, exhibiting a unique spectral signal and providing specificity and multiplexing capability of the technique. QuantiGene Assays incorporate branched-DNA technology for gene expression profiling.Branched-DNA system is responsible for the high sensitivity of the system. In fact, it permits to detect low levels of mRNA molecules. This branched-DNA system allows for the direct measurement of RNA transcripts by using signal amplification rather than target amplification. The assay protocol is spread over 2 days. First, immune cells are lysed to release the target RNA, which is incubated with oligonucleotide probe set targeted with beads capable to hybridize with the target RNA. Signal amplification is performed by sequential hybridization of the branched-DNA pre-amplifier, amplifier, and label probe molecules. The last step involves the incubation with Streptavidin-conjugated R-phycoerythrin. The fluorescent reporter generates a signal directly proportional to the levels of RNA molecules present in the cells. Luminex instrument evaluates the median intensity of fluorescence, which is proportional to the number of RNA target molecules present in the cells.
Assuntos
Perfilação da Expressão Gênica , Perfilação da Expressão Gênica/métodos , Humanos , RNA/genética , Hibridização de Ácido Nucleico/métodos , RNA Mensageiro/genéticaRESUMO
Cell senescence impedes the selfrenewal and osteogenic capacity of bone marrow mesenchymal stem cells (BMSCs), thus limiting their application in tissue regeneration. The present study aimed to elucidate the role and mechanism of repetitive element (RE) activation in BMSC senescence and osteogenesis, as well as the intervention effect of quercetin. In an H2O2induced BMSC senescence model, quercetin treatment alleviated senescence as shown by a decrease in senescenceassociated ßgalactosidase (SAßgal)positive cell ratio, increased colony formation ability and decreased mRNA expression of p21 and senescenceassociated secretory phenotype genes. DNA damage response marker γH2AX increased in senescent BMSCs, while expression of epigenetic markers methylation histone H3 Lys9, heterochromatin protein 1α and heterochromatinrelated nuclear membrane protein laminaassociated polypeptide 2 decreased. Quercetin rescued these alterations, indicating its ability to ameliorate senescence by stabilizing heterochromatin structure where REs are primarily suppressed. Transcriptional activation of REs accompanied by accumulation of cytoplasmic doublestranded (ds)RNA, as well as triggering of the RNA sensor retinoic acidinducible gene I (RIGI) receptor pathway in H2O2induced senescent BMSCs were shown. Similarly, quercetin treatment inhibited these responses. Additionally, RIGI knockdown led to a decreased number of SAßgalpositive cells, confirming its functional impact on senescence. Induction of senescence or administration of dsRNA analogue significantly hindered the osteogenic capacity of BMSCs, while quercetin treatment or RIGI knockdown reversed the decline in osteogenic function. The findings of the current study demonstrated that quercetin inhibited the activation of REs and the RIGI RNA sensing pathway via epigenetic regulation, thereby alleviating the senescence of BMSCs and promoting osteogenesis.
Assuntos
Senescência Celular , Células-Tronco Mesenquimais , Osteogênese , Quercetina , Quercetina/farmacologia , Células-Tronco Mesenquimais/efeitos dos fármacos , Células-Tronco Mesenquimais/metabolismo , Células-Tronco Mesenquimais/citologia , Senescência Celular/efeitos dos fármacos , Osteogênese/efeitos dos fármacos , Animais , Peróxido de Hidrogênio/farmacologia , Masculino , Transdução de Sinais/efeitos dos fármacos , Ratos , Ratos Sprague-Dawley , RNA/genética , RNA/metabolismo , Células CultivadasRESUMO
HYPOTHESIS: Lipid nanoparticle self-assembly is a complex process that relies on ion pairing between nucleic acids and hydrophobic cationic lipid counterions for encapsulation. The chemical factors influencing this process, such as formulation composition, have been the focus of recent research. However, the physical factors, particularly the mixing protocol, which directly modulates these chemical factors, have yet to be mechanistically examined using a reproducible mixing platform comparable to the industry standard. We here utilize Flash NanoPrecipitation (FNP), a scalable rapid mixing platform, to isolate and systematically investigate how mixing factors influence this complexation step, first by using a model polyelectrolyte-surfactant system and then generalizing to a typical RNA lipid nanoparticle formulation. EXPERIMENTS: Aqueous polystyrene sulfonate (PSS) and cetrimonium bromide (CTAB) solutions are rapidly homogenized using reproducible FNP mixing and controlled flow rates at different stoichiometric ratios and total solids concentrations to form polyelectrolyte-surfactant complexes (PESCs). Then, key mixing factors such as total flow rate, inlet stream relative volumetric flow rate, and magnitude of flow fluctuation are studied using both this PESC system and an RNA lipid nanoparticle formulation. FINDINGS: Fluctuations in flow as low as ± 5 % of the total flow rate are found to severely compromise PESC formation. This result is replicated in the RNA lipid nanoparticle system, which exhibited significant differences in size (132.7 nm vs. 75.6 nm) and RNA encapsulation efficiency (34.0 % vs. 82.8 %) under fluctuating vs. steady flow. We explain these results in light of the chemical variables isolated and studied; slow or nonuniform mixing generates localized concentration gradients that disrupt the balance between the hydrophobic and electrostatic forces that drive complex formation. These experiments contribute to our understanding of the complexation stage of lipid nanoparticle formation and provide practical insights into the importance of developing controlled mixing protocols in industry.