Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
PLoS Genet ; 18(6): e1010245, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35657999

RESUMEN

LOTUS and Tudor domain containing proteins have critical roles in the germline. Proteins that contain these domains, such as Tejas/Tapas in Drosophila, help localize the Vasa helicase to the germ granules and facilitate piRNA-mediated transposon silencing. The homologous proteins in mammals, TDRD5 and TDRD7, are required during spermiogenesis. Until now, proteins containing both LOTUS and Tudor domains in Caenorhabditis elegans have remained elusive. Here we describe LOTR-1 (D1081.7), which derives its name from its LOTUS and Tudor domains. Interestingly, LOTR-1 docks next to P granules to colocalize with the broadly conserved Z-granule helicase, ZNFX-1. The Tudor domain of LOTR-1 is required for its Z-granule retention. Like znfx-1 mutants, lotr-1 mutants lose small RNAs from the 3' ends of WAGO and mutator targets, reminiscent of the loss of piRNAs from the 3' ends of piRNA precursor transcripts in mouse Tdrd5 mutants. Our work shows that LOTR-1 acts with ZNFX-1 to bring small RNA amplifying mechanisms towards the 3' ends of its RNA templates.


Asunto(s)
Caenorhabditis elegans , Epigénesis Genética , Células Germinativas , Animales , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans , Células Germinativas/metabolismo , ARN Helicasas , ARN Interferente Pequeño/genética , ARN Interferente Pequeño/metabolismo , Dominio Tudor
2.
Elife ; 102021 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-34223818

RESUMEN

We describe MIP-1 and MIP-2, novel paralogous C. elegans germ granule components that interact with the intrinsically disordered MEG-3 protein. These proteins promote P granule condensation, form granules independently of MEG-3 in the postembryonic germ line, and balance each other in regulating P granule growth and localization. MIP-1 and MIP-2 each contain two LOTUS domains and intrinsically disordered regions and form homo- and heterodimers. They bind and anchor the Vasa homolog GLH-1 within P granules and are jointly required for coalescence of MEG-3, GLH-1, and PGL proteins. Animals lacking MIP-1 and MIP-2 show temperature-sensitive embryonic lethality, sterility, and mortal germ lines. Germline phenotypes include defects in stem cell self-renewal, meiotic progression, and gamete differentiation. We propose that these proteins serve as scaffolds and organizing centers for ribonucleoprotein networks within P granules that help recruit and balance essential RNA processing machinery to regulate key developmental transitions in the germ line.


Asunto(s)
Proteínas de Caenorhabditis elegans/metabolismo , Caenorhabditis elegans/metabolismo , Células Germinativas/fisiología , Péptidos y Proteínas de Señalización Intracelular/metabolismo , Animales , Caenorhabditis elegans/embriología , Proteínas de Caenorhabditis elegans/genética , ARN Helicasas DEAD-box/genética , ARN Helicasas DEAD-box/metabolismo , Regulación de la Expresión Génica/fisiología , Péptidos y Proteínas de Señalización Intracelular/genética
3.
J Mol Biol ; 433(15): 167051, 2021 07 23.
Artículo en Inglés | MEDLINE | ID: mdl-33992693

RESUMEN

The COVID-19 pandemic has triggered concerns about the emergence of more infectious and pathogenic viral strains. As a public health measure, efficient screening methods are needed to determine the functional effects of new sequence variants. Here we show that structural modeling of SARS-CoV-2 Spike protein binding to the human ACE2 receptor, the first step in host-cell entry, predicts many novel variant combinations with enhanced binding affinities. By focusing on natural variants at the Spike-hACE2 interface and assessing over 700 mutant complexes, our analysis reveals that high-affinity Spike mutations (including N440K, S443A, G476S, E484R, G502P) tend to cluster near known human ACE2 recognition sites (K31 and K353). These Spike regions are structurally flexible, allowing certain mutations to optimize interface interaction energies. Although most human ACE2 variants tend to weaken binding affinity, they can interact with Spike mutations to generate high-affinity double mutant complexes, suggesting variation in individual susceptibility to infection. Applying structural analysis to highly transmissible variants, we find that circulating point mutations S477N, E484K and N501Y form high-affinity complexes (~40% more than wild-type). By combining predicted affinities and available antibody escape data, we show that fast-spreading viral variants exploit combinatorial mutations possessing both enhanced affinity and antibody resistance, including S477N/E484K, E484K/N501Y and K417T/E484K/N501Y. Thus, three-dimensional modeling of the Spike/hACE2 complex predicts changes in structure and binding affinity that correlate with transmissibility and therefore can help inform future intervention strategies.


Asunto(s)
Enzima Convertidora de Angiotensina 2/química , Enzima Convertidora de Angiotensina 2/metabolismo , COVID-19/transmisión , Mutación , SARS-CoV-2/patogenicidad , Glicoproteína de la Espiga del Coronavirus/química , Glicoproteína de la Espiga del Coronavirus/metabolismo , Enzima Convertidora de Angiotensina 2/genética , Sitios de Unión , Biología Computacional , Humanos , Modelos Moleculares , Unión Proteica , Conformación Proteica , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , Glicoproteína de la Espiga del Coronavirus/genética , Internalización del Virus
4.
Nucleic Acids Res ; 47(10): 5307-5324, 2019 06 04.
Artículo en Inglés | MEDLINE | ID: mdl-30941417

RESUMEN

Hepatitis C virus (HCV) is a positive-sense RNA virus that interacts with the liver-specific microRNA, miR-122. miR-122 binds to two sites in the 5' untranslated region (UTR) and this interaction promotes HCV RNA accumulation, although the precise role of miR-122 in the HCV life cycle remains unclear. Using biophysical analyses and Selective 2' Hydroxyl Acylation analyzed by Primer Extension (SHAPE) we investigated miR-122 interactions with the 5' UTR. Our data suggests that miR-122 binding results in alteration of nucleotides 1-117 to suppress an alternative secondary structure and promote functional internal ribosomal entry site (IRES) formation. Furthermore, we demonstrate that two hAgo2:miR-122 complexes are able to bind to the HCV 5' terminus simultaneously and SHAPE analyses revealed further alterations to the structure of the 5' UTR to accommodate these complexes. Finally, we present a computational model of the hAgo2:miR-122:HCV RNA complex at the 5' terminus of the viral genome as well as hAgo2:miR-122 interactions with the IRES-40S complex that suggest hAgo2 is likely to form additional interactions with SLII which may further stabilize the HCV IRES. Taken together, our results support a model whereby hAgo2:miR-122 complexes alter the structure of the viral 5' terminus and promote formation of the HCV IRES.


Asunto(s)
Proteínas Argonautas/metabolismo , Genoma Viral , Hepacivirus/genética , Hepatitis C/virología , MicroARNs/metabolismo , Regiones no Traducidas 5' , Calorimetría , Humanos , Sitios Internos de Entrada al Ribosoma , Mutación , Conformación de Ácido Nucleico , Plásmidos/metabolismo , Unión Proteica , Estabilidad del ARN , ARN Viral/genética , Programas Informáticos , Termodinámica , Replicación Viral
5.
Methods Mol Biol ; 1970: 43-64, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30963487

RESUMEN

Translational repression and degradation of transcripts by microRNAs (miRNAs) is mediated by a ribonucleoprotein complex called the miRNA-induced silencing complex (miRISC, or RISC). Advances in experimental determination of RISC structures have enabled detailed analysis and modeling of known miRNA targets, yet a full appreciation of the structural factors influencing target recognition remains a challenge, primarily because target recognition involves a combination of RNA-RNA and RNA-protein interactions that can vary greatly among different miRNA-target pairs. In this chapter, we review progress toward understanding the role of tertiary structure in miRNA target recognition using computational approaches to assemble RISC complexes at known targets and physics-based methods for computing target interactions. Using this framework to examine RISC structures and dynamics, we describe how the conformational flexibility of Argonautes plays an important role in accommodating the diversity of miRNA-target duplexes formed at canonical and noncanonical target sites. We then discuss applications of tertiary structure-based approaches to emerging topics, including the structural effects of SNPs in miRNA targets and cooperative interactions involving Argonaute-Argonaute complexes. We conclude by assessing the prospects for genome-scale modeling of RISC structures and modeling of higher-order Argonaute complexes associated with miRNA biogenesis, mRNA regulation, and other functions.


Asunto(s)
Proteínas Argonautas/química , Biología Computacional/métodos , MicroARNs/metabolismo , ARN Mensajero/metabolismo , Complejo Silenciador Inducido por ARN/metabolismo , Programas Informáticos , Sitios de Unión , Regulación de la Expresión Génica , Humanos , MicroARNs/química , MicroARNs/genética , Estructura Terciaria de Proteína , ARN Mensajero/química , ARN Mensajero/genética , Complejo Silenciador Inducido por ARN/química
6.
Nucleic Acids Res ; 45(12): 7212-7225, 2017 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-28482037

RESUMEN

Although strong evidence supports the importance of their cooperative interactions, microRNA (miRNA)-binding sites are still largely investigated as functionally independent regulatory units. Here, a survey of alternative 3΄UTR isoforms implicates a non-canonical seedless site in cooperative miRNA-mediated silencing. While required for target mRNA deadenylation and silencing, this site is not sufficient on its own to physically recruit miRISC. Instead, it relies on facilitating interactions with a nearby canonical seed-pairing site to recruit the Argonaute complexes. We further show that cooperation between miRNA target sites is necessary for silencing in vivo in the C. elegans embryo, and for the recruitment of the Ccr4-Not effector complex. Using a structural model of cooperating miRISCs, we identified allosteric determinants of cooperative miRNA-mediated silencing that are required for both embryonic and larval miRNA functions. Our results delineate multiple cooperative mechanisms in miRNA-mediated silencing and further support the consideration of target site cooperation as a fundamental characteristic of miRNA function.


Asunto(s)
Caenorhabditis elegans/genética , Silenciador del Gen , MicroARNs/genética , Complejo Silenciador Inducido por ARN/química , Factores de Transcripción/química , Regiones no Traducidas 3' , Empalme Alternativo , Animales , Proteínas Argonautas/química , Proteínas Argonautas/genética , Proteínas Argonautas/metabolismo , Secuencia de Bases , Sitios de Unión , Caenorhabditis elegans/crecimiento & desarrollo , Caenorhabditis elegans/metabolismo , Embrión no Mamífero , MicroARNs/metabolismo , Modelos Moleculares , Conformación de Ácido Nucleico , Complejo Silenciador Inducido por ARN/genética , Complejo Silenciador Inducido por ARN/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
7.
Nucleic Acids Res ; 43(20): 9613-25, 2015 Nov 16.
Artículo en Inglés | MEDLINE | ID: mdl-26432829

RESUMEN

Experimental studies have uncovered a variety of microRNA (miRNA)-target duplex structures that include perfect, imperfect and seedless duplexes. However, non-canonical binding modes from imperfect/seedless duplexes are not well predicted by computational approaches, which rely primarily on sequence and secondary structural features, nor have their tertiary structures been characterized because solved structures to date are limited to near perfect, straight duplexes in Argonautes (Agos). Here, we use structural modeling to examine the role of Ago dynamics in assembling viable eukaryotic miRNA-induced silencing complexes (miRISCs). We show that combinations of low-frequency, global modes of motion of Ago domains are required to accommodate RNA duplexes in model human and C. elegans Ago structures. Models of viable miRISCs imply that Ago adopts variable conformations at distinct target sites that generate distorted, imperfect miRNA-target duplexes. Ago's ability to accommodate a duplex is dependent on the region where structural distortions occur: distortions in solvent-exposed seed and 3'-end regions are less likely to produce steric clashes than those in the central duplex region. Energetic analyses of assembled miRISCs indicate that target recognition is also driven by favorable Ago-duplex interactions. Such structural insights into Ago loading and target recognition mechanisms may provide a more accurate assessment of miRNA function.


Asunto(s)
Proteínas Argonautas/química , MicroARNs/química , Complejo Silenciador Inducido por ARN/química , Animales , Proteínas Argonautas/metabolismo , Proteínas Bacterianas/química , Caenorhabditis elegans/genética , Proteínas Fúngicas/química , Humanos , MicroARNs/metabolismo , Modelos Moleculares , Unión Proteica , Conformación Proteica , Complejo Silenciador Inducido por ARN/metabolismo , Thermus thermophilus
8.
RNA ; 19(4): 539-51, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-23417009

RESUMEN

Current computational analysis of microRNA interactions is based largely on primary and secondary structure analysis. Computationally efficient tertiary structure-based methods are needed to enable more realistic modeling of the molecular interactions underlying miRNA-mediated translational repression. We incorporate algorithms for predicting duplex RNA structures, ionic strength effects, duplex entropy and free energy, and docking of duplex-Argonaute protein complexes into a pipeline to model and predict miRNA-target duplex binding energies. To ensure modeling accuracy and computational efficiency, we use an all-atom description of RNA and a continuum description of ionic interactions using the Poisson-Boltzmann equation. Our method predicts the conformations of two constructs of Caenorhabditis elegans let-7 miRNA-target duplexes to an accuracy of ∼3.8 Šroot mean square distance of their NMR structures. We also show that the computed duplex formation enthalpies, entropies, and free energies for eight miRNA-target duplexes agree with titration calorimetry data. Analysis of duplex-Argonaute docking shows that structural distortions arising from single-base-pair mismatches in the seed region influence the activity of the complex by destabilizing both duplex hybridization and its association with Argonaute. Collectively, these results demonstrate that tertiary structure-based modeling of miRNA interactions can reveal structural mechanisms not accessible with current secondary structure-based methods.


Asunto(s)
MicroARNs/química , Conformación de Ácido Nucleico , ARN de Helminto/química , Animales , Proteínas Argonautas/metabolismo , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Metabolismo Energético , Modelos Moleculares , Resonancia Magnética Nuclear Biomolecular , Thermus thermophilus/metabolismo
9.
Biophys J ; 99(8): 2587-96, 2010 Oct 20.
Artículo en Inglés | MEDLINE | ID: mdl-20959100

RESUMEN

Characterizing the ionic distribution around chromatin is important for understanding the electrostatic forces governing chromatin structure and function. Here we develop an electrostatic model to handle multivalent ions and compute the ionic distribution around a mesoscale chromatin model as a function of conformation, number of nucleosome cores, and ionic strength and species using Poisson-Boltzmann theory. This approach enables us to visualize and measure the complex patterns of counterion condensation around chromatin by examining ionic densities, free energies, shielding charges, and correlations of shielding charges around the nucleosome core and various oligonucleosome conformations. We show that: counterions, especially divalent cations, predominantly condense around the nucleosomal and linker DNA, unburied regions of histone tails, and exposed chromatin surfaces; ionic screening is sensitively influenced by local and global conformations, with a wide ranging net nucleosome core screening charge (56-100e); and screening charge correlations reveal conformational flexibility and interactions among chromatin subunits, especially between the histone tails and parental nucleosome cores. These results provide complementary and detailed views of ionic effects on chromatin structure for modest computational resources. The electrostatic model developed here is applicable to other coarse-grained macromolecular complexes.


Asunto(s)
Cromatina/química , Modelos Moleculares , Electricidad Estática , Cromatina/metabolismo , ADN/química , ADN/metabolismo , Histonas/química , Histonas/metabolismo , Nucleosomas/química , Nucleosomas/metabolismo , Conformación Proteica , Pliegue de Proteína , Sales (Química)/química , Propiedades de Superficie
10.
Nucleic Acids Res ; 38(13): e139, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20448026

RESUMEN

Although identification of active motifs in large random sequence pools is central to RNA in vitro selection, no systematic computational equivalent of this process has yet been developed. We develop a computational approach that combines target pool generation, motif scanning and motif screening using secondary structure analysis for applications to 10(12)-10(14)-sequence pools; large pool sizes are made possible using program redesign and supercomputing resources. We use the new protocol to search for aptamer and ribozyme motifs in pools up to experimental pool size (10(14) sequences). We show that motif scanning, structure matching and flanking sequence analysis, respectively, reduce the initial sequence pool by 6-8, 1-2 and 1 orders of magnitude, consistent with the rare occurrence of active motifs in random pools. The final yields match the theoretical yields from probability theory for simple motifs and overestimate experimental yields, which constitute lower bounds, for aptamers because screening analyses beyond secondary structure information are not considered systematically. We also show that designed pools using our nucleotide transition probability matrices can produce higher yields for RNA ligase motifs than random pools. Our methods for generating, analyzing and designing large pools can help improve RNA design via simulation of aspects of in vitro selection.


Asunto(s)
ARN/química , Análisis de Secuencia de ARN , Algoritmos , Aptámeros de Nucleótidos/química , Ligasas de Carbono-Oxígeno/química , Biología Computacional , Conformación de Ácido Nucleico , ARN Catalítico/química
11.
J Biomed Sci ; 15(6): 697-705, 2008 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-18661287

RESUMEN

Small nucleolar RNAs (snoRNAs) play a significant role in Prader-Willi Syndrome (PWS) and Angelman Syndrome (AS), which are genomic disorders resulting from deletions in the human chromosomal region 15q11-q13. To identify snoRNAs in the region, our computational study employs key motif features of C/D box snoRNAs and introduces a complementary RNA-RNA hybridization test. We identify three previously unknown methylation guide snoRNAs targeting ribosomal 18S and 28S RNAs, and two snoRNAs targeting serotonin receptor 2C mRNA. We show that the three snoRNA candidates likely possess methylation strands complementary to, and form stable complexes with, human ribosomal RNAs. Our screen also identifies 8 other snoRNA candidates that do not pass the rRNA-complementarity and/or hybridization tests. Two of these candidates have extensive sequence similarity to HBII-52, a snoRNA that regulates the alternative splicing of serotonin receptor 2C mRNA. Six out of our eleven candidate snoRNAs are also predicted by other existing methods.


Asunto(s)
Síndrome de Angelman/genética , Biología Computacional , Genoma Humano/genética , Síndrome de Prader-Willi/genética , ARN Nucleolar Pequeño/genética , Algoritmos , Secuencia de Bases , Orden Génico , Humanos , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Hibridación de Ácido Nucleico , Alineación de Secuencia
12.
Bioinform Biol Insights ; 2: 75-94, 2008 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-19812767

RESUMEN

Recent studies of mammalian transcriptomes have identified numerous RNA transcripts that do not code for proteins; their identity, however, is largely unknown. Here we explore an approach based on sequence randomness patterns to discern different RNA classes. The relative z-score we use helps identify the known ncRNA class from the genome, intergene and intron classes. This leads us to a fractional ncRNA measure of putative ncRNA datasets which we model as a mixture of genuine ncRNAs and other transcripts derived from genomic, intergenic and intronic sequences. We use this model to analyze six representative datasets identified by the FANTOM3 project and two computational approaches based on comparative analysis (RNAz and EvoFold). Our analysis suggests fewer ncRNAs than estimated by DNA sequencing and comparative analysis, but the verity of our approach and its prediction requires more extensive experimental RNA data.

13.
Bioinformatics ; 23(21): 2959-60, 2007 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-17855416

RESUMEN

SUMMARY: Our RNA-As-Graph-Pools (RagPools) web server offers a theoretical companion tool for RNA in vitro selection and related problems. Specifically, it suggests how to construct RNA sequence/structure pools with user-specified properties and assists in analyzing resulting distributions. This utility follows our recently developed approach for engineering sequence pools that links RNA sequence space regions with corresponding structural distributions via a 'mixing matrix' approach combined with a graph theory analysis of RNA secondary-structure space; the mixing matrix specifies nucleotide transition rates, and graph theory links sequences to simple graphical objects representing RNA motifs. The companion RagPools web server ('Designer' component) provides optimized starting sequences, mixing matrices and associated weights in response to a user-specified target pool structure distribution. In addition, RagPools ('Analyzer' component) analyzes the motif distribution of pools generated from user-specified starting sequences and mixing matrices. Thus, RagPools serves as a guide to researchers who aim to synthesize RNA pools with desired properties and/or experiment in silico with various designs by our approach. AVAILABILITY: The web server is accessible on the web at http://rubin2.biomath.nyu.edu


Asunto(s)
Algoritmos , Internet , Sondas ARN/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Secuencia de Bases , Datos de Secuencia Molecular
14.
RNA ; 13(4): 478-92, 2007 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-17322501

RESUMEN

Although in vitro selection technology is a versatile experimental tool for discovering novel synthetic RNA molecules, finding complex RNA molecules is difficult because most RNAs identified from random sequence pools are simple motifs, consistent with recent computational analysis of such sequence pools. Thus, enriching in vitro selection pools with complex structures could increase the probability of discovering novel RNAs. Here we develop an approach for engineering sequence pools that links RNA sequence space regions with corresponding structural distributions via a "mixing matrix" approach combined with a graph theory analysis. We define five classes of mixing matrices motivated by covariance mutations in RNA; these constructs define nucleotide transition rates and are applied to chosen starting sequences to yield specific nonrandom pools. We examine the coverage of sequence space as a function of the mixing matrix and starting sequence via clustering analysis. We show that, in contrast to random sequences, which are associated only with a local region of sequence space, our designed pools, including a structured pool for GTP aptamers, can target specific motifs. It follows that experimental synthesis of designed pools can benefit from using optimized starting sequences, mixing matrices, and pool fractions associated with each of our constructed pools as a guide. Automation of our approach could provide practical tools for pool design applications for in vitro selection of RNAs and related problems.


Asunto(s)
Biología Computacional , ARN/química , Selección Genética , Algoritmos , Secuencia de Bases , Análisis por Conglomerados , Simulación por Computador , Secuencia Conservada , Técnicas In Vitro , Modelos Químicos , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Análisis de Secuencia de ARN
15.
Nucleic Acids Res ; 33(18): 6057-69, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-16254081

RESUMEN

Riboswitches and RNA interference are important emerging mechanisms found in many organisms to control gene expression. To enhance our understanding of such RNA roles, finding small regulatory motifs in genomes presents a challenge on a wide scale. Many simple functional RNA motifs have been found by in vitro selection experiments, which produce synthetic target-binding aptamers as well as catalytic RNAs, including the hammerhead ribozyme. Motivated by the prediction of Piganeau and Schroeder [(2003) Chem. Biol., 10, 103-104] that synthetic RNAs may have natural counterparts, we develop and apply an efficient computational protocol for identifying aptamer-like motifs in genomes. We define motifs from the sequence and structural information of synthetic aptamers, search for sequences in genomes that will produce motif matches, and then evaluate the structural stability and statistical significance of the potential hits. Our application to aptamers for streptomycin, chloramphenicol, neomycin B and ATP identifies 37 candidate sequences (in coding and non-coding regions) that fold to the target aptamer structures in bacterial and archaeal genomes. Further energetic screening reveals that several candidates exhibit energetic properties and sequence conservation patterns that are characteristic of functional motifs. Besides providing candidates for experimental testing, our computational protocol offers an avenue for expanding natural RNA's functional repertoire.


Asunto(s)
Genómica/métodos , ARN/química , Análisis de Secuencia de ARN/métodos , Algoritmos , Secuencia de Bases , Biología Computacional/métodos , Secuencia Conservada , Interpretación Estadística de Datos , Genoma Arqueal , Genoma Bacteriano , Conformación de Ácido Nucleico , ARN/genética , Termodinámica
16.
RNA ; 11(6): 853-63, 2005 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-15923372

RESUMEN

In vitro selection of functional RNAs from large random sequence pools has led to the identification of many ligand-binding and catalytic RNAs. However, the structural diversity in random pools is not well understood. Such an understanding is a prerequisite for designing sequence pools to increase the probability of finding complex functional RNA by in vitro selection techniques. Toward this goal, we have generated by computer five random pools of RNA sequences of length up to 100 nt to mimic experiments and characterized the distribution of associated secondary structural motifs using sets of possible RNA tree structures derived from graph theory techniques. Our results show that such random pools heavily favor simple topological structures: For example, linear stem-loop and low-branching motifs are favored rather than complex structures with high-order junctions, as confirmed by known aptamers. Moreover, we quantify the rise of structural complexity with sequence length and report the dominant class of tree motifs (characterized by vertex number) for each pool. These analyses show not only that random pools do not lead to a uniform distribution of possible RNA secondary topologies; they point to avenues for designing pools with specific simple and complex structures in equal abundance in the goal of broadening the range of functional RNAs discovered by in vitro selection. Specifically, the optimal RNA sequence pool length to identify a structure with x stems is 20x.


Asunto(s)
Biología Computacional , ARN/química , Simulación por Computador , Conformación de Ácido Nucleico
17.
Nucleic Acids Res ; 33(4): 1384-98, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-15745998

RESUMEN

Modular architecture is a hallmark of RNA structures, implying structural, and possibly functional, similarity among existing RNAs. To systematically delineate the existence of smaller topologies within larger structures, we develop and apply an efficient RNA secondary structure comparison algorithm using a newly developed two-dimensional RNA graphical representation. Our survey of similarity among 14 pseudoknots and subtopologies within ribosomal RNAs (rRNAs) uncovers eight pairs of structurally related pseudoknots with non-random sequence matches and reveals modular units in rRNAs. Significantly, three structurally related pseudoknot pairs have functional similarities not previously known: one pair involves the 3' end of brome mosaic virus genomic RNA (PKB134) and the alternative hammerhead ribozyme pseudoknot (PKB173), both of which are replicase templates for viral RNA replication; the second pair involves structural elements for translation initiation and ribosome recruitment found in the viral internal ribosome entry site (PKB223) and the V4 domain of 18S rRNA (PKB205); the third pair involves 18S rRNA (PKB205) and viral tRNA-like pseudoknot (PKB134), which probably recruits ribosomes via structural mimicry and base complementarity. Additionally, we quantify the modularity of 16S and 23S rRNAs by showing that RNA motifs can be constructed from at least 210 building blocks. Interestingly, we find that the 5S rRNA and two tree modules within 16S and 23S rRNAs have similar topologies and tertiary shapes. These modules can be applied to design novel RNA motifs via build-up-like procedures for constructing sequences and folds.


Asunto(s)
ARN Ribosómico/química , ARN/química , Algoritmos , Secuencia de Bases , Biología Computacional , Gráficos por Computador , Modelos Moleculares , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , ARN Catalítico/química , ARN Ribosómico 16S/química , ARN Ribosómico 23S/química , ARN Viral/química
18.
J Mol Biol ; 341(5): 1129-44, 2004 Aug 27.
Artículo en Inglés | MEDLINE | ID: mdl-15321711

RESUMEN

Because the functional repertiore of RNA molecules, like proteins, is closely linked to the diversity of their shapes, uncovering RNA's structural repertoire is vital for identifying novel RNAs, especially in genomic sequences. To help expand the limited number of known RNA families, we use graphical representation and clustering analysis of RNA secondary structures to predict novel RNA topologies and their abundance as a function of size. Representing the essential topological properties of RNA secondary structures as graphs enables enumeration, generation, and prediction of novel RNA motifs. We apply a probabilistic graph-growing method to construct the RNA structure space encompassing the topologies of existing and hypothetical RNAs and cluster all RNA topologies into two groups using topological descriptors and a standard clustering algorithm. Significantly, we find that nearly all existing RNAs fall into one group, which we refer to as "RNA-like"; we consider the other group "non-RNA-like". Our method predicts many candidates for novel RNA secondary topologies, some of which are remarkably similar to existing structures; interestingly, the centroid of the RNA-like group is the tmRNA fold, a pseudoknot having both tRNA-like and mRNA-like functions. Additionally, our approach allows estimation of the relative abundance of pseudoknot and other (e.g. tree) motifs using the "edge-cut" property of RNA graphs. This analysis suggests that pseudoknots dominate the RNA structure universe, representing more than 90% when the sequence length exceeds 120 nt; the predicted trend for <100 nt agrees with data for existing RNAs. Together with our predictions for novel "RNA-like" topologies, our analysis can help direct the design of functional RNAs and identification of novel RNA folds in genomes through an efficient topology-directed search, which grows much more slowly in complexity with RNA size compared to the traditional sequence-based search.


Asunto(s)
Modelos Teóricos , Conformación de Ácido Nucleico , ARN/química , Algoritmos , Secuencia de Bases , Análisis por Conglomerados , Modelos Moleculares
19.
BMC Bioinformatics ; 5: 88, 2004 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-15238163

RESUMEN

BACKGROUND: The proliferation of structural and functional studies of RNA has revealed an increasing range of RNA's structural repertoire. Toward the objective of systematic cataloguing of RNA's structural repertoire, we have recently described the basis of a graphical approach for organizing RNA secondary structures, including existing and hypothetical motifs. DESCRIPTION: We now present an RNA motif database based on graph theory, termed RAG for RNA-As-Graphs, to catalogue and rank all theoretically possible, including existing, candidate and hypothetical, RNA secondary motifs. The candidate motifs are predicted using a clustering algorithm that classifies RNA graphs into RNA-like and non-RNA groups. All RNA motifs are filed according to their graph vertex number (RNA length) and ranked by topological complexity. CONCLUSIONS: RAG's quantitative cataloguing allows facile retrieval of all classes of RNA secondary motifs, assists identification of structural and functional properties of user-supplied RNA sequences, and helps stimulate the search for novel RNAs based on predicted candidate motifs.


Asunto(s)
Gráficos por Computador/tendencias , Internet/tendencias , ARN/química , Programas Informáticos , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos/tendencias , Diseño de Software
20.
Bioinformatics ; 20(8): 1285-91, 2004 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-14962931

RESUMEN

MOTIVATION: Understanding RNA's structural diversity is vital for identifying novel RNA structures and pursuing RNA genomics initiatives. By classifying RNA secondary motifs based on correlations between conserved RNA secondary structures and functional properties, we offer an avenue for predicting novel motifs. Although several RNA databases exist, no comprehensive schemes are available for cataloguing the range and diversity of RNA's structural repertoire. RESULTS: Our RNA-As-Graphs (RAG) database describes and ranks all mathematically possible (including existing and candidate) RNA secondary motifs on the basis of graphical enumeration techniques. We represent RNA secondary structures as two-dimensional graphs (networks), specifying the connectivity between RNA secondary structural elements, such as loops, bulges, stems and junctions. We archive RNA tree motifs as 'tree graphs' and other RNAs, including pseudoknots, as general 'dual graphs'. All RNA motifs are catalogued by graph vertex number (a measure of sequence length) and ranked by topological complexity. The RAG inventory immediately suggests candidates for novel RNA motifs, either naturally occurring or synthetic, and thereby might stimulate the prediction and design of novel RNA motifs. AVAILABILITY: The database is accessible on the web at http://monod.biomath.nyu.edu/rna


Asunto(s)
Algoritmos , Sistemas de Administración de Bases de Datos , Bases de Datos de Ácidos Nucleicos , Almacenamiento y Recuperación de la Información/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN/métodos , Secuencia Conservada/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA