Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Annu Rev Biophys ; 37: 445-64, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18573090

RESUMO

Since the year 2000 a number of large RNA three-dimensional structures have been determined by X-ray crystallography. Structures composed of more than 100 nucleotide residues include the signal recognition particle RNA, group I intron, the GlmS ribozyme, RNAseP RNA, and ribosomal RNAs from Haloarcula morismortui, Escherichia coli, Thermus thermophilus, and Deinococcus radiodurans. These large RNAs are constructed from the same secondary and tertiary structural motifs identified in smaller RNAs but appear to have a larger organizational architecture. They are dominated by long continuous interhelical base stacking, tend to segregate into domains, and are planar in overall shape as opposed to their globular protein counterparts. These findings have consequences in RNA folding, intermolecular interaction, and packing, in addition to studies of design and engineering and structure prediction.


Assuntos
Modelos Químicos , Modelos Moleculares , RNA/química , RNA/ultraestrutura , Simulação por Computador , Conformação de Ácido Nucleico
2.
Artigo em Inglês | MEDLINE | ID: mdl-19642270

RESUMO

Almost every cellular process requires the interactions of pairs or larger complexes of proteins. High throughput protein-protein interaction (PPI) data have been generated using techniques such as the yeast two-hybrid systems, mass spectrometry method, and many more. Such data provide us with a new perspective to predict protein functions and to generate protein-protein interaction networks, and many recent algorithms have been developed for this purpose. However, PPI data generated using high throughput techniques contain a large number of false positives. In this paper, we have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resnik's formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Modelos Biológicos , Mapeamento de Interação de Proteínas/métodos , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Simulação por Computador
3.
Genome Biol ; 8(12): R271, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18154653

RESUMO

We propose a new network decomposition method to systematically identify protein interaction modules in the protein interaction network. Our method incorporates both a global metric and a local metric for balance and consistency. We have compared the performance of our method with several earlier approaches on both simulated and real datasets using different criteria, and show that our method is more robust to network alterations and more effective at discovering functional protein modules.


Assuntos
Algoritmos , Proteômica/métodos , Proteínas Fúngicas/metabolismo , Redes Reguladoras de Genes , Leveduras/química , Leveduras/genética
4.
Bioinformatics ; 22(21): 2590-6, 2006 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-16945945

RESUMO

MOTIVATION: Small non-coding RNA (ncRNA) genes play important regulatory roles in a variety of cellular processes. However, detection of ncRNA genes is a great challenge to both experimental and computational approaches. In this study, we describe a new approach called positive sample only learning (PSoL) to predict ncRNA genes in the Escherichia coli genome. Although PSoL is a machine learning method for classification, it requires no negative training data, which, in general, is hard to define properly and affects the performance of machine learning dramatically. In addition, using the support vector machine (SVM) as the core learning algorithm, PSoL can integrate many different kinds of information to improve the accuracy of prediction. Besides the application of PSoL for predicting ncRNAs, PSoL is applicable to many other bioinformatics problems as well. RESULTS: The PSoL method is assessed by 5-fold cross-validation experiments which show that PSoL can achieve about 80% accuracy in recovery of known ncRNAs. We compared PSoL predictions with five previously published results. The PSoL method has the highest percentage of predictions overlapping with those from other methods.


Assuntos
Algoritmos , Inteligência Artificial , Mapeamento Cromossômico/métodos , Escherichia coli/genética , RNA Bacteriano/genética , RNA não Traduzido/genética , Análise de Sequência de RNA/métodos , Sequência de Bases , Dados de Sequência Molecular , Reconhecimento Automatizado de Padrão/métodos , Alinhamento de Sequência/métodos
5.
Acta Crystallogr D Biol Crystallogr ; 62(Pt 6): 619-27, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16699189

RESUMO

The RNA I modulator protein (Rom) acts as a co-regulator of ColE1 plasmid copy number by binding to RNA kissing hairpins and stabilizing their interaction. The structure of Rom has been determined in a new crystal form from X-ray diffraction data to 2.5 A resolution. In this structure, a dimer of the 57-amino-acid protein is found in the asymmetric unit. Each subunit consists almost entirely of two antiparallel alpha-helices joined by a short hairpin bend. The dimer contains a non-crystallographic twofold axis and forms a highly regular four-alpha-helical bundle. The structural packing in this novel crystal form is different from previously known Rom structures. The asymmetric unit contains one dimer, giving a crystal volume per protein weight (V(M)) of 1.83 A(3) Da(-1) and a low solvent content of 30%. Strong packing interactions and low solvation are characteristic of the structure. The Rom protein was cocrystallized with the Tar-Tar* kissing hairpin RNA. Although the electron-density maps do not show bound RNA, altered conformations in the side chains of Rom that are known to be involved in RNA binding have been identified. These results provide additional information about Rom protein conformational flexibility and suggest that the presence of a highly charged polymer such as RNA can promote tight packing of an RNA-binding protein, even when the RNA itself is not observed in the crystal.


Assuntos
Proteínas de Bactérias/química , Modelos Moleculares , Proteínas de Ligação a RNA/química , Cristalografia por Raios X , Dimerização , Proteínas de Escherichia coli/química , RNA/química
6.
Biophys J ; 90(12): 4530-7, 2006 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-16581850

RESUMO

The crystal structure of the RNA octamer, 5'-GGCGUGCC-3' has been determined from x-ray diffraction data to 1.5 angstroms resolution. In the crystal, this oligonucleotide forms five self-complementary double-helices in the asymmetric unit. Tandem 5'GU/3'UG basepairs comprise an internal loop in the middle of each duplex. The NMR structure of this octameric RNA sequence is also known, allowing comparison of the variation among the five crystallographic duplexes and the solution structure. The G.U pairs in the five duplexes of the crystal form two direct hydrogen bonds and are stabilized by water molecules that bridge between the base of guanine (N2) and the sugar (O2') of uracil. This contrasts with the NMR structure in which only one direct hydrogen bond is observed for the G.U pairs. The reduced stability of the r(CGUG)2 motif relative to the r(GGUC)2 motif may be explained by the lack of stacking of the uracil bases between the Watson-Crick and G.U pairs as observed in the crystal structure.


Assuntos
Pareamento de Bases , Fosfatos de Dinucleosídeos/química , Modelos Químicos , Modelos Moleculares , RNA/química , RNA/ultraestrutura , Sequências de Repetição em Tandem , Simulação por Computador , Cristalografia , Espectroscopia de Ressonância Magnética , Conformação de Ácido Nucleico
7.
RNA ; 12(4): 533-41, 2006 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-16484377

RESUMO

The aim of the RNA Ontology Consortium (ROC) is to create an integrated conceptual framework-an RNA Ontology (RO)-with a common, dynamic, controlled, and structured vocabulary to describe and characterize RNA sequences, secondary structures, three-dimensional structures, and dynamics pertaining to RNA function. The RO should produce tools for clear communication about RNA structure and function for multiple uses, including the integration of RNA electronic resources into the Semantic Web. These tools should allow the accurate description in computer-interpretable form of the coupling between RNA architecture, function, and evolution. The purposes for creating the RO are, therefore, (1) to integrate sequence and structural databases; (2) to allow different computational tools to interoperate; (3) to create powerful software tools that bring advanced computational methods to the bench scientist; and (4) to facilitate precise searches for all relevant information pertaining to RNA. For example, one initial objective of the ROC is to define, identify, and classify RNA structural motifs described in the literature or appearing in databases and to agree on a computer-interpretable definition for each of these motifs. To achieve these aims, the ROC will foster communication and promote collaboration among RNA scientists by coordinating frequent face-to-face workshops to discuss, debate, and resolve difficult conceptual issues. These meeting opportunities will create new directions at various levels of RNA research. The ROC will work closely with the PDB/NDB structural databases and the Gene, Sequence, and Open Biomedical Ontology Consortia to integrate the RO with existing biological ontologies to extend existing content while maintaining interoperability.


Assuntos
RNA , Sociedades , Bases de Dados Genéticas , Disseminação de Informação , Internet , Alinhamento de Sequência
8.
Nucleic Acids Res ; 34(Database issue): D131-4, 2006 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-16381830

RESUMO

Metal ions are essential for the folding of RNA into stable tertiary structures and for the catalytic activity of some RNA enzymes. To aid in the study of the roles of metal ions in RNA structural biology, we have created MeRNA (Metals in RNA), a comprehensive compilation of all metal binding sites identified in RNA 3D structures available from the PDB and Nucleic Acid Database. Currently, our database contains information relating to binding of 9764 metal ions corresponding to 23 distinct elements, in 256 RNA structures. The metal ion locations were confirmed and ligands characterized using original literature references. MeRNA includes eight manually identified metal-ion binding motifs, which are described in the literature. MeRNA is searchable by PDB identifier, metal ion, method of structure determination, resolution and R-values for X-ray structure and distance from metal to any RNA atom or to water. New structures with their respective binding motifs will be added to the database as they become available. The MeRNA database will further our understanding of the roles of metal ions in RNA folding and catalysis and have applications in structural and functional analysis, RNA design and engineering. The MeRNA database is accessible at http://merna.lbl.gov.


Assuntos
Bases de Dados de Ácidos Nucleicos , Metais/química , Modelos Moleculares , RNA/química , Sítios de Ligação , Internet , Íons/química , Metais/metabolismo , Conformação de Ácido Nucleico , RNA/metabolismo , Interface Usuário-Computador
9.
Int J Data Min Bioinform ; 1(2): 162-77, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-18399069

RESUMO

We study transitivity properties of edge weights in complex networks. We show that enforcing transitivity leads to a transitivity inequality which is equivalent to ultra-metric inequality. This can be used to define transitive closure on weighted undirected graphs, which can be computed using a modified Floyd-Warshall algorithm. These new concepts are extended to dissimilarity graphs and triangle inequalities. From this, we extend the clique concept from unweighted graph to weighted graph. We outline several applications and present results of detecting protein functional modules in a protein interaction network.


Assuntos
Algoritmos , Modelos Teóricos , Mapeamento de Interação de Proteínas/métodos , Biologia Computacional , Proteínas Fúngicas/metabolismo , Ligação Proteica , Biologia de Sistemas
10.
J Struct Funct Genomics ; 6(2-3): 63-70, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16211501

RESUMO

The initial aim of the Berkeley Structural Genomics Center is to obtain a near-complete structural complement of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter fewer than 700 genes. To achieve this goal, the current protein targets have been selected starting with those predicted to be most tractable and likely to yield new structural and functional information. During the past 3 years, the semi-automated structural genomics pipeline has been set up from cloning, expression, purification, and ultimately to structural determination. The results from the pipeline substantially increased the coverage of the protein fold space of M. pneumoniae and M. genitalium. Furthermore, about 1/2 of the structures of 'unique' protein sequences revealed new and novel folds, and over 2/3 of the structures of previously annotated 'hypothetical proteins' inferred their molecular functions.


Assuntos
Proteínas de Bactérias/genética , Genoma Bacteriano/genética , Modelos Moleculares , Mycoplasma genitalium/genética , Mycoplasma pneumoniae/genética , Dobramento de Proteína , Proteômica/métodos , Clonagem Molecular , Cristalização
11.
Proc Natl Acad Sci U S A ; 102(38): 13392-7, 2005 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-16157868

RESUMO

The x-ray crystal structure of a 417-nt ribonuclease P RNA from Bacillus stearothermophilus was solved to 3.3-A resolution. This RNA enzyme is constructed from a number of coaxially stacked helical domains joined together by local and long-range interactions. These helical domains are arranged to form a remarkably flat surface, which is implicated by a wealth of biochemical data in the binding and cleavage of the precursors of transfer RNA substrate. Previous photoaffinity crosslinking data are used to position the substrate on the crystal structure and to identify the chemically active site of the ribozyme. This site is located in a highly conserved core structure formed by intricately interlaced long-range interactions between interhelical sequences.


Assuntos
Geobacillus stearothermophilus/química , RNA Bacteriano/química , RNA Catalítico/química , Ribonuclease P/química , Proteínas de Bactérias/química , Sequência de Bases , Sítios de Ligação , Cristalografia por Raios X , Geobacillus stearothermophilus/enzimologia , Modelos Moleculares , Dados de Sequência Molecular , Conformação de Ácido Nucleico
12.
Nucleic Acids Res ; 33(13): 4164-71, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16043635

RESUMO

The application of high-throughput techniques such as genomics, proteomics or transcriptomics means that vast amounts of heterogeneous data are now available in the public databases. Bioinformatics is responding to the challenge with new integrated management systems for data collection, validation and analysis. Multiple alignments of genomic and protein sequences provide an ideal environment for the integration of this mass of information. In the context of the sequence family, structural and functional data can be evaluated and propagated from known to unknown sequences. However, effective integration is being hindered by syntactic and semantic differences between the different data resources and the alignment techniques employed. One solution to this problem is the development of an ontology that systematically defines the terms used in a specific domain. Ontologies are used to share data from different resources, to automatically analyse information and to represent domain knowledge for non-experts. Here, we present MAO, a new ontology for multiple alignments of nucleic and protein sequences. MAO is designed to improve interoperation and data sharing between different alignment protocols for the construction of a high quality, reliable multiple alignment in order to facilitate knowledge extraction and the presentation of the most pertinent information to the biologist.


Assuntos
Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Análise de Sequência de RNA/métodos , Software , Bases de Dados Genéticas , Humanos , Interleucina-1/genética , Internet , Integração de Sistemas , Vocabulário Controlado
13.
Curr Opin Struct Biol ; 15(3): 302-8, 2005 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15963891

RESUMO

The database of RNA structure has grown tremendously since the crystal structure analyses of ribosomal subunits in 2000-2001. During the past year, the trend toward determining the structure of large, complex biological RNAs has accelerated, with the analysis of three intact group I introns, A- and B-type ribonuclease P RNAs, a riboswitch-substrate complex and other structures. The growing database of RNA structures, coupled with efforts directed at the standardization of nomenclature and classification of motifs, has resulted in the identification and characterization of numerous RNA secondary and tertiary structure motifs. Because a large proportion of RNA structure can now be shown to be composed of these recurring structural motifs, a view of RNA as a modular structure built from a combination of these building blocks and tertiary linkers is beginning to emerge. At the same time, however, more detailed analysis of water, metal, ligand and protein binding to RNA is revealing the effect of these moieties on folding and structure formation. The balance between the views of RNA structure either as strictly a construct of preformed building blocks linked in a limited number of ways or as a flexible polymer assuming a global fold influenced by its environment will be the focus of current and future RNA structural biology.


Assuntos
Modelos Químicos , Modelos Moleculares , RNA/química , RNA/genética , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Sítios de Ligação , Conformação de Ácido Nucleico , RNA/análise , Homologia de Sequência do Ácido Nucleico , Relação Estrutura-Atividade
14.
BMC Bioinformatics ; 6: 77, 2005 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-15790427

RESUMO

BACKGROUND: Protein domains have long been an ill-defined concept in biology. They are generally described as autonomous folding units with evolutionary and functional independence. Both structure-based and sequence-based domain definitions have been widely used. But whether these types of models alone can capture all essential features of domains is still an open question. METHODS: Here we provide insight on domain definitions through comparative mapping of two domain classification databases, one sequence-based (Pfam) and the other structure-based (SCOP). A mapping score is defined to indicate the significance of the mapping, and the properties of the mapping matrices are studied. RESULTS: The mapping results show a general agreement between the two databases, as well as many interesting areas of disagreement. In the cases of disagreement, the functional and evolutionary characteristics of the domains are examined to determine which domain definition is biologically more informative.


Assuntos
Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Algoritmos , Sequência de Aminoácidos , Animais , Sequência de Bases , Análise por Conglomerados , Bases de Dados Factuais , Bases de Dados de Proteínas , Evolução Molecular , Genoma , Humanos , Armazenamento e Recuperação da Informação , Modelos Moleculares , Modelos Estatísticos , Dados de Sequência Molecular , Mapeamento de Peptídeos , Peptídeos , Filogenia , Conformação Proteica , Dobramento de Proteína , Estrutura Terciária de Proteína , Proteínas/química , Proteoma , Alinhamento de Sequência , Análise de Sequência de DNA
15.
Pac Symp Biocomput ; : 4-15, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15759609

RESUMO

Some genes produce transcripts that function directly in regulatory, catalytic, or structural roles in the cell. These non-coding RNAs are prevalent in all living organisms, and methods that aid the understanding of their functional roles are essential. RNA secondary structure, the pattern of base-pairing, contains the critical information for determining the three dimensional structure and function of the molecule. In this work we examine whether the basic geometric and topological properties of secondary structure are sufficient to distinguish between RNA families in a learning framework. First, we develop a labeled dual graph representation of RNA secondary structure by adding biologically meaningful labels to the dual graphs proposed by Gan et al [1]. Next, we define a similarity measure directly on the labeled dual graphs using the recently developed marginalized kernels [2]. Using this similarity measure, we were able to train Support Vector Machine classifiers to distinguish RNAs of known families from random RNAs with similar statistics. For 22 of the 25 families tested, the classifier achieved better than 70% accuracy, with much higher accuracy rates for some families. Training a set of classifiers to automatically assign family labels to RNAs using a one vs. all multi-class scheme also yielded encouraging results. From these initial learning experiments, we suggest that the labeled dual graph representation, together with kernel machine methods, has potential for use in automated analysis and classification of uncharacterized RNA molecules or efficient genome-wide screens for RNA molecules from existing families.


Assuntos
Conformação de Ácido Nucleico , RNA não Traduzido/química , Sequência de Bases , Modelos Moleculares
16.
Pac Symp Biocomput ; : 221-32, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15759628

RESUMO

Proteins usually do not act isolated in a cell but function within complicated cellular pathways, interacting with other proteins either in pairs or as components of larger complexes. While many protein complexes have been identified by large-scale experimental studies, due to a large number of false-positive interactions existing in current protein complexes 10, it is still difficult to obtain an accurate understanding of functional modules, which encompass groups of proteins involved in common elementary biological function. In this paper, we present a hyperclique pattern discovery approach for extracting functional modules (hyperclique patterns) from protein complexes. A hyperclique pattern is a type of association pattern containing proteins that are highly affiliated with each other. The analysis of hyperclique patterns shows that proteins within the same pattern tend to present in the protein complex together. Also, statistically significant annotations of proteins in a pattern using the Gene Ontology suggest that proteins within the same hyperclique pattern more likely perform the same function and participate in the same biological process. More interestingly, the 3-D structural view of proteins within a hyperclique pattern reveals that these proteins physically interactwith each other. In addition, we show that several hyperclique patterns corresponding to different functions can participate in the same protein complex as independent modules. Finally, we demonstrate that a hyperclique pattern can be involved in different complexes performing different higher-order biological functions, although the pattern corresponds to a specific elementary biological function.


Assuntos
Proteínas/química , Algoritmos , Biologia Computacional , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Reconhecimento Automatizado de Padrão , Saccharomyces cerevisiae/genética
17.
Q Rev Biophys ; 38(3): 221-43, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16817983

RESUMO

RNAs are modular biomolecules, composed largely of conserved structural subunits, or motifs. These structural motifs comprise the secondary structure of RNA and are knit together via tertiary interactions into a compact, functional, three-dimensional structure and are to be distinguished from motifs defined by sequence or function. A relatively small number of structural motifs are found repeatedly in RNA hairpin and internal loops, and are observed to be composed of a limited number of common 'structural elements'. In addition to secondary and tertiary structure motifs, there are functional motifs specific for certain biological roles and binding motifs that serve to complex metals or other ligands. Research is continuing into the identification and classification of RNA structural motifs and is being initiated to predict motifs from sequence, to trace their phylogenetic relationships and to use them as building blocks in RNA engineering.


Assuntos
Modelos Químicos , Modelos Moleculares , RNA/química , RNA/genética , Análise de Sequência de RNA/métodos , Sequência de Bases , Sequência Conservada , Humanos , Modelos Genéticos , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Homologia de Sequência do Ácido Nucleico , Relação Estrutura-Atividade
18.
Proteins ; 58(2): 329-38, 2005 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-15562515

RESUMO

Sequence alignment underpins common tasks in molecular biology, including genome annotation, molecular phylogenetics, and homology modeling. Fundamental to sequence alignment is the placement of gaps, which represent character insertions or deletions. We assessed the ability of a generalized affine gap cost model to reliably detect remote protein homology and to produce high-quality alignments. Generalized affine gap alignment with optimal gap parameters performed as well as the traditional affine gap model in remote homology detection. Evaluation of alignment quality showed that the generalized affine model aligns fewer residue pairs than the traditional affine model but achieves significantly higher per-residue accuracy. We conclude that generalized affine gap costs should be used when alignment accuracy carries more importance than aligned sequence length.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Proteômica/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Algoritmos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Proteínas Ativadoras de GTPase/química , Modelos Biológicos , Modelos Moleculares , Modelos Estatísticos , Dados de Sequência Molecular , Filogenia , Conformação Proteica , Reprodutibilidade dos Testes , Homologia de Sequência de Aminoácidos , Software
19.
Proteins ; 57(1): 99-108, 2004 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-15326596

RESUMO

The protein interaction network presents one perspective for understanding cellular processes. Recent experiments employing high-throughput mass spectrometric characterizations have resulted in large data sets of physiologically relevant multiprotein complexes. We present a unified representation of such data sets based on an underlying bipartite graph model that is an advance over existing models of the network. Our unified representation allows for weighting of connections between proteins shared in more than one complex, as well as addressing the higher level organization that occurs when the network is viewed as consisting of protein complexes that share components. This representation also allows for the application of the rigorous MinMaxCut graph clustering algorithm for the determination of relevant protein modules in the networks. Statistically significant annotations of clusters in the protein-protein and complex-complex networks using terms from the Gene Ontology indicate that this method will be useful for posing hypotheses about uncharacterized components of protein complexes or uncharacterized relationships between protein complexes.


Assuntos
Complexos Multiproteicos/química , Algoritmos , Modelos Químicos , Mapeamento de Interação de Proteínas , Estrutura Quaternária de Proteína , Proteínas de Saccharomyces cerevisiae/química , Software
20.
Nucleic Acids Res ; 32(8): 2342-52, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15121895

RESUMO

Release 2.0.1 of the Structural Classification of RNA (SCOR) database, http://scor.lbl.gov, contains a classification of the internal and hairpin loops in a comprehensive collection of 497 NMR and X-ray RNA structures. This report discusses findings of the classification that have not been reported previously. The SCOR database contains multiple examples of a newly described RNA motif, the extruded helical single strand. Internal loop base triples are classified in SCOR according to their three-dimensional context. These internal loop triples contain several examples of a frequently found motif, the minor groove AGC triple. SCOR also presents the predominant and alternate conformations of hairpin loops, as shown in the most well represented tetraloops, with consensus sequences GNRA, UNCG and ANYA. The ubiquity of the GNRA hairpin turn motif is illustrated by its presence in complex internal loops.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA/química , Pareamento de Bases , Sequência de Bases , Sequência Consenso , Modelos Moleculares , Conformação de Ácido Nucleico , RNA/classificação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA