RESUMEN
Inverted repeats are common DNA elements, but they rarely overlap with protein-coding sequences due to the ensuing conflict with the structure and function of the encoded protein. We discovered numerous perfect inverted repeats of considerable length (up to 284 bp) embedded within the protein-coding genes in mitochondrial genomes of four Nematomorpha species. Strikingly, both arms of the inverted repeats encode conserved regions of the amino acid sequence. We confirmed enzymatic activity of the respiratory complex I encoded by inverted repeat-containing genes. The nucleotide composition of inverted repeats suggests strong selection at the amino acid level in these regions. We conclude that the inverted repeat-containing genes are transcribed and translated into functional proteins. The survey of available mitochondrial genomes reveals that several other organisms possess similar albeit shorter embedded repeats. Mitochondrial genomes of Nematomorpha demonstrate an extraordinary evolutionary compromise where protein function and stringent secondary structure elements within the coding regions are preserved simultaneously.
Asunto(s)
Genes de Helminto/genética , Genes Mitocondriales/genética , Código Genético , Genoma Mitocondrial , Helmintos/genética , Secuencias Invertidas Repetidas/genética , Secuencia de Aminoácidos , Animales , Composición de Base , Secuencia de Bases , ADN de Helmintos/genética , ADN Ribosómico/genética , Complejo I de Transporte de Electrón/genética , Evolución Molecular , Femenino , Proteínas del Helminto/genética , Masculino , Consumo de Oxígeno , ARN de Helminto/genética , ARN Ribosómico 18S/genética , Selección Genética , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Especificidad de la EspecieRESUMEN
The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.