Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Chembiochem ; : e202400202, 2024 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-38818670

RESUMEN

RNA labeling is an invaluable tool for investigation of the function and localization of nucleic acids. Labels are commonly incorporated into 3' end of RNA and the primary enzyme used for this purpose is RNA poly(A) polymerase (PAP), which belongs to the class of terminal nucleotidyltransferases (NTases). However, PAP preferentially adds ATP analogs, thus limiting the number of available substrates. Here, we report the use of another NTase, CutA from the fungus Thielavia terrestris. Using this enzyme, we were able to incorporate into the 3' end of RNA not only purine analogs, but also pyrimidine analogs. We engaged strain-promoted azide-alkyl cycloaddition (SPAAC) to obtain fluorescently labeled or biotinylated transcripts from RNAs extended with azide analogs by CutA. Importantly, modified transcripts retained their biological properties. Furthermore, fluorescently labeled mRNAs were suitable for visualization in cultured mammalian cells. Finally, we demonstrate that either affinity studies or molecular dynamic (MD) simulations allow for rapid screening of NTase substrates, what opens up new avenues in the search for the optimal substrates for this class of enzymes.

2.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34571541

RESUMEN

The Rossmann fold enzymes are involved in essential biochemical pathways such as nucleotide and amino acid metabolism. Their functioning relies on interaction with cofactors, small nucleoside-based compounds specifically recognized by a conserved ßαß motif shared by all Rossmann fold proteins. While Rossmann methyltransferases recognize only a single cofactor type, the S-adenosylmethionine, the oxidoreductases, depending on the family, bind nicotinamide (nicotinamide adenine dinucleotide, nicotinamide adenine dinucleotide phosphate) or flavin-based (flavin adenine dinucleotide) cofactors. In this study, we showed that despite its short length, the ßαß motif unambiguously defines the specificity towards the cofactor. Following this observation, we trained two complementary deep learning models for the prediction of the cofactor specificity based on the sequence and structural features of the ßαß motif. A benchmark on two independent test sets, one containing ßαß motifs bearing no resemblance to those of the training set, and the other comprising 38 experimentally confirmed cases of rational design of the cofactor specificity, revealed the nearly perfect performance of the two methods. The Rossmann-toolbox protocols can be accessed via the webserver at https://lbs.cent.uw.edu.pl/rossmann-toolbox and are available as a Python package at https://github.com/labstructbioinf/rossmann-toolbox.


Asunto(s)
Aprendizaje Profundo , Flavina-Adenina Dinucleótido/química , Flavina-Adenina Dinucleótido/metabolismo , NAD/química , NAD/metabolismo , NADP/química , NADP/metabolismo , Proteínas
3.
Bioinformatics ; 39(10)2023 10 03.
Artículo en Inglés | MEDLINE | ID: mdl-37725369

RESUMEN

MOTIVATION: The detection of homology through sequence comparison is a typical first step in the study of protein function and evolution. In this work, we explore the applicability of protein language models to this task. RESULTS: We introduce pLM-BLAST, a tool inspired by BLAST, that detects distant homology by comparing single-sequence representations (embeddings) derived from a protein language model, ProtT5. Our benchmarks reveal that pLM-BLAST maintains a level of accuracy on par with HHsearch for both highly similar sequences (with >50% identity) and markedly divergent sequences (with <30% identity), while being significantly faster. Additionally, pLM-BLAST stands out among other embedding-based tools due to its ability to compute local alignments. We show that these local alignments, produced by pLM-BLAST, often connect highly divergent proteins, thereby highlighting its potential to uncover previously undiscovered homologous relationships and improve protein annotation. AVAILABILITY AND IMPLEMENTATION: pLM-BLAST is accessible via the MPI Bioinformatics Toolkit as a web server for searching precomputed databases (https://toolkit.tuebingen.mpg.de/tools/plmblast). It is also available as a standalone tool for building custom databases and performing batch searches (https://github.com/labstructbioinf/pLM-BLAST).


Asunto(s)
Proteínas , Programas Informáticos , Secuencia de Aminoácidos , Alineación de Secuencia , Proteínas/genética , Anotación de Secuencia Molecular
4.
Bioinformatics ; 38(9): 2633-2635, 2022 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-35199148

RESUMEN

MOTIVATION: The wealth of protein structures collected in the Protein Data Bank enabled large-scale studies of their function and evolution. Such studies, however, require the generation of customized datasets combining the structural data with miscellaneous accessory resources providing functional, taxonomic and other annotations. Unfortunately, the functionality of currently available tools for the creation of such datasets is limited and their usage frequently requires laborious surveying of various data sources and resolving inconsistencies between their versions. RESULTS: To address this problem, we developed localpdb, a versatile Python library for the management of protein structures and their annotations. The library features a flexible plugin system enabling seamless unification of the structural data with diverse auxiliary resources, full version control and powerful functionality of creating highly customized datasets. The localpdb can be used in a wide range of bioinformatic tasks, in particular those involving large-scale protein structural analyses and machine learning. AVAILABILITY AND IMPLEMENTATION: localpdb is freely available at https://github.com/labstructbioinf/localpdb. Documentation along with the usage examples can be accessed at https://labstructbioinf.github.io/localpdb/.


Asunto(s)
Biología Computacional , Programas Informáticos , Proteínas , Bases de Datos de Proteínas , Documentación
5.
Bioinformatics ; 36(22-23): 5368-5376, 2021 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-33325494

RESUMEN

MOTIVATION: Coiled coils are widespread protein domains involved in diverse processes ranging from providing structural rigidity to the transduction of conformational changes. They comprise two or more α-helices that are wound around each other to form a regular supercoiled bundle. Owing to this regularity, coiled-coil structures can be described with parametric equations, thus enabling the numerical representation of their properties, such as the degree and handedness of supercoiling, rotational state of the helices, and the offset between them. These descriptors are invaluable in understanding the function of coiled coils and designing new structures of this type. The existing tools for such calculations require manual preparation of input and are therefore not suitable for the high-throughput analyses. RESULTS: To address this problem, we developed SamCC-Turbo, a software for fully automated, per-residue measurement of coiled coils. By surveying Protein Data Bank with SamCC-Turbo, we generated a comprehensive atlas of ∼50 000 coiled-coil regions. This machine learning-ready dataset features precise measurements as well as decomposes coiled-coil structures into fragments characterized by various degrees of supercoiling. The potential applications of SamCC-Turbo are exemplified by analyses in which we reveal general structural features of coiled coils involved in functions requiring conformational plasticity. Finally, we discuss further directions in the prediction and modeling of coiled coils. AVAILABILITY AND IMPLEMENTATION: SamCC-Turbo is available as a web server (https://lbs.cent.uw.edu.pl/samcc_turbo) and as a Python library (https://github.com/labstructbioinf/samcc_turbo), whereas the results of the Protein Data Bank scan can be browsed and downloaded at https://lbs.cent.uw.edu.pl/ccdb. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

6.
PLoS Comput Biol ; 17(10): e1009502, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34648493

RESUMEN

While the slipknot topology in proteins has been known for over a decade, its evolutionary origin is still a mystery. We have identified a previously overlooked slipknot motif in a family of two-domain membrane transporters. Moreover, we found that these proteins are homologous to several families of unknotted membrane proteins. This allows us to directly investigate the evolution of the slipknot motif. Based on our comprehensive analysis of 17 distantly related protein families, we have found that slipknotted and unknotted proteins share a common structural motif. Furthermore, this motif is conserved on the sequential level as well. Our results suggest that, regardless of topology, the proteins we studied evolved from a common unknotted ancestor single domain protein. Our phylogenetic analysis suggests the presence of at least seven parallel evolutionary scenarios that led to the current diversity of proteins in question. The tools we have developed in the process can now be used to investigate the evolution of other repeated-domain proteins.


Asunto(s)
Antiportadores , Evolución Molecular , Secuencias de Aminoácidos , Antiportadores/química , Antiportadores/genética , Antiportadores/metabolismo , Biología Computacional , Bases de Datos de Proteínas , Filogenia , Conformación Proteica
7.
Int J Mol Sci ; 22(24)2021 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-34948248

RESUMEN

The bacterial proteins of the Dsb family catalyze the formation of disulfide bridges between cysteine residues that stabilize protein structures and ensure their proper functioning. Here, we report the detailed analysis of the Dsb pathway of Campylobacter jejuni. The oxidizing Dsb system of this pathogen is unique because it consists of two monomeric DsbAs (DsbA1 and DsbA2) and one dimeric bifunctional protein (C8J_1298). Previously, we showed that DsbA1 and C8J_1298 are redundant. Here, we unraveled the interaction between the two monomeric DsbAs by in vitro and in vivo experiments and by solving their structures and found that both monomeric DsbAs are dispensable proteins. Their structures confirmed that they are homologs of EcDsbL. The slight differences seen in the surface charge of the proteins do not affect the interaction with their redox partner. Comparative proteomics showed that several respiratory proteins, as well as periplasmic transport proteins, are targets of the Dsb system. Some of these, both donors and electron acceptors, are essential elements of the C. jejuni respiratory process under oxygen-limiting conditions in the host intestine. The data presented provide detailed information on the function of the C. jejuni Dsb system, identifying it as a potential target for novel antibacterial molecules.


Asunto(s)
Oxidorreductasas/metabolismo , Proteínas Periplasmáticas/metabolismo , Proteína Disulfuro Isomerasas/genética , Proteína Disulfuro Isomerasas/metabolismo , Secuencia de Aminoácidos , Fenómenos Fisiológicos Bacterianos , Proteínas Bacterianas/metabolismo , Campylobacter jejuni/patogenicidad , Campylobacter jejuni/fisiología , Disulfuros/metabolismo , Oxidación-Reducción , Oxidorreductasas/genética , Periplasma/metabolismo , Proteínas Periplasmáticas/genética , Homología de Secuencia de Aminoácido
8.
BMC Bioinformatics ; 21(1): 179, 2020 May 07.
Artículo en Inglés | MEDLINE | ID: mdl-32381046

RESUMEN

BACKGROUND: Protein repeats can confound sequence analyses because the repetitiveness of their amino acid sequences lead to difficulties in identifying whether similar repeats are due to convergent or divergent evolution. We noted that the patterns derived from traditional "dot plot" protein sequence self-similarity analysis tended to be conserved in sets of related repeat proteins and this conservation could be quantitated using a Jaccard metric. RESULTS: Comparison of these dot plots obviated the issues due to sequence similarity for analysis of repeat proteins. A high Jaccard similarity score was suggestive of a conserved relationship between closely related repeat proteins. The dot plot patterns decayed quickly in the absence of selective pressure with an expected loss of 50% of Jaccard similarity due to a loss of 8.2% sequence identity. To perform method testing, we assembled a standard set of 79 repeat proteins representing all the subgroups in RepeatsDB. Comparison of known repeat and non-repeat proteins from the PDB suggested that the information content in dot plots could be used to identify repeat proteins from pure sequence with no requirement for structural information. Analysis of the UniRef90 database suggested that 16.9% of all known proteins could be classified as repeat proteins. These 13.3 million putative repeat protein chains were clustered and a significant amount (82.9%) of clusters containing between 5 and 200 members were of a single functional type. CONCLUSIONS: Dot plot analysis of repeat proteins attempts to obviate issues that arise due to the sequence degeneracy of repeat proteins. These results show that this kind of analysis can efficiently be applied to analyze repeat proteins on a large scale.


Asunto(s)
Secuencia Conservada , Evolución Molecular , Proteínas/química , Secuencias Repetitivas de Aminoácido , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Mutación/genética
9.
Bioinformatics ; 35(16): 2790-2795, 2019 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-30601942

RESUMEN

MOTIVATION: Coiled coils are protein structural domains that mediate a plethora of biological interactions, and thus their reliable annotation is crucial for studies of protein structure and function. RESULTS: Here, we report DeepCoil, a new neural network-based tool for the detection of coiled-coil domains in protein sequences. In our benchmarks, DeepCoil significantly outperformed current state-of-the-art tools, such as PCOILS and Marcoil, both in the prediction of canonical and non-canonical coiled coils. Furthermore, in a scan of the human genome with DeepCoil, we detected many coiled-coil domains that remained undetected by other methods. This higher sensitivity of DeepCoil should make it a method of choice for accurate genome-wide detection of coiled-coil domains. AVAILABILITY AND IMPLEMENTATION: DeepCoil is written in Python and utilizes the Keras machine learning library. A web server is freely available at https://toolkit.tuebingen.mpg.de/#/tools/deepcoil and a standalone version can be downloaded at https://github.com/labstructbioinf/DeepCoil. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Secuencia de Aminoácidos , Humanos , Aprendizaje Automático , Dominios Proteicos , Proteínas
10.
Nucleic Acids Res ; 46(D1): D202-D205, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29069520

RESUMEN

RNArchitecture is a database that provides a comprehensive description of relationships between known families of structured non-coding RNAs, with a focus on structural similarities. The classification is hierarchical and similar to the system used in the SCOP and CATH databases of protein structures. Its central level is Family, which builds on the Rfam catalog and gathers closely related RNAs. Consensus structures of Families are described with a reduced secondary structure representation. Evolutionarily related Families are grouped into Superfamilies. Similar structures are further grouped into Architectures. The highest level, Class, organizes families into very broad structural categories, such as simple or complex structured RNAs. Some groups at different levels of the hierarchy are currently labeled as 'unclassified'. The classification is expected to evolve as new data become available. For each Family with an experimentally determined three-diemsional (3D) structure(s), a representative one is provided. RNArchitecture also presents theoretical models of RNA 3D structure and is open for submission of structural models by users. Compared to other databases, RNArchitecture is unique in its focus on structure-based RNA classification, and in providing a platform for storing RNA 3D structure predictions. RNArchitecture can be accessed at http://iimcb.genesilico.pl/RNArchitecture/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN/química , Internet , Estructura Molecular , Conformación de Ácido Nucleico , ARN/clasificación , ARN/genética
11.
RNA ; 23(5): 655-672, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28138060

RESUMEN

RNA-Puzzles is a collective experiment in blind 3D RNA structure prediction. We report here a third round of RNA-Puzzles. Five puzzles, 4, 8, 12, 13, 14, all structures of riboswitch aptamers and puzzle 7, a ribozyme structure, are included in this round of the experiment. The riboswitch structures include biological binding sites for small molecules (S-adenosyl methionine, cyclic diadenosine monophosphate, 5-amino 4-imidazole carboxamide riboside 5'-triphosphate, glutamine) and proteins (YbxF), and one set describes large conformational changes between ligand-free and ligand-bound states. The Varkud satellite ribozyme is the most recently solved structure of a known large ribozyme. All puzzles have established biological functions and require structural understanding to appreciate their molecular mechanisms. Through the use of fast-track experimental data, including multidimensional chemical mapping, and accurate prediction of RNA secondary structure, a large portion of the contacts in 3D have been predicted correctly leading to similar topologies for the top ranking predictions. Template-based and homology-derived predictions could predict structures to particularly high accuracies. However, achieving biological insights from de novo prediction of RNA 3D structures still depends on the size and complexity of the RNA. Blind computational predictions of RNA structures already appear to provide useful structural information in many cases. Similar to the previous RNA-Puzzles Round II experiment, the prediction of non-Watson-Crick interactions and the observed high atomic clash scores reveal a notable need for an algorithm of improvement. All prediction models and assessment results are available at http://ahsoka.u-strasbg.fr/rnapuzzles/.


Asunto(s)
ARN Catalítico/química , Riboswitch , Aminoimidazol Carboxamida/química , Aminoimidazol Carboxamida/metabolismo , Aptámeros de Nucleótidos/química , Aptámeros de Nucleótidos/metabolismo , Fosfatos de Dinucleósidos/metabolismo , Endorribonucleasas/química , Endorribonucleasas/metabolismo , Glutamina/química , Glutamina/metabolismo , Ligandos , Modelos Moleculares , Conformación de Ácido Nucleico , ARN Catalítico/metabolismo , Ribonucleótidos/química , Ribonucleótidos/metabolismo , S-Adenosilmetionina/química , S-Adenosilmetionina/metabolismo
12.
J Struct Biol ; 203(1): 54-61, 2018 07.
Artículo en Inglés | MEDLINE | ID: mdl-29454111

RESUMEN

Computational protein design is a set of procedures for computing amino acid sequences that will fold into a specified structure. Rosetta Design, a commonly used software for protein design, allows for the effective identification of sequences compatible with a given backbone structure, while molecular dynamics (MD) simulations can thoroughly sample near-native conformations. We benchmarked a procedure in which Rosetta design is started on MD-derived structural ensembles and showed that such a combined approach generates 20-30% more diverse sequences than currently available methods with only a slight increase in computation time. Importantly, the increase in diversity is achieved without a loss in the quality of the designed sequences assessed by their resemblance to natural sequences. We demonstrate that the MD-based procedure is also applicable to de novo design tasks started from backbone structures without any sequence information. In addition, we implemented a protocol that can be used to assess the stability of designed models and to select the best candidates for experimental validation. In sum our results demonstrate that the MD ensemble-based flexible backbone design can be a viable method for protein design, especially for tasks that require a large pool of diverse sequences.


Asunto(s)
Simulación de Dinámica Molecular , Ingeniería de Proteínas/métodos , Programas Informáticos , Secuencia de Aminoácidos , Análisis de Secuencia de Proteína
13.
J Struct Biol ; 204(1): 117-124, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30042011

RESUMEN

In protein modelling and design, an understanding of the relationship between sequence and structure is essential. Using parallel, homotetrameric coiled-coil structures as a model system, we demonstrated that machine learning techniques can be used to predict structural parameters directly from the sequence. Coiled coils are regular protein structures, which are of great interest as building blocks for assembling larger nanostructures. They are composed of two or more alpha-helices wrapped around each other to form a supercoiled bundle. The coiled-coil bundles are defined by four basic structural parameters: topology (parallel or antiparallel), radius, degree of supercoiling, and the rotation of helices around their axes. In parallel coiled coils the latter parameter, describing the hydrophobic core packing geometry, was assumed to show little variation. However, we found that subtle differences between structures of this type were not artifacts of structure determination and could be predicted directly from the sequence. Using this information in modelling narrows the structural parameter space that must be searched and thus significantly reduces the required computational time. Moreover, the sequence-structure rules can be used to explain the effects of point mutations and to shed light on the relationship between hydrophobic core architecture and coiled-coil topology.


Asunto(s)
Proteínas/química , Interacciones Hidrofóbicas e Hidrofílicas , Aprendizaje Automático , Estructura Secundaria de Proteína
14.
Subcell Biochem ; 82: 95-129, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28101860

RESUMEN

α-Helical coiled coils constitute one of the most diverse folds yet described. They range in length over two orders of magnitude; they form rods, segmented ropes, barrels, funnels, sheets, spirals, and rings, which encompass anywhere from two to more than 20 helices in parallel or antiparallel orientation; they assume different helix crossing angles, degrees of supercoiling, and packing geometries. This structural diversity supports a wide range of biological functions, allowing them to form mechanically rigid structures, provide levers for molecular motors, project domains across large distances, mediate oligomerization, transduce conformational changes and facilitate the transport of other molecules. Unlike almost any other protein fold known to us, their structure can be computed from parametric equations, making them an ideal model system for rational protein design. Here we outline the principles by which coiled coils are structured, review the determinants of their folding and stability, and present an overview of their diverse architectures.


Asunto(s)
Conformación Proteica en Hélice alfa , Proteínas/química , Secuencia de Aminoácidos , Animales , Humanos , Modelos Moleculares , Pliegue de Proteína , Estabilidad Proteica
15.
RNA ; 21(6): 1066-84, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25883046

RESUMEN

This paper is a report of a second round of RNA-Puzzles, a collective and blind experiment in three-dimensional (3D) RNA structure prediction. Three puzzles, Puzzles 5, 6, and 10, represented sequences of three large RNA structures with limited or no homology with previously solved RNA molecules. A lariat-capping ribozyme, as well as riboswitches complexed to adenosylcobalamin and tRNA, were predicted by seven groups using RNAComposer, ModeRNA/SimRNA, Vfold, Rosetta, DMD, MC-Fold, 3dRNA, and AMBER refinement. Some groups derived models using data from state-of-the-art chemical-mapping methods (SHAPE, DMS, CMCT, and mutate-and-map). The comparisons between the predictions and the three subsequently released crystallographic structures, solved at diffraction resolutions of 2.5-3.2 Å, were carried out automatically using various sets of quality indicators. The comparisons clearly demonstrate the state of present-day de novo prediction abilities as well as the limitations of these state-of-the-art methods. All of the best prediction models have similar topologies to the native structures, which suggests that computational methods for RNA structure prediction can already provide useful structural information for biological problems. However, the prediction accuracy for non-Watson-Crick interactions, key to proper folding of RNAs, is low and some predicted models had high Clash Scores. These two difficulties point to some of the continuing bottlenecks in RNA structure prediction. All submitted models are available for download at http://ahsoka.u-strasbg.fr/rnapuzzles/.


Asunto(s)
Biología Computacional/métodos , ARN/química , Cristalografía por Rayos X , Modelos Moleculares , Conformación de Ácido Nucleico , ARN Mensajero/química , ARN de Transferencia/química , Programas Informáticos
16.
Methods ; 107: 34-41, 2016 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-27016142

RESUMEN

tRNA molecules contain numerous chemically altered nucleosides, which are formed by enzymatic modification of the primary transcripts during the complex tRNA maturation process. Some of the modifications are introduced by single reactions, while other require complex series of reactions carried out by several different enzymes. The location and distribution of various types of modifications vary greatly between different tRNA molecules, organisms and organelles. We have developed a computational method tRNAmodpred, for predicting modifications in tRNA sequences. Briefly, our method takes as an input one or more unmodified tRNA sequences and a set of protein sequences corresponding to a proteome of a cell. Subsequently it identifies homologs of known tRNA modification enzymes in the proteome, predicts tRNA modification activities and maps them onto known pathways of RNA modification from the MODOMICS database. Thereby, theoretically possible modification pathways are identified, and products of these modification reactions are proposed for query tRNAs. This method allows for predicting modification patterns for newly sequenced genomes as well as for checking tentative modification status of tRNAs from one species treated with enzymes from another source, e.g. to predict the possible modifications of eukaryotic tRNAs expressed in bacteria. tRNAmodpred is freely available as a web server at http://genesilico.pl/trnamodpred/.


Asunto(s)
Biología Computacional/métodos , Procesamiento Postranscripcional del ARN/genética , ARN de Transferencia/genética , Secuencia de Aminoácidos/genética , Conformación de Ácido Nucleico , ARN de Transferencia/química
17.
Nucleic Acids Res ; 42(7): 4160-79, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24464998

RESUMEN

Ribonuclease H-like (RNHL) superfamily, also called the retroviral integrase superfamily, groups together numerous enzymes involved in nucleic acid metabolism and implicated in many biological processes, including replication, homologous recombination, DNA repair, transposition and RNA interference. The RNHL superfamily proteins show extensive divergence of sequences and structures. We conducted database searches to identify members of the RNHL superfamily (including those previously unknown), yielding >60 000 unique domain sequences. Our analysis led to the identification of new RNHL superfamily members, such as RRXRR (PF14239), DUF460 (PF04312, COG2433), DUF3010 (PF11215), DUF429 (PF04250 and COG2410, COG4328, COG4923), DUF1092 (PF06485), COG5558, OrfB_IS605 (PF01385, COG0675) and Peptidase_A17 (PF05380). Based on the clustering analysis we grouped all identified RNHL domain sequences into 152 families. Phylogenetic studies revealed relationships between these families, and suggested a possible history of the evolution of RNHL fold and its active site. Our results revealed clear division of the RNHL superfamily into exonucleases and endonucleases. Structural analyses of features characteristic for particular groups revealed a correlation between the orientation of the C-terminal helix with the exonuclease/endonuclease function and the architecture of the active site. Our analysis provides a comprehensive picture of sequence-structure-function relationships in the RNHL superfamily that may guide functional studies of the previously uncharacterized protein families.


Asunto(s)
Ribonucleasa H/química , Ribonucleasa H/clasificación , Análisis por Conglomerados , Evolución Molecular , Exonucleasas/clasificación , Filogenia , Estructura Terciaria de Proteína , Ribonucleasa H/genética , Alineación de Secuencia
18.
BMC Bioinformatics ; 16: 336, 2015 Oct 23.
Artículo en Inglés | MEDLINE | ID: mdl-26493560

RESUMEN

BACKGROUND: GmrSD is a modification-dependent restriction endonuclease that specifically targets and cleaves glucosylated hydroxymethylcytosine (glc-HMC) modified DNA. It is encoded either as two separate single-domain GmrS and GmrD proteins or as a single protein carrying both domains. Previous studies suggested that GmrS acts as endonuclease and NTPase whereas GmrD binds DNA. METHODS: In this work we applied homology detection, sequence conservation analysis, fold recognition and homology modeling methods to study sequence-structure-function relationships in the GmrSD restriction endonucleases family. We also analyzed the phylogeny and genomic context of the family members. RESULTS: Results of our comparative genomics study show that GmrS exhibits similarity to proteins from the ParB/Srx fold which can have both NTPase and nuclease activity. In contrast to the previous studies though, we attribute the nuclease activity also to GmrD as we found it to contain the HNH endonuclease motif. We revealed residues potentially important for structure and function in both domains. Moreover, we found that GmrSD systems exist predominantly as a fused, double-domain form rather than as a heterodimer and that their homologs are often encoded in regions enriched in defense and gene mobility-related elements. Finally, phylogenetic reconstructions of GmrS and GmrD domains revealed that they coevolved and only few GmrSD systems appear to be assembled from distantly related GmrS and GmrD components. CONCLUSIONS: Our study provides insight into sequence-structure-function relationships in the yet poorly characterized family of Type IV restriction enzymes. Comparative genomics allowed to propose possible role of GmrD domain in the function of the GmrSD enzyme and possible active sites of both GmrS and GmrD domains. Presented results can guide further experimental characterization of these enzymes.


Asunto(s)
Enzimas de Restricción del ADN/genética , ADN/genética , Genómica/métodos , Dominio Catalítico , Filogenia , Conformación Proteica , Estructura Terciaria de Proteína , Relación Estructura-Actividad
19.
RNA ; 19(10): 1341-8, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23980204

RESUMEN

Prokaryotic ribosomal protein genes are typically grouped within highly conserved operons. In many cases, one or more of the encoded proteins not only bind to a specific site in the ribosomal RNA, but also to a motif localized within their own mRNA, and thereby regulate expression of the operon. In this study, we computationally predicted an RNA motif present in many bacterial phyla within the 5' untranslated region of operons encoding ribosomal proteins S6 and S18. We demonstrated that the S6:S18 complex binds to this motif, which we hereafter refer to as the S6:S18 complex-binding motif (S6S18CBM). This motif is a conserved CCG sequence presented in a bulge flanked by a stem and a hairpin structure. A similar structure containing a CCG trinucleotide forms the S6:S18 complex binding site in 16S ribosomal RNA. We have constructed a 3D structural model of a S6:S18 complex with S6S18CBM, which suggests that the CCG trinucleotide in a specific structural context may be specifically recognized by the S18 protein. This prediction was supported by site-directed mutagenesis of both RNA and protein components. These results provide a molecular basis for understanding protein-RNA recognition and suggest that the S6S18CBM is involved in an auto-regulatory mechanism.


Asunto(s)
Proteínas Bacterianas/metabolismo , Conformación de Ácido Nucleico , ARN Bacteriano/metabolismo , ARN Mensajero/metabolismo , ARN Ribosómico/metabolismo , Proteína S6 Ribosómica/metabolismo , Proteínas Ribosómicas/metabolismo , Regiones no Traducidas 5'/genética , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Emparejamiento Base , Secuencia de Bases , Sitios de Unión , Ensayo de Cambio de Movilidad Electroforética , Escherichia coli/genética , Escherichia coli/metabolismo , Modelos Moleculares , Datos de Secuencia Molecular , Operón/genética , Unión Proteica , Estructura Terciaria de Proteína , ARN Bacteriano/química , ARN Bacteriano/genética , ARN Mensajero/química , ARN Mensajero/genética , ARN Ribosómico/química , ARN Ribosómico/genética , Proteína S6 Ribosómica/química , Proteína S6 Ribosómica/genética , Proteínas Ribosómicas/química , Proteínas Ribosómicas/genética , Ribosomas/química , Ribosomas/genética , Ribosomas/metabolismo , Homología de Secuencia de Ácido Nucleico , Thermus thermophilus/genética , Thermus thermophilus/metabolismo
20.
Methods ; 65(3): 310-9, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24083976

RESUMEN

Protein-RNA interactions play fundamental roles in many biological processes, such as regulation of gene expression, RNA splicing, and protein synthesis. The understanding of these processes improves as new structures of protein-RNA complexes are solved and the molecular details of interactions analyzed. However, experimental determination of protein-RNA complex structures by high-resolution methods is tedious and difficult. Therefore, studies on protein-RNA recognition and complex formation present major technical challenges for macromolecular structural biology. Alternatively, protein-RNA interactions can be predicted by computational methods. Although less accurate than experimental measurements, theoretical models of macromolecular structures can be sufficiently accurate to prompt functional hypotheses and guide e.g. identification of important amino acid or nucleotide residues. In this article we present an overview of strategies and methods for computational modeling of protein-RNA complexes, including software developed in our laboratory, and illustrate it with practical examples of structural predictions.


Asunto(s)
Biología Computacional/métodos , Proteínas de Escherichia coli/química , ARN Ribosómico 16S/química , Proteínas de Unión al ARN/química , Riboswitch/genética , Programas Informáticos , Bacillus subtilis/química , Sitios de Unión , Bases de Datos de Proteínas , Escherichia coli/química , Conformación Molecular , Simulación del Acoplamiento Molecular , Unión Proteica , Thermoanaerobacter/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA