RESUMEN
We present a structural and functional analysis of the DNA polymerase of thermophilic Thermus thermophilus MAT72 phage vB_Tt72. The enzyme shows low sequence identity (<30%) to the members of the type-A family of DNA polymerases, except for two yet uncharacterized DNA polymerases of T. thermophilus phages: φYS40 (91%) and φTMA (90%). The Tt72 polA gene does not complement the Escherichia colipolA− mutant in replicating polA-dependent plasmid replicons. It encodes a 703-aa protein with a predicted molecular weight of 80,490 and an isoelectric point of 5.49. The enzyme contains a nucleotidyltransferase domain and a 3'-5' exonuclease domain that is engaged in proofreading. Recombinant enzyme with His-tag at the N-terminus was overproduced in E. coli, subsequently purified by immobilized metal affinity chromatography, and biochemically characterized. The enzyme exists in solution in monomeric form and shows optimum activity at pH 8.5, 25 mM KCl, and 0.5 mM Mg2+. Site-directed analysis proved that highly-conserved residues D15, E17, D78, D180, and D184 in 3'-5' exonuclease and D384 and D615 in the nucleotidyltransferase domain are critical for the enzyme's activity. Despite the source of origin, the Tt72 DNA polymerase has not proven to be highly thermoresistant, with a temperature optimum at 55 °C. Above 60 °C, the rapid loss of function follows with no activity > 75 °C. However, during heat treatment (10 min at 75 °C), trehalose, trimethylamine N-oxide, and betaine protected the enzyme against thermal inactivation. A midpoint of thermal denaturation at Tm = 74.6 °C (ΔHcal = 2.05 × 104 cal mol−1) and circular dichroism spectra > 60 °C indicate the enzyme's moderate thermal stability.
Asunto(s)
Bacteriófagos , Thermus thermophilus , Secuencia de Aminoácidos , Bacteriófagos/metabolismo , ADN Polimerasa Dirigida por ADN/metabolismo , Estabilidad de Enzimas , Escherichia coli/genética , Escherichia coli/metabolismo , Fosfodiesterasa I/metabolismo , Thermus thermophilus/metabolismoRESUMEN
Clostridium botulinum is a Gram-positive, anaerobic, spore-forming bacterium capable of producing botulinum toxin and responsible for botulism of humans and animals. Phage-encoded enzymes called endolysins, which can lyse bacteria when exposed externally, have potential as agents to combat bacteria of the genus Clostridium. Bioinformatics analysis revealed in the genomes of several Clostridium species genes encoding putative N-acetylmuramoyl-l-alanine amidases with anti-clostridial potential. One such enzyme, designated as LysB (224-aa), from the prophage of C. botulinum E3 strain Alaska E43 was chosen for further analysis. The recombinant 27,726 Da protein was expressed and purified from E. coli Tuner(DE3) with a yield of 37.5 mg per 1 L of cell culture. Size-exclusion chromatography and analytical ultracentrifugation experiments showed that the protein is dimeric in solution. Bioinformatics analysis and results of site-directed mutagenesis studies imply that five residues, namely H25, Y54, H126, S132, and C134, form the catalytic center of the enzyme. Twelve other residues, namely M13, H43, N47, G48, W49, A50, L73, A75, H76, Q78, N81, and Y182, were predicted to be involved in anchoring the protein to the lipoteichoic acid, a significant component of the Gram-positive bacterial cell wall. The LysB enzyme demonstrated lytic activity against bacteria belonging to the genera Clostridium, Bacillus, Staphylococcus, and Deinococcus, but did not lyse Gram-negative bacteria. Optimal lytic activity of LysB occurred between pH 4.0 and 7.5 in the absence of NaCl. This work presents the first characterization of an endolysin derived from a C. botulinum Group II prophage, which can potentially be used to control this important pathogen.
Asunto(s)
Clostridium botulinum tipo E/enzimología , Endopeptidasas/metabolismo , N-Acetil Muramoil-L-Alanina Amidasa/metabolismo , Secuencia de Aminoácidos , Dominio Catalítico , Clostridium/efectos de los fármacos , Clostridium/ultraestructura , Endopeptidasas/química , Endopeptidasas/aislamiento & purificación , Endopeptidasas/farmacología , Lipopolisacáridos/metabolismo , Pruebas de Sensibilidad Microbiana , N-Acetil Muramoil-L-Alanina Amidasa/química , N-Acetil Muramoil-L-Alanina Amidasa/aislamiento & purificación , N-Acetil Muramoil-L-Alanina Amidasa/farmacología , Profagos/enzimología , Ácidos Teicoicos/metabolismoRESUMEN
The Virus-X-Viral Metagenomics for Innovation Value-project was a scientific expedition to explore and exploit uncharted territory of genetic diversity in extreme natural environments such as geothermal hot springs and deep-sea ocean ecosystems. Specifically, the project was set to analyse and exploit viral metagenomes with the ultimate goal of developing new gene products with high innovation value for applications in biotechnology, pharmaceutical, medical, and the life science sectors. Viral gene pool analysis is also essential to obtain fundamental insight into ecosystem dynamics and to investigate how viruses influence the evolution of microbes and multicellular organisms. The Virus-X Consortium, established in 2016, included experts from eight European countries. The unique approach based on high throughput bioinformatics technologies combined with structural and functional studies resulted in the development of a biodiscovery pipeline of significant capacity and scale. The activities within the Virus-X consortium cover the entire range from bioprospecting and methods development in bioinformatics to protein production and characterisation, with the final goal of translating our results into new products for the bioeconomy. The significant impact the consortium made in all of these areas was possible due to the successful cooperation between expert teams that worked together to solve a complex scientific problem using state-of-the-art technologies as well as developing novel tools to explore the virosphere, widely considered as the last great frontier of life.
Asunto(s)
Genoma Viral/genética , Metagenómica , Bioprospección/organización & administración , Biología Computacional , Bases de Datos Genéticas , Europa (Continente) , Respiraderos Hidrotermales/virología , Proteínas Virales/química , Proteínas Virales/genética , Proteínas Virales/metabolismo , Viroma/genética , Virus/clasificación , Virus/genéticaRESUMEN
To escape from hosts after completing their life cycle, bacteriophages often use endolysins, which degrade bacterial peptidoglycan. While mesophilic phages have been extensively studied, their thermophilic counterparts are not well characterized. Here, we present a detailed analysis of the structure and function of Ts2631 endolysin from thermophilic phage vB_Tsc2631, which is a zinc-dependent amidase. The active site of Ts2631 consists of His30, Tyr58, His131 and Cys139, which are involved in Zn2+ coordination and catalysis. We found that the active site residues are necessary for lysis yet not crucial for peptidoglycan binding. To elucidate residues involved in the enzyme interaction with peptidoglycan, we tested single-residue substitution variants and identified Tyr60 and Lys70 as essential residues. Moreover, substitution of Cys80, abrogating disulfide bridge formation, inactivates Ts2631, as do substitutions of His31, Thr32 and Asn85 residues. The endolysin contains a positively charged N-terminal extension of 20 residues that can protrude from the remainder of the enzyme and is crucial for peptidoglycan binding. We show that the deletion of 20 residues from the N-terminus abolished the bacteriolytic activity of the enzyme. Because Ts2631 exhibits intrinsic antibacterial activity and unusual thermal stability, it is perfectly suited as a scaffold for the development of antimicrobial agents.
Asunto(s)
Bacteriófagos/fisiología , Endopeptidasas/metabolismo , Peptidoglicano/metabolismo , Thermus/virología , Proteínas Virales/metabolismo , Bacteriólisis , Bacteriófagos/química , Bacteriófagos/enzimología , Dominio Catalítico , Endopeptidasas/química , Modelos Moleculares , Conformación Proteica , Thermus/fisiología , Proteínas Virales/químicaRESUMEN
Proteome-pI is an online database containing information about predicted isoelectric points for 5029 proteomes calculated using 18 methods. The isoelectric point, the pH at which a particular molecule carries no net electrical charge, is an important parameter for many analytical biochemistry and proteomics techniques, especially for 2D gel electrophoresis (2D-PAGE), capillary isoelectric focusing, liquid chromatography-mass spectrometry and X-ray protein crystallography. The database, available at http://isoelectricpointdb.org allows the retrieval of virtual 2D-PAGE plots and the development of customised fractions of proteome based on isoelectric point and molecular weight. Moreover, Proteome-pI facilitates statistical comparisons of the various prediction methods as well as biological investigation of protein isoelectric point space in all kingdoms of life. For instance, using Proteome-pI data, it is clear that Eukaryotes, which evolved tight control of homeostasis, encode proteins with pI values near the cell pH. In contrast, Archaea living frequently in extreme environments can possess proteins with a wide range of isoelectric points. The database includes various statistics and tools for interactive browsing, searching and sorting. Apart from data for individual proteomes, datasets corresponding to major protein databases such as UniProtKB/TrEMBL and the NCBI non-redundant (nr) database have also been precalculated and made available in CSV format.
Asunto(s)
Biología Computacional , Bases de Datos de Proteínas , Punto Isoeléctrico , Proteoma , Proteómica , Motor de Búsqueda , Biología Computacional/métodos , Proteómica/métodos , Navegador WebRESUMEN
BACKGROUND: Accurate estimation of the isoelectric point (pI) based on the amino acid sequence is useful for many analytical biochemistry and proteomics techniques such as 2-D polyacrylamide gel electrophoresis, or capillary isoelectric focusing used in combination with high-throughput mass spectrometry. Additionally, pI estimation can be helpful during protein crystallization trials. RESULTS: Here, I present the Isoelectric Point Calculator (IPC), a web service and a standalone program for the accurate estimation of protein and peptide pI using different sets of dissociation constant (pKa) values, including two new computationally optimized pKa sets. According to the presented benchmarks, the newly developed IPC pKa sets outperform previous algorithms by at least 14.9 % for proteins and 0.9 % for peptides (on average, 22.1 % and 59.6 %, respectively), which corresponds to an average error of the pI estimation equal to 0.87 and 0.25 pH units for proteins and peptides, respectively. Moreover, the prediction of pI using the IPC pKa's leads to fewer outliers, i.e., predictions affected by errors greater than a given threshold. CONCLUSIONS: The IPC service is freely available at http://isoelectric.ovh.org Peptide and protein datasets used in the study and the precalculated pI for the PDB and some of the most frequently used proteomes are available for large-scale analysis and future development. REVIEWERS: This article was reviewed by Frank Eisenhaber and Zoltán Gáspári.
Asunto(s)
Técnicas de Química Analítica/métodos , Punto Isoeléctrico , Proteómica/métodos , Péptidos/química , Proteínas/químicaRESUMEN
Phage vB_Tsc2631 infects the extremophilic bacterium Thermus scotoductus MAT2631 and uses the Ts2631 endolysin for the release of its progeny. The Ts2631 endolysin is the first endolysin from thermophilic bacteriophage with an experimentally validated catalytic site. In silico analysis and computational modelling of the Ts2631 endolysin structure revealed a conserved Zn2+ binding site (His30, Tyr58, His131 and Cys139) similar to Zn2+ binding site of eukaryotic peptidoglycan recognition proteins (PGRPs). We have shown that the Ts2631 endolysin lytic activity is dependent on divalent metal ions (Zn2+ and Ca2+). The Ts2631 endolysin substitution variants H30N, Y58F, H131N and C139S dramatically lost their antimicrobial activity, providing evidence for the role of the aforementioned residues in the lytic activity of the enzyme. The enzyme has proven to be not only thermoresistant, retaining 64.8% of its initial activity after 2 h at 95°C, but also highly thermodynamically stable (Tm = 99.82°C, ΔHcal = 4.58 × 10(4) cal mol(-1)). Substitutions of histidine residues (H30N and H131N) and a cysteine residue (C139S) resulted in variants aggregating at temperatures ≥75°C, indicating a significant role of these residues in enzyme thermostability. The substrate spectrum of the Ts2631 endolysin included extremophiles of the genus Thermus but also Gram-negative mesophiles, such as Escherichia coli, Salmonella panama, Pseudomonas fluorescens and Serratia marcescens. The broad substrate spectrum and high thermostability of this endolysin makes it a good candidate for use as an antimicrobial agent to combat Gram-negative pathogens.
Asunto(s)
Bacteriófagos/enzimología , Dominio Catalítico , Endopeptidasas/química , Endopeptidasas/metabolismo , Thermus/virología , Secuencia de Aminoácidos , Bacteriófagos/fisiología , Cationes Bivalentes/farmacología , Estabilidad de Enzimas , Modelos Moleculares , Datos de Secuencia Molecular , Cloruro de Sodio/farmacología , Especificidad por Sustrato , TemperaturaRESUMEN
MOTIVATION: To date, only a few distinct successful approaches have been introduced to reconstruct a protein 3D structure from a map of contacts between its amino acid residues (a 2D contact map). Current algorithms can infer structures from information-rich contact maps that contain a limited fraction of erroneous predictions. However, it is difficult to reconstruct 3D structures from predicted contact maps that usually contain a high fraction of false contacts. RESULTS: We describe a new, multi-step protocol that predicts protein 3D structures from the predicted contact maps. The method is based on a novel distance function acting on a fuzzy residue proximity graph, which predicts a 2D distance map from a 2D predicted contact map. The application of a Multi-Dimensional Scaling algorithm transforms that predicted 2D distance map into a coarse 3D model, which is further refined by typical modeling programs into an all-atom representation. We tested our approach on contact maps predicted de novo by MULTICOM, the top contact map predictor according to CASP10. We show that our method outperforms FT-COMAR, the state-of-the-art method for 3D structure reconstruction from 2D maps. For all predicted 2D contact maps of relatively low sensitivity (60-84%), GDFuzz3D generates more accurate 3D models, with the average improvement of 4.87 Å in terms of RMSD. AVAILABILITY AND IMPLEMENTATION: GDFuzz3D server and standalone version are freely available at http://iimcb.genesilico.pl/gdserver/GDFuzz3D/. CONTACT: iamb@genesilico.pl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Modelos Moleculares , Conformación Proteica , Programas Informáticos , Algoritmos , Aminoácidos/química , Proteínas/química , Análisis de Secuencia de Proteína/métodosRESUMEN
Ribonucleases (RNases) play a critical role in RNA processing and degradation by hydrolyzing phosphodiester bonds (exo- or endonucleolytically). Many RNases that cut RNA internally exhibit substrate specificity, but their target sites are usually limited to one or a few specific nucleotides in single-stranded RNA and often in a context of a particular three-dimensional structure of the substrate. Thus far, no RNase counterparts of restriction enzymes have been identified which could cleave double-stranded RNA (dsRNA) in a sequence-specific manner. Here, we present evidence for a sequence-dependent cleavage of long dsRNA by RNase Mini-III from Bacillus subtilis (BsMiniIII). Analysis of the sites cleaved by this enzyme in limited digest of bacteriophage Φ6 dsRNA led to the identification of a consensus target sequence. We defined nucleotide residues within the preferred cleavage site that affected the efficiency of the cleavage and were essential for the discrimination of cleavable versus non-cleavable dsRNA sequences. We have also determined that the loop α5b-α6, a distinctive structural element in Mini-III RNases, is crucial for the specific cleavage, but not for dsRNA binding. Our results suggest that BsMiniIII may serve as a prototype of a sequence-specific dsRNase that could possibly be used for targeted cleavage of dsRNA.
Asunto(s)
Bacillus subtilis/enzimología , Proteínas Bacterianas/metabolismo , ARN Bicatenario/metabolismo , Ribonucleasa III/metabolismo , Secuencia de Aminoácidos , Bacillus subtilis/genética , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Secuencia de Bases , Sitios de Unión/genética , Cinética , Modelos Moleculares , Datos de Secuencia Molecular , Mutación , Conformación de Ácido Nucleico , Unión Proteica , Estructura Terciaria de Proteína , ARN Bicatenario/química , ARN Bicatenario/genética , Ribonucleasa III/química , Ribonucleasa III/genética , Homología de Secuencia de Aminoácido , Especificidad por SustratoRESUMEN
BACKGROUND: In flowering plants a number of genes have been identified which control the transition from a vegetative to generative phase of life cycle. In bryophytes representing basal lineage of land plants, there is little data regarding the mechanisms that control this transition. Two species from bryophytes - moss Physcomitrella patens and liverwort Marchantia polymorpha are under advanced molecular and genetic research. The goal of our study was to identify genes connected to female gametophyte development and archegonia production in the dioecious liverwort Pellia endiviifolia species B, which is representative of the most basal lineage of the simple thalloid liverworts. RESULTS: The utility of the RDA-cDNA technique allowed us to identify three genes specifically expressed in the female individuals of P.endiviifolia: PenB_CYSP coding for cysteine protease, PenB_MT2 and PenB_MT3 coding for Mysterious Transcripts1 and 2 containing ORFs of 143 and 177 amino acid residues in length, respectively. The exon-intron structure of all three genes has been characterized and pre-mRNA processing was investigated. Interestingly, five mRNA isoforms are produced from the PenB_MT2 gene, which result from alternative splicing within the second and third exon. All observed splicing events take place within the 5'UTR and do not interfere with the coding sequence. All three genes are exclusively expressed in the female individuals, regardless of whether they were cultured in vitro or were collected from a natural habitat. Moreover we observed ten-fold increased transcripts level for all three genes in the archegonial tissue in comparison to the vegetative parts of the same female thalli grown in natural habitat suggesting their connection to archegonia development. CONCLUSIONS: We have identified three genes which are specifically expressed in P.endiviifolia sp B female gametophytes. Moreover, their expression is connected to the female sex-organ differentiation and is developmentally regulated. The contribution of the identified genes may be crucial for successful liverwort sexual reproduction.
Asunto(s)
Regulación del Desarrollo de la Expresión Génica , Regulación de la Expresión Génica de las Plantas , Hepatophyta/crecimiento & desarrollo , Hepatophyta/genética , Óvulo Vegetal/genética , Esporas/crecimiento & desarrollo , Esporas/genética , Secuencia de Aminoácidos , Biología Computacional , ADN Complementario/genética , ADN Complementario/aislamiento & purificación , Ecosistema , Genes de Plantas , Modelos Moleculares , Datos de Secuencia Molecular , Proteínas de Plantas/química , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Estructura Secundaria de Proteína , ARN Mensajero/genética , ARN Mensajero/metabolismo , Reacción en Cadena en Tiempo Real de la Polimerasa , Alineación de Secuencia , Homología Estructural de ProteínaRESUMEN
Protein-RNA interactions play fundamental roles in many biological processes, such as regulation of gene expression, RNA splicing, and protein synthesis. The understanding of these processes improves as new structures of protein-RNA complexes are solved and the molecular details of interactions analyzed. However, experimental determination of protein-RNA complex structures by high-resolution methods is tedious and difficult. Therefore, studies on protein-RNA recognition and complex formation present major technical challenges for macromolecular structural biology. Alternatively, protein-RNA interactions can be predicted by computational methods. Although less accurate than experimental measurements, theoretical models of macromolecular structures can be sufficiently accurate to prompt functional hypotheses and guide e.g. identification of important amino acid or nucleotide residues. In this article we present an overview of strategies and methods for computational modeling of protein-RNA complexes, including software developed in our laboratory, and illustrate it with practical examples of structural predictions.
Asunto(s)
Biología Computacional/métodos , Proteínas de Escherichia coli/química , ARN Ribosómico 16S/química , Proteínas de Unión al ARN/química , Riboswitch/genética , Programas Informáticos , Bacillus subtilis/química , Sitios de Unión , Bases de Datos de Proteínas , Escherichia coli/química , Conformación Molecular , Simulación del Acoplamiento Molecular , Unión Proteica , Thermoanaerobacter/químicaRESUMEN
We present a continuous benchmarking approach for the assessment of RNA secondary structure prediction methods implemented in the CompaRNA web server. As of 3 October 2012, the performance of 28 single-sequence and 13 comparative methods has been evaluated on RNA sequences/structures released weekly by the Protein Data Bank. We also provide a static benchmark generated on RNA 2D structures derived from the RNAstrand database. Benchmarks on both data sets offer insight into the relative performance of RNA secondary structure prediction methods on RNAs of different size and with respect to different types of structure. According to our tests, on the average, the most accurate predictions obtained by a comparative approach are generated by CentroidAlifold, MXScarna, RNAalifold and TurboFold. On the average, the most accurate predictions obtained by single-sequence analyses are generated by CentroidFold, ContextFold and IPknot. The best comparative methods typically outperform the best single-sequence methods if an alignment of homologous RNA sequences is available. This article presents the results of our benchmarks as of 3 October 2012, whereas the rankings presented online are continuously updated. We will gladly include new prediction methods and new measures of accuracy in the new editions of CompaRNA benchmarks.
Asunto(s)
ARN/química , Análisis de Secuencia de ARN , Programas Informáticos , Benchmarking , Bases de Datos de Ácidos Nucleicos , Bases de Datos de Proteínas , Internet , Conformación de Ácido NucleicoRESUMEN
BACKGROUND: Intrinsically unstructured proteins (IUPs) lack a well-defined three-dimensional structure. Some of them may assume a locally stable structure under specific conditions, e.g. upon interaction with another molecule, while others function in a permanently unstructured state. The discovery of IUPs challenged the traditional protein structure paradigm, which stated that a specific well-defined structure defines the function of the protein. As of December 2011, approximately 60 methods for computational prediction of protein disorder from sequence have been made publicly available. They are based on different approaches, such as utilizing evolutionary information, energy functions, and various statistical and machine learning methods. RESULTS: Given the diversity of existing intrinsic disorder prediction methods, we decided to test whether it is possible to combine them into a more accurate meta-prediction method. We developed a method based on arbitrarily chosen 13 disorder predictors, in which the final consensus was weighted by the accuracy of the methods. We have also developed a disorder predictor GSmetaDisorder3D that used no third-party disorder predictors, but alignments to known protein structures, reported by the protein fold-recognition methods, to infer the potentially structured and unstructured regions. Following the success of our disorder predictors in the CASP8 benchmark, we combined them into a meta-meta predictor called GSmetaDisorderMD, which was the top scoring method in the subsequent CASP9 benchmark. CONCLUSIONS: A series of disorder predictors described in this article is available as a MetaDisorder web server at http://iimcb.genesilico.pl/metadisorder/. Results are presented both in an easily interpretable, interactive mode and in a simple text format suitable for machine processing.