RESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide Protein Data Bank (wwPDB), is the US data center for the open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB PDB is also responsible for PDB data security. Annually, RCSB PDB serves >10 000 depositors of three-dimensional (3D) biostructures working on all permanently inhabited continents. RCSB PDB delivers data from its research-focused RCSB.org web portal to many millions of PDB data consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades to the research-focused RCSB.org web portal that created a one-stop-shop for open access to â¼200 000 experimentally-determined PDB structures of biological macromolecules alongside >1 000 000 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. RCSB.org is a 'living data resource.' Every PDB structure and CSM is integrated weekly with related functional annotations from external biodata resources, providing up-to-date information for the entire corpus of 3D biostructure data freely available from RCSB.org with no usage limitations. Within RCSB.org, PDB structures and the CSMs are clearly identified as to their provenance and reliability. Both are fully searchable, and can be analyzed and visualized using the full complement of RCSB.org web portal capabilities.
Assuntos
Inteligência Artificial , Bases de Dados de Proteínas , Proteínas , Aprendizado de Máquina , Conformação Proteica , Proteínas/química , Reprodutibilidade dos TestesRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), the US data center for the global PDB archive and a founding member of the Worldwide Protein Data Bank partnership, serves tens of thousands of data depositors in the Americas and Oceania and makes 3D macromolecular structure data available at no charge and without restrictions to millions of RCSB.org users around the world, including >660 000 educators, students and members of the curious public using PDB101.RCSB.org. PDB data depositors include structural biologists using macromolecular crystallography, nuclear magnetic resonance spectroscopy, 3D electron microscopy and micro-electron diffraction. PDB data consumers accessing our web portals include researchers, educators and students studying fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. During the past 2 years, the research-focused RCSB PDB web portal (RCSB.org) has undergone a complete redesign, enabling improved searching with full Boolean operator logic and more facile access to PDB data integrated with >40 external biodata resources. New features and resources are described in detail using examples that showcase recently released structures of SARS-CoV-2 proteins and host cell proteins relevant to understanding and addressing the COVID-19 global pandemic.
Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Substâncias Macromoleculares/química , Conformação Proteica , Proteínas/química , Bioengenharia/métodos , Pesquisa Biomédica/métodos , Biotecnologia/métodos , COVID-19/epidemiologia , COVID-19/prevenção & controle , COVID-19/virologia , Humanos , Substâncias Macromoleculares/metabolismo , Pandemias , Proteínas/genética , Proteínas/metabolismo , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiologia , Software , Proteínas Virais/química , Proteínas Virais/genética , Proteínas Virais/metabolismoRESUMO
Since 1971, the Protein Data Bank (PDB) has served as the single global archive for experimentally determined 3D structures of biological macromolecules made freely available to the global community according to the FAIR principles of Findability-Accessibility-Interoperability-Reusability. During the first 50 years of continuous PDB operations, standards for data representation have evolved to better represent rich and complex biological phenomena. Carbohydrate molecules present in more than 14,000 PDB structures have recently been reviewed and remediated to conform to a new standardized format. This machine-readable data representation for carbohydrates occurring in the PDB structures and the corresponding reference data improves the findability, accessibility, interoperability and reusability of structural information pertaining to these molecules. The PDB Exchange MacroMolecular Crystallographic Information File data dictionary now supports (i) standardized atom nomenclature that conforms to International Union of Pure and Applied Chemistry-International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) recommendations for carbohydrates, (ii) uniform representation of branched entities for oligosaccharides, (iii) commonly used linear descriptors of carbohydrates developed by the glycoscience community and (iv) annotation of glycosylation sites in proteins. For the first time, carbohydrates in PDB structures are consistently represented as collections of standardized monosaccharides, which precisely describe oligosaccharide structures and enable improved carbohydrate visualization, structure validation, robust quantitative and qualitative analyses, search for dendritic structures and classification. The uniform representation of carbohydrate molecules in the PDB described herein will facilitate broader usage of the resource by the glycoscience community and researchers studying glycoproteins.
Assuntos
Carboidratos , Proteínas , Carboidratos/química , Bases de Dados de Proteínas , Proteínas/químicaRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, rcsb.org), the US data center for the global PDB archive, serves thousands of Data Depositors in the Americas and Oceania and makes 3D macromolecular structure data available at no charge and without usage restrictions to more than 1 million rcsb.org Users worldwide and 600 000 pdb101.rcsb.org education-focused Users around the globe. PDB Data Depositors include structural biologists using macromolecular crystallography, nuclear magnetic resonance spectroscopy and 3D electron microscopy. PDB Data Consumers include researchers, educators and students studying Fundamental Biology, Biomedicine, Biotechnology and Energy. Recent reorganization of RCSB PDB activities into four integrated, interdependent services is described in detail, together with tools and resources added over the past 2 years to RCSB PDB web portals in support of a 'Structural View of Biology.'
Assuntos
Bases de Dados de Proteínas , Conformação Proteica , Pesquisa Biomédica/educação , Biotecnologia/educação , Curadoria de Dados , SoftwareRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, http://rcsb.org), the US data center for the global PDB archive, makes PDB data freely available to all users, from structural biologists to computational biologists and beyond. New tools and resources have been added to the RCSB PDB web portal in support of a 'Structural View of Biology.' Recent developments have improved the User experience, including the high-speed NGL Viewer that provides 3D molecular visualization in any web browser, improved support for data file download and enhanced organization of website pages for query, reporting and individual structure exploration. Structure validation information is now visible for all archival entries. PDB data have been integrated with external biological resources, including chromosomal position within the human genome; protein modifications; and metabolic pathways. PDB-101 educational materials have been reorganized into a searchable website and expanded to include new features such as the Geis Digital Archive.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Proteínas/química , Proteínas/genética , Conjuntos de Dados como Assunto , Redes e Vias Metabólicas , Modelos Moleculares , Conformação Proteica , Proteínas/metabolismo , Software , Relação Estrutura-Atividade , Interface Usuário-Computador , NavegadorRESUMO
The Protein Data Bank (PDB) is the global repository for public-domain experimentally determined 3D biomolecular structural information. The archival nature of the PDB presents certain challenges pertaining to updating or adding associated annotations from trusted external biodata resources. While each Worldwide PDB (wwPDB) partner has made best efforts to provide up-to-date external annotations, accessing and integrating information from disparate wwPDB data centers can be an involved process. To address this issue, the wwPDB has established the PDB Next Generation (or NextGen) Archive, developed to centralize and streamline access to enriched structural annotations from wwPDB partners and trusted external sources. At present, the NextGen Archive provides mappings between experimentally determined 3D structures of proteins and UniProt amino acid sequences, domain annotations from Pfam, SCOP2 and CATH databases and intra-molecular connectivity information. Since launch, the PDB NextGen Archive has seen substantial user engagement with over 3.5 million data file downloads, ensuring researchers have access to accurate, up-to-date and easily accessible structural annotations. Database URL: http://www.wwpdb.org/ftp/pdb-nextgen-archive-site.
Assuntos
Bases de Dados de Proteínas , Anotação de Sequência Molecular , Proteínas/químicaRESUMO
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NMR exchange (NEF) and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB Restraint Violation Report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
RESUMO
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NEF and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB restraint violation report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
Assuntos
Bases de Dados de Proteínas , Modelos Moleculares , Ressonância Magnética Nuclear Biomolecular , Conformação Proteica , Proteínas , Ressonância Magnética Nuclear Biomolecular/métodos , Proteínas/química , SoftwareRESUMO
IHMCIF (github.com/ihmwg/IHMCIF) is a data information framework that supports archiving and disseminating macromolecular structures determined by integrative or hybrid modeling (IHM), and making them Findable, Accessible, Interoperable, and Reusable (FAIR). IHMCIF is an extension of the Protein Data Bank Exchange/macromolecular Crystallographic Information Framework (PDBx/mmCIF) that serves as the framework for the Protein Data Bank (PDB) to archive experimentally determined atomic structures of biological macromolecules and their complexes with one another and small molecule ligands (e.g., enzyme cofactors and drugs). IHMCIF serves as the foundational data standard for the PDB-Dev prototype system, developed for archiving and disseminating integrative structures. It utilizes a flexible data representation to describe integrative structures that span multiple spatiotemporal scales and structural states with definitions for restraints from a variety of experimental methods contributing to integrative structural biology. The IHMCIF extension was created with the benefit of considerable community input and recommendations gathered by the Worldwide Protein Data Bank (wwPDB) Task Force for Integrative or Hybrid Methods (wwpdb.org/task/hybrid). Herein, we describe the development of IHMCIF to support evolving methodologies and ongoing advancements in integrative structural biology. Ultimately, IHMCIF will facilitate the unification of PDB-Dev data and tools with the PDB archive so that integrative structures can be archived and disseminated through PDB.
Assuntos
Bases de Dados de Proteínas , Proteínas , Proteínas/química , Conformação Proteica , Modelos Moleculares , Software , Cristalografia por Raios X/métodos , Substâncias Macromoleculares/química , Biologia Computacional/métodos , LigantesRESUMO
ModelCIF (github.com/ihmwg/ModelCIF) is a data information framework developed for and by computational structural biologists to enable delivery of Findable, Accessible, Interoperable, and Reusable (FAIR) data to users worldwide. ModelCIF describes the specific set of attributes and metadata associated with macromolecular structures modeled by solely computational methods and provides an extensible data representation for deposition, archiving, and public dissemination of predicted three-dimensional (3D) models of macromolecules. It is an extension of the Protein Data Bank Exchange / macromolecular Crystallographic Information Framework (PDBx/mmCIF), which is the global data standard for representing experimentally-determined 3D structures of macromolecules and associated metadata. The PDBx/mmCIF framework and its extensions (e.g., ModelCIF) are managed by the Worldwide Protein Data Bank partnership (wwPDB, wwpdb.org) in collaboration with relevant community stakeholders such as the wwPDB ModelCIF Working Group (wwpdb.org/task/modelcif). This semantically rich and extensible data framework for representing computed structure models (CSMs) accelerates the pace of scientific discovery. Herein, we describe the architecture, contents, and governance of ModelCIF, and tools and processes for maintaining and extending the data standard. Community tools and software libraries that support ModelCIF are also described.
Assuntos
Bases de Dados de Proteínas , Substâncias Macromoleculares/química , Conformação Proteica , SoftwareRESUMO
More than 70% of the experimentally determined macromolecular structures in the Protein Data Bank (PDB) contain small-molecule ligands. Quality indicators of â¼643,000 ligands present in â¼106,000 PDB X-ray crystal structures have been analyzed. Ligand quality varies greatly with regard to goodness of fit between ligand structure and experimental data, deviations in bond lengths and angles from known chemical structures, and inappropriate interatomic clashes between the ligand and its surroundings. Based on principal component analysis, correlated quality indicators of ligand structure have been aggregated into two largely orthogonal composite indicators measuring goodness of fit to experimental data and deviation from ideal chemical structure. Ranking of the composite quality indicators across the PDB archive enabled construction of uniformly distributed composite ranking score. This score is implemented at RCSB.org to compare chemically identical ligands in distinct PDB structures with easy-to-interpret two-dimensional ligand quality plots, allowing PDB users to quickly assess ligand structure quality and select the best exemplars.
Assuntos
Proteínas/química , Proteínas/metabolismo , Bibliotecas de Moléculas Pequenas/farmacologia , Bases de Dados de Proteínas , Ligantes , Modelos Moleculares , Conformação ProteicaRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the United States National Science Foundation, National Institutes of Health, and Department of Energy, supports structural biologists and Protein Data Bank (PDB) data users around the world. The RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, serves as the US data center for the global PDB archive housing experimentally-determined three-dimensional (3D) structure data for biological macromolecules. As the wwPDB-designated Archive Keeper, RCSB PDB is also responsible for the security of PDB data and weekly update of the archive. RCSB PDB serves tens of thousands of data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) annually working on all permanently inhabited continents. RCSB PDB makes PDB data available from its research-focused web portal at no charge and without usage restrictions to many millions of PDB data consumers around the globe. It also provides educators, students, and the general public with an introduction to the PDB and related training materials through its outreach and education-focused web portal. This review article describes growth of the PDB, examines evolution of experimental methods for structure determination viewed through the lens of the PDB archive, and provides a detailed accounting of PDB archival holdings and their utilization by researchers, educators, and students worldwide.
Assuntos
Biologia Computacional , Proteínas , Humanos , Conformação Proteica , Bases de Dados de Proteínas , Biologia Computacional/métodos , Proteínas/química , EstudantesRESUMO
PDBx/mmCIF, Protein Data Bank Exchange (PDBx) macromolecular Crystallographic Information Framework (mmCIF), has become the data standard for structural biology. With its early roots in the domain of small-molecule crystallography, PDBx/mmCIF provides an extensible data representation that is used for deposition, archiving, remediation, and public dissemination of experimentally determined three-dimensional (3D) structures of biological macromolecules by the Worldwide Protein Data Bank (wwPDB, wwpdb.org). Extensions of PDBx/mmCIF are similarly used for computed structure models by ModelArchive (modelarchive.org), integrative/hybrid structures by PDB-Dev (pdb-dev.wwpdb.org), small angle scattering data by Small Angle Scattering Biological Data Bank SASBDB (sasbdb.org), and for models computed generated with the AlphaFold 2.0 deep learning software suite (alphafold.ebi.ac.uk). Community-driven development of PDBx/mmCIF spans three decades, involving contributions from researchers, software and methods developers in structural sciences, data repository providers, scientific publishers, and professional societies. Having a semantically rich and extensible data framework for representing a wide range of structural biology experimental and computational results, combined with expertly curated 3D biostructure data sets in public repositories, accelerates the pace of scientific discovery. Herein, we describe the architecture of the PDBx/mmCIF data standard, tools used to maintain representations of the data standard, governance, and processes by which data content standards are extended, plus community tools/software libraries available for processing and checking the integrity of PDBx/mmCIF data. Use cases exemplify how the members of the Worldwide Protein Data Bank have used PDBx/mmCIF as the foundation for its pipeline for delivering Findable, Accessible, Interoperable, and Reusable (FAIR) data to many millions of users worldwide.
Assuntos
Biologia Computacional , Cristalografia , Bases de Dados de Proteínas , Software , Substâncias Macromoleculares/química , Biologia Molecular , Conformação Proteica , SemânticaRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the US National Science Foundation, National Institutes of Health, and Department of Energy, has served structural biologists and Protein Data Bank (PDB) data consumers worldwide since 1999. RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, is the US data center for the global PDB archive housing biomolecular structure data. RCSB PDB is also responsible for the security of PDB data, as the wwPDB-designated Archive Keeper. Annually, RCSB PDB serves tens of thousands of three-dimensional (3D) macromolecular structure data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) from all inhabited continents. RCSB PDB makes PDB data available from its research-focused RCSB.org web portal at no charge and without usage restrictions to millions of PDB data consumers working in every nation and territory worldwide. In addition, RCSB PDB operates an outreach and education PDB101.RCSB.org web portal that was used by more than 800,000 educators, students, and members of the public during calendar year 2020. This invited Tools Issue contribution describes (i) how the archive is growing and evolving as new experimental methods generate ever larger and more complex biomolecular structures; (ii) the importance of data standards and data remediation in effective management of the archive and facile integration with more than 50 external data resources; and (iii) new tools and features for 3D structure analysis and visualization made available during the past year via the RCSB.org web portal.
Assuntos
Biologia Computacional/história , Bases de Dados de Proteínas/história , Interface Usuário-Computador , Aniversários e Eventos Especiais , História do Século XX , História do Século XXIRESUMO
Now in its 52nd year of continuous operations, the Protein Data Bank (PDB) is the premiere open-access global archive housing three-dimensional (3D) biomolecular structure data. It is jointly managed by the Worldwide Protein Data Bank (wwPDB) partnership. The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) is funded by the National Science Foundation, National Institutes of Health, and US Department of Energy and serves as the US data center for the wwPDB. RCSB PDB is also responsible for the security of PDB data in its role as wwPDB-designated Archive Keeper. Every year, RCSB PDB serves tens of thousands of depositors of 3D macromolecular structure data (coming from macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction). The RCSB PDB research-focused web portal (RCSB.org) makes PDB data available at no charge and without usage restrictions to many millions of PDB data consumers around the world. The RCSB PDB training, outreach, and education web portal (PDB101.RCSB.org) serves nearly 700 K educators, students, and members of the public worldwide. This invited Tools Issue contribution describes how RCSB PDB (i) is organized; (ii) works with wwPDB partners to process new depositions; (iii) serves as the wwPDB-designated Archive Keeper; (iv) enables exploration and 3D visualization of PDB data via RCSB.org; and (v) supports training, outreach, and education via PDB101.RCSB.org. New tools and features at RCSB.org are presented using examples drawn from high-resolution structural studies of proteins relevant to treatment of human cancers by targeting immune checkpoints.
Assuntos
Biologia Computacional , Proteínas , Humanos , Conformação Proteica , Bases de Dados de Proteínas , Proteínas/química , Biologia Computacional/métodos , Substâncias Macromoleculares/químicaRESUMO
Lanthanide-binding tags (LBTs) are valuable tools for investigation of protein structure, function, and dynamics by NMR spectroscopy, X-ray crystallography, and luminescence studies. We have inserted LBTs into three different loop positions (denoted L, R, and S) of the model protein interleukin-1ß (IL1ß) and varied the length of the spacer between the LBT and the protein (denoted 1−3). Luminescence studies demonstrate that all nine constructs bind Tb3+ tightly in the low nanomolar range. No significant change in the fusion protein occurs from insertion of the LBT, as shown by two X-ray crystallographic structures of the IL1ß-S1 and IL1ß-L3 constructs and for the remaining constructs by comparing the 1H−15N heteronuclear single-quantum coherence NMR spectra with that of the wild-type IL1ß. Additionally, binding of LBT-loop IL1ß proteins to their native binding partner in vitro remains unaltered. X-ray crystallographic phasing was successful using only the signal from the bound lanthanide. Large residual dipolar couplings (RDCs) could be determined by NMR spectroscopy for all LBT-loop constructs and revealed that the LBT-2 series were rigidly incorporated into the interleukin-1ß structure. The paramagnetic NMR spectra of loop-LBT mutant IL1ß-R2 were assigned and the Δχ tensor components were calculated on the basis of RDCs and pseudocontact shifts. A structural model of the IL1ß-R2 construct was calculated using the paramagnetic restraints. The current data provide support that encodable LBTs serve as versatile biophysical tags when inserted into loop regions of proteins of known structure or predicted via homology modeling.
Assuntos
Interleucina-1beta/química , Interleucina-1beta/genética , Elementos da Série dos Lantanídeos/química , Sondas Moleculares/química , Engenharia de Proteínas/métodos , Sequência de Aminoácidos , Cristalografia por Raios X , Estudos de Viabilidade , Humanos , Interleucina-1beta/metabolismo , Modelos Moleculares , Sondas Moleculares/metabolismo , Dados de Sequência Molecular , Ressonância Magnética Nuclear Biomolecular , Peptídeos/química , Peptídeos/metabolismo , Estrutura Secundária de Proteína , Receptores de Interleucina-1/metabolismoRESUMO
The haloalkanoic acid dehalogenase (HAD) enzyme superfamily is the largest family of phosphohydrolases. In HAD members, the structural elements that provide the binding interactions that support substrate specificity are separated from those that orchestrate catalysis. For most HAD phosphatases, a cap domain functions in substrate recognition. However, for the HAD phosphatases that lack a cap domain, an alternate strategy for substrate selection must be operative. One such HAD phosphatase, GmhB of the HisB subfamily, was selected for structure-function analysis. Herein, the X-ray crystallographic structures of Escherichia coli GmhB in the apo form (1.6 A resolution), in a complex with Mg(2+) and orthophosphate (1.8 A resolution), and in a complex with Mg(2+) and d-glycero-d-manno-heptose 1beta,7-bisphosphate (2.2 A resolution) were determined, in addition to the structure of Bordetella bronchiseptica GmhB bound to Mg(2+) and orthophosphate (1.7 A resolution). The structures show that in place of a cap domain, the GmhB catalytic site is elaborated by three peptide inserts or loops that pack to form a concave, semicircular surface around the substrate leaving group. Structure-guided kinetic analysis of site-directed mutants was conducted in parallel with a bioinformatics study of sequence diversification within the HisB subfamily to identify loop residues that serve as substrate recognition elements and that distinguish GmhB from its subfamily counterpart, the histidinol-phosphate phosphatase domain of HisB. We show that GmhB and the histidinol-phosphate phosphatase domain use the same design of three substrate recognition loops inserted into the cap domain yet, through selective residue usage on the loops, have achieved unique substrate specificity and thus novel biochemical function.
Assuntos
Proteínas de Escherichia coli/química , Escherichia coli/enzimologia , Hidrolases/química , Família Multigênica , Monoéster Fosfórico Hidrolases/química , Apoenzimas/química , Apoenzimas/genética , Bordetella bronchiseptica/enzimologia , Bordetella bronchiseptica/genética , Cristalografia por Raios X , Escherichia coli/genética , Proteínas de Escherichia coli/genética , Histidinol-Fosfatase/química , Histidinol-Fosfatase/genética , Hidrolases/genética , Mutagênese Sítio-Dirigida , Monoéster Fosfórico Hidrolases/genética , Ligação Proteica/genética , Estrutura Terciária de Proteína/genética , Especificidade por Substrato/genéticaRESUMO
Tauhe beta-phosphoglucomutase (beta-PGM) of the haloacid dehalogenase enzyme superfamily (HADSF) catalyzes the conversion of beta-glucose 1-phosphate (betaG1P) to glucose 6-phosphate (G6P) using Asp8 of the core domain active site to mediate phosphoryl transfer from beta-glucose 1,6-(bis)phosphate (betaG1,6bisP) to betaG1P. Herein, we explore the mechanism by which hydrolysis of the beta-PGM phospho-Asp8 is avoided during the time that the active site must remain open to solvent to allow the exchange of the bound product G6P with the substrate betaG1P. On the basis of structural information, a model of catalysis is proposed in which the general acid/base (Asp10) side chain moves from a position where it forms a hydrogen bond to the Thr16-Ala17 portion of the domain-domain linker to a functional position where it forms a hydrogen bond to the substrate leaving group O and a His20-Lys76 pair of the cap domain. This repositioning of the general acid/base within the core domain active site is coordinated with substrate-induced closure of the cap domain over the core domain. The model predicts that Asp10 is required for general acid/base catalysis and for stabilization of the enzyme in the cap-closed conformation. It also predicts that hinge residue Thr16 plays a key role in productive domain-domain association, that hydrogen bond interaction with the Thr16 backbone amide NH group is required to prevent phospho-Asp8 hydrolysis in the cap-open conformation, and that the His20-Lys76 pair plays an important role in substrate-induced cap closure. The model is examined via kinetic analyses of Asp10, Thr16, His20, and Lys76 site-directed mutants. Replacement of Asp10 with Ala, Ser, Cys, Asn, or Glu resulted in no observable activity. The kinetic consequences of the replacement of linker residue Thr16 with Pro include a reduced rate of Asp8 phosphorylation by betaG1,6bisP, a reduced rate of cycling of the phosphorylated enzyme to convert betaG1P to G6P, and an enhanced rate of phosphoryl transfer from phospho-Asp8 to water. The X-ray crystal structure of the T16P mutant at 2.7 A resolution provides a snapshot of the enzyme in an unnatural cap-open conformation where the Asp10 side chain is located in the core domain active site. The His20 and Lys76 site-directed mutants exhibit reduced activity in catalysis of the Asp8-mediated phosphoryl transfer between betaG1,6bisP and betaG1P but no reduction in the rate of phospho-Asp8 hydrolysis. Taken together, the results support a substrate induced-fit model of catalysis in which betaG1P binding to the core domain facilitates recruitment of the general acid/base Asp10 to the catalytic site and induces cap closure.
Assuntos
Proteínas de Bactérias/química , Fosfotransferases (Fosfomutases)/química , Estrutura Terciária de Proteína , Solventes/química , Substituição de Aminoácidos , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Sítios de Ligação/genética , Catálise , Domínio Catalítico/genética , Cristalografia por Raios X , Glucose-6-Fosfato/análogos & derivados , Glucose-6-Fosfato/química , Glucose-6-Fosfato/metabolismo , Glucofosfatos/química , Glucofosfatos/metabolismo , Cinética , Lactobacillus/enzimologia , Lactobacillus/genética , Modelos Moleculares , Estrutura Molecular , Neisseria meningitidis/enzimologia , Neisseria meningitidis/genética , Fosfotransferases (Fosfomutases)/genética , Fosfotransferases (Fosfomutases)/metabolismo , Ligação Proteica , Relação Estrutura-Atividade , Especificidade por SubstratoRESUMO
The haloacid dehalogenase (HAD) superfamily is a large family of proteins dominated by phosphotransferases. Thirty-three sequence families within the HAD superfamily (HADSF) have been identified to assist in function assignment. One such family includes the enzyme phosphoacetaldehyde hydrolase (phosphonatase). Phosphonatase possesses the conserved Rossmanniod core domain and a C1-type cap domain. Other members of this family do not possess a cap domain and because the cap domain of phosphonatase plays an important role in active site desolvation and catalysis, the function of the capless family members must be unique. A representative of the capless subfamily, PSPTO_2114, from the plant pathogen Pseudomonas syringae, was targeted for catalytic activity and structure analyses. The X-ray structure of PSPTO_2114 reveals a capless homodimer that conserves some but not all of the intersubunit contacts contributed by the core domains of the phosphonatase homodimer. The region of the PSPTO_2114 that corresponds to the catalytic scaffold of phosphonatase (and other HAD phosphotransfereases) positions amino acid residues that are ill suited for Mg+2 cofactor binding and mediation of phosphoryl group transfer between donor and acceptor substrates. The absence of phosphotransferase activity in PSPTO_2114 was confirmed by kinetic assays. To explore PSPTO_2114 function, the conservation of sequence motifs extending outside of the HADSF catalytic scaffold was examined. The stringently conserved residues among PSPTO_2114 homologs were mapped onto the PSPTO_2114 three-dimensional structure to identify a surface region unique to the family members that do not possess a cap domain. The hypothesis that this region is used in protein-protein recognition is explored to define, for the first time, HADSF proteins which have acquired a function other than that of a catalyst.