Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Proteins ; 88(10): 1376-1383, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32506721

RESUMEN

Taking advantage of the known planarity of the N-acetyl group of N-acetylglucosamine, an analysis of the quality of carbohydrate structures found in the protein databank was performed. Few obvious defects of the local geometry of the carbonyl group were observed. However, the N-acetyl group was often found in the less favorable cis conformation (12% of the cases). It was also found severely twisted in numerous instances, especially in structures with a resolution poorer than 1.9 Å determined between 2000 and 2015. Though the automated PDB-REDO procedure has proved able to improve nearly 85% of the structural models deposited to the PDB, and does prove able to cure most severely twisted conformations of the N-acetyl group, it fails to correct its high rate of cis conformations. More generally, for structures with a resolution poorer than 1.6 Å, it produces N-acetylglucosamine models in slightly poorer agreement with experimental data, as measured using real-space correlation coefficients. Significant improvements are thus still needed, at least as far as this carbohydrate structure is concerned.


Asunto(s)
Acetilglucosamina/química , Artefactos , Proteínas/química , Acetilglucosamina/metabolismo , Sitios de Unión , Cristalografía por Rayos X , Bases de Datos como Asunto , Bases de Datos de Proteínas , Humanos , Modelos Moleculares , Conformación Molecular , Unión Proteica , Proteínas/metabolismo
2.
Acta Crystallogr D Struct Biol ; 77(Pt 9): 1127-1141, 2021 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-34473084

RESUMEN

The quality of macromolecular structure models crucially depends on refinement and validation targets, which optimally describe the expected chemistry. Commonly used software for these two procedures has been designed and developed in a protein-centric manner, resulting in relatively few established features for the refinement and validation of nucleic acid-containing structure models. Here, new nucleic acid-specific approaches implemented in PDB-REDO are described, including a new restraint model using noncovalent geometries (base-pair hydrogen bonding and base-pair stacking) as refinement targets. New validation routines are also presented, including a metric for Watson-Crick base-pair geometry normality (ZbpG). Applying the PDB-REDO pipeline with the new restraint model to the whole Protein Data Bank (PDB) demonstrates an overall positive effect on the quality of nucleic acid-containing structure models. Finally, we discuss examples of improvements in the geometry of specific nucleic acid structures in the PDB. The new PDB-REDO models and pipeline are available at https://pdb-redo.eu/.


Asunto(s)
Biología Computacional/métodos , Conformación de Ácido Nucleico , Ácidos Nucleicos/química , Programas Informáticos , Modelos Moleculares
3.
Structure ; 28(11): 1249-1258.e2, 2020 11 03.
Artículo en Inglés | MEDLINE | ID: mdl-32857966

RESUMEN

Ramachandran plots report the distribution of the (ϕ, ψ) torsion angles of the protein backbone and are one of the best quality metrics of experimental structure models. Typically, validation software reports the number of residues belonging to "outlier," "allowed," and "favored" regions. While "zero unexplained outliers" can be considered the current "gold standard," this can be misleading if deviations from expected distributions are not considered. We revisited the Ramachandran Z score (Rama-Z), a quality metric introduced more than two decades ago but underutilized. We describe a reimplementation of the Rama-Z score in the Computational Crystallography Toolbox along with an algorithm to estimate its uncertainty for individual models; final implementations are available in Phenix and PDB-REDO. We discuss the interpretation of the Rama-Z score and advocate including it in the validation reports provided by the Protein Data Bank. We also advocate reporting it alongside the outlier/allowed/favored counts in structural publications.


Asunto(s)
Algoritmos , Modelos Moleculares , Proteínas/ultraestructura , Sesgo , Microscopía por Crioelectrón , Cristalografía por Rayos X , Bases de Datos de Proteínas , Humanos , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Programas Informáticos
4.
Acta Crystallogr D Struct Biol ; 75(Pt 4): 416-425, 2019 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-30988258

RESUMEN

N-Glycosylation is one of the most common post-translational modifications and is implicated in, for example, protein folding and interaction with ligands and receptors. N-Glycosylation trees are complex structures of linked carbohydrate residues attached to asparagine residues. While carbohydrates are typically modeled in protein structures, they are often incomplete or have the wrong chemistry. Here, new tools are presented to automatically rebuild existing glycosylation trees, to extend them where possible, and to add new glycosylation trees if they are missing from the model. The method has been incorporated in the PDB-REDO pipeline and has been applied to build or rebuild 16 452 carbohydrate residues in 11 651 glycosylation trees in 4498 structure models, and is also available from the PDB-REDO web server. With better modeling of N-glycosylation, the biological function of this important modification can be better and more easily understood.


Asunto(s)
Conformación de Carbohidratos , Bases de Datos de Proteínas , Glicoproteínas/química , Polisacáridos/química , Conformación Proteica , Secuencia de Carbohidratos , Cristalografía por Rayos X/métodos , Humanos , Modelos Moleculares
5.
Acta Crystallogr F Struct Biol Commun ; 74(Pt 8): 463-472, 2018 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-30084395

RESUMEN

Glycosylation is one of the most common forms of protein post-translational modification, but is also the most complex. Dealing with glycoproteins in structure model building, refinement, validation and PDB deposition is more error-prone than dealing with nonglycosylated proteins owing to limitations of the experimental data and available software tools. Also, experimentalists are typically less experienced in dealing with carbohydrate residues than with amino-acid residues. The results of the reannotation and re-refinement by PDB-REDO of 8114 glycoprotein structure models from the Protein Data Bank are analyzed. The positive aspects of 3620 reannotations and subsequent refinement, as well as the remaining challenges to obtaining consistently high-quality carbohydrate models, are discussed.


Asunto(s)
Bases de Datos de Proteínas/clasificación , Bases de Datos de Proteínas/normas , Glicoproteínas/química , Glicoproteínas/clasificación
6.
IUCrJ ; 5(Pt 5): 585-594, 2018 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-30224962

RESUMEN

Inherent protein flexibility, poor or low-resolution diffraction data or poorly defined electron-density maps often inhibit the building of complete structural models during X-ray structure determination. However, recent advances in crystallographic refinement and model building often allow completion of previously missing parts. This paper presents algorithms that identify regions missing in a certain model but present in homologous structures in the Protein Data Bank (PDB), and 'graft' these regions of interest. These new regions are refined and validated in a fully automated procedure. Including these developments in the PDB-REDO pipeline has enabled the building of 24 962 missing loops in the PDB. The models and the automated procedures are publicly available through the PDB-REDO databank and webserver. More complete protein structure models enable a higher quality public archive but also a better understanding of protein function, better comparison between homologous structures and more complete data mining in structural bioinformatics projects.

7.
Protein Sci ; 27(3): 798-808, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29168245

RESUMEN

The Protein Data Bank (PDB) is the global archive for structural information on macromolecules, and a popular resource for researchers, teachers, and students, amassing more than one million unique users each year. Crystallographic structure models in the PDB (more than 100,000 entries) are optimized against the crystal diffraction data and geometrical restraints. This process of crystallographic refinement typically ignored hydrogen bond (H-bond) distances as a source of information. However, H-bond restraints can improve structures at low resolution where diffraction data are limited. To improve low-resolution structure refinement, we present methods for deriving H-bond information either globally from well-refined high-resolution structures from the PDB-REDO databank, or specifically from on-the-fly constructed sets of homologous high-resolution structures. Refinement incorporating HOmology DErived Restraints (HODER), improves geometrical quality and the fit to the diffraction data for many low-resolution structures. To make these improvements readily available to the general public, we applied our new algorithms to all crystallographic structures in the PDB: using massively parallel computing, we constructed a new instance of the PDB-REDO databank (https://pdb-redo.eu). This resource is useful for researchers to gain insight on individual structures, on specific protein families (as we demonstrate with examples), and on general features of protein structure using data mining approaches on a uniformly treated dataset.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Algoritmos , Cristalografía por Rayos X , Minería de Datos , Bases de Datos de Proteínas , Enlace de Hidrógeno , Modelos Moleculares , Conformación Proteica
8.
Methods Mol Biol ; 1549: 209-220, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-27975294

RESUMEN

The dramatic increase in the number of protein sequences and structures deposited in biological databases has led to the development of many bioinformatics tools and programs to manage, validate, compare, and interpret this large volume of data. In addition, powerful tools are being developed to use this sequence and structural data to facilitate protein classification and infer biological function of newly identified proteins. This chapter covers freely available bioinformatics resources on the World Wide Web that are commonly used for protein structure analysis.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Conformación Proteica , Proteínas/química , Programas Informáticos , Bases de Datos de Proteínas , Ligandos , Unión Proteica , Reproducibilidad de los Resultados , Relación Estructura-Actividad , Interfaz Usuario-Computador , Navegador Web
9.
Methods Mol Biol ; 1415: 107-38, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27115630

RESUMEN

The use of macromolecular structures is widespread for a variety of applications, from teaching protein structure principles all the way to ligand optimization in drug development. Applying data mining techniques on these experimentally determined structures requires a highly uniform, standardized structural data source. The Protein Data Bank (PDB) has evolved over the years toward becoming the standard resource for macromolecular structures. However, the process selecting the data most suitable for specific applications is still very much based on personal preferences and understanding of the experimental techniques used to obtain these models. In this chapter, we will first explain the challenges with data standardization, annotation, and uniformity in the PDB entries determined by X-ray crystallography. We then discuss the specific effect that crystallographic data quality and model optimization methods have on structural models and how validation tools can be used to make informed choices. We also discuss specific advantages of using the PDB_REDO databank as a resource for structural data. Finally, we will provide guidelines on how to select the most suitable protein structure models for detailed analysis and how to select a set of structure models suitable for data mining.


Asunto(s)
Minería de Datos/métodos , Proteínas/química , Proteínas/metabolismo , Sitios de Unión , Cristalografía por Rayos X , Bases de Datos de Proteínas/normas , Guías como Asunto , Internet , Ligandos , Modelos Moleculares , Anotación de Secuencia Molecular , Estructura Molecular , Unión Proteica , Interfaz Usuario-Computador
10.
IUCrJ ; 1(Pt 4): 213-20, 2014 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-25075342

RESUMEN

The refinement and validation of a crystallographic structure model is the last step before the coordinates and the associated data are submitted to the Protein Data Bank (PDB). The success of the refinement procedure is typically assessed by validating the models against geometrical criteria and the diffraction data, and is an important step in ensuring the quality of the PDB public archive [Read et al. (2011 ▶), Structure, 19, 1395-1412]. The PDB_REDO procedure aims for 'constructive validation', aspiring to consistent and optimal refinement parameterization and pro-active model rebuilding, not only correcting errors but striving for optimal interpretation of the electron density. A web server for PDB_REDO has been implemented, allowing thorough, consistent and fully automated optimization of the refinement procedure in REFMAC and partial model rebuilding. The goal of the web server is to help practicing crystallo-graphers to improve their model prior to submission to the PDB. For this, additional steps were implemented in the PDB_REDO pipeline, both in the refinement procedure, e.g. testing of resolution limits and k-fold cross-validation for small test sets, and as new validation criteria, e.g. the density-fit metrics implemented in EDSTATS and ligand validation as implemented in YASARA. Innovative ways to present the refinement and validation results to the user are also described, which together with auto-generated Coot scripts can guide users to subsequent model inspection and improvement. It is demonstrated that using the server can lead to substantial improvement of structure models before they are submitted to the PDB.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA