RESUMO
The Human Metabolome Database or HMDB (https://hmdb.ca) has been providing comprehensive reference information about human metabolites and their associated biological, physiological and chemical properties since 2007. Over the past 15 years, the HMDB has grown and evolved significantly to meet the needs of the metabolomics community and respond to continuing changes in internet and computing technology. This year's update, HMDB 5.0, brings a number of important improvements and upgrades to the database. These should make the HMDB more useful and more appealing to a larger cross-section of users. In particular, these improvements include: (i) a significant increase in the number of metabolite entries (from 114 100 to 217 920 compounds); (ii) enhancements to the quality and depth of metabolite descriptions; (iii) the addition of new structure, spectral and pathway visualization tools; (iv) the inclusion of many new and much more accurately predicted spectral data sets, including predicted NMR spectra, more accurately predicted MS spectra, predicted retention indices and predicted collision cross section data and (v) enhancements to the HMDB's search functions to facilitate better compound identification. Many other minor improvements and updates to the content, the interface, and general performance of the HMDB website have also been made. Overall, we believe these upgrades and updates should greatly enhance the HMDB's ease of use and its potential applications not only in human metabolomics but also in exposomics, lipidomics, nutritional science, biochemistry and clinical chemistry.
Assuntos
Bases de Dados Genéticas , Metaboloma/genética , Metabolômica/classificação , Humanos , Lipidômica/classificação , Espectrometria de Massas , Interface Usuário-ComputadorRESUMO
The Natural Products Magnetic Resonance Database (NP-MRD) is a comprehensive, freely available electronic resource for the deposition, distribution, searching and retrieval of nuclear magnetic resonance (NMR) data on natural products, metabolites and other biologically derived chemicals. NMR spectroscopy has long been viewed as the 'gold standard' for the structure determination of novel natural products and novel metabolites. NMR is also widely used in natural product dereplication and the characterization of biofluid mixtures (metabolomics). All of these NMR applications require large collections of high quality, well-annotated, referential NMR spectra of pure compounds. Unfortunately, referential NMR spectral collections for natural products are quite limited. It is because of the critical need for dedicated, open access natural product NMR resources that the NP-MRD was funded by the National Institute of Health (NIH). Since its launch in 2020, the NP-MRD has grown quickly to become the world's largest repository for NMR data on natural products and other biological substances. It currently contains both structural and NMR data for nearly 41,000 natural product compounds from >7400 different living species. All structural, spectroscopic and descriptive data in the NP-MRD is interactively viewable, searchable and fully downloadable in multiple formats. Extensive hyperlinks to other databases of relevance are also provided. The NP-MRD also supports community deposition of NMR assignments and NMR spectra (1D and 2D) of natural products and related meta-data. The deposition system performs extensive data enrichment, automated data format conversion and spectral/assignment evaluation. Details of these database features, how they are implemented and plans for future upgrades are also provided. The NP-MRD is available at https://np-mrd.org.
Assuntos
Produtos Biológicos/química , Bases de Dados Factuais , Espectroscopia de Ressonância Magnética , Software , Produtos Biológicos/classificação , InternetRESUMO
While NMR-based metabolomics is only about 20 years old, NMR has been a key part of metabolic and metabolism studies for >40 years. Historically, metabolic researchers used NMR because of its high level of reproducibility, superb instrument stability, facile sample preparation protocols, inherently quantitative character, non-destructive nature, and amenability to automation. In this chapter, we provide a short history of NMR-based metabolomics. We then provide a detailed description of some of the practical aspects of performing NMR-based metabolomics studies including sample preparation, pulse sequence selection, and spectral acquisition and processing. The two different approaches to metabolomics data analysis, targeted vs. untargeted, are briefly outlined. We also describe several software packages to help users process NMR spectra obtained via these two different approaches. We then give several examples of useful or interesting applications of NMR-based metabolomics, ranging from applications to drug toxicology, to identifying inborn errors of metabolism to analyzing the contents of biofluids from dairy cattle. Throughout this chapter, we will highlight the strengths and limitations of NMR-based metabolomics. Additionally, we will conclude with descriptions of recent advances in NMR hardware, methodology, and software and speculate about where NMR-based metabolomics is going in the next 5-10 years.
Assuntos
Imageamento por Ressonância Magnética , Metabolômica , Animais , Bovinos , Reprodutibilidade dos Testes , Metabolômica/métodos , Espectroscopia de Ressonância Magnética/métodosRESUMO
MarkerDB is a freely available electronic database that attempts to consolidate information on all known clinical and a selected set of pre-clinical molecular biomarkers into a single resource. The database includes four major types of molecular biomarkers (chemical, protein, DNA [genetic] and karyotypic) and four biomarker categories (diagnostic, predictive, prognostic and exposure). MarkerDB provides information such as: biomarker names and synonyms, associated conditions or pathologies, detailed disease descriptions, detailed biomarker descriptions, biomarker specificity, sensitivity and ROC curves, standard reference values (for protein and chemical markers), variants (for SNP or genetic markers), sequence information (for genetic and protein markers), molecular structures (for protein and chemical markers), tissue or biofluid sources (for protein and chemical markers), chromosomal location and structure (for genetic and karyotype markers), clinical approval status and relevant literature references. Users can browse the data by conditions, condition categories, biomarker types, biomarker categories or search by sequence similarity through the advanced search function. Currently, the database contains 142 protein biomarkers, 1089 chemical biomarkers, 154 karyotype biomarkers and 26 374 genetic markers. These are categorized into 25 560 diagnostic biomarkers, 102 prognostic biomarkers, 265 exposure biomarkers and 6746 predictive biomarkers or biomarker panels. Collectively, these markers can be used to detect, monitor or predict 670 specific human conditions which are grouped into 27 broad condition categories. MarkerDB is available at https://markerdb.ca.
Assuntos
Biomarcadores/metabolismo , Bases de Dados Factuais , Doença/genética , Marcadores Genéticos , Proteínas/genética , Aberrações Cromossômicas , Doença/classificação , Humanos , Internet , Cariotipagem , Valor Preditivo dos Testes , Prognóstico , Proteínas/metabolismo , Curva ROC , SoftwareRESUMO
Nuclear magnetic resonance (NMR) spectral analysis of biofluids can be a time-consuming process, requiring the expertise of a trained operator. With NMR becoming increasingly popular in the field of metabolomics, there is a growing need to change this paradigm and to automate the process. Here we introduce MagMet, an online web server, that automates the processing and quantification of 1D 1 H NMR spectra from biofluids-specifically, human serum/plasma metabolites, including those associated with inborn errors of metabolism (IEM). MagMet uses a highly efficient data processing procedure that performs automatic Fourier Transformation, phase correction, baseline optimization, chemical shift referencing, water signal removal, and peak picking/peak alignment. MagMet then uses the peak positions, linewidth information, and J-couplings from its own specially prepared standard metabolite reference spectral NMR library of 85 serum/plasma compounds to identify and quantify compounds from experimentally acquired NMR spectra of serum/plasma. MagMet employs linewidth adjustment for more consistent quantification of metabolites from higher field instruments and incorporates a highly efficient data processing procedure for more rapid and accurate detection and quantification of metabolites. This optimized algorithm allows the MagMet webserver to quickly detect and quantify 58 serum/plasma metabolites in 2.6 min per spectrum (when processing a dataset of 50-100 spectra). MagMet's performance was also assessed using spectra collected from defined mixtures (simulating other biofluids), with >100 previously measured plasma spectra, and from spiked serum/plasma samples simulating known IEMs. In all cases, MagMet performed with precision and accuracy matching the performance of human spectral profiling experts. MagMet is available at http://magmet.ca.
Assuntos
Imageamento por Ressonância Magnética , Metabolômica , Humanos , Espectroscopia de Ressonância Magnética/métodos , Metabolômica/métodos , Soro , AlgoritmosRESUMO
The Human Metabolome Database or HMDB (www.hmdb.ca) is a web-enabled metabolomic database containing comprehensive information about human metabolites along with their biological roles, physiological concentrations, disease associations, chemical reactions, metabolic pathways, and reference spectra. First described in 2007, the HMDB is now considered the standard metabolomic resource for human metabolic studies. Over the past decade the HMDB has continued to grow and evolve in response to emerging needs for metabolomics researchers and continuing changes in web standards. This year's update, HMDB 4.0, represents the most significant upgrade to the database in its history. For instance, the number of fully annotated metabolites has increased by nearly threefold, the number of experimental spectra has grown by almost fourfold and the number of illustrated metabolic pathways has grown by a factor of almost 60. Significant improvements have also been made to the HMDB's chemical taxonomy, chemical ontology, spectral viewing, and spectral/text searching tools. A great deal of brand new data has also been added to HMDB 4.0. This includes large quantities of predicted MS/MS and GC-MS reference spectral data as well as predicted (physiologically feasible) metabolite structures to facilitate novel metabolite identification. Additional information on metabolite-SNP interactions and the influence of drugs on metabolite levels (pharmacometabolomics) has also been added. Many other important improvements in the content, the interface, and the performance of the HMDB website have been made and these should greatly enhance its ease of use and its potential applications in nutrition, biochemistry, clinical chemistry, clinical genetics, medicine, and metabolomics science.
Assuntos
Bases de Dados Factuais , Metaboloma , Bases de Dados de Compostos Químicos , Cromatografia Gasosa-Espectrometria de Massas , Humanos , Redes e Vias Metabólicas , Metabolômica , Ressonância Magnética Nuclear Biomolecular , Espectrometria de Massas em Tandem , Interface Usuário-ComputadorRESUMO
Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often < 10 min/structure) and to significantly outperform other shift-based or threading-based structure determination methods (in terms of top template model accuracy)-with an average TM-score performance of 0.68 (vs. 0.50-0.62 for other methods). Coupled with recent developments in chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca .
Assuntos
Sequência de Aminoácidos , Estrutura Secundária de Proteína , Proteínas/química , Ressonância Magnética Nuclear Biomolecular/métodos , Conformação Proteica , Fatores de TempoRESUMO
Chemical shifts are among the most informative parameters in protein NMR. They provide wealth of information about protein secondary and tertiary structure, protein flexibility, and protein-ligand binding. In this report, we review the progress in interpreting and utilizing protein chemical shifts that has occurred over the past 25years, with a particular focus on the large body of work arising from our group and other Canadian NMR laboratories. More specifically, this review focuses on describing, assessing, and providing some historical context for various chemical shift-based methods to: (1) determine protein secondary and super-secondary structure; (2) derive protein torsion angles; (3) assess protein flexibility; (4) predict residue accessible surface area; (5) refine 3D protein structures; (6) determine 3D protein structures and (7) characterize intrinsically disordered proteins. This review also briefly covers some of the methods that we previously developed to predict chemical shifts from 3D protein structures and/or protein sequence data. It is hoped that this review will help to increase awareness of the considerable utility of NMR chemical shifts in structural biology and facilitate more widespread adoption of chemical-shift based methods by the NMR spectroscopists, structural biologists, protein biophysicists, and biochemists worldwide. This article is part of a Special Issue entitled: Biophysics in Canada, edited by Lewis Kay, John Baenziger, Albert Berghuis and Peter Tieleman.
Assuntos
Ressonância Magnética Nuclear Biomolecular/métodos , Estrutura Secundária de Proteína , Estrutura Terciária de ProteínaRESUMO
Over the past decade, a number of methods have been developed to determine the approximate structure of proteins using minimal NMR experimental information such as chemical shifts alone, sparse NOEs alone or a combination of comparative modeling data and chemical shifts. However, there have been relatively few methods that allow these approximate models to be substantively refined or improved using the available NMR chemical shift data. Here, we present a novel method, called Chemical Shift driven Genetic Algorithm for biased Molecular Dynamics (CS-GAMDy), for the robust optimization of protein structures using experimental NMR chemical shifts. The method incorporates knowledge-based scoring functions and structural information derived from NMR chemical shifts via a unique combination of multi-objective MD biasing, a genetic algorithm, and the widely used XPLOR molecular modelling language. Using this approach, we demonstrate that CS-GAMDy is able to refine and/or fold models that are as much as 10 Å (RMSD) away from the correct structure using only NMR chemical shift data. CS-GAMDy is also able to refine of a wide range of approximate or mildly erroneous protein structures to more closely match the known/correct structure and the known/correct chemical shifts. We believe CS-GAMDy will allow protein models generated by sparse restraint or chemical-shift-only methods to achieve sufficiently high quality to be considered fully refined and "PDB worthy". The CS-GAMDy algorithm is explained in detail and its performance is compared over a range of refinement scenarios with several commonly used protein structure refinement protocols. The program has been designed to be easily installed and easily used and is available at http://www.gamdy.ca.
Assuntos
Algoritmos , Modelos Moleculares , Ressonância Magnética Nuclear Biomolecular , Conformação Proteica , Proteínas/química , Ressonância Magnética Nuclear Biomolecular/métodosRESUMO
Chemical cross-linking combined with mass spectrometry is a rapidly developing technique for structural proteomics. Cross-linked proteins are usually digested with trypsin to generate cross-linked peptides, which are then analyzed by mass spectrometry. The most informative cross-links, the interpeptide cross-links, are often large in size, because they consist of two peptides that are connected by a cross-linker. In addition, trypsin targets the same residues as amino-reactive cross-linkers, and cleavage will not occur at these cross-linker-modified residues. This produces high molecular weight cross-linked peptides, which complicates their mass spectrometric analysis and identification. In this paper, we examine a nonspecific protease, proteinase K, as an alternative to trypsin for cross-linking studies. Initial tests on a model peptide that was digested by proteinase K resulted in a "family" of related cross-linked peptides, all of which contained the same cross-linking sites, thus providing additional verification of the cross-linking results, as was previously noted for other post-translational modification studies. The procedure was next applied to the native (PrP(C)) and oligomeric form of prion protein (PrPß). Using proteinase K, the affinity-purifiable CID-cleavable and isotopically coded cross-linker cyanurbiotindipropionylsuccinimide and MALDI-MS cross-links were found for all of the possible cross-linking sites. After digestion with proteinase K, we obtained a mass distribution of the cross-linked peptides that is very suitable for MALDI-MS analysis. Using this new method, we were able to detect over 60 interpeptide cross-links in the native PrP(C) and PrPß prion protein. The set of cross-links for the native form was used as distance constraints in developing a model of the native prion protein structure, which includes the 90-124-amino acid N-terminal portion of the protein. Several cross-links were unique to each form of the prion protein, including a Lys(185)-Lys(220) cross-link, which is unique to the PrPß and thus may be indicative of the conformational change involved in the formation of prion protein oligomers.
Assuntos
Endopeptidase K/metabolismo , Peptídeos/análise , Príons/análise , Sequência de Aminoácidos , Animais , Biotina , Cromatografia de Afinidade , Cricetinae , Reagentes de Ligações Cruzadas , Escherichia coli , Mesocricetus , Modelos Moleculares , Dados de Sequência Molecular , Peptídeos/química , Peptídeos/genética , Príons/química , Príons/genética , Proteólise , Proteínas Recombinantes/análise , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Espectrometria de Massas por Ionização por Electrospray , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por MatrizRESUMO
NMR is widely considered the gold standard for organic compound structure determination. As such, NMR is routinely used in organic compound identification, drug metabolite characterization, natural product discovery, and the deconvolution of metabolite mixtures in biofluids (metabolomics and exposomics). In many cases, compound identification by NMR is achieved by matching measured NMR spectra to experimentally collected NMR spectral reference libraries. Unfortunately, the number of available experimental NMR reference spectra, especially for metabolomics, medical diagnostics, or drug-related studies, is quite small. This experimental gap could be filled by predicting NMR chemical shifts for known compounds using computational methods such as machine learning (ML). Here, we describe how a deep learning algorithm that is trained on a high-quality, "solvent-aware" experimental dataset can be used to predict 1H chemical shifts more accurately than any other known method. The new program, called PROSPRE (PROton Shift PREdictor) can accurately (mean absolute error of <0.10 ppm) predict 1H chemical shifts in water (at neutral pH), chloroform, dimethyl sulfoxide, and methanol from a user-submitted chemical structure. PROSPRE (pronounced "prosper") has also been used to predict 1H chemical shifts for >600,000 molecules in many popular metabolomic, drug, and natural product databases.
RESUMO
We report the development of MagMet-W (magnetic resonance for metabolomics of wine), a software program that can automatically determine the chemical composition of wine via 1H nuclear magnetic resonance (NMR) spectroscopy. MagMet-W is an extension of MagMet developed for the automated metabolomic analysis of human serum by 1H NMR. We identified 70 compounds suitable for inclusion into MagMet-W. We then obtained 1D 1H NMR reference spectra of the pure compounds at 700 MHz and incorporated these spectra into the MagMet-W compound library. The processing of the wine NMR spectra and profiling of the 70 wine compounds were then optimized based on manual 1H NMR analysis. MagMet-W can automatically identify 70 wine compounds in most wine samples and can quantify them to 10-15% of the manually determined concentrations, and it can analyze multiple spectra simultaneously, at 10 min per spectrum. The MagMet-W Web server is available at https://www.magmet.ca.
RESUMO
Protein side-chain motions are involved in many important biological processes including enzymatic catalysis, allosteric regulation, and the mediation of protein-protein, protein-DNA, protein-RNA, and protein-cofactor interactions. NMR spectroscopy has long been used to provide insights into the motions of side-chain groups. Currently, the method of choice for studying side-chain dynamics by NMR is the measurement of methyl (2)H autorelaxation. Methyl (2)H autorelaxation exhibits simple relaxation mechanisms and can be straightforwardly converted to meaningful dynamic parameters. However, methyl groups can only be found in 6 of 19 side-chain bearing amino acids. Consequently, only a sparse assessment of protein side-chain dynamics is possible. Therefore, there is a significant interest in developing novel methods of studying side-chain motions that can be applied to all types of side-chains. Here, we show how side-chain chemical shifts can be used to determine the magnitude of fast side-chain motions in proteins. The chemical shift method is applicable to all side-chain bearing residues and does not require any additional measurements beyond standard NMR experiments for backbone and side-chain assignments.
Assuntos
Ressonância Magnética Nuclear Biomolecular/métodos , Proteínas/química , Animais , Humanos , Simulação de Dinâmica Molecular , Conformação ProteicaRESUMO
One of the major challenges currently faced by global health systems is the prolonged COVID-19 syndrome (also known as "long COVID") which has emerged as a consequence of the SARS-CoV-2 epidemic. It is estimated that at least 30% of patients who have had COVID-19 will develop long COVID. In this study, our goal was to assess the plasma metabolome in a total of 100 samples collected from healthy controls, COVID-19 patients, and long COVID patients recruited in Mexico between 2020 and 2022. A targeted metabolomics approach using a combination of LC-MS/MS and FIA MS/MS was performed to quantify 108 metabolites. IL-17 and leptin were measured in long COVID patients by immunoenzymatic assay. The comparison of paired COVID-19/long COVID-19 samples revealed 53 metabolites that were statistically different. Compared to controls, 27 metabolites remained dysregulated even after two years. Post-COVID-19 patients displayed a heterogeneous metabolic profile. Lactic acid, lactate/pyruvate ratio, ornithine/citrulline ratio, and arginine were identified as the most relevant metabolites for distinguishing patients with more complicated long COVID evolution. Additionally, IL-17 levels were significantly increased in these patients. Mitochondrial dysfunction, redox state imbalance, impaired energy metabolism, and chronic immune dysregulation are likely to be the main hallmarks of long COVID even two years after acute COVID-19 infection.
Assuntos
COVID-19 , Interleucina-17 , Humanos , Espectrometria de Massas em Tandem , Cromatografia Líquida , SARS-CoV-2 , Metaboloma , Metabolômica , Síndrome de COVID-19 Pós-AgudaRESUMO
Phosphomannomutase/phosphoglucomutase contributes to the infectivity of Pseudomonas aeruginosa, retains and reorients its intermediate by 180°, and rotates domain 4 to close the deep catalytic cleft. Nuclear magnetic resonance (NMR) spectra of the backbone of wild-type and S108C-inactivated enzymes were assigned to at least 90%. (13)C secondary chemical shifts report excellent agreement of solution and crystallographic structure over the 14 α-helices, C-capping motifs, and 20 of the 22 ß-strands. Major and minor NMR peaks implicate substates affecting 28% of assigned residues. These can be attributed to the phosphorylation state and possibly to conformational interconversions. The S108C substitution of the phosphoryl donor and acceptor slowed transformation of the glucose 1-phosphate substrate by impairing k(cat). Addition of the glucose 1,6-bisphosphate intermediate accelerated this reaction by 2-3 orders of magnitude, somewhat bypassing the defect and apparently relieving substrate inhibition. The S108C mutation perturbs the NMR spectra and electron density map around the catalytic cleft while preserving the secondary structure in solution. Diminished peak heights and faster (15)N relaxation suggest line broadening and millisecond fluctuations within four loops that can contact phosphosugars. (15)N NMR relaxation and peak heights suggest that domain 4 reorients slightly faster in solution than domains 1-3, and with a different principal axis of diffusion. This adds to the crystallographic evidence of domain 4 rotations in the enzyme, which were previously suggested to couple to reorientation of the intermediate, substrate binding, and product release.
Assuntos
Fosfotransferases (Fosfomutases)/química , Fosfotransferases (Fosfomutases)/genética , Domínio Catalítico/genética , Cristalografia por Raios X , Ressonância Magnética Nuclear Biomolecular , Fosfoglucomutase/química , Fosfoglucomutase/genética , Fosforilação/genética , Fosfotransferases (Fosfomutases)/metabolismo , Ligação Proteica/genética , Transporte Proteico/genética , Pseudomonas aeruginosa/enzimologia , Especificidade por Substrato/genéticaRESUMO
Essential collective dynamics is a promising and robust approach for analysing the slow motions of macromolecules from short molecular dynamics trajectories. In this study, an extension of the method to treat a collection of interacting protein molecules is presented. The extension is applied to investigate the effects of dimerization on the collective dynamics of ovine prion protein molecules in two different arrangements. Examination of the structural plasticity shows that aggregation has a restricting effect on the local mobility of the prion protein molecules in the interfacial regions. Domain motions of the two dimeric ovine prion protein conformations are distinctly different and can be related to interatomic correlations at the interface. Correlated motions are among the slow collective modes extensively analysed by considering both main-chain and side-chain atoms. Correlation maps reveal the existence of a vast network of dynamically correlated side groups, which extends beyond individual subunits via interfacial interconnections. The network is formed by a core of hydrophobic side chains surrounded by hydrophilic groups at the periphery. The relevance of these findings are discussed in the context of mutations associated with prion diseases. The binding free energy of the dimeric conformations is evaluated to probe their thermodynamic stability. The descriptions afforded by the analysis of the essential collective dynamics of the prion dimers are consistent with their binding free energies. The agreement validates the extension of the methodology and provides a means of interpreting the collective dynamics in terms of the thermodynamic stability of ovine prion proteins.
Assuntos
Simulação de Dinâmica Molecular , Príons/química , Príons/metabolismo , Mapeamento de Interação de Proteínas/métodos , Sequência de Aminoácidos , Animais , Interações Hidrofóbicas e Hidrofílicas , Dados de Sequência Molecular , Mutação , Multimerização Proteica , Ovinos , TermodinâmicaRESUMO
In protein X-ray crystallography, resolution is often used as a good indicator of structural quality. Diffraction resolution of protein crystals correlates well with the number of X-ray observables that are used in structure generation and, therefore, with protein coordinate errors. In protein NMR, there is no parameter identical to X-ray resolution. Instead, resolution is often used as a synonym of NMR model quality. Resolution of NMR structures is often deduced from ensemble precision, torsion angle normality and number of distance restraints per residue. The lack of common techniques to assess the resolution of X-ray and NMR structures complicates the comparison of structures solved by these two methods. This problem is sometimes approached by calculating "equivalent resolution" from structure quality metrics. However, existing protocols do not offer a comprehensive assessment of protein structure as they calculate equivalent resolution from a relatively small number (<5) of protein parameters. Here, we report a development of a protocol that calculates equivalent resolution from 25 measurable protein features. This new method offers better performance (correlation coefficient of 0.92, mean absolute error of 0.28 Å) than existing predictors of equivalent resolution. Because the method uses coordinate data as a proxy for X-ray diffraction data, we call this measure "Resolution-by-Proxy" or ResProx. We demonstrate that ResProx can be used to identify under-restrained, poorly refined or inaccurate NMR structures, and can discover structural defects that the other equivalent resolution methods cannot detect. The ResProx web server is available at http://www.resprox.ca.
Assuntos
Algoritmos , Modelos Moleculares , Ressonância Magnética Nuclear Biomolecular , Proteínas/química , Cristalografia por Raios X , Conformação ProteicaRESUMO
PROSESS (PROtein Structure Evaluation Suite and Server) is a web server designed to evaluate and validate protein structures generated by X-ray crystallography, NMR spectroscopy or computational modeling. While many structure evaluation packages have been developed over the past 20 years, PROSESS is unique in its comprehensiveness, its capacity to evaluate X-ray, NMR and predicted structures as well as its ability to evaluate a variety of experimental NMR data. PROSESS integrates a variety of previously developed, well-known and thoroughly tested methods to evaluate both global and residue specific: (i) covalent and geometric quality; (ii) non-bonded/packing quality; (iii) torsion angle quality; (iv) chemical shift quality and (v) NOE quality. In particular, PROSESS uses VADAR for coordinate, packing, H-bond, secondary structure and geometric analysis, GeNMR for calculating folding, threading and solvent energetics, ShiftX for calculating chemical shift correlations, RCI for correlating structure mobility to chemical shift and PREDITOR for calculating torsion angle-chemical shifts agreement. PROSESS also incorporates several other programs including MolProbity to assess atomic clashes, Xplor-NIH to identify and quantify NOE restraint violations and NAMD to assess structure energetics. PROSESS produces detailed tables, explanations, structural images and graphs that summarize the results and compare them to values observed in high-quality or high-resolution protein structures. Using a simplified red-amber-green coloring scheme PROSESS also alerts users about both general and residue-specific structural problems. PROSESS is intended to serve as a tool that can be used by structure biologists as well as database curators to assess and validate newly determined protein structures. PROSESS is freely available at http://www.prosess.ca.
Assuntos
Conformação Proteica , Software , Cristalografia por Raios X , Internet , Modelos Moleculares , Ressonância Magnética Nuclear Biomolecular , Interface Usuário-ComputadorRESUMO
GeNMR (GEnerate NMR structures) is a web server for rapidly generating accurate 3D protein structures using sequence data, NOE-based distance restraints and/or NMR chemical shifts as input. GeNMR accepts distance restraints in XPLOR or CYANA format as well as chemical shift files in either SHIFTY or BMRB formats. The web server produces an ensemble of PDB coordinates for the protein within 15-25 min, depending on model complexity and completeness of experimental restraints. GeNMR uses a pipeline of several pre-existing programs and servers to calculate the actual protein structure. In particular, GeNMR combines genetic algorithms for structure optimization along with homology modeling, chemical shift threading, torsion angle and distance predictions from chemical shifts/NOEs as well as ROSETTA-based structure generation and simulated annealing with XPLOR-NIH to generate and/or refine protein coordinates. GeNMR greatly simplifies the task of protein structure determination as users do not have to install or become familiar with complex stand-alone programs or obscure format conversion utilities. Tests conducted on a sample of 90 proteins from the BioMagResBank indicate that GeNMR produces high-quality models for all protein queries, regardless of the type of NMR input data. GeNMR was developed to facilitate rapid, user-friendly structure determination of protein structures via NMR spectroscopy. GeNMR is accessible at http://www.genmr.ca.
Assuntos
Ressonância Magnética Nuclear Biomolecular , Conformação Proteica , Software , Algoritmos , Bases de Dados de Proteínas , Internet , Modelos Moleculares , Reprodutibilidade dos TestesRESUMO
CS23D (chemical shift to 3D structure) is a web server for rapidly generating accurate 3D protein structures using only assigned nuclear magnetic resonance (NMR) chemical shifts and sequence data as input. Unlike conventional NMR methods, CS23D requires no NOE and/or J-coupling data to perform its calculations. CS23D accepts chemical shift files in either SHIFTY or BMRB formats, and produces a set of PDB coordinates for the protein in about 10-15 min. CS23D uses a pipeline of several preexisting programs or servers to calculate the actual protein structure. Depending on the sequence similarity (or lack thereof) CS23D uses either (i) maximal subfragment assembly (a form of homology modeling), (ii) chemical shift threading or (iii) shift-aided de novo structure prediction (via Rosetta) followed by chemical shift refinement to generate and/or refine protein coordinates. Tests conducted on more than 100 proteins from the BioMagResBank indicate that CS23D converges (i.e. finds a solution) for >95% of protein queries. These chemical shift generated structures were found to be within 0.2-2.8 A RMSD of the NMR structure generated using conventional NOE-base NMR methods or conventional X-ray methods. The performance of CS23D is dependent on the completeness of the chemical shift assignments and the similarity of the query protein to known 3D folds. CS23D is accessible at http://www.cs23d.ca.