Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 9 de 9
1.
Protein Sci ; 30(1): 270-285, 2021 01.
Article En | MEDLINE | ID: mdl-33210433

New X-ray crystallography and cryo-electron microscopy (cryo-EM) approaches yield vast amounts of structural data from dynamic proteins and their complexes. Modeling the full conformational ensemble can provide important biological insights, but identifying and modeling an internally consistent set of alternate conformations remains a formidable challenge. qFit efficiently automates this process by generating a parsimonious multiconformer model. We refactored qFit from a distributed application into software that runs efficiently on a small server, desktop, or laptop. We describe the new qFit 3 software and provide some examples. qFit 3 is open-source under the MIT license, and is available at https://github.com/ExcitedStates/qfit-3.0.


Algorithms , Models, Molecular , Proteins/chemistry , Software , Cryoelectron Microscopy , Crystallography, X-Ray , Ligands
2.
Bioinformatics ; 36(6): 1750-1756, 2020 03 01.
Article En | MEDLINE | ID: mdl-31693112

MOTIVATION: Over the last few years, the field of protein structure prediction has been transformed by increasingly accurate contact prediction software. These methods are based on the detection of coevolutionary relationships between residues from multiple sequence alignments (MSAs). However, despite speculation, there is little evidence of a link between contact prediction and the physico-chemical interactions which drive amino-acid coevolution. Furthermore, existing protocols predict only a fraction of all protein contacts and it is not clear why some contacts are favoured over others. Using a dataset of 863 protein domains, we assessed the physico-chemical interactions of contacts predicted by CCMpred, MetaPSICOV and DNCON2, as examples of direct coupling analysis, meta-prediction and deep learning. RESULTS: We considered correctly predicted contacts and compared their properties against the protein contacts that were not predicted. Predicted contacts tend to form more bonds than non-predicted contacts, which suggests these contacts may be more important than contacts that were not predicted. Comparing the contacts predicted by each method, we found that metaPSICOV and DNCON2 favour accuracy, whereas CCMPred detects contacts with more bonds. This suggests that the push for higher accuracy may lead to a loss of physico-chemically important contacts. These results underscore the connection between protein physico-chemistry and the coevolutionary couplings that can be derived from MSAs. This relationship is likely to be relevant to protein structure prediction and functional analysis of protein structure and may be key to understanding their utility for different problems in structural biology. AVAILABILITY AND IMPLEMENTATION: We use publicly available databases. Our code is available for download at https://opig.stats.ox.ac.uk/. SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.


Computational Biology , Sequence Analysis, Protein , Algorithms , Protein Conformation , Proteins/genetics , Sequence Alignment , Software
3.
Proc Natl Acad Sci U S A ; 116(51): 25634-25640, 2019 12 17.
Article En | MEDLINE | ID: mdl-31801874

How changes in enzyme structure and dynamics facilitate passage along the reaction coordinate is a fundamental unanswered question. Here, we use time-resolved mix-and-inject serial crystallography (MISC) at an X-ray free electron laser (XFEL), ambient-temperature X-ray crystallography, computer simulations, and enzyme kinetics to characterize how covalent catalysis modulates isocyanide hydratase (ICH) conformational dynamics throughout its catalytic cycle. We visualize this previously hypothetical reaction mechanism, directly observing formation of a thioimidate covalent intermediate in ICH microcrystals during catalysis. ICH exhibits a concerted helical displacement upon active-site cysteine modification that is gated by changes in hydrogen bond strength between the cysteine thiolate and the backbone amide of the highly strained Ile152 residue. These catalysis-activated motions permit water entry into the ICH active site for intermediate hydrolysis. Mutations at a Gly residue (Gly150) that modulate helical mobility reduce ICH catalytic turnover and alter its pre-steady-state kinetic behavior, establishing that helical mobility is important for ICH catalytic efficiency. These results demonstrate that MISC can capture otherwise elusive aspects of enzyme mechanism and dynamics in microcrystalline samples, resolving long-standing questions about the connection between nonequilibrium protein motions and enzyme catalysis.


Crystallography, X-Ray/methods , Enzymes , Catalysis , Cysteine/analogs & derivatives , Cysteine/chemistry , Cysteine/metabolism , Enzymes/chemistry , Enzymes/metabolism , Enzymes/ultrastructure , Hydro-Lyases/chemistry , Hydro-Lyases/metabolism , Hydro-Lyases/ultrastructure , Models, Molecular , Protein Conformation
4.
PLoS One ; 14(10): e0218149, 2019.
Article En | MEDLINE | ID: mdl-31634369

While template-free protein structure prediction protocols now produce good quality models for many targets, modelling failure remains common. For these methods to be useful it is important that users can both choose the best model from the hundreds to thousands of models that are commonly generated for a target, and determine whether this model is likely to be correct. We have developed Random Forest Quality Assessment (RFQAmodel), which assesses whether models produced by a protein structure prediction pipeline have the correct fold. RFQAmodel uses a combination of existing quality assessment scores with two predicted contact map alignment scores. These alignment scores are able to identify correct models for targets that are not otherwise captured. Our classifier was trained on a large set of protein domains that are structurally diverse and evenly balanced in terms of protein features known to have an effect on modelling success, and then tested on a second set of 244 protein domains with a similar spread of properties. When models for each target in this second set were ranked according to the RFQAmodel score, the highest-ranking model had a high-confidence RFQAmodel score for 67 modelling targets, of which 52 had the correct fold. At the other end of the scale RFQAmodel correctly predicted that for 59 targets the highest-ranked model was incorrect. In comparisons to other methods we found that RFQAmodel is better able to identify correct models for targets where only a few of the models are correct. We found that RFQAmodel achieved a similar performance on the model sets for CASP12 and CASP13 free-modelling targets. Finally, by iteratively generating models and running RFQAmodel until a model is produced that is predicted to be correct with high confidence, we demonstrate how such a protocol can be used to focus computational efforts on difficult modelling targets. RFQAmodel and the accompanying data can be downloaded from http://opig.stats.ox.ac.uk/resources.


Algorithms , Models, Molecular , Protein Folding , Proteins , Sequence Analysis, Protein , Software , Predictive Value of Tests , Protein Conformation , Proteins/chemistry , Proteins/genetics
5.
J Med Chem ; 61(24): 11183-11198, 2018 12 27.
Article En | MEDLINE | ID: mdl-30457858

Proteins and ligands sample a conformational ensemble that governs molecular recognition, activity, and dissociation. In structure-based drug design, access to this conformational ensemble is critical to understand the balance between entropy and enthalpy in lead optimization. However, ligand conformational heterogeneity is currently severely underreported in crystal structures in the Protein Data Bank, owing in part to a lack of automated and unbiased procedures to model an ensemble of protein-ligand states into X-ray data. Here, we designed a computational method, qFit-ligand, to automatically resolve conformationally averaged ligand heterogeneity in crystal structures, and applied it to a large set of protein receptor-ligand complexes. In an analysis of the cancer related BRD4 domain, we found that up to 29% of protein crystal structures bound with drug-like molecules present evidence of unmodeled, averaged, relatively isoenergetic conformations in ligand-receptor interactions. In many retrospective cases, these alternate conformations were adventitiously exploited to guide compound design, resulting in improved potency or selectivity. Combining qFit-ligand with high-throughput screening or multitemperature crystallography could therefore augment the structure-based drug design toolbox.


Computational Biology/methods , Crystallography, X-Ray , Models, Molecular , Proteins/chemistry , Algorithms , Amyloid Precursor Protein Secretases/antagonists & inhibitors , Amyloid Precursor Protein Secretases/chemistry , Amyloid Precursor Protein Secretases/metabolism , Aspartic Acid Endopeptidases/antagonists & inhibitors , Aspartic Acid Endopeptidases/chemistry , Aspartic Acid Endopeptidases/metabolism , Calibration , Cell Cycle Proteins , Databases, Protein , Drug Design , Electrons , High-Throughput Screening Assays/methods , Ligands , Nuclear Proteins/chemistry , Protein Domains , Proteins/metabolism , Transcription Factors/chemistry
6.
Bioinformatics ; 34(13): 2219-2227, 2018 07 01.
Article En | MEDLINE | ID: mdl-29462243

Motivation: Recent advances in co-evolution techniques have made possible the accurate prediction of protein structures in the absence of a template. Here, we provide a general approach that further utilizes co-evolution constraints to generate better fragment libraries for fragment-based protein structure prediction. Results: We have compared five different fragment library generation programmes on three different datasets encompassing over 400 unique protein folds. We show that considering the secondary structure of the fragments when assembling these libraries provides a critical way to assess their usefulness to structure prediction. We then use co-evolution constraints to improve the fragment libraries by enriching them with fragments that satisfy constraints and discarding those that do not. These improved libraries have better precision and lead to consistently better modelling results. Availability and implementation: Data is available for download from: http://opig.stats.ox.ac.uk/resources. Flib-Coevo is available for download from: https://github.com/sauloho/Flib-Coevo. Supplementary information: Supplementary data are available at Bioinformatics online.


Computational Biology/methods , Protein Structure, Secondary , Software , Algorithms , Peptide Library
7.
Nucleic Acids Res ; 46(D1): D406-D412, 2018 01 04.
Article En | MEDLINE | ID: mdl-29087479

The Structural T-cell Receptor Database (STCRDab; http://opig.stats.ox.ac.uk/webapps/stcrdab) is an online resource that automatically collects and curates TCR structural data from the Protein Data Bank. For each entry, the database provides annotations, such as the α/ß or γ/δ chain pairings, major histocompatibility complex details, and where available, antigen binding affinities. In addition, the orientation between the variable domains and the canonical forms of the complementarity-determining region loops are also provided. Users can select, view, and download individual or bulk sets of structures based on these criteria. Where available, STCRDab also finds antibody structures that are similar to TCRs, helping users explore the relationship between TCRs and antibodies.


Antigens/chemistry , Complementarity Determining Regions/chemistry , Databases, Protein , Receptors, Antigen, T-Cell/chemistry , Software , Amino Acid Sequence , Antigens/immunology , Antigens/metabolism , Binding Sites , Complementarity Determining Regions/metabolism , Humans , Internet , Major Histocompatibility Complex/genetics , Major Histocompatibility Complex/immunology , Models, Molecular , Molecular Sequence Annotation , Protein Binding , Protein Conformation, alpha-Helical , Protein Conformation, beta-Strand , Protein Interaction Domains and Motifs , Receptors, Antigen, T-Cell/immunology , Receptors, Antigen, T-Cell/metabolism , Sequence Alignment , Sequence Homology, Amino Acid , T-Lymphocytes/cytology , T-Lymphocytes/immunology
8.
Bioinformatics ; 34(7): 1132-1140, 2018 04 01.
Article En | MEDLINE | ID: mdl-29136098

Motivation: Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally. Results: We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score > 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy. Availability and implementation: Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2. Contact: saulo.deoliveira@dtc.ox.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.


Computational Biology/methods , Protein Conformation , Sequence Analysis, Protein/methods , Software , Algorithms , Animals , Caspase 12/chemistry , Caspase 12/metabolism , Humans
9.
PLoS One ; 10(4): e0123998, 2015.
Article En | MEDLINE | ID: mdl-25901595

Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. "Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources".


Computational Biology/methods , Databases, Protein , Peptide Fragments/chemistry , Proteins/chemistry , Amino Acid Sequence , Models, Molecular , Molecular Sequence Data , Protein Structure, Secondary , Sequence Homology, Amino Acid , Software
...