Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters

Database
Language
Affiliation country
Publication year range
1.
BMC Bioinformatics ; 16: 162, 2015 May 16.
Article in English | MEDLINE | ID: mdl-25982853

ABSTRACT

BACKGROUND: The exponential growth of protein structural and sequence databases is enabling multifaceted approaches to understanding the long sought sequence-structure-function relationship. Advances in computation now make it possible to apply well-established data mining and pattern recognition techniques to these data to learn models that effectively relate structure and function. However, extracting meaningful numerical descriptors of protein sequence and structure is a key issue that requires an efficient and widely available solution. RESULTS: We here introduce ProtDCal, a new computational software suite capable of generating tens of thousands of features considering both sequence-based and 3D-structural descriptors. We demonstrate, by means of principle component analysis and Shannon entropy tests, how ProtDCal's sequence-based descriptors provide new and more relevant information not encoded by currently available servers for sequence-based protein feature generation. The wide diversity of the 3D-structure-based features generated by ProtDCal is shown to provide additional complementary information and effectively completes its general protein encoding capability. As demonstration of the utility of ProtDCal's features, prediction models of N-linked glycosylation sites are trained and evaluated. Classification performance compares favourably with that of contemporary predictors of N-linked glycosylation sites, in spite of not using domain-specific features as input information. CONCLUSIONS: ProtDCal provides a friendly and cross-platform graphical user interface, developed in the Java programming language and is freely available at: http://bioinf.sce.carleton.ca/ProtDCal/ . ProtDCal introduces local and group-based encoding which enhances the diversity of the information captured by the computed features. Furthermore, we have shown that adding structure-based descriptors contributes non-redundant additional information to the features-based characterization of polypeptide systems. This software is intended to provide a useful tool for general-purpose encoding of protein sequences and structures for applications is protein classification, similarity analyses and function prediction.


Subject(s)
Protein Processing, Post-Translational , Protein Structure, Secondary , Proteins/chemistry , Proteins/classification , Software , Glycosylation , Humans , Principal Component Analysis
2.
J Theor Biol ; 321: 44-53, 2013 Mar 21.
Article in English | MEDLINE | ID: mdl-23313334

ABSTRACT

The principles governing protein folding stand as one of the biggest challenges of Biophysics. Modeling the global stability of proteins and predicting their tertiary structure are hard tasks, due in part to the variety and large number of forces involved and the difficulties to describe them with sufficient accuracy. We have developed a fast, physics-based empirical potential, intended to be used in global structure prediction methods. This model considers four main contributions: Two entropic factors, the hydrophobic effect and configurational entropy, and two terms resulting from a decomposition of close-packing interactions, namely the balance of the dispersive interactions of folded and unfolded states and electrostatic interactions between residues. The parameters of the model were fixed from a protein data set whose unfolding free energy has been measured at the "standard" experimental conditions proposed by Maxwell et al. (2005) and a large data set of 1151 monomeric proteins obtained from the PDB. A blind test with proteins taken from ProTherm database, at similar experimental conditions, was carried out. We found a good correlation with the test data set, proving the effectiveness of our model for predicting protein folding free energies in considered standard conditions. Such a prediction compares favorably against estimations made with FoldX's function and the force field GROMOS96. This model constitutes a valuable tool for the fast evaluation of protein structure stability in 3D structure prediction methods.


Subject(s)
Protein Folding , Proteins/chemistry , Algorithms , Databases, Protein , Linear Models , Models, Statistical , Protein Structure, Tertiary , Reproducibility of Results , Software , Thermodynamics
SELECTION OF CITATIONS
SEARCH DETAIL