Pesquisa | Secretaria de Estado da Saúde

Real value prediction of protein solvent accessibility using enhanced PSSM features.

Chang, Darby Tien-Hao; Huang, Hsuan-Yu; Syu, Yu-Tang; Wu, Chih-Peng.

BMC Bioinformatics ; 9 Suppl 12: S12, 2008 Dec 12.

Artigo em Inglês | MEDLINE | ID: mdl-19091011

RESUMO

BACKGROUND: Prediction of protein solvent accessibility, also called accessible surface area (ASA) prediction, is an important step for tertiary structure prediction directly from one-dimensional sequences. Traditionally, predicting solvent accessibility is regarded as either a two- (exposed or buried) or three-state (exposed, intermediate or buried) classification problem. However, the states of solvent accessibility are not well-defined in real protein structures. Thus, a number of methods have been developed to directly predict the real value ASA based on evolutionary information such as position specific scoring matrix (PSSM). RESULTS: This study enhances the PSSM-based features for real value ASA prediction by considering the physicochemical properties and solvent propensities of amino acid types. We propose a systematic method for identifying residue groups with respect to protein solvent accessibility. The amino acid columns in the PSSM profile that belong to a certain residue group are merged to generate novel features. Finally, support vector regression (SVR) is adopted to construct a real value ASA predictor. Experimental results demonstrate that the features produced by the proposed selection process are informative for ASA prediction. CONCLUSION: Experimental results based on a widely used benchmark reveal that the proposed method performs best among several of existing packages for performing ASA prediction. Furthermore, the feature selection mechanism incorporated in this study can be applied to other regression problems using the PSSM. The program and data are available from the authors upon request.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/química , Solventes/química , Algoritmos , Aminoácidos/química , Físico-Química/métodos , Modelos Estatísticos , Conformação Proteica , Análise de Regressão , Reprodutibilidade dos Testes , Análise de Sequência de Proteína , Software , Propriedades de Superfície

Decoy Database Improvement for Protein Folding.

Yeh, Hsin-Yi Cindy; Lindsey, Aaron; Wu, Chih-Peng; Thomas, Shawna; Amato, Nancy M.

J Comput Biol ; 22(9): 823-36, 2015 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-26258648

RESUMO

Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing redundant structures. We test our approach on 20 different decoy databases of varying size and type and show significant improvement across a variety of metrics. We also test our improved databases on two popular modern scoring functions and show that for most cases they contain a greater or equal number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Dobramento de Proteína , Proteínas/química , Algoritmos , Simulação por Computador , Conformação Proteica

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa