Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Bases de dados
Ano de publicação
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Anal Chim Acta ; 644(1-2): 10-6, 2009 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-19463555

RESUMO

Three machine learning algorithms as least-squares support vector machine (LSSVM), random forest (RF) and Gaussian process (GP) were used to model the quantitative structure-retention relationship (QSRR) for predicting and explaining the retention behavior of proteome-wide peptides in the reverse-phase liquid chromatography. Peptides were parameterized using CODESSA approach and 145 descriptors were obtained for each peptide, including diverse structural information such as constitutional, topological, geometrical and physicochemical property. Based upon that, the nonlinear LSSVM, RF and GP as well as another sophisticated linear method (partial least-squares regression (PLS)) were employed in the QSRR model development. By a series of systematic validations as internal cross-validation, external test and Monte Carlo cross-validation, the stability and predictive power of the constructed models were confirmed. Results show that regression models developed using nonlinear approaches such as LSSVM, RF and GP predict better than linear PLS models. Considering the retention times used in this work were measured in different columns and thus have a relatively large uncertainty (reproducibility within 7%), the optimal statistics obtained from GP modeling are satisfactory, with the coefficients of determination (R2) for training set and test set of 0.894 and 0.866, respectively.


Assuntos
Inteligência Artificial , Cromatografia Líquida , Proteínas de Drosophila/química , Peptídeos/química , Proteoma , Sequência de Aminoácidos , Animais , Proteínas de Drosophila/análise , Drosophila melanogaster/química , Análise dos Mínimos Quadrados , Método de Monte Carlo , Peptídeos/análise , Valor Preditivo dos Testes , Relação Quantitativa Estrutura-Atividade , Fatores de Tempo
2.
J Chromatogr A ; 1216(15): 3107-16, 2009 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-19232620

RESUMO

In this study, we propose a new peptide characterization method that gives attention to both the amino acid composition and the residue local environment. Using this approach, structural characteristics of peptides derived from Escherichia coli proteome were parameterized and, based upon that, the performance profile of eight statistical modelling methods were validated rigorously and compared comprehensively by applying them to modelling relationship between the sequence structure and retention ability for 816 experimentally measured peptides and to predicting normalized retention times for 121,273 unmeasured peptides in liquid chromatography. Results show that the regression models constructed by nonlinear approaches are more robust and predictable but time-consuming than those by linear ones. In these modelling methods, Gaussian process and back-propagation neural network possess the best stability, unbiased ability and predictive power, thus they can be used to accurately model the peptide structure-retention relationships; multiple linear regression and partial least squares regression perform worse compared to nonlinear modelling techniques but they are computationally efficient, so they are promising candidates for solving the qualitative problems involved in massive data. In addition, by investigating the descriptor importance in different models we found that the amino acid composition presents a significantly linear correlation with the retention time of peptides, whereas the residue environment is mainly correlated nonlinearly with peptide retention. The polar Arg and strongly hydrophobic amino acids such as Leu, Ile, Phe, Trp and Val are the critical factors influencing peptide retention behavior.


Assuntos
Cromatografia Líquida/métodos , Proteínas de Escherichia coli/metabolismo , Modelos Estatísticos , Fragmentos de Peptídeos/análise , Relação Quantitativa Estrutura-Atividade , Aminoácidos/análise , Escherichia coli , Análise dos Mínimos Quadrados , Modelos Lineares , Método de Monte Carlo , Dinâmica não Linear , Distribuição Normal , Proteoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA