Predicting DNA-binding sites of proteins based on sequential and 3D structural information.
Mol Genet Genomics
; 289(3): 489-99, 2014 Jun.
Article
em En
| MEDLINE
| ID: mdl-24448651
ABSTRACT
Protein-DNA interactions play important roles in many biological processes. To understand the molecular mechanisms of protein-DNA interaction, it is necessary to identify the DNA-binding sites in DNA-binding proteins. In the last decade, computational approaches have been developed to predict protein-DNA-binding sites based solely on protein sequences. In this study, we developed a novel predictor based on support vector machine algorithm coupled with the maximum relevance minimum redundancy method followed by incremental feature selection. We incorporated not only features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure, solvent accessibility, but also five three-dimensional (3D) structural features calculated from PDB data to predict the protein-DNA interaction sites. Feature analysis showed that 3D structural features indeed contributed to the prediction of DNA-binding site and it was demonstrated that the prediction performance was better with 3D structural features than without them. It was also shown via analysis of features from each site that the features of DNA-binding site itself contribute the most to the prediction. Our prediction method may become a useful tool for identifying the DNA-binding sites and the feature analysis described in this paper may provide useful insights for in-depth investigations into the mechanisms of protein-DNA interaction.
Texto completo:
1
Bases de dados:
MEDLINE
Assunto principal:
Sítios de Ligação
/
DNA
/
Biologia Computacional
/
Proteínas de Ligação a DNA
/
Máquina de Vetores de Suporte
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Idioma:
En
Revista:
Mol Genet Genomics
Assunto da revista:
BIOLOGIA MOLECULAR
/
GENETICA
Ano de publicação:
2014
Tipo de documento:
Article