Your browser doesn't support javascript.
loading
Exploring an alignment free approach for protein classification and structural class prediction.
Deschavanne, P; Tufféry, P.
Affiliation
  • Deschavanne P; Equipe de Bioinformatique Génomique et Moléculaire, INSERM UMR-S 726, Université Paris 7, 75251 Paris Cedex 05, France.
Biochimie ; 90(4): 615-25, 2008 Apr.
Article de En | MEDLINE | ID: mdl-18067866
ABSTRACT
Alignment free methods based on Chaos Game Representation (CGR), also known as sequence signature approaches, have proven of great interest for DNA sequence analysis. Indeed, they have been successfully applied for sequence comparison, phylogeny, detection of horizontal transfers or extraction of representative motifs in regulation sequences. Transposing such methods to proteins poses several fundamental questions related to representation space dimensionality. Several studies have tackled these points, but none has, so far, brought the application of CGRs to proteins to their fully expected potential. Yet, several studies have shown that techniques based on n-peptide frequencies can be relevant for proteins. Here, we investigate the effectiveness of a strategy based on the CGR approach using a fixed reverse encoding of amino acids into nucleic sequences. We first explore its relevance to protein classification into functional families. We then attempt to apply it to the prediction of protein structural classes. Our results suggest that the reverse encoding approach could be relevant in both cases. We show that it is able to classify functional families of proteins by extracting signatures close to the ProSite patterns. Applied to structural classification, the approach reaches scores of correct classification close to 84%, i.e. close to the scores of related methods in the field. Various optimizations of the approach are still possible, which open the door for future applications.
Sujet(s)
Recherche sur Google
Collection: 01-internacional Base de données: MEDLINE Sujet principal: Conformation des protéines / Logiciel / Reconnaissance automatique des formes / Protéines / Analyse de séquence de protéine Type d'étude: Prognostic_studies / Risk_factors_studies Langue: En Journal: Biochimie Année: 2008 Type de document: Article Pays d'affiliation: France
Recherche sur Google
Collection: 01-internacional Base de données: MEDLINE Sujet principal: Conformation des protéines / Logiciel / Reconnaissance automatique des formes / Protéines / Analyse de séquence de protéine Type d'étude: Prognostic_studies / Risk_factors_studies Langue: En Journal: Biochimie Année: 2008 Type de document: Article Pays d'affiliation: France
...