Your browser doesn't support javascript.
loading
Influence of Feature Encoding and Choice of Classifier on Disease Risk Prediction in Genome-Wide Association Studies.
Mittag, Florian; Römer, Michael; Zell, Andreas.
Afiliación
  • Mittag F; Cognitive Systems Group, University of Tübingen, Tübingen, Germany.
  • Römer M; Cognitive Systems Group, University of Tübingen, Tübingen, Germany.
  • Zell A; Cognitive Systems Group, University of Tübingen, Tübingen, Germany.
PLoS One ; 10(8): e0135832, 2015.
Article en En | MEDLINE | ID: mdl-26285210
Various attempts have been made to predict the individual disease risk based on genotype data from genome-wide association studies (GWAS). However, most studies only investigated one or two classification algorithms and feature encoding schemes. In this study, we applied seven different classification algorithms on GWAS case-control data sets for seven different diseases to create models for disease risk prediction. Further, we used three different encoding schemes for the genotypes of single nucleotide polymorphisms (SNPs) and investigated their influence on the predictive performance of these models. Our study suggests that an additive encoding of the SNP data should be the preferred encoding scheme, as it proved to yield the best predictive performances for all algorithms and data sets. Furthermore, our results showed that the differences between most state-of-the-art classification algorithms are not statistically significant. Consequently, we recommend to prefer algorithms with simple models like the linear support vector machine (SVM) as they allow for better subsequent interpretation without significant loss of accuracy.
Asunto(s)

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Enfermedad / Biología Computacional / Estudio de Asociación del Genoma Completo Tipo de estudio: Etiology_studies / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: PLoS One Asunto de la revista: CIENCIA / MEDICINA Año: 2015 Tipo del documento: Article País de afiliación: Alemania

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Enfermedad / Biología Computacional / Estudio de Asociación del Genoma Completo Tipo de estudio: Etiology_studies / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: PLoS One Asunto de la revista: CIENCIA / MEDICINA Año: 2015 Tipo del documento: Article País de afiliación: Alemania