Biomarker selection and classification of "-omics" data using a two-step bayes classification framework.

Assawamakin, Anunchai; Prueksaaroon, Supakit; Kulawonganunchai, Supasak; Shaw, Philip James; Varavithya, Vara; Ruangrajitpakorn, Taneth; Tongsima, Sissades

Assawamakin, Anunchai; Prueksaaroon, Supakit; Kulawonganunchai, Supasak; Shaw, Philip James; Varavithya, Vara; Ruangrajitpakorn, Taneth; Tongsima, Sissades.

Afiliação

Assawamakin A; Department of Pharmacology, Faculty of Pharmacy, Mahidol University, 447 Sri-Ayuthaya Road, Rajathevi, Bangkok 10400, Thailand.

Biomed Res Int ; 2013: 148014, 2013.

Article em En | MEDLINE | ID: mdl-24106694

ABSTRACT

ABSTRACT

Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omics datasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray), and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time.

Assuntos

Inteligência Artificial; Teorema de Bayes; Análise em Microsséries/estatística & dados numéricos; Proteômica/estatística & dados numéricos; Algoritmos; Biomarcadores; Perfilação da Expressão Gênica; Humanos; Modelos Estatísticos; Análise de Sequência com Séries de Oligonucleotídeos; Polimorfismo de Nucleotídeo Único; Proteômica/métodos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Teorema de Bayes / Proteômica / Análise em Microsséries Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Revista: Biomed Res Int Ano de publicação: 2013 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google