Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction.

He, Dan; Kuhn, David; Parida, Laxmi

He, Dan; Kuhn, David; Parida, Laxmi.

Afiliação

He D; IBM T.J. Watson Research, Yorktown Heights, NY, USA.
Kuhn D; USDA-ARS Subtropical Horticultural Research Station, Miami, FL, USA.
Parida L; IBM T.J. Watson Research, Yorktown Heights, NY, USA.

Bioinformatics ; 32(12): i37-i43, 2016 06 15.

Article em En | MEDLINE | ID: mdl-27307640

RESUMO

UNLABELLED: Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models. In many cases, for the same set of samples and markers, multiple traits are observed. Some of these traits might be correlated with each other. Therefore, modeling all the multiple traits together may improve the prediction accuracy. In this work, we view the multitrait prediction problem from a machine learning angle: as either a multitask learning problem or a multiple output regression problem, depending on whether different traits share the same genotype matrix or not. We then adapted multitask learning algorithms and multiple output regression algorithms to solve the multitrait prediction problem. We proposed a few strategies to improve the least square error of the prediction from these algorithms. Our experiments show that modeling multiple traits together could improve the prediction accuracy for correlated traits. AVAILABILITY AND IMPLEMENTATION: The programs we used are either public or directly from the referred authors, such as MALSAR (http://www.public.asu.edu/~jye02/Software/MALSAR/) package. The Avocado data set has not been published yet and is available upon request. CONTACT: dhe@us.ibm.com.

Assuntos

Genótipo; Aprendizado de Máquina; Modelos Genéticos; Fenótipo; Algoritmos; Animais; Humanos; Modelos Lineares; Plantas; Polimorfismo de Nucleotídeo Único; Locos de Características Quantitativas

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Fenótipo / Aprendizado de Máquina / Genótipo / Modelos Genéticos Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Animals / Humans Idioma: En Ano de publicação: 2016 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google