Gene selection heuristic algorithm for nutrigenomics studies.
Physiol Genomics
; 45(14): 615-28, 2013 Jul 15.
Article
em En
| MEDLINE
| ID: mdl-23632420
Large datasets from -omics studies need to be deeply investigated. The aim of this paper is to provide a new method (LEM method) for the search of transcriptome and metabolome connections. The heuristic algorithm here described extends the classical canonical correlation analysis (CCA) to a high number of variables (without regularization) and combines well-conditioning and fast-computing in "R." Reduced CCA models are summarized in PageRank matrices, the product of which gives a stochastic matrix that resumes the self-avoiding walk covered by the algorithm. Then, a homogeneous Markov process applied to this stochastic matrix converges the probabilities of interconnection between genes, providing a selection of disjointed subsets of genes. This is an alternative to regularized generalized CCA for the determination of blocks within the structure matrix. Each gene subset is thus linked to the whole metabolic or clinical dataset that represents the biological phenotype of interest. Moreover, this selection process reaches the aim of biologists who often need small sets of genes for further validation or extended phenotyping. The algorithm is shown to work efficiently on three published datasets, resulting in meaningfully broadened gene networks.
Palavras-chave
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Algoritmos
/
Nutrigenômica
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Limite:
Humans
Idioma:
En
Ano de publicação:
2013
Tipo de documento:
Article