Sparse redundancy analysis of high-dimensional genetic and genomic data.
Bioinformatics
; 33(20): 3228-3234, 2017 Oct 15.
Article
em En
| MEDLINE
| ID: mdl-28605402
MOTIVATION: Recent technological developments have enabled the possibility of genetic and genomic integrated data analysis approaches, where multiple omics datasets from various biological levels are combined and used to describe (disease) phenotypic variations. The main goal is to explain and ultimately predict phenotypic variations by understanding their genetic basis and the interaction of the associated genetic factors. Therefore, understanding the underlying genetic mechanisms of phenotypic variations is an ever increasing research interest in biomedical sciences. In many situations, we have a set of variables that can be considered to be the outcome variables and a set that can be considered to be explanatory variables. Redundancy analysis (RDA) is an analytic method to deal with this type of directionality. Unfortunately, current implementations of RDA cannot deal optimally with the high dimensionality of omics data (pâ«n). The existing theoretical framework, based on Ridge penalization, is suboptimal, since it includes all variables in the analysis. As a solution, we propose to use Elastic Net penalization in an iterative RDA framework to obtain a sparse solution. RESULTS: We proposed sparse redundancy analysis (sRDA) for high dimensional omics data analysis. We conducted simulation studies with our software implementation of sRDA to assess the reliability of sRDA. Both the analysis of simulated data, and the analysis of 485 512 methylation markers and 18,424 gene-expression values measured in a set of 55 patients with Marfan syndrome show that sRDA is able to deal with the usual high dimensionality of omics data. AVAILABILITY AND IMPLEMENTATION: http://uva.csala.me/rda. CONTACT: a.csala@amc.uva.nl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Software
/
Genoma Humano
/
Genômica
Tipo de estudo:
Prognostic_studies
Limite:
Humans
Idioma:
En
Ano de publicação:
2017
Tipo de documento:
Article