Your browser doesn't support javascript.
loading
Integrative and regularized principal component analysis of multiple sources of data.
Liu, Binghui; Shen, Xiaotong; Pan, Wei.
Afiliação
  • Liu B; School of Mathematics and Statistics, Northeast Normal University, Changchun, 130024, Jilin Province, China.
  • Shen X; School of Statistics, University of Minnesota, 224 Church St. S.E., Minneapolis, 55455, MN, U.S.A.
  • Pan W; Division of Biostatistics, University of Minnesota, 420 Delaware St. S.E., Minneapolis, 55455, MN, U.S.A.
Stat Med ; 35(13): 2235-50, 2016 06 15.
Article em En | MEDLINE | ID: mdl-26756854
Integration of data of disparate types has become increasingly important to enhancing the power for new discoveries by combining complementary strengths of multiple types of data. One application is to uncover tumor subtypes in human cancer research in which multiple types of genomic data are integrated, including gene expression, DNA copy number, and DNA methylation data. In spite of their successes, existing approaches based on joint latent variable models require stringent distributional assumptions and may suffer from unbalanced scales (or units) of different types of data and non-scalability of the corresponding algorithms. In this paper, we propose an alternative based on integrative and regularized principal component analysis, which is distribution-free, computationally efficient, and robust against unbalanced scales. The new method performs dimension reduction simultaneously on multiple types of data, seeking data-adaptive sparsity and scaling. As a result, in addition to feature selection for each type of data, integrative clustering is achieved. Numerically, the proposed method compares favorably against its competitors in terms of accuracy (in identifying hidden clusters), computational efficiency, and robustness against unbalanced scales. In particular, compared with a popular method, the new method was competitive in identifying tumor subtypes associated with distinct patient survival patterns when applied to a combined analysis of DNA copy number, mRNA expression, and DNA methylation data in a glioblastoma multiforme study. Copyright © 2016 John Wiley & Sons, Ltd.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Interpretação Estatística de Dados / Análise de Componente Principal Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Ano de publicação: 2016 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Interpretação Estatística de Dados / Análise de Componente Principal Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Ano de publicação: 2016 Tipo de documento: Article