Your browser doesn't support javascript.
loading
Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data.
Bastien, Philippe; Bertrand, Frédéric; Meyer, Nicolas; Maumy-Bertrand, Myriam.
Afiliação
  • Bastien P; L'Oréal Recherche & Innovation, 93601 Aulnay-sous-Bois, IRMA, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg, 67084 Strasbourg Cedex, INSERM EA3430, Laboratoire de Biostatistique, Faculté de Médecine de Strasbourg, Labex IRMIA, Université de Strasbourg, 67085 Strasbourg Cedex, France.
  • Bertrand F; L'Oréal Recherche & Innovation, 93601 Aulnay-sous-Bois, IRMA, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg, 67084 Strasbourg Cedex, INSERM EA3430, Laboratoire de Biostatistique, Faculté de Médecine de Strasbourg, Labex IRMIA, Université de Strasbourg, 67085 Strasbourg Cedex, France.
  • Meyer N; L'Oréal Recherche & Innovation, 93601 Aulnay-sous-Bois, IRMA, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg, 67084 Strasbourg Cedex, INSERM EA3430, Laboratoire de Biostatistique, Faculté de Médecine de Strasbourg, Labex IRMIA, Université de Strasbourg, 67085 Strasbourg Cedex, France.
  • Maumy-Bertrand M; L'Oréal Recherche & Innovation, 93601 Aulnay-sous-Bois, IRMA, CNRS UMR 7501, Labex IRMIA, Université de Strasbourg, 67084 Strasbourg Cedex, INSERM EA3430, Laboratoire de Biostatistique, Faculté de Médecine de Strasbourg, Labex IRMIA, Université de Strasbourg, 67085 Strasbourg Cedex, France.
Bioinformatics ; 31(3): 397-404, 2015 Feb 01.
Article em En | MEDLINE | ID: mdl-25286920
ABSTRACT
MOTIVATION A vast literature from the past decade is devoted to relating gene profiles and subject survival or time to cancer recurrence. Biomarker discovery from high-dimensional data, such as transcriptomic or single nucleotide polymorphism profiles, is a major challenge in the search for more precise diagnoses. The proportional hazard regression model suggested by Cox (1972), to study the relationship between the time to event and a set of covariates in the presence of censoring is the most commonly used model for the analysis of survival data. However, like multivariate regression, it supposes that more observations than variables, complete data, and not strongly correlated variables are available. In practice, when dealing with high-dimensional data, these constraints are crippling. Collinearity gives rise to issues of over-fitting and model misidentification. Variable selection can improve the estimation accuracy by effectively identifying the subset of relevant predictors and enhance the model interpretability with parsimonious representation. To deal with both collinearity and variable selection issues, many methods based on least absolute shrinkage and selection operator penalized Cox proportional hazards have been proposed since the reference paper of Tibshirani. Regularization could also be performed using dimension reduction as is the case with partial least squares (PLS) regression. We propose two original algorithms named sPLSDR and its non-linear kernel counterpart DKsPLSDR, by using sparse PLS regression (sPLS) based on deviance residuals. We compared their predicting performance with state-of-the-art algorithms on both simulated and real reference benchmark datasets.

RESULTS:

sPLSDR and DKsPLSDR compare favorably with other methods in their computational time, prediction and selectivity, as indicated by results based on benchmark datasets. Moreover, in the framework of PLS regression, they feature other useful tools, including biplots representation, or the ability to deal with missing data. Therefore, we view them as a useful addition to the toolbox of estimation and prediction methods for the widely used Cox's model in the high-dimensional and low-sample size settings. AVAILABILITY AND IMPLEMENTATION The R-package plsRcox is available on the CRAN and is maintained by Frédéric Bertrand. http//cran.r-project.org/web/packages/plsRcox/index.html. CONTACT pbastien@rd.loreal.com or fbertran@math.unistra.fr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Software / Análise dos Mínimos Quadrados / Análise de Regressão / Perfilação da Expressão Gênica Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2015 Tipo de documento: Article País de afiliação: França

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Software / Análise dos Mínimos Quadrados / Análise de Regressão / Perfilação da Expressão Gênica Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2015 Tipo de documento: Article País de afiliação: França