Your browser doesn't support javascript.
loading
Conditional canonical correlation estimation based on covariates with random forests.
Alakus, Cansu; Larocque, Denis; Jacquemont, Sébastien; Barlaam, Fanny; Martin, Charles-Olivier; Agbogba, Kristian; Lippé, Sarah; Labbe, Aurélie.
Afiliação
  • Alakus C; Department of Decision Sciences, HEC Montréal, Montréal, QC H3T 2A7, Canada.
  • Larocque D; Department of Decision Sciences, HEC Montréal, Montréal, QC H3T 2A7, Canada.
  • Jacquemont S; Department of Pediatrics, Université de Montréal, Montréal, QC H3T 1C5, Canada.
  • Barlaam F; CHU Sainte-Justine Research Center, Montréal, QC, H3T 1C5, Canada.
  • Martin CO; CHU Sainte-Justine Research Center, Montréal, QC, H3T 1C5, Canada.
  • Agbogba K; CHU Sainte-Justine Research Center, Montréal, QC, H3T 1C5, Canada.
  • Lippé S; CHU Sainte-Justine Research Center, Montréal, QC, H3T 1C5, Canada.
  • Labbe A; Department of Psychology, Université de Montréal, Montréal, QC H3T 1J4, Canada.
Bioinformatics ; 37(17): 2714-2721, 2021 Sep 09.
Article em En | MEDLINE | ID: mdl-33693547
ABSTRACT
MOTIVATION Investigating the relationships between two sets of variables helps to understand their interactions and can be done with canonical correlation analysis (CCA). However, the correlation between the two sets can sometimes depend on a third set of covariates, often subject-related ones such as age, gender or other clinical measures. In this case, applying CCA to the whole population is not optimal and methods to estimate conditional CCA, given the covariates, can be useful.

RESULTS:

We propose a new method called Random Forest with Canonical Correlation Analysis (RFCCA) to estimate the conditional canonical correlations between two sets of variables given subject-related covariates. The individual trees in the forest are built with a splitting rule specifically designed to partition the data to maximize the canonical correlation heterogeneity between child nodes. We also propose a significance test to detect the global effect of the covariates on the relationship between two sets of variables. The performance of the proposed method and the global significance test is evaluated through simulation studies that show it provides accurate canonical correlation estimations and well-controlled Type-1 error. We also show an application of the proposed method with EEG data. AVAILABILITY AND IMPLEMENTATION RFCCA is implemented in a freely available R package on CRAN (https//CRAN.R-project.org/package=RFCCA). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Clinical_trials Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Clinical_trials Idioma: En Ano de publicação: 2021 Tipo de documento: Article