Your browser doesn't support javascript.
loading
Unsupervised multiple kernel learning for heterogeneous data integration.
Mariette, Jérôme; Villa-Vialaneix, Nathalie.
Afiliação
  • Mariette J; MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France.
  • Villa-Vialaneix N; MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France.
Bioinformatics ; 34(6): 1009-1015, 2018 03 15.
Article em En | MEDLINE | ID: mdl-29077792
ABSTRACT
Motivation Recent high-throughput sequencing advances have expanded the breadth of available omics datasets and the integrated analysis of multiple datasets obtained on the same samples has allowed to gain important insights in a wide range of applications. However, the integration of various sources of information remains a challenge for systems biology since produced datasets are often of heterogeneous types, with the need of developing generic methods to take their different specificities into account.

Results:

We propose a multiple kernel framework that allows to integrate multiple datasets of various types into a single exploratory analysis. Several solutions are provided to learn either a consensus meta-kernel or a meta-kernel that preserves the original topology of the datasets. We applied our framework to analyse two public multi-omics datasets. First, the multiple metagenomic datasets, collected during the TARA Oceans expedition, was explored to demonstrate that our method is able to retrieve previous findings in a single kernel PCA as well as to provide a new image of the sample structures when a larger number of datasets are included in the analysis. To perform this analysis, a generic procedure is also proposed to improve the interpretability of the kernel PCA in regards with the original data. Second, the multi-omics breast cancer datasets, provided by The Cancer Genome Atlas, is analysed using a kernel Self-Organizing Maps with both single and multi-omics strategies. The comparison of these two approaches demonstrates the benefit of our integration method to improve the representation of the studied biological system. Availability and implementation Proposed methods are available in the R package mixKernel, released on CRAN. It is fully compatible with the mixOmics package and a tutorial describing the approach can be found on mixOmics web site http//mixomics.org/mixkernel/. Contact jerome.mariette@inra.fr or nathalie.villa-vialaneix@inra.fr. Supplementary information Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Biologia Computacional / Aprendizado de Máquina não Supervisionado Limite: Female / Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2018 Tipo de documento: Article País de afiliação: França

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Biologia Computacional / Aprendizado de Máquina não Supervisionado Limite: Female / Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2018 Tipo de documento: Article País de afiliação: França