Data-driven characterization of molecular phenotypes across heterogeneous sample collections.

Mehtonen, Juha; Pölönen, Petri; Häyrynen, Sergei; Dufva, Olli; Lin, Jake; Liuksiala, Thomas; Granberg, Kirsi; Lohi, Olli; Hautamäki, Ville; Nykter, Matti; Heinäniemi, Merja

Mehtonen, Juha; Pölönen, Petri; Häyrynen, Sergei; Dufva, Olli; Lin, Jake; Liuksiala, Thomas; Granberg, Kirsi; Lohi, Olli; Hautamäki, Ville; Nykter, Matti; Heinäniemi, Merja.

Afiliación

Mehtonen J; Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland.
Pölönen P; Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland.
Häyrynen S; Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
Dufva O; Hematology Research Unit Helsinki, University of Helsinki and Department of Hematology, Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland.
Lin J; Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
Liuksiala T; Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
Granberg K; Tampere Center for Child Health Research, Tampere University and Tampere University Hospital, Tampere, Finland.
Lohi O; Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
Hautamäki V; Tampere Center for Child Health Research, Tampere University and Tampere University Hospital, Tampere, Finland.
Nykter M; School of Computing, University of Eastern Finland, Joensuu, Finland.
Heinäniemi M; Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.

Nucleic Acids Res ; 47(13): e76, 2019 07 26.

Article en En | MEDLINE | ID: mdl-31329928

ABSTRACT

ABSTRACT

Existing large gene expression data repositories hold enormous potential to elucidate disease mechanisms, characterize changes in cellular pathways, and to stratify patients based on molecular profiles. To achieve this goal, integrative resources and tools are needed that allow comparison of results across datasets and data types. We propose an intuitive approach for data-driven stratifications of molecular profiles and benchmark our methodology using the dimensionality reduction algorithm t-distributed stochastic neighbor embedding (t-SNE) with multi-study and multi-platform data on hematological malignancies. Our approach enables assessing the contribution of biological versus technical variation to sample clustering, direct incorporation of additional datasets to the same low dimensional representation, comparison of molecular disease subtypes identified from separate t-SNE representations, and characterization of the obtained clusters based on pathway databases and additional data. In this manner, we performed an integrative analysis across multi-omics acute myeloid leukemia studies. Our approach indicated new molecular subtypes with differential survival and drug responsiveness among samples lacking fusion genes, including a novel myelodysplastic syndrome-like cluster and a cluster characterized with CEBPA mutations and differential activity of the S-adenosylmethionine-dependent DNA methylation pathway. In summary, integration across multiple studies can help to identify novel molecular disease subtypes and generate insight into disease biology.

Asunto(s)

Análisis por Conglomerados; Biología Computacional/métodos; Minería de Datos/métodos; Conjuntos de Datos como Asunto; Perfilación de la Expresión Génica/métodos; Regulación Leucémica de la Expresión Génica; Leucemia Mieloide Aguda/genética; Fenotipo; Algoritmos; Bases de Datos Genéticas; Genes Relacionados con las Neoplasias; Humanos; Leucemia Mieloide Aguda/clasificación; Mutación; Tamaño de la Muestra

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Fenotipo / Leucemia Mieloide Aguda / Análisis por Conglomerados / Regulación Leucémica de la Expresión Génica / Biología Computacional / Perfilación de la Expresión Génica / Minería de Datos / Conjuntos de Datos como Asunto Límite: Humans Idioma: En Revista: Nucleic Acids Res Año: 2019 Tipo del documento: Article País de afiliación: Finlandia

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google