Your browser doesn't support javascript.
loading
Robust multi-group gene set analysis with few replicates.
Mishra, Pashupati P; Medlar, Alan; Holm, Liisa; Törönen, Petri.
Afiliación
  • Mishra PP; Institute of Biotechnology, University of Helsinki, P.O. Box 56, Viikinkaari 5, Helsinki, 00014, Finland. pashupati.mishra@helsinki.fi.
  • Medlar A; Institute of Biotechnology, University of Helsinki, P.O. Box 56, Viikinkaari 5, Helsinki, 00014, Finland.
  • Holm L; Institute of Biotechnology, University of Helsinki, P.O. Box 56, Viikinkaari 5, Helsinki, 00014, Finland.
  • Törönen P; Department of Biosciences, University of Helsinki, Viikinkaari 1, Helsinki, 00014, Finland.
BMC Bioinformatics ; 17(1): 526, 2016 Dec 09.
Article en En | MEDLINE | ID: mdl-27938331
BACKGROUND: Competitive gene set analysis is a standard exploratory tool for gene expression data. Permutation-based competitive gene set analysis methods are preferable to parametric ones because the latter make strong statistical assumptions which are not always met. For permutation-based methods, we permute samples, as opposed to genes, as doing so preserves the inter-gene correlation structure. Unfortunately, up until now, sample permutation-based methods have required a minimum of six replicates per sample group. RESULTS: We propose a new permutation-based competitive gene set analysis method for multi-group gene expression data with as few as three replicates per group. The method is based on advanced sample permutation technique that utilizes all groups within a data set for pairwise comparisons. We present a comprehensive evaluation of different permutation techniques, using multiple data sets and contrast the performance of our method, mGSZm, with other state of the art methods. We show that mGSZm is robust, and that, despite only using less than six replicates, we are able to consistently identify a high proportion of the top ranked gene sets from the analysis of a substantially larger data set. Further, we highlight other methods where performance is highly variable and appears dependent on the underlying data set being analyzed. CONCLUSIONS: Our results demonstrate that robust gene set analysis of multi-group gene expression data is permissible with as few as three replicates. In doing so, we have extended the applicability of such approaches to resource constrained experiments where additional data generation is prohibitively difficult or expensive. An R package implementing the proposed method and supplementary materials are available from the website http://ekhidna.biocenter.helsinki.fi/downloads/pashupati/mGSZm.html .
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Análisis de Secuencia por Matrices de Oligonucleótidos / Perfilación de la Expresión Génica Tipo de estudio: Prognostic_studies Límite: Animals / Humans Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2016 Tipo del documento: Article País de afiliación: Finlandia

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Análisis de Secuencia por Matrices de Oligonucleótidos / Perfilación de la Expresión Génica Tipo de estudio: Prognostic_studies Límite: Animals / Humans Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2016 Tipo del documento: Article País de afiliación: Finlandia