Your browser doesn't support javascript.
loading
Exploring high-dimensional biological data with sparse contrastive principal component analysis.
Boileau, Philippe; Hejazi, Nima S; Dudoit, Sandrine.
Afiliación
  • Boileau P; Graduate Group in Biostatistics.
  • Hejazi NS; Graduate Group in Biostatistics.
  • Dudoit S; Center for Computational Biology.
Bioinformatics ; 36(11): 3422-3430, 2020 06 01.
Article en En | MEDLINE | ID: mdl-32176249
ABSTRACT
MOTIVATION Statistical analyses of high-throughput sequencing data have re-shaped the biological sciences. In spite of myriad advances, recovering interpretable biological signal from data corrupted by technical noise remains a prevalent open problem. Several classes of procedures, among them classical dimensionality reduction techniques and others incorporating subject-matter knowledge, have provided effective advances. However, no procedure currently satisfies the dual objectives of recovering stable and relevant features simultaneously.

RESULTS:

Inspired by recent proposals for making use of control data in the removal of unwanted variation, we propose a variant of principal component analysis (PCA), sparse contrastive PCA that extracts sparse, stable, interpretable and relevant biological signal. The new methodology is compared to competing dimensionality reduction approaches through a simulation study and via analyses of several publicly available protein expression, microarray gene expression and single-cell transcriptome sequencing datasets. AVAILABILITY AND IMPLEMENTATION A free and open-source software implementation of the methodology, the scPCA R package, is made available via the Bioconductor Project. Code for all analyses presented in this article is also available via GitHub. CONTACT philippe_boileau@berkeley.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Programas Informáticos / Secuenciación de Nucleótidos de Alto Rendimiento Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2020 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Programas Informáticos / Secuenciación de Nucleótidos de Alto Rendimiento Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2020 Tipo del documento: Article
...