PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells.

Stassen, Shobana V; Siu, Dickson M D; Lee, Kelvin C M; Ho, Joshua W K; So, Hayden K H; Tsia, Kevin K

Stassen, Shobana V; Siu, Dickson M D; Lee, Kelvin C M; Ho, Joshua W K; So, Hayden K H; Tsia, Kevin K.

Afiliação

Stassen SV; Department of Electrical and Electronic Engineering.
Siu DMD; Department of Electrical and Electronic Engineering.
Lee KCM; Department of Electrical and Electronic Engineering.
Ho JWK; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
So HKH; Department of Electrical and Electronic Engineering.
Tsia KK; Department of Electrical and Electronic Engineering.

Bioinformatics ; 36(9): 2778-2786, 2020 05 01.

Article em En | MEDLINE | ID: mdl-31971583

ABSTRACT

ABSTRACT

MOTIVATION New single-cell technologies continue to fuel the explosive growth in the scale of heterogeneous single-cell data. However, existing computational methods are inadequately scalable to large datasets and therefore cannot uncover the complex cellular heterogeneity.

RESULTS:

We introduce a highly scalable graph-based clustering algorithm PARC-Phenotyping by Accelerated Refined Community-partitioning-for large-scale, high-dimensional single-cell data (>1 million cells). Using large single-cell flow and mass cytometry, RNA-seq and imaging-based biophysical data, we demonstrate that PARC consistently outperforms state-of-the-art clustering algorithms without subsampling of cells, including Phenograph, FlowSOM and Flock, in terms of both speed and ability to robustly detect rare cell populations. For example, PARC can cluster a single-cell dataset of 1.1 million cells within 13 min, compared with >2 h for the next fastest graph-clustering algorithm. Our work presents a scalable algorithm to cope with increasingly large-scale single-cell analysis. AVAILABILITY AND IMPLEMENTATION https//github.com/ShobiStassen/PARC. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos; Análise de Célula Única; Análise por Conglomerados; RNA-Seq; Software; Sequenciamento do Exoma

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Análise de Célula Única Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google