RESUMO
Imaging flow cytometry (IFC) provides single-cell imaging data at a high acquisition rate. It is increasingly used in image-based profiling experiments consisting of hundreds of thousands of multi-channel images of cells. Currently available software solutions for processing microscopy data can provide good results in downstream analysis, but are limited in efficiency and scalability, and often ill-adapted to IFC data. In this work, we propose Scalable Cytometry Image Processing (SCIP), a Python software that efficiently processes images from IFC and standard microscopy datasets. We also propose a file format for efficiently storing IFC data. We showcase our contributions on two large-scale microscopy and one IFC datasets, all of which are publicly available. Our results show that SCIP can extract the same kind of information as other tools, in a much shorter time and in a more scalable manner.
RESUMO
Imaging flow cytometry (IFC) produces up to 12 spectrally distinct, information-rich images of single cells at a throughput of 5,000 cells per second. Yet often, cell populations are still studied using manual gating, a technique that has several drawbacks, hence it would be advantageous to replace manual gating with an automated process. Ideally, this automated process would be based on stain-free measurements, as the currently used staining techniques are expensive and potentially confounding. These stain-free measurements originate from the brightfield and darkfield image channels, which capture transmitted and scattered light, respectively. To realize this automated, stain-free approach, advanced machine learning (ML) methods are required. Previous works have successfully tested this approach on cell cycle phase classification with both a classical ML approach based on manually engineered features, and a deep learning (DL) approach. In this work, we compare both approaches extensively on the problem of white blood cell classification. Four human whole blood samples were assayed on an ImageStream-X MK II imaging flow cytometer. Two samples were stained for the identification of eight white blood cell types, while two other sample sets were stained for the identification of resting and active eosinophils. For both data sets, four ML classifiers were evaluated on stain-free imagery with stratified 5-fold cross-validation. On the white blood cell data set, the best obtained results were 0.778 and 0.703 balanced accuracy for classical ML and DL, respectively. On the eosinophil data set, this was 0.871 and 0.856 balanced accuracy. We conclude that classifying cell types based on only stain-free images is possible with all four classifiers. Noteworthy, we also find that the DL approaches tested in this work do not outperform the approaches based on manually engineered features. © 2019 International Society for Advancement of Cytometry.