A cell-level discriminative neural network model for diagnosis of blood cancers.

Robles, Edgar E; Jin, Ye; Smyth, Padhraic; Scheuermann, Richard H; Bui, Jack D; Wang, Huan-You; Oak, Jean; Qian, Yu

Robles, Edgar E; Jin, Ye; Smyth, Padhraic; Scheuermann, Richard H; Bui, Jack D; Wang, Huan-You; Oak, Jean; Qian, Yu.

Affiliation

Robles EE; Department of Computer Science, University of California, Irvine, CA 92697, United States.
Jin Y; Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States.
Smyth P; Department of Computer Science, University of California, Irvine, CA 92697, United States.
Scheuermann RH; Department of Informatics, J. Craig Venter Institute, La Jolla, CA 92037, United States.
Bui JD; Department of Pathology, University of California, San Diego, CA 92093, United States.
Wang HY; Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology, La Jolla, CA 92037, United States.
Oak J; Department of Pathology, University of California, San Diego, CA 92093, United States.
Qian Y; Department of Pathology, University of California, San Diego, CA 92093, United States.

Bioinformatics ; 39(10)2023 10 03.

Article in En | MEDLINE | ID: mdl-37756695

ABSTRACT

ABSTRACT

MOTIVATION Precise identification of cancer cells in patient samples is essential for accurate diagnosis and clinical monitoring but has been a significant challenge in machine learning approaches for cancer precision medicine. In most scenarios, training data are only available with disease annotation at the subject or sample level. Traditional approaches separate the classification process into multiple steps that are optimized independently. Recent methods either focus on predicting sample-level diagnosis without identifying individual pathologic cells or are less effective for identifying heterogeneous cancer cell phenotypes.

RESULTS:

We developed a generalized end-to-end differentiable model, the Cell Scoring Neural Network (CSNN), which takes sample-level training data and predicts the diagnosis of the testing samples and the identity of the diagnostic cells in the sample, simultaneously. The cell-level density differences between samples are linked to the sample diagnosis, which allows the probabilities of individual cells being diagnostic to be calculated using backpropagation. We applied CSNN to two independent clinical flow cytometry datasets for leukemia diagnosis. In both qualitative and quantitative assessments, CSNN outperformed preexisting neural network modeling approaches for both cancer diagnosis and cell-level classification. Post hoc decision trees and 2D dot plots were generated for interpretation of the identified cancer cells, showing that the identified cell phenotypes match the cancer endotypes observed clinically in patient cohorts. Independent data clustering analysis confirmed the identified cancer cell populations. AVAILABILITY AND IMPLEMENTATION The source code of CSNN and datasets used in the experiments are publicly available on GitHub (http//github.com/erobl/csnn). Raw FCS files can be downloaded from FlowRepository (ID FR-FCM-Z6YK).

Subject(s)

Hematologic Neoplasms; Neoplasms; Humans; Neural Networks, Computer; Neoplasms/diagnosis; Flow Cytometry/methods; Software

Fulltext

XML

PubMed Links

Search on Google

Full text: 1 Database: MEDLINE Main subject: Hematologic Neoplasms / Neoplasms Type of study: Diagnostic_studies / Prognostic_studies / Qualitative_research Limits: Humans Language: En Year: 2023 Type: Article

Fulltext

XML

PubMed Links

Search on Google