RESUMO
MOTIVATION: Nucleus detection, segmentation and classification are fundamental to high-resolution mapping of the tumor microenvironment using whole-slide histopathology images. The growing interest in leveraging the power of deep learning to achieve state-of-the-art performance often comes at the cost of explainability, yet there is general consensus that explainability is critical for trustworthiness and widespread clinical adoption. Unfortunately, current explainability paradigms that rely on pixel saliency heatmaps or superpixel importance scores are not well-suited for nucleus classification. Techniques like Grad-CAM or LIME provide explanations that are indirect, qualitative and/or nonintuitive to pathologists. RESULTS: In this article, we present techniques to enable scalable nuclear detection, segmentation and explainable classification. First, we show how modifications to the widely used Mask R-CNN architecture, including decoupling the detection and classification tasks, improves accuracy and enables learning from hybrid annotation datasets like NuCLS, which contain mixtures of bounding boxes and segmentation boundaries. Second, we introduce an explainability method called Decision Tree Approximation of Learned Embeddings (DTALE), which provides explanations for classification model behavior globally, as well as for individual nuclear predictions. DTALE explanations are simple, quantitative, and can flexibly use any measurable morphological features that make sense to practicing pathologists, without sacrificing model accuracy. Together, these techniques present a step toward realizing the promise of computational pathology in computer-aided diagnosis and discovery of morphologic biomarkers. AVAILABILITY AND IMPLEMENTATION: Relevant code can be found at github.com/CancerDataScience/NuCLS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Núcleo Celular , Árvores de DecisõesRESUMO
The placenta is the first organ to form and performs the functions of the lung, gut, kidney, and endocrine systems. Abnormalities in the placenta cause or reflect most abnormalities in gestation and can have life-long consequences for the mother and infant. Placental villi undergo a complex but reproducible sequence of maturation across the third-trimester. Abnormalities of villous maturation are a feature of gestational diabetes and preeclampsia, among others, but there is significant interobserver variability in their diagnosis. Machine learning has emerged as a powerful tool for research in pathology. To capture the volume of data and manage heterogeneity within the placenta, we developed GestaltNet, which emulates human attention to high-yield areas and aggregation across regions. We used this network to estimate the gestational age (GA) of scanned placental slides and compared it to a baseline model lacking the attention and aggregation functions. In the test set, GestaltNet showed a higher r2 (0.9444 vs. 0.9220) than the baseline model. The mean absolute error (MAE) between the estimated and actual GA was also better in the GestaltNet (1.0847 weeks vs. 1.4505 weeks). On whole-slide images, we found the attention sub-network discriminates areas of terminal villi from other placental structures. Using this behavior, we estimated GA for 36 whole slides not previously seen by the model. In this task, similar to that faced by human pathologists, the model showed an r2 of 0.8859 with an MAE of 1.3671 weeks. We show that villous maturation is machine-recognizable. Machine-estimated GA could be useful when GA is unknown or to study abnormalities of villous maturation, including those in gestational diabetes or preeclampsia. GestaltNet points toward a future of genuinely whole-slide digital pathology by incorporating human-like behaviors of attention and aggregation.
Assuntos
Aprendizado Profundo , Idade Gestacional , Interpretação de Imagem Assistida por Computador/métodos , Placenta/diagnóstico por imagem , Placenta/patologia , Diabetes Gestacional/patologia , Feminino , Histocitoquímica , Humanos , Pré-Eclâmpsia/patologia , GravidezRESUMO
Cancer histology reflects underlying molecular processes and disease progression and contains rich phenotypic information that is predictive of patient outcomes. In this study, we show a computational approach for learning patient outcomes from digital pathology images using deep learning to combine the power of adaptive machine learning algorithms with traditional survival models. We illustrate how these survival convolutional neural networks (SCNNs) can integrate information from both histology images and genomic biomarkers into a single unified framework to predict time-to-event outcomes and show prediction accuracy that surpasses the current clinical paradigm for predicting the overall survival of patients diagnosed with glioma. We use statistical sampling techniques to address challenges in learning survival from histology images, including tumor heterogeneity and the need for large training cohorts. We also provide insights into the prediction mechanisms of SCNNs, using heat map visualization to show that SCNNs recognize important structures, like microvascular proliferation, that are related to prognosis and that are used by pathologists in grading. These results highlight the emerging role of deep learning in precision medicine and suggest an expanding utility for computational analysis of histology in the future practice of pathology.
Assuntos
Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patologia , Genômica/métodos , Glioma/genética , Glioma/patologia , Técnicas Histológicas/métodos , Redes Neurais de Computação , Algoritmos , Neoplasias Encefálicas/terapia , Glioma/terapia , Humanos , Processamento de Imagem Assistida por Computador , Medicina de Precisão , PrognósticoRESUMO
BACKGROUND: Deep learning enables accurate high-resolution mapping of cells and tissue structures that can serve as the foundation of interpretable machine-learning models for computational pathology. However, generating adequate labels for these structures is a critical barrier, given the time and effort required from pathologists. RESULTS: This article describes a novel collaborative framework for engaging crowds of medical students and pathologists to produce quality labels for cell nuclei. We used this approach to produce the NuCLS dataset, containing >220,000 annotations of cell nuclei in breast cancers. This builds on prior work labeling tissue regions to produce an integrated tissue region- and cell-level annotation dataset for training that is the largest such resource for multi-scale analysis of breast cancer histology. This article presents data and analysis results for single and multi-rater annotations from both non-experts and pathologists. We present a novel workflow that uses algorithmic suggestions to collect accurate segmentation data without the need for laborious manual tracing of nuclei. Our results indicate that even noisy algorithmic suggestions do not adversely affect pathologist accuracy and can help non-experts improve annotation quality. We also present a new approach for inferring truth from multiple raters and show that non-experts can produce accurate annotations for visually distinctive classes. CONCLUSIONS: This study is the most extensive systematic exploration of the large-scale use of wisdom-of-the-crowd approaches to generate data for computational pathology applications.
Assuntos
Neoplasias da Mama , Crowdsourcing , Neoplasias da Mama/patologia , Núcleo Celular , Crowdsourcing/métodos , Feminino , Humanos , Aprendizado de MáquinaRESUMO
Whole-slide histology images contain information that is valuable for clinical and basic science investigations of cancer but extracting quantitative measurements from these images is challenging for researchers who are not image analysis specialists. In this article, we describe HistomicsML2, a software tool for learn-by-example training of machine learning classifiers for histologic patterns in whole-slide images. This tool improves training efficiency and classifier performance by guiding users to the most informative training examples for labeling and can be used to develop classifiers for prospective application or as a rapid annotation tool that is adaptable to different cancer types. HistomicsML2 runs as a containerized server application that provides web-based user interfaces for classifier training, validation, exporting inference results, and collaborative review, and that can be deployed on GPU servers or cloud platforms. We demonstrate the utility of this tool by using it to classify tumor-infiltrating lymphocytes in breast carcinoma and cutaneous melanoma. SIGNIFICANCE: An interactive machine learning tool for analyzing digital pathology images enables cancer researchers to apply this tool to measure histologic patterns for clinical and basic science studies.