RESUMO
BACKGROUND: Convolutional neural network-based image processing research is actively being conducted for pathology image analysis. As a convolutional neural network model requires a large amount of image data for training, active learning (AL) has been developed to produce efficient learning with a small amount of training data. However, existing studies have not specifically considered the characteristics of pathological data collected from the workplace. For various reasons, noisy patches can be selected instead of clean patches during AL, thereby reducing its efficiency. This study proposes an effective AL method for cancer pathology that works robustly on noisy datasets. METHODS: Our proposed method to develop a robust AL approach for noisy histopathology datasets consists of the following three steps: 1) training a loss prediction module, 2) collecting predicted loss values, and 3) sampling data for labeling. This proposed method calculates the amount of information in unlabeled data as predicted loss values and removes noisy data based on predicted loss values to reduce the rate at which noisy data are selected from the unlabeled dataset. We identified a suitable threshold for optimizing the efficiency of AL through sensitivity analysis. RESULTS: We compared the results obtained with the identified threshold with those of existing representative AL methods. In the final iteration, the proposed method achieved a performance of 91.7% on the noisy dataset and 92.4% on the clean dataset, resulting in a performance reduction of less than 1%. Concomitantly, the noise selection ratio averaged only 2.93% on each iteration. CONCLUSIONS: The proposed AL method showed robust performance on datasets containing noisy data by avoiding data selection in predictive loss intervals where noisy data are likely to be distributed. The proposed method contributes to medical image analysis by screening data and producing a robust and effective classification model tailored for cancer pathology image processing in the workplace.
Assuntos
Processamento de Imagem Assistida por Computador , Neoplasias , Humanos , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Neoplasias/diagnóstico por imagem , Local de TrabalhoRESUMO
Deep neural network (DNN) models have been applied to a wide variety of medical image analysis tasks, often with the successful performance outcomes that match those of medical doctors. However, given that even minor errors in a model can impact patients' life, it is critical that these models are continuously improved. Hence, active learning (AL) has garnered attention as an effective and sustainable strategy for enhancing DNN models for the medical domain. Extant AL research in histopathology has primarily focused on patch datasets derived from whole-slide images (WSIs), a standard form of cancer diagnostic images obtained from a high-resolution scanner. However, this approach has failed to address the selection of WSIs, which can impede the performance improvement of deep learning models and increase the number of WSIs needed to achieve the target performance. This study introduces a WSI-level AL method, termed WSI-informative selection (WISE). WISE is designed to select informative WSIs using a newly formulated WSI-level class distance metric. This method aims to identify diverse and uncertain cases of WSIs, thereby contributing to model performance enhancement. WISE demonstrates state-of-the-art performance across the Colon and Stomach datasets, collected in the real world, as well as the public DigestPath dataset, significantly reducing the required number of WSIs by more than threefold compared to the one-pool dataset setting, which has been dominantly used in the field.
RESUMO
This paper proposes a deep learning-based patch label denoising method (LossDiff) for improving the classification of whole-slide images of cancer using a convolutional neural network (CNN). Automated whole-slide image classification is often challenging, requiring a large amount of labeled data. Pathologists annotate the region of interest by marking malignant areas, which pose a high risk of introducing patch-based label noise by involving benign regions that are typically small in size within the malignant annotations, resulting in low classification accuracy with many Type-II errors. To overcome this critical problem, this paper presents a simple yet effective method for noisy patch classification. The proposed method, validated using stomach cancer images, provides a significant improvement compared to other existing methods in patch-based cancer classification, with accuracies of 98.81%, 97.30% and 89.47% for binary, ternary, and quaternary classes, respectively. Moreover, we conduct several experiments at different noise levels using a publicly available dataset to further demonstrate the robustness of the proposed method. Given the high cost of producing explicit annotations for whole-slide images and the unavoidable error-prone nature of the human annotation of medical images, the proposed method has practical implications for whole-slide image annotation and automated cancer diagnosis.
RESUMO
CNN-based image processing has been actively applied to histopathological analysis to detect and classify cancerous tumors automatically. However, CNN-based classifiers generally predict a label with overconfidence, which becomes a serious problem in the medical domain. The objective of this study is to propose a new training method, called MixPatch, designed to improve a CNN-based classifier by specifically addressing the prediction uncertainty problem and examine its effectiveness in improving diagnosis performance in the context of histopathological image analysis. MixPatch generates and uses a new sub-training dataset, which consists of mixed-patches and their predefined ground-truth labels, for every single mini-batch. Mixed-patches are generated using a small size of clean patches confirmed by pathologists while their ground-truth labels are defined using a proportion-based soft labeling method. Our results obtained using a large histopathological image dataset shows that the proposed method performs better and alleviates overconfidence more effectively than any other method examined in the study. More specifically, our model showed 97.06% accuracy, an increase of 1.6% to 12.18%, while achieving 0.76% of expected calibration error, a decrease of 0.6% to 6.3%, over the other models. By specifically considering the mixed-region variation characteristics of histopathology images, MixPatch augments the extant mixed image methods for medical image analysis in which prediction uncertainty is a crucial issue. The proposed method provides a new way to systematically alleviate the overconfidence problem of CNN-based classifiers and improve their prediction accuracy, contributing toward more calibrated and reliable histopathology image analysis.
RESUMO
BACKGROUND: Colorectal and gastric cancer are major causes of cancer-related deaths. In Korea, gastrointestinal (GI) endoscopic biopsy specimens account for a high percentage of histopathologic examinations. Lack of a sufficient pathologist workforce can cause an increase in human errors, threatening patient safety. Therefore, we developed a digital pathology total solution combining artificial intelligence (AI) classifier models and pathology laboratory information system for GI endoscopic biopsy specimens to establish a post-analytic daily fast quality control (QC) system, which was applied in clinical practice for a 3-month trial run by four pathologists. METHODS AND FINDINGS: Our whole slide image (WSI) classification framework comprised patch-generator, patch-level classifier, and WSI-level classifier. The classifiers were both based on DenseNet (Dense Convolutional Network). In laboratory tests, the WSI classifier achieved accuracy rates of 95.8% and 96.0% in classifying histopathological WSIs of colorectal and gastric endoscopic biopsy specimens, respectively, into three classes (Negative for dysplasia, Dysplasia, and Malignant). Classification by pathologic diagnosis and AI prediction were compared and daily reviews were conducted, focusing on discordant cases for early detection of potential human errors by the pathologists, allowing immediate correction, before the pathology report error is conveyed to the patients. During the 3-month AI-assisted daily QC trial run period, approximately 7-10 times the number of slides compared to that in the conventional monthly QC (33 months) were reviewed by pathologists; nearly 100% of GI endoscopy biopsy slides were double-checked by the AI models. Further, approximately 17-30 times the number of potential human errors were detected within an average of 1.2 days. CONCLUSIONS: The AI-assisted daily QC system that we developed and established demonstrated notable improvements in QC, in quantitative, qualitative, and time utility aspects. Ultimately, we developed an independent AI-assisted post-analytic daily fast QC system that was clinically applicable and influential, which could enhance patient safety.
Assuntos
Inteligência Artificial , Neoplasias Colorretais , Humanos , Biópsia , Endoscopia Gastrointestinal , Controle de Qualidade , Neoplasias Colorretais/diagnósticoRESUMO
ETHNO PHARMACOLOGICAL RELEVANCE: Plants have been the most important natural resources for traditional medicine and for the modern pharmaceutical industry. They have been in demand in regards to finding alternative medicinal herbs with similar efficacy. Due to the very low probability of discovering useful compounds by random screening, researchers have advocated for using targeted selection approaches. Furthermore, because drug repositioning can speed up the process of drug development, an integrated technique that exploits chemical, genetic, and disease information has been recently developed. Building upon these findings, in this paper, we propose a novel framework for the targeted selection of herbs with similar efficacy by exploiting drug repositioning technique and curated modern scientific biomedical knowledge, with the goal of improving the possibility of inferring the traditional empirical ethno-pharmacological knowledge. MATERIALS AND METHODS: To rank candidate herbs on the basis of similarities against target herb, we proposed and evaluated a framework that is comprised of the following four layers: links, extract, similarity, and model. In the framework, multiple databases are linked to build an herb-compound-protein-disease network which was composed of one tripartite network and two bipartite networks allowing comprehensive and detailed information to be extracted. Further, various similarity scores between herbs are calculated, and then prediction models are trained and tested on the basis of theses similarity features. RESULTS: The proposed framework has been found to be feasible in terms of link loss. Out of the 50 similarities, the best one enhanced the performance of ranking herbs with similar efficacy by about 120-320% compared with our previous study. Also, the prediction model showed improved performance by about 180-480%. While building the prediction model, we identified the compound information as being the most important knowledge source and structural similarity as the most useful measure. CONCLUSIONS: In the proposed framework, we took the knowledge of herbal medicine, chemistry, biology, and medicine into consideration to rank herbs with similar efficacy in candidates. The experimental results demonstrated that the performances of framework outperformed the baselines and identified the important knowledge source and useful similarity measure.