RESUMO
The deployment of automated deep-learning classifiers in clinical practice has the potential to streamline the diagnosis process and improve the diagnosis accuracy, but the acceptance of those classifiers relies on both their accuracy and interpretability. In general, accurate deep-learning classifiers provide little model interpretability, while interpretable models do not have competitive classification accuracy. In this paper, we introduce a new deep-learning diagnosis framework, called InterNRL, that is designed to be highly accurate and interpretable. InterNRL consists of a student-teacher framework, where the student model is an interpretable prototype-based classifier (ProtoPNet) and the teacher is an accurate global image classifier (GlobalNet). The two classifiers are mutually optimised with a novel reciprocal learning paradigm in which the student ProtoPNet learns from optimal pseudo labels produced by the teacher GlobalNet, while GlobalNet learns from ProtoPNet's classification performance and pseudo labels. This reciprocal learning paradigm enables InterNRL to be flexibly optimised under both fully- and semi-supervised learning scenarios, reaching state-of-the-art classification performance in both scenarios for the tasks of breast cancer and retinal disease diagnosis. Moreover, relying on weakly-labelled training images, InterNRL also achieves superior breast cancer localisation and brain tumour segmentation results than other competing methods.
Assuntos
Neoplasias da Mama , Aprendizado Profundo , Doenças Retinianas , Humanos , Feminino , Retina , Aprendizado de Máquina SupervisionadoRESUMO
Methods to detect malignant lesions from screening mammograms are usually trained with fully annotated datasets, where images are labelled with the localisation and classification of cancerous lesions. However, real-world screening mammogram datasets commonly have a subset that is fully annotated and another subset that is weakly annotated with just the global classification (i.e., without lesion localisation). Given the large size of such datasets, researchers usually face a dilemma with the weakly annotated subset: to not use it or to fully annotate it. The first option will reduce detection accuracy because it does not use the whole dataset, and the second option is too expensive given that the annotation needs to be done by expert radiologists. In this paper, we propose a middle-ground solution for the dilemma, which is to formulate the training as a weakly- and semi-supervised learning problem that we refer to as malignant breast lesion detection with incomplete annotations. To address this problem, our new method comprises two stages, namely: (1) pre-training a multi-view mammogram classifier with weak supervision from the whole dataset, and (2) extending the trained classifier to become a multi-view detector that is trained with semi-supervised student-teacher learning, where the training set contains fully and weakly-annotated mammograms. We provide extensive detection results on two real-world screening mammogram datasets containing incomplete annotations and show that our proposed approach achieves state-of-the-art results in the detection of malignant breast lesions with incomplete annotations.
Assuntos
Neoplasias da Mama , Mamografia , Interpretação de Imagem Radiográfica Assistida por Computador , Humanos , Neoplasias da Mama/diagnóstico por imagem , Mamografia/métodos , Feminino , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Algoritmos , Aprendizado de Máquina SupervisionadoRESUMO
Artificial intelligence (AI) readers of mammograms compare favourably to individual radiologists in detecting breast cancer. However, AI readers cannot perform at the level of multi-reader systems used by screening programs in countries such as Australia, Sweden, and the UK. Therefore, implementation demands human-AI collaboration. Here, we use a large, high-quality retrospective mammography dataset from Victoria, Australia to conduct detailed simulations of five potential AI-integrated screening pathways, and examine human-AI interaction effects to explore automation bias. Operating an AI reader as a second reader or as a high confidence filter improves current screening outcomes by 1.9-2.5% in sensitivity and up to 0.6% in specificity, achieving 4.6-10.9% reduction in assessments and 48-80.7% reduction in human reads. Automation bias degrades performance in multi-reader settings but improves it for single-readers. This study provides insight into feasible approaches for AI-integrated screening pathways and prospective studies necessary prior to clinical adoption.
Assuntos
Inteligência Artificial , Neoplasias da Mama , Detecção Precoce de Câncer , Mamografia , Humanos , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/diagnóstico , Feminino , Mamografia/métodos , Detecção Precoce de Câncer/métodos , Estudos Retrospectivos , Pessoa de Meia-Idade , Vitória/epidemiologia , Idoso , Programas de Rastreamento/métodos , Sensibilidade e EspecificidadeRESUMO
Supplemental material is available for this article. Keywords: Mammography, Screening, Convolutional Neural Network (CNN) Published under a CC BY 4.0 license. See also the commentary by Cadrin-Chênevert in this issue.