Does your dermatology classifier know what it doesn't know? Detecting the long-tail of unseen conditions.

Guha Roy, Abhijit; Ren, Jie; Azizi, Shekoofeh; Loh, Aaron; Natarajan, Vivek; Mustafa, Basil; Pawlowski, Nick; Freyberg, Jan; Liu, Yuan; Beaver, Zach; Vo, Nam; Bui, Peggy; Winter, Samantha; MacWilliams, Patricia; Corrado, Greg S; Telang, Umesh; Liu, Yun; Cemgil, Taylan; Karthikesalingam, Alan; Lakshminarayanan, Balaji; Winkens, Jim

Guha Roy, Abhijit; Ren, Jie; Azizi, Shekoofeh; Loh, Aaron; Natarajan, Vivek; Mustafa, Basil; Pawlowski, Nick; Freyberg, Jan; Liu, Yuan; Beaver, Zach; Vo, Nam; Bui, Peggy; Winter, Samantha; MacWilliams, Patricia; Corrado, Greg S; Telang, Umesh; Liu, Yun; Cemgil, Taylan; Karthikesalingam, Alan; Lakshminarayanan, Balaji; Winkens, Jim.

Afiliação

Guha Roy A; Google Health. Electronic address: agroy@google.com.
Ren J; Google Research, Brain Team. Electronic address: jjren@google.com.
Azizi S; Google Health.
Loh A; Google Health.
Natarajan V; Google Health.
Mustafa B; Google Research, Brain Team.
Pawlowski N; Google Health.
Freyberg J; Google Health.
Liu Y; Google Health.
Beaver Z; Google Health.
Vo N; Google Health.
Bui P; Google Health.
Winter S; Google Health.
MacWilliams P; Google Health.
Corrado GS; Google Health.
Telang U; Google Health.
Liu Y; Google Health.
Cemgil T; DeepMind.
Karthikesalingam A; Google Health.
Lakshminarayanan B; Google Research, Brain Team. Electronic address: balajiln@google.com.
Winkens J; Google Health. Electronic address: jimwinkens@google.com.

Med Image Anal ; 75: 102274, 2022 01.

Article em En | MEDLINE | ID: mdl-34731777

RESUMO

Supervised deep learning models have proven to be highly effective in classification of dermatological conditions. These models rely on the availability of abundant labeled training examples. However, in the real-world, many dermatological conditions are individually too infrequent for per-condition classification with supervised learning. Although individually infrequent, these conditions may collectively be common and therefore are clinically significant in aggregate. To prevent models from generating erroneous outputs on such examples, there remains a considerable unmet need for deep learning systems that can better detect such infrequent conditions. These infrequent 'outlier' conditions are seen very rarely (or not at all) during training. In this paper, we frame this task as an out-of-distribution (OOD) detection problem. We set up a benchmark ensuring that outlier conditions are disjoint between the model training, validation, and test sets. Unlike traditional OOD detection benchmarks where the task is to detect dataset distribution shift, we aim at the more challenging task of detecting subtle differences resulting from a different pathology or condition. We propose a novel hierarchical outlier detection (HOD) loss, which assigns multiple abstention classes corresponding to each training outlier class and jointly performs a coarse classification of inliers vs. outliers, along with fine-grained classification of the individual classes. We demonstrate that the proposed HOD loss based approach outperforms leading methods that leverage outlier data during training. Further, performance is significantly boosted by using recent representation learning methods (BiT, SimCLR, MICLe). Further, we explore ensembling strategies for OOD detection and propose a diverse ensemble selection process for the best result. We also perform a subgroup analysis over conditions of varying risk levels and different skin types to investigate how OOD performance changes over each subgroup and demonstrate the gains of our framework in comparison to baseline. Furthermore, we go beyond traditional performance metrics and introduce a cost matrix for model trust analysis to approximate downstream clinical impact. We use this cost matrix to compare the proposed method against the baseline, thereby making a stronger case for its effectiveness in real-world scenarios.

Assuntos

Dermatologia; Benchmarking; Humanos

Palavras-chave

Deep learning; Dermatology; Ensembles; Long-tailed recognition; Out-of-distribution detection; Outlier exposure; Representation learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Dermatologia Limite: Humans Idioma: En Revista: Med Image Anal Assunto da revista: DIAGNOSTICO POR IMAGEM Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google