Attention De-sparsification Matters: Inducing diversity in digital pathology representation learning.

Kapse, Saarthak; Das, Srijan; Zhang, Jingwei; Gupta, Rajarsi R; Saltz, Joel; Samaras, Dimitris; Prasanna, Prateek

Kapse, Saarthak; Das, Srijan; Zhang, Jingwei; Gupta, Rajarsi R; Saltz, Joel; Samaras, Dimitris; Prasanna, Prateek.

Afiliação

Kapse S; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY, 11794, USA. Electronic address: saarthak.kapse@stonybrook.edu.
Das S; UNC Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA.
Zhang J; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY, 11794, USA.
Gupta RR; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY, 11794, USA.
Saltz J; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY, 11794, USA.
Samaras D; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY, 11794, USA.
Prasanna P; Stony Brook University, 100 Nicolls Rd, Stony Brook, NY, 11794, USA. Electronic address: prateek.prasanna@stonybrook.edu.

Med Image Anal ; 93: 103070, 2024 Apr.

Article em En | MEDLINE | ID: mdl-38176354

ABSTRACT

ABSTRACT

We propose DiRL, a Diversity-inducing Representation Learning technique for histopathology imaging. Self-supervised learning (SSL) techniques, such as contrastive and non-contrastive approaches, have been shown to learn rich and effective representations of digitized tissue samples with limited pathologist supervision. Our analysis of vanilla SSL-pretrained models' attention distribution reveals an insightful observation sparsity in attention, i.e, models tends to localize most of their attention to some prominent patterns in the image. Although attention sparsity can be beneficial in natural images due to these prominent patterns being the object of interest itself, this can be sub-optimal in digital pathology; this is because, unlike natural images, digital pathology scans are not object-centric, but rather a complex phenotype of various spatially intermixed biological components. Inadequate diversification of attention in these complex images could result in crucial information loss. To address this, we leverage cell segmentation to densely extract multiple histopathology-specific representations, and then propose a prior-guided dense pretext task, designed to match the multiple corresponding representations between the views. Through this, the model learns to attend to various components more closely and evenly, thus inducing adequate diversification in attention for capturing context-rich representations. Through quantitative and qualitative analysis on multiple tasks across cancer types, we demonstrate the efficacy of our method and observe that the attention is more globally distributed.

Assuntos

Processamento de Imagem Assistida por Computador; Aprendizado de Máquina; Patologia; Humanos; Fenótipo; Patologia/métodos

Palavras-chave

Cell segmentation; Computational pathology; Self supervised learning; Vision Transformer

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Patologia / Processamento de Imagem Assistida por Computador / Aprendizado de Máquina Tipo de estudo: Qualitative_research Limite: Humans Idioma: En Revista: Med Image Anal Assunto da revista: DIAGNOSTICO POR IMAGEM Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google