Multi-Scale Efficient Graph-Transformer for Whole Slide Image Classification.
IEEE J Biomed Health Inform
; 27(12): 5926-5936, 2023 Dec.
Article
em En
| MEDLINE
| ID: mdl-37725722
ABSTRACT
The multi-scale information among the whole slide images (WSIs) is essential for cancer diagnosis. Although the existing multi-scale vision Transformer has shown its effectiveness for learning multi-scale image representation, it still cannot work well on the gigapixel WSIs due to their extremely large image sizes. To this end, we propose a novel Multi-scale Efficient Graph-Transformer (MEGT) framework for WSI classification. The key idea of MEGT is to adopt two independent efficient Graph-based Transformer (EGT) branches to process the low-resolution and high-resolution patch embeddings (i.e., tokens in a Transformer) of WSIs, respectively, and then fuse these tokens via a multi-scale feature fusion module (MFFM). Specifically, we design an EGT to efficiently learn the local-global information of patch tokens, which integrates the graph representation into Transformer to capture spatial-related information of WSIs. Meanwhile, we propose a novel MFFM to alleviate the semantic gap among different resolution patches during feature fusion, which creates a non-patch token for each branch as an agent to exchange information with another branch by cross-attention mechanism. In addition, to expedite network training, a new token pruning module is developed in EGT to reduce the redundant tokens. Extensive experiments on both TCGA-RCC and CAMELYON16 datasets demonstrate the effectiveness of the proposed MEGT.
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Fontes de Energia Elétrica
/
Semântica
Idioma:
En
Ano de publicação:
2023
Tipo de documento:
Article