Pesquisa | BVS Doenças Infecciosas e Parasitárias

A Swin transformer encoder-based StyleGAN for unbalanced endoscopic image enhancement.

Deng, Bo; Zheng, Xiangwei; Chen, Xuanchi; Zhang, Mingzhe.

Comput Biol Med ; 175: 108472, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38663349

RESUMO

With the rapid development of artificial intelligence, automated endoscopy-assisted diagnostic systems have become an effective tool for reducing the diagnostic costs and shortening the treatment cycle of patients. Typically, the performance of these systems depends on deep learning models which are pre-trained with large-scale labeled data, for example, early gastric cancer based on endoscopic images. However, the expensive annotation and the subjectivity of the annotators lead to an insufficient and class-imbalanced endoscopic image dataset, and these datasets are detrimental to the training of deep learning models. Therefore, we proposed a Swin Transformer encoder-based StyleGAN (STE-StyleGAN) for unbalanced endoscopic image enhancement, which is composed of an adversarial learning encoder and generator. Firstly, a pre-trained Swin Transformer is introduced into the encoder to extract multi-scale features layer by layer from endoscopic images. The features are subsequently fed into a mapping block for aggregation and recombination. Secondly, a self-attention mechanism is applied to the generator, which adds detailed information of the image layer by layer through recoded features, enabling the generator to autonomously learn the coupling between different image regions. Finally, we conducted extensive experiments on a private intestinal metaplasia grading dataset from a Grade-A tertiary hospital. The experimental results show that the images generated by STE-StyleGAN are closer to the initial image distribution, achieving a Fréchet Inception Distance (FID) value of 100.4. Then, these generated images are used to enhance the initial dataset to improve the robustness of the classification model, and achieved a top accuracy of 86 %.

Assuntos

Aprendizado Profundo , Humanos , Neoplasias Gástricas/diagnóstico por imagem , Neoplasias Gástricas/patologia , Aumento da Imagem/métodos , Endoscopia/métodos , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Imagem Assistida por Computador/métodos

An attribution graph-based interpretable method for CNNs.

Zheng, Xiangwei; Zhang, Lifeng; Xu, Chunyan; Chen, Xuanchi; Cui, Zhen.

Neural Netw ; 179: 106597, 2024 Aug 05.

Artigo em Inglês | MEDLINE | ID: mdl-39128275

RESUMO

Convolutional Neural Networks (CNNs) have demonstrated outstanding performance in various domains, such as face recognition, object detection, and image segmentation. However, the lack of transparency and limited interpretability inherent in CNNs pose challenges in fields such as medical diagnosis, autonomous driving, finance, and military applications. Several studies have explored the interpretability of CNNs and proposed various post-hoc interpretable methods. The majority of these methods are feature-based, focusing on the influence of input variables on outputs. Few methods undertake the analysis of parameters in CNNs and their overall structure. To explore the structure of CNNs and intuitively comprehend the role of their internal parameters, we propose an Attribution Graph-based Interpretable method for CNNs (AGIC) which models the overall structure of CNNs as graphs and provides interpretability from global and local perspectives. The runtime parameters of CNNs and feature maps of each image sample are applied to construct attribution graphs (At-GCs), where the convolutional kernels are represented as nodes and the SHAP values between kernel outputs are assigned as edges. These At-GCs are then employed to pretrain a newly designed heterogeneous graph encoder based on Deep Graph Infomax (DGI). To comprehensively delve into the overall structure of CNNs, the pretrained encoder is used for two types of interpretable tasks: (1) a classifier is attached to the pretrained encoder for the classification of At-GCs, revealing the dependency of At-GC's topological characteristics on the image sample categories, and (2) a scoring aggregation (SA) network is constructed to assess the importance of each node in At-GCs, thus reflecting the relative importance of kernels in CNNs. The experimental results indicate that the topological characteristics of At-GC exhibit a dependency on the sample category used in its construction, which reveals that kernels in CNNs show distinct combined activation patterns for processing different image categories, meanwhile, the kernels that receive high scores from SA network are crucial for feature extraction, whereas low-scoring kernels can be pruned without affecting model performance, thereby enhancing the interpretability of CNNs.

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA