MetaV: A Pioneer in feature Augmented Meta-Learning Based Vision Transformer for Medical Image Classification.

Ansari, Shaharyar Alam; Agrawal, Arun Prakash; Wajid, Mohd Anas; Wajid, Mohammad Saif; Zafar, Aasim

Ansari, Shaharyar Alam; Agrawal, Arun Prakash; Wajid, Mohd Anas; Wajid, Mohammad Saif; Zafar, Aasim.

Afiliação

Ansari SA; School of Computer Science Engineering and Technology, Bennett University, Greater Noida, 201310, India.
Agrawal AP; Sharda School of Engineering and Technology, Sharda University, Greater Noida, 201306, India. arunpragrawal@gmail.com.
Wajid MA; Department of Computer Application & Technology, School of Computing Science & Engineering, Galgotias University, Greater Noida, 201308, India.
Wajid MS; Department of Computer Science, School of Engineering and Sciences, Tecnológico de Monterrey, Monterrey, 64849, Mexico.
Zafar A; Department of Computer Science, Aligarh Muslim University, Aligarh, 202002, India.

Interdiscip Sci ; 16(2): 469-488, 2024 Jun.

Article em En | MEDLINE | ID: mdl-38951382

ABSTRACT

ABSTRACT

Image classification, a fundamental task in computer vision, faces challenges concerning limited data handling, interpretability, improved feature representation, efficiency across diverse image types, and processing noisy data. Conventional architectural approaches have made insufficient progress in addressing these challenges, necessitating architectures capable of fine-grained classification, enhanced accuracy, and superior generalization. Among these, the vision transformer emerges as a noteworthy computer vision architecture. However, its reliance on substantial data for training poses a drawback due to its complexity and high data requirements. To surmount these challenges, this paper proposes an innovative approach, MetaV, integrating meta-learning into a vision transformer for medical image classification. N-way K-shot learning is employed to train the model, drawing inspiration from human learning mechanisms utilizing past knowledge. Additionally, deformational convolution and patch merging techniques are incorporated into the vision transformer model to mitigate complexity and overfitting while enhancing feature representation. Augmentation methods such as perturbation and Grid Mask are introduced to address the scarcity and noise in medical images, particularly for rare diseases. The proposed model is evaluated using diverse datasets including Break His, ISIC 2019, SIPaKMed, and STARE. The achieved performance accuracies of 89.89%, 87.33%, 94.55%, and 80.22% for Break His, ISIC 2019, SIPaKMed, and STARE, respectively, present evidence validating the superior performance of the proposed model in comparison to conventional models, setting a new benchmark for meta-vision image classification models.

Assuntos

Processamento de Imagem Assistida por Computador; Humanos; Processamento de Imagem Assistida por Computador/métodos; Algoritmos; Aprendizado de Máquina; Diagnóstico por Imagem; Aprendizado Profundo

Palavras-chave

Data augmentation; Few-shot learning; Grid mask; Medical image classification; Meta-learning; MetaV; Perturbation; Vision transformer

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Processamento de Imagem Assistida por Computador Limite: Humans Idioma: En Revista: Interdiscip Sci Assunto da revista: BIOLOGIA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Índia

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google