Vision transformer with masked autoencoders for referable diabetic retinopathy classification based on large-size retina image.

Yang, Yaoming; Cai, Zhili; Qiu, Shuxia; Xu, Peng

Yang, Yaoming; Cai, Zhili; Qiu, Shuxia; Xu, Peng.

Afiliación

Yang Y; College of Science, China Jiliang University, Hangzhou, Zhejiang, China.
Cai Z; College of Science, China Jiliang University, Hangzhou, Zhejiang, China.
Qiu S; College of Science, China Jiliang University, Hangzhou, Zhejiang, China.
Xu P; Key Laboratory of Intelligent Manufacturing Quality Big Data Tracing and Analysis of Zhejiang Province, Hangzhou, Zhejiang, China.

PLoS One ; 19(3): e0299265, 2024.

Article en En | MEDLINE | ID: mdl-38446810

ABSTRACT

ABSTRACT

Computer-aided diagnosis systems based on deep learning algorithms have shown potential applications in rapid diagnosis of diabetic retinopathy (DR). Due to the superior performance of Transformer over convolutional neural networks (CNN) on natural images, we attempted to develop a new model to classify referable DR based on a limited number of large-size retinal images by using Transformer. Vision Transformer (ViT) with Masked Autoencoders (MAE) was applied in this study to improve the classification performance of referable DR. We collected over 100,000 publicly fundus retinal images larger than 224×224, and then pre-trained ViT on these retinal images using MAE. The pre-trained ViT was applied to classify referable DR, the performance was also compared with that of ViT pre-trained using ImageNet. The improvement in model classification performance by pre-training with over 100,000 retinal images using MAE is superior to that pre-trained with ImageNet. The accuracy, area under curve (AUC), highest sensitivity and highest specificity of the present model are 93.42%, 0.9853, 0.973 and 0.9539, respectively. This study shows that MAE can provide more flexibility to the input image and substantially reduce the number of images required. Meanwhile, the pretraining dataset scale in this study is much smaller than ImageNet, and the pre-trained weights from ImageNet are not required also.

Asunto(s)

Diabetes Mellitus; Retinopatía Diabética; Animales; Retinopatía Diabética/diagnóstico por imagen; Abomaso; Algoritmos; Área Bajo la Curva; Fondo de Ojo

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Diabetes Mellitus / Retinopatía Diabética Límite: Animals Idioma: En Revista: PLoS One Asunto de la revista: CIENCIA / MEDICINA Año: 2024 Tipo del documento: Article País de afiliación: China

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google