RESUMO
The value of coarsely labeled datasets in learning transferable representations for medical images is investigated in this work. Compared to fine labels which require meticulous effort to annotate, coarse labels can be acquired at a significantly lower cost and can provide useful training signals for data-hungry deep neural networks. We consider coarse labels in the form of binary labels differentiating a normal (healthy) image from an abnormal (diseased) image and propose CAMContrast, a two-stage representation learning framework for medical images. Using class activation maps, CAMContrast makes use of the binary labels to generate heatmaps as positive views for contrastive representation learning. Specifically, the learning objective is optimized to maximize the agreement within fixed crops of image-heatmap pair to learn fine-grained representations that are generalizable to different downstream tasks. We empirically validate the transfer learning performance of CAMContrast on several public datasets, covering classification and segmentation tasks on fundus photographs and chest X-ray images. The experimental results showed that our method outperforms other self-supervised and supervised pretrain methods in terms of data efficiency and downstream performance.
Assuntos
Aprendizagem , Redes Neurais de Computação , TóraxRESUMO
Purpose: To develop and validate a deep learning system (DLS) for estimation of vertical cup-to-disc ratio (vCDR) in ultra-widefield (UWF) and smartphone-based fundus images. Methods: A DLS consisting of two sequential convolutional neural networks (CNNs) to delineate optic disc (OD) and optic cup (OC) boundaries was developed using 800 standard fundus images from the public REFUGE data set. The CNNs were tested on 400 test images from the REFUGE data set and 296 UWF and 300 smartphone-based images from a teleophthalmology clinic. vCDRs derived from the delineated OD/OC boundaries were compared with optometrists' annotations using mean absolute error (MAE). Subgroup analysis was conducted to study the impact of peripapillary atrophy (PPA), and correlation study was performed to investigate potential correlations between sectoral CDR (sCDR) and retinal nerve fiber layer (RNFL) thickness. Results: The system achieved MAEs of 0.040 (95% CI, 0.037-0.043) in the REFUGE test images, 0.068 (95% CI, 0.061-0.075) in the UWF images, and 0.084 (95% CI, 0.075-0.092) in the smartphone-based images. There was no statistical significance in differences between PPA and non-PPA images. Weak correlation (r = -0.4046, P < 0.05) between sCDR and RNFL thickness was found only in the superior sector. Conclusions: We developed a deep learning system that estimates vCDR from standard, UWF, and smartphone-based images. We also described anatomic peripapillary adversarial lesion and its potential impact on OD/OC delineation. Translational Relevance: Artificial intelligence can estimate vCDR from different types of fundus images and may be used as a general and interpretable screening tool to improve community reach for diagnosis and management of glaucoma.