Synthetic Medical Images for Robust, Privacy-Preserving Training of Artificial Intelligence: Application to Retinopathy of Prematurity Diagnosis.

Coyner, Aaron S; Chen, Jimmy S; Chang, Ken; Singh, Praveer; Ostmo, Susan; Chan, R V Paul; Chiang, Michael F; Kalpathy-Cramer, Jayashree; Campbell, J Peter

Synthetic Medical Images for Robust, Privacy-Preserving Training of Artificial Intelligence: Application to Retinopathy of Prematurity Diagnosis.

Coyner, Aaron S; Chen, Jimmy S; Chang, Ken; Singh, Praveer; Ostmo, Susan; Chan, R V Paul; Chiang, Michael F; Kalpathy-Cramer, Jayashree; Campbell, J Peter.

Afiliación

Coyner AS; Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon.
Chen JS; Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon.
Chang K; Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, San Diego, California.
Singh P; Department of Radiology, Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts.
Ostmo S; Center for Clinical Data Science, Massachusetts General Hospital and Boston Women's Hospital, Boston, Massachusetts.
Chan RVP; Department of Radiology, Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts.
Chiang MF; Center for Clinical Data Science, Massachusetts General Hospital and Boston Women's Hospital, Boston, Massachusetts.
Kalpathy-Cramer J; Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon.
Campbell JP; Department of Ophthalmology and Visual Sciences, Eye and Ear Infirmary, University of Illinois, Chicago, Illinois.

Ophthalmol Sci ; 2(2): 100126, 2022 Jun.

Article en En | MEDLINE | ID: mdl-36249693

ABSTRACT

ABSTRACT

Purpose:

Developing robust artificial intelligence (AI) models for medical image analysis requires large quantities of diverse, well-chosen data that can prove challenging to collect because of privacy concerns, disease rarity, or diagnostic label quality. Collecting image-based datasets for retinopathy of prematurity (ROP), a potentially blinding disease, suffers from these challenges. Progressively growing generative adversarial networks (PGANs) may help, because they can synthesize highly realistic images that may increase both the size and diversity of medical datasets.

Design:

Diagnostic validation study of convolutional neural networks (CNNs) for plus disease detection, a component of severe ROP, using synthetic data.

Participants:

Five thousand eight hundred forty-two retinal fundus images (RFIs) collected from 963 preterm infants.

Methods:

Retinal vessel maps (RVMs) were segmented from RFIs. PGANs were trained to synthesize RVMs with normal, pre-plus, or plus disease vasculature. Convolutional neural networks were trained, using real or synthetic RVMs, to detect plus disease from 2 real RVM test datasets. Main Outcome

Measures:

Features of real and synthetic RVMs were evaluated using uniform manifold approximation and projection (UMAP). Similarities were evaluated at the dataset and feature level using Fréchet inception distance and Euclidean distance, respectively. CNN performance was assessed via area under the receiver operating characteristic curve (AUC); AUCs were compared via bootstrapping and Delong's test for correlated receiver operating characteristic curves. Confusion matrices were compared using McNemar's chi-square test and Cohen's κ value.

Results:

The CNN trained on synthetic RVMs showed a significantly higher AUC (0.971; P = 0.006 and P = 0.004) and classified plus disease more similarly to a set of 8 international experts (κ = 0.922) than the CNN trained on real RVMs (AUC = 0.934; κ = 0.701). Real and synthetic RVMs overlapped, by plus disease diagnosis, on the UMAP manifold, showing that synthetic images spanned the disease severity spectrum. Fréchet inception distance and Euclidean distances suggested that real and synthetic RVMs were more dissimilar to one another than real RVMs were to one another, further suggesting that synthetic RVMs were distinct from the training data with respect to privacy considerations.

Conclusions:

Synthetic datasets may be useful for training robust medical AI models. Furthermore, PGANs may be able to synthesize realistic data for use without protected health information concerns.

Palabras clave

AI, artificial intelligence; Artificial intelligence; CNN, convolutional neural network; DL, deep learning; Deep learning; FID, Fréchet inception distance; GAN, generative adversarial network; Generative adversarial network; PGAN, progressively growing generative adversarial network; RFI, retinal fundus image; ROP, retinopathy of prematurity; RVM, retinal vessel map; Retinopathy of prematurity; UMAP, uniform manifold approximation and projection

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Banco de datos: MEDLINE Tipo de estudio: Diagnostic_studies / Prognostic_studies Idioma: En Año: 2022 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Banco de datos: MEDLINE Tipo de estudio: Diagnostic_studies / Prognostic_studies Idioma: En Año: 2022 Tipo del documento: Article