RESUMO
Purpose: Real-world evaluation of a deep learning model that prioritizes patients based on risk of progression to moderate or worse (MOD+) diabetic retinopathy (DR). Methods: This nonrandomized, single-arm, prospective, interventional study included patients attending DR screening at four centers across Thailand from September 2019 to January 2020, with mild or no DR. Fundus photographs were input into the model, and patients were scheduled for their subsequent screening from September 2020 to January 2021 in order of predicted risk. Evaluation focused on model sensitivity, defined as correctly ranking patients that developed MOD+ within the first 50% of subsequent screens. Results: We analyzed 1,757 patients, of which 52 (3.0%) developed MOD+. Using the model-proposed order, the model's sensitivity was 90.4%. Both the model-proposed order and mild/no DR plus HbA1c had significantly higher sensitivity than the random order (P < 0.001). Excluding one major (rural) site that had practical implementation challenges, the remaining sites included 567 patients and 15 (2.6%) developed MOD+. Here, the model-proposed order achieved 86.7% versus 73.3% for the ranking that used DR grade and hemoglobin A1c. Conclusions: The model can help prioritize follow-up visits for the largest subgroups of DR patients (those with no or mild DR). Further research is needed to evaluate the impact on clinical management and outcomes. Translational Relevance: Deep learning demonstrated potential for risk stratification in DR screening. However, real-world practicalities must be resolved to fully realize the benefit.
Assuntos
Aprendizado Profundo , Diabetes Mellitus , Retinopatia Diabética , Humanos , Retinopatia Diabética/diagnóstico , Retinopatia Diabética/epidemiologia , Estudos Prospectivos , Hemoglobinas Glicadas , Medição de RiscoRESUMO
BACKGROUND: Photographs of the external eye were recently shown to reveal signs of diabetic retinal disease and elevated glycated haemoglobin. This study aimed to test the hypothesis that external eye photographs contain information about additional systemic medical conditions. METHODS: We developed a deep learning system (DLS) that takes external eye photographs as input and predicts systemic parameters, such as those related to the liver (albumin, aspartate aminotransferase [AST]); kidney (estimated glomerular filtration rate [eGFR], urine albumin-to-creatinine ratio [ACR]); bone or mineral (calcium); thyroid (thyroid stimulating hormone); and blood (haemoglobin, white blood cells [WBC], platelets). This DLS was trained using 123 130 images from 38 398 patients with diabetes undergoing diabetic eye screening in 11 sites across Los Angeles county, CA, USA. Evaluation focused on nine prespecified systemic parameters and leveraged three validation sets (A, B, C) spanning 25 510 patients with and without diabetes undergoing eye screening in three independent sites in Los Angeles county, CA, and the greater Atlanta area, GA, USA. We compared performance against baseline models incorporating available clinicodemographic variables (eg, age, sex, race and ethnicity, years with diabetes). FINDINGS: Relative to the baseline, the DLS achieved statistically significant superior performance at detecting AST >36·0 U/L, calcium <8·6 mg/dL, eGFR <60·0 mL/min/1·73 m2, haemoglobin <11·0 g/dL, platelets <150·0 × 103/µL, ACR ≥300 mg/g, and WBC <4·0 × 103/µL on validation set A (a population resembling the development datasets), with the area under the receiver operating characteristic curve (AUC) of the DLS exceeding that of the baseline by 5·3-19·9% (absolute differences in AUC). On validation sets B and C, with substantial patient population differences compared with the development datasets, the DLS outperformed the baseline for ACR ≥300·0 mg/g and haemoglobin <11·0 g/dL by 7·3-13·2%. INTERPRETATION: We found further evidence that external eye photographs contain biomarkers spanning multiple organ systems. Such biomarkers could enable accessible and non-invasive screening of disease. Further work is needed to understand the translational implications. FUNDING: Google.
Assuntos
Aprendizado Profundo , Retinopatia Diabética , Humanos , Estudos Retrospectivos , Cálcio , Retinopatia Diabética/diagnóstico , Biomarcadores , AlbuminasRESUMO
PURPOSE: To validate the generalizability of a deep learning system (DLS) that detects diabetic macular edema (DME) from 2-dimensional color fundus photographs (CFP), for which the reference standard for retinal thickness and fluid presence is derived from 3-dimensional OCT. DESIGN: Retrospective validation of a DLS across international datasets. PARTICIPANTS: Paired CFP and OCT of patients from diabetic retinopathy (DR) screening programs or retina clinics. The DLS was developed using data sets from Thailand, the United Kingdom, and the United States and validated using 3060 unique eyes from 1582 patients across screening populations in Australia, India, and Thailand. The DLS was separately validated in 698 eyes from 537 screened patients in the United Kingdom with mild DR and suspicion of DME based on CFP. METHODS: The DLS was trained using DME labels from OCT. The presence of DME was based on retinal thickening or intraretinal fluid. The DLS's performance was compared with expert grades of maculopathy and to a previous proof-of-concept version of the DLS. We further simulated the integration of the current DLS into an algorithm trained to detect DR from CFP. MAIN OUTCOME MEASURES: The superiority of specificity and noninferiority of sensitivity of the DLS for the detection of center-involving DME, using device-specific thresholds, compared with experts. RESULTS: The primary analysis in a combined data set spanning Australia, India, and Thailand showed the DLS had 80% specificity and 81% sensitivity, compared with expert graders, who had 59% specificity and 70% sensitivity. Relative to human experts, the DLS had significantly higher specificity (P = 0.008) and noninferior sensitivity (P < 0.001). In the data set from the United Kingdom, the DLS had a specificity of 80% (P < 0.001 for specificity of >50%) and a sensitivity of 100% (P = 0.02 for sensitivity of > 90%). CONCLUSIONS: The DLS can generalize to multiple international populations with an accuracy exceeding that of experts. The clinical value of this DLS to reduce false-positive referrals, thus decreasing the burden on specialist eye care, warrants a prospective evaluation.
Assuntos
Aprendizado Profundo , Diabetes Mellitus , Retinopatia Diabética , Edema Macular , Retinopatia Diabética/complicações , Retinopatia Diabética/diagnóstico , Humanos , Edema Macular/diagnóstico , Edema Macular/etiologia , Estudos Retrospectivos , Tomografia de Coerência Óptica/métodos , Estados UnidosRESUMO
BACKGROUND: Diabetic retinopathy screening is instrumental to preventing blindness, but scaling up screening is challenging because of the increasing number of patients with all forms of diabetes. We aimed to create a deep-learning system to predict the risk of patients with diabetes developing diabetic retinopathy within 2 years. METHODS: We created and validated two versions of a deep-learning system to predict the development of diabetic retinopathy in patients with diabetes who had had teleretinal diabetic retinopathy screening in a primary care setting. The input for the two versions was either a set of three-field or one-field colour fundus photographs. Of the 575â431 eyes in the development set 28â899 had known outcomes, with the remaining 546â532 eyes used to augment the training process via multitask learning. Validation was done on one eye (selected at random) per patient from two datasets: an internal validation (from EyePACS, a teleretinal screening service in the USA) set of 3678 eyes with known outcomes and an external validation (from Thailand) set of 2345 eyes with known outcomes. FINDINGS: The three-field deep-learning system had an area under the receiver operating characteristic curve (AUC) of 0·79 (95% CI 0·77-0·81) in the internal validation set. Assessment of the external validation set-which contained only one-field colour fundus photographs-with the one-field deep-learning system gave an AUC of 0·70 (0·67-0·74). In the internal validation set, the AUC of available risk factors was 0·72 (0·68-0·76), which improved to 0·81 (0·77-0·84) after combining the deep-learning system with these risk factors (p<0·0001). In the external validation set, the corresponding AUC improved from 0·62 (0·58-0·66) to 0·71 (0·68-0·75; p<0·0001) following the addition of the deep-learning system to available risk factors. INTERPRETATION: The deep-learning systems predicted diabetic retinopathy development using colour fundus photographs, and the systems were independent of and more informative than available risk factors. Such a risk stratification tool might help to optimise screening intervals to reduce costs while improving vision-related outcomes. FUNDING: Google.
Assuntos
Aprendizado Profundo , Retinopatia Diabética/diagnóstico , Idoso , Área Sob a Curva , Técnicas de Diagnóstico Oftalmológico , Feminino , Humanos , Estimativa de Kaplan-Meier , Masculino , Pessoa de Meia-Idade , Fotografação , Prognóstico , Curva ROC , Reprodutibilidade dos Testes , Medição de Risco/métodosRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
Owing to the invasiveness of diagnostic tests for anaemia and the costs associated with screening for it, the condition is often undetected. Here, we show that anaemia can be detected via machine-learning algorithms trained using retinal fundus images, study participant metadata (including race or ethnicity, age, sex and blood pressure) or the combination of both data types (images and study participant metadata). In a validation dataset of 11,388 study participants from the UK Biobank, the fundus-image-only, metadata-only and combined models predicted haemoglobin concentration (in g dl-1) with mean absolute error values of 0.73 (95% confidence interval: 0.72-0.74), 0.67 (0.66-0.68) and 0.63 (0.62-0.64), respectively, and with areas under the receiver operating characteristic curve (AUC) values of 0.74 (0.71-0.76), 0.87 (0.85-0.89) and 0.88 (0.86-0.89), respectively. For 539 study participants with self-reported diabetes, the combined model predicted haemoglobin concentration with a mean absolute error of 0.73 (0.68-0.78) and anaemia an AUC of 0.89 (0.85-0.93). Automated anaemia screening on the basis of fundus images could particularly aid patients with diabetes undergoing regular retinal imaging and for whom anaemia can increase morbidity and mortality risks.
Assuntos
Anemia/diagnóstico por imagem , Retina/diagnóstico por imagem , Aprendizado Profundo , Feminino , Fundo de Olho , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Prospectivos , Curva ROCRESUMO
Center-involved diabetic macular edema (ci-DME) is a major cause of vision loss. Although the gold standard for diagnosis involves 3D imaging, 2D imaging by fundus photography is usually used in screening settings, resulting in high false-positive and false-negative calls. To address this, we train a deep learning model to predict ci-DME from fundus photographs, with an ROC-AUC of 0.89 (95% CI: 0.87-0.91), corresponding to 85% sensitivity at 80% specificity. In comparison, retinal specialists have similar sensitivities (82-85%), but only half the specificity (45-50%, p < 0.001). Our model can also detect the presence of intraretinal fluid (AUC: 0.81; 95% CI: 0.81-0.86) and subretinal fluid (AUC 0.88; 95% CI: 0.85-0.91). Using deep learning to make predictions via simple 2D images without sophisticated 3D-imaging equipment and with better than specialist performance, has broad relevance to many other applications in medical imaging.
Assuntos
Retinopatia Diabética/diagnóstico por imagem , Edema Macular/diagnóstico por imagem , Idoso , Aprendizado Profundo , Retinopatia Diabética/genética , Feminino , Humanos , Imageamento Tridimensional , Edema Macular/genética , Masculino , Pessoa de Meia-Idade , Mutação , Fotografação , Retina/diagnóstico por imagem , Tomografia de Coerência ÓpticaRESUMO
The advent of computer graphic processing units, improvement in mathematical models and availability of big data has allowed artificial intelligence (AI) using machine learning (ML) and deep learning (DL) techniques to achieve robust performance for broad applications in social-media, the internet of things, the automotive industry and healthcare. DL systems in particular provide improved capability in image, speech and motion recognition as well as in natural language processing. In medicine, significant progress of AI and DL systems has been demonstrated in image-centric specialties such as radiology, dermatology, pathology and ophthalmology. New studies, including pre-registered prospective clinical trials, have shown DL systems are accurate and effective in detecting diabetic retinopathy (DR), glaucoma, age-related macular degeneration (AMD), retinopathy of prematurity, refractive error and in identifying cardiovascular risk factors and diseases, from digital fundus photographs. There is also increasing attention on the use of AI and DL systems in identifying disease features, progression and treatment response for retinal diseases such as neovascular AMD and diabetic macular edema using optical coherence tomography (OCT). Additionally, the application of ML to visual fields may be useful in detecting glaucoma progression. There are limited studies that incorporate clinical data including electronic health records, in AL and DL algorithms, and no prospective studies to demonstrate that AI and DL algorithms can predict the development of clinical eye disease. This article describes global eye disease burden, unmet needs and common conditions of public health importance for which AI and DL systems may be applicable. Technical and clinical aspects to build a DL system to address those needs, and the potential challenges for clinical adoption are discussed. AI, ML and DL will likely play a crucial role in clinical ophthalmology practice, with implications for screening, diagnosis and follow up of the major causes of vision impairment in the setting of ageing populations globally.
Assuntos
Aprendizado Profundo , Técnicas de Diagnóstico Oftalmológico , Oftalmopatias/diagnóstico , Oftalmologia/métodos , HumanosRESUMO
Traditionally, medical discoveries are made by observing associations, making hypotheses from them and then designing and running experiments to test the hypotheses. However, with medical images, observing and quantifying associations can often be difficult because of the wide variety of features, patterns, colours, values and shapes that are present in real data. Here, we show that deep learning can extract new knowledge from retinal fundus images. Using deep-learning models trained on data from 284,335 patients and validated on two independent datasets of 12,026 and 999 patients, we predicted cardiovascular risk factors not previously thought to be present or quantifiable in retinal images, such as age (mean absolute error within 3.26 years), gender (area under the receiver operating characteristic curve (AUC) = 0.97), smoking status (AUC = 0.71), systolic blood pressure (mean absolute error within 11.23 mmHg) and major adverse cardiac events (AUC = 0.70). We also show that the trained deep-learning models used anatomical features, such as the optic disc or blood vessels, to generate each prediction.
Assuntos
Doenças Cardiovasculares , Aprendizado Profundo , Interpretação de Imagem Assistida por Computador/métodos , Retina/diagnóstico por imagem , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Doenças Cardiovasculares/diagnóstico por imagem , Doenças Cardiovasculares/epidemiologia , Feminino , Fundo de Olho , Humanos , Masculino , Pessoa de Meia-Idade , Fatores de RiscoRESUMO
Purpose: We evaluate how deep learning can be applied to extract novel information such as refractive error from retinal fundus imaging. Methods: Retinal fundus images used in this study were 45- and 30-degree field of view images from the UK Biobank and Age-Related Eye Disease Study (AREDS) clinical trials, respectively. Refractive error was measured by autorefraction in UK Biobank and subjective refraction in AREDS. We trained a deep learning algorithm to predict refractive error from a total of 226,870 images and validated it on 24,007 UK Biobank and 15,750 AREDS images. Our model used the "attention" method to identify features that are correlated with refractive error. Results: The resulting algorithm had a mean absolute error (MAE) of 0.56 diopters (95% confidence interval [CI]: 0.55-0.56) for estimating spherical equivalent on the UK Biobank data set and 0.91 diopters (95% CI: 0.89-0.93) for the AREDS data set. The baseline expected MAE (obtained by simply predicting the mean of this population) was 1.81 diopters (95% CI: 1.79-1.84) for UK Biobank and 1.63 (95% CI: 1.60-1.67) for AREDS. Attention maps suggested that the foveal region was one of the most important areas used by the algorithm to make this prediction, though other regions also contribute to the prediction. Conclusions: To our knowledge, the ability to estimate refractive error with high accuracy from retinal fundus photos has not been previously known and demonstrates that deep learning can be applied to make novel predictions from medical images.