RESUMO
PURPOSE: Inter-clinician variation could cause uncertainty in disease management. This is reported to be high in Retinopathy of Prematurity (ROP), a potentially blinding retinal disease affecting premature infants. Machine learning has the potential to quantify the differences in decision-making between ROP specialists and trainees and may improve the accuracy of diagnosis. METHODS: An anonymized survey of ROP images was administered to the expert(s) and the trainee(s) using a study-designed user interface. The results were analyzed for repeatability as well as to identify the level of agreement in the classification. "Ground truth" was prepared for each individual and a unique classifier was built for each individual using the same. The classifier allowed the identification of the most important features used by each individual. RESULTS: Correlation and disagreement between the expert and the trainees were visualized using the Dipstick™ diagram. Intra-clinician repeatability and reclassification statistics were assessed for all. The repeatability was 88.4% and 86.2% for two trainees and 92.1% for the expert, respectively. Commonly used features differed for the expert and the trainees and accounted for the variability. CONCLUSION: This novel, automated algorithm quantifies the differences using machine learning techniques. This will help audit the training process by objectively measuring differences between experts and trainees. TRANSLATIONAL RELEVANCE: Training for image-based ROP diagnosis can be more objectively performed using this novel, machine learning-based automated image analyzer and classifier.