OCTess: AN OPTICAL CHARACTER RECOGNITION ALGORITHM FOR AUTOMATED DATA EXTRACTION OF SPECTRAL DOMAIN OPTICAL COHERENCE TOMOGRAPHY REPORTS.

Balas, Michael; Herman, Josh; Bhambra, Nishaant Shaan; Longwell, Jack; Popovic, Marko M; Melo, Isabela M; Muni, Rajeev H

Balas, Michael; Herman, Josh; Bhambra, Nishaant Shaan; Longwell, Jack; Popovic, Marko M; Melo, Isabela M; Muni, Rajeev H.

Afiliação

Balas M; Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.
Herman J; Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.
Bhambra NS; Faculty of Medicine, McGill University, Montreal, Quebec, Canada.
Longwell J; Department of Mathematics and Statistics, McMaster University, Hamilton, Ontario, Canada.
Popovic MM; Department of Ophthalmology & Vision Sciences, University of Toronto, Toronto, Ontario, Canada; and.
Melo IM; Department of Ophthalmology & Vision Sciences, University of Toronto, Toronto, Ontario, Canada; and.
Muni RH; Department of Ophthalmology, St. Michael's Hospital, Toronto, Ontario, Canada.

Retina ; 44(4): 558-564, 2024 Apr 01.

Article em En | MEDLINE | ID: mdl-37948741

RESUMO

PURPOSE: Manual extraction of spectral domain optical coherence tomography (SD-OCT) reports is time and resource intensive. This study aimed to develop an optical character recognition (OCR) algorithm for automated data extraction from Cirrus SD-OCT macular cube reports. METHODS: SD-OCT monocular macular cube reports (n = 675) were randomly selected from a single-center database of patients from 2020 to 2023. Image processing and bounding box operations were performed, and Tesseract (an OCR library) was used to develop the algorithm, OCTess. The algorithm was validated using a separate test data set. RESULTS: The long short-term memory deep learning version of Tesseract achieved the best performance. After reverifying all discrepancies between human and algorithmic data extractions, OCTess achieved accuracies of 100.00% and 99.98% in the training (n = 125) and testing (n = 550) datasets, while the human error rate was 1.11% (98.89% accuracy) and 0.49% (99.51% accuracy) in each, respectively. OCTess extracted data in 3.1 seconds, compared with 94.3 seconds per report for human evaluators. CONCLUSION: We developed an OCR and machine learning algorithm that extracted SD-OCT data with near-perfect accuracy, outperforming humans in both accuracy and efficiency. This algorithm can be used for efficient construction of large-scale SD-OCT data sets for researchers and clinicians.

Assuntos

Algoritmos; Tomografia de Coerência Óptica; Humanos; Tomografia de Coerência Óptica/métodos; Aprendizado de Máquina

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Tomografia de Coerência Óptica Limite: Humans Idioma: En Revista: Retina Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Canadá

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google