OCTess: AN OPTICAL CHARACTER RECOGNITION ALGORITHM FOR AUTOMATED DATA EXTRACTION OF SPECTRAL DOMAIN OPTICAL COHERENCE TOMOGRAPHY REPORTS.
Retina
; 44(4): 558-564, 2024 Apr 01.
Article
em En
| MEDLINE
| ID: mdl-37948741
PURPOSE: Manual extraction of spectral domain optical coherence tomography (SD-OCT) reports is time and resource intensive. This study aimed to develop an optical character recognition (OCR) algorithm for automated data extraction from Cirrus SD-OCT macular cube reports. METHODS: SD-OCT monocular macular cube reports (n = 675) were randomly selected from a single-center database of patients from 2020 to 2023. Image processing and bounding box operations were performed, and Tesseract (an OCR library) was used to develop the algorithm, OCTess. The algorithm was validated using a separate test data set. RESULTS: The long short-term memory deep learning version of Tesseract achieved the best performance. After reverifying all discrepancies between human and algorithmic data extractions, OCTess achieved accuracies of 100.00% and 99.98% in the training (n = 125) and testing (n = 550) datasets, while the human error rate was 1.11% (98.89% accuracy) and 0.49% (99.51% accuracy) in each, respectively. OCTess extracted data in 3.1 seconds, compared with 94.3 seconds per report for human evaluators. CONCLUSION: We developed an OCR and machine learning algorithm that extracted SD-OCT data with near-perfect accuracy, outperforming humans in both accuracy and efficiency. This algorithm can be used for efficient construction of large-scale SD-OCT data sets for researchers and clinicians.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Algoritmos
/
Tomografia de Coerência Óptica
Limite:
Humans
Idioma:
En
Revista:
Retina
Ano de publicação:
2024
Tipo de documento:
Article
País de afiliação:
Canadá