Your browser doesn't support javascript.
loading
Application of optical character recognition with natural language processing for large-scale quality metric data extraction in colonoscopy reports.
Laique, Sobia Nasir; Hayat, Umar; Sarvepalli, Shashank; Vaughn, Byron; Ibrahim, Mounir; McMichael, John; Qaiser, Kanza Noor; Burke, Carol; Bhatt, Amit; Rhodes, Colin; Rizk, Maged K.
Afiliação
  • Laique SN; Division of Gastroenterology and Hepatology, Mayo Clinic, Phoenix, Arizona, USA.
  • Hayat U; Division of Gastroenterology, University of Minnesota, Minneapolis, Minnesota, USA.
  • Sarvepalli S; Department of Hospital Medicine, Cleveland Clinic, Cleveland, Ohio, USA; Department of Bioinformatics, Vanderbilt University, Nashville, Tennessee, USA.
  • Vaughn B; Division of Gastroenterology, University of Minnesota, Minneapolis, Minnesota, USA.
  • Ibrahim M; Digestive Disease Institute, Cleveland Clinic, Cleveland, Ohio, USA.
  • McMichael J; Digestive Disease Institute, Cleveland Clinic, Cleveland, Ohio, USA.
  • Qaiser KN; Department of Hospital Medicine, Cleveland Clinic, Cleveland, Ohio, USA.
  • Burke C; Digestive Disease Institute, Cleveland Clinic, Cleveland, Ohio, USA.
  • Bhatt A; Digestive Disease Institute, Cleveland Clinic, Cleveland, Ohio, USA.
  • Rhodes C; eHealth Technology, West Henrietta, New York, New York, USA.
  • Rizk MK; Digestive Disease Institute, Cleveland Clinic, Cleveland, Ohio, USA.
Gastrointest Endosc ; 93(3): 750-757, 2021 03.
Article em En | MEDLINE | ID: mdl-32891620
ABSTRACT
BACKGROUND AND

AIMS:

Colonoscopy is commonly performed for colorectal cancer screening in the United States. Reports are often generated in a non-standardized format and are not always integrated into electronic health records. Thus, this information is not readily available for streamlining quality management, participating in endoscopy registries, or reporting of patient- and center-specific risk factors predictive of outcomes. We aim to demonstrate the use of a new hybrid approach using natural language processing of charts that have been elucidated with optical character recognition processing (OCR/NLP hybrid) to obtain relevant clinical information from scanned colonoscopy and pathology reports, a technology co-developed by Cleveland Clinic and eHealth Technologies (West Henrietta, NY, USA).

METHODS:

This was a retrospective study conducted at Cleveland Clinic, Cleveland, Ohio, and the University of Minnesota, Minneapolis, Minnesota. A randomly sampled list of outpatient screening colonoscopy procedures and pathology reports was selected. Desired variables were then collected. Two researchers first manually reviewed the reports for the desired variables. Then, the OCR/NLP algorithm was used to obtain the same variables from 3 electronic health records in use at our institution Epic (Verona, Wisc, USA), ProVation (Minneapolis, Minn, USA) used for endoscopy reporting, and Sunquest PowerPath (Tucson, Ariz, USA) used for pathology reporting.

RESULTS:

Compared with manual data extraction, the accuracy of the hybrid OCR/NLP approach to detect polyps was 95.8%, adenomas 98.5%, sessile serrated polyps 99.3%, advanced adenomas 98%, inadequate bowel preparation 98.4%, and failed cecal intubation 99%. Comparison of the dataset collected via NLP alone with that collected using the hybrid OCR/NLP approach showed that the accuracy for almost all variables was >99%.

CONCLUSIONS:

Our study is the first to validate the use of a unique hybrid OCR/NLP technology to extract desired variables from scanned procedure and pathology reports contained in image format with an accuracy >95%.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Ceco Tipo de estudo: Guideline / Observational_studies / Prognostic_studies / Risk_factors_studies Limite: Humans País como assunto: America do norte Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Ceco Tipo de estudo: Guideline / Observational_studies / Prognostic_studies / Risk_factors_studies Limite: Humans País como assunto: America do norte Idioma: En Ano de publicação: 2021 Tipo de documento: Article