Clinical concept recognition: Evaluation of existing systems on EHRs.

Lossio-Ventura, Juan Antonio; Sun, Ran; Boussard, Sebastien; Hernandez-Boussard, Tina

Lossio-Ventura, Juan Antonio; Sun, Ran; Boussard, Sebastien; Hernandez-Boussard, Tina.

Afiliação

Lossio-Ventura JA; Biomedical Informatics Research, Stanford University, Stanford, CA, United States.
Sun R; National Institute of Mental Health, National Institutes of Health, Bethesda, MD, United States.
Boussard S; Biomedical Informatics Research, Stanford University, Stanford, CA, United States.
Hernandez-Boussard T; College of Engineering, Boston University, Boston, MA, United States.

Front Artif Intell ; 5: 1051724, 2022.

Article em En | MEDLINE | ID: mdl-36714202

RESUMO

Objective: The adoption of electronic health records (EHRs) has produced enormous amounts of data, creating research opportunities in clinical data sciences. Several concept recognition systems have been developed to facilitate clinical information extraction from these data. While studies exist that compare the performance of many concept recognition systems, they are typically developed internally and may be biased due to different internal implementations, parameters used, and limited number of systems included in the evaluations. The goal of this research is to evaluate the performance of existing systems to retrieve relevant clinical concepts from EHRs. Methods: We investigated six concept recognition systems, including CLAMP, cTAKES, MetaMap, NCBO Annotator, QuickUMLS, and ScispaCy. Clinical concepts extracted included procedures, disorders, medications, and anatomical location. The system performance was evaluated on two datasets: the 2010 i2b2 and the MIMIC-III. Additionally, we assessed the performance of these systems in five challenging situations, including negation, severity, abbreviation, ambiguity, and misspelling. Results: For clinical concept extraction, CLAMP achieved the best performance on exact and inexact matching, with an F-score of 0.70 and 0.94, respectively, on i2b2; and 0.39 and 0.50, respectively, on MIMIC-III. Across the five challenging situations, ScispaCy excelled in extracting abbreviation information (F-score: 0.86) followed by NCBO Annotator (F-score: 0.79). CLAMP outperformed in extracting severity terms (F-score 0.73) followed by NCBO Annotator (F-score: 0.68). CLAMP outperformed other systems in extracting negated concepts (F-score 0.63). Conclusions: Several concept recognition systems exist to extract clinical information from unstructured data. This study provides an external evaluation by end-users of six commonly used systems across different extraction tasks. Our findings suggest that CLAMP provides the most comprehensive set of annotations for clinical concept extraction tasks and associated challenges. Comparing standard extraction tasks across systems provides guidance to other clinical researchers when selecting a concept recognition system relevant to their clinical information extraction task.

Palavras-chave

UMLS; clinical concept recognition; clinical information extraction; electronic health records; named-entity recognition; natural language processing

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Contexto em Saúde: 1_ASSA2030 Base de dados: MEDLINE Tipo de estudo: Guideline Idioma: En Revista: Front Artif Intell Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google