Extraction of temporal relations from clinical free text: A systematic review of current approaches.

Alfattni, Ghada; Peek, Niels; Nenadic, Goran

Alfattni, Ghada; Peek, Niels; Nenadic, Goran.

Afiliação

Alfattni G; Department of Computer Science, University of Manchester, Manchester, UK; Department of Computer Science, Jamoum University College, Umm Al-Qura University, Makkah, Saudi Arabia. Electronic address: gafattni@uqu.edu.sa.
Peek N; Centre for Health Informatics, Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, UK; National Institute of Health Research Manchester Biomedical Research Centre, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK; The Alan Turing Institute, UK.
Nenadic G; Department of Computer Science, University of Manchester, Manchester, UK; The Alan Turing Institute, UK.

J Biomed Inform ; 108: 103488, 2020 08.

Article em En | MEDLINE | ID: mdl-32673788

ABSTRACT

ABSTRACT

BACKGROUND:

Temporal relations between clinical events play an important role in clinical assessment and decision making. Extracting such relations from free text data is a challenging task because it lies on between medical natural language processing, temporal representation and temporal reasoning.

OBJECTIVES:

To survey existing methods for extracting temporal relations (TLINKs) between events from clinical free text in English; to establish the state-of-the-art in this field; and to identify outstanding methodological challenges.

METHODS:

A systematic search in PubMed and the DBLP computer science bibliography was conducted for studies published between January 2006 and December 2018. The relevant studies were identified by examining the titles and abstracts. Then, the full text of selected studies was analyzed in depth and information were collected on TLINK tasks, TLINK types, data sources, features selection, methods used, and reported performance.

RESULTS:

A total of 2834 publications were identified for title and abstract screening. Of these publications, 51 studies were selected. Thirty-two studies used machine learning approaches, 15 studies used a hybrid approaches, and only four studies used a rule-based approach. The majority of studies use publicly available corpora THYME (28 studies) and the i2b2 corpus (17 studies).

CONCLUSION:

The performance of TLINK extraction methods ranges widely depending on relation types and events (e.g. from 32% to 87% F-score for identifying relations between clinical events and document creation time). A small set of TLINKs (before, after, overlap and contains) has been widely studied with relatively good performance, whereas other types of TLINK (e.g., started by, finished by, precedes) are rarely studied and remain challenging. Machine learning classifiers (such as Support Vector Machine and Conditional Random Fields) and Deep Neural Networks were among the best performing methods for extracting TLINKs, but nearly all the work has been carried out and tested on two publicly available corpora only. The field would benefit from the availability of more publicly available, high-quality, annotated clinical text corpora.

Assuntos

Registros Eletrônicos de Saúde; Processamento de Linguagem Natural; Mineração de Dados; Armazenamento e Recuperação da Informação; Aprendizado de Máquina; Tempo

Palavras-chave

Clinical notes; Electronic health records; Natural language processing; Temporal information extraction; Temporal relation extraction; Text mining

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Registros Eletrônicos de Saúde Tipo de estudo: Prognostic_studies / Qualitative_research / Systematic_reviews Idioma: En Revista: J Biomed Inform Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2020 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google