Your browser doesn't support javascript.
loading
Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review.
Sim, Jin-Ah; Huang, Xiaolei; Horan, Madeline R; Stewart, Christopher M; Robison, Leslie L; Hudson, Melissa M; Baker, Justin N; Huang, I-Chan.
Affiliation
  • Sim JA; Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States; School of AI Convergence, Hallym University, Chuncheon, Republic of Korea.
  • Huang X; Department of Computer Science, University of Memphis, Memphis, TN, United States.
  • Horan MR; Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States.
  • Stewart CM; Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States.
  • Robison LL; Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States.
  • Hudson MM; Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States; Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, United States.
  • Baker JN; Department of Pediatrics, Stanford University, Stanford, CA, United States.
  • Huang IC; Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States. Electronic address: i-chan.huang@stjude.org.
Artif Intell Med ; 146: 102701, 2023 12.
Article in En | MEDLINE | ID: mdl-38042599
ABSTRACT

OBJECTIVE:

Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care.

METHODS:

We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs.

RESULTS:

Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non-neural ML algorithms embedded in NLP.

CONCLUSIONS:

This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Natural Language Processing / Electronic Health Records Type of study: Systematic_reviews Limits: Humans Language: En Journal: Artif Intell Med Journal subject: INFORMATICA MEDICA Year: 2023 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Natural Language Processing / Electronic Health Records Type of study: Systematic_reviews Limits: Humans Language: En Journal: Artif Intell Med Journal subject: INFORMATICA MEDICA Year: 2023 Document type: Article
...