Query bot for retrieving patients' clinical history: A COVID-19 use-case.

Wang, Yibo; Tariq, Amara; Khan, Fiza; Gichoya, Judy Wawira; Trivedi, Hari; Banerjee, Imon

Wang, Yibo; Tariq, Amara; Khan, Fiza; Gichoya, Judy Wawira; Trivedi, Hari; Banerjee, Imon.

Afiliação

Wang Y; Department of Computer Science, Emory University, Atlanta, GA 30322, USA. Electronic address: imon.banerjee@asu.edu.
Tariq A; Department of Radiology, Mayo Clinic, Arizona, AZ 85054, USA.
Khan F; Department of Radiology, Emory School of Medicine, Atlanta, GA 303224, USA.
Gichoya JW; Department of Biomedical Informatics, Emory School of Medicine, Atlanta, GA 30322, USA; Department of Radiology, Emory School of Medicine, Atlanta, GA 303224, USA.
Trivedi H; Department of Biomedical Informatics, Emory School of Medicine, Atlanta, GA 30322, USA; Department of Radiology, Emory School of Medicine, Atlanta, GA 303224, USA.
Banerjee I; Department of Radiology, Mayo Clinic, Arizona, AZ 85054, USA; School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, AZ 85281, USA.

J Biomed Inform ; 123: 103918, 2021 11.

Article em En | MEDLINE | ID: mdl-34560275

ABSTRACT

ABSTRACT

OBJECTIVE:

With increasing patient complexity whose data are stored in fragmented health information systems, automated and time-efficient ways of gathering important information from the patients' medical history are needed for effective clinical decision making. Using COVID-19 as a case study, we developed a query-bot information retrieval system with user-feedback to allow clinicians to ask natural questions to retrieve data from patient notes. MATERIALS AND

METHODS:

We applied clinicalBERT, a pre-trained contextual language model, to our dataset of patient notes to obtain sentence embeddings, using K-Means to reduce computation time for real-time interaction. Rocchio algorithm was then employed to incorporate user-feedback and improve retrieval performance.

RESULTS:

In an iterative feedback loop experiment, MAP for final iteration was 0.93/0.94 as compared to initial MAP of 0.66/0.52 for generic and 1./1. compared to 0.79/0.83 for COVID-19 specific queries confirming that contextual model handles the ambiguity in natural language queries and feedback helps to improve retrieval performance. User-in-loop experiment also outperformed the automated pseudo relevance feedback method. Moreover, the null hypothesis which assumes identical precision between initial retrieval and relevance feedback was rejected with high statistical significance (p âª 0.05). Compared to Word2Vec, TF-IDF and bioBERT models, clinicalBERT works optimally considering the balance between response precision and user-feedback.

DISCUSSION:

Our model works well for generic as well as COVID-19 specific queries. However, some generic queries are not answered as well as others because clustering reduces query performance and vague relations between queries and sentences are considered non-relevant. We also tested our model for queries with the same meaning but different expressions and demonstrated that these query variations yielded similar performance after incorporation of user-feedback.

CONCLUSION:

In conclusion, we develop an NLP-based query-bot that handles synonyms and natural language ambiguity in order to retrieve relevant information from the patient chart. User-feedback is critical to improve model performance.

Assuntos

COVID-19; Algoritmos; Retroalimentação; Humanos; Armazenamento e Recuperação da Informação; SARS-CoV-2

Palavras-chave

BERT; Clinical notes; Information retrieval; Relevance feedback; k-means

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: COVID-19 Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google