Predicting relations between SOAP note sections: The value of incorporating a clinical information model.

Socrates, Vimig; Gilson, Aidan; Lopez, Kevin; Chi, Ling; Taylor, Richard Andrew; Chartash, David

Socrates, Vimig; Gilson, Aidan; Lopez, Kevin; Chi, Ling; Taylor, Richard Andrew; Chartash, David.

Affiliation

Socrates V; Section for Biomedical Informatics and Data Science, Yale University School of Medicine, 300 George St, 06511, New Haven, USA; Department of Emergency Medicine, Yale University School of Medicine, 464 Congress Ave #260, New Haven, 06519, USA; Program of Computational Biology and Bioinformatics, Yale
Gilson A; Department of Emergency Medicine, Yale University School of Medicine, 464 Congress Ave #260, New Haven, 06519, USA. Electronic address: aidan.gilson@yale.edu.
Lopez K; Section for Biomedical Informatics and Data Science, Yale University School of Medicine, 300 George St, 06511, New Haven, USA; Department of Emergency Medicine, Yale University School of Medicine, 464 Congress Ave #260, New Haven, 06519, USA. Electronic address: kevin.lopez@yale.edu.
Chi L; Department of Emergency Medicine, Yale University School of Medicine, 464 Congress Ave #260, New Haven, 06519, USA. Electronic address: ling.chi@yale.edu.
Taylor RA; Section for Biomedical Informatics and Data Science, Yale University School of Medicine, 300 George St, 06511, New Haven, USA; Department of Emergency Medicine, Yale University School of Medicine, 464 Congress Ave #260, New Haven, 06519, USA. Electronic address: richard.taylor@yale.edu.
Chartash D; Section for Biomedical Informatics and Data Science, Yale University School of Medicine, 300 George St, 06511, New Haven, USA; School of Medicine, University College Dublin - National University of Ireland, Dublin, Health Sciences Centre, Belfield, Dublin 4, Ireland. Electronic address: david.charta

J Biomed Inform ; 141: 104360, 2023 05.

Article in En | MEDLINE | ID: mdl-37061014

ABSTRACT

ABSTRACT

Physician progress notes are frequently organized into Subjective, Objective, Assessment, and Plan (SOAP) sections. The Assessment section synthesizes information recorded in the Subjective and Objective sections, and the Plan section documents tests and treatments to narrow the differential diagnosis and manage symptoms. Classifying the relationship between the Assessment and Plan sections has been suggested to provide valuable insight into clinical reasoning. In this work, we use a novel human-in-the-loop pipeline to classify the relationships between the Assessment and Plan sections of SOAP notes as a part of the n2c2 2022 Track 3 Challenge. In particular, we use a clinical information model constructed from both the entailment logic expected from the aforementioned Challenge and the problem-oriented medical record. This information model is used to label named entities as primary and secondary problems/symptoms, events and complications in all four SOAP sections. We iteratively train separate Named Entity Recognition models and use them to annotate entities in all notes/sections. We fine-tune a downstream RoBERTa-large model to classify the Assessment-Plan relationship. We evaluate multiple language model architectures, preprocessing parameters, and methods of knowledge integration, achieving a maximum macro-F1 score of 82.31%. Our initial model achieves top-2 performance during the challenge (macro-F1 81.52%, competitors' macro-F1 range 74.54%-82.12%). We improved our model by incorporating post-challenge annotations (S&O sections), outperforming the top model from the Challenge. We also used Shapley additive explanations to investigate the extent of language model clinical logic, under the lens of our clinical information model. We find that the model often uses shallow heuristics and nonspecific attention when making predictions, suggesting language model knowledge integration requires further research.

Subject(s)
Key words

Electronic health record; Entailment; Intensive care unit; Language modeling; Natural language processing; SOAP notes

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Physicians Type of study: Prognostic_studies / Risk_factors_studies Limits: Humans Language: En Journal: J Biomed Inform Journal subject: INFORMATICA MEDICA Year: 2023 Document type: Article

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google