Your browser doesn't support javascript.
loading
Use GPT-J Prompt Generation with RoBERTa for NER Models on Diagnosis Extraction of Periodontal Diagnosis from Electronic Dental Records.
Chuang, Yao-Shun; Jiang, Xiaoqian; Lee, Chun-Teh; Brandon, Ryan; Tran, Duong; Tokede, Oluwabunmi; Walji, Muhammad F.
Affiliation
  • Chuang YS; School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA.
  • Jiang X; School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA.
  • Lee CT; Department of Periodontics and Dental Hygiene, The University of Texas Health Science Center at Houston School of Dentistry, Houston, Texas, USA.
  • Brandon R; Department of Oral Health Sciences, Temple University Kornberg School of Dentistry, Philadelphia, Pennsylvania, USA.
  • Tran D; Diagnostic and Biomedical Sciences, The University of Texas Health Science Center at Houston School of Dentistry, Houston, Texas, USA.
  • Tokede O; Oral Healthcare Quality and Safety, The University of Texas Health Science Center at Houston School of Dentistry, Houston, Texas, USA.
  • Walji MF; Diagnostic and Biomedical Sciences, The University of Texas Health Science Center at Houston School of Dentistry, Houston, Texas, USA.
AMIA Annu Symp Proc ; 2023: 904-912, 2023.
Article in En | MEDLINE | ID: mdl-38222409
ABSTRACT
This study explored the usability of prompt generation on named entity recognition (NER) tasks and the performance in different settings of the prompt. The prompt generation by GPT-J models was utilized to directly test the gold standard as well as to generate the seed and further fed to the RoBERTa model with the spaCy package. In the direct test, a lower ratio of negative examples with higher numbers of examples in prompt achieved the best results with a F1 score of 0.72. The performance revealed consistency, 0.92-0.97 in the F1 score, in all settings after training with the RoBERTa model. The study highlighted the importance of seed quality rather than quantity in feeding NER models. This research reports on an efficient and accurate way to mine clinical notes for periodontal diagnoses, allowing researchers to easily and quickly build a NER model with the prompt generation approach.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Natural Language Processing / Dental Records Type of study: Diagnostic_studies / Prognostic_studies Limits: Humans Language: En Journal: AMIA Annu Symp Proc Journal subject: INFORMATICA MEDICA Year: 2023 Document type: Article Affiliation country: Estados Unidos

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Natural Language Processing / Dental Records Type of study: Diagnostic_studies / Prognostic_studies Limits: Humans Language: En Journal: AMIA Annu Symp Proc Journal subject: INFORMATICA MEDICA Year: 2023 Document type: Article Affiliation country: Estados Unidos