Knowledge-guided generative artificial intelligence for automated taxonomy learning from drug labels.

Fang, Yilu; Ryan, Patrick; Weng, Chunhua

Fang, Yilu; Ryan, Patrick; Weng, Chunhua.

Affiliation

Fang Y; Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States.
Ryan P; Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States.
Weng C; Observational Health Data Analytics, Janssen Research and Development, Titusville, NJ 08560, United States.

J Am Med Inform Assoc ; 31(9): 2065-2075, 2024 Sep 01.

Article in En | MEDLINE | ID: mdl-38787964

ABSTRACT

ABSTRACT

OBJECTIVES:

To automatically construct a drug indication taxonomy from drug labels using generative Artificial Intelligence (AI) represented by the Large Language Model (LLM) GPT-4 and real-world evidence (RWE). MATERIALS AND

METHODS:

We extracted indication terms from 46 421 free-text drug labels using GPT-4, iteratively and recursively generated indication concepts and inferred indication concept-to-concept and concept-to-term subsumption relations by integrating GPT-4 with RWE, and created a drug indication taxonomy. Quantitative and qualitative evaluations involving domain experts were performed for cardiovascular (CVD), Endocrine, and Genitourinary system diseases.

RESULTS:

2909 drug indication terms were extracted and assigned into 24 high-level indication categories (ie, initially generated concepts), each of which was expanded into a sub-taxonomy. For example, the CVD sub-taxonomy contains 242 concepts, spanning a depth of 11, with 170 being leaf nodes. It collectively covers a total of 234 indication terms associated with 189 distinct drugs. The accuracies of GPT-4 on determining the drug indication hierarchy exceeded 0.7 with "good to very good" inter-rater reliability. However, the accuracies of the concept-to-term subsumption relation checking varied greatly, with "fair to moderate" reliability. DISCUSSION AND

CONCLUSION:

We successfully used generative AI and RWE to create a taxonomy, with drug indications adequately consistent with domain expert expectations. We show that LLMs are good at deriving their own concept hierarchies but still fall short in determining the subsumption relations between concepts and terms in unregulated language from free-text drug labels, which is the same hard task for human experts.

Subject(s)

Artificial Intelligence; Drug Labeling; Natural Language Processing; Humans; Classification/methods

Key words

large language model; real-world evidence; taxonomy learning

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Database: MEDLINE Main subject: Natural Language Processing / Artificial Intelligence / Drug Labeling Limits: Humans Language: En Journal: J Am Med Inform Assoc Journal subject: INFORMATICA MEDICA Year: 2024 Type: Article Affiliation country: United States

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google