|

Toward structuring real-world data: Deep learning for extracting oncology information from clinical text with patient-level supervision.

Preston, Sam; Wei, Mu; Rao, Rajesh; Tinn, Robert; Usuyama, Naoto; Lucas, Michael; Gu, Yu; Weerasinghe, Roshanthi; Lee, Soohee; Piening, Brian; Tittel, Paul; Valluri, Naveen; Naumann, Tristan; Bifulco, Carlo; Poon, Hoifung.

Patterns (N Y) ; 4(4): 100726, 2023 Apr 14.

Article En | MEDLINE | ID: mdl-37123439

Most detailed patient information in real-world data (RWD) is only consistently available in free-text clinical documents. Manual curation is expensive and time consuming. Developing natural language processing (NLP) methods for structuring RWD is thus essential for scaling real-world evidence generation. We propose leveraging patient-level supervision from medical registries, which are often readily available and capture key patient information, for general RWD applications. We conduct an extensive study on 135,107 patients from the cancer registry of a large integrated delivery network (IDN) comprising healthcare systems in five western US states. Our deep-learning methods attain test area under the receiver operating characteristic curve (AUROC) values of 94%-99% for key tumor attributes and comparable performance on held-out data from separate health systems and states. Ablation results demonstrate the superiority of these advanced deep-learning methods. Error analysis shows that our NLP system sometimes even corrects errors in registrar labels.

Fine-tuning large neural language models for biomedical natural language processing.

Tinn, Robert; Cheng, Hao; Gu, Yu; Usuyama, Naoto; Liu, Xiaodong; Naumann, Tristan; Gao, Jianfeng; Poon, Hoifung.

Patterns (N Y) ; 4(4): 100729, 2023 Apr 14.

Article En | MEDLINE | ID: mdl-37123444

Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that fine-tuning performance may be sensitive to pretraining settings and conduct an exploration of techniques for addressing fine-tuning instability. We show that these techniques can substantially improve fine-tuning performance for low-resource biomedical NLP applications. Specifically, freezing lower layers is helpful for standard BERT- B A S E models, while layerwise decay is more effective for BERT- L A R G E and ELECTRA models. For low-resource text similarity tasks, such as BIOSSES, reinitializing the top layers is the optimal strategy. Overall, domain-specific vocabulary and pretraining facilitate robust models for fine-tuning. Based on these findings, we establish a new state of the art on a wide range of biomedical NLP applications.