Search | VHL Search Portal

Deep learning uncertainty quantification for clinical text classification.

Peluso, Alina; Danciu, Ioana; Yoon, Hong-Jun; Yusof, Jamaludin Mohd; Bhattacharya, Tanmoy; Spannaus, Adam; Schaefferkoetter, Noah; Durbin, Eric B; Wu, Xiao-Cheng; Stroup, Antoinette; Doherty, Jennifer; Schwartz, Stephen; Wiggins, Charles; Coyle, Linda; Penberthy, Lynne; Tourassi, Georgia D; Gao, Shang.

J Biomed Inform ; 149: 104576, 2024 01.

Article in English | MEDLINE | ID: mdl-38101690

ABSTRACT

INTRODUCTION: Machine learning algorithms are expected to work side-by-side with humans in decision-making pipelines. Thus, the ability of classifiers to make reliable decisions is of paramount importance. Deep neural networks (DNNs) represent the state-of-the-art models to address real-world classification. Although the strength of activation in DNNs is often correlated with the network's confidence, in-depth analyses are needed to establish whether they are well calibrated. METHOD: In this paper, we demonstrate the use of DNN-based classification tools to benefit cancer registries by automating information extraction of disease at diagnosis and at surgery from electronic text pathology reports from the US National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) population-based cancer registries. In particular, we introduce multiple methods for selective classification to achieve a target level of accuracy on multiple classification tasks while minimizing the rejection amount-that is, the number of electronic pathology reports for which the model's predictions are unreliable. We evaluate the proposed methods by comparing our approach with the current in-house deep learning-based abstaining classifier. RESULTS: Overall, all the proposed selective classification methods effectively allow for achieving the targeted level of accuracy or higher in a trade-off analysis aimed to minimize the rejection rate. On in-distribution validation and holdout test data, with all the proposed methods, we achieve on all tasks the required target level of accuracy with a lower rejection rate than the deep abstaining classifier (DAC). Interpreting the results for the out-of-distribution test data is more complex; nevertheless, in this case as well, the rejection rate from the best among the proposed methods achieving 97% accuracy or higher is lower than the rejection rate based on the DAC. CONCLUSIONS: We show that although both approaches can flag those samples that should be manually reviewed and labeled by human annotators, the newly proposed methods retain a larger fraction and do so without retraining-thus offering a reduced computational cost compared with the in-house deep learning-based abstaining classifier.

Subject(s)

Deep Learning , Humans , Uncertainty , Neural Networks, Computer , Algorithms , Machine Learning

Deep active learning for classifying cancer pathology reports.

De Angeli, Kevin; Gao, Shang; Alawad, Mohammed; Yoon, Hong-Jun; Schaefferkoetter, Noah; Wu, Xiao-Cheng; Durbin, Eric B; Doherty, Jennifer; Stroup, Antoinette; Coyle, Linda; Penberthy, Lynne; Tourassi, Georgia.

BMC Bioinformatics ; 22(1): 113, 2021 Mar 09.

Article in English | MEDLINE | ID: mdl-33750288

ABSTRACT

BACKGROUND: Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model. RESULTS: We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. Our results show that on all tasks and dataset sizes, all active learning strategies except diversity-sampling strategies outperformed random sampling, i.e., no active learning. On our large dataset (15K initial labelled samples, adding 15K additional labelled samples each iteration of active learning), there was no clear winner between the different active learning strategies. On our small dataset (1K initial labelled samples, adding 1K additional labelled samples each iteration of active learning), marginal and ratio uncertainty sampling performed better than all other active learning techniques. We found that compared to random sampling, active learning strongly helps performance on rare classes by focusing on underrepresented classes. CONCLUSIONS: Active learning can save annotation cost by helping human annotators efficiently and intelligently select which samples to label. Our results show that a dataset constructed using effective active learning techniques requires less than half the amount of labelled data to achieve the same performance as a dataset constructed using random sampling.

Subject(s)

Machine Learning , Neoplasms , Algorithms , Humans , Neoplasms/genetics , Neoplasms/pathology , Neural Networks, Computer

Limitations of Transformers on Clinical Text Classification.

Gao, Shang; Alawad, Mohammed; Young, M Todd; Gounley, John; Schaefferkoetter, Noah; Yoon, Hong Jun; Wu, Xiao-Cheng; Durbin, Eric B; Doherty, Jennifer; Stroup, Antoinette; Coyle, Linda; Tourassi, Georgia.

IEEE J Biomed Health Inform ; 25(9): 3596-3607, 2021 09.

Article in English | MEDLINE | ID: mdl-33635801

ABSTRACT

Bidirectional Encoder Representations from Transformers (BERT) and BERT-based approaches are the current state-of-the-art in many natural language processing (NLP) tasks; however, their application to document classification on long clinical texts is limited. In this work, we introduce four methods to scale BERT, which by default can only handle input sequences up to approximately 400 words long, to perform document classification on clinical texts several thousand words long. We compare these methods against two much simpler architectures - a word-level convolutional neural network and a hierarchical self-attention network - and show that BERT often cannot beat these simpler baselines when classifying MIMIC-III discharge summaries and SEER cancer pathology reports. In our analysis, we show that two key components of BERT - pretraining and WordPiece tokenization - may actually be inhibiting BERT's performance on clinical text classification tasks where the input document is several thousand words long and where correctly identifying labels may depend more on identifying a few key words or phrases rather than understanding the contextual meaning of sequences of text.

Subject(s)

Natural Language Processing , Neural Networks, Computer , Humans

Using case-level context to classify cancer pathology reports.

Gao, Shang; Alawad, Mohammed; Schaefferkoetter, Noah; Penberthy, Lynne; Wu, Xiao-Cheng; Durbin, Eric B; Coyle, Linda; Ramanathan, Arvind; Tourassi, Georgia.

PLoS One ; 15(5): e0232840, 2020.

Article in English | MEDLINE | ID: mdl-32396579

ABSTRACT

Individual electronic health records (EHRs) and clinical reports are often part of a larger sequence-for example, a single patient may generate multiple reports over the trajectory of a disease. In applications such as cancer pathology reports, it is necessary not only to extract information from individual reports, but also to capture aggregate information regarding the entire cancer case based off case-level context from all reports in the sequence. In this paper, we introduce a simple modular add-on for capturing case-level context that is designed to be compatible with most existing deep learning architectures for text classification on individual reports. We test our approach on a corpus of 431,433 cancer pathology reports, and we show that incorporating case-level context significantly boosts classification accuracy across six classification tasks-site, subsite, laterality, histology, behavior, and grade. We expect that with minimal modifications, our add-on can be applied towards a wide range of other clinical text-based tasks.

Subject(s)

Electronic Health Records/classification , Neoplasms/pathology , Histological Techniques , Humans , Natural Language Processing , SEER Program

Classifying cancer pathology reports with hierarchical self-attention networks.

Gao, Shang; Qiu, John X; Alawad, Mohammed; Hinkle, Jacob D; Schaefferkoetter, Noah; Yoon, Hong-Jun; Christian, Blair; Fearn, Paul A; Penberthy, Lynne; Wu, Xiao-Cheng; Coyle, Linda; Tourassi, Georgia; Ramanathan, Arvind.

Artif Intell Med ; 101: 101726, 2019 11.

Article in English | MEDLINE | ID: mdl-31813492

ABSTRACT

We introduce a deep learning architecture, hierarchical self-attention networks (HiSANs), designed for classifying pathology reports and show how its unique architecture leads to a new state-of-the-art in accuracy, faster training, and clear interpretability. We evaluate performance on a corpus of 374,899 pathology reports obtained from the National Cancer Institute's (NCI) Surveillance, Epidemiology, and End Results (SEER) program. Each pathology report is associated with five clinical classification tasks - site, laterality, behavior, histology, and grade. We compare the performance of the HiSAN against other machine learning and deep learning approaches commonly used on medical text data - Naive Bayes, logistic regression, convolutional neural networks, and hierarchical attention networks (the previous state-of-the-art). We show that HiSANs are superior to other machine learning and deep learning text classifiers in both accuracy and macro F-score across all five classification tasks. Compared to the previous state-of-the-art, hierarchical attention networks, HiSANs not only are an order of magnitude faster to train, but also achieve about 1% better relative accuracy and 5% better relative macro F-score.

Subject(s)

Neoplasms/pathology , Deep Learning , Humans , Natural Language Processing , Neoplasms/classification , Neural Networks, Computer

Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports.

Alawad, Mohammed; Gao, Shang; Qiu, John; Schaefferkoetter, Noah; Hinkle, Jacob D; Yoon, Hong-Jun; Christian, J Blair; Wu, Xiao-Cheng; Durbin, Eric B; Jeong, Jong Cheol; Hands, Isaac; Rust, David; Tourassi, Georgia.

IEEE EMBS Int Conf Biomed Health Inform ; 20192019 May.

Article in English | MEDLINE | ID: mdl-36081613

ABSTRACT

Automated text information extraction from cancer pathology reports is an active area of research to support national cancer surveillance. A well-known challenge is how to develop information extraction tools with robust performance across cancer registries. In this study we investigated whether transfer learning (TL) with a convolutional neural network (CNN) can facilitate cross-registry knowledge sharing. Specifically, we performed a series of experiments to determine whether a CNN trained with single-registry data is capable of transferring knowledge to another registry or whether developing a cross-registry knowledge database produces a more effective and generalizable model. Using data from two cancer registries and primary tumor site and topography as the information extraction task of interest, our study showed that TL results in 6.90% and 17.22% improvement of classification macro F-score over the baseline single-registry models. Detailed analysis illustrated that the observed improvement is evident in the low prevalence classes.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL