RESUMO
Multi-label classification according to the International Classification of Diseases (ICD) is an Extreme Multi-label Classification task aiming to categorise health records according to a set of relevant ICD codes. We implemented PlaBERT, a new multi-label text classification head with per-label attention, on top of a BERT model. The model assessment is conducted on Electronic Health Records, conveying Discharge Summaries in three languages - English, Spanish, and Swedish. The study focuses on 157 diagnostic codes from the ICD. We additionally measure the labelling noise to estimate the consistency of the gold standard. Our specialised attention mechanism computes attention weights for each input token and label pair, obtaining the specific relevance of every word concerning each ICD code. The PlaBERT model outputs the computed attention importance for each token and label, allowing for visualisation. Our best results are 40.65, 38.36, and 41.13 F1-Score points on the English, Spanish and Swedish datasets, respectively, for the 157 gastrointestinal codes. Besides, Precision is the metric that most significantly improves owing to the attention mechanism of PlaBERT, with an increase of 44.63, 40.93, and 12.92 points, respectively, for the Spanish, Swedish and English datasets.
Assuntos
Classificação Internacional de Doenças , Idioma , Registros Eletrônicos de Saúde , Humanos , Processamento de Linguagem Natural , Alta do Paciente , SuéciaRESUMO
This work deals with negation detection in the context of clinical texts. Negation detection is a key for decision support systems since negated events (detection of absence of some events) help ascertain current medical conditions. For artificial intelligence, negation detection is a valuable point as it can revert the meaning of a part of a text and, accordingly, influence other tasks such as medical dosage adjustment, the detection of adverse drug reactions or hospital acquired diseases. We focus on negated medical events such as disorders, findings and allergies. From Natural Language Processing (NLP) background, we refer to them as negated medical entities. A novelty of this work is that we approached this task as Named Entity Recognition (NER) with the restriction that just negated medical entities must be recognized (in an attempt to help distinguish them from non-negated ones). Our study is driven with Electronic Health Records (EHRs) written in Spanish. A challenge to cope with is the lexical variability (alternative medical forms, abbreviations, etc.). To this end, we employed an approach based on deep learning. Specifically, the system combines character embeddings to cope with out-of-vocabulary (OOV) words, Long Short-Term Memory (LSTM) networks to model contextual representations and it makes use of Conditional Random Fields (CRF) to classify each medical entity as either negated or not given the contextual dense representation. Moreover, we explored both embeddings created from words and embeddings created from lemmas. The best results were obtained with the lemmatized embeddings. Apparently, this approach reinforced the capability of the LSTMs to cope with the high lexical variability. The f-measure for exact-match was 65.1 and 82.4 for the partial-match.
Assuntos
Aprendizado Profundo , Registros Eletrônicos de Saúde , Inteligência Artificial , Processamento de Linguagem Natural , Redes Neurais de ComputaçãoRESUMO
BACKGROUND: Text mining and natural language processing of clinical text, such as notes from electronic health records, requires specific consideration of the specialized characteristics of these texts. Deep learning methods could potentially mitigate domain specific challenges such as limited access to in-domain tools and data sets. METHODS: A bi-directional Long Short-Term Memory network is applied to clinical notes in Spanish and Swedish for the task of medical named entity recognition. Several types of embeddings, both generated from in-domain and out-of-domain text corpora, and a number of generation and combination strategies for embeddings have been evaluated in order to investigate different input representations and the influence of domain on the final results. RESULTS: For Spanish, a micro averaged F1-score of 75.25 was obtained and for Swedish, the corresponding score was 76.04. The best results for both languages were achieved using embeddings generated from in-domain corpora extracted from electronic health records, but embeddings generated from related domains were also found to be beneficial. CONCLUSIONS: A recurrent neural network with in-domain embeddings improved the medical named entity recognition compared to shallow learning methods, showing this combination to be suitable for entity recognition in clinical text for both languages.
Assuntos
Aprendizado Profundo , Idioma , Processamento de Linguagem Natural , Mineração de Dados , Registros Eletrônicos de Saúde , Humanos , Redes Neurais de Computação , SuéciaRESUMO
OBJECTIVE: The goal of this study is to investigate entity recognition within Electronic Health Records (EHRs) focusing on Spanish and Swedish. Of particular importance is a robust representation of the entities. In our case, we utilized unsupervised methods to generate such representations. METHODS: The significance of this work stands on its experimental layout. The experiments were carried out under the same conditions for both languages. Several classification approaches were explored: maximum probability, CRF, Perceptron and SVM. The classifiers were enhanced by means of ensembles of semantic spaces and ensembles of Brown trees. In order to mitigate sparsity of data, without a significant increase in the dimension of the decision space, we propose the use of clustered approaches of the hierarchical Brown clustering represented by trees and vector quantization for each semantic space. RESULTS: The results showed that the semi-supervised approaches significantly improved standard supervised techniques for both languages. Moreover, clustering the semantic spaces contributed to the quality of the entity recognition while keeping the dimension of the feature-space two orders of magnitude lower than when directly using the semantic spaces. CONCLUSIONS: The contributions of this study are: (a) a set of thorough experiments that enable comparisons regarding the influence of different types of features on different classifiers, exploring two languages other than English; and (b) the use of ensembles of clusters of Brown trees and semantic spaces on EHRs to tackle the problem of scarcity of available annotated data.
Assuntos
Registros Eletrônicos de Saúde , Aprendizado de Máquina , Semântica , Análise por Conglomerados , Curadoria de Dados , Humanos , SuéciaRESUMO
The advances achieved in Natural Language Processing make it possible to automatically mine information from electronically created documents. Many Natural Language Processing methods that extract information from texts make use of annotated corpora, but these are scarce in the clinical domain due to legal and ethical issues. In this paper we present the creation of the IxaMed-GS gold standard composed of real electronic health records written in Spanish and manually annotated by experts in pharmacology and pharmacovigilance. The experts mainly annotated entities related to diseases and drugs, but also relationships between entities indicating adverse drug reaction events. To help the experts in the annotation task, we adapted a general corpus linguistic analyzer to the medical domain. The quality of the annotation process in the IxaMed-GS corpus has been assessed by measuring the inter-annotator agreement, which was 90.53% for entities and 82.86% for events. In addition, the corpus has been used for the automatic extraction of adverse drug reaction events using machine learning.
Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos , Mineração de Dados/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Registros Eletrônicos de Saúde/normas , Processamento de Linguagem Natural , Algoritmos , Automação , Idioma , Linguística , Aprendizado de Máquina , Preparações Farmacêuticas , Farmacovigilância , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , TraduçãoRESUMO
BACKGROUND AND OBJECTIVE: In the realm of automatic Electronic Health Records (EHR) classification according to the International Classification of Diseases (ICD) there is a notable gap of non-black box approaches and more in Spanish, which is also frequently ignored in clinical language classification. An additional gap in explainability pertains to the lack of standardized metrics for evaluating the degree of explainability offered by distinct techniques. METHODS: We address the classification of Spanish electronic health records, using methods to explain the predictions and improve the decision support level. We also propose Leberage a novel metric to quantify the decision support level of the explainable predictions. We aim to assess the explanatory ability derived from three model-independent methods based on different theoretical frameworks: SHapley Additive exPlanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), and Integrated Gradients (IG). We develop a system based on longformers that can process long documents and then use the explainability methods to extract the relevant segments of text in the EHR that motivated each ICD. We then measure the outcome of the different explainability methods by implementing a novel metric. RESULTS: Our results beat those that carry out the same task by 7%. In terms of explainability degree LIME appears as a stronger technique compared to IG and SHAP. DISCUSSION: Our research reveals that the explored techniques are useful for explaining the output of black box models as the longformer. In addition, the proposed metric emerges as a good choice to quantify the contribution of explainability techniques.
RESUMO
Civil registration and vital statistics systems capture birth and death events to compile vital statistics and to provide legal rights to citizens. Vital statistics are a key factor in promoting public health policies and the health of the population. Medical certification of cause of death is the preferred source of cause of death information. However, two thirds of all deaths worldwide are not captured in routine mortality information systems and their cause of death is unknown. Verbal autopsy is an interim solution for estimating the cause of death distribution at the population level in the absence of medical certification. A Verbal Autopsy (VA) consists of an interview with the relative or the caregiver of the deceased. The VA includes both Closed Questions (CQs) with structured answer options, and an Open Response (OR) consisting of a free narrative of the events expressed in natural language and without any pre-determined structure. There are a number of automated systems to analyze the CQs to obtain cause specific mortality fractions with limited performance. We hypothesize that the incorporation of the text provided by the OR might convey relevant information to discern the CoD. The experimental layout compares existing Computer Coding Verbal Autopsy methods such as Tariff 2.0 with other approaches well suited to the processing of structured inputs as is the case of the CQs. Next, alternative approaches based on language models are employed to analyze the OR. Finally, we propose a new method with a bi-modal input that combines the CQs and the OR. Empirical results corroborated that the CoD prediction capability of the Tariff 2.0 algorithm is outperformed by our method taking into account the valuable information conveyed by the OR. As an added value, with this work we made available the software to enable the reproducibility of the results attained with a version implemented in R to make the comparison with Tariff 2.0 evident.
Assuntos
Algoritmos , Humanos , Autopsia , Causas de Morte , Reprodutibilidade dos TestesRESUMO
Electronic Health Records (EHRs) convey valuable information. Experts in clinical documentation read the report, understand the prior work, procedures, tests carried out, and encode the EHRs according to the International Classification of Diseases (ICD). Assigning these codes to the EHRs helps to share information, and extract statistics. In this paper, we explore computer-aided multi-label classification approaches. While Natural Language Understanding has evolved for clinical text mining, there is still a gap for languages other than English. Language-modeling aware Transformers has demonstrated state of the art approaches through exploiting contextual dependencies. Here we focus on EHRs written in Spanish, and try to benefit from the Language Model itself, with unannotated corpus with less data but in-house, in-domain and closely-related EHRs to that of the downstream task. The International Classification of Diseases coding scheme is hierarchical, but its synergies among hierarchical levels are rarely exploited. In this work, we implement and release a hierarchical head for multi-label classification, which benefits from the hierarchy of the ICD via multi-task classification.
Assuntos
Registros Eletrônicos de Saúde , Idioma , Mineração de Dados , Humanos , Classificação Internacional de Doenças , Processamento de Linguagem NaturalRESUMO
BACKGROUND: This work deals with Natural Language Processing applied to Electronic Health Records (EHRs). EHRs are coded following the International Classification of Diseases (ICD) leading to a multi-label classification problem. Previously proposed approaches act as black-boxes without giving further insights. Explainable Artificial Intelligence (XAI) helps to clarify what brought the model to make the predictions. GOAL: This work aims to obtain explainable predictions of the diseases and procedures contained in EHRs. As an application, we show visualizations of the attention stored and propose a prototype of a Decision Support System (DSS) that highlights the text that motivated the choice of each of the proposed ICD codes. METHODS: Convolutional Neural Networks (CNNs) with attention mechanisms were used. Attention mechanisms allow to detect which part of the input (EHRs) motivate the output (medical codes), producing explainable predictions. RESULTS: We successfully applied methods in a Spanish corpus getting challenging results. Finally, we presented the idea of extracting the chronological order of the ICDs in a given EHR by anchoring the codes to different stages of the clinical admission. CONCLUSIONS: We found that explainable deep learning models applied to predict medical codes store helpful information that could be used to assist medical experts while reaching a solid performance. In particular, we show that the information stored in the attention mechanisms enables DSS and a shallow chronology of diagnoses.
Assuntos
Registros Eletrônicos de Saúde , Classificação Internacional de Doenças , Inteligência Artificial , Processamento de Linguagem Natural , Redes Neurais de ComputaçãoRESUMO
The international standard to ascertain the cause of death is medical certification. However, in many low and middle-income countries, the majority of deaths occur outside of health facilities. In these cases, Verbal Autopsy (VA), the narrative provided by a family member or friend together with a questionnaire is designed by the World Health Organization as the main information source. Until now technology allowed us to automatically analyze the responses of the VA questionnaire with the narrative captured by the interviewer excluded. Our work addresses this gap by developing a set of models for automatic Cause of Death (CoD) ascertainment in VAs with a focus on the textual information. Empirical results show that the open response conveys valuable information towards the ascertainment of the Cause of Death, and the combination of the closed-ended questions and the open response lead to the best results. Model interpretation capabilities position the Deep Learning models as the most encouraging choice.
Assuntos
Aprendizado Profundo , Autopsia , Causas de Morte , Humanos , Inquéritos e QuestionáriosRESUMO
BACKGROUND AND OBJECTIVE: This work deals with clinical text mining, a field of Natural Language Processing applied to biomedical informatics. The aim is to classify Electronic Health Records with respect to the International Classification of Diseases, which is the foundation for the identification of international health statistics, and the standard for reporting diseases and health conditions. Within the framework of data mining, the goal is the multi-label classification, as each health record has assigned multiple International Classification of Diseases codes. We investigate five Deep Learning architectures with a dataset obtained from the Basque Country Health System, and six different perspectives derived from shifts in the input and the output. METHODS: We evaluate a Feed Forward Neural Network as the baseline and several Recurrent models based on the Bidirectional GRU architecture, putting our research focus on the text representation layer and testing three variants, from standard word embeddings to meta word embeddings techniques and contextual embeddings. RESULTS: The results showed that the recurrent models overcome the non-recurrent model. The meta word embeddings techniques are capable of beating the standard word embeddings, but the contextual embeddings exhibit as the most robust for the downstream task overall. Additionally, the label-granularity alone has an impact on the classification performance. CONCLUSIONS: The contributions of this work are a) a comparison among five classification approaches based on Deep Learning on a Spanish dataset to cope with the multi-label health text classification problem; b) the study of the impact of document length and label-set size and granularity in the multi-label context; and c) the study of measures to mitigate multi-label text classification problems related to label-set size and sparseness.
Assuntos
Aprendizado Profundo , Registros Eletrônicos de Saúde/classificação , Informática Médica , Reconhecimento Automatizado de Padrão , Algoritmos , Gráficos por Computador , Mineração de Dados , Humanos , Classificação Internacional de Doenças , Processamento de Linguagem Natural , Redes Neurais de Computação , Software , EspanhaRESUMO
This work focuses on adverse drug reaction extraction tackling the class imbalance problem. Adverse drug reactions are infrequent events in electronic health records, nevertheless, it is compulsory to get them documented. Text mining techniques can help to retrieve this kind of valuable information from text. The class imbalance was tackled using different sampling methods, cost-sensitive learning, ensemble learning and one-class classification and the Random Forest classifier was used. The adverse drug reaction extraction model was inferred from a dataset that comprises real electronic health records with an imbalance ratio of 1:222, this means that for each drug-disease pair that is an adverse drug reaction, there are approximately 222 that are not adverse drug reactions. The application of a sampling technique before using cost-sensitive learning offered the best result. On the test set, the f-measure was 0.121 for the minority class and 0.996 for the majority class.
Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos/normas , Mineração de Dados/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Registros Eletrônicos de Saúde/estatística & dados numéricos , Sistemas de Notificação de Reações Adversas a Medicamentos/estatística & dados numéricos , Teorema de Bayes , Mineração de Dados/normas , Mineração de Dados/estatística & dados numéricos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Humanos , Modelos Logísticos , Aprendizado de Máquina/normas , Aprendizado de Máquina/estatística & dados numéricos , EspanhaRESUMO
This work focuses on the detection of adverse drug reactions (ADRs) in electronic health records (EHRs) written in Spanish. The World Health Organization underlines the importance of reporting ADRs for patients' safety. The fact is that ADRs tend to be under-reported in daily hospital praxis. In this context, automatic solutions based on text mining can help to alleviate the workload of experts. Nevertheless, these solutions pose two challenges: 1) EHRs show high lexical variability, the characterization of the events must be able to deal with unseen words or contexts and 2) ADRs are rare events, hence, the system should be robust against skewed class distribution. To tackle these challenges, deep neural networks seem appropriate because they allow a high-level representation. Specifically, we opted for a joint AB-LSTM network, a sub-class of the bidirectional long short-term memory network. Besides, in an attempt to reinforce lexical variability, we proposed the use of embeddings created using lemmas. We compared this approach with supervised event extraction approaches based on either symbolic or dense representations. Experimental results showed that the joint AB-LSTM approach outperformed previous approaches, achieving an f-measure of 73.3.
Assuntos
Mineração de Dados/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/classificação , Registros Eletrônicos de Saúde , Algoritmos , Aprendizado Profundo , Humanos , Informática Médica/métodosRESUMO
BACKGROUND AND OBJECTIVE: This work aims at extracting Adverse Drug Reactions (ADRs), i.e. a harm directly caused by a drug at normal doses, from Electronic Health Records (EHRs). The lack of readily available EHRs because of confidentiality issues and their lexical variability make the ADR extraction challenging. Furthermore, ADRs are rare events. Therefore, efficient representations against data sparsity are needed. METHODS: Embedding-based characterizations are able to group semantically related words. However, dense spaces suffer from data sparsity. We employed context-aware continuous representations to enhance the modelling of infrequent events through their context and we turned to simple smoothing techniques to increase the proximity between similar words (e.g. direction cosines, truncation, Principal Component Analysis (PCA) and clustering) in an attempt to cope with data sparsity. RESULTS: An F-measure of 0.639 for the ADR classification was achieved, obtaining an improvement of approximately 0.300 in comparison with the results obtained by a word-based characterization. CONCLUSION: The embbeding-based representation together with the smoothing techniques increased the robustness of the ADR characterization. It was proven particularly appropriate to cope with lexical variability and data sparsity.
Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos/estatística & dados numéricos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Registros Eletrônicos de Saúde/estatística & dados numéricos , Processamento de Linguagem Natural , Semântica , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/prevenção & controle , HumanosRESUMO
BACKGROUND: This work deals with Natural Language Processing applied to the clinical domain. Specifically, the work deals with a Medical Entity Recognition (MER) on Electronic Health Records (EHRs). Developing a MER system entailed heavy data preprocessing and feature engineering until Deep Neural Networks (DNNs) emerged. However, the quality of the word representations in terms of embedded layers is still an important issue for the inference of the DNNs. GOAL: The main goal of this work is to develop a robust MER system adapting general-purpose DNNs to cope with the high lexical variability shown in EHRs. In addition, given that EHRs tend to be scarce when there are out-domain corpora available, the aim is to assess the impact of the word representations on the performance of the MER as we move to other domains. In this line, exhaustive experimentation varying information generation methods and network parameters are crucial. METHODS: We adapted a general purpose sequential tagger based on Bidirectional Long-Short Term Memory cells and Conditional Random Fields (CRFs) in order to make it tolerant to high lexical variability and a limited amount of corpora. To this end, we incorporated part of speech (POS) and semantic-tag embedding layers to the word representations. RESULTS: One of the strengths of this work is the exhaustive evaluation of dense word representations obtained varying not only the domain and genre but also the learning algorithms and their parameter settings. With the proposed method, we attained an error reduction of 1.71 (5.7%) compared to the state-of-the-art even that no preprocessing or feature engineering was used. CONCLUSIONS: Our results indicate that dense representations built taking word order into account leverage the entity extraction system. Besides, we found that using a medical corpus (not necessarily EHRs) to infer the representations improves the performance, even if it does not correspond to the same genre.
Assuntos
Processamento de Linguagem Natural , Algoritmos , Registros Eletrônicos de Saúde , Redes Neurais de Computação , Semântica , DescritoresRESUMO
This work focuses on data mining applied to the clinical documentation domain. Diagnostic terms (DTs) are used as keywords to retrieve valuable information from electronic health records. Indeed, they are encoded manually by experts following the International Classification of Diseases (ICD). The goal of this work is to explore the aid of text mining on DT encoding. From the machine learning (ML) perspective, this is a high-dimensional classification task, as it comprises thousands of codes. This work delves into a robust representation of the instances to improve ML results. The proposed system is able to find the right ICD code among more than 1500 possible ICD codes with 92% precision for the main disease (primary class) and 88% for the main disease together with the nonessential modifiers (fully specified class). The methodology employed is simple and portable. According to the experts from public hospitals, the system is very useful in particular for documentation and pharmacosurveillance services. In fact, they reported an accuracy of 91.2% on a small randomly extracted test. Hence, together with this paper, we made the software publicly available in order to help the clinical and research community.
Assuntos
Documentação/métodos , Registros Eletrônicos de Saúde , Classificação Internacional de Doenças , Aprendizado de Máquina , Mineração de Dados/métodos , Humanos , Processamento de Linguagem NaturalRESUMO
BACKGROUND AND OBJECTIVES: Electronic health records (EHRs) convey vast and valuable knowledge about dynamically changing clinical practices. Indeed, clinical documentation entails the inspection of massive number of records across hospitals and hospital sections. The goal of this study is to provide an efficient framework that will help clinicians explore EHRs and attain alternative views related to both patient-segments and diseases, like clustering and statistical information about the development of heart diseases (replacement of pacemakers, valve implantation etc.) in co-occurrence with other diseases. The task is challenging, dealing with lengthy health records and a high number of classes in a multi-label setting. METHODS: LDA is a statistical procedure optimized to explain a document by multinomial distributions on their latent topics and the topics by distributions on related words. These distributions allow to represent collections of texts into a continuous space enabling distance-based associations between documents and also revealing the underlying topics. The topic models were assessed by means of four divergence metrics. In addition, we applied LDA to the task of multi-label document classification of EHRs according to the International Classification of Diseases 10th Clinical Modification (ICD-10). The set of EHRs had assigned 7 codes on average over 970 different codes corresponding to cardiology. RESULTS: First, the discriminative ability of topic models was assessed using dissimilarity metrics. Nevertheless, there was an open question regarding the interpretability of automatically discovered topics. To address this issue, we explored the connection between the latent topics and ICD-10. EHRs were represented by means of LDA and, next, supervised classifiers were inferred from those representations. Given the low-dimensional representation provided by LDA, the search was computationally efficient compared to symbolic approaches such as TF-IDF. The classifiers achieved an average AUC of 77.79. As a side contribution, with this work we released the software implemented in Python and R to both train and evaluate the models. CONCLUSIONS: Topic modeling offers a means of representing EHRs in a small dimensional continuous space. This representation conveys relevant information as hidden topics in a comprehensive manner. Moreover, in practice, this compact representation allowed to extract the ICD-10 codes associated to EHRs.
Assuntos
Cardiologia/estatística & dados numéricos , Registros Eletrônicos de Saúde/classificação , Cardiologia/tendências , Mineração de Dados , Registros Eletrônicos de Saúde/estatística & dados numéricos , Humanos , Classificação Internacional de Doenças , Modelos EstatísticosRESUMO
BACKGROUND: Electronic Health Records (EHRs) are written using spontaneous natural language. Often, terms do not match standard terminology like the one available through the International Classification of Diseases (ICD). OBJECTIVE: Information retrieval and exchange can be improved using standard terminology. Our aim is to render diagnostic terms written in spontaneous language in EHRs into the standard framework provided by the ICD. METHODS: We tackle diagnostic term normalization employing Weighted Finite-State Transducers (WFSTs). These machines learn how to translate sequences, in the case of our concern, spontaneous representations into standard representations given a set of samples. They are highly flexible and easily adaptable to terminological singularities of each different hospital and practitioner. Besides, we implemented a similarity metric to enhance spontaneous-standard term matching. RESULTS: From the 2850 spontaneous DTs randomly selected we found that only 7.71% were written in their standard form matching the ICD. This WFST-based system enabled matching spontaneous ICDs with a Mean Reciprocal Rank of 0.68, which means that, on average, the right ICD code is found between the first and second position among the normalized set of candidates. This guarantees efficient document exchange and, furthermore, information retrieval. CONCLUSION: Medical term normalization was achieved with high performance. We found that direct matching of spontaneous terms using standard lexicons leads to unsatisfactory results while normalized hypothesis generation by means of WFST helped to overcome the gap between spontaneous and standard language.