Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
J Biomed Inform ; 62: 148-58, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27363901

RESUMO

OBJECTIVE: The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. METHODS: We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. RESULTS: Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10×10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. CONCLUSION: This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Farmacovigilância , Mídias Sociais , Humanos , Internet , Saúde Pública
2.
J Biomed Inform ; 54: 202-12, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25720841

RESUMO

OBJECTIVE: Automatic monitoring of Adverse Drug Reactions (ADRs), defined as adverse patient outcomes caused by medications, is a challenging research problem that is currently receiving significant attention from the medical informatics community. In recent years, user-posted data on social media, primarily due to its sheer volume, has become a useful resource for ADR monitoring. Research using social media data has progressed using various data sources and techniques, making it difficult to compare distinct systems and their performances. In this paper, we perform a methodical review to characterize the different approaches to ADR detection/extraction from social media, and their applicability to pharmacovigilance. In addition, we present a potential systematic pathway to ADR monitoring from social media. METHODS: We identified studies describing approaches for ADR detection from social media from the Medline, Embase, Scopus and Web of Science databases, and the Google Scholar search engine. Studies that met our inclusion criteria were those that attempted to extract ADR information posted by users on any publicly available social media platform. We categorized the studies according to different characteristics such as primary ADR detection approach, size of corpus, data source(s), availability, and evaluation criteria. RESULTS: Twenty-two studies met our inclusion criteria, with fifteen (68%) published within the last two years. However, publicly available annotated data is still scarce, and we found only six studies that made the annotations used publicly available, making system performance comparisons difficult. In terms of algorithms, supervised classification techniques to detect posts containing ADR mentions, and lexicon-based approaches for extraction of ADR mentions from texts have been the most popular. CONCLUSION: Our review suggests that interest in the utilization of the vast amounts of available social media data for ADR monitoring is increasing. In terms of sources, both health-related and general social media data have been used for ADR detection-while health-related sources tend to contain higher proportions of relevant data, the volume of data from general social media websites is significantly higher. There is still very limited amount of annotated data publicly available , and, as indicated by the promising results obtained by recent supervised learning approaches, there is a strong need to make such data available to the research community.


Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos , Farmacovigilância , Mídias Sociais , Humanos , Internet , MEDLINE , Saúde Pública
3.
J Biomed Inform ; 46 Suppl: S40-S47, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24212118

RESUMO

Clinical records include both coded and free-text fields that interact to reflect complicated patient stories. The information often covers not only the present medical condition and events experienced by the patient, but also refers to relevant events in the past (such as signs, symptoms, tests or treatments). In order to automatically construct a timeline of these events, we first need to extract the temporal relations between pairs of events or time expressions presented in the clinical notes. We designed separate extraction components for different types of temporal relations, utilizing a novel hybrid system that combines machine learning with a graph-based inference mechanism to extract the temporal links. The temporal graph is a directed graph based on parse tree dependencies of the simplified sentences and frequent pattern clues. We generalized the sentences in order to discover patterns that, given the complexities of natural language, might not be directly discoverable in the original sentences. The proposed hybrid system performance reached an F-measure of 0.63, with precision at 0.76 and recall at 0.54 on the 2012 i2b2 Natural Language Processing corpus for the temporal relation (TLink) extraction task, achieving the highest precision and third highest f-measure among participating teams in the TLink track.


Assuntos
Mineração de Dados/métodos , Registros Eletrônicos de Saúde , Informática Médica/métodos , Processamento de Linguagem Natural , Humanos , Máquina de Vetores de Suporte , Fatores de Tempo
4.
JAMIA Open ; 2(3): 301-305, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31709388

RESUMO

OBJECTIVES: To investigate using patient posts in social media as a resource to profile off-label prescriptions of cancer drugs. METHODS: We analyzed patient posts from the Inspire health forums (www.inspire.com) and extracted mentions of cancer drugs from the 14 most active cancer-type specific support groups. To quantify drug-disease associations, we calculated information component scores from the frequency of posts in each cancer-specific group with mentions of a given drug. We evaluated the results against three sources: manual review, Wolters-Kluwer Medi-span, and Truven MarketScan insurance claims. RESULTS: We identified 279 frequently discussed and therefore highly associated drug-disease pairs from Inspire posts. Of these, 96 are FDA approved, 9 are known off-label uses, and 174 do not have records of known usage (potentially novel off-label uses). We achieved a mean average precision of 74.9% in identifying drug-disease pairs with a true indication association from patient posts and found consistent evidence in medical claims records. We achieved a recall of 69.2% in identifying known off-label drug uses (based on Wolters-Kluwer Medi-span) from patient posts.

5.
JMIR Public Health Surveill ; 5(2): e11264, 2019 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-31162134

RESUMO

BACKGROUND: Adverse drug reactions (ADRs) occur in nearly all patients on chemotherapy, causing morbidity and therapy disruptions. Detection of such ADRs is limited in clinical trials, which are underpowered to detect rare events. Early recognition of ADRs in the postmarketing phase could substantially reduce morbidity and decrease societal costs. Internet community health forums provide a mechanism for individuals to discuss real-time health concerns and can enable computational detection of ADRs. OBJECTIVE: The goal of this study is to identify cutaneous ADR signals in social health networks and compare the frequency and timing of these ADRs to clinical reports in the literature. METHODS: We present a natural language processing-based, ADR signal-generation pipeline based on patient posts on Internet social health networks. We identified user posts from the Inspire health forums related to two chemotherapy classes: erlotinib, an epidermal growth factor receptor inhibitor, and nivolumab and pembrolizumab, immune checkpoint inhibitors. We extracted mentions of ADRs from unstructured content of patient posts. We then performed population-level association analyses and time-to-detection analyses. RESULTS: Our system detected cutaneous ADRs from patient reports with high precision (0.90) and at frequencies comparable to those documented in the literature but an average of 7 months ahead of their literature reporting. Known ADRs were associated with higher proportional reporting ratios compared to negative controls, demonstrating the robustness of our analyses. Our named entity recognition system achieved a 0.738 microaveraged F-measure in detecting ADR entities, not limited to cutaneous ADRs, in health forum posts. Additionally, we discovered the novel ADR of hypohidrosis reported by 23 patients in erlotinib-related posts; this ADR was absent from 15 years of literature on this medication and we recently reported the finding in a clinical oncology journal. CONCLUSIONS: Several hundred million patients report health concerns in social health networks, yet this information is markedly underutilized for pharmacosurveillance. We demonstrated the ability of a natural language processing-based signal-generation pipeline to accurately detect patient reports of ADRs months in advance of literature reporting and the robustness of statistical analyses to validate system detections. Our findings suggest the important contributions that social health network data can play in contributing to more comprehensive and timely pharmacovigilance.

6.
AMIA Annu Symp Proc ; 2017: 679-688, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29854133

RESUMO

Social networks, such as Twitter, have become important sources for active monitoring of user-reported adverse drug reactions (ADRs). Automatic extraction of ADR information can be crucial for healthcare providers, drug manufacturers, and consumers. However, because of the non-standard nature of social media language, automatically extracted ADR mentions need to be mapped to standard forms before they can be used by operational pharmacovigilance systems. We propose a modular natural language processing pipeline for mapping (normalizing) colloquial mentions of ADRs to their corresponding standardized identifiers. We seek to accomplish this task and enable customization of the pipeline so that distinct unlabeled free text resources can be incorporated to use the system for other normalization tasks. Our approach, which we call Hybrid Semantic Analysis (HSA), sequentially employs rule-based and semantic matching algorithms for mapping user-generated mentions to concept IDs in the Unified Medical Language System vocabulary. The semantic matching component of HSA is adaptive in nature and uses a regression model to combine various measures of semantic relatedness and resources to optimize normalization performance on the selected data source. On a publicly available corpus, our normalization method achieves 0.502 recall and 0.823 precision (F-measure: 0.624). Our proposed method outperforms a baseline based on latent semantic analysis and another that uses MetaMap.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Processamento de Linguagem Natural , Farmacovigilância , Mídias Sociais , Terminologia como Assunto , Algoritmos , Crowdsourcing , Humanos , Armazenamento e Recuperação da Informação , Semântica , Software , Unified Medical Language System
7.
Pac Symp Biocomput ; 21: 581-92, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26776221

RESUMO

Social media has evolved into a crucial resource for obtaining large volumes of real-time information. The promise of social media has been realized by the public health domain, and recent research has addressed some important challenges in that domain by utilizing social media data. Tasks such as monitoring flu trends, viral disease outbreaks, medication abuse, and adverse drug reactions are some examples of studies where data from social media have been exploited. The focus of this workshop is to explore solutions to three important natural language processing challenges for domain-specific social media text: (i) text classification, (ii) information extraction, and (iii) concept normalization. To explore different approaches to solving these problems on social media data, we designed a shared task which was open to participants globally. We designed three tasks using our in-house annotated Twitter data on adverse drug reactions. Task 1 involved automatic classification of adverse drug reaction assertive user posts; Task 2 focused on extracting specific adverse drug reaction mentions from user posts; and Task 3, which was slightly ill-defined due to the complex nature of the problem, involved normalizing user mentions of adverse drug reactions to standardized concept IDs. A total of 11 teams participated, and a total of 24 (18 for Task 1, and 6 for Task 2) system runs were submitted. Following the evaluation of the systems, and an assessment of their innovation/novelty, we accepted 7 descriptive manuscripts for publication--5 for Task 1 and 2 for Task 2. We provide descriptions of the tasks, data, and participating systems in this paper.


Assuntos
Mineração de Dados/métodos , Mídias Sociais/estatística & dados numéricos , Sistemas de Notificação de Reações Adversas a Medicamentos/estatística & dados numéricos , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Mineração de Dados/estatística & dados numéricos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/classificação , Humanos , Processamento de Linguagem Natural , Farmacovigilância , Aprendizado de Máquina Supervisionado , Máquina de Vetores de Suporte
8.
J Am Med Inform Assoc ; 22(3): 671-81, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25755127

RESUMO

OBJECTIVE: Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. METHODS: We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words' semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. RESULTS: ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. CONCLUSION: It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets.


Assuntos
Inteligência Artificial , Mineração de Dados/métodos , Farmacovigilância , Mídias Sociais , Humanos , Processamento de Linguagem Natural , Semântica
9.
Artigo em Inglês | MEDLINE | ID: mdl-25717407

RESUMO

Social media postings are rich in information that often remain hidden and inaccessible for automatic extraction due to inherent limitations of the site's APIs, which mostly limit access via specific keyword-based searches (and limit both the number of keywords and the number of postings that are returned). When mining social media for drug mentions, one of the first problems to solve is how to derive a list of variants of the drug name (common misspellings) that can capture a sufficient number of postings. We present here an approach that filters the potential variants based on the intuition that, faced with the task of writing an unfamiliar, complex word (the drug name), users will tend to revert to phonetic spelling, and we thus give preference to variants that reflect the phonemes of the correct spelling. The algorithm allowed us to capture 50.4 - 56.0 % of the user comments using only about 18% of the variants.

11.
Artigo em Inglês | MEDLINE | ID: mdl-25209025

RESUMO

Finding gene functions discussed in the literature is an important task of information extraction (IE) from biomedical documents. Automated computational methodologies can significantly reduce the need for manual curation and improve quality of other related IE systems. We propose an open-IE method for the BioCreative IV GO shared task (subtask b), focused on finding gene function terms [Gene Ontology (GO) terms] for different genes in an article. The proposed open-IE approach is based on distributional semantic similarity over the GO terms. The method does not require annotated data for training, which makes it highly generalizable. We achieve an F-measure of 0.26 on the test-set in the official submission for BioCreative-GO shared task, the third highest F-measure among the seven participants in the shared task. DATABASE URL: https://code.google.com/p/rainbow-nlp/


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Genes , Semântica , Vocabulário Controlado , Genes/genética , Genes/fisiologia , Internet
12.
AMIA Annu Symp Proc ; 2014: 924-33, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25954400

RESUMO

Recent research has shown that Twitter data analytics can have broad implications on public health research. However, its value for pharmacovigilance has been scantly studied - with health related forums and community support groups preferred for the task. We present a systematic study of tweets collected for 74 drugs to assess their value as sources of potential signals for adverse drug reactions (ADRs). We created an annotated corpus of 10,822 tweets. Each tweet was annotated for the presence or absence of ADR mentions, with the span and Unified Medical Language System (UMLS) concept ID noted for each ADR present. Using Cohen's kappa1, we calculated the inter-annotator agreement (IAA) for the binary annotations to be 0.69. To demonstrate the utility of the corpus, we attempted a lexicon-based approach for concept extraction, with promising success (54.1% precision, 62.1% recall, and 57.8% F-measure). A subset of the corpus is freely available at: http://diego.asu.edu/downloads.


Assuntos
Mineração de Dados/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Internet , Farmacovigilância , Humanos , Medicamentos sob Prescrição/efeitos adversos
14.
Biomed Inform Insights ; 5(Suppl. 1): 165-74, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22879773

RESUMO

The reasons that drive someone to commit suicide are complex and their study has attracted the attention of scientists in different domains. Analyzing this phenomenon could significantly improve the preventive efforts. In this paper we present a method for sentiment analysis of suicide notes submitted to the i2b2/VA/Cincinnati Shared Task 2011. In this task the sentences of 900 suicide notes were labeled with the possible emotions that they reflect. In order to label the sentence with emotions, we propose a hybrid approach which utilizes both rule based and machine learning techniques. To solve the multi class problem a rule-based engine and an SVM model is used for each category. A set of syntactic and semantic features are selected for each sentence to build the rules and train the classifier. The rules are generated manually based on a set of lexical and emotional clues. We propose a new approach to extract the sentence's clauses and constitutive grammatical elements and to use them in syntactic and semantic feature generation. The method utilizes a novel method to measure the polarity of the sentence based on the extracted grammatical elements, reaching precision of 41.79 with recall of 55.03 for an f-measure of 47.50. The overall mean f-measure of all submissions was 48.75% with a standard deviation of 7%.

15.
AMIA Annu Symp Proc ; 2011: 1019-26, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22195162

RESUMO

Rapid growth of online health social networks has enabled patients to communicate more easily with each other. This way of exchange of opinions and experiences has provided a rich source of information about drugs and their effectiveness and more importantly, their possible adverse reactions. We developed a system to automatically extract mentions of Adverse Drug Reactions (ADRs) from user reviews about drugs in social network websites by mining a set of language patterns. The system applied association rule mining on a set of annotated comments to extract the underlying patterns of colloquial expressions about adverse effects. The patterns were tested on a set of unseen comments to evaluate their performance. We reached to precision of 70.01% and recall of 66.32% and F-measure of 67.96%.


Assuntos
Mineração de Dados/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA