Pesquisa | Secretaria de Estado da Saúde

Web Conversations About Complementary and Alternative Medicines and Cancer: Content and Sentiment Analysis.

Mazzocut, Mauro; Truccolo, Ivana; Antonini, Marialuisa; Rinaldi, Fabio; Omero, Paolo; Ferrarin, Emanuela; De Paoli, Paolo; Tasso, Carlo.

J Med Internet Res ; 18(6): e120, 2016 06 16.

Artigo em Inglês | MEDLINE | ID: mdl-27311444

RESUMO

BACKGROUND: The use of complementary and alternative medicine (CAM) among cancer patients is widespread and mostly self-administrated. Today, one of the most relevant topics is the nondisclosure of CAM use to doctors. This general lack of communication exposes patients to dangerous behaviors and to less reliable information channels, such as the Web. The Italian context scarcely differs from this trend. Today, we are able to mine and analyze systematically the unstructured information available in the Web, to get an insight of people's opinions, beliefs, and rumors concerning health topics. OBJECTIVE: Our aim was to analyze Italian Web conversations about CAM, identifying the most relevant Web sources, therapies, and diseases and measure the related sentiment. METHODS: Data have been collected using the Web Intelligence tool ifMONITOR. The workflow consisted of 6 phases: (1) eligibility criteria definition for the ifMONITOR search profile; (2) creation of a CAM terminology database; (3) generic Web search and automatic filtering, the results have been manually revised to refine the search profile, and stored in the ifMONITOR database; (4) automatic classification using the CAM database terms; (5) selection of the final sample and manual sentiment analysis using a 1-5 score range; (6) manual indexing of the Web sources and CAM therapies type retrieved. Descriptive univariate statistics were computed for each item: absolute frequency, percentage, central tendency (mean sentiment score [MSS]), and variability (standard variation σ). RESULTS: Overall, 212 Web sources, 423 Web documents, and 868 opinions have been retrieved. The overall sentiment measured tends to a good score (3.6 of 5). Quite a high polarization in the opinions of the conversation partaking emerged from standard variation analysis (σ≥1). In total, 126 of 212 (59.4%) Web sources retrieved were nonhealth-related. Facebook (89; 21%) and Yahoo Answers (41; 9.7%) were the most relevant. In total, 94 CAM therapies have been retrieved. Most belong to the "biologically based therapies or nutrition" category: 339 of 868 opinions (39.1%), showing an MSS of 3.9 (σ=0.83). Within nutrition, "diets" collected 154 opinions (18.4%) with an MSS of 3.8 (σ=0.87); "food as CAM" overall collected 112 opinions (12.8%) with a MSS of 4 (σ=0.68). Excluding diets and food, the most discussed CAM therapy is the controversial Italian "Di Bella multitherapy" with 102 opinions (11.8%) with an MSS of 3.4 (σ=1.21). Breast cancer was the most mentioned disease: 81 opinions of 868. CONCLUSIONS: Conversations about CAM and cancer are ubiquitous. There is a great concern about the biologically based therapies, perceived as harmless and useful, under-rating all risks related to dangerous interactions or malnutrition. Our results can be useful to doctors to be aware of the implications of these beliefs for the clinical practice. Web conversation exploitation could be a strategy to gain insights of people's perspective for other controversial topics.

Assuntos

Terapias Complementares/métodos , Terapias Complementares/psicologia , Internet/estatística & dados numéricos , Neoplasias/terapia , Adulto , Terapias Complementares/estatística & dados numéricos , Mineração de Dados/métodos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade

Entity recognition in the biomedical domain using a hybrid approach.

Basaldella, Marco; Furrer, Lenz; Tasso, Carlo; Rinaldi, Fabio.

J Biomed Semantics ; 8(1): 51, 2017 Nov 09.

Artigo em Inglês | MEDLINE | ID: mdl-29122011

RESUMO

BACKGROUND: This article describes a high-recall, high-precision approach for the extraction of biomedical entities from scientific articles. METHOD: The approach uses a two-stage pipeline, combining a dictionary-based entity recognizer with a machine-learning classifier. First, the OGER entity recognizer, which has a bias towards high recall, annotates the terms that appear in selected domain ontologies. Subsequently, the Distiller framework uses this information as a feature for a machine learning algorithm to select the relevant entities only. For this step, we compare two different supervised machine-learning algorithms: Conditional Random Fields and Neural Networks. RESULTS: In an in-domain evaluation using the CRAFT corpus, we test the performance of the combined systems when recognizing chemicals, cell types, cellular components, biological processes, molecular functions, organisms, proteins, and biological sequences. Our best system combines dictionary-based candidate generation with Neural-Network-based filtering. It achieves an overall precision of 86% at a recall of 60% on the named entity recognition task, and a precision of 51% at a recall of 49% on the concept recognition task. CONCLUSION: These results are to our knowledge the best reported so far in this particular task.

Assuntos

Algoritmos , Mineração de Dados/métodos , Aprendizado de Máquina , Redes Neurais de Computação , Vocabulário Controlado , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Semântica , Terminologia como Assunto

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa