Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Eixos temáticos
Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Biomed Inform ; 142: 104384, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37164244

RESUMO

BACKGROUND: Identifying practice-ready evidence-based journal articles in medicine is a challenge due to the sheer volume of biomedical research publications. Newer approaches to support evidence discovery apply deep learning techniques to improve the efficiency and accuracy of classifying sound evidence. OBJECTIVE: To determine how well deep learning models using variants of Bidirectional Encoder Representations from Transformers (BERT) identify high-quality evidence with high clinical relevance from the biomedical literature for consideration in clinical practice. METHODS: We fine-tuned variations of BERT models (BERTBASE, BioBERT, BlueBERT, and PubMedBERT) and compared their performance in classifying articles based on methodological quality criteria. The dataset used for fine-tuning models included titles and abstracts of >160,000 PubMed records from 2012 to 2020 that were of interest to human health which had been manually labeled based on meeting established critical appraisal criteria for methodological rigor. The data was randomly divided into 80:10:10 sets for training, validating, and testing. In addition to using the full unbalanced set, the training data was randomly undersampled into four balanced datasets to assess performance and select the best performing model. For each of the four sets, one model that maintained sensitivity (recall) at ≥99% was selected and were ensembled. The best performing model was evaluated in a prospective, blinded test and applied to an established reference standard, the Clinical Hedges dataset. RESULTS: In training, three of the four selected best performing models were trained using BioBERTBASE. The ensembled model did not boost performance compared with the best individual model. Hence a solo BioBERT-based model (named DL-PLUS) was selected for further testing as it was computationally more efficient. The model had high recall (>99%) and 60% to 77% specificity in a prospective evaluation conducted with blinded research associates and saved >60% of the work required to identify high quality articles. CONCLUSIONS: Deep learning using pretrained language models and a large dataset of classified articles produced models with improved specificity while maintaining >99% recall. The resulting DL-PLUS model identifies high-quality, clinically relevant articles from PubMed at the time of publication. The model improves the efficiency of a literature surveillance program, which allows for faster dissemination of appraised research.


Assuntos
Pesquisa Biomédica , Aprendizado Profundo , Humanos , Relevância Clínica , Idioma , PubMed , Processamento de Linguagem Natural
2.
JMIR Form Res ; 7: e46874, 2023 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-37917123

RESUMO

BACKGROUND: The COVID-19 pandemic and its associated public health mitigation strategies have dramatically changed patterns of daily life activities worldwide, resulting in unintentional consequences on behavioral risk factors, including smoking, alcohol consumption, poor nutrition, and physical inactivity. The infodemic of social media data may provide novel opportunities for evaluating changes related to behavioral risk factors during the pandemic. OBJECTIVE: We explored the feasibility of conducting a sentiment and emotion analysis using Twitter data to evaluate behavioral cancer risk factors (physical inactivity, poor nutrition, alcohol consumption, and smoking) over time during the first year of the COVID-19 pandemic. METHODS: Tweets during 2020 relating to the COVID-19 pandemic and the 4 cancer risk factors were extracted from the George Washington University Libraries Dataverse. Tweets were defined and filtered using keywords to create 4 data sets. We trained and tested a machine learning classifier using a prelabeled Twitter data set. This was applied to determine the sentiment (positive, negative, or neutral) of each tweet. A natural language processing package was used to identify the emotions (anger, anticipation, disgust, fear, joy, sadness, surprise, and trust) based on the words contained in the tweets. Sentiments and emotions for each of the risk factors were evaluated over time and analyzed to identify keywords that emerged. RESULTS: The sentiment analysis revealed that 56.69% (51,479/90,813) of the tweets about physical activity were positive, 16.4% (14,893/90,813) were negative, and 26.91% (24,441/90,813) were neutral. Similar patterns were observed for nutrition, where 55.44% (27,939/50,396), 15.78% (7950/50,396), and 28.79% (14,507/50,396) of the tweets were positive, negative, and neutral, respectively. For alcohol, the proportions of positive, negative, and neutral tweets were 46.85% (34,897/74,484), 22.9% (17,056/74,484), and 30.25% (22,531/74,484), respectively, and for smoking, they were 41.2% (11,628/28,220), 24.23% (6839/28,220), and 34.56% (9753/28,220), respectively. The sentiments were relatively stable over time. The emotion analysis suggests that the most common emotion expressed across physical activity and nutrition tweets was trust (69,495/320,741, 21.67% and 42,324/176,564, 23.97%, respectively); for alcohol, it was joy (49,147/273,128, 17.99%); and for smoking, it was fear (23,066/110,256, 20.92%). The emotions expressed remained relatively constant over the observed period. An analysis of the most frequent words tweeted revealed further insights into common themes expressed in relation to some of the risk factors and possible sources of bias. CONCLUSIONS: This analysis provided insight into behavioral cancer risk factors as expressed on Twitter during the first year of the COVID-19 pandemic. It was feasible to extract tweets relating to all 4 risk factors, and most tweets had a positive sentiment with varied emotions across the different data sets. Although these results can play a role in promoting public health, a deeper dive via qualitative analysis can be conducted to provide a contextual examination of each tweet.

3.
Vaccine ; 41(43): 6411-6418, 2023 10 13.
Artigo em Inglês | MEDLINE | ID: mdl-37718186

RESUMO

BACKGROUND: It is evident that COVID-19 will remain a public health concern in the coming years, largely driven by variants of concern (VOC). It is critical to continuously monitor vaccine effectiveness as new variants emerge and new vaccines and/or boosters are developed. Systematic surveillance of the scientific evidence base is necessary to inform public health action and identify key uncertainties. Evidence syntheses may also be used to populate models to fill in research gaps and help to prepare for future public health crises. This protocol outlines the rationale and methods for a living evidence synthesis of the effectiveness of COVID-19 vaccines in reducing the morbidity and mortality associated with, and transmission of, VOC of SARS-CoV-2. METHODS: Living evidence syntheses of vaccine effectiveness will be carried out over one year for (1) a range of potential outcomes in the index individual associated with VOC (pathogenesis); and (2) transmission of VOC. The literature search will be conducted up to May 2023. Observational and database-linkage primary studies will be included, as well as RCTs. Information sources include electronic databases (MEDLINE; Embase; Cochrane, L*OVE; the CNKI and Wangfang platforms), pre-print servers (medRxiv, BiorXiv), and online repositories of grey literature. Title and abstract and full-text screening will be performed by two reviewers using a liberal accelerated method. Data extraction and risk of bias assessment will be completed by one reviewer with verification of the assessment by a second reviewer. Results from included studies will be pooled via random effects meta-analysis when appropriate, or otherwise summarized narratively. DISCUSSION: Evidence generated from our living evidence synthesis will be used to inform policy making, modelling, and prioritization of future research on the effectiveness of COVID-19 vaccines against VOC.


Assuntos
COVID-19 , Humanos , COVID-19/prevenção & controle , Vacinas contra COVID-19 , SARS-CoV-2 , Eficácia de Vacinas , Viés , Metanálise como Assunto
4.
BMC Complement Med Ther ; 22(1): 105, 2022 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-35418205

RESUMO

BACKGROUND: Coronavirus disease 2019 (COVID-19) is a novel infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Despite the paucity of evidence, various complementary, alternative and integrative medicines (CAIMs) have been being touted as both preventative and curative. We conducted sentiment and emotion analysis with the intent of understanding CAIM content related to COVID-19 being generated on Twitter across 9 months. METHODS: Tweets relating to CAIM and COVID-19 were extracted from the George Washington University Libraries Dataverse Coronavirus tweets dataset from March 03 to November 30, 2020. We trained and tested a machine learning classifier using a large, pre-labelled Twitter dataset, which was applied to predict the sentiment of each CAIM-related tweet, and we used a natural language processing package to identify the emotions based on the words contained in the tweets. RESULTS: Our dataset included 28 713 English-language Tweets. The number of CAIM-related tweets during the study period peaked in May 2020, then dropped off sharply over the subsequent three months; the fewest CAIM-related tweets were collected during August 2020 and remained low for the remainder of the collection period. Most tweets (n = 15 612, 54%) were classified as positive, 31% were neutral (n = 8803) and 15% were classified as negative (n = 4298). The most frequent emotions expressed across tweets were trust, followed by fear, while surprise and disgust were the least frequent. Though volume of tweets decreased over the 9 months of the study, the expressed sentiments and emotions remained constant. CONCLUSION: The results of this sentiment analysis enabled us to establish key CAIMs being discussed at the intersection of COVID-19 across a 9-month period on Twitter. Overall, the majority of our subset of tweets were positive, as were the emotions associated with the words found within them. This may be interpreted as public support for CAIM, however, further qualitative investigation is warranted. Such future directions may be used to combat misinformation and improve public health strategies surrounding the use of social media information.


Assuntos
COVID-19 , Medicina Integrativa , Mídias Sociais , Humanos , Pandemias , SARS-CoV-2 , Análise de Sentimentos
5.
JMIR Med Inform ; 9(9): e30401, 2021 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-34499041

RESUMO

BACKGROUND: The rapid growth of the biomedical literature makes identifying strong evidence a time-consuming task. Applying machine learning to the process could be a viable solution that limits effort while maintaining accuracy. OBJECTIVE: The goal of the research was to summarize the nature and comparative performance of machine learning approaches that have been applied to retrieve high-quality evidence for clinical consideration from the biomedical literature. METHODS: We conducted a systematic review of studies that applied machine learning techniques to identify high-quality clinical articles in the biomedical literature. Multiple databases were searched to July 2020. Extracted data focused on the applied machine learning model, steps in the development of the models, and model performance. RESULTS: From 3918 retrieved studies, 10 met our inclusion criteria. All followed a supervised machine learning approach and applied, from a limited range of options, a high-quality standard for the training of their model. The results show that machine learning can achieve a sensitivity of 95% while maintaining a high precision of 86%. CONCLUSIONS: Machine learning approaches perform well in retrieving high-quality clinical studies. Performance may improve by applying more sophisticated approaches such as active learning and unsupervised machine learning approaches.

6.
JMIR Res Protoc ; 10(11): e29398, 2021 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-34847061

RESUMO

BACKGROUND: A barrier to practicing evidence-based medicine is the rapidly increasing body of biomedical literature. Use of method terms to limit the search can help reduce the burden of screening articles for clinical relevance; however, such terms are limited by their partial dependence on indexing terms and usually produce low precision, especially when high sensitivity is required. Machine learning has been applied to the identification of high-quality literature with the potential to achieve high precision without sacrificing sensitivity. The use of artificial intelligence has shown promise to improve the efficiency of identifying sound evidence. OBJECTIVE: The primary objective of this research is to derive and validate deep learning machine models using iterations of Bidirectional Encoder Representations from Transformers (BERT) to retrieve high-quality, high-relevance evidence for clinical consideration from the biomedical literature. METHODS: Using the HuggingFace Transformers library, we will experiment with variations of BERT models, including BERT, BioBERT, BlueBERT, and PubMedBERT, to determine which have the best performance in article identification based on quality criteria. Our experiments will utilize a large data set of over 150,000 PubMed citations from 2012 to 2020 that have been manually labeled based on their methodological rigor for clinical use. We will evaluate and report on the performance of the classifiers in categorizing articles based on their likelihood of meeting quality criteria. We will report fine-tuning hyperparameters for each model, as well as their performance metrics, including recall (sensitivity), specificity, precision, accuracy, F-score, the number of articles that need to be read before finding one that is positive (meets criteria), and classification probability scores. RESULTS: Initial model development is underway, with further development planned for early 2022. Performance testing is expected to star in February 2022. Results will be published in 2022. CONCLUSIONS: The experiments will aim to improve the precision of retrieving high-quality articles by applying a machine learning classifier to PubMed searching. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/29398.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa