Pesquisa | Portal Regional da BVS

Ensemble of deep learning language models to support the creation of living systematic reviews for the COVID-19 literature.

Knafou, Julien; Haas, Quentin; Borissov, Nikolay; Counotte, Michel; Low, Nicola; Imeri, Hira; Ipekci, Aziz Mert; Buitrago-Garcia, Diana; Heron, Leonie; Amini, Poorya; Teodoro, Douglas.

Syst Rev ; 12(1): 94, 2023 06 05.

Artigo em Inglês | MEDLINE | ID: mdl-37277872

RESUMO

BACKGROUND: The COVID-19 pandemic has led to an unprecedented amount of scientific publications, growing at a pace never seen before. Multiple living systematic reviews have been developed to assist professionals with up-to-date and trustworthy health information, but it is increasingly challenging for systematic reviewers to keep up with the evidence in electronic databases. We aimed to investigate deep learning-based machine learning algorithms to classify COVID-19-related publications to help scale up the epidemiological curation process. METHODS: In this retrospective study, five different pre-trained deep learning-based language models were fine-tuned on a dataset of 6365 publications manually classified into two classes, three subclasses, and 22 sub-subclasses relevant for epidemiological triage purposes. In a k-fold cross-validation setting, each standalone model was assessed on a classification task and compared against an ensemble, which takes the standalone model predictions as input and uses different strategies to infer the optimal article class. A ranking task was also considered, in which the model outputs a ranked list of sub-subclasses associated with the article. RESULTS: The ensemble model significantly outperformed the standalone classifiers, achieving a F1-score of 89.2 at the class level of the classification task. The difference between the standalone and ensemble models increases at the sub-subclass level, where the ensemble reaches a micro F1-score of 70% against 67% for the best-performing standalone model. For the ranking task, the ensemble obtained the highest recall@3, with a performance of 89%. Using an unanimity voting rule, the ensemble can provide predictions with higher confidence on a subset of the data, achieving detection of original papers with a F1-score up to 97% on a subset of 80% of the collection instead of 93% on the whole dataset. CONCLUSION: This study shows the potential of using deep learning language models to perform triage of COVID-19 references efficiently and support epidemiological curation and review. The ensemble consistently and significantly outperforms any standalone model. Fine-tuning the voting strategy thresholds is an interesting alternative to annotate a subset with higher predictive confidence.

Assuntos

COVID-19 , Aprendizado Profundo , Humanos , Pandemias , Estudos Retrospectivos , Idioma

Deep learning-based risk prediction for interventional clinical trials based on protocol design: A retrospective study.

Ferdowsi, Sohrab; Knafou, Julien; Borissov, Nikolay; Vicente Alvarez, David; Mishra, Rahul; Amini, Poorya; Teodoro, Douglas.

Patterns (N Y) ; 4(3): 100689, 2023 Mar 10.

Artigo em Inglês | MEDLINE | ID: mdl-36960445

RESUMO

Success rate of clinical trials (CTs) is low, with the protocol design itself being considered a major risk factor. We aimed to investigate the use of deep learning methods to predict the risk of CTs based on their protocols. Considering protocol changes and their final status, a retrospective risk assignment method was proposed to label CTs according to low, medium, and high risk levels. Then, transformer and graph neural networks were designed and combined in an ensemble model to learn to infer the ternary risk categories. The ensemble model achieved robust performance (area under the receiving operator characteristic curve [AUROC] of 0.8453 [95% confidence interval: 0.8409-0.8495]), similar to the individual architectures but significantly outperforming a baseline based on bag-of-words features (0.7548 [0.7493-0.7603] AUROC). We demonstrate the potential of deep learning in predicting the risk of CTs from their protocols, paving the way for customized risk mitigation strategies during protocol design.

Reducing systematic review burden using Deduklick: a novel, automated, reliable, and explainable deduplication algorithm to foster medical research.

Borissov, Nikolay; Haas, Quentin; Minder, Beatrice; Kopp-Heim, Doris; von Gernler, Marc; Janka, Heidrun; Teodoro, Douglas; Amini, Poorya.

Syst Rev ; 11(1): 172, 2022 08 17.

Artigo em Inglês | MEDLINE | ID: mdl-35978441

RESUMO

BACKGROUND: Identifying and removing reference duplicates when conducting systematic reviews (SRs) remain a major, time-consuming issue for authors who manually check for duplicates using built-in features in citation managers. To address issues related to manual deduplication, we developed an automated, efficient, and rapid artificial intelligence-based algorithm named Deduklick. Deduklick combines natural language processing algorithms with a set of rules created by expert information specialists. METHODS: Deduklick's deduplication uses a multistep algorithm of data normalization, calculates a similarity score, and identifies unique and duplicate references based on metadata fields, such as title, authors, journal, DOI, year, issue, volume, and page number range. We measured and compared Deduklick's capacity to accurately detect duplicates with the information specialists' standard, manual duplicate removal process using EndNote on eight existing heterogeneous datasets. Using a sensitivity analysis, we manually cross-compared the efficiency and noise of both methods. DISCUSSION: Deduklick achieved average recall of 99.51%, average precision of 100.00%, and average F1 score of 99.75%. In contrast, the manual deduplication process achieved average recall of 88.65%, average precision of 99.95%, and average F1 score of 91.98%. Deduklick achieved equal to higher expert-level performance on duplicate removal. It also preserved high metadata quality and drastically reduced time spent on analysis. Deduklick represents an efficient, transparent, ergonomic, and time-saving solution for identifying and removing duplicates in SRs searches. Deduklick could therefore simplify SRs production and represent important advantages for scientists, including saving time, increasing accuracy, reducing costs, and contributing to quality SRs.

Assuntos

Algoritmos , Inteligência Artificial , Revisões Sistemáticas como Assunto , Pesquisa Biomédica , Humanos , Processamento de Linguagem Natural

Information Retrieval in an Infodemic: The Case of COVID-19 Publications.

Teodoro, Douglas; Ferdowsi, Sohrab; Borissov, Nikolay; Kashani, Elham; Vicente Alvarez, David; Copara, Jenny; Gouareb, Racha; Naderi, Nona; Amini, Poorya.

J Med Internet Res ; 23(9): e30161, 2021 09 17.

Artigo em Inglês | MEDLINE | ID: mdl-34375298

RESUMO

BACKGROUND: The COVID-19 global health crisis has led to an exponential surge in published scientific literature. In an attempt to tackle the pandemic, extremely large COVID-19-related corpora are being created, sometimes with inaccurate information, which is no longer at scale of human analyses. OBJECTIVE: In the context of searching for scientific evidence in the deluge of COVID-19-related literature, we present an information retrieval methodology for effective identification of relevant sources to answer biomedical queries posed using natural language. METHODS: Our multistage retrieval methodology combines probabilistic weighting models and reranking algorithms based on deep neural architectures to boost the ranking of relevant documents. Similarity of COVID-19 queries is compared to documents, and a series of postprocessing methods is applied to the initial ranking list to improve the match between the query and the biomedical information source and boost the position of relevant documents. RESULTS: The methodology was evaluated in the context of the TREC-COVID challenge, achieving competitive results with the top-ranking teams participating in the competition. Particularly, the combination of bag-of-words and deep neural language models significantly outperformed an Okapi Best Match 25-based baseline, retrieving on average, 83% of relevant documents in the top 20. CONCLUSIONS: These results indicate that multistage retrieval supported by deep learning could enhance identification of literature for COVID-19-related questions posed using natural language.

Assuntos

COVID-19 , Algoritmos , Humanos , Armazenamento e Recuperação da Informação , Idioma , SARS-CoV-2

Utilizing Artificial Intelligence to Manage COVID-19 Scientific Evidence Torrent with Risklick AI: A Critical Tool for Pharmacology and Therapy Development.

Haas, Quentin; Alvarez, David Vicente; Borissov, Nikolay; Ferdowsi, Sohrab; von Meyenn, Leonhard; Trelle, Sven; Teodoro, Douglas; Amini, Poorya.

Pharmacology ; 106(5-6): 244-253, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33910199

RESUMO

INTRODUCTION: The SARS-CoV-2 pandemic has led to one of the most critical and boundless waves of publications in the history of modern science. The necessity to find and pursue relevant information and quantify its quality is broadly acknowledged. Modern information retrieval techniques combined with artificial intelligence (AI) appear as one of the key strategies for COVID-19 living evidence management. Nevertheless, most AI projects that retrieve COVID-19 literature still require manual tasks. METHODS: In this context, we pre-sent a novel, automated search platform, called Risklick AI, which aims to automatically gather COVID-19 scientific evidence and enables scientists, policy makers, and healthcare professionals to find the most relevant information tailored to their question of interest in real time. RESULTS: Here, we compare the capacity of Risklick AI to find COVID-19-related clinical trials and scientific publications in comparison with clinicaltrials.gov and PubMed in the field of pharmacology and clinical intervention. DISCUSSION: The results demonstrate that Risklick AI is able to find COVID-19 references more effectively, both in terms of precision and recall, compared to the baseline platforms. Hence, Risklick AI could become a useful alternative assistant to scientists fighting the COVID-19 pandemic.

Assuntos

Inteligência Artificial/tendências , COVID-19/terapia , Interpretação Estatística de Dados , Desenvolvimento de Medicamentos/tendências , Medicina Baseada em Evidências/tendências , Farmacologia/tendências , Inteligência Artificial/estatística & dados numéricos , COVID-19/diagnóstico , COVID-19/epidemiologia , Ensaios Clínicos como Assunto/estatística & dados numéricos , Desenvolvimento de Medicamentos/estatística & dados numéricos , Medicina Baseada em Evidências/estatística & dados numéricos , Humanos , Farmacologia/estatística & dados numéricos , Sistema de Registros

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA