Ensemble of deep learning language models to support the creation of living systematic reviews for the COVID-19 literature.

Knafou, Julien; Haas, Quentin; Borissov, Nikolay; Counotte, Michel; Low, Nicola; Imeri, Hira; Ipekci, Aziz Mert; Buitrago-Garcia, Diana; Heron, Leonie; Amini, Poorya; Teodoro, Douglas

Knafou, Julien; Haas, Quentin; Borissov, Nikolay; Counotte, Michel; Low, Nicola; Imeri, Hira; Ipekci, Aziz Mert; Buitrago-Garcia, Diana; Heron, Leonie; Amini, Poorya; Teodoro, Douglas.

Afiliação

Knafou J; University of Applied Sciences and Arts of Western Switzerland (HES-SO), Rue de la Tambourine 17, 1227, Geneva, Switzerland. julien.knafou@hesge.ch.
Haas Q; Risklick AG, Bern, Switzerland.
Borissov N; University of Applied Sciences and Arts of Western Switzerland (HES-SO), Rue de la Tambourine 17, 1227, Geneva, Switzerland.
Counotte M; CTU Bern, University of Bern, Bern, Switzerland.
Low N; Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland.
Imeri H; Wageningen Bioveterinary Research, Wageningen University & Research, Wageningen, The Netherlands.
Ipekci AM; Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland.
Buitrago-Garcia D; Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland.
Heron L; Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland.
Amini P; Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland.
Teodoro D; Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland.

Syst Rev ; 12(1): 94, 2023 06 05.

Article em En | MEDLINE | ID: mdl-37277872

RESUMO

BACKGROUND: The COVID-19 pandemic has led to an unprecedented amount of scientific publications, growing at a pace never seen before. Multiple living systematic reviews have been developed to assist professionals with up-to-date and trustworthy health information, but it is increasingly challenging for systematic reviewers to keep up with the evidence in electronic databases. We aimed to investigate deep learning-based machine learning algorithms to classify COVID-19-related publications to help scale up the epidemiological curation process. METHODS: In this retrospective study, five different pre-trained deep learning-based language models were fine-tuned on a dataset of 6365 publications manually classified into two classes, three subclasses, and 22 sub-subclasses relevant for epidemiological triage purposes. In a k-fold cross-validation setting, each standalone model was assessed on a classification task and compared against an ensemble, which takes the standalone model predictions as input and uses different strategies to infer the optimal article class. A ranking task was also considered, in which the model outputs a ranked list of sub-subclasses associated with the article. RESULTS: The ensemble model significantly outperformed the standalone classifiers, achieving a F1-score of 89.2 at the class level of the classification task. The difference between the standalone and ensemble models increases at the sub-subclass level, where the ensemble reaches a micro F1-score of 70% against 67% for the best-performing standalone model. For the ranking task, the ensemble obtained the highest recall@3, with a performance of 89%. Using an unanimity voting rule, the ensemble can provide predictions with higher confidence on a subset of the data, achieving detection of original papers with a F1-score up to 97% on a subset of 80% of the collection instead of 93% on the whole dataset. CONCLUSION: This study shows the potential of using deep learning language models to perform triage of COVID-19 references efficiently and support epidemiological curation and review. The ensemble consistently and significantly outperforms any standalone model. Fine-tuning the voting strategy thresholds is an interesting alternative to annotate a subset with higher predictive confidence.

Assuntos

COVID-19; Aprendizado Profundo; Humanos; Pandemias; Estudos Retrospectivos; Idioma

Palavras-chave

COVID-19; Deep learning; Language model; Literature screening; Living systematic review; Text classification; Transfer learning

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Aprendizado Profundo / COVID-19 Tipo de estudo: Observational_studies / Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Revista: Syst Rev Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Suíça País de publicação: Reino Unido

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google