Your browser doesn't support javascript.
loading
Crowdsourcing with Enhanced Data Quality Assurance: An Efficient Approach to Mitigate Resource Scarcity Challenges in Training Large Language Models for Healthcare.
Barai, Prosanta; Leroy, Gondy; Bisht, Prakash; Rothman, Joshua M; Lee, Sumi; Andrews, Jennifer; Rice, Sydney A; Ahmed, Arif.
Afiliação
  • Barai P; The University of Arizona, Tucson 85721, U.S.A.
  • Leroy G; The University of Arizona, Tucson 85721, U.S.A.
  • Bisht P; The University of Arizona, Tucson 85721, U.S.A.
  • Rothman JM; UC San Diego Division of Academic General Pediatrics, USA.
  • Lee S; The University of Arizona, Tucson 85721, U.S.A.
  • Andrews J; The University of Arizona, Tucson 85721, U.S.A.
  • Rice SA; The University of Arizona, Tucson 85721, U.S.A.
  • Ahmed A; The University of Arizona, Tucson 85721, U.S.A.
Article em En | MEDLINE | ID: mdl-38827063
ABSTRACT
Large Language Models (LLMs) have demonstrated immense potential in artificial intelligence across various domains, including healthcare. However, their efficacy is hindered by the need for high-quality labeled data, which is often expensive and time-consuming to create, particularly in low-resource domains like healthcare. To address these challenges, we propose a crowdsourcing (CS) framework enriched with quality control measures at the pre-, real-time-, and post-data gathering stages. Our study evaluated the effectiveness of enhancing data quality through its impact on LLMs (Bio-BERT) for predicting autism-related symptoms. The results show that real-time quality control improves data quality by 19% compared to pre-quality control. Fine-tuning Bio-BERT using crowdsourced data generally increased recall compared to the Bio-BERT baseline but lowered precision. Our findings highlighted the potential of crowdsourcing and quality control in resource-constrained environments and offered insights into optimizing healthcare LLMs for informed decision-making and improved patient care.

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article