RESUMO
ChatGPT is a language model that was trained on a large dataset including medical literature. Several studies have described the performance of ChatGPT on medical exams. In this study, we examine its performance in answering factual knowledge questions regarding clinical pharmacy. Questions were obtained from a Dutch application that features multiple-choice questions to maintain a basic knowledge level for clinical pharmacists. In total, 264 clinical pharmacy-related questions were presented to ChatGPT and responses were evaluated for accuracy, concordance, quality of the substantiation, and reproducibility. Accuracy was defined as the correctness of the answer, and results were compared to the overall score by pharmacists over 2022. Responses were marked concordant if no contradictions were present. The quality of the substantiation was graded by two independent pharmacists using a 4-point scale. Reproducibility was established by presenting questions multiple times and on various days. ChatGPT yielded accurate responses for 79% of the questions, surpassing pharmacists' accuracy of 66%. Concordance was 95%, and the quality of the substantiation was deemed good or excellent for 73% of the questions. Reproducibility was consistently high, both within day and between days (>92%), as well as across different users. ChatGPT demonstrated a higher accuracy and reproducibility to factual knowledge questions related to clinical pharmacy practice than pharmacists. Consequently, we posit that ChatGPT could serve as a valuable resource to pharmacists. We hope the technology will further improve, which may lead to enhanced future performance.
Assuntos
Farmacêuticos , Humanos , Reprodutibilidade dos Testes , Serviço de Farmácia Hospitalar , Inquéritos e Questionários , Avaliação Educacional/métodosRESUMO
Medical schools are required to assess and evaluate their curricula and to develop exam questions with strong reliability and validity evidence, often based on data derived from statistically small samples of medical students. Achieving a large enough sample to reliably and validly evaluate courses, assessments, and exam questions would require extensive data collection over many years, which is inefficient, especially in the fast-changing educational environment of medical schools. This article demonstrates how advanced quantitative methods, such as bootstrapping, can provide reliable data by resampling a single dataset to create many simulated samples. This economic approach, among others, allows for the creation of confidence intervals and, consequently, the accurate evaluation of exam questions as well as broader course and curriculum assessments. Bootstrapping offers a robust alternative to traditional methods, improving the psychometric quality of exam questions, and contributing to fair and valid assessments in medical education.
RESUMO
OBJECTIVES: Students often express uncertainty regarding changing their answers on multiple choice tests despite multiple studies quantitatively showing the benefits of changing answers. METHODS: Data was collected from 86 first-year podiatric medical students over one semester for the course of Biochemistry, as shown in electronic testing data collected via ExamSoft's® Snapshot Viewer. Quantitative analysis was performed comparing frequency of changing answers and whether students changed their answers from incorrect-to-correct, correct-to-incorrect, or incorrect-to-incorrect. A correlation analysis was performed to assess the relationship between the frequency of each type of answer change and class rank. Independent-sample t-tests were used to assess differences in the pattern of changing answers amongst the top and bottom performing students in the class. RESULTS: The correlation between total changes made from correct-to-incorrect per total answer changes and class rank yielded a positive correlation of r = 0.218 (P = .048). There was also a positive correlation of r = 0.502 (P < .000) observed in the number of incorrect-to-incorrect answer changes per total changes made compared to class rank. A negative correlation of r = -0.382 (P < .000) was observed when comparing class rank and the number of changed answers from incorrect-to-correct. While most of the class benefited from changing answers, a significant positive correlation of r = 0.467 (P < .000) for percent ultimately incorrect (regardless of number of changes) and class rank was observed. CONCLUSION: Analysis revealed that class rank correlated to likelihood of a positive gain from changing answers. Higher ranking students were more likely to gain points from changing their answer compared to lower ranking. Top students changed answers less frequently and changed answers to an ultimately correct answer more often, while bottom students changed answers from an incorrect answer to another incorrect answer more frequently than top students.
RESUMO
BACKGROUND: Writing multiple choice questions (MCQ) takes a lot of practice. Often, pharmacy practitioners lack the training to write effective MCQ. Sources for instruction in effective MCQ writing can be overwhelming with numerous suggestions of what should and should not be done. PURPOSE: The following guide is prepared to serve as a succinct reference for creation and revision of MCQ by both novice and seasoned pharmacy faculty practitioners. METHODS: The literature is summarized into 12 best practices for writing effective MCQ. Pharmacy specific examples that demonstrate violations of best practices and how they can be corrected are provided. IMPLICATIONS: The guide can serve as a primer to write new MCQ, as a reference to revise previously created questions, or as a guide to peer review of MCQ.