Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 51
Filter
1.
Med Teach ; 46(2): 188-195, 2024 02.
Article in English | MEDLINE | ID: mdl-37542358

ABSTRACT

Post-assessments psychometric reports are a vital component of the assessment cycle to ensure that assessments are reliable, valid and fair to make appropriate pass-fail decisions. Students' scores can be summarised by examination of frequency distributions, central tendency measures and dispersion measures. Item discrimination indicies to assess the quality of items, and distractors that differentiate between students achieving or not achieving the learning outcomes are key. Estimating individual item reliability and item validity indices can maximise test-score reliability and validity. Test accuracy can be evaluated by assessing test reliability, consistency and validity and standard error of measurement can be used to measure the variation. Standard setting, even by experts, may be unreliable and reality checks such as the Hofstee method, P values and correlation analysis can improve validity. The Rasch model of student ability and item difficulty assists in modifying assessment questions, pinpointing areas for additional instruction. We propose 12 tips to support test developers in interpreting structured psychometric reports, including analysis and refinement of flawed items and ensuring fair assessments with accurate and defensible marks.


Subject(s)
Educational Measurement , Students, Medical , Humans , Psychometrics , Reproducibility of Results , Educational Measurement/methods , Learning
2.
Int J Med Educ ; 14: 123-130, 2023 09 07.
Article in English | MEDLINE | ID: mdl-37678838

ABSTRACT

Objectives: To measure intra-standard-setter variability and assess the variations between the pass marks obtained from Angoff ratings, guided by the latent trait theory as the theoretical model. Methods: A non-experimental cross-sectional study was conducted to achieve the purpose of the study. Two knowledge-based tests were administered to 358 final-year medical students (223 females and 135 males) as part of their normal summative programme of assessments. The results of judgmental standard-setting using the Angoff method, which is widely used in medical schools, were used to determine intra-standard-setter inconsistency using the three-parameter item response theory (IRT). Permission for this study was granted by the local Research Ethics Committee of the University of Nottingham. To ensure anonymity and confidentiality, all identifiers at the student level were removed before the data were analysed. Results: The results of this study confirm that the three-parameter IRT can be used to analyse the results of individual judgmental standard setters. Overall, standard-setters behaved fairly consistently in both tests. The mean Angoff ratings and conditional probability were strongly positively correlated, which is a matter of inter-standard-setter validity. Conclusions: We recommend that assessment providers adopt the methodology used in this study to help determine inter and intra-judgmental inconsistencies across standard setters to minimise the number of false positive and false negative decisions.


Subject(s)
Academic Performance , Education, Medical , Program Evaluation , Humans , Male , Female , Students, Medical , Education, Medical/standards , Cross-Sectional Studies , Models, Theoretical
6.
Med Teach ; 45(2): 232-233, 2023 02.
Article in English | MEDLINE | ID: mdl-35645335
7.
Int J Med Educ ; 13: 100-106, 2022 Apr 22.
Article in English | MEDLINE | ID: mdl-35462355

ABSTRACT

Assessments in medical education, with consequent decisions about performance and competence, have both a profound and far-reaching impact on students and their future careers. Physicians who make decisions about students must be confident that these decisions are based on objective, valid and reliable evidence and are thus fair. An increasing use of psychometrics has aimed to minimise measurement bias as a major threat to fairness in testing. Currently, there is substantial literature on psychometric methods and their applications, ranging from basic to advanced, outlining how assessment providers can improve their exams to make them fairer and minimise the errors attached to assessments. Understanding the mathematical models of some of these methods may be difficult for some assessment providers, and in particular clinicians. This guide requires no prior knowledge of mathematics and describes some of the key methods used to improve and develop assessments; essential for those involved in interpreting assessment results. This article aligns each method to the Standards for educational and psychological testing framework, recognised as the gold standard for testing guidance since the 1960s. This helps the reader develop a deeper understanding of how assessors provide evidence for reliability and validity with consideration to test construction, evaluation, fairness, application, and consequences, and provides a platform to better understand the literature in regards other more complex psychometric concepts that are not specifically covered in this article.


Subject(s)
Education, Medical , Physicians , Clinical Competence , Educational Measurement , Humans , Psychometrics , Reproducibility of Results
8.
Med Teach ; 44(4): 453-454, 2022 04.
Article in English | MEDLINE | ID: mdl-35037563
9.
Med Teach ; 44(6): 582-595, 2022 06.
Article in English | MEDLINE | ID: mdl-34726546

ABSTRACT

The ratings that judges or examiners use for determining pass marks and students' performance on OSCEs serve a number of essential functions in medical education assessment, and their validity is a pivotal issue. However, some types of errors often occur in ratings that require special efforts to minimise. Rater characteristics (e.g. generosity error, severity error, central tendency error or halo error) may present a source of performance irrelevant variance. Prior literature shows the fundamental problems in student performance measurement attached to judges' or examiners' errors. It also indicates that the control of such errors supports a robust and credible pass mark and thus, accurate student marks. Therefore, for a standard-setter who identifies the pass mark and an examiner who rates student performance in OSCEs, proper, user-friendly feedback on their standard-setting and ratings is essential for reducing bias. This feedback provides useful avenues for understanding why performance ratings may be irregular and how to improve the quality of ratings. This AMEE Guide discusses various methods of feedback to support examiners' understanding of the performance of students and the standard-setting process with an effort to make inferences from assessments fair, valid and reliable.


Subject(s)
Education, Medical , Students, Medical , Clinical Competence , Educational Measurement/methods , Feedback , Humans
12.
BMJ Open ; 9(9): e029208, 2019 09 06.
Article in English | MEDLINE | ID: mdl-31494607

ABSTRACT

OBJECTIVES: Sources of bias, such as the examiners, domains and stations, can influence the student marks in objective structured clinical examination (OSCE). This study describes the extent to which the facets modelled in an OSCE can contribute to scoring variance and how they fit into a Many-Facet Rasch Model (MFRM) of OSCE performance. A further objective is to identify the functioning of the rating scale used. DESIGN: A non-experimental cross-sectional design. PARTICIPANTS AND SETTINGS: An MFRM was used to identify sources of error (eg, examiner, domain and station), which may influence the student outcome. A 16-station OSCE was conducted for 329 final year medical students. Domain-based marking was applied, each station using a sample from eight defined domains across the whole OSCE. The domains were defined as follows: communication skills, professionalism, information gathering, information giving, clinical interpretation, procedure, diagnosis and management. The domains in each station were weighted to ensure proper attention to the construct of the individual station. Four facets were assessed: students, examiners, domains and stations. RESULTS: The results suggest that the OSCE data fit the model, confirming that an MFRM approach was appropriate to use. The variable map allows a comparison with and between the facets of students, examiners, domains and stations and the 5-point score for each domain with each station as they are calibrated to the same scale. Fit statistics showed that the domains map well to the performance of the examiners. No statistically significant difference between examiner sensitivity (3.85 logits) was found. However, the results did suggest examiners were lenient and that some behaved inconsistently. The results also suggest that the functioning of response categories on the 5-point rating scale need further examination and optimisation. CONCLUSIONS: The results of the study have important implications for examiner monitoring and training activities, to aid assessment improvement.


Subject(s)
Clinical Competence/standards , Education, Medical, Undergraduate , Educational Measurement/methods , Bias , Cross-Sectional Studies , Female , Humans , Male , Medical History Taking/standards , Models, Statistical , Physical Examination/standards , Psychometrics
16.
Acad Med ; 93(5): 811, 2018 05.
Article in English | MEDLINE | ID: mdl-29280753
17.
Med Teach ; 39(10): 1010-1015, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28768456

ABSTRACT

As a medical educator, you may be directly or indirectly involved in the quality of assessments. Measurement has a substantial role in developing the quality of assessment questions and student learning. The information provided by psychometric data can improve pedagogical issues in medical education. Through measurement we are able to assess the learning experiences of students. Standard setting plays an important role in assessing the performance quality of students as doctors in the future. Presentation of performance data for standard setters may contribute towards developing a credible and defensible pass mark. Validity and reliability of test scores are the most important factors for developing quality assessment questions. Analysis of the answers to individual questions provides useful feedback for assessment leads to improve the quality of each question, and hence make students' marks fair in terms of diversity and ethnicity. Item Characteristic Curves (ICC) can send signals to assessment leads to improve the quality of individual questions.


Subject(s)
Education, Medical, Undergraduate/methods , Educational Measurement , Learning , Students, Medical , Education, Medical , Humans , Psychometrics , Reproducibility of Results
18.
J Prim Care Community Health ; 8(4): 294-299, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28645236

ABSTRACT

OBJECTIVE: To examine the empathy level of undergraduate medical students in Pakistan. Three hypotheses are developed based on the literature review. (1) Female medical students have a higher level of empathy than do male students. (2) Empathy scores vary during the medical school years in Pakistani students. (3) Medical students interested in people-oriented specialties would score higher than the students interested in technology-oriented specialties. METHODS: This is a quantitative inquiry approach using a cross-sectional design of 1453 students from 8 Pakistani medical schools, both private and state. The sample consists of 41.1% (n = 597) male students and 58.9% (n = 856) female students. Empirical data are collected using the Jefferson Scale of Physician Empathy (JSPE), a well-validated self-administered questionnaire. RESULTS: The mean empathy score among students is 4.77 with a standard deviation of 0.72. The results show that there is no statistically significant association between the empathy scores and gender, t(1342.36) = -0.053, P = .95). There is a statistically significant difference between the empathy scores and the years of medical school, F(14, 1448) = 4.95, P = .01. Concerning the specialty interests, there is no statistically significant difference between the empathy score and specialty interests. CONCLUSION: The findings of this study showed that in Western countries, medical students performed better than Pakistani medical students on the empathy scale. This finding has important implications for Pakistani medical educators to improve the interpersonal skills of medical students in the context of patient care. Inconsistent with our expectations and experiences, our findings do not support that female medical students scored better than their male counterparts on the empathy scale. Because of the nature of a cross-sectional study, it is impossible to argue the decline of empathy during medical school training.


Subject(s)
Empathy , Physician-Patient Relations , Students, Medical , Cross-Sectional Studies , Education, Medical, Undergraduate , Female , Humans , Male , Pakistan , Sex Factors , Surveys and Questionnaires
20.
Acad Med ; 91(9): 1324, 2016 Sep.
Article in English | MEDLINE | ID: mdl-27144991
SELECTION OF CITATIONS
SEARCH DETAIL
...