RESUMO
BACKGROUND: Assessment of undergraduate students using assessment instruments in the clinical setting is known to be complex. The aim of this study was therefore to examine whether two different assessment instruments, containing learning objectives (LO`s) with similar content, results in similar assessments by the clinical supervisors and to explore clinical supervisors' experiences of assessment regarding the two different assessment instruments. METHOD: A mixed-methods approach was used. Four simulated care encounter scenarios were evaluated by 50 supervisors using two different assessment instruments. 28 follow-up interviews were conducted. Descriptive statistics and logistic binary regression were used for quantitative data analysis, along with qualitative thematic analysis of interview data. RESULT: While significant differences were observed within the assessment instruments, the differences were consistent between the two instruments, indicating that the quality of the assessment instruments were considered equivalent. Supervisors noted that the relationship between the students and supervisors could introduce subjectivity in the assessments and that working in groups of supervisors could be advantageous. In terms of formative assessments, the Likert scale was considered a useful tool for evaluating learning objectives. However, supervisors had different views on grading scales and the need for clear definitions. The supervisors concluded that a complicated assessment instrument led to limited very-day usage and did not facilitate formative feedback. Furthermore, supervisors discussed how their experiences influenced the use of the assessment instruments, which resulted in different descriptions of the experience. These differences led to a discussion of the need of supervisor teams to enhance the validity of assessments. CONCLUSION: The findings showed that there were no significant differences in pass/fail gradings using the two different assessment instruments. The quantitative data suggests that supervisors struggled with subjectivity, phrasing, and definitions of the LO´s and the scales used in both instruments. This resulted in arbitrary assessments that were time-consuming and resulted in limited usage in the day-to-day assessment. To mitigate the subjectivity, supervisors suggested working in teams and conducting multiple assessments over time to increase assessment validity.