Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Adv Health Sci Educ Theory Pract ; 26(1): 313-328, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-32816242

RESUMEN

In Canada, high stakes objective structured clinical examinations (OSCEs) administered by the Medical Council of Canada have relied exclusively on physician examiners (PEs) for scoring. Prior research has looked at using SPs to replace PEs. This paper reports on two studies that implement and evaluate a standardized patient (SP) scoring tool to augment PE scoring. The unique aspect of this study is that it explores the benefits of combining SP and PE scores. SP focus groups developed rating scales for four dimensions they labelled: Listening, Communication, Empathy/Rapport, and Global Impression. In Study I, 43 SPs from one site of a national PE-scored OSCE rated 60 examinees with the initial SP rating scales. In Study II, 137 SPs used slightly revised rating scales with optional narrative comments to score 275 examinees at two sites. Examinees were blinded to SP scoring and SP ratings did not count. Separate PE and SP scoring was examined using descriptive statistics and correlations. Combinations of SP and PE scoring were assessed using pass-rates, reliability, and decision consistency and accuracy indices. In Study II, SP and PE comments were examined. SPs showed greater variability in their scoring, and rated examinees lower than PEs on common elements, resulting in slightly lower pass rates when combined. There was a moderate tendency for both SPs and PEs to make negative comments for the same examinee but for different reasons. We argue that SPs and PE assess performance from different perspectives, and that combining scores from both augments overall reliability of scores and pass/fail decisions. There is potential to provide examinees with feedback comments from each group.


Asunto(s)
Competencia Clínica/normas , Evaluación Educacional/métodos , Docentes Médicos/normas , Simulación de Paciente , Canadá , Comunicación , Evaluación Educacional/normas , Empatía , Humanos , Reproducibilidad de los Resultados
2.
Adv Health Sci Educ Theory Pract ; 20(3): 581-94, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-25164266

RESUMEN

Examiner effects and content specificity are two well known sources of construct irrelevant variance that present great challenges in performance-based assessments. National medical organizations that are responsible for large-scale performance based assessments experience an additional challenge as they are responsible for administering qualification examinations to physician candidates at several locations and institutions. This study explores the impact of site location as a source of score variation in a large-scale national assessment used to measure the readiness of internationally educated physician candidates for residency programs. Data from the Medical Council of Canada's National Assessment Collaboration were analyzed using Hierarchical Linear Modeling and Rasch Analyses. Consistent with previous research, problematic variance due to examiner effects and content specificity was found. Additionally, site location was also identified as a potential source of construct irrelevant variance in examination scores.


Asunto(s)
Sesgo , Competencia Clínica , Evaluación Educacional/normas , Médicos , Competencia Clínica/estadística & datos numéricos , Femenino , Humanos , Masculino , Modelos Estadísticos
3.
Med Teach ; 36(7): 585-90, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24787530

RESUMEN

BACKGROUND: Past research suggests that the use of externally-applied scoring weights may not appreciably impact measurement qualities such as reliability or validity. Nonetheless, some credentialing boards and academic institutions apply differential scoring weights based on expert opinion about the relative importance of individual items or test components of Observed Structured Clinical Examinations (OSCEs). AIMS: To investigate the impact of simplified scoring models that make little to no use of differential weighting on the reliability of scores and decisions on a high stakes OSCE required for medical licensure in Canada. METHOD: We applied four different weighting models of various complexities to data from three administrations of the OSCE. We compared score reliability, pass/fail rates, correlations between the scores and classification decision accuracy and consistency across the models and administrations. RESULTS: Less complex weighting models yielded similar reliability and pass rates as the more complex weighting model. Minimal changes in candidates' pass/fail status were observed and there were strong and statistically significant correlations between the scores for all scoring models and administrations. Classification decision accuracy and consistency were very high and similar across the four scoring models. CONCLUSIONS: Adopting a simplified weighting scheme for this OSCE did not diminish its measurement qualities. Instead of developing complex weighting schemes, experts' time and effort could be better spent on other critical test development and assembly tasks with little to no compromise in the quality of scores and decisions on this high-stakes OSCE.


Asunto(s)
Competencia Clínica/normas , Evaluación Educacional/normas , Licencia Médica/normas , Canadá , Lista de Verificación , Evaluación Educacional/métodos , Evaluación Educacional/estadística & datos numéricos , Humanos , Modelos Educacionales , Reproducibilidad de los Resultados
4.
J Contin Educ Health Prof ; 43(3): 155-163, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37638679

RESUMEN

INTRODUCTION: Evaluation of quality improvement programs shows variable impact on physician performance often neglecting to examine how implementation varies across contexts and mechanisms that affect uptake. Realist evaluation enables the generation, refinement, and testing theories of change by unpacking what works for whom under what circumstances and why. This study used realist methods to explore relationships between outcomes, mechanisms (resources and reasoning), and context factors of a national multisource feedback (MSF) program. METHODS: Linked data for 50 physicians were examined to determine relationships between action plan completion status (outcomes), MSF ratings, MSF comments and prescribing data (resource mechanisms), a report summarizing the conversation between a facilitator and physician (reasoning mechanism), and practice risk factors (context). Working backward from outcomes enabled exploration of similarities and differences in mechanisms and context. RESULTS: The derived model showed that the completion status of plans was influenced by interaction of resource and reasoning mechanisms with context mediating the relationships. Two patterns were emerged. Physicians who implemented all their plans within six months received feedback with consistent messaging, reviewed data ahead of facilitation, coconstructed plan(s) with the facilitator, and had fewer risks to competence (dyscompetence). Physicians who were unable to implement any plans had data with fewer repeated messages and did not incorporate these into plans, had difficult plans, or needed to involve others and were physician-led, and were at higher risk for dyscompetence. DISCUSSION: Evaluation of quality improvement initiatives should examine program outcomes taking into consideration the interplay of resources, reasoning, and risk factors for dyscompetence.

5.
J Contin Educ Health Prof ; 42(4): 243-248, 2022 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-34609355

RESUMEN

INTRODUCTION: A new multisource feedback (MSF) program was specifically designed to support physician quality improvement (QI) around the CanMEDS roles of Collaborator , Communicator , and Professional . Quantitative ratings and qualitative comments are collected from a sample of physician colleagues, co-workers (C), and patients (PT). These data are supplemented with self-ratings and given back to physicians in individualized reports. Each physician reviews the report with a trained feedback facilitator and creates one-to-three action plans for QI. This study explores how the content of the four aforementioned multisource feedback program components supports the elicitation and translation of feedback into a QI plan for change. METHODS: Data included survey items, rater comments, a portion of facilitator reports, and action plans components for 159 physicians. Word frequency queries were used to identify common words and explore relationships among data sources. RESULTS: Overlap between high frequency words in surveys and rater comments was substantial. The language used to describe goals in physician action plans was highly related to respondent comments, but less so to survey items. High frequency words in facilitator reports related heavily to action plan content. DISCUSSION: All components of the program relate to one another indicating that each plays a part in the process. Patterns of overlap suggest unique functions conducted by program components. This demonstration of coherence across components of this program is one piece of evidence that supports the program's validity.


Asunto(s)
Competencia Clínica , Médicos , Humanos , Retroalimentación , Encuestas y Cuestionarios , Mejoramiento de la Calidad
6.
Med Sci Educ ; 32(6): 1439-1445, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36532388

RESUMEN

High-stakes assessments must discriminate between examinees who are sufficiently competent to practice in the health professions and examinees who are not. In these settings, criterion-referenced standard-setting methods are strongly preferred over norm referenced methods. While there are many criterion-referenced options, few are feasible or cost effective for objective structured clinical examinations (OSCEs). The human and financial resources required to organize OSCEs alone are often significant, leaving little in an institution's budget for additional resource-intensive standard-setting methods. The modified borderline group method introduced by Dauphinee et al. for a large-scale, multi-site OSCE is a very feasible option but is not as defensible for smaller scale OSCEs. This study compared the modified borderline group method to two adaptations that address its limitations for smaller scale OSCEs while retaining its benefits, namely feasibility. We evaluated decision accuracy and consistency of calculated cut scores derived from (1) modified, (2) regression-based, and (3) 4-facet Rasch model borderline group methods. Data were from a 12-station OSCE that assessed 112 nurses for entry to practice in a Canadian context. The three cut scores (64-65%) all met acceptable standards of accuracy and consistency; however, the modified borderline group method was the most influenced by lower scores within the borderline group, leading to the lowest cut score. The two adaptations may be more defensible than modified BGM in the context of a smaller (n < 100-150) OSCE.

7.
Acad Med ; 95(11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 59th Annual Research in Medical Education Presentations): S103-S108, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32769463

RESUMEN

PURPOSE: Accreditation aims to ensure all training programs meet agreed-upon standards of quality. The process is complex, resource intensive, and costly. Its benefits are difficult to assess because contextual confounds obscure comparisons between systems that do and do not include accreditation. This study explores accreditation's influence "within system" by investigating the relationship between accreditation cycle and performance on a national licensing examination. METHOD: Scores on the computer-based portion of the Medical Council of Canada Qualifying Examination Part I, from 1993 to 2017, were examined for all 17 Canadian medical schools. Typically completed upon graduation from medical school, results within each year were transformed for comparability across administrations and linked to timing within each school's accreditation cycle. ANOVAs were used to assess the relationship between accreditation timing and examination scores. Secondary analyses isolated 4-year from 3-year training programs and separated data generated before versus after implementation of a national midcycle informal review program. RESULTS: Performance on the licensing exam was highest during and shortly after an accreditation site visit, falling significantly until the midpoint in the accreditation cycle (d = 0.47) before rising again. This pattern disappeared after introduction of informal interim review, but too little data have accumulated post implementation to determine if interim review is sufficient to break the influence of accreditation cycle. CONCLUSIONS: Formal, externally driven, accreditation cycles appear associated with educational processes in ways that translated into student outcomes on a national licensing examination. Whether informal, internal, interim reviews can mediate this effect remains to be seen.


Asunto(s)
Acreditación/normas , Competencia Clínica , Concesión de Licencias/normas , Canadá
8.
Cogn Sci ; 32(2): 301-41, 2008 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21635338

RESUMEN

The goals of this study are to evaluate a relatively novel learning environment, as well as to seek greater understanding of why human tutoring is so effective. This alternative learning environment consists of pairs of students collaboratively observing a videotape of another student being tutored. Comparing this collaboratively observing environment to four other instructional methods-one-on-one human tutoring, observing tutoring individually, collaborating without observing, and studying alone-the results showed that students learned to solve physics problems just as effectively from observing tutoring collaboratively as the tutees who were being tutored individually. We explain the effectiveness of this learning environment by postulating that such a situation encourages learners to become active and constructive observers through interactions with a peer. In essence, collaboratively observing combines the benefit of tutoring with the benefit of collaborating. The learning outcomes of the tutees and the collaborative observers, along with the tutoring dialogues, were used to further evaluate three hypotheses explaining why human tutoring is an effective learning method. Detailed analyses of the protocols at several grain sizes suggest that tutoring is effective when tutees are independently or jointly constructing knowledge: with the tutor, but not when the tutor independently conveys knowledge.

9.
Clin Teach ; 10(1): 27-31, 2013 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-23294740

RESUMEN

BACKGROUND: Performance assessments rely on human judgment, and are vulnerable to rater effects (e.g. leniency or harshness). Making valid inferences from performance ratings for high-stakes decisions requires the management of rater effects. A simple method for detecting extreme raters that does not require sophisticated statistical knowledge or software has been developed as part of the quality assurance process for objective structured clinical examinations (OSCEs). We believe it is applicable to a range of examinations that rely on human raters. METHODS: The method has three steps. First, extreme raters are identified by comparing individual rater means with the mean of all raters. A rater is deemed extreme if their mean was three standard deviations below (hawks) or above (doves) the overall mean. This criterion is adjustable. Second, the distribution of an extreme rater's scores was compared with the overall distribution for the station. This step mitigates a station effect. Third, the cohort of candidates seen by the rater is examined to ensure that any cohort effect is ruled out. RESULTS AND IMPLICATIONS: Of 3000+ raters, fewer than 0.3% have been identified as being extreme using the proposed criteria. Rater performance is being monitored on a regular basis, and the impact of these raters on candidate results will be considered before results are finalised. Extreme raters are contacted by the organisation to review their rating style. If this intervention fails to modify the rater's scoring pattern, the rater is no longer invited back. As more data are collected the organisation will assess them to inform the development of approaches to improve extreme rater performance.


Asunto(s)
Educación Médica/normas , Evaluación Educacional/normas , Variaciones Dependientes del Observador , Competencia Clínica , Humanos
10.
Cogn Sci ; 36(1): 1-61, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22050726

RESUMEN

Studies exploring how students learn and understand science processes such as diffusion and natural selection typically find that students provide misconceived explanations of how the patterns of such processes arise (such as why giraffes' necks get longer over generations, or how ink dropped into water appears to "flow"). Instead of explaining the patterns of these processes as emerging from the collective interactions of all the agents (e.g., both the water and the ink molecules), students often explain the pattern as being caused by controlling agents with intentional goals, as well as express a variety of many other misconceived notions. In this article, we provide a hypothesis for what constitutes a misconceived explanation; why misconceived explanations are so prevalent, robust, and resistant to instruction; and offer one approach of how they may be overcome. In particular, we hypothesize that students misunderstand many science processes because they rely on a generalized version of narrative schemas and scripts (referred to here as a Direct-causal Schema) to interpret them. For science processes that are sequential and stage-like, such as cycles of moon, circulation of blood, stages of mitosis, and photosynthesis, a Direct-causal Schema is adequate for correct understanding. However, for science processes that are non-sequential (or emergent), such as diffusion, natural selection, osmosis, and heat flow, using a Direct Schema to understand these processes will lead to robust misconceptions. Instead, a different type of general schema may be required to interpret non-sequential processes, which we refer to as an Emergent-causal Schema. We propose that students lack this Emergent Schema and teaching it to them may help them learn and understand emergent kinds of science processes such as diffusion. Our study found that directly teaching students this Emergent Schema led to increased learning of the process of diffusion. This article presents a fine-grained characterization of each type of Schema, our instructional intervention, the successes we have achieved, and the lessons we have learned.


Asunto(s)
Comprensión , Formación de Concepto , Ciencia/educación , Estudiantes/psicología , Humanos , Aprendizaje
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA