Pesquisa | BVS Economia da Saúde

1.

Disrupted, but not derailed: A year of scholarship in health professions education.

Barrett, Aileen; Eva, Kevin.

Clin Teach ; 18 Suppl 1: 6, 2021 12.

Artigo em Inglês | MEDLINE | ID: mdl-34813155

Assuntos

Bolsas de Estudo , Ocupações em Saúde , Humanos

2.

Idiosyncrasy in Assessment Comments: Do Faculty Have Distinct Writing Styles When Completing In-Training Evaluation Reports?

Ginsburg, Shiphra; Gingerich, Andrea; Kogan, Jennifer R; Watling, Christopher J; Eva, Kevin W.

Acad Med ; 95(11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 59th Annual Research in Medical Education Presentations): S81-S88, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32769454

RESUMO

PURPOSE: Written comments are gaining traction as robust sources of assessment data. Compared with the structure of numeric scales, what faculty choose to write is ad hoc, leading to idiosyncratic differences in what is recorded. This study offers exploration of what aspects of writing styles are determined by the faculty offering comment and what aspects are determined by the trainee being commented upon. METHOD: The authors compiled in-training evaluation report comment data, generated from 2012 to 2015 by 4 large North American Internal Medicine training programs. The Linguistic Index and Word Count (LIWC) was used to categorize and quantify the language contained. Generalizability theory was used to determine whether faculty could be reliably discriminated from one another based on writing style. Correlations and ANOVAs were used to determine what styles were related to faculty or trainee demographics. RESULTS: Datasets contained 23-142 faculty who provided 549-2,666 assessments on 161-989 trainees. Faculty could easily be discriminated from one another using a variety of LIWC metrics including word count, words per sentence, and the use of "clout" words. These patterns appeared person specific and did not reflect demographic factors such as gender or rank. These metrics were similarly not consistently associated with trainee factors such as postgraduate year or gender. CONCLUSIONS: Faculty seem to have detectable writing styles that are relatively stable across the trainees they assess, which may represent an under-recognized source of construct irrelevance. If written comments are to meaningfully contribute to decision making, we need to understand and account for idiosyncratic writing styles.

Assuntos

Avaliação Educacional/métodos , Avaliação Educacional/normas , Docentes de Medicina , Medicina Interna/educação , Redação/normas

3.

Examinee Cohort Size and Item Analysis Guidelines for Health Professions Education Programs: A Monte Carlo Simulation Study.

Aubin, André-Sébastien; Young, Meredith; Eva, Kevin; St-Onge, Christina.

Acad Med ; 95(1): 151-156, 2020 01.

Artigo em Inglês | MEDLINE | ID: mdl-31335813

RESUMO

PURPOSE: Using item analyses is an important quality-monitoring strategy for written exams. Authors urge caution as statistics may be unstable with small cohorts, making application of guidelines potentially detrimental. Given the small cohorts common in health professions education, this study's aim was to determine the impact of cohort size on outcomes arising from the application of item analysis guidelines. METHOD: The authors performed a Monte Carlo simulation study in fall 2015 to examine the impact of applying 2 commonly used item analysis guidelines on the proportion of items removed and overall exam reliability as a function of cohort size. Three variables were manipulated: Cohort size (6 levels), exam length (6 levels), and exam difficulty (3 levels). Study parameters were decided based on data provided by several Canadian medical schools. RESULTS: The analyses showed an increase in proportion of items removed with decreases in exam difficulty and decreases in cohort size. There was no effect of exam length on this outcome. Exam length had a greater impact on exam reliability than did cohort size after applying item analysis guidelines. That is, exam reliability decreased more with shorter exams than with smaller cohorts. CONCLUSIONS: Although program directors and assessment creators have little control over their cohort sizes, they can control the length of their exams. Creating longer exams makes it possible to remove items without as much negative impact on the exam's reliability relative to shorter exams, thereby reducing the negative impact of small cohorts when applying item removal guidelines.

Assuntos

Currículo/normas , Avaliação Educacional/normas , Ocupações em Saúde/educação , Faculdades de Medicina/estatística & dados numéricos , Canadá/epidemiologia , Estudos de Coortes , Avaliação Educacional/estatística & dados numéricos , Estudos de Avaliação como Assunto , Guias como Assunto , Ocupações em Saúde/normas , Humanos , Método de Monte Carlo , Psicometria/métodos , Reprodutibilidade dos Testes , Fatores de Tempo

4.

Remote assessment via video evaluation (RAVVE): a pilot study to trial video-enabled peer feedback on clinical performance.

Ho, Kendall; Yao, Christopher; Novak Lauscher, Helen; Koehler, Barry E; Shojania, Kamran; Jamal, Shahin; Collins, David; Kherani, Raheem; Meneilly, Graydon; Eva, Kevin.

BMC Med Educ ; 19(1): 466, 2019 Dec 18.

Artigo em Inglês | MEDLINE | ID: mdl-31852496

RESUMO

BACKGROUND: Video review processes for evaluation and coaching are often incorporated into medical education as a means to accurately capture physician-patient interactions. Compared to direct observation they offer the advantage of overcoming many logistical challenges. However, the suitability and viability of using video-based peer consultations for professional development requires further investigation. This study aims to explore the acceptability and feasibility of video-based peer feedback to support professional development and quality improvement in patient care. METHODS: Five rheumatologists each provided four videos of patient consultations. Peers evaluated the videos using five-point scales, providing annotations in the video recordings, and offering recommendations. The rheumatologists reviewed the videos of their own four patient interactions along with the feedback. They were asked to document if they would make practice changes based on the feedback. Focus groups were conducted and analysed to explore the effectiveness of video-based peer feedback in assisting physicians to improve clinical practice. RESULTS: Participants felt the video-based feedback provided accurate and detailed information in a more convenient, less intrusive manner than direct observation. Observations made through video review enabled participants to evaluate more detailed information than a chart review alone. Participants believed that reviewing recorded consultations allowed them to reflect on their practice and gain insight into alternative communication methods. CONCLUSIONS: Video-based peer feedback and self-review of clinical performance is an acceptable and pragmatic approach to support professional development and improve clinical care among peer clinicians. Further investigation into the effectiveness of this approach is needed.

Assuntos

Feedback Formativo , Grupo Associado , Gravação em Vídeo , Competência Clínica , Feminino , Grupos Focais , Humanos , Masculino , Projetos Piloto , Encaminhamento e Consulta , Reumatologia , Inquéritos e Questionários

5.

A Reflection Upon the Impact of Early 21st-Century Technological Innovations on Medical School Admissions.

Hanson, Mark D; Eva, Kevin W.

Acad Med ; 94(5): 640-644, 2019 05.

Artigo em Inglês | MEDLINE | ID: mdl-30640267

RESUMO

The authors describe influences associated with the incorporation of modern technologies into medical school admissions processes. Their purpose is not to critique or support specific technologies but, rather, to prompt reflection on the evolution that is afoot. Technology is now integral to the administration of multiple admissions tools, including the Medical College Admission Test, situational judgment tests, and standardized video interviews. Consequently, today's admissions landscape is transforming into an online, globally interconnected marketplace for health professions admissions tools. Academic capitalism and distance-based technologies combine to enable global marketing and dissemination of admissions tests beyond the national jurisdictions in which they are designed. As predicted by disruptive business theory, they are becoming key drivers of transformative change. The seeds of technological disruption are present now rather than something to be wary of in the future. The authors reflect on this transformation and the need for tailoring test modifications to address issues of medical student diversity and social responsibility. They comment on the online assessment of applicants' personal competencies and the potential detriments if this method were to replace admissions methods involving human contact, thanks to the ease with which institutions can implement them without cost to themselves and without adequate consideration of measurement utility or contextual appropriateness. The authors advocate for socially responsible academic capitalism within this interconnected admissions marketplace: Attending to today's transformative challenges may inform how health professions education responds to tomorrow's admissions technologies and, in turn, how tomorrow's health professionals respond to their patients' needs.

Assuntos

Educação Médica/história , Invenções/história , Critérios de Admissão Escolar/estatística & dados numéricos , Faculdades de Medicina/história , Educação Médica/estatística & dados numéricos , História do Século XXI , Humanos , Invenções/estatística & dados numéricos , Faculdades de Medicina/estatística & dados numéricos

6.

Asking for Less and Getting More: The Impact of Broadening a Rater's Focus in Formative Assessment.

Tavares, Walter; Sadowski, Alexander; Eva, Kevin W.

Acad Med ; 93(10): 1584-1590, 2018 10.

Artigo em Inglês | MEDLINE | ID: mdl-29794523

RESUMO

PURPOSE: There may be unintended consequences of broadening the competencies across which health professions trainees are assessed. This study was conducted to determine whether such broadening influences the formative guidance assessors provide to trainees and to test whether sequential collection of competency-specific assessment can overcome setbacks of simultaneous collection. METHOD: A randomized between-subjects experimental design, conducted in Toronto and Halifax, Canada, in 2016-2017 with paramedic educators experienced in observing/rating, in which observers' focus was manipulated. In the simultaneous condition, participants rated four unscripted (i.e., spontaneously generated) clinical performances using a six-dimension global rating scale and provided feedback. In three sequential conditions, participants were asked to rate the same performances and provide feedback but for only two of the six dimensions. Participants from these conditions were randomly merged to create a "full score" and set of feedback statements for each candidate. RESULTS: Eighty-seven raters completed the study; 23 in the simultaneous condition and 21 or 22 for each pair of dimensions in the sequential conditions. After randomly merging participants, there were 21 "full scores" in the sequential condition. Compared with the sequential condition, participants in the simultaneous condition demonstrated reductions in the amount of unique feedback provided, increased likelihood of ignoring some dimensions of performance, lessened variety of feedback, and reduced reliability. CONCLUSIONS: Sequential or distributed assessment strategies in which raters are asked to focus on less may provide more effective assessment by overcoming the unintended consequences of asking raters to spread their attention thinly over many dimensions of competence.

Assuntos

Pessoal Técnico de Saúde/educação , Competência Clínica , Avaliação Educacional/métodos , Feedback Formativo , Canadá , Humanos

7.

Constructing critical thinking in health professional education.

Kahlke, Renate; Eva, Kevin.

Perspect Med Educ ; 7(3): 156-165, 2018 06.

Artigo em Inglês | MEDLINE | ID: mdl-29619664

RESUMO

INTRODUCTION: Calls for enabling 'critical thinking' are ubiquitous in health professional education. However, there is little agreement in the literature or in practice as to what this term means and efforts to generate a universal definition have found limited traction. Moreover, the variability observed might suggest that multiplicity has value that the quest for universal definitions has failed to capture. In this study, we sought to map the multiple conceptions of critical thinking in circulation in health professional education to understand the relationships and tensions between them. METHODS: We used an inductive, qualitative approach to explore conceptions of critical thinking with educators from four health professions: medicine, nursing, pharmacy, and social work. Four participants from each profession participated in two individual in-depth semi-structured interviews, the latter of which induced reflection on a visual depiction of results generated from the first set of interviews. RESULTS: Three main conceptions of critical thinking were identified: biomedical, humanist, and social justice-oriented critical thinking. 'Biomedical critical thinking' was the dominant conception. While each conception had distinct features, the particular conceptions of critical thinking espoused by individual participants were not stable within or between interviews. DISCUSSION: Multiple conceptions of critical thinking likely offer educators the ability to express diverse beliefs about what 'good thinking' means in variable contexts. The findings suggest that any single definition of critical thinking in the health professions will be inherently contentious and, we argue, should be. Such debates, when made visible to educators and trainees, can be highly productive.

Assuntos

Currículo/tendências , Ocupações em Saúde/educação , Pensamento , Humanos , Entrevistas como Assunto/métodos , Pesquisa Qualitativa

8.

The Hidden Value of Narrative Comments for Assessment: A Quantitative Reliability Analysis of Qualitative Data.

Ginsburg, Shiphra; van der Vleuten, Cees P M; Eva, Kevin W.

Acad Med ; 92(11): 1617-1621, 2017 11.

Artigo em Inglês | MEDLINE | ID: mdl-28403004

RESUMO

PURPOSE: In-training evaluation reports (ITERs) are ubiquitous in internal medicine (IM) residency. Written comments can provide a rich data source, yet are often overlooked. This study determined the reliability of using variable amounts of commentary to discriminate between residents. METHOD: ITER comments from two cohorts of PGY-1s in IM at the University of Toronto (graduating 2010 and 2011; n = 46-48) were put into sets containing 15 to 16 residents. Parallel sets were created: one with comments from the full year and one with comments from only the first three assessments. Each set was rank-ordered by four internists external to the program between April 2014 and May 2015 (n = 24). Generalizability analyses and a decision study were performed. RESULTS: For the full year of comments, reliability coefficients averaged across four rankers were G = 0.85 and G = 0.91 for the two cohorts. For a single ranker, G = 0.60 and G = 0.73. Using only the first three assessments, reliabilities remained high at G = 0.66 and G = 0.60 for a single ranker. In a decision study, if two internists ranked the first three assessments, reliability would be G = 0.80 and G = 0.75 for the two cohorts. CONCLUSIONS: Using written comments to discriminate between residents can be extremely reliable even after only several reports are collected. This suggests a way to identify residents early on who may require attention. These findings contribute evidence to support the validity argument for using qualitative data for assessment.

Assuntos

Competência Clínica , Medicina Interna/educação , Internato e Residência , Narração , Avaliação Educacional , Humanos , Reprodutibilidade dos Testes

9.

Cracking the code: residents' interpretations of written assessment comments.

Ginsburg, Shiphra; van der Vleuten, Cees Pm; Eva, Kevin W; Lingard, Lorelei.

Med Educ ; 51(4): 401-410, 2017 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-28093833

RESUMO

CONTEXT: Interest is growing in the use of qualitative data for assessment. Written comments on residents' in-training evaluation reports (ITERs) can be reliably rank-ordered by faculty attendings, who are adept at interpreting these narratives. However, if residents do not interpret assessment comments in the same way, a valuable educational opportunity may be lost. OBJECTIVES: Our purpose was to explore residents' interpretations of written assessment comments using mixed methods. METHODS: Twelve internal medicine (IM) postgraduate year 2 (PGY2) residents were asked to rank-order a set of anonymised PGY1 residents (n = 48) from a previous year in IM based solely on their ITER comments. Each PGY1 was ranked by four PGY2s; generalisability theory was used to assess inter-rater reliability. The PGY2s were then interviewed separately about their rank-ordering process, how they made sense of the comments and how they viewed ITERs in general. Interviews were analysed using constructivist grounded theory. RESULTS: Across four PGY2 residents, the G coefficient was 0.84; for a single resident it was 0.56. Resident rankings correlated extremely well with faculty member rankings (r = 0.90). Residents were equally adept at reading between the lines to construct meaning from the comments and used language cues in ways similarly reported in faculty attendings. Participants discussed the difficulties of interpreting vague language and provided perspectives on why they thought it occurs (time, discomfort, memorability and the permanency of written records). They emphasised the importance of face-to-face discussions, the relative value of comments over scores, staff-dependent variability of assessment and the perceived purpose and value of ITERs. They saw particular value in opportunities to review an aggregated set of comments. CONCLUSIONS: Residents understood the 'hidden code' in assessment language and their ability to rank-order residents based on comments matched that of faculty. Residents seemed to accept staff-dependent variability as a reality. These findings add to the growing evidence that supports the use of narrative comments and subjectivity in assessment.

Assuntos

Competência Clínica , Avaliação Educacional , Medicina Interna/educação , Internato e Residência , Avaliação Educacional/métodos , Docentes de Medicina , Humanos , Narração , Reprodutibilidade dos Testes , Redação

10.

Towards a program of assessment for health professionals: from training into practice.

Eva, Kevin W; Bordage, Georges; Campbell, Craig; Galbraith, Robert; Ginsburg, Shiphra; Holmboe, Eric; Regehr, Glenn.

Adv Health Sci Educ Theory Pract ; 21(4): 897-913, 2016 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-26590984

RESUMO

Despite multifaceted attempts to "protect the public," including the implementation of various assessment practices designed to identify individuals at all stages of training and practice who underperform, profound deficiencies in quality and safety continue to plague the healthcare system. The purpose of this reflections paper is to cast a critical lens on current assessment practices and to offer insights into ways in which they might be adapted to ensure alignment with modern conceptions of health professional education for the ultimate goal of improved healthcare. Three dominant themes will be addressed: (1) The need to redress unintended consequences of competency-based assessment; (2) The potential to design assessment systems that facilitate performance improvement; and (3) The importance of ensuring authentic linkage between assessment and practice. Several principles cut across each of these themes and represent the foundational goals we would put forward as signposts for decision making about the continued evolution of assessment practices in the health professions: (1) Increasing opportunities to promote learning rather than simply measuring performance; (2) Enabling integration across stages of training and practice; and (3) Reinforcing point-in-time assessments with continuous professional development in a way that enhances shared responsibility and accountability between practitioners, educational programs, and testing organizations. Many of the ideas generated represent suggestions for strategies to pilot test, for infrastructure to build, and for harmonization across groups to be enabled. These include novel strategies for OSCE station development, formative (diagnostic) assessment protocols tailored to shed light on the practices of individual clinicians, the use of continuous workplace-based assessment, and broadening the focus of high-stakes decision making beyond determining who passes and who fails. We conclude with reflections on systemic (i.e., cultural) barriers that may need to be overcome to move towards a more integrated, efficient, and effective system of assessment.

Assuntos

Avaliação Educacional , Ocupações em Saúde , Educação Baseada em Competências , Humanos , Segurança do Paciente , Melhoria de Qualidade

11.

Product analysis and initial reliability testing of the total mesorectal excision-quality assessment instrument.

Simunovic, Marko R; DeNardi, Franco G; Coates, Angela J; Szalay, David A; Eva, Kevin W.

Ann Surg Oncol ; 21(7): 2274-9, 2014 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-24590437

RESUMO

BACKGROUND: Product analysis of rectal cancer resection specimens before specimen fixation may provide an immediate and relevant evaluation of surgical performance. We tested the interrater reliability (IRR) of a product analysis tool called the Total Mesorectal Excision-Quality Assessment Instrument (TME-QA). METHODS: Participants included two gold standard raters, five pathology assistants, and eight pathologists. Domains of the TME-QA reflect total mesorectal excision principles including: (1) completeness of mesorectal margin; (2) completeness of mesorectum; (3) coning of distal mesorectum; (4) physical defects; and (5) overall specimen quality. Specimens were scored independently. We used the generalizability theory to assess the tool's internal consistency and IRR. RESULTS: There were 39 specimens and 120 ratings. Mean overall specimen quality scores for the gold standard raters, pathologists, and assistants were 4.43, 4.43, and 4.50, respectively (p > 0.85). IRR for the first nine items was 0.68 for the full sample, 0.62 for assistants alone, 0.63 for pathologists alone, and 0.74 for gold standard raters alone. IRR for the item overall specimen quality was 0.67 for the full sample, 0.45 for assistants, 0.80 for pathologists, and 0.86 for gold standard raters. IRR increased for all groups when scores were averaged across two raters. CONCLUSIONS: Assessment of surgical specimens using the TME-QA may provide rapid and relevant feedback to surgeons about their technical performance. Our results show good internal consistency and IRR when the TME-QA is used by pathologists. However, for pathology assistants, multiple ratings with the averaging of scores may be needed.

Assuntos

Procedimentos Cirúrgicos do Sistema Digestório/normas , Patologia Clínica/normas , Guias de Prática Clínica como Assunto/normas , Indicadores de Qualidade em Assistência à Saúde/normas , Neoplasias Retais/cirurgia , Humanos , Prognóstico , Neoplasias Retais/patologia , Reprodutibilidade dos Testes

12.

Simulation-based assessment of paramedics and performance in real clinical contexts.

Tavares, Walter; LeBlanc, Vicki R; Mausz, Justin; Sun, Victor; Eva, Kevin W.

Prehosp Emerg Care ; 18(1): 116-22, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-23961742

RESUMO

OBJECTIVE: The objective of this study was to seek validity evidence for simulation-based assessments (SBA) of paramedics by asking to what extent the measurements obtained in SBA of clinical competence are associated with measurements obtained in actual paramedic contexts, with real patients. METHODS: This prospective observational study involved analyzing the assessment of paramedic trainees at the entry-to-practice level in both simulation- and workplace-based settings. The SBA followed an OSCE structure involving full clinical cases from initial patient contact to transport or transfer of care. The workplace-based assessment (WBA) involved rating samples of clinical performance during real clinical encounters while assigned to an emergency medical service. For each candidate, both assessments were completed during a 3-week period at the end of their training. Raters in the SBA and WBA settings used the same paramedic-specific seven-dimension global rating scale. Reliability was calculated and decision studies were completed using generalizability theory. Associations between settings (overall and by dimension) were calculated using Pearson's correlation. RESULTS: A total of 49 paramedic trainees were assessed using both a SBA and WBA. The mean score in the SBA and WBA settings were 4.88 (SD = 0.68) and 5.39 (SD = 0.48), respectively, out of a possible 7. Reliability for the SBA and WBA settings reached 0.55 and 0.49, respectively. A decision study revealed 10 and 13 cases would be needed to reach a reliability of 0.7 for the SBA and WBA settings. Pearson correlation reached 0.37 (p = 0.01) between settings, which rose to 0.73 when controlling for imperfect reliability; five of seven dimensions (situation awareness, history gathering, patient assessment, decision making, and communication) reaching significance. Two dimensions (resource utilization and procedural skills) did not reach significance. CONCLUSION: For five of the seven dimensions believed to represent the construct of paramedic clinical performance, scores obtained in the SBA were associated with scores obtained in real clinical contexts with real patients. As SBAs are often used to infer clinical competence and predict future clinical performance, this study contributes validity evidence to support these claims as long as the importance of sampling performance broadly and extensively is appreciated and implemented.

Assuntos

Pessoal Técnico de Saúde/educação , Avaliação Educacional/métodos , Competência Profissional , Análise e Desempenho de Tarefas , Adulto , Feminino , Humanos , Masculino , Estudos Prospectivos

13.

Comparing diagnostic performance and the utility of clinical vignette-based assessment under testing conditions designed to encourage either automatic or analytic thought.

Ilgen, Jonathan S; Bowen, Judith L; McIntyre, Lucas A; Banh, Kenny V; Barnes, David; Coates, Wendy C; Druck, Jeffrey; Fix, Megan L; Rimple, Diane; Yarris, Lalena M; Eva, Kevin W.

Acad Med ; 88(10): 1545-51, 2013 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-23969355

RESUMO

PURPOSE: Although decades of research have yielded considerable insight into physicians' clinical reasoning processes, assessing these processes remains challenging; thus, the authors sought to compare diagnostic performance and the utility of clinical vignette-based assessment under testing conditions designed to encourage either automatic or analytic thought. METHOD: This 2011-2012 multicenter randomized study of 393 clinicians (medical students, postgraduate trainees, and faculty) measured diagnostic accuracy on clinical vignettes under two conditions: one encouraged participants to give their first impression (FI), and the other led participants through a directed search (DS) for the correct diagnosis. The authors compared accuracy, feasibility, reliability, and relation to United States Medical Licensing Exam (USMLE) scores under each condition. RESULTS: A 2 (instructional condition) × 2 (vignette complexity) × 3 (experience level) analysis of variance revealed no difference in accuracy as a function of instructional condition (F[1,379] = 2.44, P = .12), but demonstrated the expected main effects of vignette complexity (F[1,379] = 965.2, P < .001) and experience (F[2,379] = 39.6, P < .001). Pearson correlations revealed greater associations between assessment scores and USMLE performance in the FI condition than in the DS condition (P < .001). Spearman-Brown calculations consistently indicated that alpha ≥ 0.75 could be achieved more efficiently under the FI condition relative to the DS condition. CONCLUSIONS: Instructions to trust one's first impres-sions result in similar performance when compared with instructions to consider clinical information in a systematic fashion, but have greater utility when used for the purposes of assessment.

Assuntos

Diagnóstico , Educação Médica/métodos , Avaliação Educacional/métodos , Medicina de Emergência/educação , Medicina Interna/educação , Adulto , Competência Clínica , Estudos Transversais , Feminino , Humanos , Masculino , Reprodutibilidade dos Testes , Estados Unidos

14.

Global rating scale for the assessment of paramedic clinical competence.

Tavares, Walter; Boet, Sylvain; Theriault, Rob; Mallette, Tony; Eva, Kevin W.

Prehosp Emerg Care ; 17(1): 57-67, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-22834959

RESUMO

OBJECTIVE: The aim of this study was to develop and critically appraise a global rating scale (GRS) for the assessment of individual paramedic clinical competence at the entry-to-practice level. METHODS: The development phase of this study involved task analysis by experts, contributions from a focus group, and a modified Delphi process using a national expert panel to establish evidence of content validity. The critical appraisal phase had two raters apply the GRS, developed in the first phase, to a series of sample performances from three groups: novice paramedic students (group 1), paramedic students at the entry-to-practice level (group 2), and experienced paramedics (group 3). Using data from this process, we examined the tool's reliability within each group and tested the discriminative validity hypothesis that higher scores would be associated with higher levels of training and experience. RESULTS: The development phase resulted in a seven-dimension, seven-point adjectival GRS. The two independent blinded raters scored 81 recorded sample performances (n = 25 in group 1, n = 33 in group 2, n = 23 in group 3) using the GRS. For groups 1, 2, and 3, respectively, interrater reliability reached 0.75, 0.88, and 0.94. Intrarater reliability reached 0.94 and the internal consistency ranged from 0.53 to 0.89. Rater differences contributed 0-5.7% of the total variance. The GRS scores assigned to each group increased with level of experience, both using the overall rating (means = 2.3, 4.1, 5.0; p < 0.001) and considering each dimension separately. Applying a modified borderline group method, 54.9% of group 1, 13.4% of group 2, and 2.9% of group 3 were below the cut score. CONCLUSION: The results of this study provide evidence that the scores generated using this scale can be valid for the purpose of making decisions regarding paramedic clinical competence.

Assuntos

Competência Clínica/normas , Avaliação Educacional/métodos , Auxiliares de Emergência/normas , Análise de Variância , Técnica Delphi , Avaliação Educacional/normas , Auxiliares de Emergência/educação , Feminino , Grupos Focais , Humanos , Masculino , Variações Dependentes do Observador , Ontário , Reprodutibilidade dos Testes , Análise e Desempenho de Tarefas , Gravação em Vídeo

15.

Scylla or Charybdis? Can we navigate between objectification and judgement in assessment?

Eva, Kevin W; Hodges, Brian D.

Med Educ ; 46(9): 914-9, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-22891912

Assuntos

Competência Clínica/normas , Medicina Baseada em Evidências/métodos , Julgamento , Exame Físico/métodos , Medicina Baseada em Evidências/normas , Humanos , Internet , Exame Físico/psicologia , Exame Físico/normas

16.

The readiness for clerkship survey: can self-assessment data be used to evaluate program effectiveness?

Peterson, Linda N; Eva, Kevin W; Rusticus, Shayna A; Lovato, Chris Y.

Acad Med ; 87(10): 1355-60, 2012 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-22914522

RESUMO

PURPOSE: To examine whether or not aggregated self-assessment data of clerkship readiness can provide meaningful sources of information to evaluate the effectiveness of an educational program. METHOD: The 39-item Readiness for Clerkship survey was developed during academic year 2009-2010 using several key competence documents and expert review. The survey was completed by two cohorts of students (179 from the class of 2011 in February 2010, 171 from the class of 2012 in November 2010) and of clinical preceptors (384 for class of 2011 preceptors, 419 for class of 2012 preceptors). Descriptive statistics, Pearson correlations coefficients, ANOVA, and generalizability and decision studies were used to determine whether ratings could differentiate between different aspects of a training program. RESULTS: When self-assessments were aggregated across students, their judgments aligned very well with those of faculty raters. The correlation of average scores, calculated for each item between faculty and students, was r=0.88 for 2011 and r=0.91 for 2012. This was only slightly lower than the near-perfect correlations of item averages within groups across successive years (r=0.99 for faculty; r=0.98 for students). Generalizability and decision analyses revealed that one can achieve interrater reliability in this domain with fewer students (9-21) than faculty (26-45). CONCLUSIONS: These results provide evidence that, when aggregated, student self-assessment data from the Readiness for Clerkship Survey provide valid data for use in program evaluation that align well with an external standard.

Assuntos

Estágio Clínico , Educação de Graduação em Medicina/normas , Avaliação Educacional/métodos , Avaliação de Programas e Projetos de Saúde/métodos , Autoavaliação (Psicologia) , Inquéritos e Questionários , Análise de Variância , Colúmbia Britânica , Competência Clínica , Análise Fatorial , Humanos , Modelos Estatísticos , Variações Dependentes do Observador

17.

Adjusting our lens: can developmental differences in diagnostic reasoning be harnessed to improve health professional and trainee assessment?

Ilgen, Jonathan S; Bowen, Judith L; Yarris, Lalena M; Fu, Rongwei; Lowe, Robert A; Eva, Kevin.

Acad Emerg Med ; 18 Suppl 2: S79-86, 2011 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-21999563

RESUMO

OBJECTIVES: Research in cognition has yielded considerable understanding of the diagnostic reasoning process and its evolution during clinical training. This study sought to determine whether or not this literature could be used to improve the assessment of trainees' diagnostic skill by manipulating testing conditions that encourage different modes of reasoning. METHODS: The authors developed an online, vignette-based instrument with two sets of testing instructions. The "first impression" condition encouraged nonanalytic responses while the "directed search" condition prompted structured analytic responses. Subjects encountered six cases under the first impression condition and then six cases under the directed search condition. Each condition had three straightforward (simple) and three ambiguous (complex) cases. Subjects were stratified by clinical experience: novice (third- and fourth-year medical students), intermediate (postgraduate year [PGY] 1 and 2 residents), and experienced (PGY 3 residents and faculty). Two investigators scored the exams independently. Mean diagnostic accuracies were calculated for each group. Differences in diagnostic accuracy and reliability of the examination as a function of the predictor variables were assessed. RESULTS: The examination was completed by 115 subjects. Diagnostic accuracy was significantly associated with the independent variables of case complexity, clinical experience, and testing condition. Overall, mean diagnostic accuracy and the extent to which the test consistently discriminated between subjects (i.e., yielded reliable scores) was higher when participants were given directed search instructions than when they were given first impression instructions. In addition, the pattern of reliability was found to depend on experience: simple cases offered the best reliability for discriminating between novices, complex cases offered the best reliability for discriminating between intermediate residents, and neither type of case discriminated well between experienced practitioners. CONCLUSIONS: These results yield concrete guidance regarding test construction for the purpose of diagnostic skill assessment. The instruction strategy and complexity of cases selected should depend on the experience level and breadth of experience of the subjects one is attempting to assess.

Assuntos

Competência Clínica , Cognição , Erros de Diagnóstico/prevenção & controle , Educação Médica/métodos , Avaliação Educacional , Medicina de Emergência/educação , Análise de Variância , Estudos Transversais , Feminino , Humanos , Masculino , Modelos Educacionais , Reprodutibilidade dos Testes

18.

Tensions in informed self-assessment: how the desire for feedback and reticence to collect and use it can conflict.

Mann, Karen; van der Vleuten, Cees; Eva, Kevin; Armson, Heather; Chesluk, Ben; Dornan, Timothy; Holmboe, Eric; Lockyer, Jocelyn; Loney, Elaine; Sargeant, Joan.

Acad Med ; 86(9): 1120-7, 2011 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-21785309

RESUMO

PURPOSE: Informed self-assessment describes the set of processes through which individuals use external and internal data to generate an appraisal of their own abilities. The purpose of this project was to explore the tensions described by learners and professionals when informing their self-assessments of clinical performance. METHOD: This 2008 qualitative study was guided by principles of grounded theory. Eight programs in five countries across undergraduate, postgraduate, and continuing medical education were purposively sampled. Seventeen focus groups were held (134 participants). Detailed analyses were conducted iteratively to understand themes and relationships. RESULTS: Participants experienced multiple tensions in informed self-assessment. Three categories of tensions emerged: within people (e.g., wanting feedback, yet fearing disconfirming feedback), between people (e.g., providing genuine feedback yet wanting to preserve relationships), and in the learning/practice environment (e.g., engaging in authentic self-assessment activities versus "playing the evaluation game"). Tensions were ongoing, contextual, and dynamic; they prevailed across participant groups, infusing all components of informed self-assessment. They also were present in varied contexts and at all levels of learners and practicing physicians. CONCLUSIONS: Multiple tensions, requiring ongoing negotiation and renegotiation, are inherent in informed self-assessment. Tensions are both intraindividual and interindividual and they are culturally situated, reflecting both professional and institutional influences. Social learning theories (social cognitive theory) and sociocultural theories of learning (situated learning and communities of practice) may inform our understanding and interpretation of the study findings. The findings suggest that educational interventions should be directed at individual, collective, and institutional cultural levels. Implications for practice are presented.

Assuntos

Retroalimentação , Relações Interprofissionais , Médicos/psicologia , Competência Profissional , Programas de Autoavaliação , Estudantes de Medicina/psicologia , Canadá , Educação Médica , Europa (Continente) , Grupos Focais , Humanos , Internato e Residência , Aprendizagem , Teoria Psicológica , Autoavaliação (Psicologia) , Programas de Autoavaliação/métodos , Estados Unidos

19.

Features of assessment learners use to make informed self-assessments of clinical performance.

Sargeant, Joan; Eva, Kevin W; Armson, Heather; Chesluk, Ben; Dornan, Tim; Holmboe, Eric; Lockyer, Jocelyn M; Loney, Elaine; Mann, Karen V; van der Vleuten, Cees P M.

Med Educ ; 45(6): 636-47, 2011 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-21564201

RESUMO

CONTEXT: Conceptualisations of self-assessment are changing as its role in professional development comes to be viewed more broadly as needing to be both externally and internally informed through activities that enable access to and the interpretation and integration of data from external sources. Education programmes use various activities to promote learners' reflection and self-direction, yet we know little about how effective these activities are in 'informing' learners' self-assessments. OBJECTIVES: This study aimed to increase understanding of the specific ways in which undergraduate and postgraduate learners used learning and assessment activities to inform self-assessments of their clinical performance. METHODS: We conducted an international qualitative study using focus groups and drawing on principles of grounded theory. We recruited volunteer participants from three undergraduate and two postgraduate programmes using structured self-assessment activities (e.g. portfolios). We asked learners to describe their perceptions of and experiences with formal and informal activities intended to inform self-assessment. We conducted analysis as a team using a constant comparative process. RESULTS: Eighty-five learners (53 undergraduate, 32 postgraduate) participated in 10 focus groups. Two main findings emerged. Firstly, the perceived effectiveness of formal and informal assessment activities in informing self-assessment appeared to be both person- and context-specific. No curricular activities were considered to be generally effective or ineffective. However, the availability of high-quality performance data and standards was thought to increase the effectiveness of an activity in informing self-assessment. Secondly, the fostering and informing of self-assessment was believed to require credible and engaged supervisors. CONCLUSIONS: Several contextual and personal conditions consistently influenced learners' perceptions of the extent to which assessment activities were useful in informing self-assessments of performance. Although learners are not guaranteed to be accurate in their perceptions of which factors influence their efforts to improve performance, their perceptions must be taken into account; assessment strategies that are perceived as providing untrustworthy information can be anticipated to have negligible impact.

Assuntos

Competência Clínica/normas , Educação de Pós-Graduação em Medicina/métodos , Educação de Graduação em Medicina/métodos , Avaliação Educacional/métodos , Autoavaliação (Psicologia) , Estudantes de Medicina/psicologia , Bélgica , Currículo , Educação de Pós-Graduação em Medicina/normas , Educação de Graduação em Medicina/normas , Avaliação Educacional/normas , Humanos , Países Baixos , Programas de Autoavaliação , Reino Unido

20.

Assessment for selection for the health care professions and specialty training: consensus statement and recommendations from the Ottawa 2010 Conference.

Prideaux, David; Roberts, Chris; Eva, Kevin; Centeno, Angel; McCrorie, Peter; McManus, Chris; Patterson, Fiona; Powis, David; Tekian, Ara; Wilkinson, David.

Med Teach ; 33(3): 215-23, 2011.

Artigo em Inglês | MEDLINE | ID: mdl-21345061

RESUMO

Assessment for selection in medicine and the health professions should follow the same quality assurance processes as in-course assessment. The literature on selection is limited and is not strongly theoretical or conceptual. For written testing, there is evidence of the predictive validity of Medical College Admission Test (MCAT) for medical school and licensing examination performance. There is also evidence for the predictive validity of grade point average, particularly in combination with MCAT for graduate entry but little evidence about the predictive validity of school leaver scores. Interviews have not been shown to be robust selection measures. Studies of multiple mini-interviews have indicated good predictive validity and reliability. Of other measures used in selection, only the growing interest in personality testing appears to warrant future work. Widening access to medical and health professional programmes is an increasing priority and relates to the social accountability mandate of medical and health professional schools. While traditional selection measures do discriminate against various population groups, there is little evidence on the effect of non-traditional measures in widening access. Preparation and outreach programmes show most promise. In summary, the areas of consensus for assessment for selection are small in number. Recommendations for future action focus on the adoption of principles of good assessment and curriculum alignment, use of multi-method programmatic approaches, development of interdisciplinary frameworks and utilisation of sophisticated measurement models. The social accountability mandate of medical and health professional schools demands that social inclusion, workforce issues and widening of access are embedded in the principles of good assessment for selection.

Assuntos

Critérios de Admissão Escolar , Faculdades de Medicina/organização & administração , Conferências de Consenso como Assunto , Avaliação Educacional , Humanos , Entrevistas como Assunto , Escolas para Profissionais de Saúde/organização & administração

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA