RESUMEN
Education in Doctor of Medicine programs has moved towards an emphasis on clinical competency, with entrustable professional activities providing a framework of learning objectives and outcomes to be assessed within the clinical environment. While the identification and structured definition of objectives and outcomes have evolved, many methods employed to assess clerkship students' clinical skills remain relatively unchanged. There is a paucity of medical education research applying advanced statistical design and analytic techniques to investigate the validity of clinical skills assessment. One robust statistical method, multitrait-multimethod matrix analysis, can be applied to investigate construct validity across multiple assessment instruments and settings. Four traits were operationalized to represent the construct of critical clinical skills (professionalism, data gathering, data synthesis, and data delivery). The traits were assessed using three methods (direct observations by faculty coaches, clinical workplace-based evaluations, and objective structured clinical examination type clinical practice examinations). The four traits and three methods were intercorrelated for the multitrait-multimethod matrix analysis. The results indicated reliability values in the adequate to good range across the three methods with the majority of the validity coefficients demonstrating statistical significance. The clearest evidence for convergent and divergent validity was with the professionalism trait. The correlations on the same method/different traits analyses indicated substantial method effect; particularly on clinical workplace-based assessments. The multitrait-multimethod matrix approach, currently underutilized in medical education, could be employed to explore validity evidence of complex constructs such as clinical skills. These results can inform faculty development programs to improve the reliability and validity of assessments within the clinical environment.
RESUMEN
Phenomenon: Existing literature, as well as anecdotal evidence, suggests that tiered clinical grading systems may display systematic demographic biases. This study aimed to investigate these potential inequities in-depth. Specifically, this study attempted to address the following gaps in the literature: (1) studying grades actually assigned to students (as opposed to self-reported ones), (2) using longitudinal data over an 8-year period, providing stability of data, (3) analyzing three important, potentially confounding covariates, (4) using a comprehensive multivariate statistical design, and (5) investigating not just the main effects of gender and race, but also their potential interaction. Approach: Participants included 1,905 graduates (985 women, 51.7%) who received the Doctor of Medicine degree between 2014 and 2021. Most of the participants were white (n = 1,310, 68.8%) and about one-fifth were nonwhite (n = 397, 20.8%). There were no reported race data for 10.4% (n = 198). To explore potential differential grading, a two-way multivariate analysis of covariance was employed to examine the impact of race and gender on grades in eight required clerkships, adjusting for prior academic performance. Findings: There were two significant main effects, race and gender, but no interaction effect between gender and race. Women received higher grades on average on all eight clerkships, and white students received higher grades on average on four of the eight clerkships (Medicine, Pediatrics, Surgery, Obstetrics/Gynecology). These relationships held even when accounting for prior performance covariates. Insights: These findings provide additional evidence that tiered grading systems may be subject to systematic demographic biases. It is difficult to tease apart the contributions of various factors to the observed differences in gender and race on clerkship grades, and the interactions that produce these biases may be quite complex. The simplest solution to cut through the tangled web of grading biases may be to move away from a tiered grading system altogether.
RESUMEN
The distinction between basic sciences and clinical knowledge which has led to a theoretical debate on how medical expertise is developed has implications for medical school and lifelong medical education. This longitudinal, population based observational study was conducted to test the fit of three theories-knowledge encapsulation, independent influence, distinct domains-of the development of medical expertise employing structural equation modelling. Data were collected from 548 physicians (292 men-53.3%; 256 women-46.7%; mean age = 24.2 years on admission) who had graduated from medical school 2009-2014. They included (1) Admissions data of undergraduate grade point average and Medical College Admission Test sub-test scores, (2) Course performance data from years 1, 2, and 3 of medical school, and (3) Performance on the NBME exams (i.e., Step 1, Step 2 CK, and Step 3). Statistical fit indices (Goodness of Fit Index-GFI; standardized root mean squared residual-SRMR; root mean squared error of approximation-RSMEA) and comparative fit [Formula: see text] of three theories of cognitive development of medical expertise were used to assess model fit. There is support for the knowledge encapsulation three factor model of clinical competency (GFI = 0.973, SRMR = 0.043, RSMEA = 0.063) which had superior fit indices to both the independent influence and distinct domains theories ([Formula: see text] vs [Formula: see text] [[Formula: see text]] vs [Formula: see text] [[Formula: see text]], respectively). The findings support a theory where basic sciences and medical aptitude are direct, correlated influences on clinical competency that encapsulates basic knowledge.
Asunto(s)
Éxito Académico , Disciplinas de las Ciencias Biológicas/estadística & datos numéricos , Competencia Clínica/estadística & datos numéricos , Prueba de Admisión Académica/estadística & datos numéricos , Médicos/normas , Adulto , Toma de Decisiones Clínicas , Femenino , Humanos , Conocimiento , Estudios Longitudinales , Masculino , Modelos Teóricos , Adulto JovenRESUMEN
OBJECTIVES: This study was conducted to adduce evidence of validity for admissions tests and processes and for identifying a parsimonious model that predicts students' academic achievement in Medical College. METHODS: Psychometric study done on admission data and assessment scores for five years of medical studies at Aga Khan University Medical College, Pakistan using confirmatory factor analysis (CFA) and structured equation modeling (SEM). Sample included 276 medical students admitted in 2003, 2004 and 2005. RESULTS: The SEM supported the existence of covariance between verbal reasoning, science and clinical knowledge for predicting achievement in medical school employing Maximum Likelihood (ML) estimations (n=112). Fit indices: χ2 (21) = 59.70, p =<.0001; CFI=.873; RMSEA = 0.129; SRMR = 0.093. CONCLUSIONS: This study shows that in addition to biology and chemistry which have been traditionally used as major criteria for admission to medical colleges in Pakistan; mathematics has proven to be a better predictor for higher achievements in medical college.
RESUMEN
There are various educational methods used in anatomy teaching. While three dimensional (3D) visualization technologies are gaining ground due to their ever-increasing realism, reports investigating physical models as a low-cost 3D traditional method are still the subject of considerable interest. The aim of this meta-analysis is to quantitatively assess the effectiveness of such models based on comparative studies. Eight studies (7 randomized trials; 1 quasi-experimental) including 16 comparison arms and 820 learners met the inclusion criteria. Primary outcomes were defined as factual, spatial and overall percentage scores. The meta-analytical results are: educational methods using physical models yielded significantly better results when compared to all other educational methods for the overall knowledge outcome (p < 0.001) and for spatial knowledge acquisition (p < 0.001). Significantly better results were also found with regard to the long-retention knowledge outcome (p < 0.01). No significance was found for the factual knowledge acquisition outcome. The evidence in the present systematic review was found to have high internal validity and at least an acceptable strength. In conclusion, physical anatomical models offer a promising tool for teaching gross anatomy in 3D representation due to their easy accessibility and educational effectiveness. Such models could be a practical tool to bring up the learners' level of gross anatomy knowledge at low cost.
Asunto(s)
Anatomía/educación , Educación Médica , Educación en Enfermería , Educación en Veterinaria , Modelos Anatómicos , Animales , Humanos , EnseñanzaRESUMEN
INTRODUCTION: Physicians identify teaching as a factor that enhances performance, although existing data to support this relationship is limited. PURPOSE: To determine whether there were differences in clinical performance scores as assessed through multisource feedback (MSF) data based on clinical teaching. METHODS: MSF data for 1831 family physicians, 1510 medical specialists, and 542 surgeons were collected from physicians' medical colleagues, co-workers (e.g., nurses and pharmacists), and patients and examined in relation to information about physician teaching activities including percentage of time spent teaching during patient care and academic appointment. Multivariate analysis of variance, partial eta squared effect sizes, and Tukey's HSD post hoc comparisons were used to determine between group differences in total MSF mean and subscale mean performance scores by teaching and academic appointment data. RESULTS: Higher clinical performance scores were associated with holding any academic appointment and generally with any time teaching versus no teaching during patient care. This was most evident for data from medical colleagues, where these differences existed across all specialty groups. CONCLUSION: More involvement in teaching was associated with higher clinical performance ratings from medical colleagues and co-workers. These results may support promoting teaching as a method to enhance and maintain high-quality clinical performance.
Asunto(s)
Competencia Clínica , Médicos , Enseñanza , Retroalimentación Formativa , Humanos , Encuestas y CuestionariosRESUMEN
UNLABELLED: CONSTRUCT: Authentic standard setting methods will demonstrate high convergent validity evidence of their outcomes, that is, cutoff scores and pass/fail decisions, with most other methods when compared with each other. BACKGROUND: The objective structured clinical examination (OSCE) was established for valid, reliable, and objective assessment of clinical skills in health professions education. Various standard setting methods have been proposed to identify objective, reliable, and valid cutoff scores on OSCEs. These methods may identify different cutoff scores for the same examinations. Identification of valid and reliable cutoff scores for OSCEs remains an important issue and a challenge. APPROACH: Thirty OSCE stations administered at least twice in the years 2010-2012 to 393 medical students in Years 2 and 3 at Aga Khan University are included. Psychometric properties of the scores are determined. Cutoff scores and pass/fail decisions of Wijnen, Cohen, Mean-1.5SD, Mean-1SD, Angoff, borderline group and borderline regression (BL-R) methods are compared with each other and with three variants of cluster analysis using repeated measures analysis of variance and Cohen's kappa. RESULTS: The mean psychometric indices on the 30 OSCE stations are reliability coefficient = 0.76 (SD = 0.12); standard error of measurement = 5.66 (SD = 1.38); coefficient of determination = 0.47 (SD = 0.19), and intergrade discrimination = 7.19 (SD = 1.89). BL-R and Wijnen methods show the highest convergent validity evidence among other methods on the defined criteria. Angoff and Mean-1.5SD demonstrated least convergent validity evidence. The three cluster variants showed substantial convergent validity with borderline methods. CONCLUSIONS: Although there was a high level of convergent validity of Wijnen method, it lacks the theoretical strength to be used for competency-based assessments. The BL-R method is found to show the highest convergent validity evidences for OSCEs with other standard setting methods used in the present study. We also found that cluster analysis using mean method can be used for quality assurance of borderline methods. These findings should be further confirmed by studies in other settings.
Asunto(s)
Educación de Pregrado en Medicina , Evaluación Educacional/normas , Análisis por Conglomerados , Evaluación Educacional/métodos , Humanos , PsicometríaRESUMEN
UNLABELLED: CGEA 2015 CONFERENCE ABSTRACT (EDITED). A Novel Approach to Assessing Professionalism in Preclinical Medical Students Using Paired Self- and Peer Evaluations. Amanda R. Emke, Steven Cheng, and Carolyn Dufault. CONSTRUCT: This study sought to assess the professionalism of 2nd-year medical students in the context of team-based learning. BACKGROUND: Professionalism is an important attribute for physicians and a core competency throughout medical education. Preclinical training often focuses on individual knowledge acquisition with students working only indirectly with faculty assessors. As such, the assessment of professionalism in preclinical training continues to present challenges. We propose a novel approach to preclinical assessment of medical student professionalism to address these challenges. APPROACH: Second-year medical students completed self- and peer assessments of professionalism in two courses (Pediatrics and Renal/Genitourinary Diseases) following a series of team-based learning exercises. Assessments were composed of nearly identical 9-point rating scales. Correlational analysis and linear regression were used to examine the associations between self- and peer assessments and the effects of predictor variables. Four subgroups were formed based on deviation from the median ratings, and logistic regression was used to assess stability of subgroup membership over time. A missing data analysis was conducted to examine differences between average peer-assessment scores as a function of selective nonparticipation. RESULTS: There was a significant positive correlation (r = .62, p < .0001) between self-assessments completed alone and those completed at the time of peer assessment. There was also a significant positive correlation between average peer-assessment and self-assessment alone (r = .19, p < .0002) and self-assessment at the time of peer assessment (r = .27, p < .0001). Logistic regression revealed that subgroup membership was stable across measurement at two time points (T1 and T2) for all groups, except for members of the high self-assessment/low peer assessment at T1, who were significantly more likely to move to a new group at T2, χ(2)(3, N = 129) = 7.80, p < .05. Linear regression revealed that self-assessment alone and course were significant predictors of self-assessment at the time of peer assessment (Fself_alone = 144.74, p < .01 and Fcourse = 4.70, p < .05), whereas average peer rating, stage (T1, T2) and academic year (13-14, 14-15) were not. Linear regression also revealed that students who completed both self-assessments had significantly higher average peer assessment ratings (average peer rating in students with both self-assessments = 8.42, no self-assessments = 8.10, self_at_peer = 8.37, self_alone = 8.28) compared to students who completed one or no self-assessments (F = 5.34, p < .01). CONCLUSIONS: When used as a professionalism assessment within team-based learning, stand-alone and simultaneous peer and self-assessments are highly correlated within individuals across different courses. However, although self-assessment alone is a significant predictor of self-assessment made at the time of assessing one's peers, average peer assessment does not predict self-assessment. To explore this lack of predictive power, we classified students into four subgroups based on relative deviation from median peer and self-assessment scores. Group membership was found to be stable for all groups except for those initially sorted into the high self-assessment/low peer assessment subgroup. Members of this subgroup tended to move into the low self-assessment/low peer assessment group at T2, suggesting they became more accurate at self-assessing over time. A small group of individuals remained in the group that consistently rated themselves highly while their peers rated them poorly. Future studies will track these students to see if similar deviations from accurate professional self-assessment persist into the clinical years. In addition, given that students who fail to perform self-assessments had significantly lower peer assessment scores than their counterparts who completed self-assessments in this study, these students may also be at risk for similar professionalism concerns in the clinical years; follow-up studies will examine this possibility.
Asunto(s)
Educación de Pregrado en Medicina , Retroalimentación , Comunicación Interdisciplinaria , Aprendizaje , Profesionalismo , Humanos , Modelos Lineales , Grupo Paritario , Enseñanza/métodosRESUMEN
BACKGROUND: Although many studies have made efforts to define and assess medical professionalism, few have addressed issues of construct validity. PURPOSES: The purpose of this article is to explore further construct validity of medical professionalism employing exploratory and confirmatory factor analysis. METHODS: The 32-item instrument by the American Board of Internal Medicine (ABIM) was adapted to assess the perceptions on medical professionalism of Vietnamese medical students. A sample of 1,196 (487 first-year, 341 third-year, 368 sixth-year) medical students participated voluntarily in the completion of the instrument. The data were randomly divided into three samples to assess the construct validity of medical professionalism by empirically deriving and confirming a model of professionalism. RESULTS: Exploratory and confirmatory factor analytic techniques resulted in a six-factor well-fitting model with a comparative fit index of .963 and root mean square error approximation of .029, 90% confidence interval [016, .039]: integrity, social responsibility, professional practice habits, ensuring quality care, altruism, and self-awareness. Social responsibility was perceived least important, and self-awareness was perceived most important by Vietnamese medical students. These constructs of medical professionalism were relatively similar with those found in Taiwanese medical students and the ABIM definitions but with some Vietnamese cultural differences. CONCLUSIONS: Although the results confirm that medical professionalism is a somewhat culturally sensitive construct, it nonetheless has many elements of medical professionalism that are universal. Future research should be conducted to test the generalizability of our six-factor model of professionalism with various samples (e.g., residents, physicians), cultures, and language groups.
Asunto(s)
Rol Profesional , Estudiantes de Medicina/psicología , Altruismo , Comparación Transcultural , Educación de Pregrado en Medicina , Análisis Factorial , Femenino , Humanos , Masculino , Calidad de la Atención de Salud , Responsabilidad Social , Encuestas y Cuestionarios/normas , Vietnam , Adulto JovenRESUMEN
BACKGROUND: Emotional Intelligence (EI) is the ability to deal with your own and others emotions. Medical students are inducted into medical schools on the basis of their academic achievement. Professionally, however, their success rate is variable and may depend on their interpersonal relationships. EI is thought to be significant in achieving good interpersonal relationships and success in life and career. Therefore, it is important to measure EI and understand its correlates in an undergraduate medical student population. AIM: The objective of study was to investigate the relationship between the EI of medical students and their academic achievement (based on cumulative grade point average [CGPA]), age, gender and year of study. METHODS: A cross-sectional survey design was used. The SSREIS and demographic survey were administered in the three medical schools in Saudi Arabia from April to May 2012. RESULTS: The response rate was 30%. For the Optimism subscale, the mean score was M = 3.79, SD ± 0.54 (α = 0.82), for Awareness-of-emotion subscale M = 3.94, SD ± 0.57 (α = 0.72) and for Use-of-emotion subscale M = 3.92, SD ± 0.54 (α = 0.63). Multiple regression showed a significant positive correlation between CGPA and the EI of medical students (r = 0.246, p = 0.000) on the Optimism subscale. No correlation was seen between CGPA and Awareness of Emotions and Use of Emotions subscales. No relationship was seen for the other independent variables. CONCLUSION: The current study demonstrates that CGPA is the only significant predictor, indicating that Optimism tends to be higher for students with a higher CPGA. None of the other independent variables (age, year of study, gender) showed a significant relationship.
Asunto(s)
Educación de Pregrado en Medicina , Inteligencia Emocional , Estudiantes de Medicina/psicología , Adulto , Animales , Estudios Transversales , Escolaridad , Femenino , Humanos , Masculino , Persona de Mediana Edad , RatasRESUMEN
We determined the Web-based configurations that are applied to teach medical and veterinary communication skills, evaluated their effectiveness, and suggested future educational directions for Web-based communication teaching in veterinary education. We performed a systematic search of CAB Abstracts, MEDLINE, Scopus, and ERIC limited to articles published in English between 2000 and 2012. The review focused on medical or veterinary undergraduate to clinical- or residency-level students. We selected studies for which the study population was randomized to the Web-based learning (WBL) intervention with a post-test comparison with another WBL or non-WBL method and that reported at least one empirical outcome. Two independent reviewers completed relevancy screening, data extraction, and synthesis of results using Kirkpatrick and Kirkpatrick's framework. The search retrieved 1,583 articles, and 10 met the final inclusion criteria. We identified no published articles on Web based communication platforms in veterinary medicine; however, publications summarized from human medicine demonstrated that WBL provides a potentially reliable and valid approach for teaching and assessing communication skills. Student feedback on the use of virtual patients for teaching clinical communication skills has been positive,though evidence has suggested that practice with virtual patients prompted lower relation-building responses.Empirical outcomes indicate that WBL is a viable method for expanding the approach to teaching history taking and possibly to additional tasks of the veterinary medical interview.
Asunto(s)
Competencia Clínica , Comunicación , Educación Médica , Educación en Veterinaria , Enseñanza , Educación Médica/métodos , Educación Médica/normas , Educación en Veterinaria/métodos , Educación en Veterinaria/normas , Evaluación Educacional , Humanos , Internet , Aprendizaje , Estudiantes , Enseñanza/métodosRESUMEN
Creating original, integrated multiple-choice questions (MCQs) is time-consuming and onerous for basic science and clinical faculty. We demonstrate that medical students are co-experts to overcome assessment challenges of the faculty. We recruited, trained, and motivated medical students to write 10,000 high-quality MCQs for use in the foundational courses of medical education. These students were ideal because they possessed integrated knowledge (basic sciences and clinical experience). We taught them how to write high-quality MCQs using a writing template and continuous monitoring and support by an item bank curator. The students themselves also benefitted personally and pedagogically from the experience.
RESUMEN
PURPOSE: This study examines the feasibility and psychometric results of an assessment of entrustable professional activities (EPAs) as a core component of the clinical program of assessment in undergraduate medical education, assesses the learning curves for each EPA, explores the time to entrustment, and investigates the dependability of the EPA data based on generalizability theory (G theory) analysis. METHOD: Third-year medical students from the University of Minnesota Medical School in 7 required clerkships from May 2022 through April 2023 were assessed. Students were required to obtain at least 4 EPA assessments per week on average from clinical faculty, residents supervising the students, or assessment and coaching experts. Student ratings were depicted as curves describing their performance over time; regression models were used to fit the curves. RESULTS: The complete class of 240 (138 women [58.0%] and 102 men [42.0%]) third-year medical students at the University of Minnesota Medical School (mean [SD] age at matriculation, 24.2 [2.7] years) participated. There were 32,614 EPA-based assessments (mean [SD], 136 [29.6] assessments per student). Reliability analysis using G theory found that an overall score dependability of 0.75 (range, 0-1) was achieved with 4 assessors on 4 occasions. The desired level of entrustment by academic year end was met by all 240 students (100%) for EPAs 1, 6, and 7, 237 (98.8%), 236 (98.3%), and 218 (90.8%) students for EPAs 2, 5, and 9, respectively, 197 students (82.1%) for EPA 3, 178 students (74.2%) for EPA 4, and 145 students (60.4%) for EPA 12. The most rapid growth was for EPA 2 (ß0 = .286), followed by EPA 1 (ß0 = .240), EPA 4 (ß0 = .236), and EPA 10 (ß0 = .230). CONCLUSIONS: The study findings suggest that EPA ratings provide reliable and dependable data to make entrustment decisions about students' performance.
RESUMEN
Current teaching approaches in human and veterinary medicine across North America, Europe, and Australia include lectures, group discussions, feedback, role-play, and web-based training. Increasing class sizes, changing learning preferences, and economic and logistical challenges are influencing the design and delivery of communication skills in veterinary undergraduate education. The study's objectives were to (1) assess the effectiveness of small-group and web-based methods for teaching communication skills and (2) identify which training method is more effective in helping students to develop communication skills. At the Ross University School of Veterinary Medicine (RUSVM), 96 students were randomly assigned to one of three groups (control, web, or small-group training) in a pre-intervention and post-intervention group design. An Objective Structured Clinical Examination (OSCE) was used to measure communication competence within and across the intervention and control groups. Reliability of the OSCEs was determined by generalizability theory to be 0.65 (pre-intervention OSCE) and 0.70 (post-intervention OSCE). Study results showed that (1) small-group training was the most effective teaching approach in enhancing communication skills and resulted in students scoring significantly higher on the post-intervention OSCE compared to the web-based and control groups, (2) web-based training resulted in significant though considerably smaller improvement in skills than small-group training, and (3) the control group demonstrated the lowest mean difference between the pre-intervention/post-intervention OSCE scores, reinforcing the need to teach communication skills. Furthermore, small-group training had a significant effect in improving skills derived from the initial phase of the consultation and skills related to giving information and planning.
Asunto(s)
Competencia Clínica , Comunicación , Educación en Veterinaria , Enseñanza , Humanos , Educación en Veterinaria/métodos , Educación en Veterinaria/normas , Evaluación Educacional , Internet , Aprendizaje , San Kitts y Nevis , EstudiantesRESUMEN
BACKGROUND: The purpose of this study was to investigate the predictive and construct validity of a high-stakes objective structured clinical examination (OSCE) used to select candidates for a 3-month clinical rotation to assess practice-readiness status. SUMMARY: Analyses were undertaken to establish the reliability and validity of the OSCE. The generalizability coefficient (Ep(2)) for the assessment scores (checklist, global, and total) were all high, ranging from 0.73 to 0.84. Two discriminant analyses (promotion to the 3-month rotation and pass/fail status on the rotation) provided evidence of predictive validity with a 100% correct classification rate in the pass/fail rotation results. Factor analysis results provided evidence of construct validity with four factors identified: Clinical Skills, Internal Medicine, General Medical Knowledge, and Counseling. The known group differences between licensing status and residency experience also provided evidence of construct validity. CONCLUSIONS: The results are encouraging for the predictive and construct validity of the OSCE as an assessment of clinical competence.
Asunto(s)
Competencia Clínica/normas , Evaluación Educacional/métodos , Médicos Graduados Extranjeros , Adulto , Canadá , Análisis Factorial , Femenino , Humanos , Masculino , Persona de Mediana EdadRESUMEN
BACKGROUND: The objective structure clinical examination (OSCE) has been used since the early 1970s for assessing clinical competence. There are very few studies that have examined the psychometric stability of the stations that are used repeatedly with different samples. The purpose of the present study was to assess the stability of objective structured clinical exams (OSCEs) employing the same stations used over time but with a different sample of candidates, SPs, and examiners. METHODS: At Time 1, 191 candidates and at Time 2 (one year apart), 236 candidates participated in a 10-station OSCE; 6 of the same stations were used in both years. Generalizability analyses (Ep2) were conducted. Employing item response analyses, test characteristic curves (TCC) were derived for each of the 6 stations for a 2-parameter model. The TCCs were compared across the two years, Time 1 and 2. RESULTS: The Ep2 of the OSCEs exceeded.70. Standardized thetas (θ) and discriminations were equivalent for the same station across the two year period indicating equivalent TCCs for a 2-parameter model. CONCLUSION: The 6 OSCE stations used by the AIMG program over two years have adequate internal consistency reliability, stable generalizability (Ep2) and equivalent test characteristics. The process of assessment employed for IMG's are stable OSCE stations that may be used several times over without compromising psychometric properties.With careful security, high-stakes OSCEs may use the same stations that have high internal consistency and generalizability repeatedly as the psychometric properties are stable over several years with different samples of candidates.
Asunto(s)
Competencia Clínica/normas , Evaluación Educacional/normas , Reproducibilidad de los Resultados , Adulto , Alberta , Educación de Pregrado en Medicina , Femenino , Humanos , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Modelos Teóricos , PsicometríaRESUMEN
BACKGROUND: New approaches are needed to ensure that surgical trainees attain competence in a timely way. Traditional solutions have focused on the years spent in surgic al training. We sought to examine the outcomes of graduates from 3-year versus 4-year medical schools for differences in surgeon performance based on multisource feedback data. METHODS: We used data from the College of Physicians and Surgeons of Alberta's Physician Achievement Review program to determine curricular outcomes. Data for each surgeon included assessments from 25 patients, 8 medical colleagues and 8 nonphysician coworkers (e.g., nurses), and a self-assessment. We used these data to compare 72 physicians from a 3-year school matched with graduates from 4-year schools. The instruments were assessed for evidence of validity and reliability. We compared the groups using 1-way analysis of covariance and multivariate analysis of covariance, with years since graduation as a covariate, and a Cohen d effect size calculation to assess the magnitude of the change. RESULTS: Data for 216 surgeons indicated that there was evidence for instrument validity and reliability. No significant differences were found based on the length of the undergraduate program for any of the questionnaires or factors within the questionnaires. CONCLUSION: Reconsideration might be given to the time spent in medical school before surgical training if training in the specialty and career years are to be maximized. This assumes that students are able to make informed career decisions based on clerkship and other experiences in a 3-year setting.
Asunto(s)
Selección de Profesión , Competencia Clínica , Curriculum , Educación de Pregrado en Medicina/normas , Adulto , Alberta , Educación Basada en Competencias , Estudios Transversales , Educación de Postgrado en Medicina/normas , Educación de Postgrado en Medicina/tendencias , Educación de Pregrado en Medicina/tendencias , Femenino , Humanos , Internado y Residencia , Masculino , Reproducibilidad de los Resultados , Facultades de Medicina/normas , Facultades de Medicina/tendencias , Autoevaluación (Psicología) , Estudiantes de Medicina/estadística & datos numéricos , Encuestas y Cuestionarios , Tiempo , Adulto JovenRESUMEN
BACKGROUND: The number of Multiple Mini Interview (MMI) stations and the type and number of interviewers required for an acceptable level of reliability for veterinary admissions requires investigation. PURPOSE: The goal is to investigate the reliability of the 2009 MMI admission process at the University of Calgary. METHODS: Each applicant (n = 103; female = 80.6%; M age = 23.05 years, SD = 3.96) participated in a 7-station MMI. Applicants were rated independently by 2 interviewers, a faculty member, and a community veterinarian, within each station (total interviewers/applicant N = 14). Interviewers scored applicants on 3 items, each on a 5-point anchored scale. RESULTS: Generalizability analysis resulted in a reliability coefficient of G = 0.79. A Decision study (D-study) indicated that 10 stations with 1 interviewer would produce a G = 0.79 and 8 stations with 2 interviewers would produce a G = 0.81; however, these have different resource requirements. A two-way analysis of variance showed that there was a nonsignificant main effect of interviewer type (between faculty member and community veterinarian) on interview scores, F(1, 1428) = 3.18, p = .075; a significant main effect of station on interview scores, F(6, 1428) = 4.34, p < .001; and a nonsignificant interaction effect between interviewer-type and station on interview scores, F(6, 1428) = 0.74, p = .62. CONCLUSIONS: Overall reliability was adequate for the MMI. Results from the D-study suggest that the current format with 7 stations provides adequate reliability given that there are enough interviewers; to achieve the same G-coefficient 1 interviewer per station with 10 stations would suffice and reduce the resource requirements. Community veterinarians and faculty members demonstrated an adequate level of agreement in their assessments of applicants.
Asunto(s)
Entrevista Psicológica/métodos , Criterios de Admisión Escolar , Facultades de Medicina Veterinaria/estadística & datos numéricos , Estudiantes del Área de la Salud/estadística & datos numéricos , Alberta , Análisis de Varianza , Femenino , Humanos , Masculino , Modelos Educacionales , Reproducibilidad de los Resultados , Estadística como Asunto , Adulto JovenRESUMEN
BACKGROUND: Assessment of clinical teaching by learners is of value to teachers, department heads, and program directors, and must be comprehensive and feasible. AIMS: To review published evaluation instruments with psychometric evaluations and to develop and psychometrically evaluate an instrument for assessing clinical teaching with linkages to the CanMEDS roles. METHOD: We developed a 19-item questionnaire to reflect 10 domains relevant to teaching and the CanMEDS roles. A total of 317 medical learners assessed 170 instructors. Fourteen (4.4 %) clinical clerks, 229 (72.3%) residents, and 53 (16.7%) fellows assessed 170 instructors. Twenty-one (6.6%) did not specify their position. RESULTS: A mean number of eight raters assessed each instructor. The internal consistency reliability of the 19-item instrument was Cronbach's α = 0.95. The generalizability coefficient (Ep(2)) analysis indicated that the raters achieved Ep(2) of 0.95. The factor analysis showed three factors that accounted for 67.97% of the total variance. The three factors together, with the variance accounted for and their internal consistency reliability, are teaching skills (variance = 53.25s%; Cronbach's α = 0.92), Patient interaction (variance = 8.56%; Cronbach's α = 0.91), and professionalism (variance = 6.16%; Cronbach's α = 0.86). The three factors are intercorrelated (correlations = 0.48, 0.58, 0.46; p < 0.01). CONCLUSION: It is feasible to assess clinical teaching with the 19-item instrument that has demonstrated evidence of both validity and reliability.
Asunto(s)
Educación Médica/normas , Docentes Médicos/normas , Competencia Profesional , Encuestas y Cuestionarios/normas , Enseñanza/normas , Alberta , Educación Médica/métodos , Análisis Factorial , Humanos , Relaciones Médico-Paciente , Proyectos Piloto , Aprendizaje Basado en Problemas , Psicometría , Reproducibilidad de los Resultados , Facultades de MedicinaRESUMEN
The experience of the International Society of Addiction Medicine in setting up the first international certification of clinical knowledge is reported. The steps followed and the results of a psychometric analysis of the tests from the first 65 candidates are reported. Lessons learned in the first 5 years and challenges for the future are identified.