Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Teach Learn Med ; : 1-11, 2022 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-36106359

RESUMO

Issue: Automatic item generation is a method for creating medical items using an automated, technological solution. Automatic item generation is a contemporary method that can scale the item development process for production of large numbers of new items, support building of multiple forms, and allow rapid responses to changing medical content guidelines and threats to test security. The purpose of this analysis is to describe three sources of validation evidence that are required when producing high-quality medical licensure test items to ensure evidence for valid test score inferences, using the automatic item generation methodology for test development. Evidence: Generated items are used to make inferences about examinees' medical knowledge, skills, and competencies. We present three sources of evidence required to evaluate the quality of the generated items that is necessary to ensure the generated items measure the intended knowledge, skills, and competencies. The sources of evidence we present here relate to the item definition, the item development process, and the item quality review. An item is defined as an explicit set of properties that include the parameters, constraints, and instructions used to elicit a response from the examinee. This definition allows for a critique of the input used for automatic item generation. The item development process is evaluated using a validation table, whose purpose is to support verification of the assumptions related to model specification made by the subject-matter expert. This table provides a succinct summary of the content and constraints that were used to create new items. The item quality review is used to evaluate the statistical quality of the generated items, which often focuses on the difficulty and the discrimination of the correct and incorrect options. Implications: Automatic item generation is an increasingly popular item development method. The generated items from this process must be bolstered by evidence to ensure the items measure the intended knowledge, skills, and competencies. The purpose of this analysis is to describe these sources of evidence that can be used to evaluate the quality of the generated items. The important role of medical expertise in the development and evaluation of the generated items is highlighted as a crucial requirement for producing validation evidence.

2.
Teach Learn Med ; 33(1): 28-35, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-32281406

RESUMO

Construct: The definition of clinical reasoning may vary among health profession educators. However, for the purpose of this paper, clinical reasoning is defined as the cognitive processes that are involved in the steps of information gathering, problem representation, generating a differential diagnosis, providing a diagnostic justification to arrive at a leading diagnosis, and formulating diagnostic and management plans. Background: Expert performance in clinical reasoning is essential for success as a physician, and has been difficult for clerkship directors to observe and quantify in a way that fosters the instruction and assessment of clinical reasoning. The purpose of this study was to gather validity evidence for the Multistep exam (MSX) format used by our medicine clerkship to assess analytical clinical reasoning abilities; we did this by examining the relationship between scores on the MSX and other external measures of clinical reasoning abilities. This analysis used dual process theory as the main theoretical framework of clinical reasoning, as well as aspects of Kane's validity framework to guide the selection of validity evidence for the investigation. We hypothesized that there would be an association between the MSX (a three-step clinical reasoning tool developed locally), and the USMLE Step 2 CS, as they share similar concepts in assessing the clinical reasoning of students. We examined the relationship between overall scores on the MSX and the Step 2 CS Integrated Clinical Encounter (ICE) score, in which the student articulates their reasoning for simulated patient cases, while controlling for examinee's internal medicine clerkship performance measures such as the NBME subject exam score and the Medicine clerkship OSCE score. Approach: A total 477 of 487 (97.9%) medical students, representing the graduating classes of 2015, 2016, 2017, who took the MSX at the end of each medicine clerkship (2012-2016), and Step 2 CS (2013-2017) were included in this study. Correlation analysis and multiple linear regression analysis were used to examine the impact of the primary explanatory variables of interest (MSX) onto the outcome variable (ICE score) when controlling for baseline variables (Medicine OSCE and NBME Medicine subject exam). Findings: The overall MSX score had a significant, positive correlation with the Step 2 CS ICE score (r = .26, P < .01). The overall MSX score was a significant predictor of Step 2 CS ICE score (ß = .19, P < .001), explaining an additional 4% of the variance of ICE beyond the NBME Medicine subject score and the Medicine OSCE score (Adjusted R2 = 13%). Conclusion: The stepwise format of the MSX provides a tool to observe clinical reasoning performance, which can be used in an assessment system to provide feedback to students on their analytical clinical reasoning. Future studies should focus on gaining additional validity evidence across different learners and multiple medical schools.


Assuntos
Estágio Clínico/normas , Currículo/normas , Educação de Graduação em Medicina/métodos , Avaliação Educacional/estatística & dados numéricos , Medicina Interna/educação , Estudantes de Medicina/estatística & dados numéricos , Competência Clínica , Feminino , Humanos , Masculino , Estados Unidos
3.
Teach Learn Med ; 29(3): 280-285, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28632015

RESUMO

Construct: We investigated the extent of the associations between medical students' clinical competency measured by performance in Objective Structured Clinical Examinations (OSCE) during Obstetrics/Gynecology and Family Medicine clerkships and later performance in both undergraduate and graduate medical education. BACKGROUND: There is a relative dearth of studies on the correlations between undergraduate OSCE scores and future exam performance within either undergraduate or graduate medical education and almost none on linking these simulated encounters to eventual patient care. Of the research studies that do correlate clerkship OSCE scores with future performance, these often have a small sample size and/or include only 1 clerkship. APPROACH: Students in USU graduating classes of 2007 through 2011 participated in the study. We investigated correlations between clerkship OSCE grades with United States Medical Licensing Examination Step 2 Clinical Knowledge, Clinical Skills, and Step 3 Exams scores as well as Postgraduate Year 1 program director's evaluation scores on Medical Expertise and Professionalism. We also conducted contingency table analysis to examine the associations between poor performance on clerkship OSCEs with failing Step 3 and receiving poor program director ratings. RESULTS: The correlation coefficients were weak between the clerkship OSCE grades and the outcomes. The strongest correlations existed between the clerkship OSCE grades and the Step 2 CS Integrated Clinical Encounter component score, Step 2 Clinical Skills, and Step 3 scores. Contingency table associations between poor performances on both clerkships OSCEs and poor Postgraduate Year 1 Program Director ratings were significant. CONCLUSIONS: The results of this study provide additional but limited validity evidence for the use of OSCEs during clinical clerkships given their associations with subsequent performance measures.


Assuntos
Estágio Clínico , Competência Clínica , Educação de Graduação em Medicina , Avaliação Educacional/métodos , Avaliação Educacional/estatística & dados numéricos , Humanos , Estados Unidos
4.
Teach Learn Med ; 26(4): 379-86, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25318034

RESUMO

BACKGROUND: Recently, there has been a surge in the use of objective structured clinical examinations (OSCEs) at medical schools around the world, and with this growth has come the concomitant need to validate such assessments. PURPOSES: The current study examined the associations between student performance on several school-level clinical skills and knowledge assessments, including two OSCEs, the National Board of Medical Examiners® (NBME) Subject Examinations, and the United States Medical Licensing Examination® (USMLE) Step 2 Clinical Skills (CS) and Step 3 assessments. METHODS: The sample consisted of 806 medical students from the Uniformed Services University of the Health Sciences. We conducted Pearson correlation analysis as well as stepwise multiple linear regression modeling to examine the strength of associations between students' performance on 2nd- and 3rd-year OSCEs and their two Step 2 CS component scores and Step 3 scores. RESULTS: Positive associations were found between the OSCE variables and the USMLE scores; in particular, student performance on both the 2nd- and 3rd-year OSCEs was more strongly associated with the two Step 2 CS component scores than with Step 3 scores. CONCLUSIONS: These findings, although preliminary, provide some predictive validity evidence for the use of OSCEs in determining readiness of medical students for clinical practice and licensure.


Assuntos
Competência Clínica , Educação de Graduação em Medicina/normas , Avaliação Educacional/métodos , Feminino , Humanos , Masculino , Estados Unidos , Adulto Jovem
5.
Acad Med ; 99(2): 192-197, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-37934828

RESUMO

PURPOSE: In late 2022 and early 2023, reports that ChatGPT could pass the United States Medical Licensing Examination (USMLE) generated considerable excitement, and media response suggested ChatGPT has credible medical knowledge. This report analyzes the extent to which an artificial intelligence (AI) agent's performance on these sample items can generalize to performance on an actual USMLE examination and an illustration is given using ChatGPT. METHOD: As with earlier investigations, analyses were based on publicly available USMLE sample items. Each item was submitted to ChatGPT (version 3.5) 3 times to evaluate stability. Responses were scored following rules that match operational practice, and a preliminary analysis explored the characteristics of items that ChatGPT answered correctly. The study was conducted between February and March 2023. RESULTS: For the full sample of items, ChatGPT scored above 60% correct except for one replication for Step 3. Response success varied across replications for 76 items (20%). There was a modest correspondence with item difficulty wherein ChatGPT was more likely to respond correctly to items found easier by examinees. ChatGPT performed significantly worse ( P < .001) on items relating to practice-based learning. CONCLUSIONS: Achieving 60% accuracy is an approximate indicator of meeting the passing standard, requiring statistical adjustments for comparison. Hence, this assessment can only suggest consistency with the passing standards for Steps 1 and 2 Clinical Knowledge, with further limitations in extrapolating this inference to Step 3. These limitations are due to variances in item difficulty and exclusion of the simulation component of Step 3 from the evaluation-limitations that would apply to any AI system evaluated on the Step 3 sample items. It is crucial to note that responses from large language models exhibit notable variations when faced with repeated inquiries, underscoring the need for expert validation to ensure their utility as a learning tool.


Assuntos
Inteligência Artificial , Conhecimento , Humanos , Simulação por Computador , Idioma , Aprendizagem
6.
J Gen Intern Med ; 27(1): 65-70, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21879372

RESUMO

BACKGROUND: The United States Medical Licensing Examination® (USMLE®) Step 3® examination is a computer-based examination composed of multiple choice questions (MCQ) and computer-based case simulations (CCS). The CCS portion of Step 3 is unique in that examinees are exposed to interactive patient-care simulations. OBJECTIVE: The purpose of the following study is to investigate whether the type and length of examinees' postgraduate training impacts performance on the CCS component of Step 3, consistent with previous research on overall Step 3 performance. DESIGN: Retrospective cohort study PARTICIPANTS: Medical school graduates from U.S. and Canadian institutions completing Step 3 for the first time between March 2007 and December 2009 (n = 40,588). METHODS: Post-graduate training was classified as either broadly focused for general areas of medicine (e.g. pediatrics) or narrowly focused for specific areas of medicine (e.g. radiology). A three-way between-subjects MANOVA was utilized to test for main and interaction effects on Step 3 and CCS scores between the demographic characteristics of the sample and type of residency. Additionally, to examine the impact of postgraduate training, CCS scores were regressed on Step 1 and Step 2 Clinical Knowledge (CK) scores. Residuals from the resulting regressions were plotted. RESULTS: There was a significant difference in CCS scores between broadly focused (µ = 216, σ = 17) and narrowly focused (µ=211, σ = 16) residencies (p < 0.001). Examinees in broadly focused residencies performed better overall and as length of training increased, compared to examinees in narrowly focused residencies. Predictors of Step 1 and Step 2 CK explained 55% of overall Step 3 variability and 9% of CCS score variability. CONCLUSIONS: Factors influencing performance on the CCS component may be similar to those affecting Step 3 overall. Findings are supportive of the validity of the Step 3 program and may be useful to program directors and residents in considering readiness to take this examination.


Assuntos
Competência Clínica/normas , Tomada de Decisões Assistida por Computador , Educação de Pós-Graduação em Medicina/normas , Avaliação Educacional/normas , Internato e Residência/normas , Licenciamento em Medicina/normas , Canadá , Educação de Pós-Graduação em Medicina/métodos , Avaliação Educacional/métodos , Feminino , Humanos , Internato e Residência/métodos , Masculino , Estudos Retrospectivos , Estados Unidos
7.
Adv Health Sci Educ Theory Pract ; 17(3): 325-37, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21964951

RESUMO

Examinees who initially fail and later repeat an SP-based clinical skills exam typically exhibit large score gains on their second attempt, suggesting the possibility that examinees were not well measured on one of those attempts. This study evaluates score precision for examinees who repeated an SP-based clinical skills test administered as part of the US Medical Licensing Examination sequence. Generalizability theory was used as the basis for computing conditional standard errors of measurement (SEM) for individual examinees. Conditional SEMs were computed for approximately 60,000 single-take examinees and 5,000 repeat examinees who completed the Step 2 Clinical Skills Examination(®) between 2007 and 2009. The study focused exclusively on ratings of communication and interpersonal skills. Conditional SEMs for single-take and repeat examinees were nearly indistinguishable across most of the score scale. US graduates and IMGs were measured with equal levels of precision at all score levels, as were examinees with differing levels of skill speaking English. There was no evidence that examinees with the largest score changes were measured poorly on either their first or second attempt. The large score increases for repeat examinees on this SP-based exam probably cannot be attributed to unexpectedly large errors of measurement.


Assuntos
Competência Clínica/normas , Avaliação Educacional/normas , Exame Físico , Comunicação , Humanos , Licenciamento , Simulação de Paciente , Estudantes de Medicina , Estados Unidos
8.
Adv Health Sci Educ Theory Pract ; 17(4): 557-71, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22041870

RESUMO

Multiple studies examining the relationship between physician gender and performance on examinations have found consistent significant gender differences, but relatively little information is available related to any gender effect on interviewing and written communication skills. The United States Medical Licensing Examination (USMLE) Step 2 Clinical Skills (CS) examination is a multi-station examination where examinees (physicians in training) interact with, and are rated by, standardized patients (SPs) portraying cases in an ambulatory setting. Data from a recent complete year (2009) were analyzed via a series of hierarchical linear models to examine the impact of examinee gender on performance on the data gathering (DG) and patient note (PN) components of this examination. Results from both components show that not only do women have higher scores on average, but women continue to perform significantly better than men when other examinee and case variables are taken into account. Generally, the effect sizes are moderate, reflecting an approximately 2% score advantage by encounter. The advantage for female examinees increased for encounters that did not require a physical examination (for the DG component only) and for encounters that involved a Women's Health issue (for both components). The gender of the SP did not have an impact on the examinee gender effect for DG, indicating a desirable lack of interaction between examinee and SP gender. The implications of the findings, especially with respect to the validity of the use of the examination outcomes, are discussed.


Assuntos
Competência Clínica/normas , Avaliação Educacional/métodos , Licenciamento em Medicina/normas , Estudantes de Medicina/psicologia , Análise de Variância , Competência Clínica/estatística & dados numéricos , Comunicação , Avaliação Educacional/estatística & dados numéricos , Feminino , Humanos , Relações Interpessoais , Masculino , Simulação de Paciente , Reprodutibilidade dos Testes , Fatores Sexuais , Estudantes de Medicina/estatística & dados numéricos , Estados Unidos
9.
Eval Health Prof ; 45(4): 327-340, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-34753326

RESUMO

One of the most challenging aspects of writing multiple-choice test questions is identifying plausible incorrect response options-i.e., distractors. To help with this task, a procedure is introduced that can mine existing item banks for potential distractors by considering the similarities between a new item's stem and answer and the stems and response options for items in the bank. This approach uses natural language processing to measure similarity and requires a substantial pool of items for constructing the generating model. The procedure is demonstrated with data from the United States Medical Licensing Examination (USMLE®). For about half the items in the study, at least one of the top three system-produced candidates matched a human-produced distractor exactly; and for about one quarter of the items, two of the top three candidates matched human-produced distractors. A study was conducted in which a sample of system-produced candidates were shown to 10 experienced item writers. Overall, participants thought about 81% of the candidates were on topic and 56% would help human item writers with the task of writing distractors.


Assuntos
Avaliação Educacional , Processamento de Linguagem Natural , Humanos , Estados Unidos , Avaliação Educacional/métodos
11.
Acad Med ; 81(10 Suppl): S17-20, 2006 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17001127

RESUMO

BACKGROUND: The purpose of the present study was to assess the fit of three factor analytic (FA) models with a representative set of United States Medical Licensing Examination (USMLE) Step 2 Clinical Skills (CS) cases and examinees based on substantive considerations. METHOD: Checklist, patient note, communication and interpersonal skills, as well as spoken English proficiency data were collected from 387 examinees on a set of four USMLE Step 2 CS cases. The fit of skills-based, case-based, and hybrid models was assessed. RESULTS: Findings show that a skills-based model best accounted for performance on the set of four CS cases. CONCLUSION: Results of this study provide evidence to support the structural aspect of validity. The proficiency set used by examinees when performing on the Step 2 CS cases is consistent with the scoring rubric employed and the blueprint used in form assembly. These findings will be discussed in light of past research in this area.


Assuntos
Competência Clínica , Comunicação , Relações Interpessoais , Licenciamento em Medicina , Análise Fatorial , Humanos , Estados Unidos
12.
Acad Med ; 81(10 Suppl): S61-4, 2006 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17001138

RESUMO

BACKGROUND: Data from national surveys indicate that patient characteristics could influence the time spent by physicians interviewing and assessing patients. The purposes of this investigation were to gather information regarding the relationship between encounter time and case characteristics for simulated clinical encounters and to provide evidence that the time provided to gather data was adequate. Timing data was extracted from United States Medical Licensing Examination Step 2 Clinical Skills. METHOD: To test the relative effects of case characteristics on encounter time, an analysis of variance was conducted with encounter time as the dependent variable and case characteristics as the independent variables. RESULTS: Mean encounter times were computed based on the case characteristics. Station format (history only, history and physical examination, telephone cases) predicted the most variance in encounter time (16%). CONCLUSIONS: The extent to which examination content is balanced from administration to administration ensures a mix of cases that provides adequate time limits for examinees.


Assuntos
Competência Clínica , Licenciamento em Medicina , Exame Físico/métodos , Adolescente , Adulto , Idoso , Análise de Variância , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Fatores de Tempo , Estados Unidos
13.
Mil Med ; 180(4 Suppl): 24-30, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25850123

RESUMO

BACKGROUND: The Essential Elements of Communication (EEC) were developed from the Kalamazoo consensus statement on physician-patient communication. The Uniformed Services University of the Health Sciences (USU) has adopted a longitudinal curriculum to use the EEC both as a learning tool during standardized patient encounters and as an evaluation tool culminating with the end of preclerkship objective-structured clinical examinations (OSCE). Medical educators have recently emphasized the importance of teaching communication skills, as evidenced by the United States Medical Licensing Examination testing both the integrated clinical encounter (ICE) and communication and interpersonal skills (CIS) within the Step 2 Clinical Skills exam (CS). PURPOSE: To determine the associations between students' EEC OSCE performance at the end of the preclerkship period with later communication skills assessment and evaluation outcomes in the context of a longitudinal curriculum spanning both undergraduate medical education and graduate medical education. METHODS: Retrospective data from preclerkship (overall OSCE scores and EEC OSCE scores) and clerkship outcomes (internal medicine [IM] clinical points and average clerkship National Board of Medical Examiners [NBME] scores) were collected from 167 USU medical students from the class of 2011 and compared to individual scores on the CIS and ICE components of Step 2 CS, as well as to the communication skills component of the program directors' evaluation of trainees during their postgraduate year 1 (PGY-1) residency. In addition to bivariate Pearson correlation analysis, we conducted multiple linear regression analysis to examine the predictive power of the EEC score beyond the IM clerkship clinical points and the average NBME Subject Exams score on the outcome measures. RESULTS: The EEC score was a significant predictor of the CIS score and the PGY-1 communication skills score. Beyond the average NBME Subject Exams score and the IM clerkship clinical points, the EEC score explained an additional 13% of the variance in the Step 2 CIS score and an additional 6% of the variance in the PGY-1 communication skills score. In addition, the EEC score was more closely associated with the CIS score than the ICE score. CONCLUSION: The use of a standardized approach with a communication tool like the EEC can help explain future performance in communication skills independent of other education outcomes. In the context of a longitudinal curriculum, this information may better inform medical educators on learners' communication capabilities and more accurately direct future remediation efforts.


Assuntos
Comunicação , Currículo , Avaliação Educacional/estatística & dados numéricos , Relações Médico-Paciente , Estudantes de Medicina/estatística & dados numéricos , Adulto , Estágio Clínico/estatística & dados numéricos , Competência Clínica , Educação de Pós-Graduação em Medicina/métodos , Educação de Pós-Graduação em Medicina/estatística & dados numéricos , Educação de Graduação em Medicina/métodos , Educação de Graduação em Medicina/estatística & dados numéricos , Avaliação Educacional/métodos , Feminino , Humanos , Estudos Longitudinais , Masculino , Estudos Retrospectivos , Estados Unidos
14.
Mil Med ; 180(4 Suppl): 97-103, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25850135

RESUMO

BACKGROUND: In the early 1990 s, our group of interdepartmental academicians at the Uniformed Services University (USU) developed a PGY-1 (postgraduate year 1) program director evaluation form. Recently, we have revised it to better align with the core competencies established by the Accreditation Council for Graduate Medical Education. We also included items that reflected USU's military-unique context. PURPOSE: To collect feasibility, reliability, and validity evidence for our revised survey. METHOD: We collected PGY-1 data from program directors (PD) who oversee the training of military medical trainees. The cohort of the present study consisted of USU students graduating in 2010 and 2011. We performed exploratory factor analysis (EFA) to examine the factorial validity of the survey scores and subjected each of the factors identified in the EFA to an internal consistency reliability analysis. We then performed correlation analysis to examine the relationship between PD ratings and students' medical school grade point averages (GPAs) and performance on U.S. Medical Licensing Examinations Step assessments. RESULTS: Five factors emerged from the EFA--Medical Expertise, Military-unique Practice, Professionalism, System-based Practice, and Communication and Interpersonal Skills." The evaluation form also showed good reliability and feasibility. All five factors were more strongly associated with students' GPA in the initial clerkship year than the first 2 years. Further, these factors showed stronger correlations with students' performance on Step 3 than other Step Examinations. CONCLUSIONS: The revised PD evaluation form seemed to be a valid and reliable tool to gauge medical graduates' first-year internship performance.


Assuntos
Competência Clínica , Educação de Pós-Graduação em Medicina/normas , Avaliação Educacional/métodos , Estudantes de Medicina/estatística & dados numéricos , Inquéritos e Questionários/normas , Adulto , Educação de Pós-Graduação em Medicina/métodos , Análise Fatorial , Docentes de Medicina , Estudos de Viabilidade , Feminino , Humanos , Masculino , Reprodutibilidade dos Testes , Faculdades de Medicina , Estados Unidos
15.
Mil Med ; 180(4 Suppl): 4-11, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25850120

RESUMO

BACKGROUND: The Medical College Admissions Test (MCAT) is a high-stakes test required for entry to most U. S. medical schools; admissions committees use this test to predict future accomplishment. Although there is evidence that the MCAT predicts success on multiple choice-based assessments, there is little information on whether the MCAT predicts clinical-based assessments of undergraduate and graduate medical education performance. This study looked at associations between the MCAT and medical school grade point average (GPA), Medical Licensing Examination (USMLE) scores, observed patient care encounters, and residency performance assessments. METHODS: This study used data collected as part of the Long-Term Career Outcome Study to determine associations between MCAT scores, USMLE Step 1, Step 2 clinical knowledge and clinical skill, and Step 3 scores, Objective Structured Clinical Examination performance, medical school GPA, and PGY-1 program director (PD) assessment of physician performance for students graduating 2010 and 2011. RESULTS: MCAT data were available for all students, and the PGY PD evaluation response rate was 86.2% (N = 340). All permutations of MCAT scores (first, last, highest, average) were weakly associated with GPA, Step 2 clinical knowledge scores, and Step 3 scores. MCAT scores were weakly to moderately associated with Step 1 scores. MCAT scores were not significantly associated with Step 2 clinical skills Integrated Clinical Encounter and Communication and Interpersonal Skills subscores, Objective Structured Clinical Examination performance or PGY-1 PD evaluations. DISCUSSION: MCAT scores were weakly to moderately associated with assessments that rely on multiple choice testing. The association is somewhat stronger for assessments occurring earlier in medical school, such as USMLE Step 1. The MCAT was not able to predict assessments relying on direct clinical observation, nor was it able to predict PD assessment of PGY-1 performance.


Assuntos
Competência Clínica/estatística & dados numéricos , Teste de Admissão Acadêmica/estatística & dados numéricos , Previsões , Faculdades de Medicina , Estudantes de Medicina/estatística & dados numéricos , Logro , Adulto , Feminino , Humanos , Masculino , Estados Unidos
16.
Acad Med ; 79(10 Suppl): S52-4, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15383389

RESUMO

PURPOSE: To examine the impact of a timing change on pacing behavior and perceptions in a high-stakes multiple-choice examination. METHOD: Two samples of standard-time examinees were analyzed: (1) 29,796 examinees that completed the examination prior to the timing change, and (2) 28,373 examinees that completed the examination after the change. Subgroups of examinees were identified and compared within and across samples with respect to perceptions, accuracy, and pacing. RESULTS: After the timing change, more examinees reported having sufficient time to complete examination sections; a small improvement in overall accuracy was observed, and there was a shift in the time-per-item strategy, though examinees continued to use more than the average amount of time available per item at the beginning of sections. CONCLUSIONS: Examinees are more satisfied with the new timing constraints, although an effect due to the time limit continues to impact performance at the end of test sections.


Assuntos
Competência Clínica , Educação Médica , Avaliação Educacional , Licenciamento em Medicina , Atitude do Pessoal de Saúde , Canadá , Estudos de Coortes , Médicos Graduados Estrangeiros , Humanos , Intercâmbio Educacional Internacional , Satisfação Pessoal , Fatores de Tempo , Gerenciamento do Tempo , Estados Unidos
17.
Acad Med ; 78(10 Suppl): S75-7, 2003 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-14557102

RESUMO

PROBLEM STATEMENT AND BACKGROUND: The purpose of the present study was to examine the extent to which an automated scoring procedure that emulates expert ratings with latent semantic analysis could be used to score the written patient note component of the proposed clinical skills examination (CSE). METHOD: Human ratings for four CSE cases collected in 2002 were compared to automated holistic scores and to regression-based scores based on automated holistic and component scores. RESULTS AND CONCLUSIONS: Regression-based scores account for approximately half of the variance in the human ratings and are more highly correlated with the ratings than the scores produced from the automated algorithm. Implications of this study and suggestions for follow-up research are discussed.


Assuntos
Avaliação Educacional/estatística & dados numéricos , Anamnese , Exame Físico , Software , Algoritmos , Competência Clínica , Humanos , Licenciamento em Medicina , Modelos Lineares , Estados Unidos
18.
Acad Med ; 89(5): 762-6, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24667514

RESUMO

PURPOSE: To investigate the association between poor performance on National Board of Medical Examiners clinical subject examinations across six core clerkships and performance on the United States Medical Licensing Examination Step 3 examination. METHOD: In 2012, the authors studied matriculants from the Uniformed Services University of the Health Sciences with available Step 3 scores and subject exam scores on all six clerkships (Classes of 2007-2011, N = 654). Poor performance on subject exams was defined as scoring one standard deviation (SD) or more below the mean using the national norms of the corresponding test year. The association between poor performance on the subject exams and the probability of passing or failing Step 3 was tested using contingency table analyses and logistic regression modeling. RESULTS: Students performing poorly on one subject exam were significantly more likely to fail Step 3 (OR 14.23 [95% CI 1.7-119.3]) compared with students with no subject exam scores that were 1 SD below the mean. Poor performance on more than one subject exam further increased the chances of failing (OR 33.41 [95% CI 4.4-254.2]). This latter group represented 27% of the entire cohort, yet contained 70% of the students who failed Step 3. CONCLUSIONS: These findings suggest that individual schools could benefit from a review of subject exam performance to develop and validate their own criteria for identifying students at risk for failing Step 3.


Assuntos
Estágio Clínico , Educação de Graduação em Medicina/normas , Avaliação Educacional , Licenciamento em Medicina , Intervalos de Confiança , Feminino , Humanos , Modelos Logísticos , Masculino , Avaliação das Necessidades , Razão de Chances , Estados Unidos , Adulto Jovem
19.
Acad Med ; 88(5): 688-92, 2013 May.
Artigo em Inglês | MEDLINE | ID: mdl-23524920

RESUMO

PURPOSE: Previous studies on standardized patient (SP) exams reported score gains both across attempts when examinees failed and retook the exam and over multiple SP encounters within a single exam session. The authors analyzed the within-session score gains of examinees who repeated the United States Medical Licensing Examination Step 2 Clinical Skills to answer two questions: How much do scores increase within a session? Can the pattern of increasing first-attempt scores account for across-session score gains? METHOD: Data included encounter-level scores for 2,165 U.S. and Canadian medical students and graduates who took Step 2 Clinical Skills twice between April 1, 2005 and December 31, 2010. The authors modeled examinees' score patterns using smoothing and regression techniques and applied statistical tests to determine whether the patterns were the same or different across attempts. In addition, they tested whether any across-session score gains could be explained by the first-attempt within-session score trajectory. RESULTS: For the first and second attempts, the authors attributed examinees' within-session score gains to a pattern of score increases over the first three to six SP encounters followed by a leveling off. Model predictions revealed that the authors could not attribute the across-session score gains to the first-attempt within-session score gains. CONCLUSIONS: The within-session score gains over the first three to six SP encounters of both attempts indicate that there is a temporary "warm-up" effect on performance that "resets" between attempts. Across-session gains are not due to this warm-up effect and likely reflect true improvement in performance.


Assuntos
Avaliação Educacional/métodos , Licenciamento em Medicina , Exame Físico/normas , Canadá , Competência Clínica/normas , Competência Clínica/estatística & dados numéricos , Avaliação Educacional/normas , Avaliação Educacional/estatística & dados numéricos , Humanos , Modelos Estatísticos , Análise de Regressão , Estados Unidos
20.
Acad Med ; 86(10 Suppl): S17-20, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21955761

RESUMO

BACKGROUND: Women typically demonstrate stronger communication skills on performance-based assessments using human raters in medical education settings. This study examines the effects of examinee and rater gender on communication and interpersonal skills (CIS) scores from the performance-based component of the United States Medical Licensing Examination, the Step 2 Clinical Skills (CS) examination. METHOD: Data included demographic and performance information for examinees that took Step 2 CS for the first time in 2009. The sample contained 27,910 examinees, 625 standardized patient/case combinations, and 278,776 scored patient encounters. Hierarchical linear modeling techniques were employed with CIS scores as the outcome measure. RESULTS: Females tend to slightly outperform males on CIS, when other variables related to performance are taken into account. No evidence of an examinee and rater gender interaction effect was found. CONCLUSIONS: Results provide validity evidence supporting the interpretation and use of Step 2 CS CIS scores.


Assuntos
Comunicação , Licenciamento em Medicina , Relações Médico-Paciente , Análise de Variância , Avaliação Educacional , Feminino , Humanos , Masculino , Pacientes , Fatores Sexuais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA