RESUMO
Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if it is selected by fewer than 5% of examinees. A limitation of this definition is that the proportion of examinees available to choose a distractor depends on overall item difficulty. This is especially problematic for mastery tests, which consist of items that most examinees are expected to answer correctly. Based on the traditional definition of nonfunctional, a five-option MCQ answered correctly by greater than 90% of examinees will be constrained to have only one functional distractor. The primary purpose of the present study was to evaluate an index of nonfunctional that is sensitive to item difficulty. A secondary purpose was to extend previous research by studying distractor functionality within the context of professionally-developed credentialing tests. Data were analyzed for 840 MCQs consisting of five options per item. Results based on the traditional definition of nonfunctional were consistent with previous research indicating that most MCQs had one or two functional distractors. In contrast, the newly proposed index indicated that nearly half (47.3%) of all items had three or four functional distractors. Implications for item and test development are discussed.
Assuntos
Educação Médica/métodos , Educação Médica/normas , Avaliação Educacional/métodos , Avaliação Educacional/normas , Comportamento de Escolha , Humanos , Modelos Estatísticos , PsicometriaRESUMO
PURPOSE: In 2007, the United States Medical Licensing Examination embedded multimedia simulations of heart sounds into multiple-choice questions. This study investigated changes in item difficulty as determined by examinee performance over time. The data reflect outcomes obtained following initial use of multimedia items from 2007 through 2012, after which an interface change occurred. METHOD: A total of 233,157 examinees responded to 1,306 cardiology test items over the six-year period; 138 items included multimedia simulations of heart sounds, while 1,168 text-based items without multimedia served as controls. The authors compared changes in difficulty of multimedia items over time with changes in difficulty of text-based cardiology items over time. Further, they compared changes in item difficulty for both groups of items between graduates of Liaison Committee on Medical Education (LCME)-accredited and non-LCME-accredited (i.e., international) medical schools. RESULTS: Examinee performance on cardiology test items with multimedia heart sounds improved by 12.4% over the six-year period, while performance on text-based cardiology items improved by approximately 1.4%. These results were similar for graduates of LCME-accredited and non-LCME-accredited medical schools. CONCLUSIONS: Examinees' ability to interpret auscultation findings in test items that include multimedia presentations increased from 2007 to 2012.
Assuntos
Cardiologia/educação , Educação Médica/métodos , Avaliação Educacional/métodos , Auscultação Cardíaca/métodos , Treinamento por Simulação/estatística & dados numéricos , Adulto , Competência Clínica , Feminino , Humanos , Licenciamento em Medicina , Masculino , Multimídia , Leitura , Treinamento por Simulação/métodos , Estados UnidosRESUMO
BACKGROUND: Residency programs commonly use performance on the Orthopaedic In-Training Examination (OITE) developed by the American Academy of Orthopaedic Surgeons (AAOS) to identify residents who are lagging behind their peers and at risk for failing Part I of the American Board of Orthopaedic Surgery (ABOS) Certifying Examination. This study was designed to investigate the utility of the OITE score as a predictor of ABOS Part I performance. METHOD: Results for 3132 examinees who took Part I of the ABOS examination for the first time from 2002 to 2006 were matched with records from the 1997 to 2006 OITE tests; at least one OITE score was located for 2852 (91%) of the ABOS Part I examinees. After OITE performance was rescaled to place scores from different test years on comparable scales, descriptive statistics and correlations between ABOS and OITE scores were computed, and regression analyses were conducted to predict ABOS results from OITE performance. RESULTS: Substantial increases in the mean OITE score were observed as residents progressed through training. Stronger correlations were observed between OITE and ABOS performance during later years in training, reaching a maximum of 0.53 in years 3 and 4. Logistic regression results indicated that residents with an OITE score below the 10th percentile were much more likely to fail Part I compared with those with an OITE score above the 50th percentile. CONCLUSIONS: OITE performance was a good predictor of the ABOS score and pass-fail outcome; the OITE can be used effectively for early identification of residents at risk for failing the ABOS Part I examination.
Assuntos
Competência Clínica/normas , Internato e Residência , Ortopedia/educação , Canadá , Certificação , Avaliação Educacional , Ortopedia/normas , Estados UnidosRESUMO
BACKGROUND: This study investigated the strength of the relationship between performance on Part I of the American Board of Orthopaedic Surgery (ABOS) Certifying Examination and scores on United States Medical Licensing Examination (USMLE) Steps 1 and 2. METHOD: USMLE Step 1 and Step 2 scores on first attempt were matched with ABOS Part I results for U.S./Canadian graduates taking Part I for the first time between 2002 and 2006. Linear and logistic regression analyses investigated the relationship between ABOS Part I performance and scores on USMLE Step 1 and 2. RESULTS: Step 1 and Step 2 individually each explained 29% of the variation in Part I scores; using both scores increased this percentage to 34%. Results of logistic regression analyses showed a similar, moderately strong relationship with Part I pass/fail outcomes: Examinees with low scores on Steps 1 and 2 were at substantially greater risk for failing Part I. CONCLUSIONS: There is continuing empirical support for use of Step 1 and Step 2 scores in selection of residents to interview for orthopedics residency positions.
Assuntos
Certificação , Competência Clínica/normas , Avaliação Educacional , Licenciamento em Medicina , Ortopedia , Estados UnidosRESUMO
BACKGROUND: Studies of retention of basic science information have commonly demonstrated a knowledge decline as students progress through medical education. This study examined item characteristics influencing patterns of retention. METHOD: A large content and statistically representative sample of basic science items from 2004-2005 forms of United States Medical Licensing Examination (USMLE) Step 1 was included in unscored sections of 2004-2005 USMLE Step 2 Clinical Knowledge (CK) test forms, and the performance of 15,000+ first-time examinees from U.S. and Canadian schools was analyzed to identify item characteristics affecting retention. RESULTS: Across the 502 study items, the mean item difficulty on Step 1 was 76.1%; on Step 2 CK, this value declined to 69.7%. Performance declines were largest in Biochemistry (17.5%) and Microbiology (12.6%). Improvement was only observed for Behavioral Sciences items (8.7%). CONCLUSIONS: Shifts in examinee performance in this study were similar to those observed in previous research, although the magnitude of the overall decline was somewhat larger.