Búsqueda | Portal Regional de la BVS

Evaluating the Effects of Missing Data Handling Methods on Scale Linking Accuracy.

Wu, Tong; Kim, Stella Y; Westine, Carl.

Educ Psychol Meas ; 83(6): 1202-1228, 2023 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-37974655

RESUMEN

For large-scale assessments, data are often collected with missing responses. Despite the wide use of item response theory (IRT) in many testing programs, however, the existing literature offers little insight into the effectiveness of various approaches to handling missing responses in the context of scale linking. Scale linking is commonly used in large-scale assessments to maintain scale comparability over multiple forms of a test. Under a common-item nonequivalent group design (CINEG), missing data that occur to common items potentially influence the linking coefficients and, consequently, may affect scale comparability, test validity, and reliability. The objective of this study was to evaluate the effect of six missing data handling approaches, including listwise deletion (LWD), treating missing data as incorrect responses (IN), corrected item mean imputation (CM), imputing with a response function (RF), multiple imputation (MI), and full information likelihood information (FIML), on IRT scale linking accuracy when missing data occur to common items. Under a set of simulation conditions, the relative performance of the six missing data treatment methods under two missing mechanisms was explored. Results showed that RF, MI, and FIML produced less errors for conducting scale linking whereas LWD was associated with the most errors regardless of various testing conditions.

Extended Multivariate Generalizability Theory With Complex Design Structures.

Brennan, Robert L; Kim, Stella Y; Lee, Won-Chan.

Educ Psychol Meas ; 82(4): 617-642, 2022 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-35754617

RESUMEN

This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and free-response items, with the latter involving variability attributable to both items and raters. In this case, two distinct designs are needed to fully characterize the design and capture potential sources of error associated with each item format. Another example involves tests containing both testlets and one or more stand-alone sets of items. Testlet effects need to be taken into account for the testlet-based items, but not the stand-alone sets of items. This article presents an extension of MGT that faithfully models such complex test designs, along with two real-data examples. Among other things, these examples illustrate that estimates of error variance, error-tolerance ratios, and reliability-like coefficients can be biased if there is a mismatch between the user-specified universe of generalization and the complex nature of the test.

Simple-Structure Multidimensional Item Response Theory Equating for Multidimensional Tests.

Kim, Stella Y; Lee, Won-Chan; Kolen, Michael J.

Educ Psychol Meas ; 80(1): 91-125, 2020 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-31933494

RESUMEN

A theoretical and conceptual framework for true-score equating using a simple-structure multidimensional item response theory (SS-MIRT) model is developed. A true-score equating method, referred to as the SS-MIRT true-score equating (SMT) procedure, also is developed. SS-MIRT has several advantages over other complex multidimensional item response theory models including improved efficiency in estimation and straightforward interpretability. The performance of the SMT procedure was examined and evaluated through four studies using different data types. In these studies, results from the SMT procedure were compared with results from four other equating methods to assess the relative benefits of SMT compared with the other procedures. In general, SMT showed more accurate equating results compared with the traditional unidimensional IRT (UIRT) equating when the data were multidimensional. More accurate performance of SMT over UIRT true-score equating was consistently observed across the studies, which supports the benefits of a multidimensional approach in equating for multidimensional data. Also, SMT performed similarly to a SS-MIRT observed score method across all studies.

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA