Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Appl Psychol Meas ; 48(1-2): 57-76, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38327610

RESUMO

The article presents adaptive testing strategies for polytomously scored technology-enhanced innovative items. We investigate item selection methods that match examinee's ability levels in location and explore ways to leverage test-taking speeds during item selection. Existing approaches to selecting polytomous items are mostly based on information measures and tend to experience an item pool usage problem. In this study, we introduce location indices for polytomous items and show that location-matched item selection significantly improves the usage problem and achieves more diverse item sampling. We also contemplate matching items' time intensities so that testing times can be regulated across the examinees. Numerical experiment from Monte Carlo simulation suggests that location-matched item selection achieves significantly better and more balanced item pool usage. Leveraging working speed in item selection distinctly reduced the average testing times as well as variation across the examinees. Both the procedures incurred marginal measurement cost (e.g., precision and efficiency) and yet showed significant improvement in the administrative outcomes. The experiment in two test settings also suggested that the procedures can lead to different administrative gains depending on the test design.

2.
Behav Res Methods ; 56(2): 615-638, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36749543

RESUMO

Increasing use of intelligent tutoring systems in education calls for analytic methods that can unravel students' learning behaviors. In this study, we explore a latent variable modeling approach for tracking learning flow during computer-interactive artificial tutoring. The study considers three models that give discrete profiles of a latent process: the (i) latent class model, (ii) latent transition model, and (iii) hidden Markov model. We illustrate application of each model using example log data from Cognitive Tutor Algebra I and suggest analytic procedures of drawing learning flow. Through experimental application, we show that the models can reveal substantive information about students' learning behaviors and have potential utility for describing the learning flow. The models differed in the assumptions and data constraints but yielded consistent findings on the flow states and interaction modalities. Based on our experiential analyses, we discuss strengths and limitations of the models and illuminate areas of future development.


Assuntos
Aprendizagem , Aprendizagem Baseada em Problemas , Humanos , Análise de Classes Latentes , Aprendizagem Baseada em Problemas/métodos , Estudantes , Inteligência
3.
Psychometrika ; 88(2): 672-696, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-35661320

RESUMO

The study presents statistical procedures that monitor functioning of items over time. We propose generalized likelihood ratio tests that surveil multiple item parameters and implement with various sampling techniques to perform continuous or intermittent monitoring. The procedures examine stability of item parameters across time and inform compromise as soon as they identify significant parameter shift. The performance of the monitoring procedures was validated using simulated and real-assessment data. The empirical evaluation suggests that the proposed procedures perform adequately well in identifying the parameter drift. They showed satisfactory detection power and gave timely signals while regulating error rates reasonably low. The procedures also showed superior performance when compared with the existent methods. The empirical findings suggest that multivariate parametric monitoring can provide an efficient and powerful control tool for maintaining the quality of items. The procedures allow joint monitoring of multiple item parameters and achieve sufficient power using powerful likelihood-ratio tests. Based on the findings from the empirical experimentation, we suggest some practical strategies for performing online item monitoring.


Assuntos
Modelos Estatísticos , Psicometria/métodos , Funções Verossimilhança , Pesquisa Empírica
4.
Educ Psychol Meas ; 82(4): 811-838, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35754615

RESUMO

The development of technology-enhanced innovative items calls for practical models that can describe polytomous testlet items. In this study, we evaluate four measurement models that can characterize polytomous items administered in testlets: (a) generalized partial credit model (GPCM), (b) testlet-as-a-polytomous-item model (TPIM), (c) random-effect testlet model (RTM), and (d) fixed-effect testlet model (FTM). Using data from GPCM, FTM, and RTM, we examine performance of the scoring models in multiple aspects: relative model fit, absolute item fit, significance of testlet effects, parameter recovery, and classification accuracy. The empirical analysis suggests that relative performance of the models varies substantially depending on the testlet-effect type, effect size, and trait estimator. When testlets had no or fixed effects, GPCM and FTM led to most desirable measurement outcomes. When testlets had random interaction effects, RTM demonstrated best model fit and yet showed substantially different performance in the trait recovery depending on the estimator. In particular, the advantage of RTM as a scoring model was discernable only when there existed strong random effects and the trait levels were estimated with Bayes priors. In other settings, the simpler models (i.e., GPCM, FTM) performed better or comparably. The study also revealed that polytomous scoring of testlet items has limited prospect as a functional scoring method. Based on the outcomes of the empirical evaluation, we provide practical guidelines for choosing a measurement model for polytomous innovative items that are administered in testlets.

5.
Br J Math Stat Psychol ; 75(1): 136-157, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34462913

RESUMO

Increasing use of innovative items in operational assessments has shedded new light on the polytomous testlet models. In this study, we examine performance of several scoring models when polytomous items exhibit random testlet effects. Four models are considered for investigation: the partial credit model (PCM), testlet-as-a-polytomous-item model (TPIM), random-effect testlet model (RTM), and fixed-effect testlet model (FTM). The performance of the models was evaluated in two adaptive testings where testlets have nonzero random effects. The outcomes of the study suggest that, despite the manifest random testlet effects, PCM, FTM, and RTM perform comparably in trait recovery and examinee classification. The overall accuracy of PCM and FTM in trait inference was comparable to that of RTM. TPIM consistently underestimated population variance and led to significant overestimation of measurement precision, showing limited utility for operational use. The results of the study provide practical implications for using the polytomous testlet scoring models.


Assuntos
Teste Adaptativo Computadorizado
6.
Physiol Meas ; 41(5): 055012, 2020 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-32252039

RESUMO

The rapid emergence of new measurement instruments and methods requires personnel and researchers of different disciplines to know the correct statistical methods to utilize to compare their performance with reference ones and properly interpret findings. We discuss the often-made mistake of applying the inappropriate correlation and regression statistical approaches to compare methods and then explain the concepts of agreement and reliability. Then, we introduce the intraclass correlation as a measure of inter-rater reliability, and the Bland-Altman plot as a measure of agreement, and we provide formulae to calculate them along with illustrative examples for different types of study designs, specifically single measurement per subject, repeated measurement while the true value is constant, and repeated measurement when the true value is not constant. We emphasize the requirement to validate the assumptions of these statistical approaches, and also how to deal with violations and provide formulae on how to calculate the confidence interval for estimated values of agreement and intraclass correlation. Finally, we explain how to interpret and report the findings of these statistical analyses.


Assuntos
Estatística como Assunto/métodos , Análise de Regressão , Reprodutibilidade dos Testes
7.
J Appl Meas ; 20(1): 66-78, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30789833

RESUMO

Computerized adaptive testing (CAT) is an attractive alternative to traditional paper-and-pencil testing because it can provide accurate trait estimates while administering fewer items than a linear test form. A stopping rule is an important factor in determining an assessments efficiency. This simulation compares three variable-length stopping rules-standard error (SE) of .3, minimum information (MI) of .7 and change in trait (CT) of .02 - with and without a maximum number of items (20) imposed. We use fixed-length criteria of 10 and 20 items as a comparison for two versions of a linear assessment. The MI rules resulted in longer assessments with more biased trait estimates in comparison to other rules. The CT rule resulted in more biased estimates at the higher end of the trait scale and larger standard errors. The SE rules performed well across the trait scale in terms of both measurement precision and efficiency.


Assuntos
Biometria , Simulação por Computador , Modelos Estatísticos , Ombro , Humanos
8.
Br J Math Stat Psychol ; 71(3): 523-535, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-29516492

RESUMO

A monotone relationship between a true score (τ) and a latent trait level (θ) has been a key assumption for many psychometric applications. The monotonicity property in dichotomous response models is evident as a result of a transformation via a test characteristic curve. Monotonicity in polytomous models, in contrast, is not immediately obvious because item response functions are determined by a set of response category curves, which are conceivably non-monotonic in θ. The purpose of the present note is to demonstrate strict monotonicity in ordered polytomous item response models. Five models that are widely used in operational assessments are considered for proof: the generalized partial credit model (Muraki, 1992, Applied Psychological Measurement, 16, 159), the nominal model (Bock, 1972, Psychometrika, 37, 29), the partial credit model (Masters, 1982, Psychometrika, 47, 147), the rating scale model (Andrich, 1978, Psychometrika, 43, 561), and the graded response model (Samejima, 1972, A general model for free-response data (Psychometric Monograph no. 18). Psychometric Society, Richmond). The study asserts that the item response functions in these models strictly increase in θ and thus there exists strict monotonicity between τ and θ under certain specified conditions. This conclusion validates the practice of customarily using τ in place of θ in applied settings and provides theoretical grounds for one-to-one transformations between the two scales.


Assuntos
Modelos Teóricos , Psicometria/métodos , Humanos , Funções Verossimilhança , Teoria Psicológica
9.
Health Aff (Millwood) ; 36(6): 1024-1031, 2017 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-28583960

RESUMO

Social determinants of health, such as poverty and minority background, severely disadvantage many people with mental disorders. A variety of innovative federal, state, and local programs have combined social services with mental health interventions. To explore the potential effects of such supports for addressing poverty and disadvantage on mental health outcomes, we simulated improvements in three social determinants-education, employment, and income. We used two large data sets: one from the National Institute of Mental Health that contained information about people with common mental disorders such as anxiety and depression, and another from the Social Security Administration that contained information about people who were disabled due to severe mental disorders such as schizophrenia and bipolar disorder. Our simulations showed that increasing employment was significantly correlated with improvements in mental health outcomes, while increasing education and income produced weak or nonsignificant correlations. In general, minority groups as well as the majority group of non-Latino whites improved in the desired outcomes. We recommend that health policy leaders, state and federal agencies, and insurers provide evidence-based employment services as a standard treatment for people with mental disorders.


Assuntos
Simulação por Computador , Educação , Emprego , Renda , Transtornos Mentais , Grupos Minoritários , Humanos , Pobreza , Inquéritos e Questionários
10.
Br J Math Stat Psychol ; 70(2): 187-208, 2017 May.
Artigo em Inglês | MEDLINE | ID: mdl-27958648

RESUMO

The Cox proportional hazards model with a latent trait variable (Ranger & Ortner, 2012, Br. J. Math. Stat. Psychol., 65, 334) has shown promise in accounting for the dependency of response times from the same examinee. The model allows flexibility in shapes of response time distributions using the non-parametric baseline hazard rate while allowing parametric inference about the latent variable via exponential regression. The flexibility of the model, however, comes at the price of a significant increase in the complexity of estimating the model. The purpose of this study is to propose a new estimation approach to overcome this difficulty in model estimation. The new procedure is based on the penalized partial likelihood estimator in which the partial likelihood is maximized in the presence of a penalty function. The potential of the proposed method is corroborated by a series of simulation studies for fitting the proportional hazards latent trait model to psychological and educational testing data. The application of the estimation method to the hierarchical framework (van der Linden, 2007, Psychometrika, 72, 287) is also illustrated for jointly analysing response times and accuracy scores.


Assuntos
Funções Verossimilhança , Modelos de Riscos Proporcionais , Humanos , Probabilidade , Tempo de Reação
11.
Appl Psychol Meas ; 40(7): 534-550, 2016 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-29881068

RESUMO

An informational distance/divergence-based approach is proposed to detect the presence of parameter drift in multidimensional computerized adaptive testing (MCAT). The study presents significance testing procedures for identifying changes in multidimensional item response functions (MIRFs) over time based on informational distance/divergence measures that capture the discrepancy between two probability functions. To approximate the MIRFs from the observed response data, the k-nearest neighbors algorithm is used with the random search method. A simulation study suggests that the distance/divergence-based drift measures perform effectively in identifying the instances of parameter drift in MCAT. They showed moderate power with small samples of 500 examinees and excellent power when the sample size was as large as 1,000. The proposed drift measures also adequately controlled for Type I error at the nominal level under the null hypothesis.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...