RESUMO
Recently Variational Autoencoders (VAEs) have been proposed as a method to estimate high dimensional Item Response Theory (IRT) models on large datasets. Although these improve the efficiency of estimation drastically compared to traditional methods, they have no natural way to deal with missing values. In this paper, we adapt three existing methods from the VAE literature to the IRT setting and propose one new method. We compare the performance of the different VAE-based methods to each other and to marginal maximum likelihood estimation for increasing levels of missing data in a simulation study for both three- and ten-dimensional IRT models. Additionally, we demonstrate the use of the VAE-based models on an existing algebra test dataset. Results confirm that VAE-based methods are a time-efficient alternative to marginal maximum likelihood, but that a larger number of importance-weighted samples are needed when the proportion of missing values is large.
RESUMO
Background and aims: Problematic smartphone use (PSU) has gained attention, but its definition remains debated. This study aimed to develop and validate a new scale measuring PSU-the Smartphone Use Problems Identification Questionnaire (SUPIQ). Methods: Using two separate samples, a university community sample (N = 292) and a general population sample (N = 397), we investigated: (1) the construct validity of the SUPIQ through exploratory and confirmatory factor analyses; (2) the convergent validity of the SUPIQ with correlation analyses and the visualized partial correlation network analyses; (3) the psychometric equivalence of the SUPIQ across two samples through multigroup confirmatory factor analyses; (4) the explanatory power of the SUPIQ over the Short Version of Smartphone Addiction Scale (SAS-SV) with hierarchical multiple regressions. Results: The results showed that the SUPIQ included 26 items and 7 factors (i.e., Craving, Coping, Habitual Use, Social Conflicts, Risky Use, Withdrawal, and Tolerance), with good construct and convergent validity. The configural measurement invariance across samples was established. The SUPIQ also explained more variances in mental health problems than the SAS-SV. Discussion and conclusions: The findings suggest that the SUPIQ shows promise as a tool for assessing PSU. Further research is needed to enhance and refine the SUPIQ as well as to investigate its clinical utility.
Assuntos
Transtorno de Adição à Internet , Psicometria , Smartphone , Humanos , Feminino , Masculino , Adulto , Psicometria/instrumentação , Psicometria/normas , Adulto Jovem , Transtorno de Adição à Internet/diagnóstico , Reprodutibilidade dos Testes , Pessoa de Meia-Idade , Adolescente , Análise Fatorial , Inquéritos e Questionários/normas , Idoso , Comportamento Aditivo/diagnóstico , Comportamento Aditivo/psicologiaRESUMO
Measurement invariance is an assumption underlying the regression of a latent variable on a background variable. It requires the measurement model parameters of the latent variable to be equal across the levels of the background variable. Item-specific violations of this assumption are referred to as differential item functioning and are ideally substantively explainable to warrant theoretically valid and meaningful results. Past research has focused on developing statistical approaches to explain differential item functioning effects in terms of item- or person-specific covariates. In this study, we propose a modeling approach that can be used to test if differences in item response times can be used to statistically explain differential item functioning. To this end, we operationalize a latent response process factor and test if item-specific group differences on this factor can account for the observed differences in item scores. We investigate the properties of the model in a simulation study, and we apply the model to a real data set. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
RESUMO
This article presents a joint modeling framework of ordinal responses and response times (RTs) for the measurement of latent traits. We integrate cognitive theories of decision-making and confidence judgments with psychometric theories to model individual-level measurement processes. The model development starts with the sequential sampling framework which assumes that when an item is presented, a respondent accumulates noisy evidence over time to respond to the item. Several cognitive and psychometric theories are reviewed and integrated, leading us to three psychometric process models with different representations of the cognitive processes underlying the measurement. We provide simulation studies that examine parameter recovery and show the relationships between latent variables and data distributions. We further test the proposed models with empirical data measuring three traits related to motivation. The results show that all three models provide reasonably good descriptions of observed response proportions and RT distributions. Also, different traits favor different process models, which implies that psychological measurement processes may have heterogeneous structures across traits. Our process of model building and examination illustrates how cognitive theories can be incorporated into psychometric model development to shed light on the measurement process, which has had little attention in traditional psychometric models.
Assuntos
Julgamento , Motivação , Tempo de Reação/fisiologia , Psicometria , Simulação por ComputadorRESUMO
We report the results of an academic survey into the theoretical and methodological foundations, common assumptions, and the current state of the field of consciousness research. The survey consisted of 22 questions and was distributed on two different occasions of the annual meeting of the Association of the Scientific Study of Consciousness (2018 and 2019). We examined responses from 166 consciousness researchers with different backgrounds (e.g. philosophy, neuroscience, psychology, and computer science) and at various stages of their careers (e.g. junior/senior faculty and graduate/undergraduate students). The results reveal that there remains considerable discussion and debate between the surveyed researchers about the definition of consciousness and the way it should be studied. To highlight a few observations, a majority of respondents believe that machines could have consciousness, that consciousness is a gradual phenomenon in the animal kingdom, and that unconscious processing is extensive, encompassing both low-level and high-level cognitive functions. Further, we show which theories of consciousness are currently considered most promising by respondents and how supposedly different theories cluster together, which dependent measures are considered best to index the presence or absence of consciousness, and which neural measures are thought to be the most likely signatures of consciousness. These findings provide us with a snapshot of the current views of researchers in the field and may therefore help prioritize research and theoretical approaches to foster progress.
RESUMO
Assessing measurement invariance is an important step in establishing a meaningful comparison of measurements of a latent construct across individuals or groups. Most recently, moderated nonlinear factor analysis (MNLFA) has been proposed as a method to assess measurement invariance. In MNLFA models, measurement invariance is examined in a single-group confirmatory factor analysis model by means of parameter moderation. The advantages of MNLFA over other methods is that it (a) accommodates the assessment of measurement invariance across multiple continuous and categorical background variables and (b) accounts for heteroskedasticity by allowing the factor and residual variances to differ as a function of the background variables. In this article, we aim to make MNLFA more accessible to researchers without access to commercial structural equation modeling software by demonstrating how this method can be applied with the open-source R package OpenMx. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
RESUMO
Estimating the reliability of cognitive task datasets is commonly done via split-half methods. We review four methods that differ in how the trials are split into parts: a first-second half split, an odd-even trial split, a permutated split, and a Monte Carlo-based split. Additionally, each splitting method could be combined with stratification by task design. These methods are reviewed in terms of the degree to which they are confounded with four effects that may occur in cognitive tasks: effects of time, task design, trial sampling, and non-linear scoring. Based on the theoretical review, we recommend Monte Carlo splitting (possibly in combination with stratification by task design) as being the most robust method with respect to the four confounds considered. Next, we estimated the reliabilities of the main outcome variables from four cognitive task datasets, each (typically) scored with a different non-linear algorithm, by systematically applying each splitting method. Differences between methods were interpreted in terms of confounding effects inflating or attenuating reliability estimates. For three task datasets, our findings were consistent with our model of confounding effects. Evidence for confounding effects was strong for time and task design and weak for non-linear scoring. When confounding effects occurred, they attenuated reliability estimates. For one task dataset, findings were inconsistent with our model but they may offer indicators for assessing whether a split-half reliability estimate is appropriate. Additionally, we make suggestions on further research of reliability estimation, supported by a compendium R package that implements each of the splitting methods reviewed here.
Assuntos
Algoritmos , Cognição , Humanos , Método de Monte Carlo , Reprodutibilidade dos TestesRESUMO
Psychopathy in females has been understudied. Extant data on gender comparisons using the predominant measure of assessment in clinical practice, the Psychopathy Checklist Revised (PCL-R), points to a potential lack of measurement invariance (MI). If indeed the instrument does not perform equally (well) in both genders, straightforward comparison of psychopathy scores in males and females is unwarranted. Using a sample of female and male forensic patients (N = 110 and N = 147 respectively), we formally tested for MI in a structural equation modeling framework. We found that the PCL-R in its current form does not attain full MI. Four items showed threshold-biases and particularly Factor 2 (the Social Deviance Factor) is gender biased. Based on our findings, it seems reasonable to expect that specific scoring adjustments might go a long way in bringing about more equivalent assessment of psychopathic features in men and women. Only then can we begin to meaningfully compare the genders on the prevalence, structure, and external correlates of psychopathy.
Assuntos
Lista de Checagem , Prisioneiros , Transtorno da Personalidade Antissocial/diagnóstico , Feminino , Humanos , MasculinoRESUMO
In analyzing responses and response times to personality questionnaire items, models have been proposed which include the so-called "inverted-U effect." These models predict that response times to personality test items decrease as the latent trait value of a given person gets closer to the attractiveness of an item. Initial studies into these models have focused on dichotomous personality items, and more recently, models for Likert-type scale items have been proposed. In all these models, it is assumed that the inverted-U effect is symmetrical around 0, while, as will be explained in this article, there are substantive and statistical reasons to study this assumption. Therefore, in this article, a general inverted-U model is proposed which accommodates two sources of asymmetry between the response times and the attractiveness of the items. The viability of this model is demonstrated in a simulation study, and the model is applied to the responses and response times of the Temperament and Character Inventory-Revised, covering a broad range of personality dimensions.
RESUMO
Effort has been devoted to the development of moderated factor models in which the traditional factor model parameters are allowed to differ across a moderator variable. These models are valuable as they enable tests on measurement invariance across a continuous background variable. However, moderated factor models require the specification of a parametric functional form between the factor model parameters and the moderator variable while, in some situations, it is unclear what functional form to assume. Therefore, in the present article, a semiparametric moderated factor modeling approach is presented in which no assumption concerning the functional form between the moderator and the model parameters is imposed. In a simulation study, the semiparametric moderated factor model is shown to be viable in terms of parameter recovery and the power to distinguish the different models for measurement invariance. In addition, the model is applied to a real dataset pertaining to intelligence. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Assuntos
Análise Fatorial , Simulação por Computador , HumanosRESUMO
Various mixture modeling approaches have been proposed to identify within-subjects differences in the psychological processes underlying responses to psychometric tests. Although valuable, the existing mixture models are associated with at least one of the following three challenges: (1) A parametric distribution is assumed for the response times that-if violated-may bias the results; (2) the response processes are assumed to result in equal variances (homoscedasticity) in the response times, whereas some processes may produce more variability than others (heteroscedasticity); and (3) the different response processes are modeled as independent latent variables, whereas they may be related. Although each of these challenges has been addressed separately, in practice they may occur simultaneously. Therefore, we propose a heteroscedastic hidden Markov mixture model for responses and categorized response times that addresses all the challenges above in a single model. In a simulation study, we demonstrated that the model is associated with acceptable parameter recovery and acceptable resolution to distinguish between various special cases. In addition, the model was applied to the responses and response times of the WAIS-IV block design subtest, to demonstrate its use in practice.
Assuntos
Cadeias de Markov , Modelos Estatísticos , Psicometria/métodos , Viés , Humanos , Modelos Psicológicos , Tempo de ReaçãoRESUMO
Linear, nonlinear, and nonparametric moderated latent variable models have been developed to investigate possible interaction effects between a latent variable and an external continuous moderator on the observed indicators in the latent variable model. Most moderation models have focused on moderators that vary across persons but not across the indicators (e.g., moderators like age and socioeconomic status). However, in many applications, the values of the moderator may vary both across persons and across indicators (e.g., moderators like response times and confidence ratings). Indicator-level moderation models are available for categorical moderators and linear interaction effects. However, these approaches require respectively categorization of the continuous moderator and the assumption of linearity of the interaction effect. In this article, parametric nonlinear and nonparametric indicator-level moderation methods are developed. In a simulation study, we demonstrate the viability of these methods. In addition, the methods are applied to a real data set pertaining to arithmetic ability.
Assuntos
Modelos Estatísticos , Dinâmica não Linear , Simulação por Computador , Interpretação Estatística de Dados , Avaliação Educacional , Análise Fatorial , Humanos , Conceitos MatemáticosRESUMO
The most common process variable available for analysis due to tests presented in a computerized form is response time. Psychometric models have been developed for joint modeling of response accuracy and response time in which response time is an additional source of information about ability and about the underlying response processes. While traditional models assume conditional independence between response time and accuracy given ability and speed latent variables (van der Linden, 2007), recently multiple studies (De Boeck and Partchev, 2012; Meng et al., 2015; Bolsinova et al., 2017a,b) have shown that violations of conditional independence are not rare and that there is more to learn from the conditional dependence between response time and accuracy. When it comes to conditional dependence between time and accuracy, authors typically focus on positive conditional dependence (i.e., relatively slow responses are more often correct) and negative conditional dependence (i.e., relatively fast responses are more often correct), which implies monotone conditional dependence. Moreover, most existing models specify the relationship to be linear. However, this assumption of monotone and linear conditional dependence does not necessarily hold in practice, and assuming linearity might distort the conclusions about the relationship between time and accuracy. In this paper we develop methods for exploring nonlinear conditional dependence between response time and accuracy. Three different approaches are proposed: (1) A joint model for quadratic conditional dependence is developed as an extension of the response moderation models for time and accuracy (Bolsinova et al., 2017b); (2) A joint model for multiple-category conditional dependence is developed as an extension of the fast-slow model of Partchev and De Boeck (2012); (3) An indicator-level nonparametric moderation method (Bolsinova and Molenaar, in press) is used with residual log-response time as a predictor for the item intercept and item slope. Furthermore, we propose using nonparametric moderation to evaluate the viability of the assumption of linearity of conditional dependence by performing posterior predictive checks for the linear conditional dependence model. The developed methods are illustrated using data from an educational test in which, for the majority of the items, conditional dependence is shown to be nonlinear.
RESUMO
In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject's response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.
Assuntos
Modelos Psicológicos , Tempo de Reação , Simulação por Computador , Avaliação Educacional , Humanos , Conceitos Matemáticos , Modelos Estatísticos , PsicometriaRESUMO
In item response theory, modelling the item response times in addition to the item responses may improve the detection of possible between- and within-subject differences in the process that resulted in the responses. For instance, if respondents rely on rapid guessing on some items but not on all, the joint distribution of the responses and response times will be a multivariate within-subject mixture distribution. Suitable parametric methods to detect these within-subject differences have been proposed. In these approaches, a distribution needs to be assumed for the within-class response times. In this paper, it is demonstrated that these parametric within-subject approaches may produce false positives and biased parameter estimates if the assumption concerning the response time distribution is violated. A semi-parametric approach is proposed which resorts to categorized response times. This approach is shown to hardly produce false positives and parameter bias. In addition, the semi-parametric approach results in approximately the same power as the parametric approach.
Assuntos
Interpretação Estatística de Dados , Psicometria/métodos , Tempo de Reação , Algoritmos , Viés , Simulação por Computador , Reações Falso-Positivas , Humanos , Modelos Psicológicos , Modelos EstatísticosRESUMO
In generalized linear modelling of responses and response times, the observed response time variables are commonly transformed to make their distribution approximately normal. A normal distribution for the transformed response times is desirable as it justifies the linearity and homoscedasticity assumptions in the underlying linear model. Past research has, however, shown that the transformed response times are not always normal. Models have been developed to accommodate this violation. In the present study, we propose a modelling approach for responses and response times to test and model non-normality in the transformed response times. Most importantly, we distinguish between non-normality due to heteroscedastic residual variances, and non-normality due to a skewed speed factor. In a simulation study, we establish parameter recovery and the power to separate both effects. In addition, we apply the model to a real data set.
Assuntos
Modelos Lineares , Modelos Estatísticos , Tempo de Reação , Humanos , Modelos Biológicos , Distribuição NormalRESUMO
With the widespread use of computerized tests in educational measurement and cognitive psychology, registration of response times has become feasible in many applications. Considering these response times helps provide a more complete picture of the performance and characteristics of persons beyond what is available based on response accuracy alone. Statistical models such as the hierarchical model (van der Linden, 2007) have been proposed that jointly model response time and accuracy. However, these models make restrictive assumptions about the response processes (RPs) that may not be realistic in practice, such as the assumption that the association between response time and accuracy is fully explained by taking speed and ability into account (conditional independence). Assuming conditional independence forces one to ignore that many relevant individual differences may play a role in the RPs beyond overall speed and ability. In this paper, we critically consider the assumption of conditional independence and the important ways in which it may be violated in practice from a substantive perspective. We consider both conditional dependences that may arise when all persons attempt to solve the items in similar ways (homogeneous RPs) and those that may be due to persons differing in fundamental ways in how they deal with the items (heterogeneous processes). The paper provides an overview of what we can learn from observed conditional dependences. We argue that explaining and modeling these differences in the RPs is crucial to increase both the validity of measurement and our understanding of the relevant RPs.
RESUMO
It is becoming more feasible and common to register response times in the application of psychometric tests. Researchers thus have the opportunity to jointly model response accuracy and response time, which provides users with more relevant information. The most common choice is to use the hierarchical model (van der Linden, 2007, Psychometrika, 72, 287), which assumes conditional independence between response time and accuracy, given a person's speed and ability. However, this assumption may be violated in practice if, for example, persons vary their speed or differ in their response strategies, leading to conditional dependence between response time and accuracy and confounding measurement. We propose six nested hierarchical models for response time and accuracy that allow for conditional dependence, and discuss their relationship to existing models. Unlike existing approaches, the proposed hierarchical models allow for various forms of conditional dependence in the model and allow the effect of continuous residual response time on response accuracy to be item-specific, person-specific, or both. Estimation procedures for the models are proposed, as well as two information criteria that can be used for model selection. Parameter recovery and usefulness of the information criteria are investigated using simulation, indicating that the procedure works well and is likely to select the appropriate model. Two empirical applications are discussed to illustrate the different types of conditional dependence that may occur in practice and how these can be captured using the proposed hierarchical models.