Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38379504

RESUMO

Several new models based on item response theory have recently been suggested to analyse intensive longitudinal data. One of these new models is the time-varying dynamic partial credit model (TV-DPCM; Castro-Alvarez et al., Multivariate Behavioral Research, 2023, 1), which is a combination of the partial credit model and the time-varying autoregressive model. The model allows the study of the psychometric properties of the items and the modelling of nonlinear trends at the latent state level. However, there is a severe lack of tools to assess the fit of the TV-DPCM. In this paper, we propose and develop several test statistics and discrepancy measures based on the posterior predictive model checking (PPMC) method (PPMC; Rubin, The Annals of Statistics, 1984, 12, 1151) to assess the fit of the TV-DPCM. Simulated and empirical data are used to study the performance of and illustrate the effectiveness of the PPMC method.

2.
Multivariate Behav Res ; 59(1): 78-97, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-37318274

RESUMO

The accessibility to electronic devices and the novel statistical methodologies available have allowed researchers to comprehend psychological processes at the individual level. However, there are still great challenges to overcome as, in many cases, collected data are more complex than the available models are able to handle. For example, most methods assume that the variables in the time series are measured on an interval scale, which is not the case when Likert-scale items were used. Ignoring the scale of the variables can be problematic and bias the results. Additionally, most methods also assume that the time series are stationary, which is rarely the case. To tackle these disadvantages, we propose a model that combines the partial credit model (PCM) of the item response theory framework and the time-varying autoregressive model (TV-AR), which is a popular model used to study psychological dynamics. The proposed model is referred to as the time-varying dynamic partial credit model (TV-DPCM), which allows to appropriately analyze multivariate polytomous data and nonstationary time series. We test the performance and accuracy of the TV-DPCM in a simulation study. Lastly, by means of an example, we show how to fit the model to empirical data and interpret the results.


Assuntos
Modelos Estatísticos , Fatores de Tempo , Simulação por Computador , Coleta de Dados
3.
Appl Psychol Meas ; 47(5-6): 420-437, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37810540

RESUMO

Aberrant responding on tests and surveys has been shown to affect the psychometric properties of scales and the statistical analyses from the use of those scales in cumulative model contexts. This study extends prior research by comparing the effects of four types of aberrant responding on model fit in both cumulative and ideal point model contexts using graded partial credit (GPCM) and generalized graded unfolding (GGUM) models. When fitting models to data, model misfit can be both a function of misspecification and aberrant responding. Results demonstrate how varying levels of aberrant data can severely impact model fit for both cumulative and ideal point data. Specifically, longstring responses have a stronger impact on dimensionality for both ideal point and cumulative data, while random responding tends to have the most negative impact on data model fit according to information criteria (AIC, BIC). The results also indicate that ideal point data models such as GGUM may be able to fit cumulative data as well as the cumulative model itself (GPCM), whereas cumulative data models may not provide sufficient model fit for data simulated using an ideal point model.

4.
Psychol Methods ; 28(3): 558-579, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35298215

RESUMO

The last 25 years have shown a steady increase in attention for the Bayes factor as a tool for hypothesis evaluation and model selection. The present review highlights the potential of the Bayes factor in psychological research. We discuss six types of applications: Bayesian evaluation of point null, interval, and informative hypotheses, Bayesian evidence synthesis, Bayesian variable selection and model averaging, and Bayesian evaluation of cognitive models. We elaborate what each application entails, give illustrative examples, and provide an overview of key references and software with links to other applications. The article is concluded with a discussion of the opportunities and pitfalls of Bayes factor applications and a sketch of corresponding future research lines. (PsycInfo Database Record (c) 2023 APA, all rights reserved).


Assuntos
Teorema de Bayes , Pesquisa Comportamental , Psicologia , Humanos , Pesquisa Comportamental/métodos , Psicologia/métodos , Software , Projetos de Pesquisa
5.
Psychol Methods ; 28(3): 740-755, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34735173

RESUMO

Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size. (PsycInfo Database Record (c) 2023 APA, all rights reserved).


Assuntos
Projetos de Pesquisa , Humanos , Teorema de Bayes , Tamanho da Amostra
6.
Psychon Bull Rev ; 30(2): 534-552, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36085233

RESUMO

In classical statistics, there is a close link between null hypothesis significance testing (NHST) and parameter estimation via confidence intervals. However, for the Bayesian counterpart, a link between null hypothesis Bayesian testing (NHBT) and Bayesian estimation via a posterior distribution is less straightforward, but does exist, and has recently been reiterated by Rouder, Haaf, and Vandekerckhove (2018). It hinges on a combination of a point mass probability and a probability density function as prior (denoted as the spike-and-slab prior). In the present paper, it is first carefully explained how the spike-and-slab prior is defined, and how results can be derived for which proofs were not given in Rouder, Haaf, and Vandekerckhove (2018). Next, it is shown that this spike-and-slab prior can be approximated by a pure probability density function with a rectangular peak around the center towering highly above the remainder of the density function. Finally, we will indicate how this 'hill-and-chimney' prior may in turn be approximated by fully continuous priors. In this way, it is shown that NHBT results can be approximated well by results from estimation using a strongly peaked prior, and it is noted that the estimation itself offers more than merely the posterior odds on which NHBT is based. Thus, it complies with the strong APA requirement of not just mentioning testing results but also offering effect size information. It also offers a transparent perspective on the NHBT approach employing a prior with a strong peak around the chosen point null hypothesis value.


Assuntos
Projetos de Pesquisa , Humanos , Teorema de Bayes , Funções Verossimilhança
7.
Psychol Methods ; 27(3): 466-475, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35901398

RESUMO

In 2019 we wrote an article (Tendeiro & Kiers, 2019) in Psychological Methods over null hypothesis Bayesian testing and its working horse, the Bayes factor. Recently, van Ravenzwaaij and Wagenmakers (2021) offered a response to our piece, also in this journal. Although we do welcome their contribution with thought-provoking remarks on our article, we ended up concluding that there were too many "issues" in van Ravenzwaaij and Wagenmakers (2021) that warrant a rebuttal. In this article we both defend the main premises of our original article and we put the contribution of van Ravenzwaaij and Wagenmakers (2021) under critical appraisal. Our hope is that this exchange between scholars decisively contributes toward a better understanding among psychologists of null hypothesis Bayesian testing in general and of the Bayes factor in particular. (PsycInfo Database Record (c) 2022 APA, all rights reserved).


Assuntos
Projetos de Pesquisa , Teorema de Bayes , Interpretação Estatística de Dados
8.
Assessment ; 29(7): 1392-1405, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34041940

RESUMO

Functional Somatic Symptoms (FSS) are physical symptoms that cannot be attributed to underlying pathology. Their severity is often measured with sum scores on questionnaires; however, this may not adequately reflect FSS severity in subgroups of patients. We aimed to identify the items of the somatization section of the Composite International Diagnostic Interview that best discriminate FSS severity levels, and to assess their functioning in sex and age subgroups. We applied the two-parameter logistic model to 19 items in a population-representative cohort of 962 participants. Subsequently, we examined differential item functioning (DIF). "Localized (muscle) weakness" was the most discriminative item of FSS severity. "Abdominal pain" consistently showed DIF by sex, with males reporting it at higher FSS severity. There was no consistent DIF by age, however, "Joint pain" showed poor discrimination of FSS severity in older adults. These findings could be helpful for the development of better assessment instruments for FSS, which can improve both future research and clinical care.


Assuntos
Sintomas Inexplicáveis , Idoso , Estudos de Coortes , Humanos , Masculino , Modelos Estatísticos , Dor , Psicometria , Inquéritos e Questionários
9.
J Exp Psychol Appl ; 28(1): 166-178, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34138620

RESUMO

Robust scientific evidence shows that human performance predictions are more valid when information is combined mechanically (with a decision rule) rather than holistically (in the decision-maker's mind). Yet, information is often combined holistically in practice. One reason is that decision makers lack the knowledge of evidence-based decision making. In a performance prediction task, we tested whether watching an educational video on evidence-based decision making increased decision-makers' use of a decision rule and their prediction accuracy immediately after the manipulation and a month later. Furthermore, we manipulated whether participants earned incentives for accurate predictions. Existing research showed that incentives decrease decision-rule use and prediction accuracy. We hypothesized that this is the case for decision makers who did not receive educational information about evidence-based decision making, but that incentives increase decision-rule use and prediction accuracy for participants who received educational information. Our results showed that educational information increased decision-rule use. This resulted in increased prediction accuracy, but only immediately after receiving the educational information. In contrast to the existing literature, incentives slightly increased decision-rule use. We did not find evidence that this effect was larger for educated participants. Providing decision makers with educational information may be effective to increase decision-rule use in practice. (PsycInfo Database Record (c) 2022 APA, all rights reserved).


Assuntos
Tomada de Decisões , Motivação , Humanos
10.
Psychol Methods ; 27(1): 17-43, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-34014719

RESUMO

Traditionally, researchers have used time series and multilevel models to analyze intensive longitudinal data. However, these models do not directly address traits and states which conceptualize the stability and variability implicit in longitudinal research, and they do not explicitly take into account measurement error. An alternative to overcome these drawbacks is to consider structural equation models (state-trait SEMs) for longitudinal data that represent traits and states as latent variables. Most of these models are encompassed in the latent state-trait (LST) theory. These state-trait SEMs can be problematic when the number of measurement occasions increases. As they require the data to be in wide format, these models quickly become overparameterized and lead to nonconvergence issues. For these reasons, multilevel versions of state-trait SEMs have been proposed, which require the data in long format. To study how suitable state-trait SEMs are for intensive longitudinal data, we carried out a simulation study. We compared the traditional single level to the multilevel version of three state-trait SEMs. The selected models were the multistate-singletrait (MSST) model, the common and unique trait-state (CUTS) model, and the trait-state-occasion (TSO) model. Furthermore, we also included an empirical application. Our results indicated that the TSO model performed best in both the simulated and the empirical data. To conclude, we highlight the usefulness of state-trait SEMs to study the psychometric properties of the questionnaires used in intensive longitudinal data. Yet, these models still have multiple limitations, some of which might be overcome by extending them to more general frameworks. (PsycInfo Database Record (c) 2022 APA, all rights reserved).


Assuntos
Modelos Teóricos , Humanos , Análise de Classes Latentes , Análise Multinível , Psicometria , Inquéritos e Questionários
11.
Qual Life Res ; 31(1): 49-59, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34476671

RESUMO

PURPOSE: In Mokken scaling, the Crit index was proposed and is sometimes used as evidence (or lack thereof) of violations of some common model assumptions. The main goal of our study was twofold: To make the formulation of the Crit index explicit and accessible, and to investigate its distribution under various measurement conditions. METHODS: We conducted two simulation studies in the context of dichotomously scored item responses. We manipulated the type of assumption violation, the proportion of violating items, sample size, and quality. False positive rates and power to detect assumption violations were our main outcome variables. Furthermore, we used the Crit coefficient in a Mokken scale analysis to a set of responses to the General Health Questionnaire (GHQ-12), a self-administered questionnaire for assessing current mental health. RESULTS: We found that the false positive rates of Crit were close to the nominal rate in most conditions, and that power to detect misfit depended on the sample size, type of violation, and number of assumption-violating items. Overall, in small samples Crit lacked the power to detect misfit, and in larger samples power differed considerably depending on the type of violation and proportion of misfitting items. Furthermore, we also found in our empirical example that even in large samples the Crit index may fail to detect assumption violations. DISCUSSION: Even in large samples, the Crit coefficient showed limited usefulness for detecting moderate and severe violations of monotonicity. Our findings are relevant to researchers and practitioners who use Mokken scaling for scale and questionnaire construction and revision.


Assuntos
Qualidade de Vida , Projetos de Pesquisa , Simulação por Computador , Humanos , Saúde Mental , Qualidade de Vida/psicologia , Inquéritos e Questionários
12.
Psychon Bull Rev ; 29(1): 70-87, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-34254263

RESUMO

The practice of sequentially testing a null hypothesis as data are collected until the null hypothesis is rejected is known as optional stopping. It is well known that optional stopping is problematic in the context of p value-based null hypothesis significance testing: The false-positive rates quickly overcome the single test's significance level. However, the state of affairs under null hypothesis Bayesian testing, where p values are replaced by Bayes factors, has perhaps surprisingly been much less consensual. Rouder (2014) used simulations to defend the use of optional stopping under null hypothesis Bayesian testing. The idea behind these simulations is closely related to the idea of sampling from prior predictive distributions. Deng et al. (2016) and Hendriksen et al. (2020) have provided mathematical evidence to the effect that optional stopping under null hypothesis Bayesian testing does hold under some conditions. These papers are, however, exceedingly technical for most researchers in the applied social sciences. In this paper, we provide some mathematical derivations concerning Rouder's approximate simulation results for the two Bayesian hypothesis tests that he considered. The key idea is to consider the probability distribution of the Bayes factor, which is regarded as being a random variable across repeated sampling. This paper therefore offers an intuitive perspective to the literature and we believe it is a valid contribution towards understanding the practice of optional stopping in the context of Bayesian hypothesis testing.


Assuntos
Projetos de Pesquisa , Teorema de Bayes , Simulação por Computador , Humanos , Probabilidade
13.
Appl Psychol Meas ; 44(6): 482-496, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32782419

RESUMO

Mokken scale analysis is a popular method to evaluate the psychometric quality of clinical and personality questionnaires and their individual items. Although many empirical papers report on the extent to which sets of items form Mokken scales, there is less attention for the effect of violations of commonly used rules of thumb. In this study, the authors investigated the practical consequences of retaining or removing items with psychometric properties that do not comply with these rules of thumb. Using simulated data, they concluded that items with low scalability had some influence on the reliability of test scores, person ordering and selection, and criterion-related validity estimates. Removing the misfitting items from the scale had, in general, a small effect on the outcomes. Although important outcome variables were fairly robust against scale violations in some conditions, authors conclude that researchers should not rely exclusively on algorithms allowing automatic selection of items. In particular, content validity must be taken into account to build sensible psychometric instruments.

14.
Int J Methods Psychiatr Res ; 28(4): e1795, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31264326

RESUMO

OBJECTIVES: In this study, we examined the consequences of ignoring violations of assumptions underlying the use of sum scores in assessing attention problems (AP) and if psychometrically more refined models improve predictions of relevant outcomes in adulthood. METHODS: Tracking Adolescents' Individual Lives data were used. AP symptom properties were examined using the AP scale of the Child Behavior Checklist at age 11. Consequences of model violations were evaluated in relation to psychopathology, educational attainment, financial status, and ability to form relationships in adulthood. RESULTS: Results showed that symptoms differed with respect to information and difficulty. Moreover, evidence of multidimensionality was found, with two groups of items measuring sluggish cognitive tempo and attention deficit hyperactivity disorder symptoms. Item response theory analyses indicated that a bifactor model fitted these data better than other competing models. In terms of accuracy of predicting functional outcomes, sum scores were robust against violations of assumptions in some situations. Nevertheless, AP scores derived from the bifactor model showed some superiority over sum scores. CONCLUSION: These findings show that more accurate predictions of later-life difficulties can be made if one uses a more suitable psychometric model to assess AP severity in children. This has important implications for research and clinical practice.


Assuntos
Transtorno do Deficit de Atenção com Hiperatividade/diagnóstico , Escala de Avaliação Comportamental/normas , Transtornos do Comportamento Infantil/diagnóstico , Escalas de Graduação Psiquiátrica/normas , Psicometria/normas , Adolescente , Adulto , Criança , Feminino , Humanos , Estudos Longitudinais , Masculino , Modelos Estatísticos , Índice de Gravidade de Doença , Adulto Jovem
15.
Psychol Methods ; 24(6): 774-795, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31094544

RESUMO

Null hypothesis significance testing (NHST) has been under scrutiny for decades. The literature shows overwhelming evidence of a large range of problems affecting NHST. One of the proposed alternatives to NHST is using Bayes factors instead of p values. Here we denote the method of using Bayes factors to test point null models as "null hypothesis Bayesian testing" (NHBT). In this article we offer a wide overview of potential issues (limitations or sources of misinterpretation) with NHBT which is currently missing in the literature. We illustrate many of the shortcomings of NHBT by means of reproducible examples. The article concludes with a discussion of NHBT in particular and testing in general. In particular, we argue that posterior model probabilities should be given more emphasis than Bayes factors, because only the former provide direct answers to the most common research questions under consideration. (PsycINFO Database Record (c) 2019 APA, all rights reserved).


Assuntos
Interpretação Estatística de Dados , Modelos Estatísticos , Probabilidade , Projetos de Pesquisa , Humanos
16.
BMC Med Res Methodol ; 19(1): 71, 2019 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-30925900

RESUMO

BACKGROUND: In clinical trials, study designs may focus on assessment of superiority, equivalence, or non-inferiority, of a new medicine or treatment as compared to a control. Typically, evidence in each of these paradigms is quantified with a variant of the null hypothesis significance test. A null hypothesis is assumed (null effect, inferior by a specific amount, inferior by a specific amount and superior by a specific amount, for superiority, non-inferiority, and equivalence respectively), after which the probabilities of obtaining data more extreme than those observed under these null hypotheses are quantified by p-values. Although ubiquitous in clinical testing, the null hypothesis significance test can lead to a number of difficulties in interpretation of the results of the statistical evidence. METHODS: We advocate quantifying evidence instead by means of Bayes factors and highlight how these can be calculated for different types of research design. RESULTS: We illustrate Bayes factors in practice with reanalyses of data from existing published studies. CONCLUSIONS: Bayes factors for superiority, non-inferiority, and equivalence designs allow for explicit quantification of evidence in favor of the null hypothesis. They also allow for interim testing without the need to employ explicit corrections for multiple testing.


Assuntos
Algoritmos , Teorema de Bayes , Medicina Baseada em Evidências/estatística & dados numéricos , Avaliação de Resultados em Cuidados de Saúde/estatística & dados numéricos , Projetos de Pesquisa , Biometria/métodos , Medicina Baseada em Evidências/métodos , Humanos , Avaliação de Resultados em Cuidados de Saúde/métodos , Equivalência Terapêutica
17.
Appl Psychol Meas ; 43(2): 172-173, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30792563

RESUMO

In this article, the newly created GGUM R package is presented. This package finally brings the generalized graded unfolding model (GGUM) to the front stage for practitioners and researchers. It expands the possibilities of fitting this type of item response theory (IRT) model to settings that, up to now, were not possible (thus, beyond the limitations imposed by the widespread GGUM2004 software). The outcome is therefore a unique software, not limited by the dimensions of the data matrix or the operating system used. It includes various routines that allow fitting the model, checking model fit, plotting the results, and also interacting with GGUM2004 for those interested. The software should be of interest to all those who are interested in IRT in general or to ideal point models in particular.

18.
Int J Psychol ; 54(4): 454-461, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-29508381

RESUMO

This study investigated the relationship between guilt and well-being of bereaved persons, and explored potential differences in the associations between guilt-complicated grief (CG) and guilt-depression. In total, 1358 Chinese bereaved adults were recruited to fill out questionnaires. Participants (N = 194) who had been bereaved within 2 years of the first survey, filled out the same questionnaires 1 year later. Higher guilt was associated with higher degrees of both CG and depression. The level of guilt predicted CG and depression symptoms 1 year later. Bereavement-related guilt has a closer association with CG than depression. Responsibility guilt, indebtedness guilt and degree of guilt feeling are more prominent aspects of guilt in CG than in depression. These findings demonstrate the significant role of guilt (perhaps a core symptom) in mental health of the bereaved, having implications for identifying persons with grief complications and depression.


Assuntos
Luto , Depressão/psicologia , Pesar , Culpa , Adulto , Feminino , Humanos , Masculino
19.
PLoS One ; 13(6): e0198746, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29889898

RESUMO

We investigated the validity of curriculum-sampling tests for admission to higher education in two studies. Curriculum-sampling tests mimic representative parts of an academic program to predict future academic achievement. In the first study, we investigated the predictive validity of a curriculum-sampling test for first year academic achievement across three cohorts of undergraduate psychology applicants and for academic achievement after three years in one cohort. We also studied the relationship between the test scores and enrollment decisions. In the second study, we examined the cognitive and noncognitive construct saturation of curriculum-sampling tests in a sample of psychology students. The curriculum-sampling tests showed high predictive validity for first year and third year academic achievement, mostly comparable to the predictive validity of high school GPA. In addition, curriculum-sampling test scores showed incremental validity over high school GPA. Applicants who scored low on the curriculum-sampling tests decided not to enroll in the program more often, indicating that curriculum-sampling admission tests may also promote self-selection. Contrary to expectations, the curriculum-sampling tests scores did not show any relationships with cognitive ability, but there were some indications for noncognitive saturation, mostly for perceived test competence. So, curriculum-sampling tests can serve as efficient admission tests that yield high predictive validity. Furthermore, when self-selection or student-program fit are major objectives of admission procedures, curriculum-sampling test may be preferred over or may be used in addition to high school GPA.


Assuntos
Desempenho Acadêmico/estatística & dados numéricos , Currículo , Estudos de Coortes , Avaliação Educacional , Feminino , Humanos , Masculino , Critérios de Admissão Escolar , Adulto Jovem
20.
Front Psychol ; 9: 873, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29872417

RESUMO

[This corrects the article on p. 305 in vol. 8, PMID: 28326049.].

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA