Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Assessment ; : 10731911241234118, 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38486349

RESUMO

Replication provides a confrontation of psychological theory, not only in experimental research, but also in model-based research. Goodness of fit (GOF) of the original model to the replication data is routinely provided as meaningful evidence of replication. We demonstrate, however, that GOF obscures important differences between the original and replication studies. As an alternative, we present Bayesian prior predictive similarity checking: a tool for rigorously evaluating the degree to which the data patterns and parameter estimates of a model replication study resemble those of the original study. We apply this method to original and replication data from the National Comorbidity Survey. Both data sets yielded excellent GOF, but the similarity checks often failed to support close or approximate empirical replication, especially when examining covariance patterns and indicator thresholds. We conclude with recommendations for applied research, including registered reports of model-based research, and provide extensive annotated R code to facilitate future applications of prior predictive similarity checking.

2.
Perspect Psychol Sci ; 19(1): 223-243, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37466102

RESUMO

Conducting research with human subjects can be difficult because of limited sample sizes and small empirical effects. We demonstrate that this problem can yield patterns of results that are practically indistinguishable from flipping a coin to determine the direction of treatment effects. We use this idea of random conclusions to establish a baseline for interpreting effect-size estimates, in turn producing more stringent thresholds for hypothesis testing and for statistical-power calculations. An examination of recent meta-analyses in psychology, neuroscience, and medicine confirms that, even if all considered effects are real, results involving small effects are indeed indistinguishable from random conclusions.


Assuntos
Neurociências , Projetos de Pesquisa , Humanos , Tamanho da Amostra
3.
J Affect Disord ; 342: 76-84, 2023 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-37708980

RESUMO

BACKGROUND: Technically sound measures are necessary for accurately identifying youth at risk for depression, but many studies rely on classical test theory metrics or adult samples to evaluate measures. This study examined the use of the PHQ-8, a common and freely available pediatric depression screener, in an adolescent sample using item response theory (IRT). METHODS: Secondary analyses were conducted on data from a study conducted in Midwestern middle schools in which 1224 youth completed the PHQ-8 as part of a battery of surveys. Polytomous IRT analyses (a Graded Response Model) were used to evaluate the PHQ-8. Items were examined for their ability to distinguish between respondents of different latent depression severity and for differential item functioning (DIF) across demographic categories. RESULTS: All PHQ-8 items had adequate discriminative abilities. Items measuring anhedonia and psychomotor disturbances performed relatively poorly, and items measuring somatic symptoms (appetite and sleep) were most informative when respondents endorsed extreme response options ("not at all" or "nearly every day"). No DIF was found across grade level or race, but several items were flagged for DIF by gender and student income level. LIMITATIONS: These results might not be generalizable to a broader youth population due to administration setting and the unique demographic characteristics of this sample (76.0 % African American). CONCLUSIONS: Tools such as the PHQ-8 are appropriate to quickly screen for depression in adolescents, but further scrutiny of adolescent response patterns is warranted. Future research should examine items measuring anhedonia and psychomotor and somatic disturbances in adolescents.


Assuntos
Depressão , Questionário de Saúde do Paciente , Adulto , Humanos , Adolescente , Criança , Depressão/diagnóstico , Anedonia , Inquéritos e Questionários , Psicometria
4.
Behav Res Methods ; 2023 Aug 03.
Artigo em Inglês | MEDLINE | ID: mdl-37537489

RESUMO

In item response theory (IRT) modeling, the magnitude of the lower and upper asymptote parameters determines the degree to which the inflection point shifts above or below P = 0.50. The current study examines the one-parameter negative log-log model (NLLM), which is characterized by a downward shift in the inflection point, among other distinctive psychometric properties. After detailing the statistical foundations of the NLLM, we present a series of simulation studies to establish item and person parameter estimation accuracy and to demonstrate that this parsimonious model addresses the "slipping" effect (i.e., unexpectedly incorrect answers) via an inflection point < 0.50 rather than through computationally difficult estimation of the upper asymptote. We then provide further support for these simulation results through empirical data analysis. Finally, we discuss how the NLLM contributes to recent methodological literature on the utility of asymmetric IRT models.

5.
Sch Psychol ; 38(3): 129-136, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37184956

RESUMO

Despite general agreement about the importance of test score relevance, utility, and consequences, empirical articles rarely examine or report these facets of validity. The purpose of this special issue of School Psychology was to provide empirical examples that examine the social consequences of educational test score use. The goal was to illustrate best practices in social consequence considerations and encourage future measurement articles to go beyond a basic consideration of psychometric properties. In this overview article, we provide an overview of validity frameworks and their application in education research. We also review the current state of social consequence research and offer recommendations to increase and improve this research in our field. After summarizing the articles and commentaries in the special issue, we conclude by describing a framework for guiding test development and ongoing evaluation from a social consequence perspective. (PsycInfo Database Record (c) 2023 APA, all rights reserved).


Assuntos
Desempenho Acadêmico , Psicologia Educacional , Humanos , Psicometria
6.
Prev Sci ; 24(3): 393-397, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36633766

RESUMO

A variety of health and social problems are routinely measured in the form of categorical outcome data (such as presence/absence of a problem behavior or stages of disease progression). Therefore, proper quantitative analysis of categorical data lies at the heart of the empirical work conducted in prevention science. Categorical data analysis constitutes a broad dynamic field of methods research and data analysts in prevention science can benefit from incorporating recent advances and developments in the statistical evaluation of categorical outcomes in their methodological repertoire. The present Special Issue, Advanced Categorical Data Analysis in Prevention Science, highlights recent methods developments and illustrates their application in the context of prevention science. Contributions of the Special Issue cover a wide variety of areas ranging from statistical models for binary as well as multi-categorical data, advances in the statistical evaluation of moderation and mediation effects for categorical data, developments in model evaluation and measurement, as well as methods that integrate variable- and person-oriented categorical data analysis. The articles of this Special issue make methodological advances in these areas accessible to the audience of prevention scientists to maintain rigorous statistical practice and decision making. The current paper provides background and rationale for this Special Issue, an overview of the articles, and a brief discussion of some potential future directions for prevention research involving categorical data analysis.


Assuntos
Modelos Estatísticos , Comportamento Problema , Humanos , Problemas Sociais , Pesquisa sobre Serviços de Saúde , Análise de Dados
7.
Prev Sci ; 24(3): 467-479, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-34519939

RESUMO

Statistical analysis of categorical data often relies on multiway contingency tables; yet, as the number of categories and/or variables increases, the number of table cells with few (or zero) observations also increases. Unfortunately, sparse contingency tables invalidate the use of standard goodness-of-fit statistics. Limited-information fit statistics and bootstrapping procedures offer valuable solutions to this problem, but they present an additional concern in their strict reliance on the (potentially misleading) observed data. To address both of these issues, we demonstrate the Bayesian model checking technique, which yields insightful, useful, and comprehensive evaluations of specific properties of a given model. We illustrate this technique using item response data from a patient-reported psychopathology screening questionnaire, and we provide annotated R code to promote dissemination of this informative method in other prevention science modeling scenarios.


Assuntos
Modelos Estatísticos , Modelos Teóricos , Humanos , Teorema de Bayes , Projetos de Pesquisa
8.
Behav Res Methods ; 55(1): 200-219, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-35355241

RESUMO

Traditional item response theory (IRT) models assume a symmetric error distribution and rely on symmetric (logit or probit) link functions to model the response probabilities. As an alternative, we investigated the one-parameter complementary log-log model (CLLM), which is founded on an asymmetric error distribution and results in an asymmetric item response function with important psychometric properties. In a series of simulation studies, we demonstrate that the CLLM (a) is estimable in small sample sizes, (b) facilitates item-weighted scoring, and (c) accounts for the effect of guessing, despite the presence of a single parameter. We then provide further evidence for these claims by applying the CLLM to empirical data. Finally, we discuss how this work contributes to the growing psychometric literature on model complexity.


Assuntos
Psicometria , Humanos , Psicometria/métodos , Simulação por Computador , Probabilidade , Tamanho da Amostra
9.
Behav Brain Sci ; 45: e5, 2022 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-35139959

RESUMO

Traditional statistical model evaluation typically relies on goodness-of-fit testing and quantifying model complexity by counting parameters. Both of these practices may result in overfitting and have thereby contributed to the generalizability crisis. The information-theoretic principle of minimum description length addresses both of these concerns by filtering noise from the observed data and consequently increasing generalizability to unseen data.


Assuntos
Modelos Estatísticos , Humanos
10.
J Sch Psychol ; 83: 66-88, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33276856

RESUMO

The purpose of this study was to support the development and initial validation of the Intervention Selection Profile (ISP)-Skills, a brief 14-item teacher rating scale intended to inform the selection and delivery of instructional interventions at Tier 2. Teacher participants (n = 196) rated five students from their classroom across four measures (total student n = 877). These measures included the ISP-Skills and three criterion tools: Social Skills Improvement System (SSIS), Devereux Student Strengths Assessment (DESSA), and Academic Competence Evaluation Scales (ACES). Diagnostic classification modeling (DCM) suggested an expert-created Q-matrix, which specified relations between ISP-Skills items and hypothesized latent attributes, provided good fit to item data. DCM also indicated ISP-Skills items functioned as intended, with the magnitude of item ratings corresponding to the model-implied probability of attribute mastery. DCM was then used to generate skill profiles for each student, which included scores representing the probability of students mastering each of eight skills. Correlational analyses revealed large convergent relations between ISP-Skills probability scores and theoretically-aligned subscales from the criterion measures. Discriminant validity was not supported, as ISP-Skills scores were also highly related to all other criterion subscales. Receiver operating characteristic (ROC) curve analyses informed the selection of cut scores from each ISP-Skills scale. Review of classification accuracy statistics associated with these cut scores (e.g., sensitivity and specificity) suggested they reliably differentiated students with below average, average, and above average skills. Implications for practice and directions for future research are discussed, including those related to the examination of ISP-Skills treatment utility.


Assuntos
Escala de Avaliação Comportamental/normas , Estudantes/psicologia , Desempenho Acadêmico , Adulto , Criança , Comportamento Infantil/psicologia , Emoções , Feminino , Humanos , Masculino , Reprodutibilidade dos Testes , Instituições Acadêmicas , Sensibilidade e Especificidade , Habilidades Sociais
12.
Assessment ; 27(2): 321-333, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-29716398

RESUMO

Observational measurement of treatment adherence has long been considered the gold standard. However, little is known about either the generalizability of the scores from extant observational instruments or the sampling needed. We conducted generalizability (G) and decision (D) studies on two samples of recordings from two randomized controlled trials testing cognitive-behavioral therapy for youth anxiety in two different contexts: research versus community. Two doctoral students independently coded 543 session recordings from 52 patients treated by 13 therapists. The initial G-study demonstrated that context accounted for a disproportionately large share of variance, so we conducted G- and D-studies for the two contexts separately. Results suggested that reliable cognitive-behavioral therapy adherence studies require at least 10 sessions per patient, assuming 12 patients per therapists and two coders-a challenging threshold even in well-funded research. Implications, including the importance of evaluating alternatives to observational measurement, are discussed.


Assuntos
Ansiedade/terapia , Terapia Cognitivo-Comportamental , Tomada de Decisões , Avaliação de Resultados em Cuidados de Saúde/métodos , Psicologia do Adolescente/métodos , Cooperação e Adesão ao Tratamento , Adolescente , Feminino , Humanos , Masculino , Psicometria , Ensaios Clínicos Controlados Aleatórios como Assunto , Cooperação e Adesão ao Tratamento/estatística & dados numéricos
13.
Sch Psychol ; 34(3): 296-306, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-30556727

RESUMO

The examination of belonging in schools, connecting school belonging to a plethora of academic and psychosocial outcomes, has been well established in the literature. Researchers have measured school belonging most frequently with the Psychological Sense of School Membership, but its psychometric properties have been called into question by several researchers. Further, the scale measures 1 subset of belonging (i.e., school), leaving out powerful belonging connections in other areas of a student's life, namely peers and family. The current study examines the development and validation of the Milwaukee Youth Belongingness Scale. This process was examined by utilizing item response theory and a secondary analysis confirming the factor structure and the validation of the scale by comparing it to other constructs. The results confirm a 9-item scale that involves a total scale score and 3 factors (School, Peers, Family). Implications for mental health professionals and future research are discussed. (PsycINFO Database Record (c) 2019 APA, all rights reserved).


Assuntos
Processos Grupais , Relações Interpessoais , Psicometria , Estudantes/psicologia , Adolescente , Criança , Família , Feminino , Humanos , Masculino , Grupo Associado , Teoria Psicológica , Psicometria/instrumentação , Psicometria/métodos , Psicometria/normas , Reprodutibilidade dos Testes , Instituições Acadêmicas , Percepção Social
14.
J Sch Psychol ; 68: 129-141, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29861023

RESUMO

In accordance with an argument-based approach to validation, the purpose of the current study was to yield evidence relating to Social, Academic, and Emotional Behavior Risk Screener (SAEBRS) score interpretation. Bifactor item response theory analyses were performed to examine SAEBRS item functioning. Structural equation modeling (SEM) was used to simultaneously evaluate intra- and inter-scale relationships, expressed through (a) a measurement model specifying a bifactor structure to SAEBRS items, and (b) a structural model specifying convergent and discriminant relations with an outcome measure (i.e., Behavioral and Emotional Screening System [BESS]). Finally, hierarchical omega coefficients were calculated in evaluating the model-based internal reliability of each SAEBRS scale. IRT analyses supported the adequate fit of the bifactor model, indicating items adequately discriminated moderate and high-risk students. SEM results further supported the fit of the latent bifactor measurement model, yielding superior fit relative to alternative models (i.e., unidimensional and correlated factors). SEM analyses also indicated the latent SAEBRS-Total Behavior factor was a statistically significant predictor of all BESS subscales, the SAEBRS-Academic Behavior predicted BESS Adaptive Skills subscales, and the SAEBRS-Emotional Behavior predicted the BESS Internalizing Problems subscale. Hierarchical omega coefficients indicated the SAEBRS-Total Behavior factor was associated with adequate reliability. In contrast, after accounting for the total scale, each of the SAEBRS subscales was associated with somewhat limited reliability, suggesting variability in these scores is largely driven by the Total Behavior scale. Implications for practice and future research are discussed.


Assuntos
Transtornos do Comportamento Infantil/diagnóstico , Emoções/fisiologia , Comportamento Problema/psicologia , Estudantes/psicologia , Criança , Transtornos do Comportamento Infantil/psicologia , Feminino , Humanos , Masculino , Programas de Rastreamento , Psicometria , Reprodutibilidade dos Testes , Medição de Risco , Instituições Acadêmicas
15.
Multivariate Behav Res ; 52(4): 465-484, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28426237

RESUMO

Complexity in item response theory (IRT) has traditionally been quantified by simply counting the number of freely estimated parameters in the model. However, complexity is also contingent upon the functional form of the model. We examined four popular IRT models-exploratory factor analytic, bifactor, DINA, and DINO-with different functional forms but the same number of free parameters. In comparison, a simpler (unidimensional 3PL) model was specified such that it had 1 more parameter than the previous models. All models were then evaluated according to the minimum description length principle. Specifically, each model was fit to 1,000 data sets that were randomly and uniformly sampled from the complete data space and then assessed using global and item-level fit and diagnostic measures. The findings revealed that the factor analytic and bifactor models possess a strong tendency to fit any possible data. The unidimensional 3PL model displayed minimal fitting propensity, despite the fact that it included an additional free parameter. The DINA and DINO models did not demonstrate a proclivity to fit any possible data, but they did fit well to distinct data patterns. Applied researchers and psychometricians should therefore consider functional form-and not goodness-of-fit alone-when selecting an IRT model.


Assuntos
Análise Fatorial , Modelos Estatísticos , Teoria da Informação , Análise Multivariada , Psicometria
16.
J Affect Disord ; 208: 369-374, 2017 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-27810720

RESUMO

BACKGROUND: Clinicians view "recovery" as the reduction in severity of symptoms over time, whereas patients view it as the restoration of premorbid functioning level and quality of life (QOL). The main purpose of this study is to incorporate patient-reported measures of functioning and QOL into the assessment of patient outcomes in MDD and to use this data to define recovery. METHOD: Using the STAR*D study of patients diagnosed with MDD, this present analysis grades patients' MDD severity, functioning level, and QOL at exit from each level of the study, as well as at follow-up. Using Item Response Theory, we combined patient data from functioning and QOL measures (WSAS, Q-LES-Q) in order to form a single latent dimension named the Recovery Index. RESULTS: Recovery Index - a latent measure assessing impact of illness on functioning and QOL - is able to predict remission of MDD in patients who participated in the STAR*D study. CONCLUSIONS: By incorporating functioning and quality of life, the Recovery index creates a new dimension towards measuring restoration of health, in order to move beyond basic symptom measurement.


Assuntos
Transtorno Depressivo Maior , Qualidade de Vida , Atividades Cotidianas , Transtorno Depressivo Maior/classificação , Transtorno Depressivo Maior/reabilitação , Humanos , Prognóstico , Escalas de Graduação Psiquiátrica , Indução de Remissão
17.
Multivariate Behav Res ; 50(1): 128, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26609749
18.
J Pers Assess ; 95(2): 129-40, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23030794

RESUMO

Confirmatory factor analytic studies of psychological measures showing item responses to be multidimensional do not provide sufficient guidance for applied work. Demonstrating that item response data are multifactorial in this way does not necessarily (a) mean that a total scale score is an inadequate indicator of the intended construct, (b) demand creating and scoring subscales, or (c) require specifying a multidimensional measurement model in research using structural equation modeling (SEM). To better inform these important decisions, more fine-grained psychometric analyses are necessary. We describe 3 established, but seldom used, psychometric approaches that address 4 distinct questions: (a) To what degree do total scale scores reflect reliable variation on a single construct? (b) Is the scoring and reporting of subscale scores justified? (c) If justified, how much reliable variance do subscale scores provide after controlling for a general factor? and (d) Can multidimensional item response data be represented by a unidimensional measurement model in SEM, or are multidimensional measurement models (e.g., second-order, bifactor) necessary to achieve unbiased structural coefficients? In the discussion, we provide guidance for applied researchers on how best to interpret the results from applying these methods and review their limitations.


Assuntos
Modelos Psicológicos , Testes de Personalidade , Análise Fatorial , Humanos , Psicometria , Projetos de Pesquisa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...