RESUMO
This study provides empirical benchmarks that quantify typical changes in students' reports of social and emotional skills in a large, diverse sample. Data come from six cohorts of students (N = 361,815; 6% Asian, 8% Black, 68% White, 75% Latinx, 50% Female) who responded to the CORE survey from 2015 to 2018 and help quantify typical gains/declines in growth mindset, self-efficacy, self-management, and social awareness. Results show fluctuations in skills between 4th and 12th grade (changes ranging from -.33 to .23 standard deviations). Growth mindset increases in fourth grade, declines in fifth to seventh grade, then mostly increases. Self-efficacy, self-management, and social awareness decline in sixth to eighth grade. Self-management and social awareness, but not self-efficacy, show increases in 10th to 12th grade.
Assuntos
Benchmarking , Estudantes , Emoções , Feminino , Humanos , Masculino , Autoeficácia , Habilidades Sociais , Estudantes/psicologia , Inquéritos e QuestionáriosRESUMO
To avoid the subjectivity of having a single person evaluate a construct of interest (e.g., a student's self-efficacy in school), multiple raters are often used. Increasingly, data that use multiple raters to evaluate psychological and social-emotional constructs over time are available. While a range of models to address measurement issues that arise when using multiple raters have been presented, including a small number for longitudinal data, few if any models are available to estimate growth in the presence of multiple raters. In this study, we provide a model that removes all but the shared perceptions of raters at a given timepoint (i.e., removes unique rater variance), then adds on a latent growth curve model across timepoints. Through simulation and empirical studies, we examine the performance of the model in terms of recovering true growth parameters, and relative to more crude approaches like estimating growth based on a single rater. Our results indicate that the model we propose performs quite well along these dimensions, and shows promise for use by researchers who want to estimate growth based on longitudinal multi-rater data.
Assuntos
Simulação por Computador , HumanosRESUMO
Using data from the Applied Problems subtest of the Woodcock-Johnson Tests of Achievement (Woodcock & Johnson, 1989/1990, Woodcock-Johnson psycho-educational battery-revised. Allen, TX: DLM Teaching Resources) administered to 1,364 children from the National Institute of Child Health and Human Development (NICHD) Study of Early Childcare and Youth Development (SECCYD), this study measures children's mastery of three numeric competencies (counting, concrete representational arithmetic and abstract arithmetic operations) at 54 months of age. We find that, even after controlling for key demographic characteristics, the numeric competency that children master prior to school entry relates to important educational transitions in secondary and post-secondary education. Those children who showed low numeric competency prior to school entry enrolled in lower math track classes in high school and were less likely to enrol in college. Important numeracy competency differences at age 54 months related to socioeconomic inequalities were also found. These findings suggest that important indicators of long-term schooling success (i.e., advanced math courses, college enrollment) are evident prior to schooling based on the levels of numeracy mastery.
RESUMO
Survey respondents employ different response styles when they use the categories of the Likert scale differently despite having the same true score on the construct of interest. For example, respondents may be more likely to use the extremes of the response scale independent of their true score. Research already shows that differing response styles can create a construct-irrelevant source of bias that distorts fundamental inferences made based on survey data. While some initial studies examine the effect of response styles on survey scores in longitudinal analyses, the issue of how response styles affect estimates of growth is underexamined. In this study, we conducted empirical and simulation analyses in which we scored surveys using item response theory (IRT) models that do and do not account for response styles, and then used those different scores in growth models and compared results. Generally, we found that response styles can affect estimates of growth parameters including the slope, but that the effects vary by psychological construct, response style, and IRT model used.
Assuntos
Inquéritos e Questionários , Viés , Simulação por Computador , Estudos LongitudinaisRESUMO
In this prospective longitudinal study (N = 1094, M age = 5.6 years to M age = 11.1 years), we examined family factors associated with school mobility and then asked if either a move during the previous year or cumulative moves across elementary school were related to child functioning. Family factors were not linked to a recent move or a single move, but changes in family income and household structure did predict higher odds of two or more moves in elementary school. There was no evidence that a recent move or a single move was related to children's academic or social functioning. Effects of two or more moves on child functioning were not significant after controlling for the number of analyses that were conducted. Taken together, school mobility during elementary school did not appear to be a pervasive risk although we were unable to study very high rates of school mobility because of very small sample sizes.
RESUMO
Research on achievement gaps by race/ethnicity and poverty status typically focuses on each gap separately, and recent syntheses suggest the poverty gap is growing while racial/ethnic gaps are narrowing. In this study, we used time-varying effect modeling to examine the interaction of race/ethnicity and poverty gaps in math and reading achievement from 1986-2005 for poor and non-poor White, Black, and Hispanic students in three age groups (5-6, 9-10, and 13-14). We found that across this twenty-year period, the gaps between poor White students and their poor Black and Hispanic peers grew, while the gap between non-poor Whites and Hispanics narrowed. We conclude that understanding the nature of achievement gaps requires simultaneous examination of race/ethnicity and income.
Assuntos
Sucesso Acadêmico , Pobreza/etnologia , Adolescente , Criança , Pré-Escolar , Etnicidade/estatística & dados numéricos , Feminino , Humanos , Renda , Estudos Longitudinais , Masculino , Estudantes/estatística & dados numéricos , Estados UnidosRESUMO
BACKGROUND: Research and practice in autism spectrum disorder (ASD) rely on quantitative measures, such as the Social Responsiveness Scale (SRS), for characterization and diagnosis. Like many ASD diagnostic measures, SRS scores are influenced by factors unrelated to ASD core features. This study further interrogates the psychometric properties of the SRS using item response theory (IRT), and demonstrates a strategy to create a psychometrically sound short form by applying IRT results. METHODS: Social Responsiveness Scale analyses were conducted on a large sample (N = 21,426) of youth from four ASD databases. Items were subjected to item factor analyses and evaluation of item bias by gender, age, expressive language level, behavior problems, and nonverbal IQ. RESULTS: Item selection based on item psychometric properties, DIF analyses, and substantive validity produced a reduced item SRS short form that was unidimensional in structure, highly reliable (α = .96), and free of gender, age, expressive language, behavior problems, and nonverbal IQ influence. The short form also showed strong relationships with established measures of autism symptom severity (ADOS, ADI-R, Vineland). Degree of association between all measures varied as a function of expressive language. CONCLUSIONS: Results identified specific SRS items that are more vulnerable to non-ASD-related traits. The resultant 16-item SRS short form may possess superior psychometric properties compared to the original scale and emerge as a more precise measure of ASD core symptom severity, facilitating research and practice. Future research using IRT is needed to further refine existing measures of autism symptomatology.
Assuntos
Transtorno do Espectro Autista/diagnóstico , Escalas de Graduação Psiquiátrica/normas , Psicometria/métodos , Comportamento Social , Adolescente , Criança , Pré-Escolar , Feminino , Humanos , Masculino , Psicometria/instrumentação , Reprodutibilidade dos TestesRESUMO
INTRODUCTION: Negative psychosocial expectancies of smoking include aspects of social disapproval and disappointment in oneself. This paper describes analyses conducted to develop and evaluate item banks for assessing psychosocial expectancies among daily and nondaily smokers. METHODS: Using data from a sample of daily (N = 4,201) and nondaily (N =1,183) smokers, we conducted a series of item factor analyses, item response theory analyses, and differential item functioning analyses (according to gender, age, and race/ethnicity) to arrive at a unidimensional set of psychosocial expectancies items for daily and nondaily smokers. We also evaluated performance of short forms (SFs) and computer adaptive tests (CATs) to efficiently assess psychosocial expectancies. RESULTS: A total of 21 items were included in the Psychosocial Expectancies item banks: 14 items are common across daily and nondaily smokers, 6 are unique to daily, and 1 is unique to nondaily. For both daily and nondaily smokers, the Psychosocial Expectancies item banks are strongly unidimensional, highly reliable (reliability = 0.95 and 0.93, respectively), and perform similarly across gender, age, and race/ethnicity groups. A SF common to daily and nondaily smokers consists of 6 items (reliability = 0.85). Results from simulated CATs showed that, on average, fewer than 8 items are needed to assess psychosocial expectancies with adequate precision when using the item banks. CONCLUSIONS: Psychosocial expectancies of smoking can be assessed on the basis of these item banks via the SF, by using CAT, or through a tailored set of items selected for a specific research purpose.
Assuntos
Psicometria/métodos , Fumar/psicologia , Adolescente , Adulto , Calibragem , Bases de Dados Factuais , Etnicidade , Análise Fatorial , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Reprodutibilidade dos Testes , Inquéritos e Questionários , Adulto JovemRESUMO
INTRODUCTION: Smoking behavior is influenced by social motivations such as the expected social benefits of smoking and the social cues that induce craving. This paper describes development of the PROMIS Social Motivations for Smoking item banks, which will serve to standardize assessment of these social motivations among daily and nondaily smokers. METHODS: Daily (N = 4,201) and nondaily (N =1,183) smokers completed an online survey. Item factor analyses, item response theory analyses, and differential item functioning analyses were conducted to identify a unidimensional set of items for each group. Short forms (SFs) and computer adaptive tests (CATs) were evaluated as tools for more efficiently assessing this construct. RESULTS: A total of 15 items were included in the item banks (9 items common to daily and nondaily smokers, 3 unique to daily, 3 unique to nondaily). Scores based on full item banks are highly reliable (reliability = 0.90-0.91). Additionally, the item banks are strongly unidimensional and perform similarly across gender, age, and race/ethnicity groups. A fixed SF for use with both daily and nondaily smokers consists of 4 items (reliability = 0.80). Results from simulated CATs showed that, on average, fewer than 5 items are needed to assess this construct with adequate precision using the item banks. CONCLUSIONS: A new set of items has been identified for assessing the social motivations for smoking in a reliable, standardized manner for daily and nondaily smokers. In addition to using the full item banks, efficient assessment can be achieved by using SFs, employing CATs, or selecting items tailored to specific research or clinical purposes.
Assuntos
Motivação , Psicometria/métodos , Fumar/psicologia , Meio Social , Adolescente , Adulto , Calibragem , Bases de Dados Factuais , Etnicidade , Análise Fatorial , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Reprodutibilidade dos Testes , Inquéritos e Questionários , Adulto JovemRESUMO
Educators have become increasingly committed to social and emotional learning in schools. However, we know too little about the typical growth trajectories of the competencies that schools are striving to improve. We leverage data from the California Office to Reform Education, a consortium of districts in California serving over 1.5 million students, that administers annual surveys to students to measure social and emotional competencies (SECs). This article uses data from six cohorts of approximately 16,000 students each (51% male, 73% Latinx, 11% White, 10% Black, 24% with parents who did not complete high school) in Grades 4-12. Two questions are addressed. First, how much growth occurs in growth mindset, self-efficacy, self-management, and social awareness from Grades 4 to 12? Second, do initial status and growth look different by parental educational attainment and gender? Using accelerated longitudinal design growth models, findings show distinct growth trends among the four SECs with growth mindset increasing, self-management mostly decreasing, and self-efficacy and social awareness decreasing and then increasing. The subgroup analyses show gaps between groups but patterns of growth that are more similar than different. Further, subgroup membership accounts for very little variation in growth or declines. Instead, initial levels of competencies predict growth. Also, variation within groups is greater than variation between groups. The findings have practical implications for educators and psychologists striving to improve SECs. If schools use student-report approaches, predicting steady and consistent positive growth in SECs is unrealistic. Instead, U-shaped patterns for some SECs appear to be normative with notable declines in the sixth grade, requiring new supports. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
RESUMO
While a great deal of thought, planning, and money goes into the design of multisite randomized control trials (RCTs) that are used to evaluate the effectiveness of interventions in fields like education and psychology, relatively little thought is often paid to the measurement choices made in such evaluations. In this study, we conduct a series of simulation studies that consider a wide range of options for producing scores from multiple administration of assessments in the context of multisite RCTs. The scoring models considered range from the simple (sum scores) to highly complex (multilevel two-tier item response theory [IRT] models with latent regression). We find that the true treatment effect is attenuated when sum scores or scores from IRT models that do not account for treatment assignment are used. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
RESUMO
Supporting students' social-emotional learning (SEL) is gaining emphasis in education. In particular, self-control is a construct that has been shown to predict academic outcomes, though much debate on this point exists. Although largely unexamined, inconsistent findings could stem from the fact that related surveys are often scored by multiple raters (e.g., teachers and parents), especially when administered at a young age when students cannot respond to items themselves. Yet little is known about (a) how much parent and teacher self-control ratings overlap and (b) what student characteristics like race and socioeconomic status are associated with inconsistencies. In this study, we use data from a widely used measure of early self-control with parent and teacher forms. We use these data to examine the impact of rater discrepancies on our understanding of students' self-control. Results show relatively low agreement between parents and teachers, with some evidence that discrepancies are associated with student race. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Assuntos
Autocontrole , Estudantes , Humanos , Estudantes/psicologia , Inquéritos e Questionários , Pais/psicologia , Professores Escolares/psicologia , Instituições AcadêmicasRESUMO
The COVID-19 pandemic has been an unprecedented disruption in students' academic development. Using reading test scores from 5 million U.S. students in grades 3-8, we tracked changes in achievement across the first two years of the pandemic. Average fall 2021 reading test scores in grades 3-8 were .09 to .17 standard deviations lower relative to same-grade peers in fall 2019, with the largest impacts in grades 3-5. Students of color attending high-poverty elementary schools saw the largest test score declines in reading. Our results suggest that many upper elementary students are at-risk for reading difficulties and will need targeted supports to build and strengthen foundational reading skills. Supplementary Information: The online version contains supplementary material available at 10.1007/s11145-022-10345-8.
RESUMO
This study is a conceptual replication of a widely cited study by Moffitt et al. (2011) which found that attention and behavior problems in childhood (a composite of impulsive hyperactive, inattentive, and impulsive-aggressive behaviors labeled "self-control") predicted adult financial status, health, and criminal activity. Using data from longitudinal cohort studies in the United States (n = 1,168) and the United Kingdom (n = 16,506), we largely reproduced their pattern of findings that attention and behavior problems measured across the course of childhood predicted a range of adult outcomes including educational attainment (ßU.S. = -0.22, ßU.K. = -0.13) and spending time in jail (ORU.S. = 1.74, ORU.K. = 1.48). We found that associations with outcomes in education, work, and finances diminished in the presence of additional covariates for children's home environment and achievement but associations for other outcomes were more robust. We also found that attention and behavior problems across distinct periods of childhood were associated with adult outcomes. Specific attention and behavior problems showed some differences in predicting outcomes in the U.S. cohort, with attention problems predicting lower educational attainment and hyperactivity/impulsivity predicting ever spending time in jail. Together with the findings from Moffitt et al., our study makes clear that childhood attention and behavior problems are associated with a range of outcomes in adulthood for cohorts born in the 1950s, 1970s, and 1990s across three countries. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Assuntos
Transtorno do Deficit de Atenção com Hiperatividade , Criminosos , Criança , Humanos , Adulto , Estados Unidos , Estudos Longitudinais , Reino Unido , Atenção , Nível de SaúdeRESUMO
Researchers in the social sciences often obtain ratings of a construct of interest provided by multiple raters. While using multiple raters provides a way to help avoid the subjectivity of any given person's responses, rater disagreement can be a problem. A variety of models exist to address rater disagreement in both structural equation modeling and item response theory frameworks. Recently, a model was developed by Bauer et al. (2013) and referred to as the "trifactor model" to provide applied researchers with a straightforward way of estimating scores that are purged of variance that is idiosyncratic by rater. Although the intent of the model is to be usable and interpretable, little is known about the circumstances under which it performs well, and those it does not. We conduct simulation studies to examine the performance of the trifactor model under a range of sample sizes and model specifications and then compare model fit, bias, and convergence rates.
RESUMO
BACKGROUND: Research shows that successfully transitioning from intermediate school to secondary school is pivotal for students to remain on track to graduate. Studies also indicate that a successful transition is a function not only of how prepared the students are academically but also whether they have the social-emotional learning (SEL) skills to succeed in a more independent secondary school environment. AIM: Yet, little is known about whether students' SEL skills are stable over time, and if they are not, whether a student's initial level of SEL skills at the start of intermediate school or change in SEL skills over time is a better indicator of whether the student will be off track academically in 9th grade. This study begins to investigate this issue. SAMPLE: We use four years of longitudinal SEL data from students in a large urban district with a sample size of ~3,000 students per timepoint. METHODS: We use several years of longitudinal SEL data to fit growth models for three constructs shown to be related to successfully transitioning to secondary school. In so doing, we examine whether a student's mean SEL score in 6th grade (status) or growth between 6th and 8th grade is more predictive of being off track academically in 9th grade. RESULT: Results indicate that, while status is more frequently significant, growth for self-management is also predictive above and beyond status on that construct. CONCLUSION: Findings suggest that understanding how a student develops social-emotionally can improve identification of students not on track to succeed in high school.
Assuntos
Instituições Acadêmicas , Aprendizado Social , Emoções , Humanos , Habilidades Sociais , Estudantes/psicologiaRESUMO
A huge portion of what we know about how humans develop, learn, behave, and interact is based on survey data. Researchers use longitudinal growth modeling to understand the development of students on psychological and social-emotional learning constructs across elementary and middle school. In these designs, students are typically administered a consistent set of self-report survey items across multiple school years, and growth is measured either based on sum scores or scale scores produced based on item response theory (IRT) methods. Although there is great deal of guidance on scaling and linking IRT-based large-scale educational assessment to facilitate the estimation of examinee growth, little of this expertise is brought to bear in the scaling of psychological and social-emotional constructs. Through a series of simulation and empirical studies, we produce scores in a single-cohort repeated measure design using sum scores as well as multiple IRT approaches and compare the recovery of growth estimates from longitudinal growth models using each set of scores. Results indicate that using scores from multidimensional IRT approaches that account for latent variable covariances over time in growth models leads to better recovery of growth parameters relative to models using sum scores and other IRT approaches. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Assuntos
Projetos de Pesquisa , Viés , Simulação por Computador , Humanos , Estudos Longitudinais , Inquéritos e QuestionáriosRESUMO
Though much effort is often put into designing psychological studies, the measurement model and scoring approach employed are often an afterthought, especially when short survey scales are used (Flake & Fried, 2020). One possible reason that measurement gets downplayed is that there is generally little understanding of how calibration/scoring approaches could impact common estimands of interest, including treatment effect estimates, beyond random noise due to measurement error. Another possible reason is that the process of scoring is complicated, involving selecting a suitable measurement model, calibrating its parameters, then deciding how to generate a score, all steps that occur before the score is even used to examine the desired psychological phenomenon. In this study, we provide three motivating examples where surveys are used to understand individuals' underlying social emotional and/or personality constructs to demonstrate the potential consequences of measurement/scoring decisions. These examples also mean we can walk through the different measurement decision stages and, hopefully, begin to demystify them. As we show in our analyses, the decisions researchers make about how to calibrate and score the survey used has consequences that are often overlooked, with likely implications both for conclusions drawn from individual psychological studies and replications of studies. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
RESUMO
This study evaluates the psychometric properties (dimensionality, item bias, reliability) of the Repetitive Behavior Scale-Revised (RBS-R), provides scoring guidelines for the dimensional measure, and makes recommendations for future RRB measure development. Participants included individuals from three large autism data repositories; Simon Foundation Powering Autism Research for Knowledge (SPARK), Simons Simplex Collection (SSC), and National Database for Autism Research (NDAR). The total sample included N = 15,318 autistic individuals ages 3-18. Confirmatory factor analysis was used to evaluate competing theoretical factor structures. Item response theory (IRT) was used to evaluate differential item functioning, estimate the reliability of each RBS-R subdomain, and score the subdomains. A unidimensional factor structure demonstrated clearly inadequate model fit, calling into question the practice of reporting a total score on the RBS-R. A five-dimensional factor structure was supported by the theoretical and empirical evidence, though the fifth factor (restricted interests) was not sufficiently reliable for use. IRT-based scoring tools were generated for use in research. The present study illustrates the promise in the future development of measures for RRBs, particularly in the development of measures to separately and specifically assess RRB constructs using rigorous methodological guidelines. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Adolescente , Transtorno do Espectro Autista/diagnóstico , Transtorno Autístico/diagnóstico , Criança , Pré-Escolar , Humanos , Avaliação de Resultados em Cuidados de Saúde , Psicometria , Reprodutibilidade dos TestesRESUMO
Few measures of autism-related symptoms have been established as both psychometrically robust and sensitive to the effects of treatment. In the present study, a personalized measure of autism-related symptoms using the Youth Top Problems (YTP) method (Weisz et al., 2011) was evaluated. Participants included 68 children with diagnoses of autism (ages 6-13 years), and their parents, who were randomized to cognitive behavioral therapy (CBT) or enhanced standard community treatment (ESCT) addressing autism-related symptoms. At pretreatment, parents described their child's top autism-related problems (YTPs) in their own words and rated the severity of these problems on a Likert-type scale. Parents also made daily severity ratings on the child's top three YTPs for 5 days prior to treatment and 5 days following treatment while videorecording their child's behavior at home on each of these days. Trained observers coded these videorecordings, focusing on the same YTPs that the parents rated. Parents also completed standardized checklists of autism-related symptoms and general mental health symptoms. There was evidence of convergent and discriminant validity as well as good test-retest reliability for the YTP measures. YTP severity scores converged with the standardized measure of autism-related symptoms. Parent-reported YTP scores predicted observers' YTP scores at the daily level, and both parent-reported and observers' YTP scores decreased from pre- to post treatment. Observers' ratings of the videorecordings exhibited sensitivity to treatment condition. These applications of the YTP method are promising and may complement standardized symptom checklists for clinical trials focusing on autism-related symptoms. (PsycInfo Database Record (c) 2022 APA, all rights reserved).