RESUMEN
The Functionality Appreciation Scale (FAS) is increasingly used in diverse national and linguistic contexts. However, limited work has assessed the extent to which the instrument demonstrates measurement invariance and differential item functioning (DIF) across nations and respondent characteristics. Here, we examined measurement invariance and DIF of the FAS using archival data from adults in Colombia (Mebarak et al., 2023) and Spain (Zamora et al., 2024). Participants included 1420 (women n = 804, men n = 616) respondents from Colombia and 838 (women n = 415, men n = 423) respondents from Spain who completed translations of the FAS. Confirmatory factor analysis supported a unidimensional structure of the FAS in both national groups. Additionally, the FAS achieved full measurement invariance (up to latent mean invariance) across both groups. We also found that the FAS lacked DIF as a function of age, body mass index (BMI), and gender identity across both national groups. Older participants (relative to younger participants), men (relative to women), and participants with lower BMIs (relative to those with higher BMIs) had higher FAS scores. These results support the notion that the FAS is measuring a common underlying construct across these national groups and respondent characteristics.
RESUMEN
BACKGROUND: A psychometric study of the Family Adaptability and Cohesion Scale (FACES III) has been conducted in Spanish-speaking countries from the perspective of the classical test theory. However, this approach has limitations that affect the psychometric understanding of this scale. OBJECTIVE: Accordingly, this study used the item response theory to investigate the psychometric performance of the items. Furthermore, it evaluated the differential performance of the items for Colombia and Chile. METHOD: For this purpose, 518 health science students from both countries participated. Confirmatory Factor Analysis was used. RESULTS: The study results revealed that the cohesion and adaptability items presented adequate discrimination and difficulty indices. In addition, items 5, 8, 13, 17, and 19 of cohesion indicated differential functioning between students from both countries, with Chilean students exhibiting a greater discriminatory power. Further, the Colombian group exhibited a greater discriminatory power for item 18 of adaptability. CONCLUSIONS: The study concluded that the items of FACES III indicated adequate psychometric performance in terms of their discriminative capacity and difficulty in Chile and Colombia.
Asunto(s)
Estudiantes , Humanos , Psicometría , Chile , Colombia , Análisis Factorial , Reproducibilidad de los Resultados , Encuestas y CuestionariosRESUMEN
Patient reported outcomes are gaining more attention in patient-centered health outcomes research and quality of life studies as important indicators of clinical outcomes, especially for patients with chronic diseases. Factor analysis is ideal for measuring patient reported outcomes. If there is heterogeneity in the patient population and when sample size is small, differential item functioning and convergence issues are challenges for applying factor models. Bayesian hierarchical factor analysis can assess health disparity by assessing for differential item functioning, while avoiding convergence problems. We conducted a simulation study and used an empirical example with American Indian minorities to show that fitting a Bayesian hierarchical factor model is an optimal solution regardless of heterogeneity of population and sample size.
RESUMEN
AIM: The threats of novel coronavirus disease 2019 (COVID-19) have caused fears worldwide. The Fear of COVID-19 Scale (FCV-19S) was recently developed to assess the fear of COVID-19. Although many studies found that the FCV-19S is psychometrically sound, it is unclear whether the FCV-19S is invariant across countries. The present study aimed to examine the measurement invariance of the FCV-19S across eleven countries. DESIGN: Cross-sectional study. METHODS: Using data collected from prior research on Bangladesh (N = 8,550), United Kingdom (N = 344), Brazil (N = 1,843), Taiwan (N = 539), Italy (N = 249), New Zealand (N = 317), Iran (N = 717), Cuba (N = 772), Pakistan (N = 937), Japan (N = 1,079) and France (N = 316), comprising a total 15,663 participants, the present study used the multigroup confirmatory factor analysis (CFA) and Rasch differential item functioning (DIF) to examine the measurement invariance of the FCV-19S across country, gender and age (children aged below 18 years, young to middle-aged adults aged between 18 and 60 years, and older people aged above 60 years). RESULTS: The unidimensional structure of the FCV-19S was confirmed. Multigroup CFA showed that FCV-19S was partially invariant across country and fully invariant across gender and age. DIF findings were consistent with the findings from multigroup CFA. Many DIF items were displayed for country, few DIF items were displayed for age, and no DIF items were displayed for gender. CONCLUSION: Based on the results of the present study, the FCV-19S is a good psychometric instrument to assess fear of COVID-19 during the pandemic period. Moreover, the use of FCV-19S is supported in at least ten countries with satisfactory psychometric properties.
Asunto(s)
COVID-19 , Adolescente , Adulto , Anciano , Ansiedad , Bangladesh , Brasil , Niño , Estudios Transversales , Cuba , Miedo , Francia , Humanos , Irán , Italia , Japón/epidemiología , Persona de Mediana Edad , Nueva Zelanda , Pakistán , Reproducibilidad de los Resultados , SARS-CoV-2 , Taiwán , Reino Unido , Adulto JovenRESUMEN
Is the assessment of motor milestones valid and scaled equivalently for all infants? It is not only important to understand if the way we use gross and fine motor scores are appropriate for monitoring motor milestones but also to determine if these scores are confounded by specific infant characteristics. Therefore, the aim of the study is to investigate the latent structure underlying motor milestone assessment in infancy and measurement invariance across sex, birth weight, and gestational age. For this study, the birth cohort data from the United Kingdom Millennium Cohort Study (MCS) was used, which includes the assessment of eight motor milestone tasks from the Denver Developmental Screening Test in 9-month-old infants (N = 18,531), depicting early motor development of the first children of generation Z. Confirmatory factor analyses showed a better model fit for a two-factor structure (i.e., gross and fine motor development) compared to a one-factor structure (i.e., general motor development), and multiple indicators multiple causes modeling revealed no differential item functioning related to sex, birth weight, and gestational age. The study provides support for the use of gross and fine motor scores when assessing motor milestones in infants-both boys and girls with different birth weights and of varying gestational ages. Further investigation into widely adopted assessment tools is recommended to support the use of valid composite scores in early childhood research and practice.
RESUMEN
Abstract There has been a heated debate on emotional intelligence (EI) and, more particularly, on the Bar-On Emotional Quotient Inventory (EQ-i) measuring all dimensions of emotional intelligence. To ensure measurement equivalence of EQ-i, the present article evaluated whether statements phrased in EQ-i questionnaire have equivalent meaning across respondents, regardless of their sex and age group membership. For 2,078 participants, three EI subscale (item 50 in reality testing, items 4 and 19 in stress tolerance, and items 7, 52, and 82 in interpersonal) for age groups had clinically significant Differential item functioning (DIF). So previous observed associations between EI and age might be misleading and deserve further study after removing or replacing DIF items.
Resumen En medio del acalorado debate sobre la Inteligencia Emocional, este estudio retoma el Inventario de Cociente Emocional Bar-On (EQ-i), que mide todas las dimensiones de este constructo psicológico. Con el fin de comprobar la equivalencia de medición de EQ-i, se comprueba si las declaraciones formuladas en el cuestionario EQ-i tienen un significado equivalente entre los encuestados, independientemente de su sexo y grupo de edad. Se aplicó a los 2078 participantes las tres subescalas de IE. Se halló un funcionamiento diferencial de los ítems (DIF) clínicamente significativo. Por lo tanto, las asociaciones observadas anteriormente entre la IE y la edad pueden ser espurias y merecen un estudio adicional después de eliminar o reemplazar los elementos DIF.
Asunto(s)
Modelos Logísticos , Encuestas y Cuestionarios , Emociones , Inteligencia Emocional , Prueba de Realidad , AsociaciónRESUMEN
In a cross-society comparison, we assessed the state of mothers' knowledge of child rearing and child development. The study included 1,077 mothers from five countries on four continents: Argentina, Belgium, Italy, South Korea, and the United States. A criteria-referenced instrument, the Knowledge of Infant Development Inventory, was used to assess parenting knowledge after being adapted for cross-society comparison using item response theory and the alignment optimization approach for testing between-sample measurement invariance. Levels of mothers' parenting knowledge varied across the five societies and were associated with different sociodemographic factors and personal and non-personal supports.
RESUMEN
Background: Self-reported depressive complaints among college students might indicate different degrees of severity of depressive states. Through the framework of item response theory, we aim to describe the pattern of responses to items of the Beck Depression Inventory-II (BDI-II), in terms of endorsement probability and discrimination along the continuum of depression. Potential differential item functioning of the scale items of the BDI-II is investigated, by gender and age, to compare across sub-groups of students. Methods: The 21-item BDI-II was cross-sectionally administered to a representative sample of 12,677 Brazilian college students. Reliability was evaluated based on Cronbach's alpha coefficient. Severity (b i ) and discrimination (a) parameters of each BDI-II items were calculated through the graded response model. The influence of gender and age were tested for differential item functioning (DIF) within the item response theory-based approach. Results: The BDI-II presented good reliability (α = 0.91). Women and younger students significantly presented a higher likelihood of depression (cut-off > 13) than men and older counterparts. In general, participants endorsed more easily cognitive-somatic items than affective items of the scale. "Guilty feelings," "suicidal thoughts," and "loss of interest in sex" were the items that most likely indicated depression severity (b ≥ 3.60). However, all BDI-II items showed moderate-to-high discrimination (a ≥ 1.32) for depressive state. While two items were flagged for DIF, "crying" and "loss of interest in sex," respectively for gender and age, the global weight of these items on the total score was negligible. Conclusions: Although respondents' gender and age might present influence on response pattern of depressive symptoms, the measures of self-reported symptoms have not inflated severity scores. These findings provide further support to the validity of using BDI-II for assessing depression in academic contexts and highlight the value of considering gender- and age-related common symptoms of depression.
RESUMEN
The present study investigated the psychometric properties of the Raven's Colored Progressive Matrices (CPM) test in a sample of preschoolers from Brazil (n = 582; age: mean = 57 months, SD = 7 months; 46% female). We investigated the plausibility of unidimensionality of the items (confirmatory factor analysis) and differential item functioning (DIF) for sex and age (multiple indicators multiple causes method). We tested four unidimensional models and the one with the best-fit index was a reduced form of the Raven's CPM. The DIF analysis was carried out with the reduced form of the test. A few items presented DIF (two for sex and one for age), confirming that the Raven's CPM items are mostly measurement invariant. There was no effect of sex on the general factor, but increasing age was associated with higher values of the g factor. Future research should indicate if the reduced form is suitable for evaluating the general ability of preschoolers.
Asunto(s)
Pruebas de Aptitud , Cognición , Psicometría/métodos , Distribución por Edad , Brasil , Preescolar , Análisis Factorial , Femenino , Humanos , Masculino , Distribución por SexoRESUMEN
Although studies have consistently demonstrated that children with attention-deficit/hyperactivity disorder (ADHD) perform significantly lower than controls on word recognition and spelling tests, such studies rely on the assumption that those groups are comparable in these measures. This study investigates comparability of word recognition and spelling tests based on diagnostic status for ADHD through measurement invariance methods. The participants (n = 1,935; 47% female; 11% ADHD) were children aged 6-15 with normal IQ (≥70). Measurement invariance was investigated through Confirmatory Factor Analysis and Multiple Indicators Multiple Causes models. Measurement invariance was attested in both methods, demonstrating the direct comparability of the groups. Children with ADHD were 0.51 SD lower in word recognition and 0.33 SD lower in spelling tests than controls. Results suggest that differences in performance on word recognition and spelling tests are related to true mean differences based on ADHD diagnostic status. Implications for clinical practice and research are discussed.
RESUMEN
This paper discusses the issue of differential item functioning (DIF) in international surveys. DIF is likely to occur in international surveys. What is needed is a statistical approach that takes DIF into account, while at the same time allowing for meaningful comparisons between countries. Some existing approaches are discussed and an alternative is provided. The core of this alternative approach is to define the construct as a large set of items, and to report in terms of summary statistics. Since the data are incomplete, measurement models are used to complete the incomplete data. For that purpose, different models can be used across countries. The method is illustrated with PISA's reading literacy data. The results indicate that this approach fits the data better than the current PISA methodology; however, the league tables are nearly identical. The implications for monitoring changes over time are discussed.
Asunto(s)
Evaluación Educacional , Internacionalidad , Alfabetización , Modelos Estadísticos , Encuestas y Cuestionarios , Canadá , Humanos , México , Psicometría , LecturaRESUMEN
En el presente trabajo se muestra la forma en la que se evalúa la invarianza utilizando los modelos de análisis factorial confirmatorio para medias y covarianzas (AFC-MACS) para datos categóricos y los modelos de Teoría de Respuesta al Ítem (TRI). Se ejemplifica el análisis de la invarianza en el estudio de la Escala de Detección del Trastorno de Ansiedad Generalizada (EDTAG) comparando hombres y mujeres. La EDTAG es una escala ampliamente utilizada en las instituciones de salud y por sus características (escala breve de 12 ítems dicotómicos) cualquier error de medida puede tener un impacto importante. En los resultados se muestra que la escala tiene la misma configuración, sin embargo, el ítem 9 muestra funcionamiento diferencial siendo los hombres quienes mayor probabilidad tienen de responder de manera afirmativa comparado con mujeres del mismo nivel de rasgo. Se discute la necesidad de éste tipo de análisis en las escalas.
The present paper shows how to evaluated invariance using confirmatory factor analysis models for means and covariances (CFA-MACS) for categorical data and models of Item Response Theory (IRT). Invariance analysis in the study of Detection Scale Generalized Anxiety Disorder comparing men and women is exemplified. The scale is widely used in health institutions and by its nature (a brief scale of 12 dichotomous items) any measurement error can have a major impact. Results have shown that the scale has the same configuration, but the item 9 shows differential item functioning, being males more likely to respond affirmatively compared to women in the same trait level. The need for this type of analysis on the scales is discussed.
RESUMEN
BACKGROUND: Given the recent launch of a new diagnostic classification (DSM-5) for alcohol use disorders (AUD), we aimed to investigate its dimensionality and possible measurement bias in a non-U.S. METHODS: The current analyses were restricted to 948 subjects who endorsed drinking at least one drink per week in the past year from a sample of 5037 individuals. Data came from São Paulo Megacity Project (which is part of World Mental Health Surveys) collected between 2005 and 2007. First, exploratory factor analysis (EFA) was carried out to test for the best dimensional structure for DSM-5-AUD criteria. Then, item response theory (IRT) was used to investigate the severity and discrimination properties of each criterion of DSM-5-AUD. Finally, differential criterion functioning (DCF) were investigated by socio-demographics (income, gender, age, employment status, marital status and education). All analyses were performed in Mplus software taking into account complex survey design features. RESULTS: The best EFA model was a one-dimensional model. IRT results showed that the criteria "Time Spent" and "Given Up" have the highest discrimination and severity properties, while the criterion "Larger/Longer" had the lowest value of severity, but an average value of discrimination. Only female gender had DCF both at criterion- and factor-level, rendering measurement bias. CONCLUSION: This study reinforces the existence of a DSM-5-AUD continuum in the largest metropolitan area of South America, including subgroups that had previously higher rates of alcohol use (lower educational/income levels). Lower DSM-5-AUD scores were found in women.
Asunto(s)
Alcoholismo/clasificación , Manual Diagnóstico y Estadístico de los Trastornos Mentales , Adolescente , Adulto , Factores de Edad , Alcoholismo/epidemiología , Brasil/epidemiología , Escolaridad , Empleo , Análisis Factorial , Femenino , Humanos , Renta , Clasificación Internacional de Enfermedades , Masculino , Estado Civil , Persona de Mediana Edad , Reproducibilidad de los Resultados , Factores Sexuales , Factores Socioeconómicos , Población Urbana , Adulto JovenRESUMEN
INTRODUÇÃO: Diversos estudos mostram o Funcionamento Diferencial do Item (DIF) em itens do Inventário de Depressão Beck (BDI), ao compararem homens e mulheres. A presença de um grande número de itens com DIF no BDI é uma severa ameaça à validade da medida da intensidade de sintomas depressivos obtida pela Teoria da Resposta ao Item (TRI) e às conclusões baseadas nos escores derivados dos itens com e sem DIF. OBJETIVO: Os objetivos deste estudo foram identificar esses itens do BDI, ajustar o modelo de TRI para itens constrangedores (modelo 2), o qual acomoda itens com a presença de DIF, e comparar esses resultados com os do ajuste do modelo logístico de dois parâmetros tradicional da TRI (modelo 1). MÉTODOS: Os resultados obtidos com ambos os modelos foram comparados. RESULTADOS: Os itens que apresentaram DIF foram: tristeza, sentimento de fracasso, insatisfações, culpa, punição, choro, fatigabilidade e perda da libido. Os resultados do ajuste dos dois modelos são similares quanto à discriminação, gravidade (à exceção dos itens com DIF) e no cálculo de escores para os indivíduos. Apesar disso, o modelo 2 é vantajoso, pois mostra as diferenças em gravidade do sintoma depressivo para os grupos avaliados, trazendo, dessa forma, mais informação ao pesquisador sobre a população estudada. CONCLUSÃO: Esse modelo, que tem um alcance mais amplo em termos de população-alvo, pode ser uma ótima alternativa na identificação e acompanhamento de indivíduos com potencial depressivo. .
INTRODUCTION: There are several studies showing the presence of Differential Item Functioning (DIF) in some items of the Beck Depression Inventory (BDI), when comparing men and women. The presence of a large number of items with DIF in BDI is a severe threat to the validity of measurement of the intensity of depressive symptoms obtained by Item Response Theory (IRT) and to the conclusions based on the scores derived from the items with or without DIF. OBJECTIVE: The objectives of this study were to identify these items from the BDI, adjust the IRT model for embarrassing items (model 2), which accommodates items with the presence of DIF, and compare these results with the fit of the traditional two-parameter logistic IRT model (model 1). METHODS: The results obtained with the both models were compared. RESULTS: Items with DIF were: sadness, feeling of failure, dissatisfaction, guilty, punishment, crying, fatigability and loss of libido. The results of the adjustment of the two models are similar in discrimination, gravity (except for items with DIF), and in the calculation of scores for individuals. Nevertheless, model 2 is beneficial because it shows the differences in gravity of depressive symptoms for groups evaluated, thus providing more information to the researcher on the study population. CONCLUSION: This model, which has a broader scope in terms of target population, may be a good alternative to the identification and follow-up of individuals with potential depression. .
Asunto(s)
Humanos , Masculino , Femenino , Adulto , Depresión , Depresión/epidemiología , Modelos EstadísticosRESUMEN
La Universidad Jorge Tadeo Lozano aplica el Examen de Clasificación en Matemáticas Básicas, como evaluación diagnóstica, a los aspirantes y estudiantes provenientes de transferencias internas o externas, cuyo plan de estudios precise conocimientos básicos de Aritmética y Algebra Elemental. Dicho examen favorece el análisis de las condiciones académicas de los admitidos y permite a la Universidad, ofrecer opciones apropiadas para cada caso particular, al mismo tiempo que al evaluado le proporciona la posibilidad de reconocer su nivel de apropiación del conocimiento de los dominios conceptuales requeridos. Consecuentemente con el carácter decisorio del Examen de Clasificación de Matemáticas Básicas, se examinó si los ítemes utilizados presentan funcionamiento diferencial, esto es, se analizó si la diferencia de habilidades entre los evaluados podría deberse a las variables de contexto seleccionadas: sexo, edad, naturaleza jurídica del colegio de procedencia y facultad en la que el aspirante tramita su ingreso. Para ello, se procesaron 1.623 cadenas de respuestas para 61 ítemes, obtenidas en las pruebas comprendidas entre el tercer período lectivo de 2011 y el primero de 2012. La metodología incluyó la implementación de tres técnicas: Contraste del DIF (diferencia entre los centros de dificultades), Contraste del DIF (diferencia entre los extremos más próximos para los intervalos de dificultad) y prueba estadística Mantel-Haenszel. La conjunción de estas técnicas permitió determinar un ítem con funcionamiento diferencial en categoría moderada a grande, para la variable edad. Finalmente, para este ítem se exhiben sus parámetros estadísticos y su curva característica, estimados en la calibración.
The following article presents an application of differential item functioning (DIF), using results obtained from the qualifying test developed by the Jorge Tadeo Lozano University and taken by students to classify them at a level of mathematical knowledge and to define an academic route for them based on their cognitive status shown on the test. The analysis is part of a perspective to estimate the difficulty and others characteristics of items, and the skills and level of students through the use of the Rasch model of the one parameter item response theory (IRT) and the parameters of a sample of 1623 students taking a test composed of 61 items. The article analyzes both the statistical performance of the items in terms of the parameters of item-test correlation, misfits (infit and outfit), and discrimination, as well as the behavior of the set of items depending on the construct validity or dimensionality, reliability, internal consistency, and separation parameters. A method is then shown to examine which items display DIF, associated with the conditions of the students' origin and not of their academic ability, which could lead to bias in the results of the test. The employed methods estimate the relative difficulty of each item, for students of similar ability but who belong to different groups, according to four variables studied: sex, age, intended major, and whether the high school of origin is public or private. The value of the difference in relative difficulty between the groups mentioned is associated with a level of DIF and recognizes whether the item in question has bias and which groups this bias is favoring. The difference in relative difficulty is graded in terms of severity according to three categories proposed by the Educational Testing Service: (1) moderate to large, if the difference in relative difficulty between groups (for students of similar ability) is greater than or equal to .64 logits, (2) small to moderate, if the difference is greater than or equal to .43 and less than .64 logits, and (3) not significant, if this difference is less than .43 logits. In order to validate the detection of DIF, the calculations are performed using three techniques. Two are chosen from those available in the literature and the third one is a proposal by the authors of this article to consider the size of the error in the estimations of difficulty difference. The three techniques used are: (1) the measurement of the difference between the core values of the difficulty intervals, ignoring the value of the estimation error, (2) the difference between the nearest extremes of the difficulty intervals, taking into account the estimation error, and (3) the Mantel-Haenszel statistical test. Regarding databases formed for the analysis, two aspects were considered: (1) chains of responses corresponding to missing data or especially small groups, which would not have allowed an effective and reliable comparison, were omitted, and (2) random samples with uniform distribution were selected to create groups of the same size for each study variable. The analysis with the technique of difference between core values showed that two items (34 and 59) displayed DIF with moderate to large severity, regarding the age variable for item 34 and the intended major and high school of origin variables for item 59. The technique about difference between the nearest extremes confirmed DIF with moderate to large severity for item 34, with respect to the age variable. The Mantel-Haenszel test detected DIF with moderate to large severity for items 13, 20, 34, and 61 for the age variable, and for items 4, 30, 36, 43, and 59 for the major variable.
RESUMEN
La Universidad Jorge Tadeo Lozano aplica el Examen de Clasificación en Matemáticas Básicas, como evaluación diagnóstica, a los aspirantes y estudiantes provenientes de transferencias internas o externas, cuyo plan de estudios precise conocimientos básicos de Aritmética y Algebra Elemental. Dicho examen favorece el análisis de las condiciones académicas de los admitidos y permite a la Universidad, ofrecer opciones apropiadas para cada caso particular, al mismo tiempo que al evaluado le proporciona la posibilidad de reconocer su nivel de apropiación del conocimiento de los dominios conceptuales requeridos. Consecuentemente con el carácter decisorio del Examen de Clasificación de Matemáticas Básicas, se examinó si los ítemes utilizados presentan funcionamiento diferencial, esto es, se analizó si la diferencia de habilidades entre los evaluados podría deberse a las variables de contexto seleccionadas: sexo, edad, naturaleza jurídica del colegio de procedencia y facultad en la que el aspirante tramita su ingreso. Para ello, se procesaron 1.623 cadenas de respuestas para 61 ítemes, obtenidas en las pruebas comprendidas entre el tercer período lectivo de 2011 y el primero de 2012. La metodología incluyó la implementación de tres técnicas: Contraste del DIF (diferencia entre los centros de dificultades), Contraste del DIF (diferencia entre los extremos más próximos para los intervalos de dificultad) y prueba estadística Mantel-Haenszel. La conjunción de estas técnicas permitió determinar un ítem con funcionamiento diferencial en categoría moderada a grande, para la variable edad. Finalmente, para este ítem se exhiben sus parámetros estadísticos y su curva característica, estimados en la calibración.(AU)
The following article presents an application of differential item functioning (DIF), using results obtained from the qualifying test developed by the Jorge Tadeo Lozano University and taken by students to classify them at a level of mathematical knowledge and to define an academic route for them based on their cognitive status shown on the test. The analysis is part of a perspective to estimate the difficulty and others characteristics of items, and the skills and level of students through the use of the Rasch model of the one parameter item response theory (IRT) and the parameters of a sample of 1623 students taking a test composed of 61 items. The article analyzes both the statistical performance of the items in terms of the parameters of item-test correlation, misfits (infit and outfit), and discrimination, as well as the behavior of the set of items depending on the construct validity or dimensionality, reliability, internal consistency, and separation parameters. A method is then shown to examine which items display DIF, associated with the conditions of the students origin and not of their academic ability, which could lead to bias in the results of the test. The employed methods estimate the relative difficulty of each item, for students of similar ability but who belong to different groups, according to four variables studied: sex, age, intended major, and whether the high school of origin is public or private. The value of the difference in relative difficulty between the groups mentioned is associated with a level of DIF and recognizes whether the item in question has bias and which groups this bias is favoring. The difference in relative difficulty is graded in terms of severity according to three categories proposed by the Educational Testing Service: (1) moderate to large, if the difference in relative difficulty between groups (for students of similar ability) is greater than or equal to .64 logits, (2) small to moderate, if the difference is greater than or equal to .43 and less than .64 logits, and (3) not significant, if this difference is less than .43 logits. In order to validate the detection of DIF, the calculations are performed using three techniques. Two are chosen from those available in the literature and the third one is a proposal by the authors of this article to consider the size of the error in the estimations of difficulty difference. The three techniques used are: (1) the measurement of the difference between the core values of the difficulty intervals, ignoring the value of the estimation error, (2) the difference between the nearest extremes of the difficulty intervals, taking into account the estimation error, and (3) the Mantel-Haenszel statistical test. Regarding databases formed for the analysis, two aspects were considered: (1) chains of responses corresponding to missing data or especially small groups, which would not have allowed an effective and reliable comparison, were omitted, and (2) random samples with uniform distribution were selected to create groups of the same size for each study variable. The analysis with the technique of difference between core values showed that two items (34 and 59) displayed DIF with moderate to large severity, regarding the age variable for item 34 and the intended major and high school of origin variables for item 59. The technique about difference between the nearest extremes confirmed DIF with moderate to large severity for item 34, with respect to the age variable. The Mantel-Haenszel test detected DIF with moderate to large severity for items 13, 20, 34, and 61 for the age variable, and for items 4, 30, 36, 43, and 59 for the major variable.(AU)
RESUMEN
Parte dos estudos sobre família envolve o suporte familiar, construto base do Inventário de Percepção do Suporte Familiar (IPSF). Várias pesquisas utilizando este instrumento podem ser encontradas, contudo sem resultados conclusivos sobre a diferença entre sexos no construto. Uma das formas de se verificar este viés se dá pelo funcionamento diferencial dos itens (DIF) e nesse sentido, este artigo teve como objetivo verificar se existe DIF em função do sexo para o IPSF, a partir de um banco de dados com 1322 sujeitos. Três itens apresentaram DIF favorecendo o grupo feminino e quatro, o masculino. Pelo princípio de equidade pode-se inicialmente indicar equilíbrio dos vieses, entretanto pela análise complementar de comparação das médias dos Thetas antes e após ancoragem dos itens sem DIF, foi possível identificar que, na prática, os vieses podem ter relação a aspectos externos à amostra, e não aos itens do teste, garantindo suas características psicométricas...
Part of the researches about family involves the Family Support, the basic construct of the Perception of Family Support Inventory (IPSF). Several studies using this instrument can be found, but without conclusive results on the gender difference in the construct. One way to verify this bias is through differential item functioning (DIF) and in this regard, this article aimed to verify whether there is DIF by gender for the IPSF, based from a database with 1322 subjects. Three items showed a DIF presence favoring the female group and four favoring the male group. By the principle of equality it can initially indicate the equilibrium state of the bias, however an additional analysis of comparison of the Thetas before and after anchoring of the items without DIF, it was found that in practice, the bias may be related to external aspects of the sample and not to the test items, ensuring its psychometric characteristics...
Parte de los estudios acerca la familia son acerca del apoyo familiar, el concepto base del Inventario de Percepción de Apoyo a la Familia (IPSF). Se puede encontrar varios estudios utilizando este instrumento, pero sin resultados concluyentes sobre la diferencia de género en el constructo. Una forma de comprobar esta tendencia es mediante el funcionamiento diferencial de los ítems (DIF) y en ese sentido, este artículo tuve como objetivo verificar si existe el DIF por género para la IPSF, a partir de una base de datos con 1322 sujetos. Tres elementos tuvieran DIF a favor de las mujeres y cuatro a favor de los hombres. Por el principio de la igualdad se puede indicar un estado de equilibrio en la parcialidad, sin embargo, el análisis de comparación adicional de los Thetas antes y después de anclaje de los elementos sin DIF, se encontró que en la práctica, la tendencia puede estar relacionada con un aspecto exterior de la muestra y no en el ítems de la prueba, asegurando a sus características psicométricas...
Asunto(s)
Humanos , Masculino , Femenino , Niño , Adolescente , Adulto , Familia , Psicología , PsicometríaRESUMEN
El objetivo de este estudio se centró en poner a prueba la invarianza de la sintomatología del Trastorno por Déficit de Atención con Hiperactividad (TDAH) en función del género, en una muestra de 634 niños. Se comprobó, en primer lugar, el ajuste de cinco modelos factoriales mediante análisis factorial confirmatorio, y se utilizó la regresión logística ordinal como método de estimación del funcionamiento diferencial del ítem (DIF), tanto uniforme como no uniforme. Los resultados pusieron de manifiesto que: (a) el modelo que presentó mejor ajuste fue el de tres factores correlacionados y (b) no existe DIF en función del género de los niños evaluados, ni en la modalidad de calificación ordinal (escala de 1 a 4) ni en la modalidad de calificación binaria (0-1) de los ítems. Estos resultados refrendan el hecho de que en el Diagnostic and Statistical Manual of Mental Disorders, DSM-IV-TR, no se establezcan criterios diferenciales para el diagnóstico del TDAH en niños y niñas.
The aim of this paper was to test the invariance of Attention Deficit Hyperactivity Disorder (ADHD) symptoms in a sample of 634 children attending to gender. We firstly examined the fit to the data of five models using confirmatory factor analysis, and ordinal logistic regression was used as a method of estimation of both uniform and non-uniform differential item functioning (DIF). The results showed that (a) the three correlated factors model was the best fit model, and (b) there was no gender-specific DIF, either in the ordinal rating form (scale of 1 to 4) or the binary rating form (0-1). These results support that the Diagnostic and Statistical Manual of Mental Disorders DSM-W-TR does not establish ADHD differential diagnosis criteria for boys and girls.
Asunto(s)
Psicología Clínica , PsicometríaRESUMEN
O objetivo do presente estudo foi investigar a presença de Funcionamento Diferencial do Item (DIF) para o sexo nos itens de uma popular medida de rastreio para a dependência do álcool, o Alcohol Use Disorders Identification Test (AUDIT). Os participantes foram 254 estudantes universitários. As análises de DIF foram realizadas a partir do modelo Partial Credit. Três itens não apresentaram ajuste adequado ao modelo, enquanto outro item apresentou um viés de grande magnitude para o sexo feminino. Especificamente, o itemAUDIT-6 mostrou ser, de fato, um aspecto do beber com maior severidade para indivíduos do sexo feminino do que para aqueles dosexo masculino. Recomenda-se cautela na utilização de escores brutos a partir dos 10 itens originais do instrumento e a exclusão doitem AUDIT-6 ou sua reformulação para a aplicação em indivíduos do sexo feminino.
The aim of the present study was to investigate Differential Item Functioning (DIF) for sex in a popular screening measure of alcohol dependence, the Alcohol Use Disorders Identification Test (AUDIT). Participants were 254 undergraduate students. Raschanalysis-based DIF procedures were employed to evaluate the possibility of sex bias in the AUDIT items. Results showed three items with inadequate fit to the unidimensional Partial Credit model and one item with a large bias against women. Specifically, item AUDIT-6 showed a larger difficulty estimate for women, suggesting a more severe aspect of drinking behavior for these individuals. Recommendations are made with regard to using raw scores based on the sum of the 10 AUDIT items. Dropping out or reformulating the AUDIT-6 item to accommodate sex differences and to reduce errors of individual ability (severity) estimates is strongly advised.
El objetivo del presente estudio fue investigar la presencia de Funcionamiento Diferencial del Ítem (DIF) para el sexo en los ítems de una popular medida de rastreo para la dependencia del alcohol, el Alcohol Use Disorders Identification Test (AUDIT). Participaron 254 estudiantes universitarios. Los análisis de DIF fueron realizados a partir del modelo Partial Credit. Tres ítems no presentaron ajusteade cuado al modelo, mientras otro ítem mostró una tendencia de gran magnitud para el sexo femenino. Específicamente, el ítem AUDIT-6 mostró ser, de hecho, un aspecto del beber con mayor severidad para individuos del sexo femenino que para aquellos del sexo masculino. Se recomienda precaución en el uso de escores brutos a partir de los 10 ítems originales del instrumento y la exclusióndel ítem AUDIT-6 o su reformulación para la aplicación en individuos del sexo femenino.
Asunto(s)
Humanos , Femenino , Adolescente , Adulto Joven , Consumo de Bebidas Alcohólicas/psicología , Universidades , Psicometría , Sesgo de Selección , Distribución por Sexo , EstudiantesRESUMEN
La Double Standard Scale (DSS) es uno de los instrumentos más utilizados para evaluar la doble moral sexual y, en ocasiones, para compararla entre sexos. El interés por el estudio de la doble moral sexual radica en que constituye una variable asociada a la salud sexual. A pesar de que hay varios estudios que se han interesado por las propiedades psicométricas de la DSS, ninguno de ellos se ha planteado examinar su equivalencia entre hombres y mujeres, o entre adolescentes y adultos. Por ello, el objetivo de este trabajo es examinar la invarianza factorial y analizar el funcionamiento diferencial del ítem por sexo y edad. Para ello, se evaluó a una muestra de 2.248 sujetos peruanos (1.063 adolescentes -46% jóvenes y 54% jovencitas- y 1.185 adultos -51% varones y 49% mujeres). Los resultados muestran una equivalencia factorial de ajuste perfecto mediante el parcelamiento de ítems respecto al sexo, siendo el funcionamiento diferencial despreciable. En cambio, con respecto a la edad, se descarta la invarianza, aunque el funcionamiento diferencial es depreciable. En definitiva, se podrán contrastar las puntuaciones de la DSS entre sexos, pero no entre adolescentes y adultos.
The Double Standard Scale (DSS) is an instrument used to assess the sexual double standard, and appropriate, to make comparisons between sexes. The interest in the study of the sexual double standard is a variable associated with sexual health. Although, several studies have been interested in the psychometric properties of the DSS, none of them has been raised considering their equivalence between men and women, or between adolescents and adults. This study has the objective, examine the factorial invariance and analyze differential item functioning by sex and age. A sample of 2,248 Peruvian subjects (1.063 adolescent - 46% boys and 54% girls- and 1.185 adult -51% men and 49% women-) was evaluated. Results obtained demonstrate a perfect fit factorial equivalence by categorizing of items about gender, and the differential item functioning is negligible. In contrast, with respect to age, invariance is discarded, but the differential item functioning is negligible. In conclusion, DSS scores can be contrasted between genders, but not between adolescents and adults.