Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Front Psychol ; 10: 2309, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31681103

RESUMEN

One important problem in the measurement of non-cognitive characteristics such as personality traits and attitudes is that it has traditionally been made through Likert scales, which are susceptible to response biases such as social desirability (SDR) and acquiescent (ACQ) responding. Given the variability of these response styles in the population, ignoring their possible effects on the scores may compromise the fairness and the validity of the assessments. Also, response-style-induced errors of measurement can affect the reliability estimates and overestimate convergent validity by correlating higher with other Likert-scale-based measures. Conversely, it can attenuate the predictive power over non-Likert-based indicators, given that the scores contain more errors. This study compares the validity of the Big Five personality scores obtained: (1) ignoring the SDR and ACQ in graded-scale items (GSQ), (2) accounting for SDR and ACQ with a compensatory IRT model, and (3) using forced-choice blocks with a multi-unidimensional pairwise preference model (MUPP) variant for dominance items. The overall results suggest that ignoring SDR and ACQ offered the worst validity evidence, with a higher correlation between personality and SDR scores. The two remaining strategies have their own advantages and disadvantages. The results from the empirical reliability and the convergent validity analysis indicate that when modeling social desirability with graded-scale items, the SDR factor apparently captures part of the variance of the Agreeableness factor. On the other hand, the correlation between the corrected GSQ-based Openness to Experience scores, and the University Access Examination grades was higher than the one with the uncorrected GSQ-based scores, and considerably higher than that using the estimates from the forced-choice data. Conversely, the criterion-related validity of the Forced Choice Questionnaire (FCQ) scores was similar to the results found in meta-analytic studies, correlating higher with Conscientiousness. Nonetheless, the FCQ-scores had considerably lower reliabilities and would demand administering more blocks. Finally, the results are discussed, and some notes are provided for the treatment of SDR and ACQ in future studies.

2.
Span J Psychol ; 21: E62, 2018 Dec 03.
Artículo en Inglés | MEDLINE | ID: mdl-30501646

RESUMEN

This study analyses the extent to which cheating occurs in a real selection setting. A two-stage, unproctored and proctored, test administration was considered. Test score inconsistencies were concluded by applying a verification test (Guo and Drasgow Z-test). An initial simulation study showed that the Z-test has adequate Type I error and power rates in the specific selection settings explored. A second study applied the Z-test statistic verification procedure to a sample of 954 employment candidates. Additional external evidence based on item time response to the verification items was gathered. The results revealed a good performance of the Z-test statistic and a relatively low, but non-negligible, number of suspected cheaters that showed higher distorted ability estimates. The study with real data provided additional information on the presence of suspected cheating in unproctored applications and the viability of using item response times as an additional evidence of cheating. In the verification test, suspected cheaters spent 5.78 seconds per item more than expected considering the item difficulty and their assumed ability in the unproctored stage. We found that the percentage of suspected cheaters in the empirical study could be estimated at 13.84%. In summary, the study provides evidence of the usefulness of the Z-test in the detection of cheating in a specific setting, in which a computerized adaptive test for assessing English grammar knowledge was used for personnel selection.


Asunto(s)
Decepción , Evaluación Educacional/normas , Internet , Selección de Personal/normas , Adulto , Femenino , Humanos , Masculino
3.
Front Psychol ; 9: 2540, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30618961

RESUMEN

This paper presents a new two-dimensional Multiple-Choice Model accounting for Omissions (MCMO). Based on Thissen and Steinberg multiple-choice models, the MCMO defines omitted responses as the result of the respondent not knowing the correct answer and deciding to omit rather than to guess given a latent propensity to omit. Firstly, using a Monte Carlo simulation, the accuracy of the parameters estimated from data with different sample sizes (500, 1,000, and 2,000 subjects), test lengths (20, 40, and 80 items) and percentages of omissions (5, 10, and 15%) were investigated. Later, the appropriateness of the MCMO to the Trends in International Mathematics and Science Study (TIMSS) Advanced 2015 mathematics and physics multiple-choice items was analyzed and compared with the Holman and Glas' Between-item Multi-dimensional IRT model (B-MIRT) and with the three-parameter logistic (3PL) model with omissions treated as incorrect responses. The results of the simulation study showed a good recovery of scale and position parameters. Pseudo-guessing parameters (d) were less accurate, but this inaccuracy did not seem to have an important effect on the estimation of abilities. The precision of the propensity to omit strongly depended on the ability values (the higher the ability, the worse the estimate of the propensity to omit). In the empirical study, the empirical reliability for ability estimates was high in both physics and mathematics. As in the simulation study, the estimates of the propensity to omit were less reliable and their precision varied with ability. Regarding the absolute item fit, the MCMO fitted the data better than the other models. Also, the MCMO offered significant increments in convergent validity between scores from multiple-choice and constructed-response items, with an increase of around 0.02 to 0.04 in R 2 in comparison with the two other methods. Finally, the high correlation between the country means of the propensity to omit in mathematics and physics suggests that (1) the propensity to omit is somehow affected by the country of residence of the examinees, and (2) the propensity to omit is independent of the test contents.

4.
Hum Brain Mapp ; 38(2): 803-816, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-27726264

RESUMEN

Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp 38:803-816, 2017. © 2016 Wiley Periodicals, Inc.


Asunto(s)
Encéfalo/diagnóstico por imagen , Encéfalo/fisiología , Cognición/fisiología , Análisis Multivariante , Análisis de Regresión , Adolescente , Mapeo Encefálico , Femenino , Humanos , Imagenología Tridimensional , Imagen por Resonancia Magnética , Masculino , Vías Nerviosas/diagnóstico por imagen , Vías Nerviosas/fisiología , Pruebas Neuropsicológicas , Reproducibilidad de los Resultados , Adulto Joven
5.
Psicothema ; 28(3): 346-52, 2016 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-27448271

RESUMEN

BACKGROUND: Multistage adaptive testing has recently emerged as an alternative to the computerized adaptive test. The current study details a new multistage test to assess fluid intelligence. METHOD: An item pool of progressive matrices with constructed response format was developed, and divided into six subtests. The subtests were applied to a sample of 724 college students and their psychometric properties were studied (i.e., reliability, dimensionality and validity evidence). The item pool was calibrated under the graded response model, and two multistage structures were developed, based on the automatic test assembly principles. Finally, the test information provided by each structure was compared in order to select the most appropriate one. RESULTS: The item pool showed adequate psychometric properties. From the two compared multistage structures, the simplest structure (i.e., routing test and two modules in the next stages) were more informative across the latent trait continuum and were therefore kept. DISCUSSION: Taken together, the results of the two studies support the application of the FIMT (Fluid Intelligence Multistage Test), a multistage test to assess fluid intelligence accurately and innovatively.


Asunto(s)
Pruebas de Inteligencia , Adolescente , Adulto , Femenino , Humanos , Masculino , Psicometría , Adulto Joven
6.
Psicothema ; 28(1): 76-82, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-26820428

RESUMEN

BACKGROUND: Forced-choice tests (FCTs) were proposed to minimize response biases associated with Likert format items. It remains unclear whether scores based on traditional methods for scoring FCTs are appropriate for between-subjects comparisons. Recently, Hontangas et al. (2015) explored the extent to which traditional scoring of FCTs relates to the true scores and IRT estimates. The authors found certain conditions under which traditional scores (TS) can be used with FCTs when the underlying IRT model was an unfolding model. In this study, we examine to what extent the results are preserved when the underlying process becomes a dominance model. METHOD: The independent variables analyzed in a simulation study are: forced-choice format, number of blocks, discrimination of items, polarity of items, variability of intra-block difficulty, range of difficulty, and correlation between dimensions. RESULTS: A similar pattern of results was observed for both models; however, correlations between TS and true thetas are higher and the differences between TS and IRT estimates are less discrepant when a dominance model involved. CONCLUSIONS: A dominance model produces a linear relationship between TS and true scores, and the subjects with extreme thetas are better measured.


Asunto(s)
Conducta de Elección , Modelos Psicológicos , Humanos , Psicometría , Encuestas y Cuestionarios
7.
Appl Psychol Meas ; 40(7): 500-516, 2016 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-29881066

RESUMEN

Forced-choice questionnaires have been proposed as a way to control some response biases associated with traditional questionnaire formats (e.g., Likert-type scales). Whereas classical scoring methods have issues of ipsativity, item response theory (IRT) methods have been claimed to accurately account for the latent trait structure of these instruments. In this article, the authors propose the multi-unidimensional pairwise preference two-parameter logistic (MUPP-2PL) model, a variant within Stark, Chernyshenko, and Drasgow's MUPP framework for items that are assumed to fit a dominance model. They also introduce a Markov Chain Monte Carlo (MCMC) procedure for estimating the model's parameters. The authors present the results of a simulation study, which shows appropriate goodness of recovery in all studied conditions. A comparison of the newly proposed model with a Brown and Maydeu's Thurstonian IRT model led us to the conclusion that both models are theoretically very similar and that the Bayesian estimation procedure of the MUPP-2PL may provide a slightly better recovery of the latent space correlations and a more reliable assessment of the latent trait estimation errors. An application of the model to a real data set shows convergence between the two estimation procedures. However, there is also evidence that the MCMC may be advantageous regarding the item parameters and the latent trait correlations.

8.
Psychol Methods ; 21(1): 93-111, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26651983

RESUMEN

An early step in the process of construct validation consists of establishing the fit of an unrestricted "exploratory" factorial model for a prespecified number of common factors. For this initial unrestricted model, researchers have often recommended and used fit indices to estimate the number of factors to retain. Despite the logical appeal of this approach, little is known about the actual accuracy of fit indices in the estimation of data dimensionality. The present study aimed to reduce this gap by systematically evaluating the performance of 4 commonly used fit indices-the comparative fit index (CFI), the Tucker-Lewis index (TLI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR)-in the estimation of the number of factors with categorical variables, and comparing it with what is arguably the current golden rule, Horn's (1965) parallel analysis. The results indicate that the CFI and TLI provide nearly identical estimations and are the most accurate fit indices, followed at a step below by the RMSEA, and then by the SRMR, which gives notably poor dimensionality estimates. Difficulties in establishing optimal cutoff values for the fit indices and the general superiority of parallel analysis, however, suggest that applied researchers are better served by complementing their theoretical considerations regarding dimensionality with the estimates provided by the latter method.


Asunto(s)
Interpretación Estadística de Datos , Modelos Estadísticos , Método de Montecarlo , Humanos
9.
Ergonomics ; 59(2): 207-21, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-26230967

RESUMEN

Artificial neural networks are sophisticated modelling and prediction tools capable of extracting complex, non-linear relationships between predictor (input) and predicted (output) variables. This study explores this capacity by modelling non-linearities in the hardiness-modulated burnout process with a neural network. Specifically, two multi-layer feed-forward artificial neural networks are concatenated in an attempt to model the composite non-linear burnout process. Sensitivity analysis, a Monte Carlo-based global simulation technique, is then utilised to examine the first-order effects of the predictor variables on the burnout sub-dimensions and consequences. Results show that (1) this concatenated artificial neural network approach is feasible to model the burnout process, (2) sensitivity analysis is a prolific method to study the relative importance of predictor variables and (3) the relationships among variables involved in the development of burnout and its consequences are to different degrees non-linear. PRACTITIONER SUMMARY: Many relationships among variables (e.g., stressors and strains) are not linear, yet researchers use linear methods such as Pearson correlation or linear regression to analyse these relationships. Artificial neural network analysis is an innovative method to analyse non-linear relationships and in combination with sensitivity analysis superior to linear methods.


Asunto(s)
Agotamiento Profesional/psicología , Modelos Teóricos , Redes Neurales de la Computación , Enfermeras y Enfermeros/psicología , Medicina del Trabajo/métodos , Adulto , China , Femenino , Humanos , Masculino , Persona de Mediana Edad , Método de Montecarlo
10.
Appl Psychol Meas ; 39(8): 598-612, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-29881030

RESUMEN

This article explores how traditional scores obtained from different forced-choice (FC) formats relate to their true scores and item response theory (IRT) estimates. Three FC formats are considered from a block of items, and respondents are asked to (a) pick the item that describes them most (PICK), (b) choose the two items that describe them the most and the least (MOLE), or (c) rank all the items in the order of their descriptiveness of the respondents (RANK). The multi-unidimensional pairwise-preference (MUPP) model, which is extended to more than two items per block and different FC formats, is applied to obtain the responses to each item block. Traditional and IRT (i.e., expected a posteriori) scores are computed from each data set and compared. The aim is to clarify the conditions under which simpler traditional scoring procedures for FC formats may be used in place of the more appropriate IRT estimates for the purpose of inter-individual comparisons. Six independent variables are considered: response format, number of items per block, correlation between the dimensions, item discrimination level, and sign-heterogeneity and variability of item difficulty parameters. Results show that the RANK response format outperforms the other formats for both the IRT estimates and traditional scores, although it is only slightly better than the MOLE format. The highest correlations between true and traditional scores are found when the test has a large number of blocks, dimensions assessed are independent, items have high discrimination and highly dispersed location parameters, and the test contains blocks formed by positive and negative items.

11.
Psychol Assess ; 26(3): 1021-30, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-24708083

RESUMEN

Previous research has suggested multiple factor structures for the 12-item General Health Questionnaire (GHQ-12), with contradictory evidence arising across different studies on the validity of these models. In the present research, it was hypothesized that these inconsistent findings were due to the interaction of 3 main methodological factors: ambiguous response categories in the negative items, multiple scoring schemes, and inappropriate estimation methods. Using confirmatory factor analysis with appropriate estimation methods and scores obtained from a large (n = 27,674) representative Spanish sample, we tested this hypothesis by evaluating the fit and predictive validities of 4 GHQ-12 factor models-unidimensional, Hankins' (2008a) response bias model, Andrich and Van Schoubroeck's (1989) 2-factor model, and Graetz's (1991) 3-factor model-across 3 scoring methods: standard, corrected, and Likert. In addition, the impact of method effects on the reliability of the global GHQ-12 scores was also evaluated. The combined results of this study support the view that the GHQ-12 is a unidimensional measure that contains spurious multidimensionality under certain scoring schemes (corrected and Likert) as a result of ambiguous response categories in the negative items. Therefore, it is suggested that the items be scored using the standard method and that only a global score be derived from the instrument.


Asunto(s)
Trastornos Mentales/diagnóstico , Proyectos de Investigación , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Análisis Factorial , Femenino , Humanos , Masculino , Trastornos Mentales/psicología , Persona de Mediana Edad , Reproducibilidad de los Resultados , Encuestas y Cuestionarios , Adulto Joven
12.
Psychol Methods ; 18(4): 454-74, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-23046000

RESUMEN

Previous research evaluating the performance of Horn's parallel analysis (PA) factor retention method with ordinal variables has produced unexpected findings. Specifically, PA with Pearson correlations has performed as well as or better than PA with the more theoretically appropriate polychoric correlations. Seeking to clarify these findings, the current study employed a more comprehensive simulation study that included the systematic manipulation of 7 factors related to the data (sample size, factor loading, number of variables per factor, number of factors, factor correlation, number of response categories, and skewness) as well as 3 factors related to the PA method (type of correlation matrix, extraction method, and eigenvalue percentile). The results from the simulation study show that PA with either Pearson or polychoric correlations is particularly sensitive to the sample size, factor loadings, number of variables per factor, and factor correlations. However, whereas PA with polychorics is relatively robust to the skewness of the ordinal variables, PA with Pearson correlations frequently retains difficulty factors and is generally inaccurate with large levels of skewness. In light of these findings, we recommend the use of PA with polychoric correlations for the dimensionality assessment of ordinal-level data.


Asunto(s)
Interpretación Estadística de Datos , Modelos Estadísticos , Animales
13.
Span J Psychol ; 15(1): 424-41, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22379731

RESUMEN

This paper describes several simulation studies that examine the effects of capitalization on chance in the selection of items and the ability estimation in CAT, employing the 3-parameter logistic model. In order to generate different estimation errors for the item parameters, the calibration sample size was manipulated (N = 500, 1000 and 2000 subjects) as was the ratio of item bank size to test length (banks of 197 and 788 items, test lengths of 20 and 40 items), both in a CAT and in a random test. Results show that capitalization on chance is particularly serious in CAT, as revealed by the large positive bias found in the small sample calibration conditions. For broad ranges of theta, the overestimation of the precision (asymptotic Se) reaches levels of 40%, something that does not occur with the RMSE (theta). The problem is greater as the item bank size to test length ratio increases. Potential solutions were tested in a second study, where two exposure control methods were incorporated into the item selection algorithm. Some alternative solutions are discussed.


Asunto(s)
Algoritmos , Inteligencia Artificial , Evaluación Educacional/estadística & datos numéricos , Modelos Estadísticos , Psicometría/estadística & datos numéricos , Humanos , Reproducibilidad de los Resultados , Diseño de Software
14.
Psicothema ; 23(4): 802-7, 2011 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22047876

RESUMEN

In this study, eCAT-Listening, a new computerized adaptive test for the evaluation of English Listening, is described. Item bank development, anchor design for data collection, and the study of the psychometric properties of the item bank and the adaptive test are described. The calibration sample comprised 1.576 participants. Good psychometric guarantees: the bank is unidimensional, the items are satisfactorily fitted to the 3-parameter logistic model, and an accurate estimation of the trait level is obtained. As validity evidence, a high correlation was obtained between the estimated trait level and a latent factor made up of the diverse criteria selected. The analysis of the trait level estimation by means of a simulation led us to fix the test length at 20 items, with a maximum exposure rate of .40.


Asunto(s)
Audición , Pruebas del Lenguaje , Computadores , Humanos , Psicometría
15.
Psicothema ; 22(2): 340-7, 2010 May.
Artículo en Español | MEDLINE | ID: mdl-20423641

RESUMEN

This study describes the parameter drift analysis conducted on eCAT (a Computerized Adaptive Test to assess the written English level of Spanish speakers). The original calibration of the item bank (N = 3224) was compared to a new calibration obtained from the data provided by most eCAT operative administrations (N = 7254). A Differential Item Functioning (DIF) study was conducted between the original and the new calibrations. The impact that the new parameters have on the trait level estimates was obtained by simulation. Results show that parameter drift is found especially for a and c parameters, an important number of bank items show DIF, and the parameter change has a moderate impact on high-level-English ? estimates. It is then recommended to replace the original estimates by the new set.


Asunto(s)
Evaluación Educacional/métodos , Evaluación Educacional/estadística & datos numéricos , Programas Informáticos , Humanos , Lenguaje
16.
Br J Math Stat Psychol ; 61(Pt 2): 493-513, 2008 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-17681109

RESUMEN

The most commonly employed item selection rule in a computerized adaptive test (CAT) is that of selecting the item with the maximum Fisher information for the estimated trait level. This means a highly unbalanced distribution of item-exposure rates, a high overlap rate among examinees and, for item bank management, strong pressure to replace items with a high discrimination parameter in the bank. An alternative for mitigating these problems involves, at the beginning of the test, basing item selection mainly on randomness. As the test progresses, the weight of information in the selection increases. In the present work we study, for two selection rules, the progressive methods (Revuelta & Ponsoda, 1998) and the proportional method (Segall, 2004a), different functions that define the weight of the random component according to the position in the test of the item to be administered. The functions were tested in simulated item banks and in an operative bank. We found that both the progressive and the proportional methods tolerate a high weight of the random component with minimal or zero loss of accuracy, while bank security and maintenance are improved.


Asunto(s)
Diseño Asistido por Computadora , Modelos Psicológicos , Distribución Aleatoria , Humanos , Psicología/métodos , Psicología/estadística & datos numéricos
17.
Psicothema ; 18(4): 828-34, 2006 Nov.
Artículo en Español | MEDLINE | ID: mdl-17296125

RESUMEN

Item selection rules in a Computerized Adaptive Test for the assessment of written English. e-CAT is a Computerized Adaptive Test for the evaluation of written English knowledge, using the item selection rule most commonly employed: the maximum Fisher information criterion. Some of the problems of this criterion have a negative impact in the estimation accuracy and in the item bank security. In this study, the performance of this item selection rule is compared, by means of simulation, with two other rules: selecting the item with maximum Fisher information in an interval (Veerkamp y Berger, 1997) and a new criterion, called "maximum Fisher information in an interval with geometric mean". In general, this new rule shows smaller measurement error and smaller item overlap rates. It seems, thus, recommendable, as it allows the simultaneous improvement of estimation accuracy and the maintenance of the item bank security of e-CAT.


Asunto(s)
Comprensión , Metodologías Computacionales , Pruebas del Lenguaje , Lectura , Algoritmos , Simulación por Computador , Evaluación Educacional , Humanos
18.
Psicothema ; 18(4): 835-40, 2006 Nov.
Artículo en Español | MEDLINE | ID: mdl-17296126

RESUMEN

Validation of the cognitive structure of the test of signs by structural equation modeling. The present work is aimed to carry out a validation study of the cognitive operations required for the correct solution of items of a math test which includes basic arithmetic operations between integer numbers. The validation of the hypothesized cognitive structure is made by means of structural equation modeling and triangulation methods. Results show strong and positive cognitive subordination relationships between some items but the structural equation model fit only provides a partial support for the proposed structure. However, the triangulation procedure provides further evidence of validity.


Asunto(s)
Matemática , Modelos Psicológicos , Pruebas Psicológicas , Adolescente , Algoritmos , Cognición , Femenino , Humanos , Masculino , Pensamiento
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA