Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Front Psychol ; 12: 685326, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34149573

RESUMEN

The item wording (or keying) effect consists of logically inconsistent answers to positively and negatively worded items that tap into similar (but polarly opposite) content. Previous research has shown that this effect can be successfully modeled through the random intercept item factor analysis (RIIFA) model, as evidenced by the improvements in the model fit in comparison to models that only contain substantive factors. However, little is known regarding the capability of this model in recovering the uncontaminated person scores. To address this issue, the study analyzes the performance of the RIIFA approach across three types of wording effects proposed in the literature: carelessness, item verification difficulty, and acquiescence. In the context of unidimensional substantive models, four independent variables were manipulated, using Monte Carlo methods: type of wording effect, amount of wording effect, sample size, and test length. The results corroborated previous findings by showing that the RIIFA models were consistently able to account for the variance in the data, attaining an excellent fit regardless of the amount of bias. Conversely, the models without the RIIFA factor produced increasingly a poorer fit with greater amounts of wording effects. Surprisingly, however, the RIIFA models were not able to better estimate the uncontaminated person scores for any type of wording effect in comparison to the substantive unidimensional models. The simulation results were then corroborated with an empirical dataset, examining the relationship between learning strategies and personality with grade point average in undergraduate studies. The apparently paradoxical findings regarding the model fit and the recovery of the person scores are explained, considering the properties of the factor models examined.

2.
Front Psychol ; 12: 614470, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33658962

RESUMEN

Cognitive diagnosis models (CDMs) allow classifying respondents into a set of discrete attribute profiles. The internal structure of the test is determined in a Q-matrix, whose correct specification is necessary to achieve an accurate attribute profile classification. Several empirical Q-matrix estimation and validation methods have been proposed with the aim of providing well-specified Q-matrices. However, these methods require the number of attributes to be set in advance. No systematic studies about CDMs dimensionality assessment have been conducted, which contrasts with the vast existing literature for the factor analysis framework. To address this gap, the present study evaluates the performance of several dimensionality assessment methods from the factor analysis literature in determining the number of attributes in the context of CDMs. The explored methods were parallel analysis, minimum average partial, very simple structure, DETECT, empirical Kaiser criterion, exploratory graph analysis, and a machine learning factor forest model. Additionally, a model comparison approach was considered, which consists in comparing the model-fit of empirically estimated Q-matrices. The performance of these methods was assessed by means of a comprehensive simulation study that included different generating number of attributes, item qualities, sample sizes, ratios of the number of items to attribute, correlations among the attributes, attributes thresholds, and generating CDM. Results showed that parallel analysis (with Pearson correlations and mean eigenvalue criterion), factor forest model, and model comparison (with AIC) are suitable alternatives to determine the number of attributes in CDM applications, with an overall percentage of correct estimates above 76% of the conditions. The accuracy increased to 97% when these three methods agreed on the number of attributes. In short, the present study supports the use of three methods in assessing the dimensionality of CDMs. This will allow to test the assumption of correct dimensionality present in the Q-matrix estimation and validation methods, as well as to gather evidence of validity to support the use of the scores obtained with these models. The findings of this study are illustrated using real data from an intelligence test to provide guidelines for assessing the dimensionality of CDM data in applied settings.

3.
Appl Psychol Meas ; 45(2): 112-129, 2021 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-33627917

RESUMEN

Decisions on how to calibrate an item bank might have major implications in the subsequent performance of the adaptive algorithms. One of these decisions is model selection, which can become problematic in the context of cognitive diagnosis computerized adaptive testing, given the wide range of models available. This article aims to determine whether model selection indices can be used to improve the performance of adaptive tests. Three factors were considered in a simulation study, that is, calibration sample size, Q-matrix complexity, and item bank length. Results based on the true item parameters, and general and single reduced model estimates were compared to those of the combination of appropriate models. The results indicate that fitting a single reduced model or a general model will not generally provide optimal results. Results based on the combination of models selected by the fit index were always closer to those obtained with the true item parameters. The implications for practical settings include an improvement in terms of classification accuracy and, consequently, testing time, and a more balanced use of the item bank. An R package was developed, named cdcatR, to facilitate adaptive applications in this context.

4.
Br J Math Stat Psychol ; 74 Suppl 1: 110-130, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-33231301

RESUMEN

The Q-matrix identifies the subset of attributes measured by each item in the cognitive diagnosis modelling framework. Usually constructed by domain experts, the Q-matrix might contain some misspecifications, disrupting classification accuracy. Empirical Q-matrix validation methods such as the general discrimination index (GDI) and Wald have shown promising results in addressing this problem. However, a cut-off point is used in both methods, which might be suboptimal. To address this limitation, the Hull method is proposed and evaluated in the present study. This method aims to find the optimal balance between fit and parsimony, and it is flexible enough to be used either with a measure of item discrimination (the proportion of variance accounted for, PVAF) or a coefficient of determination (pseudo-R2 ). Results from a simulation study showed that the Hull method consistently showed the best performance and shortest computation time, especially when used with the PVAF. The Wald method also performed very well overall, while the GDI method obtained poor results when the number of attributes was high. The absence of a cut-off point provides greater flexibility to the Hull method, and it places it as a comprehensive solution to the Q-matrix specification problem in applied settings. This proposal is illustrated using real data.


Asunto(s)
Modelos Estadísticos , Proyectos de Investigación , Simulación por Computador , Psicometría
5.
Appl Psychol Meas ; 44(6): 431-446, 2020 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-32788815

RESUMEN

In the context of cognitive diagnosis models (CDMs), a Q-matrix reflects the correspondence between attributes and items. The Q-matrix construction process is typically subjective in nature, which may lead to misspecifications. All this can negatively affect the attribute classification accuracy. In response, several methods of empirical Q-matrix validation have been developed. The general discrimination index (GDI) method has some relevant advantages such as the possibility of being applied to several CDMs. However, the estimation of the GDI relies on the estimation of the latent group sizes and success probabilities, which is made with the original (possibly misspecified) Q-matrix. This can be a problem, especially in those situations in which there is a great uncertainty about the Q-matrix specification. To address this, the present study investigates the iterative application of the GDI method, where only one item is modified at each step of the iterative procedure, and the required cutoff is updated considering the new parameter estimates. A simulation study was conducted to test the performance of the new procedure. Results showed that the performance of the GDI method improved when the application was iterative at the item level and an appropriate cutoff point was used. This was most notable when the original Q-matrix misspecification rate was high, where the proposed procedure performed better 96.5% of the times. The results are illustrated using Tatsuoka's fraction-subtraction data set.

6.
PLoS One ; 15(1): e0227196, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-31923227

RESUMEN

Currently, there are two predominant approaches in adaptive testing. One, referred to as cognitive diagnosis computerized adaptive testing (CD-CAT), is based on cognitive diagnosis models, and the other, the traditional CAT, is based on item response theory. The present study evaluates the performance of two item selection rules (ISRs) originally developed in the CD-CAT framework, the double Kullback-Leibler information (DKL) and the generalized deterministic inputs, noisy "and" gate model discrimination index (GDI), in the context of traditional CAT. The accuracy and test security associated with these two ISRs are compared to those of the point Fisher information and weighted KL using a simulation study. The impact of the trait level estimation method is also investigated. The results show that the new ISRs, particularly DKL, could be used to improve the accuracy of CAT. Better accuracy for DKL is achieved at the expense of higher item overlap rate. Differences among the item selection rules become smaller as the test gets longer. The two CD-CAT ISRs select different types of items: items with the highest possible a parameter with DKL, and items with the lowest possible c parameter with GDI. Regarding the trait level estimator, expected a posteriori method is generally better in the first stages of the CAT, and converges with the maximum likelihood method when a medium to large number of items are involved. The use of DKL can be recommended in low-stakes settings where test security is less of a concern.


Asunto(s)
Cognición , Evaluación Educacional/métodos , Psicometría/métodos , Algoritmos , Teorema de Bayes , Sesgo , Simulación por Computador , Computadores , Exactitud de los Datos , Humanos
7.
Educ Psychol Meas ; 79(4): 727-753, 2019 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-32655181

RESUMEN

Cognitive diagnosis models (CDMs) are latent class multidimensional statistical models that help classify people accurately by using a set of discrete latent variables, commonly referred to as attributes. These models require a Q-matrix that indicates the attributes involved in each item. A potential problem is that the Q-matrix construction process, typically performed by domain experts, is subjective in nature. This might lead to the existence of Q-matrix misspecifications that can lead to inaccurate classifications. For this reason, several empirical Q-matrix validation methods have been developed in the recent years. de la Torre and Chiu proposed one of the most popular methods, based on a discrimination index. However, some questions related to the usefulness of the method with empirical data remained open due the restricted number of conditions examined, and the use of a unique cutoff point (EPS) regardless of the data conditions. This article includes two simulation studies to test this validation method under a wider range of conditions, with the purpose of providing it with a higher generalization, and to empirically determine the most suitable EPS considering the data conditions. Results show a good overall performance of the method, the relevance of the different studied factors, and that using a single indiscriminate EPS is not acceptable. Specific guidelines for selecting an appropriate EPS are provided in the discussion.

8.
Span J Psychol ; 21: E62, 2018 Dec 03.
Artículo en Inglés | MEDLINE | ID: mdl-30501646

RESUMEN

This study analyses the extent to which cheating occurs in a real selection setting. A two-stage, unproctored and proctored, test administration was considered. Test score inconsistencies were concluded by applying a verification test (Guo and Drasgow Z-test). An initial simulation study showed that the Z-test has adequate Type I error and power rates in the specific selection settings explored. A second study applied the Z-test statistic verification procedure to a sample of 954 employment candidates. Additional external evidence based on item time response to the verification items was gathered. The results revealed a good performance of the Z-test statistic and a relatively low, but non-negligible, number of suspected cheaters that showed higher distorted ability estimates. The study with real data provided additional information on the presence of suspected cheating in unproctored applications and the viability of using item response times as an additional evidence of cheating. In the verification test, suspected cheaters spent 5.78 seconds per item more than expected considering the item difficulty and their assumed ability in the unproctored stage. We found that the percentage of suspected cheaters in the empirical study could be estimated at 13.84%. In summary, the study provides evidence of the usefulness of the Z-test in the detection of cheating in a specific setting, in which a computerized adaptive test for assessing English grammar knowledge was used for personnel selection.


Asunto(s)
Decepción , Evaluación Educacional/normas , Internet , Selección de Personal/normas , Adulto , Femenino , Humanos , Masculino
9.
Front Psychol ; 9: 2540, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30618961

RESUMEN

This paper presents a new two-dimensional Multiple-Choice Model accounting for Omissions (MCMO). Based on Thissen and Steinberg multiple-choice models, the MCMO defines omitted responses as the result of the respondent not knowing the correct answer and deciding to omit rather than to guess given a latent propensity to omit. Firstly, using a Monte Carlo simulation, the accuracy of the parameters estimated from data with different sample sizes (500, 1,000, and 2,000 subjects), test lengths (20, 40, and 80 items) and percentages of omissions (5, 10, and 15%) were investigated. Later, the appropriateness of the MCMO to the Trends in International Mathematics and Science Study (TIMSS) Advanced 2015 mathematics and physics multiple-choice items was analyzed and compared with the Holman and Glas' Between-item Multi-dimensional IRT model (B-MIRT) and with the three-parameter logistic (3PL) model with omissions treated as incorrect responses. The results of the simulation study showed a good recovery of scale and position parameters. Pseudo-guessing parameters (d) were less accurate, but this inaccuracy did not seem to have an important effect on the estimation of abilities. The precision of the propensity to omit strongly depended on the ability values (the higher the ability, the worse the estimate of the propensity to omit). In the empirical study, the empirical reliability for ability estimates was high in both physics and mathematics. As in the simulation study, the estimates of the propensity to omit were less reliable and their precision varied with ability. Regarding the absolute item fit, the MCMO fitted the data better than the other models. Also, the MCMO offered significant increments in convergent validity between scores from multiple-choice and constructed-response items, with an increase of around 0.02 to 0.04 in R 2 in comparison with the two other methods. Finally, the high correlation between the country means of the propensity to omit in mathematics and physics suggests that (1) the propensity to omit is somehow affected by the country of residence of the examinees, and (2) the propensity to omit is independent of the test contents.

10.
Psychol Methods ; 21(1): 93-111, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26651983

RESUMEN

An early step in the process of construct validation consists of establishing the fit of an unrestricted "exploratory" factorial model for a prespecified number of common factors. For this initial unrestricted model, researchers have often recommended and used fit indices to estimate the number of factors to retain. Despite the logical appeal of this approach, little is known about the actual accuracy of fit indices in the estimation of data dimensionality. The present study aimed to reduce this gap by systematically evaluating the performance of 4 commonly used fit indices-the comparative fit index (CFI), the Tucker-Lewis index (TLI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR)-in the estimation of the number of factors with categorical variables, and comparing it with what is arguably the current golden rule, Horn's (1965) parallel analysis. The results indicate that the CFI and TLI provide nearly identical estimations and are the most accurate fit indices, followed at a step below by the RMSEA, and then by the SRMR, which gives notably poor dimensionality estimates. Difficulties in establishing optimal cutoff values for the fit indices and the general superiority of parallel analysis, however, suggest that applied researchers are better served by complementing their theoretical considerations regarding dimensionality with the estimates provided by the latter method.


Asunto(s)
Interpretación Estadística de Datos , Modelos Estadísticos , Método de Montecarlo , Humanos
11.
Psicothema ; 26(3): 395-400, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25069561

RESUMEN

BACKGROUND: The Exploratory Factor Analysis (EFA) procedure is one of the most commonly used in social and behavioral sciences. However, it is also one of the most criticized due to the poor management researchers usually display. The main goal is to examine the relationship between practices usually considered more appropriate and actual decisions made by researchers. METHOD: The use of exploratory factor analysis is examined in 117 papers published between 2011 and 2012 in 3 Spanish psychological journals with the highest impact within the previous five years. RESULTS: RESULTS show significant rates of questionable decisions in conducting EFA, based on unjustified or mistaken decisions regarding the method of extraction, retention, and rotation of factors. CONCLUSIONS: Overall, the current review provides support for some improvement guidelines regarding how to apply and report an EFA.


Asunto(s)
Análisis Factorial , Estudios de Validación como Asunto , Guías como Asunto
12.
Span J Psychol ; 17: E48, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25012203

RESUMEN

Test security can be a major problem in computerized adaptive testing, as examinees can share information about the items they receive. Of the different item selection rules proposed to alleviate this risk, stratified methods are among those that have received most attention. In these methods, only low discriminative items can be presented at the beginning of the test and the mean information of the items increases as the test goes on. To do so, the item bank must be divided into several strata according to the information of the items. To date, there is no clear guidance about the optimal number of strata into which the item bank should be split. In this study, we will simulate conditions with different numbers of strata, from 1 (no stratification) to a number of strata equal to test length (maximum level of stratification) while manipulating the maximum exposure rate that no item should surpass (r max ) in its whole domain. In this way, we can plot the relation between test security and accuracy, making it possible to determine the number of strata that leads to better security while holding constant measurement accuracy. Our data indicates that the best option is to stratify into as many strata as possible.


Asunto(s)
Metodologías Computacionales , Evaluación Educacional/normas , Psicometría/normas , Evaluación Educacional/métodos , Humanos , Psicometría/métodos
13.
Psicothema ; 25(2): 238-44, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23628540

RESUMEN

BACKGROUND: Criterion-referenced interpretations of tests are highly necessary, which usually involves the difficult task of establishing cut scores. Contrasting with other Item Response Theory (IRT)-based standard setting methods, a non-judgmental approach is proposed in this study, in which Item Characteristic Curve (ICC) transformations lead to the final cut scores. METHOD: eCat-Listening, a computerized adaptive test for the evaluation of English Listening, was administered to 1,576 participants, and the proposed standard setting method was applied to classify them into the performance standards of the Common European Framework of Reference for Languages (CEFR). RESULTS: The results showed a classification closely related to relevant external measures of the English language domain, according to the CEFR. CONCLUSIONS: It is concluded that the proposed method is a practical and valid standard setting alternative for IRT-based tests interpretations.


Asunto(s)
Comprensión , Pruebas Psicológicas , Computadores , Humanos , Modelos Estadísticos , Psicometría , Reproducibilidad de los Resultados
14.
Psychol Methods ; 18(4): 454-74, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-23046000

RESUMEN

Previous research evaluating the performance of Horn's parallel analysis (PA) factor retention method with ordinal variables has produced unexpected findings. Specifically, PA with Pearson correlations has performed as well as or better than PA with the more theoretically appropriate polychoric correlations. Seeking to clarify these findings, the current study employed a more comprehensive simulation study that included the systematic manipulation of 7 factors related to the data (sample size, factor loading, number of variables per factor, number of factors, factor correlation, number of response categories, and skewness) as well as 3 factors related to the PA method (type of correlation matrix, extraction method, and eigenvalue percentile). The results from the simulation study show that PA with either Pearson or polychoric correlations is particularly sensitive to the sample size, factor loadings, number of variables per factor, and factor correlations. However, whereas PA with polychorics is relatively robust to the skewness of the ordinal variables, PA with Pearson correlations frequently retains difficulty factors and is generally inaccurate with large levels of skewness. In light of these findings, we recommend the use of PA with polychoric correlations for the dimensionality assessment of ordinal-level data.


Asunto(s)
Interpretación Estadística de Datos , Modelos Estadísticos , Animales
15.
Psicothema ; 23(4): 802-7, 2011 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22047876

RESUMEN

In this study, eCAT-Listening, a new computerized adaptive test for the evaluation of English Listening, is described. Item bank development, anchor design for data collection, and the study of the psychometric properties of the item bank and the adaptive test are described. The calibration sample comprised 1.576 participants. Good psychometric guarantees: the bank is unidimensional, the items are satisfactorily fitted to the 3-parameter logistic model, and an accurate estimation of the trait level is obtained. As validity evidence, a high correlation was obtained between the estimated trait level and a latent factor made up of the diverse criteria selected. The analysis of the trait level estimation by means of a simulation led us to fix the test length at 20 items, with a maximum exposure rate of .40.


Asunto(s)
Audición , Pruebas del Lenguaje , Computadores , Humanos , Psicometría
16.
Span J Psychol ; 14(1): 500-8, 2011 May.
Artículo en Inglés | MEDLINE | ID: mdl-21568205

RESUMEN

In computerized adaptive testing, the most commonly used valuating function is the Fisher information function. When the goal is to keep item bank security at a maximum, the valuating function that seems most convenient is the matching criterion, valuating the distance between the estimated trait level and the point where the maximum of the information function is located. Recently, it has been proposed not to keep the same valuating function constant for all the items in the test. In this study we expand the idea of combining the matching criterion with the Fisher information function. We also manipulate the number of strata into which the bank is divided. We find that the manipulation of the number of items administered with each function makes it possible to move from the pole of high accuracy and low security to the opposite pole. It is possible to greatly improve item bank security with much fewer losses in accuracy by selecting several items with the matching criterion. In general, it seems more appropriate not to stratify the bank.


Asunto(s)
Diagnóstico por Computador/estadística & datos numéricos , Evaluación Educacional/estadística & datos numéricos , Sistemas de Información/estadística & datos numéricos , Pruebas Psicológicas/estadística & datos numéricos , Psicometría/estadística & datos numéricos , Programas Informáticos , Encuestas y Cuestionarios , Algoritmos , Inteligencia Artificial , Simulación por Computador , Humanos , Lingüística , Cómputos Matemáticos , Matemática , Reproducibilidad de los Resultados
17.
Estud. psicol. (Campinas) ; 27(3): 315-327, jul.-set. 2010. tab
Artículo en Portugués | LILACS | ID: lil-571501

RESUMEN

A presente investigação avalia o traço de neuroticismo em escolares por meio de metodologia multi-informe que inclui autorrelato, heterorrelato, entrevista semiestruturada e observação comportamental. Especificamente, objetiva identificar o grau de concordância na avaliação de neuroticismo entre as várias fontes de informação utilizadas. A amostra [N=368] constitui parte do "Estudo Longitudinal de Avaliação das Competências Psicológicas das Crianças do Centro Pedagógico/Universidade Federal de Minas Gerais". Na seleção de crianças com alto e baixo neuroticismo, utilizaram-se os resultados em duas escalas de autorrelato - Big Five Questionnaire for Children e o Eysenck Personality Questionnaire - Junior. Logo, uma subamostra de 68 crianças foi submetida à avaliação multi-informe. Os resultados mostraram correlação moderada entre as escalas neuroticismo dos instrumentos de autorrelato. Não houve associação significativa entre o relato de pais e o de professores quando avaliaram neuroticismo, mas sim associações positivas entre o autorrelato e as avaliações multi-informes. As observações comportamentais não se mostraram úteis na mensuração dos traços e as razões para tais resultados são discutidas. Conclui-se que o presente estudo trouxe importantes contribuições para a literatura da área no que se refere à avaliação multi-informe do neuroticismo em crianças brasileiras.


The present research evaluates the dimension of Neuroticism in school children using a multi-source design that includes self-report, parents' and teachers' reports, semi-structured interview and behavior observation. Specifically, the study intended to verify the agreement level in evaluation of Neuroticism from different sources of information. The sample [N= 368] "Estudo Longitudinal de Avaliação das Competências Psicológicas das Crianças do Centro Pedagógico/Universidade Federal de Minas Gerais". Two self-report scales were used to select children with hight and low Neuroticism - Big Five Questionnaire for Children and Eysenck Personality Questionnaire - Junior. A sub sample of 68 children was submitted to a multi-source assessment. The results demonstrated moderate correlations between both Neuroticism scales used. There was not any significant association among parents' and teachers' reports when they evaluated Neuroticism, but were found positive associations between self-report and multi-source assessment. Behavioral observations were not useful in the measurement of traits. The reasons of these results are discussed. To sum-up, this research brought up important contributions to the personality literature, regarding the Neuroticism multi-source assessment in Brazilian children.


Asunto(s)
Humanos , Educación Primaria y Secundaria , Trastornos Neuróticos
18.
Estud. psicol. (Campinas) ; 27(2): 161-168, abr.-jun. 2010. ilus, tab
Artículo en Portugués | LILACS | ID: lil-567354

RESUMEN

Investiga-se a relação entre inteligência, personalidade e nível de informação geral e atual de escolares do Estado de Minas Gerais. Duas amostras participaram do estudo: a primeira proveniente de escolas de três níveis de vulnerabilidade social de Belo Horizonte (MG) (n=600), e a segunda de escolas municipais da cidade de Perdões (MG) (n=215). Ambas as amostras foram submetidas ao teste Raven e ao Questionário de Informações Gerais e Atuais. A segunda amostra realizou o Eysenck Questionnaire Personality e o subteste Informação do WISC-III. Os resultados mostraram uma associação consistente entre inteligência e o Questionário de Informações Gerais e Atuais, mesmo controlando-se o efeito da vulnerabilidade social das escolas (r=0,431). Uma path analyses mostrou predição independente da inteligência (0,430) e da dimensão Psicoticismo (-0,18) sobre Questionário de Informações Gerais e Atuais, após controle da idade e da covariância entre os preditores. Conclui-se que a inteligência explica as diferenças de Questionário de Informações Gerais e Atuais duas vezes mais do que a personalidade.


The aim of this study was to investigate the relationship between intelligence, personality and the extent of general and current information of students in the state of Minas Gerais. Two sample groups participated. The first was composed of students from three socially vulnerable levels from the city of Belo Horizonte (n=600), and the second group came from public schools in the city of Perdões (n=215). The Raven's Progressive Matrices Test and a General Information Questionnaire were applied for both samples. In addition, the Eysenck Personality Questionnaire and the WISC-III Information test were applied to the second sample. The results indicated a consistent relationship between intelligence and General Information Questionnaire, even after smoothing the effect of the social vulnerability of the schools (r=0.431). A path analysis showed an independent effect of intelligence (r=0.430) and of the Psychoticism dimension (-0.18) on the General Information Questionnaire, even after the smoothing of age and covariance between predictors. It may be concluded that intelligence explains General Information Questionnaire differences twice as much as does personality.


Asunto(s)
Humanos , Inteligencia , Personalidad , Trastornos Psicóticos
19.
Psicothema ; 21(4): 639-45, 2009 Nov.
Artículo en Español | MEDLINE | ID: mdl-19861112

RESUMEN

Applications of Item Response Theory require assessing the agreement between observations and model predictions at the item level. This paper compares approaches applied to polytomous scored items in a simulation study. Three fit-indexes are calculated: traditional chi-square index obtained by grouping examinees according to their estimated trait, an alternative that uses posterior distribution of trait and the third method, in which examinees are grouped according their observed total scores. Various conditions are simulated by manipulating test length (10, 20 and 40 items), number of categories (3, 4 and 5) and sample size (500, 1000 and 2000 examinees). Power and Type I error rates are described. Chi-square statistics based on posterior probabilities showed the best performance, especially with larger sample sizes and shorter test lengths.


Asunto(s)
Distribución de Chi-Cuadrado , Simulación por Computador , Modelos Logísticos , Funciones de Verosimilitud
20.
Psicothema ; 21(2): 313-20, 2009 May.
Artículo en Inglés | MEDLINE | ID: mdl-19403088

RESUMEN

This paper has two objectives: (a) to provide a clear description of three methods for controlling the maximum exposure rate in computerized adaptive testing -the Symson-Hetter method, the restricted method, and the item-eligibility method- showing how all three can be interpreted as methods for constructing the variable sub-bank of items from which each examinee receives the items in his or her test; (b) to indicate the theoretical and empirical limitations of each method and to compare their performance. With the three methods, we obtained basically indistinguishable results in overlap rate and RMSE (differences in the third decimal place). The restricted method is the best method for controlling exposure rate, followed by the item-eligibility method. The worst method is the Sympson-Hetter method. The restricted method presents problems of sequential overlap rate. Our advice is to use the item-eligibility method, as it saves time and satisfies the goals of restricting maximum exposure.


Asunto(s)
Análisis Numérico Asistido por Computador , Pruebas Psicológicas/estadística & datos numéricos , Pruebas Psicológicas/normas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA