RESUMO
The bifactor model is a promising alternative to traditional modeling techniques for studying the predictive validity of hierarchical constructs. However, no study to date has systematically examined the influence of cross-loadings on the estimation of regression coefficients in bifactor predictive models. Therefore, we present a systematic examination of the statistical performance of six modeling strategies to handle cross-loadings in bifactor predictive models: structural equation modeling (SEM), exploratory structural equation modeling (ESEM) with target rotation, Bayesian structural equation modeling (BSEM), and each of the three with augmentation. Results revealed four clear patterns: 1) forcing even small cross-loadings to zero was detrimental to empirical identification, estimation bias, power and Type I error rates; 2) the performance of ESEM with target rotation was unexpectedly weak; 3) augmented BSEM had satisfactory performance in an absolute sense and outperformed the other five strategies across most conditions; 4) augmentation improved the performance of ESEM and SEM, although the degree of improvement was not as substantial as that of BSEM. In addition, we also presented an empirical example to show the feasibility of the proposed approach. Overall, these findings can help users of bifactor predictive models design better studies, choose more appropriate analytical strategies, and obtain more reliable results. Implications, limitations, and future directions are discussed.
Assuntos
Teorema de Bayes , Análise Fatorial , Análise de Classes LatentesRESUMO
The present study compared the performance of logistic regression models with that of machine learning classification models (classification trees and random forests) in the context of predicting training attrition from the Delayed Enlistment Program in the United States Marine Corps (USMC) with scores from the Tailored Adaptive Personality Assessment System (TAPAS). Performance was assessed according to the type of misclassification error and across a variety of different reasons for attrition. The base rate of attrition was low, which impeded the training process, but the machine learning models outperformed logistic regression in predicting voluntary attrition in a stratified 50% attrition sample.
RESUMO
Project A clearly demonstrated that performance is multidimensional and that some aspects are better predicted by noncognitive measures. Substantial research and development in the ensuing years has focused on personality and vocational interests. The articles in this special issue convincingly demonstrate that at least one personality measure developed by military researchers, the Tailored Adaptive Personality Assessment System (TAPAS), is resistant to faking, which was an important concern about earlier single statement instruments. Moreover, several articles report showing that TAPAS predicts retention and important aspects of "will do" performance. On the other hand, these papers show that TAPAS adds little or no incremental validity to "can do" aspects of performance over and above the Armed Services Vocational Aptitude Battery (ASVAB). Three measures of vocational interest are described in articles in this special issue and research has been positive about their ability to predict attrition, rates of promotion and reenlistment, and job satisfaction. A number of topics for further research are noted.
RESUMO
Given the interpersonal nature of recruiting and the validity of personality assessments for predicting performance in a broad range of civilian and military jobs, personality traits are likely to predict the performance of recruiters in the Army as well. However, much of the research on the characteristics of successful recruiters has been conducted in civilian samples and has not examined the effects of recruiters' personality on their job-related attitudes and behaviors. Although some research has examined the prediction of recruiter performance in a military context, more research is needed to identify profiles of personality traits that will help recruiters to be successful on the job. We explored this relationship in a sample of experienced recruiters with at least six months of service in a recruiting duty assignment. Results indicated that composites of personality traits were substantial predictors of recruiter performance and attitudes. The implications of these results for the selection and assessment of recruiters in the U.S. Army will be discussed.
RESUMO
The history of vocational interests shows that these measures have great promise for use in job assignment, suggesting that individuals will be more satisfied and successful in their job when they are doing work that interests them. Recent research has provided empirical support for these predictions and demonstrated that the match between an individual's interests and his or her work activities is positively related to job performance and negatively related to attrition. Building on these positive empirical findings, the U.S. Army Research Institute is investigating vocational interest measures for personnel job assignment. Person-job fit is very important in a context such as the U.S. Army, where applicants have over 140 military occupational specialties from which to choose. This paper begins by reviewing evidence for the validity of interests and discussing how vocational interest measures may be used for assigning Soldiers in a military context followed by our recent research to develop a new measure of vocational interests to improve the process of matching Soldiers to military occupational specialties. We will conclude with the next steps for this research and potential paths of implementation.
RESUMO
A number of past studies have demonstrated that personality traits are modest predictors of workplace attitudes and behaviors and can provide incremental validity over cognitive ability. However, less is known about the utility of personality for job classification. In addition, concerns about the effects of faking on personality measures still remain. In this study, we examined the validity of a forced choice personality measure administered under operational conditions to explore the use of personality traits in high-stakes settings. In addition, we also examined the potential use of personality for classification into military occupational specialties (MOS). We explored these issues in a large sample of Soldiers from five different MOS to examine the prediction of performance during initial military training (IMT). Results indicated that composites of personality traits were valid predictors of performance and attrition and that these composites may be useful for classifying individuals into different military occupations. The implications of these results for Soldier selection and classification are discussed.
RESUMO
A nonparametric technique based on the Hamming distance is proposed in this research by recognizing that once the attribute vector is known, or correctly estimated with high probability, one can determine the item-by-attribute vectors for new items undergoing calibration. We consider the setting where Q is known for a large item bank, and the q-vectors of additional items are estimated. The method is studied in simulation under a wide variety of conditions, and is illustrated with the Tatsuoka fraction subtraction data. A consistency theorem is developed giving conditions under which nonparametric Q calibration can be expected to work.
Assuntos
Cognição/classificação , Psicometria/métodos , Calibragem , Interpretação Estatística de Dados , Estatísticas não ParamétricasRESUMO
Many self-report inventories in social/personality psychology are developed and scored using dominance-based assumptions. Specifically, they assume that the relationship between item endorsement and the latent trait is monotonically increasing; thus, individuals with high standings on the trait would be likely to endorse all items. It is possible, however, that the item response process for these inventories follows an ideal point process in which respondents only endorse items that best describe them, leading to nonmonotonic relations between item responses and latent traits. This research examined whether the item response process underlying the Experiences in Close Relationships-Revised-a commonly used self-report measure of adult attachment styles-is best understood as a dominance or ideal point process. Study 1 showed that the ideal point model provided a good account of the response process and provided better interpretability for the full trait continuum than a dominance model. Importantly, people who were the most insecure were the most likely to be scored differently under these two item response models. In Study 2, the association between attachment anxiety and subjective well-being scores was higher using ideal point than dominance-based scoring, and this was especially the case among subsets of people who were highly insecure. Study 3 demonstrated a similar pattern using simulation data. In summary, when dominance-based methods are used to measure adult attachment, people who are extremely insecure may be assessed in suboptimal ways.
Assuntos
Determinação da Personalidade , Personalidade , Adulto , Ansiedade , Humanos , Apego ao Objeto , Psicometria , AutorrelatoRESUMO
In this study, the authors examined the item response process underlying 3 vocational interest inventories: the Occupational Preference Inventory (C.-P. Deng, P. I. Armstrong, & J. Rounds, 2007), the Interest Profiler (J. Rounds, T. Smith, L. Hubert, P. Lewis, & D. Rivkin, 1999; J. Rounds, C. M. Walker, et al., 1999), and the Interest Finder (J. E. Wall & H. E. Baker, 1997; J. E. Wall, L. L. Wise, & H. E. Baker, 1996). Item response theory (IRT) dominance models, such as the 2-parameter and 3-parameter logistic models, assume that item response functions (IRFs) are monotonically increasing as the latent trait increases. In contrast, IRT ideal point models, such as the generalized graded unfolding model, have IRFs that peak where the latent trait matches the item. Ideal point models are expected to fit better because vocational interest inventories ask about typical behavior, as opposed to requiring maximal performance. Results show that across all 3 interest inventories, the ideal point model provided better descriptions of the response process. The importance of specifying the correct item response model for precise measurement is discussed. In particular, scores computed by a dominance model were shown to be sometimes illogical: individuals endorsing mostly realistic or mostly social items were given similar scores, whereas scores based on an ideal point model were sensitive to which type of items respondents endorsed.
Assuntos
Modelos Psicológicos , Testes Psicológicos , Psicometria/métodos , Orientação Vocacional/métodos , Feminino , Humanos , Masculino , Meio-Oeste dos Estados Unidos , Psicometria/estatística & dados numéricosRESUMO
Forced-choice (FC) is a popular format for developing personality measures, where individuals must choose 1 or multiple statements from several options. Although FC measures have been proposed to reduce score inflation in high-stakes assessments, inconsistent results have been found in empirical studies regarding their effectiveness. In this study, we conducted a meta-analysis of studies comparing FC personality measure scores between low-stakes and (both simulated and actual) high-stakes situations. Results suggest that the overall score inflation effect size for FC personality measures is 0.06. In selection scenarios, score inflation for FC scales is much lower than the meta-analytic effect size for single-statement personality measures across most personality facets. The score inflation effect size was also found to vary across FC scale characteristics and study design factors. Specifically, FC scales were consistently found to be more faking-resistant when constructed with statements balanced in social desirability and with responses scored via a normative approach. FC scales constructed with the PICK format were also found to be faking-resistant, while more applicant-incumbent studies are needed to examine the fakability of MOLE FC scales. Evidence at the overall level supports the use of multidimensional scales and extremity balance of statements, but results are not consistent across personality facets, or when large samples are excluded. Personality facets of high relevance to the target job were found to exhibit larger inflation than facets of low relevance to the target job. Practical guidance on constructing and using FC personality measures for personnel selection purposes is provided. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Assuntos
Enganação , Determinação da Personalidade/normas , Inventário de Personalidade/normas , Psicometria/normas , Desejabilidade Social , HumanosRESUMO
The main aim of this article is to explicate why a transition to ideal point methods of scale construction is needed to advance the field of personality assessment. The study empirically demonstrated the substantive benefits of ideal point methodology as compared with the dominance framework underlying traditional methods of scale construction. Specifically, using a large, heterogeneous pool of order items, the authors constructed scales using traditional classical test theory, dominance item response theory (IRT), and ideal point IRT methods. The merits of each method were examined in terms of item pool utilization, model-data fit, measurement precision, and construct and criterion-related validity. Results show that adoption of the ideal point approach provided a more flexible platform for creating future personality measures, and this transition did not adversely affect the validity of personality test scores.
Assuntos
Determinação da Personalidade/estatística & dados numéricos , Psicometria/estatística & dados numéricos , Adolescente , Adulto , Feminino , Humanos , Masculino , Valores de Referência , Reprodutibilidade dos Testes , Estatística como Assunto , Estudantes/psicologiaRESUMO
The nature, rate, and higher-order relationships among facets of racial/ethnic harassment (REH) and discrimination (RED) were examined across five racial/ethnic groups in a sample of 5,000 US military personnel. Using a hierarchical, multigroup confirmatory factor analysis approach, results suggest that the nature of REH and RED do not differ by race, with behavioral items equally representing REH and RED across the different groups. Further, higher-order relationships among the facets of REH and RED do not vary across race, with a single second-order factor accounting for the relationships. This single factor is theorized to represent a root intergroup prejudice that leads to harassment and discrimination. However, as anticipated, individuals from minority groups generally reported higher levels of REH and RED once measurement equivalence has been established. Together, the results suggest that both intergroup prejudice (which is multidirectional) and racism (which originates in powerful groups against other groups) are operating in REH and RED experiences.
Assuntos
Atitude/etnologia , Etnicidade/psicologia , Militares/psicologia , Preconceito , Comportamento Social , Adulto , Coleta de Dados , Feminino , Humanos , Masculino , Militares/classificação , Relações Raciais , Percepção Social , Estados UnidosRESUMO
In this article, the authors developed a common strategy for identifying differential item functioning (DIF) items that can be implemented in both the mean and covariance structures method (MACS) and item response theory (IRT). They proposed examining the loadings (discrimination) and the intercept (location) parameters simultaneously using the likelihood ratio test with a free-baseline model and Bonferroni corrected critical p values. They compared the relative efficacy of this approach with alternative implementations for various types and amounts of DIF, sample sizes, numbers of response categories, and amounts of impact (latent mean differences). Results indicated that the proposed strategy was considerably more effective than an alternative approach involving a constrained-baseline model. Both MACS and IRT performed similarly well in the majority of experimental conditions. As expected, MACS performed slightly worse in dichotomous conditions but better than IRT in polytomous cases where sample sizes were small. Also, contrary to popular belief, MACS performed well in conditions where DIF was simulated on item thresholds (item means), and its accuracy was not affected by impact.
Assuntos
Discriminação Psicológica , Teoria Psicológica , Psicologia/métodos , Psicologia/estatística & dados numéricos , Análise Fatorial , Humanos , Modelos PsicológicosRESUMO
The present study investigated whether the assumptions of an ideal point response process, similar in spirit to Thurstone's work in the context of attitude measurement, can provide viable alternatives to the traditionally used dominance assumptions for personality item calibration and scoring. Item response theory methods were used to compare the fit of 2 ideal point and 2 dominance models with data from the 5th edition of the Sixteen Personality Factor Questionnaire (S. Conn & M. L. Rieke, 1994). The authors' results indicate that ideal point models can provide as good or better fit to personality items than do dominance models because they can fit monotonically increasing item response functions but do not require this property. Several implications of these findings for personality measurement and personnel selection are described.
Assuntos
Modelos Psicológicos , Determinação da Personalidade , Personalidade , Psicologia/métodos , Psicologia/estatística & dados numéricos , Interpretação Estatística de Dados , Humanos , Teoria Psicológica , Estatísticas não ParamétricasRESUMO
Mixed format tests (e.g., a test consisting of multiple-choice [MC] items and constructed response [CR] items) have become increasingly popular. However, the latent structure of item pools consisting of the two formats is still equivocal. Moreover, the implications of this latent structure are unclear: For example, do constructed response items tap reasoning skills that cannot be assessed with multiple choice items? This study explored the dimensionality of mixed format tests by applying bi-factor models to 10 tests of various subjects from the College Board's Advanced Placement (AP) Program and compared the accuracy of scores based on the bi-factor analysis with scores derived from a unidimensional analysis. More importantly, this study focused on a practical and important question-classification accuracy of the overall grade on a mixed format test. Our findings revealed that the degree of multidimensionality resulting from the mixed item format varied from subject to subject, depending on the disattenuated correlation between scores from MC and CR subtests. Moreover, remarkably small decrements in classification accuracy were found for the unidimensional analysis when the disattenuated correlations exceeded 0.90.
RESUMO
Sexual harassment has consistently negative consequences for working women, including changes in job attitudes (e.g., lower satisfaction) and behaviors (e.g., increased work withdrawal). Cross-sectional evidence suggests that harassment influences turnover intentions. However, few studies have used actual turnover; rather, they rely on proxies. With a sample of 11,521 military servicewomen with turnover data spanning approximately 4 years, the authors used the appropriate method for longitudinal turnover data--Cox's regression--to investigate the impact of harassment on actual turnover. Experiences of harassment led to increased turnover, even after controlling for job satisfaction, organizational commitment, and marital status. Among officers, harassment also affected turnover over and above rank. Given turnover's relevance to organizational bottom lines, these findings have important implications not only for individual women but also for organizations.
Assuntos
Militares/estatística & dados numéricos , Modelos Psicológicos , Reorganização de Recursos Humanos/estatística & dados numéricos , Assédio Sexual/estatística & dados numéricos , Absenteísmo , Adulto , Coleta de Dados , Avaliação de Desempenho Profissional , Feminino , Humanos , Satisfação no Emprego , Estudos Longitudinais , Militares/psicologia , Motivação , Modelos de Riscos Proporcionais , Assédio Sexual/psicologia , Fatores de TempoRESUMO
This study investigated the psychometric properties of 3 frequently administered emotional intelligence (EI) scales (Wong and Law Emotional Intelligence Scale [WLEIS], Schutte Self-Report Emotional Intelligence Test [SEIT], and Trait Emotional Intelligence Questionnaire [TEIQue]), which were developed on the basis of different theoretical frameworks (i.e., ability EI and mixed EI). By conducting item response theory (IRT) analyses, the authors examined the item parameters and compared the fits of 2 response process models (i.e., dominance model and ideal point model) for these scales with data from 355 undergraduate sample recruited from the subject pool. Several important findings were obtained. First, the EI scales seem better able to differentiate individuals at low trait levels than high trait levels. Second, a dominance model showed better model fit to the self-report ability EI scale (WLEIS) and also fit better with most subfactors of the SEIT, except for the mood regulation/optimism factor. Both dominance and ideal point models fit a self-report mixed EI scale (TEIQue). Our findings suggest (a) the EI scales should be revised to include more items at moderate and higher trait levels; and (b) the nature of the EI construct should be considered during the process of scale development.
Assuntos
Inteligência Emocional , Testes de Inteligência , Adulto , Feminino , Humanos , Masculino , Modelos Psicológicos , Modelos Estatísticos , Teoria Psicológica , Psicometria , Autorrelato , Adulto JovemRESUMO
The Internet has significantly changed the way people conduct business, communicate, and live. In this article, the authors' focus is on how the Internet influences the practice of psychology as it relates to testing and assessment. The report includes 5 broad sections: background and context, new problems yet old issues, issues for special populations, ethical and professional issues, and recommendations for the future. Special attention is paid to implications for people with disabling conditions and culturally and linguistically diverse persons. The authors conclude that ethical responsibilities of psychologists and current psychometric standards, particularly those regarding test reliability and validity, apply even though the way in which the tests are developed and used may be quite different.
Assuntos
Internet/instrumentação , Testes Psicológicos , Psicologia/métodos , Diversidade Cultural , Cultura , Humanos , IdiomaRESUMO
Researchers have studied whether there are classes of people who differ systematically in the way they respond to polytomous ordered scales with a middle category such as ?. The mixed-partial credit model was fitted to a number of scales of a personality questionnaire. Most of the scales fit better with the use of 2 latent subpopulations. The most consistent difference among the latent classes was related to the functioning of the middle response category. For most of the examinees, the probability of choosing the middle category was very close to zero, but a nonnegligible percentage of people selected this category with much higher probability. The total scores from the 2 subpopulations were incommensurate. Some personality factors contributed to explaining class membership.