RESUMEN
BACKGROUND: Nonsuicidal self-injury (NSSI) combined with suicide ideation increases the risk of suicidal behaviors. Depression and posttraumatic stress disorder (PTSD) are comorbidities of NSSI compounding this relationship. The present study compared diagnostic subgroups of NSSI based on current depression and PTSD on psychological correlates (i.e., vulnerabilities and impairment) and suicidal presentations (i.e., suicidal cognitions and behaviors) in a psychiatric sample of adolescents. METHODS: Eighty-seven adolescents meeting DSM-5 criteria for NSSI and 104 age-range-matched nonclinical controls (NC) participated. Participants completed self-report measures on psychological vulnerabilities and impairment (e.g., emotion regulation difficulties, negative cognitions). Adolescents with NSSI also completed clinical interviews on psychiatric diagnoses and a recent self-injurious behavior (SIB). Scores on the psychological correlates of NSSI were compared between adolescents with NSSI and NC, and across three diagnostic subgroups of NSSI (A: NSSI+/depression-/PTSD-, n = 14; B: NSSI+/depression+/PTSD-, n = 57; C: NSSI+/depression+/PTSD+, n = 14). Differences between NSSI diagnostic subgroups were tested on the motives for SIB and accompanying suicidal presentations (e.g., desire, intent, motive, lethality). RESULTS: Common comorbidities of NSSI included depression, panic disorder, generalized anxiety disorder, and PTSD. The NSSI subgroup classification was significantly associated with panic disorder, which was controlled for in the subsequent group comparisons. Overall, adolescents who engage in NSSI with vs. without depression reported more psychological vulnerabilities and impairment and a greater degree of suicidal thoughts/desire in SIB (i.e., groups B, C >A), which remained significant after controlling for panic disorder. An increased odds of the suicidal motive for SIB was found in adolescents with all three conditions (i.e., group C: NSSI+/depression+/PTSD+) compared to those with NSSI but neither depression nor PTSD (i.e., group A: NSSI+/depression-/PTSD-); however, this was not significant after controlling for panic disorder. CONCLUSIONS: Psychological underpinnings of adolescent NSSI in clinical contexts may be largely associated with concurrent depression. Suicidal motives in adolescents who engage in NSSI in the presence of depression and PTSD may be confounded by the co-occurrence of panic disorder. This study warrants the importance of attending to the comorbid depression with NSSI in adolescents as it is related to an increase in suicidal desire accompanying SIB.
Asunto(s)
Conducta Autodestructiva , Trastornos por Estrés Postraumático , Humanos , Adolescente , Ideación Suicida , Trastornos por Estrés Postraumático/diagnóstico , Trastornos por Estrés Postraumático/epidemiología , Depresión/diagnóstico , Conducta Autodestructiva/diagnóstico , Conducta Autodestructiva/psicología , Trastornos de Ansiedad , Factores de RiesgoRESUMEN
BACKGROUND: Evaluating the activities of daily living (ADL) is an important factor for diagnosing dementia. The Everyday Cognition (ECog) scale was developed to measure ADL changes that were correlated with specific neuropsychological impairments. A short form of the ECog (ECog-12) was also developed, consisting of 12 items, two from each of the six cognitive domains of the ECog. The Korean full version of ECog (K-ECog) has recently been standardized, but the need for a shortened version has been raised in clinical practice. The purpose of this study was to develop a Korean version of ECog-12 (K-ECog-12) and to verify its reliability and validity by comparing those to the full version of K-ECog. METHODS: The participants were 267 cognitively normal older adults (CN), 183 patients with mild cognitive impairment (MCI), and 89 patients with dementia. The Korean-Mini Mental State Examination (K-MMSE), Korean-Montreal Cognitive Assessment (K-MoCA), and Short form of Geriatric Depression Scale (SGDS) were administered to all participants. The K-ECog and Korean-Instrumental Activities of Daily Living (K-IADL) were rated by the informants of patients. RESULTS: K-ECog-12 was newly constructed by replacing one item for the visuospatial function in the original ECog-12 with another one through an item response theory analysis on Korean data. The internal consistencies (Cronbach's α) of K-ECog-12 and K-ECog were 0.95 and 0.99, respectively. The test-retest reliabilities (Pearson's r) were 0.67 for K-ECog-12 and 0.73 for K-ECog. The K-ECog-12 was significantly correlated with K-ECog as well as K-IADL, K-MMSE, and K-MoCA. The K-ECog-12 scores differed significantly between the CN, MCI, and dementia groups, as did the K-ECog scores. Receiver operating characteristic curve analyses showed that K-ECog-12, like K-ECog, could differentiate MCI and dementia patients from CN as well. CONCLUSION: The K-ECog-12 is as reliable and valid as the K-ECog in assessing ADL. Therefore, K-ECog-12 can be used as an alternative to the K-ECog in clinical and community settings in Korea.
Asunto(s)
Disfunción Cognitiva , Demencia , Humanos , Anciano , Demencia/diagnóstico , Actividades Cotidianas/psicología , Reproducibilidad de los Resultados , Pruebas Neuropsicológicas , Disfunción Cognitiva/diagnóstico , Cognición , República de CoreaRESUMEN
PURPOSE: This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under 2 stopping rules (standard error of measurement [SEM]=0.3 and 0.25) using both real and simulated data in medical examinations in Korea. METHODS: This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees' passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules. RESULTS: Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/ fail outcomes between the 2 SEM conditions, with a high correlation (r=0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data. CONCLUSION: The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.
Asunto(s)
Evaluación Educacional , Psicometría , Estudiantes de Medicina , Humanos , Evaluación Educacional/métodos , Evaluación Educacional/normas , República de Corea , Psicometría/métodos , Simulación por Computador , Análisis de Datos , Educación de Pregrado en Medicina/métodos , Masculino , FemeninoRESUMEN
BACKGROUND: Examining the daily experiences of older adults with depression facilitates the development and application of personalized effective treatments for them. In previous clinical research on depression, traditional mean-based approaches have mainly been employed. However, the within-person residual variance as a random effect provides greater insight into the heterogeneity of daily experiences among geriatric samples. OBJECTIVE: This study aimed to examine the relationship between depression and daily vitality in older adults. Specifically, it focused on the mean and residual variance of daily vitality measured by the Ecological Momentary Assessment (EMA). METHODS: Data from 64 older adults aged 65 years or more, who participated in community welfare centers or retirees' associations, were used. Daily vitality was examined using EMA surveys for seven consecutive days (four random surveys per day). The data were analyzed using a location-scale model. RESULTS: The intraclass correlation computed from the empty model for the EMA data was 0.488, indicating significant variances in daily vitality across time between individuals. Older adults with higher levels of depressive symptoms showed low mean levels of daily vitality and a large log-residual variance of daily vitality. CONCLUSIONS: The findings from the current study suggest that individuals experiencing depression not only exhibit low vitality in their daily lives but also struggle to maintain stable levels of vitality in their lives. These insights could contribute to the facilitation and advancement of personalized interventions tailored for older adults.
Asunto(s)
Depresión , Evaluación Ecológica Momentánea , Humanos , Anciano , Depresión/epidemiología , Depresión/diagnóstico , Análisis Multinivel , Encuestas y CuestionariosRESUMEN
Non-suicidal self-injury (NSSI) among adolescents continues to be a significant public health concern worldwide. A recent systematic review and meta-analysis found that the global prevalence of NSSI in adolescents aged 12-18 years was 17.2%, with higher rates reported among females (19.7%) than males (14.8%). This behavior has been linked to several negative outcomes, such as depression, anxiety, substance abuse, and suicidal ideation. The present study aimed to classify adolescents based on intrapersonal and interpersonal factors associated with NSSI proposed in Nock's (2009) integrated model of NSSI, to identify distinct clusters targeting specific risk factors. This encompassed negative cognition, emotional vulnerability, poor coping skill, peer-victimization, family adaptability, and perceived stress. A total of 881 adolescents aged 11-16 years in South Korea completed self-reported questionnaires on automatic thoughts, depression, emotional regulation, peer victimization, family adaptability and perceived stress. Latent profile analysis (LPA) revealed three distinct classes: "the severe group", "the moderate group", "the mild group". Class 3 ("severe group": N = 127) exhibited greater severity related to NSSI, including negative cognition, emotional vulnerability, poor coping skills, peer victimization, and perceived stress, with weaker levels of factors that can prevent NSSI compared to class 1 ("mild group": N = 416) and class 2 ("moderated group": N = 338). The present study emphasizes the importance of considering both intrapersonal (e.g., negative automatic thoughts & emotional dysregulation) and interpersonal factors (i.e., peer victimization) when understanding NSSI - among adolescents. These findings can be utilized to develop interventions aimed at reducing the prevalence and severity of NSSI among adolescents.
RESUMEN
PURPOSE: This study aimed to develop a test scale to measure the character qualities of medical students as a follow-up study on the 8 core character qualities revealed in a previous report. METHODS: In total, 160 preliminary items were developed to measure 8 core character qualities. Twenty questions were assigned to each quality, and a questionnaire survey was conducted among 856 students in 5 medical schools in Korea. Using the partial credit model, polytomous item response theory analysis was carried out to analyze the goodness-of-fit, followed by exploratory factor analysis. Finally, confirmatory factor and reliability analyses were conducted with the final selected items. RESULTS: The preliminary items for the 8 core character qualities were administered to the participants. Data from 767 students were included in the final analysis. Of the 160 preliminary items, 25 were removed by classical test theory analysis and 17 more by polytomous item response theory assessment. A total of 118 items and sub-factors were selected for exploratory factor analysis. Finally, 79 items were selected, and the validity and reliability were confirmed through confirmatory factor analysis and intra-item relevance analysis. CONCLUSION: The character qualities test scale developed through this study can be used to measure the character qualities corresponding to the educational goals and visions of individual medical schools in Korea. Furthermore, this measurement tool can serve as primary data for developing character qualities tools tailored to each medical school's vision and educational goals.
Asunto(s)
Estudiantes de Medicina , Humanos , Estudios de Seguimiento , Reproducibilidad de los Resultados , Análisis Factorial , República de CoreaRESUMEN
Despite the rapidly increasing rate of non-suicidal self-injury (NSSI) among adolescents, there is a dearth of culturally appropriate psychological measures screening for NSSI among the adolescents in the Asian countries. This study aimed to develop and validate the Self-Harm Screening Inventory (SHSI), a culturally sensitive and suitable scale for screening adolescents for NSSI. In total, 514 Korean adolescents (aged 12-16 years) were recruited nationwide. All participants gave informed consent and completed the online self-report measures on NSSI, depression, anxiety, and self-esteem. Thereafter, preliminary items were developed through a series of steps: literature review, ratings of experts on self-harm and suicide, and statistical analyses. Ten of the 20 preliminary items were eliminated after exploratory factor analysis due to low endorsement and factor loading (less than .70). The final version of the SHSI comprised 10 binary items relating to self-harm behaviors within the past year (e.g., cut my body with sharp objects, hit my body). A confirmatory factor analysis supported a one-factor structure, as hypothesized. The one-factor model had a good model fit (x2(35) = 84.958, p < .001, RMSEA = .053, CFI = .981, TLI = .975, SRMR = .124). The SHSI also had good internal consistency (Cronbach's alpha = .795) and 4-week test-retest reliability (r = .786, p < .01). The SHSI had high correlations with another self-harm related scale, the Self-Harm Inventory (r = .773, p < .01), and moderate correlations with the Child Depression Inventory (r = .484, p < .01) and Revised Children's Manifest Anxiety Scale (r = .433, p < .01). Additionally, the SHSI was negatively correlated with the Rosenberg Self-Esteem Scale (r = -.399, p < .01). The findings indicate that the SHSI is a reliable and valid measure for the screening of self-harm behaviors among adolescents.
Asunto(s)
Psicología del Adolescente , Psicometría/métodos , Conducta Autodestructiva/diagnóstico , Adolescente , Ansiedad/patología , Niño , Depresión/patología , Femenino , Humanos , Masculino , República de Corea , Autoimagen , Autoinforme , Conducta Autodestructiva/epidemiología , Encuestas y CuestionariosRESUMEN
PURPOSE: Diagnostic classification models (DCMs) were developed to identify the mastery or non-mastery of the attributes required for solving test items, but their application has been limited to very low-level attributes, and the accuracy and consistency of high-level attributes using DCMs have rarely been reported compared with classical test theory (CTT) and item response theory models. This paper compared the accuracy of high-level attribute mastery between deterministic inputs, noisy "and" gate (DINA) and Rasch models, along with sub-scores based on CTT. METHODS: First, a simulation study explored the effects of attribute length (number of items per attribute) and the correlations among attributes with respect to the accuracy of mastery. Second, a real-data study examined model and item fit and investigated the consistency of mastery for each attribute among the 3 models using the 2017 Korean Medical Licensing Examination with 360 items. RESULTS: Accuracy of mastery increased with a higher number of items measuring each attribute across all conditions. The DINA model was more accurate than the CTT and Rasch models for attributes with high correlations (>0.5) and few items. In the real-data analysis, the DINA and Rasch models generally showed better item fits and appropriate model fit. The consistency of mastery between the Rasch and DINA models ranged from 0.541 to 0.633 and the correlations of person attribute scores between the Rasch and DINA models ranged from 0.579 to 0.786. CONCLUSION: Although all 3 models provide a mastery decision for each examinee, the individual mastery profile using the DINA model provides more accurate decisions for attributes with high correlations than the CTT and Rasch models. The DINA model can also be directly applied to tests with complex structures, unlike the CTT and Rasch models, and it provides different diagnostic information from the CTT and Rasch models.
Asunto(s)
Psicometría , Simulación por Computador , Humanos , República de CoreaRESUMEN
A close link has been established between self-harm and suicide risk in adolescents, and increasing attention is given to social media as possibly involved in this relationship. It is important to identify indicators of suicidality (i.e., suicide ideation or attempt) including aspects related to contagion in online and offline social networks and explore the role of social media in the relationship between social circumstances and suicidality in young adolescents with self-harm. This study explored characteristics of Korean adolescents with a recent history of self-harm and identified how behavioral and social features explain lifetime suicidality with emphasis on the impact of social media. Data came from a nationwide online survey among sixth- to ninth-graders with self-harm during the past 12 months (n = 906). We used χ2 tests of independence to explore potential concomitants of lifetime suicidality and employed a multivariate logistic regression model to examine the relationship between the explanatory variables and suicidality. Sensitivity analyses were performed with lifetime suicide attempt in place of lifetime suicidality. 33.9% (n = 306) and 71.2% (n = 642) reported to have started self-harm by the time they were fourth- and six-graders, respectively; 44.3% (n = 400) reported that they have friends who self-harm. Having endorsed moderate/severe forms and multiple forms of self-harm (OR 5.36, p < 0.001; OR 3.13, p < 0.001), having engaged in self-harm for two years or more (OR 2.42, p = 0.001), having friends who self-harm (OR 1.92, p = 0.013), and having been bullied at school were associated with an increased odds of lifetime suicidality (OR 2.08, p = 0.004). Notably, having posted content about one's self-harm on social media during the past 12 months was associated with an increased odds of lifetime suicidality (OR 3.15, p < 0.001), whereas having seen related content in the same period was not. Sensitivity analyses yielded similar results with lifetime suicide attempt, supporting our findings from the logistic regression. The current study suggests that self-harm may be prevalent from early adolescence in South Korea with assortative gathering. The relationship of vulnerable adolescents' social circumstances to suicide risk may be compounded by the role of social media. As the role of social media can be linked to both risk (i.e., contagion) and benefit (i.e., social connection and support), pre-existing vulnerabilities alongside SH and what online communication centers on should be a focus of clinical attention.
RESUMEN
PURPOSE: Deterministic inputs, noisy and gate (DINA) model is one of the promising statistical means for providing useful diagnostic information about a student' level of achievement. Diagnostics information is core element for improving learning instead of selection. Educators often want to be provided with diagnostic information which how a given examinees did on each content strand, called diagnostic profiles. The purpose of this paper is to classify examinees in different content domains using the DINA model. METHODS: This paper analyzed data from the Korean medical licensing examination (KMLE) with 360 items and 3259 examinees. The application study estimate examinees parameters as well as item characteristics. The guessing and slipping parameters of each item were estimated. DINA model was conducted as a statistical analysis. RESULTS: The output table shows the examples of some items, which can be used for the check of item quality. In addition, the probabilities of being mastery at each content domain were estimated, which indicates the mastery profile of each examinee. Classifications accuracy for 8 contents ranged from .849 to .972 and classification consistency for 8 contents ranged from .839 to .994. As a result, classification reliability in a CDM was very high for 8 contents in KMLE. CONCLUSION: This mastery profile can be useful diagnostic information for each examinee in terms of the content domains of KMLE. The master profile from KMLE provides each examinee's mastery profile in terms of each content domain. The individual mastery profile allows educators and examinees to understand that which domain(s) should be improved for mastering all domains in KMLE. In addition, the results found that all items are reasonable level with respect to item parameters character.
Asunto(s)
Modelos Estadísticos , Humanos , Probabilidad , Psicometría , Reproducibilidad de los Resultados , República de CoreaRESUMEN
This study introduces LIVECAT, a web-based computerized adaptive testing platform. This platform provides many functions, including writing item content, managing an item bank, creating and administering a test, reporting test results, and providing information about a test and examinees. The LIVECAT provides examination administrators with an easy and flexible environment for composing and managing examinations. It is available at http://www.thecatkorea.com/. Several tools were used to program LIVECAT, as follows: operating system, Amazon Linux; web server, nginx 1.18; WAS, Apache Tomcat 8.5; database, Amazon RDMS-Maria DB; and languages, JAVA8, HTML5/CSS, Javascript, and jQuery. The LIVECAT platform can be used to implement several item response theory (IRT) models such as the Rasch and 1-, 2-, 3-parameter logistic models. The administrator can choose a specific model of test construction in LIVECAT. Multimedia data such as images, audio files, and movies can be uploaded to items in LIVECAT. Two scoring methods (maximum likelihood estimation and expected a posteriori) are available in LIVECAT and the maximum Fisher information item selection method is applied to every IRT model in LIVECAT. The LIVECAT platform showed equal or better performance compared with a conventional test platform. The LIVECAT platform enables users without psychometric expertise to easily implement and perform computerized adaptive testing at their institutions. The most recent LIVECAT version only provides a dichotomous item response model and the basic components of CAT. Shortly, LIVECAT will include advanced functions, such as polytomous item response models, weighted likelihood estimation method, and content balancing method.
Asunto(s)
Algoritmos , Programas Informáticos , Computadores , Humanos , Internet , PsicometríaRESUMEN
Several methods of factor extraction have recently gained popularity as a procedure for dealing with estimation problems associated with small sample sizes, which can be found in the various behavioral science disciplines, such as comparative psychology and behavior genetics. Two popular approaches for particularly small samples (below 50) include unweighted least squares factor analysis (ULS-FA) and regularized exploratory factor analysis (REFA). However, it is unclear how well each of the approaches performs with small samples in the context of exploratory bifactor modeling. In the current study, a comprehensive simulation study was conducted to evaluate the small sample behavior of the two approaches in terms of bifactor structure recovery under different sample size, factor loading, number of variables per factor, number of factors, and factor correlation experimental conditions. The results show that REFA is recommended for use over ULS-FA, particularly in the conditions involving low factor loadings, few group factors, or a small number of variables per factor.
RESUMEN
Arithmetic mean, Harmonic mean, and Jensen equality were applied to marginalize observed standard errors (OSEs) to estimate CAT reliability. Based on different marginalization method, three empirical CAT reliabilities were compared with true reliabilities. Results showed that three empirical CAT reliabilities were underestimated compared to true reliability in short test length (<40), whereas the magnitude of CAT reliabilities was followed by Jensen equality, Harmonic mean, and Arithmetic mean when mean of ability population distribution is zero. Specifically, Jensen equality overestimated true reliability when the number of items is over 40 and mean ability population distribution is zero. However, Jensen equality was recommended for computing reliability estimates because it was closer to true reliability even if small numbers of items was administered regardless of the mean of ability population distribution, and it can be computed easily by using a single test information value at θ = 0. Although CAT is efficient and accurate compared to a fixed-form test, a small fixed number of items is not recommended as a CAT termination criterion for 2PLM, specifically for 3PLM, to maintain high reliability estimates.
RESUMEN
PURPOSE: Computerized adaptive testing (CAT) has been adopted in licensing examinations because it improves the efficiency and accuracy of the tests, as shown in many studies. This simulation study investigated CAT scoring and item selection methods for the Korean Medical Licensing Examination (KMLE). METHODS: This study used a post-hoc (real data) simulation design. The item bank used in this study included all items from the January 2017 KMLE. All CAT algorithms for this study were implemented using the 'catR' package in the R program. RESULTS: In terms of accuracy, the Rasch and 2-parametric logistic (PL) models performed better than the 3PL model. The 'modal a posteriori' and 'expected a posterior' methods provided more accurate estimates than maximum likelihood estimation or weighted likelihood estimation. Furthermore, maximum posterior weighted information and minimum expected posterior variance performed better than other item selection methods. In terms of efficiency, the Rasch model is recommended to reduce test length. CONCLUSION: Before implementing live CAT, a simulation study should be performed under varied test conditions. Based on a simulation study, and based on the results, specific scoring and item selection methods should be predetermined.
Asunto(s)
Simulación por Computador , Computadores , Concesión de Licencias/normas , Algoritmos , Evaluación Educacional/métodos , Humanos , Modelos Estadísticos , República de CoreaRESUMEN
PURPOSE: This study aimed to find the best way of developing equivalent item sets and to propose a stable and effective management plan for the periodical licensing examinations. METHODS: Five pre-equated item sets were developed based on the predicted correct answer rate of each item by using linear programming. These pre-equated item sets were compared to the ones that were developed with random item selection method based on the actual answer rate and difficulty from item response theory (IRT). Also, the results with and without common items were compared in the same way. ACAR and the IRT difficulty was used to determine whether there is a significant difference between pre-equating conditions. RESULTS: There was a statistically significant difference in IRT difficulty among the results from different pre-equated conditions. As predicted correct answer rate was divided into 2 or 3 difficulty boundaries, the actual answer rate and IRT difficulty parameters of the 5 item sets were equally constructed. Comparing item sets conditions with common items and without common items, including common items did not contribute much for the equating of 5 item sets. CONCLUSION: This study suggested the linear programming method is applicable to construct equated-item sets that reflect each content area. The best method to construct equated item sets suggested is to divide the predicted correct answer rate into 2 or 3 difficulty boundaries regardless of common items. If pre-equated item sets are required to construct a test based on the actual data, several optimal methods should be considered by simulation studies before administrating a real test.
Asunto(s)
Computadores , Educación Médica/normas , Evaluación Educacional/métodos , Licencia Médica/normas , Evaluación Educacional/normas , Humanos , República de Corea , Estudiantes de MedicinaRESUMEN
Computerized adaptive testing (CAT) has been implemented in high-stakes examinations such as the National Council Licensure Examination-Registered Nurses in the United States since 1994. Subsequently, the National Registry of Emergency Medical Technicians in the United States adopted CAT for certifying emergency medical technicians in 2007. This was done with the goal of introducing the implementation of CAT for medical health licensing examinations. Most implementations of CAT are based on item response theory, which hypothesizes that both the examinee and items have their own characteristics that do not change. There are 5 steps for implementing CAT: first, determining whether the CAT approach is feasible for a given testing program; second, establishing an item bank; third, pretesting, calibrating, and linking item parameters via statistical analysis; fourth, determining the specification for the final CAT related to the 5 components of the CAT algorithm; and finally, deploying the final CAT after specifying all the necessary components. The 5 components of the CAT algorithm are as follows: item bank, starting item, item selection rule, scoring procedure, and termination criterion. CAT management includes content balancing, item analysis, item scoring, standard setting, practice analysis, and item bank updates. Remaining issues include the cost of constructing CAT platforms and deploying the computer technology required to build an item bank. In conclusion, in order to ensure more accurate estimations of examinees' ability, CAT may be a good option for national licensing examinations. Measurement theory can support its implementation for high-stakes examinations.
Asunto(s)
Certificación , Instrucción por Computador , Evaluación Educacional/métodos , Concesión de Licencias , Algoritmos , Instrucción por Computador/normas , Humanos , Modelos Teóricos , Psicometría/métodosRESUMEN
PURPOSE: The dimensionality of examinations provides empirical evidence of the internal test structure underlying the responses to a set of items. In turn, the internal structure is an important piece of evidence of the validity of an examination. Thus, the aim of this study was to investigate the performance of the DETECT program and to use it to examine the internal structure of the Korean nursing licensing examination. METHODS: Non-parametric methods of dimensional testing, such as the DETECT program, have been proposed as ways of overcoming the limitations of traditional parametric methods. A non-parametric method (the DETECT program) was investigated using simulation data under several conditions and applied to the Korean nursing licensing examination. RESULTS: The DETECT program performed well in terms of determining the number of underlying dimensions under several different conditions in the simulated data. Further, the DETECT program correctly revealed the internal structure of the Korean nursing licensing examination, meaning that it detected the proper number of dimensions and appropriately clustered the items within each dimension. CONCLUSION: The DETECT program performed well in detecting the number of dimensions and in assigning items for each dimension. This result implies that the DETECT method can be useful for examining the internal structure of assessments, such as licensing examinations, that possess relatively many domains and content areas.
Asunto(s)
Competencia Clínica/normas , Evaluación Educacional/métodos , Licencia en Enfermería , Modelos Estadísticos , Bachillerato en Enfermería , Evaluación Educacional/normas , Femenino , Humanos , Psicometría , República de CoreaRESUMEN
This study aims to explore the influences of personal, vocational, and job environment related factors that are associated with job satisfaction of individuals with disabilities in South Korea. Data for wage-based working employees from a nationwide survey were obtained, which resulted in a total number of 417 participants. The six hypotheses and mediation effects of personal and work related environmental factors were tested using the structural equation modeling drawn from existing research evidence. Results revealed that (a) life satisfaction and job related environments directly influenced job satisfaction; (b) the relationship between personal experience and job satisfaction was mediated by life satisfaction for both mild/moderate and severe/profound disabilities group; and (c) the mediating role of job environment between vocational preparedness and job satisfaction was only observed for individuals with mild/moderate disabilities. Summary of findings and implications for future research and practices are discussed.
Asunto(s)
Personas con Discapacidad , Empleo , Satisfacción en el Trabajo , Medio Social , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad , Satisfacción Personal , República de Corea , Índice de Severidad de la Enfermedad , Encuestas y Cuestionarios , Adulto JovenRESUMEN
Most computerized adaptive tests (CATs) have been studied using the framework of unidimensional item response theory. However, many psychological variables are multidimensional and might benefit from using a multidimensional approach to CATs. This study investigated the accuracy, fidelity, and efficiency of a fully multidimensional CAT algorithm (MCAT) with a bifactor model using simulated data. Four item selection methods in MCAT were examined for three bifactor pattern designs using two multidimensional item response theory models. To compare MCAT item selection and estimation methods, a fixed test length was used. The Ds-optimality item selection improved θ estimates with respect to a general factor, and either D- or A-optimality improved estimates of the group factors in three bifactor pattern designs under two multidimensional item response theory models. The MCAT model without a guessing parameter functioned better than the MCAT model with a guessing parameter. The MAP (maximum a posteriori) estimation method provided more accurate θ estimates than the EAP (expected a posteriori) method under most conditions, and MAP showed lower observed standard errors than EAP under most conditions, except for a general factor condition using Ds-optimality item selection.