Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
1.
Behav Res Methods ; 56(2): 600-614, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36750522

RESUMO

Multidimensional computerized adaptive testing for forced-choice items (MFC-CAT) combines the benefits of multidimensional forced-choice (MFC) items and computerized adaptive testing (CAT) in that it eliminates response biases and reduces administration time. Previous studies that explored designs of MFC-CAT only discussed item selection methods based on the Fisher information (FI), which is known to perform unstably at early stages of CAT. This study proposes a set of new item selection methods based on the KL information for MFC-CAT (namely MFC-KI, MFC-KB, and MFC-KLP) based on the Thurstonian IRT (TIRT) model. Three simulation studies, including one based on real data, were conducted to compare the performance of the proposed KL-based item selection methods against the existing FI-based methods in three- and five-dimensional MFC-CAT scenarios with various test lengths and inter-trait correlations. Results demonstrate that the proposed KL-based item selection methods are feasible for MFC-CAT and generate acceptable trait estimation accuracy and uniformity of item pool usage. Among the three proposed methods, MFC-KB and MFC-KLP outperformed the existing FI-based item selection methods and resulted in the most accurate trait estimation and relatively even utilization of the item pool.


Assuntos
Teste Adaptativo Computadorizado , Humanos , Simulação por Computador
2.
Value Health ; 25(4): 512-524, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35227597

RESUMO

OBJECTIVES: This article aims to describe the generation and selection of items (stage 2) and face validation (stage 3) of a large international (multilingual) project to develop a new generic measure, the EQ-HWB (EQ Health and Wellbeing), for use in economic evaluation across health, social care, and public health to estimate quality-adjusted life-years. METHODS: Items from commonly used generic, carer, social care, and mental health quality of life measures were mapped onto domains or subdomains identified from a literature review. Potential terms and items were reviewed and refined to ensure coverage of the construct of the domains/subdomain (stage 2). Input on the potential item pool, response options, and recall period was sought from 3 key stakeholder groups. The pool of candidate items was tested in qualitative interviews with potential future users in an international face validation study (stage 3). RESULTS: Stage 2 resulted in the generation of 687 items. Predetermined selection criteria were applied by the research team resulting in 598 items being dropped, leaving 89 items that were reviewed by key stakeholder groups. Face validation (stage 3) tested 97 draft items and 4 response scales. A total of 47 items were retained and 14 were modified, whereas 3 were added to the candidate pool of items. This resulted in a 64-item set. CONCLUSIONS: This international multiculture, multilingual study with a common methodology identified many items that performed well across all countries. These were taken to the psychometric testing along with modified and new items for the EQ-HWB.


Assuntos
Cuidadores , Qualidade de Vida , Humanos , Psicometria/métodos , Anos de Vida Ajustados por Qualidade de Vida , Reprodutibilidade dos Testes , Inquéritos e Questionários
3.
Value Health ; 25(4): 525-533, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35365299

RESUMO

OBJECTIVES: The development of measures such as the EQ-HWB (EQ Health and Wellbeing) requires selection of items. This study explored the psychometric performance of candidate items, testing their validity in patients, social carer users, and carers. METHODS: Article and online surveys that included candidate items (N = 64) were conducted in Argentina, Australia, China, Germany, United Kingdom, and the United States. Psychometric assessment on missing data, response distributions, and known group differences was undertaken. Dimensionality was explored using exploratory and confirmatory factor analysis. Poorly fitting items were identified using information functions, and the function of each response category was assessed using category characteristic curves from item response theory (IRT) models. Differential item functioning was tested across key subgroups. RESULTS: There were 4879 respondents (Argentina = 508, Australia = 514, China = 497, Germany = 502, United Kingdom = 1955, United States = 903). Where missing data were allowed, it was low (UK article survey 2.3%; US survey 0.6%). Most items had responses distributed across all levels. Most items could discriminate between groups with known health conditions with moderate to large effect sizes. Items were less able to discriminate across carers. Factor analysis found positive and negative measurement factors alongside the constructs of interest. For most of the countries apart from China, the confirmatory factor analysis model had good fit with some minor modifications. IRT indicated that most items had well-functioning response categories but there was some evidence of differential item functioning in many items. CONCLUSIONS: Items performed well in classical psychometric testing and IRT. This large 6-country collaboration provided evidence to inform item selection for the EQ-HWB measure.


Assuntos
Cuidadores , Análise Fatorial , Humanos , Psicometria/métodos , Inquéritos e Questionários , Reino Unido , Estados Unidos
4.
Qual Life Res ; 31(1): 25-36, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33983619

RESUMO

PURPOSE: Mokken scale analysis (MSA) is an attractive scaling procedure for ordinal data. MSA is frequently used in health-related quality of life research. Two of MSA's prime features are the scalability coefficients and the automated item selection procedure (AISP). The AISP partitions a (large) set of items into scales based on the observed item scores; the resulting scales can be used as measurement instruments. There exist two issues in MSA: First, point estimates, standard errors, and test statistics for scalability coefficients are inappropriate for clustered item scores, which are omnipresent in quality of life research data. Second, the AISP insufficiently takes sampling fluctuation of Mokken's scalability coefficients into account. METHODS: We solved both issues by providing point estimates and standard errors for the scalability coefficients for clustered data and by implementing a Wald-based significance test in the AISP algorithm, resulting in a test-guided AISP (T-AISP), that is available for both nonclustered and clustered test scores. RESULTS: We integrated the T-AISP into a two-step, test-guided MSA for scale construction, to guide the analysis for nonclustered and clustered data. The first step is performing a T-AISP and select the final scale(s). For clustered data, within-group dependency is investigated on the final scale(s). In the second step, the strength of the scale(s) is determined and further analyses are performed. The procedure was demonstrated on clustered item scores obtained from administering a questionnaire on quality of life in schools to 639 students nested in 30 classrooms. CONCLUSIONS: We developed a two-step, test-guided MSA for scale construction that takes into account sample fluctuation of all scalability coefficients and that can be applied to item scores obtained by a nonclustered or clustered sampling design.


Assuntos
Qualidade de Vida , Projetos de Pesquisa , Algoritmos , Humanos , Psicometria , Qualidade de Vida/psicologia , Reprodutibilidade dos Testes , Inquéritos e Questionários
5.
Qual Life Res ; 30(5): 1425-1432, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-33289063

RESUMO

Preference-based measures allow patients to report their level of health, and the responses are then scored using preference weights from a representative general population sample for use in cost utility analysis. The development process of new preference-based measures should ensure that valid items are selected to reflect the constructs of interest included in the measure and that are suitable for use in preference-elicitation exercises. Existing criteria on patient-reported outcome measures (PROMs) development were reviewed, and additional considerations were taken into account in order to generate criteria to support development of new preference-based measures. Criteria covering 22 different aspects related to item selection for preference-based measures are presented. These include criteria related to how items are phrased to ensure accurate completion, the coverage of items in terms of range of domains as well as focus on current outcomes and whether items are suitable for valuation. The criteria are aimed at supporting the development of new preference-based measures with discussion to ensure that even where there is conflict between criteria, issues have been considered at the item selection stage. This would minimize problems at valuation stage by harmonizing established criteria and expanding lists to reflect the unique characteristics of preference-based measures.


Assuntos
Análise Custo-Benefício/métodos , Medidas de Resultados Relatados pelo Paciente , Qualidade de Vida/psicologia , Feminino , Humanos , Masculino , Inquéritos e Questionários
6.
Allergy ; 75(5): 1165-1177, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31815297

RESUMO

BACKGROUND: Recurrent angioedema (AE) is an important clinical problem in the context of chronic urticaria (mast cell mediator-induced), ACE-inhibitor intake and hereditary angioedema (both bradykinin-mediated). To help patients obtain control of their recurrent AE is a major treatment goal. However, a tool to assess control of recurrent AE is not yet available. This prompted us to develop such a tool, the Angioedema Control Test (AECT). METHODS: After a conceptional framework was developed for the AECT, a list of potential AECT items was generated by a combined approach of patient interviews, literature review and expert input. Subsequent item reduction was based on impact analysis, inter-item correlation, additional predefined criteria for item performance, and a review of the item selection process for content validity. Finally, an instruction section was generated, and an US-American-English version was developed by a structured translation process. RESULTS: A 4-item AECT with recall periods of 4 weeks and 3 months was developed based on 106 potential items tested in 97 patients with mast cell mediator-induced (n = 49) or bradykinin-mediated recurrent AE (n = 48). Eighty-four items were excluded based on impact analysis. The remaining 22 items could be further reduced by a method-mix of inter-item correlation, additional predefined criteria for item performance and review for content validity. CONCLUSIONS: The AECT is the first tool to assess disease control in recurrent AE patients. Its retrospective approach, its brevity and its simple scoring make the AECT ideally suited for clinical practice and trials. Its validity and reliability need to be determined in future independent studies.


Assuntos
Angioedema , Angioedema/diagnóstico , Angioedema/epidemiologia , Angioedema/etiologia , Bradicinina , Humanos , Medidas de Resultados Relatados pelo Paciente , Reprodutibilidade dos Testes , Estudos Retrospectivos
7.
Qual Life Res ; 29(9): 2585-2592, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32418061

RESUMO

PURPOSE: Previous research has suggested the essential unidimensionality of the 12-item traditional Chinese version of the Nonrestorative Sleep Scale (NRSS). This study aimed to develop a short form of the traditional Chinese version of the NRSS without compromising its reliability and validity. METHODS: Data were collected from 2 cross-sectional studies with identical target groups of adults residing in Hong Kong. An iterative Wald test was used to assess differential item functioning by gender. Based on the generalized partial credit model, we first obtained a shortened version such that further shortening would result in substantial sacrifice of test information and standard error of measurement. Another shortened version was obtained by the optimal test assembly (OTA). The two shortened versions were compared for test information, Cronbach's alpha, and convergent validity. RESULTS: Data from a total of 404 Chinese adults (60.0% female) who had completed the Chinese NRSS were gathered. All items were invariant by gender. A 6-item version was obtained beyond which the test performance substantially deteriorated, and a 9-item version was obtained by OTA. The 9-item version performed better than the 6-item version in test information and convergent validity. It had discrimination and difficulty indices ranging from 0.44 to 2.23 and - 7.58 to 2.13, respectively, and retained 92% of the test information of the original 12-item version. CONCLUSION: The 9-item Chinese NRSS is a reliable and valid tool to measure nonrestorative sleep for epidemiological studies.


Assuntos
Psicometria/métodos , Qualidade de Vida/psicologia , Sono/fisiologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Povo Asiático , Estudos Transversais , Feminino , Humanos , Idioma , Masculino , Pessoa de Meia-Idade , Adulto Jovem
8.
Health Qual Life Outcomes ; 16(1): 51, 2018 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-29554963

RESUMO

BACKGROUND: Due to a lack of an appropriate disease-specific patient-reported outcome (PRO) instrument for chronic heart failure including its social support and treatment aspects in China, this study was performed to develop a patient-reported outcome measure (PROM) for patients with chronic heart failure and evaluate its reliability, validity, and feasibility. METHODS: According to the standard PROM guidelines established by the Food and Drug Administration, an item pool was formed by reviewing a large amount of relevant literature and interviewing patients with chronic heart failure about their main symptoms. Thus, the primary scale was created after adjusting the items and language with the help of patients and experts in the field. Next, 155 patients from 8 hospitals in different districts were recruited for a pilot survey using questionnaires containing these items. The patients' responses were analyzed using the classical test theory and item response theory to select high-quality items and determine the subdomains of the scale. This was followed by a formal investigation in the same eight hospitals. In total, 360 patients and 100 healthy subjects were included to evaluate the reliability, validity, and feasibility of the items. Through this process, the final scale was established. RESULTS: The final scale comprised 12 subdomains with 57 items related to physical, psychological, social, and therapeutic areas. The data analysis results of the formal investigation showed that the PROM for chronic heart failure had good reliability, validity, and feasibility. Reliability was verified by Cronbach's alpha coefficient, which was 0.913 for the total scale, 0.903 for the physical domain, 0.941 for the psychological domain, 0.827 for the social domain, and 0.839 for the therapeutic domain. The construct validity results met the relative criteria of confirmatory factor analysis. Discriminant validity was represented by score comparisons of nine subdomains. The response rate and the effective rate of return of the CHF-PROM were 98.94% and 98.92%, respectively. CONCLUSIONS: The final scale coincides with the theoretical framework and better reflects the overall quality of life of patients with chronic heart failure. This scale can be used as a valid instrument to evaluate clinical treatment and clinical trials of chronic heart failure.


Assuntos
Insuficiência Cardíaca/psicologia , Medidas de Resultados Relatados pelo Paciente , Qualidade de Vida , Adulto , Estudos de Casos e Controles , China , Doença Crônica , Análise Fatorial , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Reprodutibilidade dos Testes
9.
J Pers ; 86(6): 1037-1049, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29425409

RESUMO

OBJECTIVE: The goal of this study was to examine age-associated personality differences using a measurement-invariant representation of the higher-order structure of the Five-Factor Model. METHOD: We reanalyzed the German NEO-PI-R norm sample (N = 11,724) and applied ant colony optimization in a multigroup confirmatory factor analysis setting in order to select three items per first-order factor that would optimize model fit and measurement invariance across 18 age groups ranging from 16 to 65 years of age. RESULTS: Ant colony optimization substantially improved absolute and relative model fit under measurement invariance constraints. However, the results showed that even when selecting items, measurement invariance across a large age span could not be guaranteed. Strong measurement invariance for Extraversion and Agreeableness could not be established. The age-associated mean-level differences of the first-order factors of Neuroticism and Conscientiousness supported the maturity hypothesis. The mean levels of the first-order factors of Openness varied substantially from each other across age. CONCLUSIONS: Findings on age differences in personality can be particularly distorted in older age groups. Testing for and ensuring measurement invariance with item selection procedures can help solve this problem. The higher-order structure of personality should be accounted for when personality development is examined.


Assuntos
Desenvolvimento Humano/fisiologia , Determinação da Personalidade/normas , Personalidade/fisiologia , Psicometria/normas , Adolescente , Adulto , Fatores Etários , Idoso , Análise Fatorial , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Desenvolvimento da Personalidade , Psicometria/métodos , Adulto Jovem
10.
Int J Psychol ; 53(2): 83-91, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26987560

RESUMO

The influence of individual differences on learners' study time allocation has been emphasised in recent studies; however, little is known about the role of individual thinking styles (analytical versus intuitive). In the present study, we explored the influence of individual thinking styles on learners' application of agenda-based and habitual processes when selecting the first item during a study-time allocation task. A 3-item cognitive reflection test (CRT) was used to determine individuals' degree of cognitive reliance on intuitive versus analytical cognitive processing. Significant correlations between CRT scores and the choices of first item selection were observed in both Experiment 1a (study time was 5 seconds per triplet) and Experiment 1b (study time was 20 seconds per triplet). Furthermore, analytical decision makers constructed a value-based agenda (prioritised high-reward items), whereas intuitive decision makers relied more upon habitual responding (selected items from the leftmost of the array). The findings of Experiment 1a were replicated in Experiment 2 notwithstanding ruling out the possible effects from individual intelligence and working memory capacity. Overall, the individual thinking style plays an important role on learners' study time allocation and the predictive ability of CRT is reliable in learners' item selection strategy.


Assuntos
Aprendizagem/fisiologia , Memória de Curto Prazo/fisiologia , Habilidades para Realização de Testes/métodos , Pensamento/fisiologia , Adulto , Feminino , Humanos , Individualidade , Masculino , Fatores de Tempo
11.
Health Qual Life Outcomes ; 14: 75, 2016 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-27165036

RESUMO

BACKGROUND: The aim of the study is to develop a specific patient-reported scale of liver cirrhosis according to the Patient Reported Outcome guidelines of the Food and Drug Administration (FDA), and to examine its capacity to fill gaps in this field. METHODS: A conceptual framework was developed and a preliminary item pool developed through literature review and interviews of 10 patients with liver cirrhosis. With the preliminary items, we performed a pilot survey that included a cognitive test with patients and interviews with experts; the focus was on content and language of the scale. In the item selection stage, seven statistical methods including discrete trends method, discrimination analysis, exploratory factor analysis, Cronbach's α coefficient, correlation coefficient, test-retest reliability, Item-Response Theory were applied to survey data from 200 subjects (150 liver cirrhosis patients and 50 controls). This produced the preliminary Liver Cirrhosis Patient-reported Outcome Measure (LC-PROM). In the next stage, we conducted the survey with 620 subjects (500 patients and 120 controls) to validate reliability, validity and acceptability of this scale. RESULTS: The 55 items and 13 dimensions addressed four domains: physical, psychological, social, and therapeutic. Cronbach's α coefficients were 0.921 for the total scale; the confirmatory factor analysis, t-tests and ANOVA supported scale validity; the model fit index as Root Mean Square Error of Approximation (RMSEA), Root Mean Square Residual (RMR), Normed Fit Index (NFI), Non-Normed Fit Index (NNFI), Comparative Fit Index (CFI) and Incremental Fit Index (IFI) met the criterion generally. The acceptance ratio and response rate indicated good feasibility. CONCLUSIONS: This study developed an accurate and stable patient-reported outcome scale of liver cirrhosis, which is able to evaluate clinical effects effectively, is helpful to patients in recognizing their health condition, and contributes to clinical decision making both for patients and physicians. Additionally, the LC-PROM can perform as an ultimate assessment of medical and health care effects and can inform clinical trials of new drugs for liver cirrhosis.


Assuntos
Cirrose Hepática/psicologia , Cirrose Hepática/terapia , Satisfação do Paciente/estatística & dados numéricos , Pacientes/psicologia , Qualidade de Vida/psicologia , Adulto , Análise Fatorial , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Avaliação de Resultados da Assistência ao Paciente , Projetos Piloto , Reprodutibilidade dos Testes , Autorrelato , Inquéritos e Questionários , Estados Unidos
12.
Neuropsychol Rehabil ; 26(1): 126-56, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-25609229

RESUMO

The progressive degradation of semantic memory is a common feature of many forms of dementia, including Alzheimer's disease and the semantic variant of primary progressive aphasia (svPPA). One of the most functionally debilitating effects of this semantic impairment is the inability to name common people and objects (i.e., anomia). Clinical management of a progressive, semantically based anomia presents extraordinary challenges for neurorehabilitation. Techniques such as errorless learning and spaced-retrieval training show promise for retraining forgotten words. However, we lack complementary detail about what to train (i.e., item selection) and how to flexibly adapt the training to a declining cognitive system. This position paper weighs the relative merits of several treatment rationales (e.g., restore vs. compensate) and advocates for maintenance of known words over reacquisition of forgotten knowledge in the context of semantic treatment paradigms. I propose a system for generating an item pool and outline a set of core principles for training and sustaining a micro-lexicon consisting of approximately 100 words. These principles are informed by lessons learned over the course of a Phase I treatment study targeting language maintenance over a 5-year span in Alzheimer's disease and SvPPA. Finally, I propose a semantic training approach that capitalises on lexical frequency and repeated training on conceptual structure to offset the loss of key vocabulary as disease severity worsens.


Assuntos
Doença de Alzheimer/reabilitação , Demência Frontotemporal/reabilitação , Terapia da Linguagem/métodos , Semântica , Idoso , Cognição/fisiologia , Feminino , Humanos , Aprendizagem , Masculino , Pessoa de Meia-Idade , Testes Neuropsicológicos , Escalas de Graduação Psiquiátrica
13.
Behav Res Methods ; 48(4): 1443-1453, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-26487053

RESUMO

Item bank stratification has been shown to be an effective method for combating item overexposure in both uni- and multidimensional computer adaptive testing. However, item bank stratification cannot guarantee that items will not be overexposed-that is, exposed at a rate exceeding some prespecified threshold. In this article, we propose enhancing stratification for multidimensional computer adaptive tests by combining it with the item eligibility method, a technique for controlling the maximum exposure rate in computerized tests. The performance of the method was examined via a simulation study and compared to existing methods of item selection and exposure control. Also, for the first time, maximum likelihood (MLE) and expected a posteriori (EAP) estimation of examinee ability were compared side by side in a multidimensional computer adaptive test. The simulation suggested that the proposed method is effective in suppressing the maximum item exposure rate with very little loss of measurement accuracy and precision. As compared to MLE, EAP generates smaller mean squared errors of the ability estimates in all simulation conditions.


Assuntos
Testes de Aptidão/estatística & dados numéricos , Computadores , Simulação por Computador , Humanos , Funções Verossimilhança , Probabilidade
14.
Stat Med ; 34(3): 487-503, 2015 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-25327293

RESUMO

With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan.


Assuntos
Teorema de Bayes , Psicometria/métodos , Qualidade de Vida , Inquéritos e Questionários , Simulação por Computador , Disartria/psicologia , Humanos , Itália , Modelos Logísticos , Análise de Componente Principal , Perfil de Impacto da Doença
15.
J Exp Child Psychol ; 125: 13-34, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-24814204

RESUMO

We explored the development of cognitive flexibility in typically developing 6-, 8-, and 10-year-olds and adults by modifying a common cognitive flexibility task, the Flexible Item Selection Task (FIST). Although performance on the standard FIST reached ceiling by 8 years, FIST performance on other variations continued to improve until 10 years of age. Within a detailed task analysis, we also explored working memory storage and processing components of executive function and how these contribute to the development of cognitive flexibility. The findings reinforce the notion that cognitive flexibility is a multifaceted construct but that the development of working memory contributes in part to age-related change in this ability.


Assuntos
Desenvolvimento Infantil/fisiologia , Cognição/fisiologia , Memória de Curto Prazo/fisiologia , Testes Neuropsicológicos/estatística & dados numéricos , Adulto , Fatores Etários , Criança , Comportamento de Escolha/fisiologia , Função Executiva/fisiologia , Feminino , Humanos , Masculino , Análise e Desempenho de Tarefas
16.
Educ Psychol Meas ; 84(2): 364-386, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38898881

RESUMO

The questionnaire method has always been an important research method in psychology. The increasing prevalence of multidimensional trait measures in psychological research has led researchers to use longer questionnaires. However, questionnaires that are too long will inevitably reduce the quality of the completed questionnaires and the efficiency of collection. Computer adaptive testing (CAT) can be used to reduce the test length while preserving the measurement accuracy. However, it is more often used in aptitude testing and involves a large number of parametric assumptions. Applying CAT to psychological questionnaires often requires question-specific model design and preexperimentation. The present article proposes a nonparametric and item response theory (IRT)-independent CAT algorithm. The new algorithm is simple and highly generalizable. It can be quickly used in a variety of questionnaires and tests without being limited by theoretical assumptions in different research areas. Simulation and empirical studies were conducted to demonstrate the validity of the new algorithm in aptitude tests and personality measures.

17.
Artigo em Inglês | MEDLINE | ID: mdl-38794963

RESUMO

Computerized adaptive testing for cognitive diagnosis (CD-CAT) achieves remarkable estimation efficiency and accuracy by adaptively selecting and then administering items tailored to each examinee. The process of item selection stands as a pivotal component of a CD-CAT algorithm, with various methods having been developed for binary responses. However, multiple-choice (MC) items, an important item type that allows for the extraction of richer diagnostic information from incorrect answers, have been underemphasized. Currently, the Jensen-Shannon divergence (JSD) index introduced by Yigit et al. (Applied Psychological Measurement, 2019, 43, 388) is the only item selection method exclusively designed for MC items. However, the JSD index requires a large sample to calibrate item parameters, which may be infeasible when there is only a small or no calibration sample. To bridge this gap, the study first proposes a nonparametric item selection method for MC items (MC-NPS) by implementing novel discrimination power that measures an item's ability to effectively distinguish among different attribute profiles. A Q-optimal procedure for MC items is also developed to improve the classification during the initial phase of a CD-CAT algorithm. The effectiveness and efficiency of the two proposed algorithms were confirmed by simulation studies.

18.
Educ Psychol Meas ; 83(2): 294-321, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36866066

RESUMO

Multidimensional forced-choice (FC) questionnaires have been consistently found to reduce the effects of socially desirable responding and faking in noncognitive assessments. Although FC has been considered problematic for providing ipsative scores under the classical test theory, item response theory (IRT) models enable the estimation of nonipsative scores from FC responses. However, while some authors indicate that blocks composed of opposite-keyed items are necessary to retrieve normative scores, others suggest that these blocks may be less robust to faking, thus impairing the assessment validity. Accordingly, this article presents a simulation study to investigate whether it is possible to retrieve normative scores using only positively keyed items in pairwise FC computerized adaptive testing (CAT). Specifically, a simulation study addressed the effect of (a) different bank assembly (with a randomly assembled bank, an optimally assembled bank, and blocks assembled on-the-fly considering every possible pair of items), and (b) block selection rules (i.e., T, and Bayesian D and A-rules) over the estimate accuracy and ipsativity and overlap rates. Moreover, different questionnaire lengths (30 and 60) and trait structures (independent or positively correlated) were studied, and a nonadaptive questionnaire was included as baseline in each condition. In general, very good trait estimates were retrieved, despite using only positively keyed items. Although the best trait accuracy and lowest ipsativity were found using the Bayesian A-rule with questionnaires assembled on-the-fly, the T-rule under this method led to the worst results. This points out to the importance of considering both aspects when designing FC CAT.

19.
Br J Math Stat Psychol ; 76(1): 52-68, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-35840353

RESUMO

Computerized classification testing (CCT) commonly chooses items maximizing information at the cut score, which yields the most information for decision-making. However, a corollary problem is that all examinees will be given the same set of items, resulting in high test overlap rate and unbalanced item bank usage, which threatens test security. Moreover, another pivotal issue for CCT is time control. Since both the extremely long response time (RT) and large RT variability across examinees intensify time-induced anxiety, it is crucial to reduce the number of examinees exceeding the time limitation and the differences between examinees' test-taking times. To satisfy these practical needs, this paper proposes the novel idea of stage adaptiveness to tailor the item selection process to the decision-making requirement in each step and generate fresh insight into the existing response time selection method. Results indicate that a balanced item usage as well as short and stable test times across examinees can be achieved via the new methods.


Assuntos
Habilidades para Realização de Testes , Tempo de Reação , Fatores de Tempo
20.
J Patient Rep Outcomes ; 7(1): 72, 2023 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-37462855

RESUMO

BACKGROUND: Women with endometrial or ovarian cancer experience a variety of symptoms during chemotherapy. Patient-Reported outcomes (PROs) can provide insight into the symptoms they experience. A PRO tool tailored to this patient population can help accurately monitor adverse events and manage symptoms. The objective of this study was to identify items in the National Cancer Institute's measurement system Patient-Reported Outcomes Version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE®) appropriate for use in a PRO tool for a population of women with endometrial or ovarian cancer undergoing treatment with taxanes (paclitaxel or docetaxel) in combination with carboplatin. METHODS: A two-phase, sequential multi-methods approach was applied. In phase one, a comprehensive literature search was done to map the toxicity of the applied chemotherapeutics and phase III clinical studies. Phase two, which comprised selecting the PRO-CTCAE items, included discussions with and feedback from a patient advisory board, an additional literature search, and focus group interviews with senior oncologists and specialized oncology nurses. A national expert panel facilitated both phases in terms of carefully select items from the PRO-CTCAE library. RESULTS: Phase one identified 18 symptoms and phase two, three additional ones, leading to the inclusion of 21 PRO-CTCAE symptoms in the final PRO tool. Since PRO-CTCAE also contains one to three sub-questions on the frequency, severity, and interference with daily activities of symptoms, there were 44 potential items. CONCLUSIONS: This study describes taking a multi-method approach to selecting items from the PRO-CTCAE library for use in a population of women with endometrial or ovarian cancer undergoing chemotherapy. By systematically combining diverse approaches, we carefully selected 21 clinically relevant symptoms covered by 44 items in the PRO-CTCAE library. Future studies should investigate the psychometric properties of this PRO tool for women with endometrial or ovarian cancer.


Assuntos
Neoplasias Ovarianas , Medidas de Resultados Relatados pelo Paciente , Feminino , Humanos , Endométrio , National Cancer Institute (U.S.) , Neoplasias Ovarianas/tratamento farmacológico , Autorrelato , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa