Búsqueda | Portal de Búsqueda de la BVS Colombia

1.

Estimating Optimal Weights for Compound Scores: A Multidimensional IRT Approach.

van Lier, Hendrika G; Siemons, Liseth; van der Laar, Mart A F J; Glas, Cees A W.

Multivariate Behav Res ; 53(6): 914-924, 2018.

Artículo en Inglés | MEDLINE | ID: mdl-30463444

RESUMEN

A method is proposed for constructing indices as linear functions of variables such that the reliability of the compound score is maximized. Reliability is defined in the framework of latent variable modeling [i.e., item response theory (IRT)] and optimal weights of the components of the index are found by maximizing the posterior variance relative to the total latent variable variance. Three methods for estimating the weights are proposed. The first is a likelihood-based approach, that is, marginal maximum likelihood (MML). The other two are Bayesian approaches based on Markov chain Monte Carlo (MCMC) computational methods. One is based on an augmented Gibbs sampler specifically targeted at IRT, and the other is based on a general purpose Gibbs sampler such as implemented in OpenBugs and Jags. Simulation studies are presented to demonstrate the procedure and to compare the three methods. Results are very similar, so practitioners may be suggested the use of the easily accessible latter method. A real-data set pertaining to the 28-joint Disease Activity Score is used to show how the methods can be applied in a complex measurement situation with multiple time points and mixed data formats.

Asunto(s)

Teorema de Bayes , Funciones de Verosimilitud , Método de Montecarlo , Humanos , Cadenas de Markov

2.

Validity and measurement precision of the PROMIS physical function item bank and a content validity-driven 20-item short form in rheumatoid arthritis compared with traditional measures.

Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Glas, Cees A W; Vonkeman, Harald E; Taal, Erik; Krishnan, Eswar; Bernelot Moens, Hein J; Boers, Maarten; Terwee, Caroline B; van Riel, Piet L C M; van de Laar, Mart A F J.

Rheumatology (Oxford) ; 54(12): 2221-9, 2015 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-26224306

RESUMEN

OBJECTIVE: To evaluate the content validity and measurement properties of the Patient-Reported Outcome Measurement Information System (PROMIS) physical function item bank and a 20-item short form in patients with RA in comparison with the HAQ disability index (HAQ-DI) and 36-item Short Form Health Survey (SF-36) physical functioning scale (PF-10). METHODS: The content validity of the instruments was evaluated by linking their items to the International Classification of Functioning, Disability and Health (ICF) core set for RA. The measures were administered to 690 RA patients enrolled in the Dutch Rheumatoid Arthritis Monitoring registry. Measurement precision was evaluated using item response theory methods and construct validity was evaluated by correlating physical function scores with other clinical and patient-reported outcome measures. RESULTS: All 207 health concepts identified in the physical function measures referred to activities that are featured in the ICF. Twenty-three of 26 ICF RA core set domains are featured in the full PROMIS physical function item bank compared with 13 and 8 for the HAQ-DI and PF-10, respectively. As hypothesized, all three physical function instruments were highly intercorrelated (r 0.74-0.84), moderately correlated with disease activity measures (r 0.44-0.63) and weakly correlated with age (rs 0.07-0.14). Item response theory-based analysis revealed that a 20-item PROMIS physical function short form covered a wider range of physical function levels than the HAQ-DI or PF-10. CONCLUSION: The PROMIS physical function item bank demonstrated excellent measurement properties in RA. A content-driven 20-item short form may be a useful tool for assessing physical function in RA.

Asunto(s)

Artritis Reumatoide/fisiopatología , Actividad Motora/fisiología , Evaluación del Resultado de la Atención al Paciente , Actividades Cotidianas , Adulto , Anciano , Artritis Reumatoide/rehabilitación , Evaluación de la Discapacidad , Femenino , Humanos , Masculino , Persona de Mediana Edad , Reproducibilidad de los Resultados , Índice de Severidad de la Enfermedad

3.

Working mechanism of a multidimensional computerized adaptive test for fatigue in rheumatoid arthritis.

Nikolaus, Stephanie; Bode, Christina; Taal, Erik; Vonkeman, Harald E; Glas, Cees A W; van de Laar, Mart A F J.

Health Qual Life Outcomes ; 13: 23, 2015 Feb 21.

Artículo en Inglés | MEDLINE | ID: mdl-25890307

RESUMEN

BACKGROUND: This paper demonstrates the mechanism of a multidimensional computerized adaptive test (CAT) to measure fatigue in patients with rheumatoid arthritis (RA). A CAT can be used to precisely measure patient-reported outcomes at an individual level as items are consequentially selected based on the patient's previous answers. The item bank of the CAT Fatigue RA has been developed from the patients' perspective and consists of 196 items pertaining to three fatigue dimensions: severity, impact and variability of fatigue. METHODS: The CAT Fatigue RA was completed by fifteen patients. To test the CAT's working mechanism, we applied the flowchart-check-method. The adaptive item selection procedure for each patient was checked by the researchers. The estimated fatigue levels and the measurement precision per dimension were illustrated with the selected items, answers and flowcharts. RESULTS: The CAT Fatigue RA selected all items in a logical sequence and those items were selected which provided the most information about the patient's individual fatigue. Flowcharts further illustrated that the CAT reached a satisfactory measurement precision, with less than 20 items, on the dimensions severity and impact and to somewhat lesser extent also for the dimension variability. Patients' fatigue scores varied across the three dimensions; sometimes severity scored highest, other times impact or variability. The CAT's ability to display different fatigue experiences can improve communication in daily clinical practice, guide interventions, and facilitate research into possible predictors of fatigue. CONCLUSIONS: The results indicate that the CAT Fatigue RA measures precise and comprehensive. Once it is examined in more detail in a consecutive, elaborate validation study, the CAT will be available for implementation in daily clinical practice and for research purposes.

Asunto(s)

Artritis Reumatoide/complicaciones , Diagnóstico por Computador/métodos , Fatiga/diagnóstico , Calidad de Vida , Adulto , Anciano , Artritis Reumatoide/psicología , Fatiga/etiología , Femenino , Indicadores de Salud , Humanos , Masculino , Persona de Mediana Edad

4.

The St George's Respiratory Questionnaire revisited: a psychometric evaluation.

Paap, Muirne C S; Brouwer, Danny; Glas, Cees A W; Monninkhof, Evelyn M; Forstreuter, Benjamin; Pieterse, Marcel E; van der Palen, Job.

Qual Life Res ; 24(1): 67-79, 2015 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-24241770

RESUMEN

PURPOSE: The St George's Respiratory Questionnaire (SGRQ) has clearly acquired the status of legacy questionnaire for measuring health-related quality of life in patients with chronic obstructive pulmonary disease (COPD). The main aim of this study was to assess the underlying dimensionality of the SGRQ and to investigate the added value of the empirical weights used to calculate total scores. METHODS: The official Dutch translation of the SGRQ was completed by 444 COPD patients participating in two clinical studies. These data were used for secondary data analysis in this study. Three complementary statistical methods were used to assess dimensionality: Mokken scale analysis (MSA), parametric multidimensional item response theory (IRT) and bifactor analysis. Additionally, the original SGRQ weighting procedure was compared to IRT-based weighting. RESULTS: The results of the MSA and multidimensional item response theory (MIRT) pointed toward a unidimensional structure. The bifactor analyses indicated that there was a strong general factor, but the group factors did have additional value. Nineteen items performed poorly in the MSA, MIRT analysis or both. Shortening the scale from 50 to 31 items did not negatively impact measurement precision. SGRQ total score and IRT-derived scores correlated strongly, 0.90 for the one-parameter model and 0.99 for the two-parameter model. CONCLUSION: The SGRQ contains some multidimensionality, but an abbreviated version can be used as a unidimensional tool in patients with COPD. Subscale scores should be used with care. SGRQ total scores correlated highly with IRT-based scores, and thus, the weighting methods may be used interchangeably to calculate total scores.

Asunto(s)

Estado de Salud , Enfermedad Pulmonar Obstructiva Crónica/fisiopatología , Calidad de Vida , Encuestas y Cuestionarios , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Modelos Estadísticos , Psicometría , Traducciones

5.

How age and sex affect the erythrocyte sedimentation rate and C-reactive protein in early rheumatoid arthritis.

Siemons, Liseth; Ten Klooster, Peter M; Vonkeman, Harald E; van Riel, Piet L C M; Glas, Cees A W; van de Laar, Mart A F J.

BMC Musculoskelet Disord ; 15: 368, 2014 Nov 06.

Artículo en Inglés | MEDLINE | ID: mdl-25373740

RESUMEN

BACKGROUND: The erythrocyte sedimentation rate (ESR) and C-reactive protein (CRP) are two commonly used measures of inflammation in rheumatoid arthritis (RA). As current RA treatment guidelines strongly emphasize early and aggressive treatment aiming at fast remission, optimal measurement of inflammation becomes increasingly important. Dependencies with age, sex, and body mass index have been shown for both inflammatory markers, yet it remains unclear which inflammatory marker is affected least by these effects in patients with early RA. METHODS: Baseline data from 589 patients from the DREAM registry were used for analyses. Associations between the inflammatory markers and age, sex, and BMI were evaluated first using univariate linear regression analyses. Next, it was tested whether these associations were independent of a patient's current disease activity as well as of each other using multiple linear regression analyses with backward elimination. The strengths of the associations were compared using standardized beta (ß) coefficients. The multivariate analyses were repeated after 1 year. RESULTS: At baseline, both the ESR and CRP were univariately associated with age, sex, and BMI, although the association with BMI disappeared in multivariate analyses. ESR and CRP levels significantly increased with age (ß-ESR=0.017, p<0.001 and ß-CRP=0.009, p=0.006), independent of the number of tender and swollen joints, general health, and sex. For each decade of aging, ESR and CRP levels became 1.19 and 1.09 times higher, respectively. Furthermore, women demonstrated average ESR levels that were 1.22 times higher than that of men (ß=0.198, p=0.007), whereas men had 1.20 times higher CRP levels (ß=-0.182, p=0.048). Effects were strongest on the ESR. BMI became significantly associated with both inflammatory markers after 1 year, showing higher levels with increasing weight. Age continued to be significantly associated, whereas sex remained only associated with the ESR level. CONCLUSIONS: Age and sex are independently associated with the levels of both acute phase reactants in early RA, emphasizing the need to take these external factors into account when interpreting disease activity measures. BMI appears to become more relevant at later stages of the disease.

Asunto(s)

Envejecimiento/sangre , Artritis Reumatoide/sangre , Artritis Reumatoide/diagnóstico , Proteína C-Reactiva/metabolismo , Eritrocitos/metabolismo , Caracteres Sexuales , Adulto , Anciano , Envejecimiento/patología , Sedimentación Sanguínea , Estudios de Cohortes , Femenino , Humanos , Masculino , Persona de Mediana Edad

6.

Reducing Attenuation Bias in Regression Analyses Involving Rating Scale Data via Psychometric Modeling.

Glas, Cees A W; Jorgensen, Terrence D; Hove, Debby Ten.

Psychometrika ; 89(1): 42-63, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38573434

RESUMEN

Many studies in fields such as psychology and educational sciences obtain information about attributes of subjects through observational studies, in which raters score subjects using multiple-item rating scales. Error variance due to measurement effects, such as items and raters, attenuate the regression coefficients and lower the power of (hierarchical) linear models. A modeling procedure is discussed to reduce the attenuation. The procedure consists of (1) an item response theory (IRT) model to map the discrete item responses to a continuous latent scale and (2) a generalizability theory (GT) model to separate the variance in the latent measurement into variance components of interest and nuisance variance components. It will be shown how measurements obtained from this mixture of IRT and GT models can be embedded in (hierarchical) linear models, both as predictor or criterion variables, such that error variance due to nuisance effects are partialled out. Using examples from the field of educational measurement, it is shown how general-purpose software can be used to implement the modeling procedure.

Asunto(s)

Psicometría , Psicometría/métodos , Humanos , Análisis de Regresión , Modelos Estadísticos , Sesgo , Evaluación Educacional/métodos , Modelos Lineales

7.

Development and evaluation of a crosswalk between the SF-36 physical functioning scale and Health Assessment Questionnaire disability index in rheumatoid arthritis.

ten Klooster, Peter M; Oude Voshaar, Martijn A H; Gandek, Barbara; Rose, Matthias; Bjorner, Jakob B; Taal, Erik; Glas, Cees A W; van Riel, Piet L C M; van de Laar, Mart A F J.

Health Qual Life Outcomes ; 11: 199, 2013 Nov 15.

Artículo en Inglés | MEDLINE | ID: mdl-24229416

RESUMEN

BACKGROUND: The SF-36 physical functioning scale (PF-10) and the Health Assessment Questionnaire disability index (HAQ-DI) are the most frequently used instruments for measuring self-reported physical function in rheumatoid arthritis (RA). The objective of this study was to develop a crosswalk between scores on the PF-10 and HAQ-DI in RA. METHODS: Item response theory (IRT) methods were used to co-calibrate both scales using data from 1791 RA patients. The appropriateness of a Rasch-based crosswalk was evaluated by comparing it with crosswalks based on a two-parameter and a multi-dimensional IRT model. The accuracy of the final crosswalk was cross-validated using baseline (n = 532) and 6-month follow-up (n = 276) data from an independent cohort of early RA patients. RESULTS: The PF-10 and HAQ-DI adequately fit a unidimensional Rasch model. Both scales measured a wide range of functioning, although the HAQ-DI tended to better target lower levels of functioning. The Rasch-based crosswalk performed similarly to crosswalks based on the two-parameter and multidimensional IRT models. Agreement between predicted and observed scale scores in the cross-validation sample was acceptable for group-level comparisons. The longitudinal validity in discriminating between disease response states was similar between observed and predicted scores. CONCLUSION: The crosswalk developed in this study allows for converting scores from one scale to the other and can be used for group-level analyses in patients with RA.

Asunto(s)

Artritis Reumatoide/fisiopatología , Evaluación de la Discapacidad , Calidad de Vida , Encuestas y Cuestionarios/normas , Adulto , Anciano , Intervalos de Confianza , Femenino , Humanos , Masculino , Persona de Mediana Edad , Modelos Teóricos , Países Bajos

8.

Application of the health assessment questionnaire disability index to various rheumatic diseases.

van Groen, Maaike M; ten Klooster, Peter M; Taal, Erik; van de Laar, Mart A F J; Glas, Cees A W.

Qual Life Res ; 19(9): 1255-63, 2010 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-20559736

RESUMEN

PURPOSE: To investigate whether the Stanford Health Assessment Questionnaire Disability Index (HAQ-DI) can serve as a generic instrument for measuring disability across different rheumatic diseases and to propose a scoring method based on item response theory (IRT) modeling to support this goal. METHODS: The HAQ-DI was administered to a cross-sectional sample of patients with confirmed rheumatoid arthritis (n = 619), osteoarthritis (n = 125), or gout (n = 102). The results were analyzed using the generalized partial credit model as an IRT model. RESULTS: It was found that 4 out of 8 item categories of the HAQ-DI displayed substantial differential item functioning (DIF) over the three diseases. Further, it was shown that this DIF could be modeled using an IRT model with disease-specific item parameters, which produces measures that are comparable for the three diseases. CONCLUSION: Although the HAQ-DI partially functioned differently in the three disease groups, the measurement regarding the disability level of the patients can be made comparable using IRT methods.

Asunto(s)

Evaluación de la Discapacidad , Enfermedades Reumáticas/fisiopatología , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Modelos Teóricos , Calidad de Vida , Enfermedades Reumáticas/psicología , Encuestas y Cuestionarios

9.

Marginal likelihood inference for a model for item responses and response times.

Glas, Cees A W; van der Linden, Wim J.

Br J Math Stat Psychol ; 63(Pt 3): 603-26, 2010 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-20109271

RESUMEN

Marginal maximum-likelihood procedures for parameter estimation and testing the fit of a hierarchical model for speed and accuracy on test items are presented. The model is a composition of two first-level models for dichotomous responses and response times along with multivariate normal models for their item and person parameters. It is shown how the item parameters can easily be estimated using Fisher's identity. To test the fit of the model, Lagrange multiplier tests of the assumptions of subpopulation invariance of the item parameters (i.e., no differential item functioning), the shape of the response functions, and three different types of conditional independence were derived. Simulation studies were used to show the feasibility of the estimation and testing procedures and to estimate the power and Type I error rate of the latter. In addition, the procedures were applied to an empirical data set from a computerized adaptive test of language comprehension.

Asunto(s)

Recolección de Datos/estadística & datos numéricos , Funciones de Verosimilitud , Modelos Estadísticos , Pruebas Psicológicas/estadística & datos numéricos , Tiempo de Reacción , Algoritmos , Comprensión , Simulación por Computador , Evaluación Educacional/estadística & datos numéricos , Estudios de Factibilidad , Humanos , Cómputos Matemáticos , Multilingüismo , Análisis Multivariante , Probabilidad , Psicometría/estadística & datos numéricos , Reproducibilidad de los Resultados , Programas Informáticos

10.

Combining Text Mining of Long Constructed Responses and Item-Based Measures: A Hybrid Test Design to Screen for Posttraumatic Stress Disorder (PTSD).

He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; van den Berg, Stéphanie M.

Front Psychol ; 10: 2358, 2019.

Artículo en Inglés | MEDLINE | ID: mdl-31695647

RESUMEN

This article introduces a new hybrid intake procedure developed for posttraumatic stress disorder (PTSD) screening, which combines an automated textual assessment of respondents' self-narratives and item-based measures that are administered consequently. Text mining technique and item response modeling were used to analyze long constructed response (i.e., self-narratives) and responses to standardized questionnaires (i.e., multiple choices), respectively. The whole procedure is combined in a Bayesian framework where the textual assessment functions as prior information for the estimation of the PTSD latent trait. The purpose of this study is twofold: first, to investigate whether the combination model of textual analysis and item-based scaling could enhance the classification accuracy of PTSD, and second, to examine whether the standard error of estimates could be reduced through the use of the narrative as a sort of routing test. With the sample at hand, the combination model resulted in a reduction in the misclassification rate, as well as a decrease of standard error of latent trait estimation. These findings highlight the benefits of combining textual assessment and item-based measures in a psychiatric screening process. We conclude that the hybrid test design is a promising approach to increase test efficiency and is expected to be applicable in a broader scope of educational and psychological measurement in the future.

11.

Measuring Patient-Reported Outcomes Adaptively: Multidimensionality Matters!

Paap, Muirne C S; Kroeze, Karel A; Glas, Cees A W; Terwee, Caroline B; van der Palen, Job; Veldkamp, Bernard P.

Appl Psychol Meas ; 42(5): 327-342, 2018 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-29962559

RESUMEN

As there is currently a marked increase in the use of both unidimensional (UCAT) and multidimensional computerized adaptive testing (MCAT) in psychological and health measurement, the main aim of the present study is to assess the incremental value of using MCAT rather than separate UCATs for each dimension. Simulations are based on empirical data that could be considered typical for health measurement: a large number of dimensions (4), strong correlations among dimensions (.77-.87), and polytomously scored response data. Both variable- (SE < .316, SE < .387) and fixed-length conditions (total test length of 12, 20, or 32 items) are studied. The item parameters and variance-covariance matrix Φ are estimated with the multidimensional graded response model (GRM). Outcome variables include computerized adaptive test (CAT) length, root mean square error (RMSE), and bias. Both simulated and empirical latent trait distributions are used to sample vectors of true scores. MCATs were generally more efficient (in terms of test length) and more accurate (in terms of RMSE) than their UCAT counterparts. Absolute average bias was highest for variable-length UCATs with termination rule SE < .387. Test length of variable-length MCATs was on average 20% to 25% shorter than test length across separate UCATs. This study showed that there are clear advantages of using MCAT rather than UCAT in a setting typical for health measurement.

12.

Automated Assessment of Patients' Self-Narratives for Posttraumatic Stress Disorder Screening Using Natural Language Processing and Text Mining.

He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; de Vries, Theo.

Assessment ; 24(2): 157-172, 2017 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-26358713

RESUMEN

Patients' narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms-including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model-were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners' diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients' self-expression behavior, thus helping clinicians identify potential patients from an early stage.

Asunto(s)

Minería de Datos , Diagnóstico por Computador , Tamizaje Masivo , Narración , Procesamiento de Lenguaje Natural , Autoinforme , Trastornos por Estrés Postraumático , Adolescente , Adulto , Algoritmos , Árboles de Decisión , Diagnóstico Precoz , Femenino , Humanos , Determinación de la Personalidad/estadística & datos numéricos , Reproducibilidad de los Resultados , Trastornos por Estrés Postraumático/clasificación , Trastornos por Estrés Postraumático/diagnóstico , Trastornos por Estrés Postraumático/psicología

13.

The Academic Medical Center Linear Disability Score (ALDS) item bank: item response theory analysis in a mixed patient population.

Holman, Rebecca; Weisscher, Nadine; Glas, Cees A W; Dijkgraaf, Marcel G W; Vermeulen, Marinus; de Haan, Rob J; Lindeboom, Robert.

Health Qual Life Outcomes ; 3: 83, 2005 Dec 29.

Artículo en Inglés | MEDLINE | ID: mdl-16381611

RESUMEN

BACKGROUND: Currently, there is a lot of interest in the flexible framework offered by item banks for measuring patient relevant outcomes. However, there are few item banks, which have been developed to quantify functional status, as expressed by the ability to perform activities of daily life. This paper examines the measurement properties of the Academic Medical Center linear disability score item bank in a mixed population. METHODS: This paper uses item response theory to analyse data on 115 of 170 items from a total of 1002 respondents. These were: 551 (55%) residents of supported housing, residential care or nursing homes; 235 (23%) patients with chronic pain; 127 (13%) inpatients on a neurology ward following a stroke; and 89 (9%) patients suffering from Parkinson's disease. RESULTS: Of the 170 items, 115 were judged to be clinically relevant. Of these 115 items, 77 were retained in the item bank following the item response theory analysis. Of the 38 items that were excluded from the item bank, 24 had either been presented to fewer than 200 respondents or had fewer than 10% or more than 90% of responses in the category 'can carry out'. A further 11 items had different measurement properties for younger and older or for male and female respondents. Finally, 3 items were excluded because the item response theory model did not fit the data. CONCLUSION: The Academic Medical Center linear disability score item bank has promising measurement characteristics for the mixed patient population described in this paper. Further studies will be needed to examine the measurement properties of the item bank in other populations.

Asunto(s)

Centros Médicos Académicos , Actividades Cotidianas , Bases de Datos como Asunto , Evaluación de la Discapacidad , Evaluación de Resultado en la Atención de Salud/métodos , Perfil de Impacto de Enfermedad , Adulto , Anciano , Anciano de 80 o más Años , Enfermedad Crónica , Femenino , Unidades Hospitalarias , Humanos , Modelos Logísticos , Masculino , Persona de Mediana Edad , Países Bajos , Casas de Salud , Evaluación de Resultado en la Atención de Salud/estadística & datos numéricos , Dolor/tratamiento farmacológico , Dolor/fisiopatología , Enfermedad de Parkinson/tratamiento farmacológico , Enfermedad de Parkinson/fisiopatología , Accidente Cerebrovascular/fisiopatología , Accidente Cerebrovascular/terapia , Encuestas y Cuestionarios

14.

Modelling non-ignorable missing-data mechanisms with item response theory models.

Holman, Rebecca; Glas, Cees A W.

Br J Math Stat Psychol ; 58(Pt 1): 1-17, 2005 May.

Artículo en Inglés | MEDLINE | ID: mdl-15969835

RESUMEN

A model-based procedure for assessing the extent to which missing data can be ignored and handling non-ignorable missing data is presented. The procedure is based on item response theory modelling. As an example, the approach is worked out in detail in conjunction with item response data modelled using the partial credit and generalized partial credit models. Simulation studies are carried out to assess the extent to which the bias caused by ignoring the missing-data mechanism can be reduced. Finally, the feasibility of the procedure is demonstrated using data from a study to calibrate a medical disability scale.

Asunto(s)

Sesgo , Modelos Estadísticos , Psicometría/estadística & datos numéricos , Actividades Cotidianas/clasificación , Recolección de Datos/estadística & datos numéricos , Evaluación de la Discapacidad , Humanos , Cómputos Matemáticos

15.

Construct Validation of a Multidimensional Computerized Adaptive Test for Fatigue in Rheumatoid Arthritis.

Nikolaus, Stephanie; Bode, Christina; Taal, Erik; Vonkeman, Harald E; Glas, Cees A W; van de Laar, Mart A F J.

PLoS One ; 10(12): e0145008, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-26710104

RESUMEN

OBJECTIVE: Multidimensional computerized adaptive testing enables precise measurements of patient-reported outcomes at an individual level across different dimensions. This study examined the construct validity of a multidimensional computerized adaptive test (CAT) for fatigue in rheumatoid arthritis (RA). METHODS: The 'CAT Fatigue RA' was constructed based on a previously calibrated item bank. It contains 196 items and three dimensions: 'severity', 'impact' and 'variability' of fatigue. The CAT was administered to 166 patients with RA. They also completed a traditional, multidimensional fatigue questionnaire (BRAF-MDQ) and the SF-36 in order to examine the CAT's construct validity. A priori criterion for construct validity was that 75% of the correlations between the CAT dimensions and the subscales of the other questionnaires were as expected. Furthermore, comprehensive use of the item bank, measurement precision and score distribution were investigated. RESULTS: The a priori criterion for construct validity was supported for two of the three CAT dimensions (severity and impact but not for variability). For severity and impact, 87% of the correlations with the subscales of the well-established questionnaires were as expected but for variability, 53% of the hypothesised relations were found. Eighty-nine percent of the items were selected between one and 137 times for CAT administrations. Measurement precision was excellent for the severity and impact dimensions, with more than 90% of the CAT administrations reaching a standard error below 0.32. The variability dimension showed good measurement precision with 90% of the CAT administrations reaching a standard error below 0.44. No floor- or ceiling-effects were found for the three dimensions. CONCLUSION: The CAT Fatigue RA showed good construct validity and excellent measurement precision on the dimensions severity and impact. The dimension variability had less ideal measurement characteristics, pointing to the need to recalibrate the CAT item bank with a two-dimensional model, solely consisting of severity and impact.

Asunto(s)

Artritis Reumatoide/patología , Fatiga/diagnóstico , Psicometría/métodos , Autoinforme , Adulto , Anciano , Anciano de 80 o más Años , Computadores , Femenino , Humanos , Masculino , Persona de Mediana Edad , Reproducibilidad de los Resultados , Encuestas y Cuestionarios , Adulto Joven

16.

Assessment of fatigue in rheumatoid arthritis: a psychometric comparison of single-item, multiitem, and multidimensional measures.

Oude Voshaar, Martijn A H; Ten Klooster, Peter M; Bode, Christina; Vonkeman, Harald E; Glas, Cees A W; Jansen, Tim; van Albada-Kuipers, Iet; van Riel, Piet L C M; van de Laar, Mart A F J.

J Rheumatol ; 42(3): 413-20, 2015 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-25593225

RESUMEN

OBJECTIVE: To compare the psychometric functioning of multidimensional disease-specific, multiitem generic, and single-item measures of fatigue in patients with rheumatoid arthritis (RA). METHODS: Confirmatory factor analysis (CFA) and longitudinal item response theory (IRT) modeling were used to evaluate the measurement structure and local reliability of the Bristol RA Fatigue Multi-Dimensional Questionnaire (BRAF-MDQ), the Medical Outcomes Study Short Form-36 (SF-36) vitality scale, and the BRAF Numerical Rating Scales (BRAF-NRS) in a sample of 588 patients with RA. RESULTS: A 1-factor CFA model yielded a similar fit to a 5-factor model with subscale-specific dimensions, and the items from the different instruments adequately fit the IRT model, suggesting essential unidimensionality in measurement. The SF-36 vitality scale outperformed the BRAF-MDQ at lower levels of fatigue, but was less precise at moderate to higher levels of fatigue. At these levels of fatigue, the living, cognition, and emotion subscales of the BRAF-MDQ provide additional precision. The BRAF-NRS showed a limited measurement range with its highest precision centered on average levels of fatigue. CONCLUSION: The different instruments appear to access a common underlying domain of fatigue severity, but differ considerably in their measurement precision along the continuum. The SF-36 vitality scale can be used to measure fatigue severity in samples with relatively mild fatigue. For samples expected to have higher levels of fatigue, the multidimensional BRAF-MDQ appears to be a better choice. The BRAF-NRS are not recommended if precise assessment is required, for instance in longitudinal settings.

Asunto(s)

Artritis Reumatoide/complicaciones , Fatiga/diagnóstico , Anciano , Artritis Reumatoide/psicología , Fatiga/complicaciones , Fatiga/psicología , Femenino , Humanos , Masculino , Persona de Mediana Edad , Psicometría , Reproducibilidad de los Resultados , Índice de Severidad de la Enfermedad , Encuestas y Cuestionarios

17.

Practical methods for dealing with 'not applicable' item responses in the AMC Linear Disability Score project.

Holman, Rebecca; Glas, Cees A W; Lindeboom, Robert; Zwinderman, Aeilko H; de Haan, Rob J.

Health Qual Life Outcomes ; 2: 29, 2004 Jun 16.

Artículo en Inglés | MEDLINE | ID: mdl-15200681

RESUMEN

BACKGROUND: Whenever questionnaires are used to collect data on constructs, such as functional status or health related quality of life, it is unlikely that all respondents will respond to all items. This paper examines ways of dealing with responses in a 'not applicable' category to items included in the AMC Linear Disability Score (ALDS) project item bank. METHODS: The data examined in this paper come from the responses of 392 respondents to 32 items and form part of the calibration sample for the ALDS item bank. The data are analysed using the one-parameter logistic item response theory model. The four practical strategies for dealing with this type of response are: cold deck imputation; hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. RESULTS: The item and respondent population parameter estimates were very similar for the strategies involving hot deck imputation; treating the missing responses as if these items had never been offered to those individual patients; and using a model which takes account of the 'tendency to respond to items'. The estimates obtained using the cold deck imputation method were substantially different. CONCLUSIONS: The cold deck imputation method was not considered suitable for use in the ALDS item bank. The other three methods described can be usefully implemented in the ALDS item bank, depending on the purpose of the data analysis to be carried out. These three methods may be useful for other data sets examining similar constructs, when item response theory based methods are used.

Asunto(s)

Actividades Cotidianas/clasificación , Evaluación de la Discapacidad , Encuestas Epidemiológicas , Modelos Logísticos , Encuestas y Cuestionarios , Interpretación Estadística de Datos , Estado de Salud , Humanos , Proyectos Piloto , Probabilidad , Calidad de Vida , Índice de Severidad de la Enfermedad

18.

Evaluation of global testing procedures for item fit to the Rasch model.

Suárez-Falcón, Juan C; Glas, Cees A W.

Br J Math Stat Psychol ; 56(Pt 1): 127-43, 2003 May.

Artículo en Inglés | MEDLINE | ID: mdl-12803827

RESUMEN

Two types of global testing procedures for item fit to the Rasch model were evaluated using simulation studies. The first type incorporates three tests based on first-order statistics: van den Wollenberg's Q(1) test, Glas's R(1) test, and Andersen's LR test. The second type incorporates three tests based on second-order statistics: van den Wollenberg's Q(2) test, Glas's R(2) test, and a non-parametric test proposed by Ponocny. The Type I error rates and the power against the violation of parallel item response curves, unidimensionality and local independence were analysed in relation to sample size and test length. In general, the outcomes indicate a satisfactory performance of all tests, except the Q(2) test which exhibits an inflated Type I error rate. Further, it was found that both types of tests have power against all three types of model violation. A possible explanation is the interdependencies among the assumptions underlying the model.

Asunto(s)

Pruebas de Aptitud/estadística & datos numéricos , Modelos Lineales , Modelos Psicológicos , Psicología/métodos , Psicología/estadística & datos numéricos , Humanos , Reproducibilidad de los Resultados

19.

Assessing impact of differential symptom functioning on post-traumatic stress disorder (PTSD) diagnosis.

He, Qiwei; Glas, Cees A W; Veldkamp, Bernard P.

Int J Methods Psychiatr Res ; 23(2): 131-41, 2014 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-24436035

RESUMEN

This article explores the generalizability of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) diagnostic criteria for post-traumatic stress disorder (PTSD) to various subpopulations. Besides identifying the differential symptom functioning (also referred to as differential item functioning [DIF]) related to various background variables such as gender, marital status and educational level, this study emphasizes the importance of evaluating the impact of DIF on population inferences as made in health surveys and clinical trials, and on the diagnosis of individual patients. Using a sample from the National Comorbidity Study-Replication (NCS-R), four symptoms for gender, one symptom for marital status, and three symptoms for educational level were significantly flagged as DIF, but their impact on diagnosis was fairly small. We conclude that the DSM-IV diagnostic criteria for PTSD do not produce substantially biased results in the investigated subpopulations, and there should be few reservations regarding their use. Further, although the impact of DIF (i.e. the influence of differential symptom functioning on diagnostic results) was found to be quite small in the current study, we recommend that diagnosticians always perform a DIF analysis of various subpopulations using the methodology presented here to ensure the diagnostic criteria is valid in their own studies.

Asunto(s)

Trastornos por Estrés Postraumático/diagnóstico , Trastornos por Estrés Postraumático/fisiopatología , Adulto , Comorbilidad , Manual Diagnóstico y Estadístico de los Trastornos Mentales , Escolaridad , Femenino , Humanos , Masculino , Estado Civil , Persona de Mediana Edad , Factores Sexuales , Trastornos por Estrés Postraumático/psicología

20.

Testing the difficulty theory of the SON-R 5(1/2)-17, a non-verbal test of intelligence.

Geerlings, Hanneke; Laros, Jacob A; Tellegen, Peter J; Glas, Cees A W.

Br J Math Stat Psychol ; 67(2): 248-65, 2014 May.

Artículo en Inglés | MEDLINE | ID: mdl-23773035

RESUMEN

Fischer's (1973) linear logistic test model can be used to test hypotheses regarding the effect of covariates on item difficulty and to predict the difficulty of newly constructed test items. However, its assumptions of equal discriminatory power across items and a perfect prediction of item difficulty are never absolutely met. The amount of misfit in an application of a Bayesian version of the model to two subtests of the SON-R 5(1/2)-17 is investigated by means of item fit statistics in the framework of posterior predictive checks and by means of a comparison with a model that allows for residual (co)variance in the item parameters. The effect of the degree of residual (co)variance on the robustness of inferences is investigated in a simulation study.

Asunto(s)

Pruebas de Inteligencia/estadística & datos numéricos , Reconocimiento Visual de Modelos , Solución de Problemas , Psicometría/estadística & datos numéricos , Adolescente , Análisis de Varianza , Teorema de Bayes , Niño , Discriminación en Psicología , Femenino , Humanos , Modelos Lineales , Modelos Logísticos , Masculino , Modelos Estadísticos , Orientación

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA