Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
J Hand Surg Eur Vol ; 48(10): 1042-1047, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37066610

RESUMO

In outcome measures, item response theory (IRT) validation can deliver interval-scaled high-quality measurement that can be harnessed using computerized adaptive tests (CATs) to pose fewer questions to patients. We aimed to develop a CAT by developing an IRT model for the Patient Evaluation Measure (PEM) for patients undergoing cubital tunnel syndrome (CuTS) surgery. Nine hundred and seventy-nine completed PEM responses of patients with CuTS in the United Kingdom Hand Registry were used to develop and calibrate the CAT. Its performance was then evaluated in a simulated cohort of 1000 patients. The CAT reduced the original PEM length from ten to a median of two questions (range two to four), while preserving a high level of precision (median standard error of measurement of 0.27). The mean error between the CAT score and full-length score was 0.08%. A Bland-Altman analysis showed good agreement with no signs of bias. The CAT version of the PEM can substantially reduce patient burden while enhancing construct validity by harnessing IRT for patients undergoing CuTS surgery.


Assuntos
Síndrome do Túnel Ulnar , Humanos , Síndrome do Túnel Ulnar/diagnóstico , Síndrome do Túnel Ulnar/cirurgia , Teste Adaptativo Computadorizado , Inquéritos e Questionários , Avaliação de Resultados em Cuidados de Saúde , Extremidade Superior
2.
J Med Internet Res ; 25: e41870, 2023 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-37104031

RESUMO

BACKGROUND: Routine use of patient-reported outcome measures (PROMs) and computerized adaptive tests (CATs) may improve care in a range of surgical conditions. However, most available CATs are neither condition-specific nor coproduced with patients and lack clinically relevant score interpretation. Recently, a PROM called the CLEFT-Q has been developed for use in the treatment of cleft lip or palate (CL/P), but the assessment burden may be limiting its uptake into clinical practice. OBJECTIVE: We aimed to develop a CAT for the CLEFT-Q, which could facilitate the uptake of the CLEFT-Q PROM internationally. We aimed to conduct this work with a novel patient-centered approach and make source code available as an open-source framework for CAT development in other surgical conditions. METHODS: CATs were developed with the Rasch measurement theory, using full-length CLEFT-Q responses collected during the CLEFT-Q field test (this included 2434 patients across 12 countries). These algorithms were validated in Monte Carlo simulations involving full-length CLEFT-Q responses collected from 536 patients. In these simulations, the CAT algorithms approximated full-length CLEFT-Q scores iteratively, using progressively fewer items from the full-length PROM. Agreement between full-length CLEFT-Q score and CAT score at different assessment lengths was measured using the Pearson correlation coefficient, root-mean-square error (RMSE), and 95% limits of agreement. CAT settings, including the number of items to be included in the final assessments, were determined in a multistakeholder workshop that included patients and health care professionals. A user interface was developed for the platform, and it was prospectively piloted in the United Kingdom and the Netherlands. Interviews were conducted with 6 patients and 4 clinicians to explore end-user experience. RESULTS: The length of all 8 CLEFT-Q scales in the International Consortium for Health Outcomes Measurement (ICHOM) Standard Set combined was reduced from 76 to 59 items, and at this length, CAT assessments reproduced full-length CLEFT-Q scores accurately (with correlations between full-length CLEFT-Q score and CAT score exceeding 0.97, and the RMSE ranging from 2 to 5 out of 100). Workshop stakeholders considered this the optimal balance between accuracy and assessment burden. The platform was perceived to improve clinical communication and facilitate shared decision-making. CONCLUSIONS: Our platform is likely to facilitate routine CLEFT-Q uptake, and this may have a positive impact on clinical care. Our free source code enables other researchers to rapidly and economically reproduce this work for other PROMs.


Assuntos
Fenda Labial , Fissura Palatina , Procedimentos de Cirurgia Plástica , Cirurgia Plástica , Humanos , Fenda Labial/cirurgia , Fissura Palatina/cirurgia , Medidas de Resultados Relatados pelo Paciente , Teste Adaptativo Computadorizado
3.
Nephrol Dial Transplant ; 38(5): 1158-1169, 2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-35913734

RESUMO

BACKGROUND: The Patient-Reported Outcomes Measurement Information System (PROMIS®) has been recommended for computerized adaptive testing (CAT) of health-related quality of life. This study compared the content, validity, and reliability of seven PROMIS CATs to the 12-item Short-Form Health Survey (SF-12) in patients with advanced chronic kidney disease. METHODS: Adult patients with chronic kidney disease and an estimated glomerular filtration rate under 30 mL/min/1.73 m2 who were not receiving dialysis treatment completed seven PROMIS CATs (assessing physical function, pain interference, fatigue, sleep disturbance, anxiety, depression, and the ability to participate in social roles and activities), the SF-12, and the PROMIS Pain Intensity single item and Dialysis Symptom Index at inclusion and 2 weeks. A content comparison was performed between PROMIS CATs and the SF-12. Construct validity of PROMIS CATs was assessed using Pearson's correlations. We assessed the test-retest reliability of all patient-reported outcome measures by calculating the intraclass correlation coefficient and minimal detectable change. RESULTS: In total, 207 patients participated in the study. A median of 45 items (10 minutes) were completed for PROMIS CATs. All PROMIS CATs showed evidence of sufficient construct validity. PROMIS CATs, most SF-12 domains and summary scores, and Dialysis Symptom Index showed sufficient test-retest reliability (intraclass correlation coefficient ≥ 0.70). PROMIS CATs had a lower minimal detectable change compared with the SF-12 (range, 5.7-7.4 compared with 11.3-21.7 across domains, respectively). CONCLUSION: PROMIS CATs showed sufficient construct validity and test-retest reliability in patients with advanced chronic kidney disease. PROMIS CATs required more items but showed better reliability than the SF-12. Future research is needed to investigate the feasibility of PROMIS CATs for routine nephrology care.


Assuntos
Qualidade de Vida , Insuficiência Renal Crônica , Humanos , Reprodutibilidade dos Testes , Teste Adaptativo Computadorizado , Inquéritos e Questionários , Diálise Renal , Medidas de Resultados Relatados pelo Paciente , Insuficiência Renal Crônica/diagnóstico , Insuficiência Renal Crônica/terapia , Sistemas de Informação
4.
J Pediatr Orthop ; 42(7): e720-e726, 2022 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-35703245

RESUMO

BACKGROUND: The use of patient-reported outcome measures, especially Patient-Reported Outcomes Measurement Information System (PROMIS) measures, has increased in recent years. Given this growth, it is imperative to ensure that the measures being used are validated for the intended population(s)/disease(s). Our objective was to assess the construct validity of 8 PROMIS computer adaptive testing (CAT) measures among children with adolescent idiopathic scoliosis (AIS). METHODS: We prospectively enrolled 200 children (aged 10 to 17 y) with AIS, who completed 8 PROMIS CATs (Anxiety, Depressive Symptoms, Mobility, Pain Behavior, Pain Interference, Peer Relationships, Physical Activity, Physical Stress Experiences) and the Scoliosis Research Society-22r questionnaire (SRS-22r) electronically. Treatment categories were observation, bracing, indicated for surgery, or postoperative from posterior spinal fusion. Construct validity was evaluated using known group analysis and convergent and discriminant validity analyses. Analysis of variance was used to identify differences in PROMIS T -scores by treatment category (known groups). The Spearman rank correlation coefficient ( rs ) was calculated between corresponding PROMIS and SRS-22r domains (convergent) and between unrelated PROMIS domains (discriminant). Floor/ceiling effects were calculated. RESULTS: Among treatment categories, significant differences were found in PROMIS Mobility, Pain Behavior, Pain Interference, and Physical Stress Experiences and in all SRS-22r domains ( P <0.05) except Mental Health ( P =0.15). SRS-22r Pain was strongly correlated with PROMIS Pain Interference ( rs =-0.72) and Pain Behavior ( rs =-0.71) and moderately correlated with Physical Stress Experiences ( rs =-0.57). SRS-22r Mental Health was strongly correlated with PROMIS Depressive Symptoms ( rs =-0.72) and moderately correlated with Anxiety ( rs =-0.62). SRS-22r Function was moderately correlated with PROMIS Mobility ( rs =0.64) and weakly correlated with Physical Activity ( rs =0.34). SRS-22r Self-Image was weakly correlated with PROMIS Peer Relationships ( rs =0.33). All unrelated PROMIS CATs were weakly correlated (| rs |<0.40). PROMIS Anxiety, Mobility, Pain Behavior, and Pain Interference and SRS-22r Function, Pain, and Satisfaction displayed ceiling effects. CONCLUSIONS: Evidence supports the construct validity of 6 PROMIS CATs in evaluating AIS patients. Ceiling effects should be considered when using specific PROMIS CATs. LEVEL OF EVIDENCE: Level II, prognostic.


Assuntos
Cifose , Escoliose , Teste Adaptativo Computadorizado , Humanos , Dor , Medidas de Resultados Relatados pelo Paciente , Qualidade de Vida , Escoliose/cirurgia , Inquéritos e Questionários
5.
Arthroscopy ; 38(11): 3023-3029, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-35469995

RESUMO

PURPOSE: To evaluate the reliability, construct validity, and responsiveness of the lower extremity-specific Patient-Reported Outcomes Measurement Information System (PROMIS) Mobility (MO) bank in patients who underwent hip arthroscopic surgery for femoroacetabular impingement. METHODS: Patients who underwent primary hip arthroscopic surgery at a large academic musculoskeletal specialty center between November 2019 and November 2020 completed the following baseline and 6-month measures: PROMIS MO, PROMIS Pain Interference (PI), PROMIS Physical Function (PF), modified Harris Hip Score, International Hip Outcome Tool 33, visual analog scale, and Single Assessment Numeric Evaluation. Construct validity was evaluated using Spearman correlation coefficients. The number of questions until completion was recorded as a marker of test burden. The percentage of patients scoring at the extreme high (ceiling) or low (floor) for each measure was recorded to measure inclusivity. Responsiveness was tested by comparing differences between baseline and 6-month measures, controlling for age and sex, using generalized estimating equations. Magnitudes of responsiveness were assessed through the effect size (Cohen d). RESULTS: In this study, 660 patients (50% female patients) aged 32 ± 14 years were evaluated. PROMIS MO showed a strong correlation with PROMIS PF (r = 0.84, P < .001), the International Hip Outcome Tool 33 (r = 0.73, P < .001), PROMIS PI (r = -0.76, P < .001), and the modified Harris Hip Score (r = 0.73, P < .001). Neither PROMIS MO, PROMIS PI, nor PROMIS PF met the conventional criteria for floor or ceiling effects (≥15%). The mean number of questions answered (± standard deviation) was 4.7 ± 2.1 for PROMIS MO, 4.1 ± 0.6 for PROMIS PI, and 4.1 ± 0.6 for PROMIS PF. From baseline to 6 months, the PROMIS and legacy measures exhibited significant responsiveness (P < .05), with similar effect sizes between the patient-reported outcome measures. CONCLUSIONS: This longitudinal study reveals that in patients undergoing hip arthroscopy, PROMIS MO computerized adaptive testing maintains high correlation with legacy hip-specific instruments, significant responsiveness to change, and low test burden compared with legacy measures, with no ceiling or floor effects at 6-month postoperative follow-up. LEVEL OF EVIDENCE: Level IV, retrospective case series.


Assuntos
Artroscopia , Impacto Femoroacetabular , Humanos , Feminino , Masculino , Impacto Femoroacetabular/cirurgia , Estudos Retrospectivos , Reprodutibilidade dos Testes , Estudos Longitudinais , Teste Adaptativo Computadorizado , Medidas de Resultados Relatados pelo Paciente , Sistemas de Informação
6.
J Hand Surg Eur Vol ; 47(9): 893-898, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35313764

RESUMO

We aimed to develop a computerized adaptive testing (CAT) version of the 11 item Patient Evaluation Measure (PEM), using an item response theory model. This model transformed the ordinal scores into ratio-interval scores. We obtained PEM responses from 924 patients with trapeziometacarpal osteoarthritis to build a CAT model and tested its performance on a simulated cohort of 1000 PEM response sets. The CAT achieved high precision (median standard error or measurement 0.26) and reduced the number of questions needed for accurate scoring from 11 to median two. The CAT scores and item-response-theory-based 15-item PEM scores were similar, and a Bland-Altman analysis demonstrated a mean score difference of 0.2 between the CAT and the full-length PEM scores on a scale from 0 to 100. We conclude that the CAT substantially reduced the burden of the PEM while also harnessing the validity of item response theory scoring.


Assuntos
Teste Adaptativo Computadorizado , Osteoartrite , Humanos , Osteoartrite/diagnóstico , Sistema de Registros , Reprodutibilidade dos Testes , Inquéritos e Questionários
7.
J Hand Surg Eur Vol ; 47(7): 750-754, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35225047

RESUMO

The QuickDASH is a short-form version of the DASH questionnaire, the most widely used patient-reported outcome measure in hand surgery. Multidimensional computerized adaptive testing (MCAT) can produce shorter and more precise testing than static short forms, like QuickDASH. We used DASH responses from 507 patients with Dupuytren's disease to develop a MCAT. The algorithm was evaluated in a Monte Carlo simulation, where the standard error of measurement (SEm) of scores obtained from the 11-item QuickDASH was compared with scores obtained from an MCAT that could administer up to 11 items from the full 30-item DASH. The MCAT asked a mean of 8.51 items (SD 2.93) and 265/1000 simulated respondents needed to complete ≤five items. Median SEms were better for DASH MCAT: 0.299 (hand function) and 0.256 (sensory symptoms) versus 0.320 and 0.290, respectively, for QuickDASH. Our study showed that the DASH MCAT can produce more precise DASH measurement than the QuickDASH, from fewer items.


Assuntos
Contratura de Dupuytren , Teste Adaptativo Computadorizado , Avaliação da Deficiência , Contratura de Dupuytren/diagnóstico , Contratura de Dupuytren/cirurgia , Humanos , Reprodutibilidade dos Testes , Inquéritos e Questionários
8.
Plast Reconstr Surg ; 148(4): 863-869, 2021 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-34415858

RESUMO

BACKGROUND: Skin cancer is among the most frequently occurring malignancies worldwide, which creates a great need for an effective patient-reported outcome measure. Providing shorter questionnaires reduces patient burden and increases patients' willingness to complete forms. The authors set out to use computerized adaptive testing to reduce the number of items needed to predict results for scales of the FACE-Q Skin Cancer Module, a validated patient-reported outcome measure that measures health-related quality of life and patient satisfaction in facial surgery. METHODS: Computerized adaptive testing generates tailored questionnaires for patients in real time based on their responses to previous questions. The authors used an open-source computerized adaptive testing simulation software to run item responses for the five scales from the FACE-Q Skin Cancer Module (i.e., scar appraisal, satisfaction with facial appearance, appearance-related psychosocial distress, cancer worry, and satisfaction with information about appearance). Each simulation continued to administer items until prespecified levels of precision were met, estimated by standard error. Mean and maximum item reductions between the original fixed-length short forms and the simulated versions were evaluated. RESULTS: The number of questions that patients needed to answer to complete the FACE-Q Skin Oncology Module was reduced from 41 items in the original form to a mean of 23 ± 0.55 items (range, 15 to 29) using the computerized adaptive testing version. Simulated computerized adaptive testing scores maintained a high correlation (0.98 to 0.99) with the score from the fixed-length short forms. CONCLUSIONS: Applying computerized adaptive testing to the FACE-Q Skin Cancer Module can reduce the length of assessment by more than 50 percent, with virtually no loss in precision. It is likely to play a critical role in the implementation in clinical practice.


Assuntos
Neoplasias Faciais/cirurgia , Medidas de Resultados Relatados pelo Paciente , Procedimentos de Cirurgia Plástica/estatística & dados numéricos , Neoplasias Cutâneas/cirurgia , Ferida Cirúrgica/cirurgia , Teste Adaptativo Computadorizado , Estética , Face/cirurgia , Neoplasias Faciais/patologia , Humanos , Satisfação do Paciente/estatística & dados numéricos , Psicometria/métodos , Psicometria/estatística & dados numéricos , Qualidade de Vida , Procedimentos de Cirurgia Plástica/psicologia , Reprodutibilidade dos Testes , Neoplasias Cutâneas/psicologia , Ferida Cirúrgica/etiologia , Inquéritos e Questionários/estatística & dados numéricos
9.
JAMA Netw Open ; 4(7): e2115707, 2021 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-34236411

RESUMO

Importance: Veterans from recent and past conflicts have high rates of posttraumatic stress disorder (PTSD). Adaptive testing strategies can increase accuracy of diagnostic screening and symptom severity measurement while decreasing patient and clinician burden. Objective: To develop and validate a computerized adaptive diagnostic (CAD) screener and computerized adaptive test (CAT) for PTSD symptom severity. Design, Setting, and Participants: A diagnostic study of measure development and validation was conducted at a Veterans Health Administration facility. A total of 713 US military veterans were included. The study was conducted from April 25, 2017, to November 10, 2019. Main Outcomes and Measures: The participants completed a PTSD-symptom questionnaire from the item bank and provided responses on the PTSD Checklist for Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) (PCL-5). A subsample of 304 participants were interviewed using the Clinician-Administered Scale for PTSD for DSM-5. Results: Of the 713 participants, 585 were men; mean (SD) age was 52.8 (15.0) years. The CAD-PTSD reproduced the Clinician-Administered Scale for PTSD for DSM-5 PTSD diagnosis with high sensitivity and specificity as evidenced by an area under the curve of 0.91 (95% CI, 0.87-0.95). The CAT-PTSD demonstrated convergent validity with the PCL-5 (r = 0.88) and also tracked PTSD diagnosis (area under the curve = 0.85; 95% CI, 0.79-0.89). The CAT-PTSD reproduced the final 203-item bank score with a correlation of r = 0.95 with a mean of only 10 adaptively administered items, a 95% reduction in patient burden. Conclusions and Relevance: Using a maximum of only 6 items, the CAD-PTSD developed in this study was shown to have excellent diagnostic screening accuracy. Similarly, using a mean of 10 items, the CAT-PTSD provided valid severity ratings with excellent convergent validity with an extant scale containing twice the number of items. The 10-item CAT-PTSD also outperformed the 20-item PCL-5 in terms of diagnostic accuracy. The results suggest that scalable, valid, and rapid PTSD diagnostic screening and severity measurement are possible.


Assuntos
Teste Adaptativo Computadorizado/métodos , Transtornos de Estresse Pós-Traumáticos/classificação , Veteranos/psicologia , Adulto , Idoso , Feminino , Humanos , Masculino , Programas de Rastreamento/métodos , Programas de Rastreamento/estatística & dados numéricos , Pessoa de Meia-Idade , Transtornos de Estresse Pós-Traumáticos/diagnóstico , Transtornos de Estresse Pós-Traumáticos/psicologia , Inquéritos e Questionários , Estados Unidos/epidemiologia , Veteranos/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA