RESUMO
OBJECTIVE: To evaluate reliability, validity and responsiveness of KOOS-12, a 12-item short form of the 42-item Knee injury and Osteoarthritis Outcome Score (KOOS) that provides Pain, Function and Quality of Life (QOL) scale scores and a summary knee impact score. DESIGN: Data from 1,392 knee osteoarthritis (OA) patients from the FORCE-TJR research cohort who completed KOOS before and 6 and 12 months after total knee replacement (TKR) were analyzed. KOOS-12 includes a pain frequency item and three items measuring pain during increasingly difficult (sitting/lying, walking, stairs) activities; function items about standing, rising from sitting, getting in/out of a car, and twisting/pivoting; and the 4-item KOOS QOL scale. Percent computable scale scores, floor and ceiling effects, internal consistency reliability, validity (scale correlations, tests of known groups validity using one-way analysis of variance (ANOVA)) and responsiveness (effect sizes, standardized response means) were compared for the KOOS-12, full-length KOOS, KOOS-PS and KOOS, JR. RESULTS: Internal consistency reliability was above 0.70 for all KOOS-12 scales and ≥0.90 for the KOOS-12 Summary score. Validity and responsiveness of KOOS-12 Pain, Function and QOL scales was satisfactory and reached similar conclusions as comparable full-length KOOS scales. The KOOS-12 Summary score was most responsive in discriminating between groups who differed in global ratings of post-TKR change in physical capabilities and had the highest effect sizes and standardized response means. CONCLUSIONS: KOOS-12 was a reliable and valid alternative to KOOS in TKR patients with moderate to severe OA and provided three domain-specific and summary knee impact scores with substantially reduced respondent burden.
Assuntos
Artroplastia do Joelho/reabilitação , Osteoartrite do Joelho/cirurgia , Atividades Cotidianas , Adulto , Idoso , Idoso de 80 Anos ou mais , Exercício Físico/fisiologia , Feminino , Humanos , Articulação do Joelho/fisiopatologia , Masculino , Pessoa de Meia-Idade , Osteoartrite do Joelho/complicações , Osteoartrite do Joelho/fisiopatologia , Osteoartrite do Joelho/reabilitação , Dor/etiologia , Medição da Dor/métodos , Medição da Dor/normas , Medidas de Resultados Relatados pelo Paciente , Psicometria , Qualidade de Vida , Recuperação de Função Fisiológica , Reprodutibilidade dos Testes , Índice de Gravidade de DoençaRESUMO
OBJECTIVE: To develop 12-item short forms (KOOS-12, HOOS-12) of the 42-item Knee injury and Osteoarthritis Outcome Score (KOOS) and 40-item Hip disability and Osteoarthritis Outcome Score (HOOS) that represent the full-length instruments sufficiently to provide joint-specific pain, function and quality of life (QOL) domain and summary joint impact scores. This paper describes KOOS-12 and HOOS-12 item selection. Subsequent papers will examine KOOS-12 and HOOS-12 reliability, validity and responsiveness. DESIGN: Items were selected based on qualitative information from patients, clinicians and KOOS/HOOS translators and analysis of data from 1,395 knee osteoarthritis (OA) and 1,281 hip OA patients from the FORCE-TJR cohort who completed KOOS or HOOS before and after total joint replacement (TJR). Item response theory models and computerized adaptive test (CAT) simulations were used to identify items that best measured patients' levels of pain and function pre- and post-TJR. KOOS-12/HOOS-12 items were selected based on content, coverage of a wide measurement range, high item information, item usage in CAT simulations, scale-level properties (reliability, validity, responsiveness), and qualitative information. RESULTS: KOOS-12 and HOOS-12 each included a pain frequency item and three items measuring pain during increasingly difficult activities (sitting/lying, walking, up/down stairs); function items about standing, rising from sitting, getting in/out of a car, and twisting/pivoting (KOOS-12) or walking on an uneven surface (HOOS-12); and the original 4-item QOL scale. CONCLUSIONS: This study demonstrated the benefits of examining patient-reported outcome measures using modern psychometric methods, to create short forms with diverse content that provide domain-specific and summary joint impact scores.
Assuntos
Artroplastia de Quadril/reabilitação , Artroplastia do Joelho/reabilitação , Osteoartrite do Quadril/cirurgia , Osteoartrite do Joelho/cirurgia , Medidas de Resultados Relatados pelo Paciente , Atividades Cotidianas , Adulto , Idoso , Idoso de 80 Anos ou mais , Avaliação da Deficiência , Feminino , Indicadores Básicos de Saúde , Humanos , Masculino , Pessoa de Meia-Idade , Osteoartrite do Quadril/reabilitação , Osteoartrite do Joelho/reabilitação , Medição da Dor/métodos , Psicometria , Qualidade de Vida , Reprodutibilidade dos TestesRESUMO
OBJECTIVE: To evaluate reliability, validity and responsiveness of HOOS-12, a 12-item short form of the 40-item Hip disability and Osteoarthritis Outcome Score (HOOS). HOOS-12 provides Pain, Function and Quality of Life (QOL) scale scores and a summary hip impact score. DESIGN: Data from 1,273 FORCE-TJR hip osteoarthritis (OA) patients who completed HOOS before and six and 12 months after total hip replacement (THR) were analyzed. HOOS-12 includes a pain frequency item and three items measuring pain during increasingly difficult (sitting/lying, walking, stairs) activities; function items about standing, rising from sitting, getting in/out of a car, and walking on an uneven surface; and the 4-item HOOS QOL scale. Percent computable scale scores, floor and ceiling effects, internal consistency reliability, validity (scale correlations, tests of known groups validity using one-way analysis of variance (ANOVA)), and responsiveness (effect sizes (ES), standardized response means (SRM)) were compared for HOOS-12, full-length HOOS, HOOS-PS and HOOS, JR. RESULTS: Internal consistency reliability was above 0.70 for all HOOS-12 scales and above 0.90 for the HOOS-12 Summary score. Validity and responsiveness of HOOS-12 Pain, Function and QOL scales were satisfactory and reached similar conclusions as comparable full-length HOOS scales. The HOOS-12 Summary score was highly responsive in discriminating between groups who differed in global ratings of post-THR change in physical capabilities and had high ES and SRM standardized response means. CONCLUSIONS: HOOS-12 was a reliable and valid alternative to HOOS in THR patients with moderate to severe OA and provided three domain-specific and summary hip impact scores with substantially reduced respondent burden.
Assuntos
Artroplastia de Quadril/reabilitação , Osteoartrite do Quadril/cirurgia , Medidas de Resultados Relatados pelo Paciente , Atividades Cotidianas , Adulto , Idoso , Idoso de 80 Anos ou mais , Avaliação da Deficiência , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Osteoartrite do Quadril/reabilitação , Medição da Dor/métodos , Psicometria , Qualidade de Vida , Reprodutibilidade dos TestesRESUMO
OBJECTIVE: The Patient-Reported Outcomes Measurement Information System (PROMIS) was initiated to improve precision, reduce respondent burden, and enhance the comparability of health outcomes measures. We used item response theory (IRT) to construct and evaluate a preliminary item bank for physical function assuming four subdomains. STUDY DESIGN AND SETTING: Data from seven samples (N=17,726) using 136 items from nine questionnaires were evaluated. A generalized partial credit model was used to estimate item parameters, which were normed to a mean of 50 (SD=10) in the US population. Item bank properties were evaluated through Computerized Adaptive Test (CAT) simulations. RESULTS: IRT requirements were fulfilled by 70 items covering activities of daily living, lower extremity, and central body functions. The original item context partly affected parameter stability. Items on upper body function, and need for aid or devices did not fit the IRT model. In simulations, a 10-item CAT eliminated floor and decreased ceiling effects, achieving a small standard error (< 2.2) across scores from 20 to 50 (reliability >0.95 for a representative US sample). This precision was not achieved over a similar range by any comparable fixed length item sets. CONCLUSION: The methods of the PROMIS project are likely to substantially improve measures of physical function and to increase the efficiency of their administration using CAT.
Assuntos
Indicadores Básicos de Saúde , Avaliação de Resultados em Cuidados de Saúde/métodos , Atividades Cotidianas , Estudos Transversais , Interpretação Estatística de Dados , Avaliação da Deficiência , Feminino , Nível de Saúde , Humanos , Masculino , Osteoartrite/fisiopatologia , Osteoartrite/reabilitação , Inquéritos e QuestionáriosRESUMO
BACKGROUND: The Breast Cancer Prevention Trial (BCPT) is a large, multicenter chemoprevention trial testing the efficacy of the antiestrogen drug tamoxifen for prevention of breast cancer and coronary heart disease in healthy women at high risk of breast cancer. The BCPT evolved from a series of prior studies in early stage breast cancer demonstrating the efficacy of tamoxifen in the prevention of systemic breast cancer recurrence and in the reduction of contralateral breast cancers. PURPOSE: The purpose of this article is to describe the methodologic considerations in the collection of health-related quality-of-life (HRQL) data in the BCPT and to present base-line HRQL data on the first 9749 participants. METHODS: An HRQL questionnaire that included the Center for Epidemiologic Studies-Depression Scale, a symptom checklist, the Medical Outcomes Study 36-item short form (MOS-SF-36), and the MOS sexual problems questions was completed by participants in the BCPT at base line (prior to random assignment). Medical and demographic information, as well as projected risk of breast cancer, were collected as part of study eligibility. Descriptive and correlational data were examined for these study participants. RESULTS: BCPT participants report high levels of functioning compared with U.S. general population norms but still report an average of 8.9 distinct symptoms during the past 4 weeks. Depression is less prevalent among the participants than in community samples, which reflects the exclusion of clinically depressed individuals. Sixty-five percent reported being sexually active in the past 6 months, with an age-related decline in sexual activity. Younger women reported fewer sexual problems than older women. There is a strong correlation between the two mental health measures, moderate to weak correlations between HRQL scales and levels of self-reported symptoms, and only weak correlations between measures of breast cancer risk and HRQL scales. The MOS-SF-36 scores were examined for three consecutive recruitment samples (0-6 months, 7-12 months, and 13-20 months), and the base-line scores were slightly better for the earliest group of participants. CONCLUSIONS: This article demonstrates the feasibility of collecting HRQL data in a large, multicenter, chemoprevention trial for women at high risk of breast cancer. The successful integration of HRQL data collection into this clinical trial attests to its value as a safety-monitoring end point and as an explicit and measurable outcome for the entire trial. IMPLICATIONS: HRQL data are important for studies in which healthy populations are involved and in which the potential for decrements in quality of life are real or perceived.
Assuntos
Neoplasias da Mama/psicologia , Qualidade de Vida , Adulto , Distribuição por Idade , Idoso , Neoplasias da Mama/prevenção & controle , Feminino , Humanos , Pessoa de Meia-Idade , Risco , Comportamento Sexual , Inquéritos e Questionários , Tamoxifeno/uso terapêuticoRESUMO
PURPOSE: To describe men who agreed to be randomized to the Prostate Cancer Prevention Trial (PCPT), a 7-year, double-blind placebo-controlled study of the efficacy of finasteride in preventing prostate cancer. METHODS: Comprehensive health-related quality-of-life data are presented for 18,882 randomized PCPT participants. RESULTS: PCPT participants are highly educated, middle to upper income, and primarily white (92%). Participants reported healthy lifestyles. The mean American Urological Association Symptom Index score was well below the maximum entry score of less than 19; existing urinary symptoms were generally not bothersome. The scores for two sexual functioning scales could range from 0 to 100, with higher scores reflecting worse sexual functioning. The mean score for the Sexual Problem Scale was 19.2 out of 100, and the mean Sexual Activities Scale was 44.1 out of 100. Scores for seven of the eight Medical Outcomes Study 36-item Short-Form Health Survey scales (higher scores are better) were 10 to 20 points higher than those reported by a general population sample and differed minimally by race but not by age. Previously reported associations between sexual dysfunction and hypertension, diabetes, and depression were also observed. Men who never smoked reported less sexual dysfunction than did those who either had quit or still smoked. CONCLUSION: Individuals who are likely to enroll in primary prevention trials have a high socioeconomic status, healthy lifestyle behaviors, and better health than the general population. These data help oncologists design chemoprevention trials with respect to the selection of health-related quality-of-life assessments and recruitment strategies.
Assuntos
Nível de Saúde , Seleção de Pacientes , Neoplasias da Próstata/prevenção & controle , Qualidade de Vida , Ensaios Clínicos Controlados Aleatórios como Assunto , Idoso , Idoso de 80 Anos ou mais , Coleta de Dados , Transtorno Depressivo/epidemiologia , Método Duplo-Cego , Educação , Humanos , Incidência , Estilo de Vida , Masculino , Pessoa de Meia-Idade , Valores de Referência , Disfunções Sexuais Fisiológicas/epidemiologia , Classe SocialRESUMO
BACKGROUND: Primary care performance has been shown to differ under different models of health care delivery, even among various models of managed care. Pervasive changes in our nation's health care delivery systems, including the emergence of new forms of managed care, compel more current data. OBJECTIVE: To compare the primary care received by patients in each of 5 models of managed care (managed indemnity, point of service, network-model health maintenance organization [HMO], group-model HMO, and staff-model HMO) and identify specific characteristics of health plans associated with performance differences. METHODS: Cross-sectional observational study of Massachusetts adults who reported having a regular personal physician and for whom plan-type was known (n = 6018). Participants completed a validated questionnaire measuring 7 defining characteristics of primary care. Senior health plan executives provided information about financial and nonfinancial features of the plan's contractual arrangements with physicians. RESULTS: The managed indemnity system performed most favorably, with the highest adjusted mean scores for 8 of 10 measures (P<.05). Point of service and network-model HMO performance equaled the indemnity system on many measures. Staff-model HMOs performed least favorably, with adjusted mean scores that were lowest or statistically equivalent to the lowest score on all 10 scales. Among network-model HMOs, several features of the plan's contractual arrangement with physicians (ie, capitated physician payment, extensive use of clinical practice guidelines, financial incentives concerning patient satisfaction) were significantly associated with performance (P<.05). CONCLUSIONS: With US employers and purchasers having largely rejected traditional indemnity insurance as unaffordable, the results suggest that the current momentum toward open-model managed care plans is consistent with goals for high-quality primary care, but that the effects of specific financial and nonfinancial incentives used by plans must continue to be examined.
Assuntos
Programas de Assistência Gerenciada/economia , Programas de Assistência Gerenciada/organização & administração , Atenção Primária à Saúde/normas , Adulto , Fatores de Confusão Epidemiológicos , Continuidade da Assistência ao Paciente , Estudos Transversais , Feminino , Órgãos Governamentais , Prática de Grupo Pré-Paga/economia , Prática de Grupo Pré-Paga/organização & administração , Planos de Assistência de Saúde para Empregados/economia , Planos de Assistência de Saúde para Empregados/organização & administração , Sistemas Pré-Pagos de Saúde/economia , Sistemas Pré-Pagos de Saúde/organização & administração , Humanos , Reembolso de Seguro de Saúde , Masculino , Massachusetts , Pessoa de Meia-Idade , Modelos Organizacionais , Atenção Primária à Saúde/economia , Análise de Regressão , Governo EstadualRESUMO
OBJECTIVE: Although depression is one of the most common problems of medical and psychiatric outpatients, it has not been clear whether the extent of medical comorbidity among depressed patients varies across major types of clinical settings in which depressed patients receive care--especially by type of treating clinician (general medical versus mental health specialty) or type of payment for services (prepaid versus fee-for-service). METHODS: The authors examined these issues using data on 1,152 adult outpatients with current depressive symptoms and a lifetime history of unipolar depressive disorder who received care in one of three health care delivery systems in three U.S. sites. RESULTS: Depressed patients had a similarly high prevalence (64.9%-71.0%) of any of eight common chronic medical conditions whether they were seen in the general medical or specialty mental health sector; however, those visiting medical clinicians had a significantly higher prevalence of the two most common chronic medical conditions, hypertension and arthritis. Among depressed patients with hypertension, those visiting the general medical sector were more likely to be taking antihypertensive medication than were those visiting the mental health specialty sector. Type of payment (prepaid versus fee-for-service) was unrelated to either prevalence or severity of comorbid medical conditions, suggesting that the typical depressed patient in all types of practices studied had medical comorbidity. CONCLUSIONS: These data suggest that clinicians in all health care settings must be prepared to encounter chronic medical conditions and complaints in the depressed patients who visit them.
Assuntos
Doença Crônica/epidemiologia , Transtorno Depressivo/epidemiologia , Medicina de Família e Comunidade , Psiquiatria , Adulto , Artrite/epidemiologia , Comorbidade , Transtorno Depressivo/diagnóstico , Honorários Médicos , Indicadores Básicos de Saúde , Humanos , Hipertensão/epidemiologia , Masculino , Avaliação de Resultados em Cuidados de Saúde , Padrões de Prática Médica , Planos de Pré-Pagamento em Saúde , Prevalência , Escalas de Graduação Psiquiátrica , Estados Unidos/epidemiologiaRESUMO
PURPOSE: To measure the functional status and well-being of patients with chronic fatigue syndrome (CFS), and compare them with those of a general population group and six disease comparison groups. PATIENTS AND METHODS: The subjects of the study were patients with CFS (n = 223) from a CFS clinic, a population-based control sample (n = 2,474), and disease comparison groups with hypertension (n = 2,089), congestive heart failure (n = 216), type II diabetes mellitus (n = 163), acute myocardial infarction (n = 107), multiple sclerosis (n = 25), and depression (n = 502). We measured functional status and well-being using the Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36), which is a self-administered questionnaire in which lower scores are indicative of greater impairment. RESULTS: Patients with CFS had far lower mean scores than the general population control subjects on all eight SF-36 scales. They also scored significantly lower than patients in all the disease comparison groups other than depression on virtually all the scales. When compared with patients with depression, they scored significantly lower on all the scales except for scales measuring mental health and role disability due to emotional problems, on which they scored significantly higher. The two SF-36 scales reflecting mental health were not correlated with any of the symptoms of CFS except for irritability and depression. CONCLUSION: Patients with CFS had marked impairment, in comparison with the general population and disease comparison groups. Moreover, the degree and pattern of impairment was different from that seen in patients with depression.
Assuntos
Síndrome de Fadiga Crônica/fisiopatologia , Nível de Saúde , Atividades Cotidianas , Adulto , Transtorno Depressivo/fisiopatologia , Diabetes Mellitus Tipo 2/fisiopatologia , Síndrome de Fadiga Crônica/psicologia , Feminino , Insuficiência Cardíaca/fisiopatologia , Humanos , Hipertensão/fisiopatologia , Masculino , Saúde Mental , Esclerose Múltipla/fisiopatologia , Infarto do Miocárdio/fisiopatologia , Psicometria , Inquéritos e QuestionáriosRESUMO
A total of 693 children between the ages of 0 and 13 years were randomly assigned to either a staff model HMO or to one of several fee-for-service insurance plans in Seattle to evaluate differences in medical expenditures and health outcomes. Although the fee-for-service plans varied the amount of cost sharing (0% to 95%), all children were covered for the same medical services, for either 3 or 5 years. No differences in imputed total expenditures were observed for children assigned to the HMO or any of the fee-for-service plans. Children with cost-sharing fee-for-service plans, however, had fewer medical contacts and received fewer preventive services than those assigned to the HMO. Nonetheless, children with the cost-sharing fee-for-service plans were perceived (by their mothers) to be in better health overall than those assigned to the HMO. No significant differences regarding physiological outcomes (eg, visual acuity, hemoglobin level) were observed between the two groups. The results of this experiment neither strongly support nor indict fee-for-service or prepaid care for children.
Assuntos
Serviços de Saúde da Criança/estatística & dados numéricos , Prática de Grupo Pré-Paga , Prática de Grupo , Gastos em Saúde , Sistemas Pré-Pagos de Saúde , Nível de Saúde , Saúde , Adolescente , Atitude Frente a Saúde , Criança , Serviços de Saúde da Criança/economia , Pré-Escolar , Ensaios Clínicos como Assunto , Feminino , Prática de Grupo/economia , Prática de Grupo Pré-Paga/economia , Sistemas Pré-Pagos de Saúde/economia , Humanos , Seguro Saúde , Masculino , Avaliação de Processos e Resultados em Cuidados de Saúde , WashingtonRESUMO
Do children whose families bear a percentage of their health care costs reduce their use of ambulatory care compared with those families who receive free care? If so, does the reduction affect their health? To answer these questions, 1,844 children aged 0 to 13 years were randomly assigned (for a period of 3 or 5 years) to one of 14 insurance plans. The plans differed in the percentage of their medical bills that families paid. One plan provided free care. The others required up to 95% coinsurance subject to a +1,000 maximum. Children whose families paid a percentage of costs reduced use by up to one third. For the typical child in the study, this reduction caused no significant difference in either parental perceptions of their child's health or in physiologic measures of health. Confidence intervals are sufficiently narrow for most measures to rule out the possibility that large true differences went undetected. Nor were statistically significant differences observed for children at risk of disease. Wider confidence intervals for these comparisons, however, mean that clinically meaningful differences, if present, could have been undetected in certain subgroups.
Assuntos
Dedutíveis e Cosseguros , Nível de Saúde , Saúde , Criança , Pré-Escolar , Feminino , Indicadores Básicos de Saúde , Humanos , Lactente , Masculino , Distribuição AleatóriaRESUMO
This article presents information about the development and evaluation of the SF-36 Health Survey, a 36-item generic measure of health status. It summarizes studies of reliability and validity and provides administrative and interpretation guidelines for the SF-36. A brief history of the International Quality of Life Assessment (IQOLA) Project is also included.
Assuntos
Indicadores Básicos de Saúde , Qualidade de Vida , Atividades Cotidianas , Análise Fatorial , Humanos , Cooperação Internacional , Psicometria , Reprodutibilidade dos TestesRESUMO
Following the translation development stage, the second research stage of the IQOLA Project tests the assumptions underlying item scoring and scale construction. This article provides detailed information on the research methods used by the IQOLA Project to evaluate data quality, scaling and scoring assumptions, and the reliability of the SF-36 scales. Tests include evaluation of item and scale-level descriptive statistics; examination of the equality of item-scale correlations, item internal consistency and item discriminant validity; and estimation of scale score reliability using internal consistency and test-retest methods. Results from these tests are used to determine if standard algorithms for the construction and scoring of the eight SF-36 scales can be used in each country and to provide information that can be used in translation improvement.
Assuntos
Indicadores Básicos de Saúde , Psicometria , Qualidade de Vida , Atividades Cotidianas , Análise Discriminante , Humanos , TraduçõesRESUMO
This article briefly summarizes methods used in the empirical validation of translations of the SF-36 Health Survey. In addition, information about the IQOLA Project norming protocol and 13 general population norming samples analyzed in this supplement is provided.
Assuntos
Indicadores Básicos de Saúde , Psicometria , Qualidade de Vida , Coleta de Dados , Europa (Continente)/epidemiologia , Análise Fatorial , Humanos , Valores de Referência , Inquéritos e Questionários , Traduções , Estados Unidos/epidemiologiaRESUMO
Cross-sectional data from a representative sample of the general population in Japan were analyzed to test the validity of Japanese SF-36 Health Survey scales as measures of physical and mental health. Results from psychometric and clinical tests of validity were compared. Principal components analyses were used to test for the hypothesized physical and mental dimensions of health and the pattern of scale correlations with those components. To test the clinical validity of SF-36 scale scores, self-reports of chronic medical conditions and the Zung Self-Rating Depression Scale were used to create mutually exclusive groups differing in the severity of physical and mental conditions. The pattern of correlations between the SF-36 scales and the two empirically derived components generally confirmed hypotheses for most scales. Results of psychometric and clinical tests of validity were in agreement for the Physical Functioning, Role-Physical, Vitality, Social Functioning, and Mental Health scales. Relatively less agreement between psychometric and clinical tests of validity was observed for the Bodily Pain, General Health, and Role-Emotional scales, and the physical and mental health factor content of those scales was not consistent with hypotheses. In clinical tests of validity, the General Health, Bodily Pain, and Physical Functioning scales were the most valid scales in discriminating between groups with and without a severe physical condition. Scales that correlated highest with mental health in the components analysis (Mental Health and Vitality) also were most valid in discriminating between groups with and without depression. The results of this study provide preliminary interpretation guidelines for all SF-36 scales, although caution is recommended in the interpretation of the Role-Emotional, Bodily Pain, and General Health scales pending further studies in Japan.
Assuntos
Indicadores Básicos de Saúde , Psicometria , Qualidade de Vida , Adulto , Comparação Transcultural , Depressão , Escolaridade , Análise Fatorial , Feminino , Humanos , Japão/epidemiologia , Masculino , Reprodutibilidade dos TestesRESUMO
This study examined the relative precision (RP) of two methods of scoring the 10-item Physical Functioning Scale (PF-10) from a large sample of patients (n = 3445) of the Medical Outcomes Study. Based on a Likert scaling model, the PF-10 summated scoring method was compared with a Rasch Item Response Theory (IRT) scaling model in which raw scores were transformed into a latent trait variable of physical functioning. Potential differences between scoring methods were hypothesized to be attributed to: (1) the logarithmic nature of the Rasch transformation; (2) the unevenness of the PF-10 item distributions; and (3) reduction of within-group variance. RP ratios favored the Rasch model in discriminating between patients who differed in disease severity. The Rasch and Likert scoring models performed similarly for tests involving sensitivity to change over a two-year follow-up period. In all comparisons, differences between methods were most apparent in clinical groups whose scores most approximated the extremes of the score distribution. Further research is necessary to test for differences between scoring models in discrimination and sensitivity to change among clinical groups whose scores are sufficiently spread across the continuum of physical functioning, in particular patients with either very high or low physical functioning. The Rasch model of scoring may have important implications for the clinical interpretation of individual scores at all ranges of the scale.
Assuntos
Interpretação Estatística de Dados , Indicadores Básicos de Saúde , Insuficiência Cardíaca , Humanos , Modelos Lineares , Psicometria , Índice de Gravidade de DoençaRESUMO
Indexes developed to measure physical functioning as an essential component of general health status are often based on sets of hierarchically-structured items intended to represent a broad underlying concept. Rasch Item Response Theory (IRT) provides a methodology to examine the hierarchical structure, unidimensionality, and reproducibility of item positions (calibrations) along a scale. Data gathered on the 10-item Physical Functioning Scale (PF-10) from a large sample of Medical Outcomes Study patients (N = 3445) were used to examine the hierarchical order, unidimensionality, and reproducibility of item calibrations. Rasch-IRT analyses generated an empirical item hierarchy, confirmed the unidimensionality of the PF-10 for most patients, and established the reproducibility of item calibrations across patient populations and repeated tests. These findings support the content validity of the PF-10 as a measure of physical functioning and suggest that valid Rasch-IRT summary scores could be generated as an alternative to the current Likert summative scores. Unidimensionality and reproducibility of the item scale are essential prerequisites for the development of Rasch-based person measures of physical functioning that can be used across populations and over repeated tests.
Assuntos
Atividades Cotidianas , Indicadores Básicos de Saúde , Idoso , Doença Crônica , Humanos , Pessoa de Meia-Idade , Reprodutibilidade dos TestesRESUMO
Statistical analyses of Differential Item Functioning (DIF) can be used for rigorous translation evaluations. DIF techniques test whether each item functions in the same way, irrespective of the country, language, or culture of the respondents. For a given level of health, the score on any item should be independent of nationality. This requirement can be tested through contingency-table methods, which are efficient for analyzing all types of items. We investigated DIF in the Danish translation of the SF-36 Health Survey, using two general population samples (USA, n = 1,506; Denmark, n = 3,950). DIF was identified for 12 out of 35 items. These results agreed with independent ratings of translation quality, but the statistical techniques were more sensitive. When included in scales, the items exhibiting DIF had only a little impact on conclusions about cross-national differences in health in the general population. However, if used as single items, the DIF items could seriously bias results from cross-national comparisons. Also, the DIF items might have larger impact on cross-national comparison of groups with poorer health status. We conclude that analysis of DIF is useful for evaluating questionnaire translations.
Assuntos
Indicadores Básicos de Saúde , Qualidade de Vida , Traduções , Adolescente , Adulto , Idoso , Comparação Transcultural , Dinamarca/epidemiologia , Humanos , Pessoa de Meia-Idade , Psicometria , Inquéritos e QuestionáriosRESUMO
This article describes the methods adopted by the International Quality of Life Assessment (IQOLA) project to translate the SF-36 Health Survey. Translation methods included the production of forward and backward translations, use of difficulty and quality ratings, pilot testing, and cross-cultural comparison of the translation work. Experience to date suggests that the SF-36 can be adapted for use in other countries with relatively minor changes to the content of the form, providing support for the use of these translations in multinational clinical trials and other studies. The most difficult items to translate were physical functioning items, which used examples of activities and distances that are not common outside of the United States; items that used colloquial expressions such as pep or blue; and the social functioning items. Quality ratings were uniformly high across countries. While the IQOLA approach to translation and validation was developed for use with the SF-36, it is applicable to other translation efforts.
Assuntos
Indicadores Básicos de Saúde , Qualidade de Vida , Tradução , Comparação Transcultural , Países Desenvolvidos , Humanos , Inquéritos e Questionários , TraduçõesRESUMO
Increasingly, translated and culturally adapted health-related quality of life measures are being used in cross-cultural research. To assess comparability of results, researchers need to know the comparability of the content of the questionnaires used in different countries. Based on an item-by-item discussion among International Quality of Life Assessment (IQOLA) investigators of the content of the translated versions of the SF-36 in 10 countries, we discuss the difficulties that arose in translating the SF-36. We also review the solutions identified by IQOLA investigators to translate items and response choices so that they are appropriate within each country as well as comparable across countries. We relate problems and solutions to ratings of difficulty and conceptual equivalence for each item. The most difficult items to translate were physical functioning items that refer to activities not common outside the United States and items that use colloquial expressions in the source version. Identifying the origin of the source items, their meaning to American English-speaking respondents and American English synonyms, in response to country-specific translation issues, greatly helped the translation process. This comparison of the content of translated SF-36 items suggests that the translations are culturally appropriate and comparable in their content.