RESUMO
In 2020 the U.S. Federal Committee on Statistical Methodology (FCSM) released "A Framework for Data Quality", organized by 11 dimensions of data quality grouped among three domains of quality (utility, objectivity, integrity). This paper addresses the use of the FCSM Framework for data quality assessments of blended data. The FCSM Framework applies to all types of data, however best practices for implementation have not been documented. We applied the FCSM Framework for three health-research related case studies. For each case study, assessments of data quality dimensions were performed to identify threats to quality, possible mitigations of those threats, and trade-offs among them. From these assessments the authors concluded: 1) data quality assessments are more complex in practice than anticipated and expert guidance and documentation are important; 2) each dimension may not be equally important for different data uses; 3) data quality assessments can be subjective and having a quantitative tool could help explain the results, however, quantitative assessments may be closely tied to the intended use of the dataset; 4) there are common trade-offs and mitigations for some threats to quality among dimensions. This paper is one of the first to apply the FCSM Framework to specific use-cases and illustrates a process for similar data uses.
RESUMO
Objectives-This report compares national and subgroup estimates of any (mild, moderate, or severe) level of major depressive disorder (depression) and generalized anxiety disorder (GAD) symptoms among the U.S. adult population from two data sources, the 2019 National Health Interview Survey (NHIS) and the third round of the Research and Development Survey (RANDS 3). Methods-Data from the 2019 NHIS (n = 31,997) and RANDS 3 (n = 2,646) were used. The eight-item Patient Health Questionnaire (PHQ-8), scores ranging from 0 to 24, and the seven-item GAD scale (GAD-7), scores ranging from 0 to 21, were used to measure the severity of depression and GAD symptoms, respectively. Binary indicators of exhibiting symptoms were based on scores of 5 to 24 for depression and 5 to 21 for GAD. The estimates were compared by the following sociodemographic characteristics: age, sex, race and Hispanic origin, education, and region. Results-Nearly all of the national and subgroup estimates of adults with depression and GAD symptoms were significantly higher based on RANDS 3 compared with the 2019 NHIS. The only exception was the depression symptoms estimate among adults aged 65 and over, where the estimates were comparable across the two data sources. Both data sources found that depression symptoms were associated with sex, age, race and Hispanic origin, and education, and GAD symptoms were associated with age, race and Hispanic origin, and education. However, NHIS identified a few associations that RANDS did not, including associations between depression symptoms and region and GAD symptoms and sex. Conclusions-Mental health estimates from RANDS, a web-based survey, may be overestimated when compared with a traditional in-person household survey. These results may inform potential strategies to improve the comparability of mental health estimates from RANDS and other surveys like NHIS, such as calibration weights or other model-based methods.
Assuntos
Transtorno Depressivo Maior , Saúde Mental , Adulto , Humanos , Estados Unidos/epidemiologia , Transtorno Depressivo Maior/epidemiologia , Transtorno Depressivo Maior/complicações , Inquéritos e Questionários , Transtornos de Ansiedade/diagnóstico , Transtornos de Ansiedade/epidemiologia , Transtornos de Ansiedade/psicologia , PesquisaRESUMO
For the CIs used in the Standards for rates from vital statistics and complex health surveys, this report evaluates coverage probability, relative width, and the resulting percentage of rates flagged as statistically unreliable when compared with previously used standards. Additionally, the report assesses the impact of design effects and the denominator's sampling variability, when applicable.
Assuntos
Coleta de Dados , Inquéritos Epidemiológicos , Estatísticas Vitais , Biometria , Coleta de Dados/normas , National Center for Health Statistics, U.S. , Projetos de Pesquisa , Inquéritos e Questionários , Estados Unidos/epidemiologiaRESUMO
BACKGROUND: Equal-tailed confidence intervals that maintain nominal coverage (0.95 or greater probability that a 95% confidence interval covers the true value) are useful in interval-based statistical reliability standards, because they remain conservative. For age-adjusted death rates, while the Fay-Feuer gamma method remains the gold standard, modifications have been proposed to streamline implementation and/or obtain more efficient intervals (shorter intervals that retain nominal coverage). METHODS: This paper evaluates three such modifications for use in interval-based statistical reliability standards, the Anderson-Rosenberg, Tiwari, and Fay-Kim intervals, when data are sparse and sample size-based standards alone are overly coarse. Initial simulations were anchored around small populations (P = 2400 or 1200), the median crude all-cause US mortality rate in 2010-2019 (833.8 per 100,000), and the corresponding age-specific probabilities of death. To allow for greater variation in the age-adjustment weights and age-specific probabilities, a second set of simulations draws those at random, while holding the mean number of deaths at 20 or 10. Finally, county-level mortality data by race/ethnicity from four causes are selected to capture even greater variation: all causes, external causes, congenital malformations, and Alzheimer disease. RESULTS: The three modifications had comparable performance when the number of deaths was large relative to the denominator and the age distribution was as in the standard population. However, for sparse county-level data by race/ethnicity for rarer causes of death, and for which the age distribution differed sharply from the standard population, coverage probability in all but the Fay-Feuer method sometimes fell below 0.95. More efficient intervals than the Fay-Feuer interval were identified under specific circumstances. When the coefficient of variation of the age-adjustment weights was below 0.5, the Anderson-Rosenberg and Tiwari intervals appeared to be more efficient, whereas when it was above 0.5, the Fay-Kim interval appeared to be more efficient. CONCLUSIONS: As national and international agencies reassess prevailing data presentation standards to release age-adjusted estimates for smaller areas or population subgroups than previously presented, the Fay-Feuer interval can be used to develop interval-based statistical reliability standards with appropriate thresholds that are generally applicable. For data that meet certain statistical conditions, more efficient intervals could be considered.
Assuntos
Modelos Estatísticos , Projetos de Pesquisa , Distribuição por Idade , Intervalos de Confiança , Humanos , Probabilidade , Reprodutibilidade dos TestesRESUMO
High-quality data are accurate, relevant, and timely. Large national health surveys have always balanced the implementation of these quality dimensions to meet the needs of diverse users. The COVID-19 pandemic shifted these balances, with both disrupted survey operations and a critical need for relevant and timely health data for decision-making. The National Health Interview Survey (NHIS) responded to these challenges with several operational changes to continue production in 2020. However, data files from the 2020 NHIS were not expected to be publicly available until fall 2021. To fill the gap, the National Center for Health Statistics (NCHS) turned to 2 online data collection platforms-the Census Bureau's Household Pulse Survey (HPS) and the NCHS Research and Development Survey (RANDS)-to collect COVID-19ârelated data more quickly. This article describes the adaptations of NHIS and the use of HPS and RANDS during the pandemic in the context of the recently released Framework for Data Quality from the Federal Committee on Statistical Methodology. (Am J Public Health. 2021;111(12):2167-2175. https://doi.org/10.2105/AJPH.2021.306516).
Assuntos
COVID-19/epidemiologia , Inquéritos Epidemiológicos/métodos , Internet , National Center for Health Statistics, U.S./organização & administração , Viés , Estudos Transversais , Coleta de Dados/métodos , Coleta de Dados/normas , Inquéritos Epidemiológicos/normas , Humanos , Entrevistas como Assunto , Pandemias , SARS-CoV-2 , Fatores Sociodemográficos , Telefone , Estados Unidos/epidemiologiaRESUMO
While web surveys have become increasingly popular as a method of data collection, there is concern that estimates obtained from web surveys may not reflect the target population of interest. Web survey estimates can be calibrated to existing national surveys using a propensity score adjustment, although requirements for the size and collection timeline of the reference data set have not been investigated. We evaluate health outcomes estimates from the National Center for Health Statistics' Research and Development web survey. In our study, the 2016 National Health Interview Survey as well as its quarterly subsets are considered as reference datasets for the web data. It is demonstrated that the calibrated health estimates overall vary little when using the quarterly or yearly data, suggesting that there is flexibility in selecting the reference dataset. This finding has many practical implications for constructing reference data, including the reduced cost and burden of a smaller sample size and a more flexible timeline.
RESUMO
Data from the National Health and Nutrition Examination Survey (NHANES) have been linked to the Center for Medicare and Medicaid Services' Medicaid Enrollment and Claims Files. As not all survey participants provide sufficient information to be eligible for record linkage, linked data often includes fewer records than the original survey data. This project presents an application of multiple imputation (MI) for handling missing Medicaid/CHIP status due to linkage refusals in linked NHANES-Medicaid data using the linked 1999-2004 NHANES data. By examining multiple outcomes and subgroups among children, the analyses compare the results from a multi-purpose dataset produced from a single MI model to that of individualized MI models. Outcomes examined here include obesity, untreated dental caries, attention deficit hyperactivity disorder (ADHD), and exposure to second hand smoke.
RESUMO
This report describes the methods used to create NHANES 2011-2014 sample weights and variance units for the public-use data files, including sample weights for selected subsamples, such as the fasting subsample. The impacts of sample design changes on estimation for NHANES 2011-2014 and the addition of the NHANES National Youth Fitness Survey (NNYFS) 2012 are described. Approaches that data users can employ to modify sample weights when combining survey cycles or when combining subsamples are also included.
Assuntos
Interpretação Estatística de Dados , Inquéritos Nutricionais/métodos , Projetos de Pesquisa , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Viés , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Pessoa de Meia-Idade , Modelos Teóricos , Inquéritos Nutricionais/normas , Tamanho da Amostra , Fatores Socioeconômicos , Estados Unidos , Adulto JovemRESUMO
Objective-This report illustrates the use of National Health Interview Survey (NHIS) data linked to Medicaid Analytic eXtract (MAX) data to identify children whose births were covered by Medicaid, as indicated in MAX data, among those participating in NHIS in early childhood, and briefly describes their selected health characteristics.
Assuntos
Inquéritos Epidemiológicos , Armazenamento e Recuperação da Informação , Cobertura do Seguro , Medicaid , Parto , Adolescente , Adulto , Pré-Escolar , Feminino , Acessibilidade aos Serviços de Saúde , Nível de Saúde , Humanos , Lactente , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Estados Unidos , Adulto JovemRESUMO
Data from the National Health and Nutrition Examination Survey have been linked to the Center for Medicare and Medicaid Services' Medicaid Enrollment and Claims Files for the survey years 1999-2004. The linked data are produced by the National Center for Health Statistics' (NCHS) Data Linkage Program and are available in the NCHS Research Data Center. This project compares the usefulness of multiple imputation to account for data linkage ineligibility and other survey nonresponse with currently recommended weight adjustment procedures. Estimated differences in environmental smoke exposure across Medicaid/Children's Health Insurance Program (CHIP) enrollment status among children ages 3-15 years are examined as a motivating example. Comparisons are drawn across the three different estimates: one that uses MI to impute the administrative Medicaid/CHIP status of those who are ineligible for linkage, a second that uses the linked data restricted to linkage eligible participants with a basic weight adjustment, and a third that uses self-reported Medicaid/CHIP status from the survey data. The results indicate that estimates from the multiple imputation analysis were comparable to those found when using weight adjustment procedures and had the added benefit of incorporating all survey participants (linkage eligible and linkage ineligible) into the analysis. We conclude that both multiple imputation and weight adjustment procedures can effectively account for survey participants who are ineligible for linkage.
RESUMO
BACKGROUND: Most US studies of mortality and air pollution have been conducted on largely non-Hispanic white study populations. However, many health and mortality outcomes differ by race and ethnicity, and non-Hispanic white persons experience lower air pollution exposure than those who are non-Hispanic black or Hispanic. This study examines whether associations between air pollution and heart disease mortality differ by race/ethnicity. METHODS: We used data from the 1997 to 2009 National Health Interview Survey linked to mortality records through December 2011 and annual estimates of fine particulate matter (PM2.5) by census tract. Proportional hazards models were used to estimate hazard ratios and 95% confidence intervals between PM2.5 (per 10 µg/m3) and heart disease mortality using the full sample and the sample adults, which have information on additional health variables. Interaction terms were used to examine differences in the PM2.5-mortality association by race/ethnicity. RESULTS: Overall, 65 936 of the full sample died during follow-up, and 22 152 died from heart disease. After adjustment for several factors, we found a significant positive association between PM2.5 and heart disease mortality (hazard ratio, 1.16; 95% confidence interval, 1.08-1.25). This association was similar in sample adults with adjustment for smoking and body mass index (hazard ratio, 1.18; 95% confidence interval, 1.06-1.31). Interaction terms for non-Hispanic black and Hispanic groups compared with the non-Hispanic white group were not statistically significant. CONCLUSIONS: Using a nationally representative sample, the association between PM2.5 and heart disease mortality was elevated and similar to previous estimates. Associations for non-Hispanic black and Hispanic adults were not statistically significantly different from those for non-Hispanic white adults.
Assuntos
Poluentes Atmosféricos/efeitos adversos , Poluição do Ar/efeitos adversos , Negro ou Afro-Americano , Cardiopatias/etnologia , Cardiopatias/mortalidade , Hispânico ou Latino , Material Particulado/efeitos adversos , População Branca , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Inquéritos Epidemiológicos , Cardiopatias/diagnóstico , Humanos , Exposição por Inalação/efeitos adversos , Entrevistas como Assunto , Masculino , Pessoa de Meia-Idade , Tamanho da Partícula , Prognóstico , Medição de Risco , Fatores de Risco , Fatores de Tempo , Estados Unidos/epidemiologiaRESUMO
Background California is the most populated state and Los Angeles County is the most populated county in the United States. National Health and Nutrition Examination Survey (NHANES) sample weights and variance units were developed for these places to obtain subnational estimates. Objective This report describes the California and Los Angeles County NHANES 1999-2006 and 2007-2014 samples, including the creation of the sample weights and variance units and descriptions of the resulting data files. Some analytic guidelines are provided. Results Eight years of NHANES data were combined for each data file to provide an adequate sample size and reduce disclosure risks. Because Los Angeles County has been a self-representing primary sampling unit, sample weights for Los Angeles County were relatively straightforward. However, a modelbased approach was used to create sample weights for California. The relatively large proportion of Mexican- American and other Hispanic persons in California, coupled with the different NHANES 1999-2014 sample design requirements for oversampling these groups within the small number of NHANES locations selected each cycle, led to a relatively large size of these groups in the California and Los Angeles County NHANES files. For example, 1,137 and 374 of the 3,353 Mexican-Americans persons in NHANES 2007-2014 were in the California and Los Angeles County samples, respectively. Conclusion The California and Los Angeles County NHANES 1999-2006 and 2007-2014 samples are available in the National Center for Health Statistics Research Data Center.
Assuntos
Inquéritos Epidemiológicos/métodos , Inquéritos Epidemiológicos/estatística & dados numéricos , Hispânico ou Latino/estatística & dados numéricos , Inquéritos Nutricionais/métodos , Inquéritos Nutricionais/estatística & dados numéricos , Projetos de Pesquisa , Adolescente , Adulto , Idoso , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Recém-Nascido , Los Angeles , Masculino , Americanos Mexicanos , Pessoa de Meia-Idade , National Center for Health Statistics, U.S. , Fatores Socioeconômicos , Estados Unidos , Adulto JovemRESUMO
OBJECTIVES: Differences in the availability of a Social Security Number (SSN) by race/ethnicity could affect the ability to link with death certificate data in passive follow-up studies and possibly bias mortality disparities reported with linked data. Using 1989-2009 National Health Interview Survey (NHIS) data linked with the National Death Index (NDI) through 2011, we compared the availability of a SSN by race/ethnicity, estimated the percent of links likely missed due to lack of SSNs, and assessed if these estimated missed links affect race/ethnicity disparities reported in the NHIS-linked mortality data. METHODS: We used preventive fraction methods based on race/ethnicity-specific Cox proportional hazards models of the relationship between availability of SSN and mortality based on observed links, adjusted for survey year, sex, age, respondent-rated health, education, and US nativity. RESULTS: Availability of a SSN and observed percent linked were significantly lower for Hispanic and Asian/Pacific Islander (PI) participants compared with White non-Hispanic participants. We estimated that more than 18% of expected links were missed due to lack of SSNs among Hispanic and Asian/PI participants compared with about 10% among White non-Hispanic participants. However, correcting the observed links for expected missed links appeared to only have a modest impact on mortality disparities by race/ethnicity. CONCLUSIONS: Researchers conducting analyses of mortality disparities using the NDI or other linked death records, need to be cognizant of the potential for differential linkage to contribute to their results.
Assuntos
Povo Asiático/estatística & dados numéricos , Atestado de Óbito , Disparidades nos Níveis de Saúde , Hispânico ou Latino/estatística & dados numéricos , Expectativa de Vida/tendências , População Branca/estatística & dados numéricos , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Taxa de Sobrevida/tendências , Estados Unidos/epidemiologiaRESUMO
BACKGROUND: Warmer temperature can alter seasonality of pollen as well as pollen concentration, and may impact allergic diseases such as hay fever. Recent studies suggest that extreme heat events will likely increase in frequency, intensity, and duration in coming decades in response to changing climate. OBJECTIVE: The overall objective of this study was to investigate if extreme heat events are associated with hay fever. METHODS: We linked National Health Interview Survey (NHIS) data from 1997 to 2013 (n = 505,386 respondents) with extreme heat event data, defined as days when daily maximum temperature (TMAX) exceeded the 95th percentile values of TMAX for a 30-year reference period (1960-1989). We used logistic regression to investigate the associations between exposure to annual and seasonal extreme heat events and adult hay fever prevalence among the NHIS respondents. RESULTS: During 1997-2013, hay fever prevalence among adults 18 years and older was 8.43%. Age, race/ethnicity, poverty status, education, and sex were significantly associated with hay fever status. We observed that adults in the highest quartile of exposure to extreme heat events had a 7% increased odds of hay fever compared with those in the lowest quartile of exposure (odds ratios: 1.07, 95% confidence interval: 1.02-1.11). This relationship was more pronounced for extreme heat events that occurred during spring season, with evidence of an exposure-response relationship (Ptrend < .01). CONCLUSIONS: Our data suggest that exposure to extreme heat events is associated with increased prevalence of hay fever among US adults.
Assuntos
Mudança Climática , Exposição Ambiental , Calor Extremo , Rinite Alérgica Sazonal/epidemiologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Alérgenos/imunologia , Exposição Ambiental/efeitos adversos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Pólen/imunologia , Prevalência , Fatores de Risco , Inquéritos e Questionários , Estados Unidos/epidemiologia , Adulto JovemRESUMO
The National Center for Health Statistics (NCHS) disseminates information on a broad range of health topics through diverse publications. These publications must rely on clear and transparent presentation standards that can be broadly and efficiently applied. Standards are particularly important for large, cross-cutting reports where estimates cannot be individually evaluated and indicators of precision cannot be included alongside the estimates. This report describes the NCHS Data Presentation Standards for Proportions. The multistep NCHS Data Presentation Standards for Proportions are based on a minimum denominator sample size and on the absolute and relative widths of a confidence interval calculated using the Clopper-Pearson method. Proportions (usually multiplied by 100 and expressed as percentages) are the most commonly reported estimates in NCHS reports.
Assuntos
Inquéritos Epidemiológicos/normas , Projetos de Pesquisa/normas , Estatística como Assunto/normas , Intervalos de Confiança , Interpretação Estatística de Dados , Feminino , Humanos , Masculino , National Center for Health Statistics, U.S. , Padrões de Referência , Tamanho da Amostra , Estados UnidosRESUMO
Objectives This report presents the development, plan, and operation of the 2011-2012 National Survey of Children's Health, a module of the State and Local Area Integrated Telephone Survey, conducted by the National Center for Health Statistics. Funding was provided by the Maternal and Child Health Bureau, Health Resources and Services Administration. The survey was designed to produce national and state prevalence estimates of the physical and emotional health of children aged 0-17 years, as well as factors that may relate to child well-being including medical homes, family interactions, parental health, school and after-school experiences, and neighborhood characteristics.
Assuntos
Inquéritos Epidemiológicos/métodos , Inquéritos Nutricionais/métodos , Adolescente , Adulto , Idoso , Criança , Pré-Escolar , Estudos Transversais , Feminino , Nível de Saúde , Habitação , Humanos , Lactente , Recém-Nascido , Masculino , Saúde Mental , Pessoa de Meia-Idade , National Center for Health Statistics, U.S. , Projetos de Pesquisa , Características de Residência/estatística & dados numéricos , Estados Unidos , United States Dept. of Health and Human Services , Adulto JovemRESUMO
Previous research has found differences in characteristics of beneficiaries enrolled in Medicare fee-for-service versus Medicare Advantage (MA), but there has been limited research using more recent MA enrollment data. We used 1997-2005 National Health Interview Survey data linked to 2000-2009 Medicare enrollment data to compare characteristics of Medicare beneficiaries before their initial enrollment into Medicare fee-for-service or MA at age 65 and whether the characteristics of beneficiaries changed from 2006 to 2009 compared with 2000 to 2005. During this period of MA growth, the greatest increase in enrollment appears to have come from those with no chronic conditions and men.
Assuntos
Planos de Pagamento por Serviço Prestado , Cobertura do Seguro , Medicare Part C , Idoso , Feminino , Humanos , Masculino , Programas de Assistência Gerenciada , Autorrelato , Estados UnidosRESUMO
Analyses of the Third National Health and Nutrition Examination Survey (NHANES III) in 1988 to 1994 found an association of increasing blood lead levels < 10 µg/dL with a higher risk of cardiovascular disease (CVD) mortality. The potential need to correct blood lead for hematocrit/hemoglobin and adjust for biomarkers for other metals, for example, cadmium and iron, had not been addressed in the previous NHANES III-based studies on blood lead-CVD mortality association. We analyzed 1999 to 2010 NHANES data for 18,602 participants who had a blood lead measurement, were ≥ 40 years of age at the baseline examination and were followed for mortality through 2011. We calculated the relative risk for CVD mortality as a function of hemoglobin- or hematocrit-corrected log-transformed blood lead through Cox proportional hazard regression analysis with adjustment for serum iron, blood cadmium, serum C-reactive protein, serum calcium, smoking, alcohol intake, race/Hispanic origin, and sex. The adjusted relative risk for CVD mortality was 1.44 (95% confidence interval = 1.05, 1.98) per 10-fold increase in hematocrit-corrected blood lead with little evidence of nonlinearity. Similar results were obtained with hemoglobin-corrected blood lead. Not correcting blood lead for hematocrit/hemoglobin resulted in underestimation of the lead-CVD mortality association while not adjusting for iron status and blood cadmium resulted in overestimation of the lead-CVD mortality association. In a nationally representative sample of U.S. adults, log-transformed blood lead was linearly associated with increased CVD mortality. Correcting blood lead for hematocrit/hemoglobin and adjustments for some biomarkers affected the association.
Assuntos
Doenças Cardiovasculares/mortalidade , Chumbo/sangue , Adulto , Idoso , Consumo de Bebidas Alcoólicas/epidemiologia , Biomarcadores , Proteína C-Reativa/análise , Cádmio/sangue , Doenças Cardiovasculares/etnologia , Causas de Morte , Feminino , Hematócrito , Hemoglobinas , Humanos , Ferro/sangue , Masculino , Pessoa de Meia-Idade , Inquéritos Nutricionais , Modelos de Riscos Proporcionais , Características de Residência , Fatores de Risco , Fatores Sexuais , Fumar/epidemiologia , Estados UnidosRESUMO
To maximize limited resources and reduce respondent burden, there is an increased interest in linking population health surveys with other sources of data, such as administrative records. Health differences between adults who consent to and refuse linkage could bias study results with linked data. National Health Interview Survey (NHIS) data are routinely linked to administrative records from the Social Security Administration and the Centers for Medicare and Medicaid Services. Using the NHIS 2010-2013, we examined the association between selected health conditions and respondents' linkage refusal. Linkage refusal was significantly lower for adults with serious psychological distress, chronic obstructive pulmonary disease, diabetes, heart disease, stroke, hypertension, and cancer compared to those without these conditions. Linkage refusal decreased as the number of conditions increased and health status decreased. Our finding that linkage consent was associated with respondents' health characteristics suggests that researchers should try to address potential linkage bias in their analyses.
RESUMO
Record linkage is a valuable and efficient tool for connecting information from different data sources. The National Center for Health Statistics (NCHS) has linked its population-based health surveys with administrative data, including Medicare enrollment and claims records. However, the linked NCHS-Medicare files are subject to missing data; first, not all survey participants agree to record linkage, and second, Medicare claims data are only consistently available for beneficiaries enrolled in the Fee-for-Service (FFS) program, not in Medicare Advantage (MA) plans. In this research, we examine the usefulness of multiple imputation for handling missing data in linked National Health Interview Survey (NHIS)-Medicare files. The motivating example is a study of mammography status from 1999 to 2004 among women aged 65 years and older enrolled in the FFS program. In our example, mammography screening status and FFS/MA plan type are missing for NHIS survey participants who were not linkage eligible. Mammography status is also missing for linked participants in an MA plan. We explore three imputation approaches: (i) imputing screening status first, (ii) imputing FFS/MA plan type first, (iii) and imputing the two longitudinal processes simultaneously. We conduct simulation studies to evaluate these methods and compare them using the linked NHIS-Medicare files. The imputation procedures described in our paper would also be applicable to other public health-related research using linked data files with missing data issues arising from program characteristics (e.g., intermittent enrollment or data collection) reflected in administrative data and linkage eligibility by survey participants.