Búsqueda | Portal Regional de la BVS

1.

Evaluating Binary Outcome Classifiers Estimated from Survey Data.

Wadekar, Adway S; Reiter, Jerome P.

Epidemiology ; 2024 Aug 14.

Artículo en Inglés | MEDLINE | ID: mdl-39140488

RESUMEN

Surveys are commonly used to facilitate research in epidemiology, health, and the social and behavioral sciences. Often, these surveys are not simple random samples, and respondents are given weights reflecting their probability of selection into the survey. We show that using survey weights can be beneficial for evaluating the quality of predictive models when splitting data into training and test sets. In particular, we characterize model assessment statistics, such as sensitivity and specificity, as finite population quantities and compute survey-weighted estimates of these quantities with test data comprising a random subset of the original data. Using simulations with data from the National Survey on Drug Use and Health and the National Comorbidity Survey, we show that unweighted metrics estimated with sample test data can misrepresent population performance, but weighted metrics appropriately adjust for the complex sampling design. We also show that this conclusion holds for models trained using upsampling for mitigating class imbalance. The results suggest that weighted metrics should be used when evaluating performance on test data derived from complex surveys.

2.

The Association between Long-term PM2.5 Exposure and Risk for Pancreatic Cancer: An Application of Social Informatics.

Bhavsar, Nrupen A; Jowers, Kay; Yang, Lexie Z; Guha, Sharmistha; Lin, Xuan; Peskoe, Sarah; McManus, Hannah; McElroy, Lisa; Bravo, Mercedes; Reiter, Jerome P; Whitsel, Eric; Timmins, Christopher.

Am J Epidemiol ; 2024 Aug 09.

Artículo en Inglés | MEDLINE | ID: mdl-39123098

RESUMEN

There is a profound need to identify modifiable risk factors to screen and prevent pancreatic cancer. Air pollution, including fine particulate matter (PM2.5), is increasingly recognized as a risk factor for cancer. We conducted a case-control study using data from the electronic health record (EHR) of Duke University Health System, 15-year residential history, NASA satellite fine particulate matter (PM2.5) and neighborhood socioeconomic data. Using deterministic and probabilistic linkage algorithms, we linked residential history and EHR data to quantify long term PM2.5 exposure. Logistic regression models quantified the association between a one interquartile range (IQR) increase in PM2.5 concentration and pancreatic cancer risk. The study included 203 cases and 5027 controls (median age of 59 years, 62% female, 26% Black). Individuals with pancreatic cancer had higher average annual exposure (9.4 µg/m3) as compared to IQR increase in average annual PM2.5 was associated with greater odds of pancreatic cancer (OR=1.20; 95% CI: 1.00-1.44). These findings highlight the link between elevated PM2.5 exposure and increased pancreatic cancer risk. They may inform screening strategies for high-risk populations and guide air pollution policies to mitigate exposure.

3.

Regression-Assisted Bayesian Record Linkage for Causal Inference in Observational Studies with Covariates Spread Over Two Files.

Guha, Sharmistha; Reiter, Jerome P.

J Stat Plan Inference ; 2292024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-39076728

RESUMEN

We consider causal inference for observational studies with data spread over two files. One file includes the treatment, outcome, and some covariates measured on a set of individuals, and the other file includes additional causally-relevant covariates measured on a partially overlapping set of individuals. By linking records in the two databases, the analyst can control for more covariates, thereby reducing the risk of bias compared to using only one file alone. When analysts do not have access to a unique identifier that enables perfect, error-free linkages, they typically rely on probabilistic record linkage to construct a single linked data set, and estimate causal effects using these linked data. This typical practice does not propagate uncertainty from imperfect linkages to the causal inferences. Further, it does not take advantage of relationships among the variables to improve the linkage quality. We address these shortcomings by fusing regression-assisted, Bayesian probabilistic record linkage with causal inference. The Markov chain Monte Carlo sampler generates multiple plausible linked data files as byproducts that analysts can use for multiple imputation inferences. Here, we show results for two causal estimators based on propensity score overlap weights. Using simulations and data from the Italy Survey on Household Income and Wealth, we show that our approach can improve the accuracy of estimated treatment effects.

4.

Reply to Muralidhar et al., Kenny et al., and Hotz et al.: The benefits of engagement with external research teams.

Jarmin, Ron S; Abowd, John M; Ashmead, Robert; Cumings-Menon, Ryan; Goldschlag, Nathan; Hawes, Michael; Keller, Sallie Ann; Kifer, Daniel; Leclerc, Philip; Reiter, Jerome P; Rodríguez, Rolando A; Schmutte, Ian; Velkoff, Victoria A; Zhuravlev, Pavel I.

Proc Natl Acad Sci U S A ; 121(11): e2401501121, 2024 Mar 12.

Artículo en Inglés | MEDLINE | ID: mdl-38442177

5.

An in-depth examination of requirements for disclosure risk assessment.

Jarmin, Ron S; Abowd, John M; Ashmead, Robert; Cumings-Menon, Ryan; Goldschlag, Nathan; Hawes, Michael B; Keller, Sallie Ann; Kifer, Daniel; Leclerc, Philip; Reiter, Jerome P; Rodríguez, Rolando A; Schmutte, Ian; Velkoff, Victoria A; Zhuravlev, Pavel.

Proc Natl Acad Sci U S A ; 120(43): e2220558120, 2023 Oct 24.

Artículo en Inglés | MEDLINE | ID: mdl-37831744

RESUMEN

The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. We argue that any proposal for quantifying disclosure risk should be based on prespecified, objective criteria. We illustrate this approach to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. More research is needed, but in the near term, the counterfactual approach appears best-suited for privacy versus utility analysis.

Asunto(s)

Confidencialidad , Revelación , Privacidad , Medición de Riesgo , Censos

6.

Leveraging Auxiliary Information on Marginal Distributions in Nonignorable Models for Item and Unit Nonresponse.

Akande, Olanrewaju; Madson, Gabriel; Hillygus, D Sunshine; Reiter, Jerome P.

J R Stat Soc Ser A Stat Soc ; 184(2): 643-662, 2021 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-36254262

RESUMEN

Often, government agencies and survey organizations know the population counts or percentages for some of the variables in a survey. These may be available from auxiliary sources, for example, administrative databases or other high quality surveys. We present and illustrate a model-based framework for leveraging such auxiliary marginal information when handling unit and item nonresponse. We show how one can use the margins to specify different missingness mechanisms for each type of nonresponse. We use the framework to impute missing values in voter turnout in a subset of data from the U.S. Current Population Survey (CPS). In doing so, we examine the sensitivity of results to different assumptions about the unit and item nonresponse.

7.

Use of Hospital Referral Regions in Evaluating End-of-Life Care.

Kaufman, Brystana G; Klemish, David; Olson, Andrew; Kassner, Cordt T; Reiter, Jerome P; Harker, Matthew; Sheble, Laura; Goldstein, Benjamin A; Taylor, Donald H; Bhavsar, Nrupen A.

J Palliat Med ; 23(1): 90-96, 2020 01.

Artículo en Inglés | MEDLINE | ID: mdl-31424316

RESUMEN

Background: Hospital referral regions (HRRs) are often used to characterize inpatient referral patterns, but it is unknown how well these geographic regions are aligned with variation in Medicare-financed hospice care, which is largely provided at home. Objective: Our objective was to characterize the variability in hospice use rates among elderly Medicare decedents by HRR and county. Methods: Using 2014 Master Beneficiary File for decedents 65 and older from North and South Carolina, we applied Bayesian mixed models to quantify variation in hospice use rates explained by HRR fixed effects, county random effects, and residual error among Medicare decedents. Results: We found HRRs and county indicators are significant predictors of hospice use in NC and SC; however, the relative variation within HRRs and associated residual variation is substantial. On average, HRR fixed effects explained more variation in hospice use rates than county indicators with a standard deviation (SD) of 10.0 versus 5.1 percentage points. The SD of the residual error is 5.7 percentage points. On average, variation within HRRs is about half the variation between regions (52%). Conclusions: The magnitude of unexplained residual variation in hospice use for NC and SC suggests that novel, end-of-life-specific service areas should be developed and tested to better capture geographic differences and inform research, health systems, and policy.

Asunto(s)

Cuidados Paliativos al Final de la Vida , Cuidado Terminal , Anciano , Teorema de Bayes , Humanos , Medicare , Derivación y Consulta , South Carolina , Estados Unidos

8.

Simultaneous record linkage and causal inference with propensity score subclassification.

Wortman, Joan Heck; Reiter, Jerome P.

Stat Med ; 37(24): 3533-3546, 2018 10 30.

Artículo en Inglés | MEDLINE | ID: mdl-30069901

RESUMEN

We develop methodology for causal inference in observational studies when using propensity score subclassification on data constructed with probabilistic record linkage techniques. We focus on scenarios where covariates and binary treatment assignments are in one file and outcomes are in another file, and the goal is to estimate an additive treatment effect by merging the files. We assume that the files can be linked using variables common to both files, eg, names or birth dates, but that links are subject to errors, eg, due to reporting errors in the linking variables. We develop methodology for cases where such reporting errors are independent of the other variables on the files. We describe conceptually how linkage errors can affect causal estimates in subclassification contexts. We also present and evaluate several algorithms for deciding which record pairs to use in estimation of causal effects. Using simulation studies, we demonstrate that case selection procedures can result in improved accuracy in estimates of treatment effects from linked data compared to using only cases known to be true links.

Asunto(s)

Registro Médico Coordinado , Puntaje de Propensión , Algoritmos , Bioestadística , Causalidad , Simulación por Computador , Interpretación Estadística de Datos , Humanos , Estudios Observacionales como Asunto/estadística & datos numéricos

9.

Predicting Length of Hospice Stay: An Application of Quantile Regression.

Kaufman, Brystana G; Klemish, David; Kassner, Cordt T; Reiter, Jerome P; Li, Fan; Harker, Matthew; O'Brien, Emily C; Taylor, Donald H; Bhavsar, Nrupen A.

J Palliat Med ; 21(8): 1131-1136, 2018 08.

Artículo en Inglés | MEDLINE | ID: mdl-29762075

RESUMEN

BACKGROUND: Use of the Medicare hospice benefit has been associated with high-quality care at the end of life, and hospice length of use in particular has been used as a proxy for appropriate timing of hospice enrollment. Quantile regression has been underutilized as an alternative tool to model distributional changes in hospice length of use and hospice payments outside of the mean. OBJECTIVE: To test for heterogeneity in the relationship between patient characteristics and hospice outcomes across the distribution of hospice days. SETTING: Medicare Beneficiary Summary File and survey data (2014) for hospice beneficiaries in North and South Carolina with common terminal diagnoses. MEASUREMENTS: Distributional shifts associated with patient characteristics were evaluated at the 25th and 75th percentiles of hospice days and hospice payments using quantile regressions and compared to the mean shift estimated by ordinary least squares (OLS) regression. PRINCIPAL FINDINGS: Significant (p < 0.001) heterogeneity in the marginal effects on hospice days and costs was observed, with patient characteristics associated with generally larger shifts in the 75th percentile than the 25th percentile. Mean effects estimated by OLS regression overestimate the magnitude of the median marginal effects for all patient characteristics except for race. Results for hospice payments in 2014 were similar. CONCLUSIONS: Methodological decisions can have a meaningful impact in the evaluation of factors influencing hospice length of use or cost.

Asunto(s)

Cuidados Paliativos al Final de la Vida/economía , Cuidados Paliativos al Final de la Vida/estadística & datos numéricos , Tiempo de Internación/economía , Tiempo de Internación/estadística & datos numéricos , Medicare/economía , Medicare/estadística & datos numéricos , Anciano , Anciano de 80 o más Años , Femenino , Predicción , Humanos , Masculino , North Carolina , Análisis de Regresión , Estudios Retrospectivos , South Carolina , Estados Unidos

10.

Sensorimotor abilities predict on-field performance in professional baseball.

Burris, Kyle; Vittetoe, Kelly; Ramger, Benjamin; Suresh, Sunith; Tokdar, Surya T; Reiter, Jerome P; Appelbaum, L Gregory.

Sci Rep ; 8(1): 116, 2018 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-29311675

RESUMEN

Baseball players must be able to see and react in an instant, yet it is hotly debated whether superior performance is associated with superior sensorimotor abilities. In this study, we compare sensorimotor abilities, measured through 8 psychomotor tasks comprising the Nike Sensory Station assessment battery, and game statistics in a sample of 252 professional baseball players to evaluate the links between sensorimotor skills and on-field performance. For this purpose, we develop a series of Bayesian hierarchical latent variable models enabling us to compare statistics across professional baseball leagues. Within this framework, we find that sensorimotor abilities are significant predictors of on-base percentage, walk rate and strikeout rate, accounting for age, position, and league. We find no such relationship for either slugging percentage or fielder-independent pitching. The pattern of results suggests performance contributions from both visual-sensory and visual-motor abilities and indicates that sensorimotor screenings may be useful for player scouting.

Asunto(s)

Rendimiento Atlético , Béisbol , Desempeño Psicomotor , Adolescente , Adulto , Algoritmos , Humanos , Modelos Teóricos , Adulto Joven

11.

Visual abilities distinguish pitchers from hitters in professional baseball.

Klemish, David; Ramger, Benjamin; Vittetoe, Kelly; Reiter, Jerome P; Tokdar, Surya T; Appelbaum, Lawrence Gregory.

J Sports Sci ; 36(2): 171-179, 2018 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-28282749

RESUMEN

This study aimed to evaluate the possibility that differences in sensorimotor abilities exist between hitters and pitchers in a large cohort of baseball players of varying levels of experience. Secondary data analysis was performed on 9 sensorimotor tasks comprising the Nike Sensory Station assessment battery. Bayesian hierarchical regression modelling was applied to test for differences between pitchers and hitters in data from 566 baseball players (112 high school, 85 college, 369 professional) collected at 20 testing centres. Explanatory variables including height, handedness, eye dominance, concussion history, and player position were modelled along with age curves using basis regression splines. Regression analyses revealed better performance for hitters relative to pitchers at the professional level in the visual clarity and depth perception tasks, but these differences did not exist at the high school or college levels. No significant differences were observed in the other 7 measures of sensorimotor capabilities included in the test battery, and no systematic biases were found between the testing centres. These findings, indicating that professional-level hitters have better visual acuity and depth perception than professional-level pitchers, affirm the notion that highly experienced athletes have differing perceptual skills. Findings are discussed in relation to deliberate practice theory.

Asunto(s)

Rendimiento Atlético/fisiología , Béisbol/fisiología , Percepción de Profundidad/fisiología , Agudeza Visual/fisiología , Adolescente , Adulto , Factores de Edad , Teorema de Bayes , Humanos , Masculino , Destreza Motora/fisiología , Corteza Sensoriomotora/fisiología , Análisis y Desempeño de Tareas , Adulto Joven

12.

A comparison of two methods of estimating propensity scores after multiple imputation.

Mitra, Robin; Reiter, Jerome P.

Stat Methods Med Res ; 25(1): 188-204, 2016 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-22687877

RESUMEN

In many observational studies, analysts estimate treatment effects using propensity scores, e.g. by matching or sub-classifying on the scores. When some values of the covariates are missing, analysts can use multiple imputation to fill in the missing data, estimate propensity scores based on the m completed datasets, and use the propensity scores to estimate treatment effects. We compare two approaches to implement this process. In the first, the analyst estimates the treatment effect using propensity score matching within each completed data set, and averages the m treatment effect estimates. In the second approach, the analyst averages the m propensity scores for each record across the completed datasets, and performs propensity score matching with these averaged scores to estimate the treatment effect. We compare properties of both methods via simulation studies using artificial and real data. The simulations suggest that the second method has greater potential to produce substantial bias reductions than the first, particularly when the missing values are predictive of treatment assignment.

Asunto(s)

Modelos Estadísticos , Puntaje de Propensión , Sesgo , Bioestadística , Lactancia Materna/estadística & datos numéricos , Niño , Desarrollo Infantil , Preescolar , Simulación por Computador , Humanos , Lactante , Recién Nacido , Estudios Observacionales como Asunto/estadística & datos numéricos , Resultado del Tratamiento

13.

A Nonparametric, Multiple Imputation-Based Method for the Retrospective Integration of Data Sets.

Carrig, Madeline M; Manrique-Vallier, Daniel; Ranby, Krista W; Reiter, Jerome P; Hoyle, Rick H.

Multivariate Behav Res ; 50(4): 383-97, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-26257437

RESUMEN

Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches.

Asunto(s)

Investigación Conductal/métodos , Estudios Retrospectivos , Estadísticas no Paramétricas , Adolescente , Adulto , Niño , Humanos , Psicometría/métodos , Reproducibilidad de los Resultados , Adulto Joven

14.

Multiple imputation for harmonizing longitudinal non-commensurate measures in individual participant data meta-analysis.

Siddique, Juned; Reiter, Jerome P; Brincks, Ahnalee; Gibbons, Robert D; Crespi, Catherine M; Brown, C Hendricks.

Stat Med ; 34(26): 3399-414, 2015 Nov 20.

Artículo en Inglés | MEDLINE | ID: mdl-26095855

RESUMEN

There are many advantages to individual participant data meta-analysis for combining data from multiple studies. These advantages include greater power to detect effects, increased sample heterogeneity, and the ability to perform more sophisticated analyses than meta-analyses that rely on published results. However, a fundamental challenge is that it is unlikely that variables of interest are measured the same way in all of the studies to be combined. We propose that this situation can be viewed as a missing data problem in which some outcomes are entirely missing within some trials and use multiple imputation to fill in missing measurements. We apply our method to five longitudinal adolescent depression trials where four studies used one depression measure and the fifth study used a different depression measure. None of the five studies contained both depression measures. We describe a multiple imputation approach for filling in missing depression measures that makes use of external calibration studies in which both depression measures were used. We discuss some practical issues in developing the imputation model including taking into account treatment group and study. We present diagnostics for checking the fit of the imputation model and investigate whether external information is appropriately incorporated into the imputed values.

Asunto(s)

Antidepresivos de Segunda Generación/uso terapéutico , Depresión/tratamiento farmacológico , Fluoxetina/uso terapéutico , Metaanálisis como Asunto , Modelos Estadísticos , Adolescente , Calibración , Niño , Femenino , Humanos , Estudios Longitudinales , Masculino , Psicología del Adolescente , Ensayos Clínicos Controlados Aleatorios como Asunto , Proyectos de Investigación , Resultado del Tratamiento

15.

Alcohol Pharmacology Education Partnership: Using Chemistry and Biology Concepts To Educate High School Students about Alcohol.

Godin, Elizabeth A; Kwiek, Nicole; Sikes, Suzanne S; Halpin, Myra J; Weinbaum, Carolyn A; Burgette, Lane F; Reiter, Jerome P; Schwartz-Bloom, Rochelle D.

J Chem Educ ; 91(2): 165-172, 2014 Feb 11.

Artículo en Inglés | MEDLINE | ID: mdl-24803686

RESUMEN

We developed the Alcohol Pharmacology Education Partnership (APEP), a set of modules designed to integrate a topic of interest (alcohol) with concepts in chemistry and biology for high school students. Chemistry and biology teachers (n = 156) were recruited nationally to field-test APEP in a controlled study. Teachers obtained professional development either at a conference-based workshop (NSTA or NCSTA) or via distance learning to learn how to incorporate the APEP modules into their teaching. They field-tested the modules in their classes during the following year. Teacher knowledge of chemistry and biology concepts increased significantly following professional development, and was maintained for at least a year. Their students (n = 14 014) demonstrated significantly higher scores when assessed for knowledge of both basic and advanced chemistry and biology concepts compared to students not using APEP modules in their classes the previous year. Higher scores were achieved as the number of modules used increased. These findings are consistent with our previous studies, demonstrating higher scores in chemistry and biology after students use modules that integrate topics interesting to them, such as drugs (the Pharmacology Education Partnership).

16.

Multiple-Shrinkage Multinomial Probit Models with Applications to Simulating Geographies in Public Use Data.

Burgette, Lane F; Reiter, Jerome P.

Bayesian Anal ; 8(2)2013 Jun 01.

Artículo en Inglés | MEDLINE | ID: mdl-24358073

RESUMEN

Multinomial outcomes with many levels can be challenging to model. Information typically accrues slowly with increasing sample size, yet the parameter space expands rapidly with additional covariates. Shrinking all regression parameters towards zero, as often done in models of continuous or binary response variables, is unsatisfactory, since setting parameters equal to zero in multinomial models does not necessarily imply "no effect." We propose an approach to modeling multinomial outcomes with many levels based on a Bayesian multinomial probit (MNP) model and a multiple shrinkage prior distribution for the regression parameters. The prior distribution encourages the MNP regression parameters to shrink toward a number of learned locations, thereby substantially reducing the dimension of the parameter space. Using simulated data, we compare the predictive performance of this model against two other recently-proposed methods for big multinomial models. The results suggest that the fully Bayesian, multiple shrinkage approach can outperform these other methods. We apply the multiple shrinkage MNP to simulating replacement values for areal identifiers, e.g., census tract indicators, in order to protect data confidentiality in public use datasets.

17.

Maternal health prior to pregnancy and preterm birth among urban, low income black women in Baltimore: the Baltimore Preterm Birth Study.

Orr, Suezanne T; Reiter, Jerome P; James, Sherman A; Orr, Caroline A.

Ethn Dis ; 22(1): 85-9, 2012.

Artículo en Inglés | MEDLINE | ID: mdl-22774314

RESUMEN

OBJECTIVES: Black women have increased risk of preterm birth compared to white women, and overall black women are in poorer health than white women. Recent recommendations to reduce preterm birth have focused on preconception health care. We explore the associations between indicators of maternal prepregnancy health with preterm birth among a sample of black women. DESIGN: The current study was prospective. SETTING: Enrollment occurred in prenatal clinics in Baltimore. PARTICIPANTS: Women (N=922) aged > or =18 were enrolled in the study. Data on maternal health, behaviors, and pregnancy outcome were abstracted from clinical records. MAIN OUTCOME MEASURE: Logistic regression was used to evaluate associations between behavioral and health status variables with preterm birth. RESULTS: In bivariate analysis, alcohol use, drug use and chronic diseases were associated with preterm birth. In the logistic regression analysis, drug use and chronic diseases were associated with preterm birth. CONCLUSIONS: These results demonstrate an association between maternal health and behaviors prior to pregnancy with preterm birth among black women. Providing access to health care prior to pregnancy to address behavioral and health risks may improve pregnancy outcomes among low-income black women.

Asunto(s)

Negro o Afroamericano , Indicadores de Salud , Conducta Materna , Nacimiento Prematuro , Adolescente , Adulto , Baltimore/epidemiología , Enfermedad Crónica/epidemiología , Enfermedad Crónica/etnología , Femenino , Conductas Relacionadas con la Salud , Humanos , Modelos Logísticos , Pobreza , Embarazo , Atención Prenatal , Factores de Riesgo , Asunción de Riesgos , Trastornos Relacionados con Sustancias/complicaciones , Trastornos Relacionados con Sustancias/epidemiología , Trastornos Relacionados con Sustancias/etnología

18.

Sensitivity analysis for unmeasured confounding in principal stratification settings with binary variables.

Schwartz, Scott; Li, Fan; Reiter, Jerome P.

Stat Med ; 31(10): 949-62, 2012 May 10.

Artículo en Inglés | MEDLINE | ID: mdl-22362635

RESUMEN

Within causal inference, principal stratification (PS) is a popular approach for dealing with intermediate variables, that is, variables affected by treatment that also potentially affect the response. However, when there exists unmeasured confounding in the treatment arms--as can happen in observational studies--causal estimands resulting from PS analyses can be biased. We identify the various pathways of confounding present in PS contexts and their effects for PS inference. We present model-based approaches for assessing the sensitivity of complier average causal effect estimates to unmeasured confounding in the setting of binary treatments, binary intermediate variables, and binary outcomes. These same approaches can be used to assess sensitivity to unknown direct effects of treatments on outcomes because, as we show, direct effects are operationally equivalent to one of the pathways of unmeasured confounding. We illustrate the methodology using a randomized study with artificially introduced confounding and a sensitivity analysis for an observational study of the effects of physical activity and body mass index on cardiovascular disease.

Asunto(s)

Modelos Estadísticos , Dinámica Poblacional , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Estudios de Cohortes , Humanos , Encuestas y Cuestionarios , Resultado del Tratamiento

19.

Modeling adverse birth outcomes via confirmatory factor quantile regression.

Burgette, Lane F; Reiter, Jerome P.

Biometrics ; 68(1): 92-100, 2012 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-21689080

RESUMEN

We describe a Bayesian quantile regression model that uses a confirmatory factor structure for part of the design matrix. This model is appropriate when the covariates are indicators of scientifically determined latent factors, and it is these latent factors that analysts seek to include as predictors in the quantile regression. We apply the model to a study of birth weights in which the effects of latent variables representing psychosocial health and actual tobacco usage on the lower quantiles of the response distribution are of interest. The models can be fit using an R package called factorQR.

Asunto(s)

Teorema de Bayes , Retardo del Crecimiento Fetal/epidemiología , Recién Nacido de muy Bajo Peso , Exposición Materna/estadística & datos numéricos , Modelos de Riesgos Proporcionales , Análisis de Regresión , Contaminación por Humo de Tabaco/estadística & datos numéricos , Peso al Nacer , Causalidad , Femenino , Humanos , Recién Nacido de Bajo Peso , Recién Nacido , Prevalencia

20.

Estimating Identification Disclosure Risk Using Mixed Membership Models.

Manrique-Vallier, Daniel; Reiter, Jerome P.

J Am Stat Assoc ; 107(500): 1385-1394, 2012 Dec 01.

Artículo en Inglés | MEDLINE | ID: mdl-25214699

RESUMEN

Statistical agencies and other organizations that disseminate data are obligated to protect data subjects' confidentiality. For example, ill-intentioned individuals might link data subjects to records in other databases by matching on common characteristics (keys). Successful links are particularly problematic for data subjects with combinations of keys that are unique in the population. Hence, as part of their assessments of disclosure risks, many data stewards estimate the probabilities that sample uniques on sets of discrete keys are also population uniques on those keys. This is typically done using log-linear modeling on the keys. However, log-linear models can yield biased estimates of cell probabilities for sparse contingency tables with many zero counts, which often occurs in databases with many keys. This bias can result in unreliable estimates of probabilities of uniqueness and, hence, misrepresentations of disclosure risks. We propose an alternative to log-linear models for datasets with sparse keys based on a Bayesian version of grade of membership (GoM) models. We present a Bayesian GoM model for multinomial variables and offer an MCMC algorithm for fitting the model. We evaluate the approach by treating data from a recent US Census Bureau public use microdata sample as a population, taking simple random samples from that population, and benchmarking estimated probabilities of uniqueness against population values. Compared to log-linear models, GoM models provide more accurate estimates of the total number of uniques in the samples. Additionally, they offer record-level predictions of uniqueness that dominate those based on log-linear models.

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA