Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
2.
Proc Natl Acad Sci U S A ; 120(43): e2220558120, 2023 Oct 24.
Artículo en Inglés | MEDLINE | ID: mdl-37831744

RESUMEN

The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. We argue that any proposal for quantifying disclosure risk should be based on prespecified, objective criteria. We illustrate this approach to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. More research is needed, but in the near term, the counterfactual approach appears best-suited for privacy versus utility analysis.


Asunto(s)
Confidencialidad , Revelación , Privacidad , Medición de Riesgo , Censos
3.
J R Stat Soc Ser A Stat Soc ; 184(2): 643-662, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36254262

RESUMEN

Often, government agencies and survey organizations know the population counts or percentages for some of the variables in a survey. These may be available from auxiliary sources, for example, administrative databases or other high quality surveys. We present and illustrate a model-based framework for leveraging such auxiliary marginal information when handling unit and item nonresponse. We show how one can use the margins to specify different missingness mechanisms for each type of nonresponse. We use the framework to impute missing values in voter turnout in a subset of data from the U.S. Current Population Survey (CPS). In doing so, we examine the sensitivity of results to different assumptions about the unit and item nonresponse.

4.
J Palliat Med ; 23(1): 90-96, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31424316

RESUMEN

Background: Hospital referral regions (HRRs) are often used to characterize inpatient referral patterns, but it is unknown how well these geographic regions are aligned with variation in Medicare-financed hospice care, which is largely provided at home. Objective: Our objective was to characterize the variability in hospice use rates among elderly Medicare decedents by HRR and county. Methods: Using 2014 Master Beneficiary File for decedents 65 and older from North and South Carolina, we applied Bayesian mixed models to quantify variation in hospice use rates explained by HRR fixed effects, county random effects, and residual error among Medicare decedents. Results: We found HRRs and county indicators are significant predictors of hospice use in NC and SC; however, the relative variation within HRRs and associated residual variation is substantial. On average, HRR fixed effects explained more variation in hospice use rates than county indicators with a standard deviation (SD) of 10.0 versus 5.1 percentage points. The SD of the residual error is 5.7 percentage points. On average, variation within HRRs is about half the variation between regions (52%). Conclusions: The magnitude of unexplained residual variation in hospice use for NC and SC suggests that novel, end-of-life-specific service areas should be developed and tested to better capture geographic differences and inform research, health systems, and policy.


Asunto(s)
Cuidados Paliativos al Final de la Vida , Cuidado Terminal , Anciano , Teorema de Bayes , Humanos , Medicare , Derivación y Consulta , South Carolina , Estados Unidos
5.
Stat Med ; 37(24): 3533-3546, 2018 10 30.
Artículo en Inglés | MEDLINE | ID: mdl-30069901

RESUMEN

We develop methodology for causal inference in observational studies when using propensity score subclassification on data constructed with probabilistic record linkage techniques. We focus on scenarios where covariates and binary treatment assignments are in one file and outcomes are in another file, and the goal is to estimate an additive treatment effect by merging the files. We assume that the files can be linked using variables common to both files, eg, names or birth dates, but that links are subject to errors, eg, due to reporting errors in the linking variables. We develop methodology for cases where such reporting errors are independent of the other variables on the files. We describe conceptually how linkage errors can affect causal estimates in subclassification contexts. We also present and evaluate several algorithms for deciding which record pairs to use in estimation of causal effects. Using simulation studies, we demonstrate that case selection procedures can result in improved accuracy in estimates of treatment effects from linked data compared to using only cases known to be true links.


Asunto(s)
Registro Médico Coordinado , Puntaje de Propensión , Algoritmos , Bioestadística , Causalidad , Simulación por Computador , Interpretación Estadística de Datos , Humanos , Estudios Observacionales como Asunto/estadística & datos numéricos
6.
J Palliat Med ; 21(8): 1131-1136, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-29762075

RESUMEN

BACKGROUND: Use of the Medicare hospice benefit has been associated with high-quality care at the end of life, and hospice length of use in particular has been used as a proxy for appropriate timing of hospice enrollment. Quantile regression has been underutilized as an alternative tool to model distributional changes in hospice length of use and hospice payments outside of the mean. OBJECTIVE: To test for heterogeneity in the relationship between patient characteristics and hospice outcomes across the distribution of hospice days. SETTING: Medicare Beneficiary Summary File and survey data (2014) for hospice beneficiaries in North and South Carolina with common terminal diagnoses. MEASUREMENTS: Distributional shifts associated with patient characteristics were evaluated at the 25th and 75th percentiles of hospice days and hospice payments using quantile regressions and compared to the mean shift estimated by ordinary least squares (OLS) regression. PRINCIPAL FINDINGS: Significant (p < 0.001) heterogeneity in the marginal effects on hospice days and costs was observed, with patient characteristics associated with generally larger shifts in the 75th percentile than the 25th percentile. Mean effects estimated by OLS regression overestimate the magnitude of the median marginal effects for all patient characteristics except for race. Results for hospice payments in 2014 were similar. CONCLUSIONS: Methodological decisions can have a meaningful impact in the evaluation of factors influencing hospice length of use or cost.


Asunto(s)
Cuidados Paliativos al Final de la Vida/economía , Cuidados Paliativos al Final de la Vida/estadística & datos numéricos , Tiempo de Internación/economía , Tiempo de Internación/estadística & datos numéricos , Medicare/economía , Medicare/estadística & datos numéricos , Anciano , Anciano de 80 o más Años , Femenino , Predicción , Humanos , Masculino , North Carolina , Análisis de Regresión , Estudios Retrospectivos , South Carolina , Estados Unidos
7.
Sci Rep ; 8(1): 116, 2018 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-29311675

RESUMEN

Baseball players must be able to see and react in an instant, yet it is hotly debated whether superior performance is associated with superior sensorimotor abilities. In this study, we compare sensorimotor abilities, measured through 8 psychomotor tasks comprising the Nike Sensory Station assessment battery, and game statistics in a sample of 252 professional baseball players to evaluate the links between sensorimotor skills and on-field performance. For this purpose, we develop a series of Bayesian hierarchical latent variable models enabling us to compare statistics across professional baseball leagues. Within this framework, we find that sensorimotor abilities are significant predictors of on-base percentage, walk rate and strikeout rate, accounting for age, position, and league. We find no such relationship for either slugging percentage or fielder-independent pitching. The pattern of results suggests performance contributions from both visual-sensory and visual-motor abilities and indicates that sensorimotor screenings may be useful for player scouting.


Asunto(s)
Rendimiento Atlético , Béisbol , Desempeño Psicomotor , Adolescente , Adulto , Algoritmos , Humanos , Modelos Teóricos , Adulto Joven
8.
J Sports Sci ; 36(2): 171-179, 2018 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-28282749

RESUMEN

This study aimed to evaluate the possibility that differences in sensorimotor abilities exist between hitters and pitchers in a large cohort of baseball players of varying levels of experience. Secondary data analysis was performed on 9 sensorimotor tasks comprising the Nike Sensory Station assessment battery. Bayesian hierarchical regression modelling was applied to test for differences between pitchers and hitters in data from 566 baseball players (112 high school, 85 college, 369 professional) collected at 20 testing centres. Explanatory variables including height, handedness, eye dominance, concussion history, and player position were modelled along with age curves using basis regression splines. Regression analyses revealed better performance for hitters relative to pitchers at the professional level in the visual clarity and depth perception tasks, but these differences did not exist at the high school or college levels. No significant differences were observed in the other 7 measures of sensorimotor capabilities included in the test battery, and no systematic biases were found between the testing centres. These findings, indicating that professional-level hitters have better visual acuity and depth perception than professional-level pitchers, affirm the notion that highly experienced athletes have differing perceptual skills. Findings are discussed in relation to deliberate practice theory.


Asunto(s)
Rendimiento Atlético/fisiología , Béisbol/fisiología , Percepción de Profundidad/fisiología , Agudeza Visual/fisiología , Adolescente , Adulto , Factores de Edad , Teorema de Bayes , Humanos , Masculino , Destreza Motora/fisiología , Corteza Sensoriomotora/fisiología , Análisis y Desempeño de Tareas , Adulto Joven
9.
Stat Methods Med Res ; 25(1): 188-204, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-22687877

RESUMEN

In many observational studies, analysts estimate treatment effects using propensity scores, e.g. by matching or sub-classifying on the scores. When some values of the covariates are missing, analysts can use multiple imputation to fill in the missing data, estimate propensity scores based on the m completed datasets, and use the propensity scores to estimate treatment effects. We compare two approaches to implement this process. In the first, the analyst estimates the treatment effect using propensity score matching within each completed data set, and averages the m treatment effect estimates. In the second approach, the analyst averages the m propensity scores for each record across the completed datasets, and performs propensity score matching with these averaged scores to estimate the treatment effect. We compare properties of both methods via simulation studies using artificial and real data. The simulations suggest that the second method has greater potential to produce substantial bias reductions than the first, particularly when the missing values are predictive of treatment assignment.


Asunto(s)
Modelos Estadísticos , Puntaje de Propensión , Sesgo , Bioestadística , Lactancia Materna/estadística & datos numéricos , Niño , Desarrollo Infantil , Preescolar , Simulación por Computador , Humanos , Lactante , Recién Nacido , Estudios Observacionales como Asunto/estadística & datos numéricos , Resultado del Tratamiento
10.
Multivariate Behav Res ; 50(4): 383-97, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26257437

RESUMEN

Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches.


Asunto(s)
Investigación Conductal/métodos , Estudios Retrospectivos , Estadísticas no Paramétricas , Adolescente , Adulto , Niño , Humanos , Psicometría/métodos , Reproducibilidad de los Resultados , Adulto Joven
11.
Stat Med ; 34(26): 3399-414, 2015 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-26095855

RESUMEN

There are many advantages to individual participant data meta-analysis for combining data from multiple studies. These advantages include greater power to detect effects, increased sample heterogeneity, and the ability to perform more sophisticated analyses than meta-analyses that rely on published results. However, a fundamental challenge is that it is unlikely that variables of interest are measured the same way in all of the studies to be combined. We propose that this situation can be viewed as a missing data problem in which some outcomes are entirely missing within some trials and use multiple imputation to fill in missing measurements. We apply our method to five longitudinal adolescent depression trials where four studies used one depression measure and the fifth study used a different depression measure. None of the five studies contained both depression measures. We describe a multiple imputation approach for filling in missing depression measures that makes use of external calibration studies in which both depression measures were used. We discuss some practical issues in developing the imputation model including taking into account treatment group and study. We present diagnostics for checking the fit of the imputation model and investigate whether external information is appropriately incorporated into the imputed values.


Asunto(s)
Antidepresivos de Segunda Generación/uso terapéutico , Depresión/tratamiento farmacológico , Fluoxetina/uso terapéutico , Metaanálisis como Asunto , Modelos Estadísticos , Adolescente , Calibración , Niño , Femenino , Humanos , Estudios Longitudinales , Masculino , Psicología del Adolescente , Ensayos Clínicos Controlados Aleatorios como Asunto , Proyectos de Investigación , Resultado del Tratamiento
12.
J Chem Educ ; 91(2): 165-172, 2014 Feb 11.
Artículo en Inglés | MEDLINE | ID: mdl-24803686

RESUMEN

We developed the Alcohol Pharmacology Education Partnership (APEP), a set of modules designed to integrate a topic of interest (alcohol) with concepts in chemistry and biology for high school students. Chemistry and biology teachers (n = 156) were recruited nationally to field-test APEP in a controlled study. Teachers obtained professional development either at a conference-based workshop (NSTA or NCSTA) or via distance learning to learn how to incorporate the APEP modules into their teaching. They field-tested the modules in their classes during the following year. Teacher knowledge of chemistry and biology concepts increased significantly following professional development, and was maintained for at least a year. Their students (n = 14 014) demonstrated significantly higher scores when assessed for knowledge of both basic and advanced chemistry and biology concepts compared to students not using APEP modules in their classes the previous year. Higher scores were achieved as the number of modules used increased. These findings are consistent with our previous studies, demonstrating higher scores in chemistry and biology after students use modules that integrate topics interesting to them, such as drugs (the Pharmacology Education Partnership).

13.
Bayesian Anal ; 8(2)2013 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-24358073

RESUMEN

Multinomial outcomes with many levels can be challenging to model. Information typically accrues slowly with increasing sample size, yet the parameter space expands rapidly with additional covariates. Shrinking all regression parameters towards zero, as often done in models of continuous or binary response variables, is unsatisfactory, since setting parameters equal to zero in multinomial models does not necessarily imply "no effect." We propose an approach to modeling multinomial outcomes with many levels based on a Bayesian multinomial probit (MNP) model and a multiple shrinkage prior distribution for the regression parameters. The prior distribution encourages the MNP regression parameters to shrink toward a number of learned locations, thereby substantially reducing the dimension of the parameter space. Using simulated data, we compare the predictive performance of this model against two other recently-proposed methods for big multinomial models. The results suggest that the fully Bayesian, multiple shrinkage approach can outperform these other methods. We apply the multiple shrinkage MNP to simulating replacement values for areal identifiers, e.g., census tract indicators, in order to protect data confidentiality in public use datasets.

14.
Ethn Dis ; 22(1): 85-9, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22774314

RESUMEN

OBJECTIVES: Black women have increased risk of preterm birth compared to white women, and overall black women are in poorer health than white women. Recent recommendations to reduce preterm birth have focused on preconception health care. We explore the associations between indicators of maternal prepregnancy health with preterm birth among a sample of black women. DESIGN: The current study was prospective. SETTING: Enrollment occurred in prenatal clinics in Baltimore. PARTICIPANTS: Women (N=922) aged > or =18 were enrolled in the study. Data on maternal health, behaviors, and pregnancy outcome were abstracted from clinical records. MAIN OUTCOME MEASURE: Logistic regression was used to evaluate associations between behavioral and health status variables with preterm birth. RESULTS: In bivariate analysis, alcohol use, drug use and chronic diseases were associated with preterm birth. In the logistic regression analysis, drug use and chronic diseases were associated with preterm birth. CONCLUSIONS: These results demonstrate an association between maternal health and behaviors prior to pregnancy with preterm birth among black women. Providing access to health care prior to pregnancy to address behavioral and health risks may improve pregnancy outcomes among low-income black women.


Asunto(s)
Negro o Afroamericano , Indicadores de Salud , Conducta Materna , Nacimiento Prematuro , Adolescente , Adulto , Baltimore/epidemiología , Enfermedad Crónica/epidemiología , Enfermedad Crónica/etnología , Femenino , Conductas Relacionadas con la Salud , Humanos , Modelos Logísticos , Pobreza , Embarazo , Atención Prenatal , Factores de Riesgo , Asunción de Riesgos , Trastornos Relacionados con Sustancias/complicaciones , Trastornos Relacionados con Sustancias/epidemiología , Trastornos Relacionados con Sustancias/etnología
15.
Stat Med ; 31(10): 949-62, 2012 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-22362635

RESUMEN

Within causal inference, principal stratification (PS) is a popular approach for dealing with intermediate variables, that is, variables affected by treatment that also potentially affect the response. However, when there exists unmeasured confounding in the treatment arms--as can happen in observational studies--causal estimands resulting from PS analyses can be biased. We identify the various pathways of confounding present in PS contexts and their effects for PS inference. We present model-based approaches for assessing the sensitivity of complier average causal effect estimates to unmeasured confounding in the setting of binary treatments, binary intermediate variables, and binary outcomes. These same approaches can be used to assess sensitivity to unknown direct effects of treatments on outcomes because, as we show, direct effects are operationally equivalent to one of the pathways of unmeasured confounding. We illustrate the methodology using a randomized study with artificially introduced confounding and a sensitivity analysis for an observational study of the effects of physical activity and body mass index on cardiovascular disease.


Asunto(s)
Modelos Estadísticos , Dinámica Poblacional , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Estudios de Cohortes , Humanos , Encuestas y Cuestionarios , Resultado del Tratamiento
16.
Biometrics ; 68(1): 92-100, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21689080

RESUMEN

We describe a Bayesian quantile regression model that uses a confirmatory factor structure for part of the design matrix. This model is appropriate when the covariates are indicators of scientifically determined latent factors, and it is these latent factors that analysts seek to include as predictors in the quantile regression. We apply the model to a study of birth weights in which the effects of latent variables representing psychosocial health and actual tobacco usage on the lower quantiles of the response distribution are of interest. The models can be fit using an R package called factorQR.


Asunto(s)
Teorema de Bayes , Retardo del Crecimiento Fetal/epidemiología , Recién Nacido de muy Bajo Peso , Exposición Materna/estadística & datos numéricos , Modelos de Riesgos Proporcionales , Análisis de Regresión , Contaminación por Humo de Tabaco/estadística & datos numéricos , Peso al Nacer , Causalidad , Femenino , Humanos , Recién Nacido de Bajo Peso , Recién Nacido , Prevalencia
17.
J Am Stat Assoc ; 107(500): 1385-1394, 2012 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-25214699

RESUMEN

Statistical agencies and other organizations that disseminate data are obligated to protect data subjects' confidentiality. For example, ill-intentioned individuals might link data subjects to records in other databases by matching on common characteristics (keys). Successful links are particularly problematic for data subjects with combinations of keys that are unique in the population. Hence, as part of their assessments of disclosure risks, many data stewards estimate the probabilities that sample uniques on sets of discrete keys are also population uniques on those keys. This is typically done using log-linear modeling on the keys. However, log-linear models can yield biased estimates of cell probabilities for sparse contingency tables with many zero counts, which often occurs in databases with many keys. This bias can result in unreliable estimates of probabilities of uniqueness and, hence, misrepresentations of disclosure risks. We propose an alternative to log-linear models for datasets with sparse keys based on a Bayesian version of grade of membership (GoM) models. We present a Bayesian GoM model for multinomial variables and offer an MCMC algorithm for fitting the model. We evaluate the approach by treating data from a recent US Census Bureau public use microdata sample as a population, taking simple random samples from that population, and benchmarking estimated probabilities of uniqueness against population values. Compared to log-linear models, GoM models provide more accurate estimates of the total number of uniques in the samples. Additionally, they offer record-level predictions of uniqueness that dominate those based on log-linear models.

18.
Ann Appl Stat ; 6(1): 229-252, 2012 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-23990852

RESUMEN

When releasing data to the public, data stewards are ethically and often legally obligated to protect the confidentiality of data subjects' identities and sensitive attributes. They also strive to release data that are informative for a wide range of secondary analyses. Achieving both objectives is particularly challenging when data stewards seek to release highly resolved geographical information. We present an approach for protecting the confidentiality of data with geographic identifiers based on multiple imputation. The basic idea is to convert geography to latitude and longitude, estimate a bivariate response model conditional on attributes, and simulate new latitude and longitude values from these models. We illustrate the proposed methods using data describing causes of death in Durham, North Carolina. In the context of the application, we present a straightforward tool for generating simulated geographies and attributes based on regression trees, and we present methods for assessing disclosure risks with such simulated data.

19.
Epidemiology ; 22(6): 859-66, 2011 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-21968775

RESUMEN

Covariates may affect continuous responses differently at various points of the response distribution. For example, some exposure might have minimal impact on conditional means, whereas it might lower conditional 10th percentiles sharply. Such differential effects can be important to detect. In studies of the determinants of birth weight, for instance, it is critical to identify exposures like the one above, since low birth weight is a risk factor for later health problems. Effects of covariates on the tails of distributions can be obscured by models (such as linear regression) that estimate conditional means; however, effects on tails can be detected by quantile regression. We present 2 approaches for exploring high-dimensional predictor spaces to identify important predictors for quantile regression. These are based on the lasso and elastic net penalties. We apply the approaches to a prospective cohort study of adverse birth outcomes that includes a wide array of demographic, medical, psychosocial, and environmental variables. Although tobacco exposure is known to be associated with lower birth weights, the analysis suggests an interesting interaction effect not previously reported: tobacco exposure depresses the 20th and 30th percentiles of birth weight more strongly when mothers have high levels of lead in their blood compared with those who have low blood lead levels.


Asunto(s)
Resultado del Embarazo/epidemiología , Análisis de Regresión , Causalidad , Interpretación Estadística de Datos , Femenino , Humanos , Recién Nacido de Bajo Peso , Recién Nacido , Modelos Lineales , Embarazo , Nacimiento Prematuro/epidemiología , Efectos Tardíos de la Exposición Prenatal/epidemiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...