Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 120(43): e2220558120, 2023 Oct 24.
Artigo em Inglês | MEDLINE | ID: mdl-37831744

RESUMO

The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. We argue that any proposal for quantifying disclosure risk should be based on prespecified, objective criteria. We illustrate this approach to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. More research is needed, but in the near term, the counterfactual approach appears best-suited for privacy versus utility analysis.


Assuntos
Confidencialidade , Revelação , Privacidade , Medição de Risco , Censos
2.
Am J Epidemiol ; 2024 Aug 09.
Artigo em Inglês | MEDLINE | ID: mdl-39123098

RESUMO

There is a profound need to identify modifiable risk factors to screen and prevent pancreatic cancer. Air pollution, including fine particulate matter (PM2.5), is increasingly recognized as a risk factor for cancer. We conducted a case-control study using data from the electronic health record (EHR) of Duke University Health System, 15-year residential history, NASA satellite fine particulate matter (PM2.5) and neighborhood socioeconomic data. Using deterministic and probabilistic linkage algorithms, we linked residential history and EHR data to quantify long term PM2.5 exposure. Logistic regression models quantified the association between a one interquartile range (IQR) increase in PM2.5 concentration and pancreatic cancer risk. The study included 203 cases and 5027 controls (median age of 59 years, 62% female, 26% Black). Individuals with pancreatic cancer had higher average annual exposure (9.4 µg/m3) as compared to IQR increase in average annual PM2.5 was associated with greater odds of pancreatic cancer (OR=1.20; 95% CI: 1.00-1.44). These findings highlight the link between elevated PM2.5 exposure and increased pancreatic cancer risk. They may inform screening strategies for high-risk populations and guide air pollution policies to mitigate exposure.

3.
Epidemiology ; 2024 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-39140488

RESUMO

Surveys are commonly used to facilitate research in epidemiology, health, and the social and behavioral sciences. Often, these surveys are not simple random samples, and respondents are given weights reflecting their probability of selection into the survey. We show that using survey weights can be beneficial for evaluating the quality of predictive models when splitting data into training and test sets. In particular, we characterize model assessment statistics, such as sensitivity and specificity, as finite population quantities and compute survey-weighted estimates of these quantities with test data comprising a random subset of the original data. Using simulations with data from the National Survey on Drug Use and Health and the National Comorbidity Survey, we show that unweighted metrics estimated with sample test data can misrepresent population performance, but weighted metrics appropriately adjust for the complex sampling design. We also show that this conclusion holds for models trained using upsampling for mitigating class imbalance. The results suggest that weighted metrics should be used when evaluating performance on test data derived from complex surveys.

4.
Artigo em Inglês | MEDLINE | ID: mdl-39076728

RESUMO

We consider causal inference for observational studies with data spread over two files. One file includes the treatment, outcome, and some covariates measured on a set of individuals, and the other file includes additional causally-relevant covariates measured on a partially overlapping set of individuals. By linking records in the two databases, the analyst can control for more covariates, thereby reducing the risk of bias compared to using only one file alone. When analysts do not have access to a unique identifier that enables perfect, error-free linkages, they typically rely on probabilistic record linkage to construct a single linked data set, and estimate causal effects using these linked data. This typical practice does not propagate uncertainty from imperfect linkages to the causal inferences. Further, it does not take advantage of relationships among the variables to improve the linkage quality. We address these shortcomings by fusing regression-assisted, Bayesian probabilistic record linkage with causal inference. The Markov chain Monte Carlo sampler generates multiple plausible linked data files as byproducts that analysts can use for multiple imputation inferences. Here, we show results for two causal estimators based on propensity score overlap weights. Using simulations and data from the Italy Survey on Household Income and Wealth, we show that our approach can improve the accuracy of estimated treatment effects.

6.
Stat Med ; 37(24): 3533-3546, 2018 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-30069901

RESUMO

We develop methodology for causal inference in observational studies when using propensity score subclassification on data constructed with probabilistic record linkage techniques. We focus on scenarios where covariates and binary treatment assignments are in one file and outcomes are in another file, and the goal is to estimate an additive treatment effect by merging the files. We assume that the files can be linked using variables common to both files, eg, names or birth dates, but that links are subject to errors, eg, due to reporting errors in the linking variables. We develop methodology for cases where such reporting errors are independent of the other variables on the files. We describe conceptually how linkage errors can affect causal estimates in subclassification contexts. We also present and evaluate several algorithms for deciding which record pairs to use in estimation of causal effects. Using simulation studies, we demonstrate that case selection procedures can result in improved accuracy in estimates of treatment effects from linked data compared to using only cases known to be true links.


Assuntos
Registro Médico Coordenado , Pontuação de Propensão , Algoritmos , Bioestatística , Causalidade , Simulação por Computador , Interpretação Estatística de Dados , Humanos , Estudos Observacionais como Assunto/estatística & dados numéricos
7.
J Sports Sci ; 36(2): 171-179, 2018 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-28282749

RESUMO

This study aimed to evaluate the possibility that differences in sensorimotor abilities exist between hitters and pitchers in a large cohort of baseball players of varying levels of experience. Secondary data analysis was performed on 9 sensorimotor tasks comprising the Nike Sensory Station assessment battery. Bayesian hierarchical regression modelling was applied to test for differences between pitchers and hitters in data from 566 baseball players (112 high school, 85 college, 369 professional) collected at 20 testing centres. Explanatory variables including height, handedness, eye dominance, concussion history, and player position were modelled along with age curves using basis regression splines. Regression analyses revealed better performance for hitters relative to pitchers at the professional level in the visual clarity and depth perception tasks, but these differences did not exist at the high school or college levels. No significant differences were observed in the other 7 measures of sensorimotor capabilities included in the test battery, and no systematic biases were found between the testing centres. These findings, indicating that professional-level hitters have better visual acuity and depth perception than professional-level pitchers, affirm the notion that highly experienced athletes have differing perceptual skills. Findings are discussed in relation to deliberate practice theory.


Assuntos
Desempenho Atlético/fisiologia , Beisebol/fisiologia , Percepção de Profundidade/fisiologia , Acuidade Visual/fisiologia , Adolescente , Adulto , Fatores Etários , Teorema de Bayes , Humanos , Masculino , Destreza Motora/fisiologia , Córtex Sensório-Motor/fisiologia , Análise e Desempenho de Tarefas , Adulto Jovem
8.
Stat Med ; 34(26): 3399-414, 2015 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-26095855

RESUMO

There are many advantages to individual participant data meta-analysis for combining data from multiple studies. These advantages include greater power to detect effects, increased sample heterogeneity, and the ability to perform more sophisticated analyses than meta-analyses that rely on published results. However, a fundamental challenge is that it is unlikely that variables of interest are measured the same way in all of the studies to be combined. We propose that this situation can be viewed as a missing data problem in which some outcomes are entirely missing within some trials and use multiple imputation to fill in missing measurements. We apply our method to five longitudinal adolescent depression trials where four studies used one depression measure and the fifth study used a different depression measure. None of the five studies contained both depression measures. We describe a multiple imputation approach for filling in missing depression measures that makes use of external calibration studies in which both depression measures were used. We discuss some practical issues in developing the imputation model including taking into account treatment group and study. We present diagnostics for checking the fit of the imputation model and investigate whether external information is appropriately incorporated into the imputed values.


Assuntos
Antidepressivos de Segunda Geração/uso terapêutico , Depressão/tratamento farmacológico , Fluoxetina/uso terapêutico , Metanálise como Assunto , Modelos Estatísticos , Adolescente , Calibragem , Criança , Feminino , Humanos , Estudos Longitudinais , Masculino , Psicologia do Adolescente , Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa , Resultado do Tratamento
9.
Multivariate Behav Res ; 50(4): 383-97, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26257437

RESUMO

Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches.


Assuntos
Pesquisa Comportamental/métodos , Estudos Retrospectivos , Estatísticas não Paramétricas , Adolescente , Adulto , Criança , Humanos , Psicometria/métodos , Reprodutibilidade dos Testes , Adulto Jovem
10.
J Chem Educ ; 91(2): 165-172, 2014 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-24803686

RESUMO

We developed the Alcohol Pharmacology Education Partnership (APEP), a set of modules designed to integrate a topic of interest (alcohol) with concepts in chemistry and biology for high school students. Chemistry and biology teachers (n = 156) were recruited nationally to field-test APEP in a controlled study. Teachers obtained professional development either at a conference-based workshop (NSTA or NCSTA) or via distance learning to learn how to incorporate the APEP modules into their teaching. They field-tested the modules in their classes during the following year. Teacher knowledge of chemistry and biology concepts increased significantly following professional development, and was maintained for at least a year. Their students (n = 14 014) demonstrated significantly higher scores when assessed for knowledge of both basic and advanced chemistry and biology concepts compared to students not using APEP modules in their classes the previous year. Higher scores were achieved as the number of modules used increased. These findings are consistent with our previous studies, demonstrating higher scores in chemistry and biology after students use modules that integrate topics interesting to them, such as drugs (the Pharmacology Education Partnership).

11.
Biometrics ; 68(1): 92-100, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21689080

RESUMO

We describe a Bayesian quantile regression model that uses a confirmatory factor structure for part of the design matrix. This model is appropriate when the covariates are indicators of scientifically determined latent factors, and it is these latent factors that analysts seek to include as predictors in the quantile regression. We apply the model to a study of birth weights in which the effects of latent variables representing psychosocial health and actual tobacco usage on the lower quantiles of the response distribution are of interest. The models can be fit using an R package called factorQR.


Assuntos
Teorema de Bayes , Retardo do Crescimento Fetal/epidemiologia , Recém-Nascido de muito Baixo Peso , Exposição Materna/estatística & dados numéricos , Modelos de Riscos Proporcionais , Análise de Regressão , Poluição por Fumaça de Tabaco/estatística & dados numéricos , Peso ao Nascer , Causalidade , Feminino , Humanos , Recém-Nascido de Baixo Peso , Recém-Nascido , Prevalência
12.
Stat Med ; 31(10): 949-62, 2012 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-22362635

RESUMO

Within causal inference, principal stratification (PS) is a popular approach for dealing with intermediate variables, that is, variables affected by treatment that also potentially affect the response. However, when there exists unmeasured confounding in the treatment arms--as can happen in observational studies--causal estimands resulting from PS analyses can be biased. We identify the various pathways of confounding present in PS contexts and their effects for PS inference. We present model-based approaches for assessing the sensitivity of complier average causal effect estimates to unmeasured confounding in the setting of binary treatments, binary intermediate variables, and binary outcomes. These same approaches can be used to assess sensitivity to unknown direct effects of treatments on outcomes because, as we show, direct effects are operationally equivalent to one of the pathways of unmeasured confounding. We illustrate the methodology using a randomized study with artificially introduced confounding and a sensitivity analysis for an observational study of the effects of physical activity and body mass index on cardiovascular disease.


Assuntos
Modelos Estatísticos , Dinâmica Populacional , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Estudos de Coortes , Humanos , Inquéritos e Questionários , Resultado do Tratamento
13.
Ethn Dis ; 22(1): 85-9, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22774314

RESUMO

OBJECTIVES: Black women have increased risk of preterm birth compared to white women, and overall black women are in poorer health than white women. Recent recommendations to reduce preterm birth have focused on preconception health care. We explore the associations between indicators of maternal prepregnancy health with preterm birth among a sample of black women. DESIGN: The current study was prospective. SETTING: Enrollment occurred in prenatal clinics in Baltimore. PARTICIPANTS: Women (N=922) aged > or =18 were enrolled in the study. Data on maternal health, behaviors, and pregnancy outcome were abstracted from clinical records. MAIN OUTCOME MEASURE: Logistic regression was used to evaluate associations between behavioral and health status variables with preterm birth. RESULTS: In bivariate analysis, alcohol use, drug use and chronic diseases were associated with preterm birth. In the logistic regression analysis, drug use and chronic diseases were associated with preterm birth. CONCLUSIONS: These results demonstrate an association between maternal health and behaviors prior to pregnancy with preterm birth among black women. Providing access to health care prior to pregnancy to address behavioral and health risks may improve pregnancy outcomes among low-income black women.


Assuntos
Negro ou Afro-Americano , Indicadores Básicos de Saúde , Comportamento Materno , Nascimento Prematuro , Adolescente , Adulto , Baltimore/epidemiologia , Doença Crônica/epidemiologia , Doença Crônica/etnologia , Feminino , Comportamentos Relacionados com a Saúde , Humanos , Modelos Logísticos , Pobreza , Gravidez , Cuidado Pré-Natal , Fatores de Risco , Assunção de Riscos , Transtornos Relacionados ao Uso de Substâncias/complicações , Transtornos Relacionados ao Uso de Substâncias/epidemiologia , Transtornos Relacionados ao Uso de Substâncias/etnologia
14.
Epidemiology ; 22(6): 859-66, 2011 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-21968775

RESUMO

Covariates may affect continuous responses differently at various points of the response distribution. For example, some exposure might have minimal impact on conditional means, whereas it might lower conditional 10th percentiles sharply. Such differential effects can be important to detect. In studies of the determinants of birth weight, for instance, it is critical to identify exposures like the one above, since low birth weight is a risk factor for later health problems. Effects of covariates on the tails of distributions can be obscured by models (such as linear regression) that estimate conditional means; however, effects on tails can be detected by quantile regression. We present 2 approaches for exploring high-dimensional predictor spaces to identify important predictors for quantile regression. These are based on the lasso and elastic net penalties. We apply the approaches to a prospective cohort study of adverse birth outcomes that includes a wide array of demographic, medical, psychosocial, and environmental variables. Although tobacco exposure is known to be associated with lower birth weights, the analysis suggests an interesting interaction effect not previously reported: tobacco exposure depresses the 20th and 30th percentiles of birth weight more strongly when mothers have high levels of lead in their blood compared with those who have low blood lead levels.


Assuntos
Resultado da Gravidez/epidemiologia , Análise de Regressão , Causalidade , Interpretação Estatística de Dados , Feminino , Humanos , Recém-Nascido de Baixo Peso , Recém-Nascido , Modelos Lineares , Gravidez , Nascimento Prematuro/epidemiologia , Efeitos Tardios da Exposição Pré-Natal/epidemiologia
15.
Stat Med ; 30(6): 627-41, 2011 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-21337358

RESUMO

In many observational studies, analysts estimate causal effects using propensity scores, e.g. by matching, sub-classifying, or inverse probability weighting based on the scores. Estimation of propensity scores is complicated when some values of the covariates are missing. Analysts can use multiple imputation to create completed data sets from which propensity scores can be estimated. We propose a general location mixture model for imputations that assumes that the control units are a latent mixture of (i) units whose covariates are drawn from the same distributions as the treated units' covariates and (ii) units whose covariates are drawn from different distributions. This formulation reduces the influence of control units outside the treated units' region of the covariate space on the estimation of parameters in the imputation model, which can result in more plausible imputations. In turn, this can result in more reliable estimates of propensity scores and better balance in the true covariate distributions when matching or sub-classifying. We illustrate the benefits of the latent class modeling approach with simulations and with an observational study of the effect of breast feeding on children's cognitive abilities.


Assuntos
Interpretação Estatística de Dados , Modelos Estatísticos , Pontuação de Propensão , Adolescente , Aleitamento Materno/psicologia , Cognição , Estudos de Coortes , Simulação por Computador , Feminino , Humanos , Estudos Longitudinais , Masculino , Adulto Jovem
16.
J Chem Educ ; 88(6): 744-750, 2011 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-24882881

RESUMO

Few studies demonstrate the impact of teaching chemistry embedded in a context that has relevance to high school students. We build upon our prior work showing that pharmacology topics (i.e., drugs), which are inherently interesting to high school students, provide a useful context for teaching chemistry and biology. In those studies, teachers were provided professional development for the Pharmacology Education Partnership (PEP) in an onsite venue (either five-day or one-day workshop). Given financial difficulties to travel, teachers have asked for alternatives for professional development. Thus, we developed the same PEP training workshop using a distance learning (DL) (two-way live video) approach. In this way, 121 chemistry and biology teachers participated in the DL workshops to learn how to incorporate the PEP modules into their teaching. They field-tested the modules over the year in high school chemistry and biology classes. Teacher knowledge of chemistry and biology increased significantly after the workshop and was maintained for at least a year. Their students (N = 2309) demonstrated a significant increase in knowledge of chemistry and biology concepts, with higher scores as the number of modules used increased. The increase in both teacher and student knowledge in these subjects was similar to that found previously when teachers were provided with onsite professional development.

17.
J R Stat Soc Ser A Stat Soc ; 184(2): 643-662, 2021 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36254262

RESUMO

Often, government agencies and survey organizations know the population counts or percentages for some of the variables in a survey. These may be available from auxiliary sources, for example, administrative databases or other high quality surveys. We present and illustrate a model-based framework for leveraging such auxiliary marginal information when handling unit and item nonresponse. We show how one can use the margins to specify different missingness mechanisms for each type of nonresponse. We use the framework to impute missing values in voter turnout in a subset of data from the U.S. Current Population Survey (CPS). In doing so, we examine the sensitivity of results to different assumptions about the unit and item nonresponse.

18.
Am J Epidemiol ; 172(9): 1070-6, 2010 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-20841346

RESUMO

Multiple imputation is particularly well suited to deal with missing data in large epidemiologic studies, because typically these studies support a wide range of analyses by many data users. Some of these analyses may involve complex modeling, including interactions and nonlinear relations. Identifying such relations and encoding them in imputation models, for example, in the conditional regressions for multiple imputation via chained equations, can be daunting tasks with large numbers of categorical and continuous variables. The authors present a nonparametric approach for implementing multiple imputation via chained equations by using sequential regression trees as the conditional models. This has the potential to capture complex relations with minimal tuning by the data imputer. Using simulations, the authors demonstrate that the method can result in more plausible imputations, and hence more reliable inferences, in complex settings than the naive application of standard sequential regression imputation techniques. They apply the approach to impute missing values in data on adverse birth outcomes with more than 100 clinical and survey variables. They evaluate the imputations using posterior predictive checks with several epidemiologic analyses of interest.


Assuntos
Simulação por Computador , Métodos Epidemiológicos , Estatísticas não Paramétricas , Algoritmos , Coleta de Dados , Interpretação Estatística de Dados , Estudos Epidemiológicos , Medicina Baseada em Evidências , Humanos , Análise Multivariada
19.
J Palliat Med ; 23(1): 90-96, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31424316

RESUMO

Background: Hospital referral regions (HRRs) are often used to characterize inpatient referral patterns, but it is unknown how well these geographic regions are aligned with variation in Medicare-financed hospice care, which is largely provided at home. Objective: Our objective was to characterize the variability in hospice use rates among elderly Medicare decedents by HRR and county. Methods: Using 2014 Master Beneficiary File for decedents 65 and older from North and South Carolina, we applied Bayesian mixed models to quantify variation in hospice use rates explained by HRR fixed effects, county random effects, and residual error among Medicare decedents. Results: We found HRRs and county indicators are significant predictors of hospice use in NC and SC; however, the relative variation within HRRs and associated residual variation is substantial. On average, HRR fixed effects explained more variation in hospice use rates than county indicators with a standard deviation (SD) of 10.0 versus 5.1 percentage points. The SD of the residual error is 5.7 percentage points. On average, variation within HRRs is about half the variation between regions (52%). Conclusions: The magnitude of unexplained residual variation in hospice use for NC and SC suggests that novel, end-of-life-specific service areas should be developed and tested to better capture geographic differences and inform research, health systems, and policy.


Assuntos
Cuidados Paliativos na Terminalidade da Vida , Assistência Terminal , Idoso , Teorema de Bayes , Humanos , Medicare , Encaminhamento e Consulta , South Carolina , Estados Unidos
20.
Sci Rep ; 8(1): 116, 2018 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-29311675

RESUMO

Baseball players must be able to see and react in an instant, yet it is hotly debated whether superior performance is associated with superior sensorimotor abilities. In this study, we compare sensorimotor abilities, measured through 8 psychomotor tasks comprising the Nike Sensory Station assessment battery, and game statistics in a sample of 252 professional baseball players to evaluate the links between sensorimotor skills and on-field performance. For this purpose, we develop a series of Bayesian hierarchical latent variable models enabling us to compare statistics across professional baseball leagues. Within this framework, we find that sensorimotor abilities are significant predictors of on-base percentage, walk rate and strikeout rate, accounting for age, position, and league. We find no such relationship for either slugging percentage or fielder-independent pitching. The pattern of results suggests performance contributions from both visual-sensory and visual-motor abilities and indicates that sensorimotor screenings may be useful for player scouting.


Assuntos
Desempenho Atlético , Beisebol , Desempenho Psicomotor , Adolescente , Adulto , Algoritmos , Humanos , Modelos Teóricos , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA