Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Biometrics ; 79(4): 3140-3152, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-36745745

RESUMO

We propose a doubly robust approach to characterizing treatment effect heterogeneity in observational studies. We develop a frequentist inferential procedure that utilizes posterior distributions for both the propensity score and outcome regression models to provide valid inference on the conditional average treatment effect even when high-dimensional or nonparametric models are used. We show that our approach leads to conservative inference in finite samples or under model misspecification and provides a consistent variance estimator when both models are correctly specified. In simulations, we illustrate the utility of these results in difficult settings such as high-dimensional covariate spaces or highly flexible models for the propensity score and outcome regression. Lastly, we analyze environmental exposure data from NHANES to identify how the effects of these exposures vary by subject-level characteristics.


Assuntos
Modelos Estatísticos , Heterogeneidade da Eficácia do Tratamento , Simulação por Computador , Inquéritos Nutricionais , Pontuação de Propensão
2.
Biostatistics ; 2022 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-36073640

RESUMO

Distributed lag models are useful in environmental epidemiology as they allow the user to investigate critical windows of exposure, defined as the time periods during which exposure to a pollutant adversely affects health outcomes. Recent studies have focused on estimating the health effects of a large number of environmental exposures, or an environmental mixture, on health outcomes. In such settings, it is important to understand which environmental exposures affect a particular outcome, while acknowledging the possibility that different exposures have different critical windows. Further, in studies of environmental mixtures, it is important to identify interactions among exposures and to account for the fact that this interaction may occur between two exposures having different critical windows. Exposure to one exposure early in time could cause an individual to be more or less susceptible to another exposure later in time. We propose a Bayesian model to estimate the temporal effects of a large number of exposures on an outcome. We use spike-and-slab priors and semiparametric distributed lag curves to identify important exposures and exposure interactions and discuss extensions with improved power to detect harmful exposures. We then apply these methods to estimate the effects of exposure to multiple air pollutants during pregnancy on birthweight from vital records in Colorado.

3.
Metabolites ; 12(6)2022 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-35736452

RESUMO

Emerging technologies now allow for mass spectrometry-based profiling of thousands of small molecule metabolites ('metabolomics') in an increasing number of biosamples. While offering great promise for insight into the pathogenesis of human disease, standard approaches have not yet been established for statistically analyzing increasingly complex, high-dimensional human metabolomics data in relation to clinical phenotypes, including disease outcomes. To determine optimal approaches for analysis, we formally compare traditional and newer statistical learning methods across a range of metabolomics dataset types. In simulated and experimental metabolomics data derived from large population-based human cohorts, we observe that with an increasing number of study subjects, univariate compared to multivariate methods result in an apparently higher false discovery rate as represented by substantial correlation between metabolites directly associated with the outcome and metabolites not associated with the outcome. Although the higher frequency of such associations would not be considered false in the strict statistical sense, it may be considered biologically less informative. In scenarios wherein the number of assayed metabolites increases, as in measures of nontargeted versus targeted metabolomics, multivariate methods performed especially favorably across a range of statistical operating characteristics. In nontargeted metabolomics datasets that included thousands of metabolite measures, sparse multivariate models demonstrated greater selectivity and lower potential for spurious relationships. When the number of metabolites was similar to or exceeded the number of study subjects, as is common with nontargeted metabolomics analysis of relatively small cohorts, sparse multivariate models exhibited the most-robust statistical power with more consistent results. These findings have important implications for metabolomics analysis in human disease.

4.
J Exp Psychol Gen ; 151(12): 3045-3059, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35696175

RESUMO

Human language is unique among animal communication systems, in part because of its dual patterning in which meaningless phonological units combine to form meaningful words (phonological structure) and words combine to form sentences (lexicosyntactic structure). Although dual patterning is well recognized, its emergence in language development has been scarcely investigated. Chief among questions still unanswered is the extent to which development of these separate structures is independent or interdependent, and what supports acquisition of each level of structure. We explored these questions by examining growth of lexicosyntactic and phonological structure in children with normal hearing (n = 49) and children with hearing loss who use cochlear implants (n = 56). Multiple measures of each kind of structure were collected at 2-year intervals (kindergarten through eighth grade), and used to construct latent scores for each type of structure. Growth curve analysis assessed (a) the relative independence of development for each level of structure; (b) interactions between these two levels of structure in real-time language processing; and (c) contributions to growth of each level of structure made by auditory input, socioeconomic status (as proxy for linguistic experience), and speech motor control. Findings suggested that phonological and lexicosyntactic structure develop largely independently. Auditory input, socioeconomic status, and speech motor control help shape these language structures, with the last two factors exerting stronger effects for children with cochlear implants. Only for children with cochlear implants were interdependencies in real-time processing observed, reflecting compensatory mechanisms likely present to help them handle the disproportionately large phonological deficit they exhibit. (PsycInfo Database Record (c) 2022 APA, all rights reserved).


Assuntos
Implante Coclear , Implantes Cocleares , Surdez , Percepção da Fala , Criança , Humanos , Desenvolvimento da Linguagem
5.
Biostatistics ; 23(4): 1039-1055, 2022 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-35088075

RESUMO

The analysis of environmental mixtures is of growing importance in environmental epidemiology, and one of the key goals in such analyses is to identify exposures and their interactions that are associated with adverse health outcomes. Typical approaches utilize flexible regression models combined with variable selection to identify important exposures and estimate a potentially nonlinear relationship with the outcome of interest. Despite this surge in interest, no approaches to date can identify exposures and interactions while controlling any form of error rates with respect to exposure selection. We propose two novel approaches to estimating the health effects of environmental mixtures that simultaneously (i) estimate and provide valid inference for the overall mixture effect and (ii) identify important exposures and interactions while controlling the false discovery rate (FDR). We show that this can lead to substantial power gains to detect weak effects of environmental exposures. We apply our approaches to a study of persistent organic pollutants and find that controlling the FDR leads to substantially different conclusions.


Assuntos
Poluentes Ambientais , Poluentes Orgânicos Persistentes , Exposição Ambiental/efeitos adversos , Poluentes Ambientais/toxicidade , Humanos
6.
Biometrics ; 78(1): 100-114, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-33349923

RESUMO

We introduce a framework for estimating causal effects of binary and continuous treatments in high dimensions. We show how posterior distributions of treatment and outcome models can be used together with doubly robust estimators. We propose an approach to uncertainty quantification for the doubly robust estimator, which utilizes posterior distributions of model parameters and (1) results in good frequentist properties in small samples, (2) is based on a single run of a Markov chain Monte Carlo (MCMC) algorithm, and (3) improves over frequentist measures of uncertainty which rely on asymptotic properties. We consider a flexible framework for modeling the treatment and outcome processes within the Bayesian paradigm that reduces model dependence, accommodates nonlinearity, and achieves dimension reduction of the covariate space. We illustrate the ability of the proposed approach to flexibly estimate causal effects in high dimensions and appropriately quantify uncertainty. We show that our proposed variance estimation strategy is consistent when both models are correctly specified, and we see empirically that it performs well in finite samples and under model misspecification. Finally, we estimate the effect of continuous environmental exposures on cholesterol and triglyceride levels.


Assuntos
Modelos Estatísticos , Teorema de Bayes , Causalidade , Simulação por Computador , Método de Monte Carlo
7.
J Speech Lang Hear Res ; 63(1): 234-258, 2020 01 22.
Artigo em Inglês | MEDLINE | ID: mdl-31834998

RESUMO

Purpose Parental language input (PLI) has reliably been found to influence child language development for children at risk of language delay, but previous work has generally restricted observations to the preschool years. The current study examined whether PLI during the early years explains variability in the spoken language abilities of children with hearing loss at those young ages, as well as later in childhood. Participants One hundred children participated: 34 with normal hearing, 24 with moderate losses who used hearing aids (HAs), and 42 with severe-to-profound losses who used cochlear implants (CIs). Mean socioeconomic status was middle class for all groups. Children with CIs generally received them early. Method Samples of parent-child interactions were analyzed to characterize PLI during the preschool years. Child language abilities (CLAs) were assessed at 48 months and 10 years of age. Results No differences were observed across groups in how parents interacted with their children. Nonetheless, strong differences across groups were observed in the effects of PLI on CLAs at 48 months of age: Children with normal hearing were largely resilient to their parents' language styles. Children with HAs were most influenced by the amount of PLI. Children with CIs were most influenced by PLI that evoked child language and modeled more complex versions. When potential influences of preschool PLI on CLAs at 10 years of age were examined, those effects at preschool were replicated. When mediation analyses were performed, however, it was found that the influences of preschool PLI on CLAs at 10 years of age were partially mediated by CLAs at preschool. Conclusion PLI is critical to the long-term spoken language abilities of children with hearing loss, but the style of input that is most effective varies depending on the severity of risk for delay.


Assuntos
Linguagem Infantil , Perda Auditiva/psicologia , Transtornos do Desenvolvimento da Linguagem/psicologia , Relações Pais-Filho , Pais/psicologia , Adulto , Criança , Pré-Escolar , Implantes Cocleares , Feminino , Humanos , Masculino , Poder Familiar/psicologia , Inteligibilidade da Fala , Aprendizagem Verbal , Vocabulário
8.
Metabolites ; 9(7)2019 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-31269707

RESUMO

To assist with management and interpretation of human metabolomics data, which are rapidly increasing in quantity and complexity, we need better visualization tools. Using a dataset of several hundred metabolite measures profiled in a cohort of ~1500 individuals sampled from a population-based community study, we performed association analyses with eight demographic and clinical traits and outcomes. We compared frequently used existing graphical approaches with a novel 'rain plot' approach to display the results of these analyses. The 'rain plot' combines features of a raindrop plot and a conventional heatmap to convey results of multiple association analyses. A rain plot can simultaneously indicate effect size, directionality, and statistical significance of associations between metabolites and several traits. This approach enables visual comparison features of all metabolites examined with a given trait. The rain plot extends prior approaches and offers complementary information for data interpretation. Additional work is needed in data visualizations for metabolomics to assist investigators in the process of understanding and convey large-scale analysis results effectively, feasibly, and practically.

9.
Metabolites ; 9(7)2019 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-31336989

RESUMO

High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity underlying human health and disease. Large-scale metabolomics data sources, generated using either targeted or nontargeted platforms, are becoming more common. Appropriate statistical analysis of these complex high-dimensional data will be critical for extracting meaningful results from such large-scale human metabolomics studies. Therefore, we consider the statistical analytical approaches that have been employed in prior human metabolomics studies. Based on the lessons learned and collective experience to date in the field, we offer a step-by-step framework for pursuing statistical analyses of cohort-based human metabolomics data, with a focus on feature selection. We discuss the range of options and approaches that may be employed at each stage of data management, analysis, and interpretation and offer guidance on the analytical decisions that need to be considered over the course of implementing a data analysis workflow. Certain pervasive analytical challenges facing the field warrant ongoing focused research. Addressing these challenges, particularly those related to analyzing human metabolomics data, will allow for more standardization of as well as advances in how research in the field is practiced. In turn, such major analytical advances will lead to substantial improvements in the overall contributions of human metabolomics investigations.

10.
J Am Stat Assoc ; 114(525): 24-27, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-34305210
11.
Bayesian Anal ; 14(3): 805-828, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32431779

RESUMO

In observational studies, estimation of a causal effect of a treatment on an outcome relies on proper adjustment for confounding. If the number of the potential confounders (p) is larger than the number of observations (n), then direct control for all potential confounders is infeasible. Existing approaches for dimension reduction and penalization are generally aimed at predicting the outcome, and are less suited for estimation of causal effects. Under standard penalization approaches (e.g. Lasso), if a variable Xj is strongly associated with the treatment T but weakly with the outcome Y, the coefficient ßj will be shrunk towards zero thus leading to confounding bias. Under the assumption of a linear model for the outcome and sparsity, we propose continuous spike and slab priors on the regression coefficients ßj corresponding to the potential confounders Xj . Specifically, we introduce a prior distribution that does not heavily shrink to zero the coefficients (ßj s) of the Xj s that are strongly associated with T but weakly associated with Y. We compare our proposed approach to several state of the art methods proposed in the literature. Our proposed approach has the following features: 1) it reduces confounding bias in high dimensional settings; 2) it shrinks towards zero coefficients of instrumental variables; and 3) it achieves good coverages even in small sample sizes. We apply our approach to the National Health and Nutrition Examination Survey (NHANES) data to estimate the causal effects of persistent pesticide exposure on triglyceride levels.

12.
Biometrics ; 74(4): 1171-1179, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-29750844

RESUMO

Valid estimation of treatment effects from observational data requires proper control of confounding. If the number of covariates is large relative to the number of observations, then controlling for all available covariates is infeasible. In cases where a sparsity condition holds, variable selection or penalization can reduce the dimension of the covariate space in a manner that allows for valid estimation of treatment effects. In this article, we propose matching on both the estimated propensity score and the estimated prognostic scores when the number of covariates is large relative to the number of observations. We derive asymptotic results for the matching estimator and show that it is doubly robust in the sense that only one of the two score models need be correct to obtain a consistent estimator. We show via simulation its effectiveness in controlling for confounding and highlight its potential to address nonlinear confounding. Finally, we apply the proposed procedure to analyze the effect of gender on prescription opioid use using insurance claims data.


Assuntos
Fatores de Confusão Epidemiológicos , Avaliação de Resultados em Cuidados de Saúde/métodos , Estatística como Assunto/métodos , Viés , Simulação por Computador , Feminino , Humanos , Revisão da Utilização de Seguros , Masculino , Estudos Observacionais como Assunto/normas , Transtornos Relacionados ao Uso de Opioides/epidemiologia , Avaliação de Resultados em Cuidados de Saúde/normas , Prognóstico , Pontuação de Propensão , Fatores Sexuais , Transtornos Relacionados ao Uso de Substâncias/epidemiologia
13.
Ann Appl Stat ; 11(2): 792-807, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29218072

RESUMO

Fine particulate matter (PM2.5) measured at a given location is a mix of pollution generated locally and pollution traveling long distances in the atmosphere. Therefore, the identification of spatial scales associated with health effects can inform on pollution sources responsible for these effects, resulting in more targeted regulatory policy. Recently, prediction methods that yield high-resolution spatial estimates of PM2.5 exposures allow one to evaluate such scale-specific associations. We propose a two-dimensional wavelet decomposition that alleviates restrictive assumptions required for standard wavelet decompositions. Using this method we decompose daily surfaces of PM2.5 to identify which scales of pollution are most associated with adverse health outcomes. A key feature of the approach is that it can remove the purely temporal component of variability in PM2.5 levels and calculate effect estimates derived solely from spatial contrasts. This eliminates the potential for unmeasured confounding of the exposure - outcome associations by temporal factors, such as season. We apply our method to a study of birth weights in Massachusetts, U.S.A from 2003-2008 and find that both local and urban sources of pollution are strongly negatively associated with birth weight. Results also suggest that failure to eliminate temporal confounding in previous analyses attenuated the overall effect estimate towards zero, with the effect estimate growing in magnitude once this source of variability is removed.

14.
Epidemiology ; 28(5): 627-634, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28768298

RESUMO

BACKGROUND: In 2012, the EPA enacted more stringent National Ambient Air Quality Standards (NAAQS) for fine particulate matter (PM2.5). Few studies have characterized the health effects of air pollution levels lower than the most recent NAAQS for long-term exposure to PM2.5 (now 12 µg/m). METHODS: We constructed a cohort of 32,119 Medicare beneficiaries residing in 5138 US ZIP codes who were interviewed as part of the Medicare Current Beneficiary Survey (MCBS) between 2002 and 2010 and had 1 year of follow-up. We considered four outcomes: all-cause hospitalizations, hospitalizations for circulatory diseases and respiratory diseases, and death. RESULTS: We found that increasing exposure to PM2.5 from levels lower than 12 µg/m to levels higher than 12 µg/m is associated with increases in all-cause admission rates of 7% (95% CI = 3%, 10%) and in circulatory admission hazard rates of 6% (95% CI = 2%, 9%). When we restricted analysis to enrollees with exposure always lower than 12 µg/m, we found that increasing exposure from levels lower than 8 µg/m to levels higher than 8 µg/m increased all-cause admission hazard rates by 15% (95% CI = 8%, 23%), circulatory by 18% (95% CI = 10%, 27%), and respiratory by 21% (95% CI = 9%, 34%). CONCLUSIONS: In a nationally representative sample of Medicare enrollees, changes in exposure to PM2.5, even at levels consistently below standards, are associated with increases in hospital admissions for all causes and cardiovascular and respiratory diseases. The robustness of our results to inclusion of many additional individual level potential confounders adds validity to studies of air pollution that rely entirely on administrative data.


Assuntos
Hospitalização/estatística & dados numéricos , Exposição por Inalação/efeitos adversos , Material Particulado/efeitos adversos , Idoso , Causalidade , Estudos de Coortes , Feminino , Humanos , Exposição por Inalação/análise , Masculino , Medicare/estatística & dados numéricos , Pessoa de Meia-Idade , Modelos Estatísticos , Material Particulado/análise , Modelos de Riscos Proporcionais , Fatores de Risco , Estados Unidos/epidemiologia
15.
Stat Med ; 36(29): 4604-4615, 2017 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-28833307

RESUMO

A critical issue in the analysis of clinical trials is patients' noncompliance to assigned treatments. In the context of a binary treatment with all or nothing compliance, the intent-to-treat analysis is a straightforward approach to estimating the effectiveness of the trial. In contrast, there exist 3 commonly used estimators with varying statistical properties for the efficacy of the trial, formally known as the complier-average causal effect. The instrumental variable estimator may be unbiased but can be extremely variable in many settings. The as treated and per protocol estimators are usually more efficient than the instrumental variable estimator, but they may suffer from selection bias. We propose a synthetic approach that incorporates all 3 estimators in a data-driven manner. The synthetic estimator is a linear convex combination of the instrumental variable, per protocol, and as treated estimators, resembling the popular model-averaging approach in the statistical literature. However, our synthetic approach is nonparametric; thus, it is applicable to a variety of outcome types without specific distributional assumptions. We also discuss the construction of the synthetic estimator using an analytic form derived from a simple normal mixture distribution. We apply the synthetic approach to a clinical trial for post-traumatic stress disorder.


Assuntos
Modelos Lineares , Cooperação do Paciente , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Sujeitos da Pesquisa , Viés , Ensaios Clínicos como Assunto , Simulação por Computador , Humanos , Transtornos de Estresse Pós-Traumáticos/terapia , Resultado do Tratamento
16.
Biostatistics ; 18(3): 553-568, 2017 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-28334230

RESUMO

In comparative effectiveness research, we are often interested in the estimation of an average causal effect from large observational data (the main study). Often this data does not measure all the necessary confounders. In many occasions, an extensive set of additional covariates is measured for a smaller and non-representative population (the validation study). In this setting, standard approaches for missing data imputation might not be adequate due to the large number of missing covariates in the main data relative to the smaller sample size of the validation data. We propose a Bayesian approach to estimate the average causal effect in the main study that borrows information from the validation study to improve confounding adjustment. Our approach combines ideas of Bayesian model averaging, confounder selection, and missing data imputation into a single framework. It allows for different treatment effects in the main study and in the validation study, and propagates the uncertainty due to the missing data imputation and confounder selection when estimating the average causal effect (ACE) in the main study. We compare our method to several existing approaches via simulation. We apply our method to a study examining the effect of surgical resection on survival among 10 396 Medicare beneficiaries with a brain tumor when additional covariate information is available on 2220 patients in SEER-Medicare. We find that the estimated ACE decreases by 30% when incorporating additional information from SEER-Medicare.


Assuntos
Teorema de Bayes , Pesquisa Comparativa da Efetividade , Incerteza , Neoplasias Encefálicas/cirurgia , Humanos , Armazenamento e Recuperação da Informação , Medicare , Estados Unidos
17.
Biostatistics ; 17(4): 764-78, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27324413

RESUMO

In environmental epidemiology, exposures are not always available at subject locations and must be predicted using monitoring data. The monitor locations are often outside the control of researchers, and previous studies have shown that "preferential sampling" of monitoring locations can adversely affect exposure prediction and subsequent health effect estimation. We adopt a slightly different definition of preferential sampling than is typically seen in the literature, which we call population-based preferential sampling. Population-based preferential sampling occurs when the location of the monitors is dependent on the subject locations. We show the impact that population-based preferential sampling has on exposure prediction and health effect estimation using analytic results and a simulation study. A simple, one-parameter model is proposed to measure the degree to which monitors are preferentially sampled with respect to population density. We then discuss these concepts in the context of PM2.5 and the EPA Air Quality System monitoring sites, which are generally placed in areas of higher population density to capture the population's exposure.


Assuntos
Exposição Ambiental , Métodos Epidemiológicos , Modelos Teóricos , Projetos de Pesquisa , Monitoramento Ambiental/estatística & dados numéricos , Humanos
18.
Stat Sci ; 31(1): 80-95, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-28979066

RESUMO

Generalized linear mixed models are a common statistical tool for the analysis of clustered or longitudinal data where correlation is accounted for through cluster-specific random effects. In practice, the distribution of the random effects is typically taken to be a Normal distribution, although if this does not hold then the model is misspecified and standard estimation/inference may be invalid. An alternative is to perform a so-called nonparametric Bayesian analyses in which one assigns a Dirichlet process (DP) prior to the unknown distribution of the random effects. In this paper we examine operating characteristics for estimation of fixed effects and random effects based on such an analysis under a range of "true" random effects distributions. As part of this we investigate various approaches for selection of the precision parameter of the DP prior. In addition, we illustrate the use of the methods with an analysis of post-operative complications among n = 18, 643 female Medicare beneficiaries who underwent a hysterectomy procedure at N = 503 hospitals in the US. Overall, we conclude that using the DP priori n modeling the random effect distribution results in large reductions of bias with little loss of efficiency. While no single choice for the precision parameter will be optimal in all settings, certain strategies such as importance sampling or empirical Bayes can be used to obtain reasonable results in a broad range of data scenarios.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...