Pesquisa | Biblioteca Virtual em Saúde Fiocruz

Covariate-guided Bayesian mixture of spline experts for the analysis of multivariate high-density longitudinal data.

Fu, Haoyi; Tang, Lu; Rosen, Ori; Hipwell, Alison E; Huppert, Theodore J; Krafty, Robert T.

Biostatistics ; 2023 Dec 23.

Artigo em Inglês | MEDLINE | ID: mdl-38141227

RESUMO

With rapid development of techniques to measure brain activity and structure, statistical methods for analyzing modern brain-imaging data play an important role in the advancement of science. Imaging data that measure brain function are usually multivariate high-density longitudinal data and are heterogeneous across both imaging sources and subjects, which lead to various statistical and computational challenges. In this article, we propose a group-based method to cluster a collection of multivariate high-density longitudinal data via a Bayesian mixture of smoothing splines. Our method assumes each multivariate high-density longitudinal trajectory is a mixture of multiple components with different mixing weights. Time-independent covariates are assumed to be associated with the mixture components and are incorporated via logistic weights of a mixture-of-experts model. We formulate this approach under a fully Bayesian framework using Gibbs sampling where the number of components is selected based on a deviance information criterion. The proposed method is compared to existing methods via simulation studies and is applied to a study on functional near-infrared spectroscopy, which aims to understand infant emotional reactivity and recovery from stress. The results reveal distinct patterns of brain activity, as well as associations between these patterns and selected covariates.

A case study in model failure? COVID-19 daily deaths and ICU bed utilisation predictions in New York state.

Chin, Vincent; Samia, Noelle I; Marchant, Roman; Rosen, Ori; Ioannidis, John P A; Tanner, Martin A; Cripps, Sally.

Eur J Epidemiol ; 35(8): 733-742, 2020 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-32780189

RESUMO

Forecasting models have been influential in shaping decision-making in the COVID-19 pandemic. However, there is concern that their predictions may have been misleading. Here, we dissect the predictions made by four models for the daily COVID-19 death counts between March 25 and June 5 in New York state, as well as the predictions of ICU bed utilisation made by the influential IHME model. We evaluated the accuracy of the point estimates and the accuracy of the uncertainty estimates of the model predictions. First, we compared the "ground truth" data sources on daily deaths against which these models were trained. Three different data sources were used by these models, and these had substantial differences in recorded daily death counts. Two additional data sources that we examined also provided different death counts per day. For accuracy of prediction, all models fared very poorly. Only 10.2% of the predictions fell within 10% of their training ground truth, irrespective of distance into the future. For accurate assessment of uncertainty, only one model matched relatively well the nominal 95% coverage, but that model did not start predictions until April 16, thus had no impact on early, major decisions. For ICU bed utilisation, the IHME model was highly inaccurate; the point estimates only started to match ground truth after the pandemic wave had started to wane. We conclude that trustworthy models require trustworthy input data to be trained upon. Moreover, models need to be subjected to prespecified real time performance tests, before their results are provided to policy makers and public health officials.

Assuntos

Infecções por Coronavirus/mortalidade , Previsões/métodos , Unidades de Terapia Intensiva/estatística & dados numéricos , Pandemias/prevenção & controle , Pneumonia Viral/mortalidade , Ocupação de Leitos , Betacoronavirus , COVID-19 , Humanos , Unidades de Terapia Intensiva/provisão & distribuição , Modelos Estatísticos , Mortalidade/tendências , New York/epidemiologia , Saúde Pública , SARS-CoV-2

Bayesian semiparametric copula estimation with application to psychiatric genetics.

Rosen, Ori; Thompson, Wesley K.

Biom J ; 57(3): 468-84, 2015 May.

Artigo em Inglês | MEDLINE | ID: mdl-25664559

RESUMO

This paper proposes a semiparametric methodology for modeling multivariate and conditional distributions. We first build a multivariate distribution whose dependence structure is induced by a Gaussian copula and whose marginal distributions are estimated nonparametrically via mixtures of B-spline densities. The conditional distribution of a given variable is obtained in closed form from this multivariate distribution. We take a Bayesian approach, using Markov chain Monte Carlo methods for inference. We study the frequentist properties of the proposed methodology via simulation and apply the method to estimation of conditional densities of summary statistics, used for computing conditional local false discovery rates, from genetic association studies of schizophrenia and cardiovascular disease risk factors.

Assuntos

Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética , Predisposição Genética para Doença/genética , Modelos Estatísticos , Esquizofrenia/epidemiologia , Esquizofrenia/genética , Algoritmos , Teorema de Bayes , Doenças Cardiovasculares/diagnóstico , Simulação por Computador , Interpretação Estatística de Dados , Predisposição Genética para Doença/epidemiologia , Humanos , Incidência , Distribuição Normal , Fatores de Risco , Esquizofrenia/diagnóstico

AdaptSPEC-X: Covariate-Dependent Spectral Modeling of Multiple Nonstationary Time Series.

Bertolacci, Michael; Rosen, Ori; Cripps, Edward; Cripps, Sally.

J Comput Graph Stat ; 31(2): 436-454, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36329784

RESUMO

We present the AdaptSPEC-X method for the joint analysis of a panel of possibly nonstationary time series. The approach is Bayesian and uses a covariate-dependent infinite mixture model to incorporate multiple time series, with mixture components parameterized by a time-varying mean and log spectrum. The mixture components are based on AdaptSPEC, a nonparametric model which adaptively divides the time series into an unknown number of segments and estimates the local log spectra by smoothing splines. AdaptSPEC-X extends AdaptSPEC in three ways. First, through the infinite mixture, it applies to multiple time series linked by covariates. Second, it can handle missing values, a common feature of time series which can cause difficulties for nonparametric spectral methods. Third, it allows for a time-varying mean. Through these extensions, AdaptSPEC-X can estimate time-varying means and spectra at observed and unobserved covariate values, allowing for predictive inference. Estimation is performed by Markov chain Monte Carlo (MCMC) methods, combining data augmentation, reversible jump, and Riemann manifold Hamiltonian Monte Carlo techniques. We evaluate the methodology using simulated data, and describe applications to Australian rainfall data and measles incidence in the US. Software implementing the method proposed in this paper is available in the R package BayesSpec.

Adaptive Bayesian Spectral Analysis of High-dimensional Nonstationary Time Series.

Li, Zeda; Rosen, Ori; Ferrarelli, Fabio; Krafty, Robert T.

J Comput Graph Stat ; 30(3): 794-807, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-35936018

RESUMO

This article introduces a nonparametric approach to spectral analysis of a high-dimensional multivariate nonstationary time series. The procedure is based on a novel frequency-domain factor model that provides a flexible yet parsimonious representation of spectral matrices from a large number of simultaneously observed time series. Real and imaginary parts of the factor loading matrices are modeled independently using a prior that is formulated from the tensor product of penalized splines and multiplicative gamma process shrinkage priors, allowing for infinitely many factors with loadings increasingly shrunk towards zero as the column index increases. Formulated in a fully Bayesian framework, the time series is adaptively partitioned into approximately stationary segments, where both the number and locations of partition points are assumed unknown. Stochastic approximation Monte Carlo (SAMC) techniques are used to accommodate the unknown number of segments, and a conditional Whittle likelihood-based Gibbs sampler is developed for efficient sampling within segments. By averaging over the distribution of partitions, the proposed method can approximate both abrupt and slowly varying changes in spectral matrices. Performance of the proposed model is evaluated by extensive simulations and demonstrated through the analysis of high-density electroencephalography.

A Bayesian regression model for multivariate functional data.

Rosen, Ori; Thompson, Wesley K.

Comput Stat Data Anal ; 53(11): 3773-3786, 2009 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-28936016

RESUMO

In this paper we present a model for the analysis of multivariate functional data with unequally spaced observation times that may differ among subjects. Our method is formulated as a Bayesian mixed-effects model in which the fixed part corresponds to the mean functions, and the random part corresponds to individual deviations from these mean functions. Covariates can be incorporated into both the fixed and the random effects. The random error term of the model is assumed to follow a multivariate Ornstein-Uhlenbeck process. For each of the response variables, both the mean and the subject-specific deviations are estimated via low-rank cubic splines using radial basis functions. Inference is performed via Markov chain Monte Carlo methods.

Conditional Spectral Analysis of Replicated Multiple Time Series with Application to Nocturnal Physiology.

Krafty, Robert T; Rosen, Ori; Stoffer, David S; Buysse, Daniel J; Hall, Martica H.

J Am Stat Assoc ; 112(520): 1405-1416, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-29430069

RESUMO

This article considers the problem of analyzing associations between power spectra of multiple time series and cross-sectional outcomes when data are observed from multiple subjects. The motivating application comes from sleep medicine, where researchers are able to non-invasively record physiological time series signals during sleep. The frequency patterns of these signals, which can be quantified through the power spectrum, contain interpretable information about biological processes. An important problem in sleep research is drawing connections between power spectra of time series signals and clinical characteristics; these connections are key to understanding biological pathways through which sleep affects, and can be treated to improve, health. Such analyses are challenging as they must overcome the complicated structure of a power spectrum from multiple time series as a complex positive-definite matrix-valued function. This article proposes a new approach to such analyses based on a tensor-product spline model of Cholesky components of outcome-dependent power spectra. The approach exibly models power spectra as nonparametric functions of frequency and outcome while preserving geometric constraints. Formulated in a fully Bayesian framework, a Whittle likelihood based Markov chain Monte Carlo (MCMC) algorithm is developed for automated model fitting and for conducting inference on associations between outcomes and spectral measures. The method is used to analyze data from a study of sleep in older adults and uncovers new insights into how stress and arousal are connected to the amount of time one spends in bed.

A Bayesian model for sparse functional data.

Thompson, Wesley K; Rosen, Ori.

Biometrics ; 64(1): 54-63, 2008 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-17573864

RESUMO

We propose a method for analyzing data which consist of curves on multiple individuals, i.e., longitudinal or functional data. We use a Bayesian model where curves are expressed as linear combinations of B-splines with random coefficients. The curves are estimated as posterior means obtained via Markov chain Monte Carlo (MCMC) methods, which automatically select the local level of smoothing. The method is applicable to situations where curves are sampled sparsely and/or at irregular time points. We construct posterior credible intervals for the mean curve and for the individual curves. This methodology provides unified, efficient, and flexible means for smoothing functional data.

Assuntos

Algoritmos , Teorema de Bayes , Biometria/métodos , Estudos de Coortes , Interpretação Estatística de Dados , Estudos Longitudinais , Modelos Estatísticos , Simulação por Computador

Multivariate bernoulli mixture models with application to postmortem tissue studies in schizophrenia.

Sun, Zhuoxin; Rosen, Ori; Sampson, Allan R.

Biometrics ; 63(3): 901-9, 2007 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-17825019

RESUMO

A novel mixture model is presented for repeated measurements in which correlation among repeated observations on the same subject is induced via correlated unobservable component indicators. The mixture components in our model are linear regressions, and the mixing proportions are logits with random effects. Inference is facilitated by sampling from the posterior distribution of the parameters via Markov chain Monte Carlo methods. The model is applied to a neuronal postmortem brain tissue study to examine the differences in neuron volumes between schizophrenic and control subjects.

Assuntos

Algoritmos , Encéfalo/patologia , Interpretação Estatística de Dados , Diagnóstico por Computador/métodos , Modelos Neurológicos , Modelos Estatísticos , Esquizofrenia/patologia , Autopsia , Biometria/métodos , Simulação por Computador , Diagnóstico , Humanos , Funções Verossimilhança , Análise Multivariada , Distribuições Estatísticas

10.

Analysis of growth curves via mixtures.

Rosen, Ori; Cohen, Ayala.

Stat Med ; 22(23): 3641-54, 2003 Dec 15.

Artigo em Inglês | MEDLINE | ID: mdl-14652866

RESUMO

We present a method for analysing growth curves using the mixtures-of-experts (ME) approach, which enables flexible estimation of the dependence of height on age. Individual growth curves are first fitted and the resulting estimates are then pooled to facilitate the estimation of mean growth curves, velocity and acceleration of growth and centile curves. The method allows for comparisons between males and females. We illustrate our methodology with data on growth of male and female children (0-4 years old).

Assuntos

Desenvolvimento Infantil/fisiologia , Crescimento/fisiologia , Modelos Biológicos , Fatores Etários , Estatura , Pré-Escolar , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Fatores Sexuais

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA