RESUMO
High throughput spatial transcriptomics (HST) is a rapidly emerging class of experimental technologies that allow for profiling gene expression in tissue samples at or near single-cell resolution while retaining the spatial location of each sequencing unit within the tissue sample. Through analyzing HST data, we seek to identify sub-populations of cells within a tissue sample that may inform biological phenomena. Existing computational methods either ignore the spatial heterogeneity in gene expression profiles, fail to account for important statistical features such as skewness, or are heuristic-based network clustering methods that lack the inferential benefits of statistical modeling. To address this gap, we develop SPRUCE: a Bayesian spatial multivariate finite mixture model based on multivariate skew-normal distributions, which is capable of identifying distinct cellular sub-populations in HST data. We further implement a novel combination of Pólya-Gamma data augmentation and spatial random effects to infer spatially correlated mixture component membership probabilities without relying on approximate inference techniques. Via a simulation study, we demonstrate the detrimental inferential effects of ignoring skewness or spatial correlation in HST data. Using publicly available human brain HST data, SPRUCE outperforms existing methods in recovering expertly annotated brain layers. Finally, our application of SPRUCE to human breast cancer HST data indicates that SPRUCE can distinguish distinct cell populations within the tumor microenvironment. An R package spruce for fitting the proposed models is available through The Comprehensive R Archive Network.
Assuntos
Modelos Estatísticos , Transcriptoma , Humanos , Teorema de Bayes , Simulação por Computador , Perfilação da Expressão GênicaRESUMO
Dengue Fever (DF) is a mosquito vector transmitted flavivirus and a reemerging global public health threat. Although several studies have addressed the relation between climatic and environmental factors and the epidemiology of DF, or looked at purely spatial or time series analysis, this article presents a joint spatio-temporal epidemiological analysis. Our approach accounts for both temporal and spatial autocorrelation in DF incidence and the effect of temperatures and precipitation by using a hierarchical Bayesian approach. We fitted several space-time areal models to predict relative risk at the municipality level and for each month from 1990 to 2014. Model selection was performed according to several criteria: the preferred models detected significant effects for temperature at time lags of up to four months and for precipitation up to three months. A boundary detection analysis is incorporated in the modeling approach, and it was successful in detecting municipalities with historically anomalous risk.