Pesquisa | BVS Violência e Saúde

1.

Stochastic machine learning via sigma profiles to build a digital chemical space.

Abranches, Dinis O; Maginn, Edward J; Colón, Yamil J.

Proc Natl Acad Sci U S A ; 121(31): e2404676121, 2024 Jul 30.

Artigo em Inglês | MEDLINE | ID: mdl-39042681

RESUMO

This work establishes a different paradigm on digital molecular spaces and their efficient navigation by exploiting sigma profiles. To do so, the remarkable capability of Gaussian processes (GPs), a type of stochastic machine learning model, to correlate and predict physicochemical properties from sigma profiles is demonstrated, outperforming state-of-the-art neural networks previously published. The amount of chemical information encoded in sigma profiles eases the learning burden of machine learning models, permitting the training of GPs on small datasets which, due to their negligible computational cost and ease of implementation, are ideal models to be combined with optimization tools such as gradient search or Bayesian optimization (BO). Gradient search is used to efficiently navigate the sigma profile digital space, quickly converging to local extrema of target physicochemical properties. While this requires the availability of pretrained GP models on existing datasets, such limitations are eliminated with the implementation of BO, which can find global extrema with a limited number of iterations. A remarkable example of this is that of BO toward boiling temperature optimization. Holding no knowledge of chemistry except for the sigma profile and boiling temperature of carbon monoxide (the worst possible initial guess), BO finds the global maximum of the available boiling temperature dataset (over 1,000 molecules encompassing more than 40 families of organic and inorganic compounds) in just 15 iterations (i.e., 15 property measurements), cementing sigma profiles as a powerful digital chemical space for molecular optimization and discovery, particularly when little to no experimental data is initially available.

2.

Fast matrix completion in epigenetic methylation studies with informative covariates.

Ribaud, Mélina; Labbe, Aurélie; Fouda, Khaled; Oualkacha, Karim.

Biostatistics ; 2024 Jun 07.

Artigo em Inglês | MEDLINE | ID: mdl-38850151

RESUMO

DNA methylation is an important epigenetic mark that modulates gene expression through the inhibition of transcriptional proteins binding to DNA. As in many other omics experiments, the issue of missing values is an important one, and appropriate imputation techniques are important in avoiding an unnecessary sample size reduction as well as to optimally leverage the information collected. We consider the case where relatively few samples are processed via an expensive high-density whole genome bisulfite sequencing (WGBS) strategy and a larger number of samples is processed using more affordable low-density, array-based technologies. In such cases, one can impute the low-coverage (array-based) methylation data using the high-density information provided by the WGBS samples. In this paper, we propose an efficient Linear Model of Coregionalisation with informative Covariates (LMCC) to predict missing values based on observed values and covariates. Our model assumes that at each site, the methylation vector of all samples is linked to the set of fixed factors (covariates) and a set of latent factors. Furthermore, we exploit the functional nature of the data and the spatial correlation across sites by assuming some Gaussian processes on the fixed and latent coefficient vectors, respectively. Our simulations show that the use of covariates can significantly improve the accuracy of imputed values, especially in cases where missing data contain some relevant information about the explanatory variable. We also showed that our proposed model is particularly efficient when the number of columns is much greater than the number of rows-which is usually the case in methylation data analysis. Finally, we apply and compare our proposed method with alternative approaches on two real methylation datasets, showing how covariates such as cell type, tissue type or age can enhance the accuracy of imputed values.

3.

Higher-order epistasis and phenotypic prediction.

Zhou, Juannan; Wong, Mandy S; Chen, Wei-Chia; Krainer, Adrian R; Kinney, Justin B; McCandlish, David M.

Proc Natl Acad Sci U S A ; 119(39): e2204233119, 2022 09 27.

Artigo em Inglês | MEDLINE | ID: mdl-36129941

RESUMO

Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype-phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype-phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA [Formula: see text] splice sites, for which we also validate our model predictions via additional low-throughput experiments.

Assuntos

Epistasia Genética , Precursores de RNA , Teorema de Bayes , Mapeamento Cromossômico , Biologia Computacional , Genótipo , Humanos , Modelos Genéticos , Mutação , Fenótipo , Splicing de RNA

4.

Bayesian nonparametric inference for heterogeneously mixing infectious disease models.

Seymour, Rowland G; Kypraios, Theodore; O'Neill, Philip D.

Proc Natl Acad Sci U S A ; 119(10): e2118425119, 2022 03 08.

Artigo em Inglês | MEDLINE | ID: mdl-35238628

RESUMO

SignificanceMathematical models of infectious disease transmission continue to play a vital role in understanding, mitigating, and preventing outbreaks. The vast majority of epidemic models in the literature are parametric, meaning that they contain inherent assumptions about how transmission occurs in a population. However, such assumptions can be lacking in appropriate biological or epidemiological justification and in consequence lead to erroneous scientific conclusions and misleading predictions. We propose a flexible Bayesian nonparametric framework that avoids the need to make strict model assumptions about the infection process and enables a far more data-driven modeling approach for inferring the mechanisms governing transmission. We use our methods to enhance our understanding of the transmission mechanisms of the 2001 UK foot and mouth disease outbreak.

Assuntos

Teorema de Bayes , Doenças Transmissíveis/epidemiologia , Modelos Teóricos , Animais , Doenças Transmissíveis/transmissão , Surtos de Doenças , Febre Aftosa/epidemiologia , Humanos , Estatísticas não Paramétricas , Reino Unido/epidemiologia

5.

Grey matter structure within the visual networks in migraine with aura: multivariate and univariate analyses.

Niddam, David M; Lai, Kuan-Lin; Hsiao, Yi-Ting; Wang, Yen-Feng; Wang, Shuu-Jiun.

Cephalalgia ; 44(1): 3331024231222637, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38170950

RESUMO

BACKGROUND: The visual cortex is involved in the generation of migraine aura. Voxel-based multivariate analyses applied to this region may provide complementary information about aura mechanisms relative to the commonly used mass-univariate analyses. METHODS: Structural images constrained within the functional resting-state visual networks were obtained in migraine patients with (n = 50) and without (n = 50) visual aura and healthy controls (n = 50). The masked images entered a multivariate analysis in which Gaussian process classification was used to generate pairwise models. Generalizability was assessed by five-fold cross-validation and non-parametric permutation tests were used to estimate significance levels. A univariate voxel-based morphometry analysis was also performed. RESULTS: A multivariate pattern of grey matter voxels within the ventral medial visual network contained significant information related to the diagnosis of migraine with visual aura (aura vs. healthy controls: classification accuracy = 78%, p < 0.001; area under the curve = 0.84, p < 0.001; migraine with aura vs. without aura: classification accuracy = 71%, p < 0.001; area under the curve = 0.73, p < 0.003). Furthermore, patients with visual aura exhibited increased grey matter volume in the medial occipital cortex compared to the two other groups. CONCLUSIONS: Migraine with visual aura is characterized by multivariate and univariate patterns of grey matter changes within the medial occipital cortex that have discriminative power and may reflect pathological mechanisms.

Assuntos

Epilepsia , Enxaqueca com Aura , Humanos , Substância Cinzenta/patologia , Enxaqueca com Aura/diagnóstico , Imageamento por Ressonância Magnética/métodos , Córtex Cerebral

6.

Redefining 'state-of-the-art' for integrated population models with immigration.

Nater, Chloé R.

J Anim Ecol ; 93(5): 520-524, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38634153

RESUMO

Research Highlight: Christian, M., Oosthuizen, W. C., Bester, M. N., & de Bruyn, P. N. (2024). Robustly estimating the demographic contribution of immigration: Simulation, sensitivity analysis and seals. Journal of Animal Ecology. https://doi.org/10.1111/1365-2656.14053. Immigration can have profound consequences for local population dynamics and demography, but collecting data to accurately quantifying it is challenging. The recent rise of integrated population models (IPMs) offers an alternative by making it possible to estimate immigration without the need for explicit data, and to quantify its contribution to population dynamics through transient Life Table Response Experiments (tLTREs). Simulation studies have, however, highlighted that this approach can be prone to bias and overestimation. In their new study, Christian et al. address one of the root causes of this issue by improving the estimation of time variation in vital rates and immigration using Gaussian processes in lieu of traditionally used temporal random effects. They demonstrate that IPM-tLTRE frameworks with Gaussian processes produce more accurate and less biased estimates of immigration and its contribution to population dynamics and illustrate the applicability of this approach using a long-term data set on elephant seals (Mirounga leonida). Results are validated with a simulation study and suggest that immigration of breeding females has been central for population recovery of elephant seals despite the species' high female site fidelity. Christian et al. thus present new insights into population regulation of long-lived marine mammals and highlight the potential for using Gaussian process priors in IPMs. They also illustrate a suite of 'best practices' for state-of-the-art IPM-tLTRE analyses and provide an inspirational example for the kind of ecological modelling workflow that can be invaluable not just as a starting point for fellow ecologists picking up or improving their own IPM-tLTRE analyses, but also for teaching and in contexts where model estimates are used for informing management and conservation decision-making.

Assuntos

Migração Animal , Modelos Biológicos , Dinâmica Populacional , Animais , Focas Verdadeiras/fisiologia

7.

Function estimation: Quantifying individual differences of hand-drawn functions.

Little, Daniel R; Shiffrin, Richard M; Laham, Simon M.

Mem Cognit ; 2024 Jun 28.

Artigo em Inglês | MEDLINE | ID: mdl-38944648

RESUMO

Graphical perception is an important part of the scientific endeavour, and the interpretation of graphical information is increasingly important among educated consumers of popular media, who are often presented with graphs of data in support of different policy positions. However, graphs are multidimensional and data in graphs are comprised not only of overall global trends but also local perturbations. We presented a novel function estimation task in which scatterplots of noisy data that varied in the number of data points, the scale of the data, and the true generating function were shown to observers. 170 psychology undergraduates with mixed experience of mathematical functions were asked to draw the function that they believe generated the data. Our results indicated not only a general influence of various aspects of the presented graph (e.g., increasing the number of data points results in smoother generated functions) but also clear individual differences, with some observers tending to generate functions that track the local changes in the data and others following global trends in the data.

8.

Bayesian spatiotemporal modeling on complex-valued fMRI signals via kernel convolutions.

Yu, Cheng-Han; Prado, Raquel; Ombao, Hernando; Rowe, Daniel.

Biometrics ; 79(2): 616-628, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-35143043

RESUMO

We propose a model-based approach that combines Bayesian variable selection tools, a novel spatial kernel convolution structure, and autoregressive processes for detecting a subject's brain activation at the voxel level in complex-valued functional magnetic resonance imaging (CV-fMRI) data. A computationally efficient Markov chain Monte Carlo algorithm for posterior inference is developed by taking advantage of the dimension reduction of the kernel-based structure. The proposed spatiotemporal model leads to more accurate posterior probability activation maps and less false positives than alternative spatial approaches based on Gaussian process models, and other complex-valued models that do not incorporate spatial and/or temporal structure. This is illustrated in the analysis of simulated data and human task-related CV-fMRI data. In addition, we show that complex-valued approaches dominate magnitude-only approaches and that the kernel structure in our proposed model considerably improves sensitivity rates when detecting activation at the voxel level.

Assuntos

Mapeamento Encefálico , Imageamento por Ressonância Magnética , Humanos , Mapeamento Encefálico/métodos , Imageamento por Ressonância Magnética/métodos , Teorema de Bayes , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia , Algoritmos

9.

On free energy barriers in Gaussian priors and failure of cold start MCMC for high-dimensional unimodal distributions.

Bandeira, Afonso S; Maillard, Antoine; Nickl, Richard; Wang, Sven.

Philos Trans A Math Phys Eng Sci ; 381(2247): 20220150, 2023 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-36970818

RESUMO

We exhibit examples of high-dimensional unimodal posterior distributions arising in nonlinear regression models with Gaussian process priors for which Markov chain Monte Carlo (MCMC) methods can take an exponential run-time to enter the regions where the bulk of the posterior measure concentrates. Our results apply to worst-case initialized ('cold start') algorithms that are local in the sense that their step sizes cannot be too large on average. The counter-examples hold for general MCMC schemes based on gradient or random walk steps, and the theory is illustrated for Metropolis-Hastings adjusted methods such as preconditioned Crank-Nicolson and Metropolis-adjusted Langevin algorithm. This article is part of the theme issue 'Bayesian inference: challenges, perspectives, and prospects'.

10.

Autonomous Electron Tomography Reconstruction with Machine Learning.

Millsaps, William; Schwartz, Jonathan; Di, Zichao Wendy; Jiang, Yi; Hovden, Robert.

Microsc Microanal ; 29(5): 1650-1657, 2023 Sep 29.

Artigo em Inglês | MEDLINE | ID: mdl-37639314

RESUMO

Modern electron tomography has progressed to higher resolution at lower doses by leveraging compressed sensing (CS) methods that minimize total variation (TV). However, these sparsity-emphasized reconstruction algorithms introduce tunable parameters that greatly influence the reconstruction quality. Here, Pareto front analysis shows that high-quality tomograms are reproducibly achieved when TV minimization is heavily weighted. However, in excess, CS tomography creates overly smoothed three-dimensional (3D) reconstructions. Adding momentum to the gradient descent during reconstruction reduces the risk of over-smoothing and better ensures that CS is well behaved. For simulated data, the tedious process of tomography parameter selection is efficiently solved using Bayesian optimization with Gaussian processes. In combination, Bayesian optimization with momentum-based CS greatly reduces the required compute time-an 80% reduction was observed for the 3D reconstruction of SrTiO3 nanocubes. Automated parameter selection is necessary for large-scale tomographic simulations that enable the 3D characterization of a wider range of inorganic and biological materials.

11.

Cautious Bayesian Optimization: A Line Tracker Case Study.

Girbés-Juan, Vicent; Moll, Joaquín; Sala, Antonio; Armesto, Leopoldo.

Sensors (Basel) ; 23(16)2023 Aug 18.

Artigo em Inglês | MEDLINE | ID: mdl-37631802

RESUMO

In this paper, a procedure for experimental optimization under safety constraints, to be denoted as constraint-aware Bayesian Optimization, is presented. The basic ingredients are a performance objective function and a constraint function; both of them will be modeled as Gaussian processes. We incorporate a prior model (transfer learning) used for the mean of the Gaussian processes, a semi-parametric Kernel, and acquisition function optimization under chance-constrained requirements. In this way, experimental fine-tuning of a performance objective under experiment-model mismatch can be safely carried out. The methodology is illustrated in a case study on a line-follower application in a CoppeliaSim environment.

12.

Sensitivity Analysis of RV Reducer Rotation Error Based on Deep Gaussian Processes.

Jin, Shousong; Shang, Shulong; Jiang, Suqi; Cao, Mengyi; Wang, Yaliang.

Sensors (Basel) ; 23(7)2023 Mar 29.

Artigo em Inglês | MEDLINE | ID: mdl-37050638

RESUMO

The rotation error is the most important quality characteristic index of a rotate vector (RV) reducer, and it is difficult to accurately optimize the design of a RV reducer, such as the Taguchi type, due to the many factors affecting the rotation error and the serious coupling effect among the factors. This paper analyzes the RV reducer rotation error and each factor based on the deep Gaussian processes (DeepGP) model and Sobol sensitivity analysis(SA) method. Firstly, using the optimal Latin hypercube sampling (OLHS) approach and the DeepGP model, a high-precision regression prediction model of the rotation error and each affecting factor was created. On the basis of the prediction model, the Sobol method was used to conduct a global SA of the factors influencing the rotation error and to compare the coupling relationship between the factors. The results show that the OLHS method and the DeepGP model are suitable for predicting the rotation error in this paper, and the accuracy of the prediction model constructed based on both of them is as high as 95%. The rotation error mainly depends on the influencing factors in the second stage cycloidal pinwheel drive part. The primary involute planetary part and planetary output carrier's rotation error factors have little effect. The coupling effects between the matching clearance between the pin gear and needle gear hole (Î´J) and the circular position error of the needle gear hole (Î´t) is noticeably stronger.

13.

Strain energy density as a Gaussian process and its utilization in stochastic finite element analysis: application to planar soft tissues.

Aggarwal, Ankush; Jensen, Bjørn Sand; Pant, Sanjay; Lee, Chung-Hao.

Comput Methods Appl Mech Eng ; 4042023 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-37235184

RESUMO

Data-based approaches are promising alternatives to the traditional analytical constitutive models for solid mechanics. Herein, we propose a Gaussian process (GP) based constitutive modeling framework, specifically focusing on planar, hyperelastic and incompressible soft tissues. The strain energy density of soft tissues is modeled as a GP, which can be regressed to experimental stress-strain data obtained from biaxial experiments. Moreover, the GP model can be weakly constrained to be convex. A key advantage of a GP-based model is that, in addition to the mean value, it provides a probability density (i.e. associated uncertainty) for the strain energy density. To simulate the effect of this uncertainty, a non-intrusive stochastic finite element analysis (SFEA) framework is proposed. The proposed framework is verified against an artificial dataset based on the Gasser-Ogden-Holzapfel model and applied to a real experimental dataset of a porcine aortic valve leaflet tissue. Results show that the proposed framework can be trained with limited experimental data and fits the data better than several existing models. The SFEA framework provides a straightforward way of using the experimental data and quantifying the resulting uncertainty in simulation-based predictions.

14.

Learning inter-annual flood loss risk models from historical flood insurance claims.

Salas, Joaquin; Saha, Anamitra; Ravela, Sai.

J Environ Manage ; 347: 118862, 2023 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-37806269

RESUMO

Flooding is a natural hazard that causes substantial loss of lives and livelihoods worldwide. Developing predictive models for flood-induced financial losses is crucial for applications such as insurance underwriting. This research uses the National Flood Insurance Program (NFIP) dataset between 2000 and 2020 to evaluate the predictive skill of past data in predicting near-future flood loss risk. Our approach applies neural networks (Conditional Generative Adversarial Networks), decision trees (Extreme Gradient Boosting), and kernel-based regressors (Gaussian Processes) to estimate pointwise losses. It aggregates them over intervals using a bias-corrected Burr-Pareto distribution to predict risk. The regression models help identify the most informative predictors and highlight crucial factors influencing flood-related financial losses. Applying our approach to quantify the county-level coastal flood loss risk in eight US Southern states results in an R2=0.807, substantially outperforming related work using stage-damage curves. More detailed testing on 11 counties with significant claims in the NFIP dataset reveals that Extreme Gradient Boosting yields the most favorable results, and bias correction significantly improves the similarity between the predicted and reference claim amount distributions. Our experiments also show that, despite the already experienced climate change, the difference in future short-term risk predictions of flood-loss amounts between historical shifting or expanding training data windows is insignificant.

Assuntos

Inundações , Seguro , Mudança Climática , Previsões

15.

Treed Gaussian Process Regression for Solving Offline Data-Driven Continuous Multiobjective Optimization Problems.

Mazumdar, Atanu; López-Ibáñez, Manuel; Chugh, Tinkle; Hakanen, Jussi; Miettinen, Kaisa.

Evol Comput ; 31(4): 375-399, 2023 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-37126577

RESUMO

For offline data-driven multiobjective optimization problems (MOPs), no new data is available during the optimization process. Approximation models (or surrogates) are first built using the provided offline data, and an optimizer, for example, a multiobjective evolutionary algorithm, can then be utilized to find Pareto optimal solutions to the problem with surrogates as objective functions. In contrast to online data-driven MOPs, these surrogates cannot be updated with new data and, hence, the approximation accuracy cannot be improved by considering new data during the optimization process. Gaussian process regression (GPR) models are widely used as surrogates because of their ability to provide uncertainty information. However, building GPRs becomes computationally expensive when the size of the dataset is large. Using sparse GPRs reduces the computational cost of building the surrogates. However, sparse GPRs are not tailored to solve offline data-driven MOPs, where good accuracy of the surrogates is needed near Pareto optimal solutions. Treed GPR (TGPR-MO) surrogates for offline data-driven MOPs with continuous decision variables are proposed in this paper. The proposed surrogates first split the decision space into subregions using regression trees and build GPRs sequentially in regions close to Pareto optimal solutions in the decision space to accurately approximate tradeoffs between the objective functions. TGPR-MO surrogates are computationally inexpensive because GPRs are built only in a smaller region of the decision space utilizing a subset of the data. The TGPR-MO surrogates were tested on distance-based visualizable problems with various data sizes, sampling strategies, numbers of objective functions, and decision variables. Experimental results showed that the TGPR-MO surrogates are computationally cheaper and can handle datasets of large size. Furthermore, TGPR-MO surrogates produced solutions closer to Pareto optimal solutions compared to full GPRs and sparse GPRs.

Assuntos

Algoritmos , Evolução Biológica , Distribuição Normal

16.

Non-parametric synergy modeling of chemical compounds with Gaussian processes.

Shapovalova, Yuliya; Heskes, Tom; Dijkstra, Tjeerd.

BMC Bioinformatics ; 23(1): 14, 2022 Jan 06.

Artigo em Inglês | MEDLINE | ID: mdl-34991440

RESUMO

BACKGROUND: Understanding the synergetic and antagonistic effects of combinations of drugs and toxins is vital for many applications, including treatment of multifactorial diseases and ecotoxicological monitoring. Synergy is usually assessed by comparing the response of drug combinations to a predicted non-interactive response from reference (null) models. Possible choices of null models are Loewe additivity, Bliss independence and the recently rediscovered Hand model. A different approach is taken by the MuSyC model, which directly fits a generalization of the Hill model to the data. All of these models, however, fit the dose-response relationship with a parametric model. RESULTS: We propose the Hand-GP model, a non-parametric model based on the combination of the Hand model with Gaussian processes. We introduce a new logarithmic squared exponential kernel for the Gaussian process which captures the logarithmic dependence of response on dose. From the monotherapeutic response and the Hand principle, we construct a null reference response and synergy is assessed from the difference between this null reference and the Gaussian process fitted response. Statistical significance of the difference is assessed from the confidence intervals of the Gaussian process fits. We evaluate performance of our model on a simulated data set from Greco, two simulated data sets of our own design and two benchmark data sets from Chou and Talalay. We compare the Hand-GP model to standard synergy models and show that our model performs better on these data sets. We also compare our model to the MuSyC model as an example of a recent method on these five data sets and on two-drug combination screens: Mott et al. anti-malarial screen and O'Neil et al. anti-cancer screen. We identify cases in which the HandGP model is preferred and cases in which the MuSyC model is preferred. CONCLUSION: The Hand-GP model is a flexible model to capture synergy. Its non-parametric and probabilistic nature allows it to model a wide variety of response patterns.

17.

A spatial Bayesian latent factor model for image-on-image regression.

Guo, Cui; Kang, Jian; Johnson, Timothy D.

Biometrics ; 78(1): 72-84, 2022 03.

Artigo em Inglês | MEDLINE | ID: mdl-33368210

RESUMO

Image-on-image regression analysis, using images to predict images, is a challenging task, due to (1) the high dimensionality and (2) the complex spatial dependence structures in image predictors and image outcomes. In this work, we propose a novel image-on-image regression model, by extending a spatial Bayesian latent factor model to image data, where low-dimensional latent factors are adopted to make connections between high-dimensional image outcomes and image predictors. We assign Gaussian process priors to the spatially varying regression coefficients in the model, which can well capture the complex spatial dependence among image outcomes as well as that among the image predictors. We perform simulation studies to evaluate the out-of-sample prediction performance of our method compared with linear regression and voxel-wise regression methods for different scenarios. The proposed method achieves better prediction accuracy by effectively accounting for the spatial dependence and efficiently reduces image dimensions with latent factors. We apply the proposed method to analysis of multimodal image data in the Human Connectome Project where we predict task-related contrast maps using subcortical volumetric seed maps.

Assuntos

Teorema de Bayes , Simulação por Computador , Humanos , Modelos Lineares , Distribuição Normal , Análise Espacial

18.

Learning Biological Dynamics From Spatio-Temporal Data by Gaussian Processes.

Han, Lifeng; He, Changhan; Dinh, Huy; Fricks, John; Kuang, Yang.

Bull Math Biol ; 84(7): 69, 2022 05 22.

Artigo em Inglês | MEDLINE | ID: mdl-35598223

RESUMO

Model discovery methods offer a promising way to understand biology from data. We propose a method to learn biological dynamics from spatio-temporal data by Gaussian processes. This approach is essentially "equation free" and hence avoids model derivation, which is often difficult due to high complexity of biological processes. By exploiting the local nature of biological processes, dynamics can be learned with data sparse in time. When the length scales (hyperparameters) of the squared exponential covariance function are tuned, they reveal key insights of the underlying process. The squared exponential covariance function also simplifies propagation of uncertainty in multi-step forecasting. After evaluating the performance of the method on synthetic data, we demonstrate a case study on real image data of E. coli colony.

Assuntos

Escherichia coli , Conceitos Matemáticos , Aprendizagem , Modelos Biológicos , Distribuição Normal

19.

Gaussian processes retrieval of crop traits in Google Earth Engine based on Sentinel-2 top-of-atmosphere data.

Estévez, José; Salinero-Delgado, Matías; Berger, Katja; Pipia, Luca; Rivera-Caicedo, Juan Pablo; Wocher, Matthias; Reyes-Muñoz, Pablo; Tagliabue, Giulia; Boschetti, Mirco; Verrelst, Jochem.

Remote Sens Environ ; 273: 112958, 2022 May.

Artigo em Inglês | MEDLINE | ID: mdl-36081832

RESUMO

The unprecedented availability of optical satellite data in cloud-based computing platforms, such as Google Earth Engine (GEE), opens new possibilities to develop crop trait retrieval models from the local to the planetary scale. Hybrid retrieval models are of interest to run in these platforms as they combine the advantages of physically- based radiative transfer models (RTM) with the flexibility of machine learning regression algorithms. Previous research with GEE primarily relied on processing bottom-of-atmosphere (BOA) reflectance data, which requires atmospheric correction. In the present study, we implemented hybrid models directly into GEE for processing Sentinel-2 (S2) Level-1C (L1C) top-of-atmosphere (TOA) reflectance data into crop traits. To achieve this, a training dataset was generated using the leaf-canopy RTM PROSAIL in combination with the atmospheric model 6SV. Gaussian process regression (GPR) retrieval models were then established for eight essential crop traits namely leaf chlorophyll content, leaf water content, leaf dry matter content, fractional vegetation cover, leaf area index (LAI), and upscaled leaf variables (i.e., canopy chlorophyll content, canopy water content and canopy dry matter content). An important pre-requisite for implementation into GEE is that the models are sufficiently light in order to facilitate efficient and fast processing. Successful reduction of the training dataset by 78% was achieved using the active learning technique Euclidean distance-based diversity (EBD). With the EBD-GPR models, highly accurate validation results of LAI and upscaled leaf variables were obtained against in situ field data from the validation study site Munich-North-Isar (MNI), with normalized root mean square errors (NRMSE) from 6% to 13%. Using an independent validation dataset of similar crop types (Italian Grosseto test site), the retrieval models showed moderate to good performances for canopy-level variables, with NRMSE ranging from 14% to 50%, but failed for the leaf-level estimates. Obtained maps over the MNI site were further compared against Sentinel-2 Level 2 Prototype Processor (SL2P) vegetation estimates generated from the ESA Sentinels' Application Platform (SNAP) Biophysical Processor, proving high consistency of both retrievals (R 2 from 0.80 to 0.94). Finally, thanks to the seamless GEE processing capability, the TOA-based mapping was applied over the entirety of Germany at 20 m spatial resolution including information about prediction uncertainty. The obtained maps provided confidence of the developed EBD-GPR retrieval models for integration in the GEE framework and national scale mapping from S2-L1C imagery. In summary, the proposed retrieval workflow demonstrates the possibility of routine processing of S2 TOA data into crop traits maps at any place on Earth as required for operational agricultural applications.

20.

Temporal Models for Demographic and Global Health Outcomes in Multiple Populations: Introducing a New Framework to Review and Standardise Documentation of Model Assumptions and Facilitate Model Comparison.

Susmann, Herbert; Alexander, Monica; Alkema, Leontine.

Int Stat Rev ; 90(3): 437-467, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-36590075

RESUMO

There is growing interest in producing estimates of demographic and global health indicators in populations with limited data. Statistical models are needed to combine data from multiple data sources into estimates and projections with uncertainty. Diverse modelling approaches have been applied to this problem, making comparisons between models difficult. We propose a model class, Temporal Models for Multiple Populations (TMMPs), to facilitate both documentation of model assumptions in a standardised way and comparison across models. The class makes a distinction between the process model, which describes latent trends in the indicator interest, and the data model, which describes the data generating process of the observed data. We provide a general notation for the process model that encompasses many popular temporal modelling techniques, and we show how existing models for a variety of indicators can be written using this notation. We end with a discussion of outstanding questions and future directions.

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA