Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
1.
Proc Natl Acad Sci U S A ; 120(47): e2307935120, 2023 Nov 21.
Article in English | MEDLINE | ID: mdl-37963253

ABSTRACT

Stochastic processes on graphs can describe a great variety of phenomena ranging from neural activity to epidemic spreading. While many existing methods can accurately describe typical realizations of such processes, computing properties of extremely rare events is a hard task, particularly so in the case of recurrent models, in which variables may return to a previously visited state. Here, we build on the matrix product cavity method, extending it fundamentally in two directions: First, we show how it can be applied to Markov processes biased by arbitrary reweighting factors that concentrate most of the probability mass on rare events. Second, we introduce an efficient scheme to reduce the computational cost of a single node update from exponential to polynomial in the node degree. Two applications are considered: inference of infection probabilities from sparse observations within the SIRS epidemic model and the computation of both typical observables and large deviations of several kinetic Ising models.

2.
Sci Rep ; 13(1): 7350, 2023 May 05.
Article in English | MEDLINE | ID: mdl-37147382

ABSTRACT

Estimating observables from conditioned dynamics is typically computationally hard. While obtaining independent samples efficiently from unconditioned dynamics is usually feasible, most of them do not satisfy the imposed conditions and must be discarded. On the other hand, conditioning breaks the causal properties of the dynamics, which ultimately renders the sampling of the conditioned dynamics non-trivial and inefficient. In this work, a Causal Variational Approach is proposed, as an approximate method to generate independent samples from a conditioned distribution. The procedure relies on learning the parameters of a generalized dynamical model that optimally describes the conditioned distribution in a variational sense. The outcome is an effective and unconditioned dynamical model from which one can trivially obtain independent samples, effectively restoring the causality of the conditioned dynamics. The consequences are twofold: the method allows one to efficiently compute observables from the conditioned dynamics by averaging over independent samples; moreover, it provides an effective unconditioned distribution that is easy to interpret. This approximation can be applied virtually to any dynamics. The application of the method to epidemic inference is discussed in detail. The results of direct comparison with state-of-the-art inference methods, including the soft-margin approach and mean-field methods, are promising.

3.
Phys Rev E ; 108(6-1): 064302, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38243547

ABSTRACT

We investigate the information-theoretical limits of inference tasks in epidemic spreading on graphs in the thermodynamic limit. The typical inference tasks consist in computing observables of the posterior distribution of the epidemic model given observations taken from a ground-truth (sometimes called planted) random trajectory. We can identify two main sources of quenched disorder: the graph ensemble and the planted trajectory. The epidemic dynamics however induces nontrivial long-range correlations among individuals' states on the latter. This results in nonlocal correlated quenched disorder which unfortunately is typically hard to handle. To overcome this difficulty, we divide the dynamical process into two sets of variables: a set of stochastic independent variables (representing transmission delays), plus a set of correlated variables (the infection times) that depend deterministically on the first. Treating the former as quenched variables and the latter as dynamic ones, computing disorder average becomes feasible by means of the replica-symmetric cavity method. We give theoretical predictions on the posterior probability distribution of the trajectory of each individual, conditioned to observations on the state of individuals at given times, focusing on the susceptible infectious (SI) model. In the Bayes-optimal condition, i.e., when true dynamic parameters are known, the inference task is expected to fall in the replica-symmetric regime. We indeed provide predictions for the information theoretic limits of various inference tasks, in form of phase diagrams. We also identify a region, in the Bayes-optimal setting, with strong hints of replica-symmetry breaking. When true parameters are unknown, we show how a maximum-likelihood procedure is able to recover them with mostly unaffected performance.


Subject(s)
Epidemics , Humans , Bayes Theorem , Probability , Disease Susceptibility , Models, Statistical
4.
Phys Rev E ; 106(5-1): 054101, 2022 Nov.
Article in English | MEDLINE | ID: mdl-36559409

ABSTRACT

We consider a high-dimensional random constrained optimization problem in which a set of binary variables is subjected to a linear system of equations. The cost function is a simple linear cost, measuring the Hamming distance with respect to a reference configuration. Despite its apparent simplicity, this problem exhibits a rich phenomenology. We show that different situations arise depending on the random ensemble of linear systems. When each variable is involved in at most two linear constraints, we show that the problem can be partially solved analytically, in particular we show that upon convergence, the zero-temperature limit of the cavity equations returns the optimal solution. We then study the geometrical properties of more general random ensembles. In particular we observe a range in the density of constraints at which the system enters a glassy phase where the cost function has many minima. Interestingly, the algorithmic performances are only sensitive to another phase transition affecting the structure of configurations allowed by the linear constraints. We also extend our results to variables belonging to GF(q), the Galois field of order q. We show that increasing the value of q allows to achieve a better optimum, which is confirmed by the replica-symmetric cavity method predictions.

5.
Sci Rep ; 12(1): 19673, 2022 11 16.
Article in English | MEDLINE | ID: mdl-36385141

ABSTRACT

The reconstruction of missing information in epidemic spreading on contact networks can be essential in the prevention and containment strategies. The identification and warning of infectious but asymptomatic individuals (i.e., contact tracing), the well-known patient-zero problem, or the inference of the infectivity values in structured populations are examples of significant epidemic inference problems. As the number of possible epidemic cascades grows exponentially with the number of individuals involved and only an almost negligible subset of them is compatible with the observations (e.g., medical tests), epidemic inference in contact networks poses incredible computational challenges. We present a new generative neural networks framework that learns to generate the most probable infection cascades compatible with observations. The proposed method achieves better (in some cases, significantly better) or comparable results with existing methods in all problems considered both in synthetic and real contact networks. Given its generality, clear Bayesian and variational nature, the presented framework paves the way to solve fundamental inference epidemic problems with high precision in small and medium-sized real case scenarios such as the spread of infections in workplaces and hospitals.


Subject(s)
Epidemics , Humans , Bayes Theorem , Epidemics/prevention & control , Contact Tracing , Neural Networks, Computer
6.
Biophys J ; 121(10): 1919-1930, 2022 05 17.
Article in English | MEDLINE | ID: mdl-35422414

ABSTRACT

Despite major environmental and genetic differences, microbial metabolic networks are known to generate consistent physiological outcomes across vastly different organisms. This remarkable robustness suggests that, at least in bacteria, metabolic activity may be guided by universal principles. The constrained optimization of evolutionarily motivated objective functions, such as the growth rate, has emerged as the key theoretical assumption for the study of bacterial metabolism. While conceptually and practically useful in many situations, the idea that certain functions are optimized is hard to validate in data. Moreover, it is not always clear how optimality can be reconciled with the high degree of single-cell variability observed in experiments within microbial populations. To shed light on these issues, we develop an inverse modeling framework that connects the fitness of a population of cells (represented by the mean single-cell growth rate) to the underlying metabolic variability through the maximum entropy inference of the distribution of metabolic phenotypes from data. While no clear objective function emerges, we find that, as the medium gets richer, the fitness and inferred variability for Escherichia coli populations follow and slowly approach the theoretically optimal bound defined by minimal reduction of variability at given fitness. These results suggest that bacterial metabolism may be crucially shaped by a population-level trade-off between growth and heterogeneity.


Subject(s)
Escherichia coli , Metabolic Networks and Pathways , Bacteria/metabolism , Entropy , Escherichia coli/metabolism , Phenotype
7.
Proc Natl Acad Sci U S A ; 118(32)2021 08 10.
Article in English | MEDLINE | ID: mdl-34312253

ABSTRACT

Contact tracing is an essential tool to mitigate the impact of a pandemic, such as the COVID-19 pandemic. In order to achieve efficient and scalable contact tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing their performance and assessing their impact on the mitigation of the epidemic. We develop Bayesian inference methods to estimate the risk that an individual is infected. This inference is based on the list of his recent contacts and their own risk levels, as well as personal information such as results of tests or presence of syndromes. We propose to use probabilistic risk estimation to optimize testing and quarantining strategies for the control of an epidemic. Our results show that in some range of epidemic spreading (typically when the manual tracing of all contacts of infected people becomes practically impossible but before the fraction of infected people reaches the scale where a lockdown becomes unavoidable), this inference of individuals at risk could be an efficient way to mitigate the epidemic. Our approaches translate into fully distributed algorithms that only require communication between individuals who have recently been in contact. Such communication may be encrypted and anonymized, and thus, it is compatible with privacy-preserving standards. We conclude that probabilistic risk estimation is capable of enhancing the performance of digital contact tracing and should be considered in the mobile applications.


Subject(s)
Contact Tracing/methods , Epidemics/prevention & control , Algorithms , Bayes Theorem , COVID-19/epidemiology , COVID-19/prevention & control , Contact Tracing/statistics & numerical data , Humans , Mobile Applications , Privacy , Risk Assessment , SARS-CoV-2
8.
Phys Rev E ; 103(4-1): 043301, 2021 Apr.
Article in English | MEDLINE | ID: mdl-34005851

ABSTRACT

Efficient feature selection from high-dimensional datasets is a very important challenge in many data-driven fields of science and engineering. We introduce a statistical mechanics inspired strategy that addresses the problem of sparse feature selection in the context of binary classification by leveraging a computational scheme known as expectation propagation (EP). The algorithm is used in order to train a continuous-weights perceptron learning a classification rule from a set of (possibly partly mislabeled) examples provided by a teacher perceptron with diluted continuous weights. We test the method in the Bayes optimal setting under a variety of conditions and compare it to other state-of-the-art algorithms based on message passing and on expectation maximization approximate inference schemes. Overall, our simulations show that EP is a robust and competitive algorithm in terms of variable selection properties, estimation accuracy, and computational complexity, especially when the student perceptron is trained from correlated patterns that prevent other iterative methods from converging. Furthermore, our numerical tests demonstrate that the algorithm is capable of learning online the unknown values of prior parameters, such as the dilution level of the weights of the teacher perceptron and the fraction of mislabeled examples, quite accurately. This is achieved by means of a simple maximum likelihood strategy that consists in minimizing the free energy associated with the EP algorithm.

9.
Phys Rev E ; 100(3-1): 032134, 2019 Sep.
Article in English | MEDLINE | ID: mdl-31639925

ABSTRACT

The problem of efficiently reconstructing tomographic images can be mapped into a Bayesian inference problem over the space of pixels densities. Solutions to this problem are given by pixels assignments that are compatible with tomographic measurements and maximize a posterior probability density. This maximization can be performed with standard local optimization tools when the log-posterior is a convex function, but it is generally intractable when introducing realistic nonconcave priors that reflect typical images features such as smoothness or sharpness. We introduce a new method to reconstruct images obtained from Radon projections by using expectation propagation, which allows us to approximate the intractable posterior. We show, by means of extensive simulations, that, compared to state-of-the-art algorithms for this task, expectation propagation paired with very simple but non-log-concave priors is often able to reconstruct images up to a smaller error while using a lower amount of information per pixel. We provide estimates for the critical rate of information per pixel above which recovery is error-free by means of simulations on ensembles of phantom and real images.

10.
Phys Rev Lett ; 123(2): 020604, 2019 Jul 12.
Article in English | MEDLINE | ID: mdl-31386499

ABSTRACT

Computing marginal distributions of discrete or semidiscrete Markov random fields (MRFs) is a fundamental, generally intractable problem with a vast number of applications in virtually all fields of science. We present a new family of computational schemes to approximately calculate the marginals of discrete MRFs. This method shares some desirable properties with belief propagation, in particular, providing exact marginals on acyclic graphs, but it differs with the latter in that it includes some loop corrections; i.e., it takes into account correlations coming from all cycles in the factor graph. It is also similar to the adaptive Thouless-Anderson-Palmer method, but it differs with the latter in that the consistency is not on the first two moments of the distribution but rather on the value of its density on a subset of values. The results on finite-dimensional Isinglike models show a significant improvement with respect to the Bethe-Peierls (tree) approximation in all cases and with respect to the plaquette cluster variational method approximation in many cases. In particular, for the critical inverse temperature ß_{c} of the homogeneous hypercubic lattice, the expansion of (dß_{c})^{-1} around d=∞ of the proposed scheme is exact up to d^{-4} order, whereas the latter two are exact only up to d^{-2} order.

11.
J R Soc Interface ; 16(151): 20180844, 2019 02 28.
Article in English | MEDLINE | ID: mdl-30958195

ABSTRACT

Accessing the network through which a propagation dynamics diffuses is essential for understanding and controlling it. In a few cases, such information is available through direct experiments or thanks to the very nature of propagation data. In a majority of cases however, available information about the network is indirect and comes from partial observations of the dynamics, rendering the network reconstruction a fundamental inverse problem. Here we show that it is possible to reconstruct the whole structure of an interaction network and to simultaneously infer the complete time course of activation spreading, relying just on single epoch (i.e. snapshot) or time-scattered observations of a small number of activity cascades. The method that we present is built on a belief propagation approximation, that has shown impressive accuracy in a wide variety of relevant cases, and is able to infer interactions in the presence of incomplete time-series data by providing a detailed modelling of the posterior distribution of trajectories conditioned to the observations. Furthermore, we show by experiments that the information content of full cascades is relatively smaller than that of sparse observations or single snapshots.


Subject(s)
Algorithms , Computational Biology , Infections/epidemiology , Models, Biological
12.
Nat Commun ; 8: 14915, 2017 04 06.
Article in English | MEDLINE | ID: mdl-28382977

ABSTRACT

Assuming a steady-state condition within a cell, metabolic fluxes satisfy an underdetermined linear system of stoichiometric equations. Characterizing the space of fluxes that satisfy such equations along with given bounds (and possibly additional relevant constraints) is considered of utmost importance for the understanding of cellular metabolism. Extreme values for each individual flux can be computed with linear programming (as flux balance analysis), and their marginal distributions can be approximately computed with Monte Carlo sampling. Here we present an approximate analytic method for the latter task based on expectation propagation equations that does not involve sampling and can achieve much better predictions than other existing analytic methods. The method is iterative, and its computation time is dominated by one matrix inversion per iteration. With respect to sampling, we show through extensive simulation that it has some advantages including computation time, and the ability to efficiently fix empirically estimated distributions of fluxes.


Subject(s)
Escherichia coli/metabolism , Metabolic Flux Analysis , Metabolic Networks and Pathways , Programming, Linear , Monte Carlo Method
13.
PLoS One ; 12(4): e0176376, 2017.
Article in English | MEDLINE | ID: mdl-28445537

ABSTRACT

The massive employment of computational models in network epidemiology calls for the development of improved inference methods for epidemic forecast. For simple compartment models, such as the Susceptible-Infected-Recovered model, Belief Propagation was proved to be a reliable and efficient method to identify the origin of an observed epidemics. Here we show that the same method can be applied to predict the future evolution of an epidemic outbreak from partial observations at the early stage of the dynamics. The results obtained using Belief Propagation are compared with Monte Carlo direct sampling in the case of SIR model on random (regular and power-law) graphs for different observation methods and on an example of real-world contact network. Belief Propagation gives in general a better prediction that direct sampling, although the quality of the prediction depends on the quantity under study (e.g. marginals of individual states, epidemic size, extinction-time distribution) and on the actual number of observed nodes that are infected before the observation time.


Subject(s)
Models, Theoretical , Area Under Curve , Bayes Theorem , Communicable Diseases/epidemiology , Epidemics , Humans , Monte Carlo Method , ROC Curve
14.
Proc Natl Acad Sci U S A ; 113(44): 12368-12373, 2016 11 01.
Article in English | MEDLINE | ID: mdl-27791075

ABSTRACT

We study the network dismantling problem, which consists of determining a minimal set of vertices in which removal leaves the network broken into connected components of subextensive size. For a large class of random graphs, this problem is tightly connected to the decycling problem (the removal of vertices, leaving the graph acyclic). Exploiting this connection and recent works on epidemic spreading, we present precise predictions for the minimal size of a dismantling set in a large random graph with a prescribed (light-tailed) degree distribution. Building on the statistical mechanics perspective, we propose a three-stage Min-Sum algorithm for efficiently dismantling networks, including heavy-tailed ones for which the dismantling and decycling problems are not equivalent. We also provide additional insights into the dismantling problem, concluding that it is an intrinsically collective problem and that optimal dismantling sets cannot be viewed as a collection of individually well-performing nodes.

15.
Sci Rep ; 6: 27538, 2016 06 10.
Article in English | MEDLINE | ID: mdl-27283451

ABSTRACT

Investigating into the past history of an epidemic outbreak is a paramount problem in epidemiology. Based on observations about the state of individuals, on the knowledge of the network of contacts and on a mathematical model for the epidemic process, the problem consists in describing some features of the posterior distribution of unobserved past events, such as the source, potential transmissions, and undetected positive cases. Several methods have been proposed for the study of these inference problems on discrete-time, synchronous epidemic models on networks, including naive Bayes, centrality measures, accelerated Monte-Carlo approaches and Belief Propagation. However, most traced real networks consist of short-time contacts on continuous time. A possibility that has been adopted is to discretize time line into identical intervals, a method that becomes more and more precise as the length of the intervals vanishes. Unfortunately, the computational time of the inference methods increase with the number of intervals, turning a sufficiently precise inference procedure often impractical. We show here an extension of the Belief Propagation method that is able to deal with a model of continuous-time events, without resorting to time discretization. We also investigate the effect of time discretization on the quality of the inference.


Subject(s)
Computational Biology , Disease Outbreaks , Epidemics , Algorithms , Bayes Theorem , Gene Regulatory Networks , Humans , Models, Theoretical , Monte Carlo Method
16.
PLoS One ; 10(12): e0145222, 2015.
Article in English | MEDLINE | ID: mdl-26710102

ABSTRACT

We present a message-passing algorithm to solve a series of edge-disjoint path problems on graphs based on the zero-temperature cavity equations. Edge-disjoint paths problems are important in the general context of routing, that can be defined by incorporating under a unique framework both traffic optimization and total path length minimization. The computation of the cavity equations can be performed efficiently by exploiting a mapping of a generalized edge-disjoint path problem on a star graph onto a weighted maximum matching problem. We perform extensive numerical simulations on random graphs of various types to test the performance both in terms of path length minimization and maximization of the number of accommodated paths. In addition, we test the performance on benchmark instances on various graphs by comparison with state-of-the-art algorithms and results found in the literature. Our message-passing algorithm always outperforms the others in terms of the number of accommodated paths when considering non trivial instances (otherwise it gives the same trivial results). Remarkably, the largest improvement in performance with respect to the other methods employed is found in the case of benchmarks with meshes, where the validity hypothesis behind message-passing is expected to worsen. In these cases, even though the exact message-passing equations do not converge, by introducing a reinforcement parameter to force convergence towards a sub optimal solution, we were able to always outperform the other algorithms with a peak of 27% performance improvement in terms of accommodated paths. On random graphs, we numerically observe two separated regimes: one in which all paths can be accommodated and one in which this is not possible. We also investigate the behavior of both the number of paths to be accommodated and their minimum total length.


Subject(s)
Algorithms , Artificial Intelligence , Computer Simulation , Automobiles , Computer Communication Networks , Computer Graphics , Travel
17.
PLoS One ; 10(7): e0119286, 2015.
Article in English | MEDLINE | ID: mdl-26177449

ABSTRACT

We study a class of games which models the competition among agents to access some service provided by distributed service units and which exhibits congestion and frustration phenomena when service units have limited capacity. We propose a technique, based on the cavity method of statistical physics, to characterize the full spectrum of Nash equilibria of the game. The analysis reveals a large variety of equilibria, with very different statistical properties. Natural selfish dynamics, such as best-response, usually tend to large-utility equilibria, even though those of smaller utility are exponentially more numerous. Interestingly, the latter actually can be reached by selecting the initial conditions of the best-response dynamics close to the saturation limit of the service unit capacities. We also study a more realistic stochastic variant of the game by means of a simple and effective approximation of the average over the random parameters, showing that the properties of the average-case Nash equilibria are qualitatively similar to the deterministic ones.


Subject(s)
Game Theory , Entropy , Models, Theoretical , Stochastic Processes
18.
Phys Rev Lett ; 112(11): 118701, 2014 Mar 21.
Article in English | MEDLINE | ID: mdl-24702425

ABSTRACT

We study several Bayesian inference problems for irreversible stochastic epidemic models on networks from a statistical physics viewpoint. We derive equations which allow us to accurately compute the posterior distribution of the time evolution of the state of each node given some observations. At difference with most existing methods, we allow very general observation models, including unobserved nodes, state observations made at different or unknown times, and observations of infection times, possibly mixed together. Our method, which is based on the belief propagation algorithm, is efficient, naturally distributed, and exact on trees. As a particular case, we consider the problem of finding the "zero patient" of a susceptible-infected-recovered or susceptible-infected epidemic given a snapshot of the state of the network at a later unknown time. Numerical simulations show that our method outperforms previous ones on both synthetic and real networks, often by a very large margin.


Subject(s)
Bayes Theorem , Contact Tracing/methods , Epidemiologic Methods , Models, Statistical , Stochastic Processes
19.
Pac Symp Biocomput ; : 39-50, 2014.
Article in English | MEDLINE | ID: mdl-24297532

ABSTRACT

Advances in experimental techniques resulted in abundant genomic, transcriptomic, epigenomic, and proteomic data that have the potential to reveal critical drivers of human diseases. Complementary algorithmic developments enable researchers to map these data onto protein-protein interaction networks and infer which signaling pathways are perturbed by a disease. Despite this progress, integrating data across different biological samples or patients remains a substantial challenge because samples from the same disease can be extremely heterogeneous. Somatic mutations in cancer are an infamous example of this heterogeneity. Although the same signaling pathways may be disrupted in a cancer patient cohort, the distribution of mutations is long-tailed, and many driver mutations may only be detected in a small fraction of patients. We developed a computational approach to account for heterogeneous data when inferring signaling pathways by sharing information across the samples. Our technique builds upon the prize-collecting Steiner forest problem, a network optimization algorithm that extracts pathways from a protein-protein interaction network. We recover signaling pathways that are similar across all samples yet still reflect the unique characteristics of each biological sample. Leveraging data from related tumors improves our ability to recover the disrupted pathways and reveals patient-specific pathway perturbations in breast cancer.


Subject(s)
Algorithms , Neoplasms/genetics , Neoplasms/metabolism , Protein Interaction Maps , Signal Transduction , Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Computational Biology , Databases, Genetic , ErbB Receptors/genetics , ErbB Receptors/metabolism , Female , Humans , Models, Biological , Mutation
20.
PLoS Comput Biol ; 9(12): e1003290, 2013.
Article in English | MEDLINE | ID: mdl-24367245

ABSTRACT

We present a powerful experimental-computational technology for inferring network models that predict the response of cells to perturbations, and that may be useful in the design of combinatorial therapy against cancer. The experiments are systematic series of perturbations of cancer cell lines by targeted drugs, singly or in combination. The response to perturbation is quantified in terms of relative changes in the measured levels of proteins, phospho-proteins and cellular phenotypes such as viability. Computational network models are derived de novo, i.e., without prior knowledge of signaling pathways, and are based on simple non-linear differential equations. The prohibitively large solution space of all possible network models is explored efficiently using a probabilistic algorithm, Belief Propagation (BP), which is three orders of magnitude faster than standard Monte Carlo methods. Explicit executable models are derived for a set of perturbation experiments in SKMEL-133 melanoma cell lines, which are resistant to the therapeutically important inhibitor of RAF kinase. The resulting network models reproduce and extend known pathway biology. They empower potential discoveries of new molecular interactions and predict efficacious novel drug perturbations, such as the inhibition of PLK1, which is verified experimentally. This technology is suitable for application to larger systems in diverse areas of molecular biology.


Subject(s)
Models, Biological , Signal Transduction , Systems Biology , Cell Line, Tumor , Humans , Monte Carlo Method , Probability
SELECTION OF CITATIONS
SEARCH DETAIL
...