RESUMEN
Recent years witnessed the development of powerful generative models based on flows, diffusion, or autoregressive neural networks, achieving remarkable success in generating data from examples with applications in a broad range of areas. A theoretical analysis of the performance and understanding of the limitations of these methods remain, however, challenging. In this paper, we undertake a step in this direction by analyzing the efficiency of sampling by these methods on a class of problems with a known probability distribution and comparing it with the sampling performance of more traditional methods such as the Monte Carlo Markov chain and Langevin dynamics. We focus on a class of probability distribution widely studied in the statistical physics of disordered systems that relate to spin glasses, statistical inference, and constraint satisfaction problems. We leverage the fact that sampling via flow-based, diffusion-based, or autoregressive networks methods can be equivalently mapped to the analysis of a Bayes optimal denoising of a modified probability measure. Our findings demonstrate that these methods encounter difficulties in sampling stemming from the presence of a first-order phase transition along the algorithm's denoising path. Our conclusions go both ways: We identify regions of parameters where these methods are unable to sample efficiently, while that is possible using standard Monte Carlo or Langevin approaches. We also identify regions where the opposite happens: standard approaches are inefficient while the discussed generative methods work well.
RESUMEN
The advent of comprehensive synaptic wiring diagrams of large neural circuits has created the field of connectomics and given rise to a number of open research questions. One such question is whether it is possible to reconstruct the information stored in a recurrent network of neurons, given its synaptic connectivity matrix. Here, we address this question by determining when solving such an inference problem is theoretically possible in specific attractor network models and by providing a practical algorithm to do so. The algorithm builds on ideas from statistical physics to perform approximate Bayesian inference and is amenable to exact analysis. We study its performance on three different models, compare the algorithm to standard algorithms such as PCA, and explore the limitations of reconstructing stored patterns from synaptic connectivity.
Asunto(s)
Redes Neurales de la Computación , Neuronas , Teorema de Bayes , Neuronas/fisiología , Algoritmos , Modelos NeurológicosRESUMEN
Contact tracing is an essential tool to mitigate the impact of a pandemic, such as the COVID-19 pandemic. In order to achieve efficient and scalable contact tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing their performance and assessing their impact on the mitigation of the epidemic. We develop Bayesian inference methods to estimate the risk that an individual is infected. This inference is based on the list of his recent contacts and their own risk levels, as well as personal information such as results of tests or presence of syndromes. We propose to use probabilistic risk estimation to optimize testing and quarantining strategies for the control of an epidemic. Our results show that in some range of epidemic spreading (typically when the manual tracing of all contacts of infected people becomes practically impossible but before the fraction of infected people reaches the scale where a lockdown becomes unavoidable), this inference of individuals at risk could be an efficient way to mitigate the epidemic. Our approaches translate into fully distributed algorithms that only require communication between individuals who have recently been in contact. Such communication may be encrypted and anonymized, and thus, it is compatible with privacy-preserving standards. We conclude that probabilistic risk estimation is capable of enhancing the performance of digital contact tracing and should be considered in the mobile applications.
Asunto(s)
Trazado de Contacto/métodos , Epidemias/prevención & control , Algoritmos , Teorema de Bayes , COVID-19/epidemiología , COVID-19/prevención & control , Trazado de Contacto/estadística & datos numéricos , Humanos , Aplicaciones Móviles , Privacidad , Medición de Riesgo , SARS-CoV-2RESUMEN
Generalized linear models (GLMs) are used in high-dimensional machine learning, statistics, communications, and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes, or benchmark models in neural networks. We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed. Nonrigorous predictions for the optimal errors existed for special cases of GLMs, e.g., for the perceptron, in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades-old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance and locate the associated sharp phase transitions separating learnable and nonlearnable regions. We believe that this random version of GLMs can serve as a challenging benchmark for multipurpose algorithms.
RESUMEN
Deep neural networks achieve stellar generalisation even when they have enough parameters to easily fit all their training data. We study this phenomenon by analysing the dynamics and the performance of over-parameterised two-layer neural networks in the teacher-student setup, where one network, the student, is trained on data generated by another network, called the teacher. We show how the dynamics of stochastic gradient descent (SGD) is captured by a set of differential equations and prove that this description is asymptotically exact in the limit of large inputs. Using this framework, we calculate the final generalisation error of student networks that have more parameters than their teachers. We find that the final generalisation error of the student increases with network size when training only the first layer, but stays constant or even decreases with size when training both layers. We show that these different behaviours have their root in the different solutions SGD finds for different activation functions. Our results indicate that achieving good generalisation in neural networks goes beyond the properties of SGD alone and depends on the interplay of at least the algorithm, the model architecture, and the data set.
RESUMEN
We study the network dismantling problem, which consists of determining a minimal set of vertices in which removal leaves the network broken into connected components of subextensive size. For a large class of random graphs, this problem is tightly connected to the decycling problem (the removal of vertices, leaving the graph acyclic). Exploiting this connection and recent works on epidemic spreading, we present precise predictions for the minimal size of a dismantling set in a large random graph with a prescribed (light-tailed) degree distribution. Building on the statistical mechanics perspective, we propose a three-stage Min-Sum algorithm for efficiently dismantling networks, including heavy-tailed ones for which the dismantling and decycling problems are not equivalent. We also provide additional insights into the dismantling problem, concluding that it is an intrinsically collective problem and that optimal dismantling sets cannot be viewed as a collection of individually well-performing nodes.
RESUMEN
Spectral algorithms are classic approaches to clustering and community detection in networks. However, for sparse networks the standard versions of these algorithms are suboptimal, in some cases completely failing to detect communities even when other algorithms such as belief propagation can do so. Here, we present a class of spectral algorithms based on a nonbacktracking walk on the directed edges of the graph. The spectrum of this operator is much better-behaved than that of the adjacency matrix or other commonly used matrices, maintaining a strong separation between the bulk eigenvalues and the eigenvalues relevant to community structure even in the sparse case. We show that our algorithm is optimal for graphs generated by the stochastic block model, detecting communities all of the way down to the theoretical limit. We also show the spectrum of the nonbacktracking operator for some real-world networks, illustrating its advantages over traditional spectral clustering.
Asunto(s)
Algoritmos , Análisis por Conglomerados , Interpretación Estadística de Datos , Modelos EstadísticosRESUMEN
We study percolation on networks, which is used as a model of the resilience of networked systems such as the Internet to attack or failure and as a simple model of the spread of disease over human contact networks. We reformulate percolation as a message passing process and demonstrate how the resulting equations can be used to calculate, among other things, the size of the percolating cluster and the average cluster size. The calculations are exact for sparse networks when the number of short loops in the network is small, but even on networks with many short loops we find them to be highly accurate when compared with direct numerical simulations. By considering the fixed points of the message passing process, we also show that the percolation threshold on a network with few loops is given by the inverse of the leading eigenvalue of the so-called nonbacktracking matrix.
RESUMEN
The proliferation of models for networks raises challenging problems of model selection: the data are sparse and globally dependent, and models are typically high-dimensional and have large numbers of latent variables. Together, these issues mean that the usual model-selection criteria do not work properly for networks. We illustrate these challenges, and show one way to resolve them, by considering the key network-analysis problem of dividing a graph into communities or blocks of nodes with homogeneous patterns of links to the rest of the network. The standard tool for undertaking this is the stochastic block model, under which the probability of a link between two nodes is a function solely of the blocks to which they belong. This imposes a homogeneous degree distribution within each block; this can be unrealistic, so degree-corrected block models add a parameter for each node, modulating its overall degree. The choice between ordinary and degree-corrected block models matters because they make very different inferences about communities. We present the first principled and tractable approach to model selection between standard and degree-corrected block models, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs. We also develop linear-time approximations for log-likelihoods under both the stochastic block model and the degree-corrected model, using belief propagation. Applications to simulated and real networks show excellent agreement with our approximations. Our results thus both solve the practical problem of deciding on degree correction and point to a general approach to model selection in network analysis.
RESUMEN
Discrete dynamical systems can exhibit complex behavior from the iterative application of straightforward local rules. A famous class of examples comes from cellular automata whose global dynamics are notoriously challenging to analyze. To address this, we relax the regular connectivity grid of cellular automata to a random graph, which gives the class of graph cellular automata. Using the dynamical cavity method and its backtracking version, we show that this relaxation allows us to derive asymptotically exact analytical results on the global dynamics of these systems on sparse random graphs. Concretely, we showcase the results on a specific subclass of graph cellular automata with "conforming nonconformist" update rules, which exhibit dynamics akin to opinion formation. Such rules update a node's state according to the majority within their own neighborhood. In cases where the majority leads only by a small margin over the minority, nodes may exhibit nonconformist behavior. Instead of following the majority, they either maintain their own state, switch it, or follow the minority. For configurations with different initial biases towards one state we identify sharp dynamical phase transitions in terms of the convergence speed and attractor types. From the perspective of opinion dynamics this answers when consensus will emerge and when two opinions coexist almost indefinitely.
RESUMEN
While classical in many theoretical settings-and in particular in statistical physics-inspired works-the assumption of Gaussian i.i.d. input data is often perceived as a strong limitation in the context of statistics and machine learning. In this study, we redeem this line of work in the case of generalized linear classification, also known as the perceptron model, with random labels. We argue that there is a large universality class of high-dimensional input data for which we obtain the same minimum training loss as for Gaussian data with corresponding data covariance. In the limit of vanishing regularization, we further demonstrate that the training loss is independent of the data covariance. On the theoretical side, we prove this universality for an arbitrary mixture of homogeneous Gaussian clouds. Empirically, we show that the universality holds also for a broad range of real data sets.
RESUMEN
We consider a class of spreading processes on networks, which generalize commonly used epidemic models such as the SIR model or the SIS model with a bounded number of reinfections. We analyze the related problem of inference of the dynamics based on its partial observations. We analyze these inference problems on random networks via a message-passing inference algorithm derived from the belief propagation (BP) equations. We investigate whether said algorithm solves the problems in a Bayes-optimal way, i.e., no other algorithm can reach a better performance. For this, we leverage the so-called Nishimori conditions that must be satisfied by a Bayes-optimal algorithm. We also probe for phase transitions by considering the convergence time and by initializing the algorithm in both a random and an informed way and comparing the resulting fixed points. We present the corresponding phase diagrams. We find large regions of parameters where even for moderate system sizes the BP algorithm converges and satisfies closely the Nishimori conditions, and the problem is thus conjectured to be solved optimally in those regions. In other limited areas of the space of parameters, the Nishimori conditions are no longer satisfied and the BP algorithm struggles to converge. No sign of a phase transition is detected, however, and we attribute this failure of optimality to finite-size effects. The article is accompanied by a Python implementation of the algorithm that is easy to use or adapt.
RESUMEN
In this paper we study a fully connected planted spin glass named the planted XY model. Motivation for studying this system comes both from the spin glass field and the one of statistical inference where it models the angular synchronization problem. We derive the replica symmetric (RS) phase diagram in the temperature, ferromagnetic bias plane using the approximate message passing (AMP) algorithm and its state evolution (SE). While the RS predictions are exact on the Nishimori line (i.e., when the temperature is matched to the ferromagnetic bias), they become inaccurate when the parameters are mismatched, giving rise to a spin glass phase where AMP is not able to converge. To overcome the defects of the RS approximation we carry out a one-step replica symmetry-breaking (1RSB) analysis based on the approximate survey propagation (ASP) algorithm. Exploiting the state evolution of ASP, we count the number of metastable states in the measure, derive the 1RSB free entropy and find the behavior of the Parisi parameter throughout the spin glass phase.
RESUMEN
We consider the problem of inferring a matching hidden in a weighted random k-hypergraph. We assume that the hyperedges' weights are random and distributed according to two different densities conditioning on the fact that they belong to the hidden matching or not. We show that for k>2 and in the large-graph-size limit, an algorithmic first-order transition in the signal strength separates a regime in which a complete recovery of the hidden matching is feasible from a regime in which partial recovery is possible. This is in contrast to the k=2 case, where the transition is known to be continuous. Finally, we consider the case of graphs presenting a mixture of edges and 3-hyperedges, interpolating between the k=2 and the k=3 cases, and we study how the transition changes from continuous to first order by tuning the relative amount of edges and hyperedges.
RESUMEN
In semisupervised community detection, the membership of a set of revealed nodes is known in addition to the graph structure and can be leveraged to achieve better inference accuracies. While previous works investigated the case where the revealed nodes are selected at random, this paper focuses on correlated subsets leading to atypically high accuracies. In the framework of the dense stochastic block model, we employ statistical physics methods to derive a large deviation analysis of the number of these rare subsets, as characterized by their free energy. We find theoretical evidence of a nonmonotonic relationship between reconstruction accuracy and the free energy associated to the posterior measure of the inference problem. We further discuss possible implications for active learning applications in community detection.
RESUMEN
We present an asymptotically exact analysis of the problem of detecting communities in sparse random networks generated by stochastic block models. Using the cavity method of statistical physics and its relationship to belief propagation, we unveil a phase transition from a regime where we can infer the correct group assignments of the nodes to one where these groups are undetectable. Our approach yields an optimal inference algorithm for detecting modules, including both assortative and disassortative functional modules, assessing their significance, and learning the parameters of the underlying block model. Our algorithm is scalable and applicable to real-world networks, as long as they are well described by the block model.
RESUMEN
The following properties are in the present literature associated with the behavior of supercooled glass-forming liquids: faster than exponential growth of the relaxation time, dynamical heterogeneities, growing point-to-set correlation length, crossover from mean-field behavior to activated dynamics. In this paper we argue that these properties are also present in a much simpler situation, namely the melting of the bulk of an ordered phase beyond a first order phase transition point. This is a promising path toward a better theoretical, numerical and experimental understanding of the above phenomena and of the physics of supercooled liquids. We discuss in detail the analogies and the differences between the glass and the bulk melting transitions.
Asunto(s)
Calor , Vidrio/química , Teoría CuánticaRESUMEN
There are deep analogies between the melting dynamics in systems with a first-order phase transition and the dynamics from equilibrium in super-cooled liquids. For a class of Ising spin models undergoing a first-order transition--namely p-spin models on the so-called Nishimori line--it can be shown that the melting dynamics can be exactly mapped to the equilibrium dynamics. In this mapping the dynamical--or mode-coupling--glass transition corresponds to the spinodal point, while the Kauzmann transition corresponds to the first-order phase transition itself. Both in mean field and finite dimensional models this mapping provides an exact realization of the random first-order theory scenario for the glass transition. The corresponding glassy phenomenology can then be understood in the framework of a standard first-order phase transition.
Asunto(s)
Calor , Vidrio/química , Teoría CuánticaRESUMEN
We show rigorously that the spin-glass susceptibility in the random field Ising model is always bounded by the ferromagnetic susceptibility, and therefore that no spin-glass phase can be present at equilibrium out of the ferromagnetic critical line. When the magnetization is, however, fixed to values smaller than the equilibrium value, a spin-glass phase can exist, as we show explicitly on the Bethe lattice.
RESUMEN
We consider the statistical inference problem of recovering an unknown perfect matching, hidden in a weighted random graph, by exploiting the information arising from the use of two different distributions for the weights on the edges inside and outside the planted matching. A recent work has demonstrated the existence of a phase transition, in the large size limit, between a full and a partial-recovery phase for a specific form of the weights distribution on fully connected graphs. We generalize and extend this result in two directions: we obtain a criterion for the location of the phase transition for generic weights distributions and possibly sparse graphs, exploiting a technical connection with branching random walk processes, as well as a quantitatively more precise description of the critical regime around the phase transition.