Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Phys Rev Lett ; 132(7): 077301, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38427855

RESUMO

Recent generalizations of the Hopfield model of associative memories are able to store a number P of random patterns that grows exponentially with the number N of neurons, P=exp(αN). Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Transformer architecture widely applied in deep learning. In this work, we study a generic family of pattern ensembles using a statistical mechanics analysis which gives exact asymptotic thresholds for the retrieval of a typical pattern, α_{1}, and lower bounds for the maximum of the load α for which all patterns can be retrieved, α_{c}, as well as sizes of attraction basins. We discuss in detail the cases of Gaussian and spherical patterns, and show that they display rich and qualitatively different phase diagrams.

2.
PLoS One ; 19(1): e0295054, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38277355

RESUMO

Processing faces accurately and efficiently is a key capability of humans and other animals that engage in sophisticated social tasks. Recent studies reported a decoupled coding for faces in the primate inferotemporal cortex, with two separate neural populations coding for the geometric position of (texture-free) facial landmarks and for the image texture at fixed landmark positions, respectively. Here, we formally assess the efficiency of this decoupled coding by appealing to the information-theoretic notion of description length, which quantifies the amount of information that is saved when encoding novel facial images, with a given precision. We show that despite decoupled coding describes the facial images in terms of two sets of principal components (of landmark shape and image texture), it is more efficient (i.e., yields more information compression) than the encoding in terms of the image principal components only, which corresponds to the widely used eigenface method. The advantage of decoupled coding over eigenface coding increases with image resolution and is especially prominent when coding variants of training set images that only differ in facial expressions. Moreover, we demonstrate that decoupled coding entails better performance in three different tasks: the representation of facial images, the (daydream) sampling of novel facial images, and the recognition of facial identities and gender. In summary, our study provides a first principle perspective on the efficiency and accuracy of the decoupled coding of facial stimuli reported in the primate inferotemporal cortex.


Assuntos
Reconhecimento Facial , Reconhecimento Psicológico , Animais , Humanos , Córtex Cerebral , Expressão Facial , Primatas
3.
Phys Rev Lett ; 131(22): 227301, 2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-38101365

RESUMO

Empirical studies on the landscape of neural networks have shown that low-energy configurations are often found in complex connected structures, where zero-energy paths between pairs of distant solutions can be constructed. Here, we consider the spherical negative perceptron, a prototypical nonconvex neural network model framed as a continuous constraint satisfaction problem. We introduce a general analytical method for computing energy barriers in the simplex with vertex configurations sampled from the equilibrium. We find that in the overparametrized regime the solution manifold displays simple connectivity properties. There exists a large geodesically convex component that is attractive for a wide range of optimization dynamics. Inside this region we identify a subset of atypical high-margin solutions that are geodesically connected with most other solutions, giving rise to a star-shaped geometry. We analytically characterize the organization of the connected space of solutions and show numerical evidence of a transition, at larger constraint densities, where the aforementioned simple geodesic connectivity breaks down.

4.
Phys Rev E ; 108(2-1): 024313, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37723818

RESUMO

We present a comparison between various algorithms of inference of covariance and precision matrices in small data sets of real vectors of the typical length and dimension of human brain activity time series retrieved by functional magnetic resonance imaging (fMRI). Assuming a Gaussian model underlying the neural activity, the problem consists of denoising the empirically observed matrices to obtain a better estimator of the (unknown) true precision and covariance matrices. We consider several standard noise-cleaning algorithms and compare them on two types of data sets. The first type consists of synthetic time series sampled from a generative Gaussian model of which we can vary the fraction of dimensions per sample q and the strength of off-diagonal correlations. The second type consists of time series of fMRI brain activity of human subjects at rest. The reliability of each algorithm is assessed in terms of test-set likelihood and, in the case of synthetic data, of the distance from the true precision matrix. We observe that the so-called optimal rotationally invariant estimator, based on random matrix theory, leads to a significantly lower distance from the true precision matrix in synthetic data and higher test likelihood in natural fMRI data. We propose a variant of the optimal rotationally invariant estimator in which one of its parameters is optimzed by cross-validation. In the severe undersampling regime (large q) typical of fMRI series, it outperforms all the other estimators. We furthermore propose a simple algorithm based on an iterative likelihood gradient ascent, leading to very accurate estimations in weakly correlated synthetic data sets.

5.
PLoS Comput Biol ; 18(6): e1010219, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35737722

RESUMO

Many different types of generative models for protein sequences have been proposed in literature. Their uses include the prediction of mutational effects, protein design and the prediction of structural properties. Neural network (NN) architectures have shown great performances, commonly attributed to the capacity to extract non-trivial higher-order interactions from the data. In this work, we analyze two different NN models and assess how close they are to simple pairwise distributions, which have been used in the past for similar problems. We present an approach for extracting pairwise models from more complex ones using an energy-based modeling framework. We show that for the tested models the extracted pairwise models can replicate the energies of the original models and are also close in performance in tasks like mutational effect prediction. In addition, we show that even simpler, factorized models often come close in performance to the original models.


Assuntos
Destilação , Redes Neurais de Computação , Sequência de Aminoácidos , Proteínas/química
6.
Phys Rev Lett ; 128(7): 075702, 2022 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-35244416

RESUMO

The spin-glass transition in a field in finite dimension is analyzed directly at zero temperature using a perturbative loop expansion around the Bethe lattice solution. The loop expansion is generated by the M-layer construction whose first diagrams are evaluated numerically and analytically. The generalized Ginzburg criterion reveals that the upper critical dimension below which mean-field theory fails is D_{U}≥8, at variance with the classical result D_{U}=6 yielded by finite-temperature replica field theory. Our expansion around the Bethe lattice has two crucial differences with respect to the classical one. The finite connectivity z of the lattice is directly included from the beginning in the Bethe lattice, while in the classical computation the finite connectivity is obtained through an expansion in 1/z. Moreover, if one is interested in the zero temperature (T=0) transition, one can directly expand around the T=0 Bethe transition. The expansion directly at T=0 is not possible in the classical framework because the fully connected spin glass does not have a transition at T=0, being in the broken phase for any value of the external field.

7.
Proc Natl Acad Sci U S A ; 117(5): 2268-2274, 2020 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-31953263

RESUMO

We apply to the random-field Ising model at zero temperature ([Formula: see text]) the perturbative loop expansion around the Bethe solution. A comparison with the standard ϵ expansion is made, highlighting the key differences that make the expansion around the Bethe solution much more appropriate to correctly describe strongly disordered systems, especially those controlled by a [Formula: see text] renormalization group (RG) fixed point. The latter loop expansion produces an effective theory with cubic vertices. We compute the one-loop corrections due to cubic vertices, finding additional terms that are absent in the ϵ expansion. However, these additional terms are subdominant with respect to the standard, supersymmetric ones; therefore, dimensional reduction is still valid at this order of the loop expansion.

8.
Phys Rev Lett ; 120(26): 268103, 2018 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-30004730

RESUMO

Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties such as robustness and good generalization performance, while typical solutions are isolated and hard to find. Binary solutions of the standard perceptron problem are obtained from a simple gradient descent procedure on a set of real values parametrizing a probability distribution over the binary synapses. Both analytical and numerical results are presented. An algorithmic extension that allows to train discrete deep neural networks is also investigated.

9.
Phys Rev E ; 95(1-1): 012302, 2017 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-28208325

RESUMO

The matching problem is a notorious combinatorial optimization problem that has attracted for many years the attention of the statistical physics community. Here we analyze the Euclidean version of the problem, i.e., the optimal matching problem between points randomly distributed on a d-dimensional Euclidean space, where the cost to minimize depends on the points' pairwise distances. Using Mayer's cluster expansion we write a formal expression for the replicated action that is suitable for a saddle point computation. We give the diagrammatic rules for each term of the expansion, and we analyze in detail the one-loop diagrams. A characteristic feature of the theory, when diagrams are perturbatively computed around the mean field part of the action, is the vanishing of the mass at zero momentum. In the non-Euclidean case of uncorrelated costs instead, we predict and numerically verify an anomalous scaling for the sub-sub-leading correction to the asymptotic average cost.

10.
Proc Natl Acad Sci U S A ; 113(48): E7655-E7662, 2016 11 29.
Artigo em Inglês | MEDLINE | ID: mdl-27856745

RESUMO

In artificial neural networks, learning from data is a computationally demanding task in which a large number of connection weights are iteratively tuned through stochastic-gradient-based heuristic processes over a cost function. It is not well understood how learning occurs in these systems, in particular how they avoid getting trapped in configurations with poor computational performance. Here, we study the difficult case of networks with discrete weights, where the optimization landscape is very rough even for simple architectures, and provide theoretical and numerical evidence of the existence of rare-but extremely dense and accessible-regions of configurations in the network weight space. We define a measure, the robust ensemble (RE), which suppresses trapping by isolated configurations and amplifies the role of these dense regions. We analytically compute the RE in some exactly solvable models and also provide a general algorithmic scheme that is straightforward to implement: define a cost function given by a sum of a finite number of replicas of the original cost function, with a constraint centering the replicas around a driving assignment. To illustrate this, we derive several powerful algorithms, ranging from Markov Chains to message passing to gradient descent processes, where the algorithms target the robust dense states, resulting in substantial improvements in performance. The weak dependence on the number of precision bits of the weights leads us to conjecture that very similar reasoning applies to more conventional neural networks. Analogous algorithmic schemes can also be applied to other optimization problems.

11.
Phys Rev E ; 93(5): 052313, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-27300916

RESUMO

Learning in neural networks poses peculiar challenges when using discretized rather then continuous synaptic states. The choice of discrete synapses is motivated by biological reasoning and experiments, and possibly by hardware implementation considerations as well. In this paper we extend a previous large deviations analysis which unveiled the existence of peculiar dense regions in the space of synaptic states which accounts for the possibility of learning efficiently in networks with binary synapses. We extend the analysis to synapses with multiple states and generally more plausible biological features. The results clearly indicate that the overall qualitative picture is unchanged with respect to the binary case, and very robust to variation of the details of the model. We also provide quantitative results which suggest that the advantages of increasing the synaptic precision (i.e., the number of internal synaptic states) rapidly vanish after the first few bits, and therefore that, for practical applications, only few bits may be needed for near-optimal performance, consistent with recent biological findings. Finally, we demonstrate how the theoretical analysis can be exploited to design efficient algorithmic search strategies.


Assuntos
Aprendizagem/fisiologia , Modelos Neurológicos , Sinapses/fisiologia , Redes Neurais de Computação
12.
Phys Rev Lett ; 115(12): 128101, 2015 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-26431018

RESUMO

We show that discrete synaptic weights can be efficiently used for learning in large scale neural systems, and lead to unanticipated computational performance. We focus on the representative case of learning random patterns with binary synapses in single layer networks. The standard statistical analysis shows that this problem is exponentially dominated by isolated solutions that are extremely hard to find algorithmically. Here, we introduce a novel method that allows us to find analytical evidence for the existence of subdominant and extremely dense regions of solutions. Numerical experiments confirm these findings. We also show that the dense regions are surprisingly accessible by simple learning protocols, and that these synaptic configurations are robust to perturbations and generalize better than typical solutions. These outcomes extend to synapses with multiple states and to deeper neural architectures. The large deviation measure also suggests how to design novel algorithmic schemes for optimization based on local entropy maximization.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA