Search | VHL Search Portal

Domain-driven models yield better predictions at lower cost than reservoir computers in Lorenz systems.

Pyle, Ryan; Jovanovic, Nikola; Subramanian, Devika; Palem, Krishna V; Patel, Ankit B.

Philos Trans A Math Phys Eng Sci ; 379(2194): 20200246, 2021 Apr 05.

Article in English | MEDLINE | ID: mdl-33583272

ABSTRACT

Recent advances in computing algorithms and hardware have rekindled interest in developing high-accuracy, low-cost surrogate models for simulating physical systems. The idea is to replace expensive numerical integration of complex coupled partial differential equations at fine time scales performed on supercomputers, with machine-learned surrogates that efficiently and accurately forecast future system states using data sampled from the underlying system. One particularly popular technique being explored within the weather and climate modelling community is the echo state network (ESN), an attractive alternative to other well-known deep learning architectures. Using the classical Lorenz 63 system, and the three tier multi-scale Lorenz 96 system (Thornes T, Duben P, Palmer T. 2017 Q. J. R. Meteorol. Soc. 143, 897-908. (doi:10.1002/qj.2974)) as benchmarks, we realize that previously studied state-of-the-art ESNs operate in two distinct regimes, corresponding to low and high spectral radius (LSR/HSR) for the sparse, randomly generated, reservoir recurrence matrix. Using knowledge of the mathematical structure of the Lorenz systems along with systematic ablation and hyperparameter sensitivity analyses, we show that state-of-the-art LSR-ESNs reduce to a polynomial regression model which we call Domain-Driven Regularized Regression (D2R2). Interestingly, D2R2 is a generalization of the well-known SINDy algorithm (Brunton SL, Proctor JL, Kutz JN. 2016 Proc. Natl Acad. Sci. USA 113, 3932-3937. (doi:10.1073/pnas.1517384113)). We also show experimentally that LSR-ESNs (Chattopadhyay A, Hassanzadeh P, Subramanian D. 2019 (http://arxiv.org/abs/1906.08829)) outperform HSR ESNs (Pathak J, Hunt B, Girvan M, Lu Z, Ott E. 2018 Phys. Rev. Lett. 120, 024102. (doi:10.1103/PhysRevLett.120.024102)) while D2R2 dominates both approaches. A significant goal in constructing surrogates is to cope with barriers to scaling in weather prediction and simulation of dynamical systems that are imposed by time and energy consumption in supercomputers. Inexact computing has emerged as a novel approach to helping with scaling. In this paper, we evaluate the performance of three models (LSR-ESN, HSR-ESN and D2R2) by varying the precision or word size of the computation as our inexactness-controlling parameter. For precisions of 64, 32 and 16 bits, we show that, surprisingly, the least expensive D2R2 method yields the most robust results and the greatest savings compared to ESNs. Specifically, D2R2 achieves 68 × in computational savings, with an additional 2 × if precision reductions are also employed, outperforming ESN variants by a large margin. This article is part of the theme issue 'Machine learning for weather and climate modelling'.

A Reservoir Computing Model of Reward-Modulated Motor Learning and Automaticity.

Pyle, Ryan; Rosenbaum, Robert.

Neural Comput ; 31(7): 1430-1461, 2019 07.

Article in English | MEDLINE | ID: mdl-31113300

ABSTRACT

Reservoir computing is a biologically inspired class of learning algorithms in which the intrinsic dynamics of a recurrent neural network are mined to produce target time series. Most existing reservoir computing algorithms rely on fully supervised learning rules, which require access to an exact copy of the target response, greatly reducing the utility of the system. Reinforcement learning rules have been developed for reservoir computing, but we find that they fail to converge on complex motor tasks. Current theories of biological motor learning pose that early learning is controlled by dopamine-modulated plasticity in the basal ganglia that trains parallel cortical pathways through unsupervised plasticity as a motor task becomes well learned. We developed a novel learning algorithm for reservoir computing that models the interaction between reinforcement and unsupervised learning observed in experiments. This novel learning algorithm converges on simulated motor tasks on which previous reservoir computing algorithms fail and reproduces experimental findings that relate Parkinson's disease and its treatments to motor learning. Hence, incorporating biological theories of motor learning improves the effectiveness and biological relevance of reservoir computing models.

Subject(s)

Computer Simulation , Nerve Net/physiology , Neural Networks, Computer , Reward , Humans , Models, Neurological , Neuronal Plasticity/physiology , Neurons/physiology , Reinforcement, Psychology

Spatiotemporal Dynamics and Reliable Computations in Recurrent Spiking Neural Networks.

Pyle, Ryan; Rosenbaum, Robert.

Phys Rev Lett ; 118(1): 018103, 2017 Jan 06.

Article in English | MEDLINE | ID: mdl-28106418

ABSTRACT

Randomly connected networks of excitatory and inhibitory spiking neurons provide a parsimonious model of neural variability, but are notoriously unreliable for performing computations. We show that this difficulty is overcome by incorporating the well-documented dependence of connection probability on distance. Spatially extended spiking networks exhibit symmetry-breaking bifurcations and generate spatiotemporal patterns that can be trained to perform dynamical computations under a reservoir computing framework.

Subject(s)

Action Potentials , Computer Simulation , Neurons/physiology , Algorithms , Models, Neurological

Translational symmetry in convolutions with localized kernels causes an implicit bias toward high frequency adversarial examples.

Caro, Josue O; Ju, Yilong; Pyle, Ryan; Dey, Sourav; Brendel, Wieland; Anselmi, Fabio; Patel, Ankit B.

Front Comput Neurosci ; 18: 1387077, 2024.

Article in English | MEDLINE | ID: mdl-38966128

ABSTRACT

Adversarial attacks are still a significant challenge for neural networks. Recent efforts have shown that adversarial perturbations typically contain high-frequency features, but the root cause of this phenomenon remains unknown. Inspired by theoretical work on linear convolutional models, we hypothesize that translational symmetry in convolutional operations together with localized kernels implicitly bias the learning of high-frequency features, and that this is one of the main causes of high frequency adversarial examples. To test this hypothesis, we analyzed the impact of different choices of linear and non-linear architectures on the implicit bias of the learned features and adversarial perturbations, in spatial and frequency domains. We find that, independently of the training dataset, convolutional operations have higher frequency adversarial attacks compared to other architectural parameterizations, and that this phenomenon is exacerbated with stronger locality of the kernel (kernel size) end depth of the model. The explanation for the kernel size dependence involves the Fourier Uncertainty Principle: a spatially-limited filter (local kernel in the space domain) cannot also be frequency-limited (local in the frequency domain). Using larger convolution kernel sizes or avoiding convolutions (e.g., by using Vision Transformers or MLP-style architectures) significantly reduces this high-frequency bias. Looking forward, our work strongly suggests that understanding and controlling the implicit bias of architectures will be essential for achieving adversarial robustness.

Shallow Univariate ReLU Networks as Splines: Initialization, Loss Surface, Hessian, and Gradient Flow Dynamics.

Sahs, Justin; Pyle, Ryan; Damaraju, Aneel; Caro, Josue Ortega; Tavaslioglu, Onur; Lu, Andy; Anselmi, Fabio; Patel, Ankit B.

Front Artif Intell ; 5: 889981, 2022.

Article in English | MEDLINE | ID: mdl-35647529

ABSTRACT

Understanding the learning dynamics and inductive bias of neural networks (NNs) is hindered by the opacity of the relationship between NN parameters and the function represented. Partially, this is due to symmetries inherent within the NN parameterization, allowing multiple different parameter settings to result in an identical output function, resulting in both an unclear relationship and redundant degrees of freedom. The NN parameterization is invariant under two symmetries: permutation of the neurons and a continuous family of transformations of the scale of weight and bias parameters. We propose taking a quotient with respect to the second symmetry group and reparametrizing ReLU NNs as continuous piecewise linear splines. Using this spline lens, we study learning dynamics in shallow univariate ReLU NNs, finding unexpected insights and explanations for several perplexing phenomena. We develop a surprisingly simple and transparent view of the structure of the loss surface, including its critical and fixed points, Hessian, and Hessian spectrum. We also show that standard weight initializations yield very flat initial functions, and that this flatness, together with overparametrization and the initial weight scale, is responsible for the strength and type of implicit regularization, consistent with previous work. Our implicit regularization results are complementary to recent work, showing that initialization scale critically controls implicit regularization via a kernel-based argument. Overall, removing the weight scale symmetry enables us to prove these results more simply and enables us to prove new results and gain new insights while offering a far more transparent and intuitive picture. Looking forward, our quotiented spline-based approach will extend naturally to the multivariate and deep settings, and alongside the kernel-based view, we believe it will play a foundational role in efforts to understand neural networks. Videos of learning dynamics using a spline-based visualization are available at http://shorturl.at/tFWZ2.

Circuit Models of Low-Dimensional Shared Variability in Cortical Networks.

Huang, Chengcheng; Ruff, Douglas A; Pyle, Ryan; Rosenbaum, Robert; Cohen, Marlene R; Doiron, Brent.

Neuron ; 101(2): 337-348.e4, 2019 01 16.

Article in English | MEDLINE | ID: mdl-30581012

ABSTRACT

Trial-to-trial variability is a reflection of the circuitry and cellular physiology that make up a neuronal network. A pervasive yet puzzling feature of cortical circuits is that despite their complex wiring, population-wide shared spiking variability is low dimensional. Previous model cortical networks cannot explain this global variability, and rather assume it is from external sources. We show that if the spatial and temporal scales of inhibitory coupling match known physiology, networks of model spiking neurons internally generate low-dimensional shared variability that captures population activity recorded in vivo. Shifting spatial attention into the receptive field of visual neurons has been shown to differentially modulate shared variability within and between brain areas. A top-down modulation of inhibitory neurons in our network provides a parsimonious mechanism for this attentional modulation. Our work provides a critical link between observed cortical circuit structure and realistic shared neuronal variability and its modulation.

Subject(s)

Attention/physiology , Models, Neurological , Nerve Net/physiology , Neurons/physiology , Visual Cortex/cytology , Action Potentials/physiology , Animals , Computer Simulation , Factor Analysis, Statistical , Humans , Neural Inhibition/physiology , Photic Stimulation

Highly connected neurons spike less frequently in balanced networks.

Pyle, Ryan; Rosenbaum, Robert.

Phys Rev E ; 93: 040302, 2016 04.

Article in English | MEDLINE | ID: mdl-27176240

ABSTRACT

Biological neuronal networks exhibit highly variable spiking activity. Balanced networks offer a parsimonious model of this variability in which strong excitatory synaptic inputs are canceled by strong inhibitory inputs on average, and irregular spiking activity is driven by fluctuating synaptic currents. Most previous studies of balanced networks assume a homogeneous or distance-dependent connectivity structure, but connectivity in biological cortical networks is more intricate. We use a heterogeneous mean-field theory of balanced networks to show that heterogeneous in-degrees can break balance. Moreover, heterogeneous architectures that achieve balance promote lower firing rates in neurons with larger in-degrees, consistent with some recent experimental observations.

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL