Enhanced regularization for on-chip training using analog and temporary memory weights.

Singhal, Raghav; Saraswat, Vivek; Deshmukh, Shreyas; Subramoney, Sreenivas; Somappa, Laxmeesha; Baghini, Maryam Shojaei; Ganguly, Udayan

Singhal, Raghav; Saraswat, Vivek; Deshmukh, Shreyas; Subramoney, Sreenivas; Somappa, Laxmeesha; Baghini, Maryam Shojaei; Ganguly, Udayan.

Afiliação

Singhal R; Department of Electrical Engineering, Indian Institute of Technology Bombay, Mumbai, India. Electronic address: 19d070049@iitb.ac.in.
Saraswat V; Department of Electrical Engineering, Indian Institute of Technology Bombay, Mumbai, India.
Deshmukh S; Department of Electrical Engineering, Indian Institute of Technology Bombay, Mumbai, India.
Subramoney S; Processor and Architecture Research Lab, Intel, Bangalore, India.
Somappa L; Department of Electrical Engineering, Indian Institute of Technology Bombay, Mumbai, India.
Baghini MS; Department of Electrical Engineering, Indian Institute of Technology Bombay, Mumbai, India.
Ganguly U; Department of Electrical Engineering, Indian Institute of Technology Bombay, Mumbai, India.

Neural Netw ; 165: 1050-1057, 2023 Aug.

Article em En | MEDLINE | ID: mdl-37478527

ABSTRACT

ABSTRACT

In-memory computing techniques are used to accelerate artificial neural network (ANN) training and inference tasks. Memory technology and architectural innovations allow efficient matrix-vector multiplications, gradient calculations, and updates to network weights. However, on-chip learning for edge devices is quite challenging due to the frequent updates. Here, we propose using an analog and temporary on-chip memory (ATOM) cell with controllable retention timescales for implementing the weights of an on-chip training task. Measurement results for Read-Write timescales are presented for an ATOM cell fabricated in GlobalFoundries' 45 nm RFSOI technology. The effect of limited retention and its variability is evaluated for training a fully connected neural network with a variable number of layers for the MNIST hand-written digit recognition task. Our studies show that weight decay due to temporary memory can have benefits equivalent to regularization, achieving a â¼33% reduction in the validation error (from 3.6% to 2.4%). We also show that the controllability of the decay timescale can be advantageous in achieving a further â¼26% reduction in the validation error. This strongly suggests the utility of temporary memory during learning before on-chip non-volatile memories can take over for the storage and inference tasks using the neural network weights. We thus propose an algorithm-circuit codesign in the form of temporary analog memory for high-performing on-chip learning of ANNs.

Assuntos

Algoritmos; Redes Neurais de Computação; Aprendizagem; Reconhecimento Psicológico; Cognição

Palavras-chave

Artificial neural network; In-memory computing; ML hardware; On-chip learning; Regularization; Temporary memory

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Redes Neurais de Computação Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Redes Neurais de Computação Idioma: En Ano de publicação: 2023 Tipo de documento: Article