Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Bases de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Neural Comput ; 31(3): 538-554, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30645180

RESUMO

This letter deals with neural networks as dynamical systems governed by finite difference equations. It shows that the introduction of k -many skip connections into network architectures, such as residual networks and additive dense networks, defines k th order dynamical equations on the layer-wise transformations. Closed-form solutions for the state-space representations of general k th order additive dense networks, where the concatenation operation is replaced by addition, as well as k th order smooth networks, are found. The developed provision endows deep neural networks with an algebraic structure. Furthermore, it is shown that imposing k th order smoothness on network architectures with d -many nodes per layer increases the state-space dimension by a multiple of k , and so the effective embedding dimension of the data manifold by the neural network is k·d -many dimensions. It follows that network architectures of these types reduce the number of parameters needed to maintain the same embedding dimension by a factor of k2 when compared to an equivalent first-order, residual network. Numerical simulations and experiments on CIFAR10, SVHN, and MNIST have been conducted to help understand the developed theory and efficacy of the proposed concepts.


Assuntos
Redes Neurais de Computação , Simulação por Computador
2.
IEEE Trans Neural Netw Learn Syst ; 33(12): 7559-7573, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34129506

RESUMO

In this article, we consider quantized learning control for linear networked systems with additive channel noise. Our objective is to achieve high tracking performance while reducing the communication burden on the communication network. To address this problem, we propose an integrated framework consisting of two modules: a probabilistic quantizer and a learning scheme. The employed probabilistic quantizer is developed by employing a Bernoulli distribution driven by the quantization errors. Three learning control schemes are studied, namely, a constant gain, a decreasing gain sequence satisfying certain conditions, and an optimal gain sequence that is recursively generated based on a performance index. We show that the control with a constant gain can only ensure the input error sequence to converge to a bounded sphere in a mean-square sense, where the radius of this sphere is proportional to the constant gain. On the contrary, we show that the control that employs any of the two proposed gain sequences drives the input error to zero in the mean-square sense. In addition, we show that the convergence rate associated with the constant gain is exponential, whereas the rate associated with the proposed gain sequences is not faster than a specific exponential trend. Illustrative simulations are provided to demonstrate the convergence rate properties and steady-state tracking performance associated with each gain, and their robustness against modeling uncertainties.

3.
Neural Netw ; 152: 499-509, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35640371

RESUMO

Large neural networks usually perform well for executing machine learning tasks. However, models that achieve state-of-the-art performance involve arbitrarily large number of parameters and therefore their training is very expensive. It is thus desired to implement methods with small per-iteration costs, fast convergence rates, and reduced tuning. This paper proposes a multivariate adaptive gradient descent method that meets the above attributes. The proposed method updates every element of the model parameters separately in a computationally efficient manner using an adaptive vector-form learning rate, resulting in low per-iteration cost. The adaptive learning rate computes the absolute difference of current and previous model parameters over the difference in subgradients of current and previous state estimates. In the deterministic setting, we show that the cost function value converges at a linear rate for smooth and strongly convex cost functions. Whereas in both the deterministic and stochastic setting, we show that the gradient converges in expectation at the order of O(1/k) for a non-convex cost function with Lipschitz continuous gradient. In addition, we show that after T iterates, the cost function of the last iterate scales as O(log(T)/T) for non-smooth strongly convex cost functions. Effectiveness of the proposed method is validated on convex functions, smooth non-convex function, non-smooth convex function, and four image classification data sets, whilst showing that its execution requires hardly any tuning unlike existing popular optimizers that entail relatively large tuning efforts. Our empirical results show that our proposed algorithm provides the best overall performance when comparing it to tuned state-of-the-art optimizers.


Assuntos
Algoritmos , Redes Neurais de Computação , Aprendizado de Máquina
4.
IEEE Trans Neural Netw Learn Syst ; 31(5): 1602-1615, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-31265420

RESUMO

This paper deals with iterative Jacobian-based recursion technique for the root-finding problem of the vector-valued function, whose evaluations are contaminated by noise. Instead of a scalar step size, we use an iterate-dependent matrix gain to effectively weigh the different elements associated with the noisy observations. The analytical development of the matrix gain is built on an iterative-dependent linear function interfered by additive zero-mean white noise, where the dimension of the function is M ≥ 1 and the dimension of the unknown variable is N ≥ 1 . Necessary and sufficient conditions for M ≥ N algorithms are presented pertaining to algorithm stability and convergence of the estimate error covariance matrix. Two algorithms are proposed: one for the case where M ≥ N and the second one for the antithesis. The two algorithms assume full knowledge of the Jacobian. The recursive algorithms are proposed for generating the optimal iterative-dependent matrix gain. The proposed algorithms here aim for per-iteration minimization of the mean square estimate error. We show that the proposed algorithm satisfies the presented conditions for stability and convergence of the covariance. In addition, the convergence rate of the estimation error covariance is shown to be inversely proportional to the number of iterations. For the antithesis , contraction of the error covariance is guaranteed. This underdetermined system of equations can be helpful in training neural networks. Numerical examples are presented to illustrate the performance capabilities of the proposed multidimensional gain while considering nonlinear functions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA