Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 43
Filter
1.
Neural Netw ; 173: 106202, 2024 May.
Article in English | MEDLINE | ID: mdl-38422835

ABSTRACT

The concept of randomized neural networks (RNNs), such as the random vector functional link network (RVFL) and extreme learning machine (ELM), is a widely accepted and efficient network method for constructing single-hidden layer feedforward networks (SLFNs). Due to its exceptional approximation capabilities, RNN is being extensively used in various fields. While the RNN concept has shown great promise, its performance can be unpredictable in imperfect conditions, such as weight noises and outliers. Thus, there is a need to develop more reliable and robust RNN algorithms. To address this issue, this paper proposes a new objective function that addresses the combined effect of weight noise and training data outliers for RVFL networks. Based on the half-quadratic optimization method, we then propose a novel algorithm, named noise-aware RNN (NARNN), to optimize the proposed objective function. The convergence of the NARNN is also theoretically validated. We also discuss the way to use the NARNN for ensemble deep RVFL (edRVFL) networks. Finally, we present an extension of the NARNN to concurrently address weight noise, stuck-at-fault, and outliers. The experimental results demonstrate that the proposed algorithm outperforms a number of state-of-the-art robust RNN algorithms.


Subject(s)
Algorithms , Neural Networks, Computer , Learning
2.
Article in English | MEDLINE | ID: mdl-37796669

ABSTRACT

Among many k -winners-take-all ( k WTA) models, the dual-neural network (DNN- k WTA) model is with significantly less number of connections. However, for analog realization, noise is inevitable and affects the operational correctness of the k WTA process. Most existing results focus on the effect of additive noise. This brief studies the effect of time-varying multiplicative input noise. Two scenarios are considered. The first one is the bounded noise case, in which only the noise range is known. Another one is for the general noise distribution case, in which we either know the noise distribution or have noise samples. For each scenario, we first prove the convergence property of the DNN- k WTA model under multiplicative input noise and then provide an efficient method to determine whether a noise-affected DNN- k WTA network performs the correct k WTA process for a given set of inputs. With the two methods, we can efficiently measure the probability of the network performing the correct k WTA process. In addition, for the case of the inputs being uniformly distributed, we derive two closed-form expressions, one for each scenario, for estimating the probability of the model having correct operation. Finally, we conduct simulations to verify our theoretical results.

3.
Article in English | MEDLINE | ID: mdl-37310825

ABSTRACT

The dual neural network (DNN)-based k -winner-take-all (WTA) model is able to identify the k largest numbers from its m input numbers. When there are imperfections, such as non-ideal step function and Gaussian input noise, in the realization, the model may not output the correct result. This brief analyzes the influence of the imperfections on the operational correctness of the model. Due to the imperfections, it is not efficient to use the original DNN- k WTA dynamics for analyzing the influence. In this regard, this brief first derives an equivalent model to describe the dynamics of the model under the imperfections. From the equivalent model, we derive a sufficient condition for which the model outputs the correct result. Thus, we apply the sufficient condition to design an efficiently estimation method for the probability of the model outputting the correct result. Furthermore, for the inputs with uniform distribution, a closed form expression for the probability value is derived. Finally, we extend our analysis for handling non-Gaussian input noise. Simulation results are provided to validate our theoretical results.

4.
IEEE Trans Neural Netw Learn Syst ; 34(8): 5218-5226, 2023 Aug.
Article in English | MEDLINE | ID: mdl-34847045

ABSTRACT

The objective of compressive sampling is to determine a sparse vector from an observation vector. This brief describes an analog neural method to achieve the objective. Unlike previous analog neural models which either resort to the l1 -norm approximation or are with local convergence only, the proposed method avoids any approximation of the l1 -norm term and is probably capable of leading to the optimum solution. Moreover, its computational complexity is lower than that of the other three comparison analog models. Simulation results show that the error performance of the proposed model is comparable to several state-of-the-art digital algorithms and analog models and that its convergence is faster than that of the comparison analog neural models.

5.
IEEE Trans Neural Netw Learn Syst ; 34(2): 1080-1088, 2023 Feb.
Article in English | MEDLINE | ID: mdl-34428154

ABSTRACT

From the feature representation's point of view, the feature learning module of a convolutional neural network (CNN) is to transform an input pattern into a feature vector. This feature vector is then multiplied with a number of output weight vectors to produce softmax scores. The common training objective in CNNs is based on the softmax loss, which ignores the intra-class compactness. This brief proposes a constrained center loss (CCL)-based algorithm to extract robust features. The training objective of a CNN consists of two terms, softmax loss and CCL. The aim of the softmax loss is to push the feature vectors from different classes apart. Meanwhile, the CCL aims at clustering the feature vectors such that the feature vectors from the same classes are close together. Instead of using stochastic gradient descent (SGD) algorithms to learn all the connection weights and the cluster centers at the same time. Our CCL-based algorithm is based on the alternative learning strategy. We first fix the connection weights of the CNN and update the cluster centers based on an analytical formula, which can be implemented based on the minibatch concept. We then fix the cluster centers and update the connection weights for a number of SGD minibatch iterations. We also propose a simplified CCL (SCCL) algorithm. Experiments are performed on six commonly used benchmark datasets. The results demonstrate that the two proposed algorithms outperform several state-of-the-art approaches.

6.
IEEE Trans Neural Netw Learn Syst ; 34(5): 2619-2632, 2023 May.
Article in English | MEDLINE | ID: mdl-34487503

ABSTRACT

For decades, adding fault/noise during training by gradient descent has been a technique for getting a neural network (NN) tolerant to persistent fault/noise or getting an NN with better generalization. In recent years, this technique has been readvocated in deep learning to avoid overfitting. Yet, the objective function of such fault/noise injection learning has been misinterpreted as the desired measure (i.e., the expected mean squared error (mse) of the training samples) of the NN with the same fault/noise. The aims of this article are: 1) to clarify the above misconception and 2) investigate the actual regularization effect of adding node fault/noise when training by gradient descent. Based on the previous works on adding fault/noise during training, we speculate the reason why the misconception appears. In the sequel, it is shown that the learning objective of adding random node fault during gradient descent learning (GDL) for a multilayer perceptron (MLP) is identical to the desired measure of the MLP with the same fault. If additive (resp. multiplicative) node noise is added during GDL for an MLP, the learning objective is not identical to the desired measure of the MLP with such noise. For radial basis function (RBF) networks, it is shown that the learning objective is identical to the corresponding desired measure for all three fault/noise conditions. Empirical evidence is presented to support the theoretical results and, hence, clarify the misconception that the objective function of a fault/noise injection learning might not be interpreted as the desired measure of the NN with the same fault/noise. Afterward, the regularization effect of adding node fault/noise during training is revealed for the case of RBF networks. Notably, it is shown that the regularization effect of adding additive or multiplicative node noise (MNN) during training an RBF is reducing network complexity. Applying dropout regularization in RBF networks, its effect is the same as adding MNN during training.

7.
IEEE Trans Neural Netw Learn Syst ; 34(12): 10930-10943, 2023 Dec.
Article in English | MEDLINE | ID: mdl-35576417

ABSTRACT

Sparse index tracking, as one of the passive investment strategies, is to track a benchmark financial index via constructing a portfolio with a few assets in a market index. It can be considered as parameter learning in an adaptive system, in which we periodically update the selected assets and their investment percentages based on the sliding window approach. However, many existing algorithms for sparse index tracking cannot explicitly and directly control the number of assets or the tracking error. This article formulates sparse index tracking as two constrained optimization problems and then proposes two algorithms, namely, nonnegative orthogonal matching pursuit with projected gradient descent (NNOMP-PGD) and alternating direction method of multipliers for l0 -norm (ADMM- l0 ). The NNOMP-PGD aims at minimizing the tracking error subject to the number of selected assets less than or equal to a predefined number. With the NNOMP-PGD, investors can directly and explicitly control the number of selected assets. The ADMM- l0 aims at minimizing the number of selected assets subject to the tracking error that is upper bounded by a preset threshold. It can directly and explicitly control the tracking error. The convergence of the two proposed algorithms is also presented. With our algorithms, investors can explicitly and directly control the number of selected assets or the tracking error of the resultant portfolio. In addition, numerical experiments demonstrate that the proposed algorithms outperform the existing approaches.

8.
Article in English | MEDLINE | ID: mdl-35895648

ABSTRACT

Inspired by sparse learning, the Markowitz mean-variance model with a sparse regularization term is popularly used in sparse portfolio optimization. However, in penalty-based portfolio optimization algorithms, the cardinality level of the resultant portfolio relies on the choice of the regularization parameter. This brief formulates the mean-variance model as a cardinality ( l0 -norm) constrained nonconvex optimization problem, in which we can explicitly specify the number of assets in the portfolio. We then use the alternating direction method of multipliers (ADMMs) concept to develop an algorithm to solve the constrained nonconvex problem. Unlike some existing algorithms, the proposed algorithm can explicitly control the portfolio cardinality. In addition, the dynamic behavior of the proposed algorithm is derived. Numerical results on four real-world datasets demonstrate the superiority of our approach over several state-of-the-art algorithms.

9.
IEEE Trans Vis Comput Graph ; 28(3): 1557-1572, 2022 Mar.
Article in English | MEDLINE | ID: mdl-32881687

ABSTRACT

Recent methods based on deep learning have shown promise in converting grayscale images to colored ones. However, most of them only allow limited user inputs (no inputs, only global inputs, or only local inputs), to control the output colorful images. The possible difficulty lies in how to differentiate the influences of different inputs. To solve this problem, we propose a two-stage deep colorization method allowing users to control the results by flexibly setting global inputs and local inputs. The key steps include enabling color themes as global inputs by extracting K mean colors and generating K-color maps to define a global theme loss, and designing a loss function to differentiate the influences of different inputs without causing artifacts. We also propose a color theme recommendation method to help users choose color themes. Based on the colorization model, we further propose an image compression scheme, which supports variable compression ratios in a single network. Experiments on colorization show that our method can flexibly control the colorized results with only a few inputs and generate state-of-the-art results. Experiments on compression show that our method achieves much higher image quality at the same compression ratio when compared to the state-of-the-art methods.

10.
IEEE Trans Neural Netw Learn Syst ; 33(7): 3184-3192, 2022 Jul.
Article in English | MEDLINE | ID: mdl-33513113

ABSTRACT

The dual neural network-based k -winner-take-all (DNN- k WTA) is an analog neural model that is used to identify the k largest numbers from n inputs. Since threshold logic units (TLUs) are key elements in the model, offset voltage drifts in TLUs may affect the operational correctness of a DNN- k WTA network. Previous studies assume that drifts in TLUs follow some particular distributions. This brief considers that only the drift range, given by [-∆, ∆] , is available. We consider two drift cases: time-invariant and time-varying. For the time-invariant case, we show that the state of a DNN- k WTA network converges. The sufficient condition to make a network with the correct operation is given. Furthermore, for uniformly distributed inputs, we prove that the probability that a DNN- k WTA network operates properly is greater than (1-2∆)n . The aforementioned results are generalized for the time-varying case. In addition, for the time-invariant case, we derive a method to compute the exact convergence time for a given data set. For uniformly distributed inputs, we further derive the mean and variance of the convergence time. The convergence time results give us an idea about the operational speed of the DNN- k WTA model. Finally, simulation experiments have been conducted to validate those theoretical results.

11.
IEEE Trans Neural Netw Learn Syst ; 31(6): 2227-2232, 2020 Jun.
Article in English | MEDLINE | ID: mdl-31398136

ABSTRACT

Over decades, gradient descent has been applied to develop learning algorithm to train a neural network (NN). In this brief, a limitation of applying such algorithm to train an NN with persistent weight noise is revealed. Let V(w) be the performance measure of an ideal NN. V(w) is applied to develop the gradient descent learning (GDL). With weight noise, the desired performance measure (denoted as J(w) ) is E[V(~w)|w] , where ~w is the noisy weight vector. Applying GDL to train an NN with weight noise, the actual learning objective is clearly not V(w) but another scalar function L(w) . For decades, there is a misconception that L(w) = J(w) , and hence, the actual model attained by the GDL is the desired model. However, we show that it might not: 1) with persistent additive weight noise, the actual model attained is the desired model as L(w) = J(w) ; and 2) with persistent multiplicative weight noise, the actual model attained is unlikely the desired model as L(w) ≠ J(w) . Accordingly, the properties of the models attained as compared with the desired models are analyzed and the learning curves are sketched. Simulation results on 1) a simple regression problem and 2) the MNIST handwritten digit recognition are presented to support our claims.

12.
IEEE Trans Neural Netw Learn Syst ; 30(10): 3200-3204, 2019 Oct.
Article in English | MEDLINE | ID: mdl-30668482

ABSTRACT

This brief presents analytical results on the effect of additive weight/bias noise on a Boltzmann machine (BM), in which the unit output is in {-1, 1} instead of {0, 1}. With such noise, it is found that the state distribution is yet another Boltzmann distribution but the temperature factor is elevated. Thus, the desired gradient ascent learning algorithm is derived, and the corresponding learning procedure is developed. This learning procedure is compared with the learning procedure applied to train a BM with noise. It is found that these two procedures are identical. Therefore, the learning algorithm for noise-free BMs is suitable for implementing as an online learning algorithm for an analog circuit-implemented BM, even if the variances of the additive weight noise and bias noise are unknown.

13.
IEEE Trans Neural Netw Learn Syst ; 29(9): 4212-4222, 2018 09.
Article in English | MEDLINE | ID: mdl-29989975

ABSTRACT

In this paper, the effect of input noise, output node stochastic, and recurrent state noise on the Wang $k$ WTA is analyzed. Here, we assume that noise exists at the recurrent state $y(t)$ and it can either be additive or multiplicative. Besides, its dynamical change (i.e., $dy/dt$ ) is corrupted by noise as well. In sequel, we model the dynamics of $y(t)$ as a stochastic differential equation and show that the stochastic behavior of $y(t)$ is equivalent to an Ito diffusion. Its stationary distribution is a Gibbs distribution, whose modality depends on the noise condition. With moderate input noise and very small recurrent state noise, the distribution is single modal and hence $y(\infty )$ has high probability varying within the input values of the $k$ and $k+1$ winners (i.e., correct output). With small input noise and large recurrent state noise, the distribution could be multimodal and hence $y(\infty )$ could have probability varying outside the input values of the $k$ and $k+1$ winners (i.e., incorrect output). In this regard, we further derive the conditions that the $k$ WTA has high probability giving correct output. Our results reveal that recurrent state noise could have severe effect on Wang $k$ WTA. But, input noise and output node stochastic could alleviate such an effect.

14.
IEEE Trans Neural Netw Learn Syst ; 29(8): 3870-3878, 2018 08.
Article in English | MEDLINE | ID: mdl-28816680

ABSTRACT

In the training stage of radial basis function (RBF) networks, we need to select some suitable RBF centers first. However, many existing center selection algorithms were designed for the fault-free situation. This brief develops a fault tolerant algorithm that trains an RBF network and selects the RBF centers simultaneously. We first select all the input vectors from the training set as the RBF centers. Afterward, we define the corresponding fault tolerant objective function. We then add an -norm term into the objective function. As the -norm term is able to force some unimportant weights to zero, center selection can be achieved at the training stage. Since the -norm term is nondifferentiable, we formulate the original problem as a constrained optimization problem. Based on the alternating direction method of multipliers framework, we then develop an algorithm to solve the constrained optimization problem. The convergence proof of the proposed algorithm is provided. Simulation results show that the proposed algorithm is superior to many existing center selection algorithms.

15.
IEEE Trans Neural Netw Learn Syst ; 29(8): 3879-3884, 2018 08.
Article in English | MEDLINE | ID: mdl-28816681

ABSTRACT

A commonly used measurement model for locating a mobile source is time-difference-of-arrival (TDOA). As each TDOA measurement defines a hyperbola, it is not straightforward to compute the mobile source position due to the nonlinear relationship in the measurements. This brief exploits the Lagrange programming neural network (LPNN), which provides a general framework to solve nonlinear constrained optimization problems, for the TDOA-based localization. The local stability of the proposed LPNN solution is also analyzed. Simulation results are included to evaluate the localization accuracy of the LPNN scheme by comparing with the state-of-the-art methods and the optimality benchmark of Cramér-Rao lower bound.

16.
IEEE Trans Neural Netw Learn Syst ; 29(4): 1082-1094, 2018 04.
Article in English | MEDLINE | ID: mdl-28186910

ABSTRACT

This paper studies the effects of uniform input noise and Gaussian input noise on the dual neural network-based WTA (DNN- WTA) model. We show that the state of the network (under either uniform input noise or Gaussian input noise) converges to one of the equilibrium points. We then derive a formula to check if the network produce correct outputs or not. Furthermore, for the uniformly distributed inputs, two lower bounds (one for each type of input noise) on the probability that the network produces the correct outputs are presented. Besides, when the minimum separation amongst inputs is given, we derive the condition for the network producing the correct outputs. Finally, experimental results are presented to verify our theoretical results. Since random drift in the comparators can be considered as input noise, our results can be applied to the random drift situation.

17.
IEEE Trans Vis Comput Graph ; 24(10): 2773-2786, 2018 10.
Article in English | MEDLINE | ID: mdl-29028201

ABSTRACT

The original Summed Area Table (SAT) structure is designed for handling 2D rectangular data. Due to the nature of spherical functions, the SAT structure cannot handle cube maps directly. This paper proposes a new SAT structure for cube maps and develops the corresponding lookup algorithm. Our formulation starts by considering a cube map as part of an auxiliary 3D function defined in a 3D rectangular space. We interpret the 2D integration process over the cube map surface as a 3D integration over the auxiliary 3D function. One may suggest that we can create a 3D SAT for this auxiliary function, and then use the 3D SAT to achieve the 3D integration. However, it is not practical to generate or store this special 3D SAT directly. This 3D SAT has some nice properties that allow us to store it in a storage-friendly data structure, namely Summed Area Cube Map (SACM). A SACM can be stored in a standard cube map texture. The lookup algorithm of our SACM structure can be implemented efficiently on current graphics hardware. In addition, the SACM structure inherits the favorable properties of the original SAT structure.

18.
IEEE Trans Neural Netw Learn Syst ; 28(6): 1360-1372, 2017 06.
Article in English | MEDLINE | ID: mdl-28113823

ABSTRACT

Many existing results on fault-tolerant algorithms focus on the single fault source situation, where a trained network is affected by one kind of weight failure. In fact, a trained network may be affected by multiple kinds of weight failure. This paper first studies how the open weight fault and the multiplicative weight noise degrade the performance of radial basis function (RBF) networks. Afterward, we define the objective function for training fault-tolerant RBF networks. Based on the objective function, we then develop two learning algorithms, one batch mode and one online mode. Besides, the convergent conditions of our online algorithm are investigated. Finally, we develop a formula to estimate the test set error of faulty networks trained from our approach. This formula helps us to optimize some tuning parameters, such as RBF width.

19.
IEEE Trans Neural Netw Learn Syst ; 28(10): 2395-2407, 2017 10.
Article in English | MEDLINE | ID: mdl-27479978

ABSTRACT

The major limitation of the Lagrange programming neural network (LPNN) approach is that the objective function and the constraints should be twice differentiable. Since sparse approximation involves nondifferentiable functions, the original LPNN approach is not suitable for recovering sparse signals. This paper proposes a new formulation of the LPNN approach based on the concept of the locally competitive algorithm (LCA). Unlike the classical LCA approach which is able to solve unconstrained optimization problems only, the proposed LPNN approach is able to solve the constrained optimization problems. Two problems in sparse approximation are considered. They are basis pursuit (BP) and constrained BP denoise (CBPDN). We propose two LPNN models, namely, BP-LPNN and CBPDN-LPNN, to solve these two problems. For these two models, we show that the equilibrium points of the models are the optimal solutions of the two problems, and that the optimal solutions of the two problems are the equilibrium points of the two models. Besides, the equilibrium points are stable. Simulations are carried out to verify the effectiveness of these two LPNN models.

20.
IEEE Trans Neural Netw Learn Syst ; 27(4): 863-74, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26990391

ABSTRACT

Fault tolerance is one interesting property of artificial neural networks. However, the existing fault models are able to describe limited node fault situations only, such as stuck-at-zero and stuck-at-one. There is no general model that is able to describe a large class of node fault situations. This paper studies the performance of faulty radial basis function (RBF) networks for the general node fault situation. We first propose a general node fault model that is able to describe a large class of node fault situations, such as stuck-at-zero, stuck-at-one, and the stuck-at level being with arbitrary distribution. Afterward, we derive an expression to describe the performance of faulty RBF networks. An objective function is then identified from the formula. With the objective function, a training algorithm for the general node situation is developed. Finally, a mean prediction error (MPE) formula that is able to estimate the test set error of faulty networks is derived. The application of the MPE formula in the selection of basis width is elucidated. Simulation experiments are then performed to demonstrate the effectiveness of the proposed method.

SELECTION OF CITATIONS
SEARCH DETAIL
...