Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Neural Netw ; 173: 106202, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38422835

RESUMEN

The concept of randomized neural networks (RNNs), such as the random vector functional link network (RVFL) and extreme learning machine (ELM), is a widely accepted and efficient network method for constructing single-hidden layer feedforward networks (SLFNs). Due to its exceptional approximation capabilities, RNN is being extensively used in various fields. While the RNN concept has shown great promise, its performance can be unpredictable in imperfect conditions, such as weight noises and outliers. Thus, there is a need to develop more reliable and robust RNN algorithms. To address this issue, this paper proposes a new objective function that addresses the combined effect of weight noise and training data outliers for RVFL networks. Based on the half-quadratic optimization method, we then propose a novel algorithm, named noise-aware RNN (NARNN), to optimize the proposed objective function. The convergence of the NARNN is also theoretically validated. We also discuss the way to use the NARNN for ensemble deep RVFL (edRVFL) networks. Finally, we present an extension of the NARNN to concurrently address weight noise, stuck-at-fault, and outliers. The experimental results demonstrate that the proposed algorithm outperforms a number of state-of-the-art robust RNN algorithms.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Aprendizaje
2.
Neural Netw ; 180: 106633, 2024 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-39208461

RESUMEN

In the construction process of radial basis function (RBF) networks, two common crucial issues arise: the selection of RBF centers and the effective utilization of the given source without encountering the overfitting problem. Another important issue is the fault tolerant capability. That is, when noise or faults exist in a trained network, it is crucial that the network's performance does not undergo significant deterioration or decrease. However, without employing a fault tolerant procedure, a trained RBF network may exhibit significantly poor performance. Unfortunately, most existing algorithms are unable to simultaneously address all of the aforementioned issues. This paper proposes fault tolerant training algorithms that can simultaneously select RBF nodes and train RBF output weights. Additionally, our algorithms can directly control the number of RBF nodes in an explicit manner, eliminating the need for a time-consuming procedure to tune the regularization parameter and achieve the target RBF network size. Based on simulation results, our algorithms demonstrate improved test set performance when more RBF nodes are used, effectively utilizing the given source without encountering the overfitting problem. This paper first defines a fault tolerant objective function, which includes a term to suppress the effects of weight faults and weight noise. This term also prevents the issue of overfitting, resulting in better test set performance when more RBF nodes are utilized. With the defined objective function, the training process is designed to solve a generalized M-sparse problem by incorporating an ℓ0-norm constraint. The ℓ0-norm constraint allows us to directly and explicitly control the number of RBF nodes. To address the generalized M-sparse problem, we introduce the noise-resistant iterative hard thresholding (NR-IHT) algorithm. The convergence properties of the NR-IHT algorithm are subsequently discussed theoretically. To further enhance performance, we incorporate the momentum concept into the NR-IHT algorithm, referring to the modified version as "NR-IHT-Mom". Simulation results show that both the NR-IHT algorithm and the NR-IHT-Mom algorithm outperform several state-of-the-art comparison algorithms.

3.
IEEE Trans Neural Netw Learn Syst ; 34(5): 2619-2632, 2023 May.
Artículo en Inglés | MEDLINE | ID: mdl-34487503

RESUMEN

For decades, adding fault/noise during training by gradient descent has been a technique for getting a neural network (NN) tolerant to persistent fault/noise or getting an NN with better generalization. In recent years, this technique has been readvocated in deep learning to avoid overfitting. Yet, the objective function of such fault/noise injection learning has been misinterpreted as the desired measure (i.e., the expected mean squared error (mse) of the training samples) of the NN with the same fault/noise. The aims of this article are: 1) to clarify the above misconception and 2) investigate the actual regularization effect of adding node fault/noise when training by gradient descent. Based on the previous works on adding fault/noise during training, we speculate the reason why the misconception appears. In the sequel, it is shown that the learning objective of adding random node fault during gradient descent learning (GDL) for a multilayer perceptron (MLP) is identical to the desired measure of the MLP with the same fault. If additive (resp. multiplicative) node noise is added during GDL for an MLP, the learning objective is not identical to the desired measure of the MLP with such noise. For radial basis function (RBF) networks, it is shown that the learning objective is identical to the corresponding desired measure for all three fault/noise conditions. Empirical evidence is presented to support the theoretical results and, hence, clarify the misconception that the objective function of a fault/noise injection learning might not be interpreted as the desired measure of the NN with the same fault/noise. Afterward, the regularization effect of adding node fault/noise during training is revealed for the case of RBF networks. Notably, it is shown that the regularization effect of adding additive or multiplicative node noise (MNN) during training an RBF is reducing network complexity. Applying dropout regularization in RBF networks, its effect is the same as adding MNN during training.

4.
Artículo en Inglés | MEDLINE | ID: mdl-37310825

RESUMEN

The dual neural network (DNN)-based k -winner-take-all (WTA) model is able to identify the k largest numbers from its m input numbers. When there are imperfections, such as non-ideal step function and Gaussian input noise, in the realization, the model may not output the correct result. This brief analyzes the influence of the imperfections on the operational correctness of the model. Due to the imperfections, it is not efficient to use the original DNN- k WTA dynamics for analyzing the influence. In this regard, this brief first derives an equivalent model to describe the dynamics of the model under the imperfections. From the equivalent model, we derive a sufficient condition for which the model outputs the correct result. Thus, we apply the sufficient condition to design an efficiently estimation method for the probability of the model outputting the correct result. Furthermore, for the inputs with uniform distribution, a closed form expression for the probability value is derived. Finally, we extend our analysis for handling non-Gaussian input noise. Simulation results are provided to validate our theoretical results.

5.
Artículo en Inglés | MEDLINE | ID: mdl-37796669

RESUMEN

Among many k -winners-take-all ( k WTA) models, the dual-neural network (DNN- k WTA) model is with significantly less number of connections. However, for analog realization, noise is inevitable and affects the operational correctness of the k WTA process. Most existing results focus on the effect of additive noise. This brief studies the effect of time-varying multiplicative input noise. Two scenarios are considered. The first one is the bounded noise case, in which only the noise range is known. Another one is for the general noise distribution case, in which we either know the noise distribution or have noise samples. For each scenario, we first prove the convergence property of the DNN- k WTA model under multiplicative input noise and then provide an efficient method to determine whether a noise-affected DNN- k WTA network performs the correct k WTA process for a given set of inputs. With the two methods, we can efficiently measure the probability of the network performing the correct k WTA process. In addition, for the case of the inputs being uniformly distributed, we derive two closed-form expressions, one for each scenario, for estimating the probability of the model having correct operation. Finally, we conduct simulations to verify our theoretical results.

6.
IEEE Trans Neural Netw Learn Syst ; 34(2): 1080-1088, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-34428154

RESUMEN

From the feature representation's point of view, the feature learning module of a convolutional neural network (CNN) is to transform an input pattern into a feature vector. This feature vector is then multiplied with a number of output weight vectors to produce softmax scores. The common training objective in CNNs is based on the softmax loss, which ignores the intra-class compactness. This brief proposes a constrained center loss (CCL)-based algorithm to extract robust features. The training objective of a CNN consists of two terms, softmax loss and CCL. The aim of the softmax loss is to push the feature vectors from different classes apart. Meanwhile, the CCL aims at clustering the feature vectors such that the feature vectors from the same classes are close together. Instead of using stochastic gradient descent (SGD) algorithms to learn all the connection weights and the cluster centers at the same time. Our CCL-based algorithm is based on the alternative learning strategy. We first fix the connection weights of the CNN and update the cluster centers based on an analytical formula, which can be implemented based on the minibatch concept. We then fix the cluster centers and update the connection weights for a number of SGD minibatch iterations. We also propose a simplified CCL (SCCL) algorithm. Experiments are performed on six commonly used benchmark datasets. The results demonstrate that the two proposed algorithms outperform several state-of-the-art approaches.

7.
IEEE Trans Neural Netw Learn Syst ; 34(12): 10930-10943, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-35576417

RESUMEN

Sparse index tracking, as one of the passive investment strategies, is to track a benchmark financial index via constructing a portfolio with a few assets in a market index. It can be considered as parameter learning in an adaptive system, in which we periodically update the selected assets and their investment percentages based on the sliding window approach. However, many existing algorithms for sparse index tracking cannot explicitly and directly control the number of assets or the tracking error. This article formulates sparse index tracking as two constrained optimization problems and then proposes two algorithms, namely, nonnegative orthogonal matching pursuit with projected gradient descent (NNOMP-PGD) and alternating direction method of multipliers for l0 -norm (ADMM- l0 ). The NNOMP-PGD aims at minimizing the tracking error subject to the number of selected assets less than or equal to a predefined number. With the NNOMP-PGD, investors can directly and explicitly control the number of selected assets. The ADMM- l0 aims at minimizing the number of selected assets subject to the tracking error that is upper bounded by a preset threshold. It can directly and explicitly control the tracking error. The convergence of the two proposed algorithms is also presented. With our algorithms, investors can explicitly and directly control the number of selected assets or the tracking error of the resultant portfolio. In addition, numerical experiments demonstrate that the proposed algorithms outperform the existing approaches.

8.
IEEE Trans Neural Netw Learn Syst ; 34(8): 5218-5226, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-34847045

RESUMEN

The objective of compressive sampling is to determine a sparse vector from an observation vector. This brief describes an analog neural method to achieve the objective. Unlike previous analog neural models which either resort to the l1 -norm approximation or are with local convergence only, the proposed method avoids any approximation of the l1 -norm term and is probably capable of leading to the optimum solution. Moreover, its computational complexity is lower than that of the other three comparison analog models. Simulation results show that the error performance of the proposed model is comparable to several state-of-the-art digital algorithms and analog models and that its convergence is faster than that of the comparison analog neural models.

9.
IEEE Trans Neural Netw Learn Syst ; 33(7): 3184-3192, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-33513113

RESUMEN

The dual neural network-based k -winner-take-all (DNN- k WTA) is an analog neural model that is used to identify the k largest numbers from n inputs. Since threshold logic units (TLUs) are key elements in the model, offset voltage drifts in TLUs may affect the operational correctness of a DNN- k WTA network. Previous studies assume that drifts in TLUs follow some particular distributions. This brief considers that only the drift range, given by [-∆, ∆] , is available. We consider two drift cases: time-invariant and time-varying. For the time-invariant case, we show that the state of a DNN- k WTA network converges. The sufficient condition to make a network with the correct operation is given. Furthermore, for uniformly distributed inputs, we prove that the probability that a DNN- k WTA network operates properly is greater than (1-2∆)n . The aforementioned results are generalized for the time-varying case. In addition, for the time-invariant case, we derive a method to compute the exact convergence time for a given data set. For uniformly distributed inputs, we further derive the mean and variance of the convergence time. The convergence time results give us an idea about the operational speed of the DNN- k WTA model. Finally, simulation experiments have been conducted to validate those theoretical results.

10.
IEEE Trans Vis Comput Graph ; 28(3): 1557-1572, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-32881687

RESUMEN

Recent methods based on deep learning have shown promise in converting grayscale images to colored ones. However, most of them only allow limited user inputs (no inputs, only global inputs, or only local inputs), to control the output colorful images. The possible difficulty lies in how to differentiate the influences of different inputs. To solve this problem, we propose a two-stage deep colorization method allowing users to control the results by flexibly setting global inputs and local inputs. The key steps include enabling color themes as global inputs by extracting K mean colors and generating K-color maps to define a global theme loss, and designing a loss function to differentiate the influences of different inputs without causing artifacts. We also propose a color theme recommendation method to help users choose color themes. Based on the colorization model, we further propose an image compression scheme, which supports variable compression ratios in a single network. Experiments on colorization show that our method can flexibly control the colorized results with only a few inputs and generate state-of-the-art results. Experiments on compression show that our method achieves much higher image quality at the same compression ratio when compared to the state-of-the-art methods.

11.
Artículo en Inglés | MEDLINE | ID: mdl-35895648

RESUMEN

Inspired by sparse learning, the Markowitz mean-variance model with a sparse regularization term is popularly used in sparse portfolio optimization. However, in penalty-based portfolio optimization algorithms, the cardinality level of the resultant portfolio relies on the choice of the regularization parameter. This brief formulates the mean-variance model as a cardinality ( l0 -norm) constrained nonconvex optimization problem, in which we can explicitly specify the number of assets in the portfolio. We then use the alternating direction method of multipliers (ADMMs) concept to develop an algorithm to solve the constrained nonconvex problem. Unlike some existing algorithms, the proposed algorithm can explicitly control the portfolio cardinality. In addition, the dynamic behavior of the proposed algorithm is derived. Numerical results on four real-world datasets demonstrate the superiority of our approach over several state-of-the-art algorithms.

12.
IEEE Trans Vis Comput Graph ; 16(2): 287-97, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20075488

RESUMEN

We propose a novel reaction diffusion (RD) simulator to evolve image-resembling mazes. The evolved mazes faithfully preserve the salient interior structures in the source images. Since it is difficult to control the generation of desired patterns with traditional reaction diffusion, we develop our RD simulator on a different computational platform, cellular neural networks. Based on the proposed simulator, we can generate the mazes that exhibit both regular and organic appearance, with uniform and/or spatially varying passage spacing. Our simulator also provides high controllability of maze appearance. Users can directly and intuitively "paint" to modify the appearance of mazes in a spatially varying manner via a set of brushes. In addition, the evolutionary nature of our method naturally generates maze without any obvious seam even though the input image is a composite of multiple sources. The final maze is obtained by determining a solution path that follows the user-specified guiding curve. We validate our method by evolving several interesting mazes from different source images.


Asunto(s)
Algoritmos , Gráficos por Computador , Interpretación de Imagen Asistida por Computador/métodos , Modelos Teóricos , Simulación por Computador
13.
IEEE Trans Vis Comput Graph ; 16(1): 43-56, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-19910660

RESUMEN

This paper proposes a novel multiscale spherical radial basis function (MSRBF) representation for all-frequency lighting. It supports the illumination of distant environment as well as the local illumination commonly used in practical applications, such as games. The key is to define a multiscale and hierarchical structure of spherical radial basis functions (SRBFs) with basis functions uniformly distributed over the sphere. The basis functions are divided into multiple levels according to their coverage (widths). Within the same level, SRBFs have the same width. Larger width SRBFs are responsible for lower frequency lighting while the smaller width ones are responsible for the higher frequency lighting. Hence, our approach can achieve the true all-frequency lighting that is not achievable by the single-scale SRBF approach. Besides, the MSRBF approach is scalable as coarser rendering quality can be achieved without reestimating the coefficients from the raw data. With the homogeneous form of basis functions, the rendering is highly efficient. The practicability of the proposed method is demonstrated with real-time rendering and effective compression for tractable storage.


Asunto(s)
Algoritmos , Gráficos por Computador , Imagenología Tridimensional/métodos , Iluminación/métodos , Modelos Teóricos , Interfaz Usuario-Computador , Simulación por Computador
14.
IEEE Trans Neural Netw Learn Syst ; 31(6): 2227-2232, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-31398136

RESUMEN

Over decades, gradient descent has been applied to develop learning algorithm to train a neural network (NN). In this brief, a limitation of applying such algorithm to train an NN with persistent weight noise is revealed. Let V(w) be the performance measure of an ideal NN. V(w) is applied to develop the gradient descent learning (GDL). With weight noise, the desired performance measure (denoted as J(w) ) is E[V(~w)|w] , where ~w is the noisy weight vector. Applying GDL to train an NN with weight noise, the actual learning objective is clearly not V(w) but another scalar function L(w) . For decades, there is a misconception that L(w) = J(w) , and hence, the actual model attained by the GDL is the desired model. However, we show that it might not: 1) with persistent additive weight noise, the actual model attained is the desired model as L(w) = J(w) ; and 2) with persistent multiplicative weight noise, the actual model attained is unlikely the desired model as L(w) ≠ J(w) . Accordingly, the properties of the models attained as compared with the desired models are analyzed and the learning curves are sketched. Simulation results on 1) a simple regression problem and 2) the MNIST handwritten digit recognition are presented to support our claims.

15.
IEEE Trans Neural Netw Learn Syst ; 30(10): 3200-3204, 2019 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-30668482

RESUMEN

This brief presents analytical results on the effect of additive weight/bias noise on a Boltzmann machine (BM), in which the unit output is in {-1, 1} instead of {0, 1}. With such noise, it is found that the state distribution is yet another Boltzmann distribution but the temperature factor is elevated. Thus, the desired gradient ascent learning algorithm is derived, and the corresponding learning procedure is developed. This learning procedure is compared with the learning procedure applied to train a BM with noise. It is found that these two procedures are identical. Therefore, the learning algorithm for noise-free BMs is suitable for implementing as an online learning algorithm for an analog circuit-implemented BM, even if the variances of the additive weight noise and bias noise are unknown.

16.
IEEE Trans Neural Netw ; 19(3): 493-507, 2008 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-18334367

RESUMEN

In classical training methods for node open fault, we need to consider many potential faulty networks. When the multinode fault situation is considered, the space of potential faulty networks is very large. Hence, the objective function and the corresponding learning algorithm would be computationally complicated. This paper uses the Kullback-Leibler divergence to define an objective function for improving the fault tolerance of radial basis function (RBF) networks. With the assumption that there is a Gaussian distributed noise term in the output data, a regularizer in the objective function is identified. Finally, the corresponding learning algorithm is developed. In our approach, the objective function and the learning algorithm are computationally simple. Compared with some conventional approaches, including weight-decay-based regularizers, our approach has a better fault-tolerant ability. Besides, our empirical study shows that our approach can improve the generalization ability of a fault-free RBF network.


Asunto(s)
Algoritmos , Simulación por Computador , Aprendizaje , Redes Neurales de la Computación , Técnicas de Apoyo para la Decisión , Humanos , Modelos Estadísticos
17.
IEEE Trans Neural Netw Learn Syst ; 29(4): 1082-1094, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-28186910

RESUMEN

This paper studies the effects of uniform input noise and Gaussian input noise on the dual neural network-based WTA (DNN- WTA) model. We show that the state of the network (under either uniform input noise or Gaussian input noise) converges to one of the equilibrium points. We then derive a formula to check if the network produce correct outputs or not. Furthermore, for the uniformly distributed inputs, two lower bounds (one for each type of input noise) on the probability that the network produces the correct outputs are presented. Besides, when the minimum separation amongst inputs is given, we derive the condition for the network producing the correct outputs. Finally, experimental results are presented to verify our theoretical results. Since random drift in the comparators can be considered as input noise, our results can be applied to the random drift situation.

18.
IEEE Trans Neural Netw Learn Syst ; 29(9): 4212-4222, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29989975

RESUMEN

In this paper, the effect of input noise, output node stochastic, and recurrent state noise on the Wang $k$ WTA is analyzed. Here, we assume that noise exists at the recurrent state $y(t)$ and it can either be additive or multiplicative. Besides, its dynamical change (i.e., $dy/dt$ ) is corrupted by noise as well. In sequel, we model the dynamics of $y(t)$ as a stochastic differential equation and show that the stochastic behavior of $y(t)$ is equivalent to an Ito diffusion. Its stationary distribution is a Gibbs distribution, whose modality depends on the noise condition. With moderate input noise and very small recurrent state noise, the distribution is single modal and hence $y(\infty )$ has high probability varying within the input values of the $k$ and $k+1$ winners (i.e., correct output). With small input noise and large recurrent state noise, the distribution could be multimodal and hence $y(\infty )$ could have probability varying outside the input values of the $k$ and $k+1$ winners (i.e., incorrect output). In this regard, we further derive the conditions that the $k$ WTA has high probability giving correct output. Our results reveal that recurrent state noise could have severe effect on Wang $k$ WTA. But, input noise and output node stochastic could alleviate such an effect.

19.
IEEE Trans Vis Comput Graph ; 24(10): 2773-2786, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-29028201

RESUMEN

The original Summed Area Table (SAT) structure is designed for handling 2D rectangular data. Due to the nature of spherical functions, the SAT structure cannot handle cube maps directly. This paper proposes a new SAT structure for cube maps and develops the corresponding lookup algorithm. Our formulation starts by considering a cube map as part of an auxiliary 3D function defined in a 3D rectangular space. We interpret the 2D integration process over the cube map surface as a 3D integration over the auxiliary 3D function. One may suggest that we can create a 3D SAT for this auxiliary function, and then use the 3D SAT to achieve the 3D integration. However, it is not practical to generate or store this special 3D SAT directly. This 3D SAT has some nice properties that allow us to store it in a storage-friendly data structure, namely Summed Area Cube Map (SACM). A SACM can be stored in a standard cube map texture. The lookup algorithm of our SACM structure can be implemented efficiently on current graphics hardware. In addition, the SACM structure inherits the favorable properties of the original SAT structure.

20.
IEEE Trans Neural Netw Learn Syst ; 29(8): 3870-3878, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-28816680

RESUMEN

In the training stage of radial basis function (RBF) networks, we need to select some suitable RBF centers first. However, many existing center selection algorithms were designed for the fault-free situation. This brief develops a fault tolerant algorithm that trains an RBF network and selects the RBF centers simultaneously. We first select all the input vectors from the training set as the RBF centers. Afterward, we define the corresponding fault tolerant objective function. We then add an -norm term into the objective function. As the -norm term is able to force some unimportant weights to zero, center selection can be achieved at the training stage. Since the -norm term is nondifferentiable, we formulate the original problem as a constrained optimization problem. Based on the alternating direction method of multipliers framework, we then develop an algorithm to solve the constrained optimization problem. The convergence proof of the proposed algorithm is provided. Simulation results show that the proposed algorithm is superior to many existing center selection algorithms.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA