Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Neural Comput ; 13(6): 1379-414, 2001 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-11387050

RESUMO

We perform a detailed fixed-point analysis of two-unit recurrent neural networks with sigmoid-shaped transfer functions. Using geometrical arguments in the space of transfer function derivatives, we partition the network state-space into distinct regions corresponding to stability types of the fixed points. Unlike in the previous studies, we do not assume any special form of connectivity pattern between the neurons, and all free parameters are allowed to vary. We also prove that when both neurons have excitatory self-connections and the mutual interaction pattern is the same (i.e., the neurons mutually inhibit or excite themselves), new attractive fixed points are created through the saddle-node bifurcation. Finally, for an N-neuron recurrent network, we give lower bounds on the rate of convergence of attractive periodic points toward the saturation values of neuron activations, as the absolute values of connection weights grow.


Assuntos
Modelos Neurológicos , Rede Nervosa/fisiologia , Redes Neurais de Computação , Neurônios/fisiologia , Matemática
2.
IEEE Trans Neural Netw ; 8(5): 1065-70, 1997.
Artigo em Inglês | MEDLINE | ID: mdl-18255709

RESUMO

In this work, we characterize and contrast the capabilities of the general class of time-delay neural networks (TDNNs) with input delay neural networks (IDNNs), the subclass of TDNNs with delays limited to the inputs. Each class of networks is capable of representing the same set of languages, those embodied by the definite memory machines (DMMs), a subclass of finite-state machines. We demonstrate the close affinity between TDNNs and DMM languages by learning a very large DMM (2048 states) using only a few training examples. Even though both architectures are capable of representing the same class of languages, they have distinguishable learning biases. Intuition suggests that general TDNNs which include delays in hidden layers should perform well, compared to IDNNs, on problems in which the output can be expressed as a function on narrow input windows which repeat in time. On the other hand, these general TDNNs should perform poorly when the input windows are wide, or there is little repetition. We confirm these hypotheses via a set of simulations and statistical analysis.

3.
Artigo em Inglês | MEDLINE | ID: mdl-18255858

RESUMO

Recently, fully connected recurrent neural networks have been proven to be computationally rich-at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. These networks are based upon Nonlinear AutoRegressive models with eXogenous Inputs (NARX models), and are therefore called NARX networks. As opposed to other recurrent networks, NARX networks have a limited feedback which comes only from the output neuron rather than from hidden states. They are formalized by y(t)=Psi(u(t-n(u)), ..., u(t-1), u(t), y(t-n(y)), ..., y(t-1)) where u(t) and y(t) represent input and output of the network at time t, n(u) and n(y) are the input and output order, and the function Psi is the mapping performed by a Multilayer Perceptron. We constructively prove that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines. We conclude that in theory one can use the NARX models, rather than conventional recurrent networks without any computational loss even though their feedback is limited. Furthermore, these results raise the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power.

4.
IEEE Trans Neural Netw ; 7(6): 1329-38, 1996.
Artigo em Inglês | MEDLINE | ID: mdl-18263528

RESUMO

It has previously been shown that gradient-descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long-term dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. We show that the long-term dependencies problem is lessened for a class of architectures called nonlinear autoregressive models with exogenous (NARX) recurrent neural networks, which have powerful representational capabilities. We have previously reported that gradient descent learning can be more effective in NARX networks than in recurrent neural network architectures that have "hidden states" on problems including grammatical inference and nonlinear system identification. Typically, the network converges much faster and generalizes better than other networks. The results in this paper are consistent with this phenomenon. We present some experimental results which show that NARX networks can often retain information for two to three times as long as conventional recurrent neural networks. We show that although NARX networks do not circumvent the problem of long-term dependencies, they can greatly improve performance on long-term dependency problems. We also describe in detail some of the assumptions regarding what it means to latch information robustly and suggest possible ways to loosen these assumptions.

5.
IEEE Trans Neural Netw ; 7(6): 1424-38, 1996.
Artigo em Inglês | MEDLINE | ID: mdl-18263536

RESUMO

Concerns the effect of noise on the performance of feedforward neural nets. We introduce and analyze various methods of injecting synaptic noise into dynamically driven recurrent nets during training. Theoretical results show that applying a controlled amount of noise during training may improve convergence and generalization performance. We analyze the effects of various noise parameters and predict that best overall performance can be achieved by injecting additive noise at each time step. Noise contributes a second-order gradient term to the error function which can be viewed as an anticipatory agent to aid convergence. This term appears to find promising regions of weight space in the beginning stages of training when the training error is large and should improve convergence on error surfaces with local minima. The first-order term is a regularization term that can improve generalization. Specifically, it can encourage internal representations where the state nodes operate in the saturated regions of the sigmoid discriminant function. While this effect can improve performance on automata inference problems with binary inputs and target outputs, it is unclear what effect it will have on other types of problems. To substantiate these predictions, we present simulations on learning the dual parity grammar from temporal strings for all noise models, and present simulations on learning a randomly generated six-state grammar using the predicted best noise model.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA