Your browser doesn't support javascript.
loading
Dynamics of learning in multilayer perceptrons near singularities.
Cousseau, Florent; Ozeki, Tomoko; Amari, Shun-Ichi.
Afiliación
  • Cousseau F; Unit for Mathematical Neuroscience, RIKEN Brain Science Institute, Saitama 3510198, Japan. florent@mns.k.u-tokyo.ac.jp
IEEE Trans Neural Netw ; 19(8): 1313-28, 2008 Aug.
Article en En | MEDLINE | ID: mdl-18701364
ABSTRACT
The dynamical behavior of learning is known to be very slow for the multilayer perceptron, being often trapped in the "plateau." It has been recently understood that this is due to the singularity in the parameter space of perceptrons, in which trajectories of learning are drawn. The space is Riemannian from the point of view of information geometry and contains singular regions where the Riemannian metric or the Fisher information matrix degenerates. This paper analyzes the dynamics of learning in a neighborhood of the singular regions when the true teacher machine lies at the singularity. We give explicit asymptotic analytical solutions (trajectories) both for the standard gradient (SGD) and natural gradient (NGD) methods. It is clearly shown, in the case of the SGD method, that the plateau phenomenon appears in a neighborhood of the critical regions, where the dynamical behavior is extremely slow. The analysis of the NGD method is much more difficult, because the inverse of the Fisher information matrix diverges. We conquer the difficulty by introducing the "blow-down" technique used in algebraic geometry. The NGD method works efficiently, and the state converges directly to the true parameters very quickly while it staggers in the case of the SGD method. The analytical results are compared with computer simulations, showing good agreement. The effects of singularities on learning are thus qualitatively clarified for both standard and NGD methods.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Algoritmos / Reconocimiento de Normas Patrones Automatizadas / Redes Neurales de la Computación / Modelos Teóricos Idioma: En Revista: IEEE Trans Neural Netw Asunto de la revista: INFORMATICA MEDICA Año: 2008 Tipo del documento: Article País de afiliación: Japón

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Algoritmos / Reconocimiento de Normas Patrones Automatizadas / Redes Neurales de la Computación / Modelos Teóricos Idioma: En Revista: IEEE Trans Neural Netw Asunto de la revista: INFORMATICA MEDICA Año: 2008 Tipo del documento: Article País de afiliación: Japón