Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Neural Comput ; 22(6): 1473-92, 2010 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-20141471

RESUMEN

To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced three-way multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric weight between a pixel in the second image and a hidden unit. This creates cubically many parameters, which form a three-dimensional interaction tensor. We describe a low-rank approximation to this interaction tensor that uses a sum of factors, each of which is a three-way outer product. This approximation allows efficient learning of transformations between larger image patches. Since each factor can be viewed as an image filter, the model as a whole learns optimal filter pairs for efficiently representing transformations. We demonstrate the learning of optimal filter pairs from various synthetic and real image sequences. We also show how learning about image transformations allows the model to perform a simple visual analogy task, and we show how a completely unsupervised network trained on transformations perceives multiple motions of transparent dot patterns in the same way as humans.


Asunto(s)
Inteligencia Artificial , Procesamiento de Imagen Asistido por Computador/métodos , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Reconocimiento Visual de Modelos/fisiología , Percepción Espacial/fisiología , Algoritmos , Conceptos Matemáticos
2.
Neural Comput ; 22(11): 2729-62, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-20804386

RESUMEN

We compare 10 methods of classifying fMRI volumes by applying them to data from a longitudinal study of stroke recovery: adaptive Fisher's linear and quadratic discriminant; gaussian naive Bayes; support vector machines with linear, quadratic, and radial basis function (RBF) kernels; logistic regression; two novel methods based on pairs of restricted Boltzmann machines (RBM); and K-nearest neighbors. All methods were tested on three binary classification tasks, and their out-of-sample classification accuracies are compared. The relative performance of the methods varies considerably across subjects and classification tasks. The best overall performers were adaptive quadratic discriminant, support vector machines with RBF kernels, and generatively trained pairs of RBMs.


Asunto(s)
Interpretación de Imagen Asistida por Computador/métodos , Imagen por Resonancia Magnética , Reconocimiento de Normas Patrones Automatizadas/métodos , Accidente Cerebrovascular/patología , Algoritmos , Humanos
3.
Trends Cogn Sci ; 11(10): 428-34, 2007 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17921042

RESUMEN

To achieve its impressive performance in tasks such as speech perception or object recognition, the brain extracts multiple levels of representation from the sensory input. Backpropagation was the first computationally efficient model of how neural networks could learn multiple layers of representation, but it required labeled training data and it did not work well in deep networks. The limitations of backpropagation learning can now be overcome by using multilayer neural networks that contain top-down connections and training them to generate sensory data rather than to classify it. Learning multilayer generative models might seem difficult, but a recent discovery makes it easy to learn nonlinear distributed representations one layer at a time.


Asunto(s)
Encéfalo/fisiología , Aprendizaje/fisiología , Modelos Psicológicos , Red Nerviosa/fisiología , Humanos
4.
Prog Brain Res ; 165: 535-47, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17925269

RESUMEN

The uniformity of the cortical architecture and the ability of functions to move to different areas of cortex following early damage strongly suggest that there is a single basic learning algorithm for extracting underlying structure from richly structured, high-dimensional sensory data. There have been many attempts to design such an algorithm, but until recently they all suffered from serious computational weaknesses. This chapter describes several of the proposed algorithms and shows how they can be combined to produce hybrid methods that work efficiently in networks with many layers and millions of adaptive connections.


Asunto(s)
Aprendizaje/fisiología , Modelos Neurológicos , Redes Neurales de la Computación , Algoritmos , Corteza Cerebral/citología , Corteza Cerebral/fisiología , Humanos , Reconocimiento de Normas Patrones Automatizadas
5.
Neural Comput ; 3(1): 79-87, 1991.
Artículo en Inglés | MEDLINE | ID: mdl-31141872

RESUMEN

We present a new supervised learning procedure for systems composed of many separate networks, each of which learns to handle a subset of the complete set of training cases. The new procedure can be viewed either as a modular version of a multilayer supervised network, or as an associative version of competitive learning. It therefore provides a new link between these two apparently different approaches. We demonstrate that the learning procedure divides up a vowel discrimination task into appropriate subtasks, each of which can be solved by a very simple expert network.

6.
IEEE Trans Neural Netw ; 15(4): 838-49, 2004 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-15461077

RESUMEN

Under-complete models, which derive lower dimensional representations of input data, are valuable in domains in which the number of input dimensions is very large, such as data consisting of a temporal sequence of images. This paper presents the under-complete product of experts (UPoE), where each expert models a one-dimensional projection of the data. Maximum-likelihood learning rules for this model constitute a tractable and exact algorithm for learning under-complete independent components. The learning rules for this model coincide with approximate learning rules proposed earlier for under-complete independent component analysis (UICA) models. This paper also derives an efficient sequential learning algorithm from this model and discusses its relationship to sequential independent component analysis (ICA), projection pursuit density estimation, and feature induction algorithms for additive random field models. This paper demonstrates the efficacy of these novel algorithms on high-dimensional continuous datasets.


Asunto(s)
Algoritmos , Inteligencia Artificial , Técnicas de Apoyo para la Decisión , Teoría de la Información , Modelos Estadísticos , Redes Neurales de la Computación , Aprendizaje por Probabilidad , Simulación por Computador , Sistemas Especialistas , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas , Análisis de Componente Principal
7.
IEEE Trans Pattern Anal Mach Intell ; 35(9): 2206-22, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23868780

RESUMEN

This paper describes a Markov Random Field for real-valued image modeling that has two sets of latent variables. One set is used to gate the interactions between all pairs of pixels, while the second set determines the mean intensities of each pixel. This is a powerful model with a conditional distribution over the input that is Gaussian, with both mean and covariance determined by the configuration of latent variables, which is unlike previous models that were restricted to using Gaussians with either a fixed mean or a diagonal covariance matrix. Thanks to the increased flexibility, this gated MRF can generate more realistic samples after training on an unconstrained distribution of high-resolution natural images. Furthermore, the latent variables of the model can be inferred efficiently and can be used as very effective descriptors in recognition tasks. Both generation and discrimination drastically improve as layers of binary latent variables are added to the model, yielding a hierarchical model called a Deep Belief Network.

8.
Philos Trans R Soc Lond B Biol Sci ; 365(1537): 177-84, 2010 Jan 12.
Artículo en Inglés | MEDLINE | ID: mdl-20008395

RESUMEN

One of the central problems in computational neuroscience is to understand how the object-recognition pathway of the cortex learns a deep hierarchy of nonlinear feature detectors. Recent progress in machine learning shows that it is possible to learn deep hierarchies without requiring any labelled data. The feature detectors are learned one layer at a time and the goal of the learning procedure is to form a good generative model of images, not to predict the class of each image. The learning procedure only requires the pairwise correlations between the activations of neuron-like processing units in adjacent layers. The original version of the learning procedure is derived from a quadratic 'energy' function but it can be extended to allow third-order, multiplicative interactions in which neurons gate the pairwise interactions between other neurons. A technique for factoring the third-order interactions leads to a learning module that again has a simple learning rule based on pairwise correlations. This module looks remarkably like modules that have been proposed by both biologists trying to explain the responses of neurons and engineers trying to create systems that can recognize objects.


Asunto(s)
Aprendizaje/fisiología , Modelos Neurológicos , Redes Neurales de la Computación , Vías Visuales/fisiología , Simulación por Computador , Humanos
9.
Neural Comput ; 20(11): 2629-36, 2008 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-18533819

RESUMEN

In this note, we show that exponentially deep belief networks can approximate any distribution over binary vectors to arbitrary accuracy, even when the width of each layer is limited to the dimensionality of the data. We further show that such networks can be greedily learned in an easy yet impractical way.


Asunto(s)
Aprendizaje , Redes Neurales de la Computación , Algoritmos , Humanos , Dinámicas no Lineales
10.
Neural Syst Circuits ; 1(1): 12, 2011 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-22330889
11.
Neural Comput ; 18(2): 381-414, 2006 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-16378519

RESUMEN

We present an energy-based model that uses a product of generalized Student-t distributions to capture the statistical structure in data sets. This model is inspired by and particularly applicable to "natural" data sets such as images. We begin by providing the mathematical framework, where we discuss complete and overcomplete models and provide algorithms for training these models from data. Using patches of natural scenes, we demonstrate that our approach represents a viable alternative to independent component analysis as an interpretive model of biological visual systems. Although the two approaches are similar in flavor, there are also important differences, particularly when the representations are overcomplete. By constraining the interactions within our model, we are also able to study the topographic organization of Gabor-like receptive fields that our model learns. Finally, we discuss the relation of our new approach to previous work--in particular, gaussian scale mixture models and variants of independent components analysis.


Asunto(s)
Modelos Neurológicos , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Vías Visuales/fisiología , Algoritmos
12.
Neural Comput ; 18(7): 1527-54, 2006 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-16764513

RESUMEN

We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.


Asunto(s)
Algoritmos , Aprendizaje/fisiología , Redes Neurales de la Computación , Neuronas/fisiología , Animales , Humanos
13.
Neural Comput ; 14(8): 1771-800, 2002 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-12180402

RESUMEN

It is possible to combine multiple latent-variable models of the same data by multiplying their probability distributions together and then renormalizing. This way of combining individual "expert" models makes it hard to generate samples from the combined model but easy to infer the values of the latent variables of each expert, because the combination rule ensures that the latent variables of different experts are conditionally independent when given the data. A product of experts (PoE) is therefore an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary. Training a PoE by maximizing the likelihood of the data is difficult because it is hard even to approximate the derivatives of the renormalization term in the combination rule. Fortunately, a PoE can be trained using a different objective function called "contrastive divergence" whose derivatives with regard to the parameters can be approximated accurately and efficiently. Examples are presented of contrastive divergence learning using several types of expert on several types of data.

14.
Neural Netw ; 9(8): 1385-1403, 1996 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-12662541

RESUMEN

The Helmholtz machine is a new unsupervised learning architecture that uses top-down connections to build probability density models of input and bottom-up connections to build inverses to those models. The wake-sleep learning algorithm for the machine involves just the purely local delta rule. This paper suggests a number of different varieties of Helmholtz machines, each with its own strengths and weaknesses, and relates them to cortical information processing. Copyright 1996 Elsevier Science Ltd.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA