Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Nat Rev Neurosci ; 21(6): 335-346, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32303713

RESUMEN

During learning, the brain modifies synapses to improve behaviour. In the cortex, synapses are embedded within multilayered networks, making it difficult to determine the effect of an individual synaptic modification on the behaviour of the system. The backpropagation algorithm solves this problem in deep artificial neural networks, but historically it has been viewed as biologically problematic. Nonetheless, recent developments in neuroscience and the successes of artificial neural networks have reinvigorated interest in whether backpropagation offers insights for understanding learning in the cortex. The backpropagation algorithm learns quickly by computing synaptic updates using feedback connections to deliver error signals. Although feedback connections are ubiquitous in the cortex, it is difficult to see how they could deliver the error signals required by strict formulations of backpropagation. Here we build on past and recent developments to argue that feedback connections may instead induce neural activities whose differences can be used to locally approximate these signals and hence drive effective learning in deep networks in the brain.


Asunto(s)
Corteza Cerebral/fisiología , Retroalimentación , Aprendizaje/fisiología , Algoritmos , Animales , Humanos , Modelos Neurológicos , Redes Neurales de la Computación
2.
Neural Comput ; 35(3): 413-452, 2023 Feb 17.
Artículo en Inglés | MEDLINE | ID: mdl-36543334

RESUMEN

This article does not describe a working system. Instead, it presents a single idea about representation that allows advances made by several different groups to be combined into an imaginary system called GLOM.1 The advances include transformers, neural fields, contrastive representation learning, distillation, and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy that has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language.

3.
Nature ; 521(7553): 436-44, 2015 May 28.
Artículo en Inglés | MEDLINE | ID: mdl-26017442

RESUMEN

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.


Asunto(s)
Inteligencia Artificial , Algoritmos , Inteligencia Artificial/tendencias , Computadores , Lenguaje , Redes Neurales de la Computación
6.
Nat Biomed Eng ; 7(6): 756-779, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37291435

RESUMEN

Machine-learning models for medical tasks can match or surpass the performance of clinical experts. However, in settings differing from those of the training dataset, the performance of a model can deteriorate substantially. Here we report a representation-learning strategy for machine-learning models applied to medical-imaging tasks that mitigates such 'out of distribution' performance problem and that improves model robustness and training efficiency. The strategy, which we named REMEDIS (for 'Robust and Efficient Medical Imaging with Self-supervision'), combines large-scale supervised transfer learning on natural images and intermediate contrastive self-supervised learning on medical images and requires minimal task-specific customization. We show the utility of REMEDIS in a range of diagnostic-imaging tasks covering six imaging domains and 15 test datasets, and by simulating three realistic out-of-distribution scenarios. REMEDIS improved in-distribution diagnostic accuracies up to 11.5% with respect to strong supervised baseline models, and in out-of-distribution settings required only 1-33% of the data for retraining to match the performance of supervised models retrained using all available data. REMEDIS may accelerate the development lifecycle of machine-learning models for medical imaging.


Asunto(s)
Aprendizaje Automático , Aprendizaje Automático Supervisado , Diagnóstico por Imagen
7.
Neural Comput ; 24(8): 1967-2006, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-22509963

RESUMEN

We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent statistics are estimated using a variational approximation that tends to focus on a single mode, and data-independent statistics are estimated using persistent Markov chains. The use of two quite different techniques for estimating the two types of statistic that enter into the gradient of the log likelihood makes it practical to learn Boltzmann machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer pretraining phase that initializes the weights sensibly. The pretraining also allows the variational inference to be initialized sensibly with a single bottom-up pass. We present results on the MNIST and NORB data sets showing that deep Boltzmann machines learn very good generative models of handwritten digits and 3D objects. We also show that the features discovered by deep Boltzmann machines are a very effective way to initialize the hidden layers of feedforward neural nets, which are then discriminatively fine-tuned.

8.
Neural Comput ; 22(6): 1473-92, 2010 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-20141471

RESUMEN

To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced three-way multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric weight between a pixel in the second image and a hidden unit. This creates cubically many parameters, which form a three-dimensional interaction tensor. We describe a low-rank approximation to this interaction tensor that uses a sum of factors, each of which is a three-way outer product. This approximation allows efficient learning of transformations between larger image patches. Since each factor can be viewed as an image filter, the model as a whole learns optimal filter pairs for efficiently representing transformations. We demonstrate the learning of optimal filter pairs from various synthetic and real image sequences. We also show how learning about image transformations allows the model to perform a simple visual analogy task, and we show how a completely unsupervised network trained on transformations perceives multiple motions of transparent dot patterns in the same way as humans.


Asunto(s)
Inteligencia Artificial , Procesamiento de Imagen Asistido por Computador/métodos , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Reconocimiento Visual de Modelos/fisiología , Percepción Espacial/fisiología , Algoritmos , Conceptos Matemáticos
9.
Neural Comput ; 22(11): 2729-62, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-20804386

RESUMEN

We compare 10 methods of classifying fMRI volumes by applying them to data from a longitudinal study of stroke recovery: adaptive Fisher's linear and quadratic discriminant; gaussian naive Bayes; support vector machines with linear, quadratic, and radial basis function (RBF) kernels; logistic regression; two novel methods based on pairs of restricted Boltzmann machines (RBM); and K-nearest neighbors. All methods were tested on three binary classification tasks, and their out-of-sample classification accuracies are compared. The relative performance of the methods varies considerably across subjects and classification tasks. The best overall performers were adaptive quadratic discriminant, support vector machines with RBF kernels, and generatively trained pairs of RBMs.


Asunto(s)
Interpretación de Imagen Asistida por Computador/métodos , Imagen por Resonancia Magnética , Reconocimiento de Normas Patrones Automatizadas/métodos , Accidente Cerebrovascular/patología , Algoritmos , Humanos
10.
Trends Cogn Sci ; 11(10): 428-34, 2007 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17921042

RESUMEN

To achieve its impressive performance in tasks such as speech perception or object recognition, the brain extracts multiple levels of representation from the sensory input. Backpropagation was the first computationally efficient model of how neural networks could learn multiple layers of representation, but it required labeled training data and it did not work well in deep networks. The limitations of backpropagation learning can now be overcome by using multilayer neural networks that contain top-down connections and training them to generate sensory data rather than to classify it. Learning multilayer generative models might seem difficult, but a recent discovery makes it easy to learn nonlinear distributed representations one layer at a time.


Asunto(s)
Encéfalo/fisiología , Aprendizaje/fisiología , Modelos Psicológicos , Red Nerviosa/fisiología , Humanos
11.
Prog Brain Res ; 165: 535-47, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17925269

RESUMEN

The uniformity of the cortical architecture and the ability of functions to move to different areas of cortex following early damage strongly suggest that there is a single basic learning algorithm for extracting underlying structure from richly structured, high-dimensional sensory data. There have been many attempts to design such an algorithm, but until recently they all suffered from serious computational weaknesses. This chapter describes several of the proposed algorithms and shows how they can be combined to produce hybrid methods that work efficiently in networks with many layers and millions of adaptive connections.


Asunto(s)
Aprendizaje/fisiología , Modelos Neurológicos , Redes Neurales de la Computación , Algoritmos , Corteza Cerebral/citología , Corteza Cerebral/fisiología , Humanos , Reconocimiento de Normas Patrones Automatizadas
12.
Cogn Sci ; 30(4): 725-31, 2006 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-21702832

RESUMEN

We describe a way of modeling high-dimensional data vectors by using an unsupervised, nonlinear, multilayer neural network in which the activity of each neuron-like unit makes an additive contribution to a global energy score that indicates how surprised the network is by the data vector. The connection weights that determine how the activity of each unit depends on the activities in earlier layers are learned by minimizing the energy assigned to data vectors that are actually observed and maximizing the energy assigned to "confabulations" that are generated by perturbing an observed data vector in a direction that decreases its energy under the current model.

13.
Neural Netw ; 18(5-6): 702-10, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-16112551

RESUMEN

We introduce spectral gradient descent, a way of improving iterative dimensionality reduction techniques. The method uses information contained in the leading eigenvalues of a data affinity matrix to modify the steps taken during a gradient-based optimization procedure. We show that the approach is able to speed up the optimization and to help dimensionality reduction methods find better local minima of their objective functions. We also provide an interpretation of our approach in terms of the power method for finding the leading eigenvalues of a symmetric matrix and verify the usefulness of the approach in some simple experiments.


Asunto(s)
Inteligencia Artificial , Interpretación Estadística de Datos , Algoritmos , Análisis por Conglomerados , Modelos Estadísticos , Dinámicas no Lineales , Análisis de Componente Principal , Procesos Estocásticos
14.
Neural Comput ; 3(1): 79-87, 1991.
Artículo en Inglés | MEDLINE | ID: mdl-31141872

RESUMEN

We present a new supervised learning procedure for systems composed of many separate networks, each of which learns to handle a subset of the complete set of training cases. The new procedure can be viewed either as a modular version of a multilayer supervised network, or as an associative version of competitive learning. It therefore provides a new link between these two apparently different approaches. We demonstrate that the learning procedure divides up a vowel discrimination task into appropriate subtasks, each of which can be solved by a very simple expert network.

15.
IEEE Trans Neural Netw ; 15(4): 838-49, 2004 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-15461077

RESUMEN

Under-complete models, which derive lower dimensional representations of input data, are valuable in domains in which the number of input dimensions is very large, such as data consisting of a temporal sequence of images. This paper presents the under-complete product of experts (UPoE), where each expert models a one-dimensional projection of the data. Maximum-likelihood learning rules for this model constitute a tractable and exact algorithm for learning under-complete independent components. The learning rules for this model coincide with approximate learning rules proposed earlier for under-complete independent component analysis (UICA) models. This paper also derives an efficient sequential learning algorithm from this model and discusses its relationship to sequential independent component analysis (ICA), projection pursuit density estimation, and feature induction algorithms for additive random field models. This paper demonstrates the efficacy of these novel algorithms on high-dimensional continuous datasets.


Asunto(s)
Algoritmos , Inteligencia Artificial , Técnicas de Apoyo para la Decisión , Teoría de la Información , Modelos Estadísticos , Redes Neurales de la Computación , Aprendizaje por Probabilidad , Simulación por Computador , Sistemas Especialistas , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas , Análisis de Componente Principal
16.
Cogn Sci ; 38(6): 1078-101, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23800216

RESUMEN

It is possible to learn multiple layers of non-linear features by backpropagating error derivatives through a feedforward neural network. This is a very effective learning procedure when there is a huge amount of labeled training data, but for many learning tasks very few labeled examples are available. In an effort to overcome the need for labeled data, several different generative models were developed that learned interesting features by modeling the higher order statistical structure of a set of input vectors. One of these generative models, the restricted Boltzmann machine (RBM), has no connections between its hidden units and this makes perceptual inference and learning much simpler. More significantly, after a layer of hidden features has been learned, the activities of these features can be used as training data for another RBM. By applying this idea recursively, it is possible to learn a deep hierarchy of progressively more complicated features without requiring any labeled data. This deep hierarchy can then be treated as a feedforward neural network which can be discriminatively fine-tuned using backpropagation. Using a stack of RBMs to initialize the weights of a feedforward neural network allows backpropagation to work effectively in much deeper networks and it leads to much better generalization. A stack of RBMs can also be used to initialize a deep Boltzmann machine that has many hidden layers. Combining this initialization method with a new method for fine-tuning the weights finally leads to the first efficient way of training Boltzmann machines with many hidden layers and millions of weights.


Asunto(s)
Aprendizaje , Modelos Neurológicos , Redes Neurales de la Computación , Inteligencia Artificial , Simulación por Computador , Humanos
17.
IEEE Trans Pattern Anal Mach Intell ; 35(9): 2206-22, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23868780

RESUMEN

This paper describes a Markov Random Field for real-valued image modeling that has two sets of latent variables. One set is used to gate the interactions between all pairs of pixels, while the second set determines the mean intensities of each pixel. This is a powerful model with a conditional distribution over the input that is Gaussian, with both mean and covariance determined by the configuration of latent variables, which is unlike previous models that were restricted to using Gaussians with either a fixed mean or a diagonal covariance matrix. Thanks to the increased flexibility, this gated MRF can generate more realistic samples after training on an unconstrained distribution of high-resolution natural images. Furthermore, the latent variables of the model can be inferred efficiently and can be used as very effective descriptors in recognition tasks. Both generation and discrimination drastically improve as layers of binary latent variables are added to the model, yielding a hierarchical model called a Deep Belief Network.

18.
Top Cogn Sci ; 3(1): 74-91, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25164175

RESUMEN

We describe a deep generative model in which the lowest layer represents the word-count vector of a document and the top layer represents a learned binary code for that document. The top two layers of the generative model form an undirected associative memory and the remaining layers form a belief net with directed, top-down connections. We present efficient learning and inference procedures for this type of generative model and show that it allows more accurate and much faster retrieval than latent semantic analysis. By using our method as a filter for a much slower method called TF-IDF we achieve higher accuracy than TF-IDF alone and save several orders of magnitude in retrieval time. By using short binary codes as addresses, we can perform retrieval on very large document sets in a time that is independent of the size of the document set using only one word of memory to describe each document.


Asunto(s)
Inteligencia Artificial , Documentación/métodos , Almacenamiento y Recuperación de la Información/métodos , Modelos Teóricos , Semántica
19.
Philos Trans R Soc Lond B Biol Sci ; 365(1537): 177-84, 2010 Jan 12.
Artículo en Inglés | MEDLINE | ID: mdl-20008395

RESUMEN

One of the central problems in computational neuroscience is to understand how the object-recognition pathway of the cortex learns a deep hierarchy of nonlinear feature detectors. Recent progress in machine learning shows that it is possible to learn deep hierarchies without requiring any labelled data. The feature detectors are learned one layer at a time and the goal of the learning procedure is to form a good generative model of images, not to predict the class of each image. The learning procedure only requires the pairwise correlations between the activations of neuron-like processing units in adjacent layers. The original version of the learning procedure is derived from a quadratic 'energy' function but it can be extended to allow third-order, multiplicative interactions in which neurons gate the pairwise interactions between other neurons. A technique for factoring the third-order interactions leads to a learning module that again has a simple learning rule based on pairwise correlations. This module looks remarkably like modules that have been proposed by both biologists trying to explain the responses of neurons and engineers trying to create systems that can recognize objects.


Asunto(s)
Aprendizaje/fisiología , Modelos Neurológicos , Redes Neurales de la Computación , Vías Visuales/fisiología , Simulación por Computador , Humanos
20.
Neural Netw ; 23(2): 239-43, 2010 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19932002

RESUMEN

A Recurrent Neural Network (RNN) is a powerful connectionist model that can be applied to many challenging sequential problems, including problems that naturally arise in language and speech. However, RNNs are extremely hard to train on problems that have long-term dependencies, where it is necessary to remember events for many timesteps before using them to make a prediction. In this paper we consider the problem of training RNNs to predict sequences that exhibit significant long-term dependencies, focusing on a serial recall task where the RNN needs to remember a sequence of characters for a large number of steps before reconstructing it. We introduce the Temporal-Kernel Recurrent Neural Network (TKRNN), which is a variant of the RNN that can cope with long-term dependencies much more easily than a standard RNN, and show that the TKRNN develops short-term memory that successfully solves the serial recall task by representing the input string with a stable state of its hidden units.


Asunto(s)
Memoria a Corto Plazo , Redes Neurales de la Computación , Algoritmos , Humanos , Pruebas Neuropsicológicas , Factores de Tiempo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA