Búsqueda | BVS Bolivia

1.

Reagent prediction with a molecular transformer improves reaction data quality.

Andronov, Mikhail; Voinarovska, Varvara; Andronova, Natalia; Wand, Michael; Clevert, Djork-Arné; Schmidhuber, Jürgen.

Chem Sci ; 14(12): 3235-3246, 2023 Mar 22.

Artículo en Inglés | MEDLINE | ID: mdl-36970100

RESUMEN

Automated synthesis planning is key for efficient generative chemistry. Since reactions of given reactants may yield different products depending on conditions such as the chemical context imposed by specific reagents, computer-aided synthesis planning should benefit from recommendations of reaction conditions. Traditional synthesis planning software, however, typically proposes reactions without specifying such conditions, relying on human organic chemists who know the conditions to carry out suggested reactions. In particular, reagent prediction for arbitrary reactions, a crucial aspect of condition recommendation, has been largely overlooked in cheminformatics until recently. Here we employ the Molecular Transformer, a state-of-the-art model for reaction prediction and single-step retrosynthesis, to tackle this problem. We train the model on the US patents dataset (USPTO) and test it on Reaxys to demonstrate its out-of-distribution generalization capabilities. Our reagent prediction model also improves the quality of product prediction: the Molecular Transformer is able to substitute the reagents in the noisy USPTO data with reagents that enable product prediction models to outperform those trained on plain USPTO. This makes it possible to improve upon the state-of-the-art in reaction product prediction on the USPTO MIT benchmark.

2.

Unsupervised Learning of Temporal Abstractions With Slot-Based Transformers.

Gopalakrishnan, Anand; Irie, Kazuki; Schmidhuber, Jürgen; van Steenkiste, Sjoerd.

Neural Comput ; 35(4): 593-626, 2023 Mar 18.

Artículo en Inglés | MEDLINE | ID: mdl-36746145

RESUMEN

The discovery of reusable subroutines simplifies decision making and planning in complex reinforcement learning problems. Previous approaches propose to learn such temporal abstractions in an unsupervised fashion through observing state-action trajectories gathered from executing a policy. However, a current limitation is that they process each trajectory in an entirely sequential manner, which prevents them from revising earlier decisions about subroutine boundary points in light of new incoming information. In this work, we propose slot-based transformer for temporal abstraction (SloTTAr), a fully parallel approach that integrates sequence processing transformers with a slot attention module to discover subroutines in an unsupervised fashion while leveraging adaptive computation for learning about the number of such subroutines solely based on their empirical distribution. We demonstrate how SloTTAr is capable of outperforming strong baselines in terms of boundary point discovery, even for sequences containing variable amounts of subroutines, while being up to seven times faster to train on existing benchmarks.

3.

Recurrent Neural-Linear Posterior Sampling for Nonstationary Contextual Bandits.

Ramesh, Aditya; Rauber, Paulo; Conserva, Michelangelo; Schmidhuber, Jürgen.

Neural Comput ; 34(11): 2232-2272, 2022 10 07.

Artículo en Inglés | MEDLINE | ID: mdl-36112923

RESUMEN

An agent in a nonstationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences. Handcrafting an appropriate historical context is an attractive alternative to transform a nonstationary problem into a stationary problem that can be solved efficiently. However, even a carefully designed historical context may introduce spurious relationships or lack a convenient representation of crucial information. In order to address these issues, we propose an approach that learns to represent the relevant context for a decision based solely on the raw history of interactions between the agent and the environment. This approach relies on a combination of features extracted by recurrent neural networks with a contextual linear bandit algorithm based on posterior sampling. Our experiments on a diverse selection of contextual and noncontextual nonstationary problems show that our recurrent approach consistently outperforms its feedforward counterpart, which requires handcrafted historical contexts, while being more widely applicable than conventional nonstationary bandit algorithms. Although it is very difficult to provide theoretical performance guarantees for our new approach, we also prove a novel regret bound for linear posterior sampling with measurement error that may serve as a foundation for future theoretical work.

Asunto(s)

Algoritmos , Redes Neurales de la Computación , Aprendizaje

4.

Bayesian Brains and the Rényi Divergence.

Sajid, Noor; Faccio, Francesco; Da Costa, Lancelot; Parr, Thomas; Schmidhuber, Jürgen; Friston, Karl.

Neural Comput ; 34(4): 829-855, 2022 03 23.

Artículo en Inglés | MEDLINE | ID: mdl-35231935

RESUMEN

Under the Bayesian brain hypothesis, behavioral variations can be attributed to different priors over generative model parameters. This provides a formal explanation for why individuals exhibit inconsistent behavioral preferences when confronted with similar choices. For example, greedy preferences are a consequence of confident (or precise) beliefs over certain outcomes. Here, we offer an alternative account of behavioral variability using Rényi divergences and their associated variational bounds. Rényi bounds are analogous to the variational free energy (or evidence lower bound) and can be derived under the same assumptions. Importantly, these bounds provide a formal way to establish behavioral differences through an α parameter, given fixed priors. This rests on changes in α that alter the bound (on a continuous scale), inducing different posterior estimates and consequent variations in behavior. Thus, it looks as if individuals have different priors and have reached different conclusions. More specifically, αâ0+ optimization constrains the variational posterior to be positive whenever the true posterior is positive. This leads to mass-covering variational estimates and increased variability in choice behavior. Furthermore, αâ+∞ optimization constrains the variational posterior to be zero whenever the true posterior is zero. This leads to mass-seeking variational posteriors and greedy preferences. We exemplify this formulation through simulations of the multiarmed bandit task. We note that these α parameterizations may be especially relevant (i.e., shape preferences) when the true posterior is not in the same family of distributions as the assumed (simpler) approximate density, which may be the case in many real-world scenarios. The ensuing departure from vanilla variational inference provides a potentially useful explanation for differences in behavioral preferences of biological (or artificial) agents under the assumption that the brain performs variational Bayesian inference.

Asunto(s)

Encéfalo , Teorema de Bayes , Humanos

5.

Analysis of Neural Network Based Proportional Myoelectric Hand Prosthesis Control.

Wand, Michael; Kristoffersen, Morten B; Franzke, Andreas W; Schmidhuber, Jurgen.

IEEE Trans Biomed Eng ; 69(7): 2283-2293, 2022 07.

Artículo en Inglés | MEDLINE | ID: mdl-35007192

RESUMEN

OBJECTIVE: We show that state-of-the-art deep neural networks achieve superior results in regression-based multi-class proportional myoelectric hand prosthesis control than two common baseline approaches, and we analyze the neural network mapping to explain why this is the case. METHODS: Feedforward neural networks and baseline systems are trained on an offline corpus of 11 able-bodied subjects and 4 prosthesis wearers, using the R2 score as metric. Analysis is performed using diverse qualitative and quantitative approaches, followed by a rigorous evaluation. RESULTS: Our best neural networks have at least three hidden layers with at least 128 neurons per layer; smaller architectures, as used by many prior studies, perform substantially worse. The key to good performance is to both optimally regress the target movement, and to suppress spurious movements. Due to the properties of the underlying data, this is impossible to achieve with linear methods, but can be attained with high exactness using sufficiently large neural networks. CONCLUSION: Neural networks perform significantly better than common linear approaches in the given task, in particular when sufficiently large architectures are used. This can be explained by salient properties of the underlying data, and by theoretical and experimental analysis of the neural network mapping. SIGNIFICANCE: To the best of our knowledge, this work is the first one in the field which not only reports that large and deep neural networks are superior to existing architectures, but also explains this result.

Asunto(s)

Miembros Artificiales , Redes Neurales de la Computación , Mano/fisiología , Humanos , Movimiento

6.

Reinforcement Learning in Sparse-Reward Environments With Hindsight Policy Gradients.

Rauber, Paulo; Ummadisingu, Avinash; Mutz, Filipe; Schmidhuber, Jürgen.

Neural Comput ; 33(6): 1498-1553, 2021 05 13.

Artículo en Inglés | MEDLINE | ID: mdl-34496391

RESUMEN

A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy. In addition to their potential to generalize desirable behavior to unseen goals, such policies may also enable higher-level planning based on subgoals. In sparse-reward environments, the capacity to exploit information about the degree to which an arbitrary goal has been achieved while another goal was intended appears crucial to enabling sample efficient learning. However, reinforcement learning agents have only recently been endowed with such capacity for hindsight. In this letter, we demonstrate how hindsight can be introduced to policy gradient methods, generalizing this idea to a broad class of successful algorithms. Our experiments on a diverse selection of sparse-reward environments show that hindsight leads to a remarkable increase in sample efficiency.

7.

Deep neural network representation and Generative Adversarial Learning.

Ruiz-Garcia, Ariel; Schmidhuber, Jürgen; Palade, Vasile; Took, Clive Cheong; Mandic, Danilo.

Neural Netw ; 139: 199-200, 2021 07.

Artículo en Inglés | MEDLINE | ID: mdl-33774356

Asunto(s)

Aprendizaje Automático , Redes Neurales de la Computación , Aprendizaje Profundo

8.

Investigating object compositionality in Generative Adversarial Networks.

van Steenkiste, Sjoerd; Kurach, Karol; Schmidhuber, Jürgen; Gelly, Sylvain.

Neural Netw ; 130: 309-325, 2020 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-32736226

RESUMEN

Deep generative models seek to recover the process with which the observed data was generated. They may be used to synthesize new samples or to subsequently extract representations. Successful approaches in the domain of images are driven by several core inductive biases. However, a bias to account for the compositional way in which humans structure a visual scene in terms of objects has frequently been overlooked. In this work, we investigate object compositionality as an inductive bias for Generative Adversarial Networks (GANs). We present a minimal modification of a standard generator to incorporate this inductive bias and find that it reliably learns to generate images as compositions of objects. Using this general design as a backbone, we then propose two useful extensions to incorporate dependencies among objects and background. We extensively evaluate our approach on several multi-object image datasets and highlight the merits of incorporating structure for representation learning purposes. In particular, we find that our structured GANs are better at generating multi-object images that are more faithful to the reference distribution. More so, we demonstrate how, by leveraging the structure of the learned generative process, one can 'invert' the learned generative model to perform unsupervised instance segmentation. On the challenging CLEVR dataset, it is shown how our approach is able to improve over other recent purely unsupervised object-centric approaches to image generation.

Asunto(s)

Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Humanos , Procesamiento de Imagen Asistido por Computador/métodos

9.

Generative Adversarial Networks are special cases of Artificial Curiosity (1990) and also closely related to Predictability Minimization (1991).

Schmidhuber, Jürgen.

Neural Netw ; 127: 58-66, 2020 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-32334341

RESUMEN

I review unsupervised or self-supervised neural networks playing minimax games in game-theoretic settings: (i) Artificial Curiosity (AC, 1990) is based on two such networks. One network learns to generate a probability distribution over outputs, the other learns to predict effects of the outputs. Each network minimizes the objective function maximized by the other. (ii) Generative Adversarial Networks (GANs, 2010-2014) are an application of AC where the effect of an output is 1 if the output is in a given set, and 0 otherwise. (iii) Predictability Minimization (PM, 1990s) models data distributions through a neural encoder that maximizes the objective function minimized by a neural predictor of the code components. I correct a previously published claim that PM is not based on a minimax game.

Asunto(s)

Redes Neurales de la Computación , Aprendizaje Automático no Supervisado/tendencias , Inteligencia Artificial , Predicción , Objetivos , Humanos , Probabilidad

10.

Aviation pioneers missed out on publicity.

Schmidhuber, Jürgen.

Nature ; 566(7742): 39, 2019 02.

Artículo en Inglés | MEDLINE | ID: mdl-30723352

11.

LSTM: A Search Space Odyssey.

Greff, Klaus; Srivastava, Rupesh K; Koutnik, Jan; Steunebrink, Bas R; Schmidhuber, Jurgen.

IEEE Trans Neural Netw Learn Syst ; 28(10): 2222-2232, 2017 10.

Artículo en Inglés | MEDLINE | ID: mdl-27411231

RESUMEN

Several variants of the long short-term memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful functional ANalysis Of VAriance framework. In total, we summarize the results of 5400 experimental runs ( ≈ 15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.

12.

Crowdsourcing the creation of image segmentation algorithms for connectomics.

Arganda-Carreras, Ignacio; Turaga, Srinivas C; Berger, Daniel R; Ciresan, Dan; Giusti, Alessandro; Gambardella, Luca M; Schmidhuber, Jürgen; Laptev, Dmitry; Dwivedi, Sarvesh; Buhmann, Joachim M; Liu, Ting; Seyedhosseini, Mojtaba; Tasdizen, Tolga; Kamentsky, Lee; Burget, Radim; Uher, Vaclav; Tan, Xiao; Sun, Changming; Pham, Tuan D; Bas, Erhan; Uzunbas, Mustafa G; Cardona, Albert; Schindelin, Johannes; Seung, H Sebastian.

Front Neuroanat ; 9: 142, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-26594156

RESUMEN

To stimulate progress in automating the reconstruction of neural circuits, we organized the first international challenge on 2D segmentation of electron microscopic (EM) images of the brain. Participants submitted boundary maps predicted for a test set of images, and were scored based on their agreement with a consensus of human expert annotations. The winning team had no prior experience with EM images, and employed a convolutional network. This "deep learning" approach has since become accepted as a standard for segmentation of EM images. The challenge has continued to accept submissions, and the best so far has resulted from cooperation between two teams. The challenge has probably saturated, as algorithms cannot progress beyond limits set by ambiguities inherent in 2D scoring and the size of the test dataset. Retrospective evaluation of the challenge scoring system reveals that it was not sufficiently robust to variations in the widths of neurite borders. We propose a solution to this problem, which should be useful for a future 3D segmentation challenge.

13.

Deep learning in neural networks: an overview.

Schmidhuber, Jürgen.

Neural Netw ; 61: 85-117, 2015 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-25462637

RESUMEN

In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarizes relevant work, much of it from the previous millennium. Shallow and Deep Learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

Asunto(s)

Inteligencia Artificial/clasificación , Inteligencia Artificial/normas , Inteligencia Artificial/tendencias

14.

Assessment of algorithms for mitosis detection in breast cancer histopathology images.

Veta, Mitko; van Diest, Paul J; Willems, Stefan M; Wang, Haibo; Madabhushi, Anant; Cruz-Roa, Angel; Gonzalez, Fabio; Larsen, Anders B L; Vestergaard, Jacob S; Dahl, Anders B; Ciresan, Dan C; Schmidhuber, Jürgen; Giusti, Alessandro; Gambardella, Luca M; Tek, F Boray; Walter, Thomas; Wang, Ching-Wei; Kondo, Satoshi; Matuszewski, Bogdan J; Precioso, Frederic; Snell, Violet; Kittler, Josef; de Campos, Teofilo E; Khan, Adnan M; Rajpoot, Nasir M; Arkoumani, Evdokia; Lacle, Miangela M; Viergever, Max A; Pluim, Josien P W.

Med Image Anal ; 20(1): 237-48, 2015 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-25547073

RESUMEN

The proliferative activity of breast tumors, which is routinely estimated by counting of mitotic figures in hematoxylin and eosin stained histology sections, is considered to be one of the most important prognostic markers. However, mitosis counting is laborious, subjective and may suffer from low inter-observer agreement. With the wider acceptance of whole slide images in pathology labs, automatic image analysis has been proposed as a potential solution for these issues. In this paper, the results from the Assessment of Mitosis Detection Algorithms 2013 (AMIDA13) challenge are described. The challenge was based on a data set consisting of 12 training and 11 testing subjects, with more than one thousand annotated mitotic figures by multiple observers. Short descriptions and results from the evaluation of eleven methods are presented. The top performing method has an error rate that is comparable to the inter-observer agreement among pathologists.

Asunto(s)

Algoritmos , Neoplasias de la Mama/patología , Mitosis , Femenino , Humanos , Variaciones Dependientes del Observador

15.

Candidate sampling for neuron reconstruction from anisotropic electron microscopy volumes.

Funke, Jan; Martel, Julien N P; Gerhard, Stephan; Andres, Bjoern; Ciresan, Dan C; Giusti, Alessandro; Gambardella, Luca M; Schmidhuber, Jürgen; Pfister, Hanspeter; Cardona, Albert; Cook, Matthew.

Med Image Comput Comput Assist Interv ; 17(Pt 1): 17-24, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-25333096

RESUMEN

The automatic reconstruction of neurons from stacks of electron microscopy sections is an important computer vision problem in neuroscience. Recent advances are based on a two step approach: First, a set of possible 2D neuron candidates is generated for each section independently based on membrane predictions of a local classifier. Second, the candidates of all sections of the stack are fed to a neuron tracker that selects and connects them in 3D to yield a reconstruction. The accuracy of the result is currently limited by the quality of the generated candidates. In this paper, we propose to replace the heuristic set of candidates used in previous methods with samples drawn from a conditional random field (CRF) that is trained to label sections of neural tissue. We show on a stack of Drosophila melanogaster neural tissue that neuron candidates generated with our method produce 30% less reconstruction errors than current candidate generation methods. Two properties of our CRF are crucial for the accuracy and applicability of our method: (1) The CRF models the orientation of membranes to produce more plausible neuron candidates. (2) The interactions in the CRF are restricted to form a bipartite graph, which allows a great sampling speed-up without loss of accuracy.

Asunto(s)

Algoritmos , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Microscopía Electrónica/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Técnica de Sustracción , Animales , Anisotropía , Células Cultivadas , Interpretación Estadística de Datos , Drosophila melanogaster , Aumento de la Imagen/métodos , Reproducibilidad de los Resultados , Tamaño de la Muestra , Sensibilidad y Especificidad , Procesamiento de Señales Asistido por Computador

16.

Curiosity driven reinforcement learning for motion planning on humanoids.

Frank, Mikhail; Leitner, Jürgen; Stollenga, Marijn; Förster, Alexander; Schmidhuber, Jürgen.

Front Neurorobot ; 7: 25, 2014 Jan 06.

Artículo en Inglés | MEDLINE | ID: mdl-24432001

RESUMEN

Most previous work on artificial curiosity (AC) and intrinsic motivation focuses on basic concepts and theory. Experimental results are generally limited to toy scenarios, such as navigation in a simulated maze, or control of a simple mechanical system with one or two degrees of freedom. To study AC in a more realistic setting, we embody a curious agent in the complex iCub humanoid robot. Our novel reinforcement learning (RL) framework consists of a state-of-the-art, low-level, reactive control layer, which controls the iCub while respecting constraints, and a high-level curious agent, which explores the iCub's state-action space through information gain maximization, learning a world model from experience, controlling the actual iCub hardware in real-time. To the best of our knowledge, this is the first ever embodied, curious agent for real-time motion planning on a humanoid. We demonstrate that it can learn compact Markov models to represent large regions of the iCub's configuration space, and that the iCub explores intelligently, showing interest in its physical constraints as well as in objects it finds in its environment.

17.

Multimodal Similarity-Preserving Hashing.

Masci, Jonathan; Bronstein, Michael M; Bronstein, Alexander M; Schmidhuber, Jürgen.

IEEE Trans Pattern Anal Mach Intell ; 36(4): 824-30, 2014 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-26353203

RESUMEN

We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches, our hashing functions are not limited to binarized linear projections and can assume arbitrarily complex forms. We show experimentally that our method significantly outperforms state-of-the-art hashing approaches on multimedia retrieval tasks.

18.

Confidence-based progress-driven self-generated goals for skill acquisition in developmental robots.

Ngo, Hung; Luciw, Matthew; Förster, Alexander; Schmidhuber, Jürgen.

Front Psychol ; 4: 833, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-24324448

RESUMEN

A reinforcement learning agent that autonomously explores its environment can utilize a curiosity drive to enable continual learning of skills, in the absence of any external rewards. We formulate curiosity-driven exploration, and eventual skill acquisition, as a selective sampling problem. Each environment setting provides the agent with a stream of instances. An instance is a sensory observation that, when queried, causes an outcome that the agent is trying to predict. After an instance is observed, a query condition, derived herein, tells whether its outcome is statistically known or unknown to the agent, based on the confidence interval of an online linear classifier. Upon encountering the first unknown instance, the agent "queries" the environment to observe the outcome, which is expected to improve its confidence in the corresponding predictor. If the environment is in a setting where all instances are known, the agent generates a plan of actions to reach a new setting, where an unknown instance is likely to be encountered. The desired setting is a self-generated goal, and the plan of action, essentially a program to solve a problem, is a skill. The success of the plan depends on the quality of the agent's predictors, which are improved as mentioned above. For validation, this method is applied to both a simulated and real Katana robot arm in its "blocks-world" environment. Results show that the proposed method generates sample-efficient curious exploration behavior, which exhibits developmental stages, continual learning, and skill acquisition, in an intrinsically-motivated playful agent.

19.

PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem.

Schmidhuber, Jürgen.

Front Psychol ; 4: 313, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-23761771

RESUMEN

Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Consider the infinite set of all computable descriptions of tasks with possibly computable solutions. Given a general problem-solving architecture, at any given time, the novel algorithmic framework PowerPlay (Schmidhuber, 2011) searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. Newly invented tasks may require to achieve a wow-effect by making previously learned skills more efficient such that they require less time and space. New skills may (partially) re-use previously learned skills. The greedy search of typical PowerPlay variants uses time-optimal program search to order candidate pairs of tasks and solver modifications by their conditional computational (time and space) complexity, given the stored experience so far. The new task and its corresponding task-solving skill are those first found and validated. This biases the search toward pairs that can be described compactly and validated quickly. The computational costs of validating new tasks need not grow with task repertoire size. Standard problem solver architectures of personal computers or neural networks tend to generalize by solving numerous tasks outside the self-invented training set; PowerPlay's ongoing search for novelty keeps breaking the generalization abilities of its present solver. This is related to Gödel's sequence of increasingly powerful formal theories based on adding formerly unprovable statements to the axioms without affecting previously provable theorems. The continually increasing repertoire of problem-solving procedures can be exploited by a parallel search for solutions to additional externally posed tasks. PowerPlay may be viewed as a greedy but practical implementation of basic principles of creativity (Schmidhuber, 2006a, 2010). A first experimental analysis can be found in separate papers (Srivastava et al., 2012a,b, 2013).

20.

First experiments with POWERPLAY.

Srivastava, Rupesh Kumar; Steunebrink, Bas R; Schmidhuber, Jürgen.

Neural Netw ; 41: 130-6, 2013 May.

Artículo en Inglés | MEDLINE | ID: mdl-23465562

RESUMEN

Like a scientist or a playing child, POWERPLAY (Schmidhuber, 2011) not only learns new skills to solve given problems, but also invents new interesting problems by itself. By design, it continually comes up with the fastest to find, initially novel, but eventually solvable tasks. It also continually simplifies or compresses or speeds up solutions to previous tasks. Here we describe first experiments with POWERPLAY. A self-delimiting recurrent neural network SLIM RNN (Schmidhuber, 2012) is used as a general computational problem solving architecture. Its connection weights can encode arbitrary, self-delimiting, halting or non-halting programs affecting both environment (through effectors) and internal states encoding abstractions of event sequences. Our POWERPLAY-driven SLIM RNN learns to become an increasingly general solver of self-invented problems, continually adding new problem solving procedures to its growing skill repertoire. Extending a recent conference paper (Srivastava, Steunebrink, Stollenga, & Schmidhuber, 2012), we identify interesting, emerging, developmental stages of our open-ended system. We also show how it automatically self-modularizes, frequently re-using code for previously invented skills, always trying to invent novel tasks that can be quickly validated because they do not require too many weight changes affecting too many previous tasks.

Asunto(s)

Algoritmos , Inteligencia Artificial , Redes Neurales de la Computación , Juego e Implementos de Juego , Ciencia/métodos , Cognición , Humanos , Modelos Teóricos , Solución de Problemas

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA