Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Discovering and Exploiting Sparse Rewards in a Learned Behavior Space.

Paolo, Giuseppe; Coninx, Miranda; Laflaquière, Alban; Doncieux, Stephane.

Evol Comput ; : 1-28, 2023 Oct 06.

Artigo em Inglês | MEDLINE | ID: mdl-37793063

RESUMO

Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions. In these situations, a good strategy is to focus on exploration, hopefully leading to the discovery of a reward signal to improve on. A learning algorithm capable of dealing with this kind of settings has to be able to (1) explore possible agent behaviors and (2) exploit any possible discovered reward. Exploration algorithms have been proposed that require the definition of a low-dimension behavior space, in which the behavior generated by the agent's policy can be represented. The need to design a priori this space such that it is worth exploring is a major limitation of these algorithms. In this work, we introduce STAX, an algorithm designed to learn a behavior space on-the-fly and to explore it while optimizing any reward discovered. It does so by separating the exploration and learning of the behavior space from the exploitation of the reward through an alternating two-step process. In the first step, STAX builds a repertoire of diverse policies while learning a low-dimensional representation of the high-dimensional observations generated during the policies evaluation. In the exploitation step, emitters optimize the performance of the discovered rewarding solutions. Experiments conducted on three different sparse reward environments show that STAX performs comparably to existing baselines while requiring much less prior information about the task as it autonomously builds the behavior space it explores.

Building an Affordances Map With Interactive Perception.

Le Goff, Léni K; Yaakoubi, Oussama; Coninx, Alexandre; Doncieux, Stéphane.

Front Neurorobot ; 16: 504459, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35619968

RESUMO

Robots need to understand their environment to perform their task. If it is possible to pre-program a visual scene analysis process in closed environments, robots operating in an open environment would benefit from the ability to learn it through their interaction with their environment. This ability furthermore opens the way to the acquisition of affordances maps in which the action capabilities of the robot structure its visual scene understanding. We propose an approach to build such affordances maps by relying on an interactive perception approach and an online classification for a real robot equipped with two arms with 7 degrees of freedom. Our system is modular and permits to learn maps from different skills. In the proposed formalization of affordances, actions and effects are related to visual features, not objects, thus our approach does not need a prior definition of the concept of object. We have tested the approach on three action primitives and on a real PR2 robot.

Exploratory State Representation Learning.

Merckling, Astrid; Perrin-Gilbert, Nicolas; Coninx, Alex; Doncieux, Stéphane.

Front Robot AI ; 9: 762051, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35237669

RESUMO

Not having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling RL tasks. However, obtaining a good state representation can only be done if a large diversity of transitions is observed, which can require a difficult exploration, especially if the environment is initially reward-free. To solve the problems of exploration and SRL in parallel, we propose a new approach called XSRL (eXploratory State Representation Learning). On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a k-step learning progress bonus to form the maximization objective of a discovery policy. This results in a policy that seeks complex transitions from which the trained models can effectively learn. Our experimental results show that the approach leads to efficient exploration in challenging environments with image observations, and to state representations that significantly accelerate learning in RL tasks.

The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities.

Lehman, Joel; Clune, Jeff; Misevic, Dusan; Adami, Christoph; Altenberg, Lee; Beaulieu, Julie; Bentley, Peter J; Bernard, Samuel; Beslon, Guillaume; Bryson, David M; Cheney, Nick; Chrabaszcz, Patryk; Cully, Antoine; Doncieux, Stephane; Dyer, Fred C; Ellefsen, Kai Olav; Feldt, Robert; Fischer, Stephan; Forrest, Stephanie; Frenoy, Antoine; Gagne, Christian; Le Goff, Leni; Grabowski, Laura M; Hodjat, Babak; Hutter, Frank; Keller, Laurent; Knibbe, Carole; Krcah, Peter; Lenski, Richard E; Lipson, Hod; MacCurdy, Robert; Maestre, Carlos; Miikkulainen, Risto; Mitri, Sara; Moriarty, David E; Mouret, Jean-Baptiste; Nguyen, Anh; Ofria, Charles; Parizeau, Marc; Parsons, David; Pennock, Robert T; Punch, William F; Ray, Thomas S; Schoenauer, Marc; Schulte, Eric; Sims, Karl; Stanley, Kenneth O; Taddei, François; Tarapore, Danesh; Thibault, Simon.

Artif Life ; 26(2): 274-306, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32271631

RESUMO

Evolution provides a creative fount of complex and subtle adaptations that often surprise the scientists who discover them. However, the creativity of evolution is not limited to the natural world: Artificial organisms evolving in computational environments have also elicited surprise and wonder from the researchers studying them. The process of evolution is an algorithmic process that transcends the substrate in which it occurs. Indeed, many researchers in the field of digital evolution can provide examples of how their evolving algorithms and organisms have creatively subverted their expectations or intentions, exposed unrecognized bugs in their code, produced unexpectedly adaptations, or engaged in behaviors and outcomes, uncannily convergent with ones found in nature. Such stories routinely reveal surprise and creativity by evolution in these digital worlds, but they rarely fit into the standard scientific narrative. Instead they are often treated as mere obstacles to be overcome, rather than results that warrant study in their own right. Bugs are fixed, experiments are refocused, and one-off surprises are collapsed into a single data point. The stories themselves are traded among researchers through oral tradition, but that mode of information transmission is inefficient and prone to error and outright loss. Moreover, the fact that these stories tend to be shared only among practitioners means that many natural scientists do not realize how interesting and lifelike digital organisms are and how natural their evolution can be. To our knowledge, no collection of such anecdotes has been published before. This article is the crowd-sourced product of researchers in the fields of artificial life and evolutionary computation who have provided first-hand accounts of such cases. It thus serves as a written, fact-checked collection of scientifically important and even entertaining stories. In doing so we also present here substantial evidence that the existence and importance of evolutionary surprises extends beyond the natural world, and may indeed be a universal property of all complex evolving systems.

Assuntos

Algoritmos , Biologia Computacional , Criatividade , Vida , Evolução Biológica

Action Generation Adapted to Low-Level and High-Level Robot-Object Interaction States.

Maestre, Carlos; Mukhtar, Ghanim; Gonzales, Christophe; Doncieux, Stephane.

Front Neurorobot ; 13: 56, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31396071

RESUMO

Our daily environments are complex, composed of objects with different features. These features can be categorized into low-level features, e.g., an object position or temperature, and high-level features resulting from a pre-processing of low-level features for decision purposes, e.g., a binary value saying if it is too hot to be grasped. Besides, our environments are dynamic, i.e., object states can change at any moment. Therefore, robots performing tasks in these environments must have the capacity to (i) identify the next action to execute based on the available low-level and high-level object states, and (ii) dynamically adapt their actions to state changes. We introduce a method named Interaction State-based Skill Learning (IS2L), which builds skills to solve tasks in realistic environments. A skill is a Bayesian Network that infers actions composed of a sequence of movements of the robot's end-effector, which locally adapt to spatio-temporal perturbations using a dynamical system. In the current paper, an external agent performs one or more kinesthetic demonstrations of an action generating a dataset of high-level and low-level states of the robot and the environment objects. First, the method transforms each interaction to represent (i) the relationship between the robot and the object and (ii) the next robot end-effector movement to perform at consecutive instants of time. Then, the skill is built, i.e., the Bayesian network is learned. While generating an action this skill relies on the robot and object states to infer the next movement to execute. This movement selection gets inspired by a type of predictive models for action selection usually called affordances. The main contribution of this paper is combining the main features of dynamical systems and affordances in a unique method to build skills that solve tasks in realistic scenarios. More precisely, combining the low-level movement generation of the dynamical systems, to adapt to local perturbations, with the next movement selection simultaneously based on high-level and low-level states. This contribution was assessed in three experiments in realistic environments using both high-level and low-level states. The built skills solved the respective tasks relying on both types of states, and adapting to external perturbations.

Open-Ended Learning: A Conceptual Framework Based on Representational Redescription.

Doncieux, Stephane; Filliat, David; Díaz-Rodríguez, Natalia; Hospedales, Timothy; Duro, Richard; Coninx, Alexandre; Roijers, Diederik M; Girard, Benoît; Perrin, Nicolas; Sigaud, Olivier.

Front Neurorobot ; 12: 59, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30319388

RESUMO

Reinforcement learning (RL) aims at building a policy that maximizes a task-related reward within a given domain. When the domain is known, i.e., when its states, actions and reward are defined, Markov Decision Processes (MDPs) provide a convenient theoretical framework to formalize RL. But in an open-ended learning process, an agent or robot must solve an unbounded sequence of tasks that are not known in advance and the corresponding MDPs cannot be built at design time. This defines the main challenges of open-ended learning: how can the agent learn how to behave appropriately when the adequate states, actions and rewards representations are not given? In this paper, we propose a conceptual framework to address this question. We assume an agent endowed with low-level perception and action capabilities. This agent receives an external reward when it faces a task. It must discover the state and action representations that will let it cast the tasks as MDPs in order to solve them by RL. The relevance of the action or state representation is critical for the agent to learn efficiently. Considering that the agent starts with a low level, task-agnostic state and action spaces based on its low-level perception and action capabilities, we describe open-ended learning as the challenge of building the adequate representation of states and actions, i.e., of redescribing available representations. We suggest an iterative approach to this problem based on several successive Representational Redescription processes, and highlight the corresponding challenges in which intrinsic motivations play a key role.

Editorial: Evolvability, Environments, Embodiment & Emergence in Robotics.

Long, John H; Aaron, Eric; Doncieux, Stéphane.

Front Robot AI ; 5: 103, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-33500982

Design of a bio-inspired controller for dynamic soaring in a simulated unmanned aerial vehicle.

Barate, Renaud; Doncieux, Stéphane; Meyer, Jean-Arcady.

Bioinspir Biomim ; 1(3): 76-88, 2006 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-17671309

RESUMO

This paper is inspired by the way birds such as albatrosses are able to exploit wind gradients at the surface of the ocean for staying aloft for very long periods while minimizing their energy expenditure. The corresponding behaviour has been partially reproduced here via a set of Takagi-Sugeno-Kang fuzzy rules controlling a simulated glider. First, the rules were hand-designed. Then, they were optimized with an evolutionary algorithm that improved their efficiency at coping with challenging conditions. Finally, the robustness properties of the controller generated were assessed with a view to its applicability to a real platform.

Assuntos

Aeronaves/instrumentação , Algoritmos , Inteligência Artificial , Materiais Biomiméticos , Desenho Assistido por Computador , Voo Animal/fisiologia , Modelos Teóricos , Simulação por Computador , Desenho de Equipamento , Análise de Falha de Equipamento , Retroalimentação , Lógica Fuzzy

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA