Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Neural Comput ; 36(4): 677-704, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38457764

RESUMO

Representing a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment. In this letter, we propose a novel approach for scene understanding, leveraging an object-centric generative model that enables an agent to infer object category and pose in an allocentric reference frame using active inference, a neuro-inspired framework for action and perception. For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint given a workspace with randomly positioned objects in 3D. We demonstrate that our active inference agent is able to balance epistemic foraging and goal-driven behavior, and quantitatively outperforms both supervised and reinforcement learning baselines by more than a factor of two in terms of success rate.

2.
Entropy (Basel) ; 26(1)2024 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-38248208

RESUMO

Robust evidence suggests that humans explore their environment using a combination of topological landmarks and coarse-grained path integration. This approach relies on identifiable environmental features (topological landmarks) in tandem with estimations of distance and direction (coarse-grained path integration) to construct cognitive maps of the surroundings. This cognitive map is believed to exhibit a hierarchical structure, allowing efficient planning when solving complex navigation tasks. Inspired by human behaviour, this paper presents a scalable hierarchical active inference model for autonomous navigation, exploration, and goal-oriented behaviour. The model uses visual observation and motion perception to combine curiosity-driven exploration with goal-oriented behaviour. Motion is planned using different levels of reasoning, i.e., from context to place to motion. This allows for efficient navigation in new spaces and rapid progress toward a target. By incorporating these human navigational strategies and their hierarchical representation of the environment, this model proposes a new solution for autonomous navigation and exploration. The approach is validated through simulations in a mini-grid environment.

3.
Neurosci Biobehav Rev ; 156: 105500, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38056542

RESUMO

This paper concerns the distributed intelligence or federated inference that emerges under belief-sharing among agents who share a common world-and world model. Imagine, for example, several animals keeping a lookout for predators. Their collective surveillance rests upon being able to communicate their beliefs-about what they see-among themselves. But, how is this possible? Here, we show how all the necessary components arise from minimising free energy. We use numerical studies to simulate the generation, acquisition and emergence of language in synthetic agents. Specifically, we consider inference, learning and selection as minimising the variational free energy of posterior (i.e., Bayesian) beliefs about the states, parameters and structure of generative models, respectively. The common theme-that attends these optimisation processes-is the selection of actions that minimise expected free energy, leading to active inference, learning and model selection (a.k.a., structure learning). We first illustrate the role of communication in resolving uncertainty about the latent states of a partially observed world, on which agents have complementary perspectives. We then consider the acquisition of the requisite language-entailed by a likelihood mapping from an agent's beliefs to their overt expression (e.g., speech)-showing that language can be transmitted across generations by active learning. Finally, we show that language is an emergent property of free energy minimisation, when agents operate within the same econiche. We conclude with a discussion of various perspectives on these phenomena; ranging from cultural niche construction, through federated learning, to the emergence of complexity in ensembles of self-organising systems.


Assuntos
Comunicação , Idioma , Animais , Teorema de Bayes , Incerteza , Fala
4.
Exp Dermatol ; 32(10): 1744-1751, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37534916

RESUMO

In dermatology, deep learning may be applied for skin lesion classification. However, for a given input image, a neural network only outputs a label, obtained using the class probabilities, which do not model uncertainty. Our group developed a novel method to quantify uncertainty in stochastic neural networks. In this study, we aimed to train such network for skin lesion classification and evaluate its diagnostic performance and uncertainty, and compare the results to the assessments by a group of dermatologists. By passing duplicates of an image through such a stochastic neural network, we obtained distributions per class, rather than a single probability value. We interpreted the overlap between these distributions as the output uncertainty, where a high overlap indicated a high uncertainty, and vice versa. We had 29 dermatologists diagnose a series of skin lesions and rate their confidence. We compared these results to those of the network. The network achieved a sensitivity and specificity of 50% and 88%, comparable to the average dermatologist (respectively 68% and 73%). Higher confidence/less uncertainty was associated with better diagnostic performance both in the neural network and in dermatologists. We found no correlation between the uncertainty of the neural network and the confidence of dermatologists (R = -0.06, p = 0.77). Dermatologists should not blindly trust the output of a neural network, especially when its uncertainty is high. The addition of an uncertainty score may stimulate the human-computer interaction.


Assuntos
Inteligência Artificial , Dermatologistas , Dermoscopia , Dermatopatias , Humanos , Dermoscopia/métodos , Melanoma/diagnóstico por imagem , Melanoma/patologia , Neoplasias Cutâneas/diagnóstico por imagem , Neoplasias Cutâneas/patologia , Dermatopatias/diagnóstico por imagem , Dermatopatias/patologia
5.
Interface Focus ; 13(3): 20220077, 2023 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-37065264

RESUMO

Humans perceive and interact with hundreds of objects every day. In doing so, they need to employ mental models of these objects and often exploit symmetries in the object's shape and appearance in order to learn generalizable and transferable skills. Active inference is a first principles approach to understanding and modelling sentient agents. It states that agents entertain a generative model of their environment, and learn and act by minimizing an upper bound on their surprisal, i.e. their free energy. The free energy decomposes into an accuracy and complexity term, meaning that agents favour the least complex model that can accurately explain their sensory observations. In this paper, we investigate how inherent symmetries of particular objects also emerge as symmetries in the latent state space of the generative model learnt under deep active inference. In particular, we focus on object-centric representations, which are trained from pixels to predict novel object views as the agent moves its viewpoint. First, we investigate the relation between model complexity and symmetry exploitation in the state space. Second, we do a principal component analysis to demonstrate how the model encodes the principal axis of symmetry of the object in the latent space. Finally, we also demonstrate how more symmetrical representations can be exploited for better generalization in the context of manipulation.

6.
Sensors (Basel) ; 22(19)2022 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-36236477

RESUMO

The robotics field has been deeply influenced by the advent of deep learning. In recent years, this trend has been characterized by the adoption of large, pretrained models for robotic use cases, which are not compatible with the computational hardware available in robotic systems. Moreover, such large, computationally intensive models impede the low-latency execution which is required for many closed-loop control systems. In this work, we propose different strategies for improving the computational efficiency of the deep-learning models adopted in reinforcement-learning (RL) scenarios. As a use-case project, we consider an image-based RL method on the synergy between push-and-grasp actions. As a first optimization step, we reduce the model architecture in complexity, by decreasing the number of layers and by altering the architecture structure. Second, we consider downscaling the input resolution to reduce the computational load. Finally, we perform weight quantization, where we compare post-training quantization and quantized-aware training. We benchmark the improvements introduced in each optimization by running a standard testing routine. We show that the optimization strategies introduced can improve the computational efficiency by around 300 times, while also slightly improving the functional performance of the system. In addition, we demonstrate closed-loop control behaviour on a real-world robot, while processing everything on a Jetson Xavier NX edge device.


Assuntos
Robótica , Algoritmos , Força da Mão , Robótica/métodos
7.
Front Syst Neurosci ; 16: 787659, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36246500

RESUMO

Simultaneous localization and mapping (SLAM) represents a fundamental problem for autonomous embodied systems, for which the hippocampal/entorhinal system (H/E-S) has been optimized over the course of evolution. We have developed a biologically-inspired SLAM architecture based on latent variable generative modeling within the Free Energy Principle and Active Inference (FEP-AI) framework, which affords flexible navigation and planning in mobile robots. We have primarily focused on attempting to reverse engineer H/E-S "design" properties, but here we consider ways in which SLAM principles from robotics may help us better understand nervous systems and emergent minds. After reviewing LatentSLAM and notable features of this control architecture, we consider how the H/E-S may realize these functional properties not only for physical navigation, but also with respect to high-level cognition understood as generalized simultaneous localization and mapping (G-SLAM). We focus on loop-closure, graph-relaxation, and node duplication as particularly impactful architectural features, suggesting these computational phenomena may contribute to understanding cognitive insight (as proto-causal-inference), accommodation (as integration into existing schemas), and assimilation (as category formation). All these operations can similarly be describable in terms of structure/category learning on multiple levels of abstraction. However, here we adopt an ecological rationality perspective, framing H/E-S functions as orchestrating SLAM processes within both concrete and abstract hypothesis spaces. In this navigation/search process, adaptive cognitive equilibration between assimilation and accommodation involves balancing tradeoffs between exploration and exploitation; this dynamic equilibrium may be near optimally realized in FEP-AI, wherein control systems governed by expected free energy objective functions naturally balance model simplicity and accuracy. With respect to structure learning, such a balance would involve constructing models and categories that are neither too inclusive nor exclusive. We propose these (generalized) SLAM phenomena may represent some of the most impactful sources of variation in cognition both within and between individuals, suggesting that modulators of H/E-S functioning may potentially illuminate their adaptive significances as fundamental cybernetic control parameters. Finally, we discuss how understanding H/E-S contributions to G-SLAM may provide a unifying framework for high-level cognition and its potential realization in artificial intelligences.

8.
Front Neurorobot ; 16: 840658, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35496899

RESUMO

Scene understanding and decomposition is a crucial challenge for intelligent systems, whether it is for object manipulation, navigation, or any other task. Although current machine and deep learning approaches for object detection and classification obtain high accuracy, they typically do not leverage interaction with the world and are limited to a set of objects seen during training. Humans on the other hand learn to recognize and classify different objects by actively engaging with them on first encounter. Moreover, recent theories in neuroscience suggest that cortical columns in the neocortex play an important role in this process, by building predictive models about objects in their reference frame. In this article, we present an enactive embodied agent that implements such a generative model for object interaction. For each object category, our system instantiates a deep neural network, called Cortical Column Network (CCN), that represents the object in its own reference frame by learning a generative model that predicts the expected transform in pixel space, given an action. The model parameters are optimized through the active inference paradigm, i.e., the minimization of variational free energy. When provided with a visual observation, an ensemble of CCNs each vote on their belief of observing that specific object category, yielding a potential object classification. In case the likelihood on the selected category is too low, the object is detected as an unknown category, and the agent has the ability to instantiate a novel CCN for this category. We validate our system in an simulated environment, where it needs to learn to discern multiple objects from the YCB dataset. We show that classification accuracy improves as an embodied agent can gather more evidence, and that it is able to learn about novel, previously unseen objects. Finally, we show that an agent driven through active inference can choose their actions to reach a preferred observation.

9.
Front Neurorobot ; 16: 795846, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35360827

RESUMO

Although still not fully understood, sleep is known to play an important role in learning and in pruning synaptic connections. From the active inference perspective, this can be cast as learning parameters of a generative model and Bayesian model reduction, respectively. In this article, we show how to reduce dimensionality of the latent space of such a generative model, and hence model complexity, in deep active inference during training through a similar process. While deep active inference uses deep neural networks for state space construction, an issue remains in that the dimensionality of the latent space must be specified beforehand. We investigate two methods that are able to prune the latent space of deep active inference models. The first approach functions similar to sleep and performs model reduction post hoc. The second approach is a novel method which is more similar to reflection, operates during training and displays "aha" moments when the model is able to reduce latent space dimensionality. We show for two well-known simulated environments that model performance is retained in the first approach and only diminishes slightly in the second approach. We also show that reconstructions from a real world example are indistinguishable before and after reduction. We conclude that the most important difference constitutes a trade-off between training time and model performance in terms of accuracy and the ability to generalize, via minimization of model complexity.

11.
Entropy (Basel) ; 24(2)2022 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-35205595

RESUMO

The free energy principle, and its corollary active inference, constitute a bio-inspired theory that assumes biological agents act to remain in a restricted set of preferred states of the world, i.e., they minimize their free energy. Under this principle, biological agents learn a generative model of the world and plan actions in the future that will maintain the agent in an homeostatic state that satisfies its preferences. This framework lends itself to being realized in silico, as it comprehends important aspects that make it computationally affordable, such as variational inference and amortized planning. In this work, we investigate the tool of deep learning to design and realize artificial agents based on active inference, presenting a deep-learning oriented presentation of the free energy principle, surveying works that are relevant in both machine learning and active inference areas, and discussing the design choices that are involved in the implementation process. This manuscript probes newer perspectives for the active inference framework, grounding its theoretical aspects into more pragmatic affairs, offering a practical guide to active inference newcomers and a starting point for deep learning practitioners that would like to investigate implementations of the free energy principle.

12.
Sensors (Basel) ; 21(19)2021 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-34640843

RESUMO

Deep neural networks have achieved state-of-the-art performance in image classification. Due to this success, deep learning is now also being applied to other data modalities such as multispectral images, lidar and radar data. However, successfully training a deep neural network requires a large reddataset. Therefore, transitioning to a new sensor modality (e.g., from regular camera images to multispectral camera images) might result in a drop in performance, due to the limited availability of data in the new modality. This might hinder the adoption rate and time to market for new sensor technologies. In this paper, we present an approach to leverage the knowledge of a teacher network, that was trained using the original data modality, to improve the performance of a student network on a new data modality: a technique known in literature as knowledge distillation. By applying knowledge distillation to the problem of sensor transition, we can greatly speed up this process. We validate this approach using a multimodal version of the MNIST dataset. Especially when little data is available in the new modality (i.e., 10 images), training with additional teacher supervision results in increased performance, with the student network scoring a test set accuracy of 0.77, compared to an accuracy of 0.37 for the baseline. We also explore two extensions to the default method of knowledge distillation, which we evaluate on a multimodal version of the CIFAR-10 dataset: an annealing scheme for the hyperparameter α and selective knowledge distillation. Of these two, the first yields the best results. Choosing the optimal annealing scheme results in an increase in test set accuracy of 6%. Finally, we apply our method to the real-world use case of skin lesion classification.


Assuntos
Dermatopatias , Humanos , Redes Neurais de Computação
13.
Neural Netw ; 142: 192-204, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34022669

RESUMO

Localization and mapping has been a long standing area of research, both in neuroscience, to understand how mammals navigate their environment, as well as in robotics, to enable autonomous mobile robots. In this paper, we treat navigation as inferring actions that minimize (expected) variational free energy under a hierarchical generative model. We find that familiar concepts like perception, path integration, localization and mapping naturally emerge from this active inference formulation. Moreover, we show that this model is consistent with models of hippocampal functions, and can be implemented in silico on a real-world robot. Our experiments illustrate that a robot equipped with our hierarchical model is able to generate topologically consistent maps, and correct navigation behaviour is inferred when a goal location is provided to the system.


Assuntos
Robótica , Algoritmos , Animais , Simulação por Computador , Hipocampo
14.
Front Neurorobot ; 15: 642780, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33746730

RESUMO

Occlusions, restricted field of view and limited resolution all constrain a robot's ability to sense its environment from a single observation. In these cases, the robot first needs to actively query multiple observations and accumulate information before it can complete a task. In this paper, we cast this problem of active vision as active inference, which states that an intelligent agent maintains a generative model of its environment and acts in order to minimize its surprise, or expected free energy according to this model. We apply this to an object-reaching task for a 7-DOF robotic manipulator with an in-hand camera to scan the workspace. A novel generative model using deep neural networks is proposed that is able to fuse multiple views into an abstract representation and is trained from data by minimizing variational free energy. We validate our approach experimentally for a reaching task in simulation in which a robotic agent starts without any knowledge about its workspace. Each step, the next view pose is chosen by evaluating the expected free energy. We find that by minimizing the expected free energy, exploratory behavior emerges when the target object to reach is not in view, and the end effector is moved to the correct reach position once the target is located. Similar to an owl scavenging for prey, the robot naturally prefers higher ground for exploring, approaching its target once located.

15.
Front Comput Neurosci ; 14: 574372, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33304260

RESUMO

In this paper we investigate the active inference framework as a means to enable autonomous behavior in artificial agents. Active inference is a theoretical framework underpinning the way organisms act and observe in the real world. In active inference, agents act in order to minimize their so called free energy, or prediction error. Besides being biologically plausible, active inference has been shown to solve hard exploration problems in various simulated environments. However, these simulations typically require handcrafting a generative model for the agent. Therefore we propose to use recent advances in deep artificial neural networks to learn generative state space models from scratch, using only observation-action sequences. This way we are able to scale active inference to new and challenging problem domains, whilst still building on the theoretical backing of the free energy principle. We validate our approach on the mountain car problem to illustrate that our learnt models can indeed trade-off instrumental value and ambiguity. Furthermore, we show that generative models can also be learnt using high-dimensional pixel observations, both in the OpenAI Gym car racing environment and a real-world robotic navigation task. Finally we show that active inference based policies are an order of magnitude more sample efficient than Deep Q Networks on RL tasks.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA