Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Entropy (Basel) ; 24(12)2022 Nov 27.
Artículo en Inglés | MEDLINE | ID: mdl-36554136

RESUMEN

We define common thermodynamic concepts purely within the framework of general Markov chains and derive Jarzynski's equality and Crooks' fluctuation theorem in this setup. In particular, we regard the discrete-time case, which leads to an asymmetry in the definition of work that appears in the usual formulation of Crooks' fluctuation theorem. We show how this asymmetry can be avoided with an additional condition regarding the energy protocol. The general formulation in terms of Markov chains allows transferring the results to other application areas outside of physics. Here, we discuss how this framework can be applied in the context of decision-making. This involves the definition of the relevant quantities, the assumptions that need to be made for the different fluctuation theorems to hold, as well as the consideration of discrete trajectories instead of the continuous trajectories, which are relevant in physics.

2.
Proc Biol Sci ; 288(1962): 20212094, 2021 11 10.
Artículo en Inglés | MEDLINE | ID: mdl-34727714

RESUMEN

The Nash equilibrium is one of the most central solution concepts to study strategic interactions between multiple players and has recently also been shown to capture sensorimotor interactions between players that are haptically coupled. While previous studies in behavioural economics have shown that systematic deviations from Nash equilibria in economic decision-making can be explained by the more general quantal response equilibria, such deviations have not been reported for the sensorimotor domain. Here we investigate haptically coupled dyads across three different sensorimotor games corresponding to the classic symmetric and asymmetric Prisoner's Dilemma, where the quantal response equilibrium predicts characteristic shifts across the three games, although the Nash equilibrium stays the same. We find that subjects exhibit the predicted deviations from the Nash solution. Furthermore, we show that taking into account subjects' priors for the games, we arrive at a more accurate description of bounded rational response equilibria that can be regarded as a quantal response equilibrium with non-uniform prior. Our results suggest that bounded rational response equilibria provide a general tool to explain sensorimotor interactions that include the Nash equilibrium as a special case in the absence of information processing limitations.


Asunto(s)
Cognición , Dilema del Prisionero , Teoría del Juego , Humanos
3.
PLoS Comput Biol ; 16(12): e1008420, 2020 12.
Artículo en Inglés | MEDLINE | ID: mdl-33270644

RESUMEN

The concept of free energy has its origins in 19th century thermodynamics, but has recently found its way into the behavioral and neural sciences, where it has been promoted for its wide applicability and has even been suggested as a fundamental principle of understanding intelligent behavior and brain function. We argue that there are essentially two different notions of free energy in current models of intelligent agency, that can both be considered as applications of Bayesian inference to the problem of action selection: one that appears when trading off accuracy and uncertainty based on a general maximum entropy principle, and one that formulates action selection in terms of minimizing an error measure that quantifies deviations of beliefs and policies from given reference models. The first approach provides a normative rule for action selection in the face of model uncertainty or when information processing capabilities are limited. The second approach directly aims to formulate the action selection problem as an inference problem in the context of Bayesian brain theories, also known as Active Inference in the literature. We elucidate the main ideas and discuss critical technical and conceptual issues revolving around these two notions of free energy that both claim to apply at all levels of decision-making, from the high-level deliberation of reasoning down to the low-level information processing of perception.


Asunto(s)
Teorema de Bayes , Entropía , Modelos Neurológicos , Humanos , Probabilidad , Incertidumbre
4.
Neural Comput ; 31(2): 440-476, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30576612

RESUMEN

Specialization and hierarchical organization are important features of efficient collaboration in economical, artificial, and biological systems. Here, we investigate the hypothesis that both features can be explained by the fact that each entity of such a system is limited in a certain way. We propose an information-theoretic approach based on a free energy principle in order to computationally analyze systems of bounded rational agents that deal with such limitations optimally. We find that specialization allows a focus on fewer tasks, thus leading to a more efficient execution, but in turn, it requires coordination in hierarchical structures of specialized experts and coordinating units. Our results suggest that hierarchical architectures of specialized units at lower levels that are coordinated by units at higher levels are optimal, given that each unit's information-processing capability is limited and conforms to constraints on complexity costs.

5.
Entropy (Basel) ; 21(4)2019 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-33267089

RESUMEN

In its most basic form, decision-making can be viewed as a computational process that progressively eliminates alternatives, thereby reducing uncertainty. Such processes are generally costly, meaning that the amount of uncertainty that can be reduced is limited by the amount of available computational resources. Here, we introduce the notion of elementary computation based on a fundamental principle for probability transfers that reduce uncertainty. Elementary computations can be considered as the inverse of Pigou-Dalton transfers applied to probability distributions, closely related to the concepts of majorization, T-transforms, and generalized entropies that induce a preorder on the space of probability distributions. Consequently, we can define resource cost functions that are order-preserving and therefore monotonic with respect to the uncertainty reduction. This leads to a comprehensive notion of decision-making processes with limited resources. Along the way, we prove several new results on majorization theory, as well as on entropy and divergence measures.

6.
Entropy (Basel) ; 20(1)2017 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-33265092

RESUMEN

Living organisms from single cells to humans need to adapt continuously to respond to changes in their environment. The process of behavioural adaptation can be thought of as improving decision-making performance according to some utility function. Here, we consider an abstract model of organisms as decision-makers with limited information-processing resources that trade off between maximization of utility and computational costs measured by a relative entropy, in a similar fashion to thermodynamic systems undergoing isothermal transformations. Such systems minimize the free energy to reach equilibrium states that balance internal energy and entropic cost. When there is a fast change in the environment, these systems evolve in a non-equilibrium fashion because they are unable to follow the path of equilibrium distributions. Here, we apply concepts from non-equilibrium thermodynamics to characterize decision-makers that adapt to changing environments under the assumption that the temporal evolution of the utility function is externally driven and does not depend on the decision-maker's action. This allows one to quantify performance loss due to imperfect adaptation in a general manner and, additionally, to find relations for decision-making similar to Crooks' fluctuation theorem and Jarzynski's equality. We provide simulations of several exemplary decision and inference problems in the discrete and continuous domains to illustrate the new relations.

7.
PLoS Comput Biol ; 11(8): e1004369, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-26305797

RESUMEN

Previous studies have shown that sensorimotor processing can often be described by Bayesian learning, in particular the integration of prior and feedback information depending on its degree of reliability. Here we test the hypothesis that the integration process itself can be tuned to the statistical structure of the environment. We exposed human participants to a reaching task in a three-dimensional virtual reality environment where we could displace the visual feedback of their hand position in a two dimensional plane. When introducing statistical structure between the two dimensions of the displacement, we found that over the course of several days participants adapted their feedback integration process in order to exploit this structure for performance improvement. In control experiments we found that this adaptation process critically depended on performance feedback and could not be induced by verbal instructions. Our results suggest that structural learning is an important meta-learning component of Bayesian sensorimotor integration.


Asunto(s)
Retroalimentación Sensorial/fisiología , Aprendizaje/fisiología , Desempeño Psicomotor/fisiología , Teorema de Bayes , Biología Computacional , Femenino , Mano/fisiología , Humanos , Masculino
8.
Biol Cybern ; 110(2-3): 135-50, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27023096

RESUMEN

Bayesian inference and bounded rational decision-making require the accumulation of evidence or utility, respectively, to transform a prior belief or strategy into a posterior probability distribution over hypotheses or actions. Crucially, this process cannot be simply realized by independent integrators, since the different hypotheses and actions also compete with each other. In continuous time, this competitive integration process can be described by a special case of the replicator equation. Here we investigate simple analog electric circuits that implement the underlying differential equation under the constraint that we only permit a limited set of building blocks that we regard as biologically interpretable, such as capacitors, resistors, voltage-dependent conductances and voltage- or current-controlled current and voltage sources. The appeal of these circuits is that they intrinsically perform normalization without requiring an explicit divisive normalization. However, even in idealized simulations, we find that these circuits are very sensitive to internal noise as they accumulate error over time. We discuss in how far neural circuits could implement these operations that might provide a generic competitive principle underlying both perception and action.


Asunto(s)
Teorema de Bayes , Simulación por Computador , Toma de Decisiones , Retroalimentación , Cibernética , Electricidad , Percepción , Probabilidad
9.
Neural Comput ; 27(8): 1686-720, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-26079747

RESUMEN

Rate distortion theory describes how to communicate relevant information most efficiently over a channel with limited capacity. One of the many applications of rate distortion theory is bounded rational decision making, where decision makers are modeled as information channels that transform sensory input into motor output under the constraint that their channel capacity is limited. Such a bounded rational decision maker can be thought to optimize an objective function that trades off the decision maker's utility or cumulative reward against the information processing cost measured by the mutual information between sensory input and motor output. In this study, we interpret a spiking neuron as a bounded rational decision maker that aims to maximize its expected reward under the computational constraint that the mutual information between the neuron's input and output is upper bounded. This abstract computational constraint translates into a penalization of the deviation between the neuron's instantaneous and average firing behavior. We derive a synaptic weight update rule for such a rate distortion optimizing neuron and show in simulations that the neuron efficiently extracts reward-relevant information from the input by trading off its synaptic strengths against the collected reward.


Asunto(s)
Potenciales de Acción/fisiología , Toma de Decisiones/fisiología , Modelos Neurológicos , Neuronas/fisiología , Recompensa , Animales , Simulación por Computador , Procesamiento Automatizado de Datos , Humanos
10.
Proc Biol Sci ; 281(1783): 20132952, 2014 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-24671968

RESUMEN

A large number of recent studies suggest that the sensorimotor system uses probabilistic models to predict its environment and makes inferences about unobserved variables in line with Bayesian statistics. One of the important features of Bayesian statistics is Occam's Razor--an inbuilt preference for simpler models when comparing competing models that explain some observed data equally well. Here, we test directly for Occam's Razor in sensorimotor control. We designed a sensorimotor task in which participants had to draw lines through clouds of noisy samples of an unobserved curve generated by one of two possible probabilistic models-a simple model with a large length scale, leading to smooth curves, and a complex model with a short length scale, leading to more wiggly curves. In training trials, participants were informed about the model that generated the stimulus so that they could learn the statistics of each model. In probe trials, participants were then exposed to ambiguous stimuli. In probe trials where the ambiguous stimulus could be fitted equally well by both models, we found that participants showed a clear preference for the simpler model. Moreover, we found that participants' choice behaviour was quantitatively consistent with Bayesian Occam's Razor. We also show that participants' drawn trajectories were similar to samples from the Bayesian predictive distribution over trajectories and significantly different from two non-probabilistic heuristics. In two control experiments, we show that the preference of the simpler model cannot be simply explained by a difference in physical effort or by a preference for curve smoothness. Our results suggest that Occam's Razor is a general behavioural principle already present during sensorimotor processing.


Asunto(s)
Retroalimentación Sensorial , Aprendizaje , Teorema de Bayes , Femenino , Humanos , Masculino , Modelos Estadísticos , Adulto Joven
11.
PLoS Comput Biol ; 8(9): e1002698, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23028294

RESUMEN

Information processing in the nervous system during sensorimotor tasks with inherent uncertainty has been shown to be consistent with Bayesian integration. Bayes optimal decision-makers are, however, risk-neutral in the sense that they weigh all possibilities based on prior expectation and sensory evidence when they choose the action with highest expected value. In contrast, risk-sensitive decision-makers are sensitive to model uncertainty and bias their decision-making processes when they do inference over unobserved variables. In particular, they allow deviations from their probabilistic model in cases where this model makes imprecise predictions. Here we test for risk-sensitivity in a sensorimotor integration task where subjects exhibit Bayesian information integration when they infer the position of a target from noisy sensory feedback. When introducing a cost associated with subjects' response, we found that subjects exhibited a characteristic bias towards low cost responses when their uncertainty was high. This result is in accordance with risk-sensitive decision-making processes that allow for deviations from Bayes optimal decision-making in the face of uncertainty. Our results suggest that both Bayesian integration and risk-sensitivity are important factors to understand sensorimotor integration in a quantitative fashion.


Asunto(s)
Encéfalo/fisiología , Señales (Psicología) , Toma de Decisiones/fisiología , Retroalimentación Sensorial/fisiología , Modelos Neurológicos , Reconocimiento Visual de Modelos/fisiología , Teorema de Bayes , Simulación por Computador , Humanos , Modelos Estadísticos , Medición de Riesgo
12.
J Neurophysiol ; 107(4): 1111-22, 2012 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-22131385

RESUMEN

Motor task variation has been shown to be a key ingredient in skill transfer, retention, and structural learning. However, many studies only compare training of randomly varying tasks to either blocked or null training, and it is not clear how experiencing different nonrandom temporal orderings of tasks might affect the learning process. Here we study learning in human subjects who experience the same set of visuomotor rotations, evenly spaced between -60° and +60°, either in a random order or in an order in which the rotation angle changed gradually. We compared subsequent learning of three test blocks of +30°â†’-30°â†’+30° rotations. The groups that underwent either random or gradual training showed significant (P < 0.01) facilitation of learning in the test blocks compared with a control group who had not experienced any visuomotor rotations before. We also found that movement initiation times in the random group during the test blocks were significantly (P < 0.05) lower than for the gradual or the control group. When we fit a state-space model with fast and slow learning processes to our data, we found that the differences in performance in the test block were consistent with the gradual or random task variation changing the learning and retention rates of only the fast learning process. Such adaptation of learning rates may be a key feature of ongoing meta-learning processes. Our results therefore suggest that both gradual and random task variation can induce meta-learning and that random learning has an advantage in terms of shorter initiation times, suggesting less reliance on cognitive processes.


Asunto(s)
Adaptación Fisiológica/fisiología , Aprendizaje/fisiología , Movimiento/fisiología , Desempeño Psicomotor/fisiología , Distribución Aleatoria , Femenino , Mano/fisiología , Humanos , Masculino , Modelos Psicológicos , Rotación
13.
PLoS Comput Biol ; 7(3): e1001112, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21483475

RESUMEN

Sensorimotor learning has been shown to depend on both prior expectations and sensory evidence in a way that is consistent with Bayesian integration. Thus, prior beliefs play a key role during the learning process, especially when only ambiguous sensory information is available. Here we develop a novel technique to estimate the covariance structure of the prior over visuomotor transformations--the mapping between actual and visual location of the hand--during a learning task. Subjects performed reaching movements under multiple visuomotor transformations in which they received visual feedback of their hand position only at the end of the movement. After experiencing a particular transformation for one reach, subjects have insufficient information to determine the exact transformation, and so their second reach reflects a combination of their prior over visuomotor transformations and the sensory evidence from the first reach. We developed a Bayesian observer model in order to infer the covariance structure of the subjects' prior, which was found to give high probability to parameter settings consistent with visuomotor rotations. Therefore, although the set of visuomotor transformations experienced had little structure, the subjects had a strong tendency to interpret ambiguous sensory evidence as arising from rotation-like transformations. We then exposed the same subjects to a highly-structured set of visuomotor transformations, designed to be very different from the set of visuomotor rotations. During this exposure the prior was found to have changed significantly to have a covariance structure that no longer favored rotation-like transformations. In summary, we have developed a technique which can estimate the full covariance structure of a prior in a sensorimotor task and have shown that the prior over visuomotor transformations favor a rotation-like structure. Moreover, through experience of a novel task structure, participants can appropriately alter the covariance structure of their prior.


Asunto(s)
Retroalimentación Sensorial/fisiología , Desempeño Psicomotor , Algoritmos , Teorema de Bayes , Computadores , Mano , Humanos , Informática/métodos , Aprendizaje , Modelos Teóricos , Destreza Motora , Movimiento , Distribución Normal , Robótica , Programas Informáticos
14.
Theory Decis ; 93(4): 663-690, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36245967

RESUMEN

We introduce a new class of real-valued monotones in preordered spaces, injective monotones. We show that the class of preorders for which they exist lies in between the class of preorders with strict monotones and preorders with countable multi-utilities, improving upon the known classification of preordered spaces through real-valued monotones. We extend several well-known results for strict monotones (Richter-Peleg functions) to injective monotones, we provide a construction of injective monotones from countable multi-utilities, and relate injective monotones to classic results concerning Debreu denseness and order separability. Along the way, we connect our results to Shannon entropy and the uncertainty preorder, obtaining new insights into how they are related. In particular, we show how injective monotones can be used to generalize some appealing properties of Jaynes' maximum entropy principle, which is considered a basis for statistical inference and serves as a justification for many regularization techniques that appear throughout machine learning and decision theory.

15.
Front Neurosci ; 16: 906198, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36248642

RESUMEN

Bayes optimal and heuristic decision-making schemes are often considered fundamentally opposed to each other as a framework for studying human choice behavior, although recently it has been proposed that bounded rationality may provide a natural bridge between the two when varying information-processing resources. Here, we investigate a two-alternative forced choice task with varying time constraints, where subjects have to assign multi-component symbolic patterns to one of two stimulus classes. As expected, we find that subjects' response behavior becomes more imprecise with more time pressure. However, we also see that their response behavior changes qualitatively. By regressing subjects' decision weights, we find that decisions allowing for plenty of decision time rely on weighing multiple stimulus features, whereas decisions under high time pressure are made mostly based on a single feature. While the first response pattern is in line with a Bayes-optimal decision strategy, the latter could be considered as an instantiation of heuristic decision-making with cue discounting. When fitting a bounded rational decision model with multiple feature channels and varying information-processing capacity to subjects' responses, we find that the model is able to capture subjects' behavioral change. The model successfully reflects the simplicity of heuristics as well as the efficiency of optimal decision making, thus acting as a bridge between the two approaches.

16.
J Neurophysiol ; 105(6): 2668-74, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21430284

RESUMEN

When a racing driver steers a car around a sharp bend, there is a trade-off between speed and accuracy, in that high speed can lead to a skid whereas a low speed increases lap time, both of which can adversely affect the driver's payoff function. While speed-accuracy trade-offs have been studied extensively, their susceptibility to risk sensitivity is much less understood, since most theories of motor control are risk neutral with respect to payoff, i.e., they only consider mean payoffs and ignore payoff variability. Here we investigate how individual risk attitudes impact a motor task that involves such a speed-accuracy trade-off. We designed an experiment where a target had to be hit and the reward (given in points) increased as a function of both subjects' endpoint accuracy and endpoint velocity. As faster movements lead to poorer endpoint accuracy, the variance of the reward increased for higher velocities. We tested subjects on two reward conditions that had the same mean reward but differed in the variance of the reward. A risk-neutral account predicts that subjects should only maximize the mean reward and hence perform identically in the two conditions. In contrast, we found that some (risk-averse) subjects chose to move with lower velocities and other (risk-seeking) subjects with higher velocities in the condition with higher reward variance (risk). This behavior is suboptimal with regard to maximizing the mean number of points but is in accordance with a risk-sensitive account of movement selection. Our study suggests that individual risk sensitivity is an important factor in motor tasks with speed-accuracy trade-offs.


Asunto(s)
Movimiento/fisiología , Desempeño Psicomotor/fisiología , Tiempo de Reacción/fisiología , Asunción de Riesgos , Adulto , Femenino , Humanos , Masculino , Modelos Biológicos , Estimulación Luminosa , Recompensa , Adulto Joven
17.
Proc Biol Sci ; 278(1716): 2325-32, 2011 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-21208966

RESUMEN

Numerous psychophysical studies suggest that the sensorimotor system chooses actions that optimize the average cost associated with a movement. Recently, however, violations of this hypothesis have been reported in line with economic theories of decision-making that not only consider the mean payoff, but are also sensitive to risk, that is the variability of the payoff. Here, we examine the hypothesis that risk-sensitivity in sensorimotor control arises as a mean-variance trade-off in movement costs. We designed a motor task in which participants could choose between a sure motor action that resulted in a fixed amount of effort and a risky motor action that resulted in a variable amount of effort that could be either lower or higher than the fixed effort. By changing the mean effort of the risky action while experimentally fixing its variance, we determined indifference points at which participants chose equiprobably between the sure, fixed amount of effort option and the risky, variable effort option. Depending on whether participants accepted a variable effort with a mean that was higher, lower or equal to the fixed effort, they could be classified as risk-seeking, risk-averse or risk-neutral. Most subjects were risk-sensitive in our task consistent with a mean-variance trade-off in effort, thereby, underlining the importance of risk-sensitivity in computational models of sensorimotor control.


Asunto(s)
Conducta de Elección/fisiología , Toma de Decisiones/fisiología , Retroalimentación Sensorial , Desempeño Psicomotor/fisiología , Asunción de Riesgos , Adulto , Femenino , Humanos , Masculino , Modelos Biológicos , Estimulación Luminosa
18.
PLoS Comput Biol ; 6(7): e1000857, 2010 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-20657657

RESUMEN

Many aspects of human motor behavior can be understood using optimality principles such as optimal feedback control. However, these proposed optimal control models are risk-neutral; that is, they are indifferent to the variability of the movement cost. Here, we propose the use of a risk-sensitive optimal controller that incorporates movement cost variance either as an added cost (risk-averse controller) or as an added value (risk-seeking controller) to model human motor behavior in the face of uncertainty. We use a sensorimotor task to test the hypothesis that subjects are risk-sensitive. Subjects controlled a virtual ball undergoing Brownian motion towards a target. Subjects were required to minimize an explicit cost, in points, that was a combination of the final positional error of the ball and the integrated control cost. By testing subjects on different levels of Brownian motion noise and relative weighting of the position and control cost, we could distinguish between risk-sensitive and risk-neutral control. We show that subjects change their movement strategy pessimistically in the face of increased uncertainty in accord with the predictions of a risk-averse optimal controller. Our results suggest that risk-sensitivity is a fundamental attribute that needs to be incorporated into optimal feedback control models.


Asunto(s)
Retroalimentación Sensorial/fisiología , Modelos Biológicos , Asunción de Riesgos , Incertidumbre , Adulto , Algoritmos , Análisis de Varianza , Simulación por Computador , Femenino , Humanos , Modelos Lineales , Masculino
19.
Exp Brain Res ; 211(3-4): 631-41, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21455618

RESUMEN

Trying to pass someone walking toward you in a narrow corridor is a familiar example of a two-person motor game that requires coordination. In this study, we investigate coordination in sensorimotor tasks that correspond to classic coordination games with multiple Nash equilibria, such as "choosing sides," "stag hunt," "chicken," and "battle of sexes". In these tasks, subjects made reaching movements reflecting their continuously evolving "decisions" while they received a continuous payoff in the form of a resistive force counteracting their movements. Successful coordination required two subjects to "choose" the same Nash equilibrium in this force-payoff landscape within a single reach. We found that on the majority of trials coordination was achieved. Compared to the proportion of trials in which miscoordination occurred, successful coordination was characterized by several distinct features: an increased mutual information between the players' movement endpoints, an increased joint entropy during the movements, and by differences in the timing of the players' responses. Moreover, we found that the probability of successful coordination depends on the players' initial distance from the Nash equilibria. Our results suggest that two-person coordination arises naturally in motor interactions and is facilitated by favorable initial positions, stereotypical motor pattern, and differences in response times.


Asunto(s)
Conducta Cooperativa , Movimiento/fisiología , Desempeño Psicomotor/fisiología , Humanos , Modelos Teóricos , Músculo Esquelético/fisiología
20.
Sci Rep ; 11(1): 20779, 2021 10 21.
Artículo en Inglés | MEDLINE | ID: mdl-34675336

RESUMEN

The Nash equilibrium concept has previously been shown to be an important tool to understand human sensorimotor interactions, where different actors vie for minimizing their respective effort while engaging in a multi-agent motor task. However, it is not clear how such equilibria are reached. Here, we compare different reinforcement learning models to human behavior engaged in sensorimotor interactions with haptic feedback based on three classic games, including the prisoner's dilemma, and the symmetric and asymmetric matching pennies games. We find that a discrete analysis that reduces the continuous sensorimotor interaction to binary choices as in classical matrix games does not allow to distinguish between the different learning algorithms, but that a more detailed continuous analysis with continuous formulations of the learning algorithms and the game-theoretic solutions affords different predictions. In particular, we find that Q-learning with intrinsic costs that disfavor deviations from average behavior explains the observed data best, even though all learning algorithms equally converge to admissible Nash equilibrium solutions. We therefore conclude that it is important to study different learning algorithms for understanding sensorimotor interactions, as such behavior cannot be inferred from a game-theoretic analysis alone, that simply focuses on the Nash equilibrium concept, as different learning algorithms impose preferences on the set of possible equilibrium solutions due to the inherent learning dynamics.


Asunto(s)
Aprendizaje , Algoritmos , Teoría del Juego , Humanos , Dilema del Prisionero
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA