Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
BMC Psychol ; 12(1): 245, 2024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38689352

RESUMO

Decision-making under uncertainty, a cornerstone of human cognition, is encapsulated by the "secretary problem" in optimal stopping theory. Our study examines this decision-making challenge, where participants are required to sequentially evaluate and make irreversible choices under conditions that simulate cognitive overload. We probed neurophysiological responses by engaging 27 students in a secretary problem simulation while undergoing EEG monitoring, focusing on Event-Related Potentials (ERPs) P200 and P400, and Theta to Beta Ratio (TBR) dynamics.Results revealed a nuanced pattern: the P200 component's amplitude declined from the initial to the middle offers, suggesting a diminishing attention span as participants grew accustomed to the task. This attenuation reversed at the final offer, indicating a heightened cognitive processing as the task concluded. In contrast, the P400 component's amplitude peaked at the middle offer, hinting at increased cognitive evaluation, and tapered off at the final decision. Additionally, TBR dynamics illustrated a fluctuation in attentional control and emotional regulation throughout the decision-making sequence, enhancing our understanding of the cognitive strategies employed.The research elucidates the dynamic interplay of cognitive processes in high-stakes environments, with neurophysiological markers fluctuating significantly as participants navigated sequential choices. By correlating these fluctuations with decision-making behavior, we provide insights into the evolving strategies from heightened alertness to strategic evaluation. Our findings offer insights that could inform the use of neurophysiological data in the development of decision-making frameworks, potentially contributing to the practical application of cognitive research in real-life contexts.


Assuntos
Atenção , Tomada de Decisões , Eletroencefalografia , Potenciais Evocados , Humanos , Tomada de Decisões/fisiologia , Potenciais Evocados/fisiologia , Masculino , Feminino , Adulto Jovem , Atenção/fisiologia , Adulto , Cognição/fisiologia , Encéfalo/fisiologia , Incerteza , Ritmo Teta/fisiologia , Ritmo beta/fisiologia
2.
Neurosci Biobehav Rev ; 160: 105623, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38490499

RESUMO

Foraging is a natural behavior that involves making sequential decisions to maximize rewards while minimizing the costs incurred when doing so. The prevalence of foraging across species suggests that a common brain computation underlies its implementation. Although anterior cingulate cortex is believed to contribute to foraging behavior, its specific role has been contentious, with predominant theories arguing either that it encodes environmental value or choice difficulty. Additionally, recent attempts to characterize foraging have taken place within the reinforcement learning framework, with increasingly complex models scaling with task complexity. Here we review reinforcement learning foraging models, highlighting the hierarchical structure of many foraging problems. We extend this literature by proposing that ACC guides foraging according to principles of model-based hierarchical reinforcement learning. This idea holds that ACC function is organized hierarchically along a rostral-caudal gradient, with rostral structures monitoring the status and completion of high-level task goals (like finding food), and midcingulate structures overseeing the execution of task options (subgoals, like harvesting fruit) and lower-level actions (such as grabbing an apple).


Assuntos
Tomada de Decisões , Giro do Cíngulo , Humanos , Animais , Reforço Psicológico , Recompensa , Comportamento Animal , Comportamento de Escolha
3.
Aging Brain ; 5: 100109, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38380149

RESUMO

Older adults demonstrate difficulties in sequential decision-making, which is partly attributed to under-recruitment of prefrontal networks. It is, therefore, important to understand the mechanisms that may improve this ability. This study investigated the effectiveness of an 18-sessions, home-based cognitive intervention and the neural mechanisms that underpin individual differences in intervention effects. Participants were required to learn sequential choices in a 3-stage Markov decision-making task that would yield the most rewards. Participants were assigned to better or worse responders group based on their performance at the last intervention session (T18). Better responders improved significantly starting from the fifth intervention session while worse responders did not improve across all training sessions. At post-intervention, only better responders showed condition-dependent modulation of the dorsolateral prefrontal cortex (DLPFC) as measured by fNIRS, with higher DLPFC activity in the delayed condition. Despite large individual differences, our data showed that value-based sequential-decision-making and its corresponding neural mechanisms can be remediated via home-based cognitive intervention in some older adults; moreover, individual differences in recruiting prefrontal activities after the intervention are associated with variations in intervention outcomes. Intervention-related gains were also maintained at three months after post-intervention. However, future studies should investigate the potential of combining other intervention methods such as non-invasive brain stimulation with cognitive intervention for older adults who do not respond to the intervention, thus emphasizing the importance of developing individualized intervention programs for older adults.

4.
Int J Psychophysiol ; 189: 11-19, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37075909

RESUMO

The process of outcome evaluation effectively navigates subsequent choices in humans. However, it is largely unclear how people evaluate decision outcomes in a sequential scenario, as well as the neural mechanisms underlying this process. To address this research gap, the study employed a sequential decision task in which participants were required to make a series of choices in each trial, with the option to terminate their choices. Based on participants' decisions, two outcome patterns were classified: the "reached" condition and the "unreached" condition, and the event-related potentials (ERPs) were recorded. Further, in the unreached condition, we investigated how the distance (i.e., the position interval between the actual outcome and potential outcome) modulated outcome evaluation. Behavioral data showed a higher emotion rating when people got a reward rather than a loss (i.e., the reached condition), while the opposite was true in the unreached condition. ERP results showed a larger feedback-related negativity (FRN), a smaller P3, and a larger late positive potential (LPP) when people got a loss compared to a reward. Importantly, a hierarchical processing pattern was found in the unreached condition: people processed separately the potential outcome and the distance at the early stage, manifested in the FRN amplitude; subsequently, the brain focused on the distance-a lower distance elicited an enhanced P3 amplitude. Finally, the potential outcome and distance were processed interactively in the LPP amplitude. Overall, these findings shed light on the neural underpinnings of outcome evaluation in sequential decision-making.


Assuntos
Eletroencefalografia , Potenciais Evocados , Humanos , Potenciais Evocados/fisiologia , Encéfalo/fisiologia , Recompensa , Tomada de Decisões/fisiologia
5.
J Biomed Inform ; 137: 104267, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36494060

RESUMO

Warfarin is a widely used anticoagulant, and has a narrow therapeutic range. Dosing of warfarin should be individualized, since slight overdosing or underdosing can have catastrophic or even fatal consequences. Despite much research on warfarin dosing, current dosing protocols do not live up to expectations, especially for patients sensitive to warfarin. We propose a deep reinforcement learning-based dosing model for warfarin. To overcome the issue of relatively small sample sizes in dosing trials, we use a Pharmacokinetic/ Pharmacodynamic (PK/PD) model of warfarin to simulate dose-responses of virtual patients. Applying the proposed algorithm on virtual test patients shows that this model outperforms a set of clinically accepted dosing protocols by a wide margin. We tested the robustness of our dosing protocol on a second PK/PD model and showed that its performance is comparable to the set of baseline protocols.


Assuntos
Anticoagulantes , Varfarina , Humanos , Varfarina/farmacologia , Varfarina/uso terapêutico , Anticoagulantes/farmacologia , Anticoagulantes/uso terapêutico , Algoritmos
6.
Entropy (Basel) ; 24(6)2022 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-35741535

RESUMO

Vehicular edge computing is a new computing paradigm. By introducing edge computing into the Internet of Vehicles (IoV), service providers are able to serve users with low-latency services, as edge computing deploys resources (e.g., computation, storage, and bandwidth) at the side close to the IoV users. When mobile nodes are moving and generating structured tasks, they can connect with the roadside units (RSUs) and then choose a proper time and several suitable Mobile Edge Computing (MEC) servers to offload the tasks. However, how to offload tasks in sequence efficiently is challenging. In response to this problem, in this paper, we propose a time-optimized, multi-task-offloading model adopting the principles of Optimal Stopping Theory (OST) with the objective of maximizing the probability of offloading to the optimal servers. When the server utilization is close to uniformly distributed, we propose another OST-based model with the objective of minimizing the total offloading delay. The proposed models are experimentally compared and evaluated with related OST models using simulated data sets and real data sets, and sensitivity analysis is performed. The results show that the proposed offloading models can be efficiently implemented in the mobile nodes and significantly reduce the total expected processing time of the tasks.

7.
Neuroimage ; 256: 119222, 2022 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-35447352

RESUMO

Cognitive control and forward planning in particular is costly, and therefore must be regulated such that the amount of cognitive resources invested is adequate to the current situation. However, knowing in advance how beneficial forward planning will be in a given situation is hard. A way to know the exact value of planning would be to actually do it, which would ab initio defeat the purpose of regulating planning, i.e. the reduction of computational and time costs. One possible solution to this dilemma is that planning is regulated by learned associations between stimuli and the expected demand for planning. Such learning might be based on generalisation processes that cluster together stimulus states with similar control relevant properties into more general control contexts. In this way, the brain could infer the demand for planning, based on previous experience with situations that share some structural properties with the current situation. Here, we used a novel sequential task to test the hypothesis that people use control contexts to efficiently regulate their forward planning, using behavioural and functional magnetic resonance imaging data. Consistent with our hypothesis, reaction times increased with trial-by-trial conflict, where this increase was more pronounced in a context with a learned high demand for planning. Similarly, we found that fMRI activity in the dorsal anterior cingulate cortex (dACC) increased with conflict, and this increase was more pronounced in a context with generally high demand for planning. Taken together, the results indicate that the dACC integrates representations of planning demand at different levels of abstraction to regulate planning in an efficient and situation-appropriate way.


Assuntos
Giro do Cíngulo , Imageamento por Ressonância Magnética , Giro do Cíngulo/diagnóstico por imagem , Giro do Cíngulo/fisiologia , Humanos , Imageamento por Ressonância Magnética/métodos , Tempo de Reação/fisiologia
8.
J Mach Learn Res ; 23(250)2022.
Artigo em Inglês | MEDLINE | ID: mdl-37576335

RESUMO

Learning optimal individualized treatment rules (ITRs) has become increasingly important in the modern era of precision medicine. Many statistical and machine learning methods for learning optimal ITRs have been developed in the literature. However, most existing methods are based on data collected from traditional randomized controlled trials and thus cannot take advantage of the accumulative evidence when patients enter the trials sequentially. It is also ethically important that future patients should have a high probability to be treated optimally based on the updated knowledge so far. In this work, we propose a new design called sequentially rule-adaptive trials to learn optimal ITRs based on the contextual bandit framework, in contrast to the response-adaptive design in traditional adaptive trials. In our design, each entering patient will be allocated with a high probability to the current best treatment for this patient, which is estimated using the past data based on some machine learning algorithm (for example, outcome weighted learning in our implementation). We explore the tradeoff between training and test values of the estimated ITR in single-stage problems by proving theoretically that for a higher probability of following the estimated ITR, the training value converges to the optimal value at a faster rate, while the test value converges at a slower rate. This problem is different from traditional decision problems in the sense that the training data are generated sequentially and are dependent. We also develop a tool that combines martingale with empirical process to tackle the problem that cannot be solved by previous techniques for i.i.d. data. We show by numerical examples that without much loss of the test value, our proposed algorithm can improve the training value significantly as compared to existing methods. Finally, we use a real data study to illustrate the performance of the proposed method.

9.
Neuroimage ; 246: 118780, 2022 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-34875383

RESUMO

Learning how to reach a reward over long series of actions is a remarkable capability of humans, and potentially guided by multiple parallel learning modules. Current brain imaging of learning modules is limited by (i) simple experimental paradigms, (ii) entanglement of brain signals of different learning modules, and (iii) a limited number of computational models considered as candidates for explaining behavior. Here, we address these three limitations and (i) introduce a complex sequential decision making task with surprising events that allows us to (ii) dissociate correlates of reward prediction errors from those of surprise in functional magnetic resonance imaging (fMRI); and (iii) we test behavior against a large repertoire of model-free, model-based, and hybrid reinforcement learning algorithms, including a novel surprise-modulated actor-critic algorithm. Surprise, derived from an approximate Bayesian approach for learning the world-model, is extracted in our algorithm from a state prediction error. Surprise is then used to modulate the learning rate of a model-free actor, which itself learns via the reward prediction error from model-free value estimation by the critic. We find that action choices are well explained by pure model-free policy gradient, but reaction times and neural data are not. We identify signatures of both model-free and surprise-based learning signals in blood oxygen level dependent (BOLD) responses, supporting the existence of multiple parallel learning modules in the brain. Our results extend previous fMRI findings to a multi-step setting and emphasize the role of policy gradient and surprise signalling in human learning.


Assuntos
Encéfalo/fisiologia , Tomada de Decisões/fisiologia , Neuroimagem Funcional/métodos , Aprendizagem/fisiologia , Imageamento por Ressonância Magnética/métodos , Adulto , Encéfalo/diagnóstico por imagem , Feminino , Humanos , Masculino , Modelos Biológicos , Reforço Psicológico , Adulto Jovem
10.
Top Cogn Sci ; 13(4): 610-665, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34710275

RESUMO

Acquiring expertise in a task is often thought of as an automatic process that follows inevitably with practice according to the log-log law (aka: power law) of learning. However, as Ericsson, Chase, and Faloon (1980) showed, this is not true for digit-span experts and, as we show, it is certainly not true for Tetris players at any level of expertise. Although some people may simply "twitch" faster than others, the limit to Tetris expertise is not raw keypress time but the techniques acquired by players that allow them to use the tools provided by the hardware and software to compensate for the game's relentlessly increasing drop speed. Unfortunately, these increases in drop speed between Tetris levels make performance plateaus very short and quickly followed by game death. Hence, a player's success at discovering, exploring, and practicing new techniques for the tasks of board preparation, board maintenance, optimal placement discovery, zoid rotation, lateral movement of zoids, and other tasks important to expertise in Tetris is limited. In this paper, we analyze data collected from 492 Tetris players to reveal the challenges they confronted while constructing expertise via the discovery of new techniques for gameplay at increasingly difficult levels of Tetris.


Assuntos
Jogos de Vídeo , Logro , Humanos , Aprendizagem
11.
J Am Stat Assoc ; 116(533): 382-391, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33814653

RESUMO

Due to the recent advancements in wearables and sensing technology, health scientists are increasingly developing mobile health (mHealth) interventions. In mHealth interventions, mobile devices are used to deliver treatment to individuals as they go about their daily lives. These treatments are generally designed to impact a near time, proximal outcome such as stress or physical activity. The mHealth intervention policies, often called just-in-time adaptive interventions, are decision rules that map a individual's current state (e.g., individual's past behaviors as well as current observations of time, location, social activity, stress and urges to smoke) to a particular treatment at each of many time points. The vast majority of current mHealth interventions deploy expert-derived policies. In this paper, we provide an approach for conducting inference about the performance of one or more such policies using historical data collected under a possibly different policy. Our measure of performance is the average of proximal outcomes over a long time period should the particular mHealth policy be followed. We provide an estimator as well as confidence intervals. This work is motivated by HeartSteps, an mHealth physical activity intervention.

12.
Eur J Health Econ ; 22(1): 51-73, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-32901420

RESUMO

BACKGROUND: In a typical single-payer setting that uses an explicit cost-effectiveness (CE) threshold in its decision-making, the payer aims to maximize the net-monetary-benefit (NMB) given the CE threshold, whilst the manufacturer aims to maximize the expected discounted-cash-flow (DCF) resulting from the sales of that technology. Managed entry agreements (MEAs) are tools that are used to improve access to expensive technologies that would otherwise not be deemed to be cost-effective to payers. While simple discount on the list price is the most commonly applied MEA type, there are different forms, each having a different impact on the cost-effectiveness of the technology, on the lifetime DCF-per-patient and on the decision uncertainty. We aim to analyze the sequential decision-making (SDM) of different MEAs (i.e. simple discount, free treatment initiation, lifetime treatment acquisition cost-capping [LTTACC], performance-based money-back guarantee [MBG]) at the manufacturer and at the payer level, respectively. METHODS: We first model the SDM of the manufacturer and the payer as a sequential game and explain the challenges to find an equilibrium analytically. Then we propose a heuristic computational method to follow for each of the MEA types, based on practice. To demonstrate this SDM on a case study, a UK-based cost-utility analysis using a three-state, partitioned-survival-model was constructed to determine the cost-effectiveness of regorafenib versus best-supportive-care for the second-line treatment of hepatocellular carcinoma. The optimal agreement terms that would maximise the lifetime DCF-per-patient for each MEA, whilst remaining below the CE-threshold (£50,000/QALY gained) were obtained in the deterministic base-case. Robustness for each optimized MEA was then assessed using probabilistic sensitivity and scenario analyses, the value of information (VoI), and HTA-risk analyses. RESULTS: As expected, the introduction of all MEAs improved the probabilistic ICER and NMB values to (almost) acceptable levels, compared to the "no-MEA" case (ICER ~ £78,000/QALY-gained). The expected DCFs across the explored MEAs were all similar, whilst the payer strategy & uncertainty burden (PSUB) for regorafenib decreased in all MEAs explored. VoI analyses revealed that regorafenib mean-dose-intensity and time-on-treatment (ToT) parameters attributed most to the decision uncertainty. LTTACC provided the smallest PSUB and the most robust NMB estimates under parametric uncertainty. For scenarios assuming increased regorafenib ToT or mean-dose-intensity, LTACC again provided acceptable cost-effectiveness outcomes, whereas for scenarios assuming decreased regorafenib progression-free/overall survival effectiveness, only MBG resulted in plausible ICER values. In scenarios, where the source of uncertainty was not targeted by MEA parameters (e.g. the scenario assuming higher progressed disease resource utilization), all investigated MEA types resulted in unacceptable cost-effectiveness outcomes. CONCLUSION: Each MEA type has a different implication. The impact of different MEAs on the NMB is more noteworthy than on the DCF, in relative terms, hence payers will benefit from the early participation of the MEA design rather than leaving this up to the prerogative of the manufacturer. While simple discount might be practical for implementation purposes, other MEAs can provide additional benefits to the payer in terms of increased NMB, reduced decision risk and reduced uncertainty. MEA performance should be investigated not only under parametric uncertainty, but also under-identified structural uncertainty, and the barriers of implementation should be considered thoroughly before choosing the most appropriate MEA type.


Assuntos
Antineoplásicos/economia , Custos de Cuidados de Saúde , Análise Custo-Benefício , Inglaterra , Humanos , Preparações Farmacêuticas , Anos de Vida Ajustados por Qualidade de Vida , Incerteza
13.
Cogn Emot ; 34(7): 1509-1516, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32393109

RESUMO

A growing body of evidence suggests that emotional states under which individuals perform decision-making tasks modulate performance. Studies have mainly reported that negative emotions can differentially increase or decrease performance by modulating feedback processing. In contrast, differential influences of specific emotions inside positive valence have been poorly investigated. The objective of the present work was to assess specific effect of different types of positive emotions on decision-making and to investigate whether this effect also depends on feedback processing. In our study, after being induced to feel either hope or happiness, participants undertook a risky sequential decision-making task in which feedback was required to obtain a good performance. We found that the more positive was the feedback received, the more happiness led participants to make risky decisions. This tendency was not observed among participants in the hopeful or in the control condition. Our results contribute to the literature showing that the effects of emotions on sequential decision-making performance can be explained by feedback processing and are not solely due to the valence of the emotional state. They also suggest that further research is required to determine which potential specific dimension is involved in the effects of positive emotions on sequential decision-making.


Assuntos
Tomada de Decisões/fisiologia , Emoções , Adolescente , Adulto , Feminino , Felicidade , Esperança , Humanos , Masculino , Adulto Jovem
14.
Proc Natl Acad Sci U S A ; 117(23): 12750-12755, 2020 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-32461363

RESUMO

In many real-life decisions, options are distributed in space and time, making it necessary to search sequentially through them, often without a chance to return to a rejected option. The optimal strategy in these tasks is to choose the first option that is above a threshold that depends on the current position in the sequence. The implicit decision-making strategies by humans vary but largely diverge from this optimal strategy. The reasons for this divergence remain unknown. We present a model of human stopping decisions in sequential decision-making tasks based on a linear threshold heuristic. The first two studies demonstrate that the linear threshold model accounts better for sequential decision making than existing models. Moreover, we show that the model accurately predicts participants' search behavior in different environments. In the third study, we confirm that the model generalizes to a real-world problem, thus providing an important step toward understanding human sequential decision making.


Assuntos
Tomada de Decisões , Modelos Psicológicos , Adolescente , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade
15.
Front Hum Neurosci ; 14: 605190, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33613203

RESUMO

The ability to learn sequential contingencies of actions for predicting future outcomes is indispensable for flexible behavior in many daily decision-making contexts. It remains open whether such ability may be enhanced by transcranial direct current stimulation (tDCS). The present study combined tDCS with functional near-infrared spectroscopy (fNIRS) to investigate potential tDCS-induced effects on sequential decision-making and the neural mechanisms underlying such modulations. Offline tDCS and sham stimulation were applied over the left and right dorsolateral prefrontal cortex (dlPFC) in young male adults (N = 29, mean age = 23.4 years, SD = 3.2) in a double-blind between-subject design using a three-state Markov decision task. The results showed (i) an enhanced dlPFC hemodynamic response during the acquisition of sequential state transitions that is consistent with the findings from a previous functional magnetic resonance imaging (fMRI) study; (ii) a tDCS-induced increase of the hemodynamic response in the dlPFC, but without accompanying performance-enhancing effects at the behavioral level; and (iii) a greater tDCS-induced upregulation of hemodynamic responses in the delayed reward condition that seems to be associated with faster decision speed. Taken together, these findings provide empirical evidence for fNIRS as a suitable method for investigating hemodynamic correlates of sequential decision-making as well as functional brain correlates underlying tDCS-induced modulation. Future research with larger sample sizes for carrying out subgroup analysis is necessary in order to decipher interindividual differences in tDCS-induced effects on sequential decision-making process at the behavioral and brain levels.

16.
Cogn Neurosci ; 11(3): 122-131, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-31617790

RESUMO

Movement-related theta oscillations in rodent hippocampus coordinate 'forward sweeps' of location-specific neural activity that could be used to evaluate spatial trajectories online. This raises the possibility that increases in human hippocampal theta power accompany the evaluation of upcoming spatial choices. To test this hypothesis, we measured neural oscillations during a spatial planning task that closely resembles a perceptual decision-making paradigm. In this task, participants searched visually for the shortest path between a start and goal location in novel mazes that contained multiple choice points, and were subsequently asked to make a spatial decision at one of those choice points. We observed ~4-8 Hz hippocampal/medial temporal lobe theta power increases specific to sequential planning that were negatively correlated with subsequent decision speed, where decision speed was inversely correlated with choice accuracy. These results implicate the hippocampal theta rhythm in decision tree search during planning in novel environments.


Assuntos
Tomada de Decisões/fisiologia , Hipocampo/fisiologia , Aprendizagem Seriada/fisiologia , Percepção Espacial/fisiologia , Aprendizagem Espacial/fisiologia , Lobo Temporal/fisiologia , Ritmo Teta/fisiologia , Adulto , Feminino , Humanos , Masculino , Percepção Visual/fisiologia , Adulto Jovem
17.
Elife ; 82019 11 11.
Artigo em Inglês | MEDLINE | ID: mdl-31709980

RESUMO

In many daily tasks, we make multiple decisions before reaching a goal. In order to learn such sequences of decisions, a mechanism to link earlier actions to later reward is necessary. Reinforcement learning (RL) theory suggests two classes of algorithms solving this credit assignment problem: In classic temporal-difference learning, earlier actions receive reward information only after multiple repetitions of the task, whereas models with eligibility traces reinforce entire sequences of actions from a single experience (one-shot). Here, we show one-shot learning of sequences. We developed a novel paradigm to directly observe which actions and states along a multi-step sequence are reinforced after a single reward. By focusing our analysis on those states for which RL with and without eligibility trace make qualitatively distinct predictions, we find direct behavioral (choice probability) and physiological (pupil dilation) signatures of reinforcement learning with eligibility trace across multiple sensory modalities.


Assuntos
Cognição/fisiologia , Tomada de Decisões/fisiologia , Aprendizagem/fisiologia , Memória/fisiologia , Pupila/fisiologia , Reforço Psicológico , Recompensa , Algoritmos , Humanos , Cadeias de Markov , Modelos Neurológicos , Desempenho Psicomotor/fisiologia
18.
Technol Health Care ; 27(S1): 367-381, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31045554

RESUMO

Traditional Chinese Medicine (TCM) multiple-acupoints stimulation is widely used to improve dysphagia among post-stroke patients. However, prior research in evidence-based acupuncture mostly focused on the treatment effects of single acupoint's on dysphagia, while the evidence of optimal sequence of multiple-acupoints stimulation remains limited. In this paper, we developed an evaluation method of hybrid knowledge (deterministic knowledge and the experiential group decision knowledge) sequences based on segmentation mechanism of sub-sequence fragments, and then, we proposed a Monte Carlo Tree Search (MCTS) sequential decision-making method under the hybrid knowledge. Thereafter, we applied this proposed sequential decision-making approach to optimizing sequential decision-making schema of multiple-acupoints stimulation, to treat dysphagia among post-stroke patients. Finally, we verified the validity and the feasibility of this method by comparing it to other sequential decision-making search methods.


Assuntos
Pontos de Acupuntura , Terapia por Acupuntura/métodos , Árvores de Decisões , Transtornos de Deglutição/terapia , Método de Monte Carlo , Reabilitação do Acidente Vascular Cerebral , Algoritmos , Humanos , Medicina Tradicional Chinesa
19.
Cogn Affect Behav Neurosci ; 19(2): 225-230, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30607832

RESUMO

Many complex real-world decisions, such as deciding which house to buy or whether to switch jobs, involve trying to maximize reward across a sequence of choices. Optimal Foraging Theory is well suited to study these kinds of choices because it provides formal models for reward-maximization in sequential situations. In this article, we review recent insights from foraging neuroscience, behavioral ecology, and computational modelling. We find that a commonly used approach in foraging neuroscience, in which choice items are encountered at random, does not reflect the way animals direct their foraging efforts in certain real-world settings, nor does it reflect efficient reward-maximizing behavior. Based on this, we propose that task designs allowing subjects to encounter choice items strategically will further improve the ecological validity of foraging approaches used in neuroscience, as well as give rise to new behavioral and neural predictions that deepen our understanding of sequential, value-based choice.


Assuntos
Encéfalo/fisiologia , Comportamento de Escolha , Recompensa , Animais , Humanos , Neurociências
20.
Cogn Psychol ; 109: 1-25, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30543908

RESUMO

Tetris is a complex task notable for the increasingly substantial demands it makes on perception, decision-making, and action as the game is played. To investigate these issues, we collected data on 39 features of Tetris play for each Tetris zoid (piece), for up to 16 levels of difficulty, as each of 240 players played an hour of Tetris under laboratory conditions. Using only early (level 1) data, we conducted a Principle Component Analysis which found intriguing differences among its three, statistically significant, principle components. Each of these components captures different combinations of perception, decision-making, and action which suggests differing higher level skills, tactics, and strategies. Each component is presented and discussed, and then used in a series of principle component regression analyses on subsets of these data (a) from different Tetris levels, as well as (b) from players of different levels of expertise. We validate these models with data collected at a locally held Tetris tournament. These components represent elements of expertise; namely, correlations among perceptual, decision-making, and motor features that represent processing stages and hierarchical control and which distinguish expert from novice Tetris players. These components provide evidence for an integrated complex of processes - the Mind's Hand and the Mind's Eye - that are the essence of expertise in the real-time, sequential-decision-making task of Tetris.


Assuntos
Tomada de Decisões , Desempenho Psicomotor , Tempo de Reação , Jogos de Vídeo , Adulto , Feminino , Humanos , Masculino , Percepção
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA