scholarly journals Higher Meta-cognitive Ability Predicts Less Reliance on Over Confident Habitual Learning System

2019 ◽  
Author(s):  
Sara Ershadmanesh ◽  
Mostafa Miandari ◽  
Abdol-hossein Vahabie ◽  
Majid Nili Ahmadabadi

AbstractMany studies on human and animals have provided evidence for the contribution of goal-directed and habitual valuation systems in learning and decision-making. These two systems can be modeled using model-based (MB) and model-free (MF) algorithms in Reinforcement Learning (RL) framework. Here, we study the link between the contribution of these two learning systems to behavior and meta-cognitive capabilities. Using computational modeling we showed that in a highly variable environment, where both learning strategies have chance level performances, model-free learning predicts higher confidence in decisions compared to model-based strategy. Our experimental results showed that the subjects’ meta-cognitive ability is negatively correlated with the contribution of model-free system to their behavior while having no correlation with the contribution of model-based system. Over-confidence of the model-free system justifies this counter-intuitive result. This is a new explanation for individual difference in learning style.

2020 ◽  
Author(s):  
Claire Rosalie Smid ◽  
Wouter Kool ◽  
Tobias U. Hauser ◽  
Nikolaus Steinbeis

Human decision-making is underpinned by distinct systems that differ in their flexibility and associated computational cost. A widely accepted dichotomy distinguishes a flexible but costly model-based system and a cheap but rigid model-free system. Optimal decision-making requires adaptive arbitration between these two systems depending on environmental demands. Previous developmental studies suggest that model-based decision-making only emerges in adolescence. Here, we show that when using a paradigm more conducive to model-based decision-making, children as young as 5 years show contributions from a model-based system to their behaviour. Furthermore, we find that between the ages 5 to 11, children demonstrate increasing metacontrol, which is the engagement of cost-benefit arbitration over decision-making systems on a trial-by-trial basis. Our results suggest that model-based decision-making emerges much earlier than previously believed, while adaptive arbitration between computationally cheap and costly systems continues to undergo developmental changes during childhood.


Author(s):  
Maaike M.H. van Swieten ◽  
Rafal Bogacz ◽  
Sanjay G. Manohar

AbstractHuman decisions can be reflexive or planned, being governed respectively by model-free and model-based learning systems. These two systems might differ in their responsiveness to our needs. Hunger drives us to specifically seek food rewards, but here we ask whether it might have more general effects on these two decision systems. On one hand, the model-based system is often considered flexible and context-sensitive, and might therefore be modulated by metabolic needs. On the other hand, the model-free system’s primitive reinforcement mechanisms may have closer ties to biological drives. Here, we tested participants on a well-established two-stage sequential decision-making task that dissociates the contribution of model-based and model-free control. Hunger enhanced overall performance by increasing model-free control, without affecting model-based control. These results demonstrate a generalized effect of hunger on decision-making that enhances reliance on primitive reinforcement learning, which in some situations translates into adaptive benefits.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Florent Wyckmans ◽  
A. Ross Otto ◽  
Miriam Sebold ◽  
Nathaniel Daw ◽  
Antoine Bechara ◽  
...  

AbstractCompulsive behaviors (e.g., addiction) can be viewed as an aberrant decision process where inflexible reactions automatically evoked by stimuli (habit) take control over decision making to the detriment of a more flexible (goal-oriented) behavioral learning system. These behaviors are thought to arise from learning algorithms known as “model-based” and “model-free” reinforcement learning. Gambling disorder, a form of addiction without the confound of neurotoxic effects of drugs, showed impaired goal-directed control but the way in which problem gamblers (PG) orchestrate model-based and model-free strategies has not been evaluated. Forty-nine PG and 33 healthy participants (CP) completed a two-step sequential choice task for which model-based and model-free learning have distinct and identifiable trial-by-trial learning signatures. The influence of common psychopathological comorbidities on those two forms of learning were investigated. PG showed impaired model-based learning, particularly after unrewarded outcomes. In addition, PG exhibited faster reaction times than CP following unrewarded decisions. Troubled mood, higher impulsivity (i.e., positive and negative urgency) and current and chronic stress reported via questionnaires did not account for those results. These findings demonstrate specific reinforcement learning and decision-making deficits in behavioral addiction that advances our understanding and may be important dimensions for designing effective interventions.


2018 ◽  
Vol 2 ◽  
pp. 239821281877296 ◽  
Author(s):  
Oliver Wang ◽  
Sang Wan Lee ◽  
John O’Doherty ◽  
Ben Seymour ◽  
Wako Yoshida

Background: While there is good evidence that reward learning is underpinned by two distinct decision control systems – a cognitive ‘model-based’ and a habitbased ‘model-free’ system, a comparable distinction for punishment avoidance has been much less clear. Methods: We implemented a pain avoidance task that placed differential emphasis on putative model-based and model-free processing, mirroring a paradigm and modelling approach recently developed for reward-based decision-making. Subjects performed a two-step decision-making task with probabilistic pain outcomes of different quantities. The delivery of outcomes was sometimes contingent on a rule signalled at the beginning of each trial, emulating a form of outcome devaluation. Results: The behavioural data showed that subjects tended to use a mixed strategy – favouring the simpler model-free learning strategy when outcomes did not depend on the rule, and favouring a model-based when they did. Furthermore, the data were well described by a dynamic transition model between the two controllers. When compared with data from a reward-based task (albeit tested in the context of the scanner), we observed that avoidance involved a significantly greater tendency for subjects to switch between model-free and model-based systems in the face of changes in uncertainty. Conclusion: Our study suggests a dual-system model of pain avoidance, similar to but possibly more dynamically flexible than reward-based decision-making.


2019 ◽  
Author(s):  
Leor M Hackel ◽  
Jeffrey Jordan Berg ◽  
Björn Lindström ◽  
David Amodio

Do habits play a role in our social impressions? To investigate the contribution of habits to the formation of social attitudes, we examined the roles of model-free and model-based reinforcement learning in social interactions—computations linked in past work to habit and planning, respectively. Participants in this study learned about novel individuals in a sequential reinforcement learning paradigm, choosing financial advisors who led them to high- or low-paying stocks. Results indicated that participants relied on both model-based and model-free learning, such that each independently predicted choice during the learning task and self-reported liking in a post-task assessment. Specifically, participants liked advisors who could provide large future rewards as well as advisors who had provided them with large rewards in the past. Moreover, participants varied in their use of model-based and model-free learning strategies, and this individual difference influenced the way in which learning related to self-reported attitudes: among participants who relied more on model-free learning, model-free social learning related more to post-task attitudes. We discuss implications for attitudes, trait impressions, and social behavior, as well as the role of habits in a memory systems model of social cognition.


Mechatronics ◽  
2014 ◽  
Vol 24 (8) ◽  
pp. 1008-1020 ◽  
Author(s):  
Abhishek Dutta ◽  
Yu Zhong ◽  
Bruno Depraetere ◽  
Kevin Van Vaerenbergh ◽  
Clara Ionescu ◽  
...  

Author(s):  
Mengmeng Li ◽  
Hiroaki Ogata ◽  
Bin Hou ◽  
Satoshi Hashimoto ◽  
Yuqin Liu ◽  
...  

This paper describes an adaptive learning system based on mobile phone email to support the study of Japanese Kanji. In this study, the main emphasis is on using the adaptive learning to resolve one common problem of the mobile-based email or SMS language learning systems. To achieve this goal, the authors main efforts focus on three aspects: sending the contents to a learner following his or her interests, adjusting the difficulty level of the tests to suit the learner’s proficiency level, and adapting the system to his or her learning style. Additionally, this system has already been evaluated by the learners and the results show that most of them benefited from the system and would like to continue using it.


2015 ◽  
Vol 22 (2) ◽  
pp. 188-198 ◽  
Author(s):  
Patricia Gruner ◽  
Alan Anticevic ◽  
Daeyeol Lee ◽  
Christopher Pittenger

Decision making in a complex world, characterized both by predictable regularities and by frequent departures from the norm, requires dynamic switching between rapid habit-like, automatic processes and slower, more flexible evaluative processes. These strategies, formalized as “model-free” and “model-based” reinforcement learning algorithms, respectively, can lead to divergent behavioral outcomes, requiring a mechanism to arbitrate between them in a context-appropriate manner. Recent data suggest that individuals with obsessive-compulsive disorder (OCD) rely excessively on inflexible habit-like decision making during reinforcement-driven learning. We propose that inflexible reliance on habit in OCD may reflect a functional weakness in the mechanism for context-appropriate dynamic arbitration between model-free and model-based decision making. Support for this hypothesis derives from emerging functional imaging findings. A deficit in arbitration in OCD may help reconcile evidence for excessive reliance on habit in rewarded learning tasks with an older literature suggesting inappropriate recruitment of circuitry associated with model-based decision making in unreinforced procedural learning. The hypothesized deficit and corresponding circuitry may be a particularly fruitful target for interventions, including cognitive remediation.


2016 ◽  
Vol 115 (6) ◽  
pp. 3195-3203 ◽  
Author(s):  
Simon Dunne ◽  
Arun D'Souza ◽  
John P. O'Doherty

A major open question is whether computational strategies thought to be used during experiential learning, specifically model-based and model-free reinforcement learning, also support observational learning. Furthermore, the question of how observational learning occurs when observers must learn about the value of options from observing outcomes in the absence of choice has not been addressed. In the present study we used a multi-armed bandit task that encouraged human participants to employ both experiential and observational learning while they underwent functional magnetic resonance imaging (fMRI). We found evidence for the presence of model-based learning signals during both observational and experiential learning in the intraparietal sulcus. However, unlike during experiential learning, model-free learning signals in the ventral striatum were not detectable during this form of observational learning. These results provide insight into the flexibility of the model-based learning system, implicating this system in learning during observation as well as from direct experience, and further suggest that the model-free reinforcement learning system may be less flexible with regard to its involvement in observational learning.


Sign in / Sign up

Export Citation Format

Share Document