This key difference makes it possible to discern the

This key difference makes it possible to discern the PF 01367338 influence of each controller on behavior and also to determine whether neural signals are correlated with predictions and prediction errors specific to each controller. Motivated by Tolman and Honzik (Tolman and Honzik, 1930), Gläscher and colleagues employed a variant of this task to examine latent learning

(Gläscher et al., 2010). Subjects were extensively taught the first-state transitions and were then told the utilities at the second state. Appropriate initial behavior in the task once the utilities were revealed could only arise from model-based control. However, the authors observed that the initial supremacy of model-based controller declined rather precipitately over time, even

in the absence of information that would contradict this controller (Gläscher et al., 2010). This decline was suggested as an analog of fast acquisition of habitual behavior. During the interregnum, behavior was best fit by a hybrid model in which both systems exerted some control. fMRI data highlighted a conventional model-free temporal difference reward prediction error in ventral striatum, whereas a different sort of state prediction error, associated with the acquisition of the model, was seen in posterior inferior parietal and lateral prefrontal cortices. Daw and colleagues devised a different variant of the task to encourage a stable Regorafenib price balance between model-based and model-free control (Daw et al., 2011). The logic of the task was that model-based and model-free strategies for RL predict different patterns by which reward obtained in the second stage should impact first-stage choices on subsequent trials. Consider a trial in which a first-stage choice, uncharacteristically, led to a second stage state with which it is not usually associated, and the choice then made at the second stage turned out to be rewarded. Model-free reinforcement predicts

that this experience will increase the probability of repeating the much successful first-stage choice. By contrast, if a subject chooses using an internal model of the transition structure, then this predicts that they would exhibit a decreased tendency to choose that same option. The best account of the behavioral data in this task was provided by a hybrid model in which model-based and model-free predictions were integrated during learning (unless subjects had to accomplish a cognitively demanding dual-task, in which case model-free control becomes rampant (Otto et al., 2013). However, across subjects, there was a wide spread in the degree of dependence on each system.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>