Meta-control of the exploration-exploitation dilemma emerges from probabilistic inference over a hierarchy of time scales
AbstractCognitive control is typically understood as a set of mechanisms which enable humans to reach goals that require integrating the consequences of actions over longer time scales. Importantly, using routine beheavior or making choices beneficial only at a short time scales would prevent one from attaining these goals. During the past two decades, researchers have proposed various computational cognitive models that successfully account for behaviour related to cognitive control in a wide
... control in a wide range of laboratory tasks. As humans operate in a dynamic and uncertain environment, making elaborate plans and integrating experience over multiple time scales is computationally expensive, the specific question of how uncertain consequences at different time scales are integrated into adaptive decisions remains poorly understood. Here, we propose that precisely the problem of integrating experience and forming elaborate plans over multiple time scales is a key component for better understanding how human agents solve cognitive control dilemmas such as the exploration-exploitation dilemma. In support of this conjecture, we present a computational model of probabilistic inference over hidden states and actions, which are represented as a hierarchy of time scales. Simulations of goal-reaching agents instantiating the model in an uncertain and dynamic task environment show how the exploration-exploitation dilemma may be solved by inferring meta-control states which adapt behaviour to changing contexts.