A Bayesian account of generalist and specialist formation under the Active Inference framework [article]

Anthony Guanxun Chen, David Benrimoh, Thomas Parr, Karl J Friston
2019 bioRxiv   pre-print
This paper offers a formal account of policy learning, or habitual behavioural optimisation, under the framework of Active Inference. In this setting, habit formation becomes an autodidactic, experience-dependent process, based upon what the agent sees itself doing. We focus on the effect of environmental volatility on habit formation by simulating artificial agents operating in a partially observable Markov decision process. Specifically, we used a "two-step" maze paradigm, in which the agent
more » ... as to decide whether to go left or right to secure a reward. We observe that in volatile environments with numerous reward locations, the agents learn to adopt a generalist strategy, never forming a strong habitual behaviour for any preferred maze direction. Conversely, in conservative or static environments, agents adopt a specialist strategy; forming strong preferences for policies that result in approach to a small number of previously-observed reward locations. The pros and cons of the two strategies are tested and discussed. In general, specialization offers greater benefits, but only when contingencies are conserved over time. We consider the implications of this formal (Active Inference) account of policy learning for understanding the relationship between specialisation and habit formation.
doi:10.1101/644807 fatcat:dlke7nvxmrbnjbo2n3yo5g343a