Learning at variable attentional load requires cooperation between working memory, meta-learning and attention-augmented reinforcement learning [article]

Thilo Womelsdorf, Marcus R Watson, Paul Tiesinga
2020 bioRxiv   pre-print
Flexible learning of changing reward contingencies can be realized with different strategies. A fast learning strategy involves forming a working memory of rewarding experiences with objects to improve future choices. A slower learning strategy uses prediction errors to gradually update value expectations to improve choices. How the fast and slow strategies work together in scenarios with real-world stimulus complexity is not well known. Here, we disentangle their relative contributions in
more » ... s monkeys while they learned the relevance of object features at variable attentional load. We found that learning behavior across six subjects is consistently best predicted with a model combining (i) fast working memory (ii) slower reinforcement learning from positive and negative prediction errors, as well as (iii) selective suppression of non-chosen features values, and (iv) meta-learning based adjustment of exploration rates given a memory trace of recent errors. These mechanisms cooperate differently at low and high attentional loads. While working memory was essential for efficient learning at lower attentional loads, learning from negative outcomes and meta-learning were essential for efficient learning at higher attentional loads. Together, these findings highlight that a canonical set of distinct learning mechanisms cooperate to optimize flexible learning when adjusting to environments with real-world attentional demands.
doi:10.1101/2020.09.27.315432 fatcat:cigb5t42kzandkjxefnqlykesq