Uncertainty and exploration release_2grjbkr62red7pdjldeljrhbxa

by Samuel Gershman

Released as a post by Cold Spring Harbor Laboratory.

2018  

Abstract

In order to discover the most rewarding actions, agents must collect information about their environment, potentially foregoing reward. The optimal solution to this "explore-exploit" dilemma is often computationally challenging, but principled algorithmic approximations exist. These approximations utilize uncertainty about action values in different ways. Some random exploration algorithms scale the level of choice stochasticity with the level of uncertainty. Other directed exploration algorithms add a "bonus" to action values with high uncertainty. Random exploration algorithms are sensitive to total uncertainty across actions, whereas directed exploration algorithms are sensitive to relative uncertainty. This paper reports a multi-armed bandit experiment in which total and relative uncertainty were orthogonally manipulated. We found that humans employ both exploration strategies, and that these strategies are independently controlled by different uncertainty computations.
In application/xml+jats format

Archived Files and Locations

application/pdf   278.3 kB
file_ombv5evznje25iqt3r6cg2sewa
www.biorxiv.org (repository)
web.archive.org (webarchive)
application/pdf   238.4 kB
file_lhcuvsrulnhmdgsiv3mtf5udjm
www.biorxiv.org (web)
web.archive.org (webarchive)
application/pdf   281.5 kB
file_ihkrncqvynflpchtxe5fpo5edi
web.archive.org (webarchive)
www.biorxiv.org (web)
Read Archived PDF
Preserved and Accessible
Type  post
Stage   unknown
Date   2018-02-14
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 53fab188-d50f-4add-a366-5009eb61d8f4
API URL: JSON