A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit <a rel="external noopener" href="https://arxiv.org/pdf/2104.01474v1.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<span class="release-stage" >pre-print</span>
Animal brains evolved to optimize behavior in dynamically changing environments, selecting actions that maximize future rewards. A large body of experimental work indicates that such optimization changes the wiring of neural circuits, appropriately mapping environmental input onto behavioral outputs. A major unsolved scientific question is how optimal wiring adjustments, which must target the connections responsible for rewards, can be accomplished when the relation between sensory inputs,<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2104.01474v1">arXiv:2104.01474v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ezxyezl2r5at3kuptlfsebacxi">fatcat:ezxyezl2r5at3kuptlfsebacxi</a> </span>
more »... n taken, environmental context with rewards is ambiguous. The computational problem of properly targeting cues, contexts and actions that lead to reward is known as structural, contextual and temporal credit assignment respectively. In this review, we survey prior approaches to these three types of problems and advance the notion that the brain's specialized neural architectures provide efficient solutions. Within this framework, the thalamus with its cortical and basal ganglia interactions serve as a systems-level solution to credit assignment. Specifically, we propose that thalamocortical interaction is the locus of meta-learning where the thalamus provides cortical control functions that parametrize the cortical activity association space. By selecting among these control functions, the basal ganglia hierarchically guide thalamocortical plasticity across two timescales to enable meta-learning. The faster timescale establishes contextual associations to enable rapid behavioral flexibility while the slower one enables generalization to new contexts. Incorporating different thalamic control functions under this framework clarifies how thalamocortical-basal ganglia interactions may simultaneously solve the three credit assignment problems.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210407002916/https://arxiv.org/pdf/2104.01474v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c7/d5/c7d592df8f4517029154aee2bb62d8f20a7d0014.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2104.01474v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>