A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Utility function security in artificially intelligent agents
2014
Journal of experimental and theoretical artificial intelligence (Print)
The notion of 'wireheading', or direct reward centre stimulation of the brain, is a wellknown concept in neuroscience. In this paper, we examine the corresponding issue of reward (utility) function integrity in artificially intelligent machines. We survey the relevant literature and propose a number of potential solutions to ensure the integrity of our artificial assistants. Overall, we conclude that wireheading in rational selfimproving optimisers above a certain capacity remains an unsolved
doi:10.1080/0952813x.2014.895114
fatcat:xkguojdutfhxxnyecwllfvryi4