The Internet Archive has a preservation copy of this work in our general collections.
The file type is
We consider a bandit problem over a graph where the rewards are not directly observed. Instead, the decision maker can compare two nodes and receive (stochastic) information pertaining to the difference in their value. The graph structure describes the set of possible comparisons. Consequently, comparing between two nodes that are relatively far requires estimating the difference between every pair of nodes on the path between them. We analyze this problem from the perspective of samplearXiv:1109.2296v1 fatcat:rqaxdtmarzgvfpux2w3ze7yzdi