Adapting Markov Decision Process for Search Result Diversification

Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, Xueqi Cheng
2017 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '17  
In this paper we address the issue of learning diverse ranking models for search result diversi cation. Typical methods treat the problem of constructing a diverse ranking as a process of sequential document selection. At each ranking position, the document that can provide the largest amount of additional information to the users is selected, because the search users usually browse the documents in a top-down manner. us, to select an optimal document for a position, it is critical for a
more » ... ranking model to capture the utility of information the user have perceived from the preceding documents. Existing methods usually calculate the ranking scores (e.g., the marginal relevance) directly based on the query and the selected documents, with heuristic rules or handcra ed features. e utility the user perceived at each of the ranks, however, is not explicitly modeled. In this paper, we present a novel diverse ranking model on the basis of continuous state Markov decision process (MDP) in which the user perceived utility is modeled as a part of the MDP state. Our model, referred to as MDP-DIV, sequentially takes the actions of selecting one document according to current state, and then updates the state for the chosen of the next action. e transition of the states are modeled in a recurrent manner and the model parameters are learned with policy gradient. Experimental results based on the TREC benchmarks showed that MDP-DIV can signi cantly outperform the state-of-the-art baselines. KEYWORDS learning to rank; search result diversi cation; Markov decision process ACM Reference format:
doi:10.1145/3077136.3080775 dblp:conf/sigir/XiaXLGZC17 fatcat:abtw3h3jbbfrfazs3iurbj2try