Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees

Domenico Dato, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini
2016 ACM Transactions on Information Systems  
Learning-to-Rank models based on additive ensembles of regression trees have been proven to be very effective for scoring query results returned by large-scale Web search engines. Unfortunately, the computational cost of scoring thousands of candidate documents by traversing large ensembles of trees is high. Thus, several works have investigated solutions aimed at improving the efficiency of document scoring by exploiting advanced features of modern CPUs and memory hierarchies. In this paper,
more » ... present QUICKSCORER, a new algorithm that adopts a novel cache-efficient representation of a given tree ensemble, it performs an interleaved traversal by means of fast bitwise operations, and also supports ensembles of oblivious trees. An extensive and detailed test assessment is conducted on two standard Learning-to-Rank datasets and on a novel very-large dataset we made publicly available for conducting significant efficiency tests. The experiments show unprecedented speedups over the best state-of-the-art baselines ranging from 1.9x to 6.6x. The analysis of low-level profiling traces shows that QUICKSCORER efficiency is due to its cache-aware approach both in terms of data layout and access patterns, and to a control flow that entails very low branch mis-prediction rates. . This paper is an extension of ; it adds an additional scoring algorithm for ensembles of obvious trees, a blockwise version of the scoring algorithm, a new large-scale learning to rank dataset as well as results from experiments on this new dataset. 1234:2 Dato D., Lucchese C., Nardini F.M., Orlando S., Perego R., Tonellotto N., and Venturini R. [Wu et al. 2010 ] are two of the most effective LtR algorithms. The GBRT algorithm builds a model by approximating the root mean squared error on a given training set. This loss function makes GBRT a point-wise LtR algorithm: querydocument pairs are exploited independently at learning time, and GBRT is trained to guess the relevance label associated with each of these pairs. The λ-MART algorithm improves over GBRT by optimizing list-wise IR measures like NDCG [Järvelin and Kekäläinen 2002] , involving the whole list of documents associated with each query. BACKGROUND AND RELATED WORK GRADIENT-BOOSTED REGRESSION TREES (GBRT) [Friedman 2001] and LAMBDA-MART (λ-MART)
doi:10.1145/2987380 fatcat:ku3cfzwjhfbebnsexnh7xyjy74