130,153 Hits in 4.2 sec

Speeding up Document Ranking with Rank-based Features

Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto
2015 Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '15  
We report a comprehensive evaluation showing that rank-based features allow us to achieve the desired effectiveness with ranking models being up to 3.5 times smaller than models not using them, with a  ...  In this paper we propose a new family of rank-based features, which extend the original feature vector associated with each query-document pair.  ...  Experiments proved that our proposed features can reduce the number of regression trees generated by gradient boosting approaches up to 3.5 times and they provide a speed-up in the scoring time up to 70%  ... 
doi:10.1145/2766462.2767776 dblp:conf/sigir/LuccheseNOPT15 fatcat:4i322gmdxfen5hwimnyes76cj4

Research on B Cell Algorithm for Learning to Rank Method Based on Parallel Strategy

Yuling Tian, Hongxian Zhang, Quan Zou
2016 PLoS ONE  
Ranking the documents according to their relevance within the system to meet user needs is a challenging endeavor, and a hot research topic-there already exist several rank-learning methods based on machine  ...  For the purposes of information retrieval, users must find highly relevant documents from within a system (and often a quite large one comprised of many individual documents) based on input query.  ...  Fig 9 shows that RankBCA tended to be convergent as the iterations progressed, and that its convergence rate was considerably faster than that of AdaRank. (3) speed-up ratio The result of the speed-up  ... 
doi:10.1371/journal.pone.0157994 pmid:27487242 pmcid:PMC4972358 fatcat:nbxpmvhbcrbpbawm6435w2czfq

GPU-based Parallelization of QuickScorer to Speed-up Document Ranking with Tree Ensembles

Francesco Lettich, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini
2016 Italian Information Retrieval Workshop  
Scoring documents with learning-to-rank (LtR) models based on large ensembles of regression trees currently represents one of the most effective solutions to rank query results returned by large scale  ...  To this end we propose GPUSCORER, a GPU-based parallelization of the state-of-the-art algorithm QUICKSCORER to score documents with tree ensembles.  ...  In this work we propose GPUSCORER, a GPU-based parallelization of QUICKSCO-RER [1] , the state-of-the-art algorithm to score documents with tree ensembles.  ... 
dblp:conf/iir/LettichLNOPTV16 fatcat:l4z4bkc6cnbnhm46qq6sjfvcri

CCRank: Parallel Learning to Rank with Cooperative Coevolution

Shuaiqiang Wang, Byron Gao, Ke Wang, Hady Lauw
With CCRank, we investigate parallel CC in the context of learning to rank.  ...  CCRank is based on cooperative coevolution (CC), a divide-and-conquer framework that has demonstrated high promise in function optimization for problems with large search space and complex structures.  ...  Due to diversity of queries and documents, learning to rank involves larger and larger training data with many features.  ... 
doi:10.1609/aaai.v25i1.8078 fatcat:dvjdl3qhkre63br5ti2bfzfddy

A Cooperative Coevolution Framework for Parallel Learning to Rank

Shuaiqiang Wang, Yun Wu, Byron J. Gao, Ke Wang, Hady W. Lauw, Jun Ma
2015 IEEE Transactions on Knowledge and Data Engineering  
With CCRank, we investigate parallel CC in the context of learning to rank. We implement CCRank with three EA-based learning to rank algorithms for demonstration.  ...  We propose CCRank, the first parallel framework for learning to rank based on evolutionary algorithms (EA), aiming to significantly improve learning efficiency while maintaining accuracy.  ...  Due to the diversity of queries and documents, learning to rank involves larger and larger training data with many features.  ... 
doi:10.1109/tkde.2015.2453952 fatcat:gltmjpyaencx5dnulatxppwyyy

Panel of Attribute Selection Methods to Rank Features Drastically Improves Accuracy in Filtering Web-pages Suitable for Education

Vladimir Estivill-Castro, Matteo Lombardi, Alessandro Marani
2019 Proceedings of the 11th International Conference on Computer Supported Education  
A document not suitable for learning, although well related to the query, should never be recommended to a student.  ...  Then, we present a new feature selection method for lowering the number of attributes of the items. We build a committee of feature selection methods, but do not use it as an ensemble.  ...  The speed recorded using the dummies indicates feature selection with SVM is able to catch up with Rank Score till becoming 3% faster (in the x16 dataset).  ... 
doi:10.5220/0007676300480057 dblp:conf/csedu/Estivill-Castro19a fatcat:7thi2xu7xzhjthvwwhs44bnu2a

Distributed Architecture for Large Scale Image-Based Search

Yu Zheng, Xing Xie, Wei-Ying Ma
2007 Multimedia and Expo, 2007 IEEE International Conference on  
In recent years, some computer vision algorithms such as SIFT (Scale Invariant Feature Transform) have been employed in image similarity match to perform image-based search applications.  ...  However, with the increasing scale of image databases, centralized image retrieval system no longer provide adequate prompt search.  ...  .  Document based: Retrieved Images from same document will be merged and represented by the document they belong to with sum of matched features as well as average distance computed as equation (3)  ... 
doi:10.1109/icme.2007.4284716 dblp:conf/icmcs/ZhengXM07 fatcat:2jxbgj4zzrcrtcjjkgll5bojhq

Use of Solr and Xapian in the Invenio document repository software [article]

Patrick O. Glauner, Jan Iwaszkiewicz, Jean-Yves Le Meur, Tibor Simko
2013 arXiv   pre-print
Invenio is a free comprehensive web-based document repository and digital library software suite originally developed at CERN.  ...  Consequently, Invenio takes advantage of Solr's efficient search and word similarity ranking capabilities. In this paper, we first give an overview of Invenio, its capabilities and features.  ...  They are designed to speed up queries assuming a high amount of selects and lower amount of updates.  ... 
arXiv:1310.0250v1 fatcat:e3hnfuthnvd2hnoolol5uluoe4

Beyond dwell time

Qi Guo, Eugene Agichtein
2012 Proceedings of the 21st international conference on World Wide Web - WWW '12  
The experimental results show that PCB is significantly more effective than using page dwell time information alone, both for estimating the explicit judgments of each user, and for re-ranking the results  ...  This paper shows that that post-click searcher behavior, such as cursor movement and scrolling, provides additional clues for better estimating document relevance.  ...  On the basis of the text passages they built up term-based task profiles which were then used for re-ranking search result lists.  ... 
doi:10.1145/2187836.2187914 dblp:conf/www/GuoA12 fatcat:ozftgvdw3bfadfnlz3qqezdtva


1998 Pattern Recognition  
In this paper, we present two techniques for speeding up character recognition.  ...  To further speed-up the recognition speed, we use a modified branch-and-bound algorithm in the detail-matching module.  ...  The total speed-up ratio can be 3.92;2.02 on comparing with the nominator.  ... 
doi:10.1016/s0031-3203(98)00043-0 fatcat:bdrwqmwkjrd25otrbqbcnu7zka

Post-Learning Optimization of Tree Ensembles for Efficient Ranking

Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, Salvatore Trani
2016 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16  
Experiments conducted on two publicly available LtR datasets show that CLEaVER is able to prune up to 80% of the trees and provides an efficiency speed-up up to 2.6x without affecting the effectiveness  ...  Learning to Rank (LtR) is the machine learning method of choice for producing high quality document ranking functions from a ground-truth of training examples.  ...  speed-up of 2.9x and 2.1x respectively.  ... 
doi:10.1145/2911451.2914763 dblp:conf/sigir/LuccheseNOPST16 fatcat:wzt4avnpbbfjbdbu5gnssemryq

Learning Term Discrimination

Jibril Frej, Philippe Mulhem, Didier Schwab, Jean-Pierre Chevallet
2020 Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval  
In this work, we propose to learn TDVs for document indexing with shallow neural networks that approximate traditional IR ranking functions such as TF-IDF and BM25.  ...  Our learned TDVs, when used to filter out terms of the vocabulary that have zero discrimination value, allow to both significantly lower the memory footprint of the inverted index and speed up the retrieval  ...  Second, by filtering out terms with zero discrimination value from the inverted index, we are able to significantly speed up the retrieval process of all ranking functions on all collections.  ... 
doi:10.1145/3397271.3401211 dblp:conf/sigir/FrejMSC20 fatcat:7dqdoez63rb3xcewhcpouz7lwq

Learning Early Exit Strategies for Additive Ranking Ensembles

Francesco Busolin, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Salvatore Trani
2021 Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval  
Modern search engine ranking pipelines are commonly based on large machine-learned ensembles of regression trees.  ...  LEAR exploits a classifier that predicts whether a document can early exit the ensemble because it is unlikely to be ranked among the final top-𝑘 results.  ...  LEAR is based on a binary classifier that exploits querydocument features along with their score/rank cumulated up to a given ensemble's tree, called "sentinel".  ... 
doi:10.1145/3404835.3463088 fatcat:t6ah34vfxvdnvfr2xlpzo2ok6e

Supervised Semantic Indexing [chapter]

Bing Bai, Jason Weston, Ronan Collobert, David Grangier
2009 Lecture Notes in Computer Science  
We present a class of models that are discriminatively trained to directly map from the word content in a query-document or documentdocument pair to a ranking score.  ...  However, unlike LSI our models are trained with a supervised signal directly on the task of interest, which we argue is the reason for our superior results.  ...  Discussion We have described a versatile, powerful set of discriminatively trained models for document ranking.  ... 
doi:10.1007/978-3-642-00958-7_81 fatcat:ixjpdfiehnejjonvnzclshj76y

Intelligent Topic Selection for Low-Cost Information Retrieval Evaluation: A New Perspective on Deep vs. Shallow Judging [article]

Mucahid Kutlu, Tamer Elsayed, Matthew Lease
2017 arXiv   pre-print
test collections at the scale of today's massive document collections.  ...  While test collections provide the cornerstone for Cranfield-based evaluation of information retrieval (IR) systems, it has become practically infeasible to rely on traditional pooling techniques to construct  ...  based on remaining documents.  ... 
arXiv:1701.07810v4 fatcat:2jtkw26ngfdxnlfzutcw5kpopa
« Previous Showing results 1 — 15 out of 130,153 results