Filters








11 Hits in 5.1 sec

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models [article]

Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, Iryna Gurevych
2021 arXiv   pre-print
To address this, and to facilitate researchers to broadly evaluate the effectiveness of their models, we introduce Benchmarking-IR (BEIR), a robust and heterogeneous evaluation benchmark for information  ...  Our results show BM25 is a robust baseline and re-ranking and late-interaction-based models on average achieve the best zero-shot performances, however, at high computational costs.  ...  The BEIR Benchmark BEIR aims to provide a one-stop zero-shot evaluation benchmark for all diverse retrieval tasks.  ... 
arXiv:2104.08663v4 fatcat:fow5uqghbzggjclobtv7bpjtaa

Promptagator: Few-shot Dense Retrieval From 8 Examples [article]

Zhuyun Dai, Vincent Y. Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, Keith B. Hall, Ming-Wei Chang
2022 arXiv   pre-print
To amplify the power of a few examples, we propose Prompt-base Query Generation for Retriever (Promptagator), which leverages large language models (LLM) as a few-shot query generator, and creates task-specific  ...  In this paper, we suggest to work on Few-shot Dense Retrieval, a setting where each task comes with a short description and a few examples.  ...  We thank Alex Salcianu for developing a bulk inference pipeline for large language models.  ... 
arXiv:2209.11755v1 fatcat:dwb2fvjmhfdtzkni36js43oj6q

Large Dual Encoders Are Generalizable Retrievers [article]

Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernández Ábrego, Ji Ma, Vincent Y. Zhao, Yi Luan, Keith B. Hall, Ming-Wei Chang, Yinfei Yang
2021 arXiv   pre-print
effective retrieval model for out-of-domain generalization.  ...  With multi-stage training, surprisingly, scaling up the model size brings significant improvement on a variety of retrieval tasks, especially for out-of-domain generalization.  ...  Acknowledgments We thank Chris Tar and Don Metzler for feedback and suggestions.  ... 
arXiv:2112.07899v1 fatcat:y6dydk7vnndmpp4cxy3r7fwhhm

LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval [article]

Canwen Xu and Daya Guo and Nan Duan and Julian McAuley
2022 arXiv   pre-print
We evaluate LaPraDoR on the recently proposed BEIR benchmark, including 18 datasets of 9 zero-shot text retrieval tasks.  ...  In this paper, we propose LaPraDoR, a pretrained dual-tower dense retriever that does not require any supervised data for training.  ...  Acknowledgments We would like to thank the anonymous reviewers for their insightful comments. We would like to thank the authors of BEIR , Nandan Thakur and Nils Reimers, for their support.  ... 
arXiv:2203.06169v2 fatcat:z6bfdcuynrcfhe4dvopehno7wa

Domain Adaptation for Memory-Efficient Dense Retrieval [article]

Nandan Thakur, Nils Reimers, Jimmy Lin
2022 arXiv   pre-print
In practice, retrieval models are often used in an out-of-domain setting, where they have been trained on a publicly available dataset, like MS MARCO, but are then used for some custom dataset for which  ...  Our domain-adapted strategy known as GPL is model agnostic, achieves an improvement by up-to 19.3 and 11.6 points in nDCG@10 across the BEIR benchmark in comparison to BPR and JPQ while maintaining its  ...  We would additionally like to thank Kexin Wang for his helpful feedback and participation in the weekly research meetings.  ... 
arXiv:2205.11498v1 fatcat:pj5uuq2zczctfhmd3w47rmgaty

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers [article]

Weng Lam Tam, Xiao Liu, Kaixuan Ji, Lilong Xue, Xingjian Zhang, Yuxiao Dong, Jiahua Liu, Maodi Hu, Jie Tang
2022 arXiv   pre-print
Notably, it can significantly improve the out-of-domain zero-shot generalization of the retrieval models.  ...  In this work, we study the problem of prompt tuning for neural text retrievers.  ...  We adopt Benchmarking-IR (BEIR) proposed in (Thakur et al., 2021) , a zero-shot generalization benchmark for evaluating retrievers tasks across domains.  ... 
arXiv:2207.07087v1 fatcat:vzkzz577yjentn7ivi7q7uvzvq

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction [article]

Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, Matei Zaharia
2022 arXiv   pre-print
We evaluate ColBERTv2 across a wide range of benchmarks, establishing state-of-the-art quality within and outside the training domain while reducing the space footprint of late interaction models by 6–  ...  Neural information retrieval (IR) has greatly advanced search and other knowledge-intensive language tasks.  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.  ... 
arXiv:2112.01488v3 fatcat:fuvq6jdnbvc7rcvb5bquntbiw4

UnifieR: A Unified Retriever for Large-Scale Retrieval [article]

Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Kai Zhang, Daxin Jiang
2022 arXiv   pre-print
We lastly evaluate the model on BEIR benchmark to verify its transferability.  ...  Experiments on passage retrieval benchmarks verify its effectiveness in both paradigms. A uni-retrieval scheme is further presented with even better retrieval quality.  ...  Table 3 shows indomain evaluation and zero-shot transfer on BEIR.  ... 
arXiv:2205.11194v1 fatcat:ttwz7flj75dg7ezchxsos2tfum

Pre-training Methods in Information Retrieval [article]

Yixing Fan, Xiaohui Xie, Yinqiong Cai, Jia Chen, Xinyu Ma, Xiangsheng Li, Ruqing Zhang, Jiafeng Guo
2022 arXiv   pre-print
The core of information retrieval (IR) is to identify relevant information from large-scale resources and return it as a ranked list to respond to the user's information need.  ...  In recent years, the resurgence of deep learning has greatly advanced this field and leads to a hot topic named NeuIR (i.e., neural information retrieval), especially the paradigm of pre-training methods  ...  Acknowledgements References Pre-training Methods in Information Retrieval Acknowledgements  ... 
arXiv:2111.13853v3 fatcat:pilemnpphrgv5ksaktvctqdi4y

LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval

Canwen Xu, Daya Guo, Nan Duan, Julian McAuley
2022 Findings of the Association for Computational Linguistics: ACL 2022   unpublished
We evaluate LaPraDoR on the recently proposed BEIR benchmark, including 18 datasets of 9 zeroshot text retrieval tasks.  ...  In this paper, we propose LaPraDoR, a pretrained dual-tower dense retriever that does not require any supervised data for training.  ...  Acknowledgments We would like to thank the anonymous reviewers for their insightful comments. We would like to thank the authors of BEIR , Nandan Thakur and Nils Reimers, for their support.  ... 
doi:10.18653/v1/2022.findings-acl.281 fatcat:42hfbkrnnbajbkn5mf6koip5ga

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction

Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, Matei Zaharia
2022 Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies   unpublished
We evaluate ColBERTv2 across a wide range of benchmarks, establishing state-of-the-art quality within and outside the training domain while reducing the space footprint of late interaction models by 6-  ...  Neural information retrieval (IR) has greatly advanced search and other knowledgeintensive language tasks.  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.  ... 
doi:10.18653/v1/2022.naacl-main.272 fatcat:cgdy5zevybh4xjglxgsa4vpq4i