A Multilingual Approach for Unsupervised Search Task Identification

Luis Lugo, Jose G. Moreno, Gilles Hubert
2020 Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval  
Users convert their information needs to search queries, which are then run on available search engines. Query logs registered by search engines enable the automatic identification of the search tasks that users perform to fulfill their information needs. Search engine logs contain queries in multiple languages, but most existing methods for search task identification are not multilingual. Some methods rely on search context training of custom embeddings or external indexed collections that
more » ... ort a single language, making it challenging to support the multiple languages of queries run in search engines. Other methods depend on supervised components and user identifiers to model search tasks. The supervised components require labeled collections, which are difficult and costly to get in multiple languages. Also, the need for user identifiers renders these methods unfeasible in user agnostic scenarios. Hence, we propose an unsupervised multilingual approach for search task identification. The proposed approach is user agnostic, enabling its use in both user-independent and personalized scenarios. Furthermore, the multilingual query representation enables us to address the existing trade-off when mapping new queries to the identified search tasks.
doi:10.1145/3397271.3401258 dblp:conf/sigir/LugoMH20a fatcat:uuw6eugy7bc2jp3r6re6mzfoee