A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Cross-Lingual Information Retrieve in Sogou Search
2017
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '17
In order to break the language barrier and connect Chinese people to the foreign language information in the world, Sogou has built a crosslingual information retrieval (CLIR) system named Sogou English ...
(http://english.sogou.com), which enables Chinese people to search and browse foreign language information with Chinese. ...
Sogou English is built based on the second largest search engine in China, Sogou Search. Besides, the neural machine translation (NMT) technology is adopted to do the translation part. ...
doi:10.1145/3077136.3096463
dblp:conf/sigir/XuZX17
fatcat:7oytbpkl4ffghiamkjxeoezvrq
SOGOU-2012-CRAWL
2016
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16
In 2012, Sogou, a major Chinese web search engine released a large-scale query log containing 43.5M user interactions, including submitted queries and clicked web page search results. ...
A real large-scale query log with accompanying crawl such as this offers several opportunities for reproducible information retrieval (IR) research, including query classification, intent modelling and ...
INTRODUCTION A great deal of research in information retrieval (IR) is based on analysing the past behaviour of real IR system users. ...
doi:10.1145/2911451.2914668
dblp:conf/sigir/WhitingJA16
fatcat:ribprbmywfdbfb65hsrbpd7ewy
Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-Shot Learning
[chapter]
2020
Lecture Notes in Computer Science
While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages. ...
In this paper, we tackle the lack of data by leveraging pre-trained multilingual language models to transfer a retrieval system trained on English collections to non-English queries and documents. ...
This work was supported in part by ARCS Foundation. ...
doi:10.1007/978-3-030-45442-5_31
fatcat:vjxxtqp345an5jsdgrhpenm6u4
Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning
[article]
2019
arXiv
pre-print
While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages. ...
In this paper, we tackle the lack of data by leveraging pre-trained multilingual language models to transfer a retrieval system trained on English collections to non-English queries and documents. ...
While most of recent approaches have focused on ad hoc retrieval for English, some researchers have studied the problem of cross-lingual information retrieval. ...
arXiv:1912.13080v1
fatcat:3fsqiservbbcvfrrwg4krubrzu
CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System
[chapter]
2017
Lecture Notes in Computer Science
These knowledge bases play important roles in enabling machines to understand texts. ...
However, most current knowledge bases are in English and non-English knowledge bases, especially Chinese ones, are still very rare. ...
Moreover, some search engines such as Sogou even show the top-10 searched movies, songs, games, etc. ...
doi:10.1007/978-3-319-60045-1_44
fatcat:a2psegja5ngqreul7zhjappara
Cross-lingual Lexical Sememe Prediction
2018
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Thus we present a task of cross-lingual lexical sememe prediction, aiming to automatically predict sememes for words in other languages. ...
Experimental results on real-world datasets show that our proposed model achieves consistent and significant improvements as compared to baseline methods in cross-lingual sememe prediction. ...
We will explore the effectiveness of our model in these tasks such as cross-lingual information retrieval. Figure 1 : 1 An example of HowNet. ...
doi:10.18653/v1/d18-1033
dblp:conf/emnlp/QiLSZX018
fatcat:tacr73frbvhp3muffmlhrjwkoa
CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark
[article]
2021
arXiv
pre-print
When con- Sogou-Log Sogou-Log consists of search logs of
structing options, crowd-sourced annotators were Sogou.com, a major Chinese commercial search
asked to extract a sentence from the story ...
NCLS: Neural cross-lingual summarization. In Pro-
ceedings of EMNLP-IJCNLP, pages 3054–3064. ...
arXiv:2112.13610v1
fatcat:eks56wvqtbhmfkq7wvs5n46lte
Analyzing chinese-english mixed language queries in a web search engine
2014
Proceedings of the American Society for Information Science and Technology
and cross-lingual query expansion. ...
Keywords Information retrieval, multilingual search query, search behavior, search topics, user intent. ...
DATA COLLECTION AND RESEARCH METHOD This study uses queries submitted to the Sogou web search engine (http://www.sogou.com/), which is one of the most popular search engines in China. ...
doi:10.1002/meet.2014.14505101114
fatcat:wftyh2vzyveo5c4qpx6revdv44
Pre-training Methods in Information Retrieval
[article]
2022
arXiv
pre-print
The core of information retrieval (IR) is to identify relevant information from large-scale resources and return it as a ranked list to respond to the user's information need. ...
In recent years, the resurgence of deep learning has greatly advanced this field and leads to a hot topic named NeuIR (i.e., neural information retrieval), especially the paradigm of pre-training methods ...
Acknowledgements
References
Pre-training Methods in Information Retrieval Acknowledgements ...
arXiv:2111.13853v3
fatcat:pilemnpphrgv5ksaktvctqdi4y
Exploiting bilingual information to improve web search
2009
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - ACL-IJCNLP '09
unpublished
Web search quality can vary widely across languages, even for the same information need. ...
We propose to exploit this variation in quality by learning a ranking function on bilingual queries: queries that appear in query logs for two languages but represent equivalent search interests. ...
The thrust of our technique is using search ranking of one language and cross-lingual information to help ranking of another language. ...
doi:10.3115/1690219.1690296
fatcat:se5yimxnrfbyphxijq6ibmwkvy
A Survey on Text Classification: From Shallow to Deep Learning
[article]
2021
arXiv
pre-print
Text classification is the most fundamental and essential task in natural language processing. ...
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning. ...
information retrieval and mining technology -plays a vital role in managing text data. ...
arXiv:2008.00364v6
fatcat:a6zp52rtf5awlh253yp62wqt3a
Cross Language Information Retrieval System
2018
Proceedings of the 2nd International Forum on Management, Education and Information Technology Application (IFMEITA 2017)
unpublished
In the paper, we describe a Cross Language Information Retrieval System (CLIR), which allows user input English queries and search Chinese documents. ...
We also explore the solution for online search engine, which can meet commercial requirements. 373 ...
A research show the retrieval effectiveness of EC and CE cross lingual search in google and yahoo is much lower than that of EE and CC monolingual search [14] . ...
doi:10.2991/ifmeita-17.2018.63
fatcat:baohr5hgxzejroysfi3pyqzj24
A Survey on Text Classification: From Traditional to Deep Learning
2022
ACM Transactions on Intelligent Systems and Technology
Text classification is the most fundamental and essential task in natural language processing. ...
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning. ...
Text classification -as efficient information retrieval and mining technology -plays a vital role in managing text data. ...
doi:10.1145/3495162
fatcat:ehrzpu4eezf7lah6jm3gyksyaq
D4.1 Report on Multimodal Machine Translation
2018
Zenodo
In MeMAD, multimodal translation is of particular interest in facilitating cross-lingual multimodal content retrieval, and is one of the main focuses of WP4. ...
Multimodal machine translation involves drawing information from more than one modality (text, audio, and visuals), and is an emerging subject within the machine translation community. ...
In addition the Finnish IT Center for Science (CSC) provided computational resources. We would also like to acknowledge the support by NVIDIA and their GPU grant. ...
doi:10.5281/zenodo.3690761
fatcat:n3b34ooubfayxphgyf6bli6bya
Deep Learning Based Text Classification: A Comprehensive Review
[article]
2021
arXiv
pre-print
In this paper, we provide a comprehensive review of more than 150 deep learning based models for text classification developed in recent years, and discuss their technical contributions, similarities, ...
Deep learning based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural ...
[61] extended the hierarchical attention model to cross-lingual sentiment classification. In each language, a LSTM network is used to model the documents. ...
arXiv:2004.03705v3
fatcat:al5hstylsbhfpldvokuvlpomam
« Previous
Showing results 1 — 15 out of 19 results