On the Relevance of Query Expansion Using Parallel Corpora and Word Embeddings to Boost Text Document Retrieval Precision

Alaidine Ben Ayed, Ismaïl Biskri
2020 International Journal on Natural Language Computing  
In this paper we implement a document retrieval system using the Lucene tool and we conduct some experiments in order to compare the efficiency of two different weighting schema: the well-known TF-IDF and the BM25. Then, we expand queries using a comparable corpus (wikipedia) and word embeddings. Obtained results show that the latter method (word embeddings) is a good way to achieve higher precision rates and retrieve more accurate documents.
doi:10.5121/ijnlc.2020.9101 fatcat:pek67e2gwbd4te2hjgwc3e6q3e