A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Dimension Projection among Languages based on Pseudo-relevant Documents for Query Translation
[article]
2016
arXiv
pre-print
In this paper, we propose a new method for dictionary-based query translation based on dimension projection of embedded vectors from the pseudo-relevant documents in the source language to their equivalents ...
To this end, first we learn low-dimensional vectors of the words in the pseudo-relevant collections separately and then aim to find a query-dependent transformation matrix between the vectors of translation ...
Linear Projection between Languages based on Pseudo-relevant Documents In this section we introduce the proposed method in more details. ...
arXiv:1605.07844v2
fatcat:vzrmsfkcgjbbxcmulz633pesoy
Continuous space models for CLIR
2017
Information Processing & Management
This property is very helpful for resource-poor languages, therefore, we carry out experiments on the English-Hindi language pair. ...
Different from most existing models, which rely only on available parallel data for training, our learning framework provides a natural way to exploit monolingual data and its associated relevance metadata ...
Acknowledgements We thank Germán Sanchis Trilles for helping in conducting experiments with machine translation. ...
doi:10.1016/j.ipm.2016.11.002
fatcat:vgiclzfllnb67fkd6omnttrdfm
Translingual information retrieval: learning from bilingual corpora
1998
Artificial Intelligence
Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more different languages. ...
Query translation based on a general machine-readable bilingual dictionaryheretofore the most popular method-did not match the performance of other, more sophisticated methods. ...
Acknowledgments We thank Christie Watson and Dorcas Wallace for their efforts in corpus annotation. ...
doi:10.1016/s0004-3702(98)00063-0
fatcat:madtpj3ndze3tapxlqvjafnjne
Advanced learning algorithms for cross-language patent retrieval and classification
2007
Information Processing & Management
We also investigate learning algorithms for cross-language document classification. The learning algorithm are based on KCCA and Support Vector Machines (SVM). ...
In comparison with most of other studies involving machine learning for cross-language information retrieval, which basically used learning techniques for monolingual sub-tasks, our learning algorithms ...
Thank Sandor Szedmak for providing us the Matlab code solving SVM_2k. Thank Mitsuharu Makita for help in preprocessing Japanese document. ...
doi:10.1016/j.ipm.2006.11.005
fatcat:l2i4icimofclxhw646hghmf56i
Using KCCA for Japanese–English cross-language information retrieval and document classification
2006
Journal of Intelligent Information Systems
A machine learning algorithm based on KCCA is studied for cross-language information retrieval. We apply the algorithm in Japanese-English cross-language information retrieval. ...
Our results show that it is feasible to use a classifier learned in one language to classify the documents in other languages. ...
We would also thank Mitsuharu Makita for help in preprocessing Japanese document. We thank anonymous reviewers for detailed comments and valuable suggestions. ...
doi:10.1007/s10844-006-1627-y
fatcat:fnvcma2w7fes5eun2ihtt4myni
Learning Neural Representation for CLIR with Adversarial Framework
2018
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
In this paper, we follow the success of neural representation in natural language processing (NLP) and develop a novel text representation model based on adversarial learning, which seeks a task-specific ...
embedding space for CLIR. ...
Acknowledgments We thank the anonymous reviewers for their valuable comments. This work was supported by the Fundamental Research Funds for Central Universities of CCNU (No. CCNU15A05062). ...
doi:10.18653/v1/d18-1212
dblp:conf/emnlp/LiC18
fatcat:rtlbaco6m5gobedpbi6rci5xim
TEKMA at CLEF-2021: BM-25 based rankings for scientific publication retrieval and data set recommendation
2021
Conference and Labs of the Evaluation Forum
We made one submission for each of the two tasks. For both submissions we focused on data enrichment and Solr's implementation of the probabilistic BM25 ranking function. ...
In this paper we report the results of our participation in the Living Labs for Academic Search (LiLAS) CLEF Challenge, which is aimed at strengthening the concept of user-centered living labs for the ...
Based on the assumption that the best-ranked documents are somehow relevant, information on them is used to rewrite and extend the query [7]. Using the base query, a ranking is generated. ...
dblp:conf/clef/KellerM21
fatcat:es2ge7e2znhhpfy7lx63xk4bty
Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models
[article]
2022
arXiv
pre-print
In translate-train, the system is trained on the MS MARCO English queries coupled with machine translations of the associated MS MARCO passages. ...
In zero-shot training, the system is trained on the English MS MARCO collection, relying on the XLM-R encoder for cross-language mappings. ...
Acknowledgments This research is based upon work supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via contract ...
arXiv:2201.08471v1
fatcat:qotjmi4dmner3cqxym6ad3ol3q
Information Flow Analysis with Chinese Text
[chapter]
2005
Lecture Notes in Computer Science
To evaluate the Chinese-based information flow model, it is applied to query expansion, in which a set of test queries are expanded automatically via information flow computations and documents are retrieved ...
The information inference derives implicit associations via computation of information flow on a high dimensional conceptual space, which is approximated by a cognitively motivated lexical semantic space ...
The authors would like to thank Zi Huang for her great work and assistance on the Chinese word segmentation system. ...
doi:10.1007/978-3-540-30211-7_11
fatcat:54e4tng3vrgx3cakekxl5twrla
UsingWord Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval
[article]
2016
arXiv
pre-print
One of the standard methods is to use query translation from source to target language. ...
In this paper, we propose an approach based on word embeddings, a method that captures contextual clues for a particular word in the source language and gives those words as translations that occur in ...
After reducing the rank, the queries and the documents are projected to a lower dimensional space. ...
arXiv:1608.01561v1
fatcat:cjgynmaawzdzzhx2h3ozqlz2by
Leveraging Entities in Document Retrieval
[chapter]
2018
Advanced Topics in Information Retrieval
The relevance between a query and a document is then estimated based on their projections to this latent entity space. ...
Document-Based Query Expansion To give an idea of how traditional (term-based) pseudo relevance feedback works, we present one of the most popular approaches, the relevance model by Lavrenko and Croft ...
doi:10.1007/978-3-319-93935-3_8
fatcat:aeg7t42jhzeebelu4nht6q4iqu
Using Word Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval
2016
Journal of Computacion y Sistemas
One of the standard methods is to use query translation from source to target language. ...
In this paper, we propose an approach based on word embeddings, a method that captures contextual clues for a particular word in the source language and gives those words as translations that occur in ...
Acknowledgments We would like to thank the anonymous reviewers for their valuable comments. ...
doi:10.13053/cys-20-3-2462
fatcat:zs44l332ivd77gglzixrnde3ay
Query representation for cross-temporal information retrieval
2013
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13
With this challenge in mind, we ask: given a query written in contemporary English, how can we retrieve relevant documents that were written in early English? ...
We focus on ways to combine evidence to improve CTIR effectiveness, proposing and testing several ways to handle language change during book search. ...
In the remainder of this section, we assume that based on our initial, dictionary-built query, we have retrieved k = 20 pseudo-relevant documents. ...
doi:10.1145/2484028.2484054
dblp:conf/sigir/Efron13
fatcat:thehxpjijvc3tc2n4x54rfserq
Using Language Models For Information Retrieval
2001
Zenodo
The approach uses simple document-based unigram models to compute for each document the probability that it generates the query. This probability is used to rank the documents. ...
This book describes a mathematical model of information retrieval based on the use of statistical language models. ...
I am most grateful to Wessel Kraaij of TNO-TPD for our cooperation in these projects, for our cooperation in four years of joined TREC-participations, and for implementing the language model algorithms ...
doi:10.5281/zenodo.570441
fatcat:mfju6ok4t5bzjp2pvp6bktfdn4
Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization
2020
Findings of the Association for Computational Linguistics: EMNLP 2020
The full architecture benefits from the complementary potential of document-query matching and the novel document transformation approach based on summarization along PM facets. ...
Component (a) directly generates a matching score of a candidate document for a query. ...
Model Source Target REL doc+query sentences doc relevance EXT doc token relevances ABS doc+facet signal a pseudo-query
Implementation Details For all three models, we begin with the pretrained bert-base-uncased ...
doi:10.18653/v1/2020.findings-emnlp.304
pmid:34541588
pmcid:PMC8444997
fatcat:elsv2diavfe7vawrnwoohyagqm
« Previous
Showing results 1 — 15 out of 5,388 results