Filters








43,899 Hits in 2.4 sec

Nearest Neighbor Machine Translation [article]

Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
2021 arXiv   pre-print
We introduce k-nearest-neighbor machine translation (kNN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation  ...  Simply adding nearest neighbor search improves a state-of-the-art German-English translation model by 1.5 BLEU. kNN-MT allows a single model to be adapted to diverse domains by using a domain-specific  ...  NEAREST NEIGHBOR MACHINE TRANSLATION kNN-MT involves augmenting the decoder of a pre-trained machine translation model with a nearest neighbor retrieval mechanism, allowing the model direct access to a  ... 
arXiv:2010.00710v2 fatcat:wwsbr2okdbgppobnyw33zebarq

Faster Nearest Neighbor Machine Translation [article]

Shuhe Wang, Jiwei Li, Yuxian Meng, Rongbin Ouyang, Guoyin Wang, Xiaoya Li, Tianwei Zhang, Shi Zong
2021 arXiv   pre-print
kNN based neural machine translation (kNN-MT) has achieved state-of-the-art results in a variety of MT tasks.  ...  One significant shortcoming of kNN-MT lies in its inefficiency in identifying the k nearest neighbors of the query representation from the entire datastore, which is prohibitively time-intensive when the  ...  The recently proposed kNN based neural machine translation (kNN-MT) (Khandelwal et al., 2020) has achieved state-of-the-art results across a wide variety of machine translation setups and datasets.  ... 
arXiv:2112.08152v1 fatcat:cnpljzgdbjdvjiteavzsobe3au

Fast Nearest Neighbor Machine Translation [article]

Yuxian Meng, Xiaoya Li, Xiayu Zheng, Fei Wu, Xiaofei Sun, Tianwei Zhang, Jiwei Li
2022 arXiv   pre-print
Though nearest neighbor Machine Translation (kNN-MT) has proved to introduce significant performance boosts over standard neural MT systems, it is prohibitively slow since it uses the entire reference  ...  corpus as the datastore for the nearest neighbor search.  ...  Experiments Bilingual Machine Translation We conduct experiments on two bilingual machine translation datasets: WMT'14 English-French and WMT'19 German-English.  ... 
arXiv:2105.14528v2 fatcat:rsz5ucniszanlimdtuvze2b57m

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [article]

Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong
2022 arXiv   pre-print
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).  ...  Our proposed methods achieve better or comparable performance while reducing up to 57% inference latency against the advanced non-parametric MT model on several machine translation benchmarks.  ...  In brief, our efficient cluster-based k-nearest neighbor machine translation can be concluded into the following steps. • We adopt the original datastore to train Compact Network while the parameters of  ... 
arXiv:2204.06175v2 fatcat:5tscsfcd4bbgjmkysxykkzplme

Nearest Neighbor Knowledge Distillation for Neural Machine Translation [article]

Zhixian Yang, Renliang Sun, Xiaojun Wan
2022 arXiv   pre-print
k-nearest-neighbor machine translation (NN-MT), proposed by Khandelwal et al. (2021), has achieved many state-of-the-art results in machine translation tasks.  ...  In this paper, we propose to move the time-consuming NN search forward to the preprocessing phase, and then introduce Nearest Neighbor Knowledge Distillation (NN-KD) that trains the base NMT model to directly  ...  Nearest Neighbor Machine Translation kNN-MT applies the nearest neighbor retrieval mechanism to the decoding phase of a NMT model, which allows the model direct access to a largescale datastore for better  ... 
arXiv:2205.00479v1 fatcat:m6u7k53cszfgnieitmq5w42j2a

Introduction to machine learning: k-nearest neighbors

Zhongheng Zhang
2016 Annals of Translational Medicine  
Figure 1 1 Illustration of how k-nearest neighbors' algorithm works. © Annals of Translational Medicine. All rights reserved.  ...  Introduction to machine learning: k-nearest neighbors. Ann Transl Med 2016; Then we divide the original dataset into the training and test datasets.  ... 
doi:10.21037/atm.2016.03.37 pmid:27386492 pmcid:PMC4916348 fatcat:kpct6fltmbcghasxbl4og42bxa

Adaptive Nearest Neighbor Machine Translation

Xin Zheng, Zhirui Zhang, Junliang Guo, Shujian Huang, Boxing Chen, Weihua Luo, Jiajun Chen
2021 Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)   unpublished
kNN-MT, recently proposed by Khandelwal et al. (2020a) , successfully combines pretrained neural machine translation (NMT) model with token-level k-nearest-neighbor (kNN) retrieval to improve the translation  ...  However, the traditional kNN algorithm used in kNN-MT simply retrieves a same number of nearest neighbors for each target token, which may cause prediction errors when the retrieved neighbors include noises  ...  ., 2020a) are increasingly receiving attentions from the machine translation (MT) community recently.  ... 
doi:10.18653/v1/2021.acl-short.47 fatcat:sphqtprypfe7zgfg5k5upztgtu

Bilingual Lexicon Induction through Unsupervised Machine Translation

Mikel Artetxe, Gorka Labaka, Eneko Agirre
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
pairs through nearest neighbor or related retrieval methods.  ...  In this paper, we propose an alternative approach to this problem that builds on the recent work on unsupervised machine translation.  ...  For that purpose, one would typically induce the translation of each source word by taking its corresponding nearest neighbor in the target language.  ... 
doi:10.18653/v1/p19-1494 dblp:conf/acl/ArtetxeLA19a fatcat:vd2hxnrnjnfarm33tubyrqcpfu

Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur'an Translation

Rahmat Hidayat, Sekar Minati
2019 IJID (International Journal on Informatics for Development)  
The classification experiment uses Support Vector Machine (SVM), Naive Bayes, k-Nearest Neighbor (kNN), and J48 Decision Tree classifier algorithms with Al-Baqarah verses translated to English and Indonesian  ...  Naïve Bayes classifier has the best accuracy for the English translation, which achieved 78.35%.  ...  The classification algorithms used are Naive Bayes, K-Nearest Neighbor, Decision Tree J48, and Support Vector Machine (SVM).  ... 
doi:10.14421/ijid.2019.08108 fatcat:5n6xjmbavzfbvcxiwqdwqt57y4

Efficient Machine Translation Domain Adaptation [article]

Pedro Henrique Martins and Zita Marinho and André F. T. Martins
2022 arXiv   pre-print
In this paper, we explore several approaches to speed up nearest neighbor machine translation.  ...  Machine translation models struggle when translating out-of-domain text, which makes domain adaptation a topic of critical importance.  ...  In sum, this paper presents the following contributions: • We adapt the methods proposed by He et al. (2021) for efficient nearest neighbor language modeling to machine translation. • We propose a caching  ... 
arXiv:2204.12608v1 fatcat:rg2xclokdbc6rdexbv34qzrjki

MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES

Archana.M, Dr. Sumithra Devi K.A
2011 Zenodo  
of machine translation (MT).  ...  K-nearest neighbor In K-nearest neighbor approach given a test document d, the system finds the K-nearest neighbors among training documents, and weight is assigned to the candidates using their classes  ... 
doi:10.5281/zenodo.3532231 fatcat:ynl7prmprffg5cvfl632rgvie4

Ambiguous Myanmar Word Disambiguation System for MyanmarEnglish Statistical Machine Translation

Nyein Thwet Thwet Aung, Khin Mar Soe, Ni Lar Thein
2011 International Journal of Computer Applications  
It is based on supervised learning approach, Nearest Neighbor Cosine Classifier. The system uses Myanmar-English Parallel Corpus as a training resource.  ...  In Statistical Machine Translation (SMT), there are many source words that can present different translations or senses.  ...  NEAREST NEIGHBOR COSINE CLASSIFICATION The nearest neighbor cosine classifier is a supervised corpusbased approach.  ... 
doi:10.5120/3323-4568 fatcat:lotyhytlajft7jufxbwvnhielq

On the Limitations of Unsupervised Bilingual Dictionary Induction

Anders Søgaard, Sebastian Ruder, Ivan Vulić
2018 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
) recently proposed a fully unsupervised machine translation (MT) model.  ...  Unsupervised machine translation-i.e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora-seems impossible, but nevertheless, Lample et al. (2018a  ...  Figure 1a -b shows the nearest neighbor graphs of the top 10 most frequent English words on Wikipedia, and their German translations.  ... 
doi:10.18653/v1/p18-1072 dblp:conf/acl/SogaardVR18 fatcat:v56sicevmrf6xdyiuqk5uplpqq

Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings

Mikel Artetxe, Holger Schwenk
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
Machine translation is highly sensitive to the size and quality of the training data, which has led to an increasing interest in collecting and filtering large parallel corpora.  ...  In contrast to previous approaches, which rely on nearest neighbor retrieval with a hard threshold over cosine similarity, our proposed method accounts for the scale inconsistencies of this measure, considering  ...  Only the nearest neighbor of B is a correct translation, yet that of A has a higher cosine similarity.  ... 
doi:10.18653/v1/p19-1309 dblp:conf/acl/ArtetxeS19 fatcat:fnc4t3fnevar7emdcmrafvkbty

Scalable Cross-Lingual Transfer of Neural Sentence Embeddings [article]

Hanan Aldarmaki, Mona Diab
2019 arXiv   pre-print
Figure 7 : 7 Nearest neighbor translation accuracy as a function of (log) parallel corpus size.  ...  However, the exact translations were not the nearest neighbors in most cases, and the nearest neighbors often included several extraneous pieces of content not present in the query sentence.  ... 
arXiv:1904.05542v1 fatcat:cvmxdq7kvzbhphsanx5rlq6rbi
« Previous Showing results 1 — 15 out of 43,899 results