Filters








19,742 Hits in 5.0 sec

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation [article]

Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong
2022 arXiv   pre-print
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).  ...  We then suggest a cluster-based pruning solution to filter out 10%-40% redundant nodes in large datastores while retaining translation quality.  ...  In brief, our efficient cluster-based k-nearest neighbor machine translation can be concluded into the following steps. • We adopt the original datastore to train Compact Network while the parameters of  ... 
arXiv:2204.06175v2 fatcat:5tscsfcd4bbgjmkysxykkzplme

Efficient Cluster-Based k-Nearest-Neighbor Machine Translation-Nearest-Neighbor Machine Translation

Dexin Wang, Kai Fan, Boxing Chen, Deyi Xiong
2022 Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)   unpublished
k-Nearest-Neighbor Machine Translation (kNN-MT) has been recently proposed as a non-parametric solution for domain adaptation in neural machine translation (NMT).  ...  We then suggest a cluster-based pruning solution to filter out 10%~40% redundant nodes in large datastores while retaining translation quality.  ...  In brief, our efficient cluster-based k-nearest neighbor machine translation can be concluded into the following steps. • We adopt the original datastore to train Compact Network while the parameters of  ... 
doi:10.18653/v1/2022.acl-long.154 fatcat:s4ohhsgglzap7nu43o4hn6eyli

Faster Nearest Neighbor Machine Translation [article]

Shuhe Wang, Jiwei Li, Yuxian Meng, Rongbin Ouyang, Guoyin Wang, Xiaoya Li, Tianwei Zhang, Shi Zong
2021 arXiv   pre-print
kNN based neural machine translation (kNN-MT) has achieved state-of-the-art results in a variety of MT tasks.  ...  One significant shortcoming of kNN-MT lies in its inefficiency in identifying the k nearest neighbors of the query representation from the entire datastore, which is prohibitively time-intensive when the  ...  The recently proposed kNN based neural machine translation (kNN-MT) (Khandelwal et al., 2020) has achieved state-of-the-art results across a wide variety of machine translation setups and datasets.  ... 
arXiv:2112.08152v1 fatcat:cnpljzgdbjdvjiteavzsobe3au

Fast Nearest Neighbor Machine Translation [article]

Yuxian Meng, Xiaoya Li, Xiayu Zheng, Fei Wu, Xiaofei Sun, Tianwei Zhang, Jiwei Li
2022 arXiv   pre-print
Though nearest neighbor Machine Translation (kNN-MT) has proved to introduce significant performance boosts over standard neural MT systems, it is prohibitively slow since it uses the entire reference  ...  This strategy avoids search through the whole datastore for nearest neighbors and drastically improves decoding efficiency.  ...  nearest-neighbor search efficiency.  ... 
arXiv:2105.14528v2 fatcat:rsz5ucniszanlimdtuvze2b57m

MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES

Archana.M, Dr. Sumithra Devi K.A
2011 Zenodo  
Figure 5 : 5 Clustering analysis for CLIR Decision Tree i. Quick reduct ii. Rough set based decision tree ensemble (RSDTE) 3 K-nearest neighbor i.  ...  K-nearest neighbor In K-nearest neighbor approach given a test document d, the system finds the K-nearest neighbors among training documents, and weight is assigned to the candidates using their classes  ... 
doi:10.5281/zenodo.3532231 fatcat:ynl7prmprffg5cvfl632rgvie4

Explaining the Success of Nearest Neighbor Methods in Prediction

George H. Chen, Devavrat Shah
2018 Foundations and Trends® in Machine Learning  
We present theoretical guarantees for k-nearest neighbor, fixed-radius near neighbor, and kernel regression where the data reside in a metric space.  ...  We provide an overview of efficient data structures for exact and approximate nearest neighbor search that are used in practice.  ... 
doi:10.1561/2200000064 fatcat:dqsejlqojvftxc5yuiyofjkele

Efficient Data Analytics on Augmented Similarity Triplets [article]

Muhammad Ahmad, Muhammad Haroon Shakeel, Sarwan Ali, Imdadullah Khan, Arif Zaman, Asim Karim
2019 arXiv   pre-print
Many machine learning methods (classification, clustering, etc.) start with a known kernel that provides similarity or distance measure between two objects.  ...  Secondly, we also propose a novel set of algorithms for common supervised and unsupervised machine learning tasks based on triplets.  ...  Machine learning tasks performed directly on triplets include nearest neighbors search [17] , classification [30, 16] , comparison based hierarchical clustering [14] , and correlation clustering [36  ... 
arXiv:1912.12064v1 fatcat:fueupq33efd4bikbee3etort6a

Distributed Synthetic Minority Oversampling Technique

Sakshi Hooda, Suman Mann
2019 International Journal of Computational Intelligence Systems  
In this paper present our solution to address the "big data challenge. " We provide a distributed version of SMOTE by using scalable k-means++ and M-Trees.  ...  However, the existing implementations of SMOTE fail when data grows and can't be stored on a single machine.  ...  M-Tree provides a very fast and efficient nearest neighbor search.  ... 
doi:10.2991/ijcis.d.190719.001 fatcat:ihop6sgsxfdzbfqxcejuon4jbe

Nearest Neighbor Knowledge Distillation for Neural Machine Translation [article]

Zhixian Yang, Renliang Sun, Xiaojun Wan
2022 arXiv   pre-print
k-nearest-neighbor machine translation (NN-MT), proposed by Khandelwal et al. (2021), has achieved many state-of-the-art results in machine translation tasks.  ...  In this paper, we propose to move the time-consuming NN search forward to the preprocessing phase, and then introduce Nearest Neighbor Knowledge Distillation (NN-KD) that trains the base NMT model to directly  ...  Nearest Neighbor Machine Translation kNN-MT applies the nearest neighbor retrieval mechanism to the decoding phase of a NMT model, which allows the model direct access to a largescale datastore for better  ... 
arXiv:2205.00479v1 fatcat:m6u7k53cszfgnieitmq5w42j2a

Semisupervised metric learning by kernel matrix adaptation

Hong Chang, Dit-Yan Yeung
2005 2005 International Conference on Machine Learning and Cybernetics  
In this paper, we propose a kernel-based approach for nonlinear metric learning, which performs locally linear translation in the kernel-induced feature space.  ...  We formulate the metric learning problem as a kernel learning problem and solve it efficiently by kernel matrix adaptation.  ...  Introduction Many machine learning and pattern recognition methods, such as nearest neighbor classifiers, radial basis function networks, support vector machines for classification and the k-means algorithm  ... 
doi:10.1109/icmlc.2005.1527496 fatcat:oikioevsyrdddgnypi63g4f6nq

Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction [article]

M. Saquib Sarfraz, Marios Koulakis, Constantin Seibold, Rainer Stiefelhagen
2022 arXiv   pre-print
We introduce a novel method based on a hierarchy built on 1-nearest neighbor graphs in the original space which is used to preserve the grouping properties of the data distribution on multiple levels.  ...  Dimensionality reduction is crucial both for visualization and preprocessing high dimensional data for machine learning.  ...  In order to perform this process in an efficient and easy to vectorize way, we choose to use the already known distance d (l) i of each c(l) i to its nearest neighbor in C(l) and then scale the translated  ... 
arXiv:2203.12997v3 fatcat:xmn3532uebcf7jt5j6mmwj5lo4

Kernel-Based Metric Adaptation with Pairwise Constraints [chapter]

Hong Chang, Dit-Yan Yeung
2006 Lecture Notes in Computer Science  
In this paper, we propose a kernel-based approach for nonlinear metric learning, which performs locally linear translation in the kernelinduced feature space.  ...  We formulate the metric learning problem as a kernel learning problem and solve it efficiently by kernel matrix adaptation.  ...  Introduction Many machine learning and pattern recognition methods, such as nearest neighbor classifiers, radial basis function networks, support vector machines for classification and the k-means algorithm  ... 
doi:10.1007/11739685_75 fatcat:imuukkxnxncmbfdmsuni6jq6ty

Unsupervised Distance-Based Outlier Detection Using Nearest Neighbours Algorithm on Distributed Approach: Survey
English

Jayshree S.Gosavi, Vinod S.Wadne
2014 International Journal of Innovative Research in Computer and Communication Engineering  
By examining again the notion of reverse nearest neighbors in the unsupervised outlier-detection context, high dimensionality can have a different impact.  ...  This proposed work goes in details about the development and analysis of outlier detection algorithms such as Local Outlier Factor(LOF), Local Distance-Based Outlier Factor(LDOF) , Influenced Outliers  ...  The local density for instances is computed by dividing volume of k,i.e k nearest neighbor and volume of hyper sphere.  ... 
doi:10.15680/ijircce.2014.0212042 fatcat:7okbstpahnbctltcf65f6cjmfi

DIMO

Ahmed Abdelsadek, Mohamed Hefeeda
2014 Proceedings of the 5th ACM Multimedia Systems Conference on - MMSys '14  
DIMO provides multimedia applications with the basic function of computing the K nearest neighbors on large-scale datasets.  ...  We have implemented DIMO and extensively evaluated it on Amazon clusters with number of machines ranging from 8 to 128.  ...  matching using the found K nearest neighbors.  ... 
doi:10.1145/2557642.2557650 dblp:conf/mmsys/AbdelsadekH14 fatcat:vvhoxhvkhvaajcfdwmsn3zynqi

Editorial

2018 Intelligent Data Analysis  
Their proposed system employs a feature extraction technique based on principal component analysis, which is called Eigentraces, of operating system call trace data, and k-nearest neighbor algorithm for  ...  tree for statistical machine translation.  ... 
doi:10.3233/ida-180893 fatcat:qteck6fwjbci7iiael36xuij6i
« Previous Showing results 1 — 15 out of 19,742 results