A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Scalable Execution of KNN Queries using Data Parallelism Approach
2018
International Journal of Engineering & Technology
The K-nearest neighbor algorithm (KNN) is a well-known learning method used in a wide range of problem-solving domains e.g., network monitoring, data mining, and image processing etc. ...
A lot of work has been done to deal with the computational complications in constant processing of continuous queries on unbounded, continuous data stream. ...
This finite index on queries can be accommodated in memory which results in efficient execution of queries avoiding memory access frequently. ...
doi:10.14419/ijet.v7i4.19.28286
fatcat:27xnmqoyzfg7vovqjyht25e4ku
Hybrid KNN-Join: Parallel Nearest Neighbor Searches Exploiting CPU and GPU Architectural Features
[article]
2020
arXiv
pre-print
K Nearest Neighbor (KNN) joins are used in scientific domains for data analysis, and are building blocks of several well-known algorithms. KNN-joins find the KNN of all points in a dataset. ...
This paper focuses on a hybrid CPU/GPU approach for low-dimensional KNN-joins, where the GPU may not yield substantial performance gains over parallel CPU algorithms. ...
In this work, we focus on exact KNN searches in low dimensionality. The performance of low dimensional KNN searches is limited by the memory bottleneck. ...
arXiv:1810.04758v2
fatcat:t4t44mwcfzbfdm7uds5uvz45ey
Scaling k-Nearest Neighbours Queries (The Right Way)
2017
2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)
Recently parallel / distributed processing approaches have been proposed for processing k-Nearest Neighbours (kNN) queries over very large (multi-dimensional) datasets aiming to ensure scalability. ...
Overall, kNN queries can be processed in just tens of milliseconds (as opposed to the (tens of) seconds required by state of the art. ...
Briefly, executing kNN queries in this way is very costly in terms of query response times, memory usage, cpu usage, and network and disk bandwidth. ...
doi:10.1109/icdcs.2017.267
dblp:conf/icdcs/CahsaiNAT17
fatcat:uvkn65obrnhjrdiovg2kfejhou
GGNN: Graph-based GPU Nearest Neighbor Search
[article]
2021
arXiv
pre-print
Approximate nearest neighbor (ANN) search in high dimensions is an integral part of several computer vision systems and gains importance in deep learning with explicit memory representations. ...
In this paper, we propose a novel search structure based on nearest neighbor graphs and information propagation on graphs. ...
This bottomup construction creates a robust searchable kNN-graph for each merged tree. It can be parallelized on multiple GPUs to even support datasets with large memory requirements. ...
arXiv:1912.01059v3
fatcat:zbewjskznrhexkvt2zc6vacnqy
LocationSpark: In-memory Distributed Spatial Query Processing and Optimization
2020
Frontiers in Big Data
Each local computation node is responsible for optimizing and selecting its best local query execution plan based on the indexes and the nature of the spatial queries in that node. ...
The scheduler generates query execution plans that minimize the effect of query skew. ...
WA also contributed to the design and analysis of the manuscript, in addition to the contribution of writing the manuscript. ...
doi:10.3389/fdata.2020.00030
pmid:33693403
pmcid:PMC7931877
fatcat:onodyye4uzb4letmbpyembq7je
SparkNN: A Distributed In-Memory Data Partitioning for KNN Queries on Big Spatial Data
2020
Data Science Journal
To fill this gap, this paper proposes SparkNN, an in-memory partitioning and indexing system for answering spatial queries, such as K-nearest neighbor, on big spatial data. ...
SparkNN is implemented on top of Apache Spark and consists of three layers to facilitate efficient spatial queries. ...
Note that it is possible that the query may run in parallel in more than one partition depending on the location of the query point q and value of k. ...
doi:10.5334/dsj-2020-035
fatcat:3z7ftetwarhe5jn5lqqezi6t6e
Application-Driven Near-Data Processing for Similarity Search
[article]
2017
arXiv
pre-print
At its core, similarity search manifests as k-nearest neighbors (kNN), a computationally simple primitive consisting of highly parallel distance calculations and a global top-k sort. ...
However, kNN is poorly supported by today's architectures because of its high memory bandwidth requirements. ...
Queries which traverse the index and end up in the same bucket should be similar; multiple parallel trees are often used in parallel with different cut orders. ...
arXiv:1606.03742v2
fatcat:tgyyr4avubbzjmr7pz7obiqmle
Similarity Search on Automata Processors
[article]
2017
arXiv
pre-print
At its core, similarity search is implemented using the k-nearest neighbors (kNN) algorithm, where computation consists of highly parallel distance calculations and a global top-k sort. ...
In this paper, we present and evaluate a novel automata-based algorithm for kNN on the Micron Automata Processor (AP), which is a non-von Neumann near-data processing architecture. ...
This work was also supported in part by NSF under grant CCF-1518703, gifts by Oracle, and by C-FAR, one of the six SRC STARnet Centers, sponsored by MARCO and DARPA. ...
arXiv:1608.03175v2
fatcat:gqtchulalnea3mdasisfqz7sgq
Hybrid Indexing for Parallel Analysis of Spatiotemporal Point Patterns
2016
International Conference on GIScience Short Paper Proceedings
We perform adaptive octree decomposition of the spatiotemporal domain and build local k-d trees to accelerate nearest neighbour search for space-time kernel density estimation (STKDE). ...
Our parallel implementation reaches substantial speedup compared to sequential processing. The hybrid index outperforms octree decomposition alone, especially at lower-levels of parallelization. ...
Hering (2013) showed that performance of in-memory k-d trees is best for intermediate number of dimensions (6-13). ...
doi:10.21433/b3114824r3wg
fatcat:nqs4jzffhjhpjgcqdbiitsqr5u
GPU-FS-kNN: A Software Tool for Fast and Scalable kNN Computation Using GPUs
2012
PLoS ONE
Results: We propose an efficient parallel formulation of the k-Nearest Neighbour (kNN) search problem, which is a popular method for classifying objects in several fields of research, such as pattern recognition ...
Based on our approach, we implemented a software tool GPU-FS-kNN (GPU-based Fast and Scalable k-Nearest Neighbour) for CUDA enabled GPUs. ...
Manuel Ujaldón, for his constructive feedbacks on an earlier version of this manuscript.
Author Contributions Conceived and designed the experiments: ASA CR PM. Performed the experiments: ASA. ...
doi:10.1371/journal.pone.0044000
pmid:22937144
pmcid:PMC3429408
fatcat:vpm23ylkhjcnjlbwhbeuw32aka
A Survey on Efficient Processing of Similarity Queries over Neural Embeddings
[article]
2022
arXiv
pre-print
In this survey, we first provide an overview of the "similarity query" and "similarity query processing" problems. ...
Similarity query is the family of queries based on some similarity metrics. ...
There are also studies of KNN join on top of other types of indexes, e.g., distributed KNN join based on tree index [21] , distributed KNN join on parallel product quantization [29] , localized KNN join ...
arXiv:2204.07922v1
fatcat:u5osyghs6vgppnj5gpnrzhae5y
Simba is based on Spark and runs over a cluster of commodity machines. ...
We present the Simba (Spatial In-Memory Big data Analytics) system that offers scalable and efficient in-memory spatial query processing and analytics for big spatial data. ...
In parallel, on each combined RDD partition, Simba builds an R-tree over Si and executes a local kNN join by querying each record from Ri over this tree. ...
doi:10.1145/2882903.2915237
dblp:conf/sigmod/XieL0LZG16
fatcat:kkus3fprcjevle5qxw7zaw2wtq
A Hardware Processing Unit for Point Sets
[article]
2008
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware - HWWS '04
A key component of our design is the spatial search unit based on a kd-tree performing both kNN and eN searches. ...
Our design is focused on fundamental and computationally expensive operations on point sets including k-nearest neighbors search, moving least squares approximation, and others. ...
While this algorithm is a highly sequential operation, we can identify three main blocks to be executed in parallel, due to their independence in terms of memory access. ...
doi:10.2312/eggh/eggh08/021-031
fatcat:n3epuivc45csrop227lqubkqkm
Accelerating Exact Similarity Search on CPU-GPU Systems
2015
2015 IEEE International Conference on Data Mining
Similarity search, also known as k-nearest neighbor search, is a key part of data mining applications and is used also extensively in applications such as multimedia search, where only a small subset of ...
In recent years, the use of Graphics Processing Units (GPUs) for data mining tasks has become popular. ...
There are four key points to such a system:
Fig. 3 : 3 Examples of execution on 6 data elements memory and global memory. ...
doi:10.1109/icdm.2015.125
dblp:conf/icdm/MatsumotoY15
fatcat:i5rzgrm2cfek5jdudloi4ndt6q
How good are modern spatial analytics systems?
2018
Proceedings of the VLDB Endowment
In this work, we first explore the available modern spatial processing systems and then thoroughly compare them based on features and queries they support, using real-world datasets. ...
In recent years a lot of spatial analytics systems have emerged. Existing work compares either limited features of these systems or the studies are outdated since new systems have emerged. ...
memory is the maximum amount of memory used at any point in time for execution of a query. ...
doi:10.14778/3236187.3236213
fatcat:f7ujehz35ra7xljqdiwwd2hs5q
« Previous
Showing results 1 — 15 out of 1,204 results