A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
A New Ensemble Method with Feature Space Partitioning for High-Dimensional Data Classification
2015
Mathematical Problems in Engineering
In this paper, we propose an ensemble method for classification of high-dimensional data, with each classifier constructed from a different set of features determined by partitioning of redundant features ...
However, data dimensionality increases rapidly day by day. Such a trend poses various challenges as these methods are not suitable to directly apply to high-dimensional datasets. ...
feature partitioning-based ensemble method to better classify high-dimensional data. ...
doi:10.1155/2015/590678
fatcat:ivwnojz3vzg35bxw2wkvhiq3oa
A Recursive Partitioning Method for Nearest Neighbor Search in High Dimensional Data
Number
unpublished
In this work we propose a recursive partitioning and distance-based indexing scheme for large and high-dimensional data to retrieve the nearest neighbours for a given query. ...
In the next level for each sub-partition a reference point is selected and again it is partitioned into further sub-sub-partitions. Main advantage of this method is that it reduces the search space. ...
Durga Bhavani for their valuable comments and suggestions. ...
fatcat:dlf5kzjqnvfqjbvxcmzwa5wc7m
An Efficient Unsavory Data Detection Method for Internet Big Data
[chapter]
2015
Lecture Notes in Computer Science
For a high-dimensional data object v in pyramid j of subspace i, we compute the height h v (to its top) and map v into a one-dimensional value p v =i+j+(0.5-h v ). ...
To realize intelligent and efficient unsavory data detection for internet big data, we proposed the i-Tree method, a semantics-based data detection method. ...
doi:10.1007/978-3-319-24315-3_21
fatcat:74drc44vgzfz7ec5jaqyiosx2m
Concentric hyperspaces and disk allocation for fast parallel range searching
1999
Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337)
However, most of these techniques were primarily designed for two-dimensional data and for balanced partitioning of the data space. ...
In this paper, we first establish that traditional declustering techniques do not scale for high-dimensional data. We then propose several new partitioning schemes based on concentric hyperspaces. ...
High Dimensional Data and Balanced Partitioning Balanced partitioning is a common assumption for most of the declustering techniques. The data space is divided into AE parts in Ø dimension. ...
doi:10.1109/icde.1999.754977
dblp:conf/icde/FerhatosmanogluAA99
fatcat:psux4hfrsveujoerahbiudiuqu
A New Indexing Method for High Dimensional Dataset
[chapter]
2005
Lecture Notes in Computer Science
However, for high dimensional data, the number of pyramids is often insufficient to discriminate data points when the number of dimensions is high. ...
We propose a new indexing method based on the surface of dimensionality. We prove that the Pyramid tree technology is a special case of our method. ...
The fan out of a node becomes very small due to the large size of coordi-nates for high dimensional data. ...
doi:10.1007/11408079_35
fatcat:d2klnvozefbklbp5rqbqvvpbbi
High-Dimensional Similarity Search Using Data-Sensitive Space Partitioning
[chapter]
2006
Lecture Notes in Computer Science
A new space partitioning method is proposed along with a new algorithm for exact similarity search in high-dimensional spaces. ...
It relies on a new method for data-sensitive space partitioning based on explicit data clustering, which is introduced in the paper for the first time. ...
An appropriate similarity search method must be aware of the locality of data in high dimensions. However, most methods for finding the locality of data rely on dimensionality reduction. ...
doi:10.1007/11827405_72
fatcat:2apjccuijfc4rnnf4tcpgbh55e
Indexing Issues in Supporting Similarity Searching
[chapter]
2004
Lecture Notes in Computer Science
This includes a discussion of the curse of dimensionality, as well as multidimensional indexing, distance-based indexing, dimension reduction, and embedding methods. ...
Concluding Remarks Providing indexing support for similarity searching is an important area where much work remains to be done. ...
Dimension Reduction and Embedding Methods There are many problems with indexing high-dimensional data. ...
doi:10.1007/978-3-540-30542-2_57
fatcat:7nysehgscvet5asn7oho22towy
A Comprehensive Study of iDistance Partitioning Strategies for kNN Queries and High-Dimensional Data Indexing
[chapter]
2013
Lecture Notes in Computer Science
and high-dimensional data and highlight the inherent difficulties associated with such tasks. ...
In this work, we perform the first comprehensive analysis of different partitioning strategies for the state-of-the-art high-dimensional indexing technique iDistance. ...
A special thanks to all research and manuscript reviewers. ...
doi:10.1007/978-3-642-39467-6_22
fatcat:glvzalrln5hj3o2cxlyzm2c7py
A Class of Region-preserving Space Transformations for Indexing High-dimensional Data
2005
Journal of Computer Science
This study introduces a class of region preserving space transformation (RPST) schemes for accessing high-dimensional data. ...
The techniques are experimentally compared to the Pyramid Technique, which is another example of static partitioning designed for high-dimensional data. ...
As a result, access methods for high-dimensional data [2] [3] [4] [5] [6] [7] [8] [9] continue to attract considerable scientific interest. ...
doi:10.3844/jcssp.2005.89.97
fatcat:msr4o6ayizdoheoezhsiioaeg4
A Functional Measure-Based Framework for Evaluation of Multi-Dimensional Point Access Methods
2011
Procedia Environmental Sciences
Multi-dimensional access methods have developed for supporting fast retrieval of multi-dimensional data from multi-dimensional databases. ...
In this framework, in order to present a comprehensive evaluation of multi-dimensional point access methods, firstly, we extended related classification of multi-dimensional point access methods in the ...
For example, BSP-tree is a binary tree that represents a recursive complete partitioning of the data space into subspaces. ...
doi:10.1016/j.proenv.2011.09.127
fatcat:3bdiiy2llze4josfw4wrcbrjry
BrePartition: Optimized High-Dimensional kNN Search with Bregman Distances
[article]
2020
arXiv
pre-print
Such high-dimensional space has posed significant challenges for existing kNN search algorithms with Bregman distances, which could only handle data of medium dimensionality (typically less than 100). ...
This paper addresses the urgent problem of high-dimensional kNN search with Bregman distances. We propose a novel partition-filter-refinement framework. ...
high-dimensional data points from the disks. ...
arXiv:2006.00227v1
fatcat:cvuizn6xbjebze2q77sp2x2vce
Clustering based feature selection using Partitioning Around Medoids (PAM)
2020
Jurnal Informatika
AB S T R A C T High-dimensional data contains a large number of features. With many features, high dimensional data requires immense computational resources, including space and time. ...
There are two methods employed for dimensionality reduction purposes: feature selection and feature extraction [6] . ...
High dimensional data give much chance to overfitting problem. Small data usually leads to a simpler model, and a simpler model tends to generalize better. d. ...
doi:10.26555/jifo.v14i2.a17620
fatcat:dxhhhwvlbrh7ji7ea4an4eqrle
An empirical study on the visual cluster validation method with Fastmap
2001
Proceedings Seventh International Conference on Database Systems for Advanced Applications DASFAA 2001 DASFAA-01
from data partitions. ...
The visual cluster validation method attempts to tackle two clustering problems in data mining: ( I ) to verify partitions of data created by a clustering algorithm and ( 2 ) to identify genuine clusters ...
Projection of high dimensional data onto low dimensional spaces for clustering is a common approach in cluster analysis. Fastmap was primarily designed for this purpose [71. ...
doi:10.1109/dasfaa.2001.916368
dblp:conf/dasfaa/HuangNC01
fatcat:oti4l7yidvbdfa6zjoocq4yr3q
An Efficient Semantic-Based Organization and Similarity Search Method for Internet Data Resources
[chapter]
2014
Lecture Notes in Computer Science
First, the iHash normalizes the internet data objects into a high-dimensional feature space, solving the "feature explosion" problem of the feature space; second, we partition the high-dimensional data ...
In this paper, we present the iHash method, a semantic-based organization and similarity search method for internet data resources. ...
Jagadish et al. presented the iDistance method for k-nearest neighbor (kNN) query in a high-dimensional metric space. ...
doi:10.1007/978-3-642-55032-4_68
fatcat:oocsmwuqazfavbjhag3eyvd7cu
A Comprehensive Study of Challenges and Approaches for Clustering High Dimensional Data
2014
International Journal of Computer Applications
In this paper we provide a short introduction to various approaches and challenges for high-dimensional data clustering. ...
Most clustering methods work efficiently for low dimensional data since distance measures are used to find dissimilarities between objects. ...
But often the data collected for research contains multiple dimension, is sparse and highly skewed, known as high dimensional data. ...
doi:10.5120/15995-4844
fatcat:y3g3hhvttfduxikr35uf7ecuky
« Previous
Showing results 1 — 15 out of 323,199 results