A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Optimal Clustering in Stable Instances Using Combinations of Exact and Noisy Ordinal Queries
2021
Algorithms
Specifically, we study a class of polynomial-time graph-based clustering algorithms (termed Single-Linkage) which are widely used in practice and that guarantee exact solutions for stable instances in ...
Our algorithms still guarantee exact solutions for stable instances of k-medoids clustering, and they use a rather small number of high-cost operations, without increasing the low-cost operations too much ...
Author Contributions: Conceptualization, E.B. and P.P.; Formal analysis, E.B. and P. P.; Investigation, E.B. and P.P. Both the authors have read and agreed to the published version of the manuscript. ...
doi:10.3390/a14020055
fatcat:6bt5olxfmrbglf24sb4eaxrgcm
Active Learning of Ordinal Embeddings: A User Study on Football Data
[article]
2022
arXiv
pre-print
Humans innately measure distance between instances in an unlabeled dataset using an unknown similarity function. ...
Distance metrics can only serve as proxy for similarity in information retrieval of similar instances. Learning a good similarity function from human annotations improves the quality of retrievals. ...
Query Ordinal Embedding (a) Left: We collect participants' responses to relative similarity queries. In our study, we use a tuple size of nine. ...
arXiv:2207.12710v1
fatcat:zt66txpic5fcdd3xofbmd57hwe
Unifying learning to rank and domain adaptation
2014
Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14
For determining these two aspects of keyword importance, we further propose the concept of feature decoupling, suggesting using two types of easy-to-design features: meta-features and intra-features. ...
Towards learning a scorer based on the decoupled features, we require that our framework fulfill inferred sparsity to eliminate the interference of noisy keywords, and employ distant supervision to tackle ...
Specifically, in verbose-query-based retrieval, using traditional document scorers as features does not perform well because of the existence of noisy keywords. ...
doi:10.1145/2623330.2623739
dblp:conf/kdd/ZhouC14
fatcat:rknkqqwt4venjg3xlgiggokdby
Using temporal bursts for query modeling
2013
Information retrieval (Boston)
In our approach we detect bursts in result lists returned for a query. We then model the term distributions of the bursts using a reduced result list and select its most descriptive terms. ...
For query sets that consist of both temporal and non-temporal queries, our query modeling approach incorporates an effective selection method of terms. ...
Acknowledgments We are grateful to our reviewers for providing valuable feedback and suggestions. This ...
doi:10.1007/s10791-013-9227-2
fatcat:kddxbjlpdnaypesvphnwnexspa
Secure multidimensional range queries over outsourced data
2011
The VLDB journal
The problem is motivated by secure data outsourcing applications where a client may store his/her data on a remote server in encrypted form and want to execute queries using server's computational capabilities ...
In this paper, we study the problem of supporting multidimensional range queries on encrypted data. ...
Acknowledgments This work was made possible by grant from NSF (award number 1045296) and gift from NEC laboratories USA. ...
doi:10.1007/s00778-011-0245-7
fatcat:4ma6bbcxafg3jpitwtfu47ho6e
Generalized Dictionaries for Multiple Instance Learning
2015
International Journal of Computer Vision
We propose a noisy-OR model and a generalized mean-based optimization framework for learning the dictionaries in the feature space. ...
We present a multi-class Multiple Instance Learning (MIL) algorithm using the dictionary learning framework where the data is given in the form of bags. ...
The exact number of instances in each bag varies depending on the action cuboids of its class present in the video sequence. ...
doi:10.1007/s11263-015-0831-z
fatcat:4e6wr55w3vb5xlyao3vbv4ixqe
A survey on wavelet applications in data mining
2002
SIGKDD Explorations
Recently there has been significant development in the use of wavelet methods in various data mining processes. However, there has been written no comprehensive survey available on the topic. ...
The paper concludes by discussing the impact of wavelets on data mining research and outlining potential future research directions and applications. ...
About the Authors Tao Li received his BS degree in Computer Science from Fuzhou University, China and MS degree in Computer Science from Chinese Academy of Science. ...
doi:10.1145/772862.772870
fatcat:3ruxy2cknze2lh2paxlzpe55ta
Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
[article]
2018
arXiv
pre-print
Focusing on Automatic Speech Recognition and Natural Language Understanding, we detail our approach to training high-performance Machine Learning models that are small enough to run in real-time on small ...
This paper presents the machine learning architecture of the Snips Voice Platform, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices. ...
We are indebted to the community of users of the Snips Voice Platform for valuable feedback and contributions. ...
arXiv:1805.10190v3
fatcat:ej65i7jecvatlp7wppshiptxwm
Crawler
[chapter]
2009
Encyclopedia of Database Systems
Synonyms Cache-aware query processing; Cache-sensitive query processing Definition Query processing algorithms are designed to efficiently exploit the available cache units in the memory hierarchy. ...
This knowledge can be used to ensure that the algorithms have good temporal and/or spatial locality on the target platform. ...
to query optimization. ...
doi:10.1007/978-0-387-39940-9_2315
fatcat:x4qspjdytvhvroc7h753dihp7u
Improving data utility in differential privacy and k-anonymity
[article]
2013
arXiv
pre-print
The main objective of this thesis is to improve the data utility in k-anonymous and differentially private data releases. k-Anonymity has several drawbacks. ...
Microaggregation-based k-anonymity and differential privacy can be combined to produce microdata releases with the strong privacy guarantees of differential privacy and improved data accuracy. ...
, insensitive MDAV generates sets of clusters that are more stable when one record of the data set changes. ...
arXiv:1307.0966v1
fatcat:adgnr7mbirbatkeirelc7pj67a
Content-based video copy detection
2009
Proceedings of the seventeen ACM international conference on Multimedia - MM '09
In particular, we propose techniques for the automatic creation of spatio-temporal descriptors using frame-based global descriptors, an acoustic descriptor that can be combined with global descriptors, ...
In particular, we propose a novel approximate search that uses pivot objects in order to estimate and discard distance evaluations, a multimodal search in large datasets, and a novel index structure that ...
In the case of exact searches, the distance aggregation performs best when combining two or three descriptors, and then it is highly affected by the fourth descriptor, which behaves as a noisy (spammer ...
doi:10.1145/1631272.1631539
dblp:conf/mm/Barrios09
fatcat:kpjhi2p3orcsrfi444ewana33i
Super-Fine Attributes with Crowd Prototyping
2018
IEEE Transactions on Pattern Analysis and Machine Intelligence
We re-annotate gender, age and ethnicity traits from PETA, a highly diverse (19K instances, 8.7K identities) amalgamation of 10 re-id datasets including VIPER, CUHK and TownCentre. ...
Such brittle representations are limited in descriminitive power and hamper the efficacy of learnt estimators. ...
Empirically, we find this weighted combination of uncertainty and dissimilarity produces the most stable and coherent embeddings, in comparison to interpreting uncertainty as a fixed distance, or ignoring ...
doi:10.1109/tpami.2018.2836900
pmid:29994759
fatcat:z7cf52y4jrdmnl7fke5yvekshe
Detecting and indexing moving objects for Behavior Analysis by Video and Audio Interpretation
2014
ELCVIA Electronic Letters on Computer Vision and Image Analysis
spatial query optimization and processing. ...
The C-index is thus not specifically designed to select an optimal number of clusters but rather to compare different clustering methods using a same number of clusters. ...
doi:10.5565/rev/elcvia.603
fatcat:jo2v6p3czncljhsdszaqr725qe
Computer science and decision theory
2008
Annals of Operations Research
In game theory, values, optimal strategies, equilibria, and other solution concepts can be easy, hard, or even impossible to compute. ...
science methods, and explores applications in the social and decision sciences of newer decision-theoretic methods developed with computer science applications in mind. ...
In [159, 180] , the authors described fast exact and approximation algorithms for equitable resource sharing and assignment of prices, using combinatorial and continuous optimization methods. ...
doi:10.1007/s10479-008-0328-z
fatcat:i5dlgvevmbaatitjhntpbjtg5u
Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities
2014
Journal of the Association for Information Science and Technology
Then, it discovers an approximate cluster structure of documents in the common space. ...
The third stage extracts promising topic phrases by constructing a discriminant model where documents along with their cluster memberships are used as training instances. ...
DESCRIPTIVE DOCUMENT CLUSTERING ...
doi:10.1002/asi.23374
fatcat:xnx7abfkyrh7djkezmsfzncbre
« Previous
Showing results 1 — 15 out of 727 results