A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Improving the Robustness of Deep Neural Networks via Stability Training
[article]
2016
arXiv
pre-print
In addition, we demonstrate that our stabilized model gives robust state-of-the-art performance on large-scale near-duplicate detection, similar-image ranking, and classification on noisy datasets. ...
In this paper we address the issue of output instability of deep neural networks: small perturbations in the visual input can significantly distort the feature embeddings and output of a neural network ...
more similar for near-duplicate images. ...
arXiv:1604.04326v1
fatcat:kfqoovo4x5ae7nq6v7au3efkd4
Adaptive near-duplicate detection via similarity learning
2010
Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. ...
Near-duplicate documents can be reliably detected through this improved similarity measure. ...
ACKNOWLEDGMENTS We thank Chris Meek and Susan Dumais for many useful discus-sions. We are also grateful to Martin Theobald for sharing the data and the SpotSigs package. ...
doi:10.1145/1835449.1835520
dblp:conf/sigir/HajishirziYK10
fatcat:saucmssdkjdqdj4t2wpxt76pfe
Improving the Robustness of Deep Neural Networks via Stability Training
2016
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
We also apply stability training in the classification set-
ting to learn stable prediction labels for visual recognition. near duplicate image detection, similar to [13]. ...
Similar image ranking
at 98% precision for JPEG near-duplicates. ...
doi:10.1109/cvpr.2016.485
dblp:conf/cvpr/ZhengSLG16
fatcat:rhlsyrmek5durapxai52e4ortm
Adaptive duplicate detection using learnable string similarity measures
2003
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03
In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity. ...
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. ...
SVMs classify input vectors p (x,y) by implicitly mapping them via the "kernel trick" to a high-dimensional space where the two classes (S, equivalent-string pairs, and D, different-string pairs) are separated ...
doi:10.1145/956755.956759
fatcat:3dafob6h7veall7enhbf5lxo6a
Adaptive duplicate detection using learnable string similarity measures
2003
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03
In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity. ...
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. ...
SVMs classify input vectors p (x,y) by implicitly mapping them via the "kernel trick" to a high-dimensional space where the two classes (S, equivalent-string pairs, and D, different-string pairs) are separated ...
doi:10.1145/956750.956759
dblp:conf/kdd/BilenkoM03
fatcat:ijhfcujqqramriziclnqceveoi
Multi-Level Spherical Locality Sensitive Hashing For Approximate Near Neighbors
[article]
2017
arXiv
pre-print
This paper introduces "Multi-Level Spherical LSH": parameter-free, a multi-level, data-dependant Locality Sensitive Hashing data structure for solving the Approximate Near Neighbors Problem (ANN). ...
This data structure uses a modified version of a multi-probe adaptive querying algorithm, with the potential of achieving a O(n^p + t) query run time, for all inputs n where t <= n. ...
LSH is used to solve a wide variety of problems such as near duplicate and duplicate document detection. ...
arXiv:1709.03517v2
fatcat:5yuof3wnqzaizj64q3srk7f35y
Structured Max-Margin Learning for Inter-Related Classifier Training and Multilabel Image Annotation
2011
IEEE Transactions on Image Processing
Second, a visual concept network is constructed for characterizing the inter-concept visual similarity contexts more precisely in the high-dimensional multimodal feature space. ...
The visual concept network is used to determine the inter-related learning tasks directly in the feature space rather than in the label space because feature space is the common space for classifier training ...
Younes for handling the review process of our paper. ...
doi:10.1109/tip.2010.2073476
pmid:20833601
fatcat:36febhw22zcedi3ubhjerkoimi
Efficient overlap and content reuse detection in blogs and online news articles
2009
Proceedings of the 18th international conference on World wide web - WWW '09
Furthermore, the dynamic nature of blog and news entries necessitates incremental processing for reuse detection. ...
On the other hand, this knowledge is not cheap to acquire: considering the size of the related space web entries, it is essential that the techniques developed for identifying re-use are fast and scalable ...
From an indexing perspective, one possible approach is to apply near neighbor searches in high dimensional spaces to the problem of duplicate identification [22, 35] . ...
doi:10.1145/1526709.1526721
dblp:conf/www/KimCT09
fatcat:amg4g62sgjctph7jsmot2qitjy
Detecting semantic duplicates in short news items
2017
Business Informatics
In the paper, we examine a task of detecting text messages that borrow similar meaning or relate to the same event. ...
To solve this task, we design an algorithm that is based on the vector space model, meaning that every text is mapped to a point in high-dimensional space. ...
Segalovich is dedicated to the problem of detecting duplicates in text documents. It provides a comparative study of the most popular modern methods of detecting near-duplicates [6] . ...
doi:10.17323/1998-0663.2017.2.47.56
fatcat:a2yn3g5scnaexpuocnbcm6renu
A novel approach to capture the similarity in summarized text using embedded model
2022
International Journal on Smart Sensing and Intelligent Systems
Text summarization, an important tool of text mining, is not explored yet for the detection of near duplicates. ...
Existing research mostly uses text clustering, classification and retrieval algorithms for detection of near duplicates. ...
Figure 1 shows a generic approach for detecting near duplicates in two input pairs of text. ...
doi:10.2478/ijssis-2022-0002
fatcat:cf6esbng5vf6nb6lvgg57plqxm
Tracking Large-Scale Video Remix in Real-World Events
2013
IEEE transactions on multimedia
The proposed joint statistical model of visual memes and words outperforms an alternative concurrence model, with an average error of 2% for predicting meme volume and 17% for predicting meme lifespan. ...
In these two events, a high percentage of videos contain remixed content, and it is apparent that traditional news media and citizen journalists have different roles in disseminating remixed content. ...
The resulting data set contains near-duplicate keyframe pairs and non-duplicate keyframe pairs. ...
doi:10.1109/tmm.2013.2264929
fatcat:2sgemwmdrjhpxfsaqh2zjl3ehm
Near-duplicate video retrieval
2013
ACM Computing Surveys
As discovered in recent works, latest improvements and progress in near-duplicate video retrieval, as well as related topics including low-level feature extraction, signature generation, and high-dimensional ...
As we survey the works in near-duplicate video retrieval, we comparatively investigate existing variants of the definition of near-duplicate video, describe a generic framework, summarize state-of-the-art ...
Such a signature can be seen as a high-dimensional point in the [0, 255] 20 space. ...
doi:10.1145/2501654.2501658
fatcat:7x5th32fijhhbne5jb4oyovjgy
Effective and Efficient Content Redundancy Detection of Web Videos
2019
IEEE Transactions on Big Data
In this paper, we propose a novel near-duplicate video detection system, CompoundEyes, whose design philosophy deviates from the conventional feature-centered paradigm. ...
Within these videos, a considerable portion is duplicate or near-duplicate. ...
In detection, the goal is to determine whether a pair of videos are similar; in retrieval, the aim is to locate the videos that are near-duplicate to the query video and position them correctly. ...
doi:10.1109/tbdata.2019.2913674
fatcat:eydjky7o5revfohhjqze2rtg4m
MULTI-MODAL RETRIEVAL IN NEWS FEED APP USING GCDL TECHNIQUE
2017
International Journal of Recent Trends in Engineering and Research
Various applications such as information retrieval, near duplicate detection, and data mining are performed by hashing based methods. ...
Nearest neighbor search methods based on hashing have attracted considerable attention for effective and efficient large-scale similarity search in computer vision and information retrieval community. ...
To reduce the training time complexity,
Near-Duplicate Detection The task of detecting near duplicate textual information has received considerable attentions in recent years. ...
doi:10.23883/ijrter.2017.3365.aeikk
fatcat:6dmfmfsmtbaejale6t63ts7may
Totally Looks Like - How Humans Compare, Compared to Machines
[article]
2018
arXiv
pre-print
Perceptual judgment of image similarity by humans relies on rich internal representations ranging from low-level features to high-level concepts, scene properties and even cultural associations. ...
We introduce a new dataset dubbed Totally-Looks-Like (TLL) after a popular entertainment website, which contains images paired by humans as being visually similar. ...
Others modify learned representations to better match this similarity, reporting a high-level of success in some cases [16] , and near-perfect in others [2] . ...
arXiv:1803.01485v3
fatcat:d4p65rhrlfhwfk6hsi5epekz4e
« Previous
Showing results 1 — 15 out of 25,180 results