Filters








25,180 Hits in 6.0 sec

Improving the Robustness of Deep Neural Networks via Stability Training [article]

Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow
2016 arXiv   pre-print
In addition, we demonstrate that our stabilized model gives robust state-of-the-art performance on large-scale near-duplicate detection, similar-image ranking, and classification on noisy datasets.  ...  In this paper we address the issue of output instability of deep neural networks: small perturbations in the visual input can significantly distort the feature embeddings and output of a neural network  ...  more similar for near-duplicate images.  ... 
arXiv:1604.04326v1 fatcat:kfqoovo4x5ae7nq6v7au3efkd4

Adaptive near-duplicate detection via similarity learning

Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
2010 Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10  
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain.  ...  Near-duplicate documents can be reliably detected through this improved similarity measure.  ...  ACKNOWLEDGMENTS We thank Chris Meek and Susan Dumais for many useful discus-sions. We are also grateful to Martin Theobald for sharing the data and the SpotSigs package.  ... 
doi:10.1145/1835449.1835520 dblp:conf/sigir/HajishirziYK10 fatcat:saucmssdkjdqdj4t2wpxt76pfe

Improving the Robustness of Deep Neural Networks via Stability Training

Stephan Zheng, Yang Song, Thomas Leung, Ian Goodfellow
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
We also apply stability training in the classification set- ting to learn stable prediction labels for visual recognition. near duplicate image detection, similar to [13].  ...  Similar image ranking at 98% precision for JPEG near-duplicates.  ... 
doi:10.1109/cvpr.2016.485 dblp:conf/cvpr/ZhengSLG16 fatcat:rhlsyrmek5durapxai52e4ortm

Adaptive duplicate detection using learnable string similarity measures

Mikhail Bilenko, Raymond J. Mooney
2003 Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03  
In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity.  ...  The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes.  ...  SVMs classify input vectors p (x,y) by implicitly mapping them via the "kernel trick" to a high-dimensional space where the two classes (S, equivalent-string pairs, and D, different-string pairs) are separated  ... 
doi:10.1145/956755.956759 fatcat:3dafob6h7veall7enhbf5lxo6a

Adaptive duplicate detection using learnable string similarity measures

Mikhail Bilenko, Raymond J. Mooney
2003 Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03  
In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity.  ...  The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes.  ...  SVMs classify input vectors p (x,y) by implicitly mapping them via the "kernel trick" to a high-dimensional space where the two classes (S, equivalent-string pairs, and D, different-string pairs) are separated  ... 
doi:10.1145/956750.956759 dblp:conf/kdd/BilenkoM03 fatcat:ijhfcujqqramriziclnqceveoi

Multi-Level Spherical Locality Sensitive Hashing For Approximate Near Neighbors [article]

Teresa Nicole Brooks, Rania Almajalid
2017 arXiv   pre-print
This paper introduces "Multi-Level Spherical LSH": parameter-free, a multi-level, data-dependant Locality Sensitive Hashing data structure for solving the Approximate Near Neighbors Problem (ANN).  ...  This data structure uses a modified version of a multi-probe adaptive querying algorithm, with the potential of achieving a O(n^p + t) query run time, for all inputs n where t <= n.  ...  LSH is used to solve a wide variety of problems such as near duplicate and duplicate document detection.  ... 
arXiv:1709.03517v2 fatcat:5yuof3wnqzaizj64q3srk7f35y

Structured Max-Margin Learning for Inter-Related Classifier Training and Multilabel Image Annotation

Jianping Fan, Yi Shen, Chunlei Yang, Ning Zhou
2011 IEEE Transactions on Image Processing  
Second, a visual concept network is constructed for characterizing the inter-concept visual similarity contexts more precisely in the high-dimensional multimodal feature space.  ...  The visual concept network is used to determine the inter-related learning tasks directly in the feature space rather than in the label space because feature space is the common space for classifier training  ...  Younes for handling the review process of our paper.  ... 
doi:10.1109/tip.2010.2073476 pmid:20833601 fatcat:36febhw22zcedi3ubhjerkoimi

Efficient overlap and content reuse detection in blogs and online news articles

Jong Wook Kim, K. Selçuk Candan, Junichi Tatemura
2009 Proceedings of the 18th international conference on World wide web - WWW '09  
Furthermore, the dynamic nature of blog and news entries necessitates incremental processing for reuse detection.  ...  On the other hand, this knowledge is not cheap to acquire: considering the size of the related space web entries, it is essential that the techniques developed for identifying re-use are fast and scalable  ...  From an indexing perspective, one possible approach is to apply near neighbor searches in high dimensional spaces to the problem of duplicate identification [22, 35] .  ... 
doi:10.1145/1526709.1526721 dblp:conf/www/KimCT09 fatcat:amg4g62sgjctph7jsmot2qitjy

Detecting semantic duplicates in short news items

Sergei Fomin, Roman Belousov
2017 Business Informatics  
In the paper, we examine a task of detecting text messages that borrow similar meaning or relate to the same event.  ...  To solve this task, we design an algorithm that is based on the vector space model, meaning that every text is mapped to a point in high-dimensional space.  ...  Segalovich is dedicated to the problem of detecting duplicates in text documents. It provides a comparative study of the most popular modern methods of detecting near-duplicates [6] .  ... 
doi:10.17323/1998-0663.2017.2.47.56 fatcat:a2yn3g5scnaexpuocnbcm6renu

A novel approach to capture the similarity in summarized text using embedded model

Asha Rani Mishra, V.K. Panchal
2022 International Journal on Smart Sensing and Intelligent Systems  
Text summarization, an important tool of text mining, is not explored yet for the detection of near duplicates.  ...  Existing research mostly uses text clustering, classification and retrieval algorithms for detection of near duplicates.  ...  Figure 1 shows a generic approach for detecting near duplicates in two input pairs of text.  ... 
doi:10.2478/ijssis-2022-0002 fatcat:cf6esbng5vf6nb6lvgg57plqxm

Tracking Large-Scale Video Remix in Real-World Events

Lexing Xie, Apostol Natsev, Xuming He, John R. Kender, Matthew Hill, John R. Smith
2013 IEEE transactions on multimedia  
The proposed joint statistical model of visual memes and words outperforms an alternative concurrence model, with an average error of 2% for predicting meme volume and 17% for predicting meme lifespan.  ...  In these two events, a high percentage of videos contain remixed content, and it is apparent that traditional news media and citizen journalists have different roles in disseminating remixed content.  ...  The resulting data set contains near-duplicate keyframe pairs and non-duplicate keyframe pairs.  ... 
doi:10.1109/tmm.2013.2264929 fatcat:2sgemwmdrjhpxfsaqh2zjl3ehm

Near-duplicate video retrieval

Jiajun Liu, Zi Huang, Hongyun Cai, Heng Tao Shen, Chong Wah Ngo, Wei Wang
2013 ACM Computing Surveys  
As discovered in recent works, latest improvements and progress in near-duplicate video retrieval, as well as related topics including low-level feature extraction, signature generation, and high-dimensional  ...  As we survey the works in near-duplicate video retrieval, we comparatively investigate existing variants of the definition of near-duplicate video, describe a generic framework, summarize state-of-the-art  ...  Such a signature can be seen as a high-dimensional point in the [0, 255] 20 space.  ... 
doi:10.1145/2501654.2501658 fatcat:7x5th32fijhhbne5jb4oyovjgy

Effective and Efficient Content Redundancy Detection of Web Videos

Yixin Chen, Dongsheng Li, Yu Hua, Wenbo He
2019 IEEE Transactions on Big Data  
In this paper, we propose a novel near-duplicate video detection system, CompoundEyes, whose design philosophy deviates from the conventional feature-centered paradigm.  ...  Within these videos, a considerable portion is duplicate or near-duplicate.  ...  In detection, the goal is to determine whether a pair of videos are similar; in retrieval, the aim is to locate the videos that are near-duplicate to the query video and position them correctly.  ... 
doi:10.1109/tbdata.2019.2913674 fatcat:eydjky7o5revfohhjqze2rtg4m

MULTI-MODAL RETRIEVAL IN NEWS FEED APP USING GCDL TECHNIQUE

2017 International Journal of Recent Trends in Engineering and Research  
Various applications such as information retrieval, near duplicate detection, and data mining are performed by hashing based methods.  ...  Nearest neighbor search methods based on hashing have attracted considerable attention for effective and efficient large-scale similarity search in computer vision and information retrieval community.  ...  To reduce the training time complexity, Near-Duplicate Detection The task of detecting near duplicate textual information has received considerable attentions in recent years.  ... 
doi:10.23883/ijrter.2017.3365.aeikk fatcat:6dmfmfsmtbaejale6t63ts7may

Totally Looks Like - How Humans Compare, Compared to Machines [article]

Amir Rosenfeld, Markus D. Solbach, John K. Tsotsos
2018 arXiv   pre-print
Perceptual judgment of image similarity by humans relies on rich internal representations ranging from low-level features to high-level concepts, scene properties and even cultural associations.  ...  We introduce a new dataset dubbed Totally-Looks-Like (TLL) after a popular entertainment website, which contains images paired by humans as being visually similar.  ...  Others modify learned representations to better match this similarity, reporting a high-level of success in some cases [16] , and near-perfect in others [2] .  ... 
arXiv:1803.01485v3 fatcat:d4p65rhrlfhwfk6hsi5epekz4e
« Previous Showing results 1 — 15 out of 25,180 results