A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Seeing through bag-of-visual-word glasses: towards understanding quantization effects in feature extraction methods
[article]
2014
arXiv
pre-print
Vector-quantized local features frequently used in bag-of-visual-words approaches are the backbone of popular visual recognition systems due to both their simplicity and their performance. ...
The question remains how much visual information is "lost in quantization" when mapping visual features to code words? ...
Unbagging bag-of-visual words: visualizing quantization effects Our technique is simple and in line with current trends for image reconstruction from local features [9, 16, 8] . ...
arXiv:1408.4692v1
fatcat:kujhplne6zhodgtf4q7pg7naxu
Clustering of multiple-event online sound collections with the codebook approach
2019
Zenodo
These "acoustic words" can be understood and processed analogous to natural language words, thus giving access to the use of varied Natural Language Processing techniques, such as Bag-of-Words, TF-IDF ...
This multiple-event audio clips might be misrepresented by statistical aggregation methods such as computing the mean over the features; in this regard, techniques that retain the elemental blocks of a ...
Image from https://gurus.pyimagesearch. com/the-bag-of-visual-words-model
Bag-of-acoustic-words Following the Bag-of-Feature model, vector quantization of the acoustic space can be performed, and a codebook ...
doi:10.5281/zenodo.3475480
fatcat:zbpmafthb5ef7i2r43pehpc4lm
Exploring features in a Bayesian framework for material recognition
2010
2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Unlike other visual recognition tasks in computer vision, it is difficult to find good, reliable features that can tell material categories apart. ...
We are interested in identifying the material category, e.g. glass, metal, fabric, plastic or wood, from a single image of a surface. ...
After quantizing these features into dictionaries, we convert an image into a bag of words and use latent Dirichlet allocation (LDA) [3] to model the distribution of the words. ...
doi:10.1109/cvpr.2010.5540207
dblp:conf/cvpr/LiuSAR10
fatcat:dmhi55krjzh6ddt3zzo3x4bfnq
Recognising complex activities with histograms of relative tracklets
2017
Computer Vision and Image Understanding
One approach to the recognition of complex human activities is to use feature descriptors that encode visual interactions by describing properties of local visual features with respect to trajectories ...
Our comparative evaluation of features from accelerometers and video highlighted a performance gap between visual and accelerometer-based motion features and showed a substantial performance gain when ...
Acknowledgements The authors would like to thank Jianguo Zhang and Ruixuan Wang for valuable feedback on drafts of this paper. This research was funded by RCUK grants EP/G0 6 6019/1 and EP/K037293/1 . ...
doi:10.1016/j.cviu.2016.08.012
fatcat:bd43ecgf2jfxhlalsixz74seum
Mobile Visual Location Recognition
2011
IEEE Signal Processing Magazine
As the spatial layout of features within query and database image is ignored in the matching process, this approach is called Bag-of-Visual-Words or Bag-of-Features (BoF). ...
Retrieval based Location Recognition . . . . . . . . . . 12 2.2.1 Feature Extraction
. . 14 2.2.2 Bag-of-Features based Image Retrieval . . . . . . . . . . . . . . . . . . 17 2.2.3 Visual Word Quantization ...
doi:10.1109/msp.2011.940882
fatcat:7vzjij4qvjb5hcc2znqp7ginuy
Image search—from thousands to billions in 20 years
2013
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Starting with a retrospective review of three stages of image search in the history, the article highlights major breakthroughs around the year 2000 in image search features, indexing methods, and commercial ...
Based on the review, the concluding section discusses open research challenges and suggests future research directions in effective visual representation, image knowledge base construction, implicit user ...
ACKNOWLEDGMENTS The authors gratefully acknowledge Wei-Ying Ma for his visionary long-term support and encouragement, and Xin-Jing Wang, Changhu Wang, Xirong Li, and Zhiwei Li for their years of collaboration ...
doi:10.1145/2490823
fatcat:cor23f3c7nb7fimy4ixp32bdk4
Visually Grounded Models of Spoken Language: A Survey of Datasets, Architectures and Evaluation Techniques
2022
The Journal of Artificial Intelligence Research
This survey provides an overview of the evolution of visually grounded models of spoken language over the last 20 years. ...
The current paper brings together these contributions in order to provide a useful introduction and overview for practitioners in all these areas. ...
via vector quantization in Harwath, Hsu, and Glass (2020) . ...
doi:10.1613/jair.1.12967
fatcat:zib2mr5wkjdmteyrgac6gxekli
Visually grounded models of spoken language: A survey of datasets, architectures and evaluation techniques
[article]
2021
arXiv
pre-print
This survey provides an overview of the evolution of visually grounded models of spoken language over the last 20 years. ...
The current paper brings together these contributions in order to provide a useful introduction and overview for practitioners in all these areas. ...
In the following section we will see the effects of these developments on the research into visually grounded models of spoken language. ...
arXiv:2104.13225v3
fatcat:edodewkhljbqtpcrm2knd2zw7i
Visual Object Recognition
2011
Synthesis Lectures on Artificial Intelligence and Machine Learning
; visual vocabularies and bags-of-words; methods to verify geometric consistency according to parameterized geometric transformations; dealing with outliers in correspondences, RANSAC and the Generalized ...
The target audience consists of researchers or students working in AI, robotics, or vision who would like to understand what methods and representations are available for these problems. ...
Bastian Leibe's work on the project was supported in part by the UMIC cluster of excellence (DFG EXC 89). ...
doi:10.2200/s00332ed1v01y201103aim011
fatcat:fhz7aokkfjav7fuauuorfstq4y
Sound representation and classification benchmark for domestic robots
2014
2014 IEEE International Conference on Robotics and Automation (ICRA)
We address the problem of sound representation and classification and present results of a comparative study in the context of a domestic robotic scenario. ...
A dataset of sounds was recorded in realistic conditions (background noise, presence of several sound sources, reverberations, etc.) using the humanoid robot NAO. ...
of the mean and standard deviation can also be used. • The bag-of-words (BoW) approach. ...
doi:10.1109/icra.2014.6907786
dblp:conf/icra/JanvierAGH14
fatcat:4kvhhwh65relnnnxiu57vyr2y4
Sound Representation and Classification Benchmark for Domestic Robots
[article]
2014
arXiv
pre-print
We address the problem of sound representation and classification and present results of a comparative study in the context of a domestic robotic scenario. ...
A dataset of sounds was recorded in realistic conditions (background noise, presence of several sound sources, reverberations, etc.) using the humanoid robot NAO. ...
of the mean and standard deviation can also be used. • The bag-of-words (BoW) approach. ...
arXiv:1402.3689v1
fatcat:ljzuajgbpncblev4eoqqijc25a
Instance search retrospective with focus on TRECVID
2017
International Journal of Multimedia Information Retrieval
The Instance Search (INS) benchmark worked with a variety of large collections of data including Sound & Vision, Flickr, BBC (British Broadcasting Corporation) Rushes for the first 3 pilot years and with ...
The main contributions of the paper include i) an examination of the evolving design of the evaluation framework and its components (system tasks, data, measures); ii) an analysis of the influence of topic ...
As described, the representation based on bag of visual words with very fine quantization is known to be effective for instance search. ...
doi:10.1007/s13735-017-0121-3
pmid:28758054
pmcid:PMC5531298
fatcat:3khp2cscmbhohipfx246gspqlq
A Comprehensive Survey on Computational Aesthetic Evaluation of Visual Art Images: Metrics and Challenges
2021
IEEE Access
Quantizing local descriptors As is often the case with bags-of-features for generic object recognition and image retrieval, we also quantize local descriptors by using visual words in codebooks. ...
bag-of-visual-words classification frame- work, which extracted LAB-based color visual words and SIFT-based texture visual words [40] , the influence of color combination on emotion [72] , visual and ...
For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. ...
doi:10.1109/access.2021.3083075
fatcat:zukn4uhlinejjdubdezsdghh5i
3D HMM-based Facial Expression Recognition using Histogram of Oriented Optical Flow
2015
Transactions on Machine Learning and Artificial Intelligence
Histogram of Optical Flow is used as the descriptor for extracting and describing the key features, while training and testing are performed on 3D Hidden Markov Models. ...
This research is focused on HCI in the recognition of human facial expression and emotion analysis. ...
ACKNOWLEDGEMENT We thank Jasser Jasser of ECE Department of Oakland University for assistance with development and running of the experiment and sharing of his pearls of wisdom with us during the course ...
doi:10.14738/tmlai.36.1661
fatcat:2xe2jcwa2ffy7buu5ywh6jvq6i
Visually Fingerprinting Humans without Face Recognition
2015
Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services - MobiSys '15
One application of visual fingerprints relates to augmented reality, in which an individual looks at other people through her camera-enabled glass (e.g., Google Glass) and views information about them. ...
If Alice recognizes Bob through motion fingerprints, she can extract Bob's clothing features and update a database inside the InSight server. ...
If successful, such a fingerprint could be effectively used towards human recognition or content announcement in the visual vicinity, and more broadly towards enabling human-centric augmented reality. ...
doi:10.1145/2742647.2742671
dblp:conf/mobisys/WangBCN15
fatcat:7kr2xjtvbfgcnk4zkw7c7k67vy
« Previous
Showing results 1 — 15 out of 374 results