Filters








7,405 Hits in 5.1 sec

Multi-scale image semantic recognition with hierarchical visual vocabulary

Xinghao Jiang, Tanfeng Sun, Fu Guanglei
2011 Computer Science and Information Systems  
The vocabulary is used to recognize the image semantic. In this paper, a new scheme to construct semantic-binding hierarchical visual vocabulary is proposed.  ...  Some attributes and relationship of the semantic nodes in the model are discussed. The hierarchical semantic model is used to organize the multi-scale semantic into a level-by-level structure.  ...  Our model can help to understand and analyze image semantic in a flexible multi-resolution way, and get the semantic recognition on each semantic node.  ... 
doi:10.2298/csis100423035j fatcat:3xqwn5qrobbp7hptbc5mrehm2m

Semantic hierarchies for image annotation: A survey

Anne-Marie Tousch, Stéphane Herbin, Jean-Yves Audibert
2012 Pattern Recognition  
In this survey, we argue that using structured vocabularies is capital to the success of image annotation.  ...  We analyze literature on image annotation uses and user needs, and we stress the need for automatic annotation.  ...  Visual recognition in the human brain consists in linking the image printed on the retina with a representation stored in memory [92] .  ... 
doi:10.1016/j.patcog.2011.05.017 fatcat:hoyo5p2oyfff3lvu6yi24vbcwi

Visual Vocabulary Learning and Its Application to 3D and Mobile Visual Search [article]

Liujuan Cao
2012 arXiv   pre-print
In this technical report, we review related works and recent trends in visual vocabulary based web image search, object recognition, mobile visual search, and 3D object retrieval.  ...  Especial focuses would be also given for the recent trends in supervised/unsupervised vocabulary optimization, compact descriptor for visual search, as well as in multi-view based 3D object representation  ...  City-Scale Landmark Search Towards city-scale landmark search and recognition, Schindler et al.  ... 
arXiv:1207.7244v1 fatcat:b6y7yvvcu5davkwh3zuv762qq4

Semantics Extraction from Images [chapter]

Ioannis Pratikakis, Anastasia Bolovinou, Bassilios Gatos, Stavros Perantonis
2011 Lecture Notes in Computer Science  
An overview of the state-of-the-art on semantics extraction from images is presented.  ...  Knowledge can be represented in either implicit or explicit fashion while the image is represented in different levels, namely, low-level, intermediate and semantic level.  ...  A multi-scale representation is achieved by performing image segmentation at three image scales (as in [66] ), assigning three different regions to each pixel.  ... 
doi:10.1007/978-3-642-20795-2_3 fatcat:h3xh2fmryfgj5hwn5k5fjsximy

GroupViT: Semantic Segmentation Emerges from Text Supervision [article]

Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang
2022 arXiv   pre-print
We train GroupViT jointly with a text encoder on a large-scale image-text dataset via contrastive losses.  ...  Grouping and recognition are important components of visual scene understanding, e.g., for object detection and semantic segmentation.  ...  By training on large-scale paired image-text data with contrastive losses, we enable the model to be zero-shot transferred to several semantic segmentation vocabularies, without requiring any further annotation  ... 
arXiv:2202.11094v3 fatcat:5agovy2yifecphcmoolwmhect4

Building an Enhanced Vocabulary of the Robot Environment with a Ceiling Pointing Camera

Alejandro Rituerto, Henrik Andreasson, Ana Murillo, Achim Lilienthal, José Guerrero
2016 Sensors  
To solve this challenging task, this paper studies how to leverage the standard vocabulary construction process to obtain a more meaningful visual vocabulary of the robot work environment using image sequences  ...  We show different robotic tasks that could benefit of the use of our visual vocabulary approach, such as place recognition or object discovery.  ...  This framework can be used to link the elements in a visual vocabulary with the elements of a semantic vocabulary.  ... 
doi:10.3390/s16040493 pmid:27070607 pmcid:PMC4851007 fatcat:svbsehelw5fdthws5uwgdedxse

Semantic-Aware Co-Indexing for Image Retrieval

Shiliang Zhang, Ming Yang, Xiaoyu Wang, Yuanqing Lin, Qi Tian
2015 IEEE Transactions on Pattern Analysis and Machine Intelligence  
are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings.  ...  In this paper, for vocabulary tree based image retrieval, we propose a semantic-aware co-indexing algorithm to jointly embed two strong cues into the inverted indexes: 1) local invariant features that  ...  K SIFT descriptors [13] of dimension D = 128, so does {x j } j∈S d for d.A visual vocabulary tree T is obtained by hierarchical Offline Indexing Dense HOG/ LBP Sparse SIFT Coding of BoW Multi-class  ... 
doi:10.1109/tpami.2015.2417573 pmid:26539859 fatcat:2adewvpkijeq3f6dhbjvbic54u

Semantic-Aware Co-indexing for Image Retrieval

Shiliang Zhang, Ming Yang, Xiaoyu Wang, Yuanqing Lin, Qi Tian
2013 2013 IEEE International Conference on Computer Vision  
are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings.  ...  In this paper, for vocabulary tree based image retrieval, we propose a semantic-aware co-indexing algorithm to jointly embed two strong cues into the inverted indexes: 1) local invariant features that  ...  K SIFT descriptors [13] of dimension D = 128, so does {x j } j∈S d for d.A visual vocabulary tree T is obtained by hierarchical Offline Indexing Dense HOG/ LBP Sparse SIFT Coding of BoW Multi-class  ... 
doi:10.1109/iccv.2013.210 dblp:conf/iccv/ZhangYWLT13 fatcat:7b7mzbj4fnbtffi3ierfnkystu

Spatial pyramid local keypoints quantization for bag of visual patches image representation

Yousef Alqasrawi, Daniel Neagu, Peter Cowling
2010 2010 10th International Conference on Intelligent Systems Design and Applications  
We show, with experiments on multi-class classification task using 700 natural scene images, that the spatial pyramid vocabulary model is suitable and discriminative for bag-of-visual patches semantic  ...  Bag of visual patches (BOP) image representation has been the main research topic in computer vision literature for scene and object recognition tasks.  ...  We only experiment with visual vocabulary of size 200 in order to compare with Lazebnik et al. approach [27] .  ... 
doi:10.1109/isda.2010.5687083 dblp:conf/isda/AlqasrawiNC10 fatcat:o376oiiqwvelppxmbtnayos2p4

Weakly supervised codebook learning by iterative label propagation with graph quantization

Liujuan Cao, Rongrong Ji, Wei Liu, Hongxun Yao, Qi Tian
2013 Signal Processing  
Visual codebook serves as a fundamental component in many state-of-the-art visual search and object recognition systems.  ...  groups similar patches with related labels (modeled by WordNet [18]), which minimizes the visual distortions in quantization.  ...  errors [24] , because it quantizes separately at each scale, with multi-to-multi linking graph to connect hierarchical codewords. (2) Different from all previous works [9] [10] [11] [12] [13] 16, 17  ... 
doi:10.1016/j.sigpro.2012.05.001 fatcat:d47fymh4ircdbeiyl2gpqu4bx4

Learning semantic features for action recognition via diffusion maps

Jingen Liu, Yang Yang, Imran Saleemi, Mubarak Shah
2012 Computer Vision and Image Understanding  
As opposed to flat vocabularies used in traditional methods, we propose to exploit the hierarchical nature of feature vocabularies representative of human actions.  ...  We present a principled approach to learning a semantic vocabulary from a large amount of video words using diffusion maps embedding.  ...  [37] proposed unifying the vocabulary construction with classifier training, and then encoding an image by a sequence of visual bits that constitute the semantic vocabulary.  ... 
doi:10.1016/j.cviu.2011.08.010 fatcat:u3glpakfsrazdlzuygpgst2i2a

Semantic Hierarchies for Visual Object Recognition

Marcin Marszalek, Cordelia Schmid
2007 2007 IEEE Conference on Computer Vision and Pattern Recognition  
We use the semantics of image labels to integrate prior knowledge about inter-class relationships into the visual appearance learning.  ...  We also demonstrate additional features that become available to object recognition due to the extension with semantic inference tools-we can classify high-level categories, such as animals, and we can  ...  Given a vocabulary, we can represent each image in the dataset as a histogram of visual words [16] .  ... 
doi:10.1109/cvpr.2007.383272 dblp:conf/cvpr/MarszalekS07a fatcat:d4rvtb5zlzac3k4g5j4aei5v5i

Open Vocabulary Scene Parsing

Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, Antonio Torralba
2017 2017 IEEE International Conference on Computer Vision (ICCV)  
Our approach is a joint image pixel and word concept embeddings framework, where word concepts are connected by semantic relations.  ...  In this paper, we propose a new task that aims at parsing scenes with a large and open vocabulary, and several evaluation metrics are explored for this problem.  ...  Recent efforts in image classification/detection/segmentation have shown this trend: emerging image datasets enable recognition on a large scale [6, 30, 32] , while image captioning can be seen as a special  ... 
doi:10.1109/iccv.2017.221 dblp:conf/iccv/ZhaoPZF017 fatcat:flambms24bhlrjpzms2hao4g2e

Hierarchical Visual Place Recognition Based on Semantic-Aggregation

Baifan Chen, Xiaoting Song, Hongyu Shen, Tao Lu
2021 Applied Sciences  
Inspired by this, we propose a hierarchical visual place recognition pipeline based on semantic-aggregation and scene understanding for the images.  ...  Semantic-aggregation happens in residual aggregation of visual information and semantic information in coarse matching, and semantic association of semantic edges in fine matching.  ...  [11] proposed a visual vocabulary-based loop-closure method, where the visual vocabularies could be built online, enabling the bag-of-words model to adapt to the dynamically changing environments.  ... 
doi:10.3390/app11209540 fatcat:5srkg3yusrfazkmorijgszgf2q

Massive-scale multimedia semantic modeling

John R. Smith, Liangliang Cao
2013 Proceedings of the 21st ACM international conference on Multimedia - MM '13  
To address these challenges, this tutorial will provide a unified overview of the two emerging techniques: Semantic modeling and Massive scale visual recognition, with a goal of both introducing people  ...  Visual data is exploding! 500 billion consumer photos are taken each year world-wide, 633 million photos taken per year in NYC alone. 120 new video-hours are uploaded on YouTube per minute.  ...  Scale Visual Recognition Challenges, Im-ageCLEF recognition and TRECVID challenges.  ... 
doi:10.1145/2502081.2502235 dblp:conf/mm/SmithC13a fatcat:gvhxnztvnvakpl4pmizqwox7di
« Previous Showing results 1 — 15 out of 7,405 results