Filters








90,660 Hits in 9.7 sec

Global Context Extraction for Object Recognition Using a Combination of Range and Visual Features [chapter]

Michael Kemmler, Erik Rodner, Joachim Denzler
2009 Lecture Notes in Computer Science  
We present an approach to context extraction in the form of global features for place recognition.  ...  It has been highlighted by many researchers, that the use of context information as an additional cue for high-level object recognition is important to close the gap between human and computer vision.  ...  Acknowledgements We would like to thank all four anonymous reviewers for their valuable comments, which really helped to improve the quality of the paper.  ... 
doi:10.1007/978-3-642-03778-8_8 fatcat:u3udmp4ewva2dbza6akh2q272y

An Application-Dependent Framework for the Recognition of High-Level Surgical Tasks in the OR [chapter]

Florent Lalys, Laurent Riffaud, David Bouget, Pierre Jannin
2011 Lecture Notes in Computer Science  
, for texture-oriented visual cues we used a bag-of-word approach with SIFT descriptors, and for all other visual cues we used a classical image classification approach including a feature extraction,  ...  Each of these classifiers was related to one kind of visual cue: visual cues recognizable through color were detected with a color histogram approach, for shape-oriented visual cues we trained a Haar classifier  ...  The authors would like to acknowledge the financial support of Carl Zeiss Meditec.  ... 
doi:10.1007/978-3-642-23623-5_42 fatcat:pbuhbci6kja43cg6idw6e2v7e4

An effective 3D target recognition model imitating robust methods of the human visual system

Sungho Kim, Gijeong Jang, In So Kweon
2005 Pattern Analysis and Applications  
This paper presents a model of 3D object recognition motivated from the robust properties of human vision system (HVS).  ...  The robust properties of the HVS are visual attention, contrast mechanism, feature binding, multi-resolution, size tuning, and part-based representation.  ...  Acknowledgements This research has been supported by the Korean Ministry of Science and Technology for National Research Laboratory Program (Grant number M1-0302-00-0064), Korea.  ... 
doi:10.1007/s10044-005-0001-y fatcat:zt6q54j665d47pc4lpl5kwmyay

An Effective 3D Target Recognition Imitating Robust Methods of the Human Visual System [chapter]

Sungho Kim, In So
2007 Vision Systems: Applications  
Spatial attention is used to combine low-level feature maps for both bottom-up (in a local structure feature extraction block) and top-down (in shape matching block) processes.  ...  Robust visual feature extraction (1) Hierarchical visual attention (Treisman, 1998) : The HVS utilizes three kinds of hierarchical attention: spatial, feature and object.  ...  This chapter is distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike-3.0 License, which permits use, distribution and reproduction for non-commercial purposes, provided  ... 
doi:10.5772/4987 fatcat:z4zhdwapfnbgpcu6ga7moup36q

Fusion of Global Shape and Local Features Using Boosting for Object Class Recognition

Noridayu Manshor, Amir Rizaan Abdul Rahiman, Mandava Rajeswari, Dhanesh Ramachandram
2012 International Journal of Computer and Communication Engineering  
In object class recognition, the state-of-the-art works shows using combination varies local features may produce a good performance in recognition.  ...  For learning technique, boosting is used in improving the recognition objects. This approach identifies the correct and misclassified dataset iteratively.  ...  CONCLUSION The work in this paper has presented the effects of combination of global shape and local features for object class recognition.  ... 
doi:10.7763/ijcce.2012.v1.109 fatcat:oqmmxrmxqfaidfileamiemt3ne

Fast Human Activity Recognition in Lifelogging [chapter]

Stefan Terziyski, Rami Albatal, Cathal Gurrin
2015 Lecture Notes in Computer Science  
We identify the importance of visual features related to HAR and we specifically evaluate the HAR discrimination potential of Colour Histograms and Histogram of Oriented Gradients.  ...  This paper addresses the problem of fast Human Activity Recognition (HAR) in visual lifelogging.  ...  Third-party libraries were used for feature extraction, specifically [10] for HOG extraction, [11] for SVM classification and [9] , [12] for SIFT comparison.  ... 
doi:10.1007/978-3-319-14442-9_43 fatcat:faze2rg7eraytdjmklmckkds2i

A Framework for the Recognition of High-Level Surgical Tasks From Video Images for Cataract Surgeries

F. Lalys, L. Riffaud, D. Bouget, P. Jannin
2012 IEEE Transactions on Biomedical Engineering  
In this paper, we propose a framework to assist in the development of systems for the automatic recognition of high level surgical tasks using microscope videos analysis.  ...  Dynamic Time Warping (DTW) and Hidden Markov Models (HMM) were tested. This association combined the advantages of all methods for better understanding of the problem.  ...  This approach, combining global spatial features and SVM, may therefore be adapted to the recognition of any type of cue.  ... 
doi:10.1109/tbme.2011.2181168 pmid:22203700 pmcid:PMC3432023 fatcat:g2ttzqmioncptcntaj2vmfqi4m

A review and an approach for object detection in images

Kartik Umesh Sharma, Nileshsingh V. Thakur
2017 International Journal of Computational Vision and Robotics  
This paper presents a review of the various techniques that are used to detect an object, localise an object, categorise an object, extract features, appearance information, and many more, in images and  ...  In order to detect an object in an image or a video the system needs to have a few components in order to complete the task of detecting an object, they are a model database, a feature detector, a hypothesiser  ...  Self-created dataset Combining local and global image features Accuracy and speed OD and scene recognition Authors present a conditional random field for jointly solving the tasks of  ... 
doi:10.1504/ijcvr.2017.10001813 fatcat:milusfamuzgv5js34ufqu2ujte

A review and an approach for object detection in images

Kartik Umesh Sharma, Nileshsingh V. Thakur
2017 International Journal of Computational Vision and Robotics  
This paper presents a review of the various techniques that are used to detect an object, localise an object, categorise an object, extract features, appearance information, and many more, in images and  ...  In order to detect an object in an image or a video the system needs to have a few components in order to complete the task of detecting an object, they are a model database, a feature detector, a hypothesiser  ...  Self-created dataset Combining local and global image features Accuracy and speed OD and scene recognition Authors present a conditional random field for jointly solving the tasks of  ... 
doi:10.1504/ijcvr.2017.081234 fatcat:n7wcnpafe5bw7klocovns33mye

Domain-Specific Priors and Meta Learning for Few-Shot First-Person Action Recognition [article]

Huseyin Coskun, Zeeshan Zia, Bugra Tekin, Federica Bogo, Nassir Navab, Federico Tombari, Harpreet Sawhney
2021 arXiv   pre-print
Visual cues we employ include object-object interactions, hand grasps and motion within regions that are a function of hand locations.  ...  We employ a framework based on meta-learning to extract the distinctive and domain invariant components of the deployed visual cues.  ...  ACKNOWLEDGMENTS The authors would like to thank David Joseph Tan for the valuable discussions and constructive feedback. This work was supported by Microsoft.  ... 
arXiv:1907.09382v2 fatcat:aj7rdwx5ongd7dd2tsk6ybyeuu

Author Index

2010 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition  
Motion Volumes by Tracking Building and Using a Semantivisual Image Hierarchy Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions Modeling Mutual Context of  ...  extraction of DAISY keypoints Girod, Bernd Unified Real-Time Tracking and Recognition with Rotation-Invariant Fast Features Demo: Unified Tracking and Recognition with Rotation-Invariant Fast Features  ... 
doi:10.1109/cvpr.2010.5539913 fatcat:y6m5knstrzfyfin6jzusc42p54

Visual Concept Reasoning Networks [article]

Taesup Kim, Sungwoong Kim, Yoshua Bengio
2020 arXiv   pre-print
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks.  ...  Extensive experiments on visual recognition tasks such as image classification, semantic segmentation, object detection, scene recognition, and action recognition show that our proposed model, VCRNet,  ...  Pooling-based sampler Global average pooling is one of the simplest ways to extract the global context from a feature map without explicitly capturing long-range dependencies.  ... 
arXiv:2008.11783v1 fatcat:ibqnbfkelbabngogis2jc543ie

Fine-Grained Grounding for Multimodal Speech Recognition [article]

Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott
2020 arXiv   pre-print
While visual signals have been shown to be useful for recovering entities that have been masked in the audio, these models should be capable of recovering a broader range of word types.  ...  In this paper, we propose a model that uses finer-grained visual information from different parts of the image, using automatic object proposals.  ...  We thank Stella Frank for discussions about whether such a model could be expected to count objects in images.  ... 
arXiv:2010.02384v1 fatcat:kmnh64s4prfevbdgbxhsxqpg2y

Context-Aware Emotion Recognition Based on Visual Relationship Detection

Manh-Hung Hoang, Soo-Hyung Kim, Hyung-Jeong Yang, Guee-Sang Lee
2021 IEEE Access  
The global image is masked and extracted the general context features by the scene-level stream.  ...  One stream extracted the context features from the global image, and another stream focused on the body features of the primary agent.  ... 
doi:10.1109/access.2021.3091169 fatcat:loymc3cl6zclroqbyyppstemfi

Visual Concept Reasoning Networks

Taesup Kim, Sungwoong Kim, Yoshua Bengio
2021 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks.  ...  Extensive experiments on visual recognition tasks such as image classification, semantic segmentation, object detection, scene recognition, and action recognition show that our proposed model, VCRNet,  ...  Global average pooling is one of the simplest ways to extract the global context from a feature map without explicitly capturing long-range dependencies.  ... 
doi:10.1609/aaai.v35i9.16995 fatcat:eg37niaw7rguzern5kwpdx2rni
« Previous Showing results 1 — 15 out of 90,660 results