32,027 Hits in 4.2 sec

Learning semantic relationships for better action retrieval in images

Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, Chuck Rossenberg, Li Fei-Fei
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
However, we could compensate for this sparsity in supervision by leveraging the rich semantic relationship between different actions.  ...  Hence, we propose a novel neural network framework which jointly extracts the relationship between actions and uses them for training better action retrieval models.  ...  Acknowledgements We thank Andrej Karpathy, Yuke Zhu and Ranjay Krishna for helpful comments and feedback. We also thank Alex Toshev for providing the parsed queries.  ... 
doi:10.1109/cvpr.2015.7298713 dblp:conf/cvpr/RamanathanLDHLG15 fatcat:ul3mqxgoybg37k5lb6c2ajfree

Video Content Analysis of Human Sports under Engineering Management Incorporating High-Level Semantic Recognition Models

Ruan Hui, Huihua Chen
2022 Computational Intelligence and Neuroscience  
integrating visual attention mechanisms into the fine-grained action feature extraction process to extract features for cues.  ...  that can reflect the image nearest neighbor relationship and association features.  ...  How to ensure the efficiency of image retrieval systems has been a key research direction in the fields of information retrieval, machine learning, and computer vision [12] .  ... 
doi:10.1155/2022/6761857 pmid:35069724 pmcid:PMC8769854 fatcat:dosm2tf7ijccjkr4s6um24slh4

Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning [article]

Shizhe Chen, Yida Zhao, Qin Jin, Qi Wu
2020 arXiv   pre-print
To be specific, the model disentangles texts into hierarchical semantic graph including three levels of events, actions, entities and relationships across levels.  ...  The current dominant approach for this problem is to learn a joint embedding space to measure cross-modal similarities.  ...  Most of them [16, 23, 24, 35, 42] apply graph reasoning on image regions to learn relationships among them.  ... 
arXiv:2003.00392v1 fatcat:4yrqi2cluvhthbd5ipb4bw5zna

Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding [article]

Arka Ujjal Dey, Suman Kumar Ghosh, Ernest Valveny, Gaurav Harit
2019 arXiv   pre-print
In this paper, we propose to jointly use scene text and visual channels for robust semantic interpretation of images.  ...  In the retrieval framework, we augment our learned text-visual semantic representation with scene text cues, to mitigate vocabulary misses that may have occurred during the semantic embedding.  ...  More specifically, in training we learn separate semantics based on statement action-reason partitioning.  ... 
arXiv:1905.10622v3 fatcat:72bpsqpxrra2hk2cddizleqi6y

Analysis and Modeling of 3D Indoor Scenes [article]

Rui Ma
2017 arXiv   pre-print
This report mainly focuses on the recent research progress in graphics on geometry, structure and semantic analysis of 3D indoor data and different modeling techniques for creating plausible and realistic  ...  We first review works on understanding and semantic modeling of scenes from captured 3D data of the real world.  ...  [11] exploit contextual relationships learned from a 3D scene database to assist the object segmentation and classification for automatic semantic modeling.  ... 
arXiv:1706.09577v1 fatcat:pdbztyjkezabnj3eaok5wrs7xq

Analytical Review on Textual Queries Semantic Search based Video Retrieval

Prof. Suvarna L. Kattimani, Miss. Saba Parveen Bougdadi
2014 IJARCCE  
And construct the semantic meaningful graph gives the semantic structure and matches the nouns, verb and adverb detected in the video frame and also detect the action and position of the object by using  ...  Semantic search based video retrieval is hard problem due to limited set of vocabulary.  ...  Involve machine learning algorithm for complex actions. In [20] D.  ... 
doi:10.17148/ijarcce.2017.64163 fatcat:d6hyjjzudbgg3g5dwnbjbildwq

Semantics-Based Art Image Retrieval Using Linguistic Variable

Qingyong Li, Siwei Luo, Zhongzhi Shi
2007 Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007)  
Though content-based image retrieval (CBIR) got great progress, current low-level visual information based retrieval technology does not allow users to retrieval images by high-level semantics.  ...  More and more digitized art images are accumulated and expanded in our daily life and techniques need to be established on how to organize and retrieval them.  ...  Details about machine learning in semanticsbased image retrieval can be seen in [6] .  ... 
doi:10.1109/fskd.2007.511 dblp:conf/fskd/LiLS07 fatcat:j2n5ay3lrzebjikwt3xyfvbugi

Learning Visual Actions Using Multiple Verb-Only Labels [article]

Michael Wray, Dima Damen
2019 arXiv   pre-print
This work introduces verb-only representations for both recognition and retrieval of visual actions, in video.  ...  We collect multi-verb annotations for three action video datasets and evaluate the verb-only labelling representations for action recognition and cross-modal retrieval (video-to-text and text-to-video)  ...  This suggests that φ SAMV is better able to learn the relationships between the different verbs inherent in describing actions.  ... 
arXiv:1907.11117v2 fatcat:hlubauoojfgonhhdgyhlmolhbu

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph [article]

Fei Yu, Jiji Tang, Weichong Yin, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang
2021 arXiv   pre-print
Thus, ERNIE-ViL can learn the joint representations characterizing the alignments of the detailed semantics across vision and language.  ...  Utilizing scene graphs of visual scenes, ERNIE-ViL constructs Scene Graph Prediction tasks, i.e., Object Prediction, Attribute Prediction and Relationship Prediction tasks in the pre-training phase.  ...  An absolute improvement of 1.20% for objects, 3.08% for relationships and 1.84% for attributes on ACC@1 demonstrates that ERNIE-ViL pre-trained with SGP tasks learns better cross-modal detailed semantics  ... 
arXiv:2006.16934v3 fatcat:q7iucmyxfrf4bkiusdujbym6ye

Bidirectional-isomorphic manifold learning at image semantic understanding & representation

Xianming Liu, Hongxun Yao, Rongrong Ji, Pengfei Xu, Xiaoshuai Sun
2012 Multimedia tools and applications  
, in order to achieve more accurate comprehension for image semantics and relationships.  ...  Based Image Retrieval and SVM, while the second group carried on a web-downloaded Flickr dataset with over 6,000 images to testify the proposed method's effectiveness in real-world application.  ...  Thus, in case of largescale web image retrieval task, this approach can potentially perform much better for image relationship analysis and semantic understanding.  ... 
doi:10.1007/s11042-011-0947-2 fatcat:sxvlbss3rbauxk3bcvden3np2m

Cross-Modal Object Detection Based on a Knowledge Update

Yueqing Gao, Huachun Zhou, Lulu Chen, Yuting Shen, Ce Guo, Xinyu Zhang
2022 Sensors  
In summary, the proposed algorithm not only learns the accurate relationship between objects in different regions of the image, but also benefits from the knowledge update through an external relational  ...  features (semantic relationships between words) corresponding to pictures.  ...  Therefore, our model better understands the semantic meaning of images and text, and better aligns image features with text features, making it easier to learn the information across different modalities  ... 
doi:10.3390/s22041338 pmid:35214240 pmcid:PMC8963053 fatcat:kdjk3oy2wvandaijab7z6xewjy

Efficient Visual Recognition

Li Liu, Matti Pietikäinen, Jie Qin, Wanli Ouyang, Luc Van Gool
2020 International Journal of Computer Vision  
search Image and video retrieval Weakly-supervised semantic guided hashing for social image retrieval Social image retrieval Anchor-based self-ensembling for semisupervised deep pairwise hashing  ...  consistent large margin proxy embeddings Image retrieval Unified binary generative adversarial network for image retrieval and compression Image retrieval, image compression Learning multifunctional  ... 
doi:10.1007/s11263-020-01351-w fatcat:mbcq6shmerbo5njayscgb3t4rq

Fuzzy aesthetic semantics description and extraction for art image retrieval

Qingyong Li, Siwei Luo, Zhongzhi Shi
2009 Computers and Mathematics with Applications  
Though content-based image retrieval (CBIR) made great progress, current low-level visual information based retrieval technology in CBIR does not allow users to search images by high-level semantics for  ...  art image retrieval.  ...  Details about machine learning in semantics-based image retrieval can be seen in [6] .  ... 
doi:10.1016/j.camwa.2008.10.058 fatcat:a7twgpnvenez5djo6v6ocj7s3u

Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO [article]

Zarana Parekh, Jason Baldridge, Daniel Cer, Austin Waters, Yinfei Yang
2021 arXiv   pre-print
By supporting multi-modal retrieval training and evaluation, image captioning datasets have spurred remarkable progress on representation learning.  ...  We also evaluate a multitask dual encoder trained on both image-caption and caption-caption pairs that crucially demonstrates CxC's value for measuring the influence of intra- and inter-modality learning  ...  , Eugene Ie for their comments on the initial versions of the paper and Daphne Luong for executive support for the data collection.  ... 
arXiv:2004.15020v3 fatcat:c3k5dipdn5e6xfbl6x7haqrgkm

Semantic Adversarial Network for Zero-Shot Sketch-Based Image Retrieval [article]

Xinxun Xu, Hao Wang, Leida Li, Cheng Deng
2019 arXiv   pre-print
Additionally, the proposed model is trained in an end-to-end strategy to exploit better semantic features suitable for ZS-SBIR.  ...  Zero-shot sketch-based image retrieval (ZS-SBIR) is a specific cross-modal retrieval task for retrieving natural images with free-hand sketches under zero-shot scenario.  ...  To evaluate the performance of the proposed, we follow sketch-based image retrieval evaluation criterion in [Kiran Yelamarthi et al., 2018; , where sketch and image retrieval Algorithm 1 Learning semantic  ... 
arXiv:1905.02327v2 fatcat:gdooxhrzurcz7bk2yzgsofihby
« Previous Showing results 1 — 15 out of 32,027 results