Filters








1,696 Hits in 4.7 sec

Guest Editorial: Introduction to the Special Section on Fine-Grained Visual Categorization

Jingdong Wang, Zhuowen Tu, Jianlong Fu, Nicu Sebe, Serge Belongie
2022 IEEE Transactions on Pattern Analysis and Machine Intelligence  
The title of their paper is "P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization."  ...  Fine-Grained Vision-Language Reasoning The paper "Fine-Grained Video Captioning via Graph-based Multi-Granularity Interaction Learning" by Yichao Yan, Ning Zhuang, Bingbing Ni, Jian Zhang, Minghao Xu,  ... 
doi:10.1109/tpami.2021.3065094 fatcat:7uqctb6qgnc7jf2dv75wnustfi

Video Question Answering: Datasets, Algorithms and Challenges [article]

Yaoyao Zhong, Wei Ji, Junbin Xiao, Yicong Li, Weihong Deng, Tat-Seng Chua
2022 arXiv   pre-print
We then point out the research trend of studying beyond factoid QA to inference QA towards the cognition of video contents, Finally, we conclude some promising directions for future exploration.  ...  Although different algorithms have continually been proposed and shown success on different VideoQA datasets, we find that there lacks a meaningful survey to categorize them, which seriously impedes its  ...  use the coarse-grained question feature and fine-grained word feature together.propose hierarchical dual-level attention networks (DLAN) to learn the question-aware video representations with word-level  ... 
arXiv:2203.01225v1 fatcat:dn4sz5pomnfb7igvmxofangzsa

Multiple Granularity Descriptors for Fine-Grained Categorization

Dequan Wang, Zhiqiang Shen, Jie Shao, Wei Zhang, Xiangyang Xue, Zheng Zhang
2015 2015 IEEE International Conference on Computer Vision (ICCV)  
This is due to two main issues: how to localize discriminative regions for recognition and how to learn sophisticated features for representation.  ...  The internal representations of these networks have different region of interests, allowing the construction of multi-grained descriptors that encode informative and discriminative features covering all  ...  Acknowledgements We would like to thank anonymous reviewers for helpful feedback. We would also like to thank Tianjun Xiao and Hao Ye for useful discussions.  ... 
doi:10.1109/iccv.2015.276 dblp:conf/iccv/WangSSZXZ15 fatcat:uccgg6vquzhbxoghp77y6txjvm

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition [article]

Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Yu Guan, Xuming He, Errui Ding
2021 arXiv   pre-print
To address the aforementioned problems, we propose a novel multi-granular spatio-temporal graph network for skeleton-based action classification that jointly models the coarse- and fine-grained skeleton  ...  Existing approaches typically employ a single neural representation for different motion patterns, which has difficulty in capturing fine-grained action classes given limited training data.  ...  ACKNOWLEDGMENTS This research is funded through the EPSRC Centre for Doctoral Training in Digital Civics (EP/L016176/1).  ... 
arXiv:2108.04536v1 fatcat:yjzuazukyrbdrlimtlkopjz6bm

Grapy-ML: Graph Pyramid Mutual Learning for Cross-Dataset Human Parsing

Haoyu He, Jing Zhang, Qiming Zhang, Dacheng Tao
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
By making use of the multi-granularity labels, Grapy-ML learns a more discriminative feature representation and achieves state-of-the-art performance, which is demonstrated by extensive experiments on  ...  Starting from the prior knowledge of the human body hierarchical structure, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity  ...  Based on the definition, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity subsequently.  ... 
doi:10.1609/aaai.v34i07.6728 fatcat:juvtu6haingcxgoaabsydtbfaa

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing [article]

Haoyu He, Jing Zhang, Qiming Zhang, Dacheng Tao
2019 arXiv   pre-print
By making use of the multi-granularity labels, Grapy-ML learns a more discriminative feature representation and achieves state-of-the-art performance, which is demonstrated by extensive experiments on  ...  Starting from the prior knowledge of the human body hierarchical structure, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity  ...  Based on the definition, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity subsequently.  ... 
arXiv:1911.12053v1 fatcat:ugfuo5dp35hr7i5lpkrnrhfc6m

Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Recognition [article]

Zhenxin Wu, Qingliang Chen, Yifeng Liu, Yinqi Zhang, Chengkai Zhu, Yang Yu
2021 arXiv   pre-print
Existing research applies large-scale convolutional neural networks or visual transformers as the feature extractor, which is extremely computationally expensive.  ...  In fact, real-world scenarios of fine-grained recognition often require a more lightweight mobile network that can be utilized offline.  ...  Part- for fine-grained visual categorization. In CVPR, pages 1173–1182. IEEE based r-cnns for fine-grained category detection.  ... 
arXiv:2112.04223v1 fatcat:sn2wcwqsgbgv7l7puhgqpapcum

An Approach for Process Model Extraction By Multi-Grained Text Classification [article]

Chen Qian, Lijie Wen, Akhil Kumar, Leilei Lin, Li Lin, Zan Zong, Shuang Li, Jianmin Wang
2020 arXiv   pre-print
Under this structure, we accordingly propose the coarse-to-fine (grained) learning mechanism, training multi-grained tasks in coarse-to-fine grained order to share the high-level knowledge for the low-level  ...  In this paper, we formalize the PME task into the multi-grained text classification problem, and propose a hierarchical neural network to effectively model and extract multi-grained information without  ...  For example, [4, 3, 29] proposed neural machine-reading models that constructed dynamic knowledge graphs from procedural text.  ... 
arXiv:1906.02127v3 fatcat:x4dd7nrczbgjfm75245q3qo5yy

2021 Index IEEE Transactions on Multimedia Vol. 23

2021 IEEE transactions on multimedia  
The Author Index contains the primary entry for each item, listed under the first author's name.  ...  ., +, TMM 2021 2794-2805 Fine-Grained Visual Categorization by Localizing Object Parts With Single Image.  ...  ., +, TMM 2021 2413-2427 Fine-Grained Visual Categorization by Localizing Object Parts With Single Image.  ... 
doi:10.1109/tmm.2022.3141947 fatcat:lil2nf3vd5ehbfgtslulu7y3lq

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification [article]

Jingzhou Chen, Peng Wang, Jian Liu, Yuntao Qian
2022 arXiv   pre-print
Considering the hierarchical feature interaction, we propose a hierarchical residual network (HRN), in which granularity-specific features from parent levels acting as residual connections are added to  ...  Hierarchical multi-granularity classification (HMC) assigns hierarchical multi-granularity labels to each object and focuses on encoding the label hierarchy, e.g., ["Albatross", "Laysan Albatross"] from  ...  Such marginalization enjoys two benefits: learning with the coarse-level label could impact decisions of fine-grained subclasses while learning with the fine-level label aids the prediction of coarse-grained  ... 
arXiv:2201.03194v2 fatcat:vwnkqsqnhjbqbfqpszeyrwwilu

Fine-Grained Image Analysis with Deep Learning: A Survey [article]

Xiu-Shen Wei and Yi-Zhe Song and Oisin Mac Aodha and Jianxin Wu and Yuxin Peng and Jinhui Tang and Jian Yang and Serge Belongie
2021 arXiv   pre-print
image recognition and fine-grained image retrieval.  ...  In this paper we present a systematic survey of these advances, where we attempt to re-define and broaden the field of FGIA by consolidating two fundamental fine-grained research areas -- fine-grained  ...  ACKNOWLEDGMENTS The authors would like to thank the editor and the anonymous reviewers for their constructive comments.  ... 
arXiv:2111.06119v2 fatcat:ninawxsjtnf4lndtqquuwl3weq

New Ideas and Trends in Deep Multimodal Content Understanding: A Review [article]

Wei Chen and Weiping Wang and Li Liu and Michael S. Lew
2020 arXiv   pre-print
We then introduce current ideas and trends in deep multimodal feature learning, such as feature embedding approaches and objective function design, which are crucial in overcoming the aforementioned challenges  ...  Finally, we include several promising directions for future research.  ...  Afterwards, GCNs extract visual representations based on the built semantic graphs and spatial graphs.  ... 
arXiv:2010.08189v1 fatcat:2l7molbcn5hf3oyhe3l52tdwra

New Ideas and Trends in Deep Multimodal Content Understanding: A Review

Wei Chen, Weiping Wang, Li Liu, Michael S. Lew
2020 Neurocomputing  
We then introduce current ideas and trends in deep multimodal feature learning, such as feature embedding approaches and objective function design, which are crucial in overcoming the aforementioned challenges  ...  Finally, we include several promising directions for future research.  ...  Afterwards, GCNs extract visual representations based on the built semantic graphs and spatial graphs.  ... 
doi:10.1016/j.neucom.2020.10.042 fatcat:hyjkj5enozfrvgzxy6avtbmoxu

Fine-Grained Object Classification via Self-Supervised Pose Alignment [article]

Xuhui Yang, Yaowei Wang, Ke Chen, Yong Xu, Yonghong Tian
2022 arXiv   pre-print
For discounting pose variations, this paper proposes to learn a novel graph based object representation to reveal a global configuration of local parts for self-supervised pose alignment across classes  ...  sub-networks encourages discriminative features in a curriculum learning manner.  ...  Acknowledgements This work is supported by the China Postdoctoral Science Foundation (2021M691682), the National Natural Science Foundation of China (61902131, 62072188, U20B2052), the Program for Guangdong  ... 
arXiv:2203.15987v1 fatcat:buaqjag3ljfdplkvaqe7yyudry

MOMA: Multi-Object Multi-Actor Activity Parsing

Zelun Luo, Wanze Xie, Siddharth Kapoor, Yiyun Liang, Michael Cooper, Juan Carlos Niebles, Ehsan Adeli, Fei-Fei Li
2021 Neural Information Processing Systems  
This study also benefited from Stanford Institute for Human-Centered AI (HAI) AWS Cloud Credits.  ...  In recent years, graph neural networks [85, 56] have been found to be a promising tool for both generating and learning from visual compositions.  ...  To jointly model these two representation, we identify graph neural network and action parsing as two promising tools for our study, briefly surveyed below.  ... 
dblp:conf/nips/LuoXKLCNAL21 fatcat:7sana34l75b4zdlgdunqjufk74
« Previous Showing results 1 — 15 out of 1,696 results