A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Guest Editorial: Introduction to the Special Section on Fine-Grained Visual Categorization
2022
IEEE Transactions on Pattern Analysis and Machine Intelligence
The title of their paper is "P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization." ...
Fine-Grained Vision-Language Reasoning The paper "Fine-Grained Video Captioning via Graph-based Multi-Granularity Interaction Learning" by Yichao Yan, Ning Zhuang, Bingbing Ni, Jian Zhang, Minghao Xu, ...
doi:10.1109/tpami.2021.3065094
fatcat:7uqctb6qgnc7jf2dv75wnustfi
Video Question Answering: Datasets, Algorithms and Challenges
[article]
2022
arXiv
pre-print
We then point out the research trend of studying beyond factoid QA to inference QA towards the cognition of video contents, Finally, we conclude some promising directions for future exploration. ...
Although different algorithms have continually been proposed and shown success on different VideoQA datasets, we find that there lacks a meaningful survey to categorize them, which seriously impedes its ...
use the coarse-grained question feature and fine-grained word feature together.propose hierarchical dual-level attention networks (DLAN) to learn the question-aware video representations with word-level ...
arXiv:2203.01225v1
fatcat:dn4sz5pomnfb7igvmxofangzsa
Multiple Granularity Descriptors for Fine-Grained Categorization
2015
2015 IEEE International Conference on Computer Vision (ICCV)
This is due to two main issues: how to localize discriminative regions for recognition and how to learn sophisticated features for representation. ...
The internal representations of these networks have different region of interests, allowing the construction of multi-grained descriptors that encode informative and discriminative features covering all ...
Acknowledgements We would like to thank anonymous reviewers for helpful feedback. We would also like to thank Tianjun Xiao and Hao Ye for useful discussions. ...
doi:10.1109/iccv.2015.276
dblp:conf/iccv/WangSSZXZ15
fatcat:uccgg6vquzhbxoghp77y6txjvm
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition
[article]
2021
arXiv
pre-print
To address the aforementioned problems, we propose a novel multi-granular spatio-temporal graph network for skeleton-based action classification that jointly models the coarse- and fine-grained skeleton ...
Existing approaches typically employ a single neural representation for different motion patterns, which has difficulty in capturing fine-grained action classes given limited training data. ...
ACKNOWLEDGMENTS This research is funded through the EPSRC Centre for Doctoral Training in Digital Civics (EP/L016176/1). ...
arXiv:2108.04536v1
fatcat:yjzuazukyrbdrlimtlkopjz6bm
Grapy-ML: Graph Pyramid Mutual Learning for Cross-Dataset Human Parsing
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
By making use of the multi-granularity labels, Grapy-ML learns a more discriminative feature representation and achieves state-of-the-art performance, which is demonstrated by extensive experiments on ...
Starting from the prior knowledge of the human body hierarchical structure, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity ...
Based on the definition, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity subsequently. ...
doi:10.1609/aaai.v34i07.6728
fatcat:juvtu6haingcxgoaabsydtbfaa
Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing
[article]
2019
arXiv
pre-print
By making use of the multi-granularity labels, Grapy-ML learns a more discriminative feature representation and achieves state-of-the-art performance, which is demonstrated by extensive experiments on ...
Starting from the prior knowledge of the human body hierarchical structure, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity ...
Based on the definition, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity subsequently. ...
arXiv:1911.12053v1
fatcat:ugfuo5dp35hr7i5lpkrnrhfc6m
Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Recognition
[article]
2021
arXiv
pre-print
Existing research applies large-scale convolutional neural networks or visual transformers as the feature extractor, which is extremely computationally expensive. ...
In fact, real-world scenarios of fine-grained recognition often require a more lightweight mobile network that can be utilized offline. ...
Part-
for fine-grained visual categorization. In CVPR, pages 1173–1182. IEEE based r-cnns for fine-grained category detection. ...
arXiv:2112.04223v1
fatcat:sn2wcwqsgbgv7l7puhgqpapcum
An Approach for Process Model Extraction By Multi-Grained Text Classification
[article]
2020
arXiv
pre-print
Under this structure, we accordingly propose the coarse-to-fine (grained) learning mechanism, training multi-grained tasks in coarse-to-fine grained order to share the high-level knowledge for the low-level ...
In this paper, we formalize the PME task into the multi-grained text classification problem, and propose a hierarchical neural network to effectively model and extract multi-grained information without ...
For example, [4, 3, 29] proposed neural machine-reading models that constructed dynamic knowledge graphs from procedural text. ...
arXiv:1906.02127v3
fatcat:x4dd7nrczbgjfm75245q3qo5yy
2021 Index IEEE Transactions on Multimedia Vol. 23
2021
IEEE transactions on multimedia
The Author Index contains the primary entry for each item, listed under the first author's name. ...
., +, TMM 2021 2794-2805 Fine-Grained Visual Categorization by Localizing Object Parts With Single Image. ...
., +, TMM 2021 2413-2427 Fine-Grained Visual Categorization by Localizing Object Parts With Single Image. ...
doi:10.1109/tmm.2022.3141947
fatcat:lil2nf3vd5ehbfgtslulu7y3lq
Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification
[article]
2022
arXiv
pre-print
Considering the hierarchical feature interaction, we propose a hierarchical residual network (HRN), in which granularity-specific features from parent levels acting as residual connections are added to ...
Hierarchical multi-granularity classification (HMC) assigns hierarchical multi-granularity labels to each object and focuses on encoding the label hierarchy, e.g., ["Albatross", "Laysan Albatross"] from ...
Such marginalization enjoys two benefits: learning with the coarse-level label could impact decisions of fine-grained subclasses while learning with the fine-level label aids the prediction of coarse-grained ...
arXiv:2201.03194v2
fatcat:vwnkqsqnhjbqbfqpszeyrwwilu
Fine-Grained Image Analysis with Deep Learning: A Survey
[article]
2021
arXiv
pre-print
image recognition and fine-grained image retrieval. ...
In this paper we present a systematic survey of these advances, where we attempt to re-define and broaden the field of FGIA by consolidating two fundamental fine-grained research areas -- fine-grained ...
ACKNOWLEDGMENTS The authors would like to thank the editor and the anonymous reviewers for their constructive comments. ...
arXiv:2111.06119v2
fatcat:ninawxsjtnf4lndtqquuwl3weq
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
[article]
2020
arXiv
pre-print
We then introduce current ideas and trends in deep multimodal feature learning, such as feature embedding approaches and objective function design, which are crucial in overcoming the aforementioned challenges ...
Finally, we include several promising directions for future research. ...
Afterwards, GCNs extract visual representations based on the built semantic graphs and spatial graphs. ...
arXiv:2010.08189v1
fatcat:2l7molbcn5hf3oyhe3l52tdwra
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
2020
Neurocomputing
We then introduce current ideas and trends in deep multimodal feature learning, such as feature embedding approaches and objective function design, which are crucial in overcoming the aforementioned challenges ...
Finally, we include several promising directions for future research. ...
Afterwards, GCNs extract visual representations based on the built semantic graphs and spatial graphs. ...
doi:10.1016/j.neucom.2020.10.042
fatcat:hyjkj5enozfrvgzxy6avtbmoxu
Fine-Grained Object Classification via Self-Supervised Pose Alignment
[article]
2022
arXiv
pre-print
For discounting pose variations, this paper proposes to learn a novel graph based object representation to reveal a global configuration of local parts for self-supervised pose alignment across classes ...
sub-networks encourages discriminative features in a curriculum learning manner. ...
Acknowledgements This work is supported by the China Postdoctoral Science Foundation (2021M691682), the National Natural Science Foundation of China (61902131, 62072188, U20B2052), the Program for Guangdong ...
arXiv:2203.15987v1
fatcat:buaqjag3ljfdplkvaqe7yyudry
MOMA: Multi-Object Multi-Actor Activity Parsing
2021
Neural Information Processing Systems
This study also benefited from Stanford Institute for Human-Centered AI (HAI) AWS Cloud Credits. ...
In recent years, graph neural networks [85, 56] have been found to be a promising tool for both generating and learning from visual compositions. ...
To jointly model these two representation, we identify graph neural network and action parsing as two promising tools for our study, briefly surveyed below. ...
dblp:conf/nips/LuoXKLCNAL21
fatcat:7sana34l75b4zdlgdunqjufk74
« Previous
Showing results 1 — 15 out of 1,696 results