Filters








2,314 Hits in 4.2 sec

Multi-Spectral Vehicle Re-Identification: A Challenge

Hongchao Li, Chenglong Li, Xianpeng Zhu, Aihua Zheng, Bin Luo
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
The dataset and baseline codes are available at: https://github.com/ttaalle/multi-modal-vehicle-Re-ID.  ...  Our work provides a benchmark dataset for RGB-NIR and RGB-NIR-TIR multi-spectral vehicle Re-ID and a baseline network for both research and industrial communities.  ...  (NLPR) ( 201900046 ), and the Natural Science Foundation of Anhui Higher Education Institutions of China (KJ2019A0033).  ... 
doi:10.1609/aaai.v34i07.6796 fatcat:oqtfculpdrbp3fgslibpz6m7x4

Image Retrieval and Re-Ranking Techniques - A Survey

Mayuri D. Joshi, Revati M. Deshmukh, Kalashree N.Hemke, Ashwini Bhake, Rakhi Wajgi
2014 Signal & Image Processing An International Journal  
The technique leaves no ambiguities as we consider only the variant characteristics or modalities, in order to gain the image and video retrieval.  ...  In contrast to the existing techniques, the re-ranking method fosters the interaction between modalities to find a tranquillity which is helpful for re-ranking.  ...  Tan et al, [15] , proposed an agreement-fusion optimization model for fusing multiple heterogeneous data.  ... 
doi:10.5121/sipij.2014.5201 fatcat:gusz6wizpbgsfgjt6asf7s44fi

iQIYI-VID: A Large Dataset for Multi-modal Person Identification [article]

Yuanliu Liu, Bo Peng, Peipei Shi, He Yan, Yong Zhou, Bing Han, Yi Zheng, Chao Lin, Jianbin Jiang, Yin Fan, Tingwei Gao, Ganwen Wang (+2 others)
2019 arXiv   pre-print
In this paper, we introduce iQIYI-VID, the largest video dataset for multi-modal person identification. It is composed of 600K video clips of 5,000 celebrities.  ...  We proposed a Multi-modal Attention module to fuse multi-modal features that can improve person identification considerably.  ...  For person re-identification (Re-ID), Wang et al. [63] raised the Rank-1 accuracy of Re-ID to 97.1% on the Market-1501 benchmark [71] .  ... 
arXiv:1811.07548v2 fatcat:333ocltp7vbt5dmdwdn7pgqhcm

Multi-Modal Long-Term Person Re-Identification Using Physical Soft Bio-Metrics and Body Figure

Nadeen Shoukry, Mohamed A. Abd El Ghany, Mohammed A.-M. Salem
2022 Applied Sciences  
This paper proposes a multi-modal person re-identification model. The first modality includes soft bio-metrics: hair, face, neck, shoulders, and part of the chest.  ...  For the first modality, a two-stream Siamese network with pre-trained FaceNet as a feature extractor for the first modality is utilized.  ...  Torchreid is a framework that includes unified data loaders for 15 of the most popular and widely used datasets in the field of person re-identification (video and image domains are available).  ... 
doi:10.3390/app12062835 fatcat:qr6wtrktrjdzlbfgk2llipmtky

Cooperative Cross-Stream Network for Discriminative Action Representation [article]

Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen
2019 arXiv   pre-print
reduces the undesired modality discrepancy by jointly optimizing a modality ranking constraint and a cross-entropy loss for both homogeneous and heterogeneous modalities.  ...  The modality ranking constraint constitutes intra-modality discriminative embedding and inter-modality triplet constraint, and it reduces both the intra-modality and cross-modality feature variations.  ...  and the Sichuan Science and Technology Program 2018GZDZX0032, 2019ZDZX0008 and 2019YFG0003.  ... 
arXiv:1908.10136v1 fatcat:hwu2pnudxffmfg3iajq7s3ymsm

Lightweight Attentional Feature Fusion for Video Retrieval by Text [article]

Fan Hu and Aozhu Chen and Ziyue Wang and Fangming Zhou and Xirong Li
2021 arXiv   pre-print
LAFF performs feature fusion at both early and late stages and at both video and text ends, making it a powerful method for exploiting diverse (off-the-shelf) features.  ...  Different from previous research that considers feature fusion only at one end, let it be video or text, we aim for feature fusion for both ends within a unified framework.  ...  We re-train our model with the top-3 ranked video / text features.  ... 
arXiv:2112.01832v1 fatcat:bq3n553otzd65oxjur4hbu2xve

Multi-modal Visual Tracking: Review and Experimental Comparison [article]

Pengyu Zhang and Dong Wang and Huchuan Lu
2020 arXiv   pre-print
Finally, we discuss various future directions from different perspectives, including model design and dataset construction for further research.  ...  To provide a thorough review of multi-modal track-ing, we summarize the multi-modal tracking algorithms, especially visible-depth (RGB-D) tracking and visible-thermal (RGB-T) tracking in a unified taxonomy  ...  ATOM [123] and sports video analysis [124] , which is also a better choice for multi-modal tracking. Specific Network for Auxiliary Modality.  ... 
arXiv:2012.04176v1 fatcat:pc3pt3hdavcp3pzij5sryvqe5y

Homogeneous and Heterogeneous Relational Graph for Visible-infrared Person Re-identification [article]

Yujian Feng, Feng Chen, Jian Yu, Yimu Ji, Fei Wu, Shangdong Liu, Xiao-Yuan Jing
2021 arXiv   pre-print
Visible-infrared person re-identification (VI Re-ID) aims to match person images between the visible and infrared modalities.  ...  Existing VI Re-ID methods mainly focus on extracting homogeneous structural relationships in an image, i.e. the relations between local features, while ignoring the heterogeneous correlation of local features  ...  Yang [31] employed different frames of video to supplement information for each other.  ... 
arXiv:2109.08811v2 fatcat:uczpqlaxufgnpk6thqtz5ntygm

Deep Learning for Person Re-identification: A Survey and Outlook [article]

Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, Steven C. H. Hoi
2021 arXiv   pre-print
ranking optimization.  ...  With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community.  ...  Heterogeneous Re-ID This subsection summarizes four main kinds of heterogeneous Re-ID, including Re-ID between depth and RGB images ( § 3.1.1), text-to-image Re-ID ( § 3.1.2), visible-toinfrared Re-ID  ... 
arXiv:2001.04193v2 fatcat:4d3thmsr3va2tnu72nawlu2wxy

Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [article]

Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiang Ruan
2022 arXiv   pre-print
In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.  ...  In addition, comprehensive applications (short-term tracking, long-term tracking and segmentation mask prediction) with diverse categories and scenes are considered for exhaustive evaluation.  ...  Decision fusion avoids the heterogeneity of different modalities and is not sensitive to modality registration.  ... 
arXiv:2204.04120v1 fatcat:36iohdke2ba7rlx3nmnkc355n4

RGB-Depth Cross-Modal Person Re-identification

Frank M. Hafner, Amran Bhuiyan, Julian F. P. Kooij, Eric Granger
2019 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)  
Although some methods have been proposed for cross-modal re-identification between RGB and infrared images [10, 11, 12, 13] , almost no research addressing RGB and depth images exists [16, 17] .  ...  unexplored problem of cross-modal re-identification of persons between RGB (color) and depth images.  ...  During inference, query and gallery images from different modalities are evaluated to produce feature embeddings and matching scores for cross-modal re-identification.  ... 
doi:10.1109/avss.2019.8909838 dblp:conf/avss/HafnerBKG19 fatcat:levi4qaznjgfvlxgqaf2tkpskq

Multimodal-based Multimedia Analysis, Retrieval, and Services in Support of Social Media Applications

Rajiv Ratn Shah
2016 Proceedings of the 2016 ACM on Multimedia Conference - MM '16  
and user-generated videos (UGVs) to provide diverse multimedia-related services.  ...  extracted from multiple modalities.  ...  It performs heterogeneous late fusion to recognize moods and retrieve a ranked list of songs using a heuristic approach for sensor-annotated videos (UGVs).  ... 
doi:10.1145/2964284.2971471 dblp:conf/mm/Shah16 fatcat:fqdt3wksh5cp5fvai2bchsf77a

Visible-Infrared Person Re-Identification: A Comprehensive Survey and a New Setting

Huantao Zheng, Xian Zhong, Wenxin Huang, Kui Jiang, Wenxuan Liu, Zheng Wang
2022 Electronics  
To this end, combining visible images with infrared images is a natural trend, and are considerably heterogeneous modalities.  ...  Person re-identification (ReID) plays a crucial role in video surveillance with the aim to search a specific person across disjoint cameras, and it has progressed notably in recent years.  ...  [53] bridged the gap between the two modalities by fusing the features of original infrared images and generated fake visible images. After that, Zhong et al.  ... 
doi:10.3390/electronics11030454 fatcat:rbjugiqeaffl7mmllbz6xjjuvq

A Comprehensive Overview of Biometric Fusion [article]

Maneet Singh, Richa Singh, Arun Ross
2019 arXiv   pre-print
This paper presents an overview of biometric fusion with specific focus on three questions: what to fuse, when to fuse, and how to fuse.  ...  In addition, the use of information fusion principles for presentation attack detection and multibiometric cryptosystems is also discussed.  ...  Singh and R. Singh are partially supported through the Infosys CAI at IIIT-Delhi.  ... 
arXiv:1902.02919v1 fatcat:4ujumax47vc5hgbddi666fkhda

A hybrid graph-based and non-linear late fusion approach for multimedia retrieval

Ilias Gialampoukidis, Anastasia Moumtzidou, Dimitris Liparas, Stefanos Vrochidis, Ioannis Kompatsiaris
2016 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)  
In contrast, we present a strategy for fusing textual and visual modalities, through the combination of a non-linear fusion model and a graph-based late fusion approach.  ...  Nowadays, multimedia retrieval has become a task of high importance, due to the need for efficient and fast access to very large and heterogeneous multimedia collections.  ...  ACKNOWLEDGEMENTS This work was supported by the projects MULTISENSOR (FP7-610411) and KRISTINA (H2020-645012), funded by the European Commission.  ... 
doi:10.1109/cbmi.2016.7500252 dblp:conf/cbmi/GialampoukidisM16 fatcat:gggu5spoufduxbycflomf5b7vu
« Previous Showing results 1 — 15 out of 2,314 results