A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Multi-Spectral Vehicle Re-Identification: A Challenge
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
The dataset and baseline codes are available at: https://github.com/ttaalle/multi-modal-vehicle-Re-ID. ...
Our work provides a benchmark dataset for RGB-NIR and RGB-NIR-TIR multi-spectral vehicle Re-ID and a baseline network for both research and industrial communities. ...
(NLPR) ( 201900046 ), and the Natural Science Foundation of Anhui Higher Education Institutions of China (KJ2019A0033). ...
doi:10.1609/aaai.v34i07.6796
fatcat:oqtfculpdrbp3fgslibpz6m7x4
Image Retrieval and Re-Ranking Techniques - A Survey
2014
Signal & Image Processing An International Journal
The technique leaves no ambiguities as we consider only the variant characteristics or modalities, in order to gain the image and video retrieval. ...
In contrast to the existing techniques, the re-ranking method fosters the interaction between modalities to find a tranquillity which is helpful for re-ranking. ...
Tan et al, [15] , proposed an agreement-fusion optimization model for fusing multiple heterogeneous data. ...
doi:10.5121/sipij.2014.5201
fatcat:gusz6wizpbgsfgjt6asf7s44fi
iQIYI-VID: A Large Dataset for Multi-modal Person Identification
[article]
2019
arXiv
pre-print
In this paper, we introduce iQIYI-VID, the largest video dataset for multi-modal person identification. It is composed of 600K video clips of 5,000 celebrities. ...
We proposed a Multi-modal Attention module to fuse multi-modal features that can improve person identification considerably. ...
For person re-identification (Re-ID), Wang et al. [63] raised the Rank-1 accuracy of Re-ID to 97.1% on the Market-1501 benchmark [71] . ...
arXiv:1811.07548v2
fatcat:333ocltp7vbt5dmdwdn7pgqhcm
Multi-Modal Long-Term Person Re-Identification Using Physical Soft Bio-Metrics and Body Figure
2022
Applied Sciences
This paper proposes a multi-modal person re-identification model. The first modality includes soft bio-metrics: hair, face, neck, shoulders, and part of the chest. ...
For the first modality, a two-stream Siamese network with pre-trained FaceNet as a feature extractor for the first modality is utilized. ...
Torchreid is a framework that includes unified data loaders for 15 of the most popular and widely used datasets in the field of person re-identification (video and image domains are available). ...
doi:10.3390/app12062835
fatcat:qr6wtrktrjdzlbfgk2llipmtky
Cooperative Cross-Stream Network for Discriminative Action Representation
[article]
2019
arXiv
pre-print
reduces the undesired modality discrepancy by jointly optimizing a modality ranking constraint and a cross-entropy loss for both homogeneous and heterogeneous modalities. ...
The modality ranking constraint constitutes intra-modality discriminative embedding and inter-modality triplet constraint, and it reduces both the intra-modality and cross-modality feature variations. ...
and the Sichuan Science and Technology Program 2018GZDZX0032, 2019ZDZX0008 and 2019YFG0003. ...
arXiv:1908.10136v1
fatcat:hwu2pnudxffmfg3iajq7s3ymsm
Lightweight Attentional Feature Fusion for Video Retrieval by Text
[article]
2021
arXiv
pre-print
LAFF performs feature fusion at both early and late stages and at both video and text ends, making it a powerful method for exploiting diverse (off-the-shelf) features. ...
Different from previous research that considers feature fusion only at one end, let it be video or text, we aim for feature fusion for both ends within a unified framework. ...
We re-train our model with the top-3 ranked video / text features. ...
arXiv:2112.01832v1
fatcat:bq3n553otzd65oxjur4hbu2xve
Multi-modal Visual Tracking: Review and Experimental Comparison
[article]
2020
arXiv
pre-print
Finally, we discuss various future directions from different perspectives, including model design and dataset construction for further research. ...
To provide a thorough review of multi-modal track-ing, we summarize the multi-modal tracking algorithms, especially visible-depth (RGB-D) tracking and visible-thermal (RGB-T) tracking in a unified taxonomy ...
ATOM [123] and sports video analysis [124] , which is also a better choice for multi-modal tracking. Specific Network for Auxiliary Modality. ...
arXiv:2012.04176v1
fatcat:pc3pt3hdavcp3pzij5sryvqe5y
Homogeneous and Heterogeneous Relational Graph for Visible-infrared Person Re-identification
[article]
2021
arXiv
pre-print
Visible-infrared person re-identification (VI Re-ID) aims to match person images between the visible and infrared modalities. ...
Existing VI Re-ID methods mainly focus on extracting homogeneous structural relationships in an image, i.e. the relations between local features, while ignoring the heterogeneous correlation of local features ...
Yang [31] employed different frames of video to supplement information for each other. ...
arXiv:2109.08811v2
fatcat:uczpqlaxufgnpk6thqtz5ntygm
Deep Learning for Person Re-identification: A Survey and Outlook
[article]
2021
arXiv
pre-print
ranking optimization. ...
With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community. ...
Heterogeneous Re-ID This subsection summarizes four main kinds of heterogeneous Re-ID, including Re-ID between depth and RGB images ( § 3.1.1), text-to-image Re-ID ( § 3.1.2), visible-toinfrared Re-ID ...
arXiv:2001.04193v2
fatcat:4d3thmsr3va2tnu72nawlu2wxy
Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline
[article]
2022
arXiv
pre-print
In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels. ...
In addition, comprehensive applications (short-term tracking, long-term tracking and segmentation mask prediction) with diverse categories and scenes are considered for exhaustive evaluation. ...
Decision fusion avoids the heterogeneity of different modalities and is not sensitive to modality registration. ...
arXiv:2204.04120v1
fatcat:36iohdke2ba7rlx3nmnkc355n4
RGB-Depth Cross-Modal Person Re-identification
2019
2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)
Although some methods have been proposed for cross-modal re-identification between RGB and infrared images [10, 11, 12, 13] , almost no research addressing RGB and depth images exists [16, 17] . ...
unexplored problem of cross-modal re-identification of persons between RGB (color) and depth images. ...
During inference, query and gallery images from different modalities are evaluated to produce feature embeddings and matching scores for cross-modal re-identification. ...
doi:10.1109/avss.2019.8909838
dblp:conf/avss/HafnerBKG19
fatcat:levi4qaznjgfvlxgqaf2tkpskq
Multimodal-based Multimedia Analysis, Retrieval, and Services in Support of Social Media Applications
2016
Proceedings of the 2016 ACM on Multimedia Conference - MM '16
and user-generated videos (UGVs) to provide diverse multimedia-related services. ...
extracted from multiple modalities. ...
It performs heterogeneous late fusion to recognize moods and retrieve a ranked list of songs using a heuristic approach for sensor-annotated videos (UGVs). ...
doi:10.1145/2964284.2971471
dblp:conf/mm/Shah16
fatcat:fqdt3wksh5cp5fvai2bchsf77a
Visible-Infrared Person Re-Identification: A Comprehensive Survey and a New Setting
2022
Electronics
To this end, combining visible images with infrared images is a natural trend, and are considerably heterogeneous modalities. ...
Person re-identification (ReID) plays a crucial role in video surveillance with the aim to search a specific person across disjoint cameras, and it has progressed notably in recent years. ...
[53] bridged the gap between the two modalities by fusing the features of original infrared images and generated fake visible images. After that, Zhong et al. ...
doi:10.3390/electronics11030454
fatcat:rbjugiqeaffl7mmllbz6xjjuvq
A Comprehensive Overview of Biometric Fusion
[article]
2019
arXiv
pre-print
This paper presents an overview of biometric fusion with specific focus on three questions: what to fuse, when to fuse, and how to fuse. ...
In addition, the use of information fusion principles for presentation attack detection and multibiometric cryptosystems is also discussed. ...
Singh and R. Singh are partially supported through the Infosys CAI at IIIT-Delhi. ...
arXiv:1902.02919v1
fatcat:4ujumax47vc5hgbddi666fkhda
A hybrid graph-based and non-linear late fusion approach for multimedia retrieval
2016
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)
In contrast, we present a strategy for fusing textual and visual modalities, through the combination of a non-linear fusion model and a graph-based late fusion approach. ...
Nowadays, multimedia retrieval has become a task of high importance, due to the need for efficient and fast access to very large and heterogeneous multimedia collections. ...
ACKNOWLEDGEMENTS This work was supported by the projects MULTISENSOR (FP7-610411) and KRISTINA (H2020-645012), funded by the European Commission. ...
doi:10.1109/cbmi.2016.7500252
dblp:conf/cbmi/GialampoukidisM16
fatcat:gggu5spoufduxbycflomf5b7vu
« Previous
Showing results 1 — 15 out of 2,314 results