Dual Low-Rank Multimodal Fusion

Tao Jin, Siyu Huang, Yingming Li, Zhongfei Zhang
2020 Findings of the Association for Computational Linguistics: EMNLP 2020   unpublished
Tensor-based fusion methods have been proven effective in multimodal fusion tasks. However, existing tensor-based methods make a poor use of the fine-grained temporal dynamics of multimodal sequential features. Motivated by this observation, this paper proposes a novel multimodal fusion method called Fine-Grained Temporal Low-Rank Multimodal Fusion (FT-LMF). FT-LMF correlates the features of individual time steps between multiple modalities, while it involves multiplications of high-order
more » ... s in its calculation. This paper further proposes Dual Low-Rank Multimodal Fusion (Dual-LMF) to reduce the computational complexity of FT-LMF through low-rank tensor approximation along dual dimensions of input features. Dual-LMF is conceptually simple and practically effective and efficient. Empirical studies on benchmark multimodal analysis tasks show that our proposed methods outperform the state-of-the-art tensorbased fusion methods with a similar computational complexity.
doi:10.18653/v1/2020.findings-emnlp.35 fatcat:m4uwzw3abrf2pcu5ibkgzmdkpu