Cross-Modal Retrieval using Random Multimodal Deep Learning

H Somasekar
2019 JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES  
In multimedia community, cross modal similarity search based hashing received extensive attention because of the effectiveness and efficiency of query. This research work contributes large scale dataset for weakly managed cross-media recovery, named Twitter100k. Current datasets namely Wikipedia, NUS Wide and Flickr30k, have two main restrictions. First, these datasets are deficient in content diversity, i.e., only some pre-characterized classes are secured. Second, texts in these datasets are
more » ... ritten informal dialect, that leads to irregularity with practical applications. To overcome these disadvantages, the proposed method used Twitter100k dataset because of two major points, first, it has 100,000 content-image pairs that are randomly crawled from Twitter and it has no importance in the image classifications. Second, text in Twitter100k is written in informal language by the clients. Since strongly supervised strategies use the class labels that might be missing in practice, this paper mainly concentrates on weakly managed learning for cross-media recovery, in which only textimage sets misused during training. This paper proposed a Random Multimodal Deep Learning (RMDL) based Recurrent Neural Network (RNN) for cross-media retrieval. The variety of input data such as video, text, images etc. are used for cross-media recovery which can be accept by proposed RMDL in weakly dataset. In RMDL, the various input data can be classified by using RNN architecture. to improve the accuracy and robustness of the proposed method, RMDL uses the specific RNN structure i.e. Long Short-Term Memory (LSTM). In the experimental analysis, the results demonstrated that the proposed RMDL-based strategy achieved 78% of Cumulative Match Characteristic (CMC) compared to other datasets.
doi:10.26782/jmcms.2019.04.00016 fatcat:jkpne7zeenbolhyllbuhgcqwjy