Social Media Based Topic Modeling for Smart Campus: A Deep Topical Correlation Analysis Method
Smart campus builds on characteristic learning and feedback evaluation of diverse students and aims to enable intelligent, accurate, and customized education. Mining social media data, especially topic modeling, from students, provides a non-intrusive method to know the instantaneous thoughts and willings of them. However, it is challenging to deal with multi-modal data (i.e., text, images, and videos contained in the social media data) as well as the modality dependence and missing modality.
... missing modality. In this paper, we present a novel deep topical correlation analysis (DTCA) approach, which achieves robust and accurate topic detection for microblogs and simultaneously handles the two challenges aforementioned. In particular, bidirectional recurrent neural networks and convolutional neural networks are used to learn deep textual and visual features, respectively. Then, a canonical correlation analysis-based fusion scheme is proposed, which has two innovations to deal with both modality independence and modality missing, i.e., a filter gate to capture the modality dependency and a matrix-projection based component to handle the missing modality. DTCA is trained in an end-to-end manner, in which the parameters of visual, textual, and crossmodal prediction parts are trained jointly. We further release a large-scale cross-modal twitter dataset for topic detection, denoted as TM-Twitter. On this dataset, extensive and quantitative evaluations are conducted with comparisons to several state-of-the-art and alternative approaches. Significant performance gains are reported to demonstrate the merits of the proposed DTCA. INDEX TERMS Topic modeling, deep neural networks, correlation analysis.