Classifying Community QA Questions That Contain an Image

Kenta Tamaki, Riku Togashi, Sosuke Kato, Sumio Fujita, Hideyuki Maeda, Tetsuya Sakai
2018 Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval - ICTIR '18  
We consider the problem of automatically assigning a category to a given question posted to a Community Question Answering (CQA) site, where the question contains not only text but also an image. For example, CQA users may post a photograph of a dress and ask the community "Is this appropriate for a wedding?" where the appropriate category for this question might be "Manners, Ceremonial occasions. " We tackle this problem using Convolutional Neural Networks with a DualNet architecture for
more » ... ing the image and text representations. Our experiments with real data from Yahoo Chiebukuro and crowdsourced gold-standard categories show that the DualNet approach outperforms a text-only baseline (p = .0000), a sum-and-product baseline (p = .0000), Multimodal Compact Bilinear pooling (p = .0000), and a combination of sum-and-product and MCB (p = .0000), where the p-values are based on a randomised Tukey Honestly Significant Difference test with B = 5000 trials.
doi:10.1145/3234944.3234948 dblp:conf/ictir/TamakiTKFMS18 fatcat:mfem6omgljcafjwb2czwy3jiqy