Hierarchical multi-label classification of social text streams

Zhaochun Ren, Maria-Hendrike Peetz, Shangsong Liang, Willemijn van Dolen, Maarten de Rijke
2014 Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval - SIGIR '14  
Hierarchical multi-label classification assigns a document to multiple hierarchical classes. In this paper we focus on hierarchical multi-label classification of social text streams. Concept drift, complicated relations among classes, and the limited length of documents in social text streams make this a challenging problem. Our approach includes three core ingredients: short document expansion, time-aware topic tracking, and chunk-based structural learning. We extend each short document in
more » ... ort document in social text streams to a more comprehensive representation via state-of-the-art entity linking and sentence ranking strategies. From documents extended in this manner, we infer dynamic probabilistic distributions over topics by dividing topics into dynamic "global" topics and "local" topics. For the third and final phase we propose a chunk-based structural optimization strategy to classify each document into multiple classes. Extensive experiments conducted on a large real-world dataset show the effectiveness of our proposed method for hierarchical multi-label classification of social text streams.
doi:10.1145/2600428.2609595 dblp:conf/sigir/RenPLDR14 fatcat:4h5pyvebgrdu7h4h7rnbnjxfpe