Entity Tracking in Real-Time Using Sub-topic Detection on Twitter [chapter]

Sandeep Panem, Romil Bansal, Manish Gupta, Vasudeva Varma
2014 Lecture Notes in Computer Science  
The velocity, volume and variety with which Twitter generates text is increasing exponentially. It is critical to determine latent sub-topics from such tweet data at any given point of time for providing better topic-wise search results relevant to users' informational needs. The two main challenges in mining subtopics from tweets in real-time are (1) understanding the semantic and the conceptual representation of the tweets, and (2) the ability to determine when a new sub-topic (or cluster)
more » ... ears in the tweet stream. We address these challenges by proposing two unsupervised clustering approaches. In the first approach, we generate a semantic space representation for each tweet by keyword expansion and keyphrase identification. In the second approach, we transform each tweet into a conceptual space that represents the latent concepts of the tweet. We empirically show that the proposed methods outperform the state-of-the-art methods.
doi:10.1007/978-3-319-06028-6_52 fatcat:kunpacwinbg4jaerwzkyif72h4