IIIT Hyderabad Publications |
|||||||||
|
Entity Tracking in Real-Time using Sub-Topic Detection on TwitterAuthors: Sandeep Panem,Romil Bansal,Manish Gupta,Vasudeva Varma Conference: 36th European Conference on Information Retrieval Location the University of Amsterdam. Date: 2014-04-13 Report no: IIIT/TR/2014/40 AbstractThe velocity, volume and variety with which Twitter generates text is increasing exponentially. It is critical to determine latent sub-topics from such tweet data at any given point of time for providing better topic-wise search results relevant to users’ informational needs. The two main challenges in mining subtopics from tweets in real-time are (1) understanding the semantic and the conceptual representation of the tweets, and (2) the ability to determine when a new sub-topic (or cluster) appears in the tweet stream.We address these challenges by proposing two unsupervised clustering approaches. In the first approach, we generate a semantic space representation for each tweet by keyword expansion and keyphrase identification. In the second approach, we transform each tweet into a conceptual space that represents the latent concepts of the tweet. We empirically show that the proposed methods outperform the state-of-the-art methods. Full paper: pdf Centre for Search and Information Extraction Lab |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |