IIIT Hyderabad Publications |
|||||||||
|
What is this Song About?: Identification of Keywords in Bollywood LyricsAuthors: Drushti Apoorva G,Kritik Mathur,Priyansh Agrawal,Radhika Mamidi Conference: 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing-2018 2018) Location Hanoi, Vietnam Date: 2018-03-18 Report no: IIIT/TR/2018/56 AbstractKeywords of a document are a representative of its content, and it helps to have meaningful words to facilitate search and organization of documents. Hence, finding methods that can automatically identify keywords in a document is very important as manual processes for this is very cumbersome and error-prone. If this task is accomplished for song lyrics, it has varied applications such as recommendation systems and digital music library management. This work proposes and compares methods to identify keywords from lyrics of Bollywood songs. We use a collection of lyrics of 1055 Bollywood songs, all written in the Devanagari script. Experiments include looking at the spatial distribution of the terms, their occurrence in a certain context or position, and using WordNet to generate keywords not present in the document. Validation was done by human annotators by providing a score to each method based on the results obtained on a subset of the data. We also used Latent Dirichlet Allocation and Latent Semantic Indexing to validate the results, as further explained in the paper. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |