IIIT Hyderabad Publications |
|||||||||
|
Sad or Glad? Corpus Creation for Odia Poetry with Sentiment Polarity InformationAuthors: Gaurav Mohanty,Pruthwik Mishra,Radhika Mamidi Conference: 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing-2018 2018) Location Hanoi, Vietnam Date: 2018-03-18 Report no: IIIT/TR/2018/112 AbstractResource poor languages, like Odia, inherently lack the necessary resources and tools for the task of sentiment analysis to give promising results. With more user-generated raw data readily available today, it is of prime importance to have annotated corpora from various domains. This paper is a first attempt towards building an annotated corpus of Odia poetry with sentiment labels. The annotated corpus is further used for sentiment classification using machine learning techniques in order to establish a baseline. Stylistic variations and structural differences between poetic and non-poetic texts make the task of sentiment classification challenging for the former. Using the annotated corpus of poems, we obtained comparable accuracies across various classification models. Linear-SVM outperformed other classifiers with a macro F1-Score of 0.68. The annotated corpus contains a total of 730 Odia Poems of various genres with a vocabulary of more than 23k words. Fleiss Kappa score of 0.83 was obtained which corresponds to near perfect agreement among the annotators. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |