IIIT Hyderabad Publications |
|||||||||
|
ACTSA: Annotated Corpus for Telugu Sentiment AnalysisAuthors: Muku Sandeep,Radhika Mamidi Conference: Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems (EMNLP-2017 2017) Pages: 54-58 Location Copenhagen, Denmark Date: 2017-09-08 Report no: IIIT/TR/2017/103 AbstractSentiment analysis deals with the task of determining the polarity of a document or sentence and has received a lot of attention in recent years for the English language. With the rapid growth of social media these days, a lot of data is available in regional languages besides English. Telugu is one such regional language with abundant data available in social media, but it’s hard to find a labelled data of sentences for Telugu Sentiment Analysis. In this paper, we describe an effort to build a gold-standard annotated corpus of Telugu sentences to support Telugu Sentiment Analysis. The corpus, named ACTSA (Annotated Corpus for Telugu Sentiment Analysis) has a collection of Telugu sentences taken from different sources which were then preprocessed and manually annotated by native Telugu speakers using our annotation guidelines. In total, we have annotated 5410 sentences, which makes our corpus the largest resource currently available. The corpus and annotation guidelines are made publicly available. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |