IIIT Hyderabad Publications |
|||||||||
|
Impact of Translation on Sentiment Analysis: A Case-Study on Telugu ReviewsAuthors: Rama Rohit Reddy Gangula,Radhika Mamidi Conference: 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing-2018 2018) Location Hanoi, Vietnam Date: 2018-03-18 Report no: IIIT/TR/2018/44 AbstractSentiment analysis research has predominantly been on English texts. There exists many sentiment resources for English but very less exist for other languages. To improve sentiment analysis in a low resource language, sentiment labeled corpora are translated from English into the focus language and use them as additional resources for sentiment analysis research in the focus language [3]. But when text is translated from one language into another, sentiment is preserved to varying degrees. In this paper, we use product and book reviews in English as stand-in for source language text and determine loss in sentiment and sentiment predictability when they are translated into Telugu (a low resource South Asian language), manually and automatically. For this purpose, we use manually and automatically determined sentiment labels of the English text as a benchmark. We show that sentiment analysis of Telugu manual translations of English text produces competitive results w.r.t English sentiment analysis. We discover that even though machine translation significantly reduces the human ability to recover sentiment, automatic sentiment systems are still able to capture sentiment information from the translations in certain cases. In the process, we created a Telugu-English parallel corpus that is independently annotated for sentiment using a 5-value scale by Telugu and English speakers. We also created a Telugu lexicon annotated at both sentiment and emphasis level. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |