IIIT Hyderabad Publications |
|||||||||
|
Looking Beyond the Obvious: Code-Mixed Sentiment Analysis (CMSA)Authors: Vaibhav Kumar,Mrinal Dhar Conference: Conference on Empirical Methods in Natural Language Processing (EMNLP-2018 2018) Location Brussels, Belgium Date: 2018-10-31 Report no: IIIT/TR/2018/66 AbstractSentiment analysis of online social media has many applications in e-commerce, recommendation systems, analysis of current trends, political campaigns, etc. In a multilingual society, such content is often a composition of different languages. This phenomenon of mixing the vocabulary and syntax of two or more languages (code-mixing) makes the processing of such content significantly harder. In this paper, we present a hybrid architecture for the task of Sentiment Analysis of English-Hindi code-mixed data. We use two different Bidirectional Long Short Term Memory (BiLSTM) Networks, one that looks at the overall sentiment of the sentence while the other utilizes an attention mechanism in order to focus on the individual sentiment bearing sub-words. This, combined with traditionally used orthographic features and monolingually trained word embeddings achieves the state-of-the-art results on a benchmark dataset. Our system scores an accuracy of 83.54% and an F1-score of 0.827. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |