IIIT Hyderabad Publications |
|||||||||
|
A Dataset for Detecting Irony in Hindi-English Code-Mixed Social Media TextAuthors: Deepanshu Vijay,Aditya Bohra,Vinay Singh,Syed S. Akhtar,Manish Shrivastava Conference: 15th Extended Semantic Web Conference (ESWC-2018 2018) Location Greece Date: 2018-06-03 Report no: IIIT/TR/2018/78 AbstractIrony is one of many forms of figurative languages. Irony detection is crucial for Natural Language Processing (NLP) tasks like sentiment analysis and opinion mining. From cognitive point of view, it is a challenge to study how human use irony as a communication tool. While relevant research has been done independently on code-mixed social media texts and irony detection, our work is the first attempt in detecting irony in Hindi-English code-mixed social media text. In this paper, we study the problem of automatic irony detection as a classification problem and present a Hindi-English code-mixed dataset consisting of tweets posted online on Twitter. The tweets are annotated with the language at word level and the class they belong to (Ironic or Non-Ironic). We also propose a supervised classification system for detecting irony in the text using various character level, word level, and structural features. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |