IIIT Hyderabad Publications |
|||||||||
|
Computational Humour: Analysis and Generation of Humorous Texts in HindiAuthor: Srishti Aggarwal Date: 2018-10-22 Report no: IIIT/TH/2018/73 Advisor:Radhika Mamidi AbstractHumour is an intricate part of human behaviour and a valued quality when developing interpersonal relationships. It is an interesting and challenging domain to work in owing to its subjectivity. Humour is one of the most sophisticated forms of human intelligence and creativity, intimately related with factors such as world knowledge, reasoning and perception. With our AI systems becoming increasingly intelligent, there is now a demand for systems which are more human-like than ever before. No AI system could be complete without the ability to to process and use an integral human phenomenon like humour. Humour is often expressed through language in various forms such as funny stories/anecdotes, jokes, witty sayings, puns etc. This verbally expressed humour is of interest to NLP researchers because it deals with spoken and textual language. The subfield of Computational Linguistics and Artificial Intelligence which aims to develop systems which can use and understand humour like humans do is called Computational Humour. A complete computational humour system has to contain two main modules - a humour interpreter, and a humour generator. Both these modules could have their independent uses in a variety of applications, but when it comes to interactive and comprehensive AI systems, both are equally important. In this thesis, we present a first ever work for computational humour for Hindi. We consider both sub-fields in computational humour - Humour Understanding and Humour Generation, and work on specific applications in both. We study Hindi-English code mixed puns by focusing on automatic recovery of the target words/phrases in them. We define code-mixed puns, analyse the type of structures present in them, and classify them into two main categories - intra-sentential and intra-word code-mixed puns. We then propose a four-step methodology to recover intra-sentential code-mixed puns. On evaluating our model on a test dataset of such puns, we observe that we are able to successfully recover 67%. We recognise several limitations of our system, like the inability to deal with highly unusual and creative language use. For the problem of computational generation of humour, we consider two tasks. First, we build a prototype system which automatically generates a simple form of Dur-se-Dekha jokes in Hindi using a small lexicon and handcrafted rules. Second, we design an algorithm which can induce humour in a short non-humorous text by performing single word subsitutions on the basis of phonetic similarity and sentiment opposition. We perform human evaluation on the results of both and observe that we are successfully able to generate funny instances in both these cases. Full thesis: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |