A Karaka Dependency based Dialog Act Tagging for Telugu using Combination of LMs and HMM

Authors: suman.d ,Radhika Mamidi
Conference: 16th International Conference on Intelligent Text Processing and Computational Linguistics
Location Cairo, Egypt
Date: 2015-04-14
Report no: IIIT/TR/2015/66

Abstract

The main goal of this paper is to perform the dialog act(DA) tagging for Telugu corpus. Annotation of utterances with dialog acts is necessary to recognize the intent of speaker in dialog systems. While En- glish language follows strict subject{verb{object(SVO) syntax, Telugu is a free word order language. The n-gram DA tagging methods proposed for the English language will not work for free word order languages like Telugu. In this paper, we propose a method to perform DA tagging for the Telugu corpus using advanced machine learning techniques com- bined with karaka dependency relation modifiers. In other words, we use syntactic features obtained from karaka dependencies and apply combi nation of language models(LMs) at utterance level with Hidden Markov Model(HMM) at context level for DA tagging. The use of karaka dependencies for free word order languages like Telugu helps in extracting the modifier-modified relationships between words or word clusters for an utterance. The modifier-modified relationships remain extracted even though the word order in an utterance changes. These extracted modifier - modified relationships appear similar to n-grams. Statistical machine learning methods such as combination of LMs and HMM are applied to predict DA for an utterance in a dialog. The proposed method is compared with several baseline tagging algorithms.

Full paper: pdf

Centre for Language Technologies Research Centre

IIIT Hyderabad Publications

A Karaka Dependency based Dialog Act Tagging for Telugu using Combination of LMs and HMM

Abstract