IIIT Hyderabad Publications |
|||||||||
|
No more beating about the bush: A Step towards Idiom Handling for Indian Language NLPAuthors: Ruchit Agrawal,Vighnesh Chenthil Kumar,Vigneshwaran Muralidharan,Dipti Misra Sharma Conference: 11th edition of the Language Resources and Evaluation Conference (LREC-2018 2018) Location Miyazaki (Japan) Date: 2018-05-07 Report no: IIIT/TR/2018/40 AbstractOne of the major challenges in the field of Natural Language Processing (NLP) is the handling of idioms; seemingly ordinary phrases which could be further conjugated or even spread across the sentence to fit the context. Since idioms are a part of natural language, the ability to tackle them brings us closer to creating efficient NLP tools. This paper presents a multilingual parallel idiom dataset for seven Indian languages in addition to English and demonstrates its usefulness for two NLP applications - Machine Translation and Sentiment Analysis. We observe significant improvement for both the subtasks over baseline models trained without employing the idiom dataset. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |