IIIT Hyderabad Publications |
|||||||||
|
Dependency Annotation Scheme for Indian languagesAuthors: Rafiya Begum,Samar Husain,Arun Dhwaj,Dipti Misra Sharma,Lakshmi Bai,Rajeev Sangal Conference: International Joint Conference on Natural Language Processing (IJCNLP-08 2008) Date: 2008-01-07 Report no: IIIT/TR/2008/188 AbstractThe paper introduces a dependency annota-tion effort which aims to fully annotate a million word Hindi corpus. It is the first at-tempt of its kind to develop a large scale tree-bank for an Indian language. In this paper we provide the motivation for fol-lowing the Paninian framework as the an-notation scheme and argue that the Pan-inian framework is better suited to model the various linguistic phenomena manifest in Indian languages. We present the basic annotation scheme. We also show how the scheme handles some phenomenon such as complex verbs, ellipses, etc. Empirical re-sults of some experiments done on the cur-rently annotated sentences are also re-ported. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |