IIIT Hyderabad Publications |
|||||||||
|
Data-Driven GrammarAuthor: ANIL KRISHNA ERAGANI 200702013 Date: 2024-04-29 Report no: IIIT/TH/2024/48 Advisor:Dipti Misra Sharma AbstractIn the literature several parsing systems for natural language processing exist which use different methods to parse sentences. Most of the previous work done in this domain was either done using hand written grammar or by using data-driven methods. Both these methods have pros as well as cons associated with them. As data-driven parsers rely on annotated treebanks they perform at high accuracies when they encounter familiar data that they have been trained on. The only downside to this method is that when they encounter unknown data in a sentence their performance drops drastically. On the other hand, when we come to parsers which use hand written grammars, they have a broad coverage on the language hence we could say that there is no such a thing as seen or unseen data. The only problem with this method is that it is very hard to write a grammar that has broad coverage in a language. In this thesis we develop a method where the con of one method is overcome by the pro of the other. In short we could say that we are building a parser which uses grammar rules extracted from a treebank (which would cover the aspect of broad coverage) and also has the data-driven approach included in it. With this thesis we try to achieve two goals: • Developing a novel approach to parsing text. • Being able to use the resulting parser using very minimal resources. Full thesis: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |