IIIT Hyderabad Publications |
|||||||||
|
Exploring Generation of Grammatical Sentences Using Corpus Derived TemplatesAuthor: Nikhilesh Bhatnagar Date: 2021-06-30 Report no: IIIT/TH/2021/83 Advisor:Radhika Mamidi AbstractNatural Language Generation (NLG) is a research task which addresses the automatic generation of natural language text representative of an input non-linguistic collection of knowledge. Inputs can be of different modalities – databases, structured reports, etc. Traditional NLG systems have been categorized into one of two general approaches: Template based and Pipeline based. In this work, we explore how raw text can be leveraged to automatically extract sub-sentence level templates to generate grammatical sentences. We propose a system for generation of grammatical sentences given a partially ordered partial bagof-words which constrain the generated sentence using a template based approach. As the input, we formulate a lexical level constraint schema as the input constraints which drives the generation process. We view the task as a search problem (a problem of choice) involving combinations of smaller chunk based templates stitched together to construct a complete sentence. To achieve that, we propose a scoring function to determine syntactic correctness of a template sequence which we use in conjunction with an evolutionary algorithm as the search procedure to arrive at a potentially grammatical sentence which satisfies the input constraints. Full thesis: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |