Neural Approaches Towards Text Summarization

Author: Abhishek Singh
Date: 2018-07-14
Report no: IIIT/TH/2018/47
Advisor:Vasudeva Varma,Manish Gupta

Abstract

The rapid growth of online and offline data over the recent decades has generated an epochal change in the way we retrieve, analyze and consume data. Access to vast amount of information often leads to confusion in identification and assimilation of the core idea. Understanding large text documents and assimilating crucial information out of it is often a laborious and time-consuming tasks. Automated Summarization systems for the text addresses the above mentioned challenges by creating a concise version of huge text while retaining the core meaning of the original text. These systems thrives to provide users filtered, high-quality concise content to work at unprecedented scale and speed. Summarization methods are generally classified into two main categories: extractive and abstractive. Extractive methods aim to select salient phrases, sentences or paragraphs from the text while abstractive techniques focus on generating summaries from scratch without the constraints of reusing phrases from the original text. While most successful summarization systems use extractive methods, abstractive methods can be used in generating a more meaningful summary. Often a hybrid approach, combining extractive and abstractive methods are used at industry level specific applications. Automated text summarization system has been a widely studied research problem in the domain of natural language processing. A vast majority of literature focused on summarization methods are dedicated to conventional approaches relying on manually compiled features. Thanks to recent advances in the computational power, and availability of huge amount of data which embarked the success of neural networks in several fields like computer vision, speech recognition, natural language processing, etc. In recent years, there have been few efforts in using deep neural architectures for text summarization task. These models are proven to considerably enhance the efficiency of the summarization systems. However, they still faces the challenges of creating an more informative, coherent, fluent and concise summary. Moreover, different types of summarization methods needs different architectural approaches for creating summaries. In this thesis, we focus on building a series of data-driven deep neural network based extractive as well as abstractive summarization systems for creating better extractive as well as abstractive summaries respectively by improving the representation of the text/document and sentences. We commence our research work by improving the extractive summarization task for single document summarization. We create a Hybrid MemNet architecture that learns the continuous unified representation of a document. This architecture jointly captures the local and global sentential information along with the notion of summary worthy sentences which eventually helps in generating better summary. Unlike the related previous neural models which only captures the semantic features, Hybrid MemNet model is able to capture the semantic features from the document and sentences along with the explicit notion of summary worthiness of the sentences. As a next step, we improve the extractive summarization task for multi-document summarization by introducing a novel unified architecture, CSTI, which uses same architecture with different training objective for improving sentence ranking and sentence extraction (redundancy identification) task. Unlike previous related research work which rely on manually crafted features for obtaining document independent features, CSTI architecture exploits the various semantic and compositional aspects latent in a sentence to automatically capture document independent features. It learns a heterogeneous sentence representation by combining document dependent and document independent features for capturing summary worthiness of a sentence. Thus, improving sentence ranking task. Further, CSTI architecture with trained with siamese objective effectively identifies redundant sentences during the sentence selection process. It learns sentence representation in a way that, similar sentences are closer in the vector space while putting dissimilar ones far apart. We also explore the role of transfer learning method to overcome the problem of lack of data for multi-document summarization task. Inspired from the performance of CSTI, we adapted the CSTI architecture to create the first data-driven end-to-end novel architecture, NASH, suited for abstractive summarization for Hindi language. NASH model uses an encoder-decoder paradigm, where an encoder reads the source text and converts it into vector representation which is used by the decoder to generate a word-by-word summary. NASH architecture uses an attention based composition over tree mechanism to capture a richer set of features for incorporating semantic and compositional aspects latent in the text. It helps in capturing the complex Subject Object Verb interaction for Hindi language, which eventually helps in decoding better abstractive summaries. We also created a new dataset, containing ∼250K text-summary pairs for Hindi language abstractive summarization. In this thesis, we make an attempt towards improving the summarization systems using deep neural network techniques. In particular, we focus on the tackling the sub-problems of improving Sentence Ranking, Sentence Extraction and text representation tasks for text summarization. In our works we show that improving these sub-problems helps in building more precise and accurate summarization systems. To establish the superior performance of all our models we performed extensive sets of experiments with diverse settings. In doing so, we created several variations of our models to analyze efficiency and efficacy of our systems.

Full thesis: pdf

Centre for Search and Information Extraction Lab

IIIT Hyderabad Publications

Neural Approaches Towards Text Summarization

Abstract