IIIT Hyderabad Publications |
|||||||||
|
Enriching Text Summarization: A Journey through Contextual Guidance and Multimodal DataAuthor: Anshul Padhi 2018114013 Date: 2024-02-15 Report no: IIIT/TH/2024/19 Advisor:Vasudeva Varma AbstractThe digital age presents an overwhelming deluge of multimodal data, underscoring the imperative need for effective text summarization techniques. Such techniques transform vast amounts of textual and visual data into concise, comprehensible, and insightful summaries, facilitating information retrieval, comprehension, and decision-making. This thesis pioneers innovative strategies to enhance text summarization by employing various forms of contextual guidance and multimodal data, contributing significantly to the evolution of the field and offering a cohesive narrative that links these diverse yet interconnected areas of study. The journey begins with an exploration of ”Popularity Forecasting” of sentences within news articles. This novel approach surpasses traditional salience-based extractive summarization by predicting the ’popularity’ or ’eye-catching’ potential of sentences. We create a popularity dataset which contains news articles from CNN/DM[47] with their sentence-popularity score mapping. We create this by comparing sentences with the search queries for the particular article. Then we adapt trained extractive summarizers to perform regression tasks and predict the popularity of a particular sentence within a news article. The result is a ranking of sentences based on their popularity scores Next, the research advances into the realm of ”Multimodal Summarization,” which synergizes textual and visual elements to create a more holistic summary. By pairing concise textual summaries with the most salient images from news articles, this technique delivers a richer and more comprehensive understanding of the content. In this work we also show that we can improve the accuracy of summarization models by using images to aid the summarization process. To do this we utilize visuolinguistic transformers like CLIP[54], OSCAR[36] to help in the interaction of the two modalities and we adapt general summarization models so that we can incorporate both textual and visual information in the summarization model Building on the foundation of extractive summarization, and using the core logic from the multimodal summarization work the study then introduces ”Guided Summarization.” This innovative method uses salience scores of sentences, obtained from an extractive summarizer, to guide an abstractive summarizer. This symbiotic relationship between the two forms of summarization results in more contextually relevant and focused abstract summaries. The research further pushes the boundaries of personalization with ”Persona-based Summarization,” applied to SEBI legal case files. This technique generates tailored summaries based on the specific information needs of different personas such as investors, defense lawyers, and judges. It underscores the potential of personalization in text summarization, making the information more accessible and relevant to each user profile. Finally, building on the insights gleaned from the exploration of multimodal summarization, the study culminates with the creation of an ”Indic Multimodal Text-Image Pair Dataset.” This unique resource is a rich assembly of text and image pairs of different Indian languages, serving as a critical foundation for the development and evaluation of visuolinguistic transformers, especially those focusing on data from the Indian subcontinent. In summary, this thesis provides a comprehensive exploration of how contextual guidance and multimodal data can significantly enhance text summarization. The innovative techniques and resources proposed and developed in this research, connected through a cohesive narrative, promise to significantly advance the field of text summarization, paving the way for more engaging, comprehensive, and personalized summary generation Full thesis: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |