Towards Narrative Understanding

Author: Ujwal Narayan
Date: 2023-03-30
Report no: IIIT/TH/2023/18
Advisor:Manish Shrivastava

Abstract

Research in NLP has seen increasing attention toward narrative understanding over the past decade. Narratives have a broad area of applications in various domains such as economics, political science and literature. Understanding narratives is critical to perform well in discourse-level tasks such as summarization, question answering, and multi-hop reasoning. In this thesis, we explore a framework for understanding narratives. Narratives are broken down into two fundamental parts: events and characters. The task of understanding narratives is then posed as the task of understanding the interplay and relations between these two constituents. We focus on two major relations. How are characters related to other characters, i.e. the character-character relations, and how are events related to other events, i.e. event-event relations. We utilize the concept of character arcs, a popular literary device that shows how the character changes with time to model character-character relations. We build MARCUS, an automated pipeline to generate and visualize these character arcs and character relations given a novel. We take two famous literary works, “Harry Potter” and “The Lord of the Rings” and analyze the character relations generated by MARCUS. We evaluate the quality of these arcs and relations through both quantitative and qualitative methods and show the effectiveness of the arcs created through MARCUS. For event-event relations, we focus on the task of identifying temporal relations between an event pair in the narrative. Narratives, by their very nature, are discourse-level phenomenons. Yet, most of the current work on identifying event temporal relations focuses on local event pairs, i.e. event pairs found close together typically within adjacent sentences. We thus, build DELTA, a discourselevel event temporal relation dataset to facilitate document-level event timeline generation. In DELTA, we introduce the concept of multiple timelines, where we distinguish between the real timeline where the events have actually occurred, and hypothetical timelines with events that may not have actually happened. We also develop a new user-friendly annotation tool that not only streamlines and makes the timeline annotation efforts more efficient but also helps visualise and understand the timeline. We train strong baseline models based on RoBERTa to predict discourse-level event temporal relations. In addition, we qualitatively analyze the timelines generated by our dataset, and evaluate these timelines against the timelines generated by existing datasets.

Full thesis: pdf

Centre for Language Technologies Research Centre

IIIT Hyderabad Publications

Towards Narrative Understanding

Abstract