Culprit Analytics from Detective Novels

Author: Aditya Motwani
Date: 2020-06-30
Report no: IIIT/TH/2020/63
Advisor:Kamalakar Karlapalem

Abstract

Novels are a mine of information stored in a narrative. Detective novels usually have a premise of a culprit and a detective consolidating the evidence. The key ingredient in a detective novel is obfuscation of the evidence so that the reader cannot easily determine the culprit. We formulate a computational problem here where we identify sentences containing potential evidence, if any, against each character in the novel. While novels and stories from other genres have a more linear narrative, detective novels reference information from different points in the story. In this genre, the author is motivated to reveal only partial information to a reader to keep them engaged. So first, we propose a method to gather evidence against every character by extracting relevant sentences which seem to implicate them in the crime and are scattered throughout the story. Thus by identifying evidence sentences against each character, we computationally generate an ‘evidence summary‘. We split the novel into pre and post culprit expose by identifying the reveal paragraph, and utilise the information post culprit expose to identify semantically similar sentences in their vector representations. We process each novel text as their graph representation, and employ different graph traversal techniques. Second, from the exercise of extracting evidence we identify and categorise three major ways in which popular authors in crime and detection genre embed evidence in their stories. We compare our algorithm results with human evaluators and qualitatively determine the different writing styles. Our results show we outperform the baseline method in extracting useful evidence sentence and categorising novels on the basis of clue obfuscation. For one of books in the dataset The Murder of Roger Ackroyd, our method obtains a score of 8/10 compared to a 3/10 baseline. Thus our contributions are two fold - we propose a method to extract evidence containing sentences, in a detective novel utilising the semantics of this genre, and also establish different ways in which authors obfuscate the evidence.

Full thesis: pdf

Centre for Data Engineering

IIIT Hyderabad Publications

Culprit Analytics from Detective Novels

Abstract