IIIT Hyderabad Publications |
|||||||||
|
CharBoxes: A System for Automatic Discovery of Character Infoboxes from BooksAuthors: Manish Gupta,Piyush Bansal,Vasudeva Varma Conference: The 37th Annual ACM SIGIR CONFERENCE Location Gold Coast, Australia Date: 2014-07-06 Report no: IIIT/TR/2014/98 AbstractEntities are centric to a large number of real world applications. Wikipedia shows entity infoboxes for a large number of entities. However, not much structured information is available about character entities in books. Automatic discovery of characters from books can help in effective summarization. Such a structured summary which not just introduces characters in the book but also provides a high level relationship between them can be of critical importance for buyers. This task involves the following challenging novel problems: 1. automatic discovery of important characters given a book; 2. automatic social graph construction relating the discovered characters; 3. automatic summarization of text most related to each of the characters; and 4. automatic infobox extraction from such summarized text for each character. As part of this demo, we design mechanisms to address these challenges and experiment with publicly available books. Full paper: pdf Centre for Search and Information Extraction Lab |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |