IIIT Hyderabad Publications |
|||||||||
|
Indiscapes: Instance Segmentation Networks for Layout Parsing of Historical Indic ManuscriptsAuthors: Abhishek Prusty,Sowmya Aitha,Abhishek Trivedi,Ravi Kiran Sarvadevabhatla Conference: 15th International Conference on Document Analysis and Recognition (ICDAR 2019 2019) Location Sydney, Australia Date: 2019-09-20 Report no: IIIT/TR/2019/67 AbstractHistorical palm-leaf manuscript and early paper documents from Indian subcontinent form an important part of the world’s literary and cultural heritage. Despite their impor- tance, large-scale annotated Indic manuscript image datasets do not exist. To address this deficiency, we introduce Indiscapes, the first ever dataset with multi-regional layout annotations for historical Indic manuscripts. To address the challenge of large diversity in scripts and presence of dense, irregular lay- out elements (e.g. text lines, pictures, multiple documents per image), we adapt a Fully Convolutional Deep Neural Network architecture for fully automatic, instance-level spatial layout parsing of manuscript images. We demonstrate the effectiveness of proposed architecture on images from the Indiscapes dataset. For annotation flexibility and keeping the non-technical nature of domain experts in mind, we also contribute a custom, web- based GUI annotation tool and a dashboard-style analytics portal. Overall, our contributions set the stage for enabling downstream applications such as OCR and word-spotting in historical Indic manuscripts at scale Full paper: pdf Centre for Visual Information Technology |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |