IIIT Hyderabad Publications |
|||||||||
|
Beyond OCRs for Document Blur EstimationAuthors: Pranjal Kumar Rai,Sajal Maheshwari,Ishit Mehta,Parikshit Sakurikar,Vineet Gandhi Conference: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR-2017 2017) Location Kyoto, Japan Date: 2017-11-13 Report no: IIIT/TR/2017/51 AbstractThe current document blur/quality estimation algorithms rely on the OCR accuracy to measure their success. A sharp document image, however, at times may yield lower OCR accuracy owing to factors independent of blur or quality of capture. The necessity to rely on OCR is mainly due to the difficulty in quantifying the quality otherwise. In this work, we overcome this limitation by proposing a novel dataset for document blur estimation, for which we physically quantify the blur using a capture set-up which computationally varies the focal distance of the camera. We also present a selective search mechanism to improve upon the recently successful patch-based learning approaches (using codebooks or convolutional neural networks). We present a thorough analysis of the improved blur estimation pipeline using correlation with OCR accuracy as well as the actual amount of blur. Our experiments demonstrate that our method outperforms the current state-of-the-art by a significant margin. Full paper: pdf Centre for Visual Information Technology |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |