Unconstrained OCR for Urdu using Deep CNN-RNN Hybrid Networks

Authors: Mohit Jain,Minesh Mathew, C V Jawahar
Conference: 4th Asian Conference on Pattern Recognition (ACPR-2017 2017)
Location NANJING, CHINA
Date: 2017-11-26
Report no: IIIT/TR/2017/94

Abstract

Building robust text recognition systems for languages with cursive scripts like Urdu has always been challenging. Intricacies of the script and the absence of ample annotated data further act as adversaries to this task. We demonstrate the effectiveness of an end-to-end trainable hybrid CNN - RNN architecture in recognizing Urdu text from printed documents, typically known as Urdu OCR. The solution proposed is not bounded by any language specific lexiconwith the model following a segmentation-free, sequence-to-sequence transcription approach. The network transcribes a sequence of convolutional features from an input image to a sequence of target labels. This discards the need to segment the input image into its constituent characters/glyphs, which is often arduous for scripts like Urdu. Furthermore, past and future contexts modelled by bidirectional recurrent layers aids the transcription. We outperform previous state-of-the-art techniques on the synthetic UPTI dataset. Additionally, we publish a new dataset curated by scanning printed Urdu publications in various writing styles and fonts, annotated at the line level. We also provide benchmark results of our model on this dataset.

Full paper: pdf

Centre for Visual Information Technology

IIIT Hyderabad Publications

Unconstrained OCR for Urdu using Deep CNN-RNN Hybrid Networks

Abstract