IIIT Hyderabad Publications |
|||||||||
|
Unconstrained Scene Text and Video Text Recognition for Arabic ScriptAuthors: Mohit Jain,Minesh Mathew, C V Jawahar Conference: 1st International Workshop on Arabic Script Analysis and Recognition (ASAR-2017 2017) Location Nancy, France Date: 2017-04-03 Report no: IIIT/TR/2017/31 AbstractBuilding robust recognizers for Arabic has always been challenging. We demonstrate the effectiveness of an end-to-end trainable CNN - RNN hybrid architecture in recognizing Arabic text in videos and natural scenes. We outperform previous state-of-the-art on two publicly available video text datasets – ALIF and ACTIV . For the scene text recognition task, we introduce a new Arabic scene text dataset and establish baseline results. For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data. We overcome this by synthesizing millions of Arabic text images from a large vocabulary of Arabic words and phrases. Our implementation is built on top of the model introduced here [37] which is proven quite effective for English scene text recognition. The model follows a segmentation-free, sequence to sequence transcription approach. The network transcribes a sequence of convolutional features from the input image to a sequence of target labels. This does away with the need for segmenting input image into constituent characters/glyphs, which is often difficult for Arabic script. Further, the ability of RNN s to model contextual dependencies yields superior recognition results. Full paper: pdf Centre for Visual Information Technology |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |