Efficient Neural Machine Translation for Indian Languages

Author: Vikrant Goyal
Date: 2020-10-24
Report no: IIIT/TH/2020/95
Advisor:Dipti Misra Sharma

Abstract

Neural Machine Translation (NMT) is a rapidly advancing MT paradigm and has shown promising results for many language pairs, especially in large training data scenarios. But Indian to English language Machine Translation is a challenging problem, owing to multiple factors including the structural and morphological difference, in addition to the lack of sufficient parallel training data, thereby demanding efficient strategies to improve the translation quality of the NMT systems. Although NMT is a promising approach, it still lacks the ability of modeling deeper semantic and syntactic aspects of the language. In machine translation with a low-resource setting, resolving data sparseness and semantic ambiguity problems can help improve its performance. In this thesis, we investigate utilizing extra syntactic and semantic linguistic factors in the context of the NMT framework for a low resource language pair i.e. Hindi-English. We propose a new architecture to incorporate explicit linguistic input features into the state-of-the-art Transformer network and demonstrate considerable performance improvements. Despite the massive success brought by neural machine translation, it has been noticed that the vanilla NMT often lags behind conventional machine translation systems, such as statistical phrase-based translation systems, for low-resource language pairs. In the past few years, various approaches have been proposed to address this issue but not much work has been done for Indian languages in this context. In this thesis, we also present our efforts towards building efficient Neural Machine Translation systems between Indian languages (specifically Indo-Aryan languages) and English via exploring the effectiveness of Multilingual Learning and Transfer Learning. We describe techniques to leverage the language relatedness among Indo-Aryan languages to improve the translation quality for individual language pairs. We also present our new approach Multilingual Transfer Learning which outperforms the aforementioned techniques. Neural MT models are generally trained using a Maximum Likelihood Estimation (MLE) objective and are tested with sequence level evaluation metrics such as BLEU. To address this inconsistency issue, Reinforcement Learning (RL) methods have been adopted to directly optimize sequence-level objectives. In this thesis, we also present an approach for training Neural Machine Translation systems using Advantage Actor-Critic method from Reinforcement Learning. Our approach directly optimizes the model parameters with respect to the task specific scores, unlike conventional maximum likelihood estimation and is fit for problems with low resource settings, large action space & delayed rewards. We also demonstrate experiments to leverage our approach to further boost the performance of NMT systems using source & target monolingual data for a low resource language pair like Hindi-English.

Full thesis: pdf

Centre for Language Technologies Research Centre

IIIT Hyderabad Publications

Efficient Neural Machine Translation for Indian Languages

Abstract