IIIT Hyderabad Publications |
|||||||||
|
Machine Translation Post-editing: Assessing Effort, Text and BiasAuthor: Arafat Ahsan 2019701030 Date: 2023-03-23 Report no: IIIT/TH/2023/19 Advisor:Dipti Misra Sharma AbstractMachine Translation aided workflows have become increasingly prevalent across industry and within large institutional translation efforts. We evaluate the machine translation postediting process end-to-end on three aspects: the usefulness of the process itself, as to whether it leads to reduction in temporal, technical and cognitive effort compared to unaided translation; the linguistic nature of the products of this process, in terms of the quality of translations generated measured against unaided human translations; and the suitability of utilizing these outputs towards a standard machine translation evaluation task. The language direction we study is English-Hindi, a mid to high resource language pair. We first conduct a behavioral experiment utilizing professional translators. In our analysis, employing mixed-effects modeling techniques, we find that for this language pair, post-editing reduces translation time by as much as 63%, consequently increasing productivity; it reduces technical effort measured as keystrokes logged by 59%; and reduces cognitive burden measured as reduction in number of pauses by 63%. We then investigate the nature of the translations generated during post-editing by comparing them against raw MT outputs and unaided human translations on linguistic indicators we define and implement. We find significant differences between the three corpora in terms of lexical richness, normalization and interference by the source language. We also detect and confirm the presence of translation universals in our data. We then go on to test the suitability of translated data thus created for a machine translation evaluation task. We detect engine-reference bias towards the engine utilized during post-editing, inflated engine scores across the board on post-edited references, and find the utilization of multiple references to be the most prudent choice when conducting meta-evaluation tasks. We run these evaluations across multiple string-based and pre-trained metrics and issue general recommendations on constructing evaluation test sets and metrics to score against. We also contribute a set of post-editing guidelines with examples culled from a real-world translation task for translators and post-editors working with the English-Hindi pair. Full thesis: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |