Are Human and Post-edited Translations Different?
Antonio Toral


In theory, one would claim that human and post-edited translations (HT and PE, respectively) are clearly different, since, in the translation workflow of the latter, the translator is primed by the output of the machine translation (MT) system (Green et al., 2013), resulting in a translation that contains the footprint of the MT system. Because of this, one would conclude that HT should be preferred over PE, as the former should be more natural and adhere more closely to the norms of the target language. However, research studies have shown that the quality of PE is comparable to that of HT (e.g., Plitt & Masselot, 2010) or even better (e.g. Koponen, 2016) and that native speakers do not have a clear preference for HT over PE (Bowker and Buitrago Ciro, 2015; Daems et al., 2017).

We conduct a computational analysis on datasets that contain HT and PE for different languages and domains. Our aim is to find out whether HT and PE differ significantly in terms of different phenomena, e.g. fluency, amount of reordering, lexical variety. Subsequently, we build a classifier that uses the most promising phenomena from this analysis as features in order to find out whether, and if so to what extent, one can discriminate automatically between HT and PE.