Neural Machine Translation of Kazakh
Galiya Yeshmagambetova


Machine translation (MT) is currently a crucial component of communication as it can push back the boundaries between people speaking different languages. The quality of MT has increased substantially with the advent of Neural Machine Translation (NMT), a new paradigm based on artificial neural networks. Although NMT is able to translate between some languages with high accuracy, it has difficulties with translating morphologically-rich languages. Examples of these languages are Finnish, Turkish and Kazakh, on which I focus on in my paper.

In this project, NMT between Kazakh and English will be explored. Due to the scarcity of parallel data for Kazakh, monolingual corpora will be exploited, along with some linguistic rules such as morphological segmentation for parsing Kazakh’s complex morphology. Therefore a supervised or semi-supervised approach to NMT will be followed. The proposed method should improve the accuracy of the translation to English even in case of lack of data for Kazakh and should be applicable to other under-resourced morphologically-rich languages.