Can we improve low-resource Neural Machine Translation with typological linguistic features?
Iacer Calixto, Fabio Curi Paixão and Miguel Rios


In this work, we explore how to incorporate typological linguistic features from The World Atlas of Language Structures (WALS) into low-resource machine translation (MT).
According to Dryer and Haspelmath (2013), WALS features describe ``structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars).''

In particular we investigate low-resource neural MT (NMT), which is a challenging setting since training neural models requires large amounts of training data (Zoph et al., 2016; Johnson et al., 2017).
In order to alleviate the amount of data needed to translate between low-resource languages, we incorporate typological linguistic features into attention-based sequence-to-sequence NMT.
We believe that an NMT model can learn structural regularities between different languages and require much less training examples if these regularities can be inferred from an external source, e.g. the WALS features.

We will discuss three specific neural sequence-to-sequence architectures and will also provide a quantitative evaluation using standard MT evaluation metrics, as well as experiments on different language pairs.
We train look-up embedding matrices for each feature in WALS, and our architectures differ on how we incorporate them into the model.
In our first two architectures, we group all features together using deep layers and use them to initialise the encoder or the decoder of the NMT model.
In another architecture, we compose related features together by feature type (e.g., phonological, morphological) and incorporate feature type embeddings by means of an independent attention mechanism.