CLIN 29 in Groningen

Towards learning an interpreted language with recurrent models
Denis Paperno

Neural network models are general purpose learning devices that show good results on many practical tasks, but can they master language the way humans do? To address this question, we train recurrent neural models to interpret simple languages that feature key properties of natural language: recursive syntactic structure and compositional semantics. We find that a long short-term memory network (LSTM) can learn an analog of a recursive rule that allows it to generalize to unseen complex examples, but only in certain favorable conditions.

The languages in our experiments contain names for entities (e.g. Ann) and functions on entities (e.g. parent). The target language features either a head-initial construction (the parent of Ann) or a head-final construction (Ann's parent), which give rise to right-branching or left-branching recursive structures, respectively. Each expression of the language refers to an entity. Identifying it correctly requires the knowledge of recursive structure, since recombining the same words (Ann's friend's child vs. Ann's child's friend) yields different interpretations.

The LSTM does learn to interpret unseen complex examples such as Ann's child's friend correctly, which means the network implicitly masters the recursive rule involved. Moreover, like a human language learner, it generalizes from small input data to more complex examples than seen at training time. On the other hand, the LSTM shows limitations in its generalization capacity: training data have to be presented in a particular order and quantity, and perfect generalization is achieved only for left-branching recursion.