CLIN 29 in Groningen

From partial neural graph-based LTAG parsing towards full parsing
Tatiana Bladier, Jörg Hendrik Janke, Jakub Waszczuk and Laura Kallmeyer

In the present work we address the approach to parsing with Lexicalized Tree Adjoining Grammar [LTAG; 3] as dependency parsing combined with supertagging. We show that predicting 1-best supertags and a TAG-driven dependency tree only allows for partial parsing. We demonstrate that the full parsing step is challenging and propose an architecture which uses n-best supertags and k-best dependency arcs to produce a full parse.

TAG parsing algorithms which include an intermediate step of supertagging have shown an increased performance for both parsing speed and computational costs [2, 6]. However, using 1-best supertag is not sufficient for the full parsing due to a high percent of erroneously predicted supertags. Experiments with larger number of n-best supertags lead to a big number
of parses per sentence, resulting from the lexical and attachment site ambiguities. Kasai et al. [4, 5] addresses these issues with a neural dependency-based architecture which jointly predicts supertags and TAG-compliant dependency trees. However, this architecture only allows for partial parsing. The step towards a full parse is difficult, since a complete LTAG derived tree can only be produced if the model predicts mutually compatible arcs and supertags.

To address the challenges discussed above, we adapt the A∗-parsing algorithm for TAG and propose an architecture which uses n-best supertags and k-best arc outputs to produce full parsing trees. We show that this architecture allows for an efficient full TAG parsing while also being sufficiently accurate. We test our architecture on LTAG grammars extracted from the French Treebank [FTB; 1].