next up previous contents
Next: Conclusion Up: Discussion and Extensions Previous: Delaying the extra constraint


Memo relations

A possible criticism for the head-driven parser presented here can be formulated as follows. The backtracking search procedure enforces the parser to re-compute possible sub-parses. One of the fundamental principles of parsing, on the other hand, is that you should not do things twice. Hence the head-corner parser does not even adhere to this fundamental principle of parsing.

We should make a methodological distinction between techniques to make the search space as small as possible on the one hand, and techniques to search through the search space as efficiently as possible. The proposed algorithm focuses on ways to make the search space as small as possible. Using tabular methods to assert previous results, in order to prevent double work, constitutes a different dimension along which we should judge parsing algorithms. Techniques to search a given search space can then be `applied' to a given proof procedure. In the current setting is possible by the introduction of memo-relations. A similar point of view is defended by [54], in the context of LR-parsers.

In the section on memo-relations in chapter 3 I presented a Prolog implementation of so-called `memo-relations' (this name is inspired by the concept of `memo-functions' in functional programming languages). The proposed technique to re-use previously established computation results can be applied for the head-driven parser for TAGs as well. In the (small) example grammars I have tested the parser with, this technique did not seem to be practically useful.

For the head-driven parser this technique may not be as useful for the following reason. We might expect that some given noun phrase occurring in the input string will be parsed only once for a given NP goal. In fact however, this noun phrase will be parsed several times for that goal because the input bag of words may be different each time.

For example, if the sentence to be parsed is ``the man left yesterday'', the memo'ed relation does make a distinction between the goal np,[s(0),s(0),s(0),0] (for ``the man left'') and the goal np, [s(0),s(0),s(0),s(0)] (for ``the man left yesterday''). This clearly is correct, but less efficient as we might have expected at first.

A possible way to improve upon this, is the following. Each time a result has been found we may generalize this result before the result is asserted in the data-base. For example, if we have shown an np with [s(0),s(0),s(0),s(0)] - [0,0,s(0),s(0)] we may generalize this result as the result with [s(W),s(X),Y,Z]-[W,X,Y,Z]. The problem with this is that our assumption, that no left-recursion arises, is problematic. That is, we cannot be sure anymore that no duplicate solutions are asserted in the data-base. Of course, it is possible to check before each assertion whether such a result already exists. However, in that case we need to do subsumption checking on the results of the parser, rather than on the goals. This implies much more overhead than before.


next up previous contents
Next: Conclusion Up: Discussion and Extensions Previous: Delaying the extra constraint
Noord G.J.M. van
1998-09-30