Selective Memoization and Goal-weakening

In the head-corner parser it turns out that the parse/5 predicate is a very good candidate for memoization. The other predicates are not. This implies that each maximal projection is computed only once; partial projections of a head can be constructed during a parse any number of times, as can sequences of categories (considered as sisters to a head). Active chart parsers `memo' everything (including sequences of categories); inactive chart parsers only memo categories, but not sequences of categories. In our proposal, we memo only those categories that are `maximal projections', i.e. projections of a head which unify with the top category (start symbol) or with a non-head daughter of a rule.

The implementation of memoization uses Prolog's internal database to store the tables. The advantage of this technique is that we use Prolog's first argument indexing for such tables. Moreover, during the consultation of the table we need not worry about modifications to it (in contrast to an approach in which the table would be maintained as the value of a Prolog variable). On the other hand, the use of the internal database brings about a certain overhead. Therefore, it may be worthwhile to experiment with a meta-interpreter along the lines of the XOLDT system [32] in which the table is maintained dynamically.

Memoization is implemented by two different tables. The first table encodes which goals have already been searched. Items in the first table are called goal items. The second table represents all solved (i.e. instantiated) goals. Items in this second table are called result items. One may be tempted to use only the second table. But in that case we would not be able to tell the difference between a goal which has already been searched, but did not result in a solution (`fail-goal') and a goal which has not been searched at all. If we have two tables then we can also immediately stop working on branches in the search space for which it has already been shown that there is no solution. The distinction between these two kinds of item is inherited from BUP [11]. The memoized version of the parse predicate can be defined as in (3.1).

$\begin{figure}\begin{verbatim}parse(Cat,P0,P,E0,E) :- ( in_table1(Cat,P0,P,E0,E... ...done ) ), in_table2(Cat,P0,P,E0,E). % pick a solution\end{verbatim}\end{figure}$

The first table is represented by the predicate 'GOAL_ITEM'. This predicate simply consists of a number of unit-clauses indicating all goals that have been searched completely. Thus, before we try to attempt to solve Goal we first check whether a goal item for that goal already exists. Given the fact that Goal may contain variables we should be a bit careful here. Unification is clearly not appropriate here since that may result in a situation in which a more general goal is not searched because a more specific variant of that goal had been solved. We want exactly the opposite: if a more general version of Goal is included in the goal table, then we can continue to look for a solution in the result table. It is useful to consider the fact that if we had previously solved e.g. the goal parse(s,3,X,3,12), then if we later encounter the goal parse(s,3,Y,3,10), we can also use the second table immediately: the way in which the extreme positions are used ensures that the former is more general than the latter. The predicates for the maintenance of the goal table are defined in (3.1).

$\begin{figure}\begin{verbatim}in_table1(Cat,P0,P,E0,E) :- 'GOAL_ITEM'(Cat_d,P0_... ...Cat,P0,P,E0,E) :- assertz('GOAL_ITEM'(Cat,P0,P,E0,E)).\end{verbatim}\end{figure}$

The second table is represented by the predicate 'RESULT_ITEM'. It is defined by unit-clauses which each represent an instantiated goal (i.e. a solution). Each time a result is found, it is checked whether that result is already available in the table. If so, the newer result is ignored. If no (more general version of) the result existed, then the result is added to the table. Moreover, more specific results which may have been put on the table previously are marked. These results need not be used anymore. ⁴ This is not strictly necessary but is often useful because it decreases the size of the tables; in this approach tables are redundancy-free and hence minimal. Moreover, such more specific results cannot be used anymore and no work will be done based on those results. Note that RESULT_ITEMs do not keep track of the extreme positions. This implies that in order to see whether a RESULT_ITEM is applicable we check whether the interval covered by the RESULT_ITEM lies within the extreme positions of the current goal. The predicates dealing with the result table are defined in (3.1).

$\begin{figure}\begin{verbatim}in_table2(Cat,P0,P,E0,E) :- clause('RESULT_ITEM'... ... mark it fail % do this for all such items ; true ).\end{verbatim}\end{figure}$

The implementation uses a faster implementation of memoizating in which both goal items and result items are indexed by the functor of the category and the string positions.

In the head-corner parser, parse goals are memoized. Note that nothing would prevent us from memoing other predicates as well. Experience suggests that the cost of maintaining tables for e.g. the head_corner relation is (much) higher than the associated profit. The use of memoization for only the parse/5 goals implies that the memory requirements of the head-corner parser in terms of the number of items that is being recorded is much smaller than in ordinary chart parsers. Not only do we refrain from asserting so-called active items, but we also refrain from asserting inactive items for non-maximal projections of heads. In practice the difference in space requirements can be enormous. This difference is a significant reason for the practical efficiency of the head-corner parser.

The Occur Check

It turns out that the use of tables defined in the previous subsection can lead to a problem with cyclic unifications. If we assume that Prolog's unification includes the occur check then no problem would arise. But since most versions of Prolog do not implement the occur check it is worthwhile investigating this potential problem.

The problem arises because cyclic solutions can be constructed that would not have been constructed by ordinary SLD-resolution. Furthermore, these cyclic structures lead to practical problems because items containing such a cyclic structure may have to be put in the table. In SICStus Prolog this results in a crash.

An example may clarify the problem. Suppose we have a very simple program containing the following unit clause:

A possible approach to deal with this situation is to index the items of the second table with the item of the first table from which the solution was obtained. In other words: if you want to select a solution from the second table, it must not only be the case that the solution matches your goal, but also that the corresponding goal of the solution is more general than your current goal. This strategy works, but turns out to be considerably slower than the original version given above. The reason seems to be that the size of the second table is increased quite drastically, because solutions may now be added to the table more than once (for all goals that could give rise to that solution).

It turns out that an improvement of the head-corner parser using a goal weakening technique often eliminates this occur check problem. Goal weakening is discussed in the following subsection.

Goal Weakening

The insight behind `goal weakening' (or abstraction [7]) in the context of memoization is that we may combine a number of slightly different goals into a single more general goal. Very often it is much cheaper to solve this single (but more general) goal, than to solve each of the specific goals in turn. Moreover, the goal table will be smaller (both in terms of number of items, and the size of individual items), which can have a very good effect on the amount of memory and CPU-time required for the administration of the table. Clearly, one must be careful not to remove essential information from the goal (in the worst case this may even lead to non-termination of otherwise well-behaved programs).

Depending on the properties of a particular grammar, it may for example be worthwhile to restrict a given category to its syntactic features before we attempt to solve the parse goal of that category. Shieber's restriction operator [19] can be used here. Thus we essentially throw some information away before an attempt is made to solve a (memoized) goal. For example, the category

$\begin{figure}\begin{verbatim}parse_with_weakening(Cat,P0,P,E0,E) :- weaken(Cat... ...Cat), parse(WeakenedCat,P0,P,E0,E), Cat=WeakenedCat.\end{verbatim}\end{figure}$

Note that goal weakening is sound. An answer a to a weakened goal g is only considered if a and g unify. Also note that goal-weakening is complete in the sense that for an answer a to a goal g there will always be an answer a' to the weakening of g such that a' subsumes a.

For practical implementations the use of goal weakening can be extremely important. It is my experience that a well-chosen goal weakening operator may reduce parsing times by an order of magnitude.

The goal weakening technique can also be used to eliminate typical instances of the problems concerning the occur check (discussed in the previous subsection). Coming back to the example in the previous subsection, if our first goal

An important question is how to come up with a good goal weakening operator. For the experiments discussed in the final section all goal weakening operators were chosen by hand, based on small experiments and inspection of the goal table and item table. Even if goal-weakening is reminiscent of Shieber's restriction operator [19], the rules of the game are quite different: in the former case as much information as possible is removed without risking non-termination of the parser. In the latter case information is removed until the resulting parser terminates. For the current version of the grammar of OVIS, it turned out that weakening the goal category in such a way that all information below a depth of 6 is replaced by fresh variables eliminated the problem caused by the absence of the occur check; moreover this goal weakening operator reduced parsing times substantially. In the latest version we use different goal weakening operators for each different functor.

An interesting special case of goal-weakening is constituted by a goal-weakening operator which ignores all feature constraints, and hence only leaves the functor for each goal category. In this case the administration of the goal table can be simplified considerably (the table consists of ground facts, hence no subsumption checks are required). This technique is used in the MiMo2 grammar and the Alvey NL Tools grammar (both discussed in section 7).