Next: The experiment Up: Head-driven Parsing for Lexicalist Previous: Introduction

Two Head-driven Parsers

**Figure 1:** The head-corner parser.
$\begin{figure} \par\begin{picture} (120,100)(-150,0) \multiput(0,0)(5,10){12}{... ...t(0,0){\line(1,0){120}} \put(120,0){\line(-1,2){60}} \end{picture}\end{figure}$

In this section we present two head-driven parsing algorithms. Prolog code for simplifications of the algorithms is included in the appendix. For each grammar rule LHS $\rightarrow$ D₁,..., D_h,..., D_n, it is assumed that there is one daughter D_h which has been identified (by the grammar writer) as the head of that rule.

Head-driven Chart Parsing

The head-driven chart parser scans a sentence from left to right, storing items representing (partial) derivations in a chart. Items are of the form item(Cat,ToParse,BeginPos,EndPos). If ToParse is empty, the item is inactive, otherwise it is active. The parser is a bottom-up active chart parser without prediction, in which the addition of an active item based on a rule R is considered whenever an inactive item H is entered into the chart which matches the head of R. More precisely, if item(Cat,[ ],B,E) is derived, and there is a rule LHS $\rightarrow$ D₁,...,D_{h - 1},Cat,D_{h + 1},...,D_n and there are inactive items matching D₁...D_{h - 1}, ranging from B₀ to B, an item(LHS,D_{h + 1}...D_n,B₀,E) is added to the chart.

If the leftmost daughter of each grammar rule is the head of the rule, then the head-driven chart parser reduces to an ordinary bottom-up active chart parser. If the rightmost daughter of each rule is the head, then the head-driven chart parser reduces to an inactive bottom-up chart parser (i.e. a breadth-first implementation of a shift-reduce parser).

The head-driven strategy has a potential advantage over active bottom-up chart parsers, as it will assert substantially less active items for grammars that contain rules with an underspecified leftmost daughter (as in rule 1). In particular it avoids entering active items into the chart for which it is clear that the missing daughters cannot be derived.

The head-driven parser also has a potential advantage over inactive bottom-up chart parsers for grammars that contain rules with an underspecified rightmost daughter. An inactive chart parser must search in the chart for items matching the remaining daughter of such a rule each time an arbitrary category is derived. The head-driven parser on the other hand only needs to search for matching active items. The difference may lead to important efficiency improvements, especially if searching the chart is expensive. For example this is the case if the unification operation is expensive.

Head-corner Parsing

Head-corner parsing is a more radical approach to head-driven parsing in that it gives up the idea that parsing should proceed form left to right. Rather, the order of processing in a head-corner parser is bidirectional, starting from a head outward (`island'-driven). A head-corner parser can be thought of as a generalization of the left-corner parser [6]. As in the left-corner parser, the flow of information in a head-corner parser is both bottom-up and top-down.

The basic idea of the head-corner parser is illustrated in figure 1. The parser selects the head of the string (1), and proves that this element is the head-corner of the goal. To this end, a rule is selected of which this lexical entry is the head daughter. Then the other daughters of the rule are parsed recursively in a bidirectional fashion: the daughters left of the head are parsed from right to left (starting from the head), and the daughters right of the head are parsed from left to right (starting from the head). The result is a slightly larger head-corner (2). This process repeats itself until a head-corner is constructed which dominates the whole string (3).

Note that a rule is triggered only with a fully instantiated head-daughter. The `generate-and-test' behavior observed for example 1 is avoided in a head-corner parser, because the rule is applied only if the VP is found, and hence Arg is instantiated. For example if Arg = np(sg3,[ ],Subj), the parser continues to search for a singular NP, and need not consider other categories.

The head-relation holds between two categories h and m with respect to a grammar G iff G contains a rule with left hand side m and head daughter h. The relation `head-corner' is the reflexive and transitive closure of the head relation. As in the left-corner parser, a `linking' table is maintained which represents important aspects of this head-corner relation. For some grammars this table simply represents the fact that the HEAD features of a category and its head-corner are shared.

Note that unlike the left-corner parser, the head-corner parser may need to consider alternative words as a possible head-corner of a phrase, e.g. when parsing a sentence which contains several verbs. This problem is reduced because of the following three observations.

The Quicksort Effect.

A simplified version of the head-corner parser is provided in the appendix. The main difference with a simple version of the left-corner parser is -- apart from the head-driven selection of rules -- the use of two pairs of indices, to implement the bidirectional way in which the parser proceeds through the string.

Observe that each parse-goal in the left-corner parser is provided with a category and a left-most position. In the head-corner parser a parse-goal is provided either with a begin or end position (depending on whether we parse from the head to the left or to the right) but also with the extreme positions between which the category should be found. In general the parse predicate is thus provided with a category and two pair of indices. The first pair indicates the begin and end position of the category, the second pair indicates the extreme positions between which the first pair should lay. The following figure illustrates this point with an example:

$\begin{picture} (300,360)(0,-20) \put(70,20){\line(1,0){230}} \put(70,20){\line(... ...\makebox(0,0){$s$}} \multiput(157.5,273)(0,5){8}{\makebox(0,0){.}} \end{picture}$

Suppose we found for a goal category s a possible head-corner v from position 5 to 6. In order to construct a complete tree s for this head-corner, a rule is selected which dictates that a category np should be parsed to the right, starting from position 6. To parse np, we predict the head-corner n between 7 and 8. Suppose furthermore that in order to connect n to np a rule is selected which requires a category adjp to the left of n. It will be clear that this category should end in position 7, but can never start before position 6. Hence the only candidate head-corner of this phrase is to be found between 6 and 7. This example illustrates that the use of two pairs of string positions reduces the number of possible head-corners for a given goal.

String positions in head-corner table.

Secondly, the head-corner table includes information about begin and end positions, following an idea in [9]. For example, if the goal is to parse a phrase with category sbar from position 7, and within positions 7 and 12, then for some grammars it can be concluded that the only possible head-corner for this goal should be a complementizer starting at position 7. Such information is compiled into the table as well. Hence the number of possible head-corners is reduced.

Well-formed substring tables.

Thirdly, the problem of multiple possible heads is reduced because a well-formed substring table is maintained. This is implemented by a memo-ization technique. This reduces the problem because even if the wrong head-corner is predicted for a given goal, it may turn out to be the case that the computations based on this wrong prediction may be useful later (each lexical category usually is the head of some projection).

The well-formed substring table is implemented using an interesting generalization of the subsumption relation. A goal need not be investigated anymore if a more general goal has already been completed. It is easy to see that a certain goal with extreme positions 3 to 6 is more general than an otherwise identical goal with extreme positions 4 and 6.

Head-driven vs. functor-driven parsing.

For categorial unification grammars in which we choose the functor as the head of a rule, the head-corner table is not going to be discriminating, because the grammar rules in such a grammar may simply be (in DCG notation, given appropriate operator definitions): ¹

Val $\displaystyle \rightarrow$ Val/Arg,ArgVal $\displaystyle \rightarrow$ Arg,Arg\Val

(5)

As no information about word-class or morphology is stated in the rules, such information will not be found in the head-corner table.

A possibly useful approach here is to compile some lexical information into the rule set, along the lines proposed in [1]. In that paper it is proposed to compile lexical information into the rule-set, and parse with this `enriched' rule-set. What seems to be most useful here, is to use this enriched grammar only for the compilation of the head-corner table. The parser then uses the general rule schemata themselves.

However, given the usual analysis of modifiers as functors, even this approach may fail to yield an interesting head-corner table. Note that some analyses in categorial grammar prescribe that even in such cases certain morphological features are shared between the functor and its resulting value [2].

Comparison

The important differences between both head-driven parsing algorithms can be summarized as follows (see also table 1). Firstly the head-driven chart parser proceeds from left-to-right as usual, whereas the head-corner parser proceeds bidirectionally. Secondly, the head-driven chart parser is an active chart parser (i.e. it also stores partial analyses of phrases); the head-corner parser uses memo-ization of the parse predicate and the head-corner predicate (i.e. it only stores complete analyses of phrases, and partial analyses of head-corners).

We also implemented an active head-corner chart parser along the lines of [9], but preliminary experiments indicate that (our implementation of) this parser is not useful for the grammars used in the experiments to be discussed in the next section. Note that it is not possible to incorporate top-down filtering in the head-driven chart parser in a simple way, because the necessary active items may not be available yet.

Thirdly, although in both algorithms the way rules are applied is bottom-up in an important sense, there is an important flow of information in top-down direction in the head-corner parser. For grammars in which the head-corner table is discriminating, this should have important effects in practice. This expectation is confirmed in the experiments discussed in the next section.

Next: The experiment Up: Head-driven Parsing for Lexicalist Previous: Introduction

Noord G.J.M. van
1998-09-25