The experiment

This section describes experimental results for the parsing algorithms discussed above, in comparison with some obvious alternative strategies. The experiment consists of two parts.

The first part of the experiment compares parsing strategies which proceed in a bottom-up fashion without the use of any top-down prediction. For CUG such parsers are suitable as no top-down information can be compiled from the rule schemata in a simple way.²It turns out that the head-driven bottom-up chart parser performs better than both an inactive and an active bottom-up chart parser, for a particular CUG for English. If the cost of unification is relatively high, the use of the head-driven chart parser pays off. If unification is cheap, then the inactive chart parser may still be the most efficient choice.

The second part of the experiment concentrates on the comparison between the head-corner parser and the left-corner parser. Both of these parsers proceed in a bottom-up fashion, but use important top-down prediction. Such parsers are interesting for grammars in which interesting top-down information can be extracted from the rule schemata. It can be concluded from the experiment that for a specific lexicalist Definite Clause Grammar for Dutch the head-corner parser performs much better than the left-corner parser.

These results indicate that at least for some grammars it is fruitful to apply parsing strategies which are sensitive to the linguistic notion `head'.

A CUG for English.

If the standard techniques for compiling a left-corner resp. a head-corner table are applied for this grammar, then, at best, the `trivial' link would result, because the rule schemata do not specify any interesting information about morphological features etc.

A lexicalist DCG for Dutch.

The compilation of the left-corner resp. the head-corner table was done using the same restrictor. The left-corner table contained 94 entries, and the head-corner table contained 25 entries.

The parsers.

Table 1: The parsers used in the experiment

	inact	hdc	act	lc	hc
well-formed substrings	+	+	+	+	+
packing	+	+	+	+	+
subsumption-checking	+	+	+	+	+
active items	-	+	+	+	-
left-to-right processing	+	+	+	+	-
top-down filtering	-	-	-	+	+
head-driven processing	-	+	-	-	+

Table 2: Results for the English grammar

		items			recognition			parsing
n	parses	hdc	inact	act	hdc	inact	act	hdc	inact	act
		#	%	%	sec	%	%	sec	%	%
3	1	25	67	170	.8	63	191	1.1	67	168
6	1	43	73	180	.9	87	199	1.4	90	164
9	2	89	74	179	2.5	91	208	3.6	94	179
12	3	141	75	181	4.0	102	211	6.2	101	175
15	4	193	79	184	5.5	111	214	8.2	109	180
18	6	254	82	184	7.0	124	215	10.8	113	175
21	32	369	84	181	10.9	135	224	30.0	117	147
24	98	452	86	181	13.7	140	225	87.0	106	120
27	55	472	87	185	14.1	142	233	29.7	119	164
30	95	592	87	179	19.9	144	218	172.7	107	120

Table 3: Results for the Dutch grammar. For parsers which did not succeed within a given period, the entry in the table has not been filled in.

		recognition					parsing
n	parses	hc	hdc	lc	act	inact	hc	hdc	lc	act	inact
		sec	%	%	%	%	sec	%	%	%	%
3	1	.3	2647	80	2804	390	.5	1699	79	1759	428
6	2	.8	5407	343	5968	1044	1.6	3698	215	4265	1300
9	6	1.2		550		1170	2.9		334		1474
12	5	2.0		428		2333	3.8		285
15	9	3.1		355		2521	6.7		210
18	16	5.1		248		2408	10.7		160
21	20	7.4		195		1918	15.3		127
24	23	10.2		147			19.8		104
27	61	13.8		209			34.3		131
30	87	17.3		145			62.4		102

The parsers used in the experiment have a number of important properties in common (see table 1). First of all, they all use a chart to represent (partially or fully developed) analyses of substrings. Second, as categories are feature-structures or terms, rather than atomic symbols, special requirements are needed to ensure that the chart is always `minimal'. That is, items are only added to the chart if no subsuming item exists, and, if an item is added to the chart, all more specific items are deleted from the chart. Finally, information about the derivational history of phrases is added to the chart in such a way that parse-trees can be recovered. This is done by using `packed structures' (also called `parse-forests') to obtain structure sharing in the case of ambiguities; semantic constraints (if present) are only evaluated when the syntactic analysis phase is completed. Our implementation of `packing' follows that of [5], who implement it for a (unification-based) left-corner parser.

Three different bottom-up chart parsers are implemented. The first one (hdc) is the head-driven chart parser presented above, in which the head of the rule is given by the grammar writer. The active chart parser (act) is the same as the head-chart parser, but now it is assumed that for each rule the left-most daughter is the head (active chart). The inactive chart parser (inact) is a version of the head-corner parser where each right-most daughter is assumed to be the head of the rule. Since the parser does not use active items, some (slight) simplifications of the head-driven chart parser were possible.

The left-corner parser is a generalized version of the chart-based left-corner parser of [6]. As we also add items to construct parse-trees using `packing', the resulting parser should be comparable to the CLE parser [5]. The head-corner parser is the parser discussed in the previous section.³

Results for CUG.

One hundred arbitrarily chosen sentences (10 of length 3, 10 of length 6, etc.) were parsed, using the three pure bottom-up parsers (hdc, inact, and act). The columns in table 2 give, for each sentence length (column 1), the average number of readings (column 2), the average number of items produced by hdc, and the average percentage of items produced by inact and act, when compared with hdc (columns 3-6), the average time it took hdc to parse a sentence without recovering the different analyses and the average percentage of time needed for inact and act to do that (columns 7-9), and finally the average time it took to parse a sentence and recover all analysis trees for hdc and the average percentage of time needed by inact and act to do that.

The number of chart items illustrate clearly that hdc combines features of an inactive chart parser with that of an active chart-parser. Note that, in spite of the fact that English is mostly a head-initial language, act produces 80% more items than hdc, whereas inact almost produces 80% of the items produced by hdc. For languages which are predominantly head-final, the difference between act and hdc will probably be larger, whereas that between inact and hdc should be smaller.

The recognition times show that an active bottom-up chart parser is two-times slower for this grammar than a head-driven chart parser. The difference between the inactive chart parser and the head-driven parser is less extreme, and is notably in favor of the head-driven parser only for relatively long and complex (in terms of number of analyses) sentences. Nevertheless, the difference is of enough significance to establish the superiority of a head-driven strategy in this case.

The final three columns show that if recovery of parse trees is taken into account as well, the differences are much less extreme. The reason for this difference is simply that recovery (for which we used an Earley-style top-down algorithm which reconstructs explicit analysis trees on the basis of inactive items) may take up to eight times as long as doing parsing without recovery. Since the amount of time needed for recovery is (approximately) equal for all three parsers, this explains why the relative differences are much smaller in this case.

The head-corner parser was applied to the same grammar and sentence set as well. It behaves much worse (up to 100 times as slow for recognition of 24-words sentences) than the parsers listed in the tables due to the lack of guiding top-down information. The left-corner parser without top-down prediction reduces to the active chart parser.

We also applied the same sentence set to a compiled version of the same CUG. In this compiled version first-order terms were used, rather than feature structures. Furthermore, we used ordinary Prolog unification on such terms rather than the previously mentioned feature unification including occurs check. This implied that we had to forbid multiple extractions in the compiled version of the grammar. Experiments indicate that in such cases the inactive chart parser performs consistently better than both the head-driven chart parser and the active chart parser. This should not come as a surprise given the discussion in section 2.1 where we expected the head-driven chart parser to be useful for grammars with an `expensive' unification operation.

Results for the DCG.

The head-corner parser improved with a well-formed substring table and packing beats the bottom-up chart parsers. This is explained by the fact that these parsers proceed strictly bottom-up, whereas the left-corner and head-corner parser employ both top-down and bottom-up information. The top-down information is available through a left-corner resp. head-corner table, which turn out to be quite informative for this grammar.

Table 4: Comparison of the size of the parse forest for the left-corner and head-corner parser for a few (longer) sentences.

	hc				lc
# parses	items	recognition	recovery	total	items	recognition	recovery	total
	#	sec	sec	sec	#	sec	sec	sec
26	768	12	12	24	503	33	8	41
100	1420	20	37	57	831	43	29	72
30	543	9	10	19	430	20	8	28

The head-corner parser performs considerably better than the left-corner parser on average, especially if we only take the recognition phase into account. For longer sentences the differences are somewhat less extreme than for shorter sentences. This difference is due to the fact that the left-corner parser seems somewhat better suited for grossly ambiguous sentences. Furthermore, the number of items used for the representation of parse trees is not the same for the left-corner and head-corner parser. For ambiguous sentences the head-corner parser produces more useless items, in the sense that such items can never be used for the construction of an actual parse tree. As a consequence, it is more expensive to recover the parse trees based on this representation, than it is for the recovery of parse trees based on the smaller representation built by the left-corner parser. A few numbers for three typical (long) sentences are shown in table 4.

This is a somewhat puzzling result. Useless items are asserted only in case the parser is following a dead-end. However, the fact that the number of useless items is larger for the head-corner parser than for the left-corner parser implies that the head-corner parser follows more dead-ends, yet the head-corner parser is much faster during the recognition phase. A possible explanation for this puzzling fact may be the overhead involved in keeping track of the active items in the left-corner parser whereas no active items are asserted for the head-corner parser. Clearly for grammars with rules that contain many daughters (unlike the grammar under consideration) the use of active items may start to pay off.

Note that we also implemented a version of the head-corner parser that asserts less useless items by delaying the assertion of items until a complete head-corner has been found. However, given the fact that this technique leads to a more complex implementation of the memo-ization of the head-corner relation, it turned out that this immediately leads to longer recognition times, and an overall worse behavior.