Unpacking and Parse Selection.

In order to select the best parse tree from a parse forest, we assume a parse evaluation function which assigns a score to each parse. In section 6 we describe some initial experiments with a variety of parse evaluation functions. A naive algorithm constructs all possible parse trees, assigns each one a score, and then selects the best one. Since it is too inefficient to construct all parse trees, we have implemented the algorithm which computes parse trees from the parse forest as a best-first search. This requires that the parse evaluation function is extended to partial parse trees. In order to be able to guarantee that this search procedure indeed finds the best parse tree, a certain monotonicity requirement should apply to this evaluation function: if a (partial) tree s is better than s', then a tree t which contains s should be better than t' which is just like t except it has s' instead of s. However, instead of relying on such a requirement, we implemented a variant of a best-first search algorithm in such a way that for each state in the search space, we maintain the b best candidates, where b is a small integer (the beam). If the beam is decreased, then we run a larger risc of missing the best parse (but the result will typically still be a relatively `good' parse); if the beam is increased, then the amount of computation increases too. Currently, we find that a value of b=4 is a good compromise between accuracy and efficiency. In table 2 the effect of various values for b is presented for two development treebanks. The grammar assigns on average about 33 parse trees per sentence for the cdbl10 corpus. This number increases rapidly for longer sentences: for the cdbl20 corpus it is at least 340.⁵

Table 2: Effect of beam-size on accuracy and efficiency of parse selection

beam	cdbl10		cdbl20
	accuracy (%)	speed (msec)	accuracy (%)	speed (msec)
1	79.99	190	73.63	740
2	80.66	270	74.59	1470
4	81.11	350	75.07	2350
8	81.22	530	75.35	3630
16	81.36	590	75.31	5460
32	81.36	790	74.98	7880
$\infty$	81.36	640	-	-