Problems with Top-Down Generators

Next: Problems with Bottom-up Up: Problems with Existing Previous: Problems with Existing

Problems with Top-Down Generators

Consider a naïve top-down generation mechanism that takes as input the semantics to generate from and a corresponding syntactic category and builds a complete tree top-down, left-to-right by applying rules of the grammar nondeterministically to the fringe of the expanding tree. This control regime is realized, for instance, when running a DCG ``backwards'' as a generator.

Clearly, such a generator may not terminate. For example, consider a grammar that includes the rule

(The intention is that VPs like, say, ``loves Mary'' be associated with a nonterminal vp(X)/love(X, mary).) Once this rule is applied to the goal s/love(john, mary), the subgoal np/NP will be considered. But the generation search space for that goal is infinite and so has infinite branches, because all noun phrases, and thus arbitrarily large ones, match the goal.This is an instance of the general problem known from logic programming that a logic program may not terminate when called with a goal less instantiated than what was intended by the program's designer. Dymetman and Isabelle [4], noting this problem, propose allowing the grammar-writer to specify a separate goal ordering for parsing and for generation. For the case at hand, the solution is to generate the VP first-from the goal vp(NP)/loves(john, mary)-in the course of which the variable NP will become bound so that the generation from np/NP will terminate. Wedekind [21] achieves this goal by expanding first nodes that are connected, that is, whose semantics is instantiated. Since the NP is not connected in this sense, but the VP is, the latter will be expanded first. In essence, the technique is a kind of goal freezing [2] or implicit wait declaration [12]. For cases in which the a priori ordering of goals is insufficient, Dymetman and Isabelle also introduce goal freezing to control expansion.

Although vastly superior to the naïve top-down algorithm, even this sort of amended top-down approach to generation based on goal freezing under one guise or another fails to terminate with certain linguistically plausible analyses. For example, the ``complements'' rule given by Shieber [pages 77-78]shieber-introduction in the PATR-II formalism

can be encoded as the DCG-style rule:

Top-down generation using this rule will be forced to expand the lower VP before its complement, since Compl is uninstantiated initially. But application of the rule can recur indefinitely leading to nontermination.

The problem arises because there is no limit to the size of the subcategorization list. Although one might propose an ad hoc upper bound for lexical entries, even this expedient may be insufficient. In analyses of Dutch cross-serial verb constructions [9][5] subcategorization lists such as these may be appended by syntactic rules [16][20][11] resulting in indefinitely long lists. Consider the Dutch sentence

The string of verbs is analysed by appending their subcategorization lists as follows:

Subcategorization lists under this analysis can have any length, and it is impossible to predict from a semantic structure the size of its corresponding subcategorization list merely by examining the lexicon.

In summary, top-down generation algorithms, even if controlled by the instantiation status of goals, can fail to terminate on certain grammars. The case given above, reminiscent of analyses from HPSG, as well as analyses from categorial unification grammar are examples in which the well-foundedness of the generation process resides in lexical information unavailable to top-down regimes.

Next: Problems with Bottom-up Up: Problems with Existing Previous: Problems with Existing

Gertjan van Noord
Thu Nov 24 18:39:44 MET 1994