A DCG Implementation

Next: An Example Up: The New Algorithm Previous: The New Algorithm

A DCG Implementation

To make the description more explicit, we will develop a Prolog implementation of the algorithm for DCGs, along the way introducing some niceties of the algorithm previously glossed over.

In the implementation, a term of the form node(Cat, P0, P) represents a phrase with the syntactic and semantic information given by Cat starting at position P0 and ending at position P in the string being generated. As usual for DCGs, a string position is represented by the list of string elements after the position. The generation process starts with a goal category and attempts to generate an appropriate node, in the process instantiating the generated string.

To generate from a node, we nondeterministically choose a non-chain rule whose left-hand side will serve as the pivot. For each right-hand-side element, we recursively generate, and then connect the pivot to the root.

The processing within generate_rhs is a simple iteration.

The connection of a pivot to the root, as noted before, requires choice of a chain rule whose semantic head matches the pivot, and the recursive generation of the remaining right-hand-side. We assume a predicate applicable_chain_rule(SemHead, LHS, Root, RHS) that holds if there is a chain rule admitting a node LHS as the left-hand-side, SemHead as its semantic head, and RHS as the remaining right-hand-side nodes, such that the left-hand-side node and the root node Root can be themselves connected.

The base case occurs when the root and the pivot are the same. One must be careful that all identity checks like this one be implemented correctly in the generator by using a sound unification algorithm with the occurs check. (The default unification in most Prolog systems is unsound in this respect.) For example, a grammar with a gap-threading treatment of wh-movement [14][13] might include the rule

stating that an NP with agreement Agr and semantics Sem can be empty provided that the list of gaps in the NP can be represented as the difference list [np(Agr)/Sem|X]-X, that is the list containing an NP gap with the same agreement features Agr [page 128]pereira-shieber. The above rule is a non-chain rule and so it will be considered when trying to generate any nongap NP, such as the proper noun np(3-sing,G-G)/john. The base case of connect will try to unify that term with the head of the rule above, leading to the attempted unification of X with [np(Agr)/Sem|X], an occurs-check failure. The base case, incorporating the explicit call to a sound unification algorithm is thus as follows:

Now, we need only define the notion of an applicable chain or non-chain rule. A non-chain rule is applicable if the semantics of the left-hand-side of the rule (which is to become the pivot) matches that of the root. Further, we require a top-down check that syntactically the pivot can serve as the semantic head of the root. For this latter purpose, we assume a predicate chained_nodes that codifies the transitive closure of the semantic head relation over categories. This is the correlate of the link relation used in left-corner parsers with top-down filtering; we direct the reader to the discussion by Matsumoto et al. [10] or Pereira and Shieber [page 182]pereira-shieber for further information.

A chain rule is applicable to connect a pivot to a root if the pivot can serve as the semantic head of the rule and the left-hand-side of the rule is appropriate for linking to the root.

: Grammar Fragment

: Analysis Tree Traversal Example

The information needed to guide the generation (given as the predicates chain_rule, non_chain_rule, and chained_nodes) can be computed automatically from the grammar; a program to compile a DCG into these tables has in fact been implemented. The details of the process will not be discussed further. The careful reader will have noticed, however, that no attention has been given to the issue of terminal symbols on the right-hand sides of rules. This is because during the compilation process, the right-hand side of a rule is converted from a list of categories and terminal strings to a list of nodes connected together by the difference-list threading technique used for standard DCG compilation. At that point, terminal strings can be introduced into the string threading and need never be considered further.

Next: An Example Up: The New Algorithm Previous: The New Algorithm

Gertjan van Noord
Thu Nov 24 18:39:44 MET 1994