Formalism

Next: Grammar rules. Up: A computational grammar for Previous: A computational grammar for

Formalism

The formalism that we use for the OVIS2 Grammar is a variant of Definite Clause Grammar (DCG) [34]. We have chosen for DCG because:

DCG provides for a balance between computational efficiency on the one hand and linguistic expressiveness on the other.
DCG is a (simple) member of the class of declarative and constraint-based grammar formalisms. Such formalisms are widely used in linguistic descriptions for NLP.
DCG is straightforwardly related to context-free grammar. Almost all parsing technology is developed for CFG; extending this technology to DCG is usually possible (although there are many non-trivial problems as well).
The compilation of richer constraint-based grammar formalisms into DCG is well investigated and forms the basis of several wide-coverage and robust grammar systems (i.e. the Alvey-grammar [14,16,13] and the Core Language Engine [3]).

The formalism for the grammar of OVIS2 imposes the following additional requirements:

External Prolog calls (in ordinary DCG these are introduced in right-hand sides using curly brackets) are allowed, but must be resolved during grammar compilation time.
Rules can be mapped to their `context-free skeleton' (by taking the functor symbol of the terms appearing in the right-hand and left-hand sides of the rule). This implies that we do not allow the under-specification of categories in rules. This is motivated by our desire to experiment with parsing strategies in which part of the work is achieved on the basis of the context-free skeleton of the grammar. It also facilitates indexing techniques.
An identifier is assigned to each rule. Such rule identifiers have a number of possible uses (debugging, grammar filtering, grammar documentation).
The grammar specifies for each rule which daughter is the head. This allows head-driven parsing strategies.

An efficient head-corner parsing strategy for this formalism is discussed in van Noord [41]. The restriction that external Prolog calls must be resolved at compilation time implies that we do not use delayed evaluation. More in particular, lexical rules (deriving a lexical entry from a given `basic' lexical entry) must be applied at compile time and are not interpreted as (relational) constraints on under-specified lexical entries, as in van Noord and Bouma [43]. Although we have experimented with combinations of delayed evaluation and memoisation, as described in Johnson and Dörre [25], the resulting systems were not efficient enough to be applied in the kind of practical system considered here.

Subsections

Next: Grammar rules. Up: A computational grammar for Previous: A computational grammar for

2000-07-10