Next: Grammar rules.
Up: A computational grammar for
Previous: A computational grammar for
Formalism
The formalism that we use for the OVIS2 Grammar is a variant of
Definite Clause Grammar (DCG) [34]. We have chosen for DCG because:
- DCG provides for a balance between computational
efficiency on the one hand and linguistic expressiveness on
the other.
- DCG is a (simple) member of the class of declarative and
constraint-based grammar formalisms. Such formalisms are widely used
in linguistic descriptions for NLP.
- DCG is straightforwardly related to context-free
grammar. Almost all parsing technology is developed for CFG;
extending this technology to DCG is usually possible (although there
are many non-trivial problems as well).
- The compilation of richer constraint-based grammar formalisms
into DCG is well investigated and forms the basis of several
wide-coverage and robust grammar systems (i.e. the Alvey-grammar
[14,16,13] and the Core Language
Engine [3]).
The formalism for the grammar of OVIS2 imposes the following
additional requirements:
- External Prolog calls (in ordinary DCG these are introduced in
right-hand sides using curly brackets) are allowed, but must
be resolved during grammar compilation time.
- Rules can be mapped to their `context-free skeleton' (by
taking the functor symbol of the terms appearing in the right-hand
and left-hand sides of the rule). This implies that we do not allow the
under-specification of categories in rules. This is motivated by our
desire to experiment with parsing strategies in which part of the
work is achieved on the basis of the context-free skeleton of the
grammar. It also facilitates indexing techniques.
- An identifier is assigned to each rule. Such rule identifiers
have a number of possible uses (debugging, grammar filtering,
grammar documentation).
- The grammar specifies for each rule which daughter
is the head. This allows head-driven parsing strategies.
An efficient head-corner parsing strategy for this formalism
is discussed in van Noord [41]. The restriction that external Prolog
calls must be resolved at compilation time implies that we do not use
delayed evaluation. More in particular, lexical rules (deriving a
lexical entry from a given `basic' lexical entry) must be applied at
compile time and are not interpreted as (relational) constraints on
under-specified lexical entries, as in van Noord and Bouma
[43]. Although
we have experimented with combinations of delayed evaluation and
memoisation, as described in Johnson and Dörre
[25], the resulting
systems were not efficient enough to be applied in the kind of
practical system considered here.
Subsections
Next: Grammar rules.
Up: A computational grammar for
Previous: A computational grammar for
2000-07-10