Next: Motivation for reversible grammars Up: diss Previous: Contents

Introduction

Constraint-based grammars have become increasingly popular within the field of natural language processing. One of the reasons for this success is that constraint-based grammars are completely declarative; that is, the grammar only states facts of the language it describes, without stating how such a grammar should be used for parsing or generation. Such declarative grammars, for this reason, provide for an abstract level of language description, and are easy to understand, and hence relatively easy to debug, extend, and re-use in other applications.

Because constraint-based grammars do not enforce a specific processing regime, it is possible to conceive of constraint-based grammars, which can be used both for parsing and generation. Such grammars may be called reversible [43]. Section 1.3 defines the notion reversibility somewhat more precisely, and discusses some of its properties. I define a reversible grammar as a grammar for which parsing and generation are both guaranteed to terminate. The notion of a grammar that can be used both for parsing and generation is intuitively appealing; but I provide explicit motivation for the use of reversible grammars in section 1.1.

Although declarative grammars can, in principle, be used in a reversible way, it turned out that from a practical point of view, several problems remained to be solved, in order for this ideal to be realized. Historically, constraint-based grammars often were used for parsing only. Attempts to use such grammars (written with parsing in mind) for generation, failed. Some of the problems are discussed in chapter 3. Specialized generation algorithms had to be developed, to be able to use constraint-based grammars for generation successfully.

In chapter 3 I present a generation technique, known as semantic-head-driven generation, which has some interesting properties. Firstly, the algorithm is goal-driven in that the order of processing is geared towards the input (semantic structures). Secondly, the algorithm is lexicon-driven, in that the algorithm proceeds in a bottom-up fashion. These two properties imply that the generation technique is most useful for theories, such as unification-based versions of categorial grammars (CUG, UCG, HPSG), in which semantic structures are defined in a lexical, and head-driven fashion.

I present some evidence that semantic-head-driven generation faces problems in the case of some types of discontinuous constituency. Most notably, in certain analyses of so-called `head-movement', as in straightforward analyses of verb-second phenomena in languages such as German and Dutch, the algorithm faces severe problems. Although I discuss some restricted techniques to repair the problem, a more general solution does not seem available, for grammars which are concatenative.

Linguistic evidence for more powerful operations on strings than concatenation, is presented by various authors. Some proposals are discussed in chapter 4. I argue that analyses, which were problematic from a generation point of view, can be avoided in grammars which allow for more powerful operations on strings. Furthermore, as long as these operations on strings are linear and non-erasing (these terms will be explained later), it might still be possible to define efficient parsing algorithms. Chapter 4 presents such a parsing algorithm: the head-corner parser. The head-corner parser for non-concatenative grammars generalizes Martin Kay's head-driven parser, which in turn is a variation of the left-corner parser. In the head-corner parser, processing proceeds in a head-driven and bottom-up fashion. Furthermore, the bag of words occurring in the input sentence functions as the guide, rather than the sequence of words. I argue that this order of processing provides for a goal-driven, and lexicon-driven flow of control.

Summarizing, the argument can be rephrased as follows. Constraint-based grammars can be used in principle both for parsing and generation. However, to use such grammars in a practically interesting way, the grammars need to be restricted in some way. Historically, the restriction has been (in order for parsing to be efficient), that phrases are built by concatenation. No restriction for generation was assumed, as generation played a minor role. However, once grammars are to be used reversibly, I argue that this division of labor (a strict restriction for parsing, and no restriction for generation) cannot be maintained. To make generation feasible at all, some assumptions about how semantic structures are combined are necessary. In the approach of chapter 3, semantic structures are built in a lexical and head-driven fashion. In order for these assumptions to make linguistic sense, it is also necessary to have more freedom in the way phonological structures are combined. Thus, instead of concatenative grammars, linear and non-erasing grammars are called for.

In chapter 5, I describe a possible application of reversible grammars. This chapter provides motivation that such grammars can be used to implement the relation `linguistically possible translation'. The results of the other chapters were developed partly in the context of the construction of a reversible MT prototype, called MiMo2. This prototype was developed by the author and colleagues at the University of Utrecht. In this architecture, translation is simply defined by a series of three reversible, constraint-based grammars.

Next: Motivation for reversible grammars Up: diss Previous: Contents

Noord G.J.M. van
1998-09-30