next up previous contents
Next: A Powerful Grammar Formalism Up: Overview Previous: Towards reversible grammars

The other chapters

The next chapter provides the basis of the other chapters of this thesis, by defining a constraint-based formalism, called $ \cal {R}$($ \cal {L}$). The constraint-language of the formalism consists of the path-equations known from PATR II [85]. This may seem somewhat restricted, as lately some results have been obtained in the area of disjunctive and negative constraints (see for example [39] and [89]; for an overview of other types of constraints cf. [109]). However, the way in which we define $ \cal {R}$($ \cal {L}$), along the lines of [32], should make it clear that more powerful constraint languages can be exchanged with the simple constraint language we are assuming, provided appropriate constraint-solving techniques are available for such more powerful constraint languages. Therefore, the results presented in the other chapters of this thesis can easily be generalized to more powerful constraint languages.

On the other hand, $ \cal {R}$($ \cal {L}$) is more general than most grammar formalisms, in that it does not prescribe that phrases are combined by concatenation. No assumptions about string combination are enforced in the formalism. The motivation for this enrichment is provided by certain analyses of discontinuous constituency constructions, which employ other possible combinations of strings. Further motivation is provided in chapter 3, where it turns out that certain analyses in concatenative grammars are problematic for generation. These problems can be solved in non-concatenative grammars. In chapter 4 the extra power of non-concatenative grammars will be investigated and exploited. Furthermore, this more general perspective allows the use of the formalism for other tasks as well. In chapter 5 the formalism is used to define transfer grammars in a machine translation system.

The formalism defined in chapter 2 is used in this thesis in two ways. Firstly, and most importantly, we use the formalism to define grammars with. But furthermore, we use the formalism to define meta-interpreters for such grammars in. For this reason the formalism is provided with a simple procedural semantics (essentially as in Prolog). However, a main theme of this thesis will be that for `linguistic deduction' (parsing and generation) different proof strategies are more appropriate. These will then be defined in $ \cal {R}$($ \cal {L}$) as meta-interpreters.

Constraint-based grammars can be used, in principle, both for parsing and generation. This is also true, for grammars defined in $ \cal {R}$($ \cal {L}$). However, to use such grammars in a practically interesting way, the grammars need to be restricted in some way. Historically, the restriction has been (in order for parsing to be efficient), that phrases are built by concatenation. No restriction for generation was assumed, as generation played a minor role. But, generation on the basis of declarative grammars is not without problems. For example, the simple top-down, left-to-right backtrack search strategy leads to non-termination for linguistically motivated grammars. Chapter 3 discusses several (linguistically relevant) problems for some obvious generation techniques. Furthermore, that chapter provides motivation for a different processing strategy, in which generation proceeds essentially bottom-up, and head-driven. A simple version of this strategy is defined, and its properties and short-comings are investigated. Several variants and possible improvements are discussed. Relying on the notion `head', has some implications for the way in which semantic structures should be combined in the grammar. Essentially, the semantic-head-driven generation algorithm embodies the assumption that semantic structures are defined in a lexical and head-driven fashion. It turns out that the head-driven generation strategy faces problems with analyses in which this assumption is violated. As an important example, the semantic-head-driven generation strategy faces problems, if the head of a construction has been `displaced'. Such an analysis is often assumed for verb-second phenomena in for example German and Dutch, if the grammars are restricted to be concatenative. Thus, the assumption that semantic structures are built lexically and head-driven, does not make linguistic sense, if phrases are to be built by concatenation. In order for these assumptions to make linguistic sense, it is therefore necessary to have more freedom in the way phonological structures are combined. Thus, instead of concatenative grammars, non-concatenative grammars are called for.

In chapter 4 linguistic motivation for such powerful operations is discussed. A number of proposals for operations like `head-wrapping' and `sequence-union' are described, and a class of formalisms is introduced in which operations on strings are allowed which are linear and non-erasing. Formalisms such as Head-Grammars [67] and Tree Adjoining Grammars [40], are members of this class. Clearly, parsing algorithms developed for concatenative formalisms cannot be used for these more powerful formalisms. An important question thus is how to parse with non-concatenative grammars. I describe a very general algorithm for this class of grammars which proceeds, again, bottom-up, and head-driven. The motivation for such a processing strategy is discussed. Furthermore, I describe possible extensions and improvements of the basic algorithm. I also show how the parser can be `specialized' for lexicalized, and constraint-based, versions of Tree Adjoining Grammars.

In chapter 5, I describe a possible application of reversible grammars. This chapter provides evidence that such grammars can be used to implement a machine translation system. The results of the other chapters in fact were developed partly in the context of the construction of a reversible MT system, called MiMo2. In this chapter I discuss the notion `linguistically possible translation' On the basis of this discussion a specific architecture for an MT system is proposed, which has been implemented as the MiMo2 prototype. This prototype was developed by the author and colleagues at the University of Utrecht. In this architecture, translation is simply defined by a series of three reversible, constraint-based grammars. It is shown how $ \cal {R}$($ \cal {L}$)-grammars can be used to define `transfer' relations as part of a Machine Translation system. Furthermore, a constraint on transfer grammars is proposed that ensures termination, while certain context-sensitive translations are still possible. It is argued that reversible transfer grammars provide an interesting compromise between expressive power and computational feasibility.

Most of the material in this thesis is based on papers and articles that have appeared elsewhere. The material on translation is partly based on [103,99,104]. The chapter on generation is based on [97,86,98,87], and the material in the chapter on parsing has been published as [101,100].

next up previous contents
Next: A Powerful Grammar Formalism Up: Overview Previous: Towards reversible grammars
Noord G.J.M. van