Next: Beyond concatenation Up: Head Corner Parsing Previous: Head Corner Parsing

Introduction

Although most constraint-based formalisms in computational linguistics assume that phrases are built by concatenation (eg. as in PATR II, GPSG, LFG and most versions of Categorial Grammar) this assumption is sometimes challenged by allowing more powerful operations to construct strings. For example, [19] proposes several versions of `head wrapping'. In the analysis of the Australian free word-order language Guugu Yimidhirr, Mark Johnson uses a `combine' predicate in a DCG-like grammar that corresponds to the union of words [11]. Mike Reape uses an operation called `sequence union' to analyse Germanic semi-free word order constructions [21,22]. Other examples outside the family of constraint-based formalisms include Tree Adjoining Grammars [12,30], versions of Categorial Grammar [4] and references cited there, and some approaches within more traditional branches of generative grammar [2].

The linguistic motivation for such alternative conceptions of string combination are so-called discontinuous constituency constructions. Furthermore, in formalisms that provide more powerful string operations it is much easier to define analyses in which the semantics is built compositionally. This in turn may simplify generation algorithms.

In the next section I describe the proposals by [19,11] and [21]. Based on work by [31] I introduce the notion of Linear Context-free Rewriting systems, augmented with Feature structures (F-LCFRS). In F-LCFRS we abstract away from the actual construction of strings; we only require that these operations are linear and non-erasing. The three approaches mentioned previously are examples of F-LCFRS.

Most `standard' parsing algorithms for constraint-based grammars [15,17,25,8,7] are not applicable in general for F-LCFRS because in these algorithms the assumption that phrases are constructed by concatenation is `built-in'. I describe a head-driven parsing algorithm, based on the head-driven parser by Martin Kay [13]. The parser is generalized in order to be applicable to any F-LCFR grammar. The disadvantages Kay noted for his parser do not carry over to this generalized version, as redundant search paths for CF-based grammars turn out to be genuine parts of the search space for F-LCFR grammars.

The algorithm is closely related to head-driven generators [28,3,26,29,27]. The algorithm proceeds in a bottom-up, head-driven fashion, which provides for bottom-up and top-down filtering in a simple and straightforward way. In modern linguistic theories very much information is defined in lexical entries, whereas rules are reduced to very general (and very uninformative) schemata. More information usually implies less search space, hence it is sensible to parse bottom-up in order to obtain useful information as soon as possible. Furthermore, in many linguistic theories a special daughter called the head determines what kind of other daughters there may be. Therefore, it is also sensible to start with the head in order to know what else you have to look for next. As the parser proceeds from head to head it is furthermore possible to use powerful top-down predictions based on the usual head feature percolations.

This chapter is organized as follows. Firstly I discuss the proposals for more powerful string operations, as presented by [19], [11] and [22]. Then I define two restrictions on possible string combinations for constraint-based grammars, based on [31]. As an example I define a simple grammar for Dutch in which strings are combined by a technique quite similar to Pollard's head wrapping. The main part of the chapter is section 4, in which a parsing strategy for F-LCFR grammars is proposed. Some of its properties and possible modifications are discussed in the final section.

Next: Beyond concatenation Up: Head Corner Parsing Previous: Head Corner Parsing

Noord G.J.M. van
1998-09-30