Unification Linguistics

Next: Overview of the Up: Introduction Previous: Computation

Unification Linguistics

In MiMo2, grammars covering a basic subset of English, Dutch and Spanish have been developed. The linguistic theory embodied in these grammars is a variant of the emerging family of unification grammars (UG; see [40] for a general introduction), Head-Driven Phrase Structure Grammar (HPSG, [34]) being the initial source of inspiration. The usual implementation tool of these grammars is a member of the family of logic grammars, such as PATR. Two recent developments in the UG tradition adopted in MiMo2 are the sign-based approach and a strong lexical orientation.

In sign-based theories like HPSG and Unification Categorial Grammar (UCG, [55]), linguistic objects (grammar rules, lexical entries etc.) are described as partial information structures that express declarative and monotonic constraints on combinations of (possibly diverse) types of linguistic information [][p 7]hpsg. As opposed to linguistic theories (such as transformational grammar) and NLP formalisms (such as the Eurotra's E-framework [5]) in which linguistic representations are sequentially transformed one into the other, sign-based grammars allow for interleaved processing of phonology, syntax, and semantics.

The second development in the UG tradition is a strong lexical orientation, which initiated with LFG [7]. MiMo2 follows HPSG [38][34] in having small grammars with few but general rules and rich lexical entries. To minimize redundancies and to capture generalisations, it is possible to define macro's (cf. the `let' definitions of [40], or the `aliases' of [36]) to implement a lexical inheritance hierarchy ([13], [34] chapter 8). Furthermore, maintenance of large lexicons is facilitated by a separate lexical preprocessor, which is discussed elsewhere [48].

The lexicalist approach is partly motivated by the considerable reduction of grammar size it enabled us to achieve, e.g. by moving subcategorization frames to the lexicon, thereby eliminating the large number of phrase structure rules in earlier phrase structure grammar (GPSG [15]). This reduction is of relevance considering the maintenance complexity of large grammars. The possibility of separately defining linguistic principles, which can be called in grammar rules as macro's, reduces the grammar complexity.

Examples of principles are HPSG's SUBCAT-principle, which recursively realizes the head of the list of arguments, which represents the subcategorization frame (cf. functional application in Categorial Grammars), and the Head Feature Principle (the HPSG restatement of GPSG's Head Feature Convention in unification terms). These principles can be defined universally, so that they can be called from all grammar components. Use of these macro's sometimes allows for some modularity: the definition of the principle can be changed without changing any grammar rule.

An empirical motivation of the lexicalist approach is the huge amount of word-specific idiosyncracies. The combination of the sign-based lexicalist approach and the idea of the subcategorisation list enables linguists to describe the idiosyncratic character of idioms in the lexical entry of the head word of the idiom only, by directly specifying the argument on the subcatlist. Given the lexical entry in figure 1 the MiMo2 grammar will recognize the VP semantically as the one-place predicate kick_bucket (note that subjects are not on the subcat list in this approach).

The lexical entry for `kick'kick kick stem = kick syntax head cat = v syntax subcat first semantics pred = bucket syntax subcat rest = nil semantics pred = kick_bucket semantics arg1 = syntax subject semantics Since the grammars are implemented in the PATR formalism, certain HPSG proposals, such as the ID/LP rule format and the obliqueness-hierarchy cannot be implemented directly. To a certain extent some proposed extensions could be simulated, as will be shown in section 3, but this is not in general the case.

The lexicalist approach can easily be extended to handle bilingual lexical idiosyncracies. This makes it fit well into the transfer-based view of translation described in section 1.1.3.

Next: Overview of the Up: Introduction Previous: Computation

Gertjan van Noord
Thu Nov 24 19:09:23 MET 1994