Next: The basic model Up: AN APPROACH TO SENTENCE-LEVEL Previous: Introduction

MiMo

The MiMo formalism tries to come up with an answer to the question what compositional translation should imply. Strictly compositional systems have to deal with several translation problems. As to what these problems exactly are depends on the nature of the definition of the notion compositionality. In general, two kinds of problems can be distinguished. First, there are the problems that arise when languages do not really match. Second, the problems that occur when translations of two constructions depend on one another.

The former type of problem is caused by lexical and structural holes. It means that source and target representation do not really match. Lexical holes occur when a language lacks words equivalent to the ones in the source language. In the case of structural holes, the target language lacks an equivalent construction rather than a word. A description of the concept will have to be used in these cases. For an example of a lexical hole, compare sentence (1) and its translation into English (2).

(1) Jan zwemt graag
(2) John likes to swim

Unlike sentences with an adverb like 'vandaag', (1) cannot be translated compositionally in the strictest sense. The translation of (1) is not simply the translation of the parts of which the constituent is composed of. This problem has been solved in the CAT framework by liberalizing the definition of compositionality in such a way that it will be possible to render (1) directly into (2), by means of a rule like (3).

(3) r1(s1,s2,graag) r2(t(s1),r3(like,t(s2)))

By (3) a construction composed of three daughters, s1, s2 and 'graag' will be translated into a construction having two daughters, viz. the translation of s1 and a construction that again has two daughters, that is, the verb 'like' and the translation of s2. The main disadvantage of this approach is the fact that combinations of exceptions have to be described explicitly again, see (4) and (5).

(4) Jan zwom gewoonlijk
    John used to swim
(5) Jan zwom gewoonlijk graag
    John used to like to swim

The translation of 'gewoonlijk' requires a rule similar to (3). However, a combination of 'graag' and 'gewoonlijk' appears to be possible as well. An additional rule will have to account for this. This will lead to an enormous explosion of the number of rules. It is one of the main reasons for an alternative definition of compositionality within the MiMo system. The nature of the definition allows the translation of both 'gewoonlijk' and 'graag' in case they cooccur. A translation rule separates a constituent into an ordinary part and an exceptional part. Both parts are then translated separately and finally, in the target language, the two translated parts are joined again. In the case of a sentence consisting of both 'graag' and 'gewoonlijk', the sentence is separated into an exceptional part, 'graag' for example, and an ordinary part, the rest of the sentence. This rest again is separated into an exceptional, 'gewoonlijk', and an ordinary part. The latter is again that which is left behind after extraction of the exceptional part. In the end, all these parts are joined and will make up a construction in the target language. So, in MiMo not all daughters are translated in one shot but part of a constituent is translated while the rules can still work on the rest of the constituent. An extensive discussion of problems like these is to be found in Arnold e.a (1988).

The second type of problems w.r.t compositionality in translation involves translation of phrases that are mutually dependent. Examples hereof are translations of phrases that are anaphorically linked. Translation requires that these relations are established. Examples are to be found in (6). In (6), the relation between the subject and the reflexive pronominal is necessary to arrive at the correct form of the reflexive pronominal in French. In (7), knowledge of the functional status of the wh-word is relevant to be able to generate the right case in German.

(6) the women think of themselves
les femmes pensent a elles-memes/*ils-memes
(7) who did you see wen/*wer/*wem sahest du

In this paper we will examine the component of the MiMo formalism that has been developed to enable the formulation of anaphoric relations on the one hand and compositional translation on the other. The system distinguishes itself from other systems in the field of computational linguistics, such as GPSG (Gazdar et al. 1985), PATR (see e.g. Shieber 1986) and DCG (Pereira and Warren 1980) for its central notion of modularity. The formalism enables the writer of rules to express generalizations in a simple and declarative way. This will be exemplified in section 4. In an MT context, it is however not enough to establish anaphoric relations monolingually. The question is what the behaviour of these relations in translation is. In MiMo, it is possible to translate the relations compositionally. This will be discussed in section 5.

Next: The basic model Up: AN APPROACH TO SENTENCE-LEVEL Previous: Introduction

Gertjan van Noord
Fri Nov 25 13:16:14 MET 1994