next up previous
Next: Permutation Up: The Matching Approach Previous: The Matching Approach

Introduction

In order to illustrate the alternative approach, based on matching we return to the bebop example given earlier, repeated here:


c1:   O[ b ] N[ e ] X[ b ] X[ o ] X[ p ]
c2:   O[ b ] N[ e ] O[ b ] N[ o ] X[ p ]
c3:   X[ b ] X[ e ] O[ b ] N[ o ] X[ p ]

Here an instance of 'X[' is a constraint violation, so c2 is the best candidate. By counting, one can see that c2 has one violation, while c1 and c3 each have 3. By matching, one can see that all candidates have a violation in position 13, but c1 and c3 also have violations in positions not corresponding to violations in c2. As long the positions of violations line up in this manner, it is possible to construct a finite state filter to rule out candidates with a non-minimal number of violations. The filter will take the set of candidates, and subtract from that set all strings that are similar, except that they contain additional constraint violations.

Given the approach of marking up constraint violations introduced earlier, it is possible to construct such a matching filter. Consider again the `bebop' example. If the violations are marked, the candidates of interest are:


O[   b ] N[   e ] X[ @ b ] X[ @ o ] X[ @ p ]
O[   b ] N[   e ] O[   b ] N[   o ] X[ @ p ]
X[ @ b ] X[ @ e ] O[   b ] N[   o ] X[ @ p ]

For the filter, we want to compare alternative mark-ups for the same input string. Any other differences between the candidates can be ignored. So the first step in constructing the filter is to eliminate everything except the markers and the original input. For the syllable structure example, finding the original input is easy since it never gets changed. For the ``bebop'' example, the filter first constructs:


    b   e @ b @ o  @  p
    b   e   b   o  @  p
  @ b @ e   b   o  @  p

Since we want to rule out candidates with at least one more constraint violation than necessary, we apply a transducer to this set which inserts at least one more marker. This will yield an infinite set of bad candidates each of which has at least two markers and with one of the markers coming directly before the final `p'.

In order to use this set of bad candidates as a filter, brackets have to be reinserted. But since the filter does not care about positions of brackets, these can be inserted randomly. The result is the set of all strings with at least two markers, one of the markers coming directly before the final `p', and arbitrary brackets anywhere. This set includes the two candidates c1 and c3 above. Therefore, after applying this filter only the optimal candidate survives. The three steps of deleting brackets, adding extra markers and randomly reinserting brackets are encoded in the add_violation macro given in fig. 4.

Figure 4: Macro to introduce additional constraint violation marks.

macro(add_violation,
      {(bracket x []), ? - bracket}*    % delete brackets
                     o
           [[? *,([] x @)]+, ? *]       % add at least one @
                     o
      {([] x bracket), ? - bracket}*    % reinsert brackets
     ).

The application of an OT constraint can now be defined as follows, using an alternative definition of the optimality operator:


macro(Cands oo Constraint, 
          Cands
            o
   mark_violation(Constraint)
            o
     ~ range(Cands
               o
         mark_violation(Constraint)
               o
          add_violation)
            o
      {(@ x []),(? -  @)}* ).

Note that this simple approach only works in cases where constraint violations line up neatly. It turns out that for the syllabification example discussed earlier that this is the case. Using the syllabify macro given above with this matching implementation of the optimality operator produces a transducer of only 22 states, and can be shown to be exact for all inputs!


next up previous
Next: Permutation Up: The Matching Approach Previous: The Matching Approach

2000-06-29