Consider the following phonological rule (from [15]) in which an underlying nasal segment N is mapped either to an m (if followed by a p) or an n:
A transducer implementing this phonological rule can be illustrated as follows:
This transducer contains a single start state 0, and two final states,
0 and 1. First consider the cyclic transition on state 0, labeled by
the predicate
, i.e., the predicate which is
true of all symbols except the symbol N. As long as the
transducer does not read this special nasal segment N, it remains
in state 0 and simply copies its input. Upon reading an N, the
transducer non-deterministically moves to state 1 or state 2, writing
out an n or m respectively. In the first case, the next
input symbol cannot be a p; in the second case the next input
symbol must be a p.
Note that the transition from state 2 to state 0 simply contains a
p. The idea here is that if the input and output symbol must be
identical, only a single predicate is written for that transition. The
same abbreviation is used for the transition from 1 to 0, as well as
over the looping transition from 0 to 0 with predicate
. The
intention here of course is that every incoming segment which is not
equal to N should be mapped to itself in the output. However, note
that this is quite different from the pair of predicates
. The
latter would map an incoming symbol to an arbitrary output symbol, as
long as both symbols are unequal to N.
The example illustrates an important point: if predicates are introduced in transducers, then for typical examples we must also be able to express the identity of input and output of a transition. In this example, if there were no way to express the identity between input and output, then we would be forced to have multiple transitions such as a:a, b:b, c:c, d:d for all of the relevant symbols; the introduction of predicates can be exploited in transducers only if identity between input and output can be expressed as well.
Expressing identity between input and output is crucial. This notion
of identity can be seen as a consequence of the linguistic principle
of Faithfulness: corresponding input and output segments tend
to be identical. A similar argument is expressed in
[11]. Indeed, many interesting transducers are
of the type `change all occurrences of in some specific
context into
, and pass on the rest of the input unaltered'.
The various replacement and `local extension' operators all produce
transducers of this kind
[16,31,17,20,9].
Identities can be seen as a limited case of backreferencing.
Backreferencing is an extension of regular expressions widely used
in editors, scripting languages and other tools. A limited version of
finite-state calculus backreferences is discussed in
[10].