next up previous
Next: Finite State Transducers with Up: Transducers with Predicates and Previous: Determinization of Transducers


Determinization and identities

To treat identities, we must assume in the definition of proto-transition that if one of the positively occurring predicates in the boolean combination is associated with an identity, then the resulting predicate is associated with an identity as well. As an example consider the following transducer. For simplicity we assume B and C are mutually exclusive predicates; as before ? is a predicate which is true of all symbols. Also, we write $\langle$A$\rangle$:$\langle$A$\rangle$ for a transition A:A with an associated identity constraint.


\begin{pspicture}(-1,-1)(10.0,3.5)
\psset{unit=0.02cm}\rput(0,60.0){\circlenode[...
...15.0]{->}{6}{1}
\nbput{\rnode{x}{\footnotesize {\sc c}:{\sc c}}}
\end{pspicture}

Determinization produces:


\begin{pspicture}(-1,-1.5)(10,1.5)
\psset{unit=0.02cm}\rput(0,0){\circlenode[]{0...
...\mbox{\sc b}\langle
?\rangle\langle ?\rangle\mbox{\sc b}$}}
\par\end{pspicture}

Outputs associated with an identity are delayed like ordinary outputs. Generalizing an idea due to Tamás Gaál and Lauri Karttunen10 transducers with such disconnected identities are interpreted as follows. During the transduction of a string, a queue is maintained. Each time an input symbol is matched by a predicate with an associated identity, this symbol is enqueued. If a symbol matched by the corresponding predicate on the output side has to be written, then that symbol is obtained by a dequeue operation. With this use of a queue, our method for interpreting a transducer is no longer finite state. The transducer itself, however, still encodes a regular transduction.

A complication arises in cases like:


\begin{pspicture}(-2,-1)(10,3.5)
\psset{unit=0.02cm}\rput(0,60.0){\circlenode[]{...
...15.0]{->}{4}{1}
\nbput{\rnode{x}{\footnotesize {\sc b}:{\sc b}}}
\end{pspicture}

Determinization yields:


\begin{pspicture}(-1,0.5)(10.0,5.5)
\psset{unit=0.02cm}\rput(0,120.0){\circlenod...
...15.0]{->}{5}{2}
\nbput{\rnode{x}{\footnotesize {\sc b}:{\sc b}}}
\end{pspicture}

What sequence of output predicates should be put on the position of the *? According to the definitions, we get CE. However, this is not right because then there is a path $0 \rightarrow 3 \rightarrow 4 \rightarrow
1$ which has an identity on the input side without a corresponding identity on the output. Embedding such examples would lead to transducers in which identities are `out of sync'. The determinization algorithm is therefore extended by marking in the output part that the scope of an identity ends; procedurally such a mark is interpreted as a dequeue operation which ignores the dequeued value. We write such a mark as $\langle\rangle$. In the example the sequence of outputs X becomes $\mbox{\sc
ce}\langle\rangle$. In the definition of proto-transition, if at least one of the positively occurring $\pi_k$ has an associated identity then we append a $\langle\rangle$ mark to each of the outputs Trans$^P(\pi_l)$ for which $\pi_l$ was not associated with an identity.


next up previous
Next: Finite State Transducers with Up: Transducers with Predicates and Previous: Determinization of Transducers
Noord G.J.M. van
2001-06-22