next up previous
Next: Per subset and per Up: Variants for -Moves Previous: Variants for -Moves

Per graph

In the per graph variant two steps can be identified. In the first step, efree, an equivalent $\epsilon $-free automaton is constructed. In the second step this $\epsilon $-free automaton is determinised using the subset construction algorithm. The advantage of this approach is that the subset construction algorithm can remain simple because the input automaton is $\epsilon $-free.

An algorithm for efree is described for instance in [8][page 26-27]. The main ingredient of efree is the construction of the function $\epsilon $-CLOSURE, which can be computed by using a standard transitive closure algorithm for directed graphs: this algorithm is applied to the directed graph consisting of all $\epsilon $-moves of M. Such an algorithm can be found in several textbooks (see, for instance, [5]).

For a given finite-state automaton $M=(Q,\Sigma,\delta,S,F)$ efree computes $M'=(Q,\Sigma,\delta',S',F')$, where $S'=\epsilon\mbox{-CLOSURE}(S)$, $F'=\epsilon\mbox{-CLOSURE}^{-1}(F)$, and $\delta'(p,a) = \{ q \vert q'\in \delta(p',a),
p'\in\epsilon\mbox{-CLOSURE}^{-1}(p), q\in\epsilon\mbox{-CLOSURE}(q') \}$. Instead of using $\epsilon $-CLOSURE on both the source and target side of a transition, efree can be optimised in two different ways by using $\epsilon $-CLOSURE only on one side:

Although both variants appear very similar, there are some differences. Firstly, efreet might introduce states which are not co-accessible: states from which no path exists to a final state; in contrast, efrees might introduce states which are not accessible: states from which no path exists from the start state. A straightforward modification of both algorithms is possible to ensure that these states are not present in the output. Thus efreet,c ensures that all states in the resulting automaton are co-accessible; efrees,a ensures that all states in the resulting automaton are accessible. As a consequence, the size of the determinised machine is in general smaller if efreet,c is employed, because states which were not co-accessible (in the input) are removed (this is therefore an additional benefit of efreet,c; the fact that efrees,a removes accessible states has no effect on the size of the determinised machine because the subset construction algorithm already ensures accessibility anyway).

Secondly, it turns out that applying efreet in combination with the subset-construction algorithm generally produces smaller automata than efrees (even if we ignore the benefit of ensuring co-accessibility). An example is presented in figure 2. The differences can be quite significant. This is illustrated in figure 3.

Below we will write per graphX to indicate the non-integrated algorithm based on efreeX.

Figure 2: Illustration of the difference in size between two variants of efree. (1) is the input automaton. The result of efreet is given in (2); (3) is the result of efrees. (4) and (5) are the result of applying the subset construction to the result of efreet and efrees, respectively.
\begin{figure}
\begin{center}
\par\begin{pspicture}(-1,-1)(7.0,1.0)
\psset{unit=...
...{x}{a}}
\put(100,-40){\makebox{(5)}}
\end{pspicture}\par\end{center}\end{figure}

Figure 3: Difference in sizes of deterministic automata constructed with either efrees or efreet, for randomly generated input automata consisting of 100 states, 15 symbols, and various numbers of transitions and jumps (cf. section 4). Note that all states in the input are co-accessible; the difference in size is due solely to the effect illustrated in figure 2.
\begin{figure}
\centerline {\psfig{file=csizes.ps}}\end{figure}


next up previous
Next: Per subset and per Up: Variants for -Moves Previous: Variants for -Moves

2000-07-10