Per graph

Next: Per subset and per Up: Variants for -Moves Previous: Variants for -Moves

Per graph

In the per graph variant two steps can be identified. In the first step, efree, an equivalent $\epsilon$ -free automaton is constructed. In the second step this $\epsilon$ -free automaton is determinised using the subset construction algorithm. The advantage of this approach is that the subset construction algorithm can remain simple because the input automaton is $\epsilon$ -free.

An algorithm for efree is described for instance in [8][page 26-27]. The main ingredient of efree is the construction of the function $\epsilon$ -CLOSURE, which can be computed by using a standard transitive closure algorithm for directed graphs: this algorithm is applied to the directed graph consisting of all $\epsilon$ -moves of M. Such an algorithm can be found in several textbooks (see, for instance, [5]).

For a given finite-state automaton $M=(Q,\Sigma,\delta,S,F)$ efree computes $M'=(Q,\Sigma,\delta',S',F')$ , where $S'=\epsilon\mbox{-CLOSURE}(S)$ , $F'=\epsilon\mbox{-CLOSURE}^{-1}(F)$ , and $\delta'(p,a) = \{ q \vert q'\in \delta(p',a), p'\in\epsilon\mbox{-CLOSURE}^{-1}(p), q\in\epsilon\mbox{-CLOSURE}(q') \}$ . Instead of using $\epsilon$ -CLOSURE on both the source and target side of a transition, efree can be optimised in two different ways by using $\epsilon$ -CLOSURE only on one side:

efree^t: $M'=(Q,\Sigma,\delta',S',F)$ , where $S'=\epsilon\mbox{-CLOSURE}(S)$ , and $\delta'(p,a) = \{ q \vert q'\in \delta(p,a), q\in\epsilon\mbox{-CLOSURE}(q') \}$ .
efree^s: $M'=(Q,\Sigma,\delta',S,F')$ , where $F'=\epsilon\mbox{-CLOSURE}^{-1}(F)$ , and $\delta'(p,a) = \{ q \vert q\in \delta(p',a), p'\in\epsilon\mbox{-CLOSURE}^{-1}(p) \}$ .

Although both variants appear very similar, there are some differences. Firstly, efree^t might introduce states which are not co-accessible: states from which no path exists to a final state; in contrast, efree^s might introduce states which are not accessible: states from which no path exists from the start state. A straightforward modification of both algorithms is possible to ensure that these states are not present in the output. Thus efree^t,c ensures that all states in the resulting automaton are co-accessible; efree^s,a ensures that all states in the resulting automaton are accessible. As a consequence, the size of the determinised machine is in general smaller if efree^t,c is employed, because states which were not co-accessible (in the input) are removed (this is therefore an additional benefit of efree^t,c; the fact that efree^s,a removes accessible states has no effect on the size of the determinised machine because the subset construction algorithm already ensures accessibility anyway).

Secondly, it turns out that applying efree^t in combination with the subset-construction algorithm generally produces smaller automata than efree^s (even if we ignore the benefit of ensuring co-accessibility). An example is presented in figure 2. The differences can be quite significant. This is illustrated in figure 3.

Below we will write per graph^X to indicate the non-integrated algorithm based on efree^X.

**Figure 2:** Illustration of the difference in size between two variants of *efree*. (1) is the input automaton. The result of *efree^t* is given in (2); (3) is the result of *efree^s*. (4) and (5) are the result of applying the subset construction to the result of *efree^t* and *efree^s*, respectively.
$\begin{figure} \begin{center} \par\begin{pspicture}(-1,-1)(7.0,1.0) \psset{unit=... ...{x}{a}} \put(100,-40){\makebox{(5)}} \end{pspicture}\par\end{center}\end{figure}$

**Figure 3:** Difference in sizes of deterministic automata constructed with either *efree^s* or *efree^t*, for randomly generated input automata consisting of 100 states, 15 symbols, and various numbers of transitions and jumps (cf. section 4). Note that all states in the input are co-accessible; the difference in size is due solely to the effect illustrated in figure 2.
$\begin{figure} \centerline {\psfig{file=csizes.ps}}\end{figure}$

Next: Per subset and per Up: Variants for -Moves Previous: Variants for -Moves

2000-07-10