Problem statement

Let $\epsilon$ -move be the relation $\{(q_i,q_j) \vert q_j \in \delta(q_i,\epsilon)\}$ . $\epsilon$ -reachable is the reflexive and transitive closure of $\epsilon$ -move. Let $\epsilon$ -CLOSURE: $2^Q \rightarrow 2^Q$ be a function which is defined as:

$\begin{displaymath} \epsilon\mbox{-CLOSURE}(Q') = \{q \vert q'\in Q', (q',q)\in \epsilon\mbox{-reachable}\} \end{displaymath}$

For any given finite-state automaton $M=(Q,\Sigma,\delta,S,F)$ there is an equivalent deterministic automaton $M'=(2^Q,\Sigma,\delta',\{Q_0\},F')$ . F' is the set of all states in 2^Q containing a final state of M, i.e., the set of subsets $\{Q_i\in 2^Q\vert q\in Q_i, q\in F\}$ . M' has a single start state Q₀ which is the epsilon closure of the start states of M, i.e., $Q_0=\epsilon\mbox{-CLOSURE}(S)$ . Finally,

$\begin{displaymath} \delta'(\{q_1,q_2,\dots,q_i\},a) = \epsilon\mbox{-CLOSURE}(\delta(q_1,a) \cup \delta(q_2,a) \cup \dots \cup \delta(q_i,a)) \end{displaymath}$

An algorithm which computes M' for a given M will only need to take into account states in 2^Q which are reachable from the start state Q₀. This is the reason that for many input automata the algorithm does not need to treat all subsets of states (but note that there are automata for which all subsets are relevant, and hence exponential behaviour cannot be avoided in general).

Figure 1: Subset-construction algorithm.

$\begin{program} \FUNCT \vert subset_construction\vert((Q,\Sigma,\delta,S,F)) \v... ...\rcomment{variant 1: No $\epsilon$-moves} \keyword{return} ~U \END \end{program}$

Consider the subset construction algorithm in figure 1. The algorithm maintains a set of subsets States. Each subset can be either marked or unmarked (to indicate whether the subset has been treated by the algorithm); the set of unmarked subsets is sometimes referred to as the agenda. The algorithm takes such an unmarked subset T and computes all transitions leaving T. This computation is performed by the function instructions and is called instruction computation by [9].

The function index_transitions constructs the function transitions: $Q \rightarrow \Sigma\times 2^Q$ which returns for a given state p the set of pairs (s,T) representing the transitions leaving p. Furthermore, the function merge takes such a set of pairs and merges all pairs with the same first element (by taking the union of the corresponding second elements). For example:

$\begin{program} \vert merge\vert(\{(a,\{1,2,4\}), (b,\{2,4\}), (a,\{3,4\}), (b,\{5,6\})\}) = \mbox{~~~~~~~~~~} \{(a,\{1,2,3,4\}),(b,\{2,4,5,6\}) \} \end{program}$

The procedure add is responsible for `reachable-state-set maintenance', by ensuring that target subsets are added to the set of subsets if these subsets were not encountered before. Moreover, if such a new subset contains a final state, then this subset is added to the set of final states.