The experiments were performed using the FSA Utilities. The FSA Utilities tool-box [20,22,7,23] is a collection of tools to manipulate regular expressions, finite-state automata and finite-state transducers. Manipulations include determinisation, minimisation, composition, complementation, intersection, Kleene closure, etc. Various visualisation tools are available to browse finite-state automata. The tool-box is implemented in SICStus Prolog, and is available free of charge under Gnu General Public License via anonymous ftp at ftp://ftp.let.rug.nl/pub/vannoord/Fsa/, and via the web at http://www.let.rug.nl/~vannoord/Fsa/. At the time of our initial experiments with finite-state approximation, an old version of the tool-box was used, which ran into memory problems for some of these automata. For this reason, the subset construction algorithm has been re-implemented, paying special attention to the treatment of -moves. Three variants of the subset construction algorithm are identified which differ in the way -moves are treated:
The motivation for this paper is the experience that the first approach turns out to be impractical for automata with very large numbers of -moves. An integration of the subset construction algorithm with the computation of -reachable states performs much better in practice for such automata.
Section 2 presents a short statement of the problem (how to determinise a given finite-state automaton), and a subset construction algorithm which solves this problem in the absence of -moves. Section 3 defines a number of subset construction algorithms which differ with respect to the treatment of -moves. Most aspects of the algorithms are not new and have been described elsewhere, and/or were incorporated in previous implementations; a comparison of the different algorithms had not been performed previously. We provide a comparison with respect to the size of the resulting deterministic automaton (in section 3) and practical efficiency (in section 4). Section 4 provides experimental results both for randomly generated automata and for automata generated by approximation algorithms. Our implementations of the various algorithms are also compared with AT&T's FSM utilities [13], to establish that the experimental differences we find between the algorithms are truly caused by differences in the algorithm (as opposed to accidental implementation details).