next up previous
Next: Availability Up: A note on Implementation Previous: Minimization

Practical Experience

The main purpose of the development of the FSA Utilities toolbox has been to experiment with new techniques, rather than the construction of practical programs. However, one might still be interested in the practical efficiency of some of the operations provided. The determinization and minimization operations are the most critical operations. Minimization typically is much slower than determinization. The second version of the minimizer (using reversal and determinization twice) is much faster than the other versions. The current implementation of the determinizer runs much faster than a previous version, in which the transitive closure of jumps was computed as a separate pre-processing step. We also experienced that the second version of the minimizer runs much faster on a determinized automaton.

To get at least some idea of the system's practical behavior, first consider a finite-state automaton recognizing all mono-syllabic Dutch words. This non-deterministic automaton (written by Gosse Bouma) has 185 transitions, 9 jumps, 42 states and 26 symbols. Determinization takes 200 milliseconds (on a 1995 HP-UX 9000/735), resulting in an automaton with 49 states and 392 transitions. Minimization is much slower: it takes about 24 seconds to reduce the number of states to 36 and the number of transitions to 315. If Brzozowski's method is used, minimization can be done in less than three seconds and has the additional advantage that the input need not be determinized first. Hopcroft's method takes about 11 seconds.

As another example consider a finite-state transducer which translates temporal expressions such as two minutes before half past seven into 7 28. This automaton is larger: it has 5824 states, 8809 transitions, 67 input symbols and 61 output symbols. Determinization takes less than 5 seconds. The result has 1714 states and 3362 transitions. In comparison, in order to determinize the automaton defining the domain of the transduction (obtained by simply removing the output part of the symbols) takes less than three seconds. The resulting automaton can be minimized (using Brzozowski's method) in about four seconds.


next up previous
Next: Availability Up: A note on Implementation Previous: Minimization
Noord G.J.M. van
1998-09-28