12. Command-line Arguments

Usage: fsa [Flag=Val]* [ActionOption]

The fsa program can be started with command-line arguments (options). A command-line consists of a number of global variable assignments, following by (at most one) action option, followed by more global variable assignments. If no action option is provided, then the system runs in interactive mode. If the variable interpreter is set to on, then fsa runs in interactive mode after the action indicated by the action option has been performed. If the flag interactive has been set to cmdint, then fsa runs the FSA command interpreter. Otherwise, you get the ordinary SICStus Prolog prompt.

Typical actions that can be performed through the use of an action option are:

regular expression operations such as kleene closure, complementation for given automata
determinization and minimization of automata
construction of automaton on the basis of regular expression
visualization of given automaton
apply automaton to a string or set of strings

12.1. -aux Aux

Aux is a file containing auxiliary regular expression operator definitions. It is loaded into module fsa_regex_aux, and will be used for compiling regular expressions.

Note that your file with definitions of regular expression operators is consulted with the special Prolog-syntax operators for regular expression notation loaded. Thus you can use * .. & etc. in your definitions. Drawback is that you cannot use operator notation for e.g. the is/2 predicate.

A typical auxiliary definition will be:

macro(vowel,{a,e,i,o,u}).

A slightly more interesting one:

macro(free(Expr), ~ $ Expr).

You can also explicitly construct an automaton yourself, e.g.:

rx(my_operator(Expr),Fa) :-
fsa_regex:rx(Expr,Fa0),
my_operator_definition(Fa0,Fa).

so you can call fsa_regex:rx/2 for further compilations.

There can be multiple -aux options.

12.2. -pm File

File is supposed to contain the definition of a predicate module. The file is loaded and moreover the global variable pred_module is set to the name of the module defined in that file. There can be multiple -pm options

12.3. -l File

The File is loaded, using use_module/1. There can be multiple -l options.

12.4. -cmd Goal

evaluates Prolog Goal; Goal is parsed as Prolog term. Example:

fsa -cmd 'listing(library_directory),halt').

There can be multiple -cmd options

12.5. -cmdint

Run interactively with the FSA6 command interpreter.

12.6. -a[ccepts] [In] String

This option can be used to test a given string for acceptance by an automaton read from In (or standard input). Fsa prints `yes' or `no' to standard error. Example:

% fsa -r 'a+' | fsa -a aaa

Prints `yes'.

12.7. -approx [In] String

This option can be used to get all best matches for a given string and an automaton read from In (or standard input). In must contain an automaton.

% fsa -r '[a,a]+' | fsa -approx aaa

Prints:

aa
aaaa

12.8. -fsa2fsm In Syms Aut | -fsa2fsm [In [Out]]

If three file names are given, then the automaton read from the first file is converted to an automaton in AT&T's fsm software format. That automaton is written into Aut; Syms will contain a mapping from the integers used in Aut to the actual symbols. If less than three file names are given, then it is assumed that the actual symbols can be ignored and no Syms is written.

12.9. -fsm2fsa [In [Out]]

Converts an automaton in AT&T's fsm software format into fsa5 format. Note that a separate symbol definition file is currently not supported.

12.10. -c[ompile] [In [Out]]

For a given automaton read from file In, a Prolog program is written to Out. For details, see the module fsa_compiler.

12.11. -c[ompile_to_]c [In [Out]]

For a given automaton read from In a C program is written to Out. For details, see module compiler_to_c.

12.12. -java [In] Out

For a given automaton read from In, a JAVA program is written to Out. For details, see module fsa_java.

12.13. -cpp [In] Out

For a given automaton read from In, a C++ program is written to Out. For details, see module fsa_cpp.

12.14. -c++ [In] Out

For a given automaton read from In, a C++ program is written to Out. For details, see module fsa_cpp.

12.15. -compose A B [Out]

The transducers read from A and B are composed, and the result is written to Out. Equivalent to

fsa -r 'compose(file(A),file(B))' >Out

12.16. -complement [In [Out]]

The complementation operator is applied to the automaton read from In, and the result is written to Out.

12.17. -count [In [Out]]

For the automaton read from In the number of transitions and symbols and some other properties is written to Out. If -short is specified, then the output is given as a single line consisting of the number of states, start states, final states, transitions, jumps, and symbols respectively. Otherwise a more elaborate message is printed meant for human consumption.

12.18. -density [In [Out]]

For the automaton read from In various densities are reported to Out. Deterministic density is the number of transition divided by the number of states times the number of symbols; absolute density is the number of transitions dividided by the number of states squared times the number of symbols. Jump density is the number of jumps dividied by the squared number of states. Deterministic density can be used to characterize the difficulty of determinization. For deterministic densities of around 2, exponential blow-up of the output (and hence processing time) can be expected (Leslie 1995). Jump density can be used to estimate the most efficient subset construction algorithm (van Noord 1998).

12.19. -davinci [In [Out]]

For the automaton read from In a corresponding DaVinci term is written to Out. This can be used to visualize the automaton In using DaVinci:

fsa -davinci a.nd > a.davinci
daVinci a.davinci

12.20. -vcg [In [Out]]

For the automaton read from In a corresponding vcg term is written to Out. This can be used to visualize the automaton In using (x)vcg:

fsa -vcg a.nd | xvcg -

12.21. -dot [In [Out]]

For the automaton read from In a corresponding dot term is written to Out. This can be used to visualize the automaton In using dot or dotty:

fsa -dot a.nd | dotty -
fsa -dot a.nd | dot -Tps | gv -
fsa -dot a.nd | dot -Tgif | xv -

12.22. -d[eterminize] [In [Out]] | -dgraph [In [Out]] -drgraph [In [Out]] -dsubset [In [Out]] | -dstate [In [Out]]

The automaton read from In is determinized and written to Out. FSA6 supports four variants of the determinization algorithm. In the first form, a heuristic is used (based on the jump density) to select the variant of the determinization variant. The other forms indicate the particular variant that is to be used. For details, refer to the fsa_determinizer module.

12.23. -efree [In [Out]]

For the automaton read from In an equivalent automaton without any epsilon transitions (jumps) is written to Out.

12.24. -ignore A B [Out]

Equivalent to:

fsa -r 'ignore(file(A),file(B))' > Out

12.25. -diff[erence] A B [Out]

Equivalent to:

fsa -r 'difference(file(A),file(B))' > Out

12.26. -aa In | -accept_all In | -raa Regex

The program checks each string read from standard input for acceptance by the automaton read from In. Depending on the type of the automaton, the system reports the transductions for each string, or simply `yes' or `no' (for recognizers). In the third form the recognizer is specified by regular expression Regex rather than by an automaton.

12.27. -prolog Goal

Evaluates Prolog goal. Example:

fsa -prolog 'listing(user:file_search_path).'

12.28. -generate States Syms Dens [JDens]

This option is used to generate random finite automata, using the algorithm of Leslie 1995. States is the number of states, Syms is the number of symbols, Dens is absolute density, and JDens is the jump density.

12.29. -intersect A B [Out]

Equivalent to:

fsa -r 'intersect(file(A),file(B))' > Out

12.30. -kleene_star [In [Out]]

Equivalent to:

fsa -r 'file(In)*' > Out

12.31. -minimum_path [In]

Writes the path with the minimum weigth of the automaton read from In.

12.32. -kleene_plus [In [Out]]

Equivalent to:

fsa -r 'file(In)+' > Out

12.33. -reverse [In [Out]]

Equivalent to:

fsa -r 'reverse(file(In))' > Out

12.34. -inverse [In [Out]]

Equivalent to:

fsa -r 'inverse(file(In))' > Out

12.35. -domain [In [Out]]

Equivalent to:

fsa -r 'domain(file(In))' > Out

12.36. -range [In [Out]]

Equivalent to:

fsa -r 'range(file(In))' > Out

12.37. -cleanup [In [Out]]

Equivalent to:

fsa -r 'cleanup(file(In))' > Out

12.38. -identity [In [Out]]

Equivalent to:

fsa -r 'identity(file(In))' > Out

12.39. -option [In [Out]]

Equivalent to:

fsa -r 'option(file(In))' > Out

12.40. -union A B [Out]]

Equivalent to:

fsa -r 'union(file(A),file(B))' > Out

12.41. -concat A B [Out]]

Equivalent to:

fsa -r 'concat(file(A),file(B))' > Out

12.42. -m[inimize] [In [Out]] | -mb [In [Out]] | -mh [In [Out]]

Minimizes the automaton read from In. The first version uses the default minimization algorithm (by Brzozwski). The other options explicitly require the algorithms by, respectively, Brzozowski or Hopcroft. Refer to the fsa_regex module and the fsa_minimizer module for details.

12.43. -t_m[inimize] [In [Out]]

Applies the minimization algorithm for transducers (by Mohri) to the transducer read from In. Refer to the fsa_t_determinizer module for details.

12.44. -produce [In [Out]]

For the recognizer read from In strings accepted by In are written to Out.

12.45. -sample [In [Out]]

For the automaton read from In strings or string pairs accepted by In are written to Out, using a sampling procedure based on the weights in In. If In is not a weighted automaton, then uniform weights are assumed. The size of the sample is determined by the flag nr_sol_max. Examples:

fsa -r '[a::0.3*,b::5*]' |
fsa nr_sol_max=10000 -sample |sort |uniq -c | sort -nr

produces:

5331 []
2519 a
1028 aa
456 aaa
187 aaaa
172 b
108 aaaaa
73 ab
32 aaaaaa
28 aab
24 aaaaaaa
17 aaab
8 aaaaaaaa
5 bb
3 abb
3 aaaab
3 aaaaab
2 aaaaaaaaa
1 aabb

fsa -r '[a*,b*]' |
fsa nr_sol_max=100 -sample |sort |uniq -c | sort -nr

produces:

32 []
16 b
11 a
9 bb
5 abb
5 ab
3 bbbb
3 bbb
3 aa
2 bbbbb
2 abbbb
2 aaa
1 bbbbbbb
1 abbbbbb
1 abbb
1 aab
1 aaabb
1 aaaabbb
1 aaaaaab

12.46. -r[egex] [Regex] [Out]

The automaton described by regular expression Regex are written to Out. If Regex is not specified, it is read from standard input.

12.47. -tk [File] | -tk [-r Regex]

Starts the graphical user interface on the automaton in File, or the automaton defined by the regular expression Regex.

12.48. -postscript [In [Out]]

Produces postscript version of the automaton read from In.

12.49. dict2ph [In [Out]]

A minimal string-to-weight transducer will be written to Out, transducing each of the lines read from In into its rank in alphabetic ordering; in other words, the transducer computes a perfect hash for the keys read from In. For more info, see module fsa_dict.

12.50. dict2m [In [Out]]

A minimal recognizer will be written to Out, recognizing each of the lines read from In.

12.51. -pstricks_tex [In [Out]] | -pstricks_picture [In [Out]]

Produces LaTeX code using PsTricks macro's for the automaton read from In. In the first variant a self-contained LaTeX document is produced; in the second variant a LaTeX picture is produced to be included in another document.

12.52. -copy [In [Out]]

Copies the automaton from In to Out. Useful to convert between different formats using the read and write global variables:

fsa read=fast write=normal -copy a.nd b.nd

12.53. -prefix_tree [In [Out]]

A so-called prefix tree will be written to Out, which is a weighted recognizer of all the lines read from In, where the shape of the recognizer is the corresponding trie, and the weights are derived from the counts that each transition is used in In. Cf. Carrasco and Oncina, 1999.

12.54. -dict [In [Out]]

A so-called trie will be written to Out, which is a deterministic recognizer of all the lines read from In, where no states have more than a single incoming transition.

12.55. -t_d[eterminize] [In [Out]]

The determinization algorithm for transducers (by Mohri) is applied to string-to-string transducer In and yields Out. For details, refer to module t_determinizer.

12.56. -w_d[eterminize] [In [Out]]

The determinization algorithm for transducers (by Mohri) is applied to string-to-weight transducer In and yields Out. For details, refer to module t_determinizer.

12.57. -w_m[inimize] [In [Out]]

The minimization algorithm for transducers (by Mohri) is applied to string-to-weight transducer In and yields Out. For details, refer to module t_determinizer.

12.58. -identical In1 In2

Reports to standard output whether the two automata read from In and In2 are identical. The algorithm used to determine this first determines both automata, and then renames the states of the automata in a canonical way. The automata are identical in case their canonical representations are identical. Note that two non-identical automata might still be equivalent in the sense that they define the same language or the same relation.