FSA is capable of reading and writing finite automata in a number of different formats. The defaul format for reading is determined by the global variable read. The default format for writing is determined by the global variable write.
The following formats are available both for reading and writing:
fast. Binary format of the normal format. Uses library(fastrw). Much faster reading and writing of automata. Drawback: binary files.
normal. Internal representation (single Prolog term).
old. Prolog program defining clauses start/1, final/2, trans/3, jump/2. A variant of this was used by FSA2 and FSA3, but it is still useful, for instance, if you want to input automata directly, rather than by means of regular expressions.
fsm. Format of automata as used in AT&T's fsm library.
compact. text format, fairly compact. Slow (especially for output).
For writing, the following additional formats are available:
ps. PostScript.
vcg. Input to the Xvcg graph visualization tool.
davinci. Input to the DaVinci graph visualization tool.
tk. Starts a interactive tcl/tk widget.
dot. Input for the GraphViz visualization tools dot and dotty.
pstricks. LaTeX code to be included in a document; requires pstricks macro's.
latex. LaTeX document; requires pstricks macro's.
prolog. Prolog program; interface to fsa_compiler module.
c. C program; interface to fsa_compiler_to_c module.
java. JAVA program; interface to fsa_java module.
cpp. C++ program; interface to fsa_cpp module.
fsm. Format of automata as used in AT&T's fsm library.
grail. Format of automata as used in the Grail programme.
The normal format is the internal format used in FSA6. The module fsa_data provides a consistent interface to this format.
Here is a table indicating the relative speed of the standard input and output formats:
format compact fast normal
writing 20 1 4
reading 5 1 4
Here is a table indicating the relative size of the standard input and output formats (measured in bytes):
compact fast normal
1 1.8 1.6
In the old format a finite automaton is given as a Prolog program. The automaton is defined by clauses for the predicates:
start(State)
final(State)
trans(State0,Sym,State)
jump(State0,State)
Note that in this format states can be are arbitrary ground Prolog terms (these will be converted to integers). In the case of transducers, Sym is a pair Left/Right. The empty list [] is used to indicate the empty string. In the case of sequential transducers, Right must be a list of symbols.
The compact format fairly closely follows the normal format. See the documentation on the normal format in the fsa_data module for more information. In this format a file is an ordinary text file. The format is intended to be used for machines only, and is not very helpful for human consumption.
The first line of the file is the string "fsa6". This is used to differentiate the file from the pre-fsa6 compact formats (which can still be read-in).
The second line is the letter r for recognizers or t for transducers.
For recognizers, the third line is the name of the predicate module.
For transducers, there are two such lines. The first line defines the domain predicate module, the second line the range predicate module.
The next line is an integer indicating the number of states
The next line is an ordered sequence of integers, separated by tabs, indicating the start states
The next line is an ordered sequence of integers, separated by tabs, indicating the final states
The next lines each represent a transition, until an empty line is encountered. The transitions are ordered. Each transition is a triple State Symbol State. Seperator is the tab again. States are integers. Symbols are readable as Prolog terms (recognizers) or pairs of In/Out, where In is a term and Out is either a single term or a list of terms. If the source state is identical to the source state of the previous line, it can be left out. If the symbol is identical as well, then it can be left out as well.
The next lines are jumps. Jumps are ordered. Each line consists of two states separated by a tab. If the source state is identical to the source state of the previous line, it can be left out. If the symbol is identical as well, then it can be left out as well.
Example:
fsa write=compact -r '[class(a..f),{g,h}]'
fsa6
r
fsa_preds
3
0
1
0 in([a,b,c,d,e,f]) 2
2 g 1
h 1
The fast format uses the same Prolog term representation as the normal format, except that library(fastrw) is used to read and write the Prolog term. This implies that reading and writing of automata in this format is very fast; the disadvantage is that fast is a binary format and therefore cannot be (easily) treated by other programs.
This module provides a consistent interface to the internal format of finite automata. A finite automaton is a term
fa(Symbols,States,Starts,Finals,Transs,Jumps)
Symbols is a term r(Sig) (for recognizers) or t(SigD,SigR) for transducers. Weighted automata have r(Sig,fsa_frozen) and t(SigD,SigR,fsa_frozen). Here Sig, SigD, SigR are the predicate modules.
States is an integer indicating the number of states in the automaton.
Starts is an ordered list of integers indicating the start states of the automaton.
Finals is an ordered list of integers indicating the final states of the automaton.
Transs is an ordered list of triples trans(A,B,C) where A and C are integers indicating source and target state, and B is the symbol part. The symbol part is different for the different types of automata:
recognizers: P where P is a predicate
weighted recognizers: P/W where P is a predicate and W is a number
letter transducer: P/Q where P and Q is a predicate or the empty list
sequence transducer: P/Q where P is a predicate or the empty list, and Q is a list of predicates
weighted letter transducer: P/(Q/W) where P and Q is a predicate or the empty list, and W is a number
weighted sequence transducer: P/(Q/W) where P is a predicate or the empty list, and Q is a list of predicates
In addition, transducers allow a term '$@'(P) anywhere where a predicate is allowed (P a predicate).
Jumps is an ordered list of pairs jump(A,B) where A and B are integers indicating source and target state. This implies there is an epsilon transition from A to B.