Next: Direct implementation of new
Up: Extendible Regular Expression Operators
Previous: Extendible Regular Expression Operators
A regular expression operator is defined as a pair
macro(ExprA,ExprB) which indicates that the regular expression
ExprA is to be interpreted as regular expression ExprB. For
example, simple nullary regular expression operators (equivalent to
abbreviatory devices found in tools such as lex and flex),
can be defined as in the following example:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro( vowel, {a,e,i,o,u} )\end{verbatim}\end{minipage}\end{displaymath}](img5.gif) |
(5) |
indicating that the operator vowel/0 can be understood by
assuming that every occurrence of vowel in a regular expression
is textually replaced by {a,e,i,o,u}.
The same mechanism is used to define n-ary operators, exploiting
Prolog variables. For instance, the containment operator
containment(Expr) is the set of all strings which have as a
sub-string any of the strings in Expr. This could be defined as
follows:2
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(containment(Expr), [? *,Expr,? *])\end{verbatim}\end{minipage}\end{displaymath}](img6.gif) |
(6) |
Naturally, operators defined in this way can be part of the definition
of other operators. For instance, the operator free(A) is the
language of all strings which do not have any of the strings in
A as a substring. This can be defined as:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(free(A), ~containment(A))\end{verbatim}\end{minipage}\end{displaymath}](img7.gif) |
(7) |
We have found it useful to define boolean operators using this
mechanism. In fact, if we use the universal language to stand for
true and the empty language to stand for false, then the
standard operators for intersection and union correspond to
conjunction and disjunction:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(true,? *).
macro(false,{}).\end{verbatim}\end{minipage}\end{displaymath}](img8.gif) |
(8) |
With these definitions we get the expected properties:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}true & tru...
...false = false {false,false} = false\end{verbatim}\end{minipage}\end{displaymath}](img9.gif) |
(9) |
The macros for true and false can also be used to define a conditional
expression in the calculus. The operator coerce_to_boolean
maps the empty language to the empty language, and any non-empty
language to the universal language:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(coer...
...~coerce_to_boolean(Cond) o Else }).\end{verbatim}\end{minipage}\end{displaymath}](img10.gif) |
(10) |
Various interesting properties of automata have been implemented which
yield boolean values, such as the predicates
is_equivalent/2 for recognizers, and
is_functional/1 and is_subsequential/1 for transducers
(using the algorithms described in for instance [23]).
Regular expression operator definitions can also be recursive. The
following example demonstrates furthermore that definitions can take
the operands of the operator into account. The operator
set(List) yields the union of the languages given by each of the
expressions in the list List; union(A,B) is a built-in
operator providing the union of the two languages A and B:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(set(...
...o(set([H\vert T]),union(H,set(T))).\end{verbatim}\end{minipage}\end{displaymath}](img11.gif) |
(11) |
We can also exploit the fact that these definitions are directly
interpreted in Prolog by providing Prolog constraints on such
rules. This possibility is used in [7] to
define a longest-match concatenation operator which implements the
leftmost-longest capture semantics required by the POSIX standard
(cf. section 4.4).
A simple example is a generalization of the operator free. Suppose
we want to define an operator free(N,Expr) indicating the set of
strings which do not contain more than N occurrences of
Expr. This can be done as follows:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(free...
...0 > 0, N is N0-1, free_list(N,X,T).\end{verbatim}\end{minipage}\end{displaymath}](img12.gif) |
(12) |
Another example is an implementation of the N-queens problem:
how to place N queens on an N by N chess-board in such a way that no
queen attacks any other queen. For any N we can create a regular
expression generating exactly all strings of solutions. A solution to
the N-queen problem is represented as a string of N integers between 1
and N. An integer i at position j in this string indicates that a
queen is placed on the i-th column of the j-th row.
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(n_qu...
...onals(N)
& reverse(diagonals(N)))\end{verbatim}\end{minipage}\end{displaymath}](img13.gif) |
(13) |
The operator n_queens(N) is defined as the intersection of a
number of constraints. The first constraint, sigma(N)*, indicates
that a solution must be a string of integers between 1 and
N. The second constraint indicates that the length of the string
must be N. The remaining constraints ensure that queens do not
attack each other. The definition of length illustrates once
more the use
of Prolog to create a regular expression; the definition of
sigma/1 uses the set operator defined previously.
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(leng...
...is N0+1, N1 < N+1, between(N1,N,I).\end{verbatim}\end{minipage}\end{displaymath}](img14.gif) |
(14) |
The complete program is given in the appendix. For instance, the
expression n_queens(5) produces the automaton in
figure 1.
Figure 1:
Solution to the 5-queens problem
|
The mechanism described sofar to define new regular expression
operators is already quite powerful. As another illustration, consider
the problem of compiling a given finite automaton into a regular
expression. This problem becomes trivial if we allow the introduction
of new operators. Here is the definition of an operator `fa/5' which
describes an automaton as a listing of its components:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}macro(fa(S...
... [[States x [],?]*,States x []]
))\end{verbatim}\end{minipage}\end{displaymath}](img16.gif) |
(15) |
As an example, the automaton given in figure 2.16 of
[10] (given in figure 2) can be specified as:
![\begin{displaymath}\begin{minipage}[t]{.9\textwidth}\begin{verbatim}fa({0,1},{...
...],
[q2,1,q3],[q3,0,q2],[q3,1,q2]})\end{verbatim}\end{minipage}\end{displaymath}](img17.gif) |
(16) |
Figure 2:
Example automaton from figure 2.16 of
[10].
![\begin{figure}
\begin{pspicture}(-1,-1)(6.5,1.4)
\psset{unit=0.025cm}\rput(0,3.0...
...gle=30.0]{->}{2}{1}
\naput{\rnode{x}{0 1}}
\end{pspicture}\centering\end{figure}](img18.gif) |
Next: Direct implementation of new
Up: Extendible Regular Expression Operators
Previous: Extendible Regular Expression Operators
2000-07-03