FSA Transducers Tutorial
Regular expressions for Transducers
-
a:b replaces the symbol
a by the symbol b
-
[{a,b},{a,b}] x [{c,d},e^] replaces
a string matching [{a,b},{a,b}]
by a string matching [{c,d},e^]
-
(a:b)* maps strings of a's
of arbitrary length onto strings of b's of
the same length.
-
{?,{a,e,i}:v}* optionally
replaces the letters a and e
and
i by v in an
input consisting of arbitrary strings.
-
{? - {a,e,i},{a,e,i}:v}* obligatorily
replaces the letters a and e
and i by v in
an input consisting of arbitrary strings.
-
{? -b,[b,[]:a]}* obligatorily
inserts an a after every b.
-
{? -b,b x [b,a]}* obligatorily
inserts an a after every b.
-
(a:b)* o (b:c)* maps strings
of a's onto strings of c's
by composition of an automaton mapping a's
to b's and an automaton mapping b's
to c's
-
{?,{a,e,i}:v}* o ~ $ {a,e,i} obligatorily
replaces the letters a and e
and i by v in
an input consisting of arbitrary strings.
Exercises
-
Give a regular expression which substitutes all vowels in a word by v and
all consonants by c (i.e. acts --> vccc, aunt
--> vvcc ).
-
Assume the language only has syllables ending in a vowel. Give a regular
expression which optionally inserts an h (for
hyphen) after every v.
-
Give a regular expression which obligatorily inserts an h
after every v.
-
Give a regular expression which maps words into cv-patterns
as described above and which obligatorily inserts a hyphen after
every v.
American and British English
There are a number of well-known differences in spelling between American
and British English. Some more or less systematic differences are listed
on Spelling
differences between American and British English. To the extent that
these differences a regular, one might design spelling rules which automatically
transform American to British spelling.
-
Give a regular expression which obligatorily replaces ize
by ise and yze
to yse. One approach is to write an
expression which optionally applies first, and to compose this with a recognizer
(i.e. an identity transducer) which filters ize
and yze atterns.
-
Give a regular expression which changes eling
to elling. Why does this rule `overgenerate'?
-
Look for other systematic differences between American and British spelling
and try to deal with them in a similar way.
Hyphenation
A syllable in general consists of an onset (consisting of zero or more
consonants), followed by a nucleus (a single vowel), followed by a coda
(zero or more consonants). Assume a language where coda's consist of at
most one consonant. Furthermore, if a single consonant occurs between two
vowels, it is part of the second syllable.
abba
|
-->
|
ab-ba
|
aba
|
-->
|
a-ba
|
baab
|
-->
|
ba-ab
|
-
Give a regular expression which inserts a hyphen after each syllable. One
approach is to define a transducer which inserts hyphens optionally first,
and to compose this transducer with a recognizer which filters illegal
hyphenation patterns.
-
Ensure that no hyphen is inserted after the last syllable. One approach
is to add an end-of-word marker (say #) to each word and to specify
a filter which refers to this marker.