#MATH192# P NP.
CYK is a bottom-up algorithm. Other
parsing algorithms, such as top-down parsing, are more selective: the
kinds of subparse that are constructed from a certain input position
onward are dependent on subparses already found to the left of that
input position. An example is Earley's algorithm [9]. Other
examples can be found in e.g. [14].
When complex categories are used (as is the case in OVIS), the
parsing problem becomes even more complicated and further distinctions
can be made between different top-down strategies and different
bottom-up strategies.
A further distinction between parsing algorithms can be made by
identifying in which order the input string is traversed. Many
algorithms proceed from left to right, but it has been observed that a
bidirectional flow of control, as in head-driven parsers
[12,22,20,15], has practical
advantages [4].
Although the end result of parsing is always the same,
the speed of different parsing algorithms may differ significantly.
This is a relevant issue for a real-time system such as OVIS. For this reason we are currently developing implementations of a
number of different parsing algorithms. These implementations are then
compared to find out which parser works best for the OVIS
grammar and input which is typical of the application.
For the construction of parsers from the grammar, we use a parser
generator that is capable of detailed analyses of
the grammar. For example, it performs a separation of grammar symbols
into a part that is required for finding the correct parses, and a
part that is merely required for finding the semantics of correct parses.
During syntactic analysis, the latter part only needs to be applied
for complete parses, and not for all subparses, some of which may
turn out to be useless later. This may speed up the syntactic analysis.
Word graphs
In practice, speech recognition does not allow unique identification
of the list of words that a person actually utters at the
other end of a telephone line. There are a number of reasons why this
is not possible.
First, there are technical limitations in
state-of-the-art speech recognition for connected
speech, which makes that full
recognition of all words is unattainable for all speakers regardless
of dialects and foreign accents, the pitch of their voices, how well
they articulate, etc.
The second reason is the varying quality of
telephone connections, which may blur the spoken text to some degree.
Also background noise, such as slamming doors, noisy pets, and
conversations in the background, constitute problems for speech
recognizers. Finally, some sequences of sounds may actually be
ambiguous, in the sense that they may represent multiple sequences of
words. The reason that people are usually unaware of such ambiguity
is because we unconsciously disambiguate by means of syntax and other
kinds of non-phonetic information.
These considerations have led to the notion of word graph
[16]. A word graph, produced by an automatic speech
recognition device, is a succinct representation of all alternative
sequences of words where the device cannot identify a unique sequence.
Most parsing algorithms for linear input can easily be
generalized to deal with word-graphs [3,13,23].
An example of such a graph is given in figure 6. The
nodes represent the points in time. The left-most node represents the
point in time where a sentence may start, the right-most node
represents the point in time where the sentence may end.
All paths from the initial to the final node represent one alternative
sentence. The edges in the graph are labeled with words, but also
with scores, which indicate some degree of certainty with which
words are recognized. The lower the score, the likelier it is that a
word should represent the sounds perceived between the corresponding
points in time. For a sequence of words, the scores
can be summed. The result is a degree of likelihood that that
sequence of words corresponds with the actual utterance.
One could try to find the path in the graph from the initial
node to the final node which has the lowest score, and take
this to be the intended sentence. However, as we have argued before,
non-phonetic information such as syntax and world knowledge is often
needed to find the correct sentence, and therefore selection
of one particular sentence based on phonetic scores alone may be premature.
In the example of figure 6, the path corresponding with
the `sentence'
Ik willen een Etten snelheid |
I want(plural) a Etten speed |
has the lowest score, yet it is unlikely that this is the sentence
uttered by the user, for obvious reasons.
Therefore, the other sentences should not be discarded before
syntactic and semantic considerations have been taken into account.
Figure 6:
An example of a word graph.
|
In the running example there are 8 paths in the word graph. Only
two are more or less syntactically correct:
Ik wil alleen met de sneltrein |
I want only with the inter-city |
Ik wil alleen met de snelheid |
I want only with the speed |
The second has the lowest score, viz.
5 + 8 + 4 + 8 + 6 + 5 = 36, and may be
taken as the intended sentence, unless later phases of the analysis
(e.g. pragmatic analysis) prefer the first, which is likely in
this case.
As a first approximation, we suggested in the previous section that
the analysis of the input should restrict itself to the sentences in
the word graph that could be processed by the syntactic analysis.
Strictly speaking, syntactic analysis means assigning a structure to a
syntactically correct sentence. In this section we have to refine
this idea. The reason is that typical sentences that occur in spoken
dialogues may not be syntactically correct. Since in everyday
human-human conversations, syntactic errors hardly ever disrupt the
normal course of dialogues, one would like the OVIS system to be
equally robust against any kind of syntactic error in the sentences
spoken by the user.
More specifically, if the grammar does not assign an analysis to a
given input, then the system should still try to construct some form
of parse which allows a semantic value to be computed. This semantic
value should in the ideal case correspond with the intentions of the
user, as expressed in the ungrammatical sentence (note that robustness
concerns both ill-formed input and well-formed input which our
grammar fails to recognize).
The realisation of this idea can be performed by three mechanisms,
which may be combined in order to obtain the desired result:
- The grammatical constraints are relaxed.
- The input is changed in order to satisfy the grammar.
- Partial sentences are recognized.
A typical application of the first mechanism is the treatment of
mismatches of agreement. For example,
Ik wilt naar Amsterdam |
I wants to go to Amsterdam |
is a grammatically incorrect sentence, because there is no agreement
between the subject and the verb. By relaxing this constraint, the
analysis of the input can proceed as if the correct verb form were
substituted. A disadvantage of this approach is that unwanted analyses
now may be derived as well.
Another typical source of violation of grammaticality is the
phenomenon of hesitations. A typical example is
Ik wil naar ...eh...naar Groningen |
I want to go to...hm...to Groningen |
The user hesitates to mention the place of destination, and when he is
ready to continue the sentence, he repeats some of the words mentioned
before (`naar' in the example).
Such errors may be treated by changing the input. Techniques for
accomplishing such repairs are proposed in [7] and
[6]. Since the input for the syntactic analysis
in our case is the word graph, we realise this by adding some edges.
We give an example for the phenomenon of hesitations. The desired
correction is described by the following rewrite rule.
w eh w
w
In words, this rewrite rule says that an occurrence of a string w,
which consists of arbitrary words,
followed by some sign of hesitation (represented by eh),
followed by the same string w, can be changed into the string w.
An example of the application of this rule is given in
figure 7.
Figure 7:
Adding an edge according to a rewrite rule.
|
Many other forms of
ungrammaticality can be corrected by means of rewrite rules.
Some input from the user may not at all be intended to be a complete
sentence: it may consist of phrases or merely partial phrases. For
example, if the system asks
Waar wilt u heen? |
Where do you want to go to? |
the user may answer with any of the following utterances, none of
which is a sentence:
a. |
Delft |
b. |
naar Delft |
c. |
Delft om negen uur |
d. |
naar Delft om negen uur |
|
to Delft at nine o'clock |
It would be fruitless to try to turn these answers into full sentences
in all possible ways. Instead, it obviously suffices that the system
should parse the answer as a phrase or partial phrase, then compute
the semantics, and finally complete this semantics to be the intention
of the user, using the elements of the question.
In the example (3.3a), the syntactic analysis should merely
recognize `Delft' as the name of a station and translate it into the
internal name for this station. Subsequent combination of this data
with the meaning of the question then yields some semantics reflecting
the intention of the user to go to Delft (and not to depart
from Delft, for example).
The above problem can also be solved by designing the grammar in such
a way that the language that is described is not merely the language
of all correct and complete Dutch sentences, but the language of all
relevant answers to questions in the application domain, in colloquial
Dutch, including (partial) phrases.
More interesting is the treatment of cases where the input consists
mostly of uninterpretable subsequences of words, with only a few clear
phrases, for example
de naar een vanuit Amsterdam uur |
the to a from Amsterdam o'clock |
The only information discernible in this sentence is that the user
wants to depart from Amsterdam. The remainder of the intentions of the
user may not become clear from the utterance as the speech recognizer
has processed it (we assume other paths in the word graph are equally
uninterpretable). The OVIS system may however use the partial
information obtained thus far and continue by requesting the desired
time of departure, the place of destination, etc.
In all cases where paths in the word graph cannot be recognized as
grammatical sentences, the score of the paths should be incremented.
This is done under the assumption that the more we change to the
structure of sentences as we receive them through the speech
recognizer, the larger the chances are that the semantics we
eventually derive is inconsistent with the intentions of the user.
For example, in figure 7 the newly added edge has a score
which is not merely the sum of the edges that it was derived from, but an
extra penalty is added, because this edge is a derived edge, not an
original one.
As we have remarked before, word graphs may contain several sentences
that are grammatical, and some that are ungrammatical, but allow easy
correction. In practice, additional ambiguity is introduced by the
grammar. The use of underspecified semantic representations (QLF) can be applied only to a limited extent to reduce ambiguities.
A typical example is PP-attachment:
Ik wil naar Hengelo reizen om zes uur |
I want to travel to Hengelo at six o'clock |
The phrase om zes uur may refer to wil (the desire
to go to Hengelo is restricted to six o'clock) or to reizen
(the travelling is to take place at six); even more
interpretations are possible.
The consequence of such ambiguity is that several parses need to be
constructed, only some of which will prove to be interpretable
in later phases of analysis. In the above example, the first reading
of the sentence will probably not fit in any pragmatic framework,
and will therefore be rejected.
At the current moment it is not clear whether such ambiguity can and
should already be solved during syntactic analysis. One conceptual
objection against this solution is that it undermines the modular
structure of the system. This is because the syntactic module would need
to be contaminated with pragmatic information. In other
application domains where different pragmatic considerations prevail
such a syntactic module would then no longer be applicable.
Next: Status of the work
Up: Grammatical Analysis in a
Previous: The Grammar
Noord G.J.M. van
1998-09-25