Parsing word-graphs

Next: Grammar Up: Robust parsing of word-graphs Previous: Word-graphs

Parsing word-graphs

The normalized word-graph is parsed by an appropriate parser. Parsing algorithms for strings can be generalized to parse such word-graphs (for some examples cf. van Noord [40]). In the ideal case, the parser will find a path in the word-graph that can be assigned an analysis according to the grammar, such that the path covers the complete time span of the utterance, i.e. the path leads from the start state to a final state. The analysis gives rise to an update of the dialogue state, which is then passed on to the dialogue manager.

However, often no such paths can be found in the word-graph, due to:

errors made by the speech recognizer,
linguistic constructions not covered in the grammar, and
irregularities in the spoken utterance.

Even if no full analysis of the word-graph is possible, it is usually the case that useful information can be extracted from the word-graph. Consider for example the utterance: Ik wil van van Assen naar Amsterdam
I want from from Assen to Amsterdam The grammar will not assign an analysis to this utterance due to the repeated preposition. However, it would be useful if the parser would discover the prepositional phrases van Assen and naar Amsterdam since in that case the important information contained in the utterance can still be recovered. Thus, in cases where no full analysis is possible we would like to fall back on an approach reminiscent of concept spotting. The following proposal implements this idea.

Firstly, the grammar is defined in such a way that each maximal projection such as S, NP, PP, etc., can be analysed as a top category. This is well-motivated because utterances very often consist of a single NP or PP (section 3.3).

Often, the task of the parser is to discover all instances of the top category from the start state of the word-graph to a final state. But in our case, we require that the parser discovers all instances of the top category anywhere in the word-graph, i.e. for all partial paths in the word-graph. This has the desired effect for example (12): both PPs will be found by the parser.

Thus we require that the parser finds all major categories anywhere in the word-graph. If a bottom-up chart parser is used, then we might use the inactive chart items for this purpose. However, since we do not want to be forced to a particular parsing strategy, we have chosen to adopt a different approach. In section 3.4 we show that in a logic programming setting the use of underspecification of the state names associated with the top-most goal obtains the desired effect, without loss of efficiency.

Therefore, after the parser has finished, we have a word-graph annotated with a number of instances of top categories. For each of these categories we are interested in the word-graph state where this category starts (v_i), the word-graph state where this category ends (v_j), the sequence of symbols associated with this category (x), the accumulated acoustic score (a), and the qlf (q). Let $\mbox{\it parsed\/}(v_i,v_j,x,a,q)$ refer to such categories.

We are interested in paths from the start state to the final state consisting of a number of categories and transitions in the word-graph (the latter are called skips). The problem consists in finding the optimal path, according to a number of criteria. This problem is formalized by defining the annotated word-graph as a directed acyclic graph (section 3.5). The vertices of this graph are the states of the word-graph; the edges are the transitions of the word-graph and the categories found by the parser.

The criteria which are used to favor some paths over other paths are expressed as a weight function on the edges of the graph. The criteria we might take into account are discussed in section 3.6. For instance, a typical criterion will favor paths consisting of a small number of categories, and a small number of skips. The case in which the parser found a full analysis from the start state of the word-graph to a final state then reduces to a special case: the analysis solely consisting of that category will be favored over sequences of partial analyses.

Obviously, it is not a good idea to generate all possible sequences of categories and skips, and then to select the best path from this set: in typical word-graphs there are simply too many different paths. If a certain uniformity requirement on weights is met, however, then efficient graph search algorithms are applicable. The particular algorithm implemented in OVIS2, namely a variant of the DAG-SHORTEST-PATH algorithm [18] is discussed in section 3.7.

The criteria used to determine the best path may also include Ngram statistics. It turns out that in those cases some complications arise in the definition of the annotated word-graph. This is explained in section 3.8.

In a previous implementation [29] we used a version of Dijkstra's algorithm. A comparison is presented in section 3.9. Finally, section 3.10 discusses methods in which the parser is applied only to a single path of the word-graph.

Next: Grammar Up: Robust parsing of word-graphs Previous: Word-graphs

2000-07-10