In the application for which the head-corner parser was developed, robust processing is essential. In a spoken dialogue system it is often impossible to parse a full sentence, but in such cases the recognition of e.g. temporal expressions might still be very useful. Therefore, a robust processing technique which collects the remnants of the parsing process in a meaningful way seems desirable.
In this subsection we show how the head-corner parser can be used in such circumstances. The approach consists of two parts. Firstly, the parser is modified in such a way that it finds all derivations of the start symbol anywhere in the input. Furthermore, the start symbol should be defined in such a way that it includes all categories which are considered useful for the application.
Normally the head-corner parser will be called as e.g. :
?- parse(s(Sem),0,12).indicating that we want to parse a sentence from position 0 to 12 with category s(Sem) (a sentence with a semantic representation that is yet to be discovered). Suppose however that a specific robustness module is interested in all `maximal projections' anywhere in the sentence. Such a maximal projection may be represented by a term xp(Sem). Furthermore there may be unary grammar rules rewriting such an xp into appropriate categories, e.g.:
Also note that even though the first call to the parse predicate has variable extreme positions, this does not imply that all power of top-down prediction is lost by this move; recursive calls to the parse predicate may still have instantiated left and/or right extreme positions. The same applies with even more force for top-down information on categories.
In the first phase, the parser finds all occurrences of the top category in the input word-graph. Thus, we obtain items for all possible maximal projections anywhere in the input graph. In the second phase, the robustness component selects a sequence of such maximal projections. The robustness procedure consists of a best-first search from the beginning of the graph to the end of the graph. A path in the input graph can be constructed by taking steps of the following two types. To move from position P to Q you can either:
In order to compare paths in the best-first search method, we have experimented with score functions which include some or all of the following factors:
If bigram scores are not included, then this best-first search method can be implemented efficiently because for each state in the word-graph we only have to keep track of the best path to that state.
The resulting `best' path in general consists of a number of maximal projections. In the OVIS application these often are simple time or place expressions. The pragmatic module is able to deal with such unconnected pieces of information and will perform better if given such partial parse results.
In order to evaluate the appropriate combination of the factors determining the scoring function, and to evaluate this approach with respect to other approaches, we use a corpus of word-graphs for which we know the corresponding actual utterances. We compare the sentence associated with the `best' path in the word-graph with the sentence that was actually spoken. Clearly if the robustness component more often uses the information that was actually uttered, then we have more confidence in that component. This notion of word accuracy is an approximation of semantic accuracy (or `concept accuracy'). The string comparison is defined by the minimal number of deletions and insertions that is required to turn the first string into the second (Levenshtein distance), although it may be worthwhile to investigate other measures. For example, it seems likely that for our application it is less problematic to `miss' information, whereas `hallucination' is a more severe problem. This could be formalized by a scoring function in which insertion (into analysis result) is cheaper than deletion.
Currently the best results are obtained with a scoring function in which bigram scores, acoustic scores and the number of skips is included. We have also implemented a version of the system in which acoustic scores and bigram scores are used to select the best path through the word-graph. This path is then sent to the parser and the robustness component. In this `best-1-mode' the system performs somewhat worse in terms of word-accuracy, but much faster (cf. the experiments in the next section).