next up previous
Next: Annotated word-graph Up: Robust parsing of word-graphs Previous: Grammar


Parser

Five different parsing algorithms were implemented and compared (a bottom-up Earley parser, an inactive chart parser, an LR parser, a left-corner parser and a head-corner parser). The most efficient parser (both in terms of CPU-time and memory usage) for this application turned out to be a head-corner parser implemented with goal-weakening and selective memoization. The head-corner parser is presented in detail in van Noord [41].

In order to apply this (or any of the other) parser(s) for robust processing, we use underspecification of the state names for the input parse goal in order to parse the start category anywhere in the word-graph. Normally the parser will be called using a goal such as the following:

\begin{displaymath}\small\begin{minipage}[t]{.9\textwidth}\begin{verbatim}?- parse(start(Sem),q0,q16).\end{verbatim}\end{minipage}\end{displaymath} (29)

indicating that we want to find a path from state q0 to q16 which can be analysed as a category start(Sem) (a sentence with a semantic representation that is yet to be discovered). If we want to recognize top categories at all positions in the input, then we can simply generalize the parse goal to:
\begin{displaymath}\small\begin{minipage}[t]{.9\textwidth}\begin{verbatim}?- parse(start(Sem),_,_).\end{verbatim}\end{minipage}\end{displaymath} (30)

Now it may seem that such an underspecified goal will dramatically slow down the parser, but this turns out to be a false expectation, at least for the head-corner and left-corner parsers. In fact we have experienced no such decrease in efficiency. This can only be understood in the light of the use of memoization: even though we now have a much more general goal, the number of different goals that we need to solve is much smaller.


next up previous
Next: Annotated word-graph Up: Robust parsing of word-graphs Previous: Grammar

2000-07-10