Next: Annotated word-graph
Up: Robust parsing of word-graphs
Previous: Grammar
Parser
Five different parsing algorithms were implemented and compared (a
bottom-up Earley parser, an inactive chart parser, an LR parser, a
left-corner parser and a head-corner parser). The most efficient
parser (both in terms of CPU-time and memory usage) for this
application turned out to be a head-corner parser implemented with
goal-weakening and selective memoization. The head-corner parser is
presented in detail in van Noord [41].
In order to apply this (or any of the other) parser(s) for robust
processing, we use underspecification of the state names for the
input parse goal in order to parse the start category anywhere
in the word-graph. Normally the parser will be called using a
goal such as the following:
![\begin{displaymath}\small\begin{minipage}[t]{.9\textwidth}\begin{verbatim}?- parse(start(Sem),q0,q16).\end{verbatim}\end{minipage}\end{displaymath}](img71.gif) |
(29) |
indicating that we want to find a path from state q0 to
q16 which can be analysed as a category start(Sem) (a
sentence with a semantic representation that is yet to be discovered).
If we want to recognize top categories at all positions in the input,
then we can simply generalize the parse goal to:
![\begin{displaymath}\small\begin{minipage}[t]{.9\textwidth}\begin{verbatim}?- parse(start(Sem),_,_).\end{verbatim}\end{minipage}\end{displaymath}](img72.gif) |
(30) |
Now it may seem that such an underspecified goal will dramatically
slow down the parser, but this turns out to be a false expectation, at
least for the head-corner and left-corner parsers. In fact we
have experienced no such decrease in efficiency. This can only be
understood in the light of the use of memoization: even though we now
have a much more general goal, the number of different goals that we
need to solve is much smaller.
Next: Annotated word-graph
Up: Robust parsing of word-graphs
Previous: Grammar
2000-07-10