Next: Interactive lexical analysis
Up: The annotation process
Previous: The annotation process
Parsing
The annotation process typically starts with parsing a sentence from
the corpus with the Alpino parser. This is a good method, since
building up dependency trees
manually is extremely time consuming and error prone. Usually the
parser produces a correct or almost correct parse. If the
parser cannot build a structure for a complete sentence, it tries to generate as
large a structure as possible (e.g. a noun phrase or a
complementizer phrase). The main
disadvantage of parsing is that the parser produces a large set of
possible parses (see fig.3). This is a well known problem in grammar
development: the more linguistic phenomena a grammar covers, the
greater the ambiguity per sentence. Because selection of
the best parse from such a large set of possible parses is time
consuming, we have tried to reduce the set of generated parses. The
interactive lexical analyzer and the constituent marker
restrict the parsing process which results in reduced sets of
parses. A tool for on line addition of lexical information makes parsing of
sentences with unknown words more accurate and efficient.
Figure 3:
Number of parses generated per sentence by the Alpino parser
![\includegraphics [angle=270,scale=0.4]{ambig.epsi}](img7.gif) |
Subsections
Next: Interactive lexical analysis
Up: The annotation process
Previous: The annotation process
Noord G.J.M. van
2002-06-13