Question Answering for Dutch using
Dependency Relations
Both casual computer users and specialized information workers nowadays
have access to large amounts of electronic information. However, on-line
information sources are loosely structured, contain overlap, noise, and inconsistencies,
are multilingual, and may contain speech as well as text. Current information
retrieval technology, as exemplified by web search engines, only partially
answers the demand for information navigation tools for such noisy and redundant
data. In particular, there is a need for tools which help to locate relevant
chunks of information efficiently, which extract and synthesize information
from various sources, and which are interactive, in the sense that they can
enter into an information dialogue with a user. The development of such tools
requires a marriage between information retrieval and natural language processing
(NLP) technology.
Question Answering (QA) is a technique aimed at providing an effective information
retrieval tool based on NLP. Users may pose a question in natural language,
and an answer is given based on relevant sentences found in a collection of
(on-line) documents. The question On which island is the Etna located?
is answered with Sicily, and not with a list of web addresses which
might contain the answer.
This project investigates the use of sophisticated linguistic knowledge
and robust natural language processing for QA. In particular, we will investigate
how syntactic and semantic dependency relations in the question and potential
answer texts can be used to support QA. A demonstrator will be developed in
cooperation with publisher Het Spectrum, who owns the rights of several
major Dutch encyclopedias. In addition, we plan to participate in QA evaluation
efforts of CLEF and similar conferences.
Detailed Project Description (pdf)