Question Answering for Dutch using Dependency Relations

Both casual computer users and specialized information workers nowadays have access to large amounts of electronic information. However, on-line information sources are loosely structured, contain overlap, noise, and inconsistencies, are multilingual, and may contain speech as well as text. Current information retrieval technology, as exemplified by web search engines, only partially answers the demand for information navigation tools for such noisy and redundant data. In particular, there is a need for tools which help to locate relevant chunks of information efficiently, which extract and synthesize information from various sources, and which are interactive, in the sense that they can enter into an information dialogue with a user. The development of such tools requires a marriage between information retrieval and natural language processing (NLP) technology.

Question Answering (QA) is a technique aimed at providing an effective information retrieval tool based on NLP. Users may pose a question  in natural language, and an answer is given based on relevant sentences found in a collection of (on-line) documents. The question On which island is the Etna located? is answered with Sicily, and not with a list of web addresses which might contain the answer.

This project investigates the use of sophisticated linguistic knowledge and robust natural language processing for QA. In particular, we will investigate how syntactic and semantic dependency relations in the question and potential answer texts can be used to support QA. A demonstrator will be developed in cooperation with publisher Het Spectrum, who owns the rights of several major Dutch encyclopedias. In addition, we plan to participate in QA evaluation efforts of CLEF and similar conferences.

Detailed Project Description (pdf)