The proposed research aims at general answers to the above-mentioned questions of efficiency and disambiguation. In order to achieve such answers, we believe strongly in a methodology in which concrete proposals are developed and compared. For this reason we want to be able to apply and evaluate such concrete proposals on a specific grammar of Dutch.
Such a grammar of Dutch should describe a large fragment of the Dutch language, in such a way that it is able to treat the large majority of sentences of a given corpus of Dutch. The grammar should include a detailed and linguistically sophisticated treatment of constructions such as cross-serial dependencies (verb-clusters) and various types of nested dependencies (which potentially give rise to center-embedding, e.g. noun-phrases within adjectival phrases within noun-phrases), because these constructions are crucial for the qualitative evaluation of finite-state language processing techniques. The grammar should also provide a treatment of government and headed projections, in order to be able to use the grammar for disambiguation experiments for techniques which are based on lexical dependency structures. For the same reason, the grammar should treat various kinds of modification constructions including prepositional phrase attachments. Moreover, a large lexicon should be available in order to be able to experiment realistically with such disambiguation techniques.
In this section we describe a number of tasks aimed at the development of such a linguistically motivated grammar for Dutch.