Progress in computational linguistics will not only be important in terms of improving our understanding of human language, but it will also have an important effect by furthering human language technology.
It will be possible to develop natural language interfaces which will drastically improve the accessibility of large amounts of information (especially for people without computer training). It will enable and improve linguistic applications such as grammar checkers, dictation systems, language instruction, documentation systems and linguistic aids to the handicapped. Human language technology is one of the key actions of Fifth Framework Programme of the European Community for research, technological development and demonstration activities.
An innovative aspect of the proposal is that it focuses on the Dutch language, and hence on Dutch linguistics and language technology. A recent overview on Dutch Language and Speech Technology conducted by de Nederlandse Taalunie [11] reports that there are many fewer language technology resources available for Dutch (as compared to English). It is important that language technology is developed for Dutch in addition to the current developments for languages such as English, German and French. Language technology applications such as those mentioned above are cultural bonuses that should accrue not only to the speakers of majority languages.
The project aims furthermore at significant spin-offs. The proposed project will devote resources to extending existing Dutch grammars to experiment with the proposed techniques and to test the hypotheses. An extensive Dutch grammar in the public domain will be a major contribution to Dutch computational linguistics and to the international community.
Moreover, we propose to apply some of the innovative techniques in a
linguistic research tool for searching bare text-corpora (called lgrep).
This application is capable of searching text corpora (including
arbitrary Dutch texts on the Internet) on the basis of
linguistic criteria. It extends existing search tools with the
possibility to specify search patterns including linguistic criteria
such as part-of-speech labels (such as noun, verb,
preposition, etc.), major syntactic category ( noun phrase,
verb phrase, subordinate sentence, etc.), and grammatical
relation ( subject, direct-object, specifier, etc.). Such a
tool would be useful for researchers working with corpora such as
researchers in linguistics, applied linguistics, comparative
literature and communication studies, but perhaps also as an extension
of traditional grammars as used by language learners, enabling them to
obtain example sentences of particular linguistic constructions upon
request.
A successful implementation of Algorithms for Linguistic Processing will not only provide new insights concerning the way in which natural language is processed, but it will also provide new techniques which are crucial for human language technology, in particular for Dutch.
![]() |