LASSY

... he refused to be a dog just like Lassy was ...

LASSY (Large Scale Syntactic Annotation of written Dutch) is a STEVIN project. STEVIN is a Flemish-Dutch Language and Speech Processing Technology Programme launched by de Nederlandse Taalunie. The STEVIN programme office is run jointly by NWO Humanities Division and SenterNovem.

A large corpus of written Dutch texts (1,000,000 words) has been syntactically annotated (manually corrected), based on D-COI and its successor. In addition, a very large corpus (about 1,500,000,000 words) has been syntactically annotated automatically. The project extends the available syntactically annotated corpora for Dutch both in size as well as with respect to the various text genres and topical domains. In addition, various browse and search tools for syntactically annotated corpora have been developed and made available. Their potential for applications in corpus linguistics and information extraction is illustrated and evaluated in a series of case studies.

Partners

Lassy is carried out by a consortium consisting of the University of Groningen and the Katholieke Universiteit Leuven. Researchers involved in the project include:


Erik Tjong Kim Sang
Gosse Bouma
Gertjan van Noord



Frank van Eynde
Ineke Schuurman
Vincent Vandeghinste

Lassy Initiatives

List of Resources