Software by Gertjan van Noord

Alpino Dutch parser, lexicon, grammar, treebank, and more.

FSA Utilities. The FSA Utilities is a collection of utilities to construct finite automata from regular expressions; manipulate finite automata; visualise finite automata; and apply finite automata.

TextCat Language Guesser. TextCat is a language guesser: given a few lines of text it attempts to decide in which natural language the text is written. TextCat knows about seventy different languages. TextCat implements the text categorization algorithm presented in a paper by Cavnar and Trenkle. TextCat is part of the SpamAssassin spam filter programme.

Suffix Arrays with Perfect Hash Finite Automata

Hdrug. Hdrug is a graphical user environment for the development of logic grammars and related tools.

Head-corner parser for Alvey NL Tools Grammar. A version of the head-corner parser (as described in my CL 1997 article) with both the Alvey NL Tools grammar and the MiMo2 grammar in DCG format.

Applications I am/was involved with

Old Software

Elex with Prolog. Elex is a scanner generator (a program such as lex) which supports multiple output languages. The patched version found here includes the possibility to produce Prolog output.

SICStus Prolog interface to ISO regular expression functions.