Creating an infrastructure for language and speech technology implies that certain coordinating institutions exist, which set priorities, ensure that resources are not only developed but also maintained and distributed, which perform evaluations, which coordinate between industry, university research, national governments, and the European Union, etc.
We interviewed approximately thirty researchers from universities and industry, as well as a number of representatives of reserach organizations. Practically everyone was of the opinion that coordination of research and development in the area of language and speech technology is currently practically absent in the Netherlands and Flanders. This has a number of undesirable consequences, especially for the development and distribution of resources. First, as most of the development of resources takes place within reserach projects with funding for limited periods of time, maintenance of resources after completion of the project is practically impossible. For instance, it is currently unclear how the CELEX database will be maintained in the near future. Second, the development of resources is often intended for use within a specific project. The use and distribution of resources outside the project is often not taken into consideration properly, and thus several potentially interesting resources cannot be used. Third, collaboration on language and speech technology in which both the language as it is spoken and written the Netherlands and as it is spoken and written in Flanders is taken into consideration, is rare. Sometimes this cooperation can be achieved within a European context (as was the case for EUROTRA, with a group based in Utrecht (NL) and Leuven (B)). Another example of collaboration is the Dutch-Flemish development of a spoken language corpus.
Such collaboration is incidental, however, and opportunities for collaboration are clearly missed. One example is the Dutch NWO priority programme on language and speech technology and the Flemish short term programme for speech and language technology, which were initiated and carried out independently. Another example concerns a recent program for Dutch-Flemish cultural cooperation (initiated by the Vlaams-Nederlandse Comité voor Nederlandse Taal en Cultuur (VNC), which stimulates collaborative research on language and literature. Language and speech technology projects appear to be outside the scope of this program.
Finally, the CELEX project has been carried out as a national (Dutch) initiative. A seperate effort (FONILEX, see bach.arts.kuleuven.ac.be/fonilex/) was required to provide a pronunciation dictionary for Flemish. Other aspects of the Flemish vocabulary have not been accounted for, however.
The situation for language and speech technology contrasts with that in more traditional areas of linguistics. The INL ( Instituut voor Nederlandse Lexicografie) is a leading institute for lexicography and is funded by the Dutch and Belgian government. The NTU has actively promoted the development of dictionaries (such as the bilingual dictionaries under development by the Commissie Lexicale Vertaalvoozieningen) and grammars (such as the ANS) by teams in which both Dutch and Flemish partners cooperate.
The NTU could play an important role as coordinating institution for language and speech technology in the Netherlands and Flanders. As an intergovernmental organization responsible for language policy, it is capable of coordinating between initiatives on both sides of the border. As an organization which has been actively involved in the development of dictionaries and grammars, it should also be capable of stimulating the development of electronic linguistic resources, as well as ensuring the maintenance and distribution of these resources over longer periods of time.