The proposal focuses on linguistic questions and therefore is part of humanities research, and uses many of the tools and techniques developed in the physical and technical sciences. We believe that in order for the proposed project to be conducted effectively, a methodology should be adopted in which a small number of applications are constructed.
One of the applications that we propose is a linguistically-informed search tool for text corpora ( lgrep). This application concerns a tool which is capable of searching text corpora (including arbitrary texts on the Internet) on the basis of syntactic criteria. This application fits well with the approach of grammar approximation by finite-state techniques. Such a tool should be useful for (computational) linguists working with corpora, but also as an extension to traditional grammars as used by language learners, to be able to obtain example sentences of particular constructions upon request. Moreover, the tool will be useful for the research proposals discussed in the previous sections.