Finite-state Language Processing

Next: Finite-state Approximation and -moves Up: Introduction Previous: Introduction

Finite-state Language Processing

An important problem in computational linguistics is posed by the fact that the grammars which are typically hypothesised by linguists are unattractive from the point of view of computation. For instance, the number of steps required to analyse a sentence of n words is n³ for context-free grammars. For certain linguistically more attractive grammatical formalisms it can be shown that no upper-bound to the number of steps required to find an analysis can be given. The human language user, however, seems to process in linear time; humans understand longer sentences with no noticeable delay. This implies that neither context-free grammars nor more powerful grammatical formalisms are likely models for human language processing. An important issue therefore is how the linearity of processing by humans can be accounted for.

A potential solution to this problem concerns the possibility of approximating an underlying general and abstract grammar by techniques of a much simpler sort. The idea that a competence grammar might be approximated by finite-state means goes back to early work by Chomsky [3,4]. There are essentially three observations which motivate the view that the processing of natural language is finite-state:

humans have a finite (small, limited, fixed) amount of memory available for language processing
humans have problems with certain grammatical constructions, such as center-embedding, which are impossible to describe by finite-state means [12]
humans process natural language very efficiently (in linear time)

Next: Finite-state Approximation and -moves Up: Introduction Previous: Introduction

2000-07-10