I will restrict the attention to a class of constraint-based
formalisms, in which operations on strings are defined that are more
powerful than concatenation, but which operations are restricted to be
nonerasing, and linear. The resulting class of systems
can be characterized as Linear Context-Free Rewriting Systems (LCFRS),
augmented with feature-structures (F-LCFRS). For a discussion of the
properties of LCFRS without feature-structures, see [31] and
[32]. Note though that these properties do not carry over to
the current system, because of the augmention with feature structures.
The proposals discussed in the previous section can be
seen as examples of F-LCFRS.
As in LCFRS, the operations on strings in F-LCFRS can be characterized as follows. First, derived structures will be mapped onto a string of words; i.e. each derived structure `knows' which string it `dominates'. For example, each derived feature structure may contain an attribute `string' whose value is a list of atoms representing the string it dominates. Furthermore, I will identify this string with the set of occurrences of words (or bag of words) in the string. I will write w(F) for the set of occurrences of words that the derived structure F dominates. Rules combine structures D1...Dn into a new structure M. Nonerasure requires that the union of w applied to each daughter is a subset of w(M):
Furthermore, I will assume that each rule has a designated daughter, called the head. Although I will not impose any restrictions on the head, it will turn out that the parsing strategy to be proposed will be very sensitive to the choice of heads, with the effect that F-LCFRS's in which the notion `head' is defined in a systematic way (Pollard's Head Grammars, Reape's version of HPSG, Dowty's version of Categorial Grammar), may be much more efficiently parsed than other grammars. The notion lexical head of a parse tree is defined recursively in terms of the head. The lexical head of a tree will be the lexical head of its head. The lexical head of a terminal will be that terminal itself. I will write the head of a definite clause as the leftmost daughter of that clause.
A grammar will be a set of definite clauses over a constraint language, following [10]. The constraint language consists of path equations in the manner of PATR. Hence, instead of first-order terms we are using feature structures. The relation defined by the grammar will be the relation `sign'. I assume furthermore that no other recursive relations are being defined. Hence a grammar rule will be something like
where the relation `concatenate_strings' states that the string of the mother is the concatenation of the strings of the daughters. The extra relation should be defined in such a way that given the two instantiated daughter signs, the extra relation is guaranteed to terminate (assuming Prolog like procedural semantics).
Furthermore, each grammar should provide a definition of the predicates head/2 and string/2. The first one defines the relation between the mother sign of a rule and its head; the second one defines for a sign the string (as a list of atoms) associated with that sign.
In the meta-interpreter I use for a nonunit clause