This paper gives a method to estimate the space and time complexity of algorithms implemented in pure Prolog and executed under the OLDT search strategy (also called Earley Deduction).
In the DENK dialogue-system, we distinguish two levels of semantic representation. We employ typed feature structures at the first representation level, deriving context-independent semantic representations using insights from Head-driven Phrase Structure Grammar (HPSG). At the second representation level, we use a form of Constructive Type Theory to represent both the contextually determined meanings of utterances as well as the information state of the system, the latter containing the context with respect to which an utterance must be interpreted. We discuss both levels of representation, specifically advantages of the levels w.r.t. underspecification of semantic ambiguities related to pronouns and to quantifiers.
There are several different ways data mining (the automatic induction of knowledge from data) can be applied to the problem of natural language processing. In the past, data mining techniques have mainly been used in linguistic engineering applications to solve knowledge acquisition bottlenecks. In this paper, we show that they can also assist in linguistic theory formation by providing a new tool for the evaluation of linguistic hypotheses, for the extraction of rules from corpora, and for the discovery of useful linguistic categories. Applying Quinlan's C4.5 inductive machine learning method to a particular linguistic task (diminutive formation in Dutch) we show that data mining techniques can be used (i) to test linguistic hypotheses about this process, and (ii) to discover interesting linguistic rules and categories.
It has been argued many times that syntactical movement, in the form of leftward extraposition and wrapping discontinuous constituents, is of such a crucial importance that there is a need to develop grammar formalisms aimed solely at a clear and concise treatment of these phenomena.
Along the guiding lines of a number of examples of Dutch verb phrase structure, I first discuss head grammar (HG) and show that it inherently lacks the strong generative to describe even very simple fragments of complete Dutch sentences. I then make, in three steps, a progression from linear context-free rewriting systems (LCFRS) to literal movement grammar (LMG), and show that in contrast to the general feeling about LCFRS, the resulting formalism is an attractive and adequate tool for describing concrete fragments of configurational languages with a nontrivial surface structure.
Casting systems can be used to describe (fragments of) natural language. It is a formalism which associates trees with sentences in the style of classical dependency analysis. Casting systems in their original form are less elaborate in their linguistic structure, but more formal in the mathematical sense than dependency grammar. This paper describes the notion of casting systems in some detail. The main topic the paper deals with, is the problem of adding the linguistic notion valency to casting systems. Without valency, a casting system will often find an unacceptable amount of different analyses for a given sentence. A lot of these are obviously incorrect or unwanted. Valency can be used to filter out such superfluous analyses.
Discourses, whether written or spoken, are intended to convey information. Obviously, it is important to the processing of discourses that one is able to recognize the information that is relevant. The need for a criterion for relevance of information arises out of the idea of developing a tool assisting in the extraction of definitions from philosophical discourses (PAPER/HCRAES-projects).
A way to analyse a discourse with regard to the information expressed in it, is to observe the Topic-Focus Articulation. A topic of (part of) a discourse can be conceived of as already available information, to which more information is added by means of one or more foci. Several topics and foci of a discourse are organized in certain structures, characterized by a thematical progression (``story-line''). The theories about TFA and thematic progression have been developed by the Prague School of Linguistics.
In order to discern the relevant information in a discourse, we try to establish the thematic progression(s) in a discourse. It will turn out that it is important, not only how topics and foci relate to each other with regard to the thematic progression (sequentially, parallelly, etc.), but also how the topics and foci are related rhetorically (e.g. by negation). In this paper we shall come to defining the way in which the information structure of a discourse can be recognized, and what relevant information means in this context.
In this paper a method for generating coherent texts is described. In this method only local conditions associated with sentences determine the appropriateness of a sentence at a certain point in the text. The method does not require any form of planning and it concentrates on maximizing the amount of variation of the texts generated.
In his book `Mechanisms of Implicit Learning' (1993) Axel Cleeremans describes how finite state grammars can be modeled successfully with connectionist networks namely with Simple Recurrent Networks (SRNs) developed by Jeffrey Elman. However, SRNs cannot be used for modeling arbitrary finite state grammars. In this paper I describe the limitations of this approach.
The recognition problem for attribute-value grammars (AVGs) was shown to be undecidable by Johnson in 1988. Therefore, the general form of AVGs is of no practical use. In this paper we study a very restricted form of AVG, for which the recognition problem is decidable (though still NP-complete), the R-AVG. We show that the R-AVG formalism captures all of the context free languages and more, and introduce a variation on the so-called off-line parsability constraint, the honest parsability constraint, which lets different types of R-AVG coincide precisely with well-known time complexity classes.
We apply complexity theory to grammatical theories. This enables us to compare these grammatical theories by an objective measure. In this paper, we concentrate on restricted versions of Lexical Functional Grammar (LFG) and Head-driven Phrase Structure Grammar (HPSG).
We show that the `fixed grammar' recognition problems of both restricted versions are intractable. In analogy to the intractability result for LFG in (Barton Jr. et al. 1987), we present an NP-hard lower bound on the complexity of both recognition problems. In addition, we also present an NP upper bound on the complexity. Thus we show that the recognition problems are NP-complete.
The presented versions avoid theory specific constructions, because of the imposed restrictions. Thus, we conjecture that the complexity results hold for other unification grammars, e.g., Functional Unification Grammar and PATR-II.
The intractability proofs show that excessive structure sharing and recursive attributes cause the complexity. We conclude that the grammatical theories should force restrictions on these phenomena.
In this paper a head-corner parser for a fragment of the Minimalist Program (Chomsky 1992) is described. It will be argued that because of the nature of Generalized Transformations (GT) and Move-$\alpha$ head-corner parsing is a suitable parsing technique for the Minimalist Program.
Furthermore a proposal is made to treat functional and lexical heads differently. Functional heads are not treated as head-corners of their mothers by the minimalist head-corner parser that is described here.
To the Abstracts of the papers.
To the Title page.