Corpus study of negative polarity items

Jack Hoeksema

University of Groningen

1. Introduction

During the last five years I have been involved in a large research project mostly on negative polarity items (see note 1). Polarity items are words or idioms which appear in negative sentences, but not in their affirmative counterparts, or in questions, but not assertions, or in the protasis of a conditional, but not in the apodosis (cf. inter alia Ladusaw 1980, Linebarger 1981, Zwarts 1986, Horn 1989, Van der Wouden 1997). The general topic of the project was the study of distributional restrictions that appeared to be of a formal-semantic nature. In this paper, I will discuss some of the results of this project in the area of corpus linguistics, focussing primarily on the topic of polarity items in the language of mature speakers. Other topics in the project were acquisition of polarity items (partly on the basis of child language corpora) and the distribution of degree adverbs.

Most semantic restrictions on distribution belong to the domain of lexical semantics. These are the selection restrictions of generative grammar (Chomsky 1965). They deal with features like [human] or [edible]. My research is not primarily devoted to such lexical-semantic features (see section 6 below for some discussion), but with features of a more abstract nature, such as implication reversal (Ladusaw 1979), anti-additivity (Zwarts 1981) and veridicality. For example, what do the sentences in (1) have in common?

(1) a Is Fred happy?
b Fred is not happy.

It would seem that what they have in common is a proposition, Fred is happy, whose truth is not entailed by either sentence. Negation and interrogation share a semantic feature called nonveridicality (see e.g. Zwarts 1995, Giannakidou 1997): a proposition which is negated or questioned is not entailed to be true. Veridicality as well as other semantic features appear to play a role in the distribution of many expressions, such as English any

(2) a She cannot sing any louder. nonveridical
b Can she sing any louder? nonveridical
c If she sings any louder, the neighbours will call the police. nonveridical
d *She sang any louder. veridical

In our project, we decided to study the phenomenon in depth. For example, it is remarkable that the majority of the literature on polarity items appears to be devoted to indefinite pronouns, like English any or ever or French aucun. However, there are hundreds of polarity items in English and French. Muller (1991) mentions an unpublised list of 600 French NPIs. In my own database, I have collected about 500 items in Dutch. One would guess that English cannot lag far behind this number. So a reasonable question to ask is, first, what does the entire set of polarity items look like, and second, do currenttheories, based as they are on in-depth study of a few items, generalize to the larger set of all NPIs.

To answer these questions, corpora are an invaluable tool. For example, in order to establish what items count as NPIs, it is useful to check potential candidates against a corpus. Second, a corpus search provides quick and dirty information on the distribution of an item. This will make it easier to formulate theories regarding the distributional properties of NPIs in a way that it much more humane than the method of introspective judgments by informants. What do I mean by humane? Consider the size of the problem we were facing. We should test hundreds of items in dozens of contexts. We should check if an item may appear with clause-mate negation and with higher clause-negation. If the latter is the case, we should check whether it is possible only with neg-raising constructions. We should check if an item appears in a question. Not just any question, mind you. Can it appear in a yes/no question? Can it appear in a WH-question? If so, are there differences between WH-sentences of the who/what variety and the how/whyvariety? For both yes/no questions and WH-questions, we should verify whether the question has to have a rhetorical flavor. Is there a difference between embedded and direct questions? Does it matter under what predicate an indirect question is embedded?

So far, it appears that all these factors can be relevant, which does not make our problem any easier. Then there are all the other environments, such as the scope of various negative adverbs, like only, a notorious problem actually (cf. the debate in Atlas 1993 and Horn 1996, inter alia), or hardly, conditional clauses of all stripes, comparative clauses, relative clauses restricting a universal or superlative expression, complements of adversative verbs and adjectives of various kinds, imperatives, perhaps, and concessive clauses.

An unpublished study (Hoppenbrouwers 1983) reports on a study of some 50 items in 50 sentential frames. The result was a list of over 2500 sentences, which was tested on five informants. It is an amazing feat to find 5 people willing to judge 2500 sentences, but the findings of the study are not always easy to interpret. In some cases, all informants agree, and in these cases, one feels that one informant would have sufficed. Quite often, however, the judgments vary. When there is variation, the judgments are hard to interpret. If two people accept a sentence, two reject it, and one doesn't know, does that mean a dialect split, or are we dealing with an item in a context where people find it hard to make up their minds? Suppose we have a sentence, accepted by 3 speakers, and another one, accepted by 2. Is the former better than the latter? Can we say that the first one is OK, being accepted by the majority, and the second one out? It is fairly obvious, I think, that we cannot jump to such conclusions. We could, of course, broaden the investigation a bit, let us say we study the judgments of 100 informants on 400 x 50 sentences, which would be a total of 2 million judgments.

However, just by checking an easily available corpus such as a newspaper on CD-ROM, we can often establish pretty quickly, that a certain item only occurs with clause-mate negation, or, say, nearly exclusively in conditional and interrogative clauses. In other cases, we find that the item is so rare, that many people may not use it actively or may not be aware of its conditions of use. In yet other cases, we find examples which seem to contradict our own intuitions, until we realize that the sentence has a pronunciation, a prosodic contour which suddenly makes it quite acceptable. When we are given such a sentence to judge, especially if there are several thousand other sentences still waiting, we would simply reject it. Now we see it in a corpus, and must give it the benefit of the doubt: somebody has used this sentence, so presumably somebody acceptsit. And we look at it, until we see how it works.

Not so easy, at this moment are cases of dialect splits, such as the famous cases of positive anymore in American English. Corpora are not usually divided up into regional subcorpora, although of course it is possible to do this, and moreover, it has been done. For medieval Dutch, for instance, there is a corpus covering all remaining documents from the 13th century, with a tag indicating the region where the document was produced. However, for modern Dutch, for instance, there is at present no such thing, if you ignore dialect atlasses and the like. There are corpora of varieties of English, like Indian English or Australian English, and you could study them to discover differences among polarity items, but so far, this has not been done, to the best of my knowledge. With the advent of large newspaper corpora on CD-ROM, it is becoming easier to study regional variation, by comparing say Le Monde with Montreal newspapers, or The New York Times with the Manchester Guardian.

For example, you can find the item cop or much cop, meaning `worth much', in British newspaper corpora, but not in their American counterparts. A corpus I use a lot is a collection of Internet texts, mainly postings on bulletin boards. The Internet has the advantage of containing a very broad spectrum of texts, ranging from manuals and technical documents, to White House briefings, science fiction, pornography, chess games and classical literature. For the lexicographer, it is a real gold mine. Much cop showed up only once in this corpus, in the text of a David Bowie song:

(3) Don't pick fights with the bullies or the cads
`Cause I'm not much cop at punching
Other people's dads (David Bowie, `Kooks',
from the album Hunky Dory)

This particular example also illustrates an important feature of polarity items: polysemy and lexical ambiguity. This feature is the topic of the next section.

2. Ambiguity and polysemy

Negative polarity items are nearly always the evil negative twin of some perfectly innocent nonpolarity item. The above-mentioned polarity-sensitive use of cop, for instance, is virtually drowned out, in a corpus, by occurrences of the other words cop, such as the noun and the verb (as in the particle verb combination cop out). If you use a corpus for automatic detection of polarity items, this fact makes your task really hard. Although it can be done in principle, if your corpus is large enough, and your methods are sufficiently sophisticated, I have not, in my own work, found automatic detection something worthwhile to do. When I looked for words that show up in the presence of negation more than other words, I ended up with lists of words only very few of which I could identify as polarity items. A number of problems make automatic detection a hard thing to do.

A main problem is separating the various uses of a word. Consider the items in 4:

(4) matter
mind
bother
much

All these items pose problems which I will briefly discuss.

In the case of matter and mind, there is no polarity sensitivity when these words are nouns. However, when we consider the verbs matter and mind, we see polarity sensitivity, albeit to different degrees: matter is used about 80% of the time in polarity contexts such as negation, questions, and the like, mind in about 99%. If we want automatic detection of this attraction to negative contexts, we need a tagged corpus, in which word occurrences have word-class tags. That will allow us to separate nominal from verbal occurrences of matter and mind.

In the case of bother, word-class tags are not going to help us very much. Here, we observe two basic meanings, both of which are associated with the use of this word as a verb. The one use can be paraphrased, somewhat roughly, as `annoy', and belongs to the class of psych-verbs. The other use can be paraphrased as `to take the trouble', and belongs to the verbs of effort, along with try, manage, etc. In the first use, there is only a very weak predilection for negative contexts, in the second, a much stronger one. Here, what we could use for automatic detection is a parsed corpus, since the two uses of bother are distinguished by their complements: bother1 takes NP-complements, whereas bother2 takes infinitival complements, gerunds, with-PPs or zero-complements:

(5) a John didn't bother Mary with his bragging. bother-1
b It didn't bother Mary that John called. bother-1
c John never bothered to call. bother-2
d John wouldn't bother calling. bother-2
e John didn't bother with the homework. bother-2
f John decided not to bother. bother-2

In the case of much, a great number of uses must be distinguished. Much can be an adjectival determiner of mass nouns, and an adverb:

(6) a John wasted much time.
b John has much improved.

Both uses must be further subdivided if we want to isolate two uses in which much is polarity sensitive. The first concerns the determiner much in the partitive construction much of a:

(7) a Fred is not much of a hero.
b John did not stand much of a chance.

Notice that omission of not in these sentences leads to illformedness. The second one concerns adverbial much in sentences like:

(8) a John did not much like the idea.
b Tom never much cared for Jerry.

As a matter of fact, whether much is polarity sensitive or not depends on the expressions it modifies. With some verbs, like improve or prefer, often involving some kind of comparison (cf. Bolinger 1972), but also with comparative adjectives (as in much better,much more, much less, much faster etc.) much is used without requiring negation or any other negative context. With other verbs, however, we see that negation is required for grammaticality. The example of much shows that part of speech information is not enough to separate off certain polarity sensitive uses: we need more information, in this case on certain collocation patterns, so that we can distinguish much of the (not polarity sensitive) from much of a, and much prefer from much like. Indeed, we may further complicate matters by noting that the behavior of much may also be affected by the presence of degree adverbs premodifying it:

(9) a *John liked it much.
b John did not much like it.
c John liked it very much.

Polysemy is rampant among polarity items. The above examples could be multi-plied many times, when we consider the various uses of any, ever, yet or need.

In corpus studies of such items, a lot of the work that needs to be done is still best done by hand, unless you really insist on using the best available tools for automatic word sense disambiguation, such as programs which utilize bilingual corpora of considerable size. But note that such corpora always come with a handicap. They are often confined to certain styles or domains of conversation, and therefore will not contain some of the items you might be interested in. I doubt, for instance, if the bilingual proceedings of the Canadian parliament contain the swear words and taboo words of the give-an-X category: give a shit, give a rat's rear, give a tinker's malediction and so on.

However, automatic detection of polarity items is by no means impossible, as long as you are willing to face the complexity of the problem, and to accept that there will always be items of such low frequency that statistical methods will have trouble identifying them. The problems are comparable to those facing automatic tagging and parsing of unrestricted text, although they are even more complex, because in some cases we need not only information about word classes, and constituent structure, but also about the scope of negation, whether a certain construction can be read as a conditional, whether a given question can be rhetorical, and the like. These are hard problems, for which there is often no rigorous linguistic treatment. Unlike automatic tagging, however, which is a useful task since it renders unnecessary a lot of boring human labor and is essential for modern language technology, automatic detection of polarity items is not equally useful and the prospects of automation in this area are consequently not overly bright.

3. Defining the contexts

Another problem facing the corpus study of polarity items is the determination of licencing contexts. Some contexts are fairly clearly marked, such as negative clauses, but others are more difficult to detect automatically. Consider the following two sentences:

(10) a You say anything, and I'll kill you.
b *You said anything, and I killed you.

At first sight, it looks like the tense of the verb is directly responsible for the acceptability of anything, as it is the only thing formally distinguishing the two sentences. Of course, this is only indirectly true: the first sentence can be read as a conditional, equivalent to an if .. then statement, while the second cannot be so interpreted. Syntactic structure and logical connectives are the same in both sentences, yet there is a crucial difference.

For another example of how structurally similar conditional and nonconditional interpretations can be, consider (11a) (with a conditional interpretation) and (11b):

(11) a With any luck, our lean days are over. [= If we have any luck, ..]
b With some luck, we survived the ordeal. [Having had some luck, ...]

I should admit, however, that I am making things seem worse than they are in playing the linguists favorite game of presenting worst-case scenarios. By far the majority of English conditionals, of course, have a recognizable if .. then marking, and in a statistical approach, they will determine an outcome which should not be hampered much by a few unrecognized occurrences in nonstandard conditionals.

Another basic problem is scope of negation. While material to the right of negation in an English clause can always be claimed to be within the scope of negation, the same is not true of material to the left. Here, numerous factors may play a role in determining whether negation may have wide scope over subjects or topicalized elements. For an overview of some of the intricacies of English and Dutch in this area, with special reference to polarity licensing, see Hoeksema (1997). Corpus study can play an important role in finding out what factors are relevant here, but we cannot yet apply existing theories of scope to help us in our corpus investigations, because none are robust enough, at the present, to withstand confrontation with naturally-occurring data.

A third problem is posed by so-called negative predicates: predicates such as lack or impossible, which trigger polarity items in their complements:

(12) a The proposal lacks any plausibility.
b It was impossible to ever return.

The problem is how to determine independently which predicates should count as negative. There are many such predicates, some of which are basic lexical items, such as lack, others are morphologically complex or idioms. There are no tests which could be used in corpus linguistics. Often, the best we can do is to use some `circular' reasoning: we may, at least provisionally, classify a predicate as negative if it appears to trigger a polarity item. We could then make a list, and use that to further study the distribution of polarity items or to discover new items. For more discussion of negative predicates and how they license polarity items, see Hoeksema and Klein (1995).

4. Distribution of Polarity Items over Licencing Environments

Let us now see what other applications of corpora are useful in the study of polarity items. An interesting application is the use of corpora to study how items are distributed over the various environments in which they are licensed. Even when we look at items which occur in roughly the same set of environments, we may find important differences in the frequency with which they occur in them. At the same time, I have found that comparableitems in different languages tend to have comparable distributions. I will illustrate this with a number of charts. These charts are ordered on the basis of increasing complexity. Simplest are the distributions which I will call monomodal, in which the bulk of occurrences are located in one area of the spectrum, e.g. in the subset of environments which are strictly negative (as opposed to interrogative or conditional or any of the other contexts). Next come the bimodal distributions, with two distinct peaks. And finally, there are the many examples of polarity items with multimodal distributions, some of which are exemplified here.

In Figure 1, I compare the distribution of English anymore with that of its Dutch and German counterparts meer and mehr. The English data are based on a corpus of postings on the Internet (with a size of about 16 million words), the Dutch and German ones on various corpora, including the ones on the cd-rom of the European Corpus Initiative as well as the the online corpora of the Institut fùr deutsche Sprache in Mannheim.

Figure 1

The tiny slice of the pie representing positive occurrences is due to the presence, in this corpus, of speakers using so-called `affirmative anymore', a usage which is substandard but widely-spread in the United States (see e.g. Labov 1975). There is no comparable affirmative usage in either Dutch or German. Had I used a British English corpus, the comparison of the three languages would have yielded a completely parallel picture for the three items.

Expressions with a bimodal distribution appear to be fairly rare. However, a good example is the class of temporal NPIs of the form in X, where X is a temporal noun. In English, this class is represented by expressions such as in months, in years, in ages, in eons etc., and in Dutch by in tijden `in times = in ages', in jaren `in years', in maanden `in months'. The distribution is bimodal in the sense that there are two basic environments in which to find these expressions, namely negative contexts and comparative/superlative constructions (assuming, for the time being, that comparatives and superlatives are a natural class here). The precise causes of this bimodal distribution have not, as yet, been established, nor do I know of other items with comparable distributional properties.

Figure 2

In the next chart, Figure 3 below, I compare English any with its Dutch cognate enig. Just like any, enig is a negative-polarity item, but it lacks the free-choice use which any has (Ladusaw 1979). To allow for a proper comparison, free choice uses of any were excluded from the database. Moreover, enig is not polarity sensitive when it combines with plural or mass nouns (Hoeksema and Klein 1995), so for this chart, only occurrences with singular count nouns were collected. Notice the general shape of the pie, indicating that the items in question are distributed over many environments, with negative contexts as the biggest slice of the pie. One difference between Dutch and English, which will be found in other charts as well, is the fact that Dutch enig occurs far more frequently in without-clauses than its English counterpart. This is likely to be due to the greater syntactic flexibility of Dutch zonder `without': it can be introduce noun phrases as well both finite and infinitival complements in Dutch, and hence has a greater potential use than English without, which is restricted to noun phrase complements and gerunds.

Figure 3

The next chart, Figure 4 below, compares the polarity sensitive auxiliaries need, hoeven and brauchen (cf. Edmondson 1983, van der Wouden 1997). English need is polaritysensitive just in case it does not require the infinitival particle to. There is, in other words, a minimal difference between Fred need not worry and Fred doesn't need to worry. In the former sentence, negation cannot be omitted, but in the latter, it can: cf. Fred needs to worry. Dutch hoeven and German brauchen are polarity items precisely when they are used as auxiliary verbs, as in the following equivalent sentences:

(13) a Fred need not work.
b Fred hoeft niet te werken.
c Fred braucht nicht zu arbeiten.

Again, we see a multimodal distribution, with negative contexts making up the largest slice of the pie. However, there are major differences with the charts for any/enig, the polarity sensitive indefinite pronouns. Note, for instance, that the indefinite pronouns are frequent in comparative constructions (more than 10% of occurrences), but the auxiliaries, although not ungrammatical in comparative clauses, only rarely show up there. The explanation is simple: most comparatives have NP complements, whereas clausal complements are far less common. Any/enig show up in NP complements but the modal verbs of course require a full clausal context, and hence are restricted to the small subset of full clausal comparatives. So simple baseline frequencies can explain this particular difference.

Figure 4

Note that need doesn't show up with any frequency in without-clauses, unlike its Dutch and German counterparts. This is easily explained on the basis of a general difference between English and continental Germanic. Whereas English restricts its modals to finite clauses, continental Germanic also allows for infinitival forms of the modals. If we add this to the information that English without, unlike Dutch zonder or German ohne, does not introduce finite clauses, we can see why need, being a modal verb, does not show up in complements of without much. It is not completely excluded, because it could occur in a finite clause embedded in a gerund, for example without claiming that anyone ever need return, but such constellations are clearly rare. Another striking difference between English and continental Germanic appears to involve the use of need in conditionals. Whereas this use is completely absent (and ungrammatical) in continental Germanic,Figure 4 shows a substantial slice of the pie for the use of need in conditionals. However, upon closer inspection, it turns out that all relevant examples involve the fixed expression if need be. If we leave these out, as well as the without-data for Dutch and German, we arrive at Figure 5.

Figure 5

Now the three pies are practically identical. The distributional similarities are even more striking at a micro-level when we zoom in on small slices of the pies. Take for example occurrences within restrictions of universal quantifiers. In the case of polarity sensitive indefinites, such restrictions form a small but well-established part of the distribution. However, from the literature on hoeven (e.g. Zwarts 1981), we know that this modal verb cannot be used in conditional clauses, or in restrictions of universal quantifiers. English and German pattern like Dutch in this regard:

(14) a *Every student who need study is present. (English)
b *Iedere student die hoeft te studeren is aanwezig. (Dutch)
c *Jeder Student der zu studieren braucht ist anwesend. (German)

However, in a corpus investigation, it was found that there is a well-defined subclass of universal quantifiers that allow for these polarity sensitive modals in their restriction, namely those introduced by all or its Dutch and German counterparts:

(15) a That is all we need know. (English)
b Dat is alles wat we hoeven te weten. (Dutch)
c Das ist alles was wir zu wissen brauchen. (German)

It is not entirely clear to me what makes all different here from other universal quantifiers, but the observation is robust. If we were to continue our investigations further, we would note that occurrences of need (or its Dutch and German counterparts) in a restrictive relative clause modifying all are frequently found within subjects or predicate nominals. And intuitively, it seems that the sentences in (16) are ill-formed:

(16) a ??John detests all we need know. (English)
b ??Jan verafschuwt alles wat we hoeven te weten. (Dutch)
c ??Hans verabscheut alles was wir zu wissen brauchen. (German)

However, the data are too sparse at this point to permit further investigation by means of corpus study. In general, corpus linguistics cannot replace or even complement the use of introspection in areas where usage is exceedingly rare, until the advent of much larger corpora. Nevertheless, the corpus data suggest that the triggering of need or hoeven in relative clauses is probably sensitive to more aspects of the quantificational structure of the sentence than just the properties of the determiner.

A number of conclusions can be drawn from these charts. First, that polarity sensitivity is in fact highly diverse, and that there is not going to be a simple unified theory of polarity items which reduces their distribution to a common property of polarity sensitivity. Another, equally important, conclusion is that the distribution of polarity items must largely be a function of lexical semantics. The items with similar distributions which we looked at here were items with similar lexical meaning. The contribution of lexical semantics cannot be overlooked, even if we do not, at the moment, have a good theory which makes clear predictions about the relation between distribution and meaning.

5. The emergence of polarity sensitivity

The method we have used here, to divide up occurrences of polarity items in a number of classes, determined by trigger elements, can also be applied to items which are not, strictly speaking, polarity items. Some expressions such as the verb matter and care or the adjective keen have a distribution which is somewhat similar to that of regular polarity items. They occur rather often in negative environments, and moreover, are semantically related to true polarity items. Care, for instance, in I don't care, is semantically on a par with a regular polarity item like give a damn, as in Frankly, I don't give a damn, or one of the many verbal idioms in Dutch for expressing general indifference. These pseudo-polarity items, which still have a substantial positive usage, are interesting for a number of reasons. First of all, it is a reasonable theory that polarity items are selected from among the pseudo-polarity items. We do not expect arbitrary words to develop polarity sensitivity, as if that were some contagious disease, suddenly affecting perfectly normal words. Rather, we expect pseudo-polarity items to gradually drift towards greater polarity sensitivity, until their use is virtually restricted to the negative, conditional, interrogative etc. contexts of strict polarity items. This would make sense if polarity sensitivity is the result of a process of grammaticalization (Hoeksema 1994).

But how does the original statistical preference for such environments come about? In part, there may be pragmatic reasons. An factor which clearly plays a role, as was noted by Horn (1989) and Israel (1996), among others, is what I will call rhetorical stereotyping. Certain expressions tend to be picked out for understatements of various kinds, like litotes, or for hyperbolic usage.

Consider in this connection for a moment some of the idioms involving roses in the European languages. Roses are symbolic for ideal situations. They are beautiful, smell nice, and so it comes as no surprise that roses figure prominently in a number of idiomatic expressions indicating ideal situations or courses of events. We can take this at face value, and we get sentences like

(17) a It was roses all the way. (English)
b Ajax zit op rozen. (Dutch)
"Ajax is sitting on roses"

These idioms can be characterized as hyperbolic. Since ideal situations are not the stuff of everyday life, we must take them with a grain of salt. But some of these idioms are reserved for other purposes. When we deny them, we get understatements, and it is striking that some very close counterparts to the hyperbolic ones occur which are typically used in combination with negation, as the following examples from a variety of European languages illustrate:

(18) a Life is no bed of roses. (English)
b Het leven gaat niet over rozen. (Dutch)
c Sein Leben war nicht wie auf Rosen gebettet. (German)
d La vida no es un lecho de rosas. (Spanish)

Polarity sensitivity develops here, it would seem to me, as a result of selecting certain items for use in hyperbolic contexts and keeping others for understatements. This is what I mean by rhetorical stereotyping.

Other types of pragmatic stereotyping can also be recognized. For instance, the Dutch degree adverb knap is derived from an adjective meaning `able, clever' or `pretty'. Just like its English synonym pretty, this adjective developed an extra use as a degree adverb, but unlike pretty, the expressions it combines with are negative in the evaluative sense of the word. Typical combinations are

(19) knap vervelend pretty annoying
knap rot pretty rotten
knap beroerd pretty lousy
knap lastig pretty difficult
knap irritant pretty irritating
knap jaloers pretty jalouse
knap onhandig pretty clumsy

In a corpus of 66 occurrences of this item I found 40 different combinations, all with a negative connotation. Combinations with a positive connotation, as English `pretty nice' or `pretty good' do not exist with Dutch knap. For English `pretty', one could say that its positive connotation is bleached, or neutralized, and the result is a degree adverb with neutral characteristics, which you can use with any adjective. Knap, on the other hand, seems to have reversed the positive connotation of the original adjective. The most plausible explanation for this would be, that the first uses of knap as an adverb of degree were restricted to ironical or sarcastic contexts. But once the negative connotation was fixed, the sarcastic force disappeared.

Degree adverbs are interesting for many reasons, and for me, not one of the least is that they may turn into polarity items. Lots of languages have such items, just consider English all that, as in He's not all that smart, which needs the negation to be acceptable (cf. *He is all that smart). The same is true for exactly, when it is used as a degree adverb, cf. He's not exactly smart. When exactly is not used as a degree adverb, and applied to nongradable expressions, as in Fred is exactly twice as old or Fiona is exactlylike her mother, we do not encounter polarity sensitivity. German has sonderlich, as in

(20) Es war nicht sonderlich erfreulich.
"It was not particularly pleasant"

as well as a polarity-sensitive use of gerade `just, exactly' analogous to the degree use of exactly. Dutch has several such polarity-sensitive adverbs. Why is it that languages tend to develop such degree adverbs? Well, one explanation is clearly pragmatic stereotyping. Adverbs marking a high degree of some property tend to develop a special understating sense under negation. Think of English not very smart. On the surface, this looks like a perfectly compositional expression, the predicate-denial of very smart. Yet the way it is used, mostly, is typical as an understatement. By saying not very smart, we usually mean something stronger, like rather stupid. For very, this use in understatements is a little job on the side, but for all that, and sonderlich, it has become the main occupation, and the result of this pragmatic specialization is two negative polarity items.

In Dutch, we see something slightly more complicated. We have two degree adverbs, bar and bijster, the distribution of which has been puzzling to many observers. A detailed corpus study reveals that these items, both of which mean something like very or all that, have developed an unusual two-way sensitivity to polarity. First of all, they are sensitive to sentential polarity, that is, whether or not the sentence, or main predication, is negative or affirmative. Second, it matters what kind of adjective they modify. Adjectives often come in pairs, called antonym pairs, and the members of these pairs are traditionally referred to as the positive and negative poles on a common scale. Good, for instance, is the positive antonym of bad, and unpleasant is the negative counterpart of pleasant.

It turns out, that bar primarily occurs with the negative member of a pair of antonyms as part of a positive predication, and bijster with a positive member of a pair of antonyms as part of a negative predication.

Some typical examples of these items, taken from a corpus of occurrences:

(21) Real Madrid was niet bijster geïnspireerd tegen Albacete.
"Real Madrid was not terribly inspired against Albacete."

(22) Werpers zijn hard nodig bij de bar slecht spelende Giants.
"Pitchers are badly needed with the very badly playing Giants."

However, this distribution is a very recent development. When we compare current distributional patterns with data from earlier parts of this century, we get the pattern in Figure 6 for bar. In this chart, I distinguish four environments, one of which, pos/neg, emerges gradually as the main environment for bar. At the beginning of this century, however, pos/pos was equally important, and the other two environments were also of some importance.

Figure 6

In Figure 7, we see a parallel development in the distribution of bijster. In the 19th century, this item was almost evenly divided over 3 of the 4 environments, with only neg/neg being disfavoured. In the 20th century, we see a steady rise of neg/pos at the expense of the two other environments. Note that the data here start a bit earlier than the data for bar, which was not very common in the 19th century.

Figure 7

As a result of these parallel developments, bar has virtually turned into a positive polarity item, and bijster into a negative polarity item. Historical developments like these are of interest for the general theory of polarity items. They suggest, for example, that the theory that the distribution of polarity items is completely determined by lexical semantics cannot be entirely correct. The two words bar and bijster started out with similar meanings and similar distributions, and without having developed different meaning came to have diametrically opposed distributions.

6. Extralogical factors

Besides the study of licensing environments, we can also consider other aspects of the distribution of polarity items. Many factors come into play, as we have seen above, including some that do not figure prominently in the study of polarity items, such as selection restrictions of various kinds. Selection restrictions are often viewed as purely pragmatic. Violations of selection restrictions as in McCawley's example my toothbrush is pregnant are generally viewed as extralinguistic in nature. The expression is grammatically correct, but odd due to a general fact about the world, namely that inanimate objects, such as toothbrushes, do not procreate. A syntactic approach to selection restrictions in terms of features, as in early generative grammar (Chomsky 1965), would seem to be unnecessary. However, in the area of polarity sensitive expletives, we see selection restrictions of a more grammatical nature. Consider in this connection the Dutch items zier and snars:

(23) a Ik zie geen snars.
I see no ....
"I don't see a thing"
b Ik doe geen zier.
I do no ...
"I am not doing a thing"

The dots indicate that no proper glosses can be given. The items are purely idiomatic, and do not have a referential meaning. One may compare this to, for instance, Fred did not do diddley squat contains an expression, diddley squat, which does not refer. Such expressions are polarity sensitive expletives. Even the etymology of snars is unknown, and the origin of zier is nowadays only known to historical linguists. Given that these expressions are meaningless in themselves, we would not expect to find selection restrictions. Yet such restrictions are quite strong, if not categorical. In Figure 8 below, I compare the two items with respect to the predicates which they combine with as objects or adjuncts:

Figure 8

The predicates were lumped together into a few sets for the purposes of comparison, and the result is a rather striking difference: whereas snars is preferred in the area of cognition verbs, zier is preferred with verbs of indifference, e.g. kunnen schelen `care', uitmaken `matter' etc. While we cannot speak of strict selection restrictions, we are certainly dealing with selection preferences here, that cannot be reduced to pragmatic considerations such as knowledge of the world.

7. Conclusion

In this paper I have reviewed some of the problems and prospects of corpus linguistics in the area of negative polarity phenomena. While there are numerous problems standing in the way of automatic detection of distributional patterns (most importantly: ambiguity and polysemy), it is argued in some detail here that the study of distributional patterns of use sheds new light on polarity items and provides the linguist with the tools for a richer classification, a deeper understanding and a better perspective on the historical changes out of which they arise. In particular, it is argued that similarities in distributional patterns across languages often reflect lexical-semantic similarities, but that, at the same time, lexical semantics does not fully determine polarity sensitivity.

Notes

1. The research reported here was conducted as part of the project Reflections of Logical Patterns in Language Structure and Language Use, funded by a PIONIER-grant from the Dutch Organization for Scientific Research (NWO) and the University of Groningen. I owe a debt of gratitude to my fellow researchers Anastasia Giannakidou, Henny Klein, Charlotte Koster, Hotze Rullmann, Victor Sanchez Valencia, Sjoukje van der Wal, Ton van der Wouden and Frans Zwarts. This paper was presented at a workshop on corpus linguistics, organized by IULA, at the Universidad Pompeu Fabre in Barcelona, spring 1997.

2. For example, of 397 conditional clauses in my database hosting an occurrence of any, 93% were introduced by if, 3% were of the Verb First variety (with fronted auxiliary, as in Should you need any assistance, we have a help desk on the 14th floor) and only 4% belonged to other varieties. Note, however, that the presence of if is not the only thing we need to note, since we must also distinguish it from interrogative if. Sometimes, of course, both readings are possible, making it impossible to determine by automatic means whether we dealing with a conditional clause or not:

(i) Let me know if you need anything more.

= let me know whether you need anything more

OR = let me know in case you need anything more

3. The discussion of bar and bijster is based on Klein and Hoeksema (1994) and Klein (1997).

4. Zier originally meant a tiny worm or mite.


References

        Atlas, Jay D., 1993, `The importance of being "Only": Testing the Neogricean versus neo-entailment paradigms.' Journal of Semantics 10, 301-318.

Bolinger, Dwight, 1972, Degree words. Mouton, The Hague.

        Chomsky, Noam, 1965, Aspects of the theory of syntax. MIT-Press, Cambridge, MA.

        Edmondson, Jerry A., 1983, `Polarized Auxiliaries.' In: Frank Heny and Barry Richards, eds., Linguistic Categories: Auxiliaries and Related Puzzles, volume 1. Reidel, Dordrecht, 49-68.

        Giannakidou, Anastasia, 1997, The Landscape of Polarity Items, dissertation University of Groningen.

        Hoeksema, Jacob, 1994, `On the grammaticalization of negative polarity items.' Proceedings of the Twentieth Annual Meeting of the Berkeley Linguistics Society, ed. by S. Gahl, A. Dolbey and C. Johnson, 273-282. Berkeley Linguistics Society, Berkeley.

        Hoeksema, Jack, 1997, `Negative Polarity Items: Triggering, Scope and C-Command'. Unpublished MS, University of Groningen.

        Hoeksema, Jack and Henny Klein, `Negative Predicates and Their Arguments,' Linguistic Analysis, 25, 3-4, 146-180.

        Hoppenbrouwers, Geer, `Resultaten van een onderzoek naar het voorkomen van negatief-polaire elementen in verschillende affectieve omgevingen.' Unpublished paper, Catholic University of Nijmegen.

        Horn, Laurence R., 1989, A Natural History of Negation. Chicago University Press, Chicago.

        Horn, Laurence R., 1996, `Exclusive company: Only and the Dynamics of Vertical Inference,' Journal of Semantics, 13, 1-40.

        Israel, Michael, 1996, `Polarity Sensitivity as Lexical Semantics.' Linguistics and Philosophy 19, 619-666.

        Klein, Henny en Jack Hoeksema, 1994, `Bar en bijster: Een onderzoek naar twee polariteitsgevoelige adverbia,' Gramma/TTT 3-2, 75-88.

Klein, Henny, 1997, Adverbs of Degree in Dutch. Dissertation, University of Groningen.

        Labov, William, 1975, `Empirical foundations of linguistic theory,' in Robert Austerlitz, ed., The Scope of American Linguistics. The First Golden Anniversary Symposium of the Linguistic Society of America, The Peter de Ridder Press, Lisse, 77-133.

        Ladusaw, William A., 1980, Polarity Sensitivity as Inherent Scope Relations. Garland Press, New York.

        Linebarger, Marcia, 1981, The Grammar of Negative Polarity. Indiana University Linguistics Club, Bloomington.

        Muller, Claude, 1991, La négation en français: syntaxe, sémantique et éléments de comparaison avec les autres languages romanes. Droz, Genève.

        Wouden, Ton van der, 1997, Negative Contexts: Collocation, Polarity and Multiple Negation. Routledge, London.

        Zwarts, Frans, 1981, Modeltheoretische semantiek en natuurlijke taal: een studie van het moderne Nederlands. Unpublished MS, University of Groningen.

        Zwarts, Frans, 1986, Categoriale Grammatica en Algebraïsche Semantiek. Dissertation, University of Groningen.

Zwarts, Frans, 1995, `Nonveridical Contexts', Linguistic Analysis 25, 286-310.