The syntactic constraint on English auxiliary contraction

Richard Hudson; Nikolas Gisborne; Thomas Hikaru Clark; Eva Duran Eppler; Willem Hollmann; Andrew Rosta; Graeme Trousdale

doi:10.1017/S0022226725000131

The syntactic constraint on English auxiliary contraction

Published online by Cambridge University Press: 28 March 2025

Andrew Rosta and

Richard Hudson: Affiliation:
University College London
Nikolas Gisborne*: Affiliation:
University of Edinburgh
Thomas Hikaru Clark: Affiliation:
MIT
Eva Duran Eppler: Affiliation:
University of Roehampton
Willem Hollmann: Affiliation:
University of Edinburgh
Andrew Rosta: Affiliation:
University of Central Lancashire
Graeme Trousdale: Affiliation:
University of Edinburgh
*: Corresponding author: Nikolas Gisborne; Email: [email protected]

Article contents

Abstract
Introduction
Previous explanations
The Following Valent Constraint
Theoretical issues
Conclusions
Footnotes
References

Rights & Permissions

Abstract

We offer a new explanation for the difference between cases where an auxiliary verb can and cannot contract, such as Kim is coming versus Kim is. Rather than a banning constraint, we argue that there is a positive syntactic licensing constraint. We consider, and reject, both the familiar Gap Restriction and a range of phonological explanations. Our analysis rests on the category of grammatical relations, valent, which includes all non-adjuncts (i.e. all subjects and complements); the analysis consists of a single claim, the Following Valent Constraint: that a contracted auxiliary has an overt following valent. We show how this analysis explains the full range of data that has been discussed in the literature and how a minor variant of the constraint captures the data of the Scots locative discovery expressions. We also propose a sociolinguistic explanation for the inability of auxiliaries to contract in certain environments, such as after a preposed negative. Finally, we suggest a functional explanation for the proposed constraint: It allows the hearer to predict the presence of a following valent and thereby to manage the burden of processing.

Keywords

auxiliary contraction Following Valent Constraint Locative Discovery Expression valent Word Grammar

Type: Research Article
Information: Journal of Linguistics , First View , pp. 1 - 32

DOI: https://doi.org/10.1017/S0022226725000131 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1. Introduction

There are constructions that challenge any linguistic theory. English auxiliary contraction (AC), which applies to (1), is such a construction.

AC invites research because of its multifaceted nature as a major grammatical signal of informality and because of the lexical restrictions on it, but this paper is concerned with just one question about AC: What are the formal constraints such that, although everyone agrees that AC is possible in (1), it is not possible – indeed, it is generally characterised as ungrammatical – in (2)?

Precisely what licenses contraction in (1) when it is not possible in (2)? Our answer is a syntactic constraint on English (and as far as we know, only English) called the Following Valent Constraint (FVC), which affects the properties of the contracted auxiliary so that its valency includes an obligatory following subject or complement (a valent). According to our analysis, this valency constraint applies to all contracted auxiliaries (whether syllabic or non-syllabic) but not to full auxiliaries. We develop this analysis in Section 3, but before that we review the history of research in this area.

This question has exercised linguists greatly during recent decades, but it is not new. As King (Reference King1970) notes, the first published discussion, from just a century ago, may have been in Harold Palmer’s A Grammar of Spoken English. Palmer offers a general description of the differences between the ‘strong’ and ‘weak’ (i.e. contracted) forms:

The strong form is used when the word is isolated, stressed, or at the end of a sentence or of a more or less complete word group. At the beginning of a sentence, the strong form is frequently used. In most other cases the weak forms are used. (Palmer Reference Palmer1924: 9–10)

Besides this general overview of contraction, Palmer also discusses the various relevant verb forms individually, repeating much the same explanation for each. For the forms of am, for example, he lists [æm], [m], and [əm] and explains when each is used. So [æm] is used ‘when isolated: ðə wə:d “[æm]”, when stressed […], when at the end of a sentence or breath group […], and occasionally when unstressed and followed by the personal pronoun’; [m] is used ‘when unstressed and preceded by the personal pronoun’; and [əm] is used ‘when unstressed and followed by the personal pronoun.’ Palmer treats the form are similarly but recognising just two variants and slightly different distributional differences, with special rules for the full form and an elsewhere statement (in all other cases) for the weak form (Palmer Reference Palmer1924: 101).

This early discussion is still relevant today. Palmer deserves credit for being the first to notice that AC is sometimes prevented by the auxiliary’s syntactic position, and we should also note that his analysis also invokes prosody: what he calls a ‘breath group.’ The challenge for the modern analyst is in two parts: to arrive at a more precise formulation consistent with modern formal theories of grammar and to figure out the precise nature of the interactions, if any, between syntax and prosody.

Returning to the present paper, our main concern is to offer a new explanation for the syntactic restrictions on AC, which Palmer attributed to being at ‘the end of a sentence or of a more or less complete word group’ – in other words, on the auxiliary’s rightward environment. But having presented this explanation, we also discuss more general theoretical issues, namely, What are the implications of our explanation for the theory of grammar? and Why might these restrictions have developed as part of English grammar?

2. Previous explanations

More recent work on AC dates from Labov’s sociolinguistic analysis of the verb BE in the speech of black and white Americans, in which he showed that the only environments in which black speakers deleted BE were those where white speakers contracted it (Labov Reference Labov1969). Presumably unaware of Palmer’s work, he commented, ‘To the best of my knowledge, the rules for SE [Standard English] contraction have never been explored in print in any detail’ (Labov Reference Labov1969: 721), but he arrived at a similar analysis to Palmer: contraction of is or are is impossible when they are ‘in clause-final position’. His main concern was to show that the same was true for deletion of these words by black speakers, but he also offered a detailed description of this area of grammar in terms of the then current version of transformational grammar. For present purposes, then, the most significant aspects of his research were the rediscovery of the constraint on AC and an attempt to interpret his data in terms of a formal theory of grammar.

This publication was followed just a year later by King’s much-cited squib (King Reference King1970), which mentions Palmer but does not mention Labov, and in which he formulated a negative generalisation which claimed that AC is blocked by any kind of immediately following syntactic gap, whether created by ellipsis or by movement as in (3) and (4):

This restriction has been dubbed the Gap Restriction. Zwicky (Reference Zwicky1970), who published in the same journal, adopted this restriction, and since then it has been accepted in a tradition of syntactic analyses (e.g. Aoun & Lightfoot Reference Aoun and Lightfoot1984; Boeckx Reference Boeckx2000; MacKenzie Reference MacKenzie2012). It has recently been described in these terms:

The Gap Restriction has proven to be a particularly strong generalization, with no counterexamples arising in the syntax literature to date … or from corpus analysis …, and various analysts have taken it to motivate articulated models of the syntax-phonology interface … All derive strong constraints that lead us to expect the Gap Restriction to be exceptionless. (Thoms et al. Reference Thoms, Adger, Heycock and Smith2019: 2)

However, alongside this strand of syntactic explanations there is also a strand of phonological work dating at least back to 1984, according to which AC is blocked when its output is unpronounceable (Selkirk Reference Selkirk1984; Inkelas & Zec Reference Inkelas and Zec1993; Anderson Reference Anderson2008; Anttila Reference Anttila2016, Reference Anttila, Gribanova and Shih2017; Bresnan Reference Bresnan2021). The argument here is that contraction is blocked if it produces a phonological unit (a syllable or a prosodic phrase) that is too small to satisfy the demands of metrical phonology.

In view of these coexisting but contradictory approaches, it is reasonable to conclude that there is no generally agreed explanation for the badness of AC in the examples above. The following subsections present evidence against both traditions, taking their most recent presentations as examples, and introduce a new analysis that is developed more systematically in Section 3. Section 4 then draws conclusions for the general theory of grammar.

2.1. Syntactic gaps

For the syntactic tradition, we focus on the presentation of the Gap Restriction by Thoms et al. (Reference Thoms, Adger, Heycock and Smith2019), which explores the usual constraints on AC as well as a Scots dialectal variant, which has somewhat different limits to contraction which we consider in Section 3.6. This analysis formulates the restriction as follows (Thoms et al. Reference Thoms, Adger, Heycock and Smith2019: 1):

The Gap Restriction is presented as an informal description of the problematic data, which the article then goes on to explain in terms of prosody:

The core of our analysis of the gap restriction on auxiliary contraction is that contracted auxiliaries are clitics that must be prosodically incorporated into a prosodic host in their immediate context. If the clitic is incorporated rightward and its host is subsequently deleted, the clitic will be stranded, leading to ungrammaticality. … the gap restriction arises when prosodic incorporation groups a clitic auxiliary with a host that is subsequently deleted, stranding the auxiliary. (Thoms et al. Reference Thoms, Adger, Heycock and Smith2019: 22–23)

But, however it may be explained, the Gap Restriction is accepted as an accurate description of the data. We discuss the challenges faced by the prosodic account in the next subsection.

We have five concerns about the Gap Restriction. The first is that ‘an ellipsis site or a gap created by movement’ is not a natural class but a disjunction of different categories. Ellipsis can no doubt be handled in many different ways within the Minimalist Program, but one widely cited and influential approach treats it in terms of a syntactic feature [E], which prevents sister material from being realised phonologically (Merchant Reference Merchant2015). In contrast, movement is generally assumed in the Minimalist Program to leave a bracketed copy of the moved material in the original site – a very different kind of category from that assumed in ellipsis. Although movement and ellipsis both lead to an expected element being inaudible, there are clearly formal differences between the two phenomena.

Our analysis is able to avoid this disjunction by changing perspective: Instead of looking at the cases where contraction is not permitted, we look at those where it is licensed. When seen in this way, the facts turn out to be simpler, and we offer an analysis without disjunction, which explains why some contractions are grammatical while others are not.

Our second concern with the Gap Restriction is related: What is the restriction’s status in a cognitive model of sentence processing? Take the simple example That’s true. The restriction does not block this sentence, so it has no impact on the sentence’s structure nor does it add any further information about the auxiliary’s informal style. This means that at the point in incremental parsing, where the hearer has heard just That’s and is processing the verb, the mental structure is exactly the same whether or not the verb is contracted. In particular, nothing tells the hearer that the contracted verb guarantees a following complement. In contrast, our analysis builds this guarantee into the mental structure, and we build on it in a possible functional explanation for contraction.

Our third reason for doubting the Gap Restriction is that, even if it worked well in the early version of transformational grammar that was available when it was first formulated, it may no longer be compatible with assumptions of the Minimalist Program. In particular, if the subject in a declarative sentence is always raised by A-movement across an auxiliary verb, the Gap Restriction needs to be formulated so that it includes the disjunction of gaps left by wh-movement and the absence of elided material, while ignoring the traces of A-movement. This needs to be worked out given that most versions of minimalism have adopted some version of the verb phrase (VP)-internal subject hypothesis (Larson Reference Larson1988; Koopman & Sportiche Reference Koopman and Sportiche1991). We give a typical representation in (6) containing a trace t just after the auxiliary (Chomsky Reference Chomsky1995: 283):

If traces cannot be systematically excluded from the class of ‘gaps’, then every auxiliary with a complement verb will be followed by a gap and will therefore wrongly be prevented by the Gap Restriction from being contracted.

Our fourth concern is that syntactic gaps seem to be irrelevant to contraction at an even more crucial point: between the contracted auxiliary and its preceding host word. The following examples (Bresnan Reference Bresnan2021: 128) illustrate the point, with an underscore to show the position of the gap (where the subject of the auxiliary is missing):

Given that contraction seems able to tolerate a gap just before the auxiliary, the gap restriction as presented by Thoms et al (Reference Thoms, Adger, Heycock and Smith2019) opens up a research problem, rather than being an explanation: If it is the correct generalisation, why should a following gap block contraction when an immediately preceding gap does not?

And finally, as noted by others (Inkelas & Zec Reference Inkelas and Zec1993; Bresnan Reference Bresnan2021: 119), the Gap Restriction faces apparent counterexamples such as the extraction in (10):

The assumption behind this example is that there is a syntactic gap, due to the movement of how much, at the point marked by the underscore; and the significance of the example is that this gap, located immediately after the auxiliary, does not in fact block contraction. Our explanation for this unexpected result is that left in the tank is, in fact, a complement of is, so our analysis predicts the grammaticality of AC in this example.

Our conclusion, therefore, is that, although the Gap Restriction improves on Palmer’s informal syntactic account of the limits of contraction, it is nevertheless less well founded and less explanatory than has been claimed. We would like to point to two general weak points in the Gap Restriction: that it focuses on cases where AC is not possible and offers a general account of these cases and that it takes AC as the dependent variable, with the following context as the independent variable. As explained below in Section 3.3, our analysis reverses both of these choices by focusing on cases where AC is possible and by treating AC as the independent variable.

2.2. Prosodic phrasing

Alongside the syntactic tradition for explaining the limits of contraction lies a phonological one that focuses on prosody. The most recent manifestation of this tradition summarises the analysis as follows:

The asyllabic forms of contracted tensed auxiliaries share metrical constraints on their right contexts with the unstressed syllabic forms of the same auxiliaries. This relation is what Selkirk (Reference Selkirk1984: 405) describes as ‘the central generalization’ about auxiliary contraction: ‘only auxiliaries that would be realized as stressless in their surface context may appear in contracted form’. (Bresnan Reference Bresnan2021: 117)

This analysis of course presupposes an account of how syntactic structure constrains metrical structure in such a way that some auxiliaries can be stressless while others cannot. The following quotation presents the core of Bresnan’s descriptive claims about the role of prosody in AC (Bresnan Reference Bresnan2021: 118):

The right context of both syllabic and asyllabic reduced auxiliaries requires that the auxiliary be followed by a stressed word, as [2]a,b illustrate.

The stressed word need not be adjacent to the auxiliary. In line with Labov’s (Reference Labov1969) observations as well as the corpus evidence of MacKenzie (Reference MacKenzie2012: 79–82), is reduces and contracts before the nonadjacent stressed verb doing in [3]a, but not before unstressed it alone.

Stressed constituents falling outside of the complement phrase of the auxiliaries do not support contraction (Labov Reference Labov1969). In [4], for example, Inkelas & Zec (Reference Inkelas and Zec1993: 234) analyze the temporal adverbs as outside of the complement phrase of the reduced or contracted is.

This is the phonological explanation that we consider next, followed by a brief discussion of an earlier analysis by Stephen Anderson.

Perhaps the most persuasive part of this account is the discussion of following pronouns, which goes back at least to 1970:

A case in which stress restrictions are clearly operative is the distinction between How is (how’s) the weather in Boston? and How is (*how’s) it in Boston? where is in the second sentence receives greater stress because of the stresslessness of pronouns like it. (Zwicky Reference Zwicky1970: 335)

We agree that pronouns trigger extra restrictions on AC and that these restrictions have a phonological component, such as a requirement that after AC a pronoun needs phonological prominence, as in (11).

Unfortunately, we do not have a full explanation of the constraints on pronouns after AC, but they seem to be orthogonal to the other constraints. For one thing, the presence of an adjunct after the pronoun sometimes makes a difference, as in (12).

We find that the addition of now improves the sentence considerably, but the Gap Constraint, like our constraint explained below, specifically excludes adjuncts (such as now) from any relation with AC. And for another, the effect of the pronoun varies with the information structure; for example, AC in WH-interrogatives similar to Zwicky’s (Reference Zwicky1970: 335) is much better if the subject it is anaphoric, as in (13), rather than referring to the weather, as in Zwicky’s example How is (*how’s) it in Boston?

We conclude, therefore, that Zwicky’s constraint is real but does not show that the constraints on AC are phonological.

We also accept the important role of prosody in deciding which auxiliaries can be contracted, as expressed clearly in Selkirk’s ‘central generalisation’ quoted above. The literature that argues for a phonological explanation of AC notes that there are constructions where contraction is not possible because the auxiliary has to carry stress. Bresnan reviews a number of such cases, including the attested example in (14) (Bresnan Reference Bresnan2021: 120):

As Bresnan’s annotation shows, the stressed auxiliary is cannot be contracted. As Selkirk says, contraction is only possible where an auxiliary would otherwise have been unstressed, so the prosody certainly takes priority over any syntactic constraints.

Nevertheless, we have four concerns about Bresnan’s phonological account. The first is that the purportedly phonological analysis actually builds in an important syntactic reservation. Recall Bresnan’s (Reference Bresnan2021: 118) observation under her example [3] quoted above, ‘Stressed constituents falling outside of the complement phrase of the auxiliaries do not support contraction’ (Labov Reference Labov1969). This makes it clear that the phonological explanation is at best only partly phonological, resting as it does on the syntactic complement/adjunct distinction. See examples (15) and (16).

The question for Bresnan’s analysis is whether the syntactic difference is mediated in its effect on AC by a phonological one or whether AC is directly related to the syntactic difference between a complement and an adjunct. Arguments can be made either way, but it certainly cannot be taken at face value that AC is phonological, once syntactic considerations are brought into play.

The second concern is the central role of stress in the phonological accounts. It is easy to construct examples where the complement of a contracted auxiliary has little or no stress. In (17), have can be reduced to a pronunciation with unstressed schwa, but as the complement of would it still licenses AC, despite being unstressed.

The stress data, therefore, include a measure of indeterminacy that makes it a poor basis for such an important linguistic generalisation: it does not robustly provide evidence for Bresnan’s position. Indeed, we doubt whether stress is ever as rigidly predictable as this rule implies, given the multiplicity of influences (including information distribution) that determine it (Gussenhoven Reference Gussenhoven2011); in Kim’s here, information distribution allows a very unprominent pronunciation of here, which is the word that licenses AC. On the general difficulty of measuring stress, we note Zwicky’s uncertainty about the amount of stress on is: ‘But note that the stress on is in examples like I wonder how tall he is (*he’s) is not very heavy’ (Zwicky Reference Zwicky1970: 334).

Our third concern repeats one of our concerns about the Gap Restriction, to do with the relationship between the proposed analysis and incremental parsing. According to Bresnan, the contracted auxiliary carries no more information than the full form; so after hearing that’s, the hearer’s expectations are just the same as if it had been that is. Certainly the hearer can work out that a stressed complement will follow that’s, but this is an indirect inference rather than information which is conveyed directly by the contracted form. In other words, Bresnan’s constraint only has the negative effect of banning some utterances. In contrast, our proposed analysis identifies a clear processing advantage of the limitation on AC and suggests a functional explanation.

And finally, we see theoretical objections to the idea of phonology vetoing one stylistic variant of a well-formed syntactic structure. After all, most rules of morphophonology have the opposite effect: They make sentences easier to pronounce by bringing them into line with the general phonotactics of the language. For example, rules of epenthesis, deletion, and assimilation generally seem to produce more regular phonological structures that are (therefore) easier to pronounce. So why should the morphophonology of English include a special rule which bans certain pronunciations of a particular syntactic category, in a specific syntactic context, rather than a rule that makes them easier to pronounce?

Having argued against Bresnan’s phonological analysis, we turn briefly to Stephen Anderson’s impressively elegant and simple explanation for the impossibility of AC in some examples:

… the PPhrase [Phonological Phrase] originally built over the phonetic material corresponding to the VP is now left with no phonetic content at all. I propose that this is in fact what renders a reduced auxiliary in this position unacceptable: it leads to a violation of a fundamental principle of prosodic structure to the effect that a PPhrase has to be supported by at least one PWord, which in turn has to be supported by some phonetic content. (Anderson Reference Anderson2008: 11)

Like Bresnan, Anderson argues that AC is blocked if the phonological structure to the right of the auxiliary is incomplete, but their definitions of incompleteness differ: the lack of a stressed word for Bresnan but complete emptiness for Anderson.

Anderson’s analysis makes no reference to stress, so it avoids our second criticism of Bresnan’s analysis. But our other objections to Bresnan’s analysis apply equally strongly to Anderson’s. Our first objection concerns the role of syntax in the supposedly phonological analysis, with the syntactic distinction between complements and adjuncts built into the prosodic structure. His structure for sentence (18) is (19) (Anderson Reference Anderson2008: 13):

In his analysis, the parentheses show the prosodic phrasing while square brackets show syntactic structure. But this analysis assumes that the sentence has only one possible pronunciation, rigidly following the syntactic phrase structure, shown in (20).

In contrast, we assume that rhythm and intonation are influenced not only by syntax but also by information structure, so we can imagine many different prosodic renderings of the sentence, such as those in (21)–(23).

But if flexibility is possible, Anderson’s explanation collapses, because this flexibility offers speakers a way to accommodate the effect of AC, just as it does when on Wednesday is a complement: The concert’s on Wednesday.

Another objection to Bresnan’s analysis was that it gives no clue to any processing advantages of the restriction on AC; the same is true of Anderson’s analysis, which explains the gaps in terms of a rigid mapping from syntactic to phonological structures, which simply blocks contraction as an automatic consequence, without giving hearers any guidance on what is to follow. And finally, we are uncomfortable with the idea that phonology might block a sentence which is otherwise well-formed; rather, we would expect such a block to motivate our ancestors to find a way to make the sentence pronounceable.

We also note a further weakness of Anderson’s analysis: AC should presumably be permitted in sentences like (24) since there is no empty phonological complement phrase.

There is, of course, an ellipsis site before of almonds, but this should not affect the analysis. And yet AC is in fact forbidden.

Our general conclusion, therefore, is that AC is not constrained by the phonology of what follows the auxiliary; rather, the rightwards constraint is a syntactic one, although, as we have shown, the syntactic Gap Restriction is not the correct generalisation.

3. The Following Valent Constraint

This subsection presents our alternative to the Gap Restriction and to the phonological explanations. It is called the FVC. Our analysis presupposes the theory of Word Grammar (WG), so we start with a brief introduction. Later subsections explain the word valent and present the FVC before discussing three related issues: the morphology of contraction, the effects of style, and the Scots Locative Discovery Constraint.

3.1. Word Grammar

The analysis below was worked out in WG (Hudson Reference Hudson1984, Reference Richard1990, Reference Richard1998, Reference Hudson2007, Reference Hudson2010; Gisborne Reference Nikolas2010, Reference Gisborne2020; Duran Eppler Reference Duran Eppler2011; Traugott & Trousdale Reference Elizabeth and Trousdale2013), which has served both as a heuristic and as a formal model for constraining our analysis.

One of the fundamental tenets of WG is that a grammar, like the rest of the mind, is a network; indeed, WG was one of the first cognitive theories to make this claim explicit: ‘A language is a network of entities related by propositions’ (Hudson Reference Hudson1984: 1). Note too that this network is not a network of complex items such as constructions; rather, its nodes are atoms. And, as in Stratificational Grammar, the labels on nodes are not central to the analysis – they are simply crutches for the analyst to use in keeping track of the network, or in communicating the network to others (Lamb Reference Lamb1998). Figure 1 illustrates the main ideas of WG through the morphology of regular English nouns.

Figure 1. The morphology of plural nouns.

In words: a word has both a stem and a fif (‘fully inflected form’), which is an instance of (is-a) ‘realisation’; by default, in English a word’s stem is also its fif – i.e. by default, words are not inflected. Two classes of words are verbs and nouns, and two kinds of nouns are the lexeme DOG and the inflectional category plural. The stem of DOG is {dog}, and exceptionally, the fif of a plural consists of its stem followed by the suffix {z}.

For this article, the most important feature of WG analysis is probably the taxonomy of relations, which allows relations to be classified as needed just like the parallel taxonomy of entities. The notation for WG networks distinguishes relations (in ellipses) from entities (in rectangles) but also provides a special notation for the most important relation of all: the is-a relation (as in Pat is a linguist), which underlies classification and generalisation – a small triangle whose base rests on the supercategory and whose apex is connected by a dotted line to the subcategory or subcategories. For example, the lexeme DOG and the inflectional category ‘plural’ both is-a ‘noun’, the relationships ‘fif’ (fully inflected form), and ‘stem’ is-a ‘realisation’, and the relation between the lexeme DOG and the morpheme {dog} is-a ‘stem’. In each case, the is-a relation permits the process of default inheritance, so if A is-a B, A inherits all the properties of B except those that are overridden as exceptions.

3.2. Valents

The term valent relates to the notion of valency, which was introduced by Tesnière:

The verb may therefore be compared to a sort of atom, susceptible to attracting a greater or lesser number of actants, according to the number of bonds the verb has available to keep them as dependents. The number of bonds a verb has constitutes what we call the verb’s valency. (Tesnière 1959/Reference Tesnière, Osborne and Kahane2015: 239)

Every modern grammatical theory has some way of imposing general or specific restrictions on the valency of a verb; the advantage of the category valent is that it covers all the dependents which are restricted in this way, including the subject, in contrast with adjuncts. The notion of verb valency has stimulated a great deal of productive research on valency grammar s (Ágel Reference Ágel2000; Thielemann & Welke Reference Thielemann and Welke2001; Herbst, et al. Reference Herbst, Heath, Roe and Götz2004; Allerton Reference Allerton and Brown2006; Ágel & Fischer Reference Ágel, Fischer, Heine and Narrog2015).

What Tesnière does not recognise in the quotation is that valency applies not only to verbs but also across the lexicon; for example, we can distinguish between intransitive and transitive instances of many prepositions – compare he looked up and he looked up the chimney. The same observation underscores Emonds’ influential analysis of since, where the difference between the adverb, the preposition, and the subordinating conjunction is just one of valency, so not a word-class distinction: I haven’t seen him since/since the party/since he got married (Emonds Reference Emonds1970; Huddleston & Pullum Reference Huddleston and Pullum2002: 599–601).

A word’s valents, then, are its subject and complement(s), so valent, contrasting with adjunct, is a familiar concept within modern grammatical theory. To be clear, however, valent is not the same as argument (Williams Reference Williams2015), because it includes both predicatives and not, as in (25)–(28).

None of the underlined would normally qualify as arguments, but they are valents. Since Pullum and Wilson (Reference Pullum and Wilson1977), the underlined words in (25)–(27) have been treated as complements of the verb, and (for reasons given below) we take not to be a complement too. Given that auxiliary verbs also have subjects by the standard diagnostics of subject-verb agreement, the minimal valency of an auxiliary verb is subject and complement.

The main evidence that not is a complement is that negation by a following not is one of the well-known NICE (Negative, Interrogative, Code, Emphasis) properties that define the class of auxiliary verbs (Huddleston Reference Huddleston1976). In the mnemonic, C stands for code (the possibility of carrying the meaning of an elided complement), which is not in fact limited to auxiliaries; in contrast, contraction is only found in auxiliaries, so it is tempting to reinterpret the C of NICE as contraction. Since the auxiliary verb licenses a following not, the latter must be among the complements of auxiliary verbs, in contrast with other negative words such as never, which are not licensed by an auxiliary but function as adjuncts – hence, the contrast between the pairs in (29) and (30).

Moreover, despite the name auxiliary verb, two members of this class (BE and HAVE) allow other valents than a verb complement. Most obviously, BE allows a wide range of complements:

All of these complement types, not just the complement verbs, count as valents and therefore allow contraction.

HAVE is more complicated because there are major dialect differences in the contractability of possessive HAVE. Two uses of HAVE need to be distinguished:

Everyone agrees that contraction is possible in the perfect use (She’s finished), but there are dialect differences regarding the possessive use (%She’s brown hair). Anderson (Reference Anderson2008: 3) denies that contraction is possible in examples like (37); however, it turns out that contracted possessive HAVE is found in American English, though it is considerably more frequent in British English than it is in American English, with ’ve occurring 7 times more frequently in Britain than in the USA (Algeo Reference Algeo2006: 19–20). Within the United Kingdom, it is claimed that contracted uses of possessive HAVE tend to be geographically restricted to Scotland, Northern Ireland, and the North of England (Trudgill Reference Trudgill and Trudgill1978: 15; Robinson Reference Robinson2021). Our observation, therefore, is that, in certain varieties, a direct object following a possessive have allows AC, as in (37).

The term valent, then, includes a verb’s subject and its complement(s), where the latter covers not only the expected non-finite verb but also a wide range of other complement types including not only all the complements of BE for all speakers of English but even the direct object in those varieties that contract possessive HAVE. An auxiliary verb’s valents also include the word not, which we classify as a complement (in recognition of its use with auxiliary verbs in realising negation). Consequently, the term auxiliary contraction actually applies to a wider range of structures than the traditional auxiliary verbs combined with a following verb.

But behind all this diversity, the FVC (unlike the gap constraint) relates to a natural class, the relational category valent. This is defined by a single property: being licensed by the head. This is true of subjects as well as complements, but we have seen that it also includes not (but not never); so valent is a homogeneous category of relation types and contrasts with adjuncts, whose defining feature is that they are not licensed by the head.

The WG analysis of valents can be seen in Figure 2, where the valent/adjunct contrast cuts across an equally fundamental contrast between ‘pre-dependent’ and ‘post-dependent’. The latter contrast may only be found in the grammars of languages where some dependents must precede the head while others must follow it – what might be called ‘head-medial’ languages (Hudson Reference Hudson, Abeillé, Borsley, Koenig and Müller2021). The relations ‘object’ and ‘predicative’ are meant as placeholders for a much longer and more complete list of complement types.

Figure 2. A taxonomy of grammatical functions.

3.3. The rule

We are now ready to present the FVC generalisation:

To see how the FVC explains why contraction is sometimes not possible, consider the following:

In (39), contraction triggers FVC, so the auxiliary’s valency includes a following valent; and just such a valent appears as the word ready, so the sentence is well-formed. In (40), however, if contraction applies to the second is, the same valency expectation applies to contracted is but it is not fulfilled because too is an adjunct, not a valent; so the sentence fails.

The FVC accounts for almost all the examples from the literature (with exceptions which we discuss below) such as the following (Zwicky Reference Zwicky1970: 333):

In all these examples, the auxiliary is followed both by its subject and also by its complement, so the promise of FVC is fulfilled and contraction is possible. It is also fulfilled in (44) and (45), which Zwicky describes as ‘an unsolved problem’ (Zwicky Reference Zwicky1970: 335).

Both these examples allow contraction because the subject – a valent – follows the auxiliary.

The key examples are (15) and (16), repeated here as (46) and (47):

These are key because they provide a minimal pair for the contrast between a valent in (46) and an adjunct in (47). The FVC predicts the grammaticality of the former and the impossibility of the latter, but (in contrast with Bresnan’s analysis) does so without requiring the valent in the former to be stressed.

Our analysis also predicts the judgments that Bresnan (Reference Bresnan2021: 118–119) reports, for examples of comparative subdeletion and pseudo-gapping, as well as those for sentences with dummy there. As Bresnan argues, AC is in fact possible with comparative subdeletion as in (48) and the attested (49).

AC is predicted to be possible in these cases by the FVC, though not by the Gap Constraint, because the following complement is overt, even if it contains a gap: an _ archeologist or a _ biker. The underscore marks the position where we may assume a missing adjective with which better is compared.

Pseudo-gapping, on the other hand, does not allow AC. Bresnan’s only example is (50).

Here is has no overt complement but instead we reconstruct playing, with blackjack as its complement. Accordingly, AC is not possible – as predicted both by the Gap Constraint and by the FVC.

Another interesting pattern noted by Bresnan, following Inkelas & Zec (Reference Inkelas and Zec1993), involves the expletive there as in (51):

As Bresnan points out, although there is a gap immediately after the auxiliary, the auxiliary can be contracted. We predict the acceptability of contraction in examples such as (51) because, on the assumption that left in the tank is a complement of is, contraction is in fact licensed by the FVC.

In contrast, the FVC explains why none of the examples in (52)–(55) allow contraction (Zwicky Reference Zwicky1970: 334):

Contraction is impossible in these examples because the promised following valent is missing. Example (54) is particularly interesting because the valent can easily be reconstructed and indeed has to be reconstructed in order for the listener to understand the sentence; yet, the promise of FVC is unsatisfied: The following valent must be overt, and an overt dependent of this valent (such as of almonds in (54)) will not do instead.

One commentator has pointed out to us the apparent challenge of examples like (56) and (57).

However, we think such examples are compatible with our rule. In (56), contraction is impossible, even though the tensed auxiliary has an overt complement (been), but what actually prevents contraction is the focus on polarity, which requires a strong form of the auxiliary to contrast with the opposite polarity. For (57), contraction is surely possible if the focus is on I (contrasting with the first I uttered by the other speaker), especially if too is included; but once again, it would be blocked by a focus on the auxiliary itself.

If the FVC does offer processing advantages, as we argue in Section 4.4, then we might expect to find contraction controlled by FVC outside the auxiliary-verb system. We do find it in the case of possessive determiner/pronouns, which show a similar link between allomorphy (e.g. my versus mine) and valency.

As with auxiliary verbs, each determiner/pronoun has two forms, one longer than the other: my/mine, your/yours, her/hers, our/ours, and their/theirs; and the same is also true of another determiner/pronoun pair, no/none. In each case, the shorter form is traditionally called a determiner, because it combines with a common noun, while the longer form is called a pronoun; but this class distinction can be replaced by a valency distinction: Assuming the determiner phrase (DP) analysis, the shorter form has an obligatory complement, just as with auxiliary verbs, in contrast with the longer form’s impossible complement. This alternation is different in some ways from that found in AC: For possessives, the alternation is between impossible and obligatory, in contrast with the impossible/optional choice for auxiliaries; and there is no stylistic difference between the two forms of a possessive. Nevertheless, the similarities are sufficiently striking to confirm the possibility that both alternations offer the same processing advantages.

Returning to the FVC, then, it is different from previous explanations in two ways. First, our analysis focuses on the positive effects of AC (and their syntactic consequences) rather than on the cases where it is not possible. Secondly, the omissibility of a following valent is dependent on the contraction of the auxiliary rather than vice versa. But of course, all the usual constraints still apply as well both to the auxiliary and to its following valent. For the auxiliary, contraction is blocked by stress on the auxiliary:

And on the following valent, the usual rules for comparatives force omission of the following valent in (61), thereby also requiring a full form for the auxiliary.

In short, we are suggesting that there is a complex interaction between two different variables, AC and complement omission, each controlled by different and potentially conflicting factors. AC may be blocked by stress on the auxiliary, while complement omission may be controlled by the syntax of comparatives. In cases of conflict, formality always takes second place (with the possible exception of the Scots case, discussed in Section 3.6).

Figure 3 presents the WG analysis of the FVC. In words, a tensed auxiliary may be contracted; and a word may have a valent (shown by the small empty square at the top). But a contracted auxiliary has a particular kind of valent: one that stands after the verb (i.e. ‘after’ its ‘landmark’) and that has to be overt. The overtness is shown by the ‘1’ in the lower square box, linked to ‘contracted’ by the relation ‘#’ for ‘quantity’. In other words, one valent of a contracted auxiliary both follows the auxiliary and also is obligatory.

Figure 3. The Following Valent Constraint.

3.4. Contraction in morphology

Having examined the syntax of contracted auxiliaries, we turn to their morphology. We argue here that we should adopt an analysis in which the syntactic structure is almost the same whether the auxiliary is contracted or not (apart from the classification of the auxiliary as ‘contracted’). The distinction of status lies in the morphology, where contracted auxiliaries are realised by an affix or even just by part of a stem morpheme. For example, pairs like He is here and He’s here have almost the same syntactic structure, but (for reasons presented below) he’s could be morphologically analysed either as {{he}{Z}} or as {hez}.

As Bresnan and others have argued, some host-clitic pairs must be stored as such in the speaker’s memory. The clearest examples have a pronoun as host: you’re, they’re, and a few others; we can be sure that these pairs are stored as a single lexical item because their pronunciation cannot be predicted by general rule and because the unpredictable pronunciation has a vowel that is different from that of the host as well as the clitic. For example, you’re can be pronounced to rhyme with your as /jɔ:/ and a plausible morphological analysis recognises {your} not only as the realisation of your but also as a fused realisation of two words, you and are. There are well-known precedents for morphological fusion, such as the fused realisation /o:/ (realising {o}) for à le (‘to the’) in French; and there are other cases in English, such as wanna for want to (Rosta Reference Rosta1997; Hudson Reference Hudson2007: 100–104).

On the other hand, you’re may also be pronounced /ju:ə/, where you has its normal pronunciation. This pronunciation is always possible for contracted you’re, whereas the shorter pronunciation /jɔ:/ is only possible when you is the subject of are. This is illustrated by the contrast between example (62) and example (63).

The easiest explanation for these facts is that contracted auxiliaries may have two morphological realisations: a clitic or a fused form. The available forms vary from verb to verb. A few auxiliaries have no contracted form: may, might, and ought. The rest all have a clitic form, such as {’re} pronounced /ə/ for are, which can be used after any host word, regardless of the syntactic structure. But a large collection of subject-auxiliary pairs have a fused form which is stored, such as {you’re} and {I’m}. In some cases, the clitics and fused forms have distinct pronunciations, but we would hypothesise, on usage-based grounds (Hudson Reference Hudson2010), that speakers memorise a lot of frequently cooccurring pairs, even if they are phonetically the same as unstored host-clitic pairs.

Fused realisations are generally restricted to pairs where the host is followed by the clitic, but there is one case where the clitic precedes the host: do you, realised as /dʒə/:

However, although the order of elements is different, the two words are once again directly related as verb+subject.

The WG analysis of the fused form you’re /jɔ:/ can be found in Figure 4, where the dotted lines indicate that the relations concerned are indirect. This diagram states that if the pronoun you acts as the subject of the present tense of BE, the pronoun and verb can have a shared realisation /jɔ:/.

Figure 4. The fused /jɔ:/ for you’re.

In contrast, you’re has to be pronounced /ju:ə/, with a separate realisation for each word, in Pictures of you’re popular. This is because the syntactic structure is as shown in Figure 5, where the subject of _’re is not you but pictures. In this case, the fused realisation is not available, so we have to use the slightly longer contracted form in which the verb and you each has a separate realisation.

Figure 5. Dependency structure for ‘Pictures of you’re popular’.

In this case, the contracted auxiliary is an ordinary clitic (rather than a fused form), which in WG means that it is exceptionally realised as an affix in a host word, rather than as the usual full form. This analysis is shown in Figure 6, which says that contracted are is realised by the affix {’re} (which in turn is realised by /ə/). Other parts of the grammar (not shown here) link this affix to the preceding word as its host.

Figure 6. The clitic contracted auxiliary {’re}.

3.5 Preposed negatives and style

The exceptions where the FVC fails include the preposed negatives in (65) and (66) (Zwicky Reference Zwicky1970: 335).

Contraction in such examples is predicted to be possible because the auxiliary is indeed followed by an overt valent – indeed, by two such valents – yet it is actually impossible. Zwicky points out that the badness of these examples cannot be explained phonologically:

… reference to stress levels will not explain the contrast either, since a perfectly normal pronunciation of Never has Pete seen her is [nevr az piyt siynr], with a reduced and unstressed has.

We agree with Zwicky that has can be unstressed and reduced (by losing its /h/) but cannot be contracted to /z/. The Gap Restriction does not explain these examples because, even if there is a gap showing the default position of the negative, this gap is not next to the auxiliary so contraction should be possible, just as it is in What’s Pete seen _?

Given that the Gap Restriction, the FVC, and phonological arguments, such as Bresnan (Reference Bresnan2021)’s, all fail to account for data such as (65) and (66), we have to look elsewhere for an answer. As has often been noted (and as we show below), AC is sensitive to register and style. We see a potential explanation in terms of style level: The preposed-negative construction is hardly ever found in casual speech, so its style level clashes with that of AC. We now develop this sociolinguistic explanation.

AC is much more frequent in some styles than in others. Figure 7, which we built on the basis of corpus figures presented as graphs (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 1129), shows that the probability of contraction is influenced both by register, with the highest probabilities in conversation, and by the verb concerned.

Figure 7. Contractions as percentage of four auxiliaries in four registers.

A preposed negative, on the other hand, generally stands at the other end of the style scale. It triggers subject-auxiliary inversion and ‘has a rhetorical effect and is virtually restricted to writing’ (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 915). This may explain the typical judgements that reject AC in Zwicky’s examples: The contracted auxiliary and the preposed negative with nobody and never are associated with diametrically opposed styles. A usage-based theory, such as the one we adopt (Hudson Reference Hudson2010: 80–83, 205–209), is at its heart a theory of how users acquire language and induce grammars from the tokens of linguistic experience in their environment. Linguistic experience includes contextual information that can be grammaticised – see for example the tu/vous distinction in French and related distinctions in other languages. Given this, we expect there to be a grammaticality gap brought about by the clash in style between the casual spoken style of AC and the formal written style of negative preposing.

However, as Biber et al. (Reference Biber, Johansson, Leech, Conrad and Finegan1999: 915–916) show, there are three negatives that avoid this conflict: nor, neither, and no way, which share their syntax with the positive word so. Unlike the other preposed negatives, these are common in ordinary casual conversation. Given their status as markers of solidarity and style, and given that they are usual in speech and not restricted to writing, we should expect them to allow AC, and we believe that this is indeed the case:

The examples in (67)–(70) show that, as predicted, nor, never, and no way permit AC, as does so.

A similar style conflict seems to arise with other constructions such as those in (71)–(73):

The examples illustrate Right Node Raising and two special cases of subject-auxiliary inversion (discussed further in the next subsection), all of which belong to a high register; for example, the use of than immediately before the auxiliary is ‘restricted to formal writing’ (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 918). In each case, the style of this construction conflicts with the casual style of AC.

Our conclusion, therefore, is that a full grammatical analysis of these constructions must include their stylistic level and an account of whether they typically belong in speech or writing and that, if this is available, it will explain that some sentences are ‘ungrammatical’ simply because of a clash between features to do with style and register. Providing a more detailed analysis is beyond the scope of this article.

3.6 The Scots locative discovery expression

The discussion so far has assumed that AC is possible only if the auxiliary is followed by a valent, but this is not the case for all varieties of English.Footnote ¹ Thoms et al. (Reference Thoms, Adger, Heycock and Smith2019: 2) report a construction in Scots which they call the locative discovery expression (LDE) and which is exemplified in (74) and (75):

Such examples are significant because, contrary to the FVC, the contracted auxiliary (_’s) has no following valent.

The construction concerned combines a number of special properties:

• The verb is the present tense of BE.
• The subject is a personal pronoun (any person or number).
• The first word is either here or there.
• The construction is only used to announce the discovery of something that was being sought.

It is in recognition of the last property that Thoms et al. call this construction the LDE.

LDE is a special subtype of a more general construction which is sometimes called subject-dependent inversion (Huddleston & Pullum Reference Huddleston and Pullum2002: 1385–1390), which has locative inversion (Bresnan Reference Bresnan1994) as a further subtype. Locative inversion divides further into two stylistically distinct varieties: In high style (typically literary writing), many different verbs may be preceded by any locative and followed by any subject, as in (76).

The stylistically low variety, which is much more common in ordinary spoken English, is far more constrained in its grammar: The verb is restricted to BE, COME, or GO and the locative can only be here or there. The verb is only ever in the simple present tense, even for COME and GO, which would normally be present progressives (so we find there goes rather than *there is going).

In contrast, (78) shows the tense expected under normal word order.

And lastly, as in the LDE subtype, the utterance announces the ‘discovery’ (loosely interpreted) of the subject.

Crucially, all varieties of subject-dependent inversion impose a constraint on inversion which is part of the structure of the LDE: Inversion is blocked if the subject is a personal pronoun; (79)–(82) are typical examples (for speakers without the Scots LDE):

The Scots LDE pattern reported by Thoms and colleagues is therefore a particular subcase of the subtype of locative inversion found in informal registers, shown in (83).

Such examples are part of the grammar of the variety of Scots that have the LDE (Gary Thoms, personal communication, 2023). This being so, the only peculiarity of this variety of English is that the possibility of contraction is extended to cases where the subject precedes the verb because it is a pronoun, as in (80), leaving the auxiliary without a following valent.

Speakers who allow LDE with AC clearly have different grammars from speakers of other varieties of English or Scots, but the difference is very small. Thoms and colleagues analyse the LDE in terms of a theoretical innovation that they call the ‘specialized mirative complementizer, C_MIR, which is typically null’ and ‘null locative proform, PRO_loc’ (Thoms et al. Reference Thoms, Adger, Heycock and Smith2019: 16, 21). We do not see that this area of grammar requires such an innovation. In contrast, we take the view that the only difference between varieties with contraction in the LDE and those without is that the contraction varieties exceptionally permit AC without a following valent in this construction only. In our analysis, the LDE inherits its illocutionary force from the more general construction of which it is an instance (the low-style version of subject-dependent inversion). We explain the lack of a following valent by means of an elementary variable called ‘quantity’, which is needed elsewhere and which distinguishes the obligatory from the optional and the impossible. We elaborate on this and explain it further in Section 4.3.

Our solution also presupposes a theoretical framework in which general rules coexist with exceptions in a way that is familiar both in phonology (where elsewhere expresses the general rule) and in constraint-based and cognitive theories that work with default inheritance. We develop this point further in Section 4.3. This solution locates all the properties of LDEs on the verb – a special subtype of the verb BE called BE_lde. The verb BE_lde has all the default properties of BE plus four special properties: having a front-shifted here or there, being in the present tense, having a subject which is delayed unless it is a pronoun, and – exceptionally – allowing AC even in the absence of a following valent. This final property is the only difference between AC in the LDE and in other contexts. In other words, the FVC is inherited by BE_lde but in a slightly modified form, which makes the following valent optional rather than obligatory – optional rather than impossible because of attested examples like (84), where there is a following overt valent (Thoms et al. Reference Thoms, Adger, Heycock and Smith2019: 8).

This exceptional feature is proportionate to the rather minor difference between the dialects that allow AC in LDE and those that do not.

This small exception can be seen in Figure 8. At the heart of this diagram is the FVC, which requires a valent of a contracted auxiliary to have a quantity (#) of 1 – i.e. to be overt. The exception is stored in relation to LDEpro (LDE with a pronoun as subject), which is defined in the next diagram. The exceptional feature is the ‘valent’ link from contracted LDEpro, which turns the default quantity 1 into ‘_’, thereby making the following valent optional.

Figure 8. The exceptionality of the Scots LDE.

Finally, we have to define the category LDEpro in relation to the other constructions discussed above. The relations are laid out in Figure 9, which is basically a partial taxonomy of verbs. Working down from the top of the diagram, we have

• By default, a verb’s subject stands before it and is some kind of noun (which here subsumes personal pronouns); e.g. Her house stands glowing with warmth.
• A tensed verb may be an example of Subject-dependent inversion, in which case its subject follows it and what precedes is some other dependent; e.g. Glowing with warmth stands her house. (Not shown in the diagram: This is a literary construction.)
• If the verb is more specifically an example of Locative inversion, the preceding element is a locative expression; e.g. In the corner of the field stands her house. (This too is a literary construction.)
• If the verb is more specifically still one of BE, COME, and GO, then the above applies but the locative expression is either HERE or THERE, the style shifts radically to everyday colloquial, and the pragmatics are as described for an LDE; e.g. There goes our bus, Here’s your change.
• If the subject of any example of subject-dependent inversion is a personal pronoun, it reverts to the default position before the verb. This is also true of LDEs, giving the verb type labelled LDEpro; e.g. Here you are, Here it comes.

Figure 9. The locative discovery expression.

4. Theoretical issues

Section 3 suggested an explanation (FVC) for the rather strict limits to the possibility of AC; the present subsection highlights four theoretical issues that this constraint raises and comments on the difficulties they pose for any attempt to accommodate the FVC.

4.1 Properties or restrictions?

One issue that we have raised is whether the FVC affects a sentence’s structure. The syntactic Gap Restriction does not: As a constraint on the contraction process, all it does is block some contractions as ungrammatical, and it does so without having any other effect on the structures that are permitted. The same is true of the various phonological accounts. Moreover, these restrictions only apply after the rightward context of the auxiliary has been registered, which is a problem for incremental parsing. Take an example such as (85): whether or not contraction is possible can only be confirmed after true has been encountered.

In these analyses, then, the contraction of an auxiliary does not help the hearer to parse the rest of the sentence. Of course, there could be extra rules of thumb linking AC to a following valent, but that would be an extra complication falling foul of Occam’s razor.

In contrast, the FVC enriches the auxiliary verb’s valency by requiring a following valent, so it guides the hearer’s forward planning of the sentence structure. If there is no following valent, the result is ungrammatical, but this is a side-effect of the structural change. This view of FVC as a constraint is very different from the Gap Restriction in terms of its effect on processing, because the FVC affects the representation of the verb itself and in particular affects its valency features. This means that the information associated with the FVC can affect the incremental parsing of a sentence: As soon as we hear That’s, we know that there must be a following valent.

4.2. Grammatical relations

Central to our explanation is the general relational category valent, covering subjects and complements and, among complements, the word not and a range of other complement types from non-finite verbs to predicative complements and (for many speakers, especially in the UK) the direct object in I’ve an idea. All these elements, when following an auxiliary verb, allow the auxiliary to contract, whereas an adjunct, in contrast, does not. Our explanation therefore depends on an analysis of grammatical relations, which includes the general category valent as defined here. However, it also presupposes that these relational categories form a taxonomy – a hierarchical structure – with valent near the top and other categories, such as subject and complement, lower down. Our analysis does not force a choice between valent and complement; on the contrary, it requires these categories to co-exist in the taxonomy.

It is important to stress that there is a good deal of independent evidence for the idea of taxonomically organised function categories; it does not stand or fall with the supercategory valent. For example, it is widely accepted that direct and indirect objects fall together with various other grammatical functions to comprise the supercategory complement, so any theory that recognises this relationship between complement and its subcategories is thereby recognising that grammatical functions form a taxonomy, not a list. The necessary theory of grammatical functions has been worked out in a number of WG publications (Hudson Reference Hudson2007, Reference Hudson2010; Gisborne Reference Gisborne, Sugayama and Hudson2005; Rosta Reference Rosta, Sugayama and Hudson2005) and the conclusions that we have presented here are drawn from that tradition.

4.3. Optionality and exceptionality

The FVC changes the default status of some valents. By default, the complement of an auxiliary is optional, but the FVC ensures that it (or some other valent) is obligatory. It has no effect on the semantics, where the valent’s meaning is always obligatory, even where the valent word itself has been elided. This clearly requires two mechanisms: one for showing whether an element is optional or obligatory and another for allowing an exception to override a default. The two mechanisms are closely linked because the first mechanism has to be compatible with the second – in particular, an optional valent must be turned, exceptionally, into an obligatory one.

In most modern theories of grammar, optionality is shown by brackets, but it is not at all clear how brackets can be removed in an exceptional case. A general rule for an auxiliary verb’s valency might look something like this:

This rule assigns an auxiliary verb an optional complement X, using whatever notation is available for showing complements. In contrast, the next rule would make the complement obligatory if the auxiliary is contracted:

However, it is not clear how these two rules will interact, if at all. Given that both rules can apply to the same verb, does (87) give this verb an extra complement, on top of the one licensed by (86), or do both rules refer to the same complement? The problem is serious and raises deep questions for the theory of language structure. It concerns the basic logic of exceptionality: How do exceptions override the default?

Where a rule has a context, it is generally accepted that the elsewhere condition applies: A broader context takes priority over a narrower one (Kiparsky Reference Kiparsky1982), and this approach would indeed help if the auxiliary was subject to change in a given context (as in the Gap Restriction):

However, we have argued for a different kind of analysis in which the auxiliary is the context (the independent variable) and the following valent is the affected (the dependent variable): Rather than saying that (for example) am can change to ’m in the context of a following overt valent, we are saying that the following valent is obligatory in the context of contracted ’m. It is unclear how, if at all, this can be expressed as a context-dependent rule.

One possible reaction would be to treat optional and obligatory as features that can be changed by rule:

However, the categories optional and obligatory are unlikely to appear in any formal grammar because they describe rules rather than words. Moreover, the specification of the context is so primitive that it does not distinguish valents from adjuncts, and it seems to require adjacency of the auxiliary.

Our alternative avoids rules and operations altogether and presents the facts in terms of nodes in a declarative network whose logic is default inheritance. This subsection has established that optionality and word order must be represented as properties of syntactic elements and that only then will it be possible to write a grammar in which the properties of a contracted auxiliary verb override the default properties that it would otherwise inherit.

4.4. Functional explanations

It is reasonable to wonder what motivated the historical changes that led to the AC and the FVC. It is always possible, of course, that the change was unmotivated, but possible functional explanations are available and worth considering, even if they can never be proved. What, then, may have driven first the development of AC itself and second the FVC?

Why did our ancestors start to contract auxiliaries? A promising sociolinguistic explanation for AC in modern English is that the contraction itself is a signal that the current situation is informal. However, in the early centuries (mainly since the sixteenth century), contraction was apparently random and socially irrelevant; so, the stylistic linkage seems to have come second, after contraction was well established, and may have been influenced by criticisms from prescriptive grammarians (Gailor Reference Gailor2011: 12). An alternative explanation, structural rather than sociolinguistic, is that AC evolved as part of the evolution of our modern auxiliary-verb category as a further way of distinguishing this new category from non-auxiliary verbs (Hudson Reference Hudson1997). Another explanation, this time psycholinguistic, is that it could have been motivated as a way of adapting the duration of an auxiliary to the effort and time needed to process it: Auxiliary verbs are usually rather easy to process – after all, they are very common, and both their syntax and their semantics are easily predicted; so the hearer does not generally need extra time. Of course, it is likely that such a major change had multiple motives, so all these explanations may be right, but we can only speculate about the truth.

Turning to the FVC, the search for explanations may be a little easier because we can exclude sociolinguistic motivation and concentrate on the possible advantages of the FVC for processing. The following account of processing rests on two assumptions: that processing, in particular hearing, is cognitively demanding because of limited working memory and that these cognitive demands are among the functional pressures leading to language change. In this case, the pressure is ‘expectation sensitivity’, a speaker’s sensitivity to the hearer’s expectations (Haspelmath Reference Haspelmath2023).

The first assumption is a commonplace of cognitive psychology, where it is supported by a mass of research going back at least as far as Miller’s ‘magical number 7’ (Miller Reference Miller1956) and including a great deal of experimental evidence as well as theoretical analysis (Baddeley Reference Baddeley2003; Cowan Reference Cowan1997; Ericsson & Kintsch Reference Ericsson and Kintsch1995; Friederici et al. Reference Friederici, Steinhauer, Mecklinger and Meyer1998). One particularly interesting and relevant development in cognitive psychology is the idea of a ‘Now-or-Never bottleneck’ in processing (Christiansen & Chater Reference Christiansen and Chater2016), where the processing demands are in danger of exceeding the hearer’s working memory capacity; this danger is a clear driver for linguistic changes that adjust the grammar to fit the processing needs of the hearer. Another relevant strand of experimental work is concerned with the cognitive demands of complex and potentially ambiguous syntactic structures and how these processing demands can be minimised by linguistic changes (Futrell, Mahowald & Gibson Reference Futrell, Mahowald and Gibson2015; Futrell, Levy & Gibson Reference Futrell, Levy and Gibson2020).

Yet another relevant strand is work on how speakers control the flow of information according to the principle called Uniform Information Density (Aylett & Turk Reference Aylett and Turk2004; Levy & Jaeger Reference Levy and Jaeger2007; Frank & Jaeger Reference Frank and Jaeger2008; Jaeger Reference Jaeger2010; Meister et al. Reference Meister, Pimentel, Haller, Jaeger, Cotterell and Levy2021): To achieve optimum communication, a speaker should aim to keep information flowing at a roughly constant speed, without major spikes or troughs. Fortunately for us, the principle has been studied in relation to AC (Frank & Jaeger Reference Frank and Jaeger2008), on the assumption that it encourages full forms to be used to give the hearer a little extra time for processing high-density constructions (such as missing valents), while contracted forms are suited to low-density processing. Frank & Jaeger’s statistical study of AC in a corpus measured the likelihood of complements both in terms of overall frequency and also in terms of frequency after the auxiliary, but (unlike us) they treated contraction as the dependent variable. Their hypothesis was confirmed for perfect HAVE, which was more likely to contract if the complement was more likely (and therefore carries less information) but not for BE, where contraction was slightly less likely when its complement was more probable. These results clearly do not support our claim, but they are hard to interpret because of the choice of contraction as the dependent variable.

As explained in the introduction to Section 3, we envisage FVC not as a limitation on possible structures but as a property of certain structures (the property of having an overt following valent). On this view, our hypothesis is that one effect of AC is to override (i.e. to change) the default valency of the auxiliary: By default, a following overt complement is optional, but (except in the Scots LDE) after a contracted auxiliary it is obligatory. This means that when we hear or produce a contracted auxiliary, our minds must represent the auxiliary not only as contracted but also as anticipating an obligatory overt following valent. Consequently, the speaker has to make a calculation when approaching an auxiliary verb, balancing the benefits of contraction against those of a full form: A contracted auxiliary is easier to pronounce but requires a following overt valent, whereas a full form is slightly harder to say but allows syntactic flexibility.

This calculation represents a cost for the speaker but a benefit for the hearer. The cost is the need for the speaker to have planned the next few words at least to the extent of knowing that the following valent will be overt (a cost which is slightly offset by the reduced articulation required by the contracted auxiliary). But the hearer benefits by being able to build a more accurate set of expectations; so instead of knowing (as they do when they hear a full form) that a following valent is possible, they know that it is certain.

The role of valents, rather than adjuncts, follows from these assumptions, because it is following valents, not adjuncts, that are anticipated, so it is only valents that can be made obligatory. For example, having heard I met we can anticipate a direct object (e.g. Pat) but not a time or place adjunct (e.g. yesterday, in the park). Similarly, therefore, hearing I’m allows us to predict a predicative complement (e.g. tired, working) but not an adjunct of time or reason (e.g. now, because it’s late). The processing benefits of anticipation are well known (Pickering & Garrod Reference Pickering and Garrod2012); so, for example, if a hearer is expecting the direct object of met, they already know, before the word is heard, a great deal about that direct object and its relation to the meaning of the sentence. Similar benefits follow from anticipating an obligatorily overt predicative complement of a contracted auxiliary.

Table 1 suggests the extent of the processing advantage arising from contraction. The data are taken from the Switchboard corpus of short conversations between strangers.Footnote ² They include all the 7,243 examples of present-tense forms of the auxiliary BE – i.e. all examples of am, is, or are or their contracted forms – classified according to the following syntactic context. Contrary to the normal practice in discussions of AC, we treat contraction as the independent variable, so the figures show the relative likelihood of each context after a full or contracted auxiliary. For this discussion, the crucial figures are in the column headed ‘NONE’, where there is no context – i.e. no following valent.

Table 1. Am, is, are in the Switchboard corpus.

The table shows, as expected, that there is no chance at all of the ‘NONE’ context following a contracted auxiliary, contrasting with a 13% chance after a full form. But it also shows that, although contraction has no significant effect on the probability of a following full NP or preposition phrase, it significantly raises the chances of a following participle or adjective. In other words, contraction does indeed give the hearer quite a lot of information about possible following valents: that they will certainly not be zero (a drop of 13% compared with the full auxiliary) but are more likely to be a participle or adjective (a rise of 7% and 10%). Our guess is that this extra information was at least part of the benefits that motivated the rise of AC subject to the FVC.

To summarise, then, contraction plays a role in processing by guiding the hearer’s expectations about the following contexts: A contracted form is absolutely certain to be followed by some kind of valent. Conversely, a full form is as likely to have no following valent, as it is to be followed by a participle or a prepositional phrase. In short, the historical development of AC and the HVC may have been partly motivated by Haspelmath’s ‘expectation sensitivity’, the speaker’s desire to guide the hearer through the syntactic structure (Haspelmath Reference Haspelmath2023). Our views are also compatible with Hawkins’s argument that grammar evolves so as to be consistent with hearers’ processing needs; seen in this light, AC developed so as to meet Hawkins’ principle, ‘Maximize Online Processing’ (Hawkins Reference Hawkins1994, Reference Hawkins2004).

On the other hand, HVC also has a cost for the speaker because AC has a second function: signalling informality. When the valent is missing, the speaker has to use the full form of the auxiliary, despite its associations with formal writing. It is possible to see the Scots data in Section 3.6 as an alternative resolution to this conflict that gives priority to expressing informality. In the Scots data, Here it’s does express informality but does not guarantee a following valent. On the other hand, these utterances are so common and so embedded in the total situation that they can be assumed to be very easy to process, so the hearer’s needs are less urgent than the need to show informality.

5. Conclusions

Our main conclusion is that AC is controlled by the FVC rather than by the syntactic Gap Restriction or any of the phonological explanations. The FVC treats the overt following valent as part of the valency of the contracted auxiliary, rather than as a condition for contraction. In the morphology, the contracted auxiliary is realised either as a clitic or as part of a fused realisation with the preceding word.

This analysis has theoretical consequences. One is that the grammar of English must provide the category valent, meaning a subject or a complement (as opposed to an adjunct). This is not a standard category (hence the need in this article to introduce and define a name for it) and is not easily compatible with some theoretical frameworks. Moreover, this category must be part of a taxonomy, with more specific categories, such as subject and complement, below it; this degree of structured organisation is not generally available for grammatical relations.

Furthermore, the logic of FVC is challenging because one of the verb’s ordinary valents, either subject or complement, has to be made obligatory even if by default it is optional. This seems to require a default logic which builds on the taxonomy of word classes.

Another consequence is that the FVC works by changing the valency of the auxiliary rather than simply by blocking contraction. This means that the sentence’s structure, in particular that of the auxiliary, is different depending on whether or not the auxiliary is contracted: Contraction guarantees that an overt valent will follow. This information helps the hearer by providing more certainty than without contraction.

The fourth (and last) consequence of the FVC builds on the previous one: It is worth looking for a functional explanation. Why did our linguistic ancestors develop the FVC as part of AC? We can move a little beyond speculation by building on three established facts. First, working memory is limited and can easily be outstripped by processing demands, so hearers need help from the language’s grammar. Second, contraction carries statistically significant information about the upcoming words, so this information is definitely available to hearers. And third, the distribution of full and contracted forms coincides statistically with the processing demands, on the (reasonable) assumption that, because it has to be reconstructed, a missing valent is harder to process than an overt valent: a contracted form signals that processing demands are low.

Acknowledgements

We would like to thank the reviewers for their helpful comments and recommendations. All remaining errors and infelicities are our own.

Footnotes

¹ There is a debate as to whether Scots is a language in its own right or a variety of English. The LDE is found in a variety of Scots, not uniformly throughout Scots, and from that point of view, it would be reasonable to take this case of variation as variation within the Scots language. However, the LDE is only minimally variant from the regular grammar of AC in English, so for the purposes of our grammatical analysis we put English and Scots together.

² https://catalog.ldc.upenn.edu/LDC97S62, May 2023.

References

Ágel, Vilmos. 2000. Valenztheorie. Tübingen: Narr.Google Scholar

Ágel, Vilmos & Fischer, Klaus. 2015. Dependency grammar and valency theory. In Heine, Bernd & Narrog, Heiko (eds.), The Oxford handbook of linguistic analysis, 2nd edn., 225–257. Oxford: Oxford University Press.Google Scholar

Algeo, John. 2006. British or American English? A handbook of word and grammar patterns. Cambridge: Cambridge University Press.Google Scholar

Allerton, David. 2006. Valency grammar. In Brown, Keith (ed.), Encyclopedia of language & linguistics, 301–314. Oxford: Elsevier.Google Scholar

Anderson, Stephen. 2008. English reduced auxiliaries really are simple clitics. Lingue e linguaggio 7, 1–18.Google Scholar

Anttila, Arto. 2016. Phonological effects on syntactic variation. Annual Review of Linguistics 2, 115–137. https://doi.org/10.1146/annurev-linguistics-011415-040845.Google Scholar

Anttila, Arto. 2017. Stress, phrasing, and auxiliary contraction in English. In Gribanova, Vera & Shih, Stephanie (eds.), The morphosyntax-phonology connection: Locality and directionality at the interface, 143–170. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780190210304.003.0006.Google Scholar

Aoun, Joseph & Lightfoot, David. 1984. Government and contraction. Linguistic Inquiry 15, 465–473.Google Scholar

Aylett, Matthew & Turk, Alice. 2004. The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language and Speech 47, 31–56. https://doi.org/10.1177/00238309040470010201.Google Scholar PubMed

Baddeley, Alan. 2003. Working memory and language: An overview. Journal of Communication Disorders 36.3, 189–208.Google Scholar PubMed

Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan & Finegan, Edward. 1999. Longman grammar of spoken and written English. London: Longman.Google Scholar

Boeckx, Cedric. 2000. A note on contraction. Linguistic Inquiry 31.2, 357–366.Google Scholar

Bresnan, Joan. 1994. Locative inversion and universal grammar. Language 70, 72–131.CrossRef Google Scholar

Bresnan, Joan. 2021. Formal grammar, usage probabilities, and auxiliary contraction x. Language 97, 108–150.Google Scholar

Chomsky, Noam. 1995. The minimalist program. Cambridge, MA: MIT Press.Google Scholar

Christiansen, Morten & Chater, Nick. 2016. The Now-or-Never bottleneck: A fundamental constraint on language. Behavioral and Brain Sciences 39, e62.Google Scholar PubMed

Cowan, Nelson. 1997. Attention and memory: An integrated framework. New York: Oxford University Press.Google Scholar

Duran Eppler, Eva. 2011. Emigranto: The syntax of German-English code-switching. Vienna: Braumüller.Google Scholar

Emonds, Joseph. 1970. Root and structure-preserving transformations. Ph.D. dissertation, Massachusetts Institute of Technology.Google Scholar

Ericsson, K. Anders & Kintsch, W.. 1995. Long-term working-memory. Psychological Review 102.2, 211–245.Google Scholar PubMed

Frank, Austin & Jaeger, T. Florian. 2008. Speaking rationally: Uniform information density as an optimal strategy for language production. Proceedings of the Annual Meeting of the Cognitive Memory Society 30.Google Scholar

Friederici, A.D., Steinhauer, K., Mecklinger, A. & Meyer, M.. 1998. Working memory constraints on syntactic ambiguity resolution as revealed by electrical brain responses. Biological Psychology 47.3, 193–221.Google Scholar PubMed

Futrell, Richard, Levy, Roger & Gibson, Edward. 2020. Dependency locality as an explanatory principle for word order. Language 96, 371–412.Google Scholar

Futrell, Richard, Mahowald, Kyle & Gibson, Edward. 2015. Large-scale evidence of dependency length minimization in 37 languages. Proceedings of the National Academy of Sciences 112.33, 10336–10341.Google Scholar PubMed

Gailor, Denis. 2011. Early modern English contractions and their relevance to present-day English. English Today 27, 10–16.CrossRef Google Scholar

Gisborne, Nikolas. 2005. Factoring out the subject dependency. In Sugayama, Kensei & Hudson, Richard (eds.), Word grammar. New perspectives on a theory of language structure, 204–224. London: Continuum.Google Scholar

Nikolas, Gisborne. 2010. The event structure of perception verbs. Oxford: Oxford University Press.Google Scholar

Gisborne, Nikolas. 2020. Ten lectures on event structure in a network theory of language. Leiden: Brill.Google Scholar

Gussenhoven, Carlos. 2011. Sentential prominence in English. In The Blackwell companion to phonology. Vol. V: Phonology across languages, 1–29. Oxford: Blackwell.Google Scholar

Haspelmath, Martin. 2023. Ambiguity avoidance vs. expectation sensitivity as functional factors in language change and language structures: Beyond argument marking. Presented at the 26th International Conference on Historical Linguistics at the University of Heidelberg.Google Scholar

Hawkins, John. 1994. A performance theory of order and constituency. Cambridge: Cambridge University Press.Google Scholar

Hawkins, John. 2004. Efficiency and complexity in grammars. Oxford: Oxford University Press.Google Scholar

Herbst, Thomas, Heath, David, Roe, Ian & Götz, Dieter. 2004. A valency dictionary of English: A corpus-based analysis of the complementation patterns of English verbs, nouns and adjectives. Berlin: Mouton de Gruyter.Google Scholar

Huddleston, Rodney. 1976. Some theoretical issues in the description of the English verb. Lingua 40, 331–383.Google Scholar

Huddleston, Rodney & Pullum, Geoffrey. 2002. The Cambridge grammar of the English language. Cambridge: Cambridge University Press.Google Scholar

Hudson, Richard. 1984. Word grammar. Oxford: Blackwell.Google Scholar

Richard, Hudson. 1990. English word grammar. Oxford: Blackwell.Google Scholar

Richard, Hudson. 1998. English grammar. London: Routledge.Google Scholar

Hudson, Richard. 1997. The rise of auxiliary DO: Verb-non-raising or category-strengthening? Transactions of the Philological Society 95.1, 41–72.Google Scholar

Hudson, Richard. 2007. Language networks: The new word grammar. Oxford: Oxford University Press.Google Scholar

Hudson, Richard. 2010. An introduction to word grammar. Cambridge: Cambridge University Press.Google Scholar

Hudson, Richard. 2021. HPSG and dependency grammar. In Abeillé, Anne, Borsley, Robert, Koenig, Jean-Pierre & Müller, Stefan (eds.), Head-driven phrase structure grammar: The handbook, 1447–1495. Berlin: Language Science Press. https://langsci-press.org/catalog/book/259.Google Scholar

Inkelas, Sharon & Zec, Draga. 1993. Auxiliary reduction without empty categories: A prosodic account. Working Papers of the Cornell Phonetics Laboratory 8, 205–253.Google Scholar

Jaeger, T. Florian. 2010. Redundancy and reduction: Speakers manage syntactic information density. Cognitive Psychology 61, 23–62.Google Scholar PubMed

King, Harold. 1970. On blocking the rules for contraction in English. Linguistic Inquiry 1, 134–136.Google Scholar

Kiparsky, Paul. 1982. Lexical morphology and phonology. In Linguistic Society of Korea (ed.), Linguistics in the morning calm, 3–91. Seoul: Hanshin.Google Scholar

Koopman, Hilda & Sportiche, Dominique. 1991. The position of subjects. Lingua 85, 211–258.Google Scholar

Labov, William. 1969. Contraction, deletion, and inherent variability of the English copula. Language 45, 715–762.CrossRef Google Scholar

Lamb, Sydney. 1998. Pathways of the brain. The neurocognitive basis of language. Amsterdam: Benjamins.Google Scholar

Larson, Richard. 1988. On the double-object construction. Linguistic Inquiry 19, 335–392.Google Scholar

Levy, Roger & Jaeger, T. Florian. 2007. Speakers optimize information density through syntactic reduction. Advances in Neural Information Processing Systems 19, 849–856. Cambridge, MA: MIT Press,Google Scholar

MacKenzie, Laurel. 2012. Locating variation above the phonology. Ph.D. dissertation, University of Pennsylvania.Google Scholar

Meister, Clara, Pimentel, Tiago, Haller, Patrick, Jaeger, Lena, Cotterell, Ryan & Levy, Roger. 2021. Revisiting the uniform information density hypothesis. arXiv e-prints. https://ui.adsabs.harvard.edu/abs/2021arXiv210911635M.CrossRef Google Scholar

Merchant, Jason. 2015. On ineffable predicates: Bilingual Greek–English code-switching under. Lingua 166, 199–213.CrossRef Google Scholar

Miller, George. 1956. The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review 63, 81–97.Google Scholar PubMed

Palmer, Harold. 1924. A grammar of spoken English, on a strictly phonetic basis. Cambridge: Heffer.Google Scholar

Pickering, Martin & Garrod, Simon. 2012. An integrated theory of language production and comprehension. Behavioral and Brain Sciences 36, 329–347.CrossRef Google Scholar

Pullum, Geoffrey and Wilson, Deirdre. 1977. Autonomous syntax and the analysis of auxiliaries. Language 53. 741–788.Google Scholar

Robinson, Jonnie. 2021. Grammatical change in the English language. https://www.bl.uk/british-accents-and-dialects/articles/grammatical-change-in-the-english-language.Google Scholar

Rosta, Andrew. 1997. English syntax and word grammar theory. London: University College London.Google Scholar

Rosta, Andrew. 2005. Structural and distributional heads. In Sugayama, Kensei & Hudson, Richard (eds.), Word grammar: New perspectives on a theory of language structure, 171–203. London: Continuum.Google Scholar

Selkirk, Elisabeth. 1984. Phonology and syntax. Cambridge, MA: MIT Press.Google Scholar

Tesnière, Lucien. 2015. Elements of structural syntax (Osborne, Timothy & Kahane, Sylvain, Trans.). Amsterdam: Benjamins. (Original work published 1959)CrossRef Google Scholar

Thielemann, Werner & Welke, Klaus. 2001. Valenztheorie -- Einsichten und Ausblicke. Münster: Nodus.Google Scholar

Thoms, Gary, Adger, David, Heycock, Caroline & Smith, Jennifer. 2019. Syntactic variation and auxiliary contraction: The surprising case of Scots. Language 95, 1–35.Google Scholar

Elizabeth, Traugott and Trousdale, Graeme. 2013. Constructionalization and constructional changes. Oxford: Oxford University Press.Google Scholar

Trudgill, Peter. 1978. Introduction: Sociolinguistics and sociolinguistics. In Trudgill, Peter (ed.), Sociolinguistic patterns in British English. London: Edward Arnold.Google Scholar

Williams, Alexander. 2015. Arguments in syntax and semantics. Cambridge: Cambridge University Press.CrossRef Google Scholar