1 Introduction
What do we know about the past? For at least some languages, we have textual (or archaeological) evidence from various periods – beyond that, there is only reconstruction. But even when we have some textual evidence, what does it tell us? The answer to this question crucially depends on the way we approach the question: we can treat texts as decontextualized, linguistic evidence, as Neogrammarian or Structuralist studies have done (see McMahon Reference McMahon1994: 17–32). Such an approach already allows us to discover important generalizations about the linguistic state of affairs of a particular language or historical period. Using decontextualized historical evidence, for example, we can already ascertain with a high degree of certainty that in Old English voiced and voiceless fricatives were allophones, rather than phonemes, that there was no do-periphrasis in Middle English, and that in Early Modern English there was some variability between third-person singular present tense {-s} and {-th} – just as we know that present-day Japanese and Korean use postpositions, rather than prepositions.
However, any language is obviously much more than just a simple collection of plain texts. Languages are means of communication that individual speakers (and writers) possess to interact with other members of a speech community. From a synchronic point of view, the (spoken and written) texts produced by the speakers of a speech community, of course, remain an interesting object of inquiry, as modern corpus linguistic research shows. Yet, in addition to this, modern sociolinguistic research, on the one hand, and psycho- and cognitive linguistic research, on the other hand, have shown that additional important scientific insights about language can be gleaned by investigating the social stratification of language as well as the processing and language use of the individual speaker, respectively.
2 Historical sociolinguistics – historical cognitive linguistics
A crucial finding of sociolinguistic research, for example, is that all languages are characterized by ‘inherent variability’ (Labov Reference Labov1969; see also Hudson Reference Hudson1997, Reference Hudson2007): the utterances of individuals regularly exhibit variation (‘first-order variation’); there is socially stratified variation across subgroups of a speech community (‘second-order variation’) as well as variation across dialects and languages (‘third-order variation’; see Croft Reference Croft and Thomsen2006: 98–103). This pervasiveness of variation led Weinreich, Labov & Herzog to describe language ‘as an object possessing orderly heterogeneity’ (Reference Weinreich, Labov, Herzog, Lehmann and Malkiel1968: 100). They even went so far as to claim that ‘in a language serving a complex (i.e. real) community, it is absence of structured heterogeneity that would be dysfunctional’ (Reference Weinreich, Labov, Herzog, Lehmann and Malkiel1968: 101). Numerous research projects have indeed established that linguistic systems can and should be described as systems with structured heterogeneity, and that the patterning and use of linguistic variables must be seen in the context of language-internal, social, as well as cognitive and cultural factors (to use the subtitles of Labov's three-volume magnum opus, Reference Labov1994, Reference Labov2001, Reference Labov2010). Historical linguistics has done a remarkable job so far in describing and analysing the internal, social and cultural factors that interact with the historical language systems and their orderly heterogeneity (for an overview, see Burridge & Bergs Reference Burridge and Bergs2016: 162–87, 260–4): the venerable S-curve of lexical diffusion, the correlation of social factors and morphosyntactic variables in Early Modern English, and the interplay of language and culture in post-conquest England as it can be seen in early mixed-language business writings are just some examples.
Similarly, synchronic psycholinguistic and cognitive linguistic research has revealed a large number of cognitive structures and processes that affect the language production of individual speakers: inter alia, analogy, blending, (prototype-based gradient) categorization, chunking, embodiment, entrenchment, frames, metaphor and metonymy, priming, preemption, and schematization (for an overview, see, e.g., Evans & Green Reference Evans and Green2006; Geeraerts & Cuyckens Reference Geeraerts and Cuyckens2010; Dabrowska & Divjak Reference Dabrowska and Divjak2015; Dancygier Reference Dancygier2017). Yet in historical linguistics, the question of cognition, or cognitive linguistics, has not featured prominently (for an exception, see Croft Reference Croft and Thomsen2006 as well as the discussion below). Speakers in language history must have somehow cognitively processed the language they produced and received. How did they do that? Did they have the same cognitive apparatus, the same cognitive underpinnings in those processes as modern speakers? Can modern findings in cognitive linguistics help us in describing and analysing historical data?
A few attempts have already been made to come to grips with the thought processes of historical speakers (see Labov Reference Labov2010 and also the summary in Winters Reference Winters, Winters, Tissari and Kathryn2010). For example, as far as we can tell, the process of ‘categorization’, in both linguistic and non-linguistic matters, has always been the same (Winters Reference Winters, Winters, Tissari and Kathryn2010: 16). Other studies have investigated semantic changes from a cognitive linguistic point of view, including metaphors (e.g. Dirven Reference Dirven, Paprotté and Driven1985; Fabiszak Reference Fabiszak2001). Grammaticalization (Heine Reference Heine1993; Hopper & Traugott Reference Hopper and Traugott2003; Bybee Reference Bybee and Tomasello2003; Brems & Hoffmann Reference Brems, Hoffmann, Bergs and Brinton2012) has also been a hot topic from a cognitive perspective. However, previous research has mostly tended to use psycho- or cognitive linguistic findings as a background, merely applying modern theories to historical data. This view, however, neglects the important repercussions of the results of diachronic studies for modern cognitive and psycholinguistic approaches. Fulk (this volume) suggests that we should first assume that whatever factors and mechanisms we find today should also be found in the past. In other words, there is no a priori need to assume a different state of affairs or a different cognitive system for speakers in the past. But what if historical data do not confirm the predictions that we make on the basis of our synchronic findings and theories? These are two of the central questions for the present special issue: to what extent do historical data confirm or disconfirm our present-day findings from cognitive linguistics? And how do we account for the discrepancies, if any? In this special issue, these questions are addressed in an interactive structure: it contains seven articles (by Antonina Harbus, Peter Petré, Meike Pentrel, Elizabeth Closs Traugott, Hendrik De Smet & Freek Van de Velde, Thomas Hoffmann and Marcelle Cole), each followed by a response squib from an international expert (Mark Turner, Lauren Fonteyn, Peter Petré, Alexander Bergs, Martin Hilpert, Bert Cappelle and Ans van Kemenade), which critically assesses the claims made in the respective target article. Three coda articles then summarize, evaluate and conclude the arguments brought forward in the various contributions: Margaret Winters comments from the perspective of historical cognitive linguistics; Elly van Gelderen provides alternative views from a more generative perspective; and Robert Fulk looks at the merits and problems of this special issue through the lens of philology. In the following section we will briefly outline the two major concerns of the special issue and how the individual contributions address them.
3 Cognitive approaches to historical data: limits and possibilities
A well-known issue in historical linguistics is the ‘bad-data problem’. As Labov (Reference Labov1994: 11) points out: ‘historical documents survive by chance, not by design’ and, on top of that, they ‘are riddled with the effects of hypercorrection, dialect mixture, and scribal error’. Consequently, it is impossible to know with certainty ‘what was understood’ (Labov Reference Labov1994: 11) by the contemporary readers of these texts. Still, an initial, plausible idea seems to be that the mental grammars of, for example, Old English or Middle English readers were subject to the very same constraints that have been identified by modern cognitive linguistics (see above) and psycholinguistics (see, e.g., Field Reference Field2004). The Uniformitarian Principle, therefore, claims that whatever is the case today must also have been the case in the past; what is possible today must have been possible in the past; and what is impossible today must have been impossible in the past (Bergs Reference Bergs, Hernández-Campoy and Conde-Silvestre2012).
As Elizabeth Closs Traugott in her contribution ‘“Insubordination” in the light of the Uniformitarian Principle’ points out, the principle refers to processes and not necessarily to linguistic states of affairs and must be carefully distinguished from language- and time-specific particulars. Investigating the evolution of insubordination constructions, i.e. finite monoclauses with an initial subordinator, she argues that synchronic interactional analyses reveal that these constructions are mainly used as independent increments that move discourse forward. In line with the Uniformitarian Principle, she then investigates whether the constructions had a similar use in earlier stages of the language. Her results show that insubordinate constructions already fulfilled this interactional function in her historical data and that it is these uses that are the most likely origin of the structure. In his response article, ‘The myth of the complete sentence’, Alexander Bergs takes this finding one step further and asks what the consequences for cognitive historical linguistics would be if the notion of the ‘complete sentence’ was no longer an axiom of syntactic analysis.
In ‘Connecting the present and the past: cognitive processing and the position of adverbial clauses in Samuel Pepys's Diary’, Meike Pentrel tests the linear order of main and temporal adverbial clauses in Samuel Pepys's Diary. By applying cognitive and processing constraints observed with synchronic data (the iconic temporal order of main and subordinate clauses, the length of the adverbial clause, and the implied meaning of both clauses), her article is another test case for the Uniformitarian Principle. As it turns out, most of her findings for Pepys are in line with the effects predicted by synchronic studies (with the iconicity of event order being the best predictor for placement of the adverbial clause), thus corroborating again the validity of the Uniformitarian Principle. At the same time, she notes that not all data can be predicted by cognitive and processing principles. On the one hand, additional discourse-pragmatic factors (which could also be argued to be cognitive in nature) also appear to play an important role. Besides other issues such as a comparison of Pentrel's approach with his own (see below), it is this point which Peter Petré focuses on in his response article, ‘Connecting the past and the present’. Petré calls attention to the fact that synchronic psycholinguistic studies on speech production emphasize that different cognitive and processing factors require different degrees of cognitive effort and consciousness. Yet in most of the contributions to the volume, cognitive processes are treated as equally strong. Petré, therefore, rightly calls for future studies to devote more attention to the issue of hierarchically ordered cognitive constraints. Tentatively, he develops the hypothesis that more automatic and unconscious processes should be diachronically more stable, while more conscious ones might be more susceptible to change. On the other hand, however, Pentrel in her article points out that not every single token can be explained exhaustively by cognitive or processing principles, since there is always the possibility of creative and experimental language use (an observation she attributes to Traugott).
Marcelle Cole's article on ‘Pronominal anaphoric strategies in the West Saxon dialect of Old English’ provides further support for the Uniformitarian Principle: Cole qualitatively as well as quantitatively analyses anaphoric strategies in late West Saxon Old English prose texts. Similarly to findings from present-day cognitive/psycholinguistic studies, information structure turns out to be the main factor affecting the choice between pronominal and demonstrative anaphoric elements in her data (with the former favouring discourse old antecedents and the latter avoiding discourse topics). Yet Cole also notes that, in addition to cognitive factors, text type plays an important role, as the relative frequency of an anaphoric variant also appears to depend on text type and subject matter. To this, Ans van Kemenade (‘Reply to Cole’) adds that recent research indicates that the choice and interpretation of the anaphoric variant also interacts with topic continuity or shift as well as cause or result continuation. Clearly more research is needed to disentangle the various constraints at work in synchronic as well as diachronic anaphor resolution.
Consequently, the above articles largely confirm the idea of the Uniformitarian Principle. The principle entails that historical data should always confirm what we know about – or at least that it should not be in contradiction to – the processes underlying present-day language production and perception. If we do find discrepancies and differences, however, these can be primarily due to two factors: the possibility that data from earlier periods may be unreliable and the likelihood of changes in cognitive processing.
First of all, there may be a data problem for language history. While present-day languages offer an abundance of data and also allow for actual experiments with (living!) speakers, there is only a very limited amount of data for historical periods, and certainly no chance of conducting further experiments or eliciting more data. Yet two contributions to the present volume illustrate how historical corpus data can at least be treated in an experimental fashion, by applying concepts and methodologies from modern psycholinguistics: In ‘The extravagant progressive: an experimental corpus study on the history of emphatic [be Ving]’, Peter Petré looks at the evolution of the present progressive construction and argues that extravagance, i.e. a speaker's desire to express emotionally charged information in an unconventional and thus prominent way, was the main motivation behind the grammaticalization of the construction. Now for Present-day English, one could design an experiment that crosses variables expressing extravagance with both the present progressive as well as the simple present to test whether the former construction is more significantly associated with extravagant language use. For his historical data, Petré tries to emulate this approach by drawing on a large corpus of Early Modern English writers, which allows him to sample a large number of tokens of both constructions for each writer that contain the same verb (thus precluding any lexical effects). His experiment-like analysis reveals that the present progressive construction is indeed significantly more associated with extravagant markers (such as linguistic elements indicating physical speaker involvement or spatio-temporal deixis). As Lauren Fonteyn in her response article, ‘The aggregate and the individual: thoughts on what non-alternating authors reveal about linguistic alternations’, notes, this is a very promising approach that enables researchers to reveal the mental factors that led historical writers to choose one linguistic variant over a competing one. At the same time, in order to get a more comprehensive view of the social spread of a variant, she cautions that studies such as Petré’s must also be complemented by analyses that additionally investigate the writers that used only one of the available variants. Moreover, Fonteyn emphasizes that these non-alternating writers potentially raise important questions concerning the alleged cognitive reality of motivations favouring certain variants.
An additional study inspired by psycholinguistic approaches is Hendrik De Smet & Freek Van de Velde's contribution, ‘Experimenting on the past: a case study on changing analysability in English ly-adverbs’. Drawing on the concept of priming, they find that ly-adverbs with a high degree of analysability (such as indirectly or mentally) have a higher number of other ly-adverbs in their preceding and following context than less analysable ly-adverbs in data from the 1950–2005 period of the Hansard Corpus. As they argue, this finding can be interpreted as a stronger priming effect of more analysable ly-adverbs. In line with the Uniformitarian Principle, De Smet & Van de Velde's results thus seem to corroborate synchronic priming effects for diachronic data. Yet, as the authors admit, their analysis still leaves a considerable amount of variation unexplained. On top of that, Martin Hilpert's replication studies do not support the effect observed by De Smet & Van de Velde, leading him to claim that ‘Text frequency does not correlate with priming sensitivity’ in his response article. The jury is thus still out on this issue, and future diachronic research will have to investigate potential priming effects in much more detail.
While all these contributions present innovative ways of approaching diachronic data, it nevertheless needs to be emphasized that the ‘limited’ data problem remains and always will remain an issue that all studies in historical cognitive linguistics simply have to deal with – and that they have to take into account in all their explanations. There may not be enough data, or the data that we have may come from genres and text types that do not easily lend themselves to studies of language processing.
Secondly, if we are confident that our historical data are reliable, and yet they fail to support or even seem to contradict cognitive and psycholinguistic processes that have been identified for present-day speakers another possibility arises: that it is at least theoretically possible that the cognitive systems of speakers then were not exactly the same as those of speakers now, that speakers during earlier periods did not possess identical modern-like constraints on memory and attention, the same affective system, or the same susceptibility to frequency effects or priming. Even though it has been argued that historical text production did not differ from modern text production, considerable changes in mediality and the contexts of text production have, e.g., clearly taken place over the history of the English language (see Ong Reference Ong1982; Goody Reference Goody1987; Clanchy Reference Clanchy1993). Questions that are particularly pressing for Old and Middle English (that is, the period before mass book production) include: was a particular text written by one scribe or several? Was it written during one sitting or several? Was it dictated? Was it translated? Is the text conceptually oral or written in nature? (See also Fulk, this volume, for a convincing argument for the continued importance of such philological questions in cognitive historical research.) Besides, it can also be expected that writers of all times are always consciously aware of certain conventions which are going to affect and influence their writing (Tieken-Boon van Ostade Reference Tieken-Boon van Ostade, Kastovsky and Mettinger2000: 448–9). Is it perhaps possible that these different text production situations resulted in (slightly) different cognitive systems? Antonina Harbus, in ‘A cognitive approach to alliteration and conceptualization in Medieval English literature’, for example, investigates the role of alliteration in Old English and Middle English literary texts (Beowulf and Sir Gawain and the Green Knight, respectively). Adopting a Construction Grammar approach, she argues that alliteration in these poetic texts functions as a chunking device that plays an important role for the creation of local and literary conceptual clusters. Her article thus supports the view that the cognitive process of chunking was also at work in the minds of Old and Middle English speakers. Yet at the same time, her historical data also indicate that the particular structures used for chunking may differ considerably from period to period (the alliteration pattern that we find in Beowulf is, e.g., arguably much less important for Present-day English literary texts). As Mark Turner points out in his response to Harbus (‘Conceptual compression and alliterative form’), chunking via alliteration can also be seen as interacting with another cognitive process, namely conceptual blending. Turner notes that alliteration chunks function as very abstract compression templates that prompt the hearer/reader to compress the meaning of the alliterated parts into an integrated meaning packet. Harbus and Turner thus illustrate how cognitive approaches to historical data considerably further our understanding not only of the processing capacities of speakers/writers from earlier periods, but also of the precise nature and interaction of cognitive processes that can inform future synchronic research.
In ‘Construction Grammar as Cognitive Structuralism: the interaction of constructional networks and processing in the diachronic evolution of English comparative correlatives’, Thomas Hoffmann, at first, seems to find evidence that contradicts the Uniformitarian Principle. His diachronic analysis reveals that Old English had only a non-iconic comparative correlative construction (C2C1 (effect C2 before cause C1): you get the fatter C2, the more you eat C1), but not the iconic alternative (C1C2 (cause C1 before effect C2): the more you eat C1, the fatter you get C2) that should be preferred on processing grounds and has a much higher frequency in Present-day English. Yet Hoffmann argues that processing issues only need to be taken into account once variation between structures exists. The more influential factor in the evolution of these constructions is the Construct–i–con (Fillmore Reference Fillmore, Axmaker, Jaisser and Singmaster1988; see also Jurafsky Reference Jurafsky1992 and Goldberg Reference Goldberg2003: 223) – the mental network of constructional knowledge, whose basic architecture of taxonomic and horizontal constructional links can be assumed to have been the same for Old English as well as present-day speakers. As Hoffmann shows, his network-based analysis explains the existence of the non-iconic C2C1 structure in Old English as the only constructional choice, the later innovation of the iconic C1C2 construction via analogy as well as the ensuing competition between the two alternatives and the expected processing-based distribution of the two constructions. In addition to this, on a more theoretical level, he argues that the Construct–i–con can straightforwardly be interpreted as the mental equivalent of the Structuralist idea of language as a system (albeit that of an individual and not the collective one of a speech community). Consequently, he claims that constructionist approaches can be seen as a type of Cognitive Structuralism and that Construction Grammar provides a cognitive explanation of many Structuralist findings. In his response article, ‘Changing the system from within’, Bert Cappelle adds that analogy in the Construct–i–con network might not always be unconscious. While phrasal and constructional innovations might be the result of unconscious analogical thinking, Cappelle provides evidence of analogy-based lexical innovations that clearly seem conscious efforts on part of the speaker. He therefore speculates that, contrary to mainstream constructionist approaches, phrasal/clausal constructions and lexical constructions such as derivation and compounding are not only different in size, but also in kind (with the latter being more consciously accessible for analogical change than the latter).
4 Codas and conclusion
The present issue raises crucial questions that should take centre stage in future diachronic research, namely how cultural and social factors generally interact with cognitive constraints and to what degree diachronic change can be accounted for by cognitive principles at all. We argue that one way to investigate this is to establish an important feedback loop between diachronic data and modern linguistic theory: on the one hand, it is possible that historical data are subject to the very same cognitive factors, and that these are unaffected by dynamic, cultural influences. In this case, the diachronic data provide important additional support for these principles. On the other hand, it is also possible that the role of certain cognitive constraints differs in historical texts from their present-day effect. Then, it is necessary to ask how far these differences can be explained by the different text production conditions, styles and (semi-conscious) writing conventions. Alternatively, we might even have to question the universality of some individual cognitive principles
The three coda articles by Margaret Winters (‘Psycho-historical linguistics: its context and potential’), Elly van Gelderen (‘Generative coda’) and Robert Dennis Fulk (‘Philological coda. Noise: an appreciation’) not only critically assess these claims, they also highlight further important issues that need to be addressed by future research into historical cognitive linguistics: for example, that we still have much to learn about the human mind and its cognitive processes; that alternative frameworks put forward different and challenging accounts that we need to engage with; and that profound philological knowledge and work are still indispensable prerequisites for diachronic linguistic studies.
Historical cognitive linguistics can thus be said to be a fascinating and highly vibrant research field – but one where a lot of cognitive linguistic work still remains to be done.