“I need not tell the world what (to their cost) they know, That Souldiers by action, and Printers by promulgation, are the two great English Factors.”
Captain John Randolph, 1643.
Introduction
In English cultural history, the sixteenth and seventeenth centuries were especially fecund. This era solidified the importance of the book as the dominant vehicle of culture, vastly increasing the flow of vernacular texts and allowing a much deeper penetration of the written word into a wide variety of English-speaking classes and cultural groups (Barnard and McKenzie Reference Barnard and McKenzie2002). “Retrospectively, the history of the book in Britain from 1557 to 1695 looks like a triumphalist progress in which a dominant Protestant vernacular culture, and an emergent canon of English literature, were steadily created and successfully displaced an earlier Latinate and Catholic world looking towards Europe” (Barnard Reference Barnard, Barnard and McKenzie2002: 1). With religious considerations permeating every aspect of early-modern English society, religious works, particularly those expressing Puritan sentiments, provided an important element of the output of the expanding publishing sector (Collinson et al. Reference Collinson, Hunt, Walsham, Barnard and McKenzie2002; Green and Peters Reference Green, Peters, Barnard and McKenzie2002).
At the same time, the early-modern period saw the flourishing of experimental science, as exemplified in the publications of Bacon, Harvey, and Boyle and in the formation of the Royal Society, which itself played an important role in the emergence of scientific publishing (Johns Reference Johns, Barnard and McKenzie2002). And fundamental texts on political philosophy and institutional thought appeared, epitomized by the works of Harrington, Locke, and Hobbes. England in the sixteenth and seventeenth centuries thus witnessed a cultural flowering in a multiplicity of domains, with works made all the more accessible by the growing publishing industry. With accessibility, there was increasing scope for a fertile intermingling of ideas, including those about religion, science, and institutions.
But what were the central emphases in the burgeoning English print culture of the sixteenth and seventeenth centuries and when did those emphases emerge? To what extent did ideas about religion, science, and institutions coevolve and when were such coevolutionary processes especially important? In these processes, how consequential were specific historical events, such as the Civil War, the Glorious Revolution, and the formation of the Royal Society? These questions have long intrigued historians and social scientists (e.g., Hill Reference Hill1965; Merton Reference Merton1938; Wootton Reference Wootton2015) and lately have increased in importance in economics as a result of works that view culture as central in England’s early economic rise (see, e.g., McCloskey Reference McCloskey2016; Mokyr Reference Mokyr2016).
Traditional textual analysis constitutes one approach to tackling the above questions. However, providing a balanced summary of a very large corpus is not possible using only the close reading that forms the foundation of traditional methods. Instead, given recent availability of digitized corpora such as the Early English Books Online-Text Creation Partnership (hereafter EEBO-TCP), the application of computational and statistical techniques for analysis of text-as-data (see, e.g., Gentzkow et al. Reference Gentzkow, Kelly and Taddy2019; Grimmer et al. Reference Grimmer, Roberts and Stewart2021; Livermore and Rockmore Reference Livermore and Rockmore2019) provides an eminently feasible route to systematic investigation of a large corpus of cultural works.
Indeed, aspects of early-modern English cultural history have recently been examined using computational and statistical methods. In particular, in an innovative contribution, Erikson (Reference Erikson2021) applies computational methods in an analysis of 2353 EEBO-TCP economics-related texts to illuminate how the actions and writings of merchants led to greater prominence of economic ideas in the early-modern period. Footnote 1
In this paper, we also adopt a computational approach to the investigation of early-modern English cultural history. Our perspective is distinctly macroscopic. Unlike Erikson (Reference Erikson2021), we do not limit ourselves to a subset of the available EEBO-TCP texts or focus on specific ideas. Rather, we paint a broad-brush picture of the central cultural emphases discernible in the entire sixteenth- and seventeenth-century EEBO-TCP corpus. We then use macro-econometric methods to study the interactions between three cultural aggregates, which reflect the sets of ideas about religion, science, and institutions discovered in the corpus.
To this end, we first use topic modeling, an unsupervised machine learning method (Grimmer et al. Reference Grimmer, Roberts and Stewart2021), synthesizing the content of 57,863 EEBO-TCP texts into 110 topics. We thereby provide a quantitative digest of the key substantive emphases in sixteenth- and seventeenth-century English print culture.
We then examine patterns within the dataset for the 110 topics that constitute that digest. We first study temporal changes in attention to specific topics. Our investigation thereby generates new stylized facts about early-modern cultural change and the timing of important cultural developments. Among the latter, debate about the emergence and origins of epistemological ideas later associated with Francis Bacon has been especially prominent (e.g., Grajzl and Murrell Reference Grajzl and Murrell2019; Martin Reference Martin1992: 164–71; Shapiro Reference Shapiro2000: 107–12). Our data and analysis casts direct empirical light on this debate. Given that Bacon has been credited as providing a central impetus to the development of a “culture of growth” (Mokyr Reference Mokyr2016), an understanding of the emergence of epistemological thought associated with him provides insight into the cultural roots of England’s early economic rise.
In our primary econometric exercise, we investigate the coevolution of ideas on religion, science, and institutions. We focus especially on the interplay between religion and science, a connection that has been the subject of long-standing interest among social historians (e.g., Barnes Reference Barnes2000; Shapin Reference Shapin1996; Wootton Reference Wootton2015).
One especially prominent theory is the “Merton thesis” (Merton Reference Merton1938) – that the cultural sentiments imbued by Puritanism played a vital role in facilitating the development of modern science. According to Shapin (Reference Shapin1996: 136), variants of that thesis are broadly accepted: “Much of what Merton then wrote about religious motives to science, and religious justifications for science, has passed into historical commonplace.” Footnote 2 But the support is far from universal. Shapin (Reference Shapin1988) comments on the hostile reception among many historians in the decades after Merton’s original publication. Brooke (Reference Brooke, Olby, Cantor, Christie and Hodge1990) views Merton’s thesis as controversial and Porter (Reference Porter2000) does not even mention religion when discussing the culture of science in the seventeenth century. Wootton (Reference Wootton2015: 908) approvingly cites the work of one of the hostile critics, who reaches the conclusion that “in the story of the rise of science, therefore, religion is a peripheral concern” (Rabb Reference Rabb1965: 126). Footnote 3 These differences surely mean that “English society, strongly influenced by Calvinism and deeply involved in the development of science and scientific institutions, constitutes an ideal test case in which to examine the relations between science and religion” (Webster Reference Webster1974: 15). Moreover, a new opportunity for tests arises with the advent of promising methods offered by the conjunction of machine learning and econometrics.
We employ vector autoregression (VAR), which is a standard technique for modeling the dynamic relationships among variables over time (see, e.g., Stock and Watson Reference Stock and Watson2001), much employed in empirical macroeconomics. Our VAR analysis also allows us to illuminate whether the interplay between religion and science also occurred in the comparatively less studied direction, from science to religion.Footnote 4 Additionally, we are able to provide insight into the timing of when the influence from religion to science was especially strong. For instance, both Merton (Reference Merton1938) and Mokyr (Reference Mokyr2016) in this respect emphasize the seventeenth century, but our analysis provides insights into whether any such influence occurred even earlier.
Finally, our VAR estimates allow us to cast new light on several major seventeenth-century events: the Civil War, the Glorious Revolution, and the formation of the Royal Society. In economics especially, the role of the Glorious Revolution has been contested. A long-held theory emphasized the view that the Glorious Revolution was a watershed in the history of English institutional development (e.g., Acemoglu and Robinson Reference Acemoglu and Robinson2012; North et al. Reference North, Wallis and Weingast2009; North and Weingast Reference North and Weingast1989). Recent research has challenged this perspective, revealing a more gradual evolution of English institutions than emphasized by earlier contributions (Henriques and Palma Reference Henriques and Palma2023; Hodgson Reference Hodgson2017; Murrell Reference Murrell2017; Ogilvie and Carus Reference Ogilvie, Carus, Aghion and Durlauf2014). We bring new empirical evidence to this debate, illuminating the extent to which the developments in 1688–1689 resonated in English print culture and specifically in the emphases on institutions.
The corpus
The EEBO-TCP documents and their processing
Processing print texts from before 1700 entails many challenges: the inscrutable fonts immune to optical character recognition (OCR), the chaotic orthography, the archaic inflections, and the appearance of untranslated Latin. TCP (2022) has solved the first of these problems for a wide-ranging set of texts by using manual keying. This is important because no current OCR software produces satisfactory output for sixteenth- and seventeenth-century texts. Therefore, no alternative machine-readable corpus of commensurate breadth and depth exists in a form that could underpin the type of inquiry into pre-1700 English culture that we undertake. We therefore begin with the corpus of 60,331 texts available from EEBO-TCP, addressing the problems of orthography, inflections, and foreign words with our own Python programs. This subsection provides an overview of the steps we took in processing that data. Appendix A in Supplementary material provides details.
We removed all EEBO-TCP-inserted formatting symbols to begin with versions of the texts that were as close to the originals as possible. We then assigned a year of publication to each text using the information provided by EEBO-TCP. A very small number of texts could not be dated and were discarded.
We converted the non-standardized orthography that was common before the eighteenth century into standard modern orthography. Older-style inflections were modernized. We translated on a word-by-word basis those words that could not be found in a modern English dictionary and were readily identified as Latin. We then dropped documents that contained either an especially small number of words or an uncharacteristically high share of words that could not be matched to any word in the English dictionary even after the processing.
The ensuing corpus was imported into R and further processed using the textProcessor function. We thereby converted all words to lower case, applied the Porter (Reference Porter1980) stemming algorithm, and removed standard English stop words, numbers, words with fewer than three characters, words included in only one document, and punctuation. After the resultant processing, the final corpus consisted of 57,863 documents containing 83,337,912 letter-based strings (i.e., words).
Selection issues
The EEBO-TCP project began with lists of works contained in prominent catalogs that “trace the history of English thought from the first book printed in English in 1475 through to 1700” (TCP 2022). The combined catalog comprises more than 125,000 works for which facsimiles of texts are available. Due to resource and other constraints, the texts that EEBO-TCP actually processed amounted to approximately one half of those. The project’s aim was simply “to key as many different works – as much different text – as possible” (TCP 2022). EEBO-TCP implemented the project in a manner in which the individual preferences of project partners, staff, and editors had a negligible impact on text selection.Footnote 5
But, of course, the EEBO-TCP corpus does not provide a random sample of English culture. Much culture was not committed to print, with only a minority of the population in early-modern England literate.Footnote 6 Many texts will have been lost, with survival depending on how much subsequent generations valued the texts. TCP (2022) also focused on first editions. The EEBO-TCP corpus is thus best viewed as capturing new developments rather than reflecting the stock of texts in use at any juncture: our analysis is more likely to reflect the production of print culture than the consumption of print culture.
Figure 1 shows the distribution over time of the texts in our corpus. Two features stand out. First, the growth from 1475 to 1640 reflects the growing importance of print. Second, the years with a large number of texts (e.g., 1642, 1660, 1689) are momentous ones in English history (the beginning of the English Civil War, the Restoration, and the resolution of the Glorious Revolution). None of the findings we reach in this paper simply reflect the number of texts published in any specific year.
Producing a machine learning digest of English print culture
Our first objective is to generate a machine learning digest of the complete EEBO-TCP corpus. In this step, our unit of analysis is a document, a self-standing published work.Footnote 7 We use topic modeling, the standard natural-language-processing approach when the aim is to infer the core emphases in a large corpus (see, e.g., Grimmer et al. Reference Grimmer, Roberts and Stewart2022, Reference Grimmer, Roberts and Stewart2021).Footnote 8
Topic modeling is an unsupervised machine learning method (see, e.g., Burkov Reference Burkov2019). The algorithm identifies topics by leveraging patterns in word use across the corpus documents. Each document is conceptualized as a mixture of all topics, while topics are probability distributions over the corpus vocabulary. We estimate a structural topic model (henceforth STM; Roberts et al. Reference Roberts, Stewart, Dustin Tingley, Jetson Leder-Luis, Albertson and Rand2014, Reference Roberts, Stewart and Airoldi2016a), a variant of topic modeling that directly incorporates document-level metadata (e.g., publication year) into the estimation to aid topic identification. To implement the estimation, we use R’s stm package (Roberts et al. Reference Roberts, Stewart and Tingley2019).
Choosing the number of topics
Before estimation, the number of topics must be chosen. There exists no universally agreed-upon methodology for making this decision (see, e.g., Grimmer et al. Reference Grimmer, Roberts and Stewart2022: Ch. 13; Wang et al. Reference Wang, Jianxiong Wang, Wang and Mao2019: 258–59). We first estimated a series of STMs by varying the number of topics between 10 and 200. We then examined standard measures of goodness-of-fit such as held-out likelihood and size of residuals (see, e.g., Roberts et al. Reference Roberts, Stewart, Tingley and Michael Alvarez2016b; Taddy Reference Taddy2012; Wallach et al. Reference Wallach, Murray, Salakhutdinov and Mimno2009). The model with 110 topics fit the data well, with further increases in the number of topics producing only modest gains. We also directly compared the 110-topic model with models featuring fewer and more topics. None of the alternative models dominated the 110-topic model on standard criteria (see, e.g., Airoldi and Bischof Reference Airoldi and Bischof2016; Weston et al. Reference Weston, Ian Shryock and Fisher2023): semantic coherence (measuring the internal consistency of the topics) and exclusivity (measuring whether topics can be easily distinguished). The decision to use a 110-topic model was confirmed when we contrasted the output of that model with the output of models with different numbers of topics by reflecting on the ease of interpreting topics and distinguishing between them.Footnote 9
Interpreting the estimated topics
To interpret and name the estimated topics, we examined both the word-stems most highly associated with each topic and those documents that featured the topic most prominently.Footnote 10 Our interpretation and naming of the estimated topics therefore also incorporated an element of the close reading typical of conventional text analysis.
Appendix B in Supplementary material describes the content and justifies the assigned names for the 110 topics. Here, we briefly illustrate our process of topic interpretation and naming by using only one example. For the pertinent topic, the top word-stems (those most associated with the topic) emphasize logical connectives: upon, yet, though, thus, mean, inde [thereunto], even, impli [imply]. The top documents (those featuring this topic in the greatest proportion) often contain the word experiment. Nearly all top documents focus on religion using logical arguments and emphasizing inductive learning from facts: biblical, historical, or personal. The top document states that “if a man were but well read in the story and various passages of his life, he might be able to make an experimentall divinitie of his own. He that is observant of Gods former dealings and dispensations towards him, may be thence furnished with a rich treasury of experience against all future conditions.” Accordingly, we name the topic Baconian Theology.Footnote 11
Via this process we were able to readily identify the ideas underlying all 110 estimated topics. This is, in itself, a verification of the quality of text preprocessing, the choice of the number of topics, and the applicability of STM. The topics include both the innocuous and familiar-sounding ones, as well as more unusual ones, all clearly dictated by our estimates. Thus, Salvation via Faith competes for attention with Lusty Entertainments; Emotional Relationships contrasts with Expressing Loving & Loathing; Deductive Theology appears alongside Baconian Theology. The most prevalent topic is an inward-looking one, Self-Reflection, accounting for 4.32 percent of the corpus, while the second most prevalent is other-directed Petitions, Protests, & Proposals, occupying 3.42 percent.
Table 1 lists the names of all topics together with the percentages of the corpus occupied by each. These names, being necessarily brief, might not convey the full meaning of the topic. Appendix B in Supplementary material provides more detailed information on each topic.
Notes: The table lists the 110 STM-estimated topics (non-italicized) as interpreted by the authors. (Appendix B provides the lists of keywords most prominently associated with each topic and a detailed justification for each topic name.) The topics have been grouped into 11 themes (italicized). The numbers in parentheses are expected document-level prevalences, computed as simple (non-weighted) report-level means of the STM-estimated topic and theme prevalences, all expressed in percentages.
Grouping topics into broader themes
The 110 estimated topics constitute one window into nearly two centuries of cultural change. Another perspective emerges from even more aggregation. We grouped the topics into a smaller number of broader themes. Following recent contributions faced with an analogous task (see, e.g., Gennaro and Ash Reference Gennaro and Ash2022; Grajzl and Murrell Reference Grajzl and Murrell2023), we assigned topics to themes manually, based on our own understanding of the topics.Footnote 12
In grouping topics into themes, we did not pre-commit to a fixed number of themes but rather allowed the number to emerge from the aggregation process. We strove to create themes that were broad enough to achieve a further reduction in dimensionality and narrow enough to resonate with major areas of cultural inquiry. In this process, we assigned each topic to one theme only, using the criterion of the most natural fit. Classifications were straightforward for the overwhelming majority of topics (e.g., Sin, Damnation, & Repentance to religion; Chemistry to science; Constitutional Rules to institutions). However, culture is a seamless web and thus a small set of topics lie close to the boundaries between themes. For example, Autonomous Church Governance and Hierarchical Church Governance were close to the religion and institutions themes. Ultimately, given their emphasis on organization under the law rather than on religion, we allocated these topics to institutions.Footnote 13
The resultant 11 themes differ considerably with regard to the number of included topics and the proportion of the corpus occupied (see Table 1). Unsurprisingly, religion accounts for a large number of topics (20) and a large proportion of the corpus (22 percent). Among religious topics, a number reflect areas of religious thought to which Puritans made major contributions (e.g., Apocalyptic Theology; Salvation via Faith; Baconian Theology; Christian Mental Exercises; Deductive Theology; Self-Reflection). Others reflect more general religious controversies in which Puritans participated in debate (Transubstantiation; Salvation via Virtue).
Characterizing cultural evolution temporally
In this section we provide insights into how attention to various areas of cultural discourse changed over time. To this end, we first describe the construction of the pertinent time series. We then offer a set of observations on particularly interesting episodes identified in the corresponding timelines. The resulting observations provide a unique macroscopic overview of the evolution of a variety of cultural domains.
Constructing the time series of attention to topics and themes
The core output of our topic model is a 57,863 × 110 topic-document matrix. An element of that matrix, ${\theta _{idt}}$ , gives the estimated prevalence of topic $i$ in document $d$ published in year $t$ . To construct annual time series of attention to the topics, we merge the topic-document matrix with metadata on publication year and document length and compute mean yearly attention to each topic. We use weighted means, weighting those documents with a greater number of words more heavily.Footnote 14 Let ${w_{dt}}$ denote the number of words in document $d$ published in year $t$ . The attention to topic $i$ in year $t$ is then:
where ${D_t}$ is the set of all documents published in year $t$ . ${\it\Psi _{it}}$ captures the proportion of the corpus occupied by topic $i$ in year $t$ , $i$ = 1,…, 110, $t$ = 1530, …, 1700.Footnote 15 Figures containing the timelines for all ${\it\Psi _{it}}$ appear in Appendix C in Supplementary material.
Timelines for the themes are constructed using the same principles. For each theme $m$ in Table 1, the average attention to that theme in year $t$ is:
where ${S_m}$ is the set of topics comprising theme $m$ , $m$ =1,…, 11, $t$ = 1530,…, 1700. Appendix D in Supplementary material provides figures depicting the resultant timelines.
Some stylized facts about cultural change gleaned from the topic timelines
For a particular topic, periods of rising attention to that topic indicate increasing importance of the corresponding ideas in the cultural discourse. In contrast, times of declining attention suggest a waning interest in the applicable ideas in print culture. Examination of the topic timelines therefore offers a path to establish stylized facts about cultural change. Below, we highlight a subset of the facts that are, in our view, the most intriguing ones.Footnote 16
First, Baconian Theology appears already in the 1570s (i.e., when Bacon was still a teenager) and rises in importance over the next 80 years (Figure 2(a)). Therefore, even before Bacon’s epistemological contributions, theological debates had featured ideas that were later to be associated with Bacon, particularly the importance of experiments or experience to pursue induction. Our results thus suggest that Bacon’s name provided a particularly convenient label for a set of ideas that were already part of the existing culture: his writings served as an important “coordination device” for later thinkers (Mokyr Reference Mokyr2016: 73). Hence, our findings suggest a view of Bacon as very much a product of his times, consistent with Harkness (Reference Harkness2007) and Grajzl and Murrell (Reference Grajzl and Murrell2019). Interestingly, in contrast to the increasing attention to Baconian Theology in the seventeenth century, the more conventional and austere Deductive Theology steadily wanes from the close of the sixteenth century onward (Figure 2(b)).
Second, among other religious topics as well as institutional and political topics touching upon religion, there is clear evidence of a shift toward a less antagonistic form of debate as the seventeenth century progresses. For example, Attacking False Doctrine, which characterizes the beliefs of others with venom and hatred, declines, while a topic that captures debate conducted in non-antagonistic tones, Reasonable Religious Discourse, increases (Figure 2(c) and (d)). Salvation via Virtue, which emphasizes the importance of good works such as charity, rises continually from the mid-seventeenth century (Figure 2(e)). Baconian Theology, also an expression of reasoned views in looking for evidence about how to interpret the world and live a good life, rises to prominence in the second half of the seventeenth century (Figure 2(a)). Hierarchical Church Governance peaks earlier than Autonomous Church Governance and the prevalence of each declines (Figure 2(f) and (g)), indicating that the long struggles between the corresponding sets of ideas became less important. Political Uses of Religion, which expresses judgment of political classes from a religious stance, similarly declines after 1650 (Figure 2(h)).
Third, after 1688, there is a large rise in Economic Lobbying (Figure 2(i)), a finding resonating with Erikson’s (Reference Erikson2021) emphasis on the growing importance of economic ideas in the early-modern period. At the same time, within skills, those associated with religion (Catechismal Compilations; Christian Mental Exercises; Practicing Christianity) decline in the latter half of the seventeenth century, while those especially relevant to commerce increase: Student & Practitioner Law, Legal Practice Aids, Using Numbers, and Industrial Arts (Figure 2(j)–(p)).
Viewed as a whole, these patterns suggest that the eighteenth-century relative calm that facilitated economic progress was anticipated by cultural changes that appeared in the latter part of the bellicose seventeenth century: our timelines evidence a turn to a more measured religious and political debate, as well as an increasing emphasis on economic ideas and skills relevant to commercial matters. The cultural origins of the eighteenth-century “nation of shopkeepers” can be clearly seen in the seventeenth century.
The coevolution of ideas about religion, science, and institutions
We next turn to the investigation of the coevolution of ideas on religion, science, and institutions. To this end, we first use our STM estimates to construct the pertinent time series. We then lay out a plausible empirical model and show how that model can be analyzed using standard econometric techniques.
The time series
We construct annual time series of attention to the three themes of interest: religion, science, and institutions. With each theme defined as an aggregate of our STM topics (see Table 1), we must first decide whether to form aggregates of total attention or relative attention to the pertinent topics. Whereas total attention captures the total number of words devoted to a topic in a given year, relative attention reflects the proportion of words. We choose total attention because it better reflects the notion of expansion, and thus flow, of culture.Footnote 17 Moreover, since we are not interested in cultural accumulation that arises purely from a larger population, we measure total attention in per capita terms. Finally, to moderate the influence of outliers, we take the natural logarithm of per capita attention. Thus, we use the following measures of attention to religion, science, and institutions in year $t$ :
where $\Theta_{mt}$ , the average attention to theme $m \in \left\{ {religion,\;science,institutions} \right\}$ in year $t$ , is defined in expression (2). ${w_{dt}}$ is the number of words in document $d$ published in year $t$ , and ${D_t}$ is the set of all documents published in year $t$ . $po{p_t}$ is England’s population in year $t$ .Footnote 18
Figure 3 plots the three time series defined by (3a), (3b), and (3c). Table E.1 in Appendix E in Supplementary material presents the descriptive statistics.
Empirical model
Our goal is to investigate the interrelated dynamics of ideas on religion, science, and institutions. In particular, we wish to allow the possibility of two-way effects between each pair of the series $reli{g_t}$ , $sc{i_t}$ , and $ins{t_t}$ defined in (3a)–(3c). We therefore study the behavior of the three-variable vector
Importantly, the value of each element of ${{\boldsymbol y}_t}$ plausibly depends on its own past values as well as on past values of the other elements of ${{\boldsymbol y}_t}$ . Specifically, the “normal” process of reaction and counter-reaction of ideas in publications makes ${{\boldsymbol y}_t}$ a function of ${{\boldsymbol y}_{t - 1}},\;{{\boldsymbol y}_{t - 2}},\; \ldots $ . This reflects the notion that heightened attention in one area of print culture at time t might change the amount of attention to all areas after t.
But ${{\boldsymbol y}_t}$ is affected by more than this process of lagged reaction and counter-reaction. There are shocks in the form of new ideas that are not a product of the normal process of response to past publications. These are one-time, idiosyncratic or “abnormal,” changes in ${{\boldsymbol y}_t}$ reflecting extrinsic factors: a divine revelation might cause speculation in theology; discovery of a new continent opens up study of new scientific phenomena; an unexpected development in caselaw spurs debate on new constitutional structures.
Finally, there are secular changes in all elements of ${{\boldsymbol y}_t}$ . In sixteenth- and seventeenth-century England, the overall attention in publications to ideas on religion, science, and institutions rose steadily (see Figure 3) due to a variety of factors. These include improvements in printing technology and an increasingly thriving vernacular print culture (Barnard Reference Barnard, Barnard and McKenzie2002), population growth (Merton Reference Merton1938: 570–5), improved access to ideas via international commerce (Palma Reference Palma2016: fn. 25), an expanding influence of the merchant class (Erikson Reference Erikson2021), and early preindustrial economic development (Crafts and Mills Reference Crafts and Mills2017).
The structure described above fits a vector autoregressive (VAR) model, which captures the intertwined dynamics of multiple time series (see, e.g., Kilian and Lütkepohl Reference Kilian and Lütkepohl2017; Sims Reference Sims1980; Stock and Watson Reference Stock and Watson2001). A cornerstone of macroeconomic methods, VAR has been employed productively by economic historians. Footnote 19
We posit the following structural VAR:
where ${{\boldsymbol y}_t}$ is defined above. ${{\bf{\Gamma }}_0}$ is a 3×1 vector of constants. The ${{\bf{\Gamma }}_i}$ , i∈{1, 2, 3}, are 3×3 matrices of coefficients. The model with three lags was selected on the basis of conventional lag length criteria and tests (see, e.g., Kilian and Lütkepohl Reference Kilian and Lütkepohl2017).Footnote 20 $t$ is a linear time trend, capturing the above-noted secular trends in ${{\boldsymbol y}_t}$ . ${\bf{D}}$ is the corresponding 3×1 vector of coefficients. Importantly, the inclusion of the time trend (implicitly detrending the data) implies that our empirical analysis illuminates the variability in attention to each of the three cultural domains around their long-turn trends. Finally, ${{\boldsymbol u}_t}$ is a 3×1 vector of orthogonal structural shocks, with ${\rm{E}}\left( {{{\boldsymbol u}_t}{{\boldsymbol u}_{t}^{\prime}}} \right) = {{\boldsymbol I}_3}$ .
Identification assumptions
In model (5), the 3×3 coefficient matrix ${\bf{A}}$ captures how shocks occurring in one domain can immediately affect attention in all domains. Thus, the model (5) allows each element of ${{\boldsymbol y}_t}$ to depend on the contemporaneous values of the other elements of ${{\boldsymbol y}_t}$ . This possibility means that the parameters of (5) are not identified without additional assumptions. However, under reasonable assumptions, estimates of those parameters can be derived from ordinary least squares (OLS) estimates of the following:
where ${{\boldsymbol e}_t} \equiv {{\bf{A}}^{ - 1}}{{\boldsymbol u}_t}$ .
Our approach to making the requisite assumptions rests on short-run restrictions, a bedrock of the VAR literature in macroeconomics (see, e.g., Caldara and Iacoviello Reference Caldara and Iacoviello2022; Christiano et al. Reference Christiano, Eichenbaum, Evans, Taylor and Woodford1999; Ramey Reference Ramey, Taylor and Uhlig2016).Footnote 21 Specifically, we assume that (A1) shocks to $sc{i_t}$ or $ins{t_t}$ do not contemporaneously impact $reli{g_t}$ and (A2) shocks to $ins{t_t}$ do not contemporaneously affect $sc{i_t}$ . In Appendix F in Supplementary material, we provide a detailed justification of these assumptions in light of the nature of our data and the workings of English society in the time period under consideration. But we do not claim to provide the last word on these issues. Issues of causality among multiple endogenous variables are notoriously difficult to solve, as attested by modern-day macroeconomics (Nakamura and Steinsson Reference Nakamura and Steinsson2018). Our approach – the first ever applied to the coevolution of religion, science, and institutions in a historical setting – offers one route to do so within a viable estimation framework.
Given assumptions (A1) and (A2), justified in Appendix F in Supplementary material, ${\bf{A}}$ in (5) and, consequently, ${{\bf{A}}^{ - 1}}$ in (6) are lower triangular and the residuals from the reduced-form VAR in expression (6) can be expressed as:
With this form, estimates of the structural parameters of (5) can be derived from OLS estimates of (6). Importantly, one can readily obtain estimates of ${{\boldsymbol u}_t} = {\bf{A}}{{\boldsymbol e}_t}$ , the vector of structural shocks. As we stress below, these estimated shocks add new information to the historical record on the occurrence of intellectual developments that our estimates suggest were not simply the product of reactions to earlier intellectual developments.
Impulse-responses: a first look at coevolutionary dynamics
We first investigate the coevolution of ideas about religion, science, and institutions using impulse response functions (IRFs). The IRFs summarize the average expected response of each element of ${{\boldsymbol y}_{t + s}}$ at time horizon $s \ge 0$ following a one-time shock to one specific element of ${{\boldsymbol y}_t}$ . We model the initial change as a one-time, one-standard-deviation structural shock (i.e., a shock of typical size) that elevates the attention to the pertinent theme (i.e., raises the value of the applicable element of ${{\boldsymbol y}_t}$ ), a manifestation of an exogenous “innovation” in culture. We estimate cumulative IRFs in attention to each of the three areas over a 30-year period, reporting 90-percent confidence intervals. Footnote 22, Footnote 23
Figure 4 summarizes the results. In each of the nine subfigures, the horizontal axis shows the number of years since the shock. The vertical axis measures the cumulative change in the pertinent series at each horizon, expressed in proportions since our series are measured in natural logs (see expressions (3a)–(3c)). For interpretation of Figure 4, it is helpful to know that a shock to $reli{g_t}$ causes an immediate jump in $reli{g_t}$ (i.e., increases attention to ideas about religion) of 45 percent. Similarly, a shock to $sc{i_t}$ immediately increases $sc{i_t}$ by 75 percent and a shock to $ins{t_t}$ immediately elevates $ins{t_t}$ by 44 percent.
The subfigures along the main diagonal (Figure 4(a), (e) and (i)) capture the extent to which a shock in one domain spurs subsequent developments in the same domain. The corresponding IRFs are positive and statistically significant at all horizons: innovations in a given cultural domain do not dissipate but rather lead to long-lasting elevated attention to that same domain. For example, following a shock to $sc{i_t}$ , the 30-year effect on $sc{i_t}$ is twice the size of the immediate rise in $sc{i_t}$ (Figure 4(e)). This effect corresponds to the mechanism alluded to in Merton’s (Reference Merton1938: 436–69) theorizing, that “a fixed order must prevail in the appearance of scientific discoveries; each discovery must await certain prerequisite [scientific] developments.”
The subfigures off the main diagonal show how innovations (shocks) in one cultural domain affect future developments in the other two domains. Most importantly, innovations in ideas on religion spur strong responses in attention to the other two cultural domains (Figure 4(d) and (g)). The 30-year-effect of a shock to $reli{g_t}$ on $s{c_t}$ is a 109 percent rise in $sc{i_t}$ (Figure 4(d)). Thus, religious thought was central in spurring developments in ideas about science as theorized by Merton (Reference Merton1938: 440): “It was precisely Puritanism which built a new bridge between the transcendental and human action, thus supplying a motive force for the new science.”
Innovations in ideas about science elevate attention to ideas about religion (Figure 4(b)): the 30-year-effect of a shock to $sc{i_t}$ on $reli{g_t}$ is a 43 percent rise in $reli{g_t}$ . The coevolution between ideas on religion and ideas on science was therefore a “reciprocal interaction” (Merton Reference Merton1938: 434), but especially strong from religion to science (Merton Reference Merton1938; Shapin Reference Shapin1996).
Innovations in ideas about religion also stimulate ideas about institutions: the 30-year-effect of a shock to $reli{g_t}$ on $ins{t_t}$ is a 28 percent increase in $ins{t_t}$ (3(g)). This finding resonates with the recent scholarship emphasizing the general relevance of religious ideas for long-run institutional development.Footnote 24
In contrast, innovations in ideas about institutions lead to a relatively small and statistically insignificant increase in attention to ideas about religion (Figure 4(c)). There is no effect of innovations in ideas about institutions on attention to ideas about science (Figure 4(f)).Footnote 25
The temporal incidence of shocks and relevance of revolutions
Shocks to ${{\boldsymbol y}_t}$ (the ${{\boldsymbol u}_t}$ in (5)) are one-time changes that cannot be predicted within the normal process of evolution and coevolution. They might be due to acts of genius (Newton), a political event (Civil War), or a monarch changing religion (Henry VIII). An understanding of the timing of the shocks provides insight into the times when cultural evolution took (or did not take) an unexpected turn relative to what would have been predicted by the model’s “normal” dynamics.
Figure 5 plots the temporal paths of five-year centered moving averages of the estimated shocks to $reli{g_t}$ , $sc{i_t}$ , and $ins{t_t}$ . The vertical axis measures the magnitude of the shocks. The absolute value of the typical moving average displayed in Figure 5 is approximately one-third.
The largest positive shocks to $reli{g_t}$ (Figure 5(a)) occur in the years 1547–1549, 1565–1567, 1579–1583, and 1650–1659. During these periods, attention to religion evidences positive shocks that are about three time as large as typical shocks during 1530–1700. The first period (1547–1549) is at the beginning of Edward VI’s reign, when a more radical form of Protestantism was being introduced. The second period (1565–1567) is in Elizabeth’s first decade, when Calvinist ideas were being strongly promoted by those outside the established church.Footnote 26 The third period (1579–1583) coincides with a time when Elizabeth was facing internal and external challenges from Catholics, with Puritan ideas finding more favor.Footnote 27 The fourth period (1650–1659), the Interregnum when Puritans controlled the country, evidences the largest amount of unusual attention to religion.
In the last four decades of the seventeenth century, the only unusually large shocks to $reli{g_t}$ are negative: in 1663–1668 and 1686–1689. During these times, attention to religion experiences negative shocks that are about two-and-a-half times the size of typical shocks occurring between 1530 and 1700. During the first of these periods (1663–1668), the country saw a backlash against the Puritan ideas that had been so prominent during the Interregnum, with pathways to the spread of such ideas largely blocked by the established church (MacCulloch Reference MacCulloch2004: 544). The latter period (1686–1689) coincides with the reign of an avowedly Catholic King, James II, who for a brief time threatened all flavors of Protestantism, discouraging attention to the corresponding ideas. Thus, echoing the conclusion of the previous paragraph, our religion variable clearly reflects the ebb and flow of the attention to Puritan ideas.
Large positive shocks to $ins{t_t}$ occur in the 1580s (Figure 5(c)). This period is a particularly fecund one in caselaw, the reports of which are not contained in the EEBO-TCP corpus, but the reverberations would certainly enter that corpus.Footnote 28 After that, there are two time periods featuring prominent positive shocks to $ins{t_t}$ : 1636–1647, the era before and during the Civil War; and 1679–1683, coinciding with the exclusion crisis that foreshadowed the Glorious Revolution. Our estimates are thus more consistent with the idea that debates on institutional development spurred revolutions than the notion that revolutions spurred debates on institutional development.
With regard to the Glorious Revolution (1688–1689), our evidence does not lend support to the North and Weingast (Reference North and Weingast1989) view that it marked a critical juncture in English institutional history. The years 1688–1700 do not feature the kind of unusually large shocks (positive or negative) to $ins{t_t}$ that one would expect to observe had the Glorious Revolution fundamentally impacted ideas on institutions. Moreover, with the period leading up to the Glorious Revolution witnessing large negative shocks to $reli{g_t}$ (1686–1689; Figure 5(a)) and large positive shocks to $ins{t_t}$ (1679–1683; Figure 5(b)), our evidence indicates that the struggles leading up to the Glorious Revolution were not centered on religion, but rather on institutional, and especially constitutional, matters.Footnote 29
Finally, the era from 1558 to 1610 saw a series of comparatively large positive shocks to $sc{i_t}$ (Figure 5(b)), an observation that locates the stirring of an English scientific revolution earlier than is conventional (Wootton Reference Wootton2015). In contrast, the negative shocks to $sc{i_t}$ from 1618 to 1650 can be explained by the existence of deep religious and political cleavages. An emphasis on coping with deep divisions within the nation and the accompanying instability would have naturally led to a decline in attention to science.
What drove attention to science? The Royal Society and alternatives
Did the founding of the Royal Society in 1660 elevate attention to science? And more generally, which shocks were particularly important as drivers of attention to science and when? To shed light on these questions, we investigate the $sc{i_t}$ series using historical decomposition, a VAR tool that allows us to apportion the overall fluctuation of $sc{i_t}$ among the effects of different types of shocks (i.e., current and past exogenous events). This tool combines information from the pertinent IRFs (Figure 4(d), (e) and (f)), which show how attention to science on average responded to specific types of shocks, and the timing of the different types of shocks (Figure 5).
Figure 6 presents a historical decomposition of the $sc{i_t}$ series for the 1650–1675 period, the era immediately preceding and following the 1660 founding of the Royal Society. In the figure, “total” indicates the total deviation of $sc{i_t}$ from the long-run mean predicted by the model. This deviation is then apportioned among three possible sources: past and current shocks to $sc{i_t}$ (dark gray line), $ins{t_t}$ (light gray), and $reli{g_t}$ (black).
The central implication of Figure 6 is that the founding of the Royal Society leaves no visible trace in our results. Specifically, even though the years 1660–1663 exhibit a somewhat higher than expected attention to science (the total is positive), this is by and large an effect of positive cumulative shocks to $reli{g_t}$ (the black line). These religious shocks, which on average elevated attention to science (Figure 4(d)), occurred especially during the Interregnum (Figure 5(a)), when Puritan sentiments were particularly strong. In contrast, shocks to $sc{i_t}$ , which on average boosted subsequent attention to science (Figure 4(e)), were of limited size throughout the 1650s and the early 1660s (Figure 5(b)). As such, cumulative shocks to $sc{i_t}$ had little effect on attention to science during 1660–1663 (the dark gray line). The same is true of the shocks to $ins{t_t}$ (the light gray line).
Our results are thus not consistent with the hypothesis that the founding of the Royal Society quickly led to an unusual outpouring of science.Footnote 30 Our estimates are more consistent with an interpretation that developments in religion during the Interregnum later heightened attention to ideas about science – ideas that were conceivably important for the founding of the Royal Society in 1660.
Figure 7 shows an analogous historical decomposition of the $sc{i_t}$ series for an earlier period, 1560–1590. The figure reveals that between 1564 and 1587, shocks to $reli{g_t}$ boosted attention to science in all but one year (the black line is above zero). As discussed in the previous subsection, during that time the large positive shocks to $reli{g_t}$ , which on average increased $sc{i_t}$ (Figure 4(d)), plausibly arose from the rising influence of Puritanism.
Figure 7 therefore provides further direct evidence in support of the Merton thesis: ideas about religion stimulated attention to ideas about science. However, Figure 7 also contextualizes Merton’s theory by providing novel insights regarding the timing of the pertinent influence. Merton (Reference Merton1938: 414–8) himself, as well as Mokyr (Reference Mokyr2016: Ch. 13), focused on the role of religious sentiments in spurring the development of science in the seventeenth century. Our results indicate that these effects were particularly strong already in the last half of the sixteenth century.
Was there secularization of science and institutional thought?
Finally, we draw on our estimated historical decompositions to assess whether the strength of the influence of attention to religion on science and institutional thought changed over time. If the influence of religion waned over time, this would be evidence of secularization of thought in these two areas.
Figure 8 shows 5 year moving averages of the relative importance of cumulative shocks to religion as drivers of attention to science (part (a)) and institutions (part (b)). To construct the figure, we use the estimates of the historical decomposition for each of the $sc{i_t}$ and $ins{t_t}$ series. We then calculate the percentage share of the effects of cumulative shocks to $reli{g_t}$ in the sum of the absolute values of the contributions of all three types of cumulative shocks. This is a measure of the relative importance of the effects of cumulative shocks to $reli{g_t}$ on $sc{i_t}$ .
The key finding implied by Figure 8 is that innovations in religious thought were an important driver of attention to both science and institutions throughout the period 1530–1700. There is no evidence of any diminishing relevance over time of shocks in religion to debates on science and institutions. In particular, our findings are inconsistent with those of Almelhem et al. (Reference Almelhem, Iyigun, Kennedy and Rubin2023: 3) who claim that, in England, “as early as 1600, and certainly by the mid-seventeenth century, there was little overlap of scientific and religious topics.”
Concluding discussion
Our objective has been, first, to present a quantitative macroscope of English print culture during the pivotal sixteenth and seventeenth centuries and, second, to examine the coevolution of ideas on religion, science, and institutions. To this end, we analyzed a major corpus of early-modern publications, combining unsupervised machine learning methods for analysis of text-as-data with time-series econometrics.
Our analysis generated five main substantive findings. First, as the early-modern era unfolded, English religious and political discourse became less antagonistic and, in line with Erikson (Reference Erikson2021), witnessed a rise in prominence of topics relevant to the economy and commerce. Within the economics discipline, the existing literature on early-modern England has highlighted institutional developments as the key to subsequent sustained economic progress (Acemoglu and Robinson Reference Acemoglu and Robinson2012; Henriques and Palma Reference Henriques and Palma2023; North et al. Reference North, Wallis and Weingast2009). Our inquiry provides a different perspective, suggesting that the eighteenth-century stability of the “nation of shopkeepers” was foreshadowed by important sixteenth- and seventeenth-century cultural changes.
Second, the epistemology that came to be associated with Bacon was present in theological debates already before Bacon’s epistemological contributions. Our analysis thus casts new light on an influential strand of recent research stressing the role that Francis Bacon played in defining a set of cultural beliefs that were conducive to later economic development (Mokyr Reference Mokyr2005, Reference Mokyr2016). Our evidence tends to support a view of Bacon as a “synthetic thinker” (Mokyr Reference Mokyr2016: 74) or “an influential mouthpiece for a culture that already existed” (Grajzl and Murrell Reference Grajzl and Murrell2019: 112) rather than a fully fledged “intellectual innovator” (Mokyr Reference Mokyr2016: 68).
Third, congruent with Merton’s (Reference Merton1938) seminal thesis, innovations in religious ideas stimulated attention to science, especially at times when Puritanism was prominent in religious discourse. We show that the link from religion to science was strong already in the latter half of the sixteenth century, that is, much earlier than stressed by Merton (Reference Merton1938: 414–6) or more recently by Mokyr (Reference Mokyr2016: Ch. 13), who both emphasized developments after the mid-seventeenth century. Moreover, we demonstrate that the interplay between religion and science in early-modern English print culture was bidirectional.
Fourth, neither science nor institutional thought evidence secularization as the seventeenth century closes. Our evidence is thus not consistent with recent analysis by Almelhem et al. (Reference Almelhem, Iyigun, Kennedy and Rubin2023: 35) who use texts available with the Hathitrust Digital Library (HDI) to conclude that, in Britain, “the ‘secularization’ of science was entrenched from the beginning of the Enlightenment.” Importantly, our analysis utilizes the EEBO-TCP corpus, which focuses on the pre-1700 era. This is not true of the HDI corpus available to Almelhem et al., where the coverage of the pre-1700 period is sparse. Moreover, unlike Almelhem et al.’s conclusion, our finding is based on an explicit VAR model that naturally lends itself to the study of coevolutionary patterns (see, e.g., Grajzl and Murrell Reference Grajzl and Murrell2022).
Fifth, the Civil War and the Glorious Revolution did not spur debates on institutions nor did the founding of the Royal Society markedly elevate attention to science. Our empirical analysis thus provides a novel interpretation of major historical events during a crucial epoch of England’s cultural history. We show that a focus on institutional debates preceded the Civil War. Then, the emphasis on religion during the ensuing Interregnum led to an enhanced focus on science at the time of the formation of the Royal Society. In addition, attention to institutions did not noticeably change around the time of the Glorious Revolution. Thus, rather than supporting the view that 1688–1689 was a pivotal moment in English institutional history, our evidence from print culture echoes recent research emphasizing the gradual character of England’s institutional development (Henriques and Palma Reference Henriques and Palma2023; Hodgson Reference Hodgson2017; Murrell Reference Murrell2017; Ogilvie and Carus Reference Ogilvie, Carus, Aghion and Durlauf2014).
A core product of this research has been the creation of a machine learning digest of sixteenth- and seventeenth-century English culture, a self-standing dataset available to researchers for further exploration. Our construction and provision of data is analogous to many other data-producing exercises in social-science and cliometric research that combine large amounts of micro data to build a macro dataset that can be used as an input into further research (see, e.g., Broadberry et al. Reference Broadberry, Campbell, Klein, Overton and van Leeuwen2015). At the same time, the availability of our machine-learning-generated data allows others to easily reassess our findings by applying, for example, different assumptions about the construction of cultural aggregates, different identification assumptions, and, indeed, wholly different econometric approaches.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/ssh.2024.17
Acknowledgments
We thank Katherine Calloway for help with the history of ideas, Boragan Aruoba for econometric advice, Franz Klein for facilitating use of the BSWIFT cluster, Paul Schaffner for help in downloading and understanding the TCP corpus, two anonymous referees for crucial comments, and the Editors for help in polishing the paper.