Introduction
As with any dynamic field of research, the goals and methods of business history are frequently reflected upon.Footnote 1 For example, the focus on big American manufacturers, following the pioneering work of Alfred D. Chandler and John Kenneth Galbraith, was questioned by Youssef Cassis.Footnote 2 Business history today has moved beyond the narrow confines of the United States, and big business no longer dominates research, although there are still arguments in favor of a wider focus.Footnote 3 Building on this, the aim of the present work is twofold. First, we present a new database of Danish dairy cooperatives (“creameries”) 1898–1945, which will allow for a variety of directions for future research due to three qualities: a long time span, a large sample, and in-depth coverage through many variables. Second, the insights for the field of business history, which can be gained from quantitative analyses, provide a motivation for a “new” business history.
Several business historians, most recently Abe de Jong, have argued that case research based on rich primary sources, which dominates the field of business history, could usefully contribute to a theorization of the field, which in turn would allow it to connect more easily to related disciplines, such as economics and economic history.Footnote 4 We are economic historians who have contributed to business history in the past. There is an increasing focus on crossing interdisciplinary boundaries today, and as the theoretical assumptions economics has been based on are increasingly called into question, the core insights from business history have an opportunity to teach important lessons. Indeed, there is a general consensus that economics and economic history today are dominated less by homo economicus and more by advanced statistical methods.Footnote 5 Nevertheless, as the Nobel Laureate George Akerlof has so pertinently pointed out, this has meant that economics stands guilty of the “sins of omission,” whereby economic research ignores important problems when they are hard to tackle using its methodologies taken from the “hard” sciences.Footnote 6 Although this is beyond the scope of the present work, one can easily imagine how the qualitative work of business historians might help fill this gap through knowledge of entrepreneurs and entrepreneurship, and how firms act within their political and social context. In short, economic and business history have much to teach each other, and we hope that the data we present here will whet the appetite of business historians as a complement to work more qualitative in focus.
The type of data we present here allows for the consideration of industries made up of many small firms, which typically do not leave behind the sort of comprehensive archives needed for case studies. Our database covers 1,419 Danish cooperative creameries over the period 1898–1945 and 131 variables. The rapid spread of industrialized dairying from the 1880s until World War I, based on the use of a new technology—the steam-powered automatic cream separator, a centrifuge—was a defining moment in the development of Denmark, leading to its rapid convergence with the leading economies of the time, and an unusually balanced growth path between town and country.Footnote 7 This case helps to shed light on the business history of cooperation, in which there has been a tendency to ignore cooperatives and other noncapitalist enterprises with a few notable exceptions, such as Patrizia Battilani and Vera Zamagni on Italian cooperative enterprises; John F. Wilson, Anthony Webster, and Rachael Vorberg-Rugh on the Co-operative Group in the UK; Eva Fernández on the diffusion of cooperatives in thirteen countries 1880–1930; and Jordi Planas and Samuel Garrido on cooperative wine making in Spain.Footnote 8
De Jong terms the use of empirical techniques in business history “bizhismetrics.”Footnote 9 We demonstrate how our use of big data allows us, for example, to consider firm-level productivity and the determinants of this, and to examine the issue of survival bias, connecting to the work of Laura Panza, Simon Ville, and David Merritt, who use a survival analysis on firm-level data from the Australian Stock Exchange 1901–1930 to investigate the determinants of longevity of firms, finding that firm size is a poor predictor but that age and profitability are statistically significant.Footnote 10 In our case, none of the individual creameries we consider have survived. Waves of mergers over the course of the twentieth century led to a near monopoly for the large Danish-Swedish dairy cooperative Arla. Mads Mordhorst employed a cultural-historical framework, narrative theory, and Pierre Nora’s notion of memory, and argued that the process of globalization has led to Arla being seen as monopolistic and undemocratic, in contrast to the Danish national story of the democratic development of the country, serving to illustrate the importance of our case beyond the confines of economic and business history.Footnote 11 It should be noted that our database also contains information on, for example, technology, the adoption of which reflects the sort of management decisions of interest to business historians, and we present other potential avenues for future research in the conclusion. Quantitative methods are prevalent in teaching and research in business schools today, as they are seen as important core competencies in decision making. This is also the origin of our database, as data was seen as the foundation of cooperative management decisions—the reason, indeed, that the data was collected and published in the first place by contemporaries.
The remainder of this paper proceeds as follows. In the following section, we provide our argument for quantitative business history. In section 3, we explain the background to the collection of the creamery statistics and the challenges that were faced, and in section 4 we provide an overview of the database. Section 5 presents some applications of the data, and section 6 concludes.
Toward a Quantitative Business History
During the past twenty-five years or so, there has been an expansion in business history research, which has seen a shift away from the study of individual firms toward a “richer understanding of business systems” and a methodological expansion from qualitative to quantitative research methods. Some have therefore argued for a plurality of methodologies, among them quantitative hypothesis testing along the lines of that used in the “new” economic history, although business history as a discipline has traditionally shied away from this.Footnote 12 This mirrors earlier calls for a broader business history encompassing more than single company descriptive case studies toward a more generalizable and analytical discipline, although one that is not isolated from other social sciences.Footnote 13 Thus, there have been calls for business history to move beyond the traditional Chandlerian and post-Chandlerian descriptive case study of core industries and large firms.Footnote 14
A special issue of Enterprise & Society from 2013 on “How to Do Business History” took up this very debate. Daniel M. G. Raff noted the survival bias evident in much of business history, argued that we should try to understand all participants in a market, and pushed for a focus on the choices and processes behind the various outcomes reached.Footnote 15 He noted that “it is easiest to carry out this forward-looking approach to history writing when there is information about a genuine cross-section of the population” but that “this is rarely obtained in business history.” Therefore, a major contribution of the present work is that we present such a cross section. Similar issues were taken up in another special issue of Enterprise & Society in 2020, with Raff again arguing for the importance of understanding how decisions are made and Philip Scranton backing this up, noting that, although business history has moved beyond the narrow confines of big business, “outcomes are contingent, unstable, and temporary, and … they are also plainly the wrong place to start an inquiry.Footnote 16 It thus becomes crucial to ask questions that locate actors at the outset (or in the middle) of emergent processes whose trajectory is only partly understood and whose success or failure is hardly within their span of control.” Our database is rooted in the microlevel but allows for quantitative work with implications for the macrolevel, covering a period when the dairy sector expanded before World War I, faced economic difficulties during the war and immediately afterward, and then began a gradual process of consolidation. The industry was highly competitive, both on the domestic market and against rivals abroad, and the period marked considerable technological and institutional changes. As Geoffrey Jones and Walter A. Friedman have argued, “Rigorous analysis of large datasets has been shown to transform our understanding of generalizations based on qualitative research.”Footnote 17 In fact, the dairy industry in Denmark teaches us that, if business history only focuses on qualitative accounts, then the histories of many smaller firms will be lost; much of the information we have on them survives in the form of the data that we describe below.
The dearth of quantitative research in business history is surprising, given the prevalence of quantitative methods in teaching and research in business schools. Broadly speaking, quantitative methods are widely taught within business schools, as they are seen as key competencies for managers to make informed decisions.Footnote 18 Within wider business research, quantitative methodologies have been a dominant strategy in research.Footnote 19 This is evident from a recent survey of all empirical articles published in the Journal of International Business Studies between 1970 and 2019 that found 87 percent of articles were quantitative in nature.Footnote 20 Similarly, David Strang and Kyle Siler highlight the methodological diversity within publication trends in Administrative Science Quarterly from 1958 to 1970, showing the increased engagement of social scientists with organization studies and how this has led to a wider array of approaches with a strong quantitative element.Footnote 21 As it is widely acknowledged that qualitative and quantitative methods are appropriate for different research questions and contexts, we are not arguing for a superior methodological approach.Footnote 22 Rather, the relative absence of one is surprising, given the evolution within social sciences and their influence on business research, thus indicating that there could be scope for greater engagement of business historians outside traditional academic silos if a broader methodological range is used.Footnote 23
The creation of new ventures is a core theme in management, organizational theory, and strategic management.Footnote 24 Explaining variation in new ventures, survival and termination and exit, requires longitudinal (repeated measurement of the same subject over time) data, such as the U.S. Longitudinal Business Database.Footnote 25 Using data from the U.S. LBD, Vincent Sterk, Petr Sedláček, and Benjamin Pugsley find that ex ante variation in firm characteristics (such as business models) are more important than persistent ex post shocks (such as demand shocks) in explaining the longevity of U.S. start-ups.Footnote 26 Other studies use different longitudinal surveys to assess the influence of entrepreneurial optimism for the establishment of new ventures and the changing risk preferences as ventures age.Footnote 27 Elsewhere, Panayiotis Georgallis, Glen Dowell, and Rudolphe Durand look at the factors that lead to the state support for new industries, finding this to be greater where the sectors were uncontested.Footnote 28 Recent work has seen attempts to finesse the research method by separating the likelihood of survival from the magnitude of growth.Footnote 29
Another avenue of interest is the wide disparity of productivity among businesses. Productivity differences across firms (businesses) has been widely documented, even businesses within the same industry, and this has led to continued use of firm (business) level data to explore this issue in more detail.Footnote 30 The work of Erik Brynjolfsson and Lorin Hitt is an example of an early study using firm-level data to document heterogeneity in productivity of firms adopting information technology.Footnote 31 Another aspect of this, which has been focused on recently, concerns how firms respond to uncertainty in business decision making. Hitesh Doshi, Praveen Kumar, and Vijay Yerramilli find uncertainty affects firm investment decisions, but this varies by firm size, with smaller firms more affected.Footnote 32 Alon Kalay, Suresh Nallareddy, and Gil Sadka find a close relation between firm-level performance shocks and economy wide uncertainty and how these two can interact.Footnote 33 Finally, Nicholas Bloom et al. show how manager expectations are influenced by historical experience and how these expectations influence investment decision making.Footnote 34
The Case of the Danish Creameries and the “Operational Statistics”
The case of the emergence of the Danish butter industry is well-known, and a sizeable literature—Danish and international—has been devoted to exploring how a small and relatively backward country, reeling from the loss of most of its territory (Norway in 1814 and Schleswig and Holstein in 1864), managed to create a world-beating dairy industry—one based on well over one thousand small, democratically organized cooperatives of peasant farmers. Denmark became the largest exporter of butter in the world prior to World War I, successfully outcompeting traditional suppliers of the important British market, such as Ireland, and would later become an important provider of know-how to other countries, such as the United States and Russia, the latter of which would go on to become the second largest exporter before the 1917 October Revolution.Footnote 35 Common to these explanations is an impression that the cooperatives rolled over Denmark in something of a uniform wave, which is often compared to relative failure in other countries. In the present work, as a first application of the database, we use a simple fixed effects setup to demonstrate that there was also heterogeneity between regions of Denmark, something that contests a homogenous linear business history of creameries in Denmark.Footnote 36
Many reasons have been given for the success of the Danish cooperative creameries. Ingrid Henriksen demonstrates that cooperation was particularly well suited to dairying and that, given the technology of the time, they were ideally suited to overcoming the problems of potential lock-in and asymmetric information, although they were not necessarily technologically “savvy” and often dragged their feet in the implementation of new technologies.Footnote 37 Denmark was, however, quick to embrace “winter dairying,” which allowed producers to enjoy higher prices at times of the year when more traditional operators were unable to produce.Footnote 38 The homogeneous population of Denmark, in contrast to the divisions of Ireland, have also been suggested as a reason the cooperation found special favor, although Henriksen et al. 2012 and Henriksen et al. 2015 demonstrate that social cohesion was not enough, and explain that the legal system, in particular the ability to enforce contracts, was also important. Eoin McLaughlin and Paul Sharp demonstrate that a lack of sizeable proprietary competitors also explained Denmark’s relative success compared to Ireland.Footnote 39 Henriksen et al. 2011 demonstrate that the productivity of the cooperatives in terms of butter production was owed mostly to their rapid adoption of the centrifuge rather than their organizational form, and Markus Lampe and Sharp show that farmers were able to increase milk yields through the introduction of multiple innovations in, for example, breeding and feeding.Footnote 40 Sofia Teives Henriques and Paul Sharp explain that Denmark was also fortunate in having a particular geography with a long coastline, which allowed coal, a vital input, easy and cheap access to the entire country, although this followed centuries of searching for coal in a country where it was practically nonexistent.Footnote 41 Finally, Lampe and Sharp, and Nina Boberg-Fazlic et al., have demonstrated the role of traditional landed elites who introduced new practices to Denmark from the eighteenth century, and through continuous innovation laid the basis for the rapid spread of industrial dairying in the final decades of the nineteenth century.Footnote 42
Understanding the decision-making that allowed for this process means understanding the decisions of hundreds of managers and boards of individual cooperatives, as well as the “knowledge elites” in government, educational institutions, and more. The background of this decision-making is, however, to a large extent the sort of statistical information we present here, and the editors of the statistics offered advice based on their reading of the information they had collected. In fact, a major contribution of the aforementioned elites was the early adoption of sophisticated agricultural accounting, probably unique in a world perspective at the time, which allowed farmers to make rational and informed decisions regarding both how and what to produce.Footnote 43 Part of this was the dissemination of knowledge through educational establishments, scientific journals, extension services, and other publications, which allowed for the rapid diffusion of best practice. Although the collection and standardization of accounting material for landed estates had a longer history, the first attempt to do this for the cooperatives was under M. C. Pedersen in 1884, just two years after the first was established in 1882, at the request of the then chair of the Association of Agricultural Societies of Jutland’s dairy committee, Frederik Friis, who hoped to be able to demonstrate their inferiority compared to privately owned concerns. In fact, Pedersen’s report, which covered just seven cooperative creameries in western Jutland, ended up demonstrating the opposite and thus providing an important boost to the cooperative movement.
The dairy consultant Bernhard Bøggild carried on this work in a number of publications in the 1880s. When creamery associations, representing the cooperatives at a regional level, were established, they took responsibility for this, and various reports were published surveying increasing numbers of creameries for different parts of the country between 1891 and 1897.Footnote 44 In 1897, it was decided to request that all cooperatives submit accounts for publication by a central organization, and it is these “Operational Statistics of Creameries” (Mejeridriftsstatistik, MDS) that we have hand collected from the original published volumes and that allow us to construct detailed microlevel longitudinal data over many decades. Our database comprises 1,419 creameries over the period 1898–1945 and up to 131 variables. We are not aware of any other such detailed microlevel database for the major industry of any country covering such a long period, and in such detail. MDS continued to be published until the early 1970s, but increased rationalization of the industry and a lack of comparability to earlier volumes means that we chose to end in 1945. This database provides us with the scale and scope to view the dynamics of the dairying industry over time, revealing key elements of the decision-making process and how factors such as regional geography influenced decisions, even for a small country such as Denmark.
Some background on MDS is necessary in order to understand their representativeness, reliability, and significance. Thus, the first volume of MDS (1897/1899) provides a helpful summary of the battle to establish the Operational Statistics. In the winter of 1897, the creamery associations of Jutland, Funen, and Zealand, which represented the cooperatives in their respective regions, applied for government funding of up to 4,000 kroner from the Ministry of Agriculture to collect statistics from their members.Footnote 45 They received a reply on September 24, 1897, addressed to the chairman of the United Creamery Associations of Jutland (samvirkende jydske Mejeriforeninger). The ministry explained that they had exchanged letters with the Danish Royal Agricultural Society, an organization founded in the eighteenth century and largely representing traditional landed elites, and had ultimately concluded that they could not proceed without first receiving more details on exactly how the information was to be collected and processed, how much it would cost, and whether it was clear that a large proportion of creameries would be willing to share their accounts. Moreover, they requested that further planning should be in collaboration with the Royal Agricultural Society, which had declared itself willing. The Creamery Associations wrote to the Royal Agricultural Society on October 6, 1897, and held their first meeting with them on October 26 of the same year. There was general agreement about the importance of the project, but disagreement regarding how exactly it should be organized. Ultimately, however, they agreed on the following points: (1) that it should be led by a committee of three members, one from the Royal Agricultural Society, one from the Creamery Associations, and one from the Ministry of Agriculture’s dairy consultant; (2) that the annual work plan should be determined at a meeting of delegates, at which each Creamery Association should participate with one representative, and other provincial agricultural organizations and the Danish Association of Dairymen (Dansk Mejeristforening) should also be represented (the idea being that this would facilitate fruitful collaboration); and (3) that the costs should be calculated as at least 6,000 kroner the first year, and the requested support was increased to this amount.
One member subsequently decided to take issue with the first point, and this led to continued negotiations, which were additionally delayed by the poor health of the president of the Royal Agricultural Society, Jørgen Carl la Cour. Under the new president, Count Gustav Wedell Wedellsborg, point 1 was changed so that the committee should consist of four members: the president of the Royal Agricultural Society, its dairy consultant, and two representatives of the Creamery Associations, while the other points were left substantially unchanged. The Ministry agreed to support this initiative with 4,000 kroner for the budget year 1898–1899, and parliament accepted this. This was communicated in a letter from the Ministry dated April 15, 1898, to the United Creamery Associations of Jutland, stating that they would leave the responsibility to the Royal Agricultural Society and the Creamery Associations to perform the task to the extent that the amount offered could cover.
On May 16, 1898, the committee as constituted met for the first time, and it was decided that responsibility for organizing the statistics should mostly rest with the Creamery Associations, so that three committee members would now come from them, and just one from the Royal Agricultural Society, and that point 2 should be removed. A new proposal was sent to the ministry proposing that the 4,000 kroner would be administered by those four members. Half was to be distributed in amounts of 5 kroner per creamery that was a member of the associations, under the condition that each association should contribute the same amount, that they submit accounts that could be used for the Operational Statistics, that a report should only be printed for an association if at least fifteen usable accounts were submitted, and that only two copies of the statistics would be sent to each creamery in each association. In order for the accounts to be usable, creameries were required to use a certain standardized form, covering 365 days and with an accounting year ending sometime between October 1, 1897, and January 1, 1898. The other half of the money would be at the disposal of the committee to cover expert advice and printing costs, with 1,000 kroner to each activity. The ministry accepted this revised plan on September 27, 1898.Footnote 46
Although it was originally planned that the statistics would be compiled separately for each association, the committee eventually agreed that it made more sense to collect the reports in one volume, which was subsequently published each year with a short introduction. Ten copies were distributed to each Creamery Association, and one was sent to creameries outside the associations, to parliament members, to chairs of agricultural societies, etc.Footnote 47 By 1901, the organization was running an increasing deficit, and it asked the government for an extra 1,500 kroner, but ultimately the MDS were combined with the publication of butter price statistics, which were previously published elsewhere, with government support of 9,000 kroner.Footnote 48 Production was turned over to a Committee for Creamery Statistics (Udvalg for Mejeri-Statistik), consisting of one representative from the Royal Agricultural Society, two from the United Danish Agricultural Associations (De samvirkende danske Landboforeninger), three from the United Danish Dairy Associations (De samvirkende danske Mejeriforeninger), and one from the Association of Danish Dairymen (Dansk Mejeristforening).Footnote 49
Besides the obvious value of the massive amount of data collected, the introduction to each volume provides a wealth of important qualitative information and analysis of the published statistics, touching on issues such as fuel prices, tuberculosis, and war, as well as the more mundane aspects of how to run a creamery based on the results of their statistical analysis. This material provides the qualitative material for understanding decisions made for the industry on the macrolevel, with considerable influence for actors in the individual creameries, and based of course on the statistical information. Of course, these decisions could only be as good as the data collected, and the editors frequently emphasized the importance of submitting accounts, and complained about low compliance rates in the early years, although this improved from around a third in the first years of the twentieth century, to about half by 1910, and two-thirds by the late 1920s, at which point it is stated that “it can be considered completely responsible to use the calculated averages as an expression of the general situation of dairying.”Footnote 50 Although the first submissions of accounts from the individual creameries were patchy and many could not be used, and were thus not published, already in the volume for 1899 improvement is noted, which was greatly facilitated by the publication of standardized accounting books.
From the beginning, the editors encouraged more creameries to submit their accounts, explaining that the comparison would facilitate improvements for the individual creamery, and for the situation of dairying in the country as a whole. One editor provided a particularly pertinent motivation:
It can easily be understood that replying to the given questions gives a very reliable picture of the individual creamery before and now, not to mention the historical and general statistical interest that the collection of this information has at a time when it is still possible to collect authentic data from the beginning of the cooperatives. The present generation will, quite naturally, not attach great value to this information, since the development is occurring before their eyes; but future descendants will quite surely value it highly.Footnote 51
A later editor explained that “accounts are the compass of the creamery. Without them one manages blindly, and if one is to have one’s accounts accounting-wise in order, it is important that they are not misleading, but point in the direction of the greatest possible return under the given conditions.”Footnote 52 Occasionally, editors speculated about the reasons why some creameries did not submit their accounts, for example, because they were embarrassed by bad results, because their accounting system was incompatible with that of the MDS, or because they were simply happy to freeride on the efforts of others, something which was even once described as morally wrong.Footnote 53 There is no doubt that the entire sector was put under a great deal of pressure to provide accurate accounts, and despite the weaknesses in the data discussed below, we believe that our database can be considered to provide a good—if imperfect—snapshot of the condition of dairying in Denmark over the period covered. If nothing else, our case study illustrates some of the challenges involved for our proposed “big data” approach to business history.
Overview of the Database
The data we decided to collect from MDS can be roughly divided into four categories:
-
1. Basic characteristics (e.g., name, year of establishment, number of shareholders, etc.)
-
2. Financial statistics (e.g., amount spent on fire insurance, debt, dividends, etc.)
-
3. Input/Output (number of cows, amount of milk processed, butter produced, milk/butter ratio, etc.)
-
4. Technology and energy sources (use of refrigeration, types of fuel used, number of centrifuges, etc.)
Appendix A contains tables with a description of all the variables presented by category, as well as the range of years for which the variable is available (Tables A.1-A-4). Furthermore, appendix B contains summary statistics for all of these variables (Tables B.1-B.4). It should be noted that, partly because of time constraints, we did not collect every piece of published information, and chose to exclude some for reasons of lack of comparability between volumes, or our own subjective lack of interest.
As noted above, the data collected is not without its flaws, in particular during the early years, with the most obvious issue being that not every creamery submitted accounts. We can only speculate, as did the editors of MDS, on the reasons for this, and whether there might be certain characteristics common to those who did not send their accounts. For example, although it might be that “worse” or less efficient creameries were less likely to have kept good accounts, which would have made it possible for them to participate, it might also be the case that the most modern creameries simply had less interest in the project, perhaps because they received useful information from elsewhere, or they were content to free ride, which would certainly have been rational at the level of the creamery. Whatever the case, our analysis below demonstrates quite some heterogeneity in productivity between the individual creameries, which might only have been greater if we had access to the full sample.
In their introductions to each volume of MDS, the editors describe the most important issues of the accounts submitted. Some were, of course, simply not compatible with the need to provide meaningful data for comparison between creameries. For example, the first accounts, which only cover Funen and Jutland, totaled 156 of which only 101 could actually be used, either because the remainder, which were thus not published, did not cover the relevant period, or because of other reasons. Another issue that was raised almost every year was that creameries failed to harmonize their accounting years, although the actual dates covered are recorded in our database. The accounts for cheese production are mentioned as being particularly incomplete for the early years, reflecting the lack of importance attached to this minor part of Danish dairying.Footnote 54 Making the accounts comparable was also a considerable task, although it became easier as standardized accounting forms became more widely used. In the first volume, some associations chose to anonymize their data, and we thus did not include them.Footnote 55 In the MDS for 1899 (1900), very few accounts were received from Maribo, and these were thus merged with Sorø and Præstø, which also included, by mistake, those for Odden (Holbæk). As the accounting forms became more standardized, certain creameries stopped participating for some years, as it took some work to make their accounts compatible, although this issue was only temporary.Footnote 56 Pricing the biproducts of butter production (i.e., buttermilk and skim milk, as well as whey from cheese production) also proved difficult, and the system for this was changed with the MDS for 1916 (1917), meaning that the prices before and after this year are not directly comparable. Finally, the small association in Bornholm did not send accounts for 1926 and 1927 because of changes in the accounting year.
The variables not digitized are within the following approximate categories: classical accounts (incomes, expenses, etc.); cheese statistics; and milk production and milk fat distributed over the months of the year.
A key issue with the construction of large-scale hand-entered databases such as this one is the validation of the data. We have, however, used several techniques to ensure the greatest possible accuracy. First, we used a judgement-based method. From previous work, we have knowledge of realistic values of, for example, the milk/butter ratio, which was the main productivity measure recorded (see also below). Thus, we were able to verify or correct unlikely values. Furthermore, we have visually inspected all observations in plots and checked all unlikely entries. Second, we employed statistical methods, in particular a method centered on outliers in individual variables and a regression-based technique.
All of these approaches are presented in detail in appendix C, and might be of value to others considering similar exercises.
Some Applications of the Database
We now illustrate the power of our database with three applications: first, we consider the extent of heterogeneity in terms of productivity between the individual creameries; second, we investigate the use of fire insurance as just one example of how one might look at decision-making (risk aversion) at the level of the creamery; and finally, we consider the extent of survivor bias.
Heterogeneity of Productivity
As a first application of the database, we aim here to determine whether the productivity of the individual creameries differed to a significant extent between regions, after controlling for trivial explanations for this—after all, the microlevel adds little detail to our prior knowledge without such heterogeneity. As noted above, a common efficiency measure of the time was the milk/butter ratio (MB_ratio), which is the weight of milk necessary to produce one weight-unit of butter.Footnote 57 Thus, in order to document heterogeneity in productivity, we model this ratio nonparametrically, that is, using regional dummies and annual fixed effects:
The set that j iterates over contains all regions (except the excluded region), and the set that t iterates over contains all years. 1[x] is an indicator function returning 1 if x is true and 0 otherwise. The reference year is chosen as 1898 (the first year), and Silkeborg is chosen as the reference region, as it is the one with an average MB_ratio closest to the overall average. If any element of $ \beta $ is different from 0, then this is evidence that there are regional differences in productivity. Thus, we test the hypothesis $ \beta =[0\hskip0.35em \dots \hskip0.35em 0] $ with a classical F-test to see whether this is the case. If the hypothesis is rejected, then this is direct evidence that levels of productivity differed across creameries in different regions.
Principally, we can think of efficiency of butter production in terms of milk input being affected by two factors: first, the technology used in processing the milk into butter (i.e., the ability to extract as much butterfat from the milk as possible), which depended on a minimum efficient scale of production; and second, the quality of the milk (i.e., the fat or cream content of the milk)—the higher the fat content, the more can be extracted per unit of milk.
Our main specification does not, however, capture possible heterogeneity induced from the differing quality of milk, which might principally be determined by variation in the breed of cows, with varying geographic accessibility.Footnote 58 In order to argue that any heterogeneity observed can be found in the production itself, we need to rule out this source of variation. Another determinant of productivity is the scale of production. Appendix D documents how some regions reported creameries of much larger size than others. Taken together, this motivates the following alternative specification:
Here Cows_now is the number of cows supplying the creamery, and Milk_fat is the fat content of the milk. We only have information on milk fat content from 1929. For this reason, we show results both with and without Milk_fat. We refer to the specification with Milk_fat as 2A and the specification without as 2B.
Finally, we might suspect that the regional heterogeneity changes over time, and this motivates the following flexible specification:
A problem in using MB_ratio as an outcome variable is that some of the heterogeneity will be caused by randomness in the production process, making the ratio an unfit measure of efficiency of the individual creamery. This could be addressed by decomposing the random and inefficiency elements of the heterogeneity in a Stochastic Frontier Analysis (SFA) approach.Footnote 59 However, we are not interested in obtaining a measure of individual productivity of each creamery, and for this reason the MB ratio suffices for our purpose and simplifies the analysis somewhat. Nevertheless, an analysis based on SFA was also performed and yielded similar results. We used the specification of George E. Battese and Timothy J. Coelli, in which explanations of the inefficiency are modeled directly in a maximum-likelihood framework.Footnote 60 However, we encountered the so-called wrong skew problem, which causes the Hessian and thereby all classical standard errors to be undefined, thus motivating our focus on the nonparametric specification outlined above.Footnote 61 An alternative method would be to supplement the current analysis with a variance decomposition—for instance, estimating Shapley values, which is an estimate of how much a specific variable (or groups of variables) contribute to explaining the variation in the outcome (here MB ratio). Doing this, we find, depending on the specification, that the region contributes between 3 and 44 percent of the variation in the MB ratio. Such a spread highlights the importance of further work with more complete models in order to understand this variation.
Table 1 presents our regression results from the first two specifications.Footnote 62 The reference region is chosen to match the mean MB ratio for that period, meaning that deviations can be roughly compared to the average. Note that there is considerable regional variation in productivity, as revealed by the very strong F-statistic on the regional dummy parameters. This does not qualitatively change when taking milk quality and production scale into account. It should be noted that, although there is regional variation in creamery size, the parameters on the regional dummies do not change much between specifications 1 and 2B. This is evidence that the regional variation is not only determined by regional heterogeneity in creamery size, and that some other factor must be of importance. The parameters do, however, change a lot when controlling for milk quality.
*** p < 0.001;
** p<0.01;
* p<0.05.
’Regional F’ is the F-statistic of the null-hypothesis that all regional parameters equal zero. Clustered standard error. P-values based on Satterhwaite-corrected t-statistics
Because of the relatively large amount of data, it is not hard to provide precise estimates, but these also show that our results are economically significant and consistent with at least one feasible mechanism. Thus, MB ratios are, in general, larger in regions that are in the geographic periphery of Denmark. In Ringkøbing (the worst performer), around 1.05 kg of additional milk is needed for every kg of butter when compared to Aarhus (the best performer). Moreover, 0.24 kg of this difference can be attributed to factors other than milk quality, at least for the years 1929–1945, for which we have data on this.Footnote 63 Aarhus is the largest city in Jutland, whereas Ringkøbing is in the rural western part of Jutland.
Specification 3 contains 890 individual parameters. For this reason, it is simply not feasible to show a regression table for every one of them. However, a figure for each estimate can communicate the information in a far more efficient manner, and Figure 1 contains all the parameter estimates. This shows changes in MB ratio, which are not attributable to the scale of production each year, compared to Silkeborg in 1898, and each point represents a parameter estimate. We have added error bars representing a 95 percent confidence interval to each estimate, but the error is so small that it is hardly noticeable. Overall, it can be noted that productivity increases across all regions over time, and that this development is fairly consistent. However, the different vertical positions of each of the curves demonstrates how it was not at all the same across the different regions. Again, a clear core-periphery pattern emerges.
The data does not describe the entire set of creameries in Denmark at the time. Of particular concern is that missing creameries might be so because of some process that is correlated with regions. The most likely root of problems like this could be that less well-performing creameries might have underreported or that some creameries might even have reported fake positively skewed data and that this issue would vary across regions. The typical MB ratio was well-known and might have provided an incentive for well-performing creameries to submit their impressive reports. On the other hand, underperforming creameries might have reported results that were better than the reality. If this is the case, we expect it to be the case in generally underperforming regions, as the potential signaling gain is greater, and this would thus generate a smaller difference between regions than was actually the case. Consequently, our estimates of regional heterogeneity might be considered a lower bound.
Finally, we have controlled directly for creamery size, but production scale is not a trivial matter. Specifically, there might be external economies of scale. Individual creameries were dependent on a sufficient supply of milk in order to ensure the efficient use of the machinery, given the technology of the time. At the other end of the spectrum, if supplies became too large to handle, they might have needed time to invest in an additional centrifuge.
Fire Insurance
The database contains the nominal amount spent on fire insurance. In mainly coal-powered creameries, fires were frequent, and how best to insure against this was an important decision for the board of a creamery to make. Figure 2 provides a plot of the average amount (in Danish kroner, DKK) spent on fire insurance.
The amount spent on fire insurance roughly follows inflation for the first twenty years, but this gradually increases while inflation falls. To better understand the actual insurance behavior of the creameries, we must take inflation into account. As insurance does not follow the general inflation patterns throughout the period, it might be misleading to simply deflate using the consumer price estimates from Statistics Denmark. Our data offers a natural way forward: We can use a combination of variables, to estimate each creamery’s turnover and thereby estimate how much each creamery spent on insurance as a share of their total budget. This gives a sense of how much weight the individual creamery placed on insurance.
Unfortunately, however, the reports did not contain estimates of total turnover. However, we can calculate that 93.6 percent of all milk was used for butter production.Footnote 64 Therefore, a reasonable proxy for the total turnover is the amount of butter produced multiplied by the price of butter. However, this leaves out an average of 93.6 percent of the production. A simple correction would be to multiply any turnover estimate by $ \frac{100\%}{93.6\%} $ to obtain an estimate of the full turnover. However, we can do better than this. Our rich data allows for this estimate to be at the individual level.Footnote 65 The variable $ Milk\_ for\_ butter\_ alt $ contains the amount of milk used for butter production as can be inferred from the Milk input and the MB ratio.Footnote 66 For each creamery $ i $ , in year $ t $ we can estimate that they used x percent of milk for butter. We label this the $ Butter\_ share $ . Next, we can define
This can be reduced to
Figure 3 contains turnover and the corrected turnover. The turnover and corrected turnover are almost the same, except for the last few years of the database. This can be explained by creameries in the Copenhagen region starting to produce products other than butter to a much larger extent. Making the same plot by region (not included here) shows that the turnover and corrected turnover only strongly diverge in the København og Frederiksborg region, which covers the greater Copenhagen area.
How much was spent on insurance? This is now a simple matter of dividing one by the other:
Figure 4 shows the insurance share of turnover over time.
What explains insurance at the level of the individual creamery? We can apply the classical tools from the econometrics toolbox. One example is the robust negative relationship between size and how much is spent on fire insurance. We estimate the following equation:
$ Cor. insurance\ share $ is the share of turnover spent on fire insurance using the corrected turnover estimates. Here $ {\beta}_1 $ represents the extra percentage points spent on fire insurance per shareholder. We scale the shareholders variable to count hundreds of shareholders, as this makes the units more comparable. $ {\boldsymbol{z}}_{it}^{\prime}\boldsymbol{\gamma} $ contains a set of controls such as fixed effects.
From Table 2 it should be noted that we find a robust negative relationship between size and the share of turnover spent on insurance. Across specifications we find a negative relationship, and for the comparable specifications (1, 2, 3, 6) this relationship is very similar: -0.166 percent and -0.117 percent less is spent on insurance when the creameries grow 1 percent in size. Moreover, we see that this relationship is replicated when measuring the size in terms of cows. Last, a similar negative relationship is found for a first difference specification (i.e., when measuring the relationship between a change in insurance share and a change in number of shareholders for the same creamery). This is in line with studies using modern data (see e.g., Mayers and Smith Jr, “On the Corporate Demand”), and a fuller analysis could take into account some (although of course, because of data constraints, not all) of the other factors found relevant in the literature, such as geographic concentration.Footnote 67 Such relationships would be very hard to pick up using classical micro-historical methods of business history. As for causality, we can speculate that creameries that were more risk averse would grow less, and we could speculate that larger creameries were able to negotiate better contracts. The introduction of this industry-wide data allows for further quantitative and/or qualitative research to explore such questions and more.
*** p < 0.001;
** p<0.01;
* p<0.05.
Clustered standard error. P-values based on Satterhwaite-corrected t-statistics
Note: (1) Is a simple pooled estimate. (2) Introduces Year and Creamery level fixed effects (3) Includes only creameries using 99% or more of their Milk for butter production (4) Is a first difference estimate. (5) Introduces an alternative size measure: The number of cows associated with the creamery. (6) Uses the raw insurance share instead of the corrected one.
Survival Analysis
A classical survival analysis would involve a proportional hazards model or similar, and an assessment of whether assumptions for such a model were fulfilled. However, this is beyond what is necessary for the current application, with which we simply want to get a more informed grasp of missing data. We instead focus on a simple adaption of Kaplan-Meier curves.Footnote 68 A Kaplan-Meier curve illustrates how many—as a share of the original number of units (in this case creameries)—are still observed at any point in time. This was originally developed in the context of medicine, in which one might be interested in actual survival; however, what we are interested in is seeing how many creameries stop reporting. In the classical medical setting, you cannot have units that are reported dead in one period turning up later. However, this is not the case for MDS, in which creameries might report once, then drop out, and later report again. This motivates a slight adaptation to allow for this. We thus count all the creameries in 1898 and then we see how many of those creameries report in 1899, 1900, 1901, and so on until 1945. This gives us a single pseudo survival curve for the creameries of 1898. We then repeat this exercise for 1899: We count the number of creameries in 1899 and then see how many of them show up later. This process is repeated until we have a survival curve based on the cross section for every single year. All of these survival curves are illustrated in Figure 5, which should be interpreted as the share of units retained after the number of years on the x-axis; for example, after ten years, around 80 percent of the creameries observed in time 0 (1898 or 1899 or …) are still observed.
There is quite a persistent pattern. The dashed curve is a best fit line, with no model structure imposed,Footnote 69 and this follows rather well a curve with a model structure given by the solid line. The model structure is
This is something we can estimate, and the β then gives a sense of the rate of loss of observations per percentage increase in time. The estimates we obtain range from -0.06 to -0.11, depending on which survival curve we run it on (we tried with origin in every single year from 1898 to 1940). Therefore, we roughly know that, if we increase the time passed by 1 percent, an additional 6–11 percent of the creameries cease reporting. It is important to note that this is not a formally identified survival analysis. Any such analysis would have to be case specific to the use of the data. However, it does indicate very specifically the type of dropout to expect, which contrasts with the alternative of starting with extant firms and working backward.
For comparison, we can do this counterfactual exercise. What if we started with just a list of creameries that existed in 1945 and then used the common approach of working backward. Given this approach, we would have 36.1 percent of all creameries in 1898 and more than 40 percent for some other years, and it turns out that already ten years prior to 1935 we would have missed 24.8 percent of the creameries by relying on the 1945 list (see the full table in appendix F). This is hard evidence for the selection problem from which any analysis based on surviving firms might suffer.
Discussion and Conclusion
We reflected on the methodological debate within business history, and argue that business historians, economic historians, economists, and others have much to learn from one another. In part to avoid this being mere platitudes, and in the spirit of de Jong’s “bizhismetrics,” we presented a new microlevel longitudinal database of cooperative creameries in Denmark for the period 1898–1945.Footnote 70 We hope that this provides inspiration for future work within business history: collecting an industry-wide picture that includes the smallest and largest of firms, with less survivor bias, and offering a more quantitative approach to assess drivers of productivity across both time and space. Each individual creamery was very small, employing perhaps three workers, and has left little trace beyond the data we present here. Nevertheless, the industry as a whole was central to Danish economic and business history.
We presented a number of applications of the database, and as one of the first uses of the data, in contrast to the implicit assumption often given in the literature that the creameries were a relatively homogeneous institution, we demonstrate considerable heterogeneity between the regions of Denmark in terms of the productivity as measured by the milk/butter ratio. Our econometric results should, of course, not be given a direct causal interpretation. However, the value of this finding is that it offers an insight into this previously unseen heterogeneity, which might provide inspiration for future research. For now, we note that the productivity of the modern dairies did not spread uniformly, that the regional heterogeneity we observe seems to be robust and attributable to factors other than the quality of the milk input and the production scale. Finally, we observe a clear core-periphery pattern.
Why might this be? Because the technology exhibited economies of scale, anything that restricted the amount of milk available would have an impact on productivity.Footnote 71 Thus, one possibility is institutional. Harry Haue documents how the new-pietist Inner Mission movement and its associated pressure to keep creameries closed on Sundays caused a loss of productivity.Footnote 72 Because they were stronger in certain regions, this might be one possibility. Another possibility is geography, and access to fuel in particular. During World War I and well into the 1920s, coal supplies were limited, creating uncertainty for creamery managers. During this period, access to coal might have differed according to location, and moreover, alternatives, principally peat and firewood, were also mostly available in certain locations. Geography might have also played a role in terms of endowments in infrastructure. Because the main export market was the UK, there might have been an advantage for creameries closest to the coast. Finally, geographical spillover could also be considered in this context. To some degree the pattern might be caused by agglomeration and external economies of scale in urban regions. Also, scale might have been a driver of productivity, although greater productivity might have also encouraged expansion, thus driving a spurious correlation. We leave all these potential determinants, and others, to future work, which is now made possible by opening the “black box” of the Danish creameries through the creation of a new database.
Supplementary Materials
To view supplementary material for this article, please visit http://doi.org/10.1017/eso.2023.5.