Introduction
All over the Western world today, socioeconomic status is positively associated with health and negatively associated with mortality (e.g., Mackenbach et al. Reference Mackenbach, Kunst, Cavelaars, Groenhof and Geurts1997; Cutler et al. Reference Cutler, Lleras-Muney, Vogl, Gleid and Smith2012; Chetty et al. Reference Chetty, Michael Stepner, Shelby Lin, Turner, Bergeron and Cutler2016; Mackenbach Reference Mackenbach2019). The historical origin of this “health gradient” is less certain. Some researchers have argued that socioeconomic differentials in mortality were as big, or even bigger, in the past as they are today (Antonovsky Reference Antonovsky1967; Marmot Reference Marmot2004; Deaton Reference Deaton2016). In several influential articles, Link and Phelan (Reference Link and Phelan1995, Reference Link and Phelan1996) argued that social conditions are fundamental causes of disease and mortality, persisting over time despite changes in disease environments, risks, and the mechanisms linking socioeconomic status and mortality (see also Clouston et al. Reference Clouston, Rubin, Phelan and Link2016). Other researchers, however, have claimed that socioeconomic differences in mortality were small historically because the diseases responsible for much of the mortality were highly communicable (Smith Reference Smith1983; Livi-Bacci Reference Livi-Bacci1991).
For childhood mortality, the evidence on socioeconomic differences in the past is mixed and differs by context. Jaadla et al. (Reference Jaadla, Potter, Keibek and Davenport2020), for example, reported a significant but non-linear relationship between child mortality and an occupation-based measure of parents’ economic well-being in mid-nineteenth century England, while Dribe and Karlsson’s recent investigation of long-term trends in child mortality in Sweden (Reference Dribe and Karlsson2022) found no evidence for a mortality gradient by socioeconomic status (also based on occupation) until after the onset of the mortality transition in the late nineteenth century. Although children born to parents living in extreme poverty clearly suffered a higher risk of death than other children in all contexts, there is no evidence for a socioeconomic gradient in childhood mortality in Western societies before the mortality transition, if by gradient we mean a consistent negative association between status and mortality across the whole status distribution and not just a difference between the lowest status groups and the rest. As in other areas of historical demography, however, research is limited by the quality and quantity of available data. High-quality data on parents’ economic well-being is especially lacking. More studies are needed, particularly more research on the period prior to the onset of the mortality transition.
In this study, we examine the relationship between parental wealth and child mortality in the mid-nineteenth-century United States immediately prior to the onset of sustained mortality decline, which began about 1880 (Haines Reference Haines, Haines and Steckel2020). We rely on new IPUMS Multigenerational Longitudinal Project (IPUMS MLP) data (Helgertz et al. Reference Helgertz, Ruggles, Warren, Fitch, Goeken, Hacker and Sobek2020, Reference Helgertz, Joseph Price, Thompson, Ruggles and Fitch2022) to construct panel datasets for over 2 million married couples linked between the 1850–60, 1860–70, and 1870–80 censuses. The panel datasets allow us to estimate child mortality using the observed survival of couples’ children over the intercensal intervals. In addition to describing a new approach of indirectly estimating child mortality using linked censuses, a major contribution of this study is its inclusion and analysis of wealth data, which were collected for all individuals in the 1850, 1860, and 1870 censuses. Wealth is a more direct measure of economic well-being than father’s occupation, which has been used frequently in historical studies as a proxy of parents’ income, wealth, class, and social status (e.g., Garrett et al. Reference Garrett, Reid, Schürer and Szreter2001; Dribe et al. Reference Dribe and Karlsson2020; Jaadla et al. Reference Jaadla, Potter, Keibek and Davenport2020). We look at parents’ real estate wealth, personal estate wealth, and combined wealth and study the association between wealth decile and child mortality, controlling for confounding factors. In addition to analysis at the national level, we make separate analyses by census region. Compared to previous studies, we use national data and a substantially larger number of cases, which allows a more detailed analysis of the association between wealth and mortality in different parts of the nation.
Background
Although a negative relationship between socioeconomic status and mortality has been consistently observed in modern societies (Marmot Reference Marmot2004; Elo Reference Elo2009; Deaton Reference Deaton2016), the precise causes of these associations have been disputed (House et al. Reference House, Landis and Umberson1988; Marmot et al. Reference Marmot, Stansfeld, Patel, North, Head, White, Brunner, Feeney and Davey Smith1991; Smith Reference Smith1999; Kuh and Ben-Shlomo Reference Kuh and Ben-Shlomo2004). Some researchers have contended that social conditions are fundamental causes of disease and mortality wherever access to resources is unequal and diseases can be avoided (Link and Phelan Reference Link and Phelan1995, Reference Link and Phelan1996; see also Clouston et al. Reference Clouston, Rubin, Phelan and Link2016). Although disease environments, causes of death, and pathways connecting socioeconomic factors and mortality have varied over time, according to this perspective, socioeconomic conditions have remained fundamental causes, resulting in persistent inequalities in health and mortality.
Historical research on the relationship between socioeconomic status and child mortality, however, has failed to find a strong and consistent relationship between parents’ socioeconomic status and child mortality prior to the late nineteenth century, casting some doubt on the long-term applicability of fundamental cause theory. Steckel (Reference Steckel1988) and Ferrie (Reference Ferrie and Costa2003), for example, found no consistent association between socioeconomic status and childhood mortality in the mid-nineteenth-century United States, whether measured by occupation, wealth, or literacy. The empirical basis for drawing firm conclusions was thin, however, being based on small and regional samples of census and retrospective mortality data. In a recent analysis of child mortality in southern Sweden, Dribe and Karlsson (Reference Dribe and Karlsson2022) found no social class differences in infant and child mortality in the early nineteenth century. Class differentials emerged only in the late nineteenth century in the context of industrialization, economic development, and mortality decline. A few studies have reported mixed evidence in support of a mortality gradient. In their study of child mortality in a mid-nineteenth-century parish in Tuscany, Italy, Breschi et al. (Reference Breschi, Manfredini, Pozzi, Breschi and Pozzi2004) found a weak socioeconomic gradient in infant mortality and a stronger gradient in the mortality of older children. Similar socioeconomic differences in child mortality have also been documented in early-nineteenth century England. Results, however, were nonlinear and explained more by the disease environment than by socioeconomic status (Jaadla et al. Reference Jaadla, Potter, Keibek and Davenport2020; Razzel and Spence Reference Razzell and Spence2006).
Preston and Haines provided the earliest evidence of a negative association between socioeconomic status and child mortality in the United States in Fatal Years: Childhood Mortality in Late Nineteenth-Century America (Reference Preston and Haines1991) using a public-use microdata sample of the 1900 U.S. census. The census collected information on the number of women’s children ever born and the number of children surviving, which allowed Preston and Haines to estimate and construct empirical models of child mortality at the level of individual mothers with controls for individual-level, household-level, and area-level correlates. The results showed only small and inconsistent differentials in child mortality by socioeconomic status, as measured by the father’s occupation, home ownership, and unemployment. The difference in mortality between children of white-collar and blue-collar workers was not large, and most of the differences between occupational groups were related to place of residence, nativity, or race (e.g., low mortality for children of farmers and high mortality for children of black urban laborers in the South). Preston and Haines attributed the relatively small socioeconomic differences to the poor knowledge about disease transmission in society and medical care, which made it difficult for high-status groups to protect their children from disease and death.
Similar census data documenting the number of children ever born and children surviving were collected in the 1910 census of the United States and the 1911 censuses for Ireland and England and Wales, which has supported similar analyses with similar results. Preston et al.’s analysis of the 1910 public-use sample (Reference Preston, Ewbank, Hereward and Cotts Watkins1994), which was constructed with a higher density sample than the 1900 sample, found large mortality differences across ethnic groups (e.g., low mortality among children of Jewish parents and high mortality among children of French-Canadian parents) and relatively small differences across occupation groups. In England and Wales, as well as in Dublin, Ireland, childhood mortality was clearly associated with social class (based on occupation), but the contexts in which people lived were often more important than class and confounded the class-mortality association (e.g., Preston and Haines Reference Preston and Haines1991; Woods et al. Reference Woods, Williams, Galley, Corsini and Viazzo1993; Haines Reference Haines1995; Woods Reference Woods2000; Garrett et al. Reference Garrett, Reid, Schürer and Szreter2001; Cormac Reference Cormac, Breschi and Pozzi2004; Connor Reference Connor2017). There was no trend over time in mortality differentials in the Netherlands, but poorer regions had more pronounced class differences than richer regions (Van Poppel et al. Reference Van Poppel, Jonker and Mandemakers2005).
A stronger and more consistent negative association between socioeconomic status and child mortality emerged in the later stages of the mortality transition. Woodbury’s classic study of infant mortality in the United States (Reference Woodbury1926), for example, identified a large and consistent mortality gradient across income groups. Among infants born to native-born white mothers, and whose fathers’ earnings exceeded $1,250 annually, the mortality rate was 58 per thousand live births; among infants whose father’s earnings were less than $450, the rate was 170 per 1,000 live births. In Sweden, the socioeconomic differentials that emerged in the late nineteenth century also developed into a full gradient in childhood mortality in the twentieth century (Dribe and Karlsson Reference Dribe and Karlsson2022).
From a theoretical perspective, socioeconomic status can affect the health and mortality of children through several pathways, related to maternal factors (birth interval and birth order), nutrition, health care, injuries, and environment (e.g., Mosley and Chen Reference Mosley and Chen1984). Economic resources improve nutrition and access to health care and reduce exposure to infectious diseases through more hygienic living quarters and access to clean water and sanitation (see, e.g., Edvinsson Reference Edvinsson, Breschi and Pozzi2004). Socioeconomic status may also interact with cultural factors in determining different behavior with consequences for child health, such as fertility, breastfeeding, uptake of health innovations, food preparation, and lifestyles. However, many of these variables are also confounded by contextual settings, such as publicly provided services like water and sanitation facilities, and health care access, as well as food security (Antonovsky and Bernstein Reference Antonovsky and Bernstein1977; Sundin Reference Sundin1995; Preston et al. Reference Preston, Hill and Drevenstedt1998).
As mortality started to decline, there was also a change in disease patterns, often described as “the epidemiological transition” (Omran Reference Omran1971). Before the transition, infectious diseases transmitted by air, food, or water dominated among the causes of death. People’s knowledge about what caused disease was rudimentary and sometimes even inaccurate (Preston and Haines Reference Preston and Haines1991; Easterlin Reference Easterlin1999), which meant that individuals who otherwise might have had the resources to protect themselves or their children from diseases and death, such as the higher-status groups, were unable to do so (Smith Reference Smith1983). In the late nineteenth-century United States, for example, children of physicians experienced only 6% lower mortality rates than children of other parents, while the children of teachers had no advantage at all (Condran and Preston Reference Condran, Preston, Chen, Arthur and Ware1994). In the first stages of the mortality transition, smallpox vaccination (in the early nineteenth century) and improved sanitation in the cities (from the mid-nineteenth century) reduced deaths from smallpox and various waterborne diseases, respectively (Sköld Reference Sköld1996; Troesken Reference Troesken2004). As knowledge of disease transmission improved, most notably through the acceptance of the germ theory of disease from the 1880s (Preston and Haines Reference Preston and Haines1991; Easterlin Reference Easterlin1999), mortality from infectious diseases was further reduced, and then continued to decline thanks to the medical innovations in the twentieth century (see Omran Reference Omran1971).
The fact that infant and child mortality was high even in high-status groups, where diet and nutrition must have been satisfactory, has led to a questioning of the idea that the mortality decline was caused by improved nutrition (McKeown Reference McKeown1976). Instead, much of the focus in the literature has been on the role of disease environment, maternal care (especially breastfeeding), water and sanitation, and early public health initiatives as leading explanations of the declines in childhood mortality that occurred before the real breakthrough of modern medicine (e.g., Szreter Reference Szreter1988; Preston and Haines Reference Preston and Haines1991; Reid Reference Reid, Corsini and Paolo Viazzo1997; Woods Reference Woods2000; Razzel and Spence Reference Razzell and Spence2006). As new health interventions became available and knowledge improved, higher-status groups could be expected to have benefitted first, leading to a sharpened mortality gradient (Morris and Heady Reference Morris and Heady1955; Antonovsky and Bernstein Reference Antonovsky and Bernstein1977; Cutler et al. Reference Cutler, Lleras-Muney, Vogl, Gleid and Smith2012) or even to the emergence of such a gradient in the first place (Smith Reference Smith1983). As health knowledge diffused to lower-status groups, urban infrastructure improved, and new health interventions became more accessible and affordable, it could perhaps be expected that the importance of socioeconomic status declined. For example, adequate provision of clean water and sanitary facilities for the whole population, through public provisions, would reduce socioeconomic disparities. Both black and white children benefitted from improved water supplies and sewage in the early twentieth century South, leading to a narrowing of the race differential in child mortality (Troesken Reference Troesken2002, Reference Troesken2004). Similarly, the establishment and expansion of publicly funded universal health care would reduce the gradient, since access was less dependent on economic status.
The period we study, however, predates many of these developments and the onset of significant mortality decline, which Haines (Reference Haines, Haines and Steckel2000) has identified as beginning in the United States during the 1870s, the period covered by the last of our three panel datasets. In this context, wealthier parents would have found it difficult to leverage their resources to shield their children from communicable diseases or to treat acquired illnesses. To cite just one example from the period, Abraham Lincoln’s second son Eddie died of “chronic consumption” (likely tuberculosis) in 1850 at age three, despite Lincoln’s considerable wealth earned as a lawyer, and his third son Willie contracted and died in 1862 from “bilious fever” (likely typhoid fever) while living in the White House, despite the care of three of Washington, D.C.’s best physicians (Dirck Reference Dirck2019).
Our study, however, does not predate the beginnings of the sanitation movement in the United States (Duffy Reference Duffy1992), a growing emphasis in the period 1750–1900 on cleanliness and domestic hygiene to promote health (Bushman and Bushman Reference Bushman and Bushman1988; Tomes Reference Tomes1990), or the early phases of the fertility transition, which began among “Yankee” couples living in New England circa 1840 (Hacker Reference Hacker2016). Although early sanitary reforms were limited by lack of knowledge of germ theory and likely benefited all classes, the greater emphasis on cleanliness among upper- and middle-class parents could have contributed to higher survival rates among their children. Recent analyses of linked census data have indicated an inverse “U-shaped” relationship between wealth and marital fertility in the period 1850–80 (Hacker et al. Reference Hacker, Haines and Jaremski2021), suggesting that wealthier couples were consciously limiting their fertility, which also could have had indirect benefits on the health and survival of their children.
To summarize, there does not seem to have been a consistent association between socioeconomic status and childhood mortality in Western societies before, or early in, the mortality transition. When the transition was well underway, which also coincided with industrialization in many places, socioeconomic differentials emerged, as the higher-status groups experienced an earlier, or sometimes faster mortality decline. Gradually these differentials developed into a full health gradient. Based on this evidence we should not expect childhood mortality in the mid-nineteenth century United States to have varied markedly by wealth or other measures of socioeconomic status. It is possible, however, that the growing emphasis on cleanliness and domestic hygiene among middle-class and upper-class parents and the early fertility transition with its clear socioeconomic pattern had implications for mortality differentials because of the association between fertility and child survival.
Data
We relied on three panel datasets of married couples constructed using IPUMS Multigenerational Longitudinal Project (MLP) data covering the periods 1850–60, 1860–70, and 1870–80 (Helgertz et al. Reference Helgertz, Ruggles, Warren, Fitch, Goeken, Hacker and Sobek2020). The IPUMS MLP datasets and their methods of construction have been described in detail elsewhere (Helgertz et al. Reference Helgertz, Joseph Price, Thompson, Ruggles and Fitch2022). Briefly, the datasets consist of census “crosswalks” identifying individuals linked between two or more of the IPUMS full-count census datasets (Ruggles et al. Reference Ruggles, Fitch, Goeken, Hacker, Nelson, Roberts, Schouweiler and Sobek2021). Individuals were linked using machine-learning algorithms that relied on time-invariant information captured in both censuses (individuals’ first name, last name, race, sex, and place of birth), supplemented with information related to the local context and other household members to improve accuracy. An evaluation of the accuracy of the method used to generate the IPUMS MLP links indicates a considerably lower type I error rate than other methods of automated record linkage used on U.S. census data (Helgertz et al. Reference Helgertz, Joseph Price, Thompson, Ruggles and Fitch2022), a dramatically lower rate of false positives than incurred by other projects employing similar automatic linking methods (e.g., Abramitzky et al. Reference Abramitzky, Leah Boustan, Feigenbaum and Pérez2021).
From each of the three panel datasets we selected an initial analytical population limited to married couples with: (1) spouses successfully linked between the two censuses; (2) spouses who remained married to each other in both censuses; and (3) one or more own children aged 0–9 in the first of the two censuses, who we considered to be “at risk” of mortality in the intercensal period. Using this dataset, we conducted several preliminary analyses of our ability to estimate the mortality of the at-risk children from their observed survival to the second census, when aged 10–19. Ultimately, as discussed in more detail below and in the Online Appendix, we concluded that our indirect estimation method worked best for children aged 1–3 in the first census, who were well enumerated in both censuses (children aged 0, in contrast, appeared to have been frequently undercounted by the first census) and who were more likely, assuming they survived the decade, to be living in their parents’ households in the second census than their older surviving siblings, thereby increasing the probability they would be linked by the IPUMS MLP project and known to survive. Our final analytical dataset, therefore, consisted of linked married couples with one or more children aged 1–3 in the first census, who were the only children considered at risk of death in the subsequent decade. Because approximately 9-in-10 Black individuals living in the United States in 1850 and 1860 were enumerated on a separate schedule for enslaved inhabitants without names, we limited the analysis to white couples.
By requiring that both spouses were linked and survived each decade, we reduced the already low chances of type 1 errors in the linked datasets and increased the chances that surviving children, relative to children born to parents whose marriages were disrupted by the death of one or both partners, would be living in their parents’ households and linked by the IPUMS MLP project. Although we did not explicitly require couples’ surviving children to be linked between the two censuses, a large majority of children present in the second census who were old enough to have been enumerated in the first census were linked as well.
Linked census records are unlikely to be representative of the populations from which they are drawn, even if methods used to establish the links relied solely on time-invariant matching criteria (Antonie et al. Reference Antonie, Inwood, Minns and Summerfield2020). Unfortunately, our requirement that couples survived each decade makes it impossible to make direct comparisons between couples in the linked analytical datasets and couples in the overall population using the same selection criteria. We have no means of determining – other than from the IPUMS MLP dataset itself – which couples in the overall population survived each decade. In addition, our selection criteria implicitly assumes that couples were still living in the United States 10 years after the first census, when they were at risk of being enumerated by the next census, and we have no means of determining which couples remained resident in the country. Consequently, Table 1, which compares couples in the three linked MLP datasets with similar couples in the corresponding IPUMS cross-sectional datasets, is not a comparison of the linked and overall targeted populations, but a comparison of the linked datasets and proxies of the overall targeted populations. An unknown but likely significant number of the couples in the cross-sectional datasets were in marriages that would soon be disrupted by death, divorce, or separation, and an unknown number of couples would be living outside the United States at the time of the next census (a significant problem when analyzing foreign-born couples, who were at high risk of returning to their country of origin and therefore more likely to be unobserved in the second census than U.S.-born couples). These couples likely differed in significant ways from couples in the targeted populations of surviving couples, biasing comparisons.
Sources: IPUMS 1850, 1860, and 1870 complete-count datasets (Ruggles et al. Reference Ruggles, Fitch, Goeken, Hacker, Nelson, Roberts, Schouweiler and Sobek2021) and IPUMS MLP datasets (Helgertz et al. Reference Helgertz, Ruggles, Warren, Fitch, Goeken, Hacker and Sobek2020).
Note: Universe: The cross-sectional dataset includes all currently married white couples with women age 20–49 and with one or more children age 0–4 in the household in the IPUMS complete-count datasets with spouses aged 25–59. Couples in the linked datasets include currently married white couples with women age 20–49 in the first of the two censuses with one or more children age 0–4 in the first census.
Displayed values are the percentages of cases in each group unless labeled otherwise. Values shown for the linked datasets are from the first of two linked censuses (Census A). The one exception is the proportion of children aged 10–19 in the second of the two linked censuses (Census B) who were linked back to a child age 0–9 in Census A. The number of surviving couples were estimated using the 10-year survival probabilities in Hacker (Reference Hacker2010) for the native-white population. Because the survival probability of the foreign-born population was likely lower, the true number of estimated couples surviving to the next census was likely somewhat lower. The estimated number of surviving and enumerated couples relied on the overall net census undercount estimates for both sexes combined reported in Hacker (Reference Hacker2013). Because the net undercount included a small but unknown number of double-counted couples, the true number of couples who were potentially linkable in both censuses was likely somewhat lower.
Given these caveats, the results indicate that couples in the panel datasets closely resembled couples in the cross-sectional datasets, with some modest differences in expected areas. Couples in the linked panel datasets were moderately more likely to be literate, born in the United States, living in rural areas, living on a farm, living in the Northeast census region, and had more children living in the household (although the number of children under age 5 was very similar) than couples in the cross-sectional datasets. Linked couples also had approximately 15% greater wealth than couples in the cross-sectional datasets. Overall, however, the differences were modest and might have resulted from unobserved differences in couples’ survival over the census interval, not biases incurred in the linking process. Rural couples and wealthier couples in the cross-sectional dataset, for example, might have been more likely to survive the decade than urban couples and poorer couples. We would therefore expect to link a higher percentage of these couples to the second census, even if the automatic linking algorithms linked the same percentage of surviving couples. We therefore did not weight our analytical datasets to represent the cross-sectional populations as suggested by some researchers (Antonie et al. Reference Antonie, Inwood, Minns and Summerfield2020; Bailey et al. Reference Bailey, Cole and Massey2020). Although it was theoretically possible to do so, weighting the dataset had the potential to make our results less representative of the targeted population of surviving couples who remained in the United States. Nevertheless, we caution that the results presented below are strictly applicable only to the population of linked couples. In the Online Appendix we show our results for the 1870–80 panel dataset using weights derived from inverse propensity scores (Bailey et al. Reference Bailey, Cole and Massey2020) to yield statistics representative of the cross-sectional population in 1870. These results correspond very closely to the results presented below and had no effect on our conclusions. In the specific case of the relationship between wealth and child mortality the results were nearly identical.
At the bottom of Table 1, we make rough estimates of the overall size of the potentially “linkable” population, defined as couples who survived the intercensal interval and were enumerated by both censuses. To estimate the size of the linkable population we relied on published ten-year survival estimates and census undercount estimates for the native-born white population (Hacker Reference Hacker2010, Reference Hacker2013). Because our data includes foreign-born couples – who likely suffered higher mortality rates and were more likely to be missed by the subsequent census – our estimates of the size of the linkable population were likely biased upwards (and the corresponding linkage rates downwards). The results indicate that our linking methods identified between 62% and 69% of the potentially linkable population, depending on the census interval. The true linkage percentage was likely higher. Among the children enumerated in the second of the two censuses who were old enough to be enumerated in the first census, 84% were linked to the first census. Unlinked couples and unlinked children either had several potential matches (e.g., two or more potential matches for couples with common names), were enumerated poorly (e.g., with an inaccurate age, misspelled name, inaccurate birthplace, or only an initial), were transcribed poorly, or (in the case of the unlinked children) not enumerated in the first of the two censuses.
Measuring child mortality
Our basic approach is easily described. To estimate child mortality, we began by restricting the analytical datasets to married couples linked between the 1850–60, 1860–70, and 1870–80 censuses who had one or more coresident own children in the first of the two linked censuses (hereafter, Census “A”). We then determined the number of those children who survived to the second of the two censuses (hereafter, Census “B”) using the links provided by the IPUMS MLP project. Finally, we assumed that children in Census A who were not linked to Census B died in the ten-year interval between the two censuses. Dividing the number of children dying by the number of at-risk children in Census A resulted in the proportion of each couple’s children dying in the ten-year interval.
This basic approach is subject to several sources of error, which we minimized by restricting our analysis to a narrow age group of at-risk children and by making a few corrective adjustments to the data. An initial exploratory analysis indicated several potentially important sources of error: (1) failures by the IPUMS MLP project to link a few children present in both censuses; (2) differences in the census under-enumeration of children, especially among children under the age of 1; and (3) unobserved differences in the age at which children left their parents’ home. This initial analysis is described in the Online Appendix.
Briefly, our analysis indicated that the IPUMS MLP, while taking steps to limit type I errors, was too conservative in linking children of married couples who could be safely assumed to be the same child. We therefore forced links among couples’ unlinked children with the same sex and approximate birth years in both censuses. In 1870, for example, Missouri couple Joab and Elizabeth Hobson had three male children, named “A L,” “A J,” and “F S,” aged 4, 4, and 1, who were not linked forward to a child in the 1880 census. In 1880, Joab and Elizabeth had three male children, named “Abraham,” “Andrew,” and “Phillip,” aged 14, 14, and 11, all born in Missouri, who were not linked backward to a child in the 1870 census. We forced links between these children and similar unlinked children with the same sex and approximate birth years of other linked couples. We then used children aged 10–19 in Census B to estimate census undercounts among children aged 0–9 in Census A, under the assumption that children coresiding with their mothers and fathers when aged 10–19 also should have been present in their parents’ households a decade earlier when aged 0–9. These estimates indicated that children under the age of 1 were much more likely to be missed by the 1850, 1860, and 1870 censuses than children aged 1–9. Although we were able to estimate and adjust for each couple’s undercounted children, the adjustments were large for children aged 0, and we had no alternate means of evaluating their accuracy. We therefore decided to drop children aged 0 in Census A from the analysis.
Finally, our preliminary analysis also indicated that children who were still living in Census B but who had left their parents’ homes were less likely to be linked between the two censuses by the IPUMS MLP project. If not linked, these children will appear in our analysis to be deceased. Because the propensity for children to leave their parents’ households likely varied by the wealth and other characteristics of their parents, the departure of children from their parents’ homes could bias our results. Unfortunately, the age a couple’s children left home was unobserved and can only be estimated at the population level using model life tables and an assumed level of mortality. Steckel (Reference Steckel1996) has estimated that the median age of leaving home in the period 1850–60 was 26 years for white males and 25 years for white females, but also concluded that a small percentage of boys and girls left home before aged 16. Although we had no means of estimating how differences in the age of leaving home varied by wealth, our exploratory analysis indicated that age-specific 10-year survival probabilities of children closely corresponded with model life table for children aged 3 and younger in Census A (aged 13 and younger in Census B), but then increasingly diverged from the expected age pattern of survival for children aged 4 and above in Census A (aged 14 and above in Census B). Although the divergence was small for children aged 4 in Census A, we decided to drop children aged 4 and older from the analysis. Together with our concern about the relatively large undercount of children aged 0 in Census A, our final analytical population was limited to linked couples with children aged 1–3 in Census A. Although this narrow age group of at-risk children limits the number of cases available for analysis, the large size of the MLP datasets ensures that we have large numbers of cases for analysis.
A more detailed discussion about possible biases and alternative measurements of child mortality, including the sensitivity of the results to the reliance on different age groups of children, is provided in the Online Appendix and concludes that our findings are robust to alternative assumptions.
Measuring parental wealth
Most historical studies of child mortality have relied on fathers’ occupation or occupational group as a measure of socioeconomic status, or economic well-being more generally (e.g., Dribe et al. Reference Dribe, Hacker and Scalone2020; Garrett et al. Reference Garrett, Reid, Schürer and Szreter2001). Although occupations were correlated with income and have been used to assign income scores to individuals in historical censuses (e.g., Sobek Reference Sobek1995), there were likely unknown but significant heterogeneities within occupations in terms of income and wealth. This represents a serious limitation in the mid-nineteenth-century United States, when approximately half of all fathers were enumerated in the census as “farmers.” Wealth data recorded by the 1850, 1860, and 1870 censuses allow us to examine the relationship between parents’ economic well-being and child mortality more directly. It is likely, of course, that economic status was also correlated to some extent with social status, and hence that wealth also captures other aspects of social position than economic living standards.
We first combined wives’ and husbands’ real estate, personal estate, and their total combined wealth, if reported separately, into a measure of couples’ real, personal, and total wealth, measured in United States dollars unadjusted for inflation, which we then divided into deciles for each census year to the extent possible. Although women’s property acts were enacted by an increasing number of states following the passage of the first such act in Mississippi in 1839, nineteenth-century coverture laws typically meant that all property held by women prior to their marriage became the property of their husbands at marriage. Reporting practices by the enumerators also clearly assumed all wealth held together by a couple was reported on the husband’s record. In each census, less than 1% of married women had wealth reported separately. Because wealth typically increased with the husband’s and wife’s age, we included women’s age and spouses’ age differentials in regression models.
Enumerator instructions stated that only real and personal estate wealth above $100 were to be recorded, so each wealth variable is effectively truncated below that value. On average, nearly half of all households in 1850 with a childbearing woman reported no real estate wealth, although the proportion varied by subgroups (e.g., “farm couples,” which included a husband whose occupation was recorded as a “farmer” or “farm laborer,” were more likely to own real estate). Depending on the census year and wealth variable, the percentage of couples reporting no wealth in national models ranged from a high of 44%, in the case of couples’ real estate wealth in the 1870–80 dataset, to a low of 12%, in the case of couples’ total wealth in the 1860–70 dataset.
Nineteenth-century census officials lacked the resources to analyze the wealth results and some officials, including Census Superintendent Francis Amasa Walker, expressed skepticism that wealth was accurately reported or that the questions were worth the resources spent for their collection (Magnuson Reference Magnuson1995: 58–9). Census enumerators were told to ignore mortgages and liens. Although mortgages were less common in the nineteenth century – Snowden (Reference Snowden, Carter, Gartner, Haines, Olmstead, Sutch and Wright2006) estimates about 30% of properties in the period were mortgaged – there is some possibility that wealth estimates are biased. In addition, although the censuses were never used to assess direct taxes, returns were not private, and some individuals may have hesitated to report their true wealth, especially their true personal estate wealth, which was easier to conceal and less evident to observers. Distrust in the census among nineteenth-century individuals was generally low, however. Most respondents probably tried to give an accurate approximation. Richard Steckel’s (Reference Steckel1994) comparison of wealth reported by individuals in the 1870 census with independent assessments made by assessors in Boston, Salem, Lexington, Westminster, and Sturbridge, Massachusetts, indicated similar wealth distributions and high correlations between the two measures.
Before conducting our analysis, we made some preliminary examinations of the wealth data. On aggregate, the data appear reasonably accurate. Nationally, mean wealth rose steadily with individuals’ age until about age 45–50, rose at a slower pace to about age 65, after which it began to decline. This age pattern is expected; men’s age-related debilities increased after middle age, work hours and incomes declined, and wealth bequests were increasingly made to children reaching adulthood (Kearl and Pope Reference Kearl and Pope1983). Although occupational income scores for each occupation (the mean annual income for men reporting each occupation) are not available in the census until 1950 (Sobek Reference Sobek1995), application of those income scores to the nineteenth-century occupations (using the IPUMS “occscore” variable) indicated that wealth and occupation income scores were strongly correlated, increasing confidence in the wealth data (see the Online Appendix for more information on the age-pattern of wealth and the correlation between occupational income scores and wealth). In addition, wealth differentials by income, nativity, and region were consistent with the economic and social history literature, with greater wealth among native-born couples relative to foreign-born couples and greater wealth in the South in 1860 relative to other regions, where the rapidly growing slave population increased slaveowners’ personal estate wealth and where market returns from cotton were high. As expected, the American Civil War and the emancipation of slaves in the war’s aftermath took a major toll on white men’s wealth in the South, with dramatically lower wealth in the 1870 census relative to the 1860 census (Dupont and Rosenbloom Reference Dupont and Rosenbloom2022). Our interpretation is that while wealth might be inconsistently recorded for individuals, on aggregate the results should be interpretable. Any measurement error at the individual level will tend to deflate the significance of wealth variables in empirical models and increase the chances of accepting the null hypothesis. Our results, therefore, will tend to be conservative.
Descriptive statistics
Table 2 shows the mean proportion of children aged 1–3 dying by father’s occupation group, region, rural–urban residence, and parents’ literacy, nativity, and total combined wealth tercile in the three panel datasets. Overall, the results indicate that child mortality was lowest in the 1870s, which coincides with the approximate timing of the onset of the mortality transition (Haines Reference Haines, Haines and Steckel2000). Mortality was higher in the 1860s than in the 1850s. The increase in mortality between the 1850s and 1860s may have been a consequence of the American Civil War (1861–65), which is known to have facilitated the spread of infectious diseases. The 1866 cholera epidemic may have also contributed to higher child mortality rates in the 1860s (Rosenberg Reference Rosenberg1987). About 9% of children aged 1–3 in 1850 and 1870 died prior to the subsequent census. This corresponds to Model West level 9 in the 1850s and Model West level 10 in the 1870s, and are close to mortality levels estimated elsewhere (Hacker Reference Hacker2010). The 11% dying in the 1860s roughly corresponds to Model West level 7, a higher level of mortality than is typically estimated for the period.
Source: See Table 1.
Notes: Universe: Currently married white couples with women aged 20–49 with spouses aged 25–59 and with one or more children aged 1–3 in the household after adjustments for undercounting as described in text.
Results were weighted by the number of couples’ children aged 1–3. In 1860, the total wealth terciles for couples in the model were $0–$299, $300–$1699, and $1700 and above. In 1870 they were $0–$299, $300–$1899, and $1900 and above.
Within categories, mortality was lower among children whose fathers were farmers, whose parents were in the upper tercile of total wealth, and who lived in a rural area and the Midwest census region. Mortality was higher among children whose fathers were service workers and general laborers, among children whose parents were in the lower tercile of total wealth, and among children who lived in larger urban cities and in the Northeast census region. The results show little differences among children born to literate and illiterate parents and, in contrast to research for the turn of twentieth century (Preston et al. Reference Preston, Ewbank, Hereward and Cotts Watkins1994; Dribe et al. Reference Dribe, Hacker and Scalone2020), relatively little advantage among children born to native-born parents relative to foreign-born parents. Many of these variables were intercorrelated, of course. Literate parents, for example, were more likely to live in urban areas. In a study of aggregate mortality in the period 1830–60, Haines et al. (Reference Haines, Craig and Weiss2003) found higher crude death rates in counties with higher per capita wealth, which were also counties with greater urbanization and transportation links. To assess the importance of parents’ wealth independent of other covariates, we turn to regression analysis.
Regression analysis
We model the proportion of children dying between Census A and Census B as a function of each couple’s socioeconomic characteristics in Census A – including their real estate wealth, personal estate wealth, and combined total estate wealth – to examine the possible relationship between wealth and child mortality. We use weighted OLS regression to model child mortality in each dataset. We weight the results by the number of children aged 1–3 in Census A at risk of death, a standard approach in the literature that reduces the problem of heteroskedasticity and gives more weight to couples with more children, just as they do in overall levels of child mortality in the population (Preston and Haines Reference Preston and Haines1991: 137–8).Footnote 1 In most models we employ county-level fixed effects to control for unobserved heterogeneity, such as county-level variations in the disease environment, which may have been significant (see, for example, county-level maps for 1910 in Dribe et al. Reference Dribe, Hacker and Scalone2020). We also cluster standard errors at the county level.
Independent variables, which we observe in Census A prior to the observation of children’s mortality, include dummy variables for residence type (rural, urban, population 2,500–9,999, urban 10,000–24,999, urban 25,000–99,999, urban 100,0000 plus); mother’s age group; age difference between spouses, couple’s nativity (native born or born in Germany, Ireland, Great Britain, Canada, or other foreign country), couple’s literacy, and wealth decile. We construct separate models using couples’ real property wealth decile in Census A (available in the 1850, 1860, and 1870 censuses), personal property wealth decile (available only in the 1860 and 1870 censuses), and total property wealth decile (also available only in the 1860 and 1870 censuses).
For each dataset, we construct a model for the entire nation and a model for each of the three major census regions, Northeast, Midwest, and South (the West census region included too few observations for consistent analysis). We also construct national models for the rural, agricultural population (children of fathers whose occupation was “farmer” or “farm laborer”) and the urban, nonagricultural population. In these two models, we shift the area fixed effects from the county level to the state economic area (SEA) level to ensure enough within-unit variation. SEAs, which were defined by the 1950 Census Bureau, are aggregations of two or more contiguous counties with similar economic orientations and are a level of geographic aggregation between counties and states. In the 1870–80 MLP dataset, for example, couples resided in 2,239 counties, 439 SEAs, and 48 states or territories.
In all models, couples with no reported wealth are the reference group.Footnote 2 Because we are modeling the proportion of each couple’s children dying using OLS and weighting the results by the number of children at risk in Census A, positive coefficients represent the increase in the proportion of children dying relative to the reference group, while negative coefficients represent the decrease in the proportion dying, all else being equal.
Results
Full model results are shown in the Online Appendix and confirm most expectations, such as higher mortality in urban areas. Here, we limit our discussion to the relationship between parents’ wealth decile and child mortality.
Figures 1, 2 and 3 display the overall national results for the three wealth variables (real, personal, and total property) for couples in each period. Figure 1 plots the relationship between couples’ real-estate wealth and child mortality. The patterns are similar across the three periods. Approximately 40–50% of couples had no real estate wealth. In the upper half of the wealth distribution, we see a remarkably linear negative relationship between wealth and child mortality, with children of parents in each higher decile experiencing progressively lower mortality relative to the reference group of children born to parents with no real estate wealth, and lower mortality than the children born to parents in the next lower decile.Footnote 3 Children of couples in the highest real estate wealth decile experienced about 1.5% fewer deaths in the 1850–60 and 1860–70 panel datasets relative to children of couples with no real wealth. In the 1870–80 panel dataset, the difference was 2% fewer. Since ten-year mortality for children in the 1–3-year age group averaged about 10% in the three panel studies, this was approximately equivalent to a 15–20% lower rate of mortality.
Figures 2 and 3 indicate very similar patterns between personal estate wealth and child mortality and total estate wealth and child mortality. The plots for personal and total wealth (couples combined real and personal wealth) for couples in the 1860–70 panel dataset suggest that mortality was modestly higher for the children with the lowest non-zero decile of wealth relative to children of no wealth couples.Footnote 4 The top 20% wealthiest families experienced about 2 percentage points lower child mortality than the bottom 20%, which is a non-trivial magnitude given the mean percentage of children aged 1–3 dying within 10 years, which was 11% in the 1860–70 dataset and 9% in 1870–80 dataset (see Table 2).
Turning to the regional models, we see similar results. Tables 3, 4 and 5 show the results for couples’ three measures of wealth in the three datasets (full model results can be found in the Online Appendix). In all contexts – whether national or regional, in models limited to the urban nonfarm population or to the rural farm population, or in models for the three periods studied – we find a negative gradient between wealth and child mortality. Although children born to parents in a higher wealth decile will occasionally have modestly higher mortality than children born to parents in the immediately adjacent lower decile, the gradients we find are remarkably smooth. Interestingly, despite the different contexts, the patterns are quite similar across regions. The gradient was somewhat steeper in the Northeast census region than in the South census region and in the urban non-agricultural population than in the rural agricultural population (see Figure 4, which plots the relationship between total wealth and child mortality by region in the 1870–80 interval, which was less likely to have been influenced by the American Civil War between 1861 and 1865). In terms of magnitudes, the difference between the top-10% and bottom-10% of the total wealth distribution range in 1870–80 was 1.9 percentage points in mortality in the South and 2.8 percentage points in the Northeast. The corresponding differences were about 2.2 percentage points in rural agricultural occupations and 2.8 percentage points in urban non-agricultural occupations. These are minor differences given the dramatically different contexts. It clearly demonstrates the universal negative association between wealth and child mortality in the United States in this period.
Source: See Table 1.
Notes: The dependent variable is the proportion of children age 1–3 in Census A dying prior to Census B. Results are weighted by the number of children at risk of death. The models for the nation and each of the three major regions employ county-level fixed effects. The models for the nonagricultural, urban, and the agricultural, rural populations employ SEA-level fixed effects.
*p < 0.05. **p < 0.01. ***p < 0.005.
Conclusion
Although a socioeconomic gradient in health and mortality appears universal in modern societies, its historical origins remain unclear. In the United States, analysis of the historical relationship between socioeconomic status and mortality has been limited by poor data. The nation’s death registration was first established in 1900 and not completed until 1933. As a result, most historical studies of the mortality gradient have focused on the turn of the twentieth century, when children ever born and children surviving data collected by the 1900 and 1910 censuses allow the measurement of child mortality and the testing of hypotheses. Even in that context, the lack of income and wealth data limits the investigation to occupation and homeownership. Although modest gradients have been found among children born to parents in different occupational groups and between children born to parents who did and did not own their own homes, heterogeneity of income and wealth within occupation groups may obscure the existence of a larger mortality gradient.
The main contribution of this study is to leverage the recently created IPUMS MLP links for the censuses 1850–60, 1860–70, and 1870–80 to examine the relationship between parental wealth and child mortality. Our findings indicate the presence of a negative relationship between the wealth of parents and the mortality of children already in the first period 1850–60. We find a remarkably smooth negative gradient, with children born to parents at each higher wealth decile experiencing lower mortality than the children born to parents at lower wealth deciles. The relationship was apparent both when using real estate wealth and personal wealth.
The wealth gradient in child mortality was also clearly evident within each of the three major census regions, among both the urban non-agricultural and rural agricultural populations, and was quite consistent across the three decades examined (1850–60, 1860–70, and 1870–80). The estimated magnitudes were significant. The difference in child mortality between the top-10% and bottom-10% of total wealth was 1–3 percentage points. Overall, however, only about 10% of children aged 1–3 died in the following ten-year period, so these differences represented about a 10–30% higher risk of death among children whose parents were in the lowest decile of wealth relative to children whose parents were in the highest decile of wealth.
These results suggest that the mortality gradient in the United States was long standing, at least for infants and children. Although this conclusion is supportive of fundamental cause theory (Link and Phelan Reference Link and Phelan1995, Reference Link and Phelan1996), historical demographers might find it surprising. The period between 1850 and 1880 was prior to the mortality transition, which commenced in the United States in the 1870s, and the wide-spread acceptance of the germ theory of disease. Access to medical care was in most cases of dubious value and public health efforts focused only on the elimination of noxious odors and visually unclean water (Duffy Reference Duffy1992). Eighteenth and nineteenth-century Americans, judging by their heights, appear to have been well fed, although a decline in heights of about one inch on average during the antebellum era may have been associated with growing inequality (Steckel Reference Steckel2009; Carson Reference Carson2013). Evidence for U.S. passport applicants, for example, suggests that a gap between elite and non-elite heights were growing between 1800 and 1860 (Sunder Reference Sunder2013). But higher net nutritional status provided little protection from acute infectious diseases, which were the primary killers of adults and children. Moreover, residential segregation by class and race was low. According to Daniel Scott Smith, it was an era when “one was killed, so to speak, by his or her neighbors” (Reference Smith1983).
Nonetheless, parental wealth can affect the mortality of children through several potential pathways that we are unable to measure directly, and those pathways may have included some role for differentials in nutrition, as well as roles for differences in health care, environment, and parental behaviors, such as breastfeeding and child spacing. Although knowledge about disease was rudimentary in the mid-nineteenth century, the early sanitation movement may have led to new forms of cleanliness and domestic hygiene among wealthier groups that were advantageous to child survival (Bushman and Bushman Reference Bushman and Bushman1988; Tomes Reference Tomes1990; Duffy Reference Duffy1992). Our results are consistent with the hypothesis that wealth allowed parents to buy modest survival advantages for their children, but not to distinguish whether those advantages were related to better living conditions, housing, environment, health care and nutrition, or possibly to different parental behavior and lifestyle not associated with wealth or income as such but with higher social status.
While census data are invaluable to estimating the relationship between wealth and child mortality across the entire United States during a period for which we lack civil registration data, they are not as helpful when trying to assess the mechanisms. In future research, other sources could be explored to get more in-depth knowledge about the likely pathways between parental wealth and child mortality.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/ssh.2023.12
Acknowledgements
A previous version of this paper was presented at the annual meeting of the SSHA, Philadelphia, November 11–14, 2021. J. David Hacker and Jonas Helgertz’s research was supported in part by funding from the Minnesota Population Center (P2C HD041023) and by a grant from the Eunice Kennedy Shriver National Institute for Child Health and Human Development (R01-HD082120-01).