Introduction
Researchers increasingly rely on aggregated radiocarbon dates from archaeological contexts to represent relative human population size or settlement density across time and space (e.g., Chaput et al. Reference Chaput, Kriesche, Betts, Martindale, Kulik, Schmidt and Gajewski2015; Chaput and Gajewski Reference Chaput and Gajewski2016; Codding et al. Reference Codding, Roberts, Eckerle, Brewer, Medina, Vernon and Spangler2023, Reference Codding, Brenner Coltrain, Louderback, Vernon, Magargal, Yaworsky, Robinson, Brewer and Spangler2022; Crema and Shoda Reference Crema and Shoda2021; DiNapoli et al. Reference DiNapoli, Crema, Lipo, Rieth and Hunt2021; Glassow Reference Glassow1999; Kelly et al. Reference Kelly, Surovell, Shuman and Smith2013; McCool et al. Reference McCool, Codding, Vernon, Wilson, Yaworsky, Marwan and Kennett2022; Peros et al. Reference Peros, Munoz, Gajewski and Viau2010; Riris Reference Riris2018; Robinson et al. Reference Robinson, Zahid, Codding, Haas and Kelly2019; Shennan et al. Reference Shennan, Downey, Timpson, Edinborough, Colledge, Kerig, Manning and Thomas2013; Steele Reference Steele2010; Timpson et al. Reference Timpson, Colledge, Crema, Edinborough, Kerig, Manning, Thomas and Shennan2014; Williams 2012; Wilson et al. Reference Wilson, McCool, Brewer, Zamora-Wilson, Schryver, Lamson, Huggard, Brenner Coltrain, Contreras and Codding2022; Zahid et al. Reference Zahid, Robinson and Kelly2016). This “dates-as-data” approach (Rick Reference Rick1987) rests on several assumptions, including that datable material is deposited by past people, preserved through time, and sampled by archaeologists all in proportion to the relative number of people living at any one point in time (Contreras and Meadows Reference Contreras and Meadows2014; Crema Reference Crema2021; Williams 2012).
Critiques of the approach highlight that non-random patterning in the cultural deposition (Freeman et al. Reference Freeman, Byers, Robinson and Kelly2018), preservation (Contreras and Codding Reference Contreras and Codding2023; Surovell et al. Reference Surovell, Finley, Smith, Brantingham and Kelly2009), and sampling (Contreras and Meadows Reference Contreras and Meadows2014) of datable material, as well as structure in the calibration curve (Brown Reference Brown2015), may systematically bias aggregated radiocarbon dates in ways that do not reflect past population size. Despite these limitations, several studies validate some aspects of the approach using simulation (Brown Reference Brown2015; Carleton Reference Carleton2021; Carleton and Groucutt Reference Carleton and Groucutt2021; Contreras and Codding Reference Contreras and Codding2023; Contreras and Meadows Reference Contreras and Meadows2014) and multi-proxy archaeological evidence (e.g., Robinson et al. Reference Robinson, Bocinsky, Bird, Freeman and Kelly2021) to highlight the specific conditions under which aggregated dates might reflect patterning in past population size.
Here we add to this conversation by coupling ethnographic population density estimates summarized by Codding and Jones (Reference Codding and Jones2013) with a systematic record of radiocarbon dates from archaeological contexts compiled by Kelly et al. (Reference Kelly, Mackie, Robinson, Meyer, Berry, Boulanger, Codding, Freeman, Garland, Gingerich, Hard, Haug, Martindale, Meeks, Miller, Miller, Perttula, Railey, Reid, Scharlotta, Spangler, Thomas, Thompson and White2022) and updated by Meyer for the modern state of California to see if counts of radiocarbon-dated sites vary as a function of estimated ethnolinguistic population density.
Methods
To help evaluate whether counts of radiocarbon-dated archaeological sites reflect past population size, here we undertake a preliminary empirical validation study in California (see Figure 1). California provides a useful test case given the history of ethnohistoric research examining Indigenous population density (Baumhoff Reference Baumhoff1963; Cook Reference Cook1976, Reference Cook1956, Reference Cook1955; Heizer Reference Heizer1978; Kroeber Reference Kroeber1925), and the state’s robust legal protections for historic resources on state and private lands under the California Environmental Quality Act (CEQA) in addition to work on federal land under the National Historic Preservation Act. Researchers in the region also have a history of systematically compiling radiocarbon dates from archaeological sites (e.g., Breschini et al. Reference Breschini, Haversat and Erlandson2004, Reference Breschini, Haversat and Erlandson1996; Glassow Reference Glassow1999), which contributed to the recent national synthesis led by Robert Kelly and colleagues (Reference Kelly, Mackie, Robinson, Meyer, Berry, Boulanger, Codding, Freeman, Garland, Gingerich, Hard, Haug, Martindale, Meeks, Miller, Miller, Perttula, Railey, Reid, Scharlotta, Spangler, Thomas, Thompson and White2022), since updated by co-author Jack Meyer. The Meyer database includes 12,381 dates from archaeological contexts at sites in California, which has 1,549 additional dates not included in Kelly et al. (Reference Kelly, Mackie, Robinson, Meyer, Berry, Boulanger, Codding, Freeman, Garland, Gingerich, Hard, Haug, Martindale, Meeks, Miller, Miller, Perttula, Railey, Reid, Scharlotta, Spangler, Thomas, Thompson and White2022). While we do not expect ethnographic population density to reflect the actual density of past populations, as these fluctuated over time, we do expect there to be consistency in their relative variation across space. For example, counties with high ethnographic population density likely had high past population density relative to counties with low ethnographic populations. This assumption is supported by the well-documented influence of environmental parameters on hunter-gatherer population size (Baumhoff Reference Baumhoff1963; Codding and Jones Reference Codding and Jones2013; Tallavaara et al. Reference Tallavaara, Eronen and Luoto2018; Zhu et al. Reference Zhu, Galbraith, Reyes-García and Ciais2021). This is addressed more in the discussion.
Here we explore spatial variation in dated site counts, and how it is predicted by spatial variation in the ethnographic population estimates. Applying these findings to understand changes in dated site counts over time relies on making a “space-for-time” substitution, or assuming that the processes that underly spatial relationships also structure relationships over time. Such an assumption is often made when studying ecological processes (Blois et al. Reference Blois, Williams, Fitzpatrick, Jackson and Ferrier2013), though it is not always clear if spatial and temporal processes are equivalent (Lovell et al. Reference Lovell, Collins, Martin, Pigot and Phillimore2023). The approach is also implicit in many archaeological studies that leverage spatial variation in cross-cultural patterns to make analogies about change over time in the archaeological record. Here we suggest this may hold given that environmental factors likely influence population dynamics over time (Bevan et al. Reference Bevan, Colledge, Fuller, Fyfe, Shennan and Stevens2017; DiNapoli et al. Reference DiNapoli, Crema, Lipo, Rieth and Hunt2021; Kelly et al. Reference Kelly, Surovell, Shuman and Smith2013) and space (e.g., Codding and Jones Reference Codding and Jones2013; Tallavaara et al. Reference Tallavaara, Eronen and Luoto2018; Zhu et al. Reference Zhu, Galbraith, Reyes-García and Ciais2021). The analysis of spatial variation in radiocarbon dates from archaeological contexts have well-developed approaches for mitigating the effect of uneven sampling across space (Bevan et al. Reference Bevan, Colledge, Fuller, Fyfe, Shennan and Stevens2017; Chaput et al. Reference Chaput, Kriesche, Betts, Martindale, Kulik, Schmidt and Gajewski2015; Crema et al. Reference Crema, Bevan and Shennan2017; Crema et al. Reference Crema, Habu, Kobayashi and Madella2016) that can further facilitate making inferences through time and space.
Our unit of analysis is the county. While an arbitrary modern administrative boundary subject to the modifiable areal unit problem (Openshaw Reference Openshaw1983), it does have several benefits, including that aggregate information on archaeological sites are easily compiled at the county level given the use of Smithsonian trinomials (site-county-number) as site identifiers, which also allows these data to be openly shared without violating federal law restricting the dissemination of site locations (section 470hh of the Archaeological Resources Protection Act). One additional benefit of retaining analysis at the county level is that it aids in the interpretation of potential factors biasing the record, including the history of research, development-driven investigations, and broad environmental patterns.
All analyses are run in the R Environment for Statistical Computing (R Core Team 2022). We align the two datasets using the county polygons from the state TIGER/Line Shapefiles (California Department of Technology 2016). To calculate average population density per county, we convert estimates for each Indigenous ethnolinguistic group synthesized by Codding and Jones (Reference Codding and Jones2013) to a raster, and then extract the mean population density for each county shapefile using the raster package (Hijmans Reference Hijmans2018). To calculate dated site counts, we then sum the total number of unique radiocarbon-dated archaeological sites by each county name for each time interval of interest. We limit our sample to unique radiocarbon dated sites to avoid double-counting sites with multiple dates. This is a cautious approach that comes at the cost of potentially underrepresenting multi-component late Holocene sites. This is sometimes referred to as ‘thinning’ the sample (Crema and Bevan Reference Crema and Bevan2021). Dated sites on the California Channel Islands are summed by their administering county (i.e., San Miguel, Santa Cruz, Santa Rosa, and Santa Barbara Islands are all assigned to Santa Barbara County; Anacapa and San Nicolas to Ventura County; and San Clemente and Santa Catalina to Los Angeles County). Ethnographic population estimates for these islands are also averaged into the county estimate.
Because this analysis focuses on the relative differences in dated site counts across space, we are not concerned about the specific calendar ages or the influence of the calibration curve on aggregations. As such, we use radiocarbon ages to aggregate dates in radiocarbon years before present (BP). While calibrating each date before aggregation may shift the total number of dates included in each time interval, it would not systematically vary the relative counts of dates across space (i.e., per county). First, we evaluate the relationship with all late Holocene dates (operationally defined as 0–4000 BP). To assess the resilience of this relationship, we then iterate the model fit with increasing 500-year windows of time starting at 0–500 BP up to 0–5000 BP. As noted above, we select only one date per site for each time interval, so as to not bias the number of dates based on overly sampled sites. These data are available in the Supplementary Material.
We describe the relationship between dated site counts and population density using count regression implemented with a generalized linear model (GLM) that specifies a distribution and link appropriate to count data (see below), which have been shown to perform better than log-transformed count data with parametric models (O’Hara and Kotze Reference O’Hara and Kotze2010). We include county area as a log-offset to account for differences in counts caused by variation in sampling area (Bolker Reference Bolker, Fox, Negrete-Yankelevich and Sosa2015; Bolker et al. Reference Bolker, Brooks, Clark, Geange, Poulsen, Stevens and White2009). This approach to modeling count data is standard in ecology (e.g., Bolker et al. Reference Bolker, Brooks, Clark, Geange, Poulsen, Stevens and White2009; Zuur et al. Reference Zuur, Ieno, Walker, Saveliev and Smith2009) and is preferable to modeling the density (rate) directly. For a graphical example of why this approach is taken, see the panel sequence in Figure 2 (see also Codding and Brewer Reference Codding and Brewer2024).
As the response variable is a count of radiocarbon-dated sites per county, we first fit the model with a Poisson distribution which is standard for count regression (Codding and Brewer Reference Codding and Brewer2024; Zuur et al. Reference Zuur, Ieno, Walker, Saveliev and Smith2009). However, as Poisson models assume that the variance is roughly equal to the mean, we check the fit for overdispersion by dividing the sum of squared Pearson’s residuals by the residual degrees of freedom to assess if the mean and variance are of a similar order of magnitude. As diagnostics reveal significant overdispersion (see Supplementary Material), we re-run the model using a negative binomial family from the MASS library (Venables and Ripley Reference Venables and Ripley2002) which can account for overdispersion (variance greater than the mean). We then check for spatial autocorrelation in the model residuals using a global Moran’s I index. Spatial autocorrelation is when neighboring observations are more (or less) like one another than is expected by chance, which can bias model results. As this reveals significant (p < 0.05) autocorrelation, we then estimate the spatial component of the trend from a binary neighbor matrix using a Moran eigenvector filtering function (Bivand et al. Reference Bivand, Pebesma and Gómez-Rubio2013; Dray et al. Reference Dray, Legendre and Peres-Neto2006; Thayn and Simanis Reference Thayn and Simanis2013; Tiefelsdorf and Griffith Reference Tiefelsdorf and Griffith2007; see Supplementary Material). The selected eigenvector(s) effectively partition the spatial structure into a separate variable(s) that can be added to the GLM to reduce residual spatial autocorrelation down to a predetermined alpha value (here set at > 0.05). We then compare if the GLM that includes the spatial eigenvector(s) is an improvement on the model that does not include them using a likelihood ratio test. If a significant improvement (p < 0.05), we proceed with this model. Model diagnostics include an examination of the distribution of deviance residuals, an evaluation of residual deviations in the top and bottom 90th percentile of all deviance residuals, and an assessment of influence using Cook’s distance. We measure goodness-of-fit using the likelihood r-squared (r2 l) value, which is similar to an r-squared (r2) value in ordinary least squares regression but is calculated as the model deviance over the deviance of a null model that includes only the intercept (see Codding and Brewer Reference Codding and Brewer2024). This is sometimes interpreted as the proportion deviance (analogous to variance) accounted for by the predictor variable(s) but may more appropriately be seen as a measure of how close the model fit is to a perfect fit (for more, see Faraway Reference Faraway2006; Zuur et al. Reference Zuur, Ieno, Walker, Saveliev and Smith2009). We evaluate the effect of population density on site counts with the exponentiated model coefficient and its standard error, which represents the rate of change in site counts for each unit change in the predictor. To illustrate the model fit, we plot the best-fit line predicted by the model with 99% confidence intervals and show the predicted values for each county. All code required to replicate this analysis, plus additional details and diagnostics are available in the Supplementary Material.
Results
Figure 1 shows the distribution of unique late Holocene radiocarbon-dated archaeological sites in each county relative to ethnolinguistic population estimates across the state of California. Figure 2 provides a visualization of the relationship between the a) number, b) density, and c) log density of unique late Holocene radiocarbon-dated sites and ethnographic population density per county. Panel a highlights the lack of a clear relationship when not accounting for differences in county area. Panel b shows an emerging trend when examining dated site density (i.e., accounting for differences in county area). Panel c shows the stronger apparent trend when log-transforming the density values. Our analytical approach described above takes these factors into account within the statistical model.
Formally examining this relationship with a generalized linear model that accounts for differences in area and spatial structure reveals a strong positive relationship accounting for about 29% of the deviance in dated site count (p = 0.0024; Table 1). As shown in Figure 3, the predicted logged number of unique radiocarbon-dated sites increases by 1.1 (±0.4) with each unit increase in population density. As this fit is logarithmic, taking the exponent of the coefficient shows that there are about 3.03 (±1.4) times more sites expected per each unit increase in population density. This means we should expect about 44 dated sites in a median sized county with a population density of one person per square kilometer. This should increase to 134 dated sites at a population density of two persons per square kilometer, and to about 405 at a population density of three persons per square kilometer.
As Figure 3 also illustrates, there is variation in the number of unique dated sites that is not accounted for by population density. While the residual variation is normally distributed around zero, indicating a good model fit (Figure 4a), several counties have a greater number of late Holocene radiocarbon-dated sites than expected based on their estimated ethnographic population size and area. These include (from highest to lowest in the bottom 90th percentile of residuals) San Francisco, Marin, Orange, Contra Costa, Calaveras, and Monterey counties (Figure 3). These high residuals may be due to a high intensity of archaeological investigations (more below). San Francisco, Marin, and Orange counties also have a high amount of leverage in the model fit (Figure 4b). Several other counties have many fewer late Holocene radiocarbon-dated sites than expected (from lowest to highest in the bottom 10th percentile of residuals), including Sutter, Trinity, Stanislaus, Sierra, Kings, and San Benito. These residuals may be influenced by a combination of preservation bias, landscape taphonomy, and a limited intensity of archaeological investigations (more in the discussion).
The positive relationship between dated site counts and population density is robust across varying windows of time. Iterating the model with counts of unique dated sites in 500-year bins from 0–500 to 0–5000 BP shows a consistent pattern (Table 2). The rate of increase in the number of sites as a function of population density grows larger with wider windows of time because of higher date counts (Figure 5a). This is confirmed with scaled estimates which show a consistent rate of increase (Figure 5b).
Discussion
Results show that counts of late Holocene radiocarbon-dated archaeological sites positively co-vary with ethnographic estimates of Indigenous population density. These results suggest that counts of radiocarbon-dated archaeological sites reflect some aspects of hunter-gatherer population size, and that relative changes in dated site counts over time should reflect relative changes in population density. However, the model only accounts for about 29% of the variation in dated site counts, so while population density may correlate with dates enough to facilitate the “dates-as-data” approach, some 71% of the variation in date counts is unaccounted for. Examining residual variation highlights other factors that may bias “dates-as-data” estimates away from proxies of population size. Specifically, counties with lower than expected site counts may experience preservation bias which limits the availability of organic material (Collins et al. Reference Collins, Nielsen-Marsh, Hiller, Smith, Roberts, Prigodich, Wess, Csapò, Millard and Turner-Walker2002) especially in acidic soils or regions with higher precipitation, a lower intensity of archaeological investigations (Bird et al. Reference Bird, Miranda, Vander Linden, Robinson, Bocinsky, Nicholson, Capriles, Finley, Gayo and Gil2022; Kelly et al. Reference Kelly, Mackie, Robinson, Meyer, Berry, Boulanger, Codding, Freeman, Garland, Gingerich, Hard, Haug, Martindale, Meeks, Miller, Miller, Perttula, Railey, Reid, Scharlotta, Spangler, Thomas, Thompson and White2022), and landscape taphonomy, or how geologic processes structure the preservation, discovery, and recovery of archaeological sites and materials (Contreras and Codding Reference Contreras and Codding2023; Surovell et al. Reference Surovell, Finley, Smith, Brantingham and Kelly2009).
For example, counties with fewer than expected dated sites include three in the central valley of California—Kings, Stanislaus, and Sutter—where geomorphological processes influenced by deposition from the San Joaquin and Sacramento Rivers and their tributaries (Meyer Reference Meyer2020; Meyer et al. Reference Meyer, Young and Rosenthal2010; Meyer and Rosenthal Reference Meyer and Rosenthal2008; Murphy and Meyer Reference Murphy and Meyer2016; Rosenthal and Meyer Reference Rosenthal and Meyer2004a), such as the Kings River Fan in Kings County, may erode and/or bury sites. Two other counties—Sierra and Trinity—occupy portions of the Sierra Nevada and North Coast ranges, respectively, which are less conducive environments for preservation of organic material. In addition to landscape taphonomy, Sierra and Trinity counties also have the lowest and third lowest GDP per capita, respectively, which may indicate limited development-driven contract work. The last county in this group—San Benito County—provides a telling example when compared to its neighbor, Monterey County.
San Benito County has very few radiocarbon dates from late Holocene archaeological sites and the sixth lowest residual after the five discussed above, but has an estimated population density estimate near the mean of the sample. This underestimate may also be due in part to the effects of landscape taphonomy. San Benito County is in the southernmost end of Santa Clara Valley, which Rosenthal and Meyer (Reference Rosenthal and Meyer2004b) show has a high proportion of buried sites due to active erosion and deposition. Development may also play a role in San Benito County, which is relatively small, has a relatively low contemporary population size (ranks 42 out of 58, US Census Bureau 2022), and a relatively low GDP (ranks 40, US Department of Commerce 2022), which may translate into below average development-driven CRM investigations. It also has no major military instillations which may drive well-funded archaeological investigations. By contrast, neighboring Monterey County has significantly more dated sites than would be expected based on its ethnographic population density and area (the sixth highest residual). This is likely due in part to long-standing, development-driven contract work and research with an emphasis on radiocarbon dating well-preserved shell middens (e.g., Breschini and Haversat Reference Breschini and Haversat2004, Reference Breschini and Haversat2002). The county has a relatively high contemporary population size (ranks 21, US Census Bureau 2022) and GDP (ranks 20, US Department of Commerce 2022), in addition to being the home to current (Presidio of Monterey) and former (Fort Ord) military instillations.
The other counties in the top 90th percentile of residuals are, in order, San Francisco, Marin, Orange, Contra Costa, and Calaveras. San Francisco has the greatest residual variation of any county, with significantly more site counts than expected. San Francisco also has a very high GDP (ranks 5, US Department of Commerce 2022) and contemporary population size (ranks 12 in count and 1 in density, US Census Bureau 2022), which combine to drive significant development-related archaeological investigations. These factors may also explain the higher site counts for adjacent counties in the San Francisco Bay Area, including Marin, Contra Costa, and San Mateo. Orange County similarly has a high GDP (ranks 3, US Department of Commerce 2022) and third highest population size (US Census Bureau 2022). Calaveras county, however, has high residuals and a low contemporary population size (ranks 44, US Census Bureau 2022) and relatively low GDP (ranks 45, US Department of Commerce 2022), which suggests something else may be driving the high dated site count.
Interestingly, some counties with both high and low ethnographic population densities have fewer dated sites than expected (e.g., Sierra and Trinity). We suggest this is in part due to environmental factors that influence both the preservation of organic material, landscape activity, and population size itself. While hunter-gatherer population density scales with environmental productivity in the region (Baumhoff Reference Baumhoff1963; Codding and Jones Reference Codding and Jones2013) and globally (Tallavaara et al. Reference Tallavaara, Eronen and Luoto2018; Zhu et al. Reference Zhu, Galbraith, Reyes-García and Ciais2021), extreme low and high productivity environments are also poor conditions for the preservation of organic material suitable for radiocarbon dating (e.g., Brock et al. Reference Brock, Higham and Ramsey2010; Bronk Ramsey Reference Bronk Ramsey2008; Collins et al. Reference Collins, Nielsen-Marsh, Hiller, Smith, Roberts, Prigodich, Wess, Csapò, Millard and Turner-Walker2002).
Another issue limiting the potential fit between ethnographic density estimates and radiocarbon-dated site counts is potential bias in ethnographic population density estimates themselves. Our interpretation above implicitly assumes that the ethnographic population are “true”, but this is certainly not the case. Some argue post-contact estimates systematically underestimate pre-contact population sizes as a result of protohistoric disease (Erlandson et al. Reference Erlandson, Rick, Kennett and Walker2001; Erlandson and Bartoy Reference Erlandson and Bartoy1996, Reference Erlandson and Bartoy1995; Preston Reference Preston1997, Reference Preston1996). However, population estimates may not be systematic underestimates given recent evidence for a late onset of increased mortality resulting from disruptive Spanish colonial practices (Jones et al. Reference Jones, Schwitalla, Pilloud, Johnson, Paine and Codding2021). Nonetheless, there may be bias in these estimates resulting from other factors including the duration between European contact and when the ethnographic estimates were made, which is systematically greater along the coast adjacent to Spanish missions.
An additional confound on the relationship between ethnographic population estimates and site counts is the degree of economic intensification (Boserup Reference Boserup1965; see, e.g., Codding and Bird Reference Codding and Bird2015; Morgan Reference Morgan2015). Despite sharing the general adaptation of hunting and gathering, Indigenous populations in California varied significantly in their specific subsistence strategies and economic intensification (e.g., Basgall Reference Basgall1987; Beaton Reference Beaton1991; Broughton Reference Broughton1999; Tushingham and Bettinger Reference Tushingham and Bettinger2013; Whitaker and Byrd Reference Whitaker and Byrd2014). This may lead to differences in the deposition and distribution of datable material (Freeman et al. Reference Freeman, Byers, Robinson and Kelly2018), as well as differences in mobility (e.g., Keeley Reference Keeley1988; Kelly Reference Kelly1983) and aggregation (e.g., Codding et al. Reference Codding, Parker and Jones2019; Haas et al. Reference Haas, Klink, Maggard and Aldenderfer2015), all of which may bias dated site counts.
One final complicating factor is the reliance on alternative dating methods in some regions where organic samples for radiocarbon dating are still available. In California this tends to occur in locations where obsidian is abundant, providing a less expensive dating method, albeit one that is less precise than radiocarbon dating (Anovitz et al. Reference Anovitz, Elam, Riciputi and Cole1999), despite significant analytical improvements in recent years (Rogers and Duke Reference Rogers and Duke2011; Rogers and Yohe Reference Rogers and Yohe2020). Research programs in counties including Lake, Sonoma, and Napa have developed longstanding traditions of research examining obsidian hydration dating (Fredrickson Reference Fredrickson1989; Meighan and Haynes Reference Meighan and Haynes1970; Origer and Wickstrom Reference Origer and Wickstrom1982). It is also prevalent in counties where radiocarbon dating is less viable due to poor preservation, such as in the Mojave Desert (Eerkens et al. Reference Eerkens, Rosenthal, Young and King2007; Jenkins and Warren Reference Jenkins and Warren1984) and Sierra Nevada Range (Hull Reference Hull2001; Stevens et al. Reference Stevens, Whitaker and Rosenthal2017). As such, more comprehensive evaluations of population size using dates-as-data will require the incorporation of additional dates from diverse dating methods. This is true not only in California, but in many other regions where obsidian (e.g., Tripcevich et al. Reference Tripcevich, Eerkens and Carpenter2012), tree ring dates (e.g., Bocinsky et al. Reference Bocinsky, Rush, Kintigh and Kohler2016; Liebmann et al. Reference Liebmann, Farella, Roos, Stack, Martini and Swetnam2016) and ceramic chronologies (e.g., Ortman Reference Ortman2016) supplant the use of radiocarbon due to cost, sample ubiquity, and research questions. Combining these methods to estimate site counts is an emerging area of research (Crema and Kobayashi Reference Crema and Kobayashi2020; Palmisano et al. Reference Palmisano, Bevan and Shennan2017).
Despite these complications and limitations, the counts of radiocarbon-dated sites in California reflect population density across increasing windows of time. We suggest this pattern persists through the late Holocene given that populations were likely at spatial equilibrium. As such, while absolute population density may have changed through time, the relative density across space at any one time was likely consistently patterned by habitat suitability (Bettinger Reference Bettinger2015; Codding and Jones Reference Codding and Jones2013). This would result if individuals distributed themselves so to maximize their per capita suitability relative to local population density (Fretwell and Lucas Reference Fretwell and Lucas1969), a pattern increasingly supported by ethnographic (Moritz et al. Reference Moritz, Hamilton, Yoak, Scholte, Cronley, Maddock and Pi2015), historic (Yaworsky and Codding Reference Yaworsky and Codding2018), and archaeological investigations (for reviews, see Codding and Bird Reference Codding and Bird2015; Weitzel and Codding Reference Weitzel and Codding2020). This is likely true globally, as underlying environmental conditions seem to structure hunter-gatherer population density generally (e.g., Binford Reference Binford2001; Birdsell Reference Birdsell1953; Kavanagh et al. Reference Kavanagh, Vilela, Haynie, Tuff, Lima-Ribeiro, Gray, Botero and Gavin2018; Tallavaara et al. Reference Tallavaara, Eronen and Luoto2018; Zhu et al. Reference Zhu, Galbraith, Reyes-García and Ciais2021). While this framework assumes individuals are free to distribute themselves, this was certainly not the case across all of Indigenous California (Bettinger Reference Bettinger2015; Kroeber Reference Kroeber1925), which might further help account for departures between predicted and observed estimates during specific windows of time if populations were territorially circumscribed (e.g., Codding and Jones Reference Codding and Jones2013; Whitaker and Byrd Reference Whitaker and Byrd2014). Nonetheless, population patterning should approximate equilibrium across such a large spatial extent over long spans of time.
Conclusion
We consider this an encouraging, albeit preliminary, empirical appraisal of the “dates-as-data” approach. Though this finding does not mean the approach can or should be applied without caution. Future work should replicate this study in other regions, and examine how to systematically account for patterned bias driven by differences in economic intensity (Freeman et al. Reference Freeman, Byers, Robinson and Kelly2018), landscape taphonomy (Contreras and Codding Reference Contreras and Codding2023; Meyer 2020 Reference Meyer1996; Rosenthal and Meyer Reference Rosenthal and Meyer2004b; Surovell et al. Reference Surovell, Finley, Smith, Brantingham and Kelly2009), and the intensity of archaeological investigations (Bird et al. Reference Bird, Miranda, Vander Linden, Robinson, Bocinsky, Nicholson, Capriles, Finley, Gayo and Gil2022; Rick Reference Rick1987). Future validation studies should also incorporate multiple dating methods (Crema and Kobayashi Reference Crema and Kobayashi2020; Crema and Shoda Reference Crema and Shoda2021). Such studies are critical for identifying the contexts in which “dates-as-data” functions as a valid—although imprecise—proxy of past population size.
Acknowledgments
The authors are grateful for generations of California archaeologists who systematically ran and compiled radiocarbon dates, including Gary Breschini, Trudy Haversat, Michael Glassow, and Jon Erlandson. The authors benefited from discussions with Adie Whitaker, Peter Yaworsky, Nathan Stevens, Kenneth Blake Vernon, and Daniel Contreras. Thanks to two anonymous reviewers and Kimberley Elliott for detailed comments and edits that improved the paper. This work was supported in part by National Science Foundation grants to RLK (BCS-1418858, -1624061, and -1822033) and BFC (BCS-1921072).
Competing interests declaration
The authors declare no competing interests.
Supplementary material
Data and code required to replicate this analysis is available in the supplementary information.