Non-technical Summary
Paleontologists have long struggled to compare fossil biodiversity to the biodiversity we see around us. Yet such comparisons are crucial as we attempt to understand and divert an approaching wave of extinction. Here, we bridge the gap between modern and fossil biodiversity by modeling modern tetrapods as fossils, known only from remains preserved in sedimentary rocks. As the first global model of fossilization potential, this provides a profound and previously unavailable perspective. We find that geography strongly structures fossil diversity, producing deeply heterogeneous preservation rates in different tetrapod groups, and, for the globally threatened amphibians, massively underrepresenting extinction. Our results elucidate how physiological and ecological traits of animals influence our ability to recover the history of life.
Introduction
The fossil record is a profound gift of biological knowledge, allowing us to glimpse the past and prompting us to imagine the future. Fossils are central to understanding the processes and scope of evolution and have been since humans first contemplated evolution. Yet fossils do not record all of life history; there is a contingent structure to the record itself. The primary control over which organisms are preserved in the fossil record is the accumulation of sediment and subsequent lithification to form sedimentary rocks (Nyberg and Howell Reference Nyberg and Howell2015), because sedimentary rocks preserve and store fossilized remains. This results in vast spatial heterogeneity in fossilization potential. Although paleontologists have long recognized the preservation potential of wet and lowland environments over dry and upland ones (Newell Reference Newell1959; Knoll and Niklas Reference Knoll and Niklas1987; Holland et al. Reference Holland, Loughney and Cone2022), the question of which ancient organisms evaded fossilization is fundamentally moot; the rocks do not exist.
Yet understanding this spatial structure is crucial to our ability to contextualize the dwindling biota of the modern diversity crisis within Earth history. The effects of anthropogenic habitat destruction and climate change on geology (Spalding and Hull Reference Spalding and Hull2021), ecology (Palkovacs et al. Reference Palkovacs, Kinnison, Correa, Dalton and Hendry2012; Stuart-Smith et al. Reference Stuart-Smith, Mellin, Bates and Edgar2021), and biodiversity (Wake and Vrendenburg Reference Wake and Vredenburg2008; Barnosky et al. Reference Barnosky, Matzke, Tomiya, Wogan, Swartz, Quental and Marshall2011; Plotnick et al. Reference R. E., Smith and Lyons2016; Crooks et al. Reference Crooks, Burdett, Theobald, King, Di Marco, Rondinini and Boitani2017) are an ongoing and time-sensitive area of research. Much research has focused on the possibility of a “sixth mass extinction” (Barnosky et al. Reference Barnosky, Matzke, Tomiya, Wogan, Swartz, Quental and Marshall2011), comparable to the five mass extinctions recognized by paleontologists (Raup and Sepkoski Reference Raup and Sepkoski1982; Sepkoski Reference Sepkoski1984), which coincide with large shifts in climate (Song et al. Reference Song, Kemp, Tian, Chu, Song and Dai2021) and elevated extinction rates (Raup and Sepkoski Reference Raup and Sepkoski1982). Measuring extinction rates in the fossil record presents many difficulties (Foote Reference Foote2000), and comparing those rates to those observed over the short timescale of recorded history is particularly fraught (McCallum Reference McCallum2007; Barnosky et al. Reference Barnosky, Matzke, Tomiya, Wogan, Swartz, Quental and Marshall2011; Ceballos et al. Reference Ceballos, Ehrlich, Barnosky, García, Pringle and Palmer2015; Spalding and Hull Reference Spalding and Hull2021). But these comparisons could be improved by incorporating a structural model of the fossil record itself, one based on the first-order control over its production: sedimentation (Holland Reference Holland2016). Organisms that die outside of an area of net sediment accumulation (active sedimentary basins) cannot enter the fossil record, as there are no sedimentary rocks being formed to hold them. This is a profound structuring mechanism; many currently accumulating sediments are geologically “doomed” (Holland Reference Holland2016) and will be lost to erosion quickly after they are deposited; just 16% of the Earth's terrestrial surface is currently accumulating sediments likely to persist more than 1 million years (Nyberg and Howell Reference Nyberg and Howell2015; Fig. 1).
We produce a first-order quantification of the potential fossil record of modern tetrapods by measuring the extent to which their geographic ranges overlap with areas of active sedimentation, producing a predicted fossil geographic range (FGR) for 34,266 extant tetrapod species. In doing so, we place bounds on the scope of that record and begin to explore the implications of this geographic structure. Because this is a rough approximation of fossilization potential, we consider a range of geographically structured “fossil records” that selectively include species based on the sizes of their FGRs. Species with smaller FGRs are more likely to have their fossils destroyed by future erosion or to be overlooked by future paleontologists than species with larger FGRs. At our most permissive, we include any species with an FGR area greater than 1 km2 in the fossil record. We then consider successively stricter minimum areas for inclusion: 10 km2, 100 km2, 1000 km2, 10,000 km2, and 100,000 km2.
Using this species-level dataset of FGR areas, we examine how taxonomic differences in fossilization potential contribute to the phylogenetic fidelity of the fossil record by measuring the loss of phylogenetic diversity (PD) when comparing the extant fauna with our predicted fossil records. We find marked heterogeneity in the fossil record's ability to accurately record the evolutionary history of different tetrapod groups.
We also measure the effect of this geographic structuring on the record of extinction. We compare modern tetrapod diversity with the diversity of a depauperate tetrapod fauna resulting from a simulated extinction event that removes species listed by the International Union for Conservation of Nature (IUCN) as Endangered (EN) and Critically Endangered (CR), Extinct (EX) and Extinct in the Wild (EW), or Data Deficient (DD). We then compare this loss in diversity to the loss in diversity that a geographically structured fossil record could preserve, demonstrating its profound inability to accurately record the rate and magnitude of such an extinction event, with a particular focus on the heavily threatened amphibians (Wake and Vrendenburg Reference Wake and Vredenburg2008; Alroy Reference Alroy2015).
Materials and Methods
Geographic Methods
We collected all available polygonal geographic range maps for modern tetrapods from the IUCN Red List database (for mammals [accessed 3 September 2021]) and amphibians [accessed 7 March 2022]; IUCN 2021), BirdLife's species range maps (for birds [accessed 18 November 2021]; BirdLife International 2021), and the Global Assessment of Reptile Distributions (GARD) range map database (for crocodilians, turtles, squamates, and the tuatara [accessed 23 June 2020]; Roll et al. Reference Roll, Feldman, Novosolov, Allison, Bauer, Bernard and Böhm2017). We then created a single maximally inclusive polygonal range map for each tetrapod species by using the dissolve function in ArcGIS Pro (ESRI 2022) to combine all polygons associated with a species (which originally were labeled with different IDs) into a single polygonal feature. We performed a pairwise intersect operation between these range polygons and the polygonal sedimentary basin maps from Nyberg and Howell (Reference Nyberg and Howell2015), including both terrestrial and shallow-marine basins, again dissolving all output layers corresponding to one species into a single polygonal feature and repairing geometry as necessary to avoid self-intersects. The resulting polygons estimate where on the Earth's surface that species has a chance to enter the fossil record. We hereafter refer to these areas as fossil geographic ranges (FGRs). The areas (km2) of these FGRs are the basis for our subsequent analyses. All geographic operations were performed in ArcGIS Pro (ESRI 2022) and resulting data were projected using the WGS 1984 Cylindrical Equal Area projection (https://pro.arcgis.com/en/pro-app/3.1/help/mapping/properties/cylindrical-equal-area.htm).
We include both terrestrial and marine sedimentary basins in our analysis in order to include marine species and to account for inexact matching between the borders of species range polygons and sedimentary basin polygons. Our methods produce a maximally inclusive FGR given the geographic ranges and data available.
Criteria for Inclusion in the Fossil Record
For most of our tests, we consider inclusion in the fossil record as a binary trait; a species is either included in the record or absent (excluded). Because the fossil record is sampled by paleontologists, who cannot sift through of all the Earth's sedimentary layers (whether exposed or buried), species with very small FGRs are less likely to be recognized as part of the fossil record than species with large FGRs. To account for this undersampling, we analyze our data under six geographically structured scenarios, representing more or less complete sampling of the fossil record. In the first scenario, all species with an FGR of greater than 1 km2 are included in the fossil record. This is nearly identical to including all species with an FGR greater than zero (less than 2% difference in taxonomic completeness). In the second scenario, all species with an FGR greater than 10 km2 are included in the record. In the third, all species with an FGR greater than 100 km2 are included, and so on for minimum FGR sizes of 1000 km2, 10,000 km2, and 100,000 km2. Because the inclusion scenarios are hierarchical (i.e., the species “included” in the record at the 10 km2 scenario are a subset of those included in the tree in the 1 km2 scenario), higher minimum FGR sizes result in progressively smaller samples of total species diversity. We analyze taxonomic completion and PD loss under these six geographically structured scenarios.
Taxonomic Completion
Because our phylogenetic and geographic data sources do not share consistent taxonomies, we first used synonymies from the Integrated Taxonomic Information System (ITIS) database accessed through the R package taxize (Chamberlain and Szöcs Reference Chamberlain and Szöcs2013) to match taxa in trees to taxa in our geographic data, using only unambiguous cases of synonymy. Then, for each genus in our geographic data represented by five or more species not found in our phylogenetic datasets, we manually checked for synonyms in the IUCN Red List and BirdLife International databases. Using these techniques, we matched more than 90% of species in each geographic database to taxa in a phylogeny. Using the inclusion scenarios described above, we tabulate the number of species whose FGRs meet each minimum FGR cutoff value. To measure genus- and family-level completeness, we simply tabulate the number of genera and families represented by species in the fossil record (Table 1).
We checked our data against the Paleobiology Database (Uhen et al. Reference Uhen, Allen, Behboudi, Clapham, Dunne, Hendy and Holroyd2023) in order to find false negatives: extant taxa that have a known fossil record, but that we do not predict will enter the fossil record. We found no clear instances of false negatives. We compared our dataset against all PBDB taxonomic records for Reptilia, Aves, Amphibia, and Mammalia (accessed 8 November 2022).
Phylogenetic Signal
We performed all statistical and phylogenetic analyses in R (R v. 4.1.2; R Core Team 2021). Our code is available at https://doi.org/10.5281/zenodo.8417641. Our phylogenetic tests use subsets of 1000 trees downloaded from the full phylogeny datasets provided by VertLife (Jetz et al. Reference Jetz, Thomas, Joy, Hartmann and Mooers2012; Tonini et al. Reference Tonini, Beard, Ferreira, Jetz and Pyron2016; Jetz and Pyron Reference Jetz and Pyron2018; Upham et al. Reference Upham, Esselstyn and Jetz2019). For our investigation of phylogenetic signal, we built a maximum-clade-credibility tree from each of these datasets using the maxCladeCred function in the R package phangorn (Schliep Reference Schliep2011) and then used the phylosig function from the phytools package (Revell Reference Revell2012) to measure Pagel's λ for geographic range area and FGR area in these clades and selected subgroups (Table 2). Pagel's λ measures the degree to which continuous data distributed across the tips of a tree fit the expectation of evolution under Brownian motion and varies from 0 (data of sister tips cannot predict each other) to 1 (the data fit a Brownian motion model), rarely exceeding 1 (Freckleton et al. Reference Freckleton, Harvey and Pagel2000).
Phylogenetic Diversity
We tested each major taxonomic group independently, trimming each initial tree set to include only those taxa which occur both in that tree and our geographic datasets, which resulted in trees with 9755 species for Squamata (100% complete), 6636 species for Amphibia (91.7% complete), 5564 species for Mammalia (91.7% complete), and 9391 species for Aves (94.0% complete). As the predicted fossil records of crocodilians and turtles are quite complete, we do not report PD (Faith Reference Faith1992) results for these groups. To understand the phylogenetic signature of the potential fossil record of modern tetrapods, we compare the PD of the maximally inclusive trees described above with the PD of trees representing only those species found in a projected fossil record. Because PD for an entire tree is defined as the sum of all branch lengths, our methodology works by “pruning” branches from trees to remove species we predict will not enter the fossil record, obtaining a subtree containing only the remaining “fossilized” species.
We first measure the total PD for a tree containing all the extant species, then prune the tree to exclude the branches of species whose FGRs are smaller than some cutoff value given by our inclusion scenarios. We then measure the PD of the pruned tree, expressed as a proportion of the total PD. We repeat this process to measure PD loss for that tree in each of the six inclusion scenarios. We then repeat this for each of the 1000 trees in the phylogeny subset to generate a distribution of 1000 PD loss values corresponding to 1000 alternate tree topologies under each inclusion scenario.
We also tracked the age of the internal nodes of the tree that are lost as taxa are removed. A preservation regime erases phylogenetic history more quickly by removing long branches from the tree at a higher rate than short ones, but removing many short branches can have the same effect as removing a few long branches. By observing the ages of nodes that are removed from the tree, we can understand whether a fossilization regime tends to favor short or long branches and whether branching events are likely to go unrecorded at particular times. For each pruned tree, we count how many internal nodes fall within log-scale time bins from 10−2 to 102.25 Ma, in exponential increments of 0.25.
We contrast these data on the diversity lost by a geographically structured fossil record with a null model for how the fossil record should record diversity. For our null model, we model the fossil record as a random preserver of species. We achieve this via Monte Carlo simulation, generating randomized datasets in which observed FGR areas are randomly shuffled between species. In this approach, the chance of fossilization for any species becomes independent of its phylogenetic position and geographic range, but distribution of FGR areas is identical to the geographically structured dataset, and therefore we can easily follow the same PD measurement process as with our geographically structured data. We generate 1000 of these “random” fossil records to compare against our geographically structured data. For each of these randomized datasets, we follow all the steps taken to generate PD loss and node age distributions for our geographically structured data, using the same 1000 phylogenetic trees and the same cutoff values for minimum FGR. We then compare the PD distributions of our geographically structured and random fossil records at each FGR size cutoff level using Cohen's D to determine the effect size of nonrandom fossilization at these different range-size cutoff values. To understand the pattern of node loss versus node age, we first compute the average difference in node number between geographically structured and random records within each node age bin. We then multiply that difference by the node age of the bin to estimate the difference in PD (in Ma) recovered by geographically structured and random fossil records.
Extinction Rates and Extinction Magnitudes
To measure the fossil record's ability to capture the rate and magnitude of large extinction events, we compared our full dataset with a subset excluding species listed by the IUCN as DD, EX, EW, CE, CR, or EN. The full dataset represents current tetrapod diversity, while the reduced dataset represents diversity in a subsequent geologic interval in which the excluded species are all extinct. Plotnick et al. (Reference R. E., Smith and Lyons2016) used a similar approach to investigate potential mass extinction in mammals, though here we include DD species among those present in the modern interval but absent in the postextinction interval, slightly increasing the magnitude of extinction. We have calculated extinction rates following Barnosky et al. (Reference Barnosky, Matzke, Tomiya, Wogan, Swartz, Quental and Marshall2011) over a 100 year time interval. Extinction magnitudes are those recovered by the potential fossil record, comparing the diversity at a pre-extinction interval and diversity at a postextinction interval at the same FGR cutoff value.
Results
Composition of a Geographically Structured Record
Just over 73% of tetrapod species at least partially overlap with areas of long-term deposition, giving them some chance of fossilization. However, there are striking differences in potential fossilization between the four major tetrapod clades at the species (Fig. 2, Supplementary Fig. S1), genus, and family levels. Amphibians are poorly represented, with just 41.9% of species having any chance of entering the fossil record, compared with 72.9% of reptilian species, 81.3% of mammalian species, and 86.0% of bird species (Table 1). None of the taxa that we predict will have an FGR size of zero are represented in the Paleobiology Database (Supplementary Table S1).
Phylogenetic Patterns
Phylogenetic signal (Pagel's λ) of both extant and fossil geographic range areas varies widely between tetrapod groups (Table 2). In squamates and amphibians, extant and FGR areas have a similar level of phylogenetic signal, with FGR signal being slightly stronger, because FGRs can only be equal to or smaller than extant geographic ranges, thereby increasing the overall similarity of tip data. The same is true for mammals, although phylogenetic signal is much higher in general in their geographic ranges. The phylogenetic signal difference is more pronounced between fossil and extant geographic range areas in birds. Salamanders notably depart from the pattern of increased FGR area signal with a Pagel's λ = 0.76 for extant geographic range areas, but only 0.28 for FGR areas.
The PD of a fossil record made up of randomly selected species is very different from the PD of a geographically structured record. Surprisingly, geographic structuring does not have a consistent effect on PD across taxa. PD is consistently underpreserved (compared with a randomly selected fossil record) in reptiles, with diminishing preservation with progressively less complete records (higher FGR cutoff values). In mammals and amphibians, more complete fossil records slightly overpreserve PD, while less complete fossil records underpreserve it. In birds, we find the opposite pattern: PD is consistently overrepresented, and overrepresentation is higher in less complete records (Fig. 3).
The pattern of phylogenetic node loss is likewise dissimilar across the four tetrapod trees (Fig. 4). In birds, a geographically structured fossil record preferentially preserves phylogenetic nodes across the entirety of crown-avian history, regardless of the inclusion scenario. Squamates, amphibians, and mammals exhibit a different pattern; younger nodes are preferentially preserved by a geographically structured fossil record, while older nodes are preferentially lost. However, the timing of the pattern is different. In squamates, late Neogene nodes are favored by geographic preservation, while early Neogene and Paleogene nodes are preferentially lost. In Amphibians, nodes are favored across the Neogene but preferentially lost in the Paleogene. In Mammals, nodes in the late Neogene and early Quaternary are preferentially preserved, while nodes in the middle Neogene and, to a small extent, the early Neogene are preferentially lost. The magnitude of preferential recovery or loss generally increases as the fossil record becomes more incomplete (under higher FGR cutoff scenarios). Under more complete geographically structured records (FGR cutoff 1–100 km2), nodes 1 Ma old or younger are preferentially lost in mammals and squamates.
Extinction
The fossil record would severely underrepresent the magnitude of a large extinction event in amphibians. Our simulated extinction produces a loss of 40% of amphibian species and 13% of amphibian genera. The magnitude of this extinction would be severely underrepresented in even our most complete fossil record. When the minimum FGR area for fossilization is 1 km2, the fossil record registers a a loss of 13.5% of amphibian species and 4% of amphibian genera. In worse fossil records, the magnitude of species diversity loss quickly drops below 5% (Fig. 5B).
This translates to drastically underrepresented extinction rates. For amphibians, the actual magnitude of our simulated extinction is dramatically high (4025 extinctions per million species-years [E/MSY]) (Fig. 5A). This drops to 600 E/MSY in the most inclusive fossil record and is just 7 E/MSY at an FGR cutoff of 100,000 km2: a 99.83% reduction in the apparent extinction rate. We observe this underrepresentation in all tetrapods, although the most severe underrepresentation of extinction rate is in amphibians (Fig. 5A).
Yet a poor record of extinction rates does not necessarily translate to an underestimate of extinction magnitude. As the minimum FGR size required for inclusion in the fossil record increases, the extinction magnitude that it preserves decreases at the species and genus levels in birds, amphibians, and mammals. However, in reptiles, the species and genus extinction magnitudes recovered by the fossil record hold approximately constant across FGR cutoff scenarios, with a slight increase in approximate genus-level extinction magnitude under the strictest scenarios. At the family level in all four groups, extinction magnitude is quite low under most FGR cutoff scenarios (Fig. 5B, Table 3).
Discussion
Geographic Controls on Taxonomic Diversity
Geography is a major driver of the taxonomic structure of the fossil record, but its effects vary widely between tetrapod clades and different taxonomic levels of analysis. These results imply that ectothermic tetrapods are poorly represented in the fossil record compared with endotherms, with amphibians particularly underrepresented. In part, this may be attributable to their smaller ranges, but the relationship between modern geographic range size and predicted FGR size is not simple; both modern and fossil geographic ranges reflect complex and contingent biogeographic patterns (Fig. 2).
These results reflect fundamental differences in the biogeography of amphibians and amniotes. Compared with biodiversity of other tetrapods, amphibian biodiversity is more constrained by temperature and water availability (Buckley and Jetz Reference Buckley and Jetz2008), more concentrated in upland environments (Rahbek et al. Reference Rahbek, Borregaard, Colwell, Dalsgaard, Holt, Morueta-Holme, Nogues-Bravo, Whittaker and Fjeldsaå2019), and more influenced by topographic complexity (Brown et al. Reference Brown, Cameron, Yoder and Vences2014) and elevational heterogeneity (Murali et al. Reference Murali, Gumbs, Meiri and Roll2021). The proportion of amphibian species we find to be excluded from the fossil record (65%) is remarkably consistent with the 62% of amphibians whose geographic range is primarily upland (Rahbek et al. Reference Rahbek, Borregaard, Colwell, Dalsgaard, Holt, Morueta-Holme, Nogues-Bravo, Whittaker and Fjeldsaå2019). This may help to explain the low family stage–level completeness of the lissamphibian fossil record compared with that of other tetrapods (Benton et al. Reference Benton, King, Chaloner and Hallam1989).
Preservation and Extinction
Any gap in fossil preservation hampers our ability to record changes in diversity through time, and we identified potentially large gaps in tetrapod preservation. Thus, extinction rates recovered in the terrestrial fossil record may not reflect actual extinction rates; in our study, they are all much lower. In our simulated modern extinction, extinction rates from even relatively complete fossil records (a minimum FGR size for inclusion of 1000 km2) are at least halved compared with the “true” rates (Fig. 5A). Extinction magnitudes can vary to a similar degree. Plotnick et al. (Reference R. E., Smith and Lyons2016), following a procedure similar to ours, found that an ~16% magnitude extinction event in extant mammals would appear as an ~8% magnitude extinction at the species level due to the small number of extant mammals found in fossil databases. This recovered magnitude is about half of the true magnitude. At the genus level, the recovered extinction magnitude (6.5%) was slightly more than half of the true extinction magnitude (10.7%). This approximate halving of extinction magnitude corresponds again to our mammal dataset when the minimum FGR area for inclusion is between 1000 and 10,000 km2.
Amphibian extinction is particularly obscured; poor fossil records underpreserve extinction rates by orders of magnitude (Fig. 4), because endangered amphibians have smaller FGRs than non-endangered ones (Supplementary Fig. S2). It therefore seems unlikely that future paleontologists would recover any indication of the modern amphibian crisis. The fossil record is simply geographically incapable of representing the type of extinction that appears imminent, and the low stage-level completeness of amphibians (Benton et al. Reference Benton, King, Chaloner and Hallam1989) and low family- and genus-level diversity of Lissamphibia in the fossil record (Paleobiology Database; Uhen et al. Reference Uhen, Allen, Behboudi, Clapham, Dunne, Hendy and Holroyd2023) suggest that the record of modern amphibians would be severely restricted by further factors of ecology, anatomy, taphonomy, sedimentation, worker effort, and so on.
This suggests a number of interesting possibilities. First, perhaps amphibian crisis is a common phenomenon; amphibian extinction rates are always high (or extinction pulses frequent), but origination rates are fast enough to maintain diversity. Both phylogenetic (Roelants et al. Reference Roelants, Gower, Wilkinson, Loader, Biju, Guillaume, Moriau and Bossuyt2007; Jetz and Pyron Reference Jetz and Pyron2018) and paleontological (Tietje and Rödel Reference Tietje and Rödel2017) studies have suggested a high turnover rate in amphibians, which bolsters this idea. Yet it is inconceivable that origination could keep pace with this modern crisis; were modern extinction rates to continue, amphibians would be wiped out within a few tens of thousands of years (McCallum Reference McCallum2007; Buckley and Jetz Reference Buckley and Jetz2008).
Second, perhaps amphibian evolution is a largely “offstage” phenomenon. A sizable amount of modern amphibian diversity is attributable to the effects of topographic heterogeneity, refugia, and the “montane species pump” (Brown et al. Reference Brown, Cameron, Yoder and Vences2014), all of which produce or maintain diversity outside the reach of sedimentation. Even in a relatively complete fossil record, this mode of evolution would produce long ghost lineages as clades move out of and back into areas of sedimentation, skirting fossilization just as they evolve new traits that would allow paleontologists to understand their evolutionary history. Given the difficulty of resolving lissamphibian origins and the long ghost lineages involved (Ruta and Coates Reference Ruta and Coates2007), this may be a major and intractable problem if lissamphibian ancestors also preferred upland habitats.
Third, if other extinctions follow the geographic structure of our simulated extinction, their real toll on the terrestrial biota will be consistently underestimated. Although we may have a good understanding of the proportions of marine taxa lost in mass extinctions (in some cases, the extinction proportion can hardly increase), our results highlight a mechanism by which a large proportion of terrestrial taxa skirt fossilization. These un-fossilizable species may represent major innovations in the history of life, yet we cannot know them directly. These ephemeral taxa lived and died beyond the scope of history.
Phylogenetic Diversity in the Fossil Record
The fossil record is not a random sampler of PD on land, instead preserving each major tetrapod group in its own idiosyncratic regime. This is not surprising, given that phylogenetic endemism in tetrapods is highest in areas of low seasonality and, crucially, high topographic relief (Murali et al. Reference Murali, Gumbs, Meiri and Roll2021). Yet the heterogeneous patterns of PD preservation in the fossil record are unexpected.
Birds’ large geographic ranges seem to translate to high predicted preservation rates (Table 1, Supplementary Fig. S1), similar to those of mammals, in our models. Yet bird PD is far better sampled by our geographically structured fossil record than a random fossil record (Fig. 4). Unlike all other tetrapods, both recent and deep nodes in the avian tree are preferentially preserved (Fig. 5). This pattern is consistent across all FGR cutoff sizes, including those where the rate of bird preservation is similar to rates of mammalian and reptilian preservation at which older nodes are underpreserved. For instance, when the mammalian preservation rate is 50.4%, nodes around 10 Ma old are underpreserved, but even when the bird preservation rate dips to 40.5%, phylogenetic nodes are all preferentially overpreserved. Because of this, the consistent overpreservation of birds cannot be due to mere abundance in the record. Instead, the preferential retention of deep avian nodes must be due to the particular interaction between the shape of the avian tree and the geographic distribution of modern bird species. While passerine birds inhabit diverse environments worldwide, the older-diverging birds of Aequorlitornithes (Prum et al. Reference Prum, Berv, Dornburg, Field, Townsend, Lemmon and Lemmon2015) are often closely associated with marine, freshwater, or wetland environments—exactly where deposition occurs. The PD of this group, and of the wide-ranging waterfowl, should therefore be very comprehensively sampled by sedimentary basins and represents a larger number of long branches with few tip taxa than the “bushier” passerine tree. This is consistent with a higher degree of phylogenetic signal in FGR size among nonpasserines than among passerines (Pagel's λ = 0.3 vs. λ = 0.14; Table 2).
The striking difference in phylogenetic signal between salamander extant geographic range and FGR areas (λ = 0.76 vs. λ = 0.28; Table 2) likely stems from a strong tendency for elevational speciation in salamanders. In Plethodontidae, the family containing more than 60% of salamander species, high- and low-elevation species have repeatedly evolved from middle-elevation origins (Wiens et al. Reference Wiens, Parra-Olea, García-París and Wake2007; Kozak and Wiens Reference Kozak and Wiens2010). As species repeatedly invade high-elevation habitats, in which little deposition is likely to occur, their FGR areas become zero, while their middle-elevation sister species may have appreciable FGRs (although likely smaller than their extant geographic ranges). Likewise, species that invade low-elevation environments are likely to increase the sizes of their FGRs, approaching the sizes of their extant geographic ranges. Through elevational speciation from this middle-elevation origin, sister species FGRs become less similar, and the phylogenetic signal of FGR area drastically drops.
We do not recover any signal similar to that seen in salamanders in passerine birds (Table 2). Despite the group's remarkable diversity and elevational diversity gradient, passerines’ high proportion of downslope to upslope dispersal (van Els et al. Reference van Els, Herrera-Alsina, Pigot and Etienne2021) likely brings high- and middle-elevation passerine lineages regularly back into sedimentary basins over long timescales. Alternatively, it may be that signals of salamander-like patterns of elevational diversification in some groups of passerines are simply swamped by other evolutionary dynamics in this incredibly diverse group.
The relatively high levels of amniote PD sampled by our hypothetical record implies that, at least in principle, amniote evolutionary history can be well preserved. Fossils of early members of important modern clades are crucial to understanding the clades’ histories and rates of diversification, and it seems that sedimentation itself is not an insurmountable barrier to the discovery of such fossils. Worker effort in finding and interpreting these fossils is likely a greater impediment to understanding amniote history than initial preservation.
Understanding the Structure of the Terrestrial Fossil Record
How good is the terrestrial fossil record, really? Foote's (Reference Foote1997) and Tietje and Rodel's (2017) estimates of the completeness of the mammalian and amphibian fossil record, though quite disparate in scope, correspond with our predicted record when the minimum FGR size for inclusion is 100–1000 km2. In this record, 25–33% of amphibians, 71–80% of birds, 64–75% of mammals, and 51–62% of reptiles might fossilize (Table 1). But our “completeness” is quite different from the stratigraphic “completeness” estimated in these sources. Here, we compare a global living assemblage to a predicted global death assemblage, asking what fraction of “species that lived” might fossilize, rather than what fraction of “species that lived and are detectable” in the fossil record appear between intervals (Žliobaitė and Fortelius Reference Žliobaitė and Fortelius2022), as Foote (Reference Foote1997) and Tietje and Rodel (2017) do. Using this definition, Žliobaitė and Fortelius (Reference Žliobaitė and Fortelius2022) estimate that perhaps 4–10% of Miocene mammalian species left fossil remains. This is a smaller proportion than the ~15% of modern mammalian species represented by fossils in Plotnick et al.'s (Reference R. E., Smith and Lyons2016) analysis, perhaps resulting from a better fossil record for the Pleistocene than the Miocene—a strong “Pull of the Recent.” But what accounts for this tiny proportion of preservation?
Moving hierarchically through the structuring processes of the fossil record, we can begin to estimate the relative contributions of different processes to the completeness (percent of total living fauna preserved) of the record. Here, we estimate that at least 19% of mammalian species can be eliminated from the record based on geography alone (Table 1). To reach Plotnick et al.'s (Reference R. E., Smith and Lyons2016) 15% completeness estimate, another 66% of diversity must be lost due to (1) ecological, anatomical, taphonomic, and sedimentary processes that prevent taxa from entering the sedimentary record; 2) sedimentary, tectonic, and environmental processes that destroy sedimentary rocks or make them unavailable; (3) some combination of worker effort and sampling imperfection that has failed to uncover and correctly identify all of the fossilized species; and (4) taxonomic uncertainty, as some species are not diagnosable based on fossilizable traits.
We have designed our study to capture maximally inclusive FGRs based on the data available, making our estimates of diversity lost to geography quite conservative. Geographic ranges are labile and currently in flux for many species spreading into new habitats, retreating from inhospitable environments, or experiencing population declines (Lenoir and Svenning Reference Lenoir and Svenning2015), and these changes may nudge some species toward better fossilization potential and others away from it.
Organisms’ anatomical and ecological characteristics matter a great deal to their probability of preservation. Alongside geographic range size (Wagner and Marcot Reference Wagner and Marcot2013; Plotnick et al. Reference R. E., Smith and Lyons2016), body size (Behrensmeyer et al. Reference Behrensmeyer, Kidwell and Gastaldo2000; Wagner and Marcot Reference Wagner and Marcot2013; Plotnick et al. Reference R. E., Smith and Lyons2016; Mannion et al. Reference Mannion, Chiarenza, Godoy and Cheah2019) and species abundance (Wagner and Marcot Reference Wagner and Marcot2013; Plotnick et al. Reference R. E., Smith and Lyons2016) are known to correlate with preservation, and their effects could be considered in more comprehensive models of the potential fossil record. Moreover, particular traits of clades can influence how well they are represented by fossils. For instance, modern crocodiles are ideal fossilizers: large-bodied, robust-boned animals living in and around low-energy aquatic environments. We predict their inclusion in the future fossil record is secure, continuing the already fairly complete crocodilian record (Markwick Reference Markwick1998; Mannion et al. Reference Mannion, Chiarenza, Godoy and Cheah2019). The skeletons of birds are made up of much smaller elements, usually quite fragile, and are more prone to taphonomic degradation than the skeletons of other tetrapods, probably contributing to their poorer fossil record (Benton et al. Reference Benton, King, Chaloner and Hallam1989). (Although it is still somewhat phylogenetically complete [Ksepka and Boyd Reference Ksepka and Boyd2012].) Amphibians’ small, often weakly ossified skeletons likewise seem to preserve poorly, again likely contributing to their remarkably spotty fossil record (Benton et al. Reference Benton, King, Chaloner and Hallam1989; Tietje and Rödel Reference Tietje and Rödel2017). In contrast, mammals’ tough, complex, and highly diagnostic teeth are ideal for both preservation and later worker recognition.
We have considered all sedimentary basins equivalent and static, but variation in sediment sources, sediment transport mechanisms, sedimentation rates, fluvial regimens, accommodation rates and basin architecture (e.g., difference between an active foreland basin vs. a passive margin), climate, and long-term changes within a given basin will leave some sedimentary basins with far better preservation potential than others. Ecological gradients, which are inherently coupled to elevation gradients and climate variables, also act as a control on the structure of the fossil record (Holland and Loughney Reference Holland and Loughney2021; Holland Reference Holland2022) in a way not considered in these models. However, they are in part expressed in the tetrapod ranges themselves. The climate and environment within a sedimentary basin may also affect taxonomic fidelity in its fossil record, with cool, dry climates producing the highest preservation potential (Behrensmeyer et al. Reference Behrensmeyer, Kidwell and Gastaldo2000). Furthermore, large shifts in geomorphology (e.g., the retreat of the Laurentide Ice Sheet) can cause massive reorganization of the erosional and depositional makeup of sedimentary basins (Wickert Reference Wickert2016), which can lead to the erosion of previously stable sediments. Human impacts on sedimentary rock production are already apparent, producing increases in both erosion (Reusser et al. Reference Reusser, Bierman and Rood2015) and sedimentation (Jenny et al. Reference Jenny, Koirala, Gregory-Eaves, Francus, Niemann, Ahrens and Brovkin2019). These effects will further confound our assumption of preservational homogeneity, perhaps by producing higher sedimentation rates and a higher-fidelity fossil record in more biologically depauperate areas. However, these human impacts on the sedimentary record are likely to be insignificant in the whole of a basin's stratigraphy.
Our model also fails to account for atypical modes of preservation. Cave and fissure-fill deposits can preserve species otherwise absent from the fossil record (Lundelius Reference Lundelius2006; Jass and George Reference Jass and George2010), although these deposits are generally short-lived, and very few are known that represent deep-time upland environments (Lundelius Reference Lundelius2006). Preservation of terrestrial species in marine environments, however, is not uncommon (Schultze Reference Schultze1995; Butler and Barrett Reference Butler and Barrett2008). The extent to which postmortem transport and burial add otherwise unsampled species to the fossil record is unclear, and a potential source of fossil diversity for which our geographic approximations cannot account.
Moreover, a thorough comparison of modern and fossil faunal diversity must account for exceptional spatial heterogeneity in paleontological effort; western Europe, the United States, and Canada are blanketed with known fossil localities, while the Global South—including areas of extremely high modern tetrapod diversity and endangerment—is poorly represented (Raja et al. Reference Raja, Dunne, Matiwane, Khan, Nätscher, Ghilardi and Chattopadhyay2022).
Beyond collecting, incomplete worker effort in identifying and publishing specimens, as well as keeping up with taxonomic changes, contributes to a mismatch between our accounting of the modern fauna and their recent fossil record. Neontologists are increasingly describing cryptic species based on ecological, genetic, or otherwise un-fossilizable characteristics (Bickford et al. Reference Bickford, Lohman, Sodhi, Ng, Meier, Winker, Ingram and Das2007; Barrowclough et al. Reference Barrowclough, Cracraft, Klicka and Zink2016; Burgin et al. Reference Burgin, Colella, Kahn and Upham2018), making species an increasingly difficult unit of comparison for modern and fossil data.
The Future Fossil Record
It is difficult to consider our results outside the context of global warming and human impacts on the planet. Some researchers refer to this time of heightened human impact as the Anthropocene (Lewis and Maslin Reference S. L. and Maslin2015). While not recognized as a subdivision of geologic time by the International Commission on Stratigraphy (ICS) or the International Union of Geologic Sciences (IUGS), the stratigraphic record will undoubtedly retain evidence of human activity. Climate change will globally disrupt ecosystems, changing ecological gradients just as sea-level rise changes the sediments accumulating along them (for a detailed discussion, see Holland Reference Holland2022).
Our results indicate a possible existential problem: Will the future fossil record capture the current diversity crisis? How will range shifts appear in the future fossil record, with organisms migrating into and out of sedimentary basins in the midst of species extinction, evolution, and turnover? And ultimately, how many diversity crises are missing from the geologic record? These questions may be ultimately untestable, but we hope that they prompt studies that broaden the taxonomic and phenomenological scope of this study.
Conclusions
Our results are at once reassuring and deeply troubling. We confirm that mammals, long used as a model for terrestrial vertebrate biodiversity, are the most reliable recorders of tetrapod history, having the highest proportions of potential fossilization and eventual recovery (Fig. 2). Yet amphibians, representing a similar proportion of extant vertebrate diversity, leave behind far less; a clear understanding of amphibian history may be fundamentally impossible to recover.
Modern biodiversity is the outcome of contingent geologic, climatic, and evolutionary processes, and as such is not a perfect model for biodiversity in past or future points in Earth history. However, large-scale patterns in tetrapod biogeography are not mere recent developments; they are a momentary state in a long history structured by profound biological differences between major clades, differences that produce a contingent and idiosyncratic fossil record through the interaction of climate, geography, evolution, extinction, geology, and taphonomy. The scale of these differences demonstrates a need to carefully consider the histories of these groups and understand their unique macroevolutionary dynamics. Future work comparing modern and fossil biodiversity will greatly benefit from this geographic perspective, as well as from a broader taxonomic scope. What other evolutionary stories—of insects, conifers, or pulmonates—might the fossil record pass over?
That nearly 20% of modern mammalian species have almost no chance of fossilizing should caution any paleontologist against ignoring the geographic structure of the fossil record. Much past biodiversity may be unknowable, but by understanding the processes that failed to record these species, we can better understand the structure and scope of history itself.
Acknowledgments
We would like to thank S. Finnegan, J. McGuire, C. Marshall, and S. Scarpetta for their helpful comments on an earlier version of this article. We would also like to thank S. Holland and A. Dunhill for their helpful reviews, which greatly improved this article.
Competing Interests
The authors declare that they have no competing interests.
Data Availability Statement
Our supplementary figures, tables, code, and data are available at: https://doi.org/10.5281/zenodo.8417641.