Introduction
Estimation of the real burden imposed by the coronavirus disease-2019 (COVID-19) pandemic in its first year has been challenged by numerous factors including limited testing, the large fraction of asymptomatic or subclinical cases, and questions surrounding whether deceased individuals died of COVID-19 as the primary cause or as one of several contributing conditions [Reference Omori, Mizumoto and Chowell1–Reference Woolf, Chapman and Lee3]. Under these circumstances, analysis of excess mortality data represents one way to assess the actual impact of the pandemic on society [Reference Leon4]. However, because data on excess mortality are provisionally released, reported counts are subject to reporting delays. The lengths of reporting delays are often unclear, and strategies to adjust provisional excess mortality data to account for delays would be helpful to study the impact of the COVID-19 pandemic in real time.
The United States Centers for Disease Control and Prevention (US CDC) releases provisional death counts by week and US jurisdiction on a weekly basis [5]. Incomplete counts in the weeks preceding the publication week are caused by various factors, including administrative and processing time lags, time of year, decedent age and cause of death. According to the National Center for Health Statistics, approximately 80% of deaths are automatically processed by a system, while 20% require manual input. In view of the ongoing COVID-19 pandemic, deaths can take even longer to process. Although the completeness of the data cannot be determined directly, the associated reporting delays can be estimated using well-developed techniques [Reference White and Held6].
The importance of reporting delays in real-time analysis of infectious disease outbreaks has been previously recognised [Reference Lawless7–Reference van de Kassteele, Eilers and Wallinga14]. In some instances, detailed characterisation of reporting delays was hindered by limited available data, for instance during outbreaks in regions with ongoing armed conflicts [Reference Akhmetzhanov11] or among refugee populations [Reference Tsuzuki12, Reference Finger13]. In other instances, the more detailed available data allowed analysis of time-varying trends in reporting delays using P-splines [Reference van de Kassteele, Eilers and Wallinga14] or moving time windows [Reference McGough15]. Both methods can be significantly hampered when only a small fraction of cases is reported, making follow-up inference of reporting delays challenging [Reference McGough15].
Among the published studies on excess mortality in 2020 during the COVID-19 pandemic [Reference Rossen16–Reference Weinberger21], few adjusted their estimates for reporting delays. Kawashima and colleagues [Reference Kawashima20] conducted such an adjustment for monthly all-cause deaths in Japan based on prompt vital statistics. By contrast, Weinberger and colleagues [Reference Weinberger21] analysed more granular data consisting of weekly counts across US jurisdictions and conducted nowcasting of deaths within a Bayesian framework [Reference McGough15]. Their conclusions were that reporting delays significantly differed across US jurisdictions, and that excess mortality was modestly undercounted in recent weeks unless adjustment was done. Although the official CDC report acknowledged this issue [5], there is still no detailed information on differences in reporting delays between jurisdictions as well as follow-up estimation of a common shared mean reporting delay.
In the current study, we fill this gap by explicitly characterising differences in reporting delays between jurisdictions. We also provide estimates of excess mortality at the subnational level in the USA for two timeframes: from March to December and from September to December 2020. The first timeframe of our analysis covers the whole period of the pandemic in the USA in 2020. The second timeframe was chosen to encompass the timeline of the second wave of the COVID-19 pandemic.
Methods
Data
Provisional death counts for 2019–2020 were regularly published on the CDC website (https://www.cdc.gov/nchs/covid19/covid-19-mortality-data-files.htm) on Wednesdays at 5 p.m. Reported deaths were categorised by Morbidity and Mortality Weekly Report (MMWR) week of publication and by US jurisdiction where the death occurred. For the current study, 22 snapshots were collected with publication dates between September 2020 and the first week of February 2021. One snapshot published the week of 16 September 2020 (MMWR week 38) was omitted for technical reasons, which was not critical for the study. The time period containing the most recent week with non-zero deaths covered MMWR week 34 (week ending date: 22 August 2020) in the earliest collected snapshot through week 53 of 2020 (week ending date: 2 January 2021) in the last four snapshots. The death counts for the week of publication and for the preceding week as well as for the weeks of 2021 were likely to be missed in any given snapshot because of zero reported counts; all non-zero death counts less than 10 were masked by the CDC for privacy reasons. The reporting jurisdictions included the 50 states with New York state separated into two jurisdictions: New York City and the rest of the state. Additionally, the District of Columbia and Puerto Rico were among the total 53 jurisdictions.
Historical records of weekly deaths from 2014 to 2018 were retrieved from the same source as the provisional counts for 2019–2020. The structure of the dataset was analogous except that it was not subject to any changes in the future. The jurisdictional counts of reported COVID-19 deaths were assessed via the daily trends published by the CDC (https://covid.cdc.gov/covid-data-tracker/#trends_dailytrendscases (accessed 26 February 2021)).
Reporting delay: parametric estimation (independent and partial pool model)
The reporting delay distribution describes the distribution of time periods between the occurrence of an event and its reporting to the system. The probability distribution function of the reporting delay is usually modelled using one of three unimodal distributions with positive support (f i(○; θ), i = 1, 2, 3): the gamma, Weibull or lognormal distributions. The set θ consists of two parameters: the mean and the standard deviation (s.d.) of the reporting delay distribution.
To estimate the reporting delay, the death count $d_w^{( {s, j} ) }$ reported on week w by jurisdiction j in any of the earlier snapshots s = 1, …, (S − 1) was compared with the death count $d_w^{( {S, j} ) }$ reported in the latest snapshot S [Reference Akhmetzhanov11, Reference Tsuzuki12]. Poisson likelihood was used to infer the unknown parameters θ j:
for any w, j and s = 1, …, (S − 1). The second equation accounts for the continuity factor [Reference White and Held6]. Here, T(s) denotes the publishing times of the snapshots s, and Poissonpmf(d;E[d]) is the probability mass function:
To estimate variation in reporting delays across jurisdictions, two different approaches were employed. In the first approach, the reporting delay for each jurisdiction was estimated independently, such that each likelihood L (j) (j = 1, …, 53) was maximised with respect to θ j. In the second approach, a partial pool model was used to infer the common shared mean reporting delay and its s.d. [Reference Flaxman22–Reference Gelman24]. In the latter context, the reporting delays for various jurisdictions were closely related to each other, sharing a common mean value μ. Any deviations from the shared value of the mean were modelled using a Student's t-distribution:
where the other two parameters were the degree of freedom ν and the standard error of the mean $\sigma _\mu$. The Student's t-distribution (3) was chosen over the normal distribution because it is less sensitive to outliers. Like the nonparametric estimation described below, the first approach showed promise when used to nowcast the number of deaths that have yet to be reported. The second approach was used to identify the common mean μ, and the corresponding P-values (percentiles of the Student's t-distribution) for detecting outliers (i.e. jurisdictions significantly deviating from others in their reporting delays).
A negative binomial distribution could have been used instead of the Poisson distribution in equation (1). However, simulations showed that the value of the overdispersion parameter in the negative binomial distribution approached an arbitrarily large value, implying equivalence of the negative binomial and Poisson likelihood functions. A similar conclusion was reached in another relevant study [Reference McGough15].
Mixture model
Although the reporting delay distribution was chosen from one of three unimodal distributions, this selection induced a constraint to the modelling framework by imposing a structural prior. Another approach to account for all three distributions within a single model is to consider mixtures of distributions. This strategy provides a greater degree of flexibility because each distribution contributes to the total likelihood proportionally to the relative weights π i ($\sum\nolimits_i {\pi _i} = 1$), subject to the data fit. By contrast with the common practice in formulating mixture models, where each component distribution has its own set of parameters (e.g. each of the three distributions would have their own means and s.d.s), we assumed that all distributions shared the same set of parameters (mean and s.d.). This ensures a higher convergence probability of implemented Markov Chain Monte-Carlo simulations [Reference Keller and Kamary25].
Alternatively, the best-fit distribution could be selected based on information criteria (e.g. the widely applicable information criterion (WAIC) or ‘leave-one-out’ information criterion (LOOIC) [Reference Gelman24, Reference Watanabe26]). However, integrating out unobserved (latent) variables from the model, such as the death counts masked by CDC, can be challenging [Reference Watanabe27]. The mixture model implements all three component distributions based on relative weights π i. In this case, there was no need to integrate out latent variables or to manually calculate likelihoods for each data point, as is required using other methods [28].
Following these assumptions, the total likelihood for the mixture model was defined as follows:
where $\pi _i^{( j ) }$ are relative weights ($\sum\nolimits_i {\pi _i^{( j ) } } = 1$) and the subscript i indicates one of the three distributions (i = 1, 2, 3). The component likelihoods $L_i^{( j ) }$ are given by equation (1) and the expected deaths $E[ {d_w^{( {S, j} ) } } ] _i$ (an internal argument of the likelihoods) are given by (2) respective to each distribution i:
where F i denotes the cumulative distribution function of the component distribution i. The posterior probability for each component distribution $q_i^{( j ) }$ could be then determined using the equation:
Reporting delay: nonparametric estimation
For nonparametric estimation of the reporting delay, the reverse-time discrete hazard was defined as previously described [Reference Lawless7, Reference Höhle and der Heiden8]: g j(d) = Pr(delay = d | delay ≤ d) = f j(d)/F j(d). Here, the variable d was introduced such that a zero value (d = 0) corresponds to the death count reported within the first 2 weeks (equivalently, within the first 10 days because all snapshots were published on Wednesdays rather than on the last day of the week). Other values (d = 1, 2, …, D) correspond to reporting delays of weeks, respectively. The upper bound D denotes the maximum delay, implying that F j(delay ≥ D) = 1. Finally, g j(0) = 1 was imposed, and other hazards g j(d > 0) were found by fitting the probability distribution functions $f_j( d ) = g_j( d ) \prod\nolimits_{i = d + 1}^D {( {1-g_j( i ) } ) }$ to the data. Equations (1), (4) were used, accounting for the only difference in defining the cumulative distribution functions:
where the parameter D was set to 20 weeks in the simulations.
Nowcasting procedure
To predict the number of deaths not yet reported by the surveillance system, a prospective nowcasting framework was applied [Reference Akhmetzhanov11, Reference Cowling29]. The number of yet unreported deaths on a given week was sampled from a negative binomial distribution that followed the failure-counting interpretation [Reference Champredon30]. The first parameter of the negative binomial distribution (the number of ‘failures’) was the number of already reported deaths during that week, whereas the second parameter (the probability of ‘success’) was the cumulative distribution function of the reporting delay counted from week of death, w, to the publication date of the latest snapshot.
Expected excess mortality
The expected weekly number of deaths was estimated using a Poisson linear regression model [Reference Weinberger21] involving a seasonal component but neglecting to adjust for severe influenza and associated pneumonia. The posterior median and the 95% upper bound were set as two thresholds. The range of differences between the nowcasted number of deaths and each of these thresholds was then reported as excess deaths as in previous studies [5, Reference Kawashima20]. All negative differences were assigned to zero. The reader is referred to the Appendix for additional mathematical details of the statistical framework.
Technical details
To infer individual mean reporting delays and perform nowcasting, nonparametric estimation of the reporting delay distribution was used. A parametric estimation was implemented only for verification purposes (Fig. 1). A partial pool model was used to calculate the common mean of the reporting delay shared across jurisdictions. Because of excessive computational time requirements, only a lognormal distribution was implemented in the partial pool model. The choice of the lognormal distribution was guided by its dominant selection while fitting the mixture models for various jurisdictions (Appendix Fig. 1).
Statistical inference was conducted within the Bayesian framework realised in CmdStan (version 2.26, https://mc-stan.org). Pre- and post-processing of the data and results were performed in the Python environment (version 3.8). The code snippets are available at http://github.com/aakhmetz/Excess-mortality-in-US-2020.
Results
We started our analysis by fitting the reporting delays for all 53 US jurisdictions independently of one another. The patterns of the mean reporting delays for all-cause excess mortality suggested clustering around a common value (Fig. 1a and Appendix Fig. 1). A common shared mean delay was calculated at 2.8 weeks (95% credible interval (CI): 2.4, 3.0 weeks). Most jurisdictions (32/53, 60.4%) had mean reporting delays within the interquartile range of 2.0–3.4 weeks. All jurisdictions except for four had mean reporting delays within the 95th percentile range of 0.5–5.0 weeks. Connecticut, North Carolina, Puerto Rico and West Virginia had mean reporting delays above the 95th percentile (shown as a dotted line in Fig. 1a). Suspecting those jurisdictions to be outliers, we first identified that North Carolina clearly deviated from the other jurisdictions with a mean delay of 10.4 weeks (median P-value 0.001) [Reference Dushoff, Kain and Bolker31]. Excluding North Carolina from the partial pool model, we determined that the other three jurisdictions also clearly deviated from the remainder: Connecticut reported death counts with a mean delay of 5.8 weeks (median P-value = 0.006), Puerto Rico with a mean delay of 4.7 weeks (median P-value = 0.028) and West Virginia with a mean delay of 5.5 weeks (median P-value = 0.010) (Appendix Fig. 2).
To identify jurisdictions experiencing delays in reporting not as extreme as the four jurisdictions above, we investigated the correlation between fraction of deaths reported within the first 10 days and mean reporting delays. Figure 1b shows clustering of points around the value of 61% identified earlier in the technical notes of the CDC [5]. We hypothesised that points located on the left-hand side of the corresponding dashed vertical line in Figure 1 represented jurisdictions with longer anticipated reporting delays.
Suspecting that longer delays were caused by larger numbers of reported COVID-19 cases in jurisdictions, we assessed the association between mean reporting delay and cumulative number of reported COVID-19 deaths per 100 000 from September to December 2020. We first compared a linear regression model with a non-zero slope and Student's t-distribution to minimise the effect of outliers with a null model with a zero slope. Following an LOOIC, the null model was rejected (ΔLOOIC = 10.1; relative weight for alternative model: 0.93). The alternative model predicted that an additional 4.5 reported COVID-19 deaths per week per 100 000 individuals was associated with 1 additional week in the reporting delay (Fig. 1c).
Figure 2 shows all-cause excess mortality adjusted by the reporting delays for six jurisdictions. Among them were two jurisdictions (Texas and Florida) with the highest numbers of reported COVID-19 deaths from September to December 2020, two jurisdictions (South Dakota and North Dakota) with the highest COVID-19 deaths per 100 000 over the same period, and two jurisdictions (Delaware and Georgia) where adjustment for reporting delay led to an increase instead of a decrease in the unadjusted counts over the last 2 weeks of 2020 (cf. solid and dotted lines in Fig. 2). Conducting a validation procedure for nowcasting using earlier cutoff times (Appendix Fig. 3), we found, similarly to [Reference Weinberger21], that the performance for nowcasting was conservative because the nowcasted death counts are likely to be underestimates of the final counts. The values from the latest snapshot published on 11 February 2021 are expected to reflect the final counts of 2020 with greater certainty compared with prior snapshots because the time elapsed between the publication date and the last week of 2020 exceeds the estimated mean reporting delay in most jurisdictions. The results of nowcasting for all jurisdictions are shown in Appendix Figure 4.
Next, we calculated excess mortality by jurisdiction for the entire period of the COVID-19 pandemic (from March to December 2020; Table 1) and for the second wave (from September to December 2020; Appendix Table 1). As expected, adjustment did not significantly alter the estimated numbers of deaths over the entire period. The jurisdictions with the largest percent changes following adjustment were New York City with the range of 58.4–62.4% (26 212–28 040 excess deaths), New Jersey at 32.4–37.0% (19 571–22 369 excess deaths) and Texas at 23.7–27.1% (40 413–46 127 excess deaths). The jurisdictions with the smallest changes following adjustment were Hawaii at 0.2–3.6% (24–357 excess deaths), Maine at 0.2–3.8% (32–497 excess deaths) and Alaska at 0.3–7.9% (14–316 excess deaths). Comparing the September–December 2020 period with the March–December 2020 period, the jurisdictions with the largest percent changes following adjustment were South Dakota at 36.9–54.0% (1073–1568 excess deaths), North Dakota at 33.9–50.7% (906–1354 excess deaths) and Missouri at 27.8–33.9% (6196–7543 excess deaths), while the jurisdictions with the smallest changes following adjustment were Puerto Rico at 0.2–2.5% (22–271 excess deaths), Hawaii at 0.5–5.9% (22–242 excess deaths) and Maine at 0.6–6.9% (32–374 excess deaths). The provided values indicate deviations from two thresholds of the median and the 95th percentile as it was described above.
The numbers in parenthesis indicate the 95% CI. The range shown in two columns for the excess deaths denotes the range of differences between the nowcasted number of deaths and each of two thresholds: the 95th percentile and the median of the posterior for the expected number of deaths.
Discussion
Adjustment of provisional all-cause excess deaths by reporting delays as currently documented on the CDC website relies on estimates obtained from provisional data for 2018–2019 [5]. Adjustment of delays using recent data from 2020 has been carried out, but not explicitly reported. Here, we quantified jurisdictional reporting delays using the latest data from the second half of 2020. According to our estimates, four jurisdictions out of 53 (Connecticut, North Carolina, Puerto Rico and West Virginia) reported excess mortality with substantial time lags that were likely related to administrative factors. On the one hand, the percentage of deaths reported within the first 10 days in those four prefectures was much smaller compared to the overall mean of 61% (Fig. 1b). On the other hand, there was no evident correlation between the mean reporting delay and the average weekly number of reported COVID-19 cases for September–December 2020 (Fig. 1c). However, longer reporting delays in some other jurisdictions such as South Dakota or North Dakota were likely be caused by the burden of the COVID-19 pandemic (Fig. 1c). We determined that an increase of approximately 4–5 reported COVID-19 deaths per 100 000 individuals per week was associated with an additional 1 week in the reporting delay. Overall, we found that jurisdictional reporting of death counts had delays of 2–3 weeks, which, nevertheless, represents a significant improvement compared with 2015–2016 [32]. In 2015–2016, 61.9% of all-cause deaths were reported within the first 5 weeks. However, the same fraction of deaths was reported within the first 10 days in 2020. Additionally, only some jurisdictions significantly deviated from that value during the second half of 2020 (Fig. 1b).
When we assessed jurisdictional all-cause excess mortality from September to December 2020, we found that Puerto Rico had the lowest estimated number, potentially because of significant delays in reporting. This result confirms the importance of accounting for reporting delays when analysing provisional death counts and performing nowcasting. Excess mortality from March to December 2020 was less affected by reporting delays; however, some underestimation of nowcasted death counts can still be observed.
From a methodological point of view, we employed several different approaches to estimate the reporting delay. Both non-parametric and parametric estimation of the reporting delay yielded similar results, confirming the validity of our methodology. The non-parametric estimation was the easiest to implement, but was prone to overfitting the data. In contrast, a partial pool model less sensitive to overfitting can be used for deriving common characteristics shared across jurisdictions [Reference Flaxman22, Reference Alexander, Zagheni and Barbieri23].
Our study had several limitations. First, we considered only all-cause excess mortality, and different underlying causes of death may have contributed differentially to the reporting delay. COVID-19-associated deaths may require additional post-mortem examinations, leading to longer reporting delays especially during the first year of the COVID-19 pandemic. The reporting delay can also differ based on age, race and ethnicity as described elsewhere [Reference McGough15, Reference Richardson33]. Second, the nowcasting procedure used in our study does not incorporate a time-varying trend in the reporting delay [Reference van de Kassteele, Eilers and Wallinga14, Reference McGough15] and does not include a random effect [Reference Lawless7, Reference Höhle and der Heiden8]. It also considers the contributions across different snapshots and across weeks to be independent. Implementation of a nowcasting procedure incorporating these factors would require a more sophisticated approach with construction of a two-dimensional contingency matrix of number of deaths with the week of death on one margin and the reporting delay on the other margin [Reference van de Kassteele, Eilers and Wallinga14]. This was not feasible for our aggregated dataset consisting of subsequently released snapshots. For example, some re-arrangements of weekly numbers were observed for weekly death counts in Vermont, which would lead to negative differences between subsequent snapshots, and thus negative elements of the contingency matrix. Furthermore, McGough and colleagues [Reference McGough15] also showed that nowcasting remains challenging when low reporting rates were observed (e.g. in Connecticut, North Carolina and Puerto Rico). Under these conditions, both simpler approaches such as those used in our study and the more sophisticated approaches used elsewhere [Reference Höhle and der Heiden8, Reference van de Kassteele, Eilers and Wallinga14, Reference McGough15] will be limited in their performance.
Our study shows necessity for adjustment of excess death counts by the reporting delay which is rather different across jurisdictions of the USA. A more detailed cause-specific and multifactorial analysis (e.g. by age, gender, ethnicity and socio-economic status) is required to further differentiate the reporting delay and allow more accurate real-time assessments of the burden of COVID-19 pandemic in the future.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0950268821001527
Acknowledgements
A.R.A. is grateful to Yun-Chun Wu (National Taiwan University) for helpful discussions, and Edanz Group (https://en-author-services.edanz.com/ac) for editing the initial draft of this manuscript. He also thanks two reviewers for their helpful comments and efforts towards improving the manuscript.
Financial support
The author received no specific funding for this work.
Conflict of interest
The author declares that he has no known competing financial or personal relationships that could have appeared to influence the work reported in this paper.
Ethical standards
The present study used publicly available data, and thus, did not require ethical approval.
Data availability statement
All data used for this study can be found at: http://github.com/aakhmetz/Excess-mortality-in-US-2020.