We estimated the underreporting of the novel coronavirus or COVID-19 as of March 9, 2020, in various countries until the first peak occurred in each country that had reported ≥500 cases of COVID-19 as of March 9, 2020. Our retrospective model-based estimations of underreporting (including those due to underdiagnosis) will be helpful in assessing pandemic preparedness. The ratio of reported COVID-19 cases to model-based predictions of COVID-19 for 8 major countries that had reported ≥500 cases up to March 9, 2020, are provided (Table 1, column l). COVID-19 reporting in France, Germany, Italy, and South Korea was comparatively much better than in other countries. For the United States, the data as of March 9, 2020, were not sufficient to provide a robust estimate.
According to Situational Report 49, released by the World Health Organization (WHO) on March 9, 2020,1 there had been 109,000 cases of COVID-19 and 3,800 related deaths worldwide. Most of these cases (~80,700) were from China and 8 other countries: Italy, South Korea, Iran, France, Germany, Spain, the United States, and Japan. All of these countries have reported ≥500 confirmed cases of COVID-19.1,2 However, identification of possible cases of COVID-19 is arguably more important in controlling high traffic to hospitals and emergency departments.3 Earlier models on COVID-19 did reflect the importance of data collection.4
Actual pandemic preparedness depends on true cases in the population, whether or not they are identified. Preventing transmission to the susceptible from these true cases depends on how well we can assess underreported and underdiagnosed situations promptly. A retrospective analysis of the data will be useful for the next epidemic but not for the current epidemic. Hence, we are proposing to use our methods, which we have been developing in recent years, to provide model-based estimates of underreporting for COVID-19 within a few weeks.
New methods using harmonic analysis and wavelets that we are developing—some of them recently accepted—will be of timely use.5 We propose a model-based evaluation of underreporting of coronavirus (COVID-19) in various countries using the methods we recently developed using harmonic analysis,5 that is, to develop full epidemic data from partial data (using a wavelet approach). However, the current article is a preliminary analysis and modeling was done using the data available as of March 9, 2020. These data do not represent the pandemic in its entire scale; such data will need to be reevaluated when the pandemic is completely controlled. However, our predictions for underreporting as of March 9 in a couple of European countries were close to the reported number of COVID-19 cases as more cases surfaced from March 9 to March 16, 2020. Wavelets of reported cases and adjusted estimates with the underreported cases are shown in Figure 1. We also anticipate using other techniques5–9 to further understand the reporting once more data become available.
Data, Methods, and Models
We collected COVID-19 and population data for each country from the World Health Organization (WHO),1 Worldometer,2 and World Bank10 sources. We used population densities, proportion of the population living in urban areas, and populations delineated by 3 age groups: 0–14 years, 15–64 years, and ≥65 years. Furthermore, we considered daily new cases (>10) up to the first reported peak of COVID-19 cases and the corresponding date ranges for all the countries for which such data were available. This range of days varied between 8 and 16 days (Table 1). We use 2 coupled differential equations $\mathop {s\left( t \right)^\cdot } = - \beta s\left( t \right)k\left( t \right)$ and $\mathop {k\left( t \right)^\cdot } = \beta s\left( t \right)k\left( t \right)$ , where s(t) and k(t) represent susceptible and infected at time t, and β is the transmission rate that is assumed to be invariant within the range of days for which the infection numbers in each country were computed. The respective β values per 100,000 thousands for the age groups 15–64 years and ≥65+ years considered for various countries are as follows: China: 0.8×1.5 and 1.5, 0.75; Italy: 1.5 and 3.0; Iran: 1.5 and 9.0; South Korea: 2.25 and 4.50; France: 1.50 and 3.0; Spain: 3.0 and 6.0; Germany: 1.5 and 3.0; and the United States: 0.75 and 1.5. The difference between model-predicted numbers and the actual numbers reported within the range were treated as underreported, which includes underdiagnosed cases. We constructed the Meyer wavelets for the reported and adjusted data after adjusting the infected number in the population for underreporting. The Meyer wavelet is a differentiable function, ψ(ω), which is infinitely differentiable in the domain with a function u as follows:
Here, u(x) = 0 for x < 0, u(x) = x for x ∈ (0,1), and u(x) = 1 for x1 For further details, please refer to Krantz et alReference Krantz, Polyakov and Rao5 and Krantz.Reference Krantz9
As of March 16, 2020, we did not have enough data on COVID-19 transmissibility rates from infected to uninfected persons based on migration of populations to construct countrywide networks. We also had no clear idea of the duration that SARS-CoV-2 virus remains active on nonliving surfaces such as plastics, metals, paper, etc; thus, we did not consider the interaction between humans and nonliving surfaces. Mathematical modeling can be made more complex by adding more parameters, but caution is necessary to ensure that these studies are well designed and that these parameters use readily available, scientifically collected data. Once we obtain more data on the duration of COVID-19 living on nonliving surfaces, we can build more complex models with more parameters.
Acknowledgments
We thank the journal’s Editor-in-Chief, Associate Editor (Handling), and the Statistical Consultant for their constructive comments.
Financial support
No financial support was provided relevant to this article.
Conflicts of interest
All authors report no conflicts of interest relevant to this article.
Authors contributions
Both the authors contributed in writing. ASRS Rao designed the study, developed the methods, collected data, performed analysis, computing, wrote the first draft. SG Krantz designed the study, contributed in writing, performed analysis, editing the draft.