To the Editor—The basic reproductive number $${R_0}$$ in epidemiology is defined as the average number of secondary infections that will be likely produced by a primary infected person in a predominantly susceptible population. Mathematically, it is an accurate measure of disease spread.Reference Eisenberg1 However, the value of $${R_0}$$ is difficult to estimate from epidemiological data, for example, during the ongoing coronavirus disease 2019 (COVID-19) pandemic. In recent studies on COVID-19, for example,Reference Hong and Li2–Reference Abbott, Hellewell and Thompson4 computed a time-varying $${R_0}$$ has been computed, which researchers called ${R_t}$ . They ascertained that the decline in ${R_t}$ is due to continued lockdowns and nonpharmaceutical interventions. Although the conclusions in those studies are supported by the data, estimates of ${R_t}$ raise methodological issues that require further consideration. Here, we convey the essential and technical difficulties in estimating either ${R_0}$ or ${R_t}$ from the data, and we discuss how a model-based ${R_0}$ may not adequately capture the actual spread of the disease. Although these limitations are generally unavoidable (even after defining appropriate error structures and statistical modeling), the inappropriate use of this metric, especially in the ongoing COVID-19 pandemic, has important implications for infectious disease mitigation planning.
Suppose that ${Y_0}$ is the number of infected people at time ${t_0}$ who could generate secondary infections between ${t_0}$ and ${t_1}$ , say, ${Y_1}$ . However, the testing of all the potential infected individuals during this period need not be complete. ${Y_1}$ could generate further secondary infections between ${t_1}$ and ${t_2}$ , say, ${Y_2}$ , and so on. Again, the testing of the samples through contact tracing need not be complete (Fig. 1). That is, ${Y_{i + 1}}$ at ${t_{i + 1}}$ could be generated by ${Y_i}$ at ${t_i}$ for i = 0, 1, …. In reality, during most epidemics, and especially for the COVID-19 pandemice, only a fraction of ${Y_i}$ , say, $Y^{'_i}$ are ever reported (and also diagnosed due to incomplete testing) such that $Y^{'_i} < {Y_i}$ for all i.Reference Gibbons, Mangen and Plass5,Reference Krantz, Polyakov and Rao6 This partial reporting (including partial diagnosis and partial testing) could also be due to lockdowns and lack of proper knowledge regarding COVID-19 (forced or natural behavior changes in the community, eg, lockdowns and use of masks). The average number of secondary infections generated by ${Y_i}$ individuals is ${Y_{i + 1}}{\rm{\;}}/{Y_i}$ . If there is variation in the infected people or a rapid aggregation of infected people, then it is more appropriate that we should use the geometric mean instead of the arithmetic mean approaches to determine expected reproductive numbers. Not only is the former far better suited than the latter to deal both with fluctuations and numbers that are not independent of one another, it also is the only correct mean when using results that are presented as ratios.Reference Hayes7–Reference Rao, Krantz and Bonsall9
Suppose that ${Y_{i + k}}$ is the number of infected people at time ${t_{i + k}}$ when lockdowns are introduced at k for k = 0, 1, 2 ….
Assume that
The percentage of growth in the number of infected people during the 4 time intervals ( ${t_{i + k}}$ , ${t_{i + k + 1}}$ ) for k = 0, 1, 2, 3, 4, are, say, $\gamma_{i + k}\% $ for k = 0, 1, 2, 3, 4, respectively. These growth percentages are computed as
The secondary infections caused by an infected individual (Fig. 1) are the people who were not traced by the system. This step assumes that all of the infected people who were identified by the system were either quarantined or were controlled not to spread the virus further. Only a proportion of infected people who were tested and identified during lockdowns was reported, and others were either not diagnosed or not reported. Asymptomatic individuals could be anywhere in the process; that is, they were part of the identified and reported group or were among those who had not been contact traced or diagnosed. The mean (geometric) number of secondary infections would be appropriate because we were considering proportionate secondary infections. Hence, the mean number of secondary infections during ( ${t_i}$ , ${t_i+4}$ ) is given by
Similarly, the trend in eq. (1) continues for $k = 0, 1, \ldots n$ , then the mean number of secondary infections during the lockdown period ( ${t_i}$ , ${t_i+n}$ ) is given by
This point applies to several studies in which the reporting over time of the study is not constant. Even if the testing numbers and testing patterns are constant over a period, the proportion of underreported cases may not be constant. Thus, the estimation of ${R_0}$ is likely to be highly variable in any given situation. For the practical purposes of computing ${R_0}$ or ${R_t}$ we usually have data on $Y^{'_i}$ , the number tested.
When the ratios ${Y_{i + k + 1}}{\rm{\;}}/{Y_{i + k}}$ for $k = 0, 1, \ldots n$ are considered, then the geometric mean of these growth rates would be
However, $${\widehat R_0}$$ or $${\widehat R_t}$$ , (the estimated basic and time-varying reproductive numbers at the start or ongoing through an epidemic, respectively) may not be at all close to ${R_0}$ or ${R_t}$ even if the ${Y_i}$ values are generated from a mathematical model for a period $i > 0$ that uses data on susceptible, exposed, infected, and recovered in which the underlying epidemiological processes are time varying. This factor will introduce bias to estimates of model-based basic reproductive rates and time-varying reproductive rates. Some other limitations in various studies arise due to computing ${R_t}$ after lockdowns were relaxed. Possibly, heterogeneity exists in the data that could have masked ${R_t}$ measures due to the computation of subnational and regional parameters in several COVID-19–affected countries.
The lesson here is that mathematical models must be used with care. They must be fitted to the data, and their accuracy must be carefully monitored and quantified.Reference Krantz and Rao10 Any alternative course of action could lead to wrong interpretation and mismanagement of the disease with disastrous consequences.
Acknowledgments
We thank Dr Natasha Martin, University of California San Diego, and Dr Chris T. Bauch, University Waterloo for providing useful comments on our original draft and pointing us to critical literature.
Authors’ contributions
All authors contributed in writing. ASRSR and SGK designed the study, and ASRSR wrote the first draft and conceptualized the study. MBB, TK, SB, DS, RB and SK have contributed in writing, editing and discussions. All authors approved the manuscript.
Financial support
No financial support was provided relevant to this article.
Conflicts of interest
All authors report no conflicts of interest relevant to this article.