Introduction
Lassa fever (LF), caused by Lassa virus (LASV), is increasingly recognised as an important rodent-borne viral haemorrhagic fever presenting a severe public health threat to some of the communities in sub-Saharan West Africa [Reference Gibb1]. Discovered in 1969 [Reference Carey2], LF is endemic to much of rural Nigeria and regions in the Mano River Union [Reference Richmond and Baglole3]. LASV transmits from human to human, as well as via the zoonotic cycle [Reference Gibb1, Reference Richmond and Baglole3, 4]. LF has a high case fatality rate ranging from 1% in the community to over 60% in hospital settings [Reference Gibb1, 4, Reference Shaffer5]. The common reservoir of LASV is Mastomys natalensis, one of the most widespread rodent species in sub-Saharan Africa [Reference Gibb1, Reference Richmond and Baglole3], which exhibits sensitive population dynamics to the water level, e.g. rainfall, flooded agricultural activities [Reference Granjon6, Reference Leirs7]. Previous studies have recognised the ecological association between the population levels of rodents and rainfall [Reference Leirs, Verhagen and Verheyen8–Reference Leirs10].
LF epidemics typically start in November and last until May of the following year, with the majority of cases occurring in the first quarter of the following year, in addition to sporadic cases reported throughout the year. The 2017–18 epidemic in Nigeria was an unprecedented LF epidemic in the country's history [Reference Roberts11], which resulted in 400 confirmed cases, including 97 deaths, between January and March 2018 [12]. The most recent epidemic in Nigeria has already caused 526 confirmed cases from January to March of 2019, which included 121 deaths [12]. The five states of Edo, Ondo, Ebonyi, Bauchi and Plateau are the only states that have been among the top 10 hit hardest states in terms of number of LF cases in both the 2018 (85.5% of total national cases) and 2019 (85.7% of total national cases) epidemics. While there have been discussions about the association of rainfall level and LF incidence rate [Reference Fichet-Calvet and Rogers13, Reference Mylne14], this association has not yet been demonstrated and quantified. This work aims to study the epidemiological features of epidemics in different Nigerian regions between January 2016 and March 2019. We estimate LF infectivity in terms of the reproduction number (R) and quantify the association between R and state rainfall. We explore the spatial heterogeneity of the LF epidemics and summarise the overall findings with model-average estimates.
Data and methods
Data
Weekly LF surveillance data are obtained from the Nigeria Centre for Disease Control (NCDC), where the data are publicly available from the weekly situation reports released by NCDC [12]. Laboratory-confirmed case time series are used for analysis. We examine the major epidemics that occurred between January 2016 and March 2019 across the whole country and the aforementioned five states that were among the top 10 hardest-hit states in both the 2018 and 2019 epidemics, i.e. Edo, Ondo, Ebonyi, Bauchi and Plateau. The state rainfall records of each state were collected on monthly average basis from the historical records of the World Weather Online website [15]. Figure 1(a) and (b) shows the rainfall time series of the five states and the weekly reported LF cases across the entire Nigeria.
Intuitive coincidence between rainfall and epidemic
To test the credibility of the coincidence between rainfall and LF epidemic, we use a simple statistical regression model of ‘case ~ exp(α × rainfall) + θ', where α and θ are free parameters to be estimated. The ‘rainfall’ in the model represents the state rainfall time series with lag of 4–9 months. This lag term corresponds to the time interval between the rainfall and the development of rodent population [Reference Leirs7]. We check the least-square fitting outcomes of these regression models and select the model of lagged rainfall with the highest goodness-of-fit. The fitting significance is treated as the initiation of the quantitative association between state rainfall and the LF epidemic.
Modelling and estimation
Four different nonlinear growth models are adopted to pinpoint the epidemiological features of each epidemic. The models are the Richards, three-parameter logistic, Gompertz and Weibull growth models. These simple structured models are widely used to study S-shaped cumulative growth processes; e.g. the curve of a single-wave epidemic and have been extensively studied in previous work [Reference Sebrango-Rodríguez16, Reference Tsoularis and Wallace17]. These models consider cumulative cases with saturation in the growth rate to reflect the progression of an epidemic due to reduction in susceptible pools or a decrease in the exposure to infectious rodent populations. The extrinsic growth rate increases to a maximum (i.e. saturation) before steadily declining to zero. The modelling and fitting via the growth models of the epidemic curve are illustrated in Figure 2.
We fit all models to the weekly reported LF cases in different regions and evaluate the fitting performance by the Akaike information criterion (AIC). We adopt the standard nonlinear least squares (NLS) approach for model fitting and parameter estimation, following [Reference Sebrango-Rodríguez16, Reference Hsieh and Ma18]. A P-value <0.05 is regarded as statistically significant and the 95% confidence intervals (CIs) are estimated for all unknown parameters. As we are using the cumulative number of the LF cases to conduct the model fitting, some fitting issues might occur, as per the studies in King et al. [Reference King19], due to the non-decreasing nature in the cumulative summation time series. The models are selected by comparing the AIC to that of the baseline (or null) model. Only the models with an AIC lower than the AIC of the baseline model are considered for further analysis. Importantly, the baseline model adopted is expected to capture the trends of the time series. Since the epidemic curves of an infectious disease commonly exhibit autocorrelations [Reference Tang20], we use autoregression (AR) models with a degree of 2, i.e. AR(2), as the baseline models for growth model selection. We also adopt the coefficient of determination (R-squared) and the coefficient of partial determination (partial R-squared) to evaluate goodness-of-fit and fitting improvement, respectively. For the calculation of partial R-squared, the AR(2) model is used as the baseline model. The growth models with a positive partial R-squared (indicating fitting improvement) against the baseline AR(2) model will be selected for further analyses.
After the selection of models, we estimate the epidemiological features (parameters) of turning point (τ) and reproduction number (R) via the selected models. The turning point is defined as the time point of a sign change in the rate of case accumulation, i.e. from increasing to decreasing or vice versa [Reference Sebrango-Rodríguez16, Reference Hsieh and Ma18]. The reproduction number, R, is the average number of secondary human cases caused by one primary human case via the ‘human-to-rodent-to-human’ transmission path [Reference Hsieh and Ma18, Reference Wallinga and Lipsitch21]. When the population is totally (i.e. 100%) susceptible, the R will equate to the basic reproduction number, commonly denoted as R 0 [Reference Wallinga and Lipsitch21, Reference Zhao22]. The reproduction number (R) is given in Eqn (1),
Here, γ is the intrinsic per capita growth rate from the nonlinear growth models and κ is the serial interval of the LASV infection. The serial interval (i.e. the generation interval) is the time between the infections of two successive cases in a chain of transmission [Reference Wallinga and Lipsitch21, Reference Cori23–Reference Zhao25]. The function h(·) represents the probability distribution of κ. Hence, the function M(·) is the Laplace transform of h(·) and specifically, M(·) is the moment generating function (MGF) of a probability distribution [Reference Wallinga and Lipsitch21]. According to previous work [Reference Iacono26], we assume h(κ) to follow a Gamma distribution with a mean of 7.8 days and a standard deviation (SD) of 10.7 days. Therefore, R can be estimated with the values of γ from the fitted models [Reference Hsieh and Ma18, Reference Wallinga and Lipsitch21, Reference Ma27, Reference Zhao28]. The state Rs were estimated from the γs of the fitted epidemic growth curves of each state. Similarly, the national Rs are estimated from the γs of the epidemic growth curves fitted to the national number of cases time series in different epidemic periods.
We then summarise the κ and R estimates via the AIC-weighted model averaging. The AIC weights, w, of the selected models (with positive partial R-squared) are defined in Eqn (2),
Here, AICi is the AIC of the i-th selected model and the AICmin is the lowest AIC among all selected models. Thus, the i-th selected model has a weight of w i. The model-averaged estimator is the weighted average of the estimates in each selected model, which has been well studied in previous work [Reference Sebrango-Rodríguez16, Reference Anderson and Burnham29].
For the AIC-based model average of the R, there could be the situation that no growth model is selected according to the partial R-squared. In such cases, instead of the model average, we report the range of the R estimated from all growth models.
Testing the spatial heterogeneity of the LF epidemics
After finding the model-averaged estimates, we apply Cochran's Q test to examine the spatial heterogeneity of the epidemics in different regions over the same period of time [Reference Lin30]. For instance, we treat the model-averaged R estimates as the univariate meta-analytical response against different Nigerian regions (states) and further check the heterogeneity by estimating the significance levels of the Q statistics. A P-value <0.05 is regarded as statistically significant.
Association between rainfall and reproduction number
Similar to the approach in the previous study [Reference Zhao31], the association between the state rainfall level and LASV transmissibility are modelled by a linear mixed-effect regression (LMER) model in Eqn (3),
Here, E(·) represents the expectation function and j is the region index corresponding to different regions (states). Term c j is the interception term of the j-th region to be estimated and it is variable from different regions, serving as the baseline scale of transmissibility in different states. The term t denotes the cumulative lag in the model and 〈rainfallj,t〉 represents the average monthly rainfall of the previous t months from the turning point, τ, of the j-th region. The range of lag term, t, will be considered from 4 to 9 months, which is explained by the time interval between the peak of the rainfall and the peak of rodent population [Reference Leirs7]. As illustrated in Figure 3, the reproduction numbers, R js, are estimated for different epidemics from the selected growth models. The regression coefficient, β, is to be estimated. Hence, the term (eβ – 1) × 100% is the percentage changing rate (of R), which can be interpreted as the percentage change in transmissibility due to a one-unit (mm) increase in the average of the monthly rainfall level over the past 7 months. The framework of the regression is based on the exponential form of the predictor to model the expectation of transmissibility (e.g. R); this framework is inspired by previous work [Reference Ali32–Reference Zhao35]. To quantify the impacts of state rainfall, we calculate the percentage changing rate with different cumulative lags (t) from 4 to 9 months and estimate their significant levels. Only the lag terms (t) with significant estimates are presented in this work.
We present the analysis procedure in a flow diagram in Figure 3. All analyses are conducted by using R (version 3.4.3 [36]) and the R function ‘nls' is employed for the NLS estimation of model parameters.
Results and discussion
The rainfall time series of the five states and the weekly reported LF cases of the whole of Nigeria are shown in Figure 1(a) and (b). We observe that the major LF epidemics usually occur in Nigeria between November and May of the following year. The cumulative lagged effects were observed via matching the peak timing of the rainfall and epidemic curves. In Figure 1(c), we shift the rainfall time series of the five states by + 6 months to match the trends of the national LF epidemic curve in Nigeria. In Figure 1(d) and (e), we find that the fit has a P-value <0.0001, which indicates a statistically significant association between the LF cases and shifted rainfall curve.
We fit four different growth models to the LF confirmed cases and estimate the model-average reproduction number (R) after model selection. We show the growth model fitting results in Figure 4 and the model estimation and selection results in Table 1. Most of the models have positive partial R-squared against the baseline AR(2) model. Most of the regions exhibit an epidemic turning point (τ) ranging from the epidemiological week (EW) 4–10, i.e. from the end of January to mid-March, in each year. Out of four epidemics in the states of Bauchi and Plateau, there are three estimated τs after EW 10 (Table 1). Larger τ means the more extension in the duration of the epidemics. The turning point (τ) could be affected by several factors including seasonality, intervention program and depletion of the susceptible pool. The estimated reproduction number (R) of the epidemics in different regions varies from 1.06 to 1.62. At the national level, the R value for the whole of Nigeria in 2016–17 (R = 1.23 with 95% CI 1.22–1.24) is significantly lower than in the epidemics of 2017–18 (R = 1.33 with 95% CI 1.29–1.37) and 2018–19 (R ranged from 1.08 to 1.36). The state of Edo has the highest estimated R (1.62 in 2017–18 and 1.09 in 2018–19) and this state also has the largest number of LF cases in the epidemics of both 2017–18 (41.9% of all cases) and 2018–19 (36.0% of all cases). Hypothesised spatial heterogeneity in the R is tested via Cochran's Q test. The testing results for the Rs in the five states are significant (i.e. P-value <0.05) for both the 2017–18 and 2018–19 LF epidemics. Thus, we report the existence of spatial heterogeneity in LF epidemics in Nigeria.
The ‘improvement’ is the partial R-squared. The ‘weight’ is the AIC-weight of the selected model, which is used for calculating the model-averaged estimates. The ‘NA’ means a summary term that is not applicable to a certain model. The notation ‘–’ means that the model cannot achieve a converging fitting outcome. The model-averaged estimates in each region are highlighted in grey.
Many previous studies adopted the instantaneous reproduction number, commonly denoted by R t, which can be estimated by a renewable equation, to quantify the transmissibility of infectious diseases [Reference Wallinga and Lipsitch21, Reference Cori23, Reference Fraser24, Reference Ali, Kadi and Ferguson37, Reference Wallinga and Teunis38]. The factors that affect the changing dynamics of R t include
(i) the depletion of the susceptible population [Reference Ali32] or decrease in the exposure to infectious sources,
(ii) the change, usually it is the improvement, in the unmeasurable disease control efforts, e.g. contract tracing, travel restriction, school closure, etc., [Reference Zhao, Bauch and He39–Reference Tang42] and local awareness of the epidemic [Reference Camacho33], and
(iii) the natural features of the pathogen, e.g. its original infectivity and other interepidemic factors [Reference Ali32, Reference Camacho33, Reference Zhao35].
In this work, we choose to use the average reproduction number (R) rather than R t, as the measurement of the LASV transmissibility. The estimated R summarises the LASV transmissibility over the whole period of an epidemic. The reasons why we prefer R rather than R t are as follows. First, the temporal changes of the susceptible population or decrease in the exposure to infectious sources are removed from the R estimates due to the nature of the growth models. Second, the changes of the susceptible population and/or disease awareness or control measures and the effect of the rainfall cannot be disentangled in the time-varying reproduction number, R t, the average reproduction number (R) adopted is a better proxy to explore the association between LF infectivity and rainfall. With respect to point (iii) and other heterogeneities of epidemics in different regions, we account for this issue by including the ‘region' dummy variables in the LMER model in Eqn (3). These dummy variables serve as random effects to offset the regional heterogeneities of LF epidemics. Therefore, we can then quantify a general effect, i.e. the β in Eqn (3), of the lagged rainfall on the LASV R estimate among different Nigerian places.
The association between total rainfall in a state and the LASV transmissibility (R) is modelled and quantified by the LMER model. In Figure 5, we find a positive relation between rainfall and R. The estimated changing rate in R under a one-unit (mm) increase in the average monthly rainfall is summarised with different cumulative lag terms from 4 to 9 months (the t in Eqn (3)). The range of lag in the rainfall from 4 to 9 months had previously been explained by the time interval between the peak of the rainfall and the peak of the rodent population [Reference Leirs7]. The estimates of the rainfall-associated changing rate in R with different lag terms were summarised in Table 2. We report the most significant (i.e. with the lowest P-value) regression estimates that appear with a cumulative lag of 7 months. The habitats of the LASV reservoir, i.e. rodents, include irrigated and flooded agricultural lands that are commonly found in and around African villages [Reference Granjon6]. The 7-month lag also coincides with the period between the dry and rainy seasons [Reference Akhmetzhanov, Asai and Nishiura43]. The association between rodent population dynamics and rainfall levels has been demonstrated in a number of previous studies [Reference Granjon6–Reference Leirs10]. Hence, we consider the 7-month lagged estimation as our main results. Namely, a one-unit (mm) increase in the average monthly rainfall over the past 7 months is likely to cause a 0.62% (95% CI 0.20%–1.05%) rise in the R of the LF epidemic. We also remark that this ‘one-unit (mm) increase in the average monthly rainfall over the past 7 months’ is equivalent to ‘7-unit (mm) increase in the total rainfall over the past 7 months’. The present finding of the impact of lagged rainfall on LF epidemics suggests that the knowledge of such weather-driven epidemics could be gained by referring to past rainfall levels. For instance, if a relatively high amount of rainfall occurs, local measures, such as rodent population control, could be effective to reduce the LF risk. This speculation could also be verified by examining the rodent population data of the Nigerian regions included in this work. The findings in this work are of public health interest and are helpful for policymakers in LF prevention and control.
On the one hand, our findings suggest the existence of an association between rainfall and LASV transmissibility, which could be affected by the population dynamics of rodents [Reference Fichet-Calvet and Rogers13]. On the other hand, the positive relation between rainfall and R indicates that rainfall, particularly in states with a high LF risk, can be translated as a warning signal for LF epidemics. The modelling framework in this study should be easily extended to other infectious diseases.
Limitation
Our work contains limitations. As in some African countries, the weather data are available only from a limited number of observatory stations and thus it is not sufficient to capture more accurate spatial variability. In this work, instead of exploring the spatial differences in the associations between rainfall and LF epidemic, we relaxed the setting and studied a general relationship. We qualified the general rainfall-associated changing rate of R in Nigeria. For the transmissibility estimation, our growth modelling framework can provide the estimates of R, but not the basic reproduction number commonly denoted as R 0. However, according to the theoretical epidemiology [Reference Zhao22, Reference Ma27, Reference Zhao35, Reference Earn44, Reference Keeling and Rohani45], the R 0 can be determined by R 0 = R/S, where S denotes the population susceptibility. Although S is not involved in our modelling framework, the information of S could be acquired from local serological surveillances. The existing literature reported 21.3% seroprevalence among Nigerian humans by the enzyme-linked immunosorbent assay (ELISA) [Reference Tomori46]. Hence, the R 0 can be calculated as 1.63 by using S = 1–21.3% = 0.787 and the R = 1.28 as the average of the 2016–18 LF epidemics. This was a data-driven modelling study, and we quantified the effect of rainfall as a weather-driven force of R based on previous ecological and epidemiological evidences [Reference Leirs7, Reference Akhmetzhanov, Asai and Nishiura43]. Since the transmission of LASV mainly relies on the rodent population, the factors including seasonality, agricultural land-using, subtropical or tropical forest coverage that could impact rodent ecology should be relevant and helpful in the analysis. However, due to availability of data, the agricultural land-using factors, e.g. pastureland, irrigated land, flooded agricultural land usage and forest coverage were absent in our analysis, which should be studied in the future if they become available.
Conclusions
The LF epidemic reproduction numbers (R) of the whole of Nigeria in 2017–18 (R = 1.33 with 95% CI 1.29–1.37) and 2018–19 (R ranged from 1.08 to 1.36) are significantly higher than in 2016–17 (R = 1.23 with 95% CI 1.22–1.24). There is significant spatial heterogeneity in the LF epidemics of different Nigerian regions. We report clear evidence of rainfall impacts on LF epidemics in Nigeria and quantify this impact. A one-unit (mm) increase in the average monthly rainfall over the past 7 months could cause a 0.62% (95% CI 0.20%–1.05%) rise in the R. The state rainfall information has potential to be utilised as a warning signal for LF epidemics.
Acknowledgements
We appreciate the helpful comments from anonymous reviewers that improved this manuscript. The funding agencies had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.
Author contributions
SZ conceived, carried out the study and drafted the first manuscript. SZ and DH discussed the results. All authors revised the manuscript and gave final approval for publication.
Conflict of interest
The authors declare that they have no competing interests.
Ethical standards
Since no personal data were collected, ethical approval and individual consent were not applicable.