INTRODUCTION
The largest Ebola virus disease (EVD) epidemic in history began in Guinea in December 2013. As of 30 March 2016, the EVD epidemic resulted in over 28 600 cases and 11 300 fatalities mainly in Guinea, Liberia and Sierra Leone [1]. The most recent reported case was reported in Liberia in March 2016 and WHO has warned that we may find new flare-ups in the affected countries [2]. With over 17 000 survivors in West Africa and chronic persistence of virus in some survivors [Reference MacIntyre and Chughtai3], such flare ups have already occurred and may continue to be a risk. During an epidemic emergency, it is important to provide scientific justification for rapid policy decisions. Early epidemic models can frame policy decisions by providing epidemiological characteristics to aid effective disease control – they allow policy makers and scientists to characterise disease epidemiology parameters (such as basic reproduction number (R 0), serial interval, infectious period) based on limited/early disease surveillance data, which can assist in understanding how a disease spreads during the early phase of an outbreak.
For previous EVD outbreaks in the Democratic Republic of Congo (1995) and Uganda (2000), modelling approaches consisted of both traditional Susceptible–(Exposed)–Infected–Removed models and more context-specific compartment models (which account for the hospital transmissions and post-death transmissions). Modelling these hospital and post-death transmissions can provide justification for intervention effects in different transmission contexts [Reference Drake4]. Choice of these modelling approaches may impact modelling outputs, for example, a study [Reference Weitz and Dushoff5] revealed that estimates of R 0 tend to be underestimated if post-death transmission dynamics are neglected. For the current EVD epidemic, both traditional and context-specific approaches have also been employed; however, the impact of compartment model designs on epidemic parameters estimation and trajectory projection have not yet been systematically evaluated – this is one motivation for our review.
Another motivating factor of this research is to compare outputs of models that do or do not account for underreporting – which is an issue that is present during most outbreaks, but was thought to be a considerably significant issue during the EVD epidemic, due to a number of reasons [Reference Gibbons6]. Firstly, EVD cases can be asymptomatic [Reference Heffernan7, Reference Leroy8]. Secondly, West Africa is one of the poorest regions of the world, their health systems, let alone their surveillance capabilities are severely limited – in a study from the Centers of Disease Control and Prevention (CDC), timeliness of reporting was found to be particularly lacking during the early phases of the EVD outbreak [Reference Meltzer9]. Lastly, a prevailing distrust of Western medicine, particularly in more rural regions, has been thought to deter cases from presenting to health facilities [Reference Manguvo and Mafuvadze10]. This review systematically compares models, which have or have not accounted for underreporting.
There are many difficulties which are inherent in early epidemic models, such as uncertainty regarding disease epidemiology, modes of transmission and unknown rates of underreporting. In this study, we build on the work of a recently published EVD modelling review [Reference Chretien, Riley and George11] – we systematically evaluate different modelling methods used to study the current EVD outbreak in West Africa and their outputs on key disease parameters, focusing on investigating the impact of using models with different compartmental structures and which account for underreporting. Our objective is to provide directions for future modelling efforts in settings where early disease outbreak data may be limited.
METHODS
A systematic review was conducted in compliance with the preferred reporting items for systematic review and meta-analyses (PRISMA) checklist (http://www.prisma-statement.org/) [Reference Moher, Liberati, Tetzlaff and Altman12].
Search strategy
Three online databases (Pubmed, Embase and Scopus) were searched for relevant literature published between January 2014 and December 2015. We only focused on Ebola modelling studies published in about 2 years of the start of the outbreak as early publications provided insight and direction to global and national emerging diseases response and control decisions. These databases cover index international journals in the multi-disciplinary field of: public health, biomedical and pharmaceutical research, clinical and experimental research, health policy and management, and scientific, technical and social science research. For each database, we conducted a search with the following search keys: ‘ebola’ AND ‘model*’. All searches included article title, abstract and keywords. The search was limited to studies in the English language. The detailed search strategy was slightly adjusted according to the specific database settings and was reported in Supplementary Document 1. In order to minimise the chance of missing references, we carried out hand search of key Ebola modelling papers from internet and all the included studies were cross-checked with the papers identified from our initial scoping review. The key literature search was carried out on 1 June 2016 and the final search was carried out on 18 November 2016.
Inclusion and exclusion criteria
Studies were included if they aimed: (1) to project the trajectory of disease outbreak or (2) to provide early epidemiological parameters estimation of Ebola (including R 0, serial interval, latent period, infectious period and case fatality rate) using modelling methods. Only those studies using the current EVD outbreak data were included. Studies were excluded if the models presented focused only on evaluating intervention strategies without offering parameter estimates or trajectory projection – evaluating such models was determined to be outside the scope of this review. Studies pertaining to EVD outbreaks prior to 2014 were also excluded. Narrative studies, response policy studies, process model studies, phylogenetic and biological or genetic modelling studies were also excluded. We included all relevant modelling studies based on the above criteria and systematically registered key features and attributes associated with modelling techniques, design and major assumptions taken for epidemic estimation and projection.
Selection of studies
We conducted a two-step screening process. In the first screening, titles and abstracts were independently screened by two reviewers – any discrepancies were resolved by discussion and/or referring to the full-text. In the second screening, remaining studies were included or excluded based on contents within the full-text.
Once all studies were identified, we collected quantitative and qualitative information pertaining to the model presented in each study. We focused on synthesising study outcomes that: (1) account for data uncertainty and (2) used hospitalisation and/or funeral compartments. We also summarised and tabulated qualitative descriptions of what questions the model aims to answer, the country of interest, type of model presented, source and date of the case data and model fitting. We quantitatively analysed estimates of key epidemiological parameters, including the R 0, serial interval, latency period, infectious period and case fatality rate. We then synthesised the estimated R 0 by country, use of compartment, consideration of underreporting; and then investigated the relationship between the R 0 estimation with other parameters. Summary statistics, boxplots, scatterplots, non-parametric significance tests (including Mood's median test, Kruskal–Wallis test and Spearman test) were used, where appropriate. All of the analysis was performed using an R version 3.3·1 64 bit platform with the packages ‘plotrix’ and ‘zoo’.
RESULTS
Definition of modelling terms used in this study is listed in Table 1. Phenomenological models are recognised methods (such as summary statistics, regression models and predictive models), which offer computational solutions to determine values of key disease epidemiological characteristics. Mechanistic models describe the transmission of infectious diseases by categorising host populations into various stages of infection. The R 0 is an important parameter which allows epidemiologists and modellers to quantify how easily a disease can spread, and how effective interventions need to be in order to achieve disease control. R 0 is defined as the expected number of secondary cases generated by one infected individual over the course of their infection in a fully susceptible population [Reference Diekmann, Heesterbeek and Metz13] (i.e. before interventions are put in place or immunity develops).
Of the 874 studies identified through the online database search, 496 duplicates were removed. After preliminary title and abstract screening, 351 studies were excluded. Seventeen were excluded through a second screening of the full-text of articles, based on the exclusion criteria. Four additional studies were identified through hand-searching. Finally, 41 studies met the inclusion criteria and were included in the review (see Figure 1).
From the 41 studies we selected for our review, 16 studies were published in 2014 and 25 were published in 2015. Thirty-five studies aimed for EVD parameter estimation, 27 offered trajectory projection and 21 studies provided both parameter estimation and trajectory projection. There are 11 phenomenological modelling studies, 29 mechanistic modelling studies and one study employed both modelling methods. Among these 41 studies, 14 accounted for data uncertainty or data bias issues. There are five homogeneous models and 25 of the reviewed studies incorporated heterogeneous mixing assumption. Among those heterogeneous mixing studies, 17 considered a hospitalised compartment, 16 considered a funeral compartment and 12 incorporated both hospitalised and funeral compartments.
Overview of EVD modelling studies
Tables 2–4 summarised our descriptive results by research aim. Thirty-one out of the 41 included studies have specified referencing case data from World Health Organization (WHO) [1]. The WHO source provided cumulative numbers of reported confirmed, probable and suspected cases and deaths. Other commonly used data include: National Ministry of Health data from the affected countries (such as [17]), CDC Morbidity and Mortality Weekly Reports [18] and/or synthesised data from public data repositories (such as Caitlin Rivers of Virginia Polytechnic Institute [19] and Virology Down Under blog [20]). Demographic and population data were taken from sources such as the Central Intelligence Agency (CIA) factbook [21], United Nation (UN) reports [22] and national census. In many studies, model parameters were derived from models of past Ebola outbreaks [Reference Chowell23, Reference Legrand24] and/or early published models of the current Ebola epidemic [Reference Meltzer9, 25]. For instance, Althaus’ study in 2014 [Reference Althaus26] adopted estimates of incubation and infectious periods from the 1995 EVD outbreak in Congo, and the team published a study for Nigeria in 2015 [Reference Althaus27] that incorporated the estimated incubation period from a 1976 EVD outbreak in Zaire.
When developing an epidemic model with more in-depth detailed compartment structures, many state parameters (i.e. parameters defining the transition between infection stages) will be required and calibrated. For example, Barbarossa et al. [Reference Barbarossa28] created a model with seven compartments (including hospitalised and buried) and estimated state parameters based on the outcomes of a range of earlier EVD modelling studies [Reference Legrand24, Reference Gomes29–Reference Hsieh31]. It is noted that model parameters should be estimated with caution as they are prone to biases, and the intended prediction outcomes can be highly sensitive to small changes in some parameters. Twenty-eight of the review studies have carried out sensitivity analysis (so-called stress tests) to examine the robustness of modelling outcomes.
Synthesised results: estimates of epidemiological parameters
Thirty-five studies offer epidemiological parameter estimation and 29 of them estimated the R 0. We evaluated the estimated mean of R 0 by country, compartment consideration and accounting for underreporting, as shown in Figure 2 and reported the details values in Supplementary Document 2 Tables S1–S3. The median of the R 0 mean estimate for the ongoing epidemic (overall) is 1·78 (interquartile range: 1·44, 1·80), 1·30 (interquartile range: 1·24, 1·51) for Guinea, 1·84 (interquartile range: 1·69, 2·10) for Liberia, 1·70 (interquartile range: 1·34, 2·05) for Sierra Leone and 9·01 for Nigeria. Kruskal–Wallis non-parametric test result showed that the estimated R 0 do not have identical data distributions across the included countries (P value ⩽0·05) when we considered Nigerian estimates – which had much higher R 0 estimate comparing with other countries. We performed additional Kruskal–Wallis test (without considering Nigeria estimate) and showed that there is an identical data distributions across other included countries (We cannot reject null hypothesis). When we performed additional pairwise Mood's median tests between each pair of countries (without Nigeria), we found the median values of estimated R 0 are generally not significantly different (apart from the pair between Guinea and Liberia). The results of additional Kruskal–Wallis test and pairwise tests between each pair of countries were reported in Supplementary Document 1 Figure S3. A trend line indicates the potential temporal patterns of the reported R 0 estimation from the bottom-right panel of Figure 2 and there is a slight increasing trend of R 0 when more recent data were used in each study.
The median of R 0 values of 1·90 (interquartile range: 1·90, 2·10) is found for those models accounting for underreporting and 1·71 (interquartile range: 1·40, 1·96) for those not accounting for underreporting. The median of the estimated mean R 0 values is reported as 1·49, 1·80, 1·78 and 1·73 for those models, which considered compartments with hospital, funeral, hospital and funeral, and without hospital and funeral stages, respectively. The Kruskal–Wallis and Mood's median test results revealed that we cannot reject the null hypotheses – i.e. the estimated mean of R 0 remains insignificantly different, regardless of the model's consideration of compartments (with hospital and/or funeral) or accounting for underreporting. The above results yield the same when we do not consider the Nigerian study data. The results were reported in Supplementary Document 1 Figure S3.
We also synthesised the values of other key modelling parameters, including serial interval, latency period, infectious period and case fatality rate, as shown in Figure 3 (and Supplementary Document 1 Figure S4). The median of the mean serial interval is 14·35 days (interquartile range: 12·28, 16·35), latency period is 9·70 days (interquartile range: 8·80, 10·38), the infectious period is 7 days (interquartile range: 4·00, 10·00) and case fatality rate is 0·68 (interquartile range: 0·48, 0·71) (the distributions by country are shown in Figure 3).
We also studied the relationship between estimated R 0 and the used or estimated epidemiology parameters. Different studies may estimate/fine tune these parameters or use previously published values when producing R 0 estimates. Figure 4 demonstrates the relationship between the estimated R 0 with different serial interval, latency period, infectious period and fatality rates (We also reported the same figure without considering the Nigerian data in Supplementary Document 1 Figure S4). Based on the results, we do not observe any obvious trend of R 0 estimates from studies using/estimating various mean values of serial interval, incubation period and infectious period. We carried out correlation tests between these values (for pairwise complete observations) and the Spearman test results were insignificant (on overall and per-country basis). The synthesised values of reported epidemiological parameters in terms of R 0, serial interval, latency period, infectious period and case fatality can be found in the Supplementary Document 2 Table S2.
Synthesised results and figures stratified by modelling types, approaches, mixing assumptions and sensitivity analyses are provided in Supplementary Documents 1 Figures S5–S12 and Supplementary Documents 2. We do not observe significant difference in the distributions of R 0 mean estimate from those included studies did or did not carry out sensitivity analysis – as most sensitivity analyses are typically conducted in orthogonal manner to estimate related parameters. Furthermore, R 0 is also sometime being used as a response function of sensitivity analysis of other model parameters. Additionally, 27 studies offered epidemic trajectory projection and provided estimated number of cases (without additional intervention). Figure 5 shows the relationship between model prediction to WHO case observation ratio (matched with forecast target date) and account for underreporting and consideration of compartment. The median values of ratio between prediction and observation are significantly different (P value ⩽ 0·1) in the pair between do and do not account for underreporting [median ratio of those do account for underreporting is 4·35 (interquartile range: 1·52, 15·03) and do not account for underreporting is 1·19 (interquartile range: 0·98, 1·53)]. However, we do not observe a significant relationship in the pairs of different compartments used. We also paired up models that offered model prediction with and without considering underreporting and carried out the same set of analysis. The matched models analysis results are provided in Supplementary Document 1 Figure S12. The synthesised results of epidemic projection with indication of accounting for data uncertainty and used of hospitalisation and/or funeral compartments are also provided in the Supplementary Document 2 Table S3.
DISCUSSION
We systematically reviewed 2014–2015 Ebola modelling studies, which provided epidemiological insights to the current Ebola and future outbreaks. We evaluated the selected studies based on the sources of the case data used, and modelling approaches by modelling aim and we further synthesised the reported R 0 results and the distributions of key epidemiological parameters based on compartment designs, and consideration of underreporting. We found that epidemic models offered R 0 mean estimates for this EVD are country-specific, but these are not associating with several key disease parameters, compartment designs and accounting for underreporting.
In this EVD outbreak, we noticed a significantly different relationship between the contexts (i.e. the country of interest) with estimations in R 0 (particularly with regards to Nigeria, which had a much higher R 0 estimate). We only observed differences in the values of estimated/used serial interval, latency period, infectious period and case fatality rates by country but the median differences are not statistically significant.
We generally did not observe any apparent systematic pattern in the distribution of estimated R 0 when specifying different compartments. This may be due to one fundamental issue that models are generally fitted based on observed epidemiology data from the same original sources and other associated model parameters within the model could be calibrated altogether to achieve model fitting. This also coincides with our finding – models that utilised different mean serial intervals, incubation periods and infectious periods within plausible ranges yielded similar estimates of R 0.
R 0 can be expressed as the product of transmission probability per contact, number of contacts per time unit and duration of infectious period; however, these quantities are usually difficult to parameterise directly from observing a outbreak [Reference Keeling and Rohani14]. In a simplified epidemic model assumption, the lengths (and distributions) of serial interval [Reference Wallinga and Lipsitch16] and infectious period [Reference Vynnycky and White65] are influential to R 0 estimation. Furthermore, inclusion of a latency period into a model would result in a slower epidemic growth rate after pathogen invasion due to individuals needing to pass through the exposed class before they can contribute to the transmission process [Reference Keeling and Rohani14].
Although serial interval, latency period and infectious period are computationally related to the estimate of R 0 (differences account for compartmental contact heterogeneity and epidemic model assumptions), we observed that the use of different mean serial intervals, incubation periods and infectious periods yield similar estimates of R 0. However, we only compared the relationship by median durations of these epidemiological parameters – using different mean distributions may produce a different effect with the same values.
Due to changes in reporting systems and/or public awareness of disease over time, there may have been unknown observation biases and errors over time (i.e. there may be higher rates of disease reporting due to greater public awareness of the disease). The included studies that considered underreporting issues generally offered similar R 0 estimates to those without incorporated underreporting and offered a larger case forecast compared with actual case observation at the forecast target date. Without high-quality and reliable data, it is challenging to accurately estimate the impact of time-dependent changing factors and incorporate them into a model. We recommend health authorities endeavour to share detailed epidemiologic attributes related to reported cases, including geographical locations of cases, information on contact networks and date of symptom onset. The availability of ongoing case data can help to recognise hidden changes of disease patterns over time. Health authorities could consider providing time-dependent associated correction factors according to the practical underreporting situations and force of intervention strategies. Furthermore, incorporating surveillance outcomes from phylogenetic [Reference Kühnert, Wu and Drummond66] and serological [Reference Wu67] studies would potentially useful for advising the underlying emerging disease transmission characteristics and identify undetected cases. These would be useful to modellers for accurately calibrating epidemiological models based on actual outbreak situations, which can then feedback meaningfully into decision support during the outbreak.
Furthermore, we observed that many included models in this review have inferred data about disease behaviour using disease parameters calculated from previous EVD outbreaks. It is noted that this Ebola outbreak has similar parameters estimates with the previously EVD outbreak parameters given by Drake et al. [Reference Drake4]. Furthermore, it will be useful to develop a centralised reference data platform, which allows for sharing of epidemiological parameters. This would enable modellers to use/calibrate disease parameters based on comparable metrics. Together with other EVD modelling review outcomes [Reference Drake4, Reference Chretien, Riley and George11], this study outcome laid groundwork for such reference.
During the later stage of the EVD epidemic, new evidence emerged that EVD can survive in various body fluids during convalescence [Reference MacIntyre and Chughtai68, Reference Chughtai, Barnes and Macintyre69] and may result in transmission of infection [Reference Rodriguez70–Reference Christie75]. WHO has recently highlighted the potential of the occurrence of EVD flare-ups and disease re-introduction [2]. At the post-EVD outbreak stage, accounting for risk of transmission from ‘recovered’ EVD patients [Reference Abbate76, Reference Vinson77] should be a priority for future EVD modelling.
Accuracy, transparency and flexibility are the major considerations when formulating models for infectious diseases [Reference Keeling and Rohani14]. The EbolaRepsonse tool created by the CDC [Reference Meltzer9], which allows users to project the number of Ebola cases in Liberia and Sierra Leone using a simple model implemented on a Microsoft Excel Worksheet. The WHO Response Team [Reference Hsieh31] study was one of the first studies providing a detailed epidemiological description of the EVD epidemic using primary data collected from hospitals and patients – the study also provided short-term projections of the epidemic for Guinea, Liberia and Sierra Leone. These early modelling studies are some of those successful examples that demonstrate how timely and effectively use of phenomenological and mechanistic modelling methods support emerging disease understanding and public health responses.
Modelling outcomes can be different depending on the epidemiological characteristics of the emerging diseases and the interplay among pathogen, host and environment. For this EVD outbreak, simpler models generally yielded similar estimates of R 0 regardless of the consideration of additional hospitalisation and/or funeral compartments and underreporting. However, context-specific models, which mimic the way a disease is transmitted among stages realistically and meaningfully, allow for justification for intervention effects under different transmission contexts. Under an epidemic emergency, a “simple enough but not simpler” model, which allows for understanding of key epidemic dynamics (i.e. offer transparency to public health policy makers), may play a critical role for advising rapid policy decisions and predicting outbreak progression at the early phase of an outbreak.
CONCLUSION
Newly emerging infectious diseases are the most challenging to manage due to the uncertainties in clinical impact and transmissibility. Epidemiological parameters are usually difficult to observe directly from an emerging infectious disease outbreak and would require modelling methods to accurately estimate at the early phase of a disease outbreak. Epidemic modelling is a quantitative approach to understanding disease transmission within a specific population and provides indications of future trends. Such models are well-recognised tools to aid policy formulation in the early phase of epidemics.
Despite varied complexity and methods, the estimates of R 0 yielded from numerous studies were reasonably consistent in this EVD outbreak regardless of concurrent use of other associated epidemiology parameters. Different model design decision did not appear to meaningfully impact the resulting R 0 estimates but models that accounted for data uncertainty offered a larger case forecast compared with actual case observation at the forecast target date. Simple early models remain informative to reference, and provide a foundation for more complex transmission modelling to understand the progression of a disease outbreak.
SUPPLEMENTARY MATERIAL
The supplementary material for this article can be found at https://doi.org/10.1017/S0950268817000164.
ACKNOWLEDGEMENTS
This study was supported by National Health & Medical Research Council (NHMRC) – Project Grant (Australia) – APP1082524.
AUTHORS’ CONTRIBUTION STATEMENT
The study was conceived by Z.W. and R.M. Z.W., C.B., A.C. and R.M. contributing to the study design. Research data/information was retrieved and interpreted by Z.W. and C.B. Z.W. and C.B. led the writing of the paper and all authors revised and refined the arguments. All authors approved the article.
DECLARATION OF INTEREST
None.