INTRODUCTION
Syndromic surveillance (SyS) of pre-diagnostic cases based on signs and symptoms or health-related behaviour is a supplementary approach for timely detection of public health threats and for monitoring events with potential public health impact if information from other surveillance systems are not yet or not at all available [1]. SyS can provide a flexible and cost-effective way to gain timely information about the health impact of known and unknown, communicable and non-communicable, natural and man-made health threats [Reference Buehler2, Reference Ziemann3].
The European landscape of public health surveillance mainly consists of three parallel schemes. The first scheme comprises the specific communicable disease surveillance systems of European Union member states (MS) that provide information on confirmed cases following a common case definition to the European Surveillance System (TESSy) [4]. The second scheme is different reporting systems through which MS give account of communicable or non-communicable events to inform other MS and European institutions, e.g. the Early Warning and Response System (EWRS) [5]. The third scheme comprises unspecific information collated by European networks of different countries, e.g. Influenzanet for self-reported influenza symptoms [6], or EuroMOMO for mortality monitoring [7], and by the Medical Information System (MedISys) that automatically screens online news wires concerning health events [8]. SyS is accomplished in MS at the local, regional, and national levels [Reference Fouillet9]. A systematic approach towards European SyS could support timely, comparable, cross-border surveillance.
Routinely collected emergency-care data from (i) emergency medical dispatch (EMD) centres, (ii) ambulance or emergency medical services (EMS), and (iii) emergency departments (ED) can be a valuable source for SyS. Across Europe, emergency-care data is available following a common structure [Reference Krafft10]. The biggest advantage is the opportunity of real-time reporting of electronic emergency data that can offer timelier and more frequent information compared to established traditional surveillance systems, e.g. based on sentinel doctors [Reference Buehler2]. It provides data based on a form of clinical assessment, e.g. working diagnoses from emergency physicians (EP), which have a higher specificity for SyS compared to non-clinical data from, e.g. over-the-counter drug sales [Reference Das11].
We aimed at developing the first concept for SyS based on three routine emergency data sources that are applicable across Europe. We describe the development of the SyS concept and present results of a case study testing the SyS concept using the example of local gastrointestinal outbreak detection.
METHODS
SyS system concept
Inventory of emergency data availability in Europe
We asked regional (sub-national) emergency service representatives in 12 countries (Austria, Belgium, Czech Republic, Denmark, Finland, France, Germany, Hungary, Italy, Norway, Spain, Turkey) to assess availability and content of routine datasets collected in EMD, EMS, and ED. Using a semi-standardized survey we asked for the method of data collection, i.e. manual or electronic, the frequency of data availability, e.g. daily, and the available data fields in the routine datasets.
Syndrome definition
Based on the inventory, we defined syndromes of potential public health relevance that could be generated using routine emergency data. Based on a focus group discussion with emergency-care and public-health experts from across Europe and examples from the literature, we developed recommendations for generating syndromes based on the most common diagnostic coding systems used in EMD, EMS, and ED, i.e. Advanced Medical Priority Dispatch System (AMPDS), versions 11.3 and 12.0 (Priority Dispatch Inc., USA), International Classification of Disease (ICD) 9th and 10th revisions, chief complaints based on Canadian Triage and Acuity Scale (CTAS), and the Minimum Dataset for Emergency Physicians (MIND).
SyS system design
Based on a review of the literature and material published on existing SyS systems and a consultation with European emergency-care, public-health and information technology experts, we developed a design concept for an emergency data-based SyS system. We defined a minimum standard dataset as input for the SyS system that is applicable for EMD, EMS and ED, defined the data flow, selected statistical analytical methods for detecting unusual aberrations, and described ways of reporting the output.
Case study on local gastrointestinal outbreak detection
We tested our SyS concept for EMD, EMS and ED, and for different syndromes and purposes, based on retrospective analyses of historical data from regional emergency systems in four countries [Reference Rosenkötter12]. In this paper we present the results of a case study on local gastrointestinal outbreak detection.
Datasets
We analysed data from the EMD centre in the state of Tyrol, Austria (EMD-AT dataset), data from EMS staffed by EP in the state of Tyrol, Austria (EP-AT dataset), the county of Goeppingen, Germany (EP-DE dataset) and the country of Belgium (EP-BE dataset), and data from an ED in a university hospital in the city of Santander, Spain (ED-ES dataset). Table 1 describes the main characteristics of the datasets.
EMD, Emergency medical dispatch, EMS, emergency medical services, ED, emergency department; EP, emergency physician; AMPDS, Advanced Medical Priority Dispatch System; ICD, International Classification of Diseases; MIND, Minimum Dataset for Emergency Physicians; CTAS, Canadian Triage and Acuity Scale.
Gastrointestinal syndrome case definition
Table 2 details the definition of gastrointestinal syndrome cases for five common emergency-care coding systems as an example for a syndrome that can be generated based on routine emergency-care data. An emergency case which received any code included in Table 2 was included in the case study.
AMPDS, Advanced Medical Priority Dispatch System; ICD, International Classification of Diseases; MIND, Minimum Dataset for Emergency Physicians; CTAS, Canadian Triage and Acuity Scale.
Temporal aberration detection algorithms
As a first step, three detection algorithms based on cumulative sums were applied for the analysis of aberrations in the time series of gastrointestinal syndrome cases: C1, C2, and C3 based on short-term baselines [Reference Hutwagner13], and two cumulative sum algorithms based on longer baselines, one for normal (CUSUM-N) and one for Poisson-distributed data (CUSUM-P) [Reference Burkom, Lombardo and Buckeridge14]. If the distribution of the datasets for a specific syndrome was neither normal nor Poisson distributed, as was the case for gastrointestinal syndrome cases, we applied all algorithms in parallel. The CUSUM algorithms were enhanced with the fast initial response (FIR) technique which ensures that large chart values do not inflate following values preventing the production of excessive signals [Reference Lucas and Crosier15]. In the case study the algorithms were applied retrospectively. We analysed periods of six (EP-BE dataset) or 12 (EMD-AT, ED-ES datasets) months and produced a daily CUSUM value. For each analysis period, we calculated baseline means to which the actual values were compared, based on the 6 or 12 months preceding the analysis period (Table 1). For the CUSUM-P analysis, the accepted mean was defined close to the actual mean and the threshold value h was defined by look-up procedure in the table of Lucas [Reference Lucas16]. The temporal aberration detection algorithms have been applied using Microsoft Excel 2003 (Microsoft Corp., USA).
Spatio-temporal cluster detection algorithm
In a second step, outbreak periods that were identified based on temporal aberration detection analysis (see definition of outbreaks in the next section) were analysed by a prospective spatio-temporal scan statistic [Reference Kulldorff, Lawson and Kleinman17]. The scan statistic process can be explained as a cylindrical scanning window that moves flexibly over the study area. The width of the cylinder base represents the geographical area and the height represents the time period which is scanned. The scan statistic evaluates for all possible cylinder locations and sizes if an observed cluster of cases is caused by chance. The scan statistic can be applied to different levels of spatial aggregation of cases. In the case of spatially aggregated datasets, the cases are concentrated on the centroids of an area. In our case study, a prospective spatio-temporal Bernoulli model-based scan statistic was applied to the exact addresses of the emergency sites in the EMD-AT dataset. A prospective spatio-temporal Poisson model was applied to the EP-AT, EP-BE, EP-DE, and ED-ES datasets, based on the centroids of each administrative area (Table 1) [Reference Kulldorff, Lawson and Kleinman17]. During the scanning process the rates of gastrointestinal cases divided by the total number of emergency cases within the scanning window were compared to the rates outside of the window. The baseline populations were generated using the total number of emergencies in the previous 12 months (EP-AT, EP-DE) and the previous 6 months (EP-BE, ED-ES). The likelihood that a cluster exists by chance was characterized by a P value based on 999 Monte-Carlo simulations [Reference Kulldorff, Lawson and Kleinman17].
For each syndrome, different parameters have to be defined for detecting relevant clusters. For local gastrointestinal outbreak detection, only clusters with the parameters of 1 day temporal length, enclosing a circular area of up to 1 km radius, and with a significance level of P < 0·001 that the cluster exists by chance were defined as relevant. Pre-tests with different parameters showed that for longer and larger cluster sizes the number of cases that formed a cluster was too low and/or the cases were scattered over too large an area to reflect a true positive outbreak. The analyses were performed using SaTScan™ (v. 9.1.1., M. Kulldorff and Information Management Services Inc., USA). The identified spatio-temporal clusters were visualized using ESRI ArcGIS® v. 10.1. (Environmental Systems Research Institute Inc., USA).
Definition of an outbreak
We followed a decision tree as suggested by Meyer et al. [Reference Meyer18] and Ansaldi et al. [Reference Ansaldi19] to define inclusion criteria for outbreaks based on the signals given by, first, the temporal and, second, the spatio-temporal detection algorithm. For the case of local gastrointestinal outbreak detection, these were (i) at least 2 days of consecutive temporal aberration detection signals, or (ii) days with an exceptionally high aberration in case numbers from the mean [>3 standard deviations (s.d.) from the baseline mean of the previous 6 or 12 months], and (iii) outbreaks identified by the temporal aberration analyses with corresponding spatio-temporal clusters.
Validation of outbreaks
The comparison with reference data from other (traditional) surveillance systems can give additional assurance that a signal could represent a real event. For the case study on local gastrointestinal outbreaks, we compared the detected outbreaks with notifiable surveillance reports of foodborne diseases. This reference data was available for Tyrol (Austria), Belgium, and Goeppingen (Germany) [20] (Table 1).
RESULTS
SyS system concept
Availability of emergency data
Routine electronic data was available daily in 11 of 12 regions from EMD, EMS and/or ED (Table 3). Information on the patients' chief complaints was available daily and electronically in ten systems, information on age and sex in nine systems (Table 3). Although the datasets comprised common data fields across Europe such as date, age, sex, and diagnostic information, the items were defined differently. In particular, diagnostic information varied. Sometimes international coding systems were used, and sometimes data was collected following regional or national coding systems (Table 1).
EMD, Emergency medical dispatch, EMS, emergency medical services, ED, emergency department; EP, emergency physician; n.a., information not available; –, data not available.
SyS system design
We defined a standard dataset for SyS that can be generated based on routine data collected in the majority of EMD, EMS, or ED across Europe: (1) date, (2) syndrome, (3) geographical reference, (4) modifier I: age, (5) modifier II: sex, (6) modifier III: severity.
Figure 1 shows the generic functions and data flow of the automated SyS system. The system can be implemented by emergency institutions using the institution's already established health information technology infrastructure. The emergency institution is supposed to programme a permanent, daily translation between the emergency database and the surveillance system following the standard SyS dataset, e.g. an extract transform load (ETL) process. Afterwards, the syndromic data should automatically be analysed by applying temporal and spatio-temporal aberration detection algorithms in parallel. The proposed algorithms can be operationalized using open source software such as R [21] and SaTScan, or can be programmed directly in other, already applied data analysis software. The parameters of the algorithms have to be calculated once for each monitored syndrome and each emergency dataset, based on historical emergency data. During regular operation of the SyS system, these parameters should be updated regularly and after changes in the data collection procedure. The outputs of the SyS analyses are statistical signals that can be displayed in tables, charts and maps, which can be disseminated within the emergency-care institution and to the local/regional public health authority. Reporting can be accomplished by establishing a regular automatic email message, by incorporating the results in already established reports, or by allowing stakeholders to access a virtual dashboard online that is automatically updated on a regular basis. The public health authority and/or emergency institution decide if the signals could represent a real event following a pre-defined decision tree for each syndrome. The public health authority can incorporate SyS alerts into existing surveillance systems and response procedures. The emergency institution can use the information for resource planning. At the time of writing this paper, two institutions have implemented an automatic SyS system following this concept, the EMD centre of the State of Tyrol, Austria, and the ED of the University Hospital in Santander, Spain.
Case study on local gastrointestinal outbreak detection
The case study showed that the case numbers in the datasets based on data from EMS staffed with EP in the Austrian (EP-AT dataset) and the German (EP-DE dataset) regions, with an average of 0·14 and 0·31 cases per day, respectively, were too low for providing valid results based on the temporal aberration detection analysis. Figure 2 shows the time series of the number of gastrointestinal syndrome cases and the signals of the temporal aberration detection analyses for the EMD-AT, EP-BE and ED-ES datasets.
The temporal aberration detection analyses resulted in many signals. When applying the decision tree to identify outbreaks, there were many events with high aberration from the mean and with signals on at least 2 consecutive days. When applying the spatio-temporal analysis during these outbreak periods, we were able to further narrow down the number of relevant outbreaks. Figure 3 provides an overview on the number of signals and the application of the decision tree for each dataset. One outbreak was located in Tyrol, Austria (EP-AT) (14 February 2007, 12 cases within a circle of 0 km radius, P < 0·0001), and one in Santander, Spain (3 August 2010, seven cases within a circle of 0·68 km radius or distribution across postal code areas of 2·2 km2, P < 0·0001). Figure 4 exemplifies the cluster in Santander, Spain.
The comparison with notifiable disease reporting data confirmed the alert on 14 February 2007 as a norovirus outbreak in a group of foreign students who stayed in one hotel in the city of Kufstein (n = 26 cases). The alert was not confirmed by the EMD-AT dataset which refers to the same region. Two subsequent norovirus outbreaks in the following days in two foreign tourist groups in the same hotel (n = 10 and n = 53 cases) were not identified in the syndromic datasets. No other notified foodborne outbreak in Tyrol, Austria (n = 42), and Belgium (n = 105) could ultimately be linked to signals in the syndromic datasets. The reference data from Goeppingen, Germany did not provide the number of outbreaks.
DISCUSSION
SyS system concept
We developed the first concept for a SyS system based on routinely collected emergency medical care data from EMD, EMS and ED for different countries in Europe.
Routine emergency data was available in many regions in Europe in electronic form and on a daily basis. It provided relevant information for SyS, such as date and geographical information and the patients' chief complaints. We defined recommendations for syndrome coding, based on the most common coding systems in emergency care, and designed a concept for an emergency data-based SyS system able to be implemented at the local/regional level in Europe. Two regional emergency institutions in Austria and Spain have initially implemented an automatic SyS system following our concept.
As the emergency data inventory revealed differences in data coding and availability across Europe, we conceptualized the system to be implemented at single emergency institutions or in one jurisdiction. This allows for raw data to be analysed in the emergency institution, respecting data privacy. This flexibility of the concept supports a relatively rapid set-up of a SyS system as no agreements or technical connections outside of the emergency institution have to be established. The syndrome definitions based on the most common emergency-care coding systems ease the implementation and support the portability of the SyS concept across Europe. Next to the gastrointestinal syndrome, the expert consortium defined syndromes for respiratory and influenza-like illness, for heat-related illness and unspecific syndrome ( = volume of medical cases without specification) [Reference Rosenkötter12, Reference Garcia-Castrillo Riesgo22]. The results of case studies analysing these syndromes are discussed elsewhere [Reference Schrell23, Reference Rosenkötter24].
Case study on local gastrointestinal outbreak detection
Our SyS concept was tested for the detection of local outbreaks of gastrointestinal illness in four regions in Europe. In this case study, we identified two potentially relevant outbreaks. The outbreak identified in Spain could not be confirmed due to missing reference data. The alert in Austria was confirmed as a norovirus outbreak in a group of foreign students. No other notified outbreak was identified by the SyS analyses. This low validity shows that our SyS concept cannot replace traditional surveillance of gastrointestinal diseases.
Gastrointestinal diseases are often the focus of SyS applications [Reference Buckeridge25], pursuing three major purposes: (i) early information on the onset of expected seasonal outbreaks such as winter vomiting disease [Reference Loveridge26], (ii) situational awareness during potentially health-threatening events such as disasters or mass gatherings [Reference Meyer18], and (iii) detection of local gastrointestinal illness clusters [Reference Edge27]. Earlier studies suggested that comparatively large outbreaks at the local or regional levels were successfully detected by SyS systems [Reference Moore, Edgar and McGuinness28]. Rather small outbreaks, however, appear to be difficult to detect as Xing et al. [Reference Xing, Burkom and Tokars29], Balter et al. [Reference Balter30] and Heffernan et al. [Reference Heffernan31] found based on ED data. Moreover, in our study most notified outbreaks in the study regions, which mainly consisted of few cases, were not detected by our SyS analyses. Emergency-care data, similar to other health services-based data sources for surveillance, are unlikely to reflect outbreaks with few or dispersed cases such as foodborne outbreaks comprised of visitors to a restaurant who later develop symptoms when they are in different areas [Reference Balter30].
Another explanation for the low validity is the fact that emergency-care data sources are not anticipated to catch all gastrointestinal outbreaks as most gastrointestinal illness patients with mild symptoms would self-treat their symptoms or utilize primary-care services. This assumption would suggest additional analysis of other data sources for SyS with a bigger coverage of mild gastrointestinal illness cases. Andersson and colleagues [Reference Andersson32] compared three syndromic data sources able to cover people affected by gastrointestinal illness who were not seeking care in Sweden: telephone helplines, web queries and over-the-counter drug sales. This study also confirmed the finding that only larger outbreaks were detected by SyS. From nine point-source outbreaks only the four largest were detected with case numbers between 369 and 27 000. Five smaller outbreaks with case numbers between 100 and 185 were not detected. We could not test our concept on large outbreaks as no outbreaks with more than 53 cases occurred during the study period. The reference data in Belgium and Germany did not provide the number of cases per outbreak.
Emergency care especially comes into contact with gastrointestinal illness in case of severe illness, e.g. during the Shiga toxin-producing Escherichia coli outbreak in Germany in 2011 during which ED reported on bloody diarrhoea cases [Reference Wadl33]. Further, emergency services are approached by gastrointestinal patients during crisis situations such as the 2003 blackout in the USA [Reference Marx34]. In ED in the USA, seasonal increases of gastrointestinal cases are seen during winter suggesting that gastrointestinal patients visit emergency services not only for severe illness but most likely because other health facilities are not accessible, e.g. during Christmas holidays [Reference Balter30]. In addition, emergency services cover patients with special characteristics, e.g. as in our case of Austria foreign tourists that might have decided to use emergency care as the easiest point of access to care. Hence, compared to other SyS data sources, emergency-care data-based SyS can have an added value for gastrointestinal surveillance if patients with severe symptoms or in special circumstances are using emergency care instead of other health services.
In the case study, we received many temporal signals for aberrations consisting of small case numbers which could not be confirmed by data from notifiable disease surveillance, which was also the case in other studies [Reference Steiner-Sichel35, Reference Yih36]. This could be due to the choice or calibration of the statistical methods applied for temporal aberration detection analysis [Reference Hadler, Siniscalchi and Dembek37]. The application of other detection algorithms such as regression analysis or moving averages could yield more valid results. However, we saw the greatest potential to increase validity by additionally applying spatio-temporal detection algorithms which are expected to add information to solely temporal analyses of local gastrointestinal outbreaks as many cases tend to cluster in relatively small areas [Reference Horst and Coco38].
Other studies applying spatio-temporal scan statistics detected rather large or severe outbreaks [Reference Yih36, Reference Greene39]. In order to enhance the validity of detecting small clusters, adjustment of the analysis parameters was suggested [Reference Horst and Coco38]. Our case study showed promising results for identifying smaller outbreaks and reducing the number of potential false alerts when applying relatively restrictive parameters to the analysis. This limited our analysis to only detect point-source outbreaks although it increased the probability of receiving alerts for true positive outbreaks. We also tested less restrictive parameters to scan for clusters up to 1 week and up to 5 km radius but found only insignificant results.
The aggregation of cases to a larger geographical area yields the problem of lower validity of the identified clusters [Reference Chen40]. In our case study, the Spanish study area contained both urban and rural areas with very large zip-code areas. If a cluster had been detected comprised of such a large postal code, the risk of it being a false alert is much higher compared to a cluster comprised of only small urban postal code areas. Another limitation in the applied scan statistic is the fixed circular form of the scanning window which cannot identify clusters of another shape. Flexible shapes have been tested but are not commonly used [Reference Kulldorff, Lawson and Kleinman17]. Due to high computing time we applied the prospective spatio-temporal analysis to shorter, previously defined outbreak periods based on the temporal analysis for the whole study area, which might have led to missing outbreaks that cluster in space and time, but are not visible in the purely temporal analysis. This problem would be diminished if the analyses ran automatically.
We are the first to have used run-sheet data from EMS staffed with EP for SyS. Although in two areas the case numbers were too low to perform a valid temporal aberration detection analysis, the data source appears to be promising for SyS. The true positive norovirus outbreak in Tyrol, Austria, was only captured by the data from the EP run sheets and not by the EMD data covering the same area. This indicates a higher specificity of EP-staffed EMS compared to EMD data. It also indicates that SyS based on data sources with such low case numbers tend to detect point-source outbreaks with a high number of cases rather than continuous or propagated source outbreaks with low case numbers or cases dispersed over space and time. We encourage further research using ambulance data for SyS to confirm our findings.
The case study was performed retrospectively and was not based on results from active automated SyS systems. The performance of the two currently implemented automated systems needs to be evaluated prospectively in the future to further confirm the usefulness of our concept.
CONCLUSIONS
We have provided a practical concept for implementing SyS in Europe based on routine emergency-care data from EMD, EMS and ED that can be used as supplementary and timely surveillance information source at the local/regional level. Emergency-care data-based SyS can supplement local surveillance with near real-time information on gastrointestinal patients, especially in special circumstances or with special treatment-seeking behaviour, e.g. foreign tourists. It should be able to detect large outbreaks and outbreaks comprised of patients with severe symptoms. It is not very likely to detect the majority of local gastrointestinal outbreaks with few, mild or dispersed cases. We recommend using a combination of temporal and spatial outbreak detection algorithms in parallel and to apply a decision tree for initiating public health action based on statistical signals, in order to increase the validity of SyS.
ACKNOWLEDGEMENTS
This paper describes results from the SIDARTHa project, which is an initiative of the European Emergency Data (EED) Research Network. We thank all SIDARTHa Consortium partners for their valuable contributions for conceptualizing and testing the SIDARTHa concept. Furthermore, we thank the Belgian Scientific Institute of Public Health and the Tyrolean Government for providing reference data for the case study of gastrointestinal outbreak detection and the Belgian Ministry of Health for providing national ambulance service data for Belgium. We also thank the anonymous reviewer for providing valuable comments to earlier versions of this paper.
This paper arose from the SIDARTHa project which has received funding from the European Union, within the framework of the Public Health Programme (grant agreement no. 2007208).
DECLARATION OF INTEREST
None.