INTRODUCTION
The spatial and behavioural associations between people and their pets, with their shared risks and susceptibilities has motivated recommendations to use companion animals as sentinels of environmental hazards for people [Reference Cleaveland, Laurenson and Taylor1–Reference Rabinowitz3]. However, companion animal surveillance has not been fully exploited as a method for identifying changing human environmental risk. There is no mandate for companion animal surveillance in Canada and only a limited number of companion animal diseases are federally or provincially reportable.
Laboratory surveillance may have limited value. An aetiological diagnosis, based on laboratory confirmation is often not pursued in companion animal practice as veterinarians frequently rely on empirical management of their patients [Reference Anholt4]. When a diagnosis is sought, samples from clinical cases are often tested in private laboratories from which there is currently no formal means of sharing data. New approaches should be explored. Syndromic surveillance has shown promise for early detection of changing disease patterns in human populations. The last decade has seen a focus on veterinary syndromic surveillance as a means of identifying signals associated with changing population health in a variety of available data [Reference Dórea, Sanchez and Revie5].
Syndromic surveillance uses existing data that are easily and electronically available and may include behaviour (e.g. school attendance) or the signs and symptoms of disease (e.g. emergency room attendance with complaints) in a population. It aims to detect changing patterns in the data over time and/or location [Reference Henning6–Reference Babin, Lombardo and Buckeridge8]. The utility of syndromic surveillance for early disease detection, its uses in research initiatives and to support planning and policy development, have resulted in its continued growth in veterinary medicine [Reference Dórea, Sanchez and Revie5]. The adoption of electronic medical records by veterinary practitioners provides an opportunity to employ informatics for data collection, management and analysis [Reference Savel and Foldy9] to enable syndromic surveillance of pet populations.
Many methods are available for detecting unusual disease patterns in syndromic data [Reference Mandl7, Reference Burkom, Lombardo and Buckeridge10]. Spatial-temporal methods have been used to identify aberrations in disease frequency from an expected baseline [Reference Carpenter11]. Kuldorff et al. [Reference Kulldorff12] developed a space–time permutation scan statistic that uses case counts from one data stream to identify disease clusters without requiring information about the underlying population at risk. Without denominator data, this method relies on historical data from normal time periods to serve as the control. Therefore the space–time permutation scan statistic is well suited to surveillance contexts where the catchment area is undefined, the population at risk is unknown, and the volume of data is large [Reference Kulldorff12, Reference Robertson13]. These are characteristics of veterinary practice surveillance. Kuldorff's space–time permutation scan statistic has been applied in human [Reference Heffernan14–Reference Greene16] and veterinary [Reference Recuenco17, Reference Van den Wijngaard18] surveillance systems. Maciejewski et al. [Reference Maciejewski19] used the electronic medical record and syndromic definitions to detect a change in time and spatial patterns of eye inflammation in cats and gastrointestinal syndrome in dogs following a release of propyl mercaptan from a waste-processing facility in Fairburn, Georgia, USA.
The objective of space–time scan analysis of surveillance data is to detect unusual aggregations (clusters) of disease occurrences and to identify the location, size and duration of the aggregation. Suspected clusters must to be further investigated to determine whether they represent actionable signals originating from true changes in of patterns of disease in the population. In veterinary medicine this may be accomplished using protocols that are similar to the frameworks described for outbreak investigations [Reference Pavlin20, Reference Hurt-Mullen and Coberly21]. In this approach, additional data are sought for the post-hoc descriptive epidemiological characterization of detected clusters. Characterization often includes a description of the location and dates of the clusters, a description of the animals involved (species, age, sex), their clinical histories, clinical findings and aetiological diagnoses [Reference Pavlin20, Reference Hurt-Mullen and Coberly21]. The distribution of cluster characteristics is compared to an expected distribution for the population at risk over the same time and space [Reference Hertz-Picciotto, Rothman, Greenland and Lash22]. If the cluster is determined to be unexpected or unusual, then a hypothesis can be formulated as to its cause. However, once a signal has been detected in the data, there must be enough information available with which to make a decision in regards to a response [Reference Wagner23].
In prospective surveillance with the objective of early warning, the final step is to determine if the cluster is of animal or public health importance and to communicate this information to those who need to know so that an intervention can be undertaken [Reference Pavlin20, Reference Hurt-Mullen and Coberly21]. The objectives of retrospective space–time analysis may include: an epidemiological investigation of a health event, an enhanced understanding of the natural history of the disease, or it may facilitate planning for disease eradication or management [Reference Teutsch, Teutsch and Churchill24].
In this paper we present a retrospective analysis of companion animal enteric syndrome data collected from electronic medical records extracted from participating veterinary practices. This research had the following objectives:
-
(1) Use time-series analysis to visualize and characterize the trends and patterns of enteric syndrome in companion animals seen in private veterinary practices.
-
(2) Determine if it was possible to identify statistically significant spatial-temporal clusters of enteric syndrome in the study area.
-
(3) Determine if there was sufficient information to infer the potential cause(s) of the cluster.
-
(4) Determine if sufficient information could be accessed in electronic medical records to determine the biological and epidemiological significance of the clusters to human or animal health.
METHODS
Figure 1 provides an overview of the study's methods.
Study population and enteric syndrome data
Electronic medical records were extracted from 12 participating veterinary practices in the city of Calgary, Alberta and the surrounding communities of Cochrane, Airdrie, Chestermere, Strathmore and Okotoks (Fig. 2). The participating practices represented a convenience sample of companion animal practices. Participating practices had completely computerized medical records and used the same veterinary management software for which a customized data extraction program was written to accurately extract data. Records from each practice were aggregated into a single file which contained 428 783 companion animal records for the period 1 January 2007 to 31 December 2010. The file was stored in a secure data warehouse at the University of Calgary [Reference Anholt25]. We used text-mining technology and an enteric syndromic case definition (all animals presenting to the veterinarian with complaints or clinical signs consistent with diarrhoea) to identify and retrieve enteric syndrome positive records from the warehoused medical records [Reference Anholt26]. Enteric syndromes resulting from infectious or parasitic aetiologies most often have an acute presentation which is defined as being ⩽14 days' duration [Reference Triolo, Lappin and Tams27]. Records from individual animals within 14 days of the initial visit were combined to represent one enteric syndrome case. There were 15 928 enteric syndrome cases. Data were stored and managed in Microsoft Office Excel 2007 (Microsoft, USA) and Konstanz Information Miner 2.2.2 (Knime, http://www.knime.org).
The date that the animal first presented to the veterinarian with symptoms of enteric syndrome served as the time value for temporal analysis. To protect the identity of pet owners, only the first three digits of the owner's home postal code (the forward sortation area; FSA) were obtained and these provided the spatial data (Canada Post, http://www.canadapost.ca/business/tools/pg/manual/PGaddress-e.asp# 1 382 487). There were 35 FSAs from within the city of Calgary and 15 FSAs in the remaining study area. The animal's demographic data (species, age, sex, dog breed) and the medical notes recorded by the veterinarian or his/her staff were used to further characterize the cases.
Time series
The total number of enteric syndrome cases recorded for each day of the week was counted. The daily count and the 7-day moving average of daily case counts were plotted against time (Stata/IC 10.0, Stata Corp, USA). The results of these analyses were used to determine appropriate parameter values required for the space–time permutation scan statistic. To characterize temporal trends, the daily number of enteric syndrome cases divided by the daily number of all cases was plotted against time and a regression line was fitted to the data (Stata/IC 10.0).
Scan statistic for space–time clusters
The space–time scan statistic software, SaTScan™, v. 9·1·1 (Kulldorff Information Management Services; http://www.satscan.org), and the retrospective space–time permutation model [Reference Kulldorff12] were used to detect clusters of enteric syndrome occurrences in the data. This analysis required two data files; the geographical coordinates (latitude and longitude) for the centroid of each FSA and the number of enteric cases for each FSA on each day of the study. Temporal aggregation was based on the variation in the number of cases that presented to the participating practices on each day of the week (see results below) and was set at 7 days. Temporal aggregation also served to reduce the computing time. We used 100 days as the maximum size of the temporal window to reflect the seasonal pattern of enteric syndrome cases (see results below). The default maximum spatial window of 50% of the data was used. Statistical significance was evaluated using Monte Carlo re-sampling (999 repetitions) whereby the observed data was permutated under a null hypothesis of no disease clustering and the observed data are compared to this random distribution [Reference Kulldorff12]. The relative risk of a companion animal presenting with clinical signs of enteric syndrome inside the clusters compared with the surrounding area was estimated.
Evaluating the cluster signals
In each of the statistically significant (α = 0·05) clusters, the cases within the clusters were characterized by the animals involved, their histories and clinical features using the information available within the electronic medical records stored in the data warehouse. If necessary the records from previous or subsequent days (and outside the time-frame of the cluster) were reviewed to further characterize a case included in a cluster. For example, if an animal's record indicated that a faecal sample had been submitted to a laboratory for diagnostic testing, subsequent records were searched to find the laboratory results.
The median was used to describe the age of the animals within the cluster. The remaining variables were reported as a proportion of cases within the cluster with the variable of interest and the 95% confidence interval (CI). This descriptive statistic was used for: (i) species (dogs, cats and ‘other species’ which included rabbits, ferrets and small rodents); (ii) intact animals (i.e. not spayed or neutered); (iii) vaccination history; (iv) exposure history; (v) disease severity (including haematochezia, admitted for intravenous fluid therapy, died or euthanized); and (vi) aetiological diagnosis.
The next step was to determine if the characteristics of cases within a cluster identified the cluster as either expected (similar to) or unexpected (unusual) compared to the characteristics of referent cases. To establish the parameters of the reference population against which the clusters could be compared, the median age, and proportions (with 95% CI) for species, and sexually intact animals, were calculated from all of the enteric syndrome cases. For each of the remaining variables, the medical notes from a random sample of 500 enteric syndrome cases were reviewed to describe the proportions of cases by the history of vaccinations, exposure history, disease severity and the aetiological diagnosis. This sample size was sufficient to estimate the reference population's proportions with a 4·4% precision, assuming the a priori estimate of the proportion to be (conservatively) 0·5, and a 5% significance level [Reference Dohoo, Martin and Stryhn28].
After the characteristics of the cluster were defined and it was determined that the findings were unexpected when compared to the reference population, we evaluated the information available within the electronic medical records: (i) for the possibility of developing a hypothesis as to the cause of the outbreak, (ii) to assess the possible risk factors for enteric syndrome in the cluster, and/or (iii) to inform a response by animal health or public health authorities.
RESULTS
Time series
There were 1242 enteric cases presenting to the participating veterinary practices on Sundays during the study period compared to an average of 2446 cases on each of the other 6 days of the week. This day of the week effect was a reflection of the veterinary practices' operating hours and informed the 7-day time aggregation in SaTScan.
The daily and 7-day moving average of counts of enteric cases were plotted against time. The numbers of enteric syndrome cases presenting to veterinarians increased in late summer and autumn (over a window of ~100 days) for each of the 4 years of the study (Fig. 3). The seasonal pattern was still evident when the daily enteric cases were normalized by all cases presenting to the participating practices each day. The linear regression examining the number of enteric syndrome cases over all cases indicated there was no long-term trend (slope coefficient <0·0001) in the proportion of enteric syndrome cases over the time of the study (Fig. 4).
Space–time analysis
There were four significant (P < 0·05) clusters identified, one each in 2007, 2008, 2009, and 2010 (Table 1, Figs 3 and 5).
Evaluating the cluster signals
The characteristics of the reference population were estimated and are reported in Table 2.
n.a., Not applicable; CI, confidence interval.
* Other species includes ferrets, rabbits and small rodents
† Feed includes raw diet, diet change, dietary indiscretion
‡ Housing includes pet store, breeder, boarding kennel
§ Environmental exposure includes hunter/scavenger, off leash parks, camping, hiking
|| Determined by morphological appearance under light microscopy, not identified by microbiological culture.
The distributions of cats and dogs within each of the clusters were as expected (similar to the reference population). Other species were not represented in the 2007, 2008, or the 2010 clusters. The 2009 cluster had an unexpectedly high proportion (0·125) of ferret cases. Four of the eight ferret cases had hyperadrenocorticoidism as a comorbidity, one was positive for parvovirus (Aleutian disease) and an aetiological diagnosis was not obtained in the other three cases.
The vaccination status and severity of disease for cases within all of the clusters were similar to those in the reference population.
Two clusters (2008 and 2009) had a larger proportion of cases positive for canine parvovirus (CPV) than the reference population (Fig. 6). The time-frames for these two clusters overlapped the seasonal peak for enteric syndrome identified in the time series (Fig. 3). The median age of the animals in these two clusters was different from enteric cases in the reference population (Fig. 7). The 2009 cluster contained more animals that had not been spayed or neutered compared to the reference population (Fig. 8). The 2009 cluster also had an unexpectedly high proportion of Shepherd-crosses in both the cases without an aetiological diagnosis and the CPV-positive cases; ten of the CPV cases presented to the veterinary practice together, were the same breed and age and so were presumed to be from the same litter. In cluster 2008, 10/22 positive CPV cases in the cluster were from the same pet store, another four were from the same litter, and three were from the same rescue organization.
The remaining two clusters (2007 and 2010) occurred outside the seasonal peak of enteric syndrome and had expected values for the proportion of cases with any positive aetiological diagnosis. The cases from the 2010 cluster had a median age of 0·13 years and most were still intact. Nine of these animals were boxer puppies from the same litter; there was no aetiological diagnosis. The increased number of cases in the 2007 and 2010 clusters may have been the result of a true increase in disease burden in the pet population, or they may have been an artifact. There was insufficient diagnostic or exposure information in the data to make this distinction.
DISCUSSION
This study demonstrates that extracted electronic medical records from private veterinary practices has potential for characterizing disease trends and patterns, and identifying clusters of excess cases of enteric syndrome in pets. However, the validity of the surveillance system's performance reported in this study could not be estimated because there was no alternative data source against which to measure the system.
The use of a syndromic case definition provided sufficient number of cases to identify clusters using the retrospective space–time permutation model. In this study the investigation of clusters was limited to a review of the clinical records. The lack of information in the medical records often limited the case assessment to a few demographic parameters. Most of the case histories were only briefly documented especially in terms of the animals' potential risk behaviours and risk exposures. A detailed examination of the potential risk factors for these clusters based on the records alone was compromised by a lack of data. This does provide an opportunity to educate private practice veterinarians about the importance of collecting epidemiologically relevant data.
Describing clusters by species, vaccination status, or disease severity was not useful for determining whether the clusters were unexpected or unusual. An increased number of cases diagnosed with CPV was useful for determining that two clusters (2008 and 2009) were unusual. These clusters were of animal health importance and being informed of an increased risk could be of interest to pet owners. CPV is capable of causing disease in pets and spreading rapidly through direct faecal–oral contact or environmental contamination [Reference Greene, Descaro and Greene29]. There were too few aetiological diagnoses to understand the animal or human health importance of the 2007 and 2010 clusters.
The records from the 2008 and 2009 clusters provided some information from which risk factor hypotheses could be developed. Dog breeders were linked to increased numbers of CPV cases in these two clusters. A pet store and a rescue organization were associated with CPV cases in the 2008 cluster. The animals in both of these clusters were younger. The animals in the 2009 cluster were less likely to have been surgically sterilized compared to the reference population despite a median age of 0·5 years; an age when most animals can be spayed or neutered. One possible hypothesis is that the pets in the 2009 cluster received less preventive veterinary care than the pets in other regions of the study area. The vaccination history would have provided some insight into this hypothesis but this information was not routinely recorded in these veterinary records. Further investigations such as surveys of the pet owners and of the veterinarians practising in this area of Calgary would be needed to understand the context of these findings.
An alternative approach to the analysis would be to examine the data by species. However, we expect that the evaluation of any additional species-specific clusters indentified with this approach would still be compromised by the lack of clinical data.
If this study had been conducted prospectively and in real-time, the syndromic surveillance data could yield information that may inspire a targeted educational campaign. For example, dog owners in the region could be notified of an increased number of cases of CPV and the importance of vaccination stressed.
Clusters identified in a prospective study would require further investigations if there was insufficient information recorded in the medical records to determine if the cases were biologically or epidemiologically related. Contacting veterinary practices to seek additional information about risk behaviours and exposures in order to determine if the increased number of cases is important, would be an obvious first step. Veterinarians are likely to be interested in knowing if there is an unusual increase in enteric cases within their practice area. Given that few enteric cases in this study were routinely subjected to aetiological testing, communication of an unusual number of cases of diarrhoea could motivate veterinarians in the cluster area to increase diagnostic testing.
Public health responds to human cases of disease; there have been no studies indicating an association between companion animal enteric disease and human enteric disease in Alberta. There is no mandate or provisions for companion animal disease outbreak investigations in the province of Alberta and it is unlikely that any action would be taken by the public sector upon cluster detection. Implementation of this surveillance system would need to be led by the private or academic sector because these are the two sectors most likely to benefit from or use the outputs.
The results of this study were similar to those of Balter et al. [Reference Balter30]. These authors reported on a human syndromic surveillance system using emergency-department chief complaint data for gastrointestinal syndrome. Their system was capable of detecting seasonal outbreaks of diarrhoeal illness due to norovirus. However, an outbreak that occurred outside of the seasonal peak of diarrhoeal disease and that followed a widespread power outage had insufficient information recorded in the records to characterize that cluster by aetiology or exposures. Furthermore, the study by Maciejewski et al. [Reference Maciejewski19] reported that they were unable to associate a pet's clinical signs, that may have been an indication of exposure to propyl mercaptan, with that of its owners.
We have demonstrated that cluster detection is possible, but the importance of the cluster to animal or public health is compromised by the lack of data in the medical records – a critical information requirement if inspiring action is the desired outcome of the disease surveillance. To overcome this limitation, including data from faecal examinations and faecal culture results from the laboratories outside the veterinary practices may be helpful. It is also possible that if this system was adapted for surveillance of alternative syndromes such as respiratory or urinary tract diseases, the information biases affecting the data would be different from those identified in this study.
The use of a convenience sample of veterinary practices and the associated selection bias may have impacted this study's results. In order for a case of enteric syndrome to be captured by this syndromic surveillance system, an animal exhibiting clinical signs of enteric disease needed to have an owner, their pet's symptoms needed to be severe enough for the owner to justify a visit to the veterinarian and the animal had to be seen by a participating practice. Previous work [Reference Anholt25] demonstrated that the data collected by this system was not geographically or demographically representative of the underlying companion animal population. Pets in northeast Calgary were underrepresented in the data but enteric syndrome signals were detected in this region. Dogs and pets aged <1 year were overrepresented in the data. This bias may have enhanced the ability of the system to detect the clusters with higher than expected numbers of CPV cases. It is also possible that the limitations found in the medical records in this study (few diagnostic tests performed and limited exposure information) were not consistent across the study area. Therefore the results of this study cannot be generalized to all of the pet population in Calgary and area.
We have described how syndromic electronic medical record data from companion animal veterinary practices can be used for retrospective space–time surveillance in Alberta. The tools can also be used prospectively and could be useful to inform additional surveillance and control strategies. Demonstrating the value of these methods to veterinary practitioners may result in improved data collection. This study highlights the value of developing both the technological and human capacities to strengthen animal disease surveillance.
ACKNOWLEDGEMENTS
The authors thank Peter Peller, Librarian, Carto-graphic Materials and Spatial Data, University of Calgary.
DECLARATION OF INTEREST
None.