INTRODUCTION
Legionnaires' disease (LD) is an atypical pneumonia, caused by the inhalation of aerosols contaminated by bacteria of the genus Legionella, which can grow in artificial aquatic environments, such as cooling towers or hot-water systems. The case-fatality rate can be high, particularly in vulnerable groups (e.g. immunosuppression, chronic lung disease, advanced age). The majority of cases occur on a sporadic basis but outbreaks involving a high number of cases are regularly documented [Reference Nguyen1, Reference Vanaclocha2]. In order to limit the spread of the disease, a rapid investigation aimed at identifying and controlling the source of contamination is necessary. The early detection of cases and the high sensitivity of the surveillance system are thus two key elements for controlling the disease. In France, the surveillance of LD has been based on mandatory notifications (MN) since 1987. In 1996, a first capture–recapture study showed considerable under-reporting of the disease [Reference Infuso3]. Thus, the surveillance of the disease was strengthened in 1997 (information to clinicians, guidelines for the surveillance and control of LD, modification of the MN form) and the urinary antigen detection test was introduced. The sensitivity of the surveillance system improved over the following years [Reference Nardone4] and the notification rate increased, reaching 2·5/100 000 population in 2005 [Reference Campese5]. New regulations were implemented and the notification rate of LD slightly decreased until 2009 (1·9/100 000) [Reference Campese6]. However, in 2010 France experienced an unexpected increase in LD (2·4/100 000) [Reference Campese7]. Moreover, a geographical west–east gradient was described with a higher notification rate in eastern than in western administrative regions, suggesting that the sensitivity of the MN system was not homogenous.
The aim of this study was to estimate the number of confirmed LD cases that occurred in France in 2010 and the sensitivity of MN by region in order to assess the geographical gradient.
MATERIALS AND METHODS
Study design
A capture–recapture approach was used to estimate the number of cases. Information about cases diagnosed in 2010 was gathered through two sources of data: the MN database and a survey of all hospital laboratories.
Case definition
A confirmed case of LD was defined as a patient presenting clinical and/or radiological signs of pneumonia associated with isolation of Legionella species from a culture of broncho-pulmonary secretions and/or a positive urinary antigen test and whose diagnosis was made between 1 January and 31 December 2010. Cases diagnosed by serology (single high titre or fourfold increase in antibody titres) were not included because of possible cross-reactions [Reference Boswell8] and because of the low positive predictive value of this method [Reference Plouffe9, Reference Rudbeck10].
Sources of data
For the MN system, physicians and biologists are required to report all LD cases to the regional health authority (Agence régionale de santé; ARS) by completing a standardized form. The ARS is in charge of the rapid implementation of epidemiological and environmental investigations. The form is then sent to the National Institute for Public Health Surveillance (Institut de Veille Sanitaire; InVS). Information included in the database held at the InVS, and used for the purposes of this study, included: the patient's anonymous identifier (based on an algorithm using the first name, the initial of surname, the date of birth and sex) residential postcode, date of onset of illness, date of hospitalization, and bacteriological diagnosis (type of methods and results). The date of diagnosis is not available on the MN form; therefore we used the date of hospitalization, considering that the delay between hospitalization and diagnosis is very short in France [Reference Chidiac11].
A survey of the 423 private and public hospital laboratories, able to perform a test for LD diagnosis, was conducted in 2011 for each case of LD diagnosed in 2010. Laboratories were identified from the ‘Epibac network’, implemented by the InVS in 1987. Epibac is a national hospital-based laboratory network that collects data on severe invasive bacterial diseases [Reference Lepoutre12]. The high and sustained coverage of the Epibac laboratory network has been confirmed by extrapolations made on a reliable source of information (the French national Hospital Annual Statistic database). Collected information included first name, initial of surname, date of birth and sex (patient's anonymous identifier) postcode of the laboratory, type of methods used (isolation, urinary test) and date of diagnosis. We considered that all cases tested for Legionella presented signs of pneumonia.
Identification of duplicate cases
The patient's code was used to identify duplicate cases between or within the sources of data. A more sensitive algorithm (checking postcode, year of birth and month of diagnosis/onset) was used to identify further duplicates by linking those records that did not have an identical patient's code because of one of the following: a difference in age, or sex, a different or missing first name and a different or missing date of diagnosis.
Capture–recapture analysis
The capture–recapture method is frequently used in epidemiological studies to assess the burden of disease. The effective use of the method depends on several assumptions: all cases identified by each source are true cases and occurred during the time and the geographical area studied, the study population is closed, all matches are true matches, sources are independent (i.e. declaring a case in one source will not affect the declaration in the other) and there is an equal probability of cases being reported in a given source. When two data sources (A and B) are involved and assuming that the previous conditions are fulfilled, the estimated total number of cases ‘N est’ equals the number of cases in source A, N A, multiplied by the number of cases in source B, N B, divided by the overlap of the two sources, N AB (N est = N A × N B/N AB). To correct for bias generated by small registers, the nearly unbiased estimators proposed by Chapman [Reference Chapman13] were used; in this equation, from the 2 × 2 table:
and the variance is
The sensitivity of one source is N A divided by N est. Finally, to estimate the total number of confirmed LD cases in France and also by region in 2010, we applied the estimated sensitivity to the number of confirmed cases notified by MN in 2010. We used Excel® software for calculations.
Ethical approval
The objectives of the study, the process regarding patients' consent and access to the patients' identity and medical data were approved by the national committee (Commission Nationale de l'Informatique et des Libertés), according to French law.
RESULTS
Out of the 423 laboratories, 57 did not routinely perform the diagnosis of LD in 2010. Of the 366 remaining laboratories, 343 (94%) responded to the survey. All regions participated including overseas regions. From MN, 1469 cases corresponded to the case definition including 162 cases diagnosed by laboratories that had not been included in the survey. Therefore, a total of 1307 cases were included in the MN data source. The laboratory survey yielded a total of 1352 LD cases.
A total of 1463 cases were identified by both sources, of which 1196 were common. Using the Chapman formula, we estimated that 14 cases were not captured by either data source (Table 1). Thus, the estimated sensitivity of MN was 88·5% (95% CI 88·0–89·0)=[1307/(1463 + 14)].
* Cases estimated using the Chapman formula, not captured by either data source.
The sensitivity of MN ranged from 70% to 100% by region. The sensitivity was below 80% in only two of the 26 regions. In six regions where all laboratories responded (two in mainland France and four in overseas regions) all LD cases were common to both sources and the estimated sensitivity of MN was 100%.
Applying the global estimated sensitivity to the 1469 cases notified (1307 included in the analysis and 162 not included; making the hypothesis that the sensitivity was the same for included and not included cases), the estimated number of confirmed LD cases was 1661 (95% CI 1621–1700) in 2010 in France.
The LD notification rate was 2·4/100 000 population in 2010. Considering that the sensitivity was 88·5%, the adjusted incidence of LD in France was 2·7 cases/100 000 population in 2010. At the regional level, and taking into account the regional sensitivity of MN, incidence rates ranged from 0·6/100 000 population in the western regions to 6·4/100 000 population in the eastern regions and the west–east gradient persisted (Fig. 1).
DISCUSSION
This study enables the estimation of the total number of confirmed LD cases diagnosed in France in 2010. We can estimate that the annual incidence was 2·7 cases/100 000 population in 2010, close to the notification rate (2·4/100 000 in 2010). The sensitivity of the MN system increased from 10% (95% CI 9–11) in 1995 [Reference Infuso3] to 33% (95% CI 29–38) in 1998 [Reference Nardone4] and 88·5% (95% CI 88–89) in 2010. This improvement is probably due to the growing awareness of practitioners, as suggested by the high participation in the laboratory survey. Clusters and outbreaks that have occurred in France over the last decades [Reference Alsibai14–Reference Schmitt and Bitar16], may have contributed to this progress, particularly the large outbreak in north of France in 2003–2004 [Reference Nguyen1] which generated a major media attraction. Moreover, the introduction of urinary tests in hospital laboratories has facilitated the LD diagnosis and has contributed to shorten the delay between diagnosis and notification [Reference Campese5]. Date of diagnosis is not included in the MN form but the median delay between date of onset and notification to the regional health authority has decreased from 28 days in 1998 to less than 7 days in 2006 [Reference Campese5]. Based on these results, we consider that at present, our surveillance system gives a representative description of the epidemiology of LD in France, especially as no changes in the surveillance system have occurred in recent years.
Some limitations of the study have to be considered. The sensitivity and the number of LD cases were obtained from a capture–recapture study using only two sources of data. In fact, capture–recapture method with three sources allows the use of log-linear models, which can control dependencies between sources. However, in a three-source study performed on the 2002 data (C. Campese, InVS, unpublished results), we found dependencies between the three available sources of data (MN, laboratory survey and national reference centre; NRC). The NRC and InVS have daily contact about notified cases and their databases are now comparable. For this reason, and in absence of other nationally available source of information, we had to use only two sources of data for the present study. In fact, the national hospital discharge database could not be used for the purposes of this study because the information used for the patient's identifier and the method of diagnosis were not available in that database. Another bias has to be considered; biologists have been requested to notify cases since 1999. This could have led to a positive dependence between sources and an overestimation of the sensitivity, but the magnitude of this bias is currently difficult to evaluate. Only confirmed LD cases diagnosed by a positive culture or urinary antigen test were included. This could have underestimated the number of cases. However, in recent years the proportion of cases diagnosed by serology (fourfold rise in antibody titres or single high titre or PCR) represented less than 3% of cases, thus limiting the impact of such bias. Finally, we consider that the assumptions for the correct use of the capture–recapture methods were satisfied, except for homogeneity of capture. For example, it was not possible to assess whether the outcome (death or recovery) influenced the probability of being notified.
From an epidemiological point of view, it is important to monitor the sensitivity of any surveillance system over time. In fact, LD requires a rapid implementation of public health actions and an improvement in the sensitivity of the surveillance system, associated with a reduction in time of notification, which contributes to the effectiveness of the control and prevention system by detecting clusters earlier. Many clusters continue to be identified in France but they are rapidly investigated, and since 2007, no outbreaks (more than 10 cases) have been identified [Reference Campese7].
With an estimated incidence of 2·7/100 000 population in 2010, France presents the highest incidence of LD in European countries [17]. Nevertheless, other countries have rarely documented the sensitivity of their LD surveillance systems. In Italy in 2002, a capture–recapture method was used; the sensitivity was estimated to be 78·6% and the incidence of LD was 1·4/100 000 population [Reference Rota18] (vs. 1·7/100 000 in France in 2002). In The Netherlands, estimated sensitivity was 42·1% in 2000–2001 and the estimated incidence was 2·8/100 000 population [Reference Van19] (vs. 1·3/100 000 in France in 2001). Comparisons with the notification rates from other European countries are not easy without accurate information of the performance of their respective surveillance systems. Adjusted estimations of their incidence rates should be encouraged before further comparisons.
The regional estimations of MN sensitivity provide important information. Indeed, it was suggested that the geographical variation in LD notification rates could be partly related to regional disparities of MN sensitivity. However, the west–east gradient remains, even after taking into account the sensitivity of MN by region. Further studies will be necessary, particularly those focusing on the impact of environmental factors to understand this gradient. It has already been demonstrated that climatic factors, such as humidity or temperature [Reference Fisman20–Reference Ricketts23], can influence the survival of the bacteria in the atmosphere and are associated with incidence levels. As several climates are described across the country (e.g. with temperature and humidity differences) ecological analysis should be further developed to better describe the association between LD incidence and climatic variation.
In conclusion, this study documents the substantial improvement in the sensitivity of our LD surveillance system and helps us to understand the observed trends. Estimating the sensitivity of a surveillance system is necessary to estimate the true incidence of a disease and to ensure that the surveillance system will enable the timely identification of clusters. Another important finding of this study is the persisting west–east gradient, which is not related to regional disparities in notification, suggesting that additional studies are necessary to further explore other issues particularly the relationship between the ecology of Legionella and the occurrence of cases.
ACKNOWLEDGEMENTS
The authors thank the microbiologists from 343 hospitals who participated in the laboratory survey and Dounia Bitar for her helpful comments.
DECLARATION OF INTEREST
None.