INTRODUCTION
Streptococcus agalactiae (group B streptococcus; GBS) is a Gram-positive, β-haemolytic coccus found in chains or diplococci. S. agalactiae has been a leading cause of neonatal morbidity and mortality for at least the past 30 years. Also affected are elderly adults, and adults with underlying conditions such as diabetes. A high carriage rate has been found in the general population, with ∼25% of healthy adults colonized at any one time point.
Although S. agalactiae has been studied for decades, relatively little is known about virulence factors that affect development of disease. The bacterial capsule, which induces serotype-specific antibodies, has been found to play a critical role in virulence [Reference Spellerberg1]. Other putative virulence factors include the surface proteins Rib and the alpha and beta antigens of the C protein (encoded by the genes rib, bca, and bac respectively). The C protein is thought to play a role in epithelial cell adherence and invasion, and resistance to phagocytosis [Reference Bolduc2], while Rib is a related high-molecular-weight protein [Reference Stalhammar-Carlemalm, Stenberg and Lindahl3].
Additionally, the recently identified sbp1 gene [Reference Adderson4] has been suggested to play a role in epithelial cell adherence and invasion in serotype III isolates, and to serve as a marker for the invasive serotype III-3 (ST-17) lineage. Adderson et al. [Reference Adderson4] initially identified the spb1 gene using a subtractive hybridization method. Although this method can also lead to the discovery of new putative virulence genes [Reference Pettigrew5, Reference Saxena, Li and Caufield6], the availability of two sequenced GBS isolates [Reference Tettelin7, Reference Glaser8] enabled in silico screening. We searched the published genomes of 2603V/R (serotype V) [Reference Tettelin7] and NEM316 (serotype III) [Reference Glaser8] for genes present in only one of these isolates and absent in the other, and with characteristics of previously identified virulence genes. We chose two genes that were specific to serotype V: orf1867 (pag, phage-associated gene), a hypothetical DNA-binding protein associated with phage proteins, and orf2000 (psp) a putative surface protein containing an LPXTG motif. One gene specific to the serotype III sequenced isolate was chosen: orf1223, a hypothetical protein of unknown function that was also associated with phage proteins. This gene was most closely related to an abortive infection bacteriophage resistance protein from Lactococcus gasseri (48% identical); as such, we refer to it as brp, bacteriophage resistance protein. We investigated the distribution of these genes within a cross-section of commensal and invasive isolates, and examined correlations with a number of epidemiological parameters, including serotype and site of isolation.
MATERIALS AND METHODS
Bacterial strains
A total of 418 isolates sampled from five epidemiological collections were screened. Colonizing isolates were systematically chosen to maximize serotypes included in the tester set, and to minimize inclusion of clonal isolates. We selected 124 out of 687 anal orifice isolates and 17 vaginal, seven throat, and four urine isolates from healthy non-pregnant female and male students enrolled at the University of Michigan [Reference Bliss9]. In addition, 80 isolates from neonatal disease, 17 isolates from pregnant women with invasive disease acquired from Houston between 1995 and 2003 [Reference Zaleznik10], and 79 isolates from Wisconsin neonates (both early and late onset) GBS disease were screened. A random selection of 90 out of 240 Wisconsin isolates from elderly adults (aged >65 years) with invasive GBS disease was also screened [Reference Borchardt11]. Wisconsin isolates consisted of invasive strains collected from patients through the Wisconsin public health surveillance system between 1998 and 2002. Invasive GBS disease was defined as any symptomatic GBS infection identified from a normally sterile site. An additional three well-characterized isolates were added as controls: A909 (Ia) (www.tigr.org), the sequenced serotype III strain NEM316 [Reference Glaser8], and the sequenced serotype V strain 2603V/R [Reference Tettelin7]. Streptococcus pyogenes ATCC strain 19165 and Streptococcus pneumoniae strain 02J1175 were negative controls. Three additional strains were employed for PFGE typing: 874391 (serotype III-3a), a vaginal isolate from a colonized woman, 620593 (serotype III-3b) isolated from the blood of a 37-day-old infant, and 865043 (serotype III-2), a vaginal isolate from a colonized woman.
Serotyping
All 421 isolates (418 from our collections, plus three GBS control isolates) had been serotyped using either the Lancefield capillary precipitin test at the Streptococcal Immunology Laboratory, Baylor College of Medicine, Houston, Texas or dot-blot capsular typing (DBCT), or both, as previously described [Reference Borchardt12].
Polymerase chain reaction
DNA lysates were prepared from 1 ml GBS overnight culture in Todd–Hewitt broth (BBL, Sparks, MD, USA). Each reaction mixture (25 μl) included 25 pmol of each primer, 0·4 mm deoxynucleoside triphosphate mixture, 4 mm MgCl2, 2 μl DNA lysate, 2 U platinum Taq DNA polymerase and PCR buffer (Gibco-BRL Life Technologies Inc., Gaithersburg, MD, USA). PCR amplification of the probe sequence for each gene examined was performed by using the primer pairs described in Table 1. The denaturation, annealing and elongation temperatures and times used were 94°C for 30 s, 55°C for 30 s and 72°C for 2 min, respectively for 35 cycles. A total of 10 μl of PCR product was separated by electrophoresis on a 1·5% agarose gel, stained with 0·02 μg of ethidium bromide/ml, and visualized by UV trans-illumination.
Dot-blot hybridization
Probe sequences were amplified by PCR and purified using QIAquick Gel Extraction kit (Qiagen Inc., Valencia, CA, USA). The purified DNA was sequenced at the University of Michigan DNA Core Facility using an Applied Biosystems (Foster City, CA, USA) model 3730 sequencer. These sequences were subsequently labelled in accordance with the manufacturer's protocol (Gene Images AlkPhos Direct Labeling and Detection System; Amersham Biosciences, Piscataway, NJ, USA).
GBS isolates were inoculated into 1200 μl of Todd–Hewitt broth, grown overnight at 37°C with 5% CO2, and then centrifuged at 8000 g for 20 min (224 microplate rotor, HN-SII centrifuge; International Equipment Co., Needham Heights, MA, USA) to pellet the bacteria. Cells were lysed with 1400 μl of lysis buffer (0·4 m NaOH, 10 mm EDTA) overnight at 68°C. Samples were then centrifuged at 8000 g for 20 min to pellet cellular debris, and 50 μl of DNA in the crude extract was applied to a nylon transfer membrane (Hybond-N+; Amersham Biosciences). DNA hybridization was performed in accordance with the manufacturer's protocol (Gene Images AlkPhos Direct Labeling and Detection System, Amersham Biosciences), and visualization was done with a Storm PhosphorImager (Molecular Dynamics, Sunnyvale, CA, USA) as described previously for Escherichia coli [Reference Zhang13]. The DNA hybridization data were quantified relative to positive and negative controls present on each membrane using ImageQuant (Molecular Dynamics). After subtraction of the mean negative control value for the positive controls and test strain, strains with a signal intensity greater than ∼30% of the mean fluorescence of the positive controls were considered positive. The cut-point was determined empirically based upon signal intensity achieved in known positive and negative isolates using criteria described previously [Reference Zhang13].
All isolates were tested in duplicate, using separate membranes. Isolates with discordant results were retested. If discordant results were observed upon repeat testing, then the isolate was analysed by PCR.
Statistical analysis
χ2 tests were used to assess differences in gene frequencies by collection and serotype. SAS was used for all statistical analyses [14].
RESULTS
Overall prevalence of genes
Prevalence of the selected genes varied widely in the population (see Fig. 1). pag was seen with the least frequency (present in only 5% of isolates), while bca and psp were found most often (in 43% and 42% of the isolates, respectively). The prevalence of bac, bca, and rib was similar to those observed in previous studies [Reference Stalhammar-Carlemalm, Stenberg and Lindahl3, 15–Reference Madoff18]. A total of 53 isolates representing 13% of the collection were negative for all genes examined. The majority of these were serotype V (30/53, 57%) and serotype Ia (10/53, 19%).
Prevalence of genes by type of isolate
We examined the prevalence of the selected genes as a function of type of isolate. Isolates were divided into two groups, ‘invasive’ (isolates from neonates, pregnant women, and elderly patients) and ‘colonizing’ (isolates taken from healthy asymptomatic individuals, which included rectal, throat, urine, and vaginal samples) (Table 2). The distribution of the genes examined was similar between the two groups, with spb1 appearing slightly more often in invasive than colonizing isolates (12% vs. 8%, P=0·23), and pag more frequently in colonizing isolates (8% vs. 4%, P<0·01, see Fig. 2a). More striking differences were seen when only the three largest groups of isolates were examined, but none achieved statistical significance (see Fig. 2b).
NT, Non-typable.
* Colonizing isolates were systematically chosen to maximize serotypes included in the tester set, and to minimize inclusion of clonal isolates.
Distribution of genes by serotype
The serotypes of isolates that were positive for genes of interest were examined. Most of the genes were common in only one or two serotypes; bca was an exception, as this gene was present in >50% of all serotype Ib, II, and V isolates in our collection. However, no gene was limited to a single serotype, and all of the genes examined were present in at least one isolate of all of the most common serotypes (Ia, Ib, II, III, and V; see Fig. 3).
Genetic signatures and gene distribution
A binary genetic signature was assigned to all isolates based on the presence or absence of each gene examined, where 0 or 1 refers to absence or presence of genes in the following order: spb1, brp, pag, psp, bac, bca, rib. A total of 43 different genetic signatures was found among the 421 GBS isolates examined. The 11 most common signatures in more than 10 isolates each accounted for >80% of all isolates examined. All of the most common genetic signatures contained three or fewer genes of interest, and the bulk of the isolates examined contained only one or two of the genes.
Genetic signatures of spb1-positive isolates
Previous studies have suggested that spb1-positive isolates represent a lineage of invasive neonatal isolates [Reference Adderson4]. This lineage (serotype III-3) was initially identified on the basis of MLEE [Reference Musser19], and more recently by MLST [Reference Jones20] profiles of invasive neonatal isolates (see also [Reference Jones21, Reference Lin22]). spb1 was identified via subtractive hybridization, using one of these invasive III-3 isolates as the tester strain and a colonizing serotype III-2 isolate as the driver. Subsequent investigation found sbp1 to be present only in serotype III-3 strains [Reference Adderson4], although previous work by our group reported its presence in a number of serotypes (T. C. Smith et al., unpublished observations). In our current study, sbp1-positive isolates were represented by 18 different genetic signatures (see Table 3) and many different serotypes (see Fig. 3). However, the presence of both spb1 and rib was correlated with invasive neonatal infection in serotype III isolates. With the exception of one non-typable isolate, all isolates containing both spb1 and rib were serotype III, and 21/25 (84%) of the sbp1+, rib+ isolates were taken from invasive infections; 19/25 (76%) of the spb1+, rib+ isolates were serotype III-3 by PFGE, while 19/41 (46%) of the total spb1+ isolates in our collection were III-3. Therefore, we suggest that the presence of both spb1 and rib is a better marker for serotype III-3 isolates than spb1 alone.
Are isolates with the same genetic signature clonal?
In order to determine relationships between genetic signature and serotype, isolates were grouped by genetic signature, and serotype distribution within each signature was examined (see Fig. 4). While one serotype dominated each genetic signature, there was heterogeneity within most signatures. Exceptions to this were isolates with the genetic signature 0010011 (containing pag, bca, and bac), which was found only in isolates of serotype Ib, and 1000001 (containing spb1 and rib), which was found only in isolates of serotype III. Using these methods, however, we were unable to determine whether this variety in serotype within the remaining genetic signature groups was due to capsule switching or other lateral transfer events.
DISCUSSION
A number of previous studies have examined the molecular epidemiology of putative virulence factors, most notably surface proteins, in GBS (e.g. [Reference Stalhammar-Carlemalm, Stenberg and Lindahl3, Reference Adderson4, Reference Berner15–Reference Madoff18, Reference Bohnsack23–Reference Kvam25]. They have often found heterogeneity within serotypes, with serotype V frequently being singled out as a more homogeneous serotype, particularly when antibiotic-resistant strains are examined [Reference von Both26]. In the current study, we observed a large amount of serotype diversity within each genetic signature. This is probably due to lateral gene transfer, either of the capsule genes (capsule switching) or of the other virulence genes examined. Several other investigators have reported the appearance of capsule switching when examining the molecular epidemiology of GBS [Reference Davies27–Reference Luan29]. Ferrieri et al. have suggested that genetic exchange may occur within the genital tract of women who are colonized with strains representing multiple serotypes [Reference Ferrieri30]. It is possible this exchange may occur during long-term rectal carriage of multiple serotypes as well [Reference Ferrieri24].
Our study confirms prior research [Reference Stalhammar-Carlemalm, Stenberg and Lindahl3, Reference Berner15–Reference Madoff18] which found the bac and bca genes primarily in serotypes Ia, Ib, and II [Reference Kong17, Reference Kvam25]. However, earlier reports using PFGE have suggested that bac-positive strains are genetically homogeneous [Reference Dmitriev31]. In our study, we found 75 isolates that contained the bac gene. These isolates could be divided into 14 different genetic signatures. Most commonly found was 0000110, containing only bac and bca (31/75 bac gene-positive isolates, 41%); eight other signatures were represented by four or more bac gene-positive isolates, while the remaining five genetic signatures were represented by a single bac gene-positive isolate. This suggests that a measure of diversity is missed when relying on PFGE as the main method of determining diversity within a population, as has been noted in the literature [Reference Manning32].
In line with previous studies, bca was also found in ∼15% of serotype V isolates, while rib was found primarily in serotype III isolates [Reference Stalhammar-Carlemalm, Stenberg and Lindahl3, Reference Kong17]. In a similar proportion of isolates (13%), none of the genes under investigation were detected (giving a genetic signature of 0000000). It is possible this is due to limitations of the dot-blot hybridization detection system. However, two lines of evidence suggest otherwise. When a random subset of isolates was examined by PCR, the results obtained corroborated the data from dot-blot hybridization methods. Additionally, dot- blot hybridization results obtained for the sequenced isolates (2603V/R, NEM316, and A909) were consistent with sequence information. It is also possible that genes similar to the ones under investigation were present in these 0000000 isolates, but too divergent to be detected using our hybridization conditions.
The 0000000 signature was particularly common in serotype V and Ia isolates; 45% (24/53) of the 0000000 isolates are invasive isolates, suggesting that there are additional, unidentified genes which contribute to the virulence of these strains. It is particularly prudent to discover virulence factors in serotype V isolates, as they have often been reported to be resistant to antibiotics [Reference Borchardt12, 26, 33–Reference von Both35]. As serotype V isolates in our collection were largely 0000000 or 0000010 (containing bca only), this study is unable to shed much light on this problem. It is unfortunate that 2603V/R, the sequenced serotype V isolate, is atypical of that serotype. Rather, it has been found to be similar to many serotype III isolates [Reference Davies27, Reference Herbert36]. Our data support this finding, as genes initially identified in 2603VR (pag and psp) were rarely found in other serotype V strains (see Fig. 2). Therefore, further genomic research should assist in elucidating additional potential virulence factors, paying special attention to identification of novel virulence factors important for serotype V isolates.
With the exception of spb1, we observed no difference in prevalence of the selected genes among invasive compared to colonizing isolates. This is similar to a previous report that found an association between the presence of putative virulence genes and genetic lineage, but not disease [Reference Hauge28]. A recent study using MLST to examine a population of invasive and colonizing isolates similarly found no association between MLST pattern and disease in their isolate population collected in Alberta, Canada [Reference Davies27]. However, also using MLST, Luan et al. [Reference Luan29] did find a lineage associated with neonatal disease in their study of Swedish invasive isolates. While this discrepancy could be due to genetic differences in the two geographical locations, another possibility is that it is an artifact of their isolate selection, as Davies et al. [Reference Davies27] examined both invasive and colonizing serotype III isolates, while Luan et al. looked only at a population of invasive isolates. Additionally, the very characterization of isolates as ‘invasive’ or ‘colonizing’, particularly when dealing with neonatal isolates, introduces variability into the study, as an isolate that may only be colonizing in a pregnant woman may result in invasive disease in the neonate. The increased finding of one lineage in invasive disease may be the result of hypervirulence in that lineage, or it may be a function of increased ability to adhere to neonatal host cells, to be transferred to the neonate, or exhibit preferential adaptation to the neonate than the adult.
The advantages of our study over several previous ones include a population-based approach to isolate selection, and a large number of isolates which were diverse in serotype and PFGE type. However, our colonizing and invasive isolates are taken from different geographic locations and study populations, making it difficult to generalize to all bacterial populations, although no statistically significant differences in gene distribution were noted in different geographic regions. Moreover, the initial sequenced isolates (NEM316 and 2603 V/R) are both isolates taken from patients with clinical disease, introducing a potential bias in our initial selection of genes of interest. Future work examining both invasive and colonizing isolates in a number of community populations would improve our understanding of the ecology and epidemiology of S. agalactiae.
Finally, the variety of genetic signatures among spb1-positive isolates indicates that spb1-positive isolates are more genetically diverse than previously suggested, and that this gene has limited use as a single marker for invasive III-3 isolates. However, while the presence of spb1 alone was not enough to identify a virulent isolate, the presence of both sbp1 and rib was highly associated with invasive neonatal disease. It remains undetermined whether the pathogenicity of these strains are due to the Rib and Spb1 proteins, or whether they are simply markers of this lineage. Additionally, we looked solely at gene presence or absence, and did not examine different levels of gene transcription. It is possible that small changes in gene regulation could lead to large differences in disease outcome. Future studies examining the role of these genes individually and together in a variety of genetic backgrounds could further elucidate the role each protein plays in these infections.
ACKNOWLEDGEMENTS
We thank Patricia Tallman and Joan DeBusscher for assistance with management of the bacterial collections and excellent technical advice; to Sarah Smathers for assistance with DNA isolation; the Wisconsin Public Health Department for the use of their collection of invasive neonatal and elderly isolates, and Carol Baker for Texas neonatal isolates. This work was supported by NIH grant number R01AI051675 (B.F.).
DECLARATION OF INTEREST
None.