INTRODUCTION
Persistent infection by at least one type of high-risk human papillomavirus (HR-HPV) is necessary but not sufficient for such infection to progress to cervical cancer (CC) [Reference Bosch1]. Bearing in mind that the main HPV transmission route is sexual, females infected by at least one viral type have a greater probability of acquiring new viral types [Reference Liaw2–Reference Thomas4].
Variable coinfection prevalence has been reported [Reference Chaturvedi5–Reference Garcia7] and it has been found that the number of cases of multiple HPV infection has been significantly greater than expected [Reference Chaturvedi5, Reference Chaturvedi8]. It has been described that infection by multiple types of HPV plays an important role in the development and/or progression of precursor CC lesions [Reference Chaturvedi5, Reference Trottier9]. The inverse relationship between coinfection prevalence and cervical lesion severity [Reference Garcia7] has led to exploring which HPV genotypes tend to be associated with coinfection with which other types could be useful for understanding the impact of multiple infection and identifying females at risk of developing CC.
Knowing whether determined groupings of HR-HPV types exist in multiple infection at phylogenetic or genotype level has been of interest for some investigations where different populations and detection techniques have been used, unfortunately leading to non-conclusive findings so far [Reference Chaturvedi5, Reference Vaccarella10]. Moreover, limitations related to primer sets leading to underestimating the number of types [Reference Trottier9], as well as a restricted sample size and the need for multiple statistical analyses [Reference Chaturvedi8], difficult interpreting grouping patterns.
The present study has been aimed at investigating whether multiple HPV types (the six most prevalent high-risk types in Colombia [Reference Camargo11]), particularly certain combinations of types, tend to become grouped and which factors (related to women and HPV) are associated to such grouping patterns. This could provide additional information for understanding the possible impact of multiple viral types regarding carcinogenesis and thus establish a baseline for future studies aimed at monitoring the impact of vaccination on a target population.
METHODS
Study population
All the females who participated in this study voluntarily attended the existing CC promotion and prevention (P&P) programmes being offered by healthcare institutions in three Colombian cities (Bogotá, Girardot and Chaparral). The methodology used in this study has previously been described [Reference Soto-De Leon12].
Females in whose cervical samples infection was detected involving at least one type of HR-HPV from the six types identified by real-time polymerase chain reaction (PCR) and with complete medical history information were included for analysis in this study (n = 242). Patients were excluded in whose samples the constitutive gene hydroxymethylbilane synthase (HMBS) could not be amplified (n = 23). The sample thus consisted of 219 females.
Each participating institution's ethics committee approved the study and supervised it (i.e. the Ethics Committees at Fundación Instituto de Inmunología de Colombia, Nuevo Hospital San Rafael E.S.E in Girardot, the Hospital San Juan Bautista de Chaparral E.S.E. and the Hospital de Engativá).
Detecting and quantifying DNA from HPV
The methodology used in this study has previously been described by our research group [Reference Soto-De Leon12]. Briefly, DNA from the six high-risk viral types previously reported as having high frequency in Colombia was detected and quantified by real-time PCR. The 1:10 serial dilution (1011 to 106) was obtained having an initial concentration of plasmid for each HPV type and HMBS. These six dilution points were used for standardising real-time PCR by constructing standard curves for each viral type and the HMBS gene. Each sample was analysed for detecting and quantifying HPV-16, HPV-18, HPV-31, HPV-33, HPV-45 and HPV-58. The HMBS gene was amplified to verify DNA integrity and determine the number of viral copies per cell. Four parallel duplex real-time PCR were run for each patient (the first for HPV-16; the second included HPV−18 and −31; the third for HPV−33 and −45; and the last one for HPV−58 and HMBS), using the CFX96 Touch (BIO-RAD) detection system. Six plasmid dilution points were included in each analysis for respective viral types or HMBS as well as an NTC (no template control). Absolute viral load and normalised viral load (HPV copies/cell = number of HPV copies/(number of HMBS copies/2))[Reference Carcopino13] were thus quantified for each viral type.
Statistical analysis
Frequencies and percentages were used for univariate analysis of categorical variables; medians and interquartile ranges (IQR) were used for quantitative variables, bearing the non-normal distribution of data in mind. The presence or absence of infection for each HPV type was established by viral load, using >0 viral copies as cut-off point for defining viral load positivity.
Multiple infection status was defined as real-time PCR detection of two or more viral types (coinfection). It was determined whether there were differences regarding socio-demographic variables and risk factors between females suffering single infection and those suffering coinfection. Fisher's exact test was used for identifying such differences. The Mann–Whitney test was used for quantitative variables, evaluating differences in distribution between both groups of females. STATA 10 software was used for analysing data; all hypothesis tests were two-tailed, using 0·05 significance.
Logistic regression analysis was used for modelling multiple infection risk. The variables included in the model were age, origin, ethnicity, a background of smoking cigarettes, age on first sexual relationship, lifetime number of sexual partners, contraceptive method used, a background of births and previous abortions, menopause and STD (sexually transmitted disease). A Poisson regression model was used for evaluating the variables associated with the amount of infecting types (1–6 types) regarding count data; this included the same set of variables as in the logistic model. The assumption of equivalence between variance and expected value regarding these models had been previously verified through over-dispersion tests.
Multiple correspondence analysis (MCA) was used for exploring HPV genotype grouping. MCA is a form of factor analysis aimed at grouping a set of variables into a smaller set of related factors or profiles. Such analysis evaluated the degree to which each of the six HPV types participated in configuring determined profiles and the simultaneous association between multiple variables within the profiles so detected (coinfection patterns, cytological findings and their relationship with some variables considered risk factors for HPV infection). This method enabled analysing the profiles configured by multiple variables in terms of different modalities’ similarity or proximity [Reference Escofier and Pagès14]. The significant representation of each variable was taken into account when selecting variables to avoid results being due to chance (as in other types of multivariate analysis) [Reference Escofier and Pagès14, Reference Lebart, Morineau and Piron15]. This led to identifying groups having clinical significance from different groupings of categories of variables and continuous variables’ patterns, i.e. using this method led to identifying how HPV types were grouped and how socio-demographic characteristics, risk factors and cytology result were associated with determined groups of viral types.
Two groups of variables were chosen in this study for such analysis: active variables used in constructing factorial axes and supplementary or illustrative variables enriching interpretation of factorial axes once they have been constructed [Reference Lebart, Morineau and Piron15]. The presence/absence of infection caused by each viral type was considered an active variable for this analysis, whilst some HR-HPV infection risk factors, cytological findings and viral load for each HR type were treated as illustrative variables.
The factorial axes on which the grouping of variables had to be evaluated were initially established by analysing the variance and eigenvalues in the histogram constructed from active variables (i.e. infection by each of the six viral types detected). Two factorial axes were selected, bearing six initial variables in mind (HPV-16, −18, −31, −33, −45 and −58), the percentage of variance explained by each variable and the variance accumulated by the first two factors (close to 50%). Furthermore, contribution values were analysed for interpreting these axes, thereby establishing which categories were meaningful within each axis. The average contribution of all categories of active variables was thus determined (each of six variables had two categories: yes and no; (100/12 = 8·3%)), categories making more than 8·3% contribution were selected.
Cosine squared values were also evaluated to complement the contribution values, thereby leading to an estimation of the quality of the representation of each active variable on each axis (which was better as it was closer to 1) and test values were evaluated for determining whether the representation of each viral type within a determined grouping of viral types (axis) was significantly different from 0, giving an assessment of each variable's significance.
Selecting illustrative variables involved independently analysing the representation of each category for each variable on each factorial axis (or groups of HPV types). Test values were thus taken into account, indicating whether the location coordinates for each category on the two-dimensional plane (determined by both factors) was significantly different from 0, thereby ascertaining whether the representation of each category in a factor was significant, taking ⩽−2 or ⩾2 values as cut-off points (determined by MCA). This led to identifying categories having ⩽−2 or ⩾2 test values, thereby determining the variables having the greatest representation in the grouping of HPV types. Variables associated with a particular HPV group were identified according to the test value sign (negative or positive). Categories having ⩽−2 test values were associated with the grouping found at the negative poles for each factor; categories having ⩾2 test values were associated with the grouping found at the positive poles. The final model only included variables having test values establishing significant representation (⩽−2 or ⩾2) for a particular group of HPV types.
Viral loads were included as continuous illustrative variables. Their representation on each axis was analysed bearing correlation values in mind.
The different profiles’ formation and structure were analysed using a two-dimensional graphical representation giving the socio-demographic variables, risk factors and cytological findings’ projection and each viral type's contribution, as well as the correlation of the six HR-HPV viral loads for each profile configured here. Bearing in mind the distance between the variables represented on the graph, possible dependence and similarity relationships were identified regarding the categories so represented. SPAD-5 software was used for all multiple correspondence statistical analysis.
Absolute and normalised viral loads were analysed; however, only absolute load was used for multivariate analysis (MCA) since this allowed a closer estimation of the total amount of viral DNA. Absolute load values were transformed in log10 to facilitate interpreting the results.
RESULTS
Socio-demographic characteristics, risk factors and cervical findings
The study population's median age was 43 years (IQR 37–51 years). No statistically significant differences were found when comparing the medians for female age between those with single and multiple infections (P = 0·308).
The prevalence of socio-demographic characteristics, risk factors and cytology and colposcopy findings was determined in the study population (Table 1); however, no statistically significant differences were found when comparing both groups of patients.
P, P value; IUD, intrauterine device; STD, sexually transmitted disease; ASCUS, atypical squamous cells of undetermined significance; LSIL, low-grade squamous intraepithelial lesion; HSIL, high-grade squamous intraepithelial lesion.
* The minimum average monthly income (2014 rate) would be roughly US$ 300.
Even though only five females were diagnosed as having atypical squamous cells of undetermined significance (ASCUS), all of them were positive for infection by more than one viral type. There was greater prevalence of coinfection in females having negative cytology and colposcopy than those with squamous intraepithelial lesions (SIL).
HPV infection and viral load
Multiple infection prevalence was 75·80% (n = 166). HPV-18 was the most frequently occurring viral type for both single (n = 17, 32·08%) and multiple infections (n = 114, 68·67%), followed by HPV-16 in multiple infections and HPV-31 regarding single infection (Table 2). Statistically significant differences were seen when comparing each viral type's frequency regarding single and multiple infections (P = 0·000), except for HPV-33 (P = 0·079).
HPV, human papillomavirus; n/a, not applicable.
Absolute viral loads have been given in log10.
Values shown in bold denote statistically significant ones (P < 0·05).
* The Mann–Whitney test was used for comparing single and multiple infection for all HPV types.
Table 2 gives the medians for viral loads for each HR-HPV type. Statistically significant differences were only found for HPV-31 absolute viral load when determining whether there were differences regarding viral load distribution according to the presence of single or multiple infections, this being greater regarding coinfection (P < 0·05). It was seen that viral load was greater for HPV-31 and HPV-33 regarding multiple infections whilst this became reduced regarding HPV-58.
Multivariate analysis (logistic regression, Poisson and MCA models)
A multivariate logistic regression taking as outcome the infection by more than one HPV type was carried out; no significant results were found for any of the variables considered (data not shown). A Poisson regression using the amount of infecting HPV types as dependent variable was also performed. No significant results were found for any of the variables considered in the univariate analysis, so an MCA was thus used for exploring HPV genotype grouping and the associated variables.
Forming factorial axes
MCA led to identifying two factorial axes, thereby enabling the grouping of variables into two main profiles. When reviewing each active variable's contribution, cosine squared and test values (Table 3), it was found that presence of HPV-16 and HPV-45 were grouped on the positive pole of axis 1 and the absence of infection by these types was representative on the negative pole of axis 1. Regarding axis 2, HPV-33 and HPV-58 made the highest contribution and had the highest test and cosine squared values for this factor. The foregoing was also observed for HPV-31, which was also representative for this axis; even when equal cosine squared values were obtained on both axes and higher test values on axis 1, the contribution values were significant for axis 2 but not for axis 1 (10·90 and 11·62 > 8·3% (average contribution)) (Table 3). It was found that the presence of HPV-31, HPV-33 and HPV-58 was most representative for the negative pole, whilst the absence of infection by HPV-31 and HPV-58 was most representative for the positive pole of axis 2.
HPV, human papillomavirus.
The values in bold show the axis where HPV types were representative, except for HPV-18 (which had low representation on both axes). The test value's sign indicates the pole of the respective axis where the variable was grouped.
Regarding HPV-18, it was found that contribution and cosine squared values, as well as correlation coefficients were not as high as those for other viral types, thereby indicating this variable's low representation on both axes, i.e. it was not significant for the profiles of groupings configured by the other viral types.
Grouping nominal and continuous illustrative variables on factorial axes (profiles)
Correlation coefficients were obtained for viral loads on each axis (Table 3). HPV-16 and HPV-45 viral loads had the highest coefficients for axis 1, thereby agreeing with that found regarding HPV-16, HPV-31 and HPV-45 representativeness on axis 1. Likewise, HPV-33 and HPV-58 viral loads had higher coefficients for axis 2.
The first profile grouped HPV-16 and HPV-45 (from different species: A9 and A7, respectively), having high viral loads (represented by the length and projection of the vectors). The variables associated with HPV-16 and HPV-45 (bearing significant test values in mind) were Girardot, background of 3–4 births, infection by more than three viral types and the corresponding viral loads. Bogotá, being infected by three or less viral types and using no type of contraceptive were the variables associated with absence of infection by these types (Table 4).
IUD, intrauterine device; ASCUS, atypical squamous cells of undetermined significance; LSIL, low-grade squamous intraepithelial lesion; HPV, human papillomavirus.
Test values were obtained from analysing every category for each variable on each factorial axis (or groups of HPV types). Such values indicate the location of coordinates for each category on a two-dimensional plane (constructed from the axes shown in Fig. 1). Significant representation was ⩽−2 or ⩾2 as cut-off points (values in bold). Variables associated with a particular HPV group were identified according to the test value sign (negative or positive).
The second profile (species related) so identified was represented by grouping HPV-31, HPV-33 and HPV-58 (A9 species) on axis 2. The closeness between HPV-33 and HPV-58 denoted greater association between these two types. The variables associated with infection by these types involved Bogotá, linked healthcare regime, hormonal contraceptives, secondary level of education, nulliparity, coinfection by more than three viral types, ASCUS and corresponding types’ viral loads. On the other hand, when there was no infection by these three types, the representative variables involved Girardot, background of 3–4 pregnancies, primary school educational level, subsidised healthcare regime and coinfection by three viral types or less (Table 4).
The two-dimensional graph (Fig. 1) shows how illustrative nominal (socio-demographic characteristics, risk factors and cytological findings), illustrative continuous (viral loads) and active variables (viral types) were associated regarding profile formation. This graph is like a Cartesian plane formed by two axes, each having a positive and negative pole. Each HPV type could be loaded onto each axis (either positively or negatively). This reflected the data given in Table 3; this showed the cosine squared indicating representation, the contribution or association of each viral type on each axis and test value sign showing the position on a determined quadrant. These values (such as coordinates on a Cartesian plane) indicated the position and association of a variable within a profile. An arrow represented each type's viral load whose direction towards a determined quadrant indicated the association with a particular grouping and its length revealed the contribution within the profile. Regarding socio-demographic variables, risk factors and cytological findings (empty boxes), the proximity between boxes and the distance to viral types indicated the strength of their association.
DISCUSSION
MCA led to identifying and evaluating the profiles configured by the dependency relationships between HPV infection and other variables. Two grouping patterns were identified for the six most relevant HR-HPV types in the Colombian population. It was seen that two high-risk viral types (HPV-16 and −45) from different species (A9 and A7, respectively) were grouped and three types from the same species (HPV-31, −33 and −58). It was seen that HPV-16 was not grouped with types from the same species (HPV-31, −33 and −58) which could be related to competition between types; likewise, HPV-18, in spite of being the most frequently occurring viral type in both single and multiple infections, did not have a tendency to become grouped with either of these two profiles.
Previous studies have found negative associations for infection by another viral type when HPV-16 is present, as well as positive associations for other combinations [Reference Mejlhede16]. The present study found two grouping profiles for HR-HPV types, each type being representative for the corresponding profile (except HPV-18). This cannot be used for proving a biological dependence between viral types; however, it might suggest that infection by some types could depend on the existence of one or more viral types and the interaction between coexisting types [Reference Dickson17].
It has been reported that some type of competition between HPV types for the ecological niche is probable in natural coinfection [Reference Dickson18, Reference Murall, McCann and Bauch19]; however, different viral types could coexist in the same cell population, or even in the same cell, whilst having interactions between them [Reference Dickson17]. The viral types could compete in the cervical epithelium through factors necessary for replication, transcription, translation and/or persistence [Reference Xi20]. Such competition could affect the replication of both genomes or just that of a single viral type which could be seen regarding differences in viral load [Reference Dickson17]. However, a specie-specific type of association has been described where HPV-16 and HPV-18 viral loads became reduced in the presence of types from the same species, suggesting an interaction between coexisting types and the induction of a cross-reactive immune response [Reference Xi20]. Due to the homology regarding the genome's sequence and similar structure, HPV types could induce such responses and also favour a trend towards grouping [Reference Vaccarella10, Reference McCarthy, Youde and Man21].
However, such differences between viral loads in coinfection could also represent viral integration to host genome [Reference Dickson17]. HPV-16 had lower absolute viral load regarding coinfection compared with HPV-45. HPV-31 and HPV-58 viral loads were greater in coinfection compared with single infection, whilst HPV-33 had a lower load in multiple infection. These findings suggest an induced response against HPV-16 and HPV-33 or integration with host genome [Reference Dickson17].
No competition between viral types from the same species was seen for the second profile (HPV-31, −33 and −58); this may have facilitated infection by the three types as cell regulators were available which promoted each viral type's replication, thereby leading to cooperativity between these types [Reference Dickson17, Reference Xi20]. Another factor which could have facilitated the grouping of these viral types (from the same species) was infection temporality. It has been described that the probability of acquiring HPV infection increases if there is already infection by another high risk type [Reference Liaw2, Reference Mendez3, Reference Rousseau22]. HPV-31, HPV-33 and HPV-58 infection would thus have been subsequent, contrary to that observed for HPV-16 and HPV-31 (concurrent).
HPV-18 has frequently been reported in Colombia as being the second most prevalent type after HPV-16, contrary to that found in this study where it was reported most frequently. However, it is worth highlighting that women from just three cities in Colombia were included in the study and that such frequency changes depending on the region being evaluated according to the women's age and whether they have cervical lesions [Reference Camargo11]. P&P programmes must thus pay special attention to such changes regarding prevalence due to HPV-18 being associated with 15% of CC cases [Reference Clifford and Franceschi23].
One of the variables associated with clustering patterns was the origin. Differences between Girardot and Bogotá have been identified in relation to the prevalence and type-specific distribution of HPV, multiple infections (being higher for Girardot) and the time required for clearance (lower in Girardot when A9 species types were present) [Reference Camargo11, Reference Soto-De Leon12]. However, so far there are no reports of previous studies about the association of the origin of women with the grouping of HPV genotypes in coinfection. Factors related to sexual behaviour and cultural characteristics, which might be different between these cities, were not evaluated in this study but could modulate the associations found here.
Regarding parity, 3–4 births was associated with the presence of HPV-16 and HPV-45, this is consistent with the finding that multiparity have a higher risk of HPV infection [Reference Zhang24]. This has been explained by the prolonged exposure to the contact with HPV of the transformation zone into the exocervix [Reference Castellsague, Bosch and Munoz25]. However, no history of births was found to be associated with HPV-31, HPV-33 and HPV-58 coinfection, this agreed with previous reports [Reference Camargo11]. This could suggest that the risk of infection conferred by a history of births and pregnancies depends on viral genotype and probably sexual behaviour and contraceptive methods used by those women [Reference Castellsague, Bosch and Munoz25].
Our findings confirm what has previously been stated, that hormonal contraception is related to HPV infection [Reference Castellsague, Bosch and Munoz25]. As is known, HPV infection is sexually transmitted, it is likely that women that use a hormonal contraceptive method do not use barrier methods such as condoms, which would affect the transmission of the virus and therefore, facilitate the acquisition of various HPV types.
Regarding the healthcare scheme affiliation, the subsidised and linked regimes include people without capacity to pay, from low socio-economic strata [Reference Guerrero26]. The association between the linked regime and coinfection with HPV-31, HPV-33 and HPV-58, and also with secondary education, agreed with 85% of the HPV infection burden occurring in developing countries [Reference Ferlay27] as with studies that have shown that low-income women are more likely to have HPV [Reference Schluterman28]. Factors such a risky sexual behaviour [Reference Schluterman28] and possibly nutritional deficiencies [Reference Goodman29] in low-income populations may make them more susceptible to HPV infection.
An association was found between the second profile and the presence of ASCUS in cytology. Three viral types were grouped in this profile, having absolute loads of 109, 107 and 106 viral copies (HPV-31, −33 and −58, respectively). It has been described that developing cervical lesions is directly proportional to the level of viral load [Reference Hernández-Hernández30–Reference Ylitalo33]. It has also been found that the number of viral types has been significantly correlated with lesion severity [Reference Bello34]. The foregoing would explain ASCUS being found in this profile but not in the first one. However, no associations were found with other cytological findings, probably due to the size of the samples from these groups. Coinfection prevalence in ASCUS patients has been reported as being 10%–40·7% [Reference Rousseau22, Reference Schmitt35]. Spinillo et al. [Reference Spinillo36] suggested that the presence of coinfection is common in ASCUS and LSIL (low-grade SIL) women and is associated with a greater risk of invasive SIL and CC compared with single infection. The foregoing has implications for managing women having abnormal cytology.
As the present study was cross-sectional, most infection may have been prevalent but not incident; similarly, subsequent and concurrent infection could not be evaluated. However, the results from this study are consistent regarding some follow-up studies [Reference Liaw2–Reference Thomas4]. Moreover, no associated variables with multiple HPV infection were found when multivariate logistic regression or Poisson analyses were carried out. However, previous studies with a larger simple size in Colombian population have described factors such as women procedence (Andean region, Leticia), ethnicity (indigenous), birth or abortion antecedents, associated with multiple HPV infection [Reference Camargo11, Reference Soto-De Leon37, Reference Del Rio-Ospina38]. Some of these variables configure the described profiles identified through the MCA.
This study has provided relevant information regarding multiple HPV infection, information that has not been evaluated or reported by other studies. As real-time PCR was used as the detection technique, infection having low viral load was probably identified which could not have been detected by a qualitative technique [Reference Mendez3]. The present study dealt with the most relevant types for Colombia, meaning that information could be established as baseline for future evaluation of the impact of vaccination on the Colombian population. There is concern that viral types already covered (HPV-16, −18, −6 and −11) may become eliminated and positive selection may occur regarding viral types which are not covered by vaccines, increasing their frequency. Their interaction in natural infection must be taken into account as a tendency or grouping pattern in multiple infections, as has been shown [Reference Rousseau22, Reference Rousseau39, Reference Tota40]. Future research is needed regarding the mechanisms and other factors behind the relationships found between HPV genotypes.
The findings suggested that in coinfection, HR-HPV types tended to become grouped and the relationship between the grouping profiles was different regarding viral type. Biological characteristics such as the species to which they belonged, the similarity of their genomes and the number of viral copies, temporality regarding acquiring infection by different HPV genotypes and competition between viral types could influence the configuration of different grouping patterns. Differently to previous studies, host characteristics were included allowing a better approach to such associations with the presence of one pattern or another. Other yet-to-be-evaluated factors may intervene in the interaction between different coexisting viral types, such as the women's susceptibility, which predisposes some females to HPV infection more than others, or such susceptibility could be type specific, involving synergism between viral types, immune system alterations and sexual companions’ characteristics [Reference Thomas4]. Within the configured profiles, there is an interaction with both, host and virus factors, which can be associated with specific patterns of grouping, reflecting the importance of evaluating both viral and host's characteristics when carrying out studies about infection by multiple HR-HPV types.
ACKNOWLEDGEMENTS
This study was supported by the Basque Development Cooperation Agency, the Spanish International Development Cooperation Agency (AECID) (Project 10-CAP1-0197). The authors would like to thank the people from health institutions who participated in this study, collected samples and filled out questionnaires. They would also like to thank Jason Garry for translating and reviewing this document.
DECLARATION OF INTEREST
None.
ETHICAL STANDARDS
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.