Introduction
Human papillomaviruses (HPVs) are members of the Papillomaviridae family, with over 150 HPV types having been identified [Reference de Villiers1, Reference Bernard2]. They are small, non-enveloped DNA viruses with a double-stranded circular DNA molecule of ~8000 base pairs (bp) [Reference de Villiers1, Reference Bernard2]. HPVs, usually HPV types 6 or 11, are the aetiological agent of recurrent respiratory papillomatosis (RRP) [Reference Seedat, Combrinck and Burt3]. HPV6 is classified into two variant lineages (A and B), and five B sublineages (B1, B2, B3, B4, B5) [Reference Jelen4]. HPV variant lineages and sublineages are defined based on the nucleotide differences across the complete HPV genome sequences of 1·0–10·0% and 0·5–1·0%, respectively [Reference Burk, Harari and Chen5].
The HPV genome is organized into early (E) and late (L) open reading frames (ORFs) and includes a non-coding region [Reference Lazarczyk6]. The early region ORFs code for proteins that play a role in viral regulation as well as for proteins that are essential in initiating viral DNA replication. The late ORFs code for the two structural capsid proteins [Reference Lazarczyk6, Reference Zheng and Baker7]. The non-coding region, also known as the long control region (LCR) or the upstream regulatory region (URR), contains the crucial elements necessary for control of viral replication and transcription as well as the origin of DNA replication [Reference Lazarczyk6, Reference O'Connor, Chan, Bernard, Myers, Bernard and Delius8]. The LCR extends from the termination of the L1 gene to the first methionine of the E6 gene It is less conserved than one would expect these regions to be and contains a variety of transcriptional regulatory motifs, the early promoter, viral coded E2 regulator binding sites and the viral origin of DNA replication [Reference O'Connor, Chan, Bernard, Myers, Bernard and Delius8].
Nucleotide alterations in the HPV6 LCR have previously been described [Reference Jelen4, Reference Kocjan9–Reference Heinzel11]. The clinical significance of these variations is still not clear. Conflicting results have been described in studies investigating the effect of alteration on viral enhancer activity-based assays using reporter gene constructs [Reference Heinzel11–Reference Lace15]. While almost equal amounts of luciferase activities were demonstrated with HPV6 variants where point mutations were observed in the LCR, reporter gene activity using HPV-derived LCR regions from two variants that included duplications of 459 bp and 236 bp within the 3’ region of the LCR was significantly raised [Reference Grassmann14]. Similarly, functional analysis of nucleotide substitutions in seven different HPV16 LCR isolates revealed enhanced transcriptional activity of the P97 promoter for two variants and no change in transcriptional activity for five of the seven variants [Reference Kämmer12]. It is unclear whether mutations within the LCR influence the transcriptional activity, and it is possible that the site where the mutation occurs, or the type of mutation, might play a role in the alteration of replication or transcription.
In a previous study, we identified a 170 bp duplication within the LCR of an HPV6 isolate obtained from a patient with aggressive RRP [Reference Combrinck10]. Based on uncertainty regarding the influence of alterations, further investigation of the duplication on function was warranted. Hence the complete genome of the isolate was determined and the promoter sequence from the HPV LCR including the 170 bp duplication was placed upstream of a heterologous reporter gene and the activity of the reporter gene product determined using transfected cells.
Materials and Methods
The complete nucleotide sequence data was determined for the HPV6 isolate laboratory number VBD19/10 with a 170 bp duplication within the LCR (Genbank accession no. JN573172.1). Sequence data were determined for HPV6 isolate VBD02/10 (Genbank accession no. JN573167.1) that did not have alterations within the LCR relative to prototypic strain and was selected as a control for reporter gene studies.
The biopsies were collected from patients treated for RRP at the Universitas Academic hospital in Bloemfontein, South Africa in 2010. The clinical features of the patients are summarized in Table 1. Neither of the patients had spread distal to the subglottis or had required a tracheostomy. Written informed consent for inclusion in the study was obtained from the parents of these patients. The study was approved by the Ethics Committee of the Faculty of Health Sciences, University of the Free State (ECUFS 6/2011).
Two pairs of primers were designed that amplified two overlapping fragments of the full-length HPV6 genome. The complete genome sequence data of one HPV6a (L41216), two HPV6b (NC_001355 and X00203) and one HPV6vc (AF092932) isolates were retrieved from GenBank. The sequence data was aligned in Clustal X version 2.1 [Reference Larkin16]. Using the alignment file, a primer pair designated HPV6_F1 (5’-CAG TTA TAG GGG AAG CAC CAG-3’) at position 1826–1846 relative to HPV6b, and HPV6_R1 (5’-CTG GTA ATA AGT TCT AAG GGC GG-3’) at position 6354–6332 relative to HPV6b were designed to amplify a region of ~4529 bp. A second primer pair designated HPV6_F2 (5’-GTA TCC AAA GTT GTT GCC ACG G-3’) at position 5837–5858 relative to HPV6b, and HPV6_R2 (5’-CTC ATA CAA AAG TAC GAT TTC CCA G-3’) at position 2300–2276 relative to HPV6b were designed to amplify a region that overlapped of ~4460–4631 bp.
The primer pairs amplified two overlapping regions that targeted the complete HPV6 genome of each isolate. The amplification of each region was performed using the Phusion HotStart DNA polymerase-mediated PCR amplification kit (Finnzymes, Finland) according to the manufacturer's instructions using the following cycling conditions, denaturation at 98 °C for 30 s and 30 cycles of 98 °C for 10 s, 47 °C for 30 s, 72 °C for 135 s and a final extension at 72 °C for 10 min. PCR amplicons were purified for nucleotide sequence determination.
Purified amplicons were submitted to the National Institute for Communicable Diseases (NICD), Sandringham, Johannesburg for determination of nucleotide sequences of amplicons using the Roche GS Junior System (Roche 454 Life Sciences, USA). The sequence data obtained was mapped against the reference HPV6b using GS Reference Mapper Roche software and assembled using the GS De Novo Assembler Roche software. The contiguous file obtained from the GS Reference Mapper and GS De Novo Assembler Roche software were assembled and aligned against the sequence of the corrected HPV6b reference genome (PAVE ID: HPV6REF), HPV6a (GenBank accession no. L41216) and HPV6vc (GenBank accession no. AF092932) using Geneious Pro v. 4.8.5 (www.geneious.com) [Reference Kearse17], The aligned sequence data was edited in Geneious Pro v. 4.8.5 according to the phred confidence values observed in Gap4 v. 4.11 [Reference Bonfield, Smith and Staden18].
The complete genome sequence data of the isolate was aligned with complete sequence data of 193 HPV6 isolates from GenBank using MEGA v. 6.06 [Reference Tamura19] using the ClustalW algorithm and a maximum likelihood phylogenetic tree was constructed.
Functional analysis of LCR region
The purified DNA amplicons were cloned into pBlue-TOPO reporter vector (Invitrogen, USA) according to the manufacturer's instructions. Positive transformants were identified and confirmed by sequence analysis. Purified recombinant plasmid was prepared from overnight cultures and transfection grade DNA plasmid was purified using the Qiagen Plasmid Mini kit (Qiagen, USA) according to the manufacturer's instructions.
Baby hamster kidney (BHK) 21 (ATCC CCL-10) cells were grown to 90–95% confluency in growth media, Dulbecco's Modified Eagle's Medium (DMEM; Lonza, Belgium) with 5% fetal calf serum (FCS; Delta Bioproducts, South Africa), 1% l-glutamine (l-Glu; Sigma Aldrich, UK), 1% non-essential amino acids (NEAA; Lonza), and 1% penicillin/streptomycin (Pen/Strep) antibiotics (Sigma Aldrich). A ratio of 1·6–1·7 µg DNA:5 µl lipofectamine was determined as optimal based on transfection efficiencies and plasmid. Untransfected BHK cells were used as negative controls. Levels of active β-galactosidase expressed from BHK cells transfected with plasmids expressing the lacZ gene were determined using the β-galactosidase assay kit (Invitrogen) according to the manufacturer's instructions. Absorbance values were measured at 420 nm using a Spectronic Genesys 5 spectrophotometer (Thermo Electron Corp., USA). The protein concentration of the cell lysate was determined using the Qubit protein assay (Invitrogen) according to the manufacturer's instructions. The specific activity of the cell lysate, determined in a total volume of 8 × 105 nl, was calculated as follows:
where ONPG = ortho-nitrophenyl-β-galactoside; 4500 = the extinction coefficient, t = time of incubation (min) at 37 °C, and mg protein = the amount of protein assayed. Two independent transfection experiments were performed with duplicates within each experiment with three readings within each experiment. The ratio of pBlueTopo_19/10 to pBlueTopo_02/10 was determined for each experiment in order to normalize data from different experiments.
Data analysis was performed using IBM SPSS Statistics v. 22 (IBM Corp., USA). The specific activities of the cell lysate measured were analysed using the independent samples t test and 95% confidence intervals (CIs) were calculated for the ratio of pBlueTopo_19/10 to pBlueTopo_02/10.
Results
The complete genome sequence data for isolate VBD19/10 was aligned for identification of variants. Nucleotide positions were numbered relative to the corrected HPV6b reference genome (PAVE ID: HPV6REF). Genome data for HPV6b from lineage A, HPV6vc (AF092932) from sublineage B1, and HPV6a (L41216) from sublineage B3, were included in the alignment for comparing genomic variation. Mutations observed in the full-length genome of VBD19/10 with base positions relative to the HPV6b reference genome are summarized in Table 2 with the amino acid substitutions and corresponding residue positions in Table 3.
Base positions where mutations were observed are indicated at the top vertically. The shaded blocks indicate no variation from the reference genome. Insertions are indicated by I and deletions are indicated by d.
I1 = TACATTATTGTATA.
I2 = ATATGTTTATTGCCACTGCA.
I3 = TCACCTGGCGCCAGGGTGCGGTATTGCCTTACTCATATGTTTATTGCCACTGCAATAAACCTGTCTTTGTGTTATACTTTTCTGCACTGTAGCCAACTCTTAAAAGCATTTTTGGCTTGTAGCAGAACATTTTTTTGCTCTTACTGTTTGGTATACAATAACATAAAAATG.
I4 = T.
* NNCR3 is a non-classic non-coding region between the stop codon of genomic region E5b and start codon of genomic region L2.
None of the mutations observed in the L1 gene modified the amino acid.
* Residue positions are relative to the HPV6b prototype.
† Nucleotide positions are relative to the HPV6b prototype.
The phylogenetic tree showed the isolate to form part of the B3 sublineage, with the closest isolates being ZA54/10 and ZA65/11 (Fig. 1). Pairwise distances relative to these isolates and to the corrected reference HPV6b isolate from lineage A, HPV6vc from sublineage B1, and HPV6a from sublineage B3 are shown in Table 4.
Functional analysis
The level of β-galactosidase activity expressed from the lacZ gene of cells transfected with the plasmid containing the duplication within the LCR (pBlueTopo_19/10) was significantly higher than the level expressed from cells transfected with the plasmid without the duplication (pBlueTopo_02/10) in both experiments. In experiment 1, the range of pBlueTopo_02/10 was 9·4–12·7 (mean 10·89) and that of pBlueTopo_19/10 was 35·3–57·2 (mean 44·6) (P = 0·000) while in experiment 2, pBlueTopo_02/10 ranged between 39·1 and 92·1 (mean 70·0) and pBlueTopo_19/10 ranged between 113·5 and 287·8 (mean 193·6) (P = 0·002) (Fig. 2). The ratio of specific activity detected in cells transfected with plasmid pBlueTopo_19/10 to that detected in cells transfected with plasmid pBlueTopo_02/10 was 3·4 (95% CI 2·9–3·9). The difference in absolute values between the two experiments is probably because they were performed at different times using different passages of cells.
Discussion
The complete genome sequence was determined and in vitro functional analysis was performed on HPV6 isolate VBD19/10. In total, mutations were observed at 157 nucleotide positions and included nucleotide substitutions, deletions and insertions, resulting in amino acid changes at 43 residue positions.
Four amino acid mutations that have been described previously: at residue 222 in the E2, residue 6 in the E4, residue 65 in the E5b, and residue 55 in E5b regions resulting in changes in the hydrophobic property of the protein that are probably more significant than the two novel mutations that did not alter hydrophobicity index, as hydrophobic amino acids are likely to be located within the protein folds. The E2 gene regulates the transcription of the HPV genome and a difference in transcriptional activation can be a result of a difference in the specific binding of E2 to DNA. Thus, a modification in the amino acid properties could alter the folding of the protein and may influence the specific binding of E2 to DNA and ultimately influence transcription of the viral genome. Moreover, the E4 and E5 proteins modify the cellular environment to assist indirectly in amplification of the viral genome, with reports of transforming activities for the E5 gene. Therefore, mutations in the E4 and E5 proteins, especially non-synonymous mutations could have extensive effects on the structure of the proteins. The implications and significance of each amino acid change would require further investigation, possibly using mutagenesis studies.
The biological functioning of promoter sequences can be investigated using reporter vectors in which PCR-amplified regions are inserted into promotorless vectors upstream of a reporter gene. A 170 bp duplication was previously identified in the central segment of the LCR of an HPV6 isolate from a patient with RRP [Reference Combrinck10]. The duplication was located where the majority of transcriptional factor-binding sites are and between two E2-binding sites [Reference O'Connor, Chan, Bernard, Myers, Bernard and Delius8]. In HPV11, this central segment of the LCR has been shown to serve as an enhancer specific for epithelial transcription [Reference Chin, Broker and Chow20]. Quantitation of the enzyme β-galactosidase expressed by the reporter gene, lacZ, was used as an indicator of the functioning of the promoter sequences. Although the specific activity level of β-galactosidase expressed from cells transfected with the recombinant plasmid pBlueTopo_19/10 was 3·4 times higher than the activity determined from cells transfected with recombinant plasmid pBlueTopo_2/10, the conclusions drawn need to take into consideration the limitations of the assay. Estimations of transfection efficiencies based on staining cells for protein β-galactosidase showed higher transfection efficiencies from the recombinant plasmid pBlueTopo_19/10. Higher transfection efficiency could be responsible for detection of higher levels of specific activity. However, similarly, higher transfection efficiencies may have been detected by staining due to higher levels of protein expression. Hence the outcome suggests that the influence of the 170 bp duplication on the levels of transcription of downstream genes warrants further investigation.
Mutations that occur in the LCR could have an enhancing or inhibiting effect on the replication and transcription of the virus. Reporter gene technology has been used successfully in the analysis of transcriptional regulation by HPV6 E2 proteins and in the evaluation of the effect of sequence modifications in the LCR of HPV6 isolates on gene expression [Reference Grassmann14, Reference Kovelman21]. Sequence rearrangements in the LCR did not seem to affect the promoter activity, but where large duplications in the LCR were identified, it resulted in enhanced promoter activity and thus it was stated that the duplication may have caused an increase in the oncogenic potential of HPV6 variants ascribed to overexpression of E6 and E7 [Reference Grassmann14]. Further investigations and long-term studies of patients are warranted to determine if it is possible to use duplications as prognostic indicators or biomarkers of disease. It may also depend on the site at which the mutation occurs, for instance at the origin of replication, at E2-binding sites, or whether additional transcription binding sites were produced. This, however, needs further investigation, namely introducing mutagenesis in the form of the same duplication in the LCR of the control in order to determine whether an increase in expression levels is due to an increase in promoter activity or due to a higher transfection efficiency. If the duplication enhanced the activity and consequentially the expression of the protein, it could be used as a molecular determinant in the prediction of aggressiveness of RRP disease. Isolate VBD19/10, which was obtained from a patient with aggressive RRP disease, showed additional mutations in other regions of the genome other than the duplication observed in the LCR. The role of these mutations also warrants investigation. The combination of the mutations could also contribute to the aggressive behaviour of the disease, but requires further investigation.
In conclusion, this study provides complete genome sequence data on a HPV6 isolate from a patient with aggressive RRP. We have possibly identified a molecular determinant that could influence promoter activity. However, further mutagenesis investigations will be required to substantiate the exact role of duplications on promoter activity and possibly disease severity.
ACKNOWLEDGEMENTS
We acknowledge Mr Stephanus Riekert for providing resources and assistance on the High Performance Computing Unit's infrastructure at the University of the Free State.
This project was funded by grants from the South African Society of Otorhinolaryngology-Head and Neck Surgery Research Fund, National Health Laboratory Service Research Trust and University of the Free State Cluster funding.
DECLARATION OF INTEREST
None.