Previous Article | Next Article ![]()
Journal of Virology, November 2005, p. 13630-13640, Vol. 79, No. 21
0022-538X/05/$08.00+0 doi:10.1128/JVI.79.21.13630-13640.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Department of Molecular Biology and Biochemistry, University of CaliforniaIrvine, Irvine, California 92697,1 Ludwig Institute for Cancer Research, Sao Paulo, Brazil,2 Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences,3 National Health Laboratory Service, University of Cape Town, South Africa,4 Queen Mary Hospital and the University of Hong Kong, Hong Kong,5 Department of Pathology, University of Oklahoma Health Sciences Center, Oklahoma City, Oklahoma 73104,6 Department of Obstetrics and Gynecology, Budish Tzu Chi General Hospital, Hualian, Taiwan,7 Royal Infirmary of Edinburgh, Edinburgh, Scotland,8 Institute for Molecular Pathology, University of Heidelberg, Heidelberg, Germany,9 Departamento de Biologia Celular, Instituto de Biologia, Universidade de Brasilia, DF, Brazil,10 HPV Laboratory, IDIBELL, Institut Català d'Oncologia, Barcelona, Spain, and Group Infection and Cancer, Universidad de Antioquia, Medillin, Colombia,11 Epidemiology and Cancer Registration Unit, IDIBELL, Institut Català d'Oncologia, Barcelona, Spain,12 International Agency for Research on Cancer, Lyon, France,13
Received 17 June 2005/ Accepted 4 August 2005
|
|
|---|
|
|
|---|
An HPV type is defined as a separate taxon when the nucleotide sequence of its L1 gene differs from that of any other HPV type by at least 10%. The additional term "subtype" defines a HPV genome whose L1 nucleotide sequence is greater than 2% and less than 10% different from that of the closest type (11). Only a very few HPV subtypes have been described. Our recent analyses of many isolates of two HPV types (HPV-44 and HPV-68) that have given rise to subtypes led to the calculation of dichotomic phylogenetic trees, which indicated an ancient origin of the "subtype" taxa (6).
While subtypes of HPV types are rare, each HPV type comprises numerous genomic variants whose numbers are nevertheless relatively small (tens or hundreds of variants) in comparison to the quasispecies formed by rapidly evolving RNA viruses whose diversity is several orders of magnitude higher. Variants of HPV types differ by less than 2% of their L1 nucleotide sequences but slightly more in the long control region (LCR), which does not encode genes and is therefore less restricted in its ability to accumulate and tolerate mutations (15, 34). Genomic variation of HPV-16 and HPV-18 has been particularly well studied. Variants of these two types form phylogenetic trees with branches formed by variants with high prevalence in cohorts in Africa, Europe, or East Asia, with one of the East Asian phylogenetic branches of variants extending into populations of Native Americans (16, 24). Thus, these trees are reflective of the evolution and worldwide migration patterns of the human host (3) and suggested that certain variants diverged at approximately the same time when the major human ethnic groups formed.
Subsequent studies have addressed the genomic variabilities of HPV-6 and -11 in genital warts (14); HPV-2, -27, and -57 in common warts (8); HPV-5 in epidermodysplasia verruciformis (10); and rare HPV types in small numbers of cervical lesions (29). All of these studies agree that all HPV types (i) have given rise to a limited and relatively small number of genomic variants, (ii) show a small fraction of genomic diversity (less than 2% in the L1 gene and only slightly more in hypervariable regions like the LCR), and (iii) show geographic specificity, at least for some variants. However, these studies did not document a close linkage between intratype HPV evolution and the evolution and migration of humans as revealed in the studies of HPV-16 and -18 (16, 24).
This paper is the third of three recent publications that address the issue of intratype evolution of HPVs. In the first report, we studied the diversity of HPV types 44 and 68 and their subtypes and variants (6). In the second report, we addressed the diversity of HPV-53, -56, and -66 (26), which form HPV species 6. This represents the third major group in the taxonomy of HPVs in addition to the two species formed by the relatives of HPV-16 and -18. The current study reports on the diversity of four HPV types, HPV-31, -35, -52, and -58, which are closely related to HPV-16. We have been able to examine the intratype diversity of these rare HPVs in collaboration with colleagues involved in HPV studies throughout the world. The data we report confirm the notion of a limited genomic diversity for each of these HPV types and geographic clustering of some HPV variants that suggests slower rates of molecular evolution than previously proposed for HPV-16 and -18 (16, 24). In contrast to HPV-16 and -18, which diversified only by nucleotide exchanges, some of these rarer HPV types show significant amounts of deletions and insertions. This appears to be a rarely encountered mechanism of HPV evolution. Amino acid substitutions of HPV oncoproteins and capsid proteins are rare but do occur and have to be considered in studies of pathogenicity and vaccine development.
|
|
|---|
Phylogenetically informative amplicons of the LCR. As in our previous studies on the genomic diversity of HPV types, we aimed to amplify a segment of the LCR, which would likely contain a sufficient number (e.g., 20 to 50) of mutations to generate stable phylogenetic trees. No effort was made to amplify the whole LCR. For studying HPV-31, a 523-bp segment between the genomic positions 7527 and 137 was amplified with primers 31-8aF (5'-AGTAGTTCTGCGGTTTTTGGTTTC-3') and 31-8aR (5'-CCGAGGTCTTTCTGCAGGATTTTT-3'). The genomic sequence was established for 503 bp of the 523-bp fragment. Of the 70 isolates, the known sequences are deposited in GenBank under the access codes AY453992 to AY454037. In order to exclude possible PCR artifacts, all samples were amplified twice and both strands were sequenced twice. The same precaution applied to the treatment of the HPV-35, -52, and -58 samples. For studying HPV-35, we amplified a 893-bp fragment between genomic positions 7146 and 187 with the primers 35LCR-F (5'-TATATTATGTGTTGTGGTGCCTGTTTG-3') and 35LCRa-R (5'-CGTTTTCGGTCACTCCCTGTTTT-3'). The genomic sequence was established for 814 bp of the 893-bp fragment. Of the 45 isolates, known sequences are deposited in GenBank under the access codes AY454064 to AY454038. Genomic segments of HPV-52 were amplified by PCR using the primers HPV-52-LF (5'-TTGTCTGTTGGGTAATTGTCTGTG-3') and HPV-52-LR (5'-CGTAACCGGTCGTGTAGTGC-3'), which generated a 750-bp segment between the genomic positions 7158 and 7907. The genomic sequence was established for 637 bp of the 750-bp fragment. Genomic segments of HPV-58 were amplified by PCR with the primers 58-UF (5'-TATGAGTAAGGTGCTGTCCCT-3') and 58-UR (5'-CGGTCTGACCGAAACCGGTGC-3'), which generated a 545-bp segment between the genomic positions 7345 and 68. The genomic sequence was established for 461 bp of the 545-bp fragment.
Amplification with E6 and L1 consensus primers. The HPV-52, -58, -31, and -35 E6 genes were amplified with type-specific primers. For HPV-31, E631-F (5'-AAAAGTAGGGAGTGACCGAAAGTGG-3') and E631-R (5'-TCGGGTAATTGCTCATAACAGTGGA-3') were used, resulting in a 625-bp fragment; for HPV-35, E635-F (5'-CGAAAACGGTTGCCATAAAAG-3') and E635-R (5'-TGCCTCGGGTTCCAAATCTA-3') were used, resulting in a 578-bp fragment; for HPV-52, E652-F (5'-ACGCACGGCCATGTTTGAGGAT-3') and E652-R (5'-TAATTGCTTGTGGCTTGTTCTGCTTGTC-3') were used, resulting in a 622-bp fragment; and for HPV-58, E658-F (5'-AGGCTACTGCAGGACTATGTTC-3') and E658-R (5'-AGCGTTGGGTTGTTTCCTCTCA-3') were used, resulting in a 503-bp fragment. The genomic sequences were established for 447 bp (HPV-52) and 450 bp (HPV-58, -31, and -35). A PCR fragment (450 bp) of the L1 gene was amplified with the consensus primers MY09/MY11 (2, 4), and the genomic sequences were established for 351 bp (HPV-31 and -35) and 421 bp (HPV-52 and -58) of these 450 bp.
PCR amplification, sequence analysis, and phylogenetic evaluations. PCR mixtures contained 20 mM Tris, pH 8.0, 100 mM KCl, 200 mM of each deoxynucleoside triphosphate, 2 mM MgCl2, 10 mM of each sense and antisense oligonucleotide primer, and 1 unit of Taq polymerase (Promega, Madison, WI). Forty amplification cycles were run in the Eppendorf Master Cycler with a 94°C denaturing step (30 s), a 60°C annealing step (30 s), and a 72°C extension step (60 s). PCR amplicons were separated electrophoretically on 2% agarose gel, purified with Exo I and alkaline phosphatase (USB, Cleveland, OH), and applied to enzymatic extension reactions for DNA sequencing using the ABI PRISM BigDye Cycle sequencing kit (AB Applied Biosystems). Both strands were sequenced with the same forward and reverse primers as those used for PCR amplification of the LCR unless stated differently. The sequencing reactions were purified by ethanol-sodium acetate precipitation and then run on an ABI Prism 3100 sequencer. The mutations were analyzed and determined by the ALIGN program at the GENESTREAM network server (25) (http://www2.igh.curs.fr/bin/align-guess.cgi). The neighbor-joining and unweighted-pair group method with arithmetic average (UPGMA) trees were constructed by using Mega version 2.1. All reference (or prototype) sequences were taken from the Los Alamos-based HPV sequence database (http://hpv-web.lanl.gov/stdgen/virus/hpv/compendium/htdocs/). In the case of HPV-35, the prototype sequence is reported at this site as HPV-35H, which corrected several mistakes published in a previous report.
Nucleotide sequence accession numbers. The new sequences of HPV-31 are published with the GenBank accession codes DQ057247 to DQ057270. The new sequences of HPV-35 are published with the GenBank access codes DQ057271 to DQ057289. The sequences of all 66 isolates of HPV-52 are published with the GenBank access codes DQ057080 to DQ057145. The sequences of all 101 isolates of HPV-58 are published with the GenBank access codes DQ057146 to DQ057246. The sequences of all isolates of E6 gene amplification are published with the following GenBank access codes: for E6-amplified HPV-31, DQ057302 to DQ057308; for E6-amplified HPV-35, DQ057309 to DQ057314; for E6-amplified HPV-52, DQ057290 to DQ057295; and for E6-amplified HPV-58, DQ057296 to DQ057301. The L1 MY09/11 GenBank codes are as follows: for HPV-52, DQ057315 to DQ057320; for HPV-58, DQ057321 to DQ057326; for HPV-31, DQ057327 to DQ057333; and for HPV-35, DQ057334 to DQ057339.
|
|
|---|
We analyzed a 503-bp fragment of the LCR of 69 HPV-31 samples and identified 28 variants (including the reference genome) with 28 point mutations and a 7-bp deletion relative to the HPV-31 genome. The deletion occurred in only four samples, namely, MX80, MX635, MX701, and MX1144. Maximal distance between any two variants was 14 mutations (2.8% of the 503-bp sequence), considering the deletion a single event. A total of 17 of the 28 variants were found in only a single sample, and 6 variants were found in two samples. The HPV-31 reference clone was found in 26 samples.
Gagnon and colleagues (12) could identify 18 variants (including the reference genome). The amplicons studied by this group and by us overlapped but extended in different directions. Based on the overlap, we observed 23 variants and Gagnon and colleagues observed 13 variants, and five of these variants were not observed in our study. Based on these numbers, our study detected more than 80% of the more common and widely distributed HPV-31 variants.
Figure 1 shows a phylogenetic tree of HPV-31 isolates based on the UPGMA algorithm (large figure) and a second tree based on the neighbor-joining method, where each variant is represented by a single isolate (small insert). These trees were calculated weighting each mutation equally except in the case of the deletion. The deletion was given the weight of two mutations, since we assumed that such a rare and specific event likely occurred only a single time. In phylogenetic trees calculated without introducing this bias, the variants MX800, MX635, MX701, and MX1144 did not cluster together.
![]() View larger version (20K): [in a new window] |
FIG. 1. The intratype diversity of HPV-31. The phylogenetic trees represent the relationship between HPV-31 variants based on a 503-bp segment of the LCR. The large tree is based on the UPGMA, and the small tree is based on the neighbor-joining algorithm. The UPGMA tree represents all isolates, while those that were chosen to represent a particular variant in the neighbor-joining tree are indicated by black triangles. BR, Sao Paulo (Brazil); ED, Edinburgh (Scotland); HE, Heidelberg (Germany); HK, Hong Kong; ML, Mali; MR, Morocco; MX, Monterrey (Mexico); OK, Oklahoma City (Oklahoma); SA, Cape Town (South Africa); PH, The Philippines; TL, Thailand; TW, Taipei (Taiwan); USA, Los Angeles (California).
|
Genomic diversity of HPV-35 isolates. The HPV-35 reference genome was isolated from an endocervical adenocarcinoma (21). HPV-35 is considered a high-risk HPV type and, worldwide, is the eighth most common HPV type in squamous cervical carcinomas and the seventh most common HPV type in asymptomatic control patients (23). Previously, only two genomic variants of HPV-35 were found through worldwide sampling (29), while a recent pilot study from our lab detected nine variants in a small number of cohorts (5). Some of the samples described in this article are identical to those in our previous publication and are identified by the same abbreviations. For the genomic sequence of the HPV-35 reference clone, we used the sequence HPV-35H as corrected and discussed at http://hpv-web.lanl.gov/stdgen/virus/hpv/compendium/htdocs/.
We analyzed an 814-bp fragment of the LCR of 47 HPV-35 samples and identified 12 variants with nucleotide exchanges, a single nucleotide deletion, and a 16-bp insertion relative to the reference genome. The latter was present in 33 samples. Within this 16-bp insert, we observed an additional single nucleotide exchange in two variants (10 samples) and 10 nucleotide exchanges in one sample (SA2299), suggestive of an ancient origin of this insertion. To clarify this unique observation, the raw sequencing data are shown in Fig. 2. Maximal distance between any two variants was 15 mutations (1.8%), considering the insertion a single event and counting the additional point mutations in SA2299. It is evident from the phylogenetic tree in Fig. 3 that the HPV-35 reference clone was found in only four samples (8.5%). In contrast, two variants with the insertion and specific patterns of points mutations elsewhere occurred in 16 (34%) and nine samples (19.1%), respectively, and seem to be the predominating HPV-35 genomes worldwide.
![]() View larger version (48K): [in a new window] |
FIG. 2. Mutational patterns in HPV-35 variants. The two top rows indicate the genomic position in the HPV-35 reference clone and the corresponding nucleotides. In the following rows, nucleotide exchanges are shown by letters, deletions relative to the reference clone by a hyphen, and an insert in some variants by an open square in variants that lack this insert. The positions 7412 and 7413 are listed to indicate the position of the insert.
|
![]() View larger version (16K): [in a new window] |
FIG. 3. The intratype diversity of HPV-35. The phylogenetic trees represent the relationship between HPV-35 variants based on an 814-bp segment of the LCR. The large tree is based on the UPGMA, and the small tree is based on the neighbor-joining algorithm. For further details, see the legend to Fig. 1.
|
Genomic diversity of HPV-52 isolates. The HPV-52 reference genome was isolated from a cervical intraepithelial neoplasia (27). HPV-52 is considered a high-risk HPV type. Worldwide, it is the sixth most common HPV type in cervical cancer but is not among the eight most common HPV types in cervical smears from asymptomatic patients (23). A preliminary study of the worldwide diversity of this virus based on the MY09/11 segment of L1 identified seven different variant genomes (29).
In total, we analyzed a 637-bp sequence from 66 samples, which led to the identification of 17 HPV-52 variant genomes relative to the reference HPV-52 genome (Fig. 4). All except one of our samples were missing the sequence 5'-TTATG-3' at the genomic positions 7387 to 7391 (Fig. 4). In order to clarify this unexpected finding, Wayne D. Lancaster kindly supplied us with the original HPV-52 reference isolate. Resequencing of this isolate confirmed a lack of this sequence, indicating an error in the establishment of the original HPV-52 sequence (27). We refer to the corrected sequence, i.e., the omission of the 5-bp segment, as the HPV-52 reference sequence and have submitted this correction to GenBank. However, in order to maintain an unequivocal discussion, the numbers in the following two paragraphs refer to the genomic positions of the uncorrected reference sequence.
![]() View larger version (52K): [in a new window] |
FIG. 4. Mutational patterns in HPV-52 variants. The two top rows indicate the genomic position in the published HPV-52 reference sequence and the corresponding nucleotides. Positions 7387 to 8391 could not be detected by resequencing the original reference clone or in any variants, leading to the corrected sequence of the HPV-52 reference clone in the third row (refer. cor.). In the following rows, nucleotide exchanges are shown by letters, deletions relative to the reference clone by a hyphen, and an insert in one variant (between positions 7701 and 7702) by an open square in variants that lack this insert.
|
![]() View larger version (19K): [in a new window] |
FIG. 5. The intratype diversity of HPV-52. The phylogenetic trees represent the relationship between HPV-52 variants based on a 637-bp segment of the LCR. The large tree is based on the UPGMA, and the small tree is based on the neighbor-joining algorithm. For further details, see the legend to Fig. 1.
|
Genomic diversity of HPV-58 isolates. The HPV-58 reference genome was cloned from a squamous cervical carcinoma (22). HPV-58 is considered the seventh most common high-risk HPV type in cervical cancer and the sixth most common in asymptomatic cervical samples (23). Two previous studies identified seven (29) and eight (7) genomic variants of HPV-58 based on partial L1 sequences.
We analyzed a 461-bp segment of 101 samples, which led to the identification of 21 variants (including the reference genome) relative to the reference HPV-58 genome. Surprisingly, none of the isolates represented the sequence of the reference genome, while a variant that differed by a single nucleotide substitution was found in 61 of the 101 samples. Eighteen of the remaining variants differed only by nucleotide substitutions, while one (ED18136) had a single nucleotide deletion. The maximal distance between any two variants was 10 mutations (2.2%).
The most common variant, represented by BR63, occurred in South African samples (6 of 11), Scottish samples (4 of 7), and East Asian samples (2 of 7), preventing a clear geographic association of this variant. However, four samples (represented by SA013) contained a unique South African variant, and three samples contained a variant found only in Taiwan (but not in other Asian countries), showing a limited amount of diversification in unique geographic locations. Figure 6 shows a phylogenetic tree of all isolates based on the UPGMA algorithm (large figure) and a second tree based on the neighbor-joining algorithm, in which each variant is represented by a single isolate (small insert).
![]() View larger version (21K): [in a new window] |
FIG. 6. The intratype diversity of HPV-58. The phylogenetic trees represent the relationship between HPV-52 variants based on a 461-bp segment of the LCR. The large tree is based on the UPGMA, and the small tree is based on the neighbor-joining algorithm. For further details, see the legend to Fig. 1.
|
The sequence data are summarized in Fig. 7 and show the nucleotide and amino acid sequence changes of six or seven variants of each of the four HPV types. Within the L1 segment of 25 variants, we found the prototype amino acid sequence in 19 variants, single amino acid sequence exchanges in 1 HPV-35 variant and 1 HPV-52 variant, and a triple amino acid exchange in an HPV-58 variant. These changes were the consequence of zero to seven nucleotide exchanges, amounting to a maximal divergence of 1.6% from the prototype and maximal intervariant diversity of 11 nucleotides (2.4%). In the E6 gene, maximal divergence from the prototype was five nucleotides (1.1%) and maximal intervariant diversity was eight nucleotides (1.8%). Altogether, the E6 protein of 14 of the 25 variants was unaltered. There were single amino acid exchanges in four variants of HPV-35, in two of HPV-52, and in one of HPV-58, and there were two amino acid exchanges in one HPV-31 variant and one HPV-52 variant.
![]() View larger version (51K): [in a new window] |
FIG. 7. Diversity of the E6 genes (four panels on the right side of the figure) and part of the L1 genes (left side of the figure) in distantly related variants of HPV-31, -35, -52, and -58. Within each panel, the first column lists the variants, whose relative phylogenetic position can be found in Fig. 1, 3, 5, and 6. The central part of the figure identifies nucleotide exchanges (letters) or maintenance of the sequence of the reference genome (gray squares). The box on the right side of each panel indicates whether the amino acid sequence of the reference clone has been maintained ("prototype") and if not, what kind of amino acid exchanges have occurred.
|
|
|
|---|
The limited amount of diversity is astonishing, considering the fact that HPV types likely originated from progenitor papillomaviruses quite unlike the extant HPVs by a process of continuous accumulation of mutations. It is therefore not trivial to reason why all the intermediary genomes that led to present day HPV types are now missing. One possibility likely involves biologic advantages acquired by genomic change. Even small improvements (e.g., slightly faster replication or higher infectivity) may, over extended periods of time, alter the composition of an HPV population. By positive selection, functionally improved genomes would eventually outnumber parent genomes over time merely by stochastic processes. Another facet of papillomavirus evolution almost certainly involves genetic drift. Genetic drift can become manifested by founder effects and bottlenecks, i.e., the limited genomic diversity of an HPV type in very small populations of infected individuals who eventually give rise to large populations, for example, at the very origin of the evolution of Homo sapiens. This hypothesis is supported by the fact that the chimpanzee papillomaviruses are more closely related to some HPV types (notably HPV-13) (30) than are closely related HPV types to one another, e.g., HPV-13 to HPV-6 and -11. Such observations support the idea that even closely related HPVs had already diversified into separate types in prehuman primates and the small populations of primates that evolved into new host species, such as Homo sapiens, may have carried only a fraction of the genomic diversity that existed in the total population of those primates. Thus, a likely mechanism for the evolution of new HPV types was the extinction of hosts close to the evolutionary roots of Homo sapiens. Positive selection and bottlenecks could suffice to explain the origin of closely related HPV types. Presently, available data neither support nor exclude the possibility that in addition, competition between HPV types and selection of the host against virus mutations contributed to the phylogenetic processes.
In the study of HPV-16 and -18 variation, we found a nucleotide divergence of 1 to 5% in the LCR between those variants that predominate today in Africa, Europe, and East Asia. Since the emergence of the predominant ethnic groups in these continents took 10,000 to a few tens of thousands of years, we interpret our findings to indicate that the speed of evolution of most HPV types occurred at a rate of roughly 1% per 10,000 years. This estimate may well be at the high end. For example, the ubiquitous occurrence of E variants of HPV-16 in all three continents indicates that these variants may have existed already while the other variants emerged. Our data may indicate an even lower speed of genomic change for HPV-31, -35, -52, and -58, since we found some variants of each type in Africa, Europe, and Asia. These widespread variants may have existed well before the spread of humans out of Africa, while clusters that are specific for ethnic groups may have evolved subsequently in specific locations.
Our data support the notion that HPV sequences encoding proteins developed a lower diversity than the noncoding LCR. However, our limited study of the E6 and L1 genes of the most distant variants of each type, aside from prototype sequences, indicate a level of 0.7 to 2.1% diversity on the level of protein sequences. This finding points to the possibility of functional differences between variants within each type, which is of relevance for epidemiological, etiological, pharmaceutical, and vaccination research. It should be noted that molecular as well as epidemiological studies have identified quite significant functional differences between variants of HPV-16, which seem to stem from diversity in the LCR as well as in protein-encoding regions of the genome (13, 17-19, 28, 29, 31, 32-34).
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2010 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»