Previous Article | Next Article ![]()
Journal of Virology, October 2006, p. 9569-9576, Vol. 80, No. 19
0022-538X/06/$08.00+0 doi:10.1128/JVI.00835-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Department of Clinical Virology, Göteborg University, Göteborg, Sweden,1 National VZV Laboratory, Centers for Disease Control and Prevention, Atlanta, Georgia,2 Scientific Resources Program, Centers for Disease Control and Prevention, Atlanta, Georgia3
Received 22 April 2006/ Accepted 20 July 2006
|
|
|---|
|
|
|---|
The genetic diversity of various human herpesviruses (cytomegalovirus, Epstein-Barr virus, herpes simplex virus type 1, human herpesvirus 7, and human herpesvirus 8) and methods for genotyping clinical isolates have been proposed by a number of laboratories (7, 11, 27, 31, 34, 35, 38, 41-43, 46). Genetic diversity among VZV strains was initially defined using either restriction enzyme mapping (1, 22) or variation in the copy number at the tandem repeat regions (6, 20, 21, 39). Phylogenetic analyses of partial VZV genome sequences have recently identified a divergence of VZV into different genotypes and recombinant strains (25, 30, 44). Loparev et al. (25) performed targeted (multiple locus) analysis of VZV polymorphisms for a large number of clinical isolates obtained from all six continents, establishing a divergence into the three genotypes E (European), J (Japanese), and M (mosaic). The M genotype could be further subdivided into the groups M1 and M2. A seminal finding of the report was that strains belonging to M1 and M2 were most common in tropical regions, whereas E and J strains tended to occur in temporate latitudes. In addition, it was demonstrated that sequence analysis of a 447-nucleotide (nt) region located in open reading frame 22 (ORF22r1) could unerringly recapitulate the genotyping results obtained from analyzing 23 single-nucleotide polymorphisms located in seven open reading frames distributed over the entire genome. These analyses implied that recombination events, which were recently found to be a common feature among wild-type isolates of another alphaherpesvirus, i.e., herpes simplex virus type 1 (5, 31), were of interest for future study as a possible mechanism for VZV strain variation (30). In the studies reported here, we determined the complete genomic sequence of two M VZV strains presented in a previous study (25). The sequences were analyzed with different phylogenetic algorithms applied on the complete genome as well as on shorter segments. The results suggest a divergence of clinical VZV isolates into at least four genotypes, designated E, J, M1, and M2, and further suggest that recombination is an important contributor to the evolution of VZV.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Characteristics of VZV strains analyzed in this study
|
Sequence analysis. The complete sequences were readily aligned manually thanks to the high level of conservation among all VZV genomes. The tandem repeat elements (R1 to R5) as well as hypervariable regions (i.e., loci with variable numbers of insertions or deletions of single nucleotides) were excluded prior to phylogenetic analyses due to the uncertainty regarding the phylogenetic informative degree of those regions. Phylogenetic analyses were completed using algorithms included in the Phylip package (10) and the SimPlot program (24). Owing to the limited number of complete VZV genomic sequences available, the following assumptions were made. (i) P-Oka was regarded as the representative consensus sequence for all strains of genotype J. (ii) The consensus sequence of BC, Dumas, and MSP was considered to represent the consensus sequence for all strains of genotype E. (iii) The genomic sequence for strain 123 served as the representative consensus sequence for all M1 genotype strains. (iv) The genomic sequence for strain DR was assumed to represent the consensus sequence for all M2 genotype strains. Since no outgroup sequences were included in the analyses, all phylogenetic trees generated for the study are unrooted.
Phylogenetic algorithms and methods. All phylogenetic analyses (except for the bootscan method) were performed by the maximum likelihood method (DNAML) included in the Phylip 3.62 package (10). For simplicity, the faster neighbor-joining algorithm was selected for the boostcan method (included in the SimPlot program). To investigate whether the tree topologies were robust regarding the use of different algorithms, the maximum parsimony method (DNAPARS) was used in all analyses in parallel and a Bayesian inference algorithm (MrBayes 3.1) (23, 36) was applied on the complete genome alignment. Bootstrap replicates of the sequence alignments were constructed by using SEQBOOT. To decide a suitable evolutionary model and parameters, the complete genome alignment of all included strains was investigated by using MrModeltest 2.2 (32). The program suggested the use of the GTR+I+G (general time-reversible model using invariant sites and gamma-distributed rates at variable sites) model. The parameters were set as proposed. The simpler model, Jukes-Cantor, was applied in parallel to validate the robustness of the tree topologies regarding the choice of evolutionary model.
Phylogenetic analysis of the complete genome. Although strains belonging to the M1 and M2 genotypes have previously been suggested as possible recombinants (25), phylogenetic analyses including all seven strains were performed on the complete genome without fragmentation. Unrooted trees were created based on 100 bootstrap replicates of the alignment.
Analysis of phylogenetically informative sites. The four genotypes E, J, M1, and M2 can be ordered in unrooted bifurcating trees with three different topologies. In addition to the phylogenetic analysis described above, analyses of all single phylogenetically informative sites in the alignment of the consensus sequences representing E, J, M1, and M2 were performed. Each informative site, localized using the SimPlot program, supports one of the three alternative tree topologies. We investigated whether any topology was uniformly supported by a majority of the informative sites throughout the genome or if blocks of sequence were present in the genome supporting different topologies that served as evidence for recombination.
Segmentation analysis. To investigate whether viruses classified as the M1 and M2 genotypes were derived from the E and J strains through recombination events, strains 123 and DR were separately evaluated segment by segment. Since the genome is highly conserved, with few informative sites, the SimPlot program was unable to produce reliable recombination analyses. However, using the bootscan method it was possible to locate putative recombination sites and evaluate them manually. Each predicted segment was analyzed together with the corresponding segments from the E and J strains using the maximum likelihood algorithm. In all likelihood, if M1 and M2 strains represent the culmination of recombination activity, they would have descended from different recombination events since these strains are positioned on different branches in phylogenetic trees (25). Although analyses that included both strains 123 and DR were performed for purposes of comparison, strains 123 and DR were analyzed separately. Thus, strain DR was excluded from the analyses when strain 123 was under evaluation and vice versa. Regions lacking recombination sites were excluded from the analyses. Finally, all segments with a similar phylogenetic topology were concatenated to larger segments and further reanalyzed using the maximum parsimony algorithm applied on 100 bootstrap replicates. The maximum likelihood and neighbor-joining algorithms were applied in parallel for comparison.
Nucleotide sequence accession numbers. The complete nucleotide sequences for both 123 and DR are being deposited in GenBank (accession numbers not yet assigned).
|
|
|---|
0.2%. In a comparison between the European strain Dumas, the Japanese strain P-Oka and the mixed strain 123, 62% of all intragenic nucleotide substitutions were silent. Phylogenetic analysis of the complete genome. Based on analyses of selected genomic regions of clinical VZV isolates (25, 30), the VZV selected strains segregated into at least three genotypes. We performed phylogenetic analyses based on the complete genome (Fig. 1A). After aligning and cleaning the sequence data as described above (removal of repeat regions), trees were constructed from 100 bootstrap replicates. The E, J, M1, and M2 genotypes were unambiguously separated from each other, with high bootstrap values (>85%), and the complementary algorithms and evolutionary models were tested in parallel as comparison yielded the same topology and branch lengths.
![]() View larger version (25K): [in a new window] |
FIG. 1. (A) Phylogenetic tree based on the complete VZV genome. Bootstrap values supporting each genotype are shown. (B) Schematic view of phylogenetically informative sites (n = 49) in the VZV genome included in 18 genetic blocks. The topology supported by each informative site, the position of the site, nucleotides for the respective strain, and transversions (TV) are marked to the left. The possible tree topologies (no. 1 to 3) of the four genotypes are illustrated to the right. Nucleotide positions refer to the European strain Dumas.
|
99.89% (1.15 x 103 substitutions per site), while the genetic similarity between the most distantly related groups, J and M1, is
99.82% (1.76 x 103 substitutions per site). By comparison, the genetic distance between E and J is
99.85% (1.47 x 103 substitutions per site). The highest genetic divergence between strains of the same genotype was observed for genotype E (Dumas versus BC,
99.96% similarity or 0.39 x 103 substitutions per site). Previously published analyses of the genomic sequences for P-Oka and V-Oka established the high level of similarity between these viruses; our analysis established a similarity of
99.98% (0.15 x 103 substitutions per site). Similar topologies were observed regardless of whether the maximum likelihood or the neighbor-joining algorithm was used to conduct the analysis. Each phylogenetically informative site in the sequence alignment was separately compared to the three possible tree topologies of the four genotypes E, J, M1, and M2. Altogether, 49 informative sites were identified in the alignment of E, J, M1, and M2, including 41 transitions and 8 transversions. The results indicate that the topology obtained from a phylogenetic tree based on the complete genome is not uniformly supported (Fig. 1B). Eighteen genetic blocks were identified in the VZV genome, each supporting one of three possible phylogenetic topologies. The topology described in Fig. 1A, which is represented by tree no. 2 in Fig. 1B, is only supported by 25 informative sites (n = 25) included in a total of eight blocks encompassing one block with eight informative sites, one block with five informative sites, one block with four informative sites, three blocks with two informative sites each, and two blocks with a single informative site each (Fig. 1B). The topology illustrated as tree no. 1 was supported by 11 informative sites (n = 11) included in six blocks: one block with three informative sites, three blocks with two informative sites each, and two blocks with one informative site each. Finally, the topology illustrated as tree no. 3 was supported by 13 informative sites (n = 13) included in four blocks: one block with seven informative sites, one block with four informative sites, and two blocks with one informative site each.
Recombination analysis. The M genotype VZV strains DR and 123 were previously postulated to represent putative recombinant strains (25). We performed separate fragmentation analyses of DR and 123 to evaluate whether or not a complete genome analysis would support the hypothesis that M1 and M2 genotypes were derived through a series of recombination events between E and J genotype strains.
The sequence data were evaluated for the presence of possible recombination sites predicted by the bootscan method. Genomic segments identified through this approach revealed putative sites where viral recombination events have occurred. We performed phylogenetic analyses on each identified segment through the application of the maximum likelihood algorithm. Results from these analyses show different phylogenetic topologies as well as branch lengths.
Fifteen distinct segments of interest for further evaluation were identified in our analysis of VZV genomic sequence data in the context of the DR strain (Fig. 2B). Trees based on each segment were constructed separately using the maximum likelihood algorithm. The analyses showed that 11 segments clustered DR with P-Oka and four segments clustered DR with the European strains.
![]() View larger version (33K): [in a new window] |
FIG. 2. (A) Fragmentation analysis of strain 123 (genotype M1). Phylogenetic trees were calculated for each fragment, and the larger trees on top and below are based on 100 bootstrap replicates of a concatenation of all segments supporting similar topologies. Nucleotide positions refer to the European strain Dumas. (B) Fragmentation analysis of strain DR (genotype M2) with phylogenetic trees based on each fragment. The larger trees on top and below are based on 100 bootstrap replicates of a concatenation of all segments supporting similar topologies. Nucleotide positions refer to the European strain Dumas. (C) Fragmentation analysis of the VZV genome with M1 and M2 genotypes included. The tree topology of the six green segments clusters M1 and M2 together and clearly separates them from the E and J genotypes. The larger tree below is based on 100 bootstrap replicates of a concatenation of all six segments. Nucleotide positions refer to the European strain Dumas.
|
Although concatenation would be expected to yield the same topology, the results based on the extended segments were used to estimate correct branch lengths and information about the bootstrap values. The more extensive concatenated segment was reanalyzed with 100 bootstrap replicates. In addition, the four segments clustering DR with the European strains were also concatenated and reanalyzed. The results showed that the topology was supported by high bootstrap values (Fig. 2B), an observation that provides additional support for the occurrence of recombination events in the derivation of the M2 strain DR.
Fourteen segments were identified in strain 123. Again, each segment was analyzed separately using the maximum likelihood algorithm. As with the DR analysis, topology depended on which part of the genome was under investigation (Fig. 2A). Ten segments clustered 123 with P-Oka, and four segments clustered 123 with the European strains. As with strain DR, these topologies remained stable using high bootstrap values after the segments were concatenated and reanalyzed (Fig. 2A).
In addition to the separate analyses of the DR and 123 strains, we performed an additional analysis that included both DR and 123. Notably, the results revealed that six blocks clustered strains 123 and DR closely together, with wider separation from both the E and J genotypes (Fig. 2C). Analogous to the results obtained from the separate analyses of DR and 123, the topology of the six blocks was also supported when the concatenated sequences were reanalyzed by the maximum likelihood algorithm applied using 100 bootstrap replicates (Fig. 2C).
VZV genotyping. ORF22 has previously been suggested as a suitable target for genotyping of clinical VZV isolates (25). In the present study, we investigated the complete genome in an effort to uncover alternative regions that can be used to distinguish between the four major VZV genotypic groups E, J, M1, and M2. Even though these four genotypes are hitherto the only ones described, we cannot exclude that more genotypes will be revealed as more strains are sequenced, which will require additional genotyping strategies. However, although the genome is highly conserved, several regions with nucleotides specific for the four genotypes are present. The region including ORF51 to -58 contains at least six regions (Fig. 3, no. 1 to 6) that might be utilized for VZV genotyping. One region is located in ORF51 to -52, ORF53 to -54, ORF56, and ORF58, respectively, and two are located in region ORF54. Each region includes single-nucleotide polymorphisms that specifically associate with each of the four genotypes. The flanking sequences of each region have additional sites that are also likely to prove useful for VZV genotypic analysis. In addition, five sites are present in ORF51 to -58 (nt 90202, 92861, 99186, 99709, and 100123) that could be useful for the identification of stable subgroups of genotype E.
![]() View larger version (35K): [in a new window] |
FIG. 3. Genomic variation of a 10,456-nt stretch including ORF51 to -58 of the seven VZV strains included. Nucleotide positions refer to the European strain Dumas. Six genotyping targets are suggested where no. 4, 5, and 6 are all shorter than 430 nt.
|
|
|
|---|
Interpretations of VZV evolution are hampered by the scarcity of point mutations. The limited numbers of fully sequenced isolates and the conservation of the genome complicate both the reconstruction of the evolutionary history and the detection of individual recombination events. Such reconstruction is complex even without these additional obstacles. Nonetheless, we were able to construct a model of the evolution of VZV based on both phylogenetic and recombination analyses (Fig. 4). We propose that the E and J genotypes have evolved from a common ancestor and that subsequent recombination between these genotypes in superinfected persons led to the emergence of at least two mosaic genotypes. Since then, the viruses have continued to evolve both independently (through point mutations) and dependently (through additional recombination events). This model of evolution is supported by the results from the informative site analysis as well as the fragmentation analyses. That said, we cannot exclude that additional hitherto unknown genotypes and recombination events may be involved in the evolutionary history of VZV. Such information could affect the proposed model and could also reveal information about the evolutionary origin of the nonspecified segments in DR and 123 (Fig. 2A and B). Nevertheless, it appears likely that recombination events occurring in dually infected persons have contributed to the evolution of VZV.
![]() View larger version (4K): [in a new window] |
FIG. 4. A suggested evolutionary model based on the results from the phylogenetic and recombination analyses. An ancestral VZV strain (t0) diverged into the E and J strains (t1). E and J are postulated to have recombined at least twice (t2 and t3) to form mosaic recombinant M strains. Following a period of independent evolution, the mosaic strains are postulated to have recombined at least once to establish M1 and M2 (t4). Finally, E, J, M1, and M2 have evolved independently to the present time (t5).
|
Evidence for strain recombination has also been observed among clinical isolates of human herpesvirus 8 (34), cytomegalovirus (18), and Epstein-Barr virus isolates (28, 45). As such, homologous recombination appears to be a general feature of human herpesvirus evolution and an important mechanism for maintaining genetic diversity among human herpesvirus strains. An
34-kb DNA comparison between V-Oka and the wild-type parental ancestor P-Oka revealed that nucleotide substitutions were preferentially located in ORF62, the major transactivating protein (2). V-Oka vaccine was attenuated in vitro from a wild-type J strain, and vaccine preparations cluster phylogenetically with genotype J. The vaccine is gaining more widespread use globally and has already been introduced broadly into a number of globally distributed populations. V-Oka establishes latency and can reactivate to cause zoster, and superinfection with wild-type VZV strains has been documented frequently among vaccinated individuals (3, 13-15, 30, 37). In addition, a recent population-based study indicated that second cases of wild-type varicella occur more frequently among naturally infected persons than previously appreciated (19). These findings suggest that recombination between wild-type VZV strains and V-Oka strains is possible in both vaccine recipients and, more rarely, between wild-type viruses in persons dually infected with wild-type strains. These new opportunities for VZV recombination should provide an interesting avenue for further study.
We investigated the complete genome for isolates from each of the previously identified VZV genotypes to detect sequence variation that can be used for the development of practical VZV genotyping methods. On the basis of these analyses, we suggest ORF51 to -58 as alternative regions for genotyping of VZV clinical isolates. These regions are relatively variable and include several polymorphic sites that can be used to distinguish between the genotypes E, J, M1, and M2. We also present six examples of shorter subregions that are suitable for genotyping (no. 1 to 6 in Fig. 3). Regions 4, 5, and 6 are all shorter than 430 nucleotides and can therefore easily be amplified by a single PCR. Although the three European isolates included here are classified into the same genotype (E), we cannot exclude the possibility that as yet unidentified subgenotypes exist. At least five sites in ORF51 to -58 should be useful for the classification of variants of genotype E.
The present and previous studies (25, 30, 44) clearly demonstrate limited but informative genetic diversity among globally distributed VZV clinical strains. It is quite likely that both recombination and point mutations, in addition to polymorphic TR regions, have all played important roles in the generation of VZV variability. VZV is essentially ubiquitous, at least in temperate climates, and establishes lifelong latent infection; in addition, VZV is transmissible from cases of zoster, albeit at low efficiency. A relatively small number of individuals in regular global transit could have spread and established relatively few VZV strains in new geographic regions in recent centuries. Analyses of herpesvirus family evolution indicate that the major herpesvirus species emerged in parallel with primate evolution, probably multiple millions of years ago (26, 29). This would explain the intricate balance of power that exists between all human herpesviruses and their hosts; herpesviruses only rarely cause serious human disease, but are able to persistently infect over a human lifespan and thus expand their opportunities for further transmission. VZV may have passed through periodic evolutionary bottlenecks, which may explain the geographic separation of genotypes recently observed (25). To further explore this issue, it may be useful to analyze the evolutionary model proposed here in the context of the migration patterns of modern humans over the past 30,000 years.
In conclusion, our results suggest that the history of the VZV evolution included recombination events. Regardless of the method employed, different phylogenetic topologies were observed that were dependent on the region of the genome being investigated. We suggest that strains 123 and DR consist of genome fragments acquired from the E and J genotypes during homologous recombination. In addition, it is likely that strain 123 has acquired at least six fragments from strain DR during a more recent recombination event or vice versa.
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»