Previous Article | Next Article ![]()
Journal of Virology, December 2007, p. 13158-13167, Vol. 81, No. 23
0022-538X/07/$08.00+0 doi:10.1128/JVI.01310-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Department of Virology, Göteborg University, Göteborg, Sweden,1 Center for International Health,2 Department of Microbiology and Immunology, The Gade Institute, University of Bergen, Bergen, Norway,4 Department of Microbiology and Immunology, Muhimbili University, College of Health Sciences, Dar es Salaam, Tanzania3
Received 15 June 2007/ Accepted 7 September 2007
|
|
|---|
|
|
|---|
Herpesviruses are among the most extensively studied DNA viruses. The evolutionary relationships among the different herpesviruses infecting humans, reptiles, and other vertebrates, as well as invertebrates, have been described, and the speciation can be traced back 10 million to hundreds of millions of years ago (39). Based on DNA sequence data from clinical isolates, divergence into different genogroups has been described for human herpesviruses HSV-1 (2, 47, 55), VZV (45, 48, 50), Epstein-Barr virus (57), cytomegalovirus (7, 8), human herpesvirus 6 (HHV-6) (11, 20), HHV-7 (18), and HHV-8 (42). Information regarding the genetic variability of clinical HSV-2 isolates has been limited to date. This is the first evolutionary study based on HSV-2 DNA sequence data.
Recombination is a general molecular process that generates new combinations of genetic material (29). Similarly, viral recombination occurs when two viruses of different parental strains coinfect the same cell and interact during replication to generate progeny whose genomes consist of genetic segments obtained from both parental strains. Recombination has been shown to participate in the evolution of several alphaherpesviruses (65) as well as betaherpesviruses (21) and gammaherpesviruses (44). The underlying mechanisms are poorly understood but are associated with DNA replication (67) and different cell factors (14, 63). Several studies have shown that homologous recombination occurs frequently under experimental conditions (43, 58), and HSV-1 recombinants, for example, have been detected in vitro (3, 4, 24, 66) as well as in animal models (32, 69). Recombination has also been shown experimentally for several varicelloviruses, i.e., VZV, pseudorabies virus, bovine herpesvirus-1, and feline herpesvirus (12, 19, 22, 58). Furthermore, recombinants between bovine herpesvirus-1 mutants after coinoculation of calves by the natural route of infection have been demonstrated recently (58). Also, wild-type recombinants have recently been described for, e.g., HSV-1 (2, 47), VZV (45, 48, 50), and pseudorabies virus (9).
The HSV-2 genome contains approximately 155,000 nucleotides (nt) (13) and is related to that of HSV-1 with an overall nucleotide identity of approximately 50%. HSV DNA has two covalently linked segments consisting of the unique long (UL) and unique short (US) components. Recently, Norberg et al. sequenced the glycoprotein G (gG), gI, and gE genes, localized in the US segment, for 28 clinical HSV-1 isolates and defined three distinct genogroups by phylogenetic analysis (47). The clustering into different genogroups facilitated further analysis of the gene sequences, revealing that a substantial portion of the isolates presented evidence of homologous recombination. Thus, recombination is a mechanism used not only for repair of DNA damage but also to exchange genetic segments between different HSV-1 strains.
In the present study, we have sequenced and analyzed clinical HSV-2 isolates from Tanzania, from Bergen in Norway, and from Göteborg in Sweden. We have focused on the US4 gene, encoding HSV-2 gG (gG-2); the US7 gene, coding for gI; and the US8 gene, coding for gE. These genes were selected because the orthologous genes had been sequenced and analyzed for clinical HSV-1 isolates. Here we found, despite low overall genetic diversity, a divergence into at least two genogroups, designated A and B, and evidence of frequent recombination events.
|
|
|---|
Virus stocks were prepared by infecting baby hamster kidney (BHK) cells or GMK-AH1 cells grown in Eagle's minimal essential medium supplemented with 2% calf serum and antibiotics. The clinical isolates were sequenced at a low passage number (<5). The laboratory strain B4327UR (27) was also sequenced and compared to HSV-2 HG52 (38), used as a reference. Strain B4327UR was passaged 10 times before sequencing at the laboratory in Göteborg.
PCR amplification and sequencing. Two regions of the HSV-2 genome were amplified prior to sequencing. The US4 gene (encoding gG-2) was amplified as a 2,194-bp fragment spanning the region from 57 bp upstream of the start codon to 39 bp downstream of the termination codon (the positions refer to strain HG52). Several additional primers were used for sequencing purposes, and all primers have been published previously (31). Because the carboxy-terminal half of the gG-2 gene has been described previously for the Swedish isolates (30), only the gene segment coding for the amino-terminal secreted portion of gG-2 was sequenced. Amplification with the other set of primers, listed in Table 1, resulted in a 3,182-bp segment starting 57 bp upstream of the start codon of the US7 gene (encoding gI-2) and extending to 47 bp downstream of the stop codon for the US8 gene (encoding gE-2), thus including the noncoding sequence between the two genes.
|
View this table: [in a new window] |
TABLE 1. Primers used for amplification and sequencinga
|
Sets of overlapping primers as shown in Table 1 and the ABI Prism BigDye Terminator cycle sequencing ready reaction kit (Applied Biosystems) were used for sequencing. The reaction mixture contained 1 µl 5x sequencing buffer, 2 µl BigDye, 4.4 µl H2O, 1 µl PCR product, 1.6 µl primer at a concentration of 1 pmol/µl, and 10 µl deionized H2O in a total volume of 20 µl. Incubation was carried out according to the following program: 1 min at 96°C, followed by 25 cycles of 10 s at 96°C, 5 s at 50°C, and 4 min at 60°C. The reaction mixtures were then treated with a Sequencing Reaction Cleanup kit (Biomek 2000) according to the manufacturer's protocol. Both strands were sequenced in an ABI Prism 3700 DNA analyzer (Applied Biosystems). For the gG gene and the combined gI and gE genes, a minimum of two sequences were obtained in parallel experiments. Sequences were assembled using DNA Sequence Assembly software, version 3.7 (Applied Biosystems). The sequences were further analyzed by using the Staden sequence analysis package (61) and were compared to those of the reference strain HG52. No mixed-strain infections were detected.
Sequence analysis. The sequences were easily aligned manually due to a high degree of similarity. To avoid interference by possible hypervariable repeat regions, all gaps in the alignment were analyzed separately and excluded prior to further analyses. The US4 gene and the US7-to-US8 segment were analyzed separately as well as concatenated.
To obtain information about evolutionary relationships among clinical isolates, sequences are traditionally analyzed by using algorithms for constructing bifurcating phylogenetic trees. However, recombination is a reticulate event, which may be difficult to detect. Because recombinants are the progeny of at least two parental strains, traditional phylogenetic trees are insufficient to represent the evolutionary history of, or evolutionary relationships among, isolates including recombinants. Since recombinants have been detected frequently for HSV-1 (47), the US4 and US7-to-US8 segments of the HSV-2 isolates were here analyzed first by using the SplitsTree program (25). In contrast to traditional bifurcating phylogenetic trees, SplitsTree constructs recombination networks, illustrating the evolutionary relationships among taxa in the presence of recombination. If recombination events have participated in the evolution of the isolates included, the sequence alignment will contain conflicting phylogenetic signals. These signals are utilized by SplitsTree to illustrate the evolutionary history of the isolates, including recombination events, as a non-tree-like recombination network. Thus, in such a network, a recombinant will have not one but several branches connected to the parental strains. That is, if a complex network represents the relationships between the isolates, it is likely that several recombination events have participated in their evolutionary history. In contrast, a recombination network with no recombinants included typically appears as a traditional bifurcating phylogenetic tree.
In an additional analysis, the US4 and US7-to-US8 segments were concatenated and analyzed by the SplitsTree program. Owing to a suspected high degree of recombination crossovers in this relatively long genomic segment, we first constructed a network based only on four randomly selected isolates. The analysis was then extended by randomly adding more isolates to the data set, one at a time, and new networks were constructed based on each data set. In total, nine networks including 4 to 12 isolates were constructed. To account for random sampling errors, this procedure was repeated several times with new isolates randomly added to the analysis. A network including all isolates, based solely on silent mutations in the concatenated US4 and US7-to-US8 segments, was constructed for purposes of comparison.
To further analyze the evolutionary history of HSV-2 regarding genetic divergence, isolates presenting conflicting signals (recombinant candidates), identified by using the SplitsTree program, were removed. Traditional phylogenetic trees were then constructed by using the maximum-likelihood method included in the Phylip package, version 3.66. To estimate the robustness of the trees, the calculations were based on 100 bootstrap replicates of each alignment. Phylogenetic trees including all isolates were constructed in parallel for purposes of comparison.
To validate the presence of recombination in HSV-2, the bootscan method included in the SimPlot program (35) was applied to the recombinant candidates. We used the bootscan method on isolates appearing in different clades in the phylogenetic trees based on US4 and US7-to-US8 segments, as well as on the recombinant candidates identified by using the SplitsTree program.
Conflicting signals (i.e., phylogenetically incompatible sites) may be explained either by recombination or by parallel mutations (i.e., true homoplasies) caused by chance or by selection pressure on specific sites. Although evidence for parallel mutations is rarely detected, the pairwise homoplasy index (PHI) test (5) was applied to the sequence alignments in order to determine whether the conflicting signals detected were due to recombination or to parallel mutations. The PHI test is based on the principle that recombination results in fragmented genomes and that each fragment contains several phylogenetically informative sites. In the case of a finite level of recombination, distant loci tend to have a higher degree of incompatibility than adjacent sites. Thus, the presence of two or more nonconflicting informative sites near each other in the sequence alignment will increase the statistical probability of recombination. However, if the genetic distance between the parental strains is low, an insufficient number of informative sites may be present in each fragment, and the probability that the PHI test will falsely reject the hypothesis of recombination will increase.
Nucleotide sequence accession numbers. The sequences obtained in this study were deposited in GenBank with accession numbers EU106374 to EU106469.
|
|
|---|
The similarity between the two most distant isolates was approximately 99.6%, indicating that the interstrain variability of clinical HSV-2 isolates is less than that described previously for HSV-1 based on the same gene segments (98%) (47). A deletion of 3 nt (CGT) was detected in US4 at nt 876 to 878, (positions refer to strain HG52) in nine isolates. In addition, a duplication of 3 nt (GGC) was detected in the US4 gene (nt 1284 to 1286) in five isolates. In US7, an insertion of nucleotides CCCGCG (isolate S_99-3322) or nucleotides CCCGCA (isolate T_64-3300) was detected at nt 706 to 711. Two variable regions were detected in the noncoding region located between US7 and US8; one run contained 6 to 16 cytosine residues, and the other run contained 5 to 14 guanine residues. In US8, one isolate (T_3034) presented an insertion of a single nucleotide (C) at position 584 and an insertion of 11 nt (CCCCCCCGACG) at position 612, leading to an insertion of four novel amino acids and a V
D shift at amino acid position 206. In the US8 alignment, we noted that the sequence from strain HG52, derived from GenBank, displayed a deletion of nucleotide G at position 542 compared to all the other isolates. In addition, at position 574, an extra nucleotide C was found, resulting in 12 altered amino acids from position 181 to 192. Strain HG52 was first received at the virology laboratory in Göteborg in 1981. This virus and HG52 from Bergen were sequenced and were found not to have these nucleotide changes.
SplitsTree recombination analysis. Separate recombination networks based on the US4 gene and based on the US7-to-US8 segment were constructed, including all isolates. The results show that both networks presented a reticulate topology, consistent with an evolutionary history involving recombination. The network based on US4 (Fig. 1A) was more complex than the network based on the US7-to-US8 segment (Fig. 2A), suggesting a higher rate of recombination crossovers in the US4 gene.
![]() View larger version (33K): [in a new window] |
FIG. 1. Sequence analysis of the US4 gene for 27 Tanzanian (red), 10 Swedish (green), and 10 Norwegian (blue) clinical HSV-2 isolates and the two laboratory strains HG52 and B4327. (A) Recombination networks, including all sequences, were first constructed by using the SplitsTree program. (B) The isolates inferring phylogenetically conflicting signals (recombinant candidates) were removed, and new phylogenetic networks were constructed based solely on nonrecombinants. (C) Trees based on all isolates using the maximum-likelihood algorithm. The recombinant isolates are connected with dotted branches and the nonrecombinant isolates with solid branches. The bootstrap values are derived from a consensus tree based on 100 bootstrap replicates, including solely nonrecombinant isolates. Only bootstrap values above 60 are shown. (D) Recombinant candidates were analyzed by using the bootscan method implemented in the SimPlot program. Results from analyses of the US4 gene segment in two isolates are shown.
|
![]() View larger version (30K): [in a new window] |
FIG. 2. Sequence analysis of the US7-to-US8 segment. See the legend to Fig. 1 for details.
|
Network analysis was also performed on the concatenated US4 and US7-to-US8 segments by including an increasing number of isolates. First, a network was constructed based on four randomly selected isolates, followed by networks constructed by randomly adding new isolates, one at a time. The majority of the networks presented reticulate topologies, consistent with recombination. Furthermore, the complexity of the networks increased drastically as more isolates were included in the analysis, and networks including 12 isolates were too complex to reveal specific evolutionary relationships among the isolates (data not shown). The network based solely on silent mutations in the concatenated US4 and US7-to-US8 segments also presented a reticulate topology (Fig. 3), a finding that further supports recombination rather than parallel mutations caused by selection pressure.
![]() View larger version (27K): [in a new window] |
FIG. 3. Recombination network based solely on silent mutations. The network includes all isolates and is based on the concatenated genes US4, US7, and US8. The Tanzanian, Swedish, and Norwegian isolates are shown in red, green, and blue, respectively.
|
The nonrecombinant isolates in the trees shown in Fig. 1C and 2C diverge into at least two genogroups, arbitrarily designated genogroups A and B, supported by high bootstrap values. The topologies of the two trees are similar, and genogroup A contains only isolates collected in Tanzania, while genogroup B contains isolates collected in Tanzania as well as in Scandinavia. Although genogroup B may be further divided into subgenogroups, more sequences would have to be included in order to confirm such subdivisions.
The orthologous gI and gE genes in the HSV-1 genome present high similarity to the corresponding HSV-2 genes (
80%). In contrast, the gG-1 gene has a large internal deletion and includes only 717 nt (strain 17), in comparison to 2,097 nt described for the HSV-2 strain HG52. Thus, only the gI and gE genes from HSV-1 could be used as an outgroup to establish a reliable root, which was localized between the two HSV-2 genogroups, A and B (Fig. 2C). The tree based on US4 includes 20 nonrecombinant isolates and 29 recombinant candidates. Nine nonrecombinants cluster to genogroup A and 11 to genogroup B. The tree based on the US7-to-US8 segment includes 39 nonrecombinant isolates and 10 recombinant candidates. Ten nonrecombinants cluster to genogroup A and 29 to genogroup B. Furthermore, both phylogenetic trees, the tree based on US4 and that based on the US7-to-US8 region, showed a star-like topology, an appearance that remained when the recombinant candidates were excluded from analysis.
In an additional analysis, the gap regions in the sequence alignment were compared to the topology of the most parsimonious tree as well as the maximum-likelihood tree. In the first gap region in US4, the deletion of the triplet CTG was shared among isolates N_3, N_6, N_7, N_10, T_43-742, T_70-3486, S_99-3140, S_96-1217, and S_B4327, which support the subgenogroup divergence within genogroup B (Fig. 1C and 2C). In contrast, in the second gap region in US4, only isolates HG52, S_99-3140, T_64-3300, S_97-1643, and and S_95-580 displayed an insertion of the nucleotides GGC, a finding that does not support the subgenogroup divergence within genogroup B. Thus, the two gap regions in US4 are phylogenetically conflicting. Furthermore, the repeat regions in the noncoding region between US7 and US8 showed no correlation to the topology of the most parsimonious tree or the maximum-likelihood tree, suggesting hypervariability and/or recombination.
Bootscan and the PHI test. All three genomic regions—US4, the US7-to-US8 segment, and the concatenated US4 and US7-to-US8 segments—were analyzed by the bootscan method. Recombinant candidates were compared to each other as well as to nonrecombinants. In these analyses, variable degrees of segmentation were detected, findings that further support the idea that the isolates have been involved in recombination events. The results obtained from analyzing US4 in two isolates are shown in Fig. 1D, and those from analysis of the US7-to-US8 segment in two other isolates are shown in Fig. 2D. Bootscan analysis of three isolates based on the complete concatenated region is shown in Fig. 4.
![]() View larger version (19K): [in a new window] |
FIG. 4. Results from the bootscan analysis of three recombinant candidates based on concatenations of the US4 and US7-to US8 segments.
|
|
|
|---|
The genetic variability of the HSV-2 isolates was significantly lower than that previously described for HSV-1 (47). The underlying reason for this observation is currently unknown, but an explanation might be that HSV-2 has historically been a smaller population than HSV-1. Since small populations are more sensitive to genetic drift and fixation, the divergence of HSV-2 may have been restricted. In contrast, large populations are less sensitive to random sampling errors, which may have allowed HSV-1 to diverge at an earlier stage than HSV-2, implying that the most recent ancestor for the HSV-1 isolates investigated previously is older than that for the HSV-2 isolates investigated here. Furthermore, other factors, such as different transmission routes, may also have influenced the population genetics of HSV-1 and HSV-2, exerting different selection pressures, bottlenecks, and founder effects on the two populations. In addition, more-frequent recombination events in the HSV-2 population during evolution may have decreased genetic diversity. Taken together, the lower genetic variability for HSV-2 may not necessarily be explained by a higher mutation rate for HSV-1 than for HSV-2, but by differences in variables affecting the population genetics of HSV-1 and HSV-2.
By constructing recombination networks and using the bootscan method, we showed that the sequence alignment of the clinical HSV-2 isolates contained a substantial number of phylogenetically conflicting signals. We suggest that the majority of these conflicting signals are results of homologous recombination. However, in addition to recombination, phylogenetically conflicting signals may also be explained by parallel mutations (i.e., true homoplasies), either randomly introduced into the genome or caused by selection pressure on specific, functionally important amino acids. While conflicting signals caused by selection pressure are usually found accumulated at specific sites or epitopes, the conflicting signals described here for the HSV-2 alignment were present in all three genes investigated, US4, US7, and US8, and had not accumulated in certain regions, which argues against an overt selection pressure. Furthermore, the recombination network based solely on silent mutations also presented a reticulate topology, which may not be explained by selection pressure. Owing to the close homology of the sequences, the probability of such frequent conflicting signals arising by chance is considered extremely low. In addition, the gap regions were also phylogenetically conflicting in that the number of repeated triplets or nucleotides for each isolate was not entirely reflected by the topology of the most parsimonious tree including all isolates. When the PHI test was applied solely to silent mutations, a high statistical probability of recombination was found. The fact that the PHI test failed to prove a statistical probability of the presence of recombination in the sequence alignments including all isolates and nucleotide substitutions may not necessarily reject the hypothesis of recombination. When more isolates are included in the analysis, the number of crossovers will increase in the alignment if the frequency of recombination is high, and hence, the distances between the crossovers will decrease. Since the average genetic distance between the isolates is smaller than 0.4%, the probability that two or more phylogenetically informative sites will be identified between each pair of crossovers is low. Thus, it is likely that the number of such sites in the sequence alignment analyzed here is insufficient to be utilized for calculation of the statistical probability of recombination by the PHI test. Furthermore, it has been shown that the probability of the PHI test falsely rejecting the hypothesis of recombination increases if the evolutionary history includes exponential population growth (5).
Conflicting phylogenetic signals may also be explained by random introduction of nucleotide substitutions by Taq polymerase during the PCR. Although we cannot entirely exclude this possibility for all nucleotide substitutions described, this explanation is unlikely. First, all sequences were generated from two strands in both directions. Second, most isolates have been resequenced from new PCRs, giving identical results. Third, to achieve a Taq polymerase artifact, the error must be introduced early in the amplification reaction, where the original sample contains a low number of DNA copies. To avoid this possibility, the PCRs were initiated with a high number of DNA copies (>106 genome copies) measured by real-time PCR (46). Finally, although some nucleotide substitutions are specific for single isolates, most substitutions are shared among two or more isolates, a feature that would have been unlikely if the substitutions were introduced randomly by the Taq polymerase. In addition, since the isolates were sequenced after a low number of passages in cell culture (<5), and it has been found that multiple passages of HSV-2 in cell cultures do not alter the DNA sequences (30, 64), it seems unlikely that the substitutions described here were cell culture artifacts.
Since nucleotide substitutions are rare events for herpesviruses compared to many RNA viruses, recombination may act as a powerful and essential driving force of evolution. For example, it has been shown experimentally that two avirulent herpes simplex viruses may generate lethal recombinants in vivo (26). Recombination can break down associations between deleterious and beneficial mutations at different loci (negative disequilibrium). A genome can also collect beneficial mutations from several other genomes, which is advantageous when different individuals in a population carry different beneficial mutations (16). A consequence is that recombination can increase the additive genetic variance and, by Fisher's fundamental theorem for natural selection, can increase the rate of adaptation (15, 17). In addition, all organisms randomly introduce harmful mutations into their genomes, and a recent study has demonstrated that recombination is a powerful mechanism for the deletion of such mutations (28). Whether HSV-2 uses recombination for adaptation is unknown.
Star phylogenies may typically be explained either by an evolutionary history with frequent recombination or by an exponentially growing population. This pattern of evolution has been described for several RNA viruses, such as HIV (36, 62), hepatitis C viruses (68), Puumala hantaviruses (60), and enteroviruses (37). In addition, HPV-16 (23), HPV-44, HPV-55, and HPV-68 (6) present a tree topology similar to that described here, i.e., two dichotomous clusters and a star-like appearance of the variants within the clusters. Due to the suggested high frequency of recombinants described here for HSV-2, it is likely that the star phylogeny of the trees including all isolates is caused, at least partly, by recombination. However, since the trees based solely on nonrecombinants also presented star-like topologies, two possibilities can be considered: (i) some of the nonrecombinants are in fact recombinants, which we failed to classify, or (ii) the HSV-2 population has expanded exponentially. The latter suggestion corresponds well with what we know of the HSV-2 host, i.e., the human population. Nevertheless, the conservation of the genome and the complex pattern of recombination crossovers complicate the predictions of which isolates are recombinants. Hence, it may be possible that some of the isolates classified here as recombinants are nonrecombinants, and vice versa. On the other hand, the phylogenetic trees based solely on nonrecombinants presented similar topologies for both the US4 region and the US7-to-US8 region. These results not only increase the support for the evolutionary history illustrated by the two trees, i.e., the divergence into genogroups A and B and the subgroups within genogroup B, but also decrease the probability that the isolates classified here as nonrecombinants are recombinants.
In conclusion, based on clinical data from HSV-2 isolates collected in Scandinavia and Tanzania, the results presented here demonstrate a divergence into at least two genogroups. By applying different algorithms, it was also possible to perform a novel and thorough analysis of recombination of clinical HSV-2 isolates. The results of this study suggest that, as has been described for HSV-1 and VZV, recombination is a prominent feature of the evolution of HSV-2 as well. To increase the understanding of the genetic variability and divergence of HSV-2, it would be of interest to analyze isolates from other regions of Africa, as well as from the rest of the world. Possible biological implications of genetic variability for tissue tropism, symptomatic versus asymptomatic transmission, vaccine development, and different clinical manifestations of HSV-2 infection need to be addressed further.
Financial support was received from the Swedish International Development Agency, the Swedish Research Council, the ALF Foundation at Sahlgren's Hospital, the Swedish Society for Medical Research, and the Western Norway Regional Health Authority.
Published ahead of print on 19 September 2007. ![]()
|
|
|---|
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»