Previous Article | Next Article ![]()
Journal of Virology, October 2006, p. 9928-9933, Vol. 80, No. 20
0022-538X/06/$08.00+0 doi:10.1128/JVI.00441-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, Mueller Laboratory, University Park, Pennsylvania 16802,1 Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, United Kingdom,2 Fogarty International Center, National Institutes of Health, Bethesda, Maryland 208923
Received 21 June 2006/ Accepted 3 August 2006
|
|
|---|
|
|
|---|
4 x 107 synonymous substitutions per site per year (subs/site/year), is based on the assumption of codivergence with human populations, with host divergence times used to calibrate those of the virus (17, 32). Although long-term codivergence and consequently low rates of nucleotide substitution have been supported in some DNA viruses, specifically herpesviruses and papillomaviruses (5, 24, 25), the extent of codivergence between JCV and human populations has not been rigorously tested. However, a previous study of genetic variation noted differences in the demographic histories of JCV and human populations, implying that factors besides human population structure have shaped viral diversity (39). Similarly, an evolutionary rate for JCV that is independent of calibration through codivergence has not been obtained, making it difficult to ascertain if rates derived so far are valid. Indeed, if the polyomaviruses do evolve as slowly as estimated under a codivergence hypothesis, then an independent estimate, based on sequence variation observed over a short time period (as described in reference 15), should be impossible, as substitutions would accumulate too slowly to measure. Further, the assumption that all DNA viruses evolve orders of magnitude more slowly than RNA viruses has recently been challenged. In particular, an interhost rate of approximately 104 subs/site/year, close to that of many RNA viruses, has been observed in the small, 5-kb single-stranded DNA parvoviruses (30), and an intrahost rate of roughly 105 subs/site/year was estimated for the human polyomavirus BK (BKV) (10). Herein we provide a systematic study of JCV-human codivergence.
|
|
|---|
To compare the population dynamics (including population growth rates) of JCV and their human hosts, we compiled a corresponding data set of 158 human mitochondrial DNA (mtDNA) sequences from the mitochondrial database mtDB (http://www.genpat.uu.se/mtDB/). This data set reflected, as far as possible, the populations from which the viral isolates were sampled. Entire mtDNA genomes, with the exception of the 1,120-bp noncoding "D-loop," which evolves at a higher rate than the rest of the mitochondrial genome (19), were aligned manually. Accession numbers and mitochondrial host populations are given in Table S2 in the supplemental material.
Phylogenetic analyses.
To determine the phylogenetic relationships of all JCV strains, we used maximum likelihood (ML), neighbor-joining (NJ), and Bayesian Markov chain Monte Carlo (MCMC) approaches to infer three individual trees. ML and NJ phylogenies were estimated with the GTR+I+
4 model of nucleotide substitution, available in PAUP* (33). ML trees used SPR (subtree pruning regrafting) and TBR (tree bisection-reconnection) branch swapping, with 100,000 and 200,000 rearrangements, respectively. All parameter values were estimated from the data, and bootstrap values were calculated using 1,000 replicate NJ trees on the ML substitution model. Bayesian trees were estimated with the program MrBayes 3, with the HKY85+I+
4 model of nucleotide substitution (29). The MCMC chain was run for 9 million generations (with a burn-in of 850,000 generations), with sampling every 1,000 generations. The tree with the highest posterior probability, i.e., the MAP (maximum a posteriori) tree, was found, and posterior probabilities for nodes were calculated from a consensus tree derived from the same MCMC chain, sampling every 500 generations after a burn-in of 850,000 generations. All trees were midpoint rooted, as no suitable outgroup is known. (The most closely related virus, BKV, is
22% divergent from the JCV strains in our data set. As the maximum diversity of the JCV sequences is only 2.7%, BKV is not sufficiently similar to constitute a reliable outgroup.)
These three trees were used as a basis for constructing the JCV phylogeny that was used as input for the TreeMap analysis (see below). Because TreeMap does not allow viruses to have multiple hosts, JCV subtypes which both shared a host population and were located on sister or neighboring branches were combined onto a single branch. The ML phylogeny gave a polytomy comprising the three clades 7A, 7C1/C2, and 7B1/B2, yet the NJ tree indicated an initial divergence of 7A and the MAP tree indicated an initial divergence of 7B1/B2. To ensure all possible topologies were explored, concise trees were constructed from both resolutions (labeled a and b, respectively). Subtypes 2B, 2E, and 2D3 were excluded because of either their unresolved positions or limited sampling.
The phylogenetic tree of human populations, labeled i, was constructed in accordance with those proposed by Cavalli-Sforza and Feldman (6) and Cavalli-Sforza et al. (7). The variant human tree ii was constructed by placing the Caucasoid branch in the position proposed by Ayub et al. (3) and Uinuk-ool et al. (36). Finally, tree iii shows the Indian population branching separately because of controversy regarding the affinity of Indian populations to Asians and Europeans (4) (see Fig. S3 in the supplemental material).
Cophylogenetic reconciliation and significance testing. To determine the degree of JCV and human phylogenetic congruence, we used the program TreeMapv2.0 (http://taxonomy.zoology.gla.ac.uk/rod/treemap.html) (9, 20). A "tanglegram" was created by matching each subtype with the host population in which it is predominantly found. From this, a graph (a "jungle") was created which includes all optimal mappings of viral tree nodes onto host tree nodes. The potentially optimal solutions, or POpt, for each of the tanglegrams were determined by weighing the noncoevolutionary events (NCEs) required to reconcile the host and virus trees (see Table S3 in the supplemental material). NCEs include viral duplication, host population transfer, and the loss of a virus by a host population. When determining the POpt, an upper limit was put on NCEs at the point where NCE + 1 fails to result in reconciliations with a greater number of codivergence events. Those maps in the jungle that were optimal with respect to these evolutionary events (i.e., maps that infer the maximum number of codivergences with the minimum number of NCEs to explain the phylogenetic congruencies and incongruencies) were analyzed (20). To test the null hypothesis that the JCV tree is no more congruent with the host tree than a random tree would be, 100 viral phylogenies in which the branches were randomized were mapped onto host phylogenies. We then determined which proportion of these reconciliations showed the same or more codivergence events, or the same or fewer NCEs, as the "optimal" trees. Using the same analysis, the consensus tree of the 158 human mtDNA genome data set was also compared to the host phylogenies described above.
Nucleotide substitution rates and population dynamics.
To estimate rates of nucleotide substitution in JCV and to compare the population dynamics of JCV and human mtDNA, we used a Bayesian MCMC approach (the BEASTv1.3 package; http://evolve.zoo.ox.ac.uk). This method considers differences in branch lengths among viruses sampled at different times and explores evolutionary models whose parameters include tree topology, substitution rate, and population size changes. Bayesian skyline plots, with 10 population groups of unique sizes, were used to infer demographic history (10 grouped intervals were used) (16). Phylogenies were evaluated using a chain length of 40 million states under the HKY85+
4 substitution model and with uncertainty in the data reflected in the 95% high-probability density (HPD) intervals. An uncorrelated lognormal relaxed molecular clock model (14) was employed for JCV genome analyses, while a strict clock and a fixed (known) substitution rate were used for the analysis of human mtDNA, as these sequences have been shown to evolve in a roughly clock-like manner (19). Population growth curves were estimated over a 22-million-state chain with a fixed substitution rate parameter.
|
|
|---|
![]() View larger version (37K): [in a new window] |
FIG. 1. Phylogenetic tree of 333 JCV genomes inferred using a maximum likelihood approach. The tree is midpoint rooted, and clades are labeled with the range of host ethnicities found in the clade (designations are those submitted by the publishing author) followed by the subtype designation. Bootstrap values, calculated using 1,000 replicate NJ trees, are labeled for relevant nodes with >50% support.
|
To compare the evolutionary histories of virus and host, the consensus JCV trees were each mapped onto all three possible human population phylogenies (i, ii, and iii), creating "tanglegrams" (Fig. 2; see Fig. S3 in the supplemental material; Table 1). Although some mismatch is to be expected given human population admixture, it is striking that none of the six cophylogenetic solution sets had reconciliations with more than five lineage codivergences (i.e., 10 codivergence events). A significance test demonstrated that, given 100 random viral topologies, more than 60 can be mapped onto each host tree and still give five or more codivergences (P
0.61 ± 0.05). Similarly, nonsignificant P values were obtained when testing the minimal number of NCEs (P
0.58 ± 0.05). In sum, there does not appear to be any significant support for codivergence, as the observed JCV trees are no more congruent with human trees than random viral trees would be. In contrast, the consensus tree of the 158 globally sampled mtDNA genomes, which clearly have codiverged with that of the human population, shows a significant level of congruence with one of the three consensus human phylogenies (P = 0.02 ± 0.014), confirming the robustness of the TreeMap approach employed here.
![]() View larger version (16K): [in a new window] |
FIG. 2. Tanglegrams a to i of JCV (right) and human population (left) phylogenies (see Table 1; see Fig. S3 in the supplemental material).
|
|
View this table: [in a new window] |
TABLE 1. JCV-human phylogenetic reconciliation analysis
|
The predicted age of the human mtDNA phylogeny was consistent with accepted estimates of a most recent common ancestor (MRCA) 100,000 to 200,000 years ago (ya) (27) and an increase in population size corresponding to major cultural changes, beginning approximately 50,000 ya (21) (Table 2; Fig. 3B). Furthermore, the posterior estimates of the 10
parameters (equivalent to Ne x g, where g is generation length and Ne is the effective population size) ranged from 5.3 x 104 to 9.2 x 106. Given that the human lineage has had a long-term harmonic mean Ne of
10,000 (34, 37) and a g of approximately 20 years, the estimated
's are also consistent with human population history. In contrast, the posterior population parameters estimated for JCV did not correspond to those of its human host, as would be predicted under a model of codivergence. The upper 95% confidence interval for the inferred MRCA of JCV did not exceed 3,100 ya (Table 2), while the values estimated for JCV
's only ranged from 5.6 x 102 to 3.4 x 104. Given the prevalence of JCV in the human population, if the viral g was truly equivalent to that of humans (i.e., transmitted essentially vertically), these
's should be the same order of magnitude as those estimated for the human population.
|
View this table: [in a new window] |
TABLE 2. Bayesian MCMC estimates of JCV and human mtDNA evolutionary dynamics
|
![]() View larger version (20K): [in a new window] |
FIG. 3. Skyline plots estimated from Bayesian MCMC analyses of (A) 158 JCV genomes found in geographically diverse populations and (B) 158 human mtDNA sequences with a similar geographical distribution. A priori nucleotide substitution rates of 1.7 x 105 and 1.7 x 108 subs/site/year were specified, respectively, while all other parameters were allowed to vary. The black line shows the median estimate of (Ne x g) throughout the given time period. The gray area gives the 95% HPD interval of these estimates.
|
|
|
|---|
Thus, our analysis suggests that JCV has not strictly codiverged with human populations. While specific strains do exist predominantly within certain populations and some parts of the JCV phylogeny hint at codivergence, such as the close association of subtypes 2A1 (east Asians) and 2A2 (Native Americans), geographical association does not provide adequate evidence for long-term codivergence. As such, we caution against using this virus to make extensive inferences about the evolution of human populations. That factors other than codivergence could account for the similar geographical distribution of humans and JCV was proposed by Wooding (39) after finding differences in demographic history. Both JCV and human subpopulations exhibit distinctive genetic features likely caused by population isolation and genetic drift, but human population structure cannot explain many of the observed phylogenetic patterns in JCV, such as the relative similarity of viruses sampled in subpopulations from Africa and Asia and the genetic diversity of European strains. In contrast, despite their limitations, sequence data derived from human mitochondria and Y chromosomes may be far better suited for deducing the details of human population migration.
Under an assumption of JCV-human codivergence, the rate of viral nucleotide substitution estimated by calibrating the viral phylogeny with host divergence times should approximately match the rate obtained in a host-independent analysis of substitution rate. While host-calibrated viral clocks have resulted in estimates of the synonymous substitution rate at 4 x 107 subs/site/year (17), no host-independent rate had been estimated, either to confirm this rate or to test assumptions of vicariance upon which it is based. Here, we attempted to obtain such an estimate, based solely on the extent of sequence variation observed in JCV isolates over a period of 34 years. Our analyses suggest a significantly more rapid rate of evolution than that obtained under a model of codivergence. Although limited long-term viral sampling resulted in fairly large confidence intervals for the rate of JCV evolution (indicating that all estimations of the substitution rate must be made with caution and that further data are required), these rapid rates, together with the lack of evidence for codivergence, suggest that human phylogenetic history does not provide suitable calibration points for JCV.
A high rate of evolutionary change in JCV is also compatible with analyses of intrahost diversity in populations of BKV, the only other known human polyomavirus. In particular, a 52-year-old patient was found to harbor BK viruses that differed by 0.55% (10). Assuming a clonal infection
50 years before, this level of diversity would indicate a rate of 5 x 105 subs/site/year, similar to the estimates for JCV we find here. Likewise, the 0.15% diversity in a patient infected for approximately 40 years yields a rate of 2 x 105 subs/site/year (10). A separate study of healthy transplant recipients found less intrahost diversity, yet the phylogenetically grouped BKV populations in four out of six patients still showed nucleotide differences (35). Performing the equivalent conservative calculation as that described above suggests that if the patients contracted the virus as infants, rates of evolution are between 4.0 x 106 and 7.0 x 106 subs/site/year. If, on the other hand, the virus was contracted at the time of the kidney transplant, evolutionary rates would range from 1.8 x 104 to 7.8 x 104 subs/site/year. In either of these scenarios, the rate appears to be closer to the rate derived independently from codivergence assumptions than to the codivergence-based rate estimates. Finally, employing the covarion model of nucleotide substitution, which has been proposed as a means to reconcile virus and host divergence times (18), failed to extend the divergence times of JCV; although the covarion model gave a significantly better fit than a noncovarion model, the total length of the phylogenetic tree was reduced in the former (analysis performed using MrBayes; methods and results are available from the authors on request).
In the case of some DNA viruses, notably herpesviruses and papillomaviruses, it has been possible to estimate substitution rates using well-established patterns of host-virus codivergence (5, 24, 25). The rates inferred in these viruses are generally low, in the range of 107 to 109 subs/site/year (5, 23). In the case of the herpesviruses (which range from 150 to 230 kb in length), these rates are also compatible with the notion that there is a universal rate of mutation in DNA microbes which is proportional to genome size (estimated at
0.003 mutations/genome/replication [12, 13]). While the rapid rates of evolution recently observed in the 5-kb autonomous carnivore parvoviruses of
104 to 105 subs/site/year (30) would seem to support this notion, the low rates found in the 8-kb papillomaviruses imply that genome size is not the only factor influencing substitution rates. It is evident that viral generation times (replication rates) will strongly influence substitution rates per unit of time, although accurate measurements of generation times are lacking for most viruses. Our study suggests that it is necessary to reexamine many previously held suppositions regarding substitution rates in DNA viruses and to more accurately determine the similarities and differences between long- and short-term as well as intra- and interhost substitution rates.
This work was supported by a Howard Hughes Medical Institute fellowship to L.A.S.
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»