Previous Article | Next Article ![]()
Journal of Virology, April 2008, p. 3584-3589, Vol. 82, No. 7
0022-538X/08/$08.00+0 doi:10.1128/JVI.02506-07
Copyright © 2008, American Society for Microbiology. All Rights Reserved.
,
Institut de Recherche pour le Développement (IRD), UMR RPB, BP 64501, 34394 Montpellier cedex 5, France,1 Centre National de la Recherche Appliquée au Développement Rural (FOFIFA), BP 289, Mahajanga 401, Madagascar,2 Botany Department, Dar es Salaam University, P.O. Box 35091, Dar es Salaam, Tanzania,3 Institut de l'Environnement et de Recherches Agricoles (INERA), Laboratoire de Biotechnologie et de Virologie Végétale, 01 BP 476, Kamboinsé, Ouagadougou, Burkina Faso,4 Institut de Recherche Agronomique du Niger, BP 60, Kollo, Niger,5 Africa Rice Center (WARDA), 01 BP 2031, Cotonou, Bénin6
Received 22 November 2007/ Accepted 4 January 2008
|
|
|---|
|
|
|---|
Interestingly, some RNA viruses change little or not at all over time. The best-documented example is an RNA plant virus, Tobacco mild green mosaic virus, which showed no increase in genetic diversity over the 90 years considered, in the longest series of isolates with known isolation times for any virus (20). Indeed, many studies have shown the remarkable genetic stability of RNA plant virus populations from different geographical regions, hosts, and collection times (21). It was claimed that most tobamovirus populations are very stable and do not evolve at a measurable rate (22). It was even observed that populations of Turnip yellow mosaic virus from Europe and Australia that probably separated more than 12,000 years ago differed by less than 1% (4).
Actually, the lack of estimates of evolution rates of RNA plant viruses over time may merely reflect the absence of a large enough number of isolates collected over a sufficiently long period. Heterochronous data for plant virus isolates are particularly difficult to gather compared to data for animal viruses, for which isolates are readily recovered from blood samples that have been collected over many decades and stored for medical purposes. Even then, the temporal component of variation can be blurred or biased by other sources of diversity, such as long range dispersal, recombinant events, and subpopulation division, which are common features of plant viruses.
Since the 1920s, experimental evidence has established that RNA plant viruses can evolve rapidly, especially under selection pressures such as a change of host (21, 22). The evolution rate of Wheat streak mosaic virus was extrapolated from serial passage experiments (38). This method assumed that the mean rate of change measured in the laboratory reflected that of the natural viral populations, although constraints on evolution in nature and in experiments may differ. Recently, the evolution rate of Barley yellow dwarf virus was calculated by comparing an isolate preserved in old herbarium specimens to present-day specimens (29). This method postulates that the genetic diversity of the population at the time of sampling is negligible, so that sequences isolated at different times differ only by substitutions accumulated during the time interval. If this condition is not met, this method has an upward bias and overestimates the evolution rate (11). In addition, both attempts assumed that the molecular clock remained constant during the evolution of the viruses. The estimates of the evolution rates of these two RNA plant viruses therefore rely on critical but nontestable assumptions. Estimates of the evolution rates of plant viruses based on historical evidence such as outbreak records can be tentative only. Altogether, the evolution rate of an RNA plant virus has never been estimated by applying statistical methods developed to analyze temporally spaced sequences. This contrasts with the recent advances made for RNA animal viruses by using this statistical approach (11).
Rice yellow mottle virus (RYMV), of the Sobemovirus genus, was used to estimate the evolution rate of an RNA plant virus. RYMV has a high natural molecular diversity (18) and reaches a high concentration in rice (19). Experimentally, RYMV adapts rapidly to alternative hosts through accumulation of point mutations (25, 34). These features make RYMV an appropriate model with which to estimate the evolution rate of an RNA plant virus species. RYMV is an emergent virus, indigenous to Africa. It was first noticed in Kenya in East Africa in 1966 (3) and since then in almost all African countries where rice is grown, including Madagascar (39). RYMV is a major threat to rice cultivation (28). It has a narrow host range, restricted to wild and cultivated rice species and a few related grasses (2). RYMV is transmitted primarily by coleopterous beetles of the family Chrysomelidae and is disseminated by contact during cultural practices (28). Its genome contains four open reading frames (ORFs) (18). ORF1, located at the 5' end of the genome, encodes a protein involved in virus movement and gene silencing suppression. ORF2, which encodes the central polyprotein, comprises two overlapping ORFs. ORF2a encodes a serine protease and a viral-genome-linked protein (VPg). ORF2b, which is translated through a –1 ribosomal frameshift mechanism as a fusion protein, encodes the RNA-dependent RNA polymerase. The coat protein (CP) gene (ORF4) is expressed by a subgenomic RNA at the 3' end of the genome.
RYMV isolates were collected in 16 African countries between 1966 and 2006. The CP genes of 253 isolates were sequenced in this or in earlier studies (1, 18, 33, 39). Such a large collection of sequences over a 40-year period is unique for a plant virus and is used here to estimate the rate of change of RYMV. RYMV diversity is geographically structured, with different strains in East, West, and Central Africa (1, 33, 39). Substitution rates were first estimated by pairwise distance linear regressions from five phylogeographically based groups of isolates. Each group comprised isolates collected over the longest possible period of epidemiological record while other factors influencing diversity, which might adversely affect the analysis of temporally spaced viral sequences, were minimized (11). The five groups comprised 135 isolates in total. Rates were further assessed by Bayesian coalescent methods using 253 isolates originating from all regions of Africa. The rates were calculated under strict and relaxed molecular clock hypotheses (10) and under constant-size and skyline population models (14). The synonymous evolution rate was estimated from the evolution rate of the third codon position. The overall and synonymous evolution rates of the CP gene of RYMV were compared to the evolution rates of 50 RNA animal viruses (24, 26). Experimentally, the number of changes was calculated by comparing the full sequence of each isolate at inoculation with that 1 to 6 months later. Then the number of changes was compared to that estimated from the evolution rate. Altogether, we found that the overall and synonymous evolution rates of RYMV were within the ranges of those of RNA animal viruses. This shows that an RNA plant virus such as RYMV evolves as rapidly as most RNA animal viruses.
|
|
|---|
Five phylogeographically based groups of isolates were defined for pairwise distance linear regression estimates (see Table S1 in the supplemental material). Within each group, isolates belonged to the same phylogenetic cluster, originated from the same region, had the least population subdivision, and showed no evidence of long-distance movement or of recent recombination events. Group I, also referred to as S2/S3-CI, comprised 40 isolates from Côte d'Ivoire collected over 27 years. Strains S2 and S3, considered earlier to be distinct, were combined as a more intensive survey revealed a continuum between the two strains and are now referred to as strain S2/S3 (34). Group I had the densest sampling survey, the least residual variability, and minimal population subdivision. Group II (S2/S3-WA), with 68 isolates collected over 31 years, was an extension of group I that included 28 additional S2/S3 isolates from Guinea, Mali, and Sierra Leone, three countries that neighbor Côte d'Ivoire in West Africa. Group III (S1-WA) comprised 23 S1 isolates from Burkina Faso, Côte d'Ivoire, and Mali in West Africa, collected over 11 years, a shorter period, and displaying a higher residual diversity. Group IV (S1'-CA) included 31 isolates from Benin, Cameroon, Chad, Niger, Nigeria, and Togo collected over 23 years. These isolates formed a monophyletic cluster referred to earlier as the S1'-Central African strain to designate isolates from Cameroon and Chad in Central Africa; then, by extension, this designation was applied to isolates of the same lineage from countries to the west (Benin, Niger, Nigeria, Togo), although geographically these countries belong to West Africa. The isolates originated from a large geographic area and had a relatively high residual diversity. Group V (S4-LV) comprised 13 isolates of strain S4 collected over 34 years that originated from western Kenya, northern Tanzania, and southern Uganda around Lake Victoria. The sampling time was the longest but the number of isolates the smallest. The five groups comprised a total of 135 isolates. An additional set of 118 isolates from all parts of Africa was included to make a group of 253 isolates for the Bayesian coalescent methods.
Pairwise distance linear regressions. Pairwise distance linear regressions are fast and useful methods to estimate the substitution rate from large data sets of heterochronous sequences. These methods are also used to determine if the sequence diversity exhibits an adequate temporal structure and to verify that no outlier substantially alters the substitution rate (26). With RYMV, the genetic distance (percentage of nucleotide differences calculated over the 720-nt CP gene) and the separation time (sampling interval in number of years in absolute value) of each pair of isolates were determined. The genetic distance was plotted against the separation time. The significance of the correlation coefficient between genetic distance and separation time (30) was tested after 100,000 permutations by using GENETIX (version 4.05; Laboratoire Génome, Populations, Interactions, CNRS UMR 5171, Université Montpellier II [http://www.genetix.univ-montp2.fr/genetix/intro.htm]). The slope of the linear regression of the pairwise genetic distance against the separation time is an estimate of the evolution rate (in nucleotide substitutions per site per year) (11, 12). A parametric bootstrap method for obtaining the variance of the rate was implemented in STATISTICA (StatSoft). The confidence interval was calculated with the percentile method: the 26th and 975th estimates after 1,000 replicates (when ranked) were, respectively, the upper and lower 95% confidence limits of the original estimate. However, this bootstrap procedure underestimates the true confidence interval, since it does not consider the correlation of bootstrap replicates due to shared ancestry (11). Thus, pairwise distance linear regression is used here as an exploratory analysis of the evolution rate of the CP gene of RYMV.
Bayesian estimates. The best-fitting nucleotide substitution model was evaluated by hierarchical likelihood ratio tests, using ModelTest on the basis of hierarchical likelihood ratio testing (35), as implemented in HyPhy (27). The best-fitting model was the Hasegawa Kino Yano model with gamma rate heterogeneity. The evolution rate was estimated from the full group of 253 heterochronous sequences within a Bayesian coalescent framework by a Markov chain Monte Carlo (MCMC) method using the BEAST program (13). The Bayesian MCMC method estimates the parameter as the mean of its marginal posterior distribution while simultaneously incorporating uncertainty in the underlying genealogy and other parameters. Evolutionary rates were estimated using both strict and relaxed molecular clocks (uncorrelated lognormal model) as implemented in BEAST (10). Two population genetic models were applied to analyze the data. The first model assumed that the population size was constant over time. The second, the skyline population model, determined the population growth model using the data supplied (14). The unweighted-pair group method using average linkages was used to construct the starting tree. Uncertainty in the estimated parameter values is summarized by the highest posterior density interval that contains 95% of the marginal posterior distribution. The length and number of MCMC chains were chosen so that the effective sample size for each parameter was >100, indicating that parameter space was sufficiently explored.
Currently implemented coalescent methods assume panmictic populations, free of recombination and selection. Substantial selection, recombination, or population subdivision may adversely affect the analysis of temporally spaced viral sequences. The first two assumptions are realistic, since earlier studies showed that RYMV evolves under a marked purifying selection, with few sites under diversifying selection (18). Consequently, positive selection makes a minor contribution to the molecular evolution of the genome. There is no evidence of recombination, either (18). By contrast, the third assumption is not verified, because the diversity of RYMV has a strong spatial basis (1, 39). Therefore, it was checked whether or not population subdivision affected the analysis of temporally spaced viral sequences by comparing the estimates from the group of 253 isolates to estimates from the geographically based groups with minimal population subdivision. The synonymous evolution rate of RYMV was computed using the nucleotides at the third codon position, which are fourfold degenerate at the amino acid level and where most synonymous substitutions occur. This approach was chosen in order to compare the synonymous evolution rate of RYMV to the rates of 50 RNA animal viruses, calculated similarly (26). The evolution rate of the CP gene was extrapolated to the other ORFs and to the full genome, assuming proportionality between the evolution rate and nucleotide diversity. Then the nucleotide diversities of the ORFs and of the full genome were calculated from a set of 22 isolates representative of the geographic and molecular diversity by using DnaSP (37).
Experimental studies. The highly susceptible cultivar Oryza sativa indica cv. IR64, the partially resistant cultivar Oryza sativa japonica cv. Azucena, and the highly resistant cultivars Oryza sativa indica cv. Gigante and Oryza sativa indica cv. Bekarosaka were inoculated with RYMV isolates of different strains. Infection after host change was monitored over time. Isolates were fully sequenced before inoculation and 1 to 6 months later. In all, six experiments were conducted. For each experiment, the number of nucleotide changes between the two stages was counted, and the numbers of synonymous and nonsynonymous substitutions were distinguished. The total number of changes observed experimentally for the six experiments pooled was subsequently compared to that estimated from the evolution rate of the full genome for the same time period.
|
|
|---|
![]() View larger version (29K): [in a new window] |
FIG. 1. Genetic distance (percent nucleotide difference) and sampling interval (number of years in absolute value) of each pair of isolates for each of the five groups and the regression line between the two variables. r, estimate of the evolution rate; R, correlation coefficient.
|
|
View this table: [in a new window] |
TABLE 1. Estimates of the evolution rate of Rice yellow mottle virus by Bayesian analysisa with different molecular clock and population genetic models
|
Comparison to RNA animal viruses. The RYMV evolution rate of 5.2 x 10–4 nt/site/year, estimated under the strict molecular clock and constant population models, was compared to the rates of 50 RNA animal viruses calculated with similar assumptions and using broadly similar analytical techniques (26). The evolution rate of RYMV was below the average (7.0 x 10–4 nt/site/year) but above the distribution median (3.6 x 10–4 nt/site/year) (Fig. 2). A similar conclusion was reached when the evolution rate of the CP gene (ORF4) was extrapolated to the other ORFs. The nucleotide diversities calculated from 22 isolates representative of the genetic and geographic diversities of RYMV were 0.098 for ORF4, 0.117 for ORF1, 0.057 for ORF2a, and 0.062 for ORF2b. Assuming proportionality between the nucleotide diversity and the substitution rate gave the following estimates: ORF1, 6.1 x 10–4 nt/site/year; ORF2a, 3.0 x 10–4 nt/site/year; ORF2b, 3.3 x 10–4 nt/site/year.
![]() View larger version (36K): [in a new window] |
FIG. 2. Overall (top) and synonymous (bottom) evolution rates of a range of 50 RNA animal viruses (Jenkins et al., 2002 [26]) and of the RYMV CP gene (solid bars and arrows). Horizontal dashed lines indicate the median and average evolution rates of the 50 RNA animal viruses.
|
The overall and synonymous rates of evolution of RYMV fell within the range of RNA animal viruses, with a rate higher than those of many animal viruses. It shows that an RNA plant virus can evolve as rapidly as most RNA animal viruses. The substitution rate of RYMV was similar to that of Tomato yellow curl virus (TYLCV), a fast-evolving plant DNA virus, recently calculated by the same Bayesian coalescent method (15). Altogether, the evolution rates of RYMV and TYLCV show that plant viruses can evolve as rapidly as most animal viruses but not as fast as the fastest-evolving animal viruses, such as human immunodeficiency virus type 1 or influenza A virus (24, 26). Evolution rates of RNA plant viruses, like those of RNA animal viruses (24, 26), may encompass a wide range of values. Similar studies should be conducted to determine the range of evolution rates of plant viruses.
Nucleotide changes in experimental studies. The number of changes observed experimentally was compared to that expected from the estimated evolution rate. The number of nucleotide changes observed experimentally on the full genome is quite variable between experiments (Table 2). A large variation in the number of changes was also observed when ORFs 2a and 4 of several other isolates were sequenced 1 to 6 months after inoculation (data not shown). Thus, inferring the rate of evolution from such low and variable numbers of experimental changes is speculative. Furthermore, the total number of changes observed experimentally greatly exceeded that estimated from the evolution rate. The ratio of the nucleotide diversity of the full genome to that of the CP gene for 22 representative isolates of RYMV is 0.8 (see above). Based on this ratio, the evolution rate of the full-length genome would be between 3.2 x 10–4 and 6.5 x 10–4 nt/site/year by extrapolation from the evolution rate of the CP gene (4 x 10–4 to 8 x 10–4 nt/site/year). Three to 6 nucleotide changes are expected when these rates are applied to the 4,450-nt genome for the length of time of the experiments, a number three to six times lower than the 19 changes observed experimentally (Table 2).
|
View this table: [in a new window] |
TABLE 2. Numbers of nucleotide changes observed in experimental studies with several RYMV isolates, rice cultivars, and durations of infection
|
Plant viruses as measurably evolving populations. Molecular studies have enlarged the scope of plant virus epidemiology (17, 21, 31). Recently, renewed attention has been paid to ecological studies dealing with topics such as plant viruses in wild ecosystems, virus spread from wild to cultivated plants, virus emergence, and host plant domestication or movement (5, 16, 23, 29, 40). The resulting ecological and epidemiological scenarios are related to plant virus history or even prehistory. They raise the recurrent question of the role of agriculture in the emergence and diversification of plant viruses. Indeed, a sound time scale of virus evolution is necessary in order to validate and to calibrate these scenarios. Our results show that the evolution rates of plant viruses can be calculated and consequently that these time scales can be estimated. For instance, the substitution rate of TYLCV was used recently to date the diversification of TYLCV species (15).
This work was supported by the ECOGER action of the French National Program ANR "Ecosphère continentale: processus, modélisation et risques environnementaux," supervised by INRA.
Published ahead of print on 16 January 2008. ![]()
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»