Previous Article | Next Article ![]()
Journal of Virology, March 2006, p. 2349-2357, Vol. 80, No. 5
0022-538X/06/$08.00+0 doi:10.1128/JVI.80.5.2349-2357.2006
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
UMR Biologie et Génétique des Interactions Plantes-Parasites, CIRAD-INRA-ENSAM, TA 41/K, Campus International de Baillarguet, 34398 Montpellier Cedex 05, France,1 UMR Biologie et de Gestion des Populations, CIRAD-ENSAM-INRA-IRD, Campus International de Baillarguet, 34398 Montpellier Cedex 05, France2
Received 27 September 2005/ Accepted 6 December 2005
|
|
|---|
|
|
|---|
The concept of the viral population has been hotly debated around the term quasispecies (6, 12, 22, 47), but it is undisputed that an RNA virus population is a swarm of mutant genomes among which complementation occurs to a variable extent for different vital functions (16). The limit or frontier of virus populations is difficult to determine and varies widely depending on the scientific questions addressed by various authors. This limit is often logically determined by the physical or geographical barriers that separate host populations. Infected pluricellular hosts have been, and are still, often considered as delineating the minimal virus population since no clear isolation is perceived (or known) among groups of viral genomes replicating in various locations within this host, and mixing can occur through the vascular system, resembling a panmictic situation. Consequently, in animals and most particularly in plants, the structure of virus populations is seldom described in detail at an intrahost level. When sequence analysis of a genome pool originating from a single host is reported, this pool is usually considered a unique population sample and is rarely regarded as a mix of several possible bulk genomes originating from various organs and tissues (14, 19, 45).
The few data available in the literature at this intrahost level suggest a more complex pattern than the genuine panmictic situation. Remarkably, for Human immunodeficiency virus type 1 (HIV-1), it was shown that the assumption that populations are panmictic within a host is inconsistent with the observed data. Instead, the metapopulation concept, where a population is made up of several discrete subpopulations that undergo turnover through frequent foundation and extinction events, could play a central role in the evolution of HIV-1 (17). Differentiation of subpopulations in various organs of the same host has been further confirmed more recently for HIV (24, 30, 33, 37) and has also been reported for other animal virus species such as Hepatitis C virus (1, 9, 25) and TT virus (29).
In plants, available information is scarcer. Analysis of the genetic composition of virus populations during systemic invasion of tobacco plants by Tobacco mosaic virus (36) and Cucumber mosaic virus (26) or wheat plants by Wheat streak mosaic virus (15, 20) has clearly revealed the existence of severe bottlenecks when a virus population is colonizing new leaves, possibly inducing strong genetic drift and differentiation within each leaf, related to the founder effect. Perennial plants are particularly well suited as models for the study of the structure of virus populations. Indeed, infections commonly persist for many years without killing the host, and the viral population can potentially be analyzed after a much longer within-host evolution than in annual plants. It is thus surprising that very few studies have investigated the structure of virus populations within perennial hosts. Previous hints, however, indicate that the composition of sequence variants (also designated haplotypes) may differ in various locations of a single host tree for Apple stem grooving virus (27) and Citrus tristeza virus (10), although other data appear to be contradictory (11).
To further study the differentiation of several subpopulations of plant viruses at the intrahost level and to elucidate the mechanisms of such differentiation, we have decided to use a perennial plant model chronically infected by a potyvirus following a unique inoculation event. The extensive analysis presented in this report reveals a series of striking phenomena that have thus far been mostly overlooked in plant virology. Thirteen years after the initial inoculation, we demonstrate that a high genetic diversity has built up in the virus population, which is structured by the architecture of the tree. Beyond the observation that viral genetic diversity increases when moving up from old (trunk and then limb) to newly formed (branches and then leaves) organs, our data clearly demonstrate that several viral subpopulations differentiate over the years as they become isolated in different limbs of the host tree and evolve independently by "contiguous range expansion." In addition, we present evidence that each individual leaf is infected by a single haplotypic variant; thus, the tree harbors, through its myriad leaves, a huge collection of mutant genomes comprising the complex sum of a virus population(s).
|
|
|---|
Sampling protocol. In June 2004, 13 years after PPV inoculation, we carried out a large-scale stratified sampling on the infected Prunus tree. As schematized in Fig. 1, we collected the following samples on the tree: (i) 3 main roots (bark only) and 18 young terminal roots (pieces of well-separated fibrous root); (ii) 12 bark samples from the trunk, evenly distributed around the periphery of a virtual transverse section; (iii) 12 bark samples from the base of each of the five constitutive limbs, evenly distributed at the periphery of virtual transverse sections; (iv) 1 bark sample from the base of each of 23 newly formed branches, with a maximum of 5 branches (when available) being sampled on each of the five constitutive limbs; and (v) 10 whole-leaf samples (when available) from each of the 23 young branches. In total, 230 whole leaves, 23 bark samples of young branches, 60 bark samples of limbs, 12 bark samples of the trunk, 3 bark samples of main roots, and 18 young terminal roots were collected. All samples were stored at 80°C until use for immunocapture reverse transcription (RT)-PCR and single strand conformation polymorphism (SSCP) analysis.
![]() View larger version (27K): [in a new window] |
FIG. 1. Sample locations within the Prunus persica host. The architecture of the tree is drawn schematically, comprising terminal roots, main roots, trunk, constitutive limbs, yearling branches, and leaves. An extensive sampling was conducted according to the stratified pattern as follows: dots situated at the periphery of each of the upper branches indicate whole-leaf samples; the dot situated at the base of each of the upper branches indicates a branch bark sample; dotted circles illustrate virtual transverse sections, around which each dot represents one bark sample; and dots on the main and young terminal roots indicate bark and whole tissue samples, respectively. Groups of samples Br1 to Br23, Li1 to Li5, Tr, and Ro are hypothesized to contain distinct subpopulations (see text). Samples from the main and young terminal roots were grouped in a single Ro subpopulation.
|
Plant extracts were prepared by grinding leaves, barks, and fragments of roots 1/20 (wt/vol) in phosphate-buffered saline (PBS)-Tween (PBS containing 0.05% [vol/vol] Tween 20 and 2% [wt/vol] polyvinylpyrrolidone). For each sample, 100 µl of supernatant from a low-speed centrifugation (12,000 x g, 5 min) was incubated overnight at 4°C in tubes previously coated with PBS-Tween containing 1 µg ml1 polyclonal anti-PPV immunoglobulin G; the supernatant was then washed away with two rinses with the same buffer. RT was performed directly in these immunocapture tubes using a reverse transcription kit (Promega, Madison, WI), according to the supplier's recommendations. One microliter of the cDNA-containing mixture was then used for further PCR amplification. The yields of the resulting RT-PCR products were systematically checked on agarose gels and then stored at a concentration of 100 ng µl1.
SSCP analysis and sequencing. For SSCP analysis, 5 µl (0.5 µg) of each of the RT-PCR products was mixed with 12 µl of denaturing solution (95% formamide, 20 mM EDTA, pH 8.0, 0.05% bromophenol blue, and 0.05% xylene cyanol), heated for 10 min at 95°C, and immediately cooled on ice. The DNA strands were separated by nondenaturing 8% polyacrylamide gel electrophoresis in Tris-borate-EDTA buffer (89 mM Tris-borate, pH 8.0, 2 mM EDTA), with a constant current of 28 mA per minigel, and the temperature was maintained at 4°C. The duration of the electrophoresis run varied with the size of the fragment analyzed and was 4, 5, and 6 h for regions A, B, and C, respectively. The gels were finally stained with a DNA silver staining kit (Amersham Biosciences, Little Chalfont, United Kingdom).
Tissue samples yielding different SSCP patterns in at least one of the genomic regions analyzed were considered to be containing different haplotypes. On the one hand, sequencing of all different haplotypes confirmed that they indeed differ by one or more mutations; on the other hand, sequencing of up to five samples (when available) containing identical haplotypes confirmed that they displayed strictly identical sequences.
Fragmentation of the data set for statistical analysis. Samples from each of the 23 newly formed branches (10 leaf samples and one bark sample per branch, named Br1 to Br23), from each of the five constitutive limbs (12 bark samples per limb, named Li1 to Li5), from the trunk (12 bark samples, named Tr), and from the whole root system (named Ro) were a priori considered to be containing discrete population units. Thus, in total, we primarily hypothesized that we were working with 30 putatively distinct subpopulations positioned as shown in Fig. 1.
Descriptive statistics of genetic structure. Genetic differentiation between the 30 PPV subpopulations defined above was estimated by the F statistic (46), using the Arlequin version 2.000 program, and the significance was tested by bootstrap analysis based on 10,000 replicates.
The potential effect of a limb that supports branches and leaves on the total variation of haplotype distribution was studied by analysis of molecular variance (AMOVA) (13). This procedure was performed by evaluating the type and frequency of haplotypes detected in each limb (including associated branches and leaves) and by estimating the probability that their distribution was random. This test was contrasted using a bootstrap analysis with 10,000 repetitions. The statistic for differentiation between groups of subpopulations on different limbs was FCt, the significance of which is similar to the corresponding percentage of variance. The second level of variation was between subpopulations within each limb. This source of variation was tested by exchanging the haplotypes among Br and Li subpopulations in each limb, and the statistic of differentiation was FSC, the significance of which is also evaluated as a percentage of the corresponding variance. The third level of variation was within subpopulations, and the statistic of differentiation was FST, the significance of which is also evaluated as a percentage of the corresponding variance.
For neutrality testing, the program DnaSP (34) was used to study Tajima's D and Fu and Li's F* statistics.
Nested clade analysis. We used a nested clade analysis (NCA) to evaluate the association between haplotypes and their geographical range. The NCA was performed according to the procedure described previously by Templeton et al. (43) and can be divided into a series of successive steps implemented with the computer programs TCS version 1.13 (2) and Geodis (32). First, a haplotype tree or network was constructed according to a statistical parsimony procedure (23, 40). The same TCS program was used to reconstruct a nonrooted haplotype genealogy and to identify the most ancestral haplotype. Second, the nesting procedure (3, 4, 40, 42, 44) was applied to the haplotype network. The resulting nested clades are designated by C-N, where C is the nesting level of the clade and N is the number assigned to a particular clade at a given nesting level (41). Once the cladogram had been converted into a nested series of linked clades, the geographical data were quantified via a distance matrix (in centimeters) based on the tree (host) structure. Finally, the computation of different distance statistics (Dc, the geographical range of a particular clade, and Dn, the distribution of a particular clade relative to its sister clades) was performed using Geodis (32) according to the procedures described previously by Templeton et al. (43).
In cases where clades were nonrandomly distributed, the biological causes of the haplotype-geography association were interpreted using the inference key previously proposed by Templeton et al. in 1995 (43) and updated recently (31). This inference key aims to discriminate between distinct evolution patterns throughout the hierarchical clade levels. These patterns are designated as restricted gene flow, past fragmentation, and range expansion.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Distribution of the 33 haplotypes within samples composing the 23 putative subpopulations
|
![]() View larger version (41K): [in a new window] |
FIG. 2. Haplotype distribution superimposed on the schematic architecture of the tree. Each haplotype is represented as a thin rectangle composed of three distinct compartments illustrating the different sequences of regions A, B, and C. To facilitate the schematic representation of haplotypes, the first detected haplotype, H1, for regions A, B, and C is represented in green. When the sequence of a given genomic region varies, the color of the corresponding compartment changes. Blank haplotypes correspond to samples from which the immunocapture RT-PCR failed repeatedly. (1) Haplotypes were detected in the 18 young terminal root samples. (2) Haplotypes were detected in the three main root samples. (3) Haplotypes were detected in the 12 bark samples from the trunk. (4) Haplotypes were detected in the 60 bark samples from constitutive limbs. (5) Haplotypes were detected in the 23 bark samples from yearling branches. (6) Haplotypes were detected in the 230 leaf samples.
|
To test for possible differentiation of PPV populations within this single host tree, the whole sample set was divided into 30 putative subpopulations according to the architecture of the tree, with 23 representing subpopulations from each yearling branch (Br1 to Br23), 5 representing subpopulations from the bark of each major limb (Li1 to Li5), and 2 representing subpopulations from the bark of the trunk and from the root system (Tr and Ro, respectively). We estimated the haplotype (H) and nucleotide (
) diversity within each of the 30 putative subpopulations (Table 2). The values were highly variable and ranged from 0.0 to 0.800 (±0.114) and from 0.0 to 0.0240 (±0.00054) for H and
, respectively. By comparing H and
obtained from bulk samples from the trunk, the limbs, the branches, and the leaves (Table 2), it appears that the diversity increases significantly when sampling is done upward in the host tree from the trunk to the leaves, from older to younger organs. Our sampling of the root system was only exploratory and not as extensive as that of the rest of the tree. For this reason, the variability found in young fibrous roots is not further considered in the detailed analysis described below and will be discussed later.
|
View this table: [in a new window] |
TABLE 2. Haplotype and nucleotide diversity within PPV putative subpopulations
|
In order to evaluate whether the differentiation detected between the subpopulation units is structured, and how it is structured, we performed an AMOVA test on the haplotype distribution at different hierarchical levels: the major limbs, the yearling branches within each limb, and the leaves within each yearling branch. While the AMOVA results (Table 3) showed significant genetic structure at the three hierarchical levels examined, the limb hierarchical level appears to explain the majority of the variation (55.04%). Differences between yearling branches of the same limb contributed to only 13.05% of the total variation. The remarkably large residual variation of 31.9% of the total is due to the high polymorphism within the 23 yearling branches (discussed in detail below).
|
View this table: [in a new window] |
TABLE 3. AMOVA within the PPV sample set
|
![]() View larger version (21K): [in a new window] |
FIG. 3. Maximum parsimony network and nested clades of the 33 PPV haplotypes. Haplotypes are named as described in Table 1, and each connecting line represents a single mutational step between two haplotypes. Missing (hypothetical) haplotypes are represented by a "0." Dotted-line rectangles enclose step 1 level clades and are designated by "1-n"; plain-line rectangles enclose step 2 level clades and are designated "2-n"; and the thick line separates step 3 level clades (3-1 and 3-2). Haplotype H1 has the highest outgroup probability, as indicated by the TCS program.
|
Clade 1-1 contains the majority of the samples (141/333) and, in particular, samples containing the haplotype H1 (133/333), which is proposed to be the root of the network.
We measured the geographical distance (in centimeters) between the positions of all samples in the Prunus tree and confronted the haplotype network with the distance matrix. A significant nonrandom geographical association was found for 10 clades out of 20 clades tested. According to Templeton's inference key (31, 43), the inferred evolutionary process (Table 4), which explains the geographical association of clades, was contiguous range expansion for clades 1-1, 1-4, 1-6, and higher clade 2-2 and allopatric fragmentation for clade 1-7. However, for reasons discussed below, we believe the case of clade 1-7 is to be considered with caution. Finally, no conclusive inference could be determined for clade 1-2 and higher clades 2-1, 2-5, 3-1, and 3-2.
|
View this table: [in a new window] |
TABLE 4. Evolution process of PPV populations within a single host tree
|
|
|
|---|
The observation that a unique sequence variant is detected in each of the very numerous samples is remarkably novel and has considerable implications. For main roots, trunk, and constitutive limbs, both the number of bark samples and the haplotypic diversity are low, and mixes could easily be overlooked under such conditions. In contrast, in leaves, the genetic diversity is very large, and up to 230 samples were analyzed. This situation was also confirmed in unrelated experiments, where single-sequence PPV variants were found in each of 450 additional whole-leaf samples from experimentally inoculated Prunus trees (C. Jridi, unpublished results). Previous reports have demonstrated the existence of strong genetic bottlenecking during the invasion of leaves by plant virus populations (15, 20, 26, 36). However, these studies suggested that more than one infectious unit was most often initiating the within-leaf population. To make sure that the surprising observation reported here was not an artifact of our experimental SSCP conditions, we pooled leaf samples by pairs and verified that haplotype mixes were readily detectable (data not shown). Moreover, we conducted SSCP on variable ratios of two PPV sequence variants in mixed solutions and observed that a sequence variant present at a frequency of 0.1 is still detectable (data not shown), a value also reported previously (35). From the latter test, we concluded that each leaf contains one major sequence with no other variant reaching a frequency of 0.1. Even in branches where two haplotypes are found with a high frequency among the 10 corresponding leaves (i.e., Br6, Br8, and Br11), no mix was ever detected within a single leaf. The only logical conclusion from these results is that the virus population is submitted to biological cloning at most (if not all) leaf invasions, and the tree thus harbors a tremendous number of clones within the complex virus population, each individually isolated in a single leaf. In apparent contradiction, a previous study on PPV has reported mixes of sequence variants within single leaves (5). In this study, healthy plants were coinoculated with two variants that were then colonizing "free-land" cells and tissues at their own pace, obviously reaching some leaves concomitantly. Interestingly, in an analysis of coinfection at the cellular level, those same authors reported that single cells were rarely (if ever) coinfected, thus suggesting that PPV does not easily invade "occupied land," making conciliation with our results possible. Indeed, under our experimental conditions, with the virus being present well before expansion of new leaves, we propose that the cells initiating buds might become rapidly infected by very few virus genomes (most often a single one) present in neighboring cells, subsequently precluding secondary infection by other variants. In this hypothesis, newly formed tissues are colonized by locally established haplotypes and not by those circulating in the phloem, as one would expect from the general scheme that describes virus primary invasion of a healthy host (38).
H1 was the only haplotype detected in the bark of the trunk. Interestingly, the bark samples of the major limbs contain H1 and three additional haplotypes, the bark samples of yearling branches contain the four haplotypes detected below plus two additional ones, and the leaves contain the 6 detected below plus 27 new ones. This addition of new haplotypes when going from old to young organs suggests that the tree reflects the chronology of the appearance of virus diversity. Consistently, the phylogeny network containing all haplotypes (Fig. 3) proposes that H1 is the most probable root from which diversity has built up slowly over the years. Although indirect, these data partially compensate for the fact that the original virus source plant, on which the aphids were fed prior to inoculation of the tree analyzed here, was lost 13 years ago, preventing any possibility of directly identifying the ancestral haplotype. H1 was also found in old main roots, whereas additional haplotypes appeared in young terminal roots, perhaps suggesting a similar phenomenon of PPV diversification downward with the growth of the root system. The presence (together with H1) of some haplotypes at a low frequency (below 0.1) in the initial inoculum, and their ulterior increase in frequency due to genetic drift and/or founder effect in samples of limbs, branches, and leaves, cannot be excluded. However, because the virus has replicated for 13 years in the tree and because the haplotypic diversity increases in comparable younger tissues (see H and
in barks of trunk, barks of limbs, and barks of branches in Table 2), it is likely that most haplotypes were derived from H1, accumulating new mutations after the original inoculation event.
As mentioned in above, fragmentary data had previously hinted at a within-host differentiation of virus populations in perennial plants. AMOVA based on the breakdown of our data set into 30 discrete subpopulations confirms the importance of this phenomenon for PPV in a Prunus host. More innovatively, AMOVA demonstrates that this differentiation is structured by the architecture of the tree, with subpopulations diverging most significantly in different major limbs. From Fig. 2, one can immediately perceive that different PPV lines become isolated in majors limbs, with haplotypes that occur frequently in samples from one limb, with its associated branches and leaves, being totally absent in samples from another limb. To statistically sustain this hypothesis, we applied NCA to evaluate the association between haplotype and geographical range within the host. First of all, the phylogeny network shows that haplotypes abundant in more than one major limb (H1 and H4) are likely the most ancestral, suggesting that they existed prior to limb development and corresponding PPV line isolation. Moreover, the inferred evolution process for clades 1-1, 1-4, 1-6, and 2-2 is contiguous range expansion, defined previously by Templeton (41) as a gradual, moving front of expansion for a given population. The possibility of range expansion for a virus population in a tree that is already systemically infected is limited to the newly formed branches and leaves (and roots) appearing over time. This result is consistent with the observation that PPV haplotypes seem to be unable to recolonize previously infected cells and confirms our conclusion that the virus population is expanding in one direction, toward young extremities as they develop, with no possibility of return and mixing via the vascular system in previously infected organs and tissues. We believe that this mechanism satisfactorily explains how several lines become isolated in different parts of the tree. Because neutrality tests detected no selection acting on any haplotypes, it is also likely that upon isolation, the lines differentiate rapidly due to the drastic bottlenecks discussed above and the associated genetic drift. Contiguous range expansion is the most likely inference, as it concerned 282 of the 333 samples analyzed. Other clades gave inconclusive inferences, except clade 1-7, the evolutionary process of which was inferred as allopatric fragmentation. We believe that this latter case is not reliable since clade 1-7 groups a very limited number of samples.
Overall, the results presented in this report have important implications, as they challenge some views commonly adopted among plant virologists and beyond. While panmixy is being denied for within-host HIV populations, with the concept of the metapopulation applying better, we demonstrate here that differentiation can go further in other virus species such as PPV, with several populations becoming isolated and evolving independently for years in a single host. PPV is a member of the genus Potyvirus, which alone accounts for nearly 25% of described plant virus species. It will be interesting to evaluate how the situation reported here applies to other potyviruses and to other virus genera infecting annual or perennial plants. Finally, the fact that the leaves harbor a tremendous collection of clones from the virus population(s) is also a striking situation that has been largely overlooked, with huge implications for both theoretical and practical aspects of virus biology, epidemiology, and evolution; how these findings apply to other tree-infecting virus species appears an interesting prospect, as it is, in large majority, from the leaves that viruses are acquired during transmission by aerial vectors.
This work was supported by the government of Tunisia, by the French ministry of Agriculture, and by a grant from DADP-INRA/Région Languedoc-Roussillon.
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»