Previous Article | Next Article ![]()
Journal of Virology, February 2007, p. 1746-1761, Vol. 81, No. 4
0022-538X/07/$08.00+0 doi:10.1128/JVI.01390-06
Copyright © 2007, American Society for Microbiology. All Rights Reserved.
Institute of Ecology and Evolutionary Biology, National Taiwan University,1 Department of Plant Pathology and Microbiology, National Taiwan University, Taipei 106, Taiwan2
Received 3 July 2006/ Accepted 16 November 2006
|
|
|---|
|
|
|---|
The BBTV genome consists of at least six integral single-stranded circular DNA (ssDNA) components (BBTV DNA 1 to 6) (6, 22, 37). BBTV DNA 1 encodes a replication initiation protein (Rep). An open reading frame (ORF) of DNA 2 was proposed by Beetham et al. (3) on the basis of Northern blot hybridization and sequences obtained by 3' rapid amplification of cDNA ends from an Australian isolate, but the function of this gene is still unknown (22). DNA 3 encodes a viral coat protein for encapsulation (83). The products of DNA 4 and DNA 6 genes are very similar to BC1 and BV1 of the bipartite Begomovirus, respectively, and both genes have been suggested to be involved in virus movement (53, 62, 84). DNA 6 possibly encodes a nuclear shuttle protein, and DNA 4 encodes a protein that can redirect the products of DNA 6 to the cell periphery (53, 62, 84). DNA 5 has been shown to contain an LXCXE motif, and binding between DNA 5 products and retinoblastoma (Rb), demonstrated by yeast two-hybrid analysis, may be involved in host-cell cycle manipulation (82). In addition, more than one Rep-encoded component may be associated with virus infection in nanoviruses (22, 29, 30, 39, 63). Only one of these Rep genes, master Rep, can directly replicate non-self-replicable components, and remaining Rep-encoded components are considered to be satellite viruses (29, 78, 80). Among BBTV, DNA 1 encodes the master Rep, and several additional Rep-encoded components have been reported in the Asian strains (29, 30, 88, 91).
All BBTVs and associated components have a stem-loop structure with a conserved sequence TA(G/T)TATTAC in the loop region that is common to all nanoviruses. In addition, a stretch of 69 nucleotides (nt) flanking the stem-loop region shares at least 62% homology among BBTV DNA 1 to 6 and has been defined as the stem-loop common region (CR-SL) (6). Another stretch of 66 to 92 nt located 5' to CR-SL and sharing at least 76% sequence homology among BBTV DNA 1 to 6 has also been identified and defined as the major common region (CR-M) and is only present in BBTV (6).
In previous studies, two groups of BBTV, the South Pacific (including Indian and Egyptian isolates) and the Asian groups, have been identified from sequence analysis of coding and noncoding sequences of DNA 1, 3, and 6 in isolates obtained from different geographic regions (37, 38, 84). The past few years have seen the accumulation of extensive sequence information about BBTV components, including several full sets of BBTV genomes. These sequences provide the basis for our analysis of the evolutionary history of BBTV.
Although BBTV is an ssDNA virus, the basic mechanisms of molecular evolution, including mutation, recombination, and reassortment (for multipartite viruses), are qualitatively similar to those of most other plant viruses with a plus-sense RNA genome (for a review, see reference 19). However, studies of Rep and coat protein genes of cotton leaf curl geminivirus, a bipartite plant DNA virus, revealed a higher degree of natural variation than several plant and animal RNA viruses (64). Whether plant DNA and RNA viruses show distinct properties remains to be examined since the information on plant DNA viruses is mostly limited to Geminiviridae and Caulimoviridae. BBTV is an important and unusual multipartite ssDNA virus, unique among all known DNA viruses and nanoviruses, and serves as a great example for studying plant DNA virus evolution.
In this study, we first performed phylogenetic network analyses based on coding regions of the master Rep (DNA 1) and the coat protein (DNA 3) genes of BBTV. Phylogenetic network methods were developed to visualize a non-tree-like network of target sequences and are particularly useful in studying organisms with reticulate evolutionary history (34). Such methods have been incorporated in phylogenetic analysis of viruses as evidence of recombination or conflicting signals in the genomes of, for example, Dengue virus (28), primate lentivirus (61), and hepatitis E virus (81). To examine the congruence of the phylogenetic trees obtained from study of DNA 1 and 3 sequences, we conducted a preliminary screening of conflict phylogenetic signals by an incongruence length difference (ILD) test (14). An ILD test is usually used to assess whether the data partitions are combinable, but it can be applied to detect hybridization (2) or horizontal gene transfer (42). Since the test can give false-positive results (9, 11, 36), we conducted nonparametric and parametric analysis for comparing obtained trees by the Kishino-Hasegawa (KH) (40) and Shimodaira-Hasegawa (SH) (69) analysis. In addition, we performed a phylogenetic analysis using CR-M regions from all six components of BBTV.
We included two BBTV isolates from Taiwan, one from a typical BBTV Taiwanese strain (Taiwan), and one from a newly identified mild strain of BBTV, the Taiwan type V (TW4) strain (72). The TW4 genome has only five components, corresponding to DNA 1 to DNA 5 of other BBTV genomes.
Finally, to obtain more evolutionary information of BBTV genomes, we conducted multivariate analysis based on codon and amino acid usages for different components of BBTV and other nanoviruses. Such analysis has been used for examining disparity in codon usages among genes in the same genome and/or patterns of biochemical composition among gene products of orthologous genes (56). An unusual pattern showing a specific gene positioned away from the gene clusters of the same genome on the two-dimensional (2-D) plot will indicate a different codon usage (codon bias) of the particular gene (7, 52). Although the virus genome is small, the coding sequence lengths of the BBTV genome are all longer than the minimal requirement for multivariate analysis of codon usage (i.e., 150 nt) (56). Clustering of codon usage according to specific genomes, as seen on the 2-D plot, is expected, and any outliers of the clusters are possible indicators of horizontal transferred genes, as has been demonstrated in Escherichia coli (49).
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Voucher information of BBTV sequences used in this studya
|
or Top 10. DNA 1, 3, and 5 of BBTV were differentiated by restriction enzyme AluI digestion patterns. Nucleotide sequences of the cloned cDNAs were determined in both directions with use of an ABI Prism DNA sequencer 310 or 377 (Applied Biosystems, Foster city, CA). At least three individual clones of each component were sequenced, and the consensus sequence derived from the alignment of sequences of each component was used in this study. Phylogenetic analysis. To obtain detailed phylogenetic information, we compiled data sets of coding regions of 43 nucleotide sequences of DNA 1 isolates and 29 sequences of DNA 3 isolates of BBTV. Corresponding sequences according to Hughes (32) from three other nanoviruses, Faba bean necrotic yellow virus (FBNYV), Milk vetch dwarf virus (MDV), and Subterranean clover stunt virus (SCSV), were used as outgroups. Phylogenetic network analysis involved use of Neighbor-Net (5), a distance method for constructing phylogenetic networks, incorporated in the program SplitsTree4 (34). Nucleotide substitution of the HKY85 model was used in the analysis, and branch support was estimated by bootstrapping with 1,000 replicates.
Phylogenetic analysis of DNA 4 and DNA 5 ORFs involved Bayesian inference (BI), neighbor-joining (NJ), and maximum parsimony (MP) methods. BI analysis was by use of MrBayes, version 3.1 (31), and NJ and MP by PAUP*, version 4.0b10 (73). In BI analysis, nucleotide substitution of the HKY85 model (23) was used with gamma distribution (alpha = 0.5) set as the rate distribution among sites in the analyses. We performed BI analyses with four chains of Markov chain Monte Carlo, sampling 1 tree per 5,000 generations for 1,500,000 generations, and the first 500 trees were excluded for calculation of posterior probability on each node. An HKY85 model for the nucleotide sequences was incorporated in NJ analysis by PAUP*, version 4.0b10 (73). In MP analysis, we conducted heuristic searches with 1,000 random addition replicates and tree bisection-reconnection branch swapping, and 10 trees were saved from each replicate. Because of the limited availability of data, only 10 and 11 sequences were included in the data sets for DNA 4 and DNA 5, respectively. The branch support was estimated by bootstrapping with 1,000 replicates for both NJ and MP analyses.
Phylogenetic analysis of 102 CR-M sequences involved NJ with PAUP*, version 4.0b10 (73), and the HKY85 nucleotide substitution model, and support for the branches was evaluated by bootstrap analyses with 1,000 bootstrap replicates. For all analyses, gaps were treated as missing data, and no sites containing insertions or deletions were excluded.
To compare the phylogenies of DNA 1 and DNA 3, two smaller data matrices were constructed for strains with both DNA 1 and DNA 3 sequences, since the comparison among components should be based on the same source of data. We selected 17 isolates with both DNA 1 and DNA 3 sequences, including 14 BBTV and three other nanoviruses, FBNYV, MDV, and SCSV, as outgroups. Phylogenetic analysis was conducted by BI methods as described above.
The incongruence of phylogenetic information among data partitions of BBTV DNA 1 and DNA 3 was first evaluated by the ILD test (14), followed by KH and SH tests. Three partitions, CR-SL, ORF, and CR-M, were categorized for each BBTV component and subjected to the ILD test, which is implemented in PAUP*, version 4.0b10 (73), and the branch-and-bound option was chosen for tree searching with 1,000 replicates. The tree topological test was conducted by use of KH tests (40) under parsimony criteria for comparing tree lengths of the obtained trees from the BI analysis above and SH tests (69) for comparing the likelihood scores between trees. SH tests involved a resampling estimated log-likelihood method (41). Both analyses were conducted by use of PAUP*, version 4.0b10 (73), for the DNA 1 or DNA 3 data matrix.
Measurement of codon and amino acid usage. A total of 132 coding sequences from all components of BBTV and other nanoviruses, including five identified additional Rep genes (as described in reference 32), were used in codon and amino acid usage analysis. Relative synonymous codon usage (RSCU) values (67) were used for multivariate analysis to measure codon bias. The RSCU value is the observed frequency of a codon divided by the expected frequency under equal usage of all codons for a given amino acid. Amino acid usage was calculated from the direct counts of amino acids in a particular sequence.
Multivariate analyses involved principal component analysis (PCA) for both RSCU and amino acid usage data for all 101 coding sequences of BBTV (PCA/RSCU and PCA/AA, respectively, hereafter), and 31 sequences of nanoviruses, including Coconut foliar decay virus (CFDV), FBNYV, MDV, and SCSV. PCA was chosen for multivariate analyses of codon and amino acid usages as suggested by Perriere (56), because other methods such as corresponding analysis are very sensitive to codons of rarely used amino acids in the sequences. The multivariate analyses involved use of ADE-4 (75), in which the indices of codon usage were calculated by CodonW (55) on a web server hosted by the Institut Pasteur, Paris, France (http://bioweb.pasteur.fr/seqanal/interfaces/codonw.html). Stop codons were excluded from the analyses because only one stop codon exists for each gene.
Nucleotide sequence accession numbers. The obtained sequences of BBTV Taiwan strain isolated by this study were deposited in GenBank under the following accession numbers: DNA 1, DQ817617; DNA 2a, DQ817867; DNA 2b, DQ825708; DNA 3, DQ817893; DNA 4, DQ825714; DNA 5, DQ817921; and DNA 6, DQ825730. Accession numbers for the TW4 strain are as follows: DNA 1a, 853027; DNA 1b, 857902; DNA 2, 853087; DNA 3, 857918; DNA 4, 853095; and DNA 5, 857944.
|
|
|---|
|
View this table: [in a new window] |
TABLE 2. Genetic diversity of all used BBTV genomic DNA, calculated as the number of substitutions per site
|
![]() View larger version (17K): [in a new window] |
FIG. 1. Neighbor-Net trees based on 46 ORF nucleotide sequences of BBTV DNA 1 and three other nanoviruses. The Pacific group, and three Asian groupsVietnam-N, Vietnam-S, and Asian s.s. groupsare marked accordingly. Bootstrap values over 50 are marked on the branches. The branches leading to FBNYV, MDV11, and SCSV8 sequences are shortened and marked by broken lines to save space. The inset is a phylogram based on neighbor-joining criteria, showing the relative branch length of sequences of BBTV and other nanoviruses. Accession numbers are preceded by abbreviations for the places of origin. Ina, Indonesia; Jap, Japan; Vie, Vietnam; Fij, Fiji; Haw, Hawaii; Phi, Philippines; Ton, Tonga.
|
![]() View larger version (15K): [in a new window] |
FIG. 2. Neighbor-Net trees based on 32 ORF nucleotide sequences of BBTV DNA 3 and three other nanoviruses. The Pacific group is marked on the figure, and the rest of the BBTV sequences are Asian group. Bootstrap values over 50 are marked on the branches. The branches leading to FBNYV, MDV11, and SCSV8 sequences are shortened and marked by broken lines to save space. The inset is a phylogram based on neighbor-joining criteria, showing the relative branch length of sequences of BBTV and other nanoviruses. Accession numbers are preceded by abbreviations for the places of origin. Ina, Indonesia; Jap, Japan; Vie, Vietnam; Fij, Fiji; Phi, Philippines; Bur, Burundi.
|
![]() View larger version (16K): [in a new window] |
FIG. 3. Phylogenies obtained by the BI method based on ORF nucleotide sequences of DNA 4 and DNA 5 BBTV isolates. (A) DNA 4 tree. (B) DNA 5 tree. Along the branches are the supports of posterior probabilities from BI, followed by bootstrap supports of neighbor-joining and most parsimonious methods; only values that are >50% are shown. Accession numbers are preceded by abbreviations for the places of origin. Haw, Hawaii.
|
![]() View larger version (20K): [in a new window] |
FIG. 4. (A) Unrooted phylogram of the neighbor-joining tree based on nucleotide sequences of the CR-Ms from 102 BBTV isolates under an HKY85 model. Bootstrap values for internal support of the branches are given along the branches. Two major clades are marked: the Asian and Pacific groups. The component names are given before the sequence names (i.e., DNA 1 to DNA 6 are named D1 to D6, respectively). CR-M sequences of additional Reps have full names (i.e., BBTV S1, S3, W3, and W4) before the accession numbers. (B and C) Hypothetical trees showing two scenarios of CR-M evolution: a phylogeny showing a single origin of CR-M regions, in which each component forms monophyletic groups (B) and a phylogeny showing two origins of CR-M, grouped in two major clades, accordingly (C). See text for more explanation. Accession numbers are preceded by abbreviations for the places of origin. Ina, Indonesia; Jap, Japan; Vie, Vietnam; Fij, Fiji; Haw, Hawaii; Phi, Philippines; Ton, Tonga; Bur, Burundi.
|
![]() View larger version (42K): [in a new window] |
FIG. 5. Alignment of 102 CR-Ms of BBTV and other nanoviruses. The putative concerted evolution region in the Pacific group is marked by a square. The tandem repeated sequence proposed by Burns et al. (6) is underlined. Accession numbers are preceded by abbreviations for the places of origin. Ina, Indonesia; Jap, Japan; Vie, Vietnam; Fij, Fiji; Haw, Hawaii; Phi, Philippines; Ton, Tonga; Bur, Burundi.
|
![]() View larger version (13K): [in a new window] |
FIG. 6. Phylogenies obtained by BI based on ORF nucleotide sequences of DNA 1 and DNA 3 of 14 BBTV and three other nanovirus isolates. (A) DNA 1 tree. (B) DNA 3 tree. Posterior probabilities for each branch are marked. Ina, Indonesia; Jap, Japan; Vie, Vietnam; Phi, Philippines.
|
|
View this table: [in a new window] |
TABLE 3. ILD test result based on 14 BBTV and three other nanovirus strains with full sets of DNA 1 and DNA3
|
|
View this table: [in a new window] |
TABLE 4. KH and SH tests comparing the trees resulting from analysis of DNA 1 and DNA 3 data matrices from Fig. 6
|
![]() View larger version (18K): [in a new window] |
FIG. 7. Result of combined analysis of PCA on amino acid and codon usage values of 132 nanovirus genes. (A) PCA map of the two first factors realized by amino acid usage of gene products. (B) PCA map plotted by the two first factors realized by RSCU of all genes. Different components of BBTV are marked in different colors.
|
|
|
|---|
Use of Rep and coat protein gene sequences to reconstruct phylogenies has been the prime approach for elucidating the evolutionary history of BBTV and other nanoviruses. Our results indicate that although such an approach could reflect the genealogy of individual genes of BBTV, the phylogenies might not be applicable to other components since different components can have different phylogenies. We demonstrated a clear case of a chimeric BBTV isolate, TW4, as having a mixture of the Asian and the Pacific components in a single isolate; TW4 likely contains DNA 2, 3, and 5 of the Pacific group and DNA 4 of the Asian group by genome reassortment, as supported by the CR-M phylogeny (Fig. 4) and the ORF phylogenies (Fig. 1, 2, and 3). Interestingly, TW4 has both the Pacific and the Asian type of DNA 1. The reason for maintaining two copies of DNA 1 could be due to a necessity for the two types of components, since the CR-M regions are quite distinct between the Pacific and the Asian group.
Phylogenetic network analysis is designed for analyzing non-tree-like phylogenies in the case of reticulate evolution such as hybridization, horizontal gene transfer, and recombination (34). It has also been proven very useful in studying virus evolution (28, 81). Our results indicate that, in general, DNA 1 of BBTV is more tree-like than DNA 3. The inclusion of FBNYV, MDV, and SCSV sequences did not help much in the rooting problems in BBTV, because they showed a network-like pattern at the base of BBTV (Fig. 1 and 2). However, a network-like pattern simply reflects conflicting phylogenetic signals, but the actual cause for the pattern could be reassortment, recombination, or concerted evolution. Other methods, such as use of genetic distances or substitution distribution as estimates (58), are needed for further evaluation of the possibility of recombination.
Our phylogenetic results for DNA 1 and DNA 3 mostly agree with the two-group hypothesis of DNA 1 and 3 in BBTV as suggested by previous studies (37, 38, 84). However, results of BI-based phylogenies of DNA 1 and 3 may be problematic in the rooting. One of the error sources for rooting in phylogenetic reconstruction is the inclusion of distantly related outgroup sequences. Use of a very divergent outgroup can cause precarious rooting in phylogenetic analysis (74, 85), likely due to the long-branch attraction effect (15), a well-known source of error in phylogenetic analysis. Although FBNYV, MDV, and SCSV represent the closest known viruses to BBTV, they are still quite different from BBTV, averaging more than 40% variation compared to DNA 1 and more than 57% compared to DNA 3 at the nucleotide level. In contrast, the three nanoviruses show only 8% to 20% variation on Rep genes and 21% to 38% variation on coat protein genes among themselves. Unfortunately, no other nanoviruses closer to BBTV are known; thus, no better outgroup is currently available.
The phylogeny of the CR-M complicates the story of BBTV. The rationale is that, if the ancestor genome of BBTV possessed all six components, the CR-M of each component should follow the evolution of BBTV, in accordance with the coding regions; therefore, the CR-M sequences should form separate monophyletic groups on the resulting tree according to each component. If not, then the evolution of CR-M will be shown to be independent of the evolution of components as a whole (i.e., some genomic reassortment or recombination has occurred). Our results strongly suggest two origins of the CR-M in BBTV, one that gave rise to the Pacific group and another that gave rise to the Asian group of BBTV. However, when or where these two CR-M types originated is not clear because no other comparable homologous sequence has been found, even in other nanoviruses. Since the additional Rep genes have been considered as outgroups of all nanovirus DNA 1 genes (32) and their CR-M grouped with the Asian group, the CR-M of the Asian group is likely the ancient type. A possible scenario is that the original BBTV of the Pacific group acquired a new CR-M sequence and then a homogenization occurred among the six components in this region. This "new" set of the BBTV genome became the origin of the Pacific group BBTV and then gave rise to strains in Australia, the Pacific, India, and Egypt. The phylogeny shows that the CR-Ms of additional Rep genes are similar to the CR-M of DNA 1, which suggests that they may be regulated by the master Rep (80). The phylogeny also shows that the CR-Ms of DNA 1 and 2 form a monophyletic group distinct from that of DNA 3 to 6 (Fig. 4A). In sum, this evidence suggests a concerted evolution in the CR-M of BBTV components, and the most likely cause is recombination in these regions. Our further analysis with use of the Recombination Analysis Tool (13) to rapidly screen possible recombinations showed a likely recombination breaking point between the ORF and CR-M of DNA 1 and 3 (J.-M. Hu, unpublished data). Even though the trees we obtained for individual components are identicale.g., two groups, Asian and Pacific, for three components, DNA 1, 3, and 6the true evolutionary history of BBTV may be more complicated than demonstrated in this study. Since the function of the CR-M has yet to be identified in BBTV, whether the difference indicates distinct regulation patterns in replication or signals of virion encapsidation for the two groups of components remains to be examined.
The use of incongruence tests such as the ILD test in phylogenies has been questioned for the feasibility in detecting incongruent partitions (8, 11, 36). We consider the ILD test a first screen for identifying potential incongruence, since it is generally more susceptible to type I error, as suggested by several authors (27, 47, 57). Our ILD test result showing significance in the CR-SL of DNA 3 with other regions could have been due to random effects and type I error, since the noncoding regions in our analysis are quite short, with CR-SL and CR-M approximately 100 bp or shorter. Further analysis is certainly needed to validate this incongruence result. Therefore, we focused on the incongruence in coding regions between DNA 1 and 3. The tree topologies based on DNA 1 and DNA 3 are significantly different as calculated from the DNA 1 data matrix on SH and KH testing (Table 4), although the difference was less significant when the DNA 3 data matrix was used. The results suggest that the evolutionary histories of DNA 1 and 3 are very likely different from each other. The results of both the ILD test and the phylogenies indicate that the DNA 1 and 3 coding sequences contain incongruence signals, likely because of a birth-and-death evolution and/or genome reassortment among components, which is not uncommon in viruses (32). Our PCA/AA and PCA/RSCU results also indicate that the Rep and coat protein genes, including the additional Rep genes, have a distinct codon bias compared with other nanovirus genomes.
One other problem we encountered is the identification of DNA 2 ORF in BBTV. Although the corresponding mRNA of DNA 2 has been detected by Beetham et al. (3), the identification of an ORF in DNA 2 is problematic in other isolates. Two possible explanations may account for the ambiguity. First, BBTV may utilize a nontypical translation start site for protein initiation as reported for Rice tungro bacilliform virus (18) and Tobacco mosaic virus (66). Alternatively, the function of the DNA 2 gene product could act at the RNA rather the protein level, because increasing evidence has indicated that noncoding RNA plays an important role in eukaryotic cell gene regulation (48). However, the function of DNA 2 of BBTV remains to be resolved. In addition, based on the inoculation assay with cloned Faba bean necrotic yellow virus, DNA revealed a possible functional redundancy or complementation between distinctive nanovirus genomic components. It is possible that other components may have compensated for the function of DNA 2 and allowed DNA 2 to accumulate mutations (79).
Multivariate analysis of codon usage analysis shows an unusual bias in BBTV DNA 1 and 3 that differs from other components of BBTV, whereas the codon usage of DNA 4 to 6 of BBTV is similar to that of other nanoviruses. Although the Eigen values were generally low from PCA/AA and PCA/CA and the significance of the clustering was not accessed by randomized data sets, we still think the analyses reflect certain patterns that bear evolutionary information. We have examined four axes of the data, and most of the axis pairs show the clustering of DNA 1 and 3, respectively (data not shown). The unusual codon usages of DNA 1 and 3 might be due to their having origins different from the origins of other components, as explained below.
The main causes of codon bias among genes in a specific genome are limitations in translational efficiency reflected by expression level and/or the need to maintain a certain genomic composition (i.e., the GC content) (43, 68). Although quantitative expression data for different components of BBTV are largely unavailable, we assume that expression level is not a major force in shaping codon usage in BBTV, since corresponding genes in FBNYV, MDV, and SCSV all clustered together with other components of BBTV. In addition, since virus gene expression depends on the host translation apparatus, codon bias may be influenced by host background. However, no such effect was detected in our analysis as no clustering patterns according to host species were seen in the PCA plot (data not shown).
The second source of codon bias is maintenance of genomic nucleotide composition. The base compositions are assumed to be similar among components of BBTV since they represent a single and complete genome. The average nucleotide content among all ORFs of nanoviruses is similar. In addition, the third-position nucleotide composition of nanoviruses does not influence any corresponding patterning in the 2-D PCA/RSCU plot (Fig. 6B). For example, although thymine content in the synonymous third codon position (T3s) of BBTV ORFs range from 27% to 48%, the values are not as extreme as in other nanoviruses (e.g., 23% in CFDV or 63% in SCSV4, where they are clustered in the middle of the PCA/RSCU plot) (Fig. 6B). Therefore, codon bias due to differences in genome base composition is also not a major factor in nanoviruses.
Gene length has been shown to shape codon usage in both prokaryotes and eukaryotes (12, 51), although the factor is usually significant only among genes with a 10-fold or greater difference in length. The length differences of the genes used in this study are within 100 to 300 codons. Therefore, the codon bias cannot be due to gene length differences; furthermore, FBNYV and MDV genes, with different lengths, were clustered together, which suggests that gene length is not a major factor in this case. Another possible source of codon bias is the nature of overlapping genes in many viruses, but the effect has only been extensively evaluated by a few studies (33, 54, 70). The possible influence of overlapping genes in codon usage should be limited in nanoviruses since most are monopartite for each component; thus, the core genes do not overlap.
We favor a reassortment event as the most plausible cause of codon bias among components of BBTV. This explanation has been demonstrated in prokaryotes, for which multivariate analyses revealed horizontally acquired genes showing distinct codon usage patterns (20, 49). Our results also showed the five identified additional Rep genes in BBTV with a codon usage similar to that of DNA 4 to 6 of BBTV and other nanoviruses, which suggests that they all derived from the same origin.
The combination and reassortment of virus components from different sources, known as a "symbiogenesis" process, have been recently reviewed by Roossinck (60). Mixed infection of viruses in a single host plant is common in the field. In such mix-infected plants, genome reassortment events are frequently observed in viruses with segmented genomes, and these events have a strong impact on virus speciation (44, 60). Examples can be seen in tobravirus strains I6 and N5, containing RNA 1 sequences from Tobacco rattle virus and RNA 2 sequences derived from Pea early browning virus (59), and Bean distortion mosaic virus, which contains RNAs 1 and 2 of Cucumber mosaic virus and RNA 3 of Peanut stunt virus (86). The potential for genetic reassortment among nanoviruses has been demonstrated, as the Rep genes of FBNYV, MDV, and SCSV are able to trigger replication of heterologous nanovirus DNAs (80). In addition, several Rep-encoded components associated with nanovirus infection have been discovered. Meanwhile, a replication-competent, nanovirus-like DNA component was found to be associated with Cotton leaf curl begomovirus (46, 65). It is believed that this particular nanovirus-like DNA component requires Cotton leaf curl begomovirus for encapsidation and transmission. Collectively, this information supports a possible scenario in which a nanovirus Rep gene was introduced into the ancestors of BBTV populations and took over the function of replication. The original Rep genes were able to remain in the BBTV genome only in some isolates and eventually lost the ability to replicate themselves. The case of a mild strain, TW4, demonstrated here is a strong proof that genome reassortment did occur in BBTV.
The results of this study suggest that the traditional approach of using a single component to reconstruct phylogenies in BBTV allows only limited inferences about the evolutionary history of the virus even though it is still informative in comparing isolates from different regions. We demonstrated a special case of genome reassortment in one Taiwanese isolate (TW4) and a concerted evolution in the CR-M of the Pacific group of BBTV. In addition, we demonstrated the possible uses of multivariate analyses of codon and amino usages for nanovirus genes for detecting underlying evolutionary patterns that are not obvious from other analyses; these methods are applicable to other virus genomes with similar situations. Finally, since little information is available for BBTV with full sets of components, we urge that such data be collected for other components of BBTV to better elucidate evolution in nanoviruses.
H.-H.Y. is supported by grants from National Science Council, Taiwan (NSC 92-2313-B-002-066, NSC 93-2313-B-002-112, and NSC 94-2313-B-002-105).
Published ahead of print on 29 November 2006. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»