Previous Article | Next Article ![]()
Journal of Virology, November 2003, p. 11517-11530, Vol. 77, No. 21
0022-538X/03/$08.00+0 DOI: 10.1128/JVI.77.21.11517-11530.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
CRUK Institute for Cancer Studies, University of Birmingham, Birmingham B15 2TT,1 MRC Virology Unit, Institute of Virology, University of Glasgow, Glasgow G11 5JR, United Kingdom2
Received 24 March 2003/ Accepted 30 July 2003
|
|
|---|
|
|
|---|
One possible example of such circumstances emerged during the study of CD8+-cytotoxic-T-lymphocyte (CTL) responses to Epstein-Barr virus (EBV), a gammaherpesvirus widespread in human populations. In HLA-A11-positive Caucasians, the memory CTL response to EBV latent cycle antigens is frequently dominated by T cells restricted through this allele and recognizing one of two peptide epitopes derived from the virus-encoded nuclear antigen EBNA3B; these are the immunodominant IVTDFSVIK epitope (EBNA3B codons 416 to 424, called IVT) and the next-most-dominant AVFDRKSDAK epitope (EBNA3B codons 399 to 408, called AVF) (7, 9). These same responses are also prominent during primary infection (32), at a time when successful EBV transmission to the naive host appears critically dependent upon the proliferation of latently infected B cells expressing the immunodominant EBNA3A, 3B, and 3C antigens (28). We found that the IVT and AVF epitope sequences were very frequently altered (relative to the Caucasian type 1 prototype strain, B95.8) in EBV strains isolated from highly HLA-A11-positive populations in lowland Papua New Guinea (6) and southern China (7). Furthermore, most nucleotide changes caused amino acid replacement in the key anchor positions (amino acids 2 or 9/10 in the epitope sequence) that are the major determinants of epitope affinity for HLA-A11 molecules. Accordingly, the variant epitopes formed less stable complexes with HLA-A11 and/or were not well-recognized by Caucasian donor CTLs specific for the wild-type (i.e., B95.8) IVT and AVF sequences (2, 6, 7, 17, 21). In a paper accompanying the present report, our group went on to describe more accurately the range of IVT and AVF variants found in Chinese EBV strains and provided evidence that these variants were indeed nonimmunogenic in vivo (21).
Such evidence is consistent with the view that, under pressure from the HLA-A11-restricted CTL response in a population where >50% of individuals carry this allele, EBV strains with epitope-loss sequence variants have enjoyed a selective advantage. However a number of other observations have cast doubt upon the significance of such HLA-A11 epitope changes (13). Thus, one of the combinations of IVT and AVF variants seen among EBV isolates from China was very common not just in lowland Papua New Guinea but also in highland people, in whom HLA-A11 is rare. Furthermore, the same Papua New Guinea viruses also showed coincidental changes, affecting anchor residues, in two epitopes within EBNA3A that were restricted through HLA alleles (B8 and B35) not found at all in these populations (2). A broader study comparing Caucasian, African, Chinese, and Papua New Guinea virus strains across several other known epitope regions, mainly in EBNA3A, 3B, and 3C, again found no correlation between sequence variation within the epitope and representation of the restricting HLA allele in the host population. That study also found no evidence for positive selection in these regions from an analysis of replacement/silent mutation ratios (14).
These various studies highlight the debate as to whether the loss of HLA-A11 epitopes seen among Chinese EBV strains really reflects immune selection or is simply a product of random evolutionary drift. Certainly other sequence polymorphisms have been described that distinguish Chinese from Caucasian and African EBV strains (1, 4, 11, 18, 22, 23, 33), consistent with slow evolutionary divergence of the virus through random drift in geographically separate host populations. Here, we have carried out extensive sequencing of both type 1 and type 2 Chinese EBV isolates across several latent genes in order to analyze HLA-A11 epitope polymorphism in a broader genomic context.
|
|
|---|
Sequencing of EBV latent genes. Total genomic DNA was prepared from LCL pellets by standard methods, and the relevant regions of the EBV genome were amplified by PCR to generate suitable templates for DNA sequence analysis. For each isolate, sequences corresponding to EBNA1 (codons 475 to 535, or in selected cases codons 460 to 641), EBNA2 (codons 109 to 259), EBNA3A (codons 114 to 320), EBNA3B (entire gene), EBNA3C (codons 121 to 293), and LMP1 (codons 318 to 386) were amplified by using the primer combinations described in Table 1; note that in the case of EBNA3B, the entire coding region was initially amplified as a series of short fragments with primer pairs which contiguously spanned the open reading frame and the intervening intron sequence. PCR amplifications were done under the following conditions: EBNA1, 35 cycles of 94°C for 60 s, 62°C for 90 s, and 72°C for 240 s; EBNA2, 35 cycles of 94°C for 30 s, 45°C for 90 s, and 72°C for 120 s; EBNA3A, 3B, and 3C, 40 cycles of 94°C for 60 s, 45°C for 90 s, and 72°C for 120 s; LMP1, 40 cycles of 94°C for 60 s, 62°C for 45 s, and 72°C for 120 s. For the Chinese prototype 1 virus strain NPC15, the entire coding regions of EBNA3A and EBNA3C were also amplified with the primers listed in Table 1, together with the complete coding sequence of EBNA2 as previously described (39). PCR products were gel purified with a QIAquick gel extraction kit (Qiagen, Crawley, West Sussex, United Kingdom) and directly sequenced by using a BigDye, version 3.0, PCR sequencing kit (Applied Biosystems, Warrington, United Kingdom) with a suitable primer. All samples were analyzed with an Applied Biosystems 3700 automated sequencer (Functional Genomics Laboratory, School of Biosciences, University of Birmingham).
|
View this table: [in a new window] |
TABLE 1. Oligonucleotide primers used to amplify EBV sequences
|
Evaluation of positively selected loci in the coding regions of the EBNA3 genes was performed by using the program Codeml in the PAML package (version 3.13) (24), obtained from http://abacus.gene.ucl.ac.uk. Codeml (using its option of codon-based analysis) is a program for probabilistic modeling of characteristics of evolutionary diversity in aligned sets of protein coding sequences. It calculates by maximum-likelihood methods the instantaneous rates of synonymous change (dS) and nonsynonymous change (dN) for an aligned set of protein coding sequences. The possible occurrence of positive selection in a data set is then detected as a situation in which the dN/dS ratio (called
) is estimated to be greater than unity, i.e., mutations that cause amino acid changes are being fixed in the population represented by the data set at a higher rate than silent changes. Evaluation of specific sites in the alignment at which positive selection may have occurred is accomplished by Bayesian estimation of the probability of
being greater than one for each such site. Note that the maximum-likelihood analysis does not assume that the contemporary prototypes (B95.8 and Ag876) predate the test isolates.
Codeml provides a number of models for fitting distributions of
values to the input sequence data, and the following were employed in analyzing the EBNA3 gene sequences, with compatible results: model 1, neutral, 2 classes, namely
= 0 and
= 1; model 2, selection, 3 classes,
= 0,
= 1, and
> 1; model 3, discrete, 2, 3 and 8 classes of
values examined, with a value of
for each class assigned by the program. Models 7 and 8, with more complex distributions of
, were also investigated, but their outputs were judged unsatisfactory because in each case the set of
classes computed contained several of identical
value, presumably because the input sequence data were not sufficiently diverse to allow nontrivial fitting to the model. Lastly, model B as described by Yang and Nielsen (37) was applied to examine the combined site and lineage specificity of positive selection. This model provides two classes of
values for the whole sequence set and an additional class with
of >1 for application in foreground lineages that had been specified as of specific interest for evaluation of occurrence of positive selection. In all Codeml runs, equilibrium frequencies of codons were estimated as products of nucleotide frequencies at each codon position, the transition/transversion rate ratio was calculated in the program, and no molecular clock was imposed. In all appropriate cases, the program was run separately with high and low initial values of
to check the validity of computational convergence.
|
|
|---|
|
View this table: [in a new window] |
TABLE 2. Sequence divergence of Chinese prototype 1 virus NPC15 from B95.8 prototype
|
Sequencing of EBNA3B identifies families of sequence variation among type 1 viruses. We went on to sequence the complete EBNA3B gene in a panel of 25 more type 1 Chinese virus isolates (derived both from healthy donors and from NPC patients). The panel was found to contain one virus with a wild-type (i.e., B95.8-like) sequence in the IVT and AVF epitopes and representatives of all but one of the nine combinations of IVT and AVF sequence variants identified among type 1 Chinese virus strains in the accompanying report; the one variant combination not available (AVF/SIL2, IVT/N9) was found in only 1 of 64 type 1 viruses analyzed in that study (21).
Figure 1 presents the results of this analysis for all 25 type 1 isolates as well as for NPC15; the figure shows only those codons where nucleotide changes occurred relative to the B95.8 prototype sequence and, in cases of amino acid replacement, identifies the amino acid change. Apart from one virus (C1) which was identical to B95.8 throughout EBNA3B, there were three main patterns of sequence change, identified in the figure by different degrees of shading. It is important to stress that these changes always occurred against a background of >98% nucleotide identity between Chinese strains and the B95.8 prototype. The Li family of viruses (C4 to C10) almost all showed the same 16 nucleotide and 11 amino acid changes relative to B95.8 EBNA3B, whereas the Wu family (C11 to NPC11) had 51 nucleotide and 31 amino acid changes in common; only 6 of these nucleotide changes and 3 of the amino acid changes were shared between the two families. The distinction between the Li and Wu families was also apparent at the 60-bp repeat locus lying between codons 758 and 797 on the figure, with Wu family viruses all having one extra repeat. Interestingly, there were three virus strains (the original prototype NPC15 plus C17 and C5) with identical sequences which followed the Wu family consensus throughout the 5' half of EBNA3B but switched to the Li consensus in the 3' half between indicator polymorphisms at codons 610 and 645. These Wu/Li interfamily recombinants are placed between the Wu and Li family groups in Fig. 1.
![]() ![]() View larger version (235K): [in a new window] |
FIG. 1. Sequence changes in the entire EBNA3B gene of type 1 Chinese EBV strains relative to the Caucasian prototype 1 B95.8 EBNA3B gene. Strains are aligned vertically under the B95.8 prototype and are identified in the left-hand column of each block; strains within the same box in that column have identical sequences. All other columns represent individual EBNA3B codons (numbered at the top) where a nucleotide change relative to the B95.8 sequence was detected in one or more Chinese viruses. The nucleotide changes are shown in boldface type and resulting amino acid changes are also shown in boldface type; unchanged nucleotides and amino acids are not in boldface type. To illustrate the familialrelationships among the Chinese strains, blocks of the Li family sequence are shown in light shading, blocks of the Wu family sequence are shown in medium shading, and blocks of sporadic (Sp) sequence are shown in dark shading. One Chinese virus strain, C1, is entirely B95.8-like. Strains C4 to C10 constitute the Li family, strains C11 to NPC11 constitute the Wu family, strains C5, C17, and NPC15 are Wu/Li recombinants, and strains C6 to C13 represent various Wu/Sp recombinants. Codons lying within the AVF (399 to 408) and IVT (416 to 424) epitope regions are identified above the relevant columns. The 60-bp repeat locus with EBNA3B is also identified, and the number of repeats in each virus strain is shown.
|
Familial sequence variation extends to the EBNA3A, 3C, and 2 loci. We next sequenced the same panel of 26 type 1 viruses across regions of the EBNA3A and 3C genes, i.e., genes situated immediately upstream and downstream of EBNA3B in the viral genome. The regions sequenced (EBNA3A codons 114 to 320 and EBNA3C codons 121 to 293) were chosen since they each contained a number of changes relative to B95.8 in the NPC15 prototype sequence. Figure 2 summarizes all of the sequence polymorphisms observed in both EBNA3A and 3C for all 26 viruses, with the isolates being grouped as in Fig. 1 into Li family, Wu/Li recombinants, Wu family, and Wu/Sp recombinants.
![]() View larger version (114K): [in a new window] |
FIG. 2. Sequence changes in parts of the EBNA3A (codons 114 to 320) and EBNA3C (codons 121 to 293) genes of type 1 Chinese EBV strains relative to the Caucasian prototype 1 B95.8 sequence. The different virus strains are aligned vertically under the B95.8 prototype by using the same format as adopted in Fig. 1, with the nucleotide and amino acid changes identified in boldface type and the different blocks of family sequences identified by different degrees of shading as described in the legend to Fig. 1.
|
We then selected representative viruses from each of the above-described family groups for sequencing at the EBNA2 locus some 40 kb upstream of EBNA3A in the viral genome. Sequence divergence from B95.8 within the region analyzed (EBNA2 codons 109 to 259) is shown in Fig. 3. Familial patterns were again apparent at this locus, with five Li family viruses (NPC3 to C10) showing 11 nucleotide and 6 amino acid changes plus the addition of an extra codon (for a leucine residue) after codon 211 in the B95.8 sequence and four Wu family viruses (C2 to NPC11) showing only 2 nucleotide changes and a single amino acid change. The two representatives of viruses with Wu/Li recombination within EBNA3B (C5 and NPC 15) were Wu-like at EBNA2, as would have been predicted for a single crossover, whereas the three members of the more diverse Wu/Sp recombinant family were Wu-like in two cases (C15 and C12) and Li-like in one (C13).
![]() View larger version (132K): [in a new window] |
FIG. 3. Sequence changes in a part of the EBNA2 gene (codons 109 to 259) of type 1 Chinese EBV strains relative to the Caucasian prototype 1 B95.8 sequence. The data are derived from a subset of the full panel of Chinese strains including five members of the Li family (NPC3 to C10), two Wu/Li recombinants (C5 and NPC15), four members of the Wu family (CT1 to NPC11), and three Wu/Sp recombinants (C15 to C13). The different virus strains are aligned vertically under the B95.8 prototype, and the nucleotide and amino acid changes and different blocks of family sequences are identified as described in the legend to Fig. 1.
|
|
View larger version (22K): [in a new window] |
FIG. 4. Sequence changes in the EBNA2 (codons 109 to 259), EBNA3A (codons 114 to 320), EBNA3B (entire gene), and EBNA3C (codons 121 to 293) genes of type 2 Chinese EBV strains relative to the African prototype 2 Ag876 sequence. Virus strains are aligned vertically under the Ag876 prototype, and strains in the same box have identical sequences. Nucleotide and amino acid changes are identified as described in the legend to Fig. 1. Note that only the C19 and C20 strains are type 2 at all four gene loci. The other three strains are type 1-type 2 recombinants with type 2 sequences only at the EBNA3A, 3B, and 3C loci (C18) or only at the EBNA3B and 3C loci (NPC13 and NPC14).
|
|
View this table: [in a new window] |
TABLE 3. Sequence patterns at latent gene loci and AVF/IVT epitope change
|
Ile amino acid change. Likewise, within LMP1 codons 318 to 386, the great majority of type 1 viruses carried a characteristically Chinese allele (4, 33), here called Ch' (22), with several amino acid changes relative to B95.8 plus a 30-bp deletion removing codons 343 to 352; all Ch' sequences were identical, except at codon 335, where just the Li family viruses and Wu/Li recombinants showed an additional Gly
Asp amino acid change. We also noted that the two type 1 viruses C4 and C13, which carried a different LMP1 allele (called Ch", lacking the deletion but with its own pattern of sequence divergence from B95.8) (22), were distinct from the rest in other ways; thus, C4 was an outlier within the Li family group with its own unique IVT/AVF epitope sequences and C13 had a unique intratypic recombinant structure at the EBNA2 and 3C loci and also unique IVT/AVF epitope variation. It is also worth noting that, while neither EBNA1 nor LMP1 is type specific in its sequence divergence among EBV isolates worldwide, the two Chinese type 2 virus strains studied here were distinct from type 1 in carrying a T allele at the EBNA1 locus and a unique Ch''' sequence at LMP1 (22). Table 3 not only illustrates how familial relationships within type 1 viruses run throughout all six latent gene loci analyzed but also emphasizes how the different combinations of IVT and AVF epitope mutation within EBNA3B tend to align with these family groups. Thus, all of the viruses with the AVF/N4, IVT/N9 combination of epitope sequences (C10 to NPC12) fall within the Li family, as does the rarer AVF/A2, IVT/L5 virus (C4). Likewise, all of the AVF/S1F2, IVT/L2 viruses (C2 to NPC11) fall within the Wu family, as does the rarer AVF/S1F2, IVT/N9 virus (C11). Interestingly, the three Wu/Li recombinants (C5, C17, and NPC15) are the only ones to carry the AVF/P1L2, IVT/N9 epitope combination, whereas the four members of the more diverse Wu/Sp recombinant family (C6, C15, C12, and C13) show four different patterns of epitope change, three of which are not seen in any other virus on the panel.
The relationship between A11 epitope polymorphism and broader patterns of sequence divergence is better appreciated in Fig. 5, which shows a phylogenetic tree of evolutionary distance (horizontal axis) between Chinese EBV strains based on available sequences at the EBNA3A, 3B, and 3C genes. Type 2 strains apparently derive from an early branch point in EBV evolution, which predates human migration out of Africa, since the Chinese type 2 viruses are clearly much closer to the contemporary African type 2 prototype Ag876 than to contemporary Chinese type 1 viruses. Indeed, from the limited data available, type 2 EBNA3 gene sequences appear not to have diversified with geographic isolation as much as type 1 sequences, and the conservation of A11 epitope loci in type 2 viruses reflects this more general trend. Among type 1 strains, the phylogenetic tree clearly identifies the separate Li family and Wu family groups, placing Li viruses closer to the B95.8 prototype. We have included in the tree the several type 1 sequences showing intratypic recombination. Strictly, recombinants cannot be accommodated accurately in a tree-based depiction of relationships, but the tree does serve to highlight that the intratypic recombinants fall into three separate groups; namely the Wu/Li recombinants and two groups of Wu/Sp recombinants. From Fig. 5, these different branches of type 1 EBV evolution in the Chinese population are clearly associated with distinct patterns of A11 epitope variation.
![]() View larger version (25K): [in a new window] |
FIG. 5. Phylogenetic tree based on EBNA3 sequences. The tree shown was obtained by the neighbor-joining method (PHYLIP program Neighbor) by using maximum-likelihood distances between pairs of DNA sequences in the concatenated alignment of EBNA3A (part), EBNA3B (whole), and EBNA3C (part) sequences. The horizontal branches represent substitutions per nucleotide site, with the scale indicated at the foot of the tree. Branches for type 1 intratypic recombinants are drawn in grey (see text). The tree is rooted between types 1 and 2, with the branch joining types 1 and 2 (dashed line) compressed with respect to the rest of the figure. On the right, the AVF and IVT epitope sequences for each virus strain are indicated (using the same nomenclature as in Table 3), and viruses of the Li and Wu families, Wu/Li recombinants, Wu/Sp recombinants, and type 2 viruses are identified. wt, wild type; wt1, Caucasian B95.8 prototype 1 epitope sequence; wt2, African Ag876 prototype 2 epitope sequence.
|
(i) Incidence of sequence changes affecting unrelated CD8+-T-cell epitopes. Following the approach used by Khanna et al. (14), we investigated how frequently sequence changes in type 1 virus strains coincidentally affected any of the other known CD8+-T-cell epitopes with EBNA3A, 3B, or 3C. Table 4 lists all CD8+-T-cell epitopes currently known within the Caucasian prototype 1 EBNA3A, 3B, and 3C sequences (3, 24) and records whether or not these sequences are coincidentally altered in Chinese type 1 viruses. Note that a large majority of these epitopes are restricted through HLA alleles that are poorly represented in the Chinese population, and in any case, most of them elicit responses that are much weaker than those induced by IVT and AVF. We therefore assume that mutations occurring in these epitopes probably reflect the coincidental effects of random genetic drift during EBV evolution.
|
View this table: [in a new window] |
TABLE 4. Conservation or variation of known CTL epitopes in Chinese EBNA3A, 3B, and 3C
|
Leu change in just a subset of Wu family viruses, another (VEI/B44) was altered in Wu family viruses and in Wu/Sp recombinants but not in anchor positions for HLA class I binding, and a third (AVL/B35) showed the same sequence change (Ala
Thr position 1) in essentially every isolate. This latter epitope change is an example of a geographic marker shared by virtually all Chinese and Papua New Guinea virus strains and is already known not to alter epitope antigenicity (14). Taken overall, these results stand in contrast to the situation at the IVT and AVF epitopes in EBNA3B, which are mutated in a large majority of type 1 Chinese strains, including the Li family, Wu family, and Wu/Sp recombinants, which exhibit a variety of different mutations, and where these mutations regularly affect immunogenicity (21). (ii) Evaluation of positively selected sites by computer-based modeling. We then used a computer-based modeling approach to look for evidence of sites within EBNA3A, 3B, and 3C with patterns of sequence change indicative of positive selection. A concatenated alignment of the EBNA3 gene coding sequences for 21 Chinese isolates was constructed for analysis with Codeml. These included 19 type 1 and 2 type 2 virus strains (duplicate entries of identical sequence and type 1/type 2 recombinants were excluded) that were each analyzed across 1,326 codons representing 207 codons of EBNA3A, all 946 codons of EBNA3B (inclusive of alignment gapping but omitting the stop codon), and 173 codons of EBNA3C. Maximum-likelihood distances were calculated for all pairs of sequences (with PHYLIP Dnadist) and used to derive a neighbor-joining tree (with PHYLIP Neighbor) whose topology (unrooted) was supplied as input to Codeml.
A number of runs of Codeml were made to evaluate the existence and identity of positively selected sites, under a range of models for distribution of
values across the codon set of the input alignment. Under Codeml model 1, only neutral or silent changes are allowed, whereas model 2 also allows positively selected changes. Models 1 and 2 yielded log likelihood values (for the data, given the model) of -8,904.35 and -8,850.29, respectively. By the criterion of the likelihood ratio test, these figures indicate that model 2 gave a highly significantly improved fit over model 1 and thus that the sequence set contains positively selected elements. From the model 2 analysis, 12 codon sites were identified as exhibiting positive selection on the basis of having a probability of greater than 0.95 that
is >1; the sites identified are listed in Table 5. In model 3, the number of classes for
values is specified to the program, which then computes the
value for each class. Model 3 identified expanded numbers of codon sites as showing positive selection, namely 34, 34, and 27 sites for runs with 2, 3, and 8
classes, respectively (data not shown). Finally Codeml model B was applied to examine combined site specificity and lineage specificity of positive selection, with the EBV type 1 branches of the input tree designated as the foreground locus of interest for analysis of lineage specificity. Model B gave a log likelihood figure of -8,834.34 compared with a value for the comparable analysis that lacked the foreground-specific
class (i.e., model 3 [discrete] with 2
classes) of -8,851.50, and the likelihood ratio test then showed that the addition of the foreground class of
value represented a highly significant improved fit to the data. As shown in Table 5, model B identified five sites showing positive selection (P > 0.95 that
is >1) specific to EBV type 1, two of which corresponded to codons 399 and 400 encoding amino acids 1 and 2 of the AVF epitope and a further two codons, 417 and 424, encoding amino acids 2 and 9 of the IVT epitope. Two other sites were identified by model B as representing positive selection not specific to the EBV type 1 branches; both refer to codons where all three nucleotides differed between type 1 and 2 sequences rather than to codons showing substantial diversification within viruses of the same type.
|
View this table: [in a new window] |
TABLE 5. Codon sites in EBNA3 genes showing positive selection
|
|
|
|---|
Irrespective of the situation at HLA-A11 epitope loci, there is clearly a process of EBV sequence diversification which has occurred through random mutation and is reflected in the presence of several contemporary polymorphisms that serve as geographic markers of EBV identity (1, 4, 11, 18, 22, 23, 33). Their existence accords with the view that EBV has been evolving for many thousands of years within host populations that have remained geographically separate. Among Chinese virus strains studied to date, almost all of which have been type 1, the most frequently described geographic markers include a characteristic LMP1 sequence (here called the Ch' allele) with a 30-bp deletion and numerous other changes relative to B95.8 (4, 11, 22, 23, 33) and a characteristic EBNA1 sequence with several codon changes in the C-terminal half of the molecule and referred to as the V allele, after the signature amino acid found at position 487 (11, 22). This diversification of the EBNA1 sequence among different human populations is particularly informative since it affords an example of evolution that is presumably independent of CD8+-T-cell-mediated immune pressure. Thus, the EBNA1 protein is protected from CD8+-T-cell recognition by the presence of a glycine-alanine repeat domain that prevents peptides being generated from the endogenously expressed protein by proteasomal digestion (16, 30).
It is therefore very likely that the EBNA3A, 3B, and 3C genes will also have accumulated sequence changes via random mutation that, as in EBNA1 and LMP1, have become embedded in contemporary Chinese strains as a result of earlier founder effects. Indeed, such signature mutations, which may or may not be reflected as amino acid changes, are likely to be in the majority and may serve to mask local examples of positive or negative selection if the analysis of synonymous and nonsynonymous nucleotide change is conducted only at the level of a whole gene sequence. An interesting, but still unexplained, feature of the present results is that the EBNA3A, 3B, 3C, and 2 genes (Fig. 1 to 3) appear to display greater polymorphism among Chinese type 1 viruses as a group than is seen for the same viruses at the EBNA1 and LMP1 loci. Moreover, these polymorphisms at the EBNA3A, 3B, 3C, and 2 loci are linked and serve to identify separate families of type 1 Chinese viruses with characteristic combinations of alleles at all four loci. These familial relationships even appear to be reflected at the EBNA1 and LMP1 loci but only as a single nucleotide polymorphism within the V-allelic and Ch'-allelic sequences, respectively (Table 3). Genetic linkage between the EBNA3 gene loci and the EBNA2 locus some 40 kb upstream in the viral genome was first recognized as a feature of the type 1/type 2 division of EBV strains (29), but the present data provide the first clear evidence that this is also the case among EBV strains of the same type. It is possible that the coevolution of these loci reflects the fact that their protein products interact functionally during the virus-driven growth transformation of latently infected B cells, with both EBNA3A and 3C serving as regulators of EBNA2-induced gene activation through competitive inhibition of EBNA2's interaction with the cellular transcriptional regulator RBP-JK (12, 41).
Changes at the IVT and AVF epitope regions therefore need to be assessed against this background of slow evolutionary drift at the EBNA2, 3A, 3B and 3C gene loci. Indeed, the IVT and AVF mutations clearly do align with type 1 family divergence on a phylogenetic tree of Chinese EBV strains (Fig. 5). Thus, the Li family and the Wu family of isolates each display two family-specific patterns of epitope change, the Wu/Li recombinants display another pattern, and the more disperse Wu/Sp family displays four other patterns, three of them unique to that family. Of itself, this concordance between IVT/AVF sequence variation and other markers of type 1 virus diversification can be accommodated with either interpretation of epitope change (immune selection or neutral mutation). The question again becomes whether epitope change is anything other than a fortuitous marker of general diversification at the EBNA2, 3A, 3B, and 3C loci.
We examined the available evidence in two ways. The first was to pursue the approach of Khanna et al. (14) and look at the incidence with which random sequence change in the EBNA3 genes of Chinese viruses has coincidentally affected known CD8+-CTL epitope sequences other than the observed examples of epitope loss in the two A11 epitopes within EBNA3B. This type of analysis can now be conducted more systematically since we have much more sequence information from Chinese viruses and an increasing number of defined CTL epitopes (3, 24) of varying strength and restricted through a variety of HLA alleles, most of which are relatively rare or absent in the Chinese population (Table 4). While conservation of an epitope sequence could in some instances reflect an absolute requirement for that sequence in order to maintain protein function, such constraints are unlikely to apply in every case. Thus, it was significant that of the 15 epitopes that lie within those EBNA3A, 3B, and 3C gene sequences available from all 26 Chinese type 1 isolates, only the AVL epitope in EBNA3B (restricted through an HLA allele, B35, not seen in Chinese populations) showed evidence of consistent divergence from the B95.8 prototype, and this reflects a single nucleotide change with all the hallmarks of a founder mutation, being present in virtually all Chinese and Papua New Guinea strains but absent from Caucasian or African viruses (14). Among the 15 other epitopes that lie in areas of EBNA3A and 3C for which only the Chinese prototype (NPC15) sequence was available, again only one (the YPL/B35 epitope in EBNA3A) was altered; coincidental mutations of this epitope have been observed before in Chinese virus strains (15), although their potential effects on antigenicity have not been fully investigated. Our data suggest that, within the limits of currently available sequence information and currently known epitopes in EBNA3 proteins, the focusing of multiple epitope-loss mutations as seen at the IVT and AVF epitopes in Chinese viruses is not replicated coincidentally at other epitope loci.
A second means of examining the significance of our results was to use computer-based modeling to analyze the available Chinese EBNA3 gene sequences for evidence of sites under positive selection. In a series of papers since 1994, Yang and colleagues have developed probabilistic methods for evaluating evolutionary change in sets of diverging protein coding sequences, building on a codon-based model of sequence evolution (10), with particular reference to detecting and analyzing instances of positive (or diversifying) selection. The strategy involves estimation of the overall rates of nonsynonymous change (dN) and synonymous change (dS) for the data and identifying situations where the dN/dS ratio (termed
) is greater than 1. This is a general approach of superior power and resolution for evaluating the occurrence of positive selection compared to the commonly used device of counting synonymous and nonsynonymous differences between pairs of aligned sequences (35). The resulting program, Codeml, incorporates modeling capabilities that allow for multiple
values across lineages (34, 36) or across sites in the alignment (26, 38) or in a specified subset of lineages (37).
Analysis of the EBNA3 gene diversity by Codeml under different models of
distribution gave lists of sites identified as undergoing positive selection whose numbers ranged from a total of 7 with model B to 34 with model 3. This variability reflects the probabilistic nature of the modeling process and the limitations of each model, together with the arbitrary cutoff of a P value of >0.95 applied for inclusion in the lists. In regard to our major aim of investigating whether the HLA-A11 epitopes in EBNA3B have been subject to positive selection over the set of EBV type 1 sequences, we can summarize the findings as follows: (i) four variable sites in the two epitopes were scored as exhibiting positive selection in all analyses (with models 2, 3, and B); (ii) in the analysis with model 2, three of the epitope sites were in the high-scoring category of a P value of >0.99, while in the analyses with model 3, all four of these sites were in the P value of >0.99 class; and (iii) in the model B analysis, the epitope-associated sites comprised four of only five sites scored as exhibiting positive selection specifically in the EBV type 1 portion of the tree. Overall, these results indicate strongly that the epitope coding sites have been subject to positive selection.
Our study therefore sets the A11 epitope variants in their broader genomic context and shows that, while the epitope changes map coherently onto a phylogenetic tree of Chinese type 1 EBV strains, a number of their features are still difficult to explain as the result of mere chance. First, both epitopes show a number of different patterns of nonsynonymous coding change; second, all amino acid changes yet examined appear to render the epitopes nonimmunogenic; third, searching the Chinese virus sequences for other instances of coincidental epitope loss produced few parallels; and fourth, computer-based analysis identifies two codons within each of the two epitope coding regions as being subject to positive selection. These collective observations indeed strengthen the view that IVT and AVF epitope change in Chinese EBV strains could have arisen through immunological pressure. Perhaps the strongest additional support for an immune selection hypothesis would be now to find an independent example of just such a phenomenon. Efforts should therefore be made to find other situations, however rare, where the possibility of CTL-mediated immune pressure shaping herpesvirus evolution within a human population can be examined.
We would like to thank the Functional Genomics Laboratory (supported by BBSRC grant 6/JIF13209) and the Glaxo Wellcome Biocomputing Laboratory (supported by MRC grant 4600017), School of Biosciences, University of Birmingham, for help with DNA sequencing.
|
|
|---|
in Epstein-Barr virus-transformed B lymphocytes. J. Virol. 70:4179-4183.
ß
with the proteasome: a new mechanism for selective inhibition of proteolysis. Nat. Med. 4:939-944.[CrossRef][Medline]
. J. Virol. 70:4228-4236.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2010 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»