Previous Article | Next Article ![]()
Journal of Virology, October 2006, p. 9519-9529, Vol. 80, No. 19
0022-538X/06/$08.00+0 doi:10.1128/JVI.00575-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
John McNevin,4,
Jianhong Cao,4
Hong Zhao,1
Indira Genowati,1
Kim Wong,1
Sherry McLaughlin,1
Matthew D. McSweyn,4
Kurt Diem,2
Claire E. Stevens,2
Janine Maenza,2
Hongxia He,1
David C. Nickle,1
Daniel Shriner,1,
Sarah E. Holte,5
Ann C. Collier,2
Lawrence Corey,1,2,3,4
M. Juliana McElrath,2,3,4 and
James I. Mullins1,2,3*
Departments of Microbiology,1 Medicine,2 Laboratory Medicine, University of Washington School of Medicine, Seattle, Washington 98195,3 Program in Infectious Diseases,4 Program in Biostatistics and Biomathematics, Fred Hutchinson Cancer Research Center, Seattle, Washington 981095
Received 20 March 2006/ Accepted 8 July 2006
|
|
|---|
17 time points during his first 3 years of infection, and in his infecting partner near the time of transmission. Autologous peptides covering amino acid sites inferred to be under positive selection were powerful for identifying HIV-1-specific cytotoxic-T-lymphocyte (CTL) epitopes. Positive selection and mutations resulting in escape from CTLs occurred across the viral proteome. We detected 25 CTL epitopes, including 14 previously unreported. Seven new epitopes mapped to the viral Env protein, emphasizing Env as a major target of CTLs. One-third of the selected sites were associated with epitopic mutational escapes from CTLs. Most of these resulted from replacement with amino acids found at low database frequency. Another one-third represented acquisition of amino acids found at high database frequency, suggesting potential reversions of CTL epitopic sites recognized by the immune system of the transmitting partner and mutation toward improved viral fitness in the absence of immune targeting within the recipient. A majority of the remaining selected sites occurred in the envelope protein and may have been subjected to humoral immune selection. Hence, a majority of the amino acids undergoing selection in this subject appeared to result from fitness-balanced CTL selection, confirming CTLs as a dominant selective force in HIV-1 infection. |
|
|---|
In vitro studies indicate that cytotoxic T lymphocytes (CTL) can both effectively kill HIV-1-infected cells and inhibit viral replication through cytolytic and noncytolytic mechanisms (57). Studies of simian immunodeficiency virus-infected rhesus monkeys showed that depletion of CTL during primary infection led to the loss of initial control of viral replication and that depletion of CTL during chronic infection was associated with a rapid and marked increase in viremia (50). In humans, CTL responses have been associated with the decline of HIV-1 viremia following primary infection (8, 30). CTL responses are detectable in most study subjects, and responses to all nine HIV-1 proteins have been found to occur (2, 10, 15). Nef, Gag, and Pol (2, 7, 45) are reported to be recognized more frequently than Env, perhaps as a result of exceptionally high variability in Env (6, 15).
Despite these responses, adaptive host immunity is not able to clear HIV-1 infection, and numerous studies implicate CTL escape as an important mechanism for HIV-1 evasion. CTL responses to HIV-1 appear to select for escape mutants during both acute (9, 11, 24, 46) and, perhaps to a more limited extent (14, 27), chronic infection (4, 5, 17, 20, 55). In vitro studies demonstrate that incomplete HIV-1 suppression by CTL can result in rapid emergence of immune escape mutants (56). Viral adaptation to HLA-restricted CTL responses at the population level has also been inferred (25, 33, 34, 40, 61). Furthermore, recent reports indicate that reversion of CTL escape mutations can occur upon transmission to a new host (4, 34), likely reflecting recovery of a viral fitness cost exacted by amino acid substitutions that resulted in immune escape (16, 19, 24).
In an effort to affirm the aforementioned findings and to comprehensively examine the relationship between CTL responses and overall positive selective pressure acting throughout the HIV-1 proteome, we intensively studied the first 3 years of HIV-1 infection in an antiretroviral-therapy-naïve subject (PIC1362). Longitudinal viral genome or subgenome sequences were determined from plasma virions. Using computational methods, we identified sites potentially having experienced positive selection throughout the HIV-1 proteome. In addition, over the same time frame as viral sequencing was done, overlapping peptides representing the amino acid sequences of the entire HIV-1 proteome were tested for their ability to elicit CD8+ gamma interferon (IFN-
)-secreting-T-cell responses. Our studies demonstrate that CTL responses are a major contributor to the positive selection shaping the viral population during the early years of HIV-1 infection.
|
|
|---|
![]() View larger version (30K): [in a new window] |
FIG. 1. Clinical progression, venipuncture time points, and positive selection following acute HIV-1 infection in PIC1362. (A) Longitudinal viral load and CD4+ and CD8+ T-cell counts. The arrow indicates the single treatment of interleukin 2 (IL-2) taken by the subject. (B) Time points of IFN- ELISPOT analysis (row 1) and full-length (row 2) as well as subgenomic (row 3) sequencing, indicated by X's. Open circles indicate additional time points at which Gag p17 and p24 coding regions were sequenced. (C) Coding elements of the viral genome encompassed by targeted gene sequencing. (D) Codon positions of amino acids potentially under positive selection in the HIV-1 genome and their relationships to CTL responses and other potential selective forces. Reference epitopes were identified using peptides derived from HIV-1 subtype B lab strains or consensus sequences. Autologous epitopes were identified using peptides whose amino acid sequences were derived from the viral gene sequences of the subject. The numbers in parentheses correspond to the numbers of epitopes or amino acid sites within each category. Arrows pointing downward show sites mutated to a lower database (db) frequency, meaning the amino acid at a particular position had a database frequency of >50% during the initial infection and later mutated to a state with a database frequency of less than 50% and less than half of the database frequency of the initial state. Arrows pointing upward show sites mutated to a higher database frequency, meaning the amino acid at a particular position had a database frequency of <50% during the initial infection and later mutated to a state with a database frequency of more than 50% and at least twofold higher than the database frequency of the initial state. NGLS, N-linked glycosylation site.
|
10 years at the time of the transmission, and his class I HLA alleles were determined to be A*0101, A*0201, B*0801, B*5001/5004, Cw*0602, and Cw*0701. Amplification and sequencing of viral genomes and fragments. RNA was purified from plasma virions by use of a QIAamp viral RNA mini kit (QIAGEN, Valencia, CA) following the manufacturer's recommendations. Viral RNA was reverse transcribed and targeted gene regions amplified by nested PCR as previously described (36). Primers used in this study and their corresponding products are summarized in Table S1 in the supplemental material.
For half- and full-length viral genome cDNA synthesis, a mixture of 34 µl viral RNA, 40 mM deoxynucleoside triphosphate (Invitrogen, San Diego, CA), and 80 pmol of 24-mer oligo(dT) primers was preincubated for 5 min at 65°C and then incubated for 1.5 h at 45°C with prewarmed 40 U RNase inhibitor (Roche Diagnostics, Indianapolis, IN), 800 U SuperscriptII, and 5X First-Strand buffer (Invitrogen, San Diego, CA). The mixture was supplemented with 400 U of SuperscriptII and incubated for another 1.5 h. The resulting cDNA was treated with 4 U RNase H (Invitrogen, San Diego, CA) for 30 min prior to PCR amplification. An Expand long-template PCR system (Roche Diagnostics, Indianapolis, IN) was then used for amplification by following the manufacturer's recommendations for buffer system 1. Hot-start PCR was carried out using paraffin in 0.5-ml thin-walled tubes in a DNA engine thermal cycler (MJ Research, Waltham, MA) under the following conditions: 2 min at 94°C; 10 cycles of 10 s at 94°C and 8.5 min at 68°C; and 20 cycles of 10 s at 94°C and 8.5 min at 68°C (with the extension step incremented by 20 s per cycle).
Targeted-gene PCR products were cloned into the TOPO TA vector (Invitrogen, San Diego, CA) and selected for sequencing as described previously (52) to avoid template resampling (35). PCR products from half- and full-length genome amplifications were gel purified, cloned into the pCR-XL-TOPO vector (Invitrogen, San Diego, CA), and sequenced by primer walking. To avoid template resampling, only one clone from each PCR was sequenced. Sequencing primers were designed for every 600 to 700 bases from HIV-1 clade B alignments, with a total of 32 primers. Due to viral variation, sequencing primers often had to be redesigned to amplify specific sequences. Sequences were determined with an automated DNA sequencer (Applied Biosystems, Foster City, CA) and edited using SEQUENCHER, version 3.0 (Gene Codes Corp., Ann Arbor, MI).
Synthetic peptides and IFN-
ELISPOT assays.
By use of selected reference strains and consensus sequences as guides, 769 peptides spanning the entire HIV-1 subtype B proteome were synthesized for enzyme-linked immunospot (ELISPOT) assays. Env peptides (212 11- to 15-mers) were based on HIV-1MN; Gag (122 15-mer), Pol (248 12- to 15-mer), Tat (23 15-mer), and Nef (49 15-mer) peptides were based on HIV-1HXB2; and Vpr (22 15-mer), Rev (27 15-mer), Vif (47 15-mer), and Vpu (six 9-mer and 13 15-mer) peptides were based on the 2001 HIV-1 subtype B consensus sequence (32). In addition, to cover the differences between the major variants of the viruses infecting PIC1362 and the initial testing peptide panel, for Nef and variable loops of Env, a total of 91 additional peptides (41 Env 15-mer peptides [20 for V1/V2, 8 for V3, 8 for V4, and 5 for V5] and 50 Nef 15-mer peptides) were derived from autologous sequences from plasma viruses sequenced 8 days after onset of symptoms. Last, we synthesized 78 additional autologous 15-mer peptides whose sequences corresponded to 29 selected sites outside Nef and the variable regions of Env but were not included within the initial testing peptide panels. All 15-mers overlapped by 11 amino acids. For ELISPOT assay-positive 15-mers, shorter peptides were synthesized to define optimal CTL epitopes. All peptides used in this study are provided in Table S2 in the supplemental material. Peptides were synthesized by the Biotechnology Center at Fred Hutchinson Cancer Research Center, Synpep (Dublin, CA), or kindly provided by the National Institutes of Health AIDS Research and Reference Reagent Program.
IFN-
ELISPOT assays and determination of 50% effective concentration (EC50), the effective peptide concentration eliciting 50% of the peak IFN-
response in the ELISPOT assay, were performed as previously described (10).
Estimation of diversifying selection.
Amino acid sites under diversifying selection, in which mutations to other amino acids are advantageous, are indicated by an excessive nonsynonymous-substitution distance compared to the synonymous-substitution distance within the codon (e.g., see references 18 and 48). Mutations within a codon that change the encoded amino acid are referred to as nonsynonymous, whereas mutations that do not change the encoded amino acid are referred to as synonymous. The number of nonsynonymous or synonymous changes per potential nonsynonymous or synonymous site is referred to as the nonsynonymous or synonymous distance (dn and ds, respectively). The ratio dn/ds, or
, has been used to assess selection (26). Generally speaking,
of >1 indicates diversifying selection. For this analysis, the program PAUP* (54) was used to generate maximum-likelihood trees. CODEML, from a package for phylogenetic analysis by maximum likelihood (PAML) (58, 59), was then used to identify sites potentially under diversifying selection. The latter program employs codon-based models that allow for variable selection intensities among sites within protein-coding DNA sequences. By estimating dn and ds from the sample's phylogeny, the program can identify individual sites experiencing diversifying selection, i.e., those with
of >1. The sequence data were fit to a model (M7) in which all sites belonged to a category in which
was beta-distributed and bound between 0 and 1 and to a model (M8) in which a fraction of sites belonged to a category in which
was beta-distributed and bound between 0 and 1, with the remaining sites belonging to a category with a freely estimated
value. A significantly better fit of M8 than of M7 (by a likelihood ratio test with 2 degrees of freedom) and a freely estimated
value greater than 1 were taken as evidence of diversifying selection. Next, if a codon's posterior probability of belonging to the category of codons with
of >1 was greater than 99%, it was considered to have experienced diversifying selection.
Estimation of directional selection. Amino acid sites experiencing directional selection, in which mutation to one particular amino acid is advantageous, are inferred when a mutant accumulates in the population faster than expected due to random drift. Suppose the frequency of a mutant at a site increases from f1 to f2 in time t1 to t2. A simulation program was written to determine if the frequency increase was likely due to random genetic drift. One thousand simulations of random genetic drift were repeated with a population of n individuals, with the parents being randomly sampled with replacement each generation. The effective population size of HIV-1 within a chronically infected individual has been estimated to be on the order of 103 (1, 31, 51, 53); thus, n was set to 1,000. The probability that an allele increases in frequency from f1 to f2 from time t1 to t2 by random genetic drift, P(time = t2 t1 | finitial = f1, ffinal = f2, n = 1,000), can thus be estimated. If P is >5%, the frequency increase can be explained by random genetic drift. Otherwise, the mutated site is considered to be potentially under directional selection. The generation time of HIV-1 has been estimated to be 2.0 days (38); thus, the time interval between t1 and t2 can be converted from days to generations.
Longitudinal sequence alignments of PIC1362 were screened for amino acid sites with mutations that had increasing frequencies over time. For a mutant, t1 was taken as the time point just before the observation of the mutation and f1 was set at 1/n1, with n1 being the sample size at t1. If the sequences in the data set were not all in the mutant form at the last time point, t2 was taken as the last time point, and the observed last time point frequency of the mutant was used as f2. Otherwise, the time point after which the mutant frequency remained at 100% in the data set was used as t2 and f2 was set at (n2 1)/n2, with n2 being the sample size at t2.
Estimation of selection coefficient. We assumed that HIV-1 would behave as a haploid organism and that its evolution could be described adequately by a discrete-generations model due to fast turnover (47). Therefore, the effect of selection can be described by log(pt/qt) = log(p0/q0) + t x log(1 + s) (22), where p0 is the initial frequency of the advantageous mutant and q0 is the initial frequency of the original amino acid. After t generations, the frequency of the advantageous mutant becomes pt, and the frequency of the original amino acid becomes qt (note that p0 + q0 = 1 and pt + qt = 1). When multiple mutant forms were observed for an epitope, all of the mutants were treated equally: p0 and pt are the sums of the frequencies of all of the mutants at generation 0 and generation t, respectively. For each epitope or site of interest, the selection coefficient, s, was estimated from longitudinal data from the time that any mutant was observed until the time at which no wild type was observed. Estimation of s was conducted with nonlinear regression using SAS version 9 PROC MODEL. s was assumed to be constant over all observation times. Profile likelihood was used to obtain s for each case where two or more data points were available for analysis.
Calculation of the database representation of each amino acid at positions of interest. The database frequencies of amino acids at each position were calculated using a collection of subtype B HIV-1 sequences chosen at random, one sequence per individual, from the HIV-1 sequence database (28, 32): 139 sequences for p17, 150 for p24, 130 for protease, 87 for reverse transcriptase (RT), 75 for IN, 155 for Vpr, 94 for Tat, 91 for Vpu, 111 for gp160, and 226 for Nef.
Nucleotide sequence accession numbers. The sequences generated in this study were submitted to GenBank under the accession numbers DQ853426 to DQ854622.
|
|
|---|
One thousand sixty three amino acid sites across the viral proteome exhibited polymorphism within this data set. We used computational methods to identify sites among them that potentially had experienced diversifying selection or directional selection (see Materials and Methods). We found 39 sites undergoing diversifying selection and 34 others undergoing directional selection. Of the total of 73 sites, 36 were in Env, 14 in Nef, 8 in Pol, 5 in Gag, a total of 10 in Vif, Vpr, Tat, and Vpu, and none in Rev (Fig. 1C). Table 1 summarizes all of the selected amino acid sites and their associations with known selective forces.
|
View this table: [in a new window] |
TABLE 1. Summary of selected amino acid sites
|
-secreting CD8+ T cells by use of PBMCs cryopreserved from 17 time points. We defined 25 CD8+ T-cell epitopes, including 14 not reported previously (Table 2). Eighteen epitopes were identified using a panel of 769 peptides derived from HIV-1HXB2, HIV-1MN, or consensus sequences covering the whole proteome. Sequences of gp120 and Nef in this subject were largely different from those of the tested peptides. We therefore examined an additional 91 autologous 15-mer peptides derived from the viral sequences found at day 8 across Nef and all five variable regions of Env. The viral population sampled at this time point was virtually homogeneous, with the only genetic diversity corresponding to scattered, clone-specific changes (data not shown). In this process, two additional epitopes (Nef WW9 and Env EY10) were identified. |
View this table: [in a new window] |
TABLE 2. HIV-1-specific CD8+ T-cell epitopes recognized by PIC1362
|
CD8+ T-cell-mediated responses were restricted by all six of this subject's HLA class I alleles: 9 by HLA-A, 10 by HLA-B, and 6 by HLA-C alleles (Table 2). Notably, Env was the richest region for CTL recognition, with eight reactive epitopes, including five identified only by use of autologous peptides encompassing selected sites.
Relationship between amino acid sites under positive selection and CD8+ T-cell recognition and escape. For each CTL epitope identified, we also tested T-cell recognition of the mutant forms detected over time. Functional avidity, summarized by EC50 values, was characterized for responses to both the wild-type and mutant forms by use of T cells obtained at the time point at which the epitope elicited peak responses. Mutant forms with EC50 values at least 10-fold higher than those of the corresponding wild-type epitope forms were designated escape mutants. Escape mutations occurred at 25 amino acid sites within 16 epitopes. Of note, 13 sites of mutation were within seven Env epitopes (Table 3).
|
View this table: [in a new window] |
TABLE 3. HIV-1-specific CTL escape mutants in PIC1362
|
Selected sites flanking CTL epitopes. CTL escape can be mediated both by mutations within an epitope that result in poor binding to the presenting HLA class I molecules and/or the T-cell receptor and by amino acid changes flanking an epitope, with escape resulting from impaired processing and presentation (4, 13, 60). Of the 51 selected sites not in CTL epitopes, 12 were located in the flanking regions within 10 amino acids of the epitopes (Fig. 1C), of which 7 were located within 5 amino acids of the epitopes.
Apparent selective advantage of escape mutants. CTL responses to two epitopes, CCFHCQVC (Tat CC8, amino acids [aa] 30 to 37) and EAVRHFPRI (Vpr EI9, aa 29 to 37), were detected as early as day 8, and escape mutants were observed as early as day 50 after onset of acute symptoms (11). The Vpr EI9 escape variant completely replaced the epitope within 54 days (0/15 at day 22 versus 10/10 at day 76), corresponding to approximately 27 generations (assuming a 2.0-day replication cycle [38]). For Tat CC8, escape mutants increased from 0/17 to 9/9 within 29 days. Selection coefficients of the mutants were thus calculated to be 0.24 for Vpr EI9 and 0.41 for Tat CC8 (see Materials and Methods), indicating that in the presence of CTL responses, the mutants had a 24% or 41% increased likelihood for survival and replication over the wild type. The selective coefficients of other CTL escape mutants ranged from 0.0024 to 0.15, with an average of 0.03. Thus, it appeared that the escape mutants from the earliest CTL recognition conferred a higher survival advantage than the later-appearing mutants.
Association of amino acid sites under positive selection with other adaptive immune responses. To estimate the possible effects of other immune responses, namely, helper T-lymphocyte (HTL) and humoral immune responses, on the selection of HIV-1 variants, we compared the observed viral sequences to epitopes of CD4+ HTL and neutralizing antibodies reported in the HIV-1 immunology database (29). We found that eight of the selected amino acid sites were located in known HTL epitopes (Fig. 1C). However, four of these, p24-15I, RT-135I, RT-211R, and gp120-226L, were also within confirmed CD8+ T-cell epitopes. No selected site in Env was located within a known neutralizing antibody binding site, and only one change, I399S in gp120, resulted in the acquisition of an N-linked glycosylation site (Fig. 1C).
Association of amino acid mutations mediating CTL escape with viral database frequency. Amino acids associated with high viral replicative fitness are likely to be carried with high frequency in HIV infection. Indeed, infection of a new host results in the reacquisition of mutations that are more ancestral or consensus-like (23). Therefore, viruses with amino acids found at higher frequencies in the Los Alamos National Laboratory (LANL) HIV sequence database (32) might reasonably be hypothesized to be associated with better replicative capacity or to be more stable in the absence of host immune pressure (3). Exceptions may include those within peptides that have been adapted extensively to a human population with shared HLA alleles (33, 40). We compared the frequencies of amino acids at each escape position in their original and escape mutant forms in the subtype B database (32). In our study, the database frequencies of the amino acids responsible for CTL escape were significantly lower than those of the corresponding amino acids found in the recognized epitope (P = 0.0021, Wilcoxon signed-rank test).
Amino acid changes associated with transmission and outgrowth early in infection. To identify sites associated with viral outgrowth in the newly infected host, we sequenced 10 full-length viral genomes from plasma of the subject's sexual partner (PIC1365) obtained 1 month after onset of acute symptoms in subject PIC1362. Phylogenetic analysis of random sequences of C2-V5 of Env (and other regions) from the LANL sequence database and those from PIC1362 and PIC1365 showed monophyletic clustering between sequences from PIC1362 and PIC1365 (data not shown). This is consistent with historical documentation that PIC1365 was the transmitting source partner. Comparing sequences from PIC1362, PIC1365, and known HIV-1 CTL epitopes specific only to PIC1365's class I HLA alleles (A*0101, B*0801, B*5001/5004, Cw*0602, and Cw*0701) in the LANL HIV immunology database (29), we identified two selected sites in the recipient (PIC1362) within p17 (aa 31) and Nef (aa 94). Of interest, these amino acid changes resided at putative anchor residues (underlined) within two HLA-B8-restricted epitopes, GGKKKYKL, Gag GL8, of p17 (aa 24 to 31) and FLKEKGGL, Nef FL8, of Nef (aa 90 to 97), and were transmitted from an HLA-B8-positive donor (PIC1365) to the HLA-B8-negative recipient (PIC1362) in the form of p17-31F and Nef-94E (Table 4).
|
View this table: [in a new window] |
TABLE 4. Longitudinal viral sequence analysis of and CTL responses to Gag GL8 and NEF FL8
|
The database frequency is 100% for p17-31L and 91% for Nef-94K for subtype B. Therefore, the selection for p17-31L and Nef-94K in the recipient suggests that viruses evolved toward the higher-database-frequency form, likely the more replication-fit form, in the absence of corresponding CTL responses. We detected no CTL responses to either p17-31 or Nef-94 peptides in the donor at the time point just following transmission (Table 4). However, 4 years later, all of the sampled viruses from the donor were in the form p17-31L and a low-level ELISPOT assay response (185 spot-forming cells [SFC]/106 PBMCs) was detected. This is consistent with the idea that the viruses with p17-31F were escape mutants in PIC1365 and that this escape might be associated with fitness cost. Loss of the epitope form, leading to a decay of CTL pressure in PIC1365, may have accounted for the late development of the epitope form of p17-31L, which was then followed by reemergence of a CTL response.
Of the 73 selected sites (Fig. 1C), 21 (29%), including p17-31 and Nef-94, were found with sequence changes evolved from those at a lower subtype B database frequency (<50% at early infection) to those at a database frequency at least twofold higher and with over 50% database frequency. Moreover, 86% (18/21) of these sites evolving to amino acids of higher database frequency were not within CTL epitopes recognized in PIC1362. Of the 18 sites, 15 were identified only by the method for detection of directional selection. Therefore, evolution to enhance replicative fitness, including reversion of CTL escapes with fitness costs in the donor, is very likely to be another significant positive-selection force in HIV-1 infection.
We were not able to implicate known selective forces for the remaining 20 selected amino acid sites not accounted for as discussed above (Fig. 1C). However, 15 of these sites were in Env, the principal target of the humoral immune response. It is therefore possible that some of these sites in Env were selected as a result of conferring escape from presently undefined humoral immune responses.
|
|
|---|
We found CTL epitopic responses, as well as escape mutations, within seven of the HIV-1 proteins, which together were restricted by all six of the subject's HLA alleles. Sixteen of the 25 epitopes detected (64%) developed escape mutations over the course of the first 3 years of infection. Four of the epitopes that developed escape mutations were identified only by using peptides encompassing selective sites. Thus, we may have overestimated the percentage of epitopes with escape mutations. However, excluding these four epitopes, we still found that 12 of 21 (57%) epitopes developed escape mutations. A total of 22 of the 73 selected sites identified (30%) were in 16 confirmed CTL epitopes, and 20 of these sites were located in 14 CTL epitopes in which escape mutations occurred. An additional 12 selected sites were located within 10 amino acids flanking an epitope, where mutations might impair epitope processing and presentation (4, 13, 60). Fewer responses could be readily attributed to HTL (
8 sites) or neutralizing antibodies (1 site). In addition, 21 selected sites (29%), including 18 not located in CTL epitopes recognized in this subject, were found to mutate from a form of low database representation to a form of high database representation. Mutations at two of these sites appeared to revert two donor-specific epitopes from CTL escape mutant forms in the donor to their epitopic forms in the recipient. Thus, in the absence of the selective pressure of a targeted immune response, these amino acids may have reverted to the more common, epitopic forms.
In a recent study (3), Allen et al. monitored four HIV-1-infected subjects for viral sequence changes in the 4 to 5 years following acute infection. Consensus whole-genome sequences, excluding Env, were determined over time. Fifty-three percent of the accumulating mutations (equivalent to the selected sites described here) they reported were associated with CTL responses. This compares with 30% of the mutations we studied over the whole proteome in recognized epitopes, or 43% if we exclude Env. In addition, 18% of the mutations they detected were associated with mutations towards more common HIV-1 subtype B consensus sequences, and these sites were inferred to be reversions of transmitted CTL escape mutants. This compares to 25% of the mutations we found, or 30% if we exclude Env. Last, 29% of the mutations detected by Allen et al. were not linked to cellular immune responses or reversion, compared to 45% overall in our study, or 38% outside Env. Our results are therefore in substantial agreement with this and other previous studies (3, 24) and confirm that in the first few years of infection, escape from CTL responses and improvements in replication fitness are two major positive-selection forces shaping the natural course of HIV-1 evolution across the viral proteome.
Amino acid sites under positive selection were typically inferred when the codon exhibited an excessive nonsynonymous-substitution distance compared to the synonymous-substitution distance. This method is sensitive for identifying amino acid sites under diversifying selection, in which mutations to multiple other amino acids are advantageous, or parallel evolution, in which the same mutation recurs in separate lineages. However, this method largely fails to identify amino acid sites under directional selection, in which mutation to one particular advantageous amino acid occurs once. Hence, to identify amino acid sites under directional selection, we developed a method to screen sequences for sites showing a mutant frequency that increased faster than one would expect under random genetic drift. Thirty-four of the 73 selected sites were identified by this method but were missed by nonsynonymous/synonymous-distance analysis. Of the directionally selected amino acid sites, 7 were associated with CTL responses and 15 were correlated with fitness improvement. Therefore, our method for identification of sites under directional selection should be very useful for epitope identification as well as studies of viral evolution and host-virus interactions.
CTL responses against HIV-1 infection are commonly assessed using peptides derived from lab strains. The use of peptides derived from consensus sequences has improved the detection rate of CTL responses, with further increases noted by use of peptides based on autologous viral sequences (6, 15). However, due to the high degree of genetic variation between HIV-1 strains both among and within infected individuals, it is impractical to estimate CTL responses by using all autologous peptides, and such an approach is likely to have a low yield of immune-targeted sites. Indeed, when we synthesized 91 autologous 15-mer peptides derived from the viral sequences dominating initial infection, and covering full-length Nef and the variable loops of Env, we identified only two new epitopes. In a more targeted approach, we used positively selected sites detected by computational analyses to guide identification of autologous epitopes. In this way, we identified these two plus five new epitopes by using 78 autologous 15-mers covering selected sites, a threefold- to fourfold-better rate of detection of epitopes. Thus, designing and testing autologous peptides based on amino acid sites under positive selection is a powerful and cost-saving method to identify CTL epitopes missed by conventional methods that employ only peptides derived from reference strains or consensus sequences. It also needs to be noted that selection-based methods are biased to identify CTL epitopes that escaped recognition, whereas epitopes with minimum evolution are missed by these methods.
Prior studies have omitted Env from study (3) or generally failed to show that Env is as important a CTL target as Nef, Gag, and Pol (2, 7, 45). The use of autologous peptides targeted by inclusion of selected sites, however, showed that Env was the gene product most frequently targeted by CTL in this subject, with eight reactive epitopes, of which seven developed escape mutations. Indeed, five of the eight Env epitopes identified were detected only by using autologous peptides covering selected sites. Thus, as a result of the exceptionally high variability in Env, CTL responses to Env will remain underappreciated if only peptides derived from consensus or lab strains are examined.
Although CTL escape mutants are advantageous in terms of escaping immunity, they may have replication fitness costs (4, 16, 19, 24, 34). Our study showed that compared to the epitopic forms, mutants mediating CTL escapes were associated with low database frequency, implying fitness costs of these mutations. In addition, 18/51 (35%) of the selected amino acid sites not in recognized CTL epitopes were found to have mutated from lower-database-frequency forms to higher-database-frequency forms, including two sites at which mutations appeared to result in reversion of CTL escape mutants present in the donor. This suggests the existence of selective pressure in the new host to improve replication fitness when the site is no longer targeted by an active immune response.
Given the extensive CTL escape mutants observed with PIC1362 and by other studies (4, 5, 9, 11, 17, 20, 24, 46, 55), it is surprising to note that only two amino acid sites at which mutations appeared to result in reversion of potential CTL escape mutants were identified in the infecting partner, PIC1365. It may be that many of the escapes that occur within a host have little fitness impact on the virus and thus revert slowly or not at all. Alternatively, compensatory mutations outside the epitopes that affect presentation and restore fitness may obviate the requirement for reversion. In addition, these two sites were identified by comparing sequences from PIC1362, PIC1365, and known HIV-1 CTL epitopes specific to PIC1365's class I HLA alleles in the LANL HIV immunology database (29). It should be noted that this database is substantially incomplete. For example, in the case of PIC1362, 14 of the 25 recognized epitopes were not reported previously. In any case, we did not have specimens available to follow the infecting partner over time and thus are unable to document most of his cellular responses or escape mutations.
Figure 2 illustrates the balancing selective pressures we hypothesize to play critical roles in shaping the viral proteome. Escape mutants are selected in an HIV-1-infected donor due in large measure to CTL responses. However, many of these escape mutations have a replication fitness cost. Therefore, viruses with these escape mutations are less fit when transmitted to a new host. Viruses in a recipient with different HLA alleles target different stretches of the viral proteome. Thus, a new set of escape mutants, many of which have reduced replicative fitness, is selected. Balancing this effect, escape mutations which occurred in the donor but are no longer under immune pressure in the recipient and which result in impaired fitness come under selective pressure to revert to a state with an amino acid that confers greater fitness. Therefore, pressures to repair impaired replicative fitness of the transmitted viruses and to escape immune responses, especially CTL, are two major positive-selection forces shaping the natural course of HIV-1 evolution.
![]() View larger version (36K): [in a new window] |
FIG. 2. Two major positive-selection forces hypothesized to act on HIV-1 during the natural course of infection: escape from CTL responses and increasing replication fitness. Dark-gray bars indicate immune escape mutations that are balanced by a partial loss in replication fitness. Light-gray bars indicate immune escape mutations that have little or no replication fitness cost. Thick vertical lines indicate sequence differences from the theoretically most replication-fit virus that have no or little effect on viral fitness.
|
Our findings also have important implications for HIV-1 vaccine designs that seek to reduce disease progression and transmission rates if the goal of sterilizing immunity cannot be attained (41). Immunization against viruses of more-fit form should force replicating viruses that break through this immunity to escape into forms of increasingly lower fitness. Thus, the challenge suggested by our and other recent findings is to develop vaccines that target amino acid variants mediating high fitness and provide wide coverage of different variants (43). Given the frequent CTL targeting at Env observed in this subject by use of autologous peptides, it is also important to include Env as an immunogen in order to elicit maximum cellular as well as humoral responses.
This work was supported by grants from the U.S. Public Health Services, including support for the Seattle Primary Infection Program (AI57005) and the University of Washington Center for AIDS Research (AI27757), and by grant M01-RR-00037 for leukapheresis.
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
These authors contributed equally to this work. ![]()
Present address: University of Alabama, RPHB 327, 1530 3rd Ave. S, Birmingham, AL 35294-0022. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»