Impact of Antiretroviral Therapy Duration on HIV-1 Infection of T Cells within Anatomic Sites

HIV-1 persists as an integrated genome in CD4+ memory T cells during effective therapy, and cessation of current treatments results in resumption of viral replication. To date, the impact of antiretroviral therapy duration on HIV-infected CD4+ T cells and the mechanisms of viral persistence in different anatomic sites is not clearly elucidated. In the current study, we found that treatment duration was associated with a reduction in HIV-infected T cells. Our genetic analyses revealed that CD4+ effector memory T (TEM) cells derived from the lymph node appeared to contain provirus that was genetically identical to plasma-derived virions. Moreover, we found that cellular proliferation counterbalanced the decay of HIV-infected cells throughout therapy. The contribution of cellular proliferation to viral persistence is particularly significant in TEM cells. Our study emphasizes the importance of HIV-1 intervention and provides new insights into the location of memory T cells infected with HIV-1 DNA, which is capable of contributing to viremia.


HIV-1 infection frequencies of T cells located in different anatomic sites during effective ART. The impact of ART duration on the proportion of HIV-1-infected T cells is not clearly defined.
To evaluate the effect of ART duration on the proportion of infected T cells, we performed a cross-sectional/interparticipant analysis of the proportion of HIV-1-infected cells in CD4 ϩ T cell subsets sorted from PB, LN, and gut tissues. We sorted a broad range of CD4 ϩ T cell subsets from the anatomic sites using their specific cellular markers in 26 participants after they had been on effective ART for 3.0 to 17.8 years: 12 who initiated therapy during acute/early HIV-1 infection (Յ6 months of infection before initiation of therapy) (AHI group) and 14 who initiated therapy during chronic HIV-1 infection (Ն1 year of infection before initiation of therapy) (CHI group) (Tables 1 to 3 and Fig. 1 to 3). The anatomic regions and cellular subsets were collected after the stated duration of ART for each participant (Table 1). These participants were continuously suppressed during the study, except for one participant who had a viral rebound at the time of sampling (Table 1). We used a previously described maximum-likelihood method to estimate the proportion of cells infected within each T cell subset and within each anatomic site (23). The influence of ART duration on the proportions of infected cells are presented as a fold effect per year of ART. We estimated the fold differences in the proportions of cells infected between earlier and later time points. A fold effect per year of ART greater than 1 indicates that a higher proportion of HIV-1-infected T cells is associated with each additional year on therapy, while a value of 1 means a stable proportion of infected T cells during treatment. A fold effect per year of less than 1 indicates that a lower proportion of infected T cells is associated with each additional year of ART. The statistical significance indicates the evidence for increase (a fold effect per year of greater than 1) or decrease (a fold effect per year of less than 1) versus the null hypothesis of no change over the duration of ART (a fold effect per year of 1). Furthermore, we estimated the impact of ART duration on the proportions of infected cells when at least four participants contributed to the fold effect per year on ART within each T cell subset and tissue.
In PB-derived T cells from the AHI group, each additional year of ART was associated with an overall lower proportion of HIV-1-infected T cells that was statistically significant (fold effect ϭ 0.82/year; 95% confidence interval [CI] ϭ 0.70 to 0.97; P ϭ 0.034) (Fig.  4A). This equates to an 18% reduction in the proportion of cells infected during each additional year on therapy. The fold effect on the proportion of cells infected per year in naive (T N ), central memory (T CM ), transitional memory (T TM ), and effector memory (T EM ) T cell subsets sorted from the AHI group ranged from 0.71 to 0.90, all indicating that a lower proportion of infected T cells was associated with each additional year of therapy. This lower proportion of HIV-1-infected T cells was statistically significant in T N and T CM cells. As we obtained stem cell memory T (T SCM ) cells from only two participants from the AHI group, we were not able to examine the association between the ART duration and the proportion of infected T cells within this T cell subset ( Table 2). For the CHI group, the fold effect per year of ART within the infected T cells from the PB was similar to that for the AHI group (fold effect ϭ 0.84/year; 95% CI ϭ 0.38 to 1.82; P ϭ 0.59) but not statistically significant (Fig. 4B). The fold effects on the proportion of infected T cells associated with each additional year on therapy in T N , T CM , T TM , and T EM cells derived from the CHI group were also similar to those from the AHI group. The PB-derived T SCM cells were obtained from four participants in the CHI group. The reduction in the proportions of infected T SCM cells during each additional year on therapy was not statistically significant (fold effect ϭ 0.70/year; 95% CI ϭ 0.15 to 3.37; P ϭ 0.56). Of interest, T EM cells derived from the CHI group had strong evidence for a stable proportion of HIV-1-infected T cells during each additional year on therapy, with a fold effect of 1.01 per year (95% CI ϭ 0.87 to 1.17). However, CXCR5 ϩ CCR6 Ϫ (X5 ϩ R6 Ϫ ) memory CD4 ϩ T cells derived from the CHI group had a substantially lower proportion of infected T cells associated with each additional year of ART.
The LN data were derived from 4 participants from the AHI group, with samples obtained from 3.6 to 7.3 years of therapy (Tables 1 and 2). For 8 participants from the CHI group, the LN samples were obtained from 3 to 10.8 years of therapy (Tables 1 and  2). For these LN samples, we found an overall lower proportion of HIV-1-infected T cells associated with each additional year on therapy, with a fold effect of 0.15/year and 0.74/year for the AHI and CHI groups, respectively ( Fig. 4A and B). For the AHI group, b Years (y) and months (m) before ART initiation. Only the CHI group was used for analysis. NA, not applicable. c Years, months, or days (d) after ART initiation. Only the CHI group was used for analysis. NA, not applicable. d Duration on ART at the time of sample isolation. e CD4 cell count at the time of sample isolation. f Viral load at the time of sample isolation. g Therapeutic regimen at the time of sample isolation. 3TC, lamivudine; ABC, abacavir; ATV, atazanavir; AZT, zidovudine; DRV, darunavir; ECV, entecavir; EFV, efavirenz; ETR, etravirine; FPB, fosamprenavir; FTC, emtricitabine; MVC, maraviroc; NVP, nevirapine; RGV, raltegravir; RTV, ritonavir; TDF, tenofovir. h Excluded from analyses of the fold effect of the proportion of cells infected for cross-sectional analysis.   the fold effect per year on therapy was influenced by all the T cell subsets sorted from LNs (Fig. 4A). For the CHI group, all the T cell subsets sorted from LNs also had a lower proportion of infected cells per year on ART, with a fold effect of 0.62 to 0.88 per year on therapy (Fig. 4B). Overall, our results provide some evidence that HIV-1-infected T cells decay substantially during each additional year of therapy within the LN-derived T cell subsets we sorted from 4 participants in the AHI group compared to the 8 participants from the CHI group. We also found a strong correlation between the proportion of HIV-1-infected T cells sorted from LNs and PB within the acute/early participants for whom we had both LN and PB samples available (LN and PB correlations within T cell subsets were 0.98 to 0.99 [ Table 4]). Among the chronic participants who had both samples available, we also found a strong correlation between LNs and PB within the T N and T CM cell subsets (0.93 and 0.96, respectively). We used mixed-effects models of both PB and LN data together to obtain extrapolated fold effects per year in LN-derived T cell subsets. These models exploit the PB-LN correlations to estimate what the effects would have been for all 12 AHI and all 14 CHI participants. These models estimated an overall 13% reduction in the proportion of infected T cells during each additional year on therapy in LNs of the AHI group, with a fold effect of 0.87/year (95% CI ϭ 0.43 to 1.77) (Fig. 4A). This lower proportion of cells infected per year on ART was influenced by T N (fold effect ϭ 0.57/ year) and T CM (fold effect ϭ 0.67/year) cells. In the CHI group, the fold effects per year  on therapy derived from the extrapolation of T cell subsets located in the LNs were comparable to those measured within the actual samples (0.86/year and 0.74/year, respectively) (Fig. 4B). This equates to a 14% reduction in the proportion of infected cells in the CHI group and was comparable to the overall reduction in LNs derived from the AHI group.
To assess the association between the ART duration and the proportion of infected cells within the gut tissue, we included nine and eight participants from the AHI and CHI groups, respectively (Tables 1 and 2). In the gut, the change in the proportion of HIV-1-infected T cells, as measured by fold effect, across all T cell subsets was similar to those for PB and LNs (AHI overall ϭ 0.89/year; 95% CI ϭ 0.64 to 1.25; P ϭ 0.29; CHI overall ϭ 0.65/year; 95% CI ϭ 0.30 to 1.40; P ϭ 0.17) but was not statistically significant for both the AHI and CHI groups ( Fig. 4A and B). We also performed a correlation analysis within paired T cell subsets between gut and PB (Table 4). This correlation analysis was performed on participants who had more than 1,000 cells within each T cell subset derived from the gut ( Table 2). In the AHI group, we found a moderate correlation with PB-derived T cell subsets within central/transitional memory T (T CTM ) and T EM cells sorted from the gut. For the CHI group, the correlations between gut and PB within T CTM and T EM cells were 0.79 and 0.93, respectively. However, only the T EM cell values were statistically significant.
To investigate the changes in cellular infection frequencies within each anatomic site in absolute terms, we determined the number of HIV-1-infected T cells per million within each T cell subset derived from PB, LNs, and gut for all participants ( Fig. 5 and  6). These data recapitulated the results of the fold effect analysis per year on ART within each T cell subset.
Defective HIV-1 DNA in the p6-RT region during therapy. HIV-1 replication is characterized by rapid and highly error-prone reverse transcription, which lacks proofreading capacity and causes genetic defects in viral genomes (15,(36)(37)(38)(39)(40). Also, an antiviral mechanism called G-to-A hypermutation renders in-frame stop codons and produces replication-deficient HIV-1 DNA during reverse transcription (15,41,42). These genetic defects have been identified in memory T cells during therapy (43)(44)(45). However, the impact of ART duration on the accumulation of genetically defective HIV-1 DNA is unclear. Therefore, we used a mixed-effects logistic regression model to calculate the proportion of genetically defective HIV-1 DNA p6-RT sequences during each additional year of therapy. Included in this analysis were 3,963 intracellular HIV-1 p6-RT sequences isolated from PB, LNs, and gut (Tables 5 and 6). HIV-1 sequences were defined as genetically defective due to insertions and/or deletions causing a frameshift, G-to-A hypermutations, stop codons, and/or internal deletions.
Overall, the odds of an HIV-1 DNA sequence being genetically defective did not substantially increase with the duration of ART within PB, LN, and gut tissues obtained from both participant groups (Fig. 7). Similar results were found in specific CD4 ϩ T cell subsets sorted from the anatomic sites obtained from the AHI group (Fig. 8) and the CHI group (Fig. 9). Our findings provide evidence that defective HIV-1 DNA p6-RT sequences do not increase substantially in the cell subsets and anatomic sites we analyzed during therapy regardless of when ART was initiated.
Pre-and on-therapy plasma samples contained few defective HIV-1 RNA p6-RT sequences. Studies have found that the majority of HIV-1 isolated from virions in the plasma is infectious and replication competent (46)(47)(48). Therefore, we compared the quantities of defective HIV-1 DNA sequences in T cell subsets sorted from PB, LNs, and gut sampled during ART to the 1,134 HIV-1 RNA sequences isolated from pre-and on-therapy plasma samples in the p6-RT region (Tables 5 and 6).
HIV-1 DNA sequences derived from the anatomic sites were more often genetically defective than HIV-1 RNA sequences derived from pretherapy plasma samples (Fig. 10). The odds that a PB-derived HIV-1 DNA sequence was genetically defective were about 3-fold higher than those for pretherapy plasma HIV-1 RNA sequences in both partici-  Table 1). The error bars indicate the 95% confidence intervals of the infection frequency fold effect per year. *, P Ͻ 0.05; **, P Ͻ 0.01; ***, P Ͻ 0.001. The effects were estimated by negative binomial regression. pant groups. In LNs, the odds that an HIV-1 DNA sequence was genetically defective were 2.5 and 3.0 times higher than those for pretherapy plasma sequences in the AHI and CHI groups, respectively. In the gut, the odds that a viral DNA sequence was defective were 4.9-and 2.1-fold higher than those for pretherapy plasma RNA se-  Impact of Anti-HIV Therapy Length on T Cell Infection Journal of Virology quences in the AHI and CHI groups, respectively. However, they did not reach statistical significance in the CHI group. HIV-1 DNA sequences derived from the anatomic sites contained more genetic defects than viral RNA sequences derived from on-therapy plasma samples, but in some cases, they did not reach statistical significance (Fig. 10). In PB, the odds that a viral DNA sequence was defective were 5-to 6-fold greater than those for on-therapy plasma RNA sequences for both participant groups. Similarly, the odds that a viral DNA sequence was defective were about 5 times higher for LNs than for on-therapy plasma sequences in both participant groups. In the gut, the odds that a viral DNA sequence was genetically defective were 9.2-fold and 3.7-fold higher than for on-therapy plasma HIV-1 RNA sequences derived from the AHI and CHI groups, respectively. Furthermore, a majority of CD4 ϩ T cell subsets derived from the anatomic sites also showed higher odds that a viral sequence was genetically defective than pre-and on-therapy plasmaderived sequences in both the AHI and CHI groups (Fig. 11).
HIV-1 sequences from LN-derived T EM cells were more often genetically identical to pre-and on-therapy plasma viral RNA sequences. We found that HIV-1 RNA p6-RT sequences isolated from pre-and on-therapy plasma contained few genetic defects. The lack of defects within the p6-RT region of HIV-1 RNA sequences derived from pre-and on-therapy plasma indicated that these sequences came from replication-competent virions that could contribute to recrudescence. Therefore, we compared HIV-1 RNA and DNA p6-RT sequences to identify cell subsets that contained viral genomes most closely related to plasma-derived HIV-1. We compared the HIV-1 RNA and DNA sequences from the CHI group (Fig. 12), since HIV-1 sequences in participants treated during chronic infection are genetically diverse (17,23,24). A representative phylogenetic tree showing the genetic comparisons between HIV-1 RNA sequences derived from plasma and HIV-1 DNA sequences obtained during ART is presented in Fig. 13. We did not have pre-and on-therapy plasma samples collected from five of the CHI participants. Therefore, nine participants were included in the genetic comparison between HIV-1 DNA sequences isolated from the PB-derived T cell subsets and HIV-1 RNA sequences obtained from pretherapy or on-therapy plasma samples. For the comparison between the LN-derived T cell subsets and the plasma samples, LN tissue was collected from seven participants in the CHI group.

PT OT Total T N T SCM T CM T TM T EM R6 ؉ X5 ؉ Total T N T CM T TM T EM Total T N T CTM T EM CD4
We observed that the proportions of identical HIV-1 DNA sequences increased while the proportions of unique viral sequences decreased with duration on therapy in PB and LNs (Fig. 15A, B, D, and E). In the gut, however, we found that the proportions of identical HIV-1 DNA sequences decreased while the proportions of unique viral sequences increased with duration on therapy when data from participants who were on ART for 3 to 17.8 years were included ( Fig. 15C and F). However, the trends in LNs and gut did not reach statistical significance. Only PB showed a statistically significant increase in the odds that an HIV-1 DNA sequence was genetically identical to another by a factor of 1.09 per year on ART (95% CI ϭ 1.03 to 1.15; P ϭ 0.003) and a decrease in the odds that a viral sequence was unique by a factor of 0.941 (95% CI ϭ 0.903 to 0.981; P ϭ 0.004) ( Fig. 15A and D). The phylogenetic analysis of HIV-1 DNA sequences derived from PB clearly showed increasing expansions of identical HIV-1 DNA sequences as ART duration increased (Fig. 15G to I).
We observed that genetically identical HIV-1 DNA sequences increased whereas unique viral sequences decreased in a majority of CD4 ϩ T cell subsets sorted from PB, LNs, and gut during therapy ( Fig. 16 and 17). In PB-derived T EM cells, however, we observed the strongest evidence that the proportions of identical HIV-1 DNA sequences increased while the proportions of unique viral sequences decreased during each additional year on therapy (Fig. 16D). The odds of an HIV-1 DNA sequence being genetically identical to another in T EM cells increased with the duration of therapy by a factor of 1.11 per year (95% CI ϭ 1.03 to 1.20; P ϭ 0.007) (Fig. 16D). The odds that a viral sequence would be unique, however, decreased per year on ART by 0.909-fold in T EM cells (95% CI ϭ 0.850 to 0.971; P ϭ 0.004) (Fig. 16D).

DISCUSSION
Understanding the impact of ART duration on the dynamics and genetic composition of HIV-1-infected cells is critical for effective curative strategies. We therefore performed detailed genetic analyses of HIV-1 RNA sequences derived from pre-and on-therapy plasma samples and HIV-1 DNA sequences derived from a broad range of CD4 ϩ T cell subsets sorted from blood, lymph node, and gut tissues from participants who initiated treatment during acute/early and chronic infection. These studies allowed us to elucidate how the duration of therapy affects the number of HIV-1-infected T cells and the genetic composition of the proviruses they contain. In this cross-sectional analysis, we analyzed samples obtained after at least 3 years of effective therapy when

FIG 10
Odds that an HIV-1 sequence was defective in anatomic sites versus plasma samples. Shown is a comparison of the odds that an HIV-1 RNA sequence from pre-and on-ART plasma samples was defective versus the odds that a viral DNA sequence from PB, LN, and gut tissues was defective for the AHI group (open squares) and the CHI group (solid squares). The comparison of the odds is indicated as the OR; the error bars indicate the 95% confidence intervals for the ORs. *, P Ͻ 0.05; **, P Ͻ 0.01; ***, P Ͻ 0.001; ****, P Ͻ 0.0001. p6-RT sequences were used for the genetic comparisons. The odds ratios and their confidence intervals were estimated by logistic regression models.
HIV-1 replication was fully suppressed to below the limit of detection to minimize the impact of episomal HIV-1 DNA in the single-proviral sequencing (SPS) data (49,50).
We investigated the genetic composition of HIV-1 within the p6-RT region. It is well known that no subgenomic region can accurately represent the genetic diversity of full-length HIV-1 sequences (43,44,51). However, it has been shown that the gag-p6pro region is better than other subgenomic regions at predicting the genetic diversity of full-length HIV-1 genomes in the viral populations derived from participants who initiated therapy during acute and chronic infection (51). The HIV-1 gene region we

FIG 11
Odds that an HIV-1 sequence was defective in CD4 ϩ T cell subsets versus plasma samples. Shown are comparisons of the odds that a viral sequence was defective in CD4 ϩ T cell subsets sorted from PB, LN, and gut tissues derived from the AHI group (A) and the CHI group (B) to the odds that they were defective in pre-and on-therapy plasma samples. Blue shading, odds ratio of the CD4 ϩ T cell subset to pretherapy plasma; yellow shading, odds ratio of CD4 ϩ T cell subset to on-therapy plasma. *, P Ͻ 0.05; **, P Ͻ 0.01; ****, P Ͻ 0.0001. p6-RT sequences were used for the comparisons. The odds ratios and their confidence intervals were derived by logistic regression models.

Impact of Anti-HIV Therapy Length on T Cell Infection
Journal of Virology analyzed was longer than gag-p6-pro and had a clonal prediction score of 95, meaning that identical sequences in this region are identical throughout the viral genome with a probability of 95% (51). We analyzed over 5,000 HIV-1 RNA and DNA sequences, including both genetically intact and defective sequences, in this p6-RT region. Therefore, the subgenomic region selected for our genetic analyses, the clonal prediction score of the region, and the large number of sequences we analyzed increase the likelihood that the results of our study reflect the findings that would have resulted from full-length HIV sequence analyses. Moreover, the p6-RT HIV-1 DNA sequences that contain defects are not functional, as the region encodes viral enzymes important for replication. Previous studies have shown that defective HIV-1 genomes can produce viral proteins, including gag and pol (52,53). Also, CD4 ϩ T cells from HIV-1-infected participants showed an increase in the expression of p24 when incubated with gag and pol viral proteins, indicating that these viral proteins can induce viral replication within infected cells (54). The p6-RT region we sequenced includes gag and pol, and based on the studies described above, we believe that quantifying how the proportions of defective or intact p6-RT sequences change over the duration of ART can help assess the risk of viral rebound and spread when ART is interrupted. Several studies have demonstrated that the number of HIV-1-infected cells decays more rapidly in HIV-1-infected individuals who initiated therapy during early infection than in those who started treatment during chronic infection (55)(56)(57). Within the peripheral blood derived from the AHI group, we found an 18% reduction in the proportions of cells infected during each additional year of therapy. This result was similar to that of a recent longitudinal study that estimated the decay rate within CD4 ϩ T cells from peripheral blood at 13% per year on ART (55). However, one of the studies included participants who had several episodes of intermittent viremia per year, indicating that their infections were not fully and continuously suppressed by the treatment (57). Moreover, the studies assessed the virus from total CD4 ϩ T cells (55,57) or sorted a limited population of memory CD4 ϩ T cells (56) from peripheral blood when demonstrating a decay of HIV-1-infected cells during therapy. In contrast to these previous findings, we found similar decreases in the number of HIV-1-infected cells within the peripheral blood from both participant cohorts irrespective of whether treatment was initiated during acute/early or chronic infection. In both participant cohorts, this decrease in HIV-1-infected cells resulted from a statistically significant association of lower HIV-1 infection frequencies of naive and central memory T cells with each additional year of ART. However, in the case of participants treated during chronic infection, we found a large decrease in the proportion of HIV-1-infected cells over the years of therapy in the CXCR5 ϩ CCR6 Ϫ cell subset. A recent study describes cells expressing PD-1 and CXCR5 as circulating follicular helper T cells (58). We did not sort for the PD-1 cell marker, but our data may suggest that HIV-1 infection of circulating follicular helper T cells is less stable and even decays over time. Moreover, it has been found that CXCR5 ϩ memory CD4 ϩ T cells sorted from peripheral blood coexpress CCR7, indicating these cells comprise a subpopulation of peripheral-bloodderived T CM cells (59). Thus, the CXCR5 ϩ CCR6 Ϫ cell subset could also have contributed to the decrease in the number of HIV-1-infected T CM cells during each year on therapy in the peripheral blood of the participants treated during chronic infection. Overall, these findings indicate that the numbers of HIV-1-infected naive and T CM cells within the peripheral blood decrease during effective therapy. This observed decrease could be due to the decay of T cells that harbor defective proviruses, as cells contributing to the HIV-1 reservoir are known to contain predominantly defective viral genomes (44).
For AHI and CHI participants with PB and LN samples available for analysis, HIV-1infected cells from the lymph node decreased per year on ART. A substantial decrease in the proportion of HIV-1-infected T cells during each additional year on ART was found for all lymph node-derived CD4 ϩ T cell subsets from four participants treated during acute/early infection with paired PB and LN samples. Extrapolation to all 12 acute/early participants estimated smaller reductions in the proportion of HIV-1-  infected cells within LNs over 3 to 17.8 years of therapy that were similar to those for the participants treated during chronic infection. This finding suggests that tissue restoration and T cell reconstitution in lymph nodes reduced the number of HIV-1 cells during therapy for both participant groups (60).
Due to errors made during HIV-1 replication, genetic mutations that reduce viral fitness accumulate (15,37,38,61). Genetic defects, such as G-to-A hypermutation, internal deletions, and nucleotide insertions and/or deletions that cause a frameshift, also accumulate in proviruses during viral replication (15, 37-42, 45, 62). However, our previous studies have shown a lack of HIV-1 evolution during ART, supporting a small amount of evidence for ongoing viral replication during therapy (17,23). We combined all the defective features within the p6-RT region and found that the proportion of genetically defective HIV-1 DNA sequences did not appear to increase substantially with the duration of therapy. This was observed in most of the cell subsets and the anatomic sites we analyzed regardless of when ART was initiated. Our results revealing that defective HIV-1 sequences do not accumulate during therapy agree with those of a recent study showing that cytotoxic T cells can target cells containing defective proviruses (63). Taken together, our findings provide evidence that defective viral sequences are established during multiple rounds of viral replication before viral suppression rather than during effective therapy.
We found that plasma samples obtained before ART and during the early phase of treatment contained a lower proportion of defective viral sequences than HIV-1 DNA sequences derived from blood or tissue CD4 ϩ T cells. This result indicates that plasmaderived viral sequences most likely represent a population of HIV-1 that is infectious and capable of producing new virions (46)(47)(48). Compared to intracellular HIV-1 sequences from the other T cell subsets, T CM cells from the peripheral blood were least  , and x represents the years on ART. Each HIV-1 DNA sequence is the unit of analysis, and the denominator for each sequence is a single HIV-1-infected T cell (33,73). The confidence interval of each data point was derived from the binomial distribution. Odds ratios and their confidence intervals were estimated by logistic regression models. likely to contain HIV-1 DNA sequences that were identical to pre-and on-therapy plasma RNA sequences. Our recent study revealed that T CM cells contain the smallest amount of genetically intact HIV-1 proviruses in peripheral blood (43). In agreement with the previous study, our genetic analysis involving intracellular HIV-1 DNA sequences and plasma-derived viral RNA sequences suggests that the T CM cell subset within the peripheral blood is the least likely cellular source for infectious HIV-1. Importantly, our genetic analyses revealed that, in the lymph node tissue, T EM cells were highly enriched with viral sequences that were genetically identical to plasmaderived HIV-1 sequences. These results suggest that T EM cells are a probable source for infectious HIV-1 in the lymph node compared to all other T cell subsets.
The presence of genetically identical HIV-1 DNA sequences indicates that cellular proliferation plays a role in the maintenance of persistent HIV-1 during therapy (17, 23, 25-28, 43, 64). Our in-depth genetic analysis revealed that the genetically identical HIV-1 DNA sequences increased whereas the unique viral sequences decreased in a majority of CD4 ϩ T cell subsets sorted from PB and LNs during therapy. This indicates that cellular proliferation compensates for the decay of HIV-1-infected CD4 ϩ T cells in the peripheral blood and lymph nodes over time.
We found that expansions of genetically identical HIV-1 DNA sequences increased over the years of therapy at a statistically significant rate in the peripheral blood and that peripheral-blood-derived T EM cells were the main contributors. This is in agreement with our previous longitudinal study, which showed that identical HIV-1 DNA sequences increased in PB-derived T EM cells over 6 months of therapy in both p6-RT and env genomic regions (17). In contrast, the population of unique HIV-1 DNA sequences decayed in peripheral blood at a statistically significant rate, particularly in T EM cells. A previous study found that an increase in the proportion of identical HIV-1 integration sites was associated with a decrease in the proportion of unique viral integration sites in PB-derived CD4 ϩ T cells during therapy (65). This supports the inverse relationship between identical and unique p6-RT HIV-1 DNA sequences within PB, particularly within the PB-derived T EM cell subset. Also, it has been noted that the change in the frequency of PB-derived T EM cells containing HIV-DNA was less than 2-fold during 6 months of effective therapy (17). In agreement with this previous study, we found that the HIV-1 infection frequency appeared to be stable during 3 to 18 years of therapy in T EM cells sorted from peripheral blood. Taken together, our findings provide strong evidence that the overall stability of HIV-1-infected T EM cells is regulated by cellular proliferation that restores T cell loss during therapy with the clonal expansion of particular T EM cells containing identical HIV-1 sequences (possibly in response to an antigen) and the reduction/extinction of T EM cells containing unique HIV-1 sequences. The one anatomic region that revealed a decrease in identical sequences and an increase in unique sequences was the gut; however, the findings in the gut did not reach statistical significance. This suggests that cells located in the gut are under strict immune regulation preventing their response to normal gut flora, which limits cellular proliferation and expansion of identical HIV-1 sequences (66)(67)(68)(69)(70).
Although there are limitations to cross-sectional analyses, this study of more than 5,000 HIV-1 single-genome sequences from 26 participants has provided important insights into the HIV-1 reservoir during 3 to 17.8 years of therapy. A longitudinal study using full-length HIV-1 sequencing and integration site analysis of individual participant samples over time would be ideal to reconfirm our findings, but it would be timeconsuming and costly (43,44,64,65,71,72). Moreover, the number of cells required to adequately sort specific T cell subsets for full-length HIV-1 sequencing and integration site analysis would limit the ability to conduct such a study using frozen peripheral blood cells and tissue samples, which most laboratories have for longitudinal samples.
In conclusion, our in-depth genetic characterization of HIV-1 within anatomic sites after 3 to 17.8 years of ART revealed several important findings. The number of HIV-1-infected memory T cells decays in all anatomic sites analyzed with early and late ART initiation. The absence of substantial increases in the pool of defective HIV-1 sequences during effective therapy in both participant cohorts suggests that the defects are established during multiple rounds of viral replication before viral suppression rather than during effective ART. Moreover, lymph node-derived T EM cells are a probable source of HIV-1 genomes capable of producing infectious virus. Importantly, the complex interplay of identical and unique HIV-1 DNA sequences indicates that cellular proliferation plays a significant role in the maintenance of persistent HIV-1, and these mechanisms are most pronounced in peripheral-blood-derived T EM cells.

MATERIALS AND METHODS
Study approval. This project was approved by the institutional review board at the Western Sydney Health Department for the Westmead Institute for Medical Research (AU RED LNR/13/WMEAD/315) and the ethics review committees at the University of California San Francisco (UCSF) (10-01330/068192 and 10-02631/083640) and the Vaccine Gene Therapy Institute-Florida (VGTI-FL) (FWA 00004139). All participants provided written informed consent prior to inclusion in the study.
Participant selection. HIV-1-infected adults on effective ART for 3.0 to 17.8 years were recruited for the study at UCSF, San Francisco, CA, USA. The inclusion criteria for the study were at least 3 continuous years of therapy, with undetectable viral loads since month 6 of therapy and HIV-1 RNA at Ͻ40 copies/ml for at least 3 years. All the participants had viral loads of Ͻ40 HIV-1 RNA copies/ml during therapy except for participant 2450, whose viral load rebounded to 3,418 HIV-1 RNA copies/ml at the time of sampling (Table 1).
Clinical samples. PB samples from 26 participants infected with HIV-1 subtype B who were on long-term suppressive ART (duration, 3.0 to 17.8 years) were analyzed. Of those participants, LN and gut (gut-associated lymphoid, ileum, and/or rectum) tissues were available from 12 and 17 participants, respectively. Samples were collected from 12 participants who initiated therapy Յ6 months after infection (acute/early) and 14 participants who initiated therapy Ն1 year after HIV-1 infection (chronic) ( Table 1). Note that all the clinical samples analyzed were collected after the stated duration of therapy for each participant ( Table 1). The CD4 counts of all participants ranged from 339 to 1,165 cells/l at the time of sample collection (Table 1). We completed and published in-depth longitudinal genetic characterizations of the HIV-1 of 8 participants (Tables 5 and 6) (17,23). Here, we conducted an interpatient cross-sectional study of the impact of treatment duration on the HIV-1 reservoir, which included our published data (17,23).
Isolation of cells from peripheral blood, lymph node, and gut tissues. The T cell subsets were isolated from peripheral blood, lymph node, and gut tissues as previously described (17,23). Briefly, 230-ml peripheral blood samples were collected in tubes containing acid citrate dextrose as an anticoagulant or by leukapheresis. Peripheral blood mononuclear cells (PBMCs) were separated from plasma using Ficoll within 30 min after collection. Total CD4 ϩ T cells were isolated from PBMCs using magnetic negative selection (Stem Cell Technologies, Vancouver, Canada) according to the manufacturer's protocol. For isolation of cells from lymph nodes, 1/2 to 1 inguinal lymph node was removed under local anesthesia. Mechanical dissociation, followed by filtration and washing, was applied to the lymph node tissue. The isolation of cells from gut tissues and rectal and ileal biopsy specimens was accomplished using enzymatic digestion of Liberase TL or Liberase DL (Roche), respectively, in association with DNase I (Sigma) and mechanical disruption using GentleMacs (Miltenyi).
Cell sorting. Fluorescence-activated cell sorting (FACS) (FACSAria; BD Biosciences) was used to sort CD4 ϩ T cell subsets, as previously described (17,23). CD4 ϩ T N , T CM , T TM , and T EM cells were sorted from peripheral blood sampled from all 26 participants (Tables 1 to 3 and Fig. 1) (17,23). CD4 ϩ T SCM , memory CXCR5 Ϫ CR6 ϩ (X5 Ϫ R6 ϩ ) and X5 ϩ R6 Ϫ T cell subsets were sorted from blood samples from 6 participants whose blood samples were collected after at least 15 years of therapy (Tables 1 to 3 and Fig. 1 and 2). CD4 ϩ T N , T CM , T TM , and T EM cells were sorted from lymph nodes obtained from 12 participants (Tables 1  to 3 and Fig. 3) (17). Figures 1 to 3 show representative cell-sorting strategies from PB and LNs. CD4 ϩ T cells were sorted from gut tissues of 5 participants whose gut samples were collected after Ͼ15 years of therapy (participants 2115, 2518, 2046, 2013, and 2026) (Tables 1 to 3). CD4 ϩ T N , T CTM , and T EM cell subsets were derived from gut samples obtained from another 12 participants using a gating strategy published previously (Tables 1 to 3 The detailed gating strategy is presented in Fig. 1 to 3. From rectal and ileal tissues, CD4 ϩ T cells were sorted from single-cell suspensions using the following combinations of antibodies: CD3-Alexa 700 (clone UCHT1; BD no. 557943), CD4-APC (clone RPA-T4; BD no. 555349), CD8-PB (clone RPA-T8; BD no. 557943), CD14-V500 (clone M5E2; BD no. 561391), and LIVE/DEAD aqua marker (Invitrogen no. L34957). The markers used for sorting each CD4 ϩ T cell subset from each anatomic site are presented in Table 3. For each cell type derived from peripheral blood, lymph nodes, and gut, 62 ϫ 10 6 to 30 ϫ 10 6 cells were sorted into FACS tubes. The cells were divided and spun down in 1.5-ml Eppendorf tubes. The supernatant was removed, and the cell pellets were stored at Ϫ80°C for further analysis. Postsort analysis of cellular purity was done for each cell type from peripheral blood, with means of 93.4% for T N , 90.3% for T CM , 90.9% for T TM , 95.3% for T EM , 97.3% for CXCR5 Ϫ CCR6 ϩ , and 94.9% for CXCR5 ϩ CCR6 Ϫ cell subsets. The cellular purity for T N and CD4 ϩ memory T cell subsets sorted from lymph node tissue and gut was similar to the previous report (17). The purity of CD4 ϩ T cells sorted from gut tissues was 94.9%. The cell purity for T SCM cells was not assessed due to the low number of cells sorted. For HIV-1 sequencing, we analyzed an average of 5.0 ϫ 10 3 to 2.4 ϫ 10 6 cells in each CD4 ϩ T cell subset (Table 2). DNA extraction. DNA was extracted from CD4 ϩ T cell subsets sorted from peripheral blood and gut samples collected from 6 participants who were on ART for at least 15 years (Table 1). Four hundred microliters of RNAzol RT (MRC, Inc.) was added to a 1.5-ml Eppendorf tube containing the cell pellet; a 0.4ϫ RNAzol RT volume of sterile nuclease-free water (Invitrogen) was added and mixed by inversion for 15 s, followed by incubation for 15 min. The mixture was centrifuged at 16,000 ϫ g for 15 min at room temperature. The top phase was removed, and the bottom phase was used for DNA extraction. Nine hundred microliters of DNAzol (MRC, Inc.), followed by 10 l of glycogen (20 g/l; Qiagen), was added to the bottom phase. DNA was precipitated by adding 500 l of 200 proof ethanol (Sigma-Aldrich). The mixture was incubated for 10 min at room temperature and centrifuged at full speed for 30 min. The supernatant was removed, and the DNA pellet was washed with 75% ethanol twice. The pellet was air dried until no ethanol was visible. The pellet was dissolved in 300 l of 8 mM NaOH (Sigma-Aldrich), followed by neutralization by adding 24 l of 0.1 M HEPES (Gibco).
Single-proviral sequencing. The SPS assay was developed to obtain many individual intracellular HIV-1 DNA sequences from cells of participants on long-term therapy to assess the viral DNA population characteristics, diversity, evolution, and HIV infection frequencies of the cells (17,23,33,73). We validated the technique using tissue culture cells with known numbers of genetically distinct HIV-1 proviruses. We applied this proven technique to quantify and genetically characterize HIV-1 DNA sequences from T cell subsets. Using advanced fluorescence-activated cell sorting, CD4 ϩ T N , T SCM , T CM , T TM , T EM , and memory X5 ϩ R6 Ϫ and X5 Ϫ R6 ϩ cells were sorted from peripheral blood and lymph node tissue samples based on their cellular phenotypes ( Fig. 1 and 3 and Table 3) (17,23,74). From gut tissues, we sorted CD4 ϩ T N , T CTM , T EM , and total CD4 ϩ T cells (Table 3). Briefly, the lysate of the sorted cells or the extracted DNA was serially diluted (1:1 to 1:729). Single HIV-1 DNA molecules were amplified using primers flanking the gag-pol region (p6-RT). PCR amplification and sequencing of the DNA molecules in each well allowed quantification and analysis of the genetic relationship of HIV-1 DNA sequences in each infected cell subset.
Single-genome sequencing. Plasma was collected before and during ART, and HIV-1 RNA sequences were obtained using single-genome sequencing (SGS). The HIV-1 RNA sequences were compared to the HIV-1 DNA sequences identified using SPS, as described previously (75)(76)(77). At least 22 ml of plasma was used to pellet the virus. The pre-ART plasma samples were collected from 20 out of 26 participants from 4.7 years before to just prior to the initiation of treatment (Table 1) (17,23). Eighteen out of 26 participants had plasma samples collected after ART initiation ( Table 1). The on-therapy plasma samples from 5 previously reported participants were collected at 5.8 to 13.4 years of ART (17,23). The on-therapy plasma samples from the remaining 12 participants were collected at Յ3 months after the initiation of therapy (Table 1). Only the sequences from the CHI group were used for the comparison of HIV-1 RNA sequences derived from pre-and on-ART plasma samples and HIV-1 DNA sequences obtained from CD4 ϩ T cell subsets sorted from different anatomic sites during therapy.
Phylogenetic analyses. Using previously described methods (17,23), HIV-1 sequences derived from plasma and CD4 ϩ T cell subsets were phylogenetically analyzed. The SPS and SGS methods preclude the resampling of HIV-1 DNA and RNA sequences, respectively. The sequence data were generated over 3 to 4 years in two different physical locations with protocols designed to prevent cross-contamination between clinical samples and laboratory-related HIV strains. p6-RT sequences were aligned using MAFFT (78). Hypermutants (HIV-1 sequences containing G-to-A hypermutations) were identified using the Los Alamos Hypermut tool (79). Sequences containing premature stop codons, insertions/deletions causing a frameshift, and/or internal deletions were identified by manual screening and the Los Alamos quality control tool. The HIV-1 RNA sequences derived from pre-ART and early on-ART plasma samples and HIV-1 DNA sequences derived from a broad spectrum of CD4 ϩ T cell subsets sorted from PB, LN, and gut tissues were used to construct maximum-likelihood phylogenetic trees for each participant using MEGA 6 (1,000 bootstrap replicates; general time-reversible substitution model with gamma distribution and proportion of invariant sites; gamma category 4) (80). The phylogenetic trees containing all viral sequences were used to visually locate EIS (80). The genetically identical HIV-1 DNA sequences included at least 2 viral DNA sequences with 100% pairwise identity and were found in monotypic groups of a phylogenetic tree without any internal branches. Also, these identical HIV-1 DNA sequences were derived from a specific CD4 ϩ memory T cell subset sorted from an anatomic site. Maximum-likelihood phylogenetic trees at 3 different time points on ART (3.5, 10.8, and 17.3 years, respectively) were constructed to visualize the number of EIS from PB-derived intracellular HIV-1 DNA sequences at these specific time points. The unique HIV-1 DNA sequences are located on single branches within a phylogenetic tree and are genetically distinct and not genetically identical to any other viral RNA and/or DNA sequence (Fig. 14). The denominator for identical and unique HIV-1 DNA sequences, whether they are defective or not, is a single infected T cell. This is due to the fact that more than 90% of the HIV-1-infected T cells contain a single HIV-1 DNA molecule when analyzed by single-cell sequencing (33,73). Previously published HIV-1 DNA and RNA sequence are available in GenBank (accession no. KP065816 to KP067089, KP113063 to KP113482, KP152533 to KP152580, and KP152658 to KP153066) (17).
Statistics. We used previously described maximum-likelihood methods (23) to estimate the proportion of infected cells in each cell population in each tissue type from each participant. The maximumlikelihood method was used to calculate the proportion of infected cells that would be most consistent with the observed numbers of PCR-positive wells at different dilutions with various total numbers of cells (81). We then created simpler "stand-in" data to convey equivalent information for further analysis. For each cell population from a given tissue type in a given participant, we defined stand-in data as r infected cells out of n examined, with r and n chosen so that a Poisson regression model with only an intercept term would reproduce the same estimated infection rate and confidence interval around it as obtained in the maximum-likelihood analyses. When no wells were positive, we set r equal to 0 and n equal to the total number of cells tested. We performed statistical analyses of these data using negative binomial regression, because this generalizes Poisson regression to allow additional variation. We used the SAS NL mixed procedure (SAS Institute, Cary, NC; version 9.4) to fit negative binomial regression models for each cell population from each tissue with a linear effect of years on ART, which estimates the fold effect per year on therapy. The fold effect per year on therapy is a multiplicative effect that is equivalent to the fold change in the proportion of HIV-1-infected T cells from earlier to later time points. We summarized the results across cell populations within each tissue type with geometric means. The intercept and slope coefficients from the models were used to fit curves to the number of HIV-1-infected T cells per million over years on ART, as shown in equation 1: y ϭ 1,000,000 ϫ e ͑ kϩxs ͒ (1) where y is the number of HIV-1-infected T cells per million, k is the intercept, s is the slope per year on ART, and x is the number of years on ART. This statistical calculation was applied when at least four participants contributed to the fold effect per year on ART within each T cell population and tissue.
To obtain a P value for the differences in rates of decline, we fitted a mixed-effects negative binomial regression model to the PB and LN data together, with a random intercept to account for within-person correlation of HIV-1 infection rates in PB and LN samples of the same population from the same person. These unexpectedly showed nonsignificant P values and very different estimates of the fold effect per year of ART in LNs in the AHI group, suggesting that the 4 participants with LN samples available might not be representative of the entire set of 12. The mixed-effects models used the within-person PB-LN correlations to estimate fold effects for LNs that are less susceptible to bias caused by having the data from only four participants of the AHI group (the extrapolated estimates in Fig. 4). We calculated the Pearson correlations of the logarithmically transformed proportions of T cells that were HIV-1 infected, specifically using the following formula: log[max(r,0.5)/n]. When estimated proportions were zero, we changed r to 0.5 in order to permit logarithmic calculation. We excluded gut measurements based on fewer than 1,000 cells when performing the Pearson correlations, because they were likely to be imprecise.
We defined dichotomous dependent variables for each genetic sequence of HIV-1 RNA derived from pre-and on-ART plasma samples and HIV-1 DNA sequences derived from CD4 ϩ T cell subsets sorted from peripheral blood, lymph node, and gut tissues obtained after 3 to 17.8 years of ART. These variables indicated whether a viral sequence was genetically defective, unique, or coming from EIS. An additional variable indicated whether an HIV-1 DNA sequence had an identity of 100% compared to a viral sequence derived from pre-or on-ART plasma samples (82). We applied mixed-effects logistic models with random-person effects to compare the odds of a viral sequence being genetically defective between the plasma samples, the anatomic sites, and the CD4 ϩ T cell subsets. We applied similar models to compare the odds of an HIV-1 DNA sequence being genetically identical to an HIV-1 RNA sequence derived from pre-or on-ART plasma samples, using CD4 ϩ T cell subsets (T N , T CM , T TM , and T EM ) of peripheral blood and lymph nodes as the categorical predictor variable. For analyses involving the duration of ART, we applied mixed-effects logistic regression to estimate the odds ratio (OR) per year on ART. We plotted the proportions of defective, unique, or identical HIV-1 DNA sequences across ART duration and fitted nonlinear curves derived from mixed-effects logistic regression models using equation 2 below; where y is the fitted percentage of defective, unique, or identical HIV-1 DNA sequences; k is the intercept; s is the log odds ratio per year on ART; and x is the number of years on ART. The fitted curves were derived from the models, with a viral sequence as the unit of analysis. Moreover, the curves account for the plotted confidence intervals, which include the number of sequences available for each time point. Accession number(s). The HIV-1 DNA and RNA sequences have been submitted to GenBank under accession numbers MH830518 to MH834389.