Previous Article | Next Article ![]()
Journal of Virology, October 2007, p. 10625-10635, Vol. 81, No. 19
0022-538X/07/$08.00+0 doi:10.1128/JVI.00985-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Gayathri Athreya,3
Marcus Daniels,3
Ha Youn Lee,3,
William Bruno,3 and
Thomas Leitner3
Department of Virology, Swedish Institute for Infectious Disease Control, SE-171 82 Solna, Sweden,1 Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, SE-171 77 Stockholm, Sweden,2 Theoretical Biology and Biophysics, T-10, MS K710, Los Alamos National Laboratory, Los Alamos, New Mexico 875453
Received 7 May 2007/ Accepted 9 July 2007
|
|
|---|
|
|
|---|
It has been shown that the selective pressure of the human immune system is one of the driving forces of HIV-1 intrahost evolution (42). In addition to this, the intrahost evolutionary rate has been suggested to be influenced by the initial fitness of the HIV-1 population, depending on the route of infection (40). Strikingly, HIV-1 intravenous drug user (IDU) sequences, such as those collected from the region of the former Soviet Union (FSU), show very high genetic similarity to each other. This high similarity was not observed in HIV-1 sequences spreading through the sexual route. Previous studies have suggested that the intrahost evolutionary rate of HIV-1 either does (18) or does not (28, 31) differ between patients belonging to the IDU and sexual risk groups.
Currently, Eastern Europe and central Asia are the regions of the world with the fastest-growing HIV epidemics. The fast spread of the virus in the FSU region is closely linked with a rise in injecting drug use after the collapse of the Soviet Union in the 1990s, connected to the increase and diversification of drug trafficking routes through Central Asia and eastern Europe from the world's largest opium producer, Afghanistan (58). Until 1995, countries of the FSU experienced only a few hundred cases of HIV-1 infection, most of which were associated with sexual transmissions. The first outbreaks of HIV-1 among IDUs in the FSU were reported in 1995 in southern Ukraine (48), followed by rapid spread of the virus in this risk group in the Russian Federation (3, 5) and its neighboring countries Belarus and the Republic of Moldova in 1996 (41, 50). In 1997, the virus had spread to the Baltic region, inducing a fast-growing epidemic among IDUs in Latvia and later Estonia (2, 15, 64). Other regions of the FSU also started to experience the rapid growth of IDU HIV-1 infections, such as Kazakhstan, with more than 90% of HIV-1 cases belonging to this risk group (6), and Uzbekistan, with an exponential rise in HIV-1 prevalence among IDUs between 2001 and 2002 (29).
In South and Southeast Asia, a sudden rise in HIV-1 prevalence to an epidemic proportion was first observed in Thai IDUs in 1988 (61), mainly involving subtypes B and B'. Thailand, like Laos, Myanmar, and China, is a part of the Golden Triangle, a main heroin trafficking route in Southeast Asia. Consequently, only a year later, high HIV prevalence rates of the same subtype were detected among IDUs in Myanmar and the Yunnan province of China (19, 30, 65). Then, an increase of HIV-1 CRF01_AE infections was detected among female sex workers in Thailand, and soon thereafter, CRF01_AE spread to IDUs, suggesting an early and complicated interconnection between epidemics among drug users and commercial sex workers (49). By 1995, CRF01_AE was widespread among Thai IDUs (20, 56), and IDU populations in other countries, such as Vietnam and Malaysia, had experienced rapid increases in CRF01_AE infections (21, 43, 46). In 1996 and 1997, an outbreak of CRF01_AE, genetically highly similar to the Vietnamese strain, was detected among IDUs in China (22). Today, the HIV-1 epidemic in South and Southeastern Asia is complicated not only by the interplay of injecting drug use and commercial sex work but also by the circulation of numerous HIV-1 subtypes in these risk groups, resulting in the emergence of novel intersubtype and unique recombinant strains (57, 62).
Slower heterosexual spread of HIV-1 can be found in Western Europe and North America, consisting mainly of subtype B. Other mainly heterosexual epidemics are those of the predominant HIV-1 subtypes in Africa, A and C. Subtype A is mostly found in central Africa, while subtype C is predominant in the south and southeast. Although it is hard to estimate how and when these epidemics started, there have been some indications that the spread of subtype C in Africa has been greater than the spread of other subtypes (1, 17, 24, 59).
The high homogeneity of HIV-1 sequences derived from IDU outbreaks throughout the world suggests, thus, that the evolutionary rate of the virus spreading on the population level is lower in the IDU risk group. An introduction of HIV-1 into a social standing network, such as an IDU population, results in a very fast spread of the virus. Moreover, the virus is transmitted shortly after initial infection, possibly before the host immune system exerts a selective pressure on the viral population. This may result in the transmission of viral variants that have not experienced the pressure to evolve and, consequently, are genetically very similar to each other. On the other hand, slow HIV-1 spread in a heterosexually driven epidemic, where the virus spreads only as new social contacts are formed, may involve some amount of intrahost evolution before being transmitted to a new person. Here, we hypothesize that it is not the route of infection, but rather the amount of time between the initial infection and the further transmission of the virus, that determines the evolutionary rate and similarity of sequences in an epidemic.
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Epidemiological data
|
Constructing maximum likelihood trees.
All the trees were constructed on a parallel computing cluster using a maximum likelihood code with a general-time-reversible model with variable rates among sites (T. Bhattacharya and M. Daniels, unpublished) (26). For each epidemic, phylogenetic trees were constructed using sequences sampled at two separate time points, corresponding to a certain time interval (
t). Trees were made for all possible combinations of sampling years resulting in
t values from 1 to 17 years. For example, phylogenetic trees representing a
t of 5 years would be made with sequences sampled in 1992 and 1997, in 1993 and 1998, in 1994 and 1999, and so on. In cases where there were few sequences sampled in one year, these were pooled together with sequences sampled in the following year and analyzed as if they were sampled together, using a weighted sampling time point for that pool. The weighting expression used was (YiNi + YjNj)/(Ni + Nj), where Yi is sampling year i, Yj is sampling year j, Ni is the number of sequences sampled in year i, and Nj is the number of sequences sampled in year j.
Root and rate optimization.
We developed a tool for analysis of genetic distances (
ds) between two groups of sequences in a phylogenetic tree. The input file is a phylogenetic tree in Newick format. The taxa in the tree can be divided into two groups, between which
d is calculated. The
d between the two groups is calculated as the average genetic distance of group 2 to the root (d2) minus the average genetic distance of group 1 to the root (d1) (Fig. 1). This approach removes all errors in branch lengths prior to time point 1, since only the time and distance between the two defined groups are used. Since the
d between the groups can differ depending on how the tree is rooted,
d values for all possible rooting points of the given tree are calculated. The tree rooted in a way that gives a minimum P value, and thus the best separation of group 1 and 2 sequences, is scored as the best tree. This is calculated with Welch's t test, which is designed to provide a valid t test in the presence of unequal population variances. The output file gives
d, P values, and variances for not only the best-scored root of the tree but all possible rooting points in the given tree. An additional group in the tool is the discard group, where sequences not desired in the rate calculation can be put. In this way, taxa in one phylogenetic tree can be rearranged between groups and reanalyzed in several different ways. The root-and-rate-optimization tool is fast and easy to use. It will be available on the Los Alamos HIV-1 sequence database homepage (www.hiv.lanl.gov).
![]() View larger version (13K): [in a new window] |
FIG. 1. Root and rate optimization. The distance ( d) between two groups of sequences is optimized by a t test finding the best rooting point in a phylogenetic tree. d is calculated by d2 – d1, where d2 is the average distance of group 2, and d1 is the average distance of group 1, from the rooting point of the tree. The sequences are separated by a t.
|
) in each of the six different epidemics studied here was estimated using Poisson error correction such that
![]() |
d/
t, calculated for an individual maximum likelihood tree, and var(R) is the expected Poisson error in that evolutionary rate.
d is the delta distance from the root and rate optimization, and
t is as described above. The error is given by the following equation:
![]() |
was calculated from 100 replicates derived by resampling R with replacement at the same sample size as that for the original data in each epidemic.
Tree simulations.
To test our root-and-rate-optimization method, we simulated trees with different amounts of information to cover the situations that we encountered in the real data. Random trees with 40 taxa, 20 at sampling time 1 (t1) and 20 at sampling time 2 (t2), were generated using MacClade (version 4 [analysis of phylogeny and character evolution]; W. P. Maddison and D. R. Maddison, Sinauer Associated, Sunderland, MA). Branch lengths were scaled to reflect an expected
d (
dexp) between t1 and t2 between 0.001 and 0.1 substitutions site–1, and the fraction of the tree with this distance was 0.2, 0.5, or 0.8. A Poisson error process was used to get realistic branch lengths, assuming a sequence length of 1,000 characters. For each expected rate and tree fraction, 100 simulations were performed (3,300 trees in total).
BEAST analyses. In order to further validate the results generated by root and rate optimization, we also estimated the evolutionary rates of HIV-1 subtype A1 from the FSU and Africa by using Bayesian Evolutionary Analysis Sampling Trees (BEAST), a program for Bayesian Markov Chain Monte Carlo analysis of molecular data (BEAST v1.4; A. Drummond and A. Rambaut [http://evolve.zoo.ox.ac.uk/beast/]). The substitution rates were generated using a general-time-reversible substitution model with gamma distribution and invariable rates among sites, with Markov Chain Monte Carlo runs of 50,000,000 steps sampled every 1,000 steps and analyzed with Tracer (A. Rambaut and A. J. Drummond [http://evolve.zoo.ox.ac.uk/software.html?id=tracer]) with a discarded burn-in of 10%. Since the number of sequences in each epidemic was too large to analyze simultaneously, 50 sequences from each epidemic were randomly sampled 10 times and the average evolutionary rate was estimated for each epidemic. In addition, both datasets were tested using three different clock models available in BEAST 1.4: the strict clock, the relaxed uncorrected exponential clock, and the relaxed uncorrected lognormal molecular clock (13). Furthermore, two different a priori assumptions for population growth were used, coalescent constant size and coalescent exponential growth.
Modeling different HIV-1 epidemics.
We developed a structured epidemiological model to investigate differences in the evolutionary rates in fast epidemics and in slow epidemics. The model is an abstraction of the complex processes involved in transmission of HIV, using a standard susceptible-infected-population model to track the infection in different population groups (7). At each time point, we are interested in calculating the average divergence since the start of the epidemic across all infected individuals. Thus, we simulated a structured population where infected individuals are stratified by the divergence (since the beginning of the epidemic) of the infecting virus and the time since the individual's infection. For instance, at any given time, we know the number of individuals (Ixy) who were infected by a virus that had diverged for y (say, 3) years before infection and who themselves have been infected for x (say, 2) years. For this group, the virus has diverged for 5 (x + y) years since the beginning of the epidemic. We then follow the transitions occurring among the infected groups (different x and y values) as time goes by and new infections arise. For each year of the epidemic, the equations that describe the epidemic are as follows:
![]() |
![]() |
![]() |
Here,
is the influx of new susceptibles (S), which was chosen as µ x S0, where µ is the natural mortality (0.001 year–1) and S0 is the susceptible population at the start of the epidemic.
x is the infected-individual death rate, which depends on the time since infection (x) and is set as in reference 11. And
x is the infection rate, which during the initial 6 months of the infection is 10 times higher for drug users than for heterosexual transmission (10–4 year–1 per person early on for drug users and 10–5 year–1 per person for heterosexual transmission throughout and also for drug users after the first 6 months). In addition, we tested a situation with a higher infection rate during the first 6 months in the heterosexual population (10–4 year–1 per person early on for drug users, 10–5 year–1 per person early on for heterosexual transmission, and 10–6 year–1 per person for both of the epidemics after the initial 6 months). We start the epidemics with 100 infected people, we let y range between 0 and 15, and the total number of infected individuals is given by
.
The main assumptions of the model are as follows: (i) initially, over the first 6 months, when the immune pressure is low, the viral evolution is negligible; (ii) after that, the virus evolves and linearly diverges within an infected individual from the initial virus at 1% per year; (iii) the initial infection rate for drug users, over the first 6 months, is 10 times higher than that for heterosexual transmission; and (iv) after that time, the infection rates are the same for both populations. Also, we note that the within-individual divergence rate chosen (1% per year) is the same for both drug users and sexual transmission and is somewhat arbitrary, although it was chosen for consistency with the empirical data. The model was implemented with programming language C, and solution of the equations was carried out by a fourth-order Runge-Kutta method.
We calculated the average divergence of the virus in the population by summing the viral divergence (x + y) for all groups (Ixy), weighted by the fraction of all infected individuals (I) that Ixy represents. Thus, the divergence d at any time point is given by the following equation:
![]() |
|
|
|---|
ts, and trees (n = 332) are indicated in Table 1.
Accurate
d estimates.
Tree simulations showed that the root-and-rate-optimization method accurately estimated the
d between two groups of sequences in a tree. When the
dexp in the simulated trees was 0.01 or above, the method estimated a correct
d (Fig. 2). When
dexp was below 0.01, the method tended to somewhat overestimate
d, with the overestimation being greater at a lower
dexp. However, there was less overestimation for greater fractions of the tree containing
d. In trees simulated with a perfect molecular clock (infinite long sequences), our method always found the correct root and rate.
![]() View larger version (9K): [in a new window] |
FIG. 2. Accuracy of the root-and-rate-optimization method in estimating d. The graph shows observed d against dexp based on tree simulations. White circles represent data for when the fraction of the tree containing d is 0.8, black triangles represent a fraction of 0.5, and stars represent a fraction of 0.2. subst., substitution.
|
d and evolutionary rates generated from each maximum likelihood calculation against
t, we observed stochastic error-like noise for the evolutionary rate in each epidemic. These fluctuations for subtype A1 from the FSU and Africa can be seen in Fig. 3. The evolutionary rate fluctuated around the true value, with the variation being greater the closer in time the samples were to each other. Eventually, the rate stabilized to its true value. Interestingly, fluctuation appeared to correlate with evolutionary rate in that higher evolutionary rates had larger variations, supporting a Poisson-like process. Thus, a Poisson-corrected average evolutionary rate was calculated for each epidemic, with correction for the greater error present in rates for trees with low
t values (Table 2). Generally, the fluctuations of evolutionary rate stabilized after an approximate
t of 3 to 4 years, suggesting that a period of at least 3 to 4 years between collection of samples is needed for accurate estimation of an evolutionary rate on the population level. An exception was subtype B, with an unusually long stabilization time of approximately 10 years (data not shown).
![]() View larger version (12K): [in a new window] |
FIG. 3. Fast (FSU) and slow (Africa) HIV-1 subtype A1 epidemics. Delta distances ( d) (a) and evolutionary rates (R) (b) are plotted against time between sampling dates ( t). Black dots represent FSU data, and white dots represent African data. d values were calculated using a t test root optimization on maximum-likelihood reconstructed phylogenetic trees (Fig. 1). Each data point represents the d and R from an individual maximum likelihood tree optimization. The lines indicate the estimated Poisson-corrected evolutionary rate ( ) in each epidemic (solid line, FSU; dashed line, Africa). subst., substitution.
|
|
View this table: [in a new window] |
TABLE 2. Poisson corrected average evolutionary rates
|
ts (1 to 6 years), a slow increase in
d was observed (Fig. 3a), resulting in an average evolutionary rate of 2.02 x 10–3 substitutions site–1 year–1 (Table 2). There was no sign of this rate changing over the studied time period (Fig. 3b).
High rate in slow epidemic.
The evolutionary rate of the V3 region of subtype A1 virus, spreading in a slow heterosexual manner in Africa, was calculated for the period between 1987 and 2004 based on maximum likelihood trees and root optimizations (Table 1). As in the fast-epidemic case,
d increased with increased
t (Fig. 3a). The average rate was estimated to be 16.9 x 10–3 substitutions site–1 year–1, that is, 8.4 times higher than the rate of the fast-spreading virus of the same subtype in the FSU (Table 2). The rate of this virus was the highest one observed among the epidemics studied here. The evolutionary rate over
t for this virus is shown in Fig. 3b.
Intermediate rates in mixed epidemics. The epidemics of subtypes B' and CRF01_AE in South and Southeastern Asia were classified as mixed epidemics, since these subtypes spread among IDUs, commercial sex workers, and their customers, using both standing social networks and forming of new contacts. The average V3 evolutionary rate of B' spreading in this geographical region between 1991 and 1996 was found to be 10.7 x 10–3 substitutions site–1 year–1. The evolutionary rate of CRF01_AE for the period between 1994 and 2002 was found to be 8.3 x 10–3 substitutions site–1 year–1 (Table 2). Interestingly, the evolutionary rates for these mixed epidemics were between the evolutionary rates for the more clearly defined epidemics for IDUs in the FSU and heterosexual transmission in Africa.
Other epidemic scenarios.
Although HIV-1 subtype C in Africa is mainly spreading heterosexually, it has been shown that the rate of spread for this virus was somewhat higher than that for other subtypes in Africa (17, 24, 59). The average V3 evolutionary rate of the virus collected from samples from this region between 1987 and 2004, as estimated by 130 maximum likelihood trees, was 9.7 x 10–3 substitutions site–1 year–1 (Table 2), i.e., an evolutionary rate somewhat lower than what was observed in the African A1 epidemic, agreeing with a somewhat faster spread. The average evolutionary rate for the hetero- and homosexual epidemic of subtype B from Europe and North America in the period between 1981 and 2002 was estimated to be lower than those for the other subtype epidemics (4.3 x 10–3 substitutions site–1 year–1). We also attempted to estimate the early (1978 to 1984) subtype B epidemic separately, but a shortage of samples unfortunately limited reliable estimates of that rate (1 to 2% per year). Subtype B from the Western world differs from all the other subtypes by the fact that it is subjected, in addition to immune pressure, to selection pressure from powerful antiretroviral treatment. However, if the evolutionary rate was calculated after an unusually long stabilization time (
t = 10 years), it was estimated to be 9.4 x 10–3 substitutions site–1 year–1, i.e., similar to the other intermediate evolutionary rates. The other studied subtype epidemics did not have a discrepancy between average evolutionary rate and rate after stabilization (data not shown).
Calculation of HIV evolutionary rates using BEAST. BEAST was used to calculate the evolutionary rates of subtype A1 in fast and slow epidemics in the FSU and Africa, respectively. The rates were calculated for two different population growth scenarios, constant and exponential, combined with three different molecular-clock models: strict, relaxed uncorrected exponential, and relaxed uncorrected lognormal. While the 95% highest-posterior-density (HPD) values for several comparisons overlap, and the evolutionary rates of A1 in Africa differed greatly between different models used, significant differences were observed in all models between the evolutionary rates of A1 in the FSU and A1 in Africa (Table 3). Thus, the rate of the fast-spreading IDU virus was found to be consistently lower than the rate of the virus spreading in a slow heterosexual way. For A1 in Africa, however, the mean posterior values were lowest for the relaxed-exponential-clock model, suggesting the best fit. Here, the evolutionary rates were the highest ones, on average estimated to be 16.6 x 10–3 substitutions site–1 year–1 for the constant population growth and 10.6 x 10–3 for the exponential population growth (Table 3), thus suggesting a 3.3- to 7.8-times-higher evolutionary rate among virus that spread slower.
|
View this table: [in a new window] |
TABLE 3. BEAST evolutionary rates
|
![]() View larger version (16K): [in a new window] |
FIG. 4. Viral divergence during an epidemic. The average divergence of the virus over time is shown both (a) when all the population is sampled and (b) when only recently infected people are sampled. In both cases, we see that the average divergence is less for an epidemic among a cohort of drug users (thick line) than for one among heterosexual cohorts. For the latter, we simulated both a small, susceptible population with the same size as that of the drug user epidemic (normal line) and a 10-fold-larger initial susceptible population (dashed line).
|
|
|
|---|
![]() View larger version (13K): [in a new window] |
FIG. 5. Schematic drawing of the impact of immune selection on the evolutionary rate of HIV-1. Timeline A represents the first stage of infection, where the immune system is not yet activated. Timeline B represents the second stage of infection, where immune selection has started. The dotted line (no selection) is representative of the evolutionary rate of virus spreading rapidly among IDUs, transmitted from person to person during stage A, therefore experiencing no diversifying selection pressure and hence very little divergence. The normal line is representative of the evolutionary rate of virus spreading from person to person during infection stage B, i.e., in a slow heterosexual spread.
|
It has been suggested that bypassing the mucosal barriers by injecting HIV-1 may remove selective mechanisms (18, 60). Such selective mechanisms may constitute parallel and convergent evolution that would lower the evolutionary rate on the population level or diversify the evolution, increasing the evolutionary rate (9, 39). Since we do see a lower rate for IDUs than for heterosexual transmissions, however, convergent evolution in sexual transmission could not have had a great impact. Also, it was recently shown that coreceptor adaptation was independent of the route of infection (8); thus, such adaptation would affect the rates in both risk groups. Similarly, a mucosal barrier selection or other bottleneck effects occurring in sexual transmission but not in intravenous transmission, resulting in host-specific virus adaptations, would most likely affect only certain amino acid positions. Thus, it is unlikely that any of these effects would overshadow the general diversifying and neutral evolution seen on the population level.
In this study, we used a root-and-rate-optimization method to calculate the evolutionary rates of HIV-1 spreading in different epidemics throughout the world. The molecular clock was originally suggested by Zuckerkandl and Pauling (67), and many later methods have been based on Kimura's neutral evolution theory (23). Our method was inspired by earlier work by Li et al. (38), which also has inspired related methods, such as TipDate (52). The advantages of our method include that it is fast and thus can be used on large data sets, it can use a tree calculated with or without the explicit use of a molecular clock (using any model or method that the user finds adequate), and it gives the possibility of finding and comparing genetic distances and node heights for all possible rooting points in a given phylogenetic tree. Furthermore, the branch lengths prior to the first group of sequences are subtracted in the calculation so that errors in this part of the tree are essentially ignored. For each epidemic investigated here, we divided the samples into as many
ts as was possible, while retaining enough data in each sampling year (usually above 20 sequences per year). This means that some data points (
ts) partly reuse some data, and thus, there is covariance among some data points. The fact that we use large-scale meta-data, however, mostly overcomes any bias that a certain sampling year might have added to the overall results. Similarly, tree distances (node heights) usually contain shared branches, and thus, also in this case there will be a lack of independence among distances based on phylogenetic trees. Note, however, that such branches prior to the first time point are removed in the analysis, so the distances may actually be independent segments in the part that is used for the rate estimate. Strictly speaking, and generally for tree methods, however, since the whole tree is optimized, not even the terminal branches of a tree are fully independent estimates. Importantly, our tree simulations showed that our method was accurately estimating
ds between two groups of sequences in a tree (Fig. 2). However, when the
d between the two groups was very small (<<0.01), the method tended to somewhat overestimate the rate. When a molecular clock, whether strict or sloppy, had operated on the studied sequence set, we found that our method was able to find a root that reasonably estimated the rate. Recombination and different types of selection make the accuracy and the value of a molecular clock debatable. Therefore, we compared the results from our method with calculations of an evolutionary rate using BEAST, a software program with which we could test relaxed-molecular-clock models (13). In agreement with the estimates from our method, the relaxed-clock models suggested similar differences in the evolutionary rates of viruses spreading in a slow or rapid manner in IDU or heterosexual transmissions, respectively.
Our results showed that HIV-1 subtype A1 spreading heterosexually in Africa had an evolutionary rate 8.4 times higher than the same subtype spreading rapidly among IDUs in the FSU (Table 2). The IDU HIV-1 epidemic in the FSU started with the introduction of two HIV-1 strains into networks of IDUs, subtype A in IDUs in Odessa and subtype B in Nikolaev (45). From here, the virus rapidly spread to other regions of the FSU, causing IDU HIV-1 outbreaks (2, 3, 6, 15, 29, 41, 50). Molecular analyses of these outbreaks show, however, that this is mainly a single interconnected IDU epidemic caused by the rapid spread of subtype A virus (4). In contrast, the heterosexual A1 epidemic in central and eastern Africa is characterized by a stabilizing or even decreasing prevalence of HIV infections (1). Although injecting drug use remains the driving force for the rapid spread of HIV-1 in the FSU, the patterns of the epidemic are now starting to change, with an increase in heterosexual transmissions mainly due to the growth of the sex industry. At this point, we predict that the evolutionary rate will increase due to a change in spread patterns to an epidemic that is more similar to the mixed-risk-group epidemics.
In South and Southeast Asia, rapid growth of HIV-1 infections was first observed in Thai IDUs in 1988 (61), mainly involving subtypes B and B'. Shortly after, an increase of CRF01_AE was observed among female sex workers in this region, spreading it further to the IDU population (49). Since many sex workers also belong to the IDU risk group, a complicated interconnection between commercial sex work and injecting drug use in South and Southeast Asia arose, where the epidemics of subtype B' and CRF01_AE spread simultaneously in both IDU and sexual risk groups, making them mixed epidemics. Consistent with this, the evolutionary rates of B' and CRF01_AE were estimated to be 10.7 x 10–3 and 8.3 x 10–3 substitutions site–1 year–1, respectively, putting them between the evolutionary rates of subtype A1 in fast spread in the FSU and slow spread in Africa. These intermediate evolutionary rates mirror the mixed nature of these epidemics, where the virus is spreading fast among IDUs and slow by sexual transmissions.
The evolutionary rate of subtype C spreading heterosexually in south and southeast Africa was estimated to be 1.7 times lower than the rate of subtype A1 in central and east Africa (Table 2). This intermediate evolutionary rate of subtype C is not, however, a result of the virus spreading by different risk groups, as seen in the mixed epidemics of subtypes B' and CRF01_AE. Instead, it has been shown that the rate of spread for subtype C has been higher than that for other subtypes in Africa (1, 14, 17, 24, 59); thus, a higher proportion of these infections must have occurred during the first stage of infection. In addition, it has been shown that the V3 loop of this virus is relatively well conserved compared to those of other subtypes and that it is the region downstream of the loop that is highly variable (14, 27, 51). Hence, both the conservation of the V3 loop and the more rapid spread of this subtype across the African continent may have influenced the evolutionary rate of this virus.
The evolutionary rate of subtype B from Europe and North America was much lower than expected in a slow epidemic. However, subtype B differs greatly from all the other subtypes by the fact that it is subjected to the selective pressure of antiretroviral treatment, common in these parts of the world. If antiretroviral therapy is successful, the viral replication within a host will be diminished, and there will be no measurable accumulation of substitutions in env. It has previously been shown that effective antiretroviral treatment can slow down and even totally abolish the evolution of HIV-1 in the envelope region (12, 47, 53). It is possible that this effect is reflected in the low evolutionary rate found in our subtype B analyses. On the other hand, it has also been suggested that suboptimal therapy can lead to increased evolutionary rates (35). Furthermore, the initial spread of subtype B showed a rapid increase within specific risk groups, such as the homosexual risk group, until it slowed down during the 1990s (44, 55). Thus, the overall evolutionary rate for subtype B may have been affected also by the combination of initial rapid and late slow spread.
In conclusion, there are large differences in the evolutionary rates in different epidemics worldwide, and there may be different rates during different phases of an individual epidemic, all depending on the specific epidemic growth rate. On the pandemic level, this is yet another component in the evolutionary dynamics of HIV and it is interesting to note that such rate differences may affect estimates of the times of origin of HIV-1 in different parts of the world. Thus, it is possible that more-accurate dates could be inferred if the epidemic rate were taken into account.
|
|
|---|
The GenBank accession numbers for Southeast Asia subtype B' are as follows: L07449 to L07456, L07460, L19238, AF209202 to AF209204, U15576 to U15587, U65538 to U65548, U22543 to U22547, U22549, U22551, U22552, U22554 to U22556, U22558 to U22560, U22562 to U22566, U22568 to U22574, U22576 to U22603, U22605 to U22608, U22610, U22613 to U22616, U22618 to U22623, U22626, AF082536 to AF082555, AF151767 to AF151773, AF101121, AF101122, AF385992, and AF417200.
The GenBank accession numbers for Southeast Asia CRF01_AE are as follows: U22542, U22548, U22553, U22557, U22561, U22567, U22575, U22604, U22609, U22611, U22612, U22617, U22624, U22625, U48719, U48720, U90068 to U90073, U90077, U90079, U90082, U90085 to U90088, AF081710 to AF081780, AF151737 to AF151766, AF151774, AF151775, L07457, L07462, AF160732 to AF160736, AF080193 to AF080199, AF080201 to AF080203, AF160737 to AF160740, AF160746 to AF160750, AB034839 to AB034844, AB034846 to AB034848, AB034852, AB034853, AB034855 to AB034860, AY358036 to AY358039, AY358041 to AY358049, AY358051 to AY358054, AY358056 to AY358068, AY635620 to AY635626, AY635628 to AY635639, AF326123 to AF326129, AF326131, AF326133 to AF326135, AF326139, AY283413 to AY283416, and AY283418 to AY283426.
The GenBank accession numbers for Africa subtype A1 are as follows: AY905583, AY905586 to AY905590, AY905596, AY905601 to AY905605, AY669700 to AY669705, AY669769, AY713407, U51190, U88823, AY288084, AY713406, AF004885, AF028319, AF028324, AF028328, AF119189, AF119200, AF119201, AB052867, AF361871, AF361873, AF361876, AF361879, AF372392, AJ404009, AJ404013, AJ404015, AJ404016, AJ404020, AJ404022, AJ404052, AJ404085 to AJ404087, AJ404090, AJ404100, AJ404101, AJ404106, AJ404108, AJ404110, AJ404114, AJ404116, AJ404118, AJ404140, AJ404141, AJ404143, AJ404150, AJ404152, AJ404153, AY140594 to AY140596, AY140598, AY140607, AY322190, AY322193, AF484478, AF484479, AF484491 to AF484493, AF484496, AF484503, AF484507 to AF484509, AF484520, AF457052, AF 457055, AF457056, AF457059, AF457062 to AF457070, AF457073, AF457075, AF457078, AF457079, AF457081, AF457083, AF457084, AF457086 to AF457089, AJ490697, AJ490700, AJ490704, AJ490705, AJ490708, AJ490711, AJ490714, AJ490721 to AJ490724, AJ490726, AJ490729, AJ490731, AJ490734, AJ490740 to AJ490743, AJ490745, AJ490746, AJ490748, AJ490752, AJ490753, AJ490755, AJ490757 to AJ490759, AJ490761, AJ490763 to AJ490766, AJ490768, AJ490770, AJ490777, AJ490781, AJ490782, AJ490789, AJ490790, AJ490792, AJ490793, AJ490795, AJ490798, AJ490799, AJ490803, CQ891777, CQ891778, AY253305, AY253306, AY253314 to AY253316, AY253318, AY243319, AY371143, AY371163, AY371164, AY456285, AY456286, AY456291 to AY456294, AY456300, AY456301, AY456303, AY456306, AY676582 to AY676586, AY174897, AY174925, AY174981, AY175006, AY175045, AY175079, AY175104, AY175152, AY175183, AY175234, AY175291, AY175340, DQ155223, and DQ155257.
The GenBank accession numbers for Europe and North America subtype B are as follows: Z29219 to Z29222, Z29224 to Z29258, L08312, L08334, AF065562, AF065590, U27434, AF041125, AF112539, U23112, U23126, U23128, U23130, U68502, U68503, U68508, U58777, U28668, U28671, U28675, U28676, U28680, U28681, AF041132, AF041134, AF191349, AF191365, AF191372, AF191375, AF191377, AF191404, AF191407, AF191410, AF191415, AF191423, AF191380, AF191381, AF191384, AF191429, AF016548, AF016550, AF016553, AF016554, AF016556, AF016558, AF016561, AF016563, AF016564, AF016568 to AF016570, AF016572, AF016575, AF016577, AF016579, AF181291 to AF181293, AF181295, AF181296, AF181298, AF181326, AF181328, AF181329, AF181359 to AF181362, AF181364, AF181365, AF181377, AF181378, AF181380, AF181382, AY308760, AY314044, AY322103, AY331291, AY331282, AF223984, AF223985, AF223987, AF223988, AF223994, AF223997, AF224001, AF224003, AF125300, AF325930, AF325933, AF325948, AF325950, AF325959, AY535455, AY835447, AY835451, AF466232, AY290956, AY290957, AY835449, and AY524180.
The GenBank accession numbers for Africa subtype C are as follows: AY146085, AY146087 to AY146090, AY146091 to AY146104, AY146106, AY146108 to AY146140, AY146142 to AY146151, AY146153 to AY146180, AY146182 to AY146201, AY146203 to AY146214, AF085304, AF085315, AF085320, AF254773, U06716 to U06719, U07237, Z76039, Z76306, L48067, L48068, U88730 to U88732, U88745, U88746, U88766, U88778 to U88780, U88787 to U88790, U88794 to U88799, U88802 to U88812, U33762, AF101462, AY669739, AY669749, U08453, U08455, AF095826, AF101458, AF101460, AF101461, AF101463, AF101465, AF110959, AF110962, AF110969, AF110972, AF110973, AF110976, AF110979, AF245525 to AF245529, AF245531 to AF245534, AF245536, AF245545 to AF245548, AF245556, AF245566, AF245568 to AF245571, AF245579, AF245582, AF245588, AF245590 to AF245592, AF245596, AF245602 to AF245608, AF245611 to AF245613, AF290027, AF095831, AF158871, AF159869, AF159872, AF160813 to AF160828, AF361874, AF361875, AJ272645, AJ272653, AJ272663, AY140600, AF254766 to AF254772, AF254774 to AF254777, AF254782, AF254783, AF391240, AY158534, AY158535, AY423929, AY423965, AY424151, AY424167, AY424176, AY424195, AY529659, AY529661 to AY529663, AY585272, AY135970, AY136014, AY136041, AY424034, AY424061, AY424078, AY424099, AY529668 to AY529673, AY529675, AY529676, AY529678, AY135965, AY463217 to AY463222, AY463225 to AY463234, AY463236, AY529667, AY585266, AY838565 to AY838568, AY137011 to AY137026, AY137028 to AY137031, AY137042 to AY137051, AY137053 to AY137062, AY137064, AY137065, AY137069 to AY137073, AY529671, AY529672, AY196497, AY736823, AY703908, AY772690 to AY772692, AY772694 to AY772696, AY772700, AY878056, AY878057, AY878060, AY878061, AY878063 to AY878065, AY878068, AY878070, AY901966 to AY901972, AY901975, AY901981, DQ011165, AY703909 to AY703911, AY772697 to AY772699, AY821310, AY878054, AY878055, AY878058, AY878059, AY878062, AY878071, AY878072, AY901973, AY901974, and AY901976 to AY901980.
This study was supported by an NIH-DOE interagency agreement (Y1-AI-1500-07).
Published ahead of print on 18 July 2007. ![]()
Present address: Department of Genetics, Case Western Reserve University, School of Medicine, 10900 Euclid Ave., BRB 632, Cleveland, OH 44106. ![]()
Present address: Department of Biostatistics and Computational Biology, University of Rochester, School of Medicine and Dentistry, 601 Elmwood Ave., Box 630, Rochester, NY 14642. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»