The human immunodeficiency virus type 1 (HIV-1) epidemic in
Southeast Asia has been largely due to the emergence of clade E
(HIV-1E). It has been suggested that HIV-1E is derived from a
recombinant lineage of subtype A (HIV-1A) and subtype E, with multiple
breakpoints along the E genome. We obtained complete genome sequences
of clade E viruses from Thailand (93TH057 and 93TH065) and from the
Central African Republic (90CF11697 and 90CF4071), increasing the total
number of HIV-1E complete genome sequences available to seven.
Phylogenetic analysis of complete genomes showed that subtypes A and E
are themselves monophyletic, although together they also form a larger
monophyletic group. The apparent phylogenetic incongruence at
different regions of the genome that was previously taken as evidence
of recombination is shown to be not statistically significant.
Furthermore, simulations indicate that bootscanning and pairwise
distance results, previously used as evidence for recombination, can be
misleading, particularly when there are differences in substitution or
evolutionary rates across the genomes of different subtypes. Taken
jointly, our analyses suggest that there is inadequate support for the
hypothesis that subtype E variants are derived from a recombinant
lineage. In contrast, many other HIV strains claimed to have a
recombinant origin, including viruses for which only a single parental
strain was employed for analysis, do indeed satisfy the statistical
criteria we propose. Thus, while intersubtype recombinant HIV strains
are indeed circulating, the criteria for assigning a recombinant origin to viral structures should include statistical testing of alternative hypotheses to avoid inappropriate assignments that would obscure the
true evolutionary properties of these viruses.
 |
INTRODUCTION |
Viruses involved in the human
immunodeficiency virus type 1 (HIV-1) pandemic are grouped into the
main (M), the outlier, and the non-M, non-O groups. Phylogenetic
analysis of the env and gag genes of the M group
has established 10 distinct subtypes, or clades (A through H, K, and J)
(11, 26, 33, 60; information found in the HIV
Molecular Immunology database [http://hiv-web.lanl.gov/immuno/ctl]). A high amount of genetic diversity has developed among and within these
clades through nucleotide substitution, duplication, deletion, and
recombination of closely related or divergent viral strains (1, 6,
18, 27, 42, 43, 47, 49). The relatively high level of genetic
divergence between the M group clades has led to the hypothesis that
multiple vaccines against HIV-1 may have to be made against the
different subtypes of the virus (20, 35). Sequence
information on most of the nine subtypes is currently limited,
suggesting that more information will be needed if subtype-specific vaccines are to be produced.
Previous studies of clade E viruses from both Thailand and the Central
African Republic suggest that HIV-1E originated in Africa and then
spread through a single introduction into Southeast Asia (12, 35,
38, 39). HIV-1E predominates in a growing epidemic in Southeast
Asia and is expected to represent a major proportion of new HIV
infections in the coming decades (65). The number of
HIV-1-infected individuals in Thailand is estimated to be 750,000, with
90% of the sexually transmitted viruses belonging to subtype E
(61, 64) and over half of the recently infected intravenous
drug users infected with subtype E (24, 32, 57).
Phylogenetic analysis of HIV-1 group M viruses has led to the discovery
of intersubtype recombinants, each having genome regions that are
evolutionarily associated with different subtypes (4, 5, 9, 12,
22, 30, 37, 43, 46, 49). Typically, unique recombinants are
represented by individual strains. However, in some instances, entire
groups appear to be descended from a recombinant lineage: viruses in
subtype E, those ascribed to the circulating recombinant form IbNG
(3, 36), and those in recent outbreaks in Russia
(31) and in China (51, 52), are examples of
these. However, the clade E viruses are unusual in that only a single
parental strain has been identified. These viruses, recently designated
HIV-1 subtype A/E, are described as recombinant lineages, with regions
of the gag and pol genes derived from the A
subtype and regions of the env and vpu genes
derived from an unknown, subtype E parental strain (4, 12).
Several techniques have been designed to detect recombination
events. Some examples include Stephens' method, based on
incompatible sites (56); Sawyer's method, based on
imbalances in the distribution of sequence segments (50);
Smith's chi-square method (55); Jakobsen and Easteal's
method of displaying compatibility matrices (21); Grassly
and Holmes' sliding window likelihood approach (14);
Weiller's graphical method, based on character partitions (63); and the RIP program of Siepel et al. (54).
The techniques originally used in identifying recombination events
within the present HIV-1E genome, which is thought to have had one of
its parental lineages either die out or go undetected, include a
combination of bootscanning (48) and pairwise distance
analyses (12).
The bootscanning analysis uses bootstrapped phylogenetic analyses
(7) on a sliding window of sequential and overlapping segments of the HIV-1 genome alignment. Previous HIV-1 subtype E
recombination analyses have relied on the three HIV-1 subtype E
complete genomes available in the literature. We have sequenced four
more subtype E complete genomes, two from Thailand and two from the
Central African Republic. In order to locate putative regions of
intersubtype recombination, we performed bootscanning and pairwise
distance analyses that yielded results similar those obtained by other
researchers (4, 12). However, using more-stringent statistical techniques (described below), we found no support for the
hypothesis that HIV-1E is derived from a recombinant ancestor. Using
simulations, we also found that variations in substitution and/or
evolutionary rates across the HIV-1 genome can appear as putative
recombination events with bootscanning and pairwise distance analyses.
Our analyses indicate that the evolutionary relationship of subtype A
and E viruses is similar to that of subtype B and D viruses, the latter
of which have never been classified as recombinants. Our analyses also
suggest that subtype E viruses were not derived from a recombinant
lineage and in fact form an independent monophyletic clade that,
similar to the relationship described for clades B and D, maintains a
relatively close evolutionary affinity to HIV-1A.
 |
MATERIALS AND METHODS |
Terminology.
Throughout this article, all previously defined
HIV-1 subtypes are assumed to form monophyletic groups. All variants
that have recently been termed HIV-1 subtype A/E are referred to as "HIV-1 subtype E" variants. The term "evolutionary rate" refers to the measure of nucleotide divergence over time from a common ancestor: we assume that two contemporary lineages that share a common
ancestor have different evolutionary rates if the branch lengths are
different. The term "substitution rate" refers to the expected rate
of substitution of one nucleotide for another. A "recombinant
lineage" is broadly defined as a phylogenetic subset of
variants that share a common ancestor that acquired different parts of
its genome from two or more parental strains. In the case of
"intersubtype recombination," the founder for the recombinant lineage would be formed by a recombination event between viruses from
two different viral subtypes.
Sample acquisition, viral gene amplification, cloning, and
sequencing.
HIV-1 subtype E viral isolates used in this study were
obtained from individuals in the Central African Republic
(90CF11697 and 90CF4071) (38) and Thailand (93TH057
and 93TH065) (62). A nested set of PCRs were used to
amplify either half-genome (~5,000 bp)- or quarter-genome (~2,500
bp)-sized fragments from end point-diluted proviral samples (Expand
High Fidelity PCR System; Boehringer Mannheim, Indianapolis, Ind.)
according to the manufacturer's instructions (44).
Either overlapping PCR fragments from each patient were subcloned into
pAMP1 (Gibco BRL, Grand Island, N.Y.) and sequenced or the PCR product
was directly sequenced using dye terminator chemistry on an automated
DNA sequencer model 373A or 377 (Applied Biosystems, Foster City,
Calif.) according to the manufacturer's instructions. Sequence data
were assembled using the computer program Sequencher (Gene Codes Corp.,
Ann Arbor, Mich.).
Sequence alignment.
Newly determined clade E sequences
93TH057, 93TH065, 90CF4071, and 90CF11697 were aligned, using the
program ClustalW (59), with the complete genomes of 42 subtype reference sequences that are representative of the variability
of the current HIV-1 pandemic (2, 25) (Table
1). The initial ClustalW-generated
multiple sequence alignment was manually refined to maximize the
positional homology of nucleotides while minimizing the number of gaps
introduced into the alignment. Amino acid coding information for each
sequence was also used in the manual refinement of the sequences.
Regions of the alignment where positional homology was undeterminable were excluded, but the alignment was not completely gap-stripped. Thus,
8,650 sites were included in the final alignment. Since some analyses
require that the reference alignment does not include recombinants, a
second alignment (alignment 2) was made with the newly determined clade
E sequences and the complete genomes of 32 of the 42 subtype reference
sequences. This second alignment included only a single outgroup
sequence, CPZGAB, and excluded the sequences of subtypes AG and AGI,
which are thought to be derived from recombinant lineages.
Bootscan analysis.
For bootscan analysis (48),
alignment 2 was divided into sequential overlapping segments of 500 nucleotides, with 100 nucleotide steps between each segment. The window
size of 500 nucleotides was chosen to correspond with previous studies
that describe subtype E variants as recombinants (4, 12).
Bootstrapping using 100 replicates was performed on each segment using
the program PAUP* 4.0 (58). The bootstrapping analysis used
a general time-reversible (GTR) model (28) of nucleotide
substitution and a gamma-distributed site-to-site heterogeneity of
rates with invariant sites, all estimated from the original data set.
The bootstrap values that supported the monophyly of all HIV-1E and
HIV-1A sequences were then plotted.
Distance analysis.
The pairwise distance analysis performed
was similar to that described by Gao et al. (12), with our
sequence alignment divided into sequential overlapping segments of 500 nucleotides and moving in steps of 100 nucleotides between each
segment. Maximum likelihood distances were calculated under the
Felsenstein 84 model of evolution for each segment using the program
DNADIST from the PHYLIP package (8). The intersubtype and
intrasubtype pairwise distances were calculated and plotted across the
HIV-1 genome.
Kishino-Hasegawa testing.
Likelihood ratio tests
(19) were performed on substitution rate models to identify
the model that produced significantly better likelihood scores for the
given data set. The program Modeltest was used to compare the different
nested models of DNA substitution in a hierarchical hypothesis-testing
framework (40). The nested nature of this group of
nucleotide substitution models also allows measurement of the effects
that different model parameters have on the goodness of fit.
Kishino-Hasegawa (KH) tests were conducted using the PAUP* 4.0 program
(58). The KH test compares the likelihood scores of two
phylogenetic trees defined a priori (e.g., trees that correspond to
hypotheses derived independently of the observed data) over a given
region of an alignment and tests the null hypothesis that there is no
statistical difference in the likelihoods of the two trees. The KH test
is designed to compare only tree topologies and thus does not use fixed
branch lengths in the calculation of likelihood scores. Therefore, two
identical topologies will always produce identical likelihood scores
and will satisfy the null hypothesis of no difference. We were
interested in knowing if there is a statistically significant
difference between topologies for which HIV-1E sequences share a common
node with HIV-1A sequences and those for which the two subtypes are
separate monophyletic groups that do not exclusively share a common
node. The complete genome reference sequence alignment was divided into
nine regions that corresponded to variations in the tree topology, as
indicated by the bootscan results. The maximum likelihood (ML) tree
topology for each region was constructed. Constrained tree topologies, which forced clade E sequences to share a common node with the clade A
sequences, were also constructed for regions where the best topology
grouped clade E viruses evolutionarily closer to subtypes other than
HIV-1A. The KH test was used to test the null hypothesis that there is
no statistical difference in the likelihoods of the best-tree topology
and the constrained-tree topology.
Novel likelihood-based statistical tests of topologies have been
developed to correctly compare tree topologies that are derived from
the observed data (i.e., the ML tree topology) (13, 53). These tests are computationally intensive, limiting the size of sample
sets to 10 or fewer sequences. The KH test, however, can be used with
larger sample sets, although using the KH test to compare
prederived topologies may force the KH test to be too conservative
because the test was designed to statistically compare the topologies
of two trees defined a priori (13).
Simulated data.
In evaluating recombination, it is
instructive to ask whether nonrecombining sequences can show similar
bootscan and pairwise distances as those obtained previously with
subtype E sequences. Differences in the rates of evolution or
substitution across the genome can potentially confound phylogenetic
signal (15, 16). To address this issue, we simulated
multiple nucleotide alignments corresponding to the 36 complete genome
sequences (alignment 2) used in our bootscanning and pairwise distance
analyses. Since we were interested in knowing whether an equivalent
pattern can be produced in the absence of recombination, we simulated
data using a single identical phylogeny across the genome, broken into nine segments (see Fig. 1). The nine contiguous segments corresponded to regions that had previously been identified by bootscanning as
having potentially different phylogenetic affinities (possibly as a
result of recombination). This segmenting of the genome allowed us to
vary the rate of evolution along the genome to correspond with the
observed data while maintaining a consistent tree topology. The nine
regions were then concatenated to form a simulated complete genome
alignment that maintained variations in rates of evolution along the
genome. The complete genome phylogeny for the 36 isolates included in
alignment 2 was used as the tree topology for all nine regions. In the
absence of recombination, the complete genome ML topology should
provide an accurate reconstruction of relationships among lineages
(17). This topology places E and A viral isolates as
separate monophyletic groups that together form a larger monophyletic group. These simulations were performed using the program Seq-Gen, version 1.1 (41). To produce the simulated data sets,
Seq-Gen used a GTR model (28) of nucleotide substitution,
gamma-distributed site-to-site heterogeneity of rates, and nucleotide
frequencies that were estimated from the observed sequence data (Table
2). Therefore, the evolutionary rate
differences along the simulated genome reflect the differences seen in
the observed data. The program also used a phylogenetic tree as the
true tree for each data set that it produced. The true trees for each
of the nine regions were produced by building a constrained tree for
each region based on the whole genome ML tree topology, which maintains clades A and E as two separate monophyletic groups. Therefore, the same
tree topology is maintained along the entire genome, thus ensuring that
recombination has not taken place in the simulated data but accounting
for variations in the evolutionary rate by allowing for variations in
the branch lengths for each of the nine simulated regions.
The complete genome ML topology was used as the model tree to generate
100 independent simulations. The model of substitution and the branch
lengths of the model tree varied along the length of the genome
corresponding to regional variations in the observed data. Bootscan and
pairwise distance analyses were performed on the simulated data, as
with the original data.
Nucleotide sequence accession numbers.
GenBank accession
numbers for the full-length HIV-1 sequences reported in this study are
AF197338 (93TH057), AF197339 (93TH065), AF197340 (90CF11697), and
AF197341 (90CF4071).
 |
RESULTS |
Bootscan analysis.
Complete genome phylogenetic analysis
can show the broad relationships among viral populations, but a more
detailed analysis that breaks up the genome into smaller sections may
reveal relationships that are more intricate. Bootscan analysis, which
breaks the genome into small sections and then analyzes each section
independently, has been used to identify areas of recombination within
an HIV-1 genome (48). Ideally, bootscan analysis of
nonrecombining genomes would show consistently high bootstrap support
(i.e., >70%) across the genome for major clades containing the same
sets of sequences. If some sequences were recombinants, then their
phylogenetic placement at different regions of the genome would vary,
grouping with one parental lineage for one region of the genome and
with another lineage for a different region (48). However,
it is unclear whether inconsistent bootstrap support for clade
membership or phylogenetic incongruence in and of itself is sufficient
evidence of the presence of recombinant lineages.
The bootscan plot we obtained using the full set of 36 genomes was
comparable to previous clade E bootscanning results produced by Carr et
al. (4) and Gao et al. (12). Sequences from
clades A and E group with a high bootstrap value over most of the
genome, with regions of lower bootstrap values located in
env, nef, and vif (Fig.
1).

View larger version (18K):
[in this window]
[in a new window]
|
FIG. 1.
Plot showing bootstrap clustering values comparing
subtypes A and E along the genome. The x axis shows the
relative position across the HIV-1 genome. Numbered sections along the
HIV-1 gene map identify the nine contiguous genomic regions that were
used in later analyses.
|
|
Pairwise distance analysis.
Pairwise distance analyses have
been used in conjunction with bootscan analyses to support the idea of
recombination within the clade E genome. Previous pairwise distance
analyses have shown that clade E and clade A viruses are relatively
close to one another in gag but more distant in
env (4, 12). Also, there is no apparent parental
lineage for the subtype E viruses for part of the genome (i.e., there
are no known extant lineages that appear to be more closely related
phylogenetically to subtype E viruses over regions of the
vif, env, and nef genes). Without a
parental clade E available, the env region of HIV-1E
variants does not show close genetic affinity with any subtype. Our own
pairwise distance analyses show similar results, with HIV-1E sequences sharing great similarity to HIV-1A sequences in gag and
pol but diverging in env. Interestingly, the same
pattern is also seen with HIV-1B and HIV-1D sequences, but it has never
been suggested that subtypes B and D are recombinant lineages (Fig.
2).

View larger version (71K):
[in this window]
[in a new window]
|
FIG. 2.
Nucleotide divergence measurements across the HIV-1 M
group genome for alignment 2. (a) Pairwise distance plot showing intra-
and intersubtype maximum likelihood distances. The gray lines indicate
the intersubtype distances across the genome for all subtype pairs
except A versus E and B versus D. The teal lines indicate intrasubtype
distances across the genome. (b) Pairwise distance plot showing intra-
and intersubtype maximum likelihood distances as described for panel a,
with A/E and B/D intersubtype distances included. The x axis
shows the relative position across the HIV-1 genome.
|
|
Complete genome phylogenetic analysis of subtype E.
A
phylogenetic tree that included all available complete clade E
genomes, three from Gao et al. (12) and Carr et al.
(4) and four determined in this study, along with the
complete genomes from the 39 other subtype reference sequences, was
produced (Fig. 3). HIV-1E and HIV-1A
sequences were found to form separate monophyletic groups on this tree.
Larger monophyletic groups can also be formed with the A and E clades,
and with the A, AG, AGI, and E clades. Viruses from the AG and AGI
clades are reported to be derived from recombinant lineages, with
portions of their genomes closely associating but forming separate
monophyletic groups with subtype A viruses (3, 10) (Fig.
4). The clade E viruses from Thailand cluster separately from the CAR viruses, the latter having a greater overall level of divergence (12). The lower diversity among the Thailand clade E viruses supports the concept that the Thailand clade E epidemic is relatively recent. The clustering of the Thailand viruses within the more diverse CAR clade E subgroup supports the
hypothesis that the clade E viruses in Southeast Asia were introduced
from Africa (12, 38).

View larger version (21K):
[in this window]
[in a new window]
|
FIG. 3.
Maximum likelihood phylogenetic relationships of newly
derived HIV-1 clade E viral genomes and complete genomes representative
of other HIV-1 group M clades, with bootstrap values of 70% or greater
indicated. The tree was constructed using the ML method and the GTR
substitution model as described in Materials and Methods and reference
29.
|
|

View larger version (20K):
[in this window]
[in a new window]
|
FIG. 4.
Locations of the nine adjacent regions used in the
analysis of clade E and A variants (numbered 1 through 9), along with
the inferred mosaic structures for subtypes AG and AGI (3,
10). Approximate genomic coordinates are indicated by the
position within the HXB2 reference sequence. Regions of different
subtypes located in the AG and AGI sequences are indicated by the
single letters A, G, and I. U, unknown. Dashed lines indicate the
breakpoints for the nine analyzed regions superimposed onto the AG,
AGI, and HIV-1 gene maps.
|
|
Kishino-Hasegawa tests.
Nine ML phylogenetic tree
topologies were constructed across the genome to identify regions of
topological incongruence (Fig. 5). The
model of evolution used for each region is shown in Table 3. Each of the nine topologies was
constructed from a genomic region that had previously been identified
by bootscanning as having a potentially different phylogenetic affinity
to subtype E sequences (Fig. 1), and together the nine regions spanned
the entire HIV-1 genome. In six of the nine regions, the best ML tree topology grouped clade E viruses evolutionarily more closely to subtypes other than HIV-1A (Fig. 5). For these, constrained topologies were constructed which forced HIV-1A and HIV-1E viruses to exclusively share a common node on the tree. The constrained topology in regions 4, 5, 8, and 9 forced clade E viruses to group with clade A viruses, while
the constrained topology in regions 6 and 7 forced the clade E viruses
to group with viruses from clades A and AG (the mosaic subtype AG has
been reported to be subtype A in these regions [3])
(Fig. 4). To test for significant differences between the best-ML
tree topologies and the constrained topologies, the Kishino-Hasegawa test was used (23). In every set
tested, we found that a topology that maintained clades A and E as
separate but closely related monophyletic groups never produced a
likelihood score statistically worse than those produced by topologies
that group clade E with other subtypes (Fig.
6). Therefore, the KH test could not
reject the hypothesis that, across the genome, clades A and E are
maintained as separate monophyletic groups that share a close
association to one another. These results indicate that no significant
topological incongruence occurs along the genome and that thus no
recombination is required to account for these topologies.

View larger version (77K):
[in this window]
[in a new window]
|
FIG. 5.
Maximum likelihood trees for each of the nine genome
segments (shown in Fig. 4), showing the phylogenetic relationships of
clade E viruses in comparison to representative sequences of other
HIV-1 group M clades. The coordinates of each segment are given
relative to the HXB2 genome. Subtype designations are indicated in the
names of the sequences (i.e., J_ indicates subtype J).
|
|

View larger version (54K):
[in this window]
[in a new window]
|
FIG. 6.
Kishino-Hasegawa test of topological
incongruence across the entire genome. For each region indicated, the
best-tree topology (shown in Fig. 5) was compared to the
constrained-tree topology shown that forced clade E sequences to group
either with clade A exclusively or with clades A and AG. In regions 6 and 7, where clade E was forced to group with clades A and AG, clade AG
has been reported to be of subtype A (3). Likelihood scores
for the best and constrained topologies are indicated, along with
P values for each regional comparison (statistical
significance was set as being 0.05). A P value of <0.05
indicates that the best topology is significantly better than the
constrained topology.
|
|
To evaluate the significance of separate
monophyletic groupings of clades A and E, we constructed
two sets of constrained topologies for each of the nine regions. These
constrained topologies forced the most basal clade E sample to be
grouped within clade A or vice versa. In all regions, these constrained
topologies, which did not maintain A and E as separate monophyletic
groups, produced significantly worse likelihood scores when
compared to the best-ML tree topology (data not shown).
Simulated data.
Bootscan analyses showed a consistently low
level of bootstrap support for grouping clades A and E over portions of
the env and vif regions (Fig. 1). Intersubtype
pairwise distance analyses also showed that the env region
of clades A and E are more dissimilar than the gag and
pol regions (Fig. 2). To determine if factors other than
recombination can cause such changes in the bootscan and distance
measures, we performed bootscanning and pairwise distance analysis on
alignment simulations. Interestingly, the simulated data produced
bootscan patterns similar to that obtained with the real sequence data
(Fig. 7a). Since the true tree (i.e., the
tree used to construct the simulated sequences) maintained the same
topology over the entire genome, it is interesting to note that this
topology is not consistently supported by bootstrap analysis as one
moves along the genome in 500-nucleotide steps. Likewise, results of
pairwise distance measurements show similar patterns with simulations
as those obtained with the real data (Fig. 7b).

View larger version (87K):
[in this window]
[in a new window]
|
FIG. 7.
Bootscan and pairwise distance analyses of real and
simulated nucleotide sequence alignment data sets. (a) Bootscan plot
showing the bootstrap value along the genome of subtypes A and E
forming a monophyletic group for the real data (black line) and 10 independent simulations (gray lines) across the genome. The
x axis shows the relative position across the HIV-1 genome,
with the nine contiguous regions indicated by the alternately shaded
sections. (b) Pairwise distance plots showing intersubtype maximum
likelihood distances of the observed data (black lines) and 10 independent simulations (gray lines) across the genome. The
x axis shows the relative position across the HIV-1 genome,
with the nine contiguous regions indicated by the alternately shaded
sections.
|
|
Our studies suggest a clear framework for the evaluation of
potential recombinant genomes, in which topological relationships are
evaluated with bootscan or other techniques, followed by KH testing for
the statistical significance of these relationships. To evaluate the
general utility of the KH test to identify recombinant structures, we
also applied it to the analysis of other reported recombinant HIV-1
genomes (Table 4). In each of seven cases
evaluated, we found statistical support for the hypothesis of a
recombinant origin. In all but one of the 16 regions tested, the KH
test found statistical support for recombination. The single region
that did not produce a significant result was a putative subtype G region in the isolate 94CY032 (Table 4). As a further test of the
robustness of this test, we evaluated the reported B/F recombinant virus 93BR029 (11), without the benefit of one parental
sequence (Fig. 8), thus simulating the
situation with the subtype E viruses, in which only an A-like parental
sequence has been identified. In this instance too, statistical support
for the recombination hypothesis was evident (Fig. 8). Thus, the E
genome analysis uniquely stands out as one that fails to support a
recombinational origin.

View larger version (26K):
[in this window]
[in a new window]
|
FIG. 8.
Evaluation of the B/F recombinant 93BR029 in analyses of
only one of the two parental strains. For these analyses, subtype F
sequences were removed from the sequence alignments, creating a
situation where subtype B was the only parental sequence studied. (a)
Bootscan plot showing bootstrap clustering values comparing the variant
93BR029 and subtype B variants along the genome. The x axis
shows the relative position across the HIV-1 genome. Numbered sections
along the HIV-1 gene map identify the 10 contiguous genomic regions
that were used for Kishino-Hasegawa testing. (b) Kishino-Hasegawa test
of topological incongruence across the entire genome. For each region
indicated, the best-tree topology that grouped 93BR029 with the subtype
B sequences was compared to the best-tree topology that placed 93BR029
outside of the subtype B clade. The tree topology that produced the
highest likelihood score was labeled "Best" and was compared to the
remaining topology using the Kishino-Hasegawa test. P values
for each regional comparison are indicated with statistical
significance set as being 0.05 (statistically significant results
were marked with an asterisk). A P value of < 0.05 indicates that the "Best" topology is significantly better than the
other topology.
|
|
 |
DISCUSSION |
Our initial bootscanning and pairwise distance analyses yielded
results consistent with those obtained by other investigators (4,
12): they indicated that a recombination event might have
occurred in the history of the HIV epidemic, giving rise to a lineage
of HIV-1A/E variants. Bootscan analysis showed that gag and
pol regions of the genome group HIV-1E and HIV-1A together with strong bootstrap support, whereas analyses of portions of vif, env, and nef failed to provide
strong evidence for this same grouping (Fig. 1).
The intersubtype pairwise distance analyses comparing
clades A and E have also been used to support the idea of
recombination within the clade E genome (4, 12). The
distance between clades A and E in gag and pol is
within the intrasubtype distance range, while the distance in
env, which is thought to be descended from a nonrecombinant
proto-E parental virus, reaches the intersubtype distance range (Fig.
2). The close distances in gag and pol and the
more extreme distances in env seem to support the
recombination hypothesis that clades A and E share an intrasubtype
relationship in gag and pol while they are
separate subtypes in the env region. However, the analysis
of clades B and D also shows a pattern of close distances in
gag and pol, with a more divergent region located in env (Fig. 2). Therefore, a hypothesis of recombination
taking place between clades B and D would also be supported by these pairwise distance analyses, although this suggestion has not been put
forward. In fact, these extreme changes in pairwise distances over
different regions of the genome may not be due to recombination but may
be the result of variations in evolutionary rates across the genome
(43). If variations in evolutionary rate can dramatically affect the pairwise distance analysis, then the ability of pairwise distance analysis to determine or confirm putative areas of
recombination is questionable.
Since regional variations in the evolutionary rate may cause extreme
changes in the patterns of pairwise distances, it is conceivable that
these same variations may also affect the bootscan results. To test
this hypothesis, simulations were performed to determine the effect of
rate variation on bootscan and pairwise distance analyses. Our results
indicate that despite simulating data using a nonrecombining and
consistent phylogenetic topology across the genome, the bootscan plots
of the simulated data erroneously showed support for a recombination
event in the env and vif regions (Fig. 7a). The
simulations also showed that clades A and E maintained a pairwise
distance similar to that of most intersubtype pairings in the
env region but resembling an intrasubtype relationship in
the gag and pol regions (Fig. 7b), all without a
recombination event having taken place in the generation of these data
sets. These simulations highlight the fact that changes in the
evolutionary rate across the genome can have profound effects on the
distance analysis and can support the hypothesis of recombination when none has occurred. In particular, by sampling relatively small (500-nucleotide) sections of this region, bootscanning may be prone to
sitewise heterogeneity in rates of substitutions and poor support (by
virtue of a short length of sampled sequence) for short interior
branches on a topology. Thus, simulations that invoke differences in
the evolutionary rate across the genome can provide a reasonable
explanation for the bootscan and pairwise distance results without
invoking an intersubtype recombination event.
A likely explanation for the differences in the
evolutionary rates across the genome is that different regions of the
genome are under different selective pressures. Random mutations in
gag or pol are more likely than those in
env to adversely affect the fitness of the viral progeny.
Furthermore, the rate of nonsynonymous substitutions equals or exceeds
that for synonymous substitutions in the gp120 encoding region of
env, indicative of diversifying selection (29,
45). On the other hand, purifying selection appears to
predominate in gag and pol, as evidenced by a
relatively low rate of nonsynonymous substitutions (29, 45).
Therefore, gag and pol reside in a more
constrained sequence space than env. As the HIV-1A and
HIV-1E variants diverge from their most recent common ancestor, the
gag and pol regions evolve more slowly into regions of sequence space that maintain fit viruses while the less
constrained env regions are diverging more rapidly.
The last piece of evidence cited for the hypothesis of clade E viruses
being derived from a recombinant lineage resides in the topological
incongruence seen in different regions of the genome. In the
gag and pol regions, clade E and clade A viruses are more closely related to each other than to any other subtype, with
HIV-1E and HIV-1A each forming a separate monophyletic group (Fig. 5).
The env region, on the other hand, groups clade E viruses closer to subtypes other than HIV-1A, supporting the idea that HIV-1E
is derived from a recombinant lineage (Fig. 5). It is important, however, to test whether the sequence data in different regions provide
statistically greater support for the recombination hypotheses than for
the different phylogenetic development hypotheses. The Kishino-Hasegawa
test (23) was used to test these hypotheses, and it revealed
that the hypothesis that HIV-1E and HIV-1A are separate monophyletic
assemblages with a close association to each other is not significantly
less supportable than an HIV-1E recombination hypothesis (Fig. 6).
The KH test showed that the constrained topologies that forced clade E
viruses to group with clade A viruses were not significantly different
from the ML topology across the genome, as long as the clades remained
monophyletic. However, these constrained topologies in the
vif and env regions may group the HIV-1A and E
clades on long branches that are approximately equidistant from every
other subtype. These longer branches may still support the hypothesis of recombination in these regions. To address this issue, we
constructed further constrained-tree topologies in regions 6 and 7 (Fig. 9) that placed the common node of
HIV-1A and HIV-1E above that of HIV-1A and HIV-1AG, the latter pair
having been classified as the same subtype in these regions (Fig. 4). A
KH test comparing these new constrained topologies to that of the
best-ML tree topologies indicated that the difference between the best
topology and the constrained topology in either region was not
statistically significant (Fig. 9), although the constrained topology
in region 7 came close to the significance threshold. Therefore, the
long clade E branch seen in these regions does not provide significant
evidence of a recombination event. On the weight of this evidence, we
find it more reasonable to believe that HIV-1E variants are not
the descendents of a recombinant lineage but are monophyletic, although with relatively close evolutionary affinities to HIV-1A. We contend that HIV-1A and HIV-1E maintain separate monophyletic lineages, with a relationship similar to that of HIV-1B and HIV-1D. Finally, we
conclude that the use of bootscan and pairwise distance analyses without other statistical tests to locate areas of recombination may
result in false identification due to variations in evolutionary rates
across the genome.

View larger version (40K):
[in this window]
[in a new window]
|
FIG. 9.
Kishino-Hasegawa test results over the vif
and env regions of the HIV-1 genome. The best-tree topology
(shown in Fig. 5) was compared to the constrained-tree topology shown
here which forced clade E sequences to form a common node with clade A
sequences that was evolutionarily closer to clade A than clade AG was
to clade A. In these regions clade AG is reported to be of subtype A,
and thus the constrained tree places clade A sequences within the A/AG
cluster. Likelihood scores and P values were derived as described in
the legend to Fig. 6.
|
|
We have shown that more-stringent statistical analyses need to be
performed to achieve a greater understanding of the intricate processes
involved in HIV-1 evolution. Indeed, these tests confirmed the
recombinational origin of several other viruses reported in the
literature, and this confirmation was not dependent upon the presence
of both parental strains (Table 4).
HIV-1 nomenclature is primarily based on the evolutionary relationships
of viral variants. The point can be made that if the clade E variants
are indeed evolutionarily similar to clade A variants across the entire
genome, then clade E may in fact be just a subclade of a larger A
subgroup that would include present clade A sequences along with the
clade E sequences and portions of clade AG and AGI sequences. Following
the current nomenclature requirements (D. L. Robertson, J. P. Anderson, J. A. Bradac, J. K. Carr, B. Foley, R. K. Funkhouser, F. Gao, B. H. Hahn, M. L. Kalish, C. Kuiken, G. H. Learn, T. Leitner, F. McCutchan, S. Osmanov, M. Peeters, D. Pieniazek, M. Salminen, P. M. Sharp, S. Wolinsky, and
B. Korber, Letter, Science 288:55-56). The similarity between these two monophyletic clades would preclude them from achieving the rank of two individual subtypes. These two monophyletic clades have been historically named subtypes A and E, however, just as
the monophyletic clades of B and D have historically been designated
subtypes even though they do not meet present requirements (D. L. Robertson et al., Letter, Science 288:55-56). Hence, we
propose a nomenclature that maintains the E subtype and argue that the
A/E nomenclature is unwarranted. Most importantly, the inappropriate
attribution of recombinant origins to divergent sequences obscures the
true evolutionary properties of these viruses.
This work was supported by grants from the Centers for Disease
Control and Prevention (CDC), The University of Washington Center for
AIDS Research (CFAR), and the U.S. Public Health Service.
| 1.
|
Burke, D. S.
1997.
Recombination in HIV: an important viral evolutionary strategy.
Emerg. Infect. Dis.
3:253-259[Medline].
|
| 2.
|
Carr, J. K.,
B. T. Foley,
T. Leitner,
M. Salminen,
B. Korber, and F. McCutchan.
1999.
Reference sequences representing the principal genetic diversity of HIV-1 in the pandemic, vol. 1999.
Los Alamos National Laboratory, Los Alamos, N.M.
|
| 3.
|
Carr, J. K.,
M. O. Salminen,
J. Albert,
E. Sanders-Buell,
D. Gotte,
D. L. Birx, and F. E. McCutchan.
1998.
Full genome sequences of human immunodeficiency virus type 1 subtypes G and A/G intersubtype recombinants.
Virology
247:22-31[CrossRef][Medline].
|
| 4.
|
Carr, J. K.,
M. O. Salminen,
C. Koch,
D. Gotte,
A. W. Artenstein,
P. A. Hegerich,
D. St. Louis,
D. S. Burke, and F. E. McCutchan.
1996.
Full-length sequence and mosaic structure of a human immunodeficiency virus type 1 isolate from Thailand.
J. Virol.
70:5935-5943[Abstract].
|
| 5.
|
Cornelissen, M.,
G. Kampinga,
F. Zorgdrager,
J. Goudsmit, and the UNAIDS Network for HIV Isolation and Characterization.
1996.
Human immunodeficiency virus type 1 subtypes defined by env show high frequency of recombinant gag genes.
J. Virol.
70:8209-8212[Abstract].
|
| 6.
|
Diaz, R. S.,
E. C. Sabino,
A. Mayer,
J. W. Mosley,
M. P. Busch, and The Transfusion Safety Study Group.
1995.
Dual human immunodeficiency virus type 1 infection and recombination in a dually exposed transfusion recipient.
J. Virol.
69:3272-3281.
|
| 7.
|
Felsenstein, J.
1985.
Confidence limits on phylogenies: an approach using the bootstrap.
Evolution
39:783-791[CrossRef].
|
| 8.
|
Felsenstein, J.
1993.
PHYLIP (phylogeny inference package) version 3.5c.
Department of Genetics, University of Washington, Seattle, Washington.
|
| 9.
|
Gao, F.,
S. G. Morrison,
D. L. Robertson,
C. L. Thornton,
S. Craig,
G. Karlsson,
J. Sodroski,
M. Morgado,
B. Galvao-Castro,
H. von Breisen,
S. Beddows,
J. Weber,
P. M. Sharp,
G. M. Shaw,
B. H. Hahn, and the WHO and NIAID Networks for HIV Isolation and Characterization.
1996.
Molecular cloning and analysis of functional envelope genes from human immunodeficiency virus type 1 sequence subtypes A through G.
J. Virol.
70:1651-1667[Abstract].
|
| 10.
|
Gao, F.,
D. L. Robertson,
C. D. Carruthers,
Y. Li,
E. Bailes,
L. G. Kostrikis,
M. O. Salminen,
F. Bibollet-Ruche,
M. Peeters,
D. D. Ho,
G. M. Shaw,
P. M. Sharp, and B. H. Hahn.
1998.
An isolate of human immunodeficiency virus type 1 originally classified as subtype I represents a complex mosaic comprising three different group M subtypes (A, G, and I).
J. Virol.
72:10234-10241[Abstract/Free Full Text].
|
| 11.
|
Gao, F.,
D. L. Robertson,
C. D. Carruthers,
S. G. Morrison,
B. Jian,
Y. Chen,
F. Barré-Sinoussi,
M. Girard,
A. Srinivasan,
A. G. Abimiku,
G. M. Shaw,
P. M. Sharp, and B. H. Hahn.
1998.
A comprehensive panel of near-full-length clones and reference sequences for non-subtype B isolates of human immunodeficiency virus type 1.
J. Virol.
72:5680-5698[Abstract/Free Full Text].
|
| 12.
|
Gao, F.,
D. L. Robertson,
S. G. Morrison,
H. Hui,
S. Craig,
J. Decker,
P. N. Fultz,
M. Girard,
G. M. Shaw,
B. H. Hahn, and P. M. Sharp.
1996.
The heterosexual human immunodeficiency virus type 1 epidemic in Thailand is caused by an intersubtype (A/E) recombinant of African origin.
J. Virol.
70:7013-7029[Abstract/Free Full Text].
|
| 13.
| Goldman, N., J. P. Anderson, and A. Rodrigo.
Likelihood-based tests of topologies in phylogenetics. Syst. Biol., in
press.
|
| 14.
|
Grassly, N. C., and E. C. Holmes.
1997.
A likelihood method for the detection of selection and recombination using nucleotide sequences.
Mol. Biol. Evol.
14:239-247[Abstract].
|
| 15.
|
Griffiths, C. S.
1997.
Correlation of functional domains and rates of nucleotide substitution in cytochrome b.
Mol. Phylogenet. Evol.
7:352-365[CrossRef][Medline].
|
| 16.
|
Hillis, D. M., and J. P. Huelsenbeck.
1992.
Signal, noise, and reliability in molecular phylogenetic analyses.
J. Hered.
83:189-195[Abstract/Free Full Text].
|
| 17.
|
Hillis, D. M.,
C. Moritz, and B. K. Mable (ed.).
1996.
Molecular systematics, 2nd ed.
Sinauer Associates, Sunderland, Mass.
|
| 18.
|
Howell, R. M.,
J. E. Fitzgibbon,
M. Noe,
Z. Ren,
D. Gocke,
T. A. Schwartzer, and D. T. Dubin.
1991.
In vivo sequence variation of the human immunodeficiency virus type 1 env gene: evidence for recombination among variants found in a single individual.
AIDS Res. Hum. Retrovir.
7:869-876[Medline].
|
| 19.
|
Huelsenbeck, J. P., and B. Rannala.
1997.
Phylogenetic methods come of age: testing hypotheses in an evolutionary context.
Science
276:227-232[Abstract/Free Full Text].
|
| 20.
|
Ichimura, H.,
S. C. Kliks,
S. Visrutaratna,
C. Y. Ou,
M. L. Kalish, and J. A. Levy.
1994.
Biological, serological, and genetic characterization of HIV-1 subtype E isolates from northern Thailand.
AIDS Res. Hum. Retrovir.
10:263-269[Medline].
|
| 21.
|
Jakobsen, I. B., and S. Easteal.
1996.
A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequences.
Comput. Appl. Biosci.
12:291-295[Abstract/Free Full Text].
|
| 22.
|
Kampinga, G. A.,
A. Simonon,
P. Van de Perre,
E. Karita,
P. Msellati, and J. Goudsmit.
1997.
Primary infections with HIV-1 of women and their offspring in Rwanda: findings of heterogeneity at seroconversion, coinfection, and recombinants of HIV-1 subtypes A and C.
Virology
227:63-76[CrossRef][Medline].
|
| 23.
|
Kishino, H., and M. Hasegawa.
1989.
Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order of the Hominoidea.
J. Mol. Evol.
29:170-179[CrossRef][Medline].
|
| 24.
|
Kitayaporn, D.,
S. Vanichseni,
T. D. Mastro,
S. Raktham,
T. Vaniyapongs,
D. C. Des Jarlais,
C. Wasi,
N. L. Young,
S. Sujarita,
W. L. Heyward, and J. Esparza.
1998.
Infection with HIV-1 subtypes B and E in injecting drug users screened for enrollment into a prospective cohort in Bangkok, Thailand.
J. Acquir. Immune Defic. Syndr. Hum. Retrovirol.
19:289-295[Medline].
|
| 25.
|
Korber, B.,
B. Hahn,
B. Foley,
J. W. Mellors,
T. Leitner,
G. Myers,
F. McCutchan, and C. L. Kuiken.
1997.
Human retroviruses and AIDS 1997: a compilation and analysis of nucleic acid and amino acid sequences.
Los Alamos National Laboratory, Los Alamos, N.M.
|
| 26.
|
Kostrikis, L. G.,
E. Bagdades,
Y. Cao,
L. Zhang,
D. Dimitriou, and D. D. Ho.
1995.
Genetic analysis of human immunodeficiency virus type 1 strains from patients in Cyprus: identification of a new subtype designated subtype I.
J. Virol.
69:6122-6130[Abstract].
|
| 27.
|
Kuwata, T.,
Y. Miyazaki,
T. Igarashi,
J. Takehisa, and M. Hayami.
1997.
The rapid spread of recombinants during a natural in vitro infection with two human immunodeficiency virus type 1 strains.
J. Virol.
71:7088-7091[Abstract].
|
| 28.
|
Lanave, C.,
G. Preparata,
C. Saccone, and G. Serio.
1984.
A new method for calculating evolutionary substitution rates.
J. Mol. Evol.
20:86-93[CrossRef][Medline].
|
| 29.
|
Leigh-Brown, A., and P. Monaghan.
1988.
Evolution of the structural proteins of Human immunodeficiency virus: selective constraints on nucleotide substitution.
AIDS Res. Hum. Retrovir.
4:399-407[Medline].
|
| 30.
|
Leitner, T.,
D. Escanilla,
S. Marquina,
J. Wahlberg,
C. Brostrom,
H. B. Hansson,
M. Uhlen, and J. Albert.
1995.
Biological and molecular characterization of subtype D, G, and A/D recombinant HIV-1 transmissions in Sweden.
Virology
209:136-146[CrossRef][Medline].
|
| 31.
|
Liitsola, K.,
I. Tashkinova,
T. Laukkanen,
G. Korovina,
T. Smolskaja,
O. Momot,
N. Mashkilleyson,
S. Chaplinskas,
H. Brummer-Korvenkontio,
J. Vanhatalo,
P. Leinikki, and M. O. Salminen.
1998.
HIV-1 genetic subtype A/B recombinant strain causing an explosive epidemic in injecting drug users in Kaliningrad.
AIDS
12:1907-1919[Medline].
|
| 32.
|
Limpakarnjanarat, K.,
K. Ungchusak,
T. D. Mastro,
N. L. Young,
C. Likhityingvara,
O. Sangwonloy,
B. G. Weniger,
C. P. Pau, and T. J. Dondero.
1998.
The epidemiological evolution of HIV-1 subtypes B and E among heterosexuals and injecting drug users in Thailand, 1992-1997.
AIDS
12:1108-1109[Medline].
|
| 33.
|
Louwagie, J.,
F. E. McCutchan,
M. Peeters,
T. P. Brennan,
E. Sanders-Buell,
G. A. Eddy,
G. van der Groen,
K. Fransen,
G.-M. Gershy-Damet,
R. Deleys, and D. S. Burke.
1993.
Phylogenetic analysis of gag genes from seventy international HIV-1 isolates provides evidence for multiple genotypes.
AIDS
7:769-780[Medline].
|
| 34.
|
Mascola, J. R.,
J. Louwagie,
F. E. McCutchan,
C. L. Fischer,
P. A. Hegerich,
K. F. Wagner,
A. K. Fowler,
J. G. McNeil, and D. S. Burke.
1994.
Two antigenetically distinct subtypes of HIV-1: viral genotype predicts neutralization serotype.
J. Infect. Dis.
169:48-54[Medline].
|
| 35.
|
McCutchan, F. E.,
A. W. Artenstein,
E. Sanders-Buell,
M. O. Salminen,
J. K. Carr,
J. R. Mascola,
X. F. Yu,
K. E. Nelson,
C. Khamboonruang,
D. Schmitt,
M. P. Kieny,
J. G. McNeil, and D. S. Burke.
1996.
Diversity of the envelope glycoprotein among human immunodeficiency virus type 1 isolates of clade E from Asia and Africa.
J. Virol.
70:3331-3338[Abstract].
|
| 36.
|
McCutchan, F. E.,
J. K. Carr,
M. Bajani,
E. Sanders-Buell,
T. O. Harry,
T. C. Stoeckli,
K. E. Robbins,
W. Gashau,
A. Nasidi,
W. Janssens, and M. L. Kalish.
1999.
Subtype G and multiple forms of A/G intersubtype recombinant human immunodeficiency virus type 1 in Nigeria.
Virology
254:226-234[CrossRef][Medline].
|
| 37.
|
McCutchan, F. E.,
M. O. Salminen,
J. K. Carr, and D. S. Burke.
1996.
HIV-1 genetic diversity.
AIDS
10(Suppl. 3):S13-S20.
|
| 38.
|
Murphy, E.,
B. Korber,
M. Georges-Courbot,
B. You,
A. Pinter,
D. Cook,
M. Kieny,
A. Georges,
C. Mathiot,
F. Barre-Sinoussi, and M. Girard.
1993.
Diversity of V3 region sequences of human immunodeficiency viruses type 1 from the central african republic.
AIDS Res. Hum. Retrovir.
9:997-1006[Medline].
|
| 39.
|
Nkengasong, J. N.,
W. Janssens,
L. Heyndrickx,
K. Fransen,
P. M. Ndumbe,
J. Motte,
A. Leonaers,
M. Ngolle,
J. Ayuk,
P. Piot, et al.
1994.
Genotypic subtypes of HIV-1 in Cameroon.
AIDS
8:1405-1412[Medline].
|
| 40.
|
Posada, D., and K. A. Crandall.
1998.
MODELTEST: testing the model of DNA substitution.
Bioinformatics
14:817-818[Abstract/Free Full Text].
|
| 41.
|
Rambaut, A., and N. C. Grassly.
1997.
Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees.
Comput. Appl. Biosci.
13:235-238[Abstract/Free Full Text].
|
| 42.
|
Robertson, D. L.,
B. H. Hahn, and P. M. Sharp.
1995.
Recombination in AIDS viruses.
J. Mol. Evol.
40:249-259[CrossRef][Medline].
|
| 43.
|
Robertson, D. L.,
P. M. Sharp,
F. E. McCutchan, and B. H. Hahn.
1995.
Recombination in HIV-1.
Nature
374:124-126[Medline].
|
| 44.
|
Rodrigo, A. G.,
P. C. Goracke,
K. Rowhanian, and J. I. Mullins.
1997.
Quantitation of target molecules from polymerase chain reaction-based limiting dilution assays.
AIDS Res. Hum. Retrovir.
13:737-742[Medline].
|
| 45.
|
Rodrigo, A. G., and J. I. Mullins.
1996.
HIV-1 molecular evolution and the measure of selection.
AIDS Res. Hum. Retrovir.
12:1681-1685[Medline].
|
| 46.
|
Sabino, E. C.,
E. G. Shpaer,
M. G. Morgado,
B. T. M. Korber,
R. Diaz,
V. Bongertz,
S. Cavalcante,
B. Galvao-Castro,
J. I. Mullins, and A. Mayer.
1994.
Identification of human immunodeficiency virus type 1 envelope genes recombinant between subtypes B and F in two epidemiologically linked individuals in Brazil.
J. Virol.
68:6340-6346[Abstract/Free Full Text].
|
| 47.
|
Saksena, N. K.,
B. Wang,
Y. C. Ge,
S. H. Xiang,
D. E. Dwyer, and A. L. Cunningham.
1997.
Coinfection and genetic recombination between HIV-1 strains: possible biological implications in Australia and South East Asia.
Ann. Acad. Med. Singapore
26:121-127[Medline].
|
| 48.
|
Salminen, M. O.,
J. K. Carr,
D. S. Burke, and F. E. McCutchan.
1995.
Identification of breakpoints in intergenotypic recombinants of HIV-1 by bootscanning.
AIDS Res. Hum. Retrovir.
11:1423-1425[Medline].
|
| 49.
|
Salminen, M. O.,
J. K. Carr,
D. L. Robertson,
P. Hegerich,
D. Gotte,
C. Koch,
E. Sanders-Buell,
F. Gao,
P. M. Sharp,
B. H. Hahn,
D. S. Burke, and F. E. McCutchan.
1997.
Evolution and probable transmission of intersubtype recombinant human immunodeficiency virus type 1 in a Zambian couple.
J. Virol.
71:2647-2655[Abstract].
|
| 50.
|
Sawyer, S.
1989.
Statistical tests for detecting gene conversion.
Mol. Biol. Evol.
6:526-538[Abstract].
|
| 51.
|
Shao, Y.,
L. Su,
F. Zhao,
H. Xing, et al.
1998.
Genetic recombination of HIV-1 strains identified in China, abstr. 11179, p. 429.
In
12th World AIDS Conference, Geneva, Switzerland.
|
| 52.
|
Shao, Y.,
F. Zhao,
W. Yang, et al.
1999.
The identification of recombinant HIV-1 strains in IDUs in Southwest and Northwest China.
Chin. J. Exp. Clin. Virol.
13:109-112.
|
| 53.
|
Shimodaira, H., and M. Hasegawa.
1999.
Multiple comparisons of log-likelihoods with applications to phylogenetic inference.
Mol. Biol. Evol.
16:1114-1116.
|
| 54.
|
Siepel, A. C.,
A. L. Halpern,
C. Macken, and B. T. M. Korber.
1995.
A computer program designed to rapidly screen for HIV-1 intersubtype recombinant sequences.
AIDS Res. Hum. Retrovir.
11:1413-1416[Medline].
|
| 55.
|
Smith, J. M.
1992.
Analyzing the mosaic structure of genes.
J. Mol. Evol.
34:126-129[Medline].
|
| 56.
|
Stephens, J. C.
1985.
Statistical methods of DNA sequence analysis: detection of intragenic recombination or gene conversion.
Mol. Biol. Evol.
2:539-556[Abstract].
|
| 57.
|
Subbarao, S.,
K. Limpakarnjanarat,
T. D. Mastro,
J. Bhumisawasdi,
P. Warachit,
C. Jayavasu,
N. L. Young,
C. C. Luo,
N. Shaffer,
M. L. Kalish, and G. Schochetman.
1998.
HIV type 1 in Thailand, 1994-1995: persistence of two subtypes with low genetic diversity.
AIDS Res. Hum. Retrovir.
14:319-327[Medline].
|
| 58.
|
Swofford, D. L.
1999.
PAUP 4.0: Phylogenetic analysis using parsimony (and other methods), 4.0b2a ed.
Sinauer Associates, Inc., Sunderland, Mass.
|
| 59.
|
Thompson, J. D.,
D. G. Higgins, and T. J. Gibson.
1994.
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
Nucleic Acids Res.
22:4673-4680[Abstract/Free Full Text].
|
| 60.
|
Triques, K.,
A. Bourgeois,
N. Vidal,
E. Mpoudi-Ngole,
C. Mulanga-Kabeya,
N. Nzilambi,
N. Torimiro,
E. Saman,
E. Delaporte, and M. Peeters.
2000.
Near-full-length genome sequencing of divergent African HIV type 1 subtype F viruses leads to the identification of a new HIV type 1 subtype designated K.
AIDS Res. Hum. Retrovir.
16:139-151[CrossRef][Medline].
|
| 61.
|
UNAIDS.
1997.
UNAIDS/WHO Working Group on Global HIV/AIDS and STD Surveillance Report on the global HIV/AIDS epidemic.
World Health Organization, Geneva, Switzerland.
|
| 62.
|
Wasi, C.,
B. Herring,
S. Vanichseni,
S. Raktham,
T. D. Mastro,
N. L. Young,
H. Rübsamen-Waigmann,
H. von Briesen,
M. L. Kalish,
C.-C. Luo,
C.-P. Pau,
A. Baldwin,
J. I. Mullins,
E. L. Delwart,
J. Esparza,
W. L. Heyward, and S. Osmanov.
1995.
Determination of HIV-1 subtypes in injecting drug users in Bangkok, Thailand, using peptide binding enzyme immunoassay and the heteroduplex mobility assay: evidence of increasing infection with HIV-1 subtype E.
AIDS
9:843-849[Medline].
|
| 63.
|
Weiller, G. F.
1998.
Phylogenetic profiles: a graphical method for detecting genetic recombinations in homologous sequences.
Mol. Biol. Evol.
15:326-335[Abstract].
|
| 64.
|
Weniger, B.
1994.
Experience from HIV incidence cohorts in Thailand: implications for HIV vaccine efficacy trials.
AIDS
8:1007-1010[Medline].
|
| 65.
|
WHO Working Group.
1995.
The HIV/AIDS pandemic: 1995 overview. Global Programme on AIDS.
World Health Organization, Geneva, Switzerland.
|