Previous Article | Next Article ![]()
Journal of Virology, February 2006, p. 1367-1375, Vol. 80, No. 3
0022-538X/06/$08.00+0 doi:10.1128/JVI.80.3.1367-1375.2006
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Section of Virology, Department of Medical Sciences, Uppsala University, SE-751 85 Uppsala, Sweden,1 Unit of Physiology, Department of Neuroscience, Uppsala University, SE-751 23 Uppsala, Sweden2
Received 3 August 2005/ Accepted 14 November 2005
|
|
|---|
|
|
|---|
In this paper, we demonstrate unique sequences in the human and chimpanzee genomes. We found a difference in recent activity between the beta-like and gamma-like retroviruses. This difference applied to both genomes inversely, with one group expanding in each genome. It indicates the importance of environmental factors and random reactivation events in preexisting elements for determining the retroviral genetic setup. Several cross-species transfers of nonhuman, nonchimpanzee primate gamma-like retroviruses to chimpanzee have occurred since the Homo-Pan sp. split.
|
|
|---|
![]() View larger version (37K): [in a new window] |
FIG. 1. Pol
NJ phylogeny of recently integrated hg16 proviruses (indicated by white
boxes), PanTro1 proviruses (indicated by black circles), and reference
retroviral sequences using the entire proteins or
"puteins." Shadows indicate the different groups
discussed in the text. Abbreviations are spelled out in the last
paragraph of Materials and Methods. Bootstrap supports for the
corresponding NJ cladogram rooted on midpoint are presented in Fig. S1
in the supplemental
material.
|
Sequences and accession numbers. The genome sequences for human (hg16) and chimpanzee (PanTro1) were retrieved from the UCSC Genome Browser (http://genome.ucsc.edu/).
GenBank accession numbers or chromosomal positions in hg16 for reference (putein) sequences used in the analysis were as follows: avian leukosis virus (ALV) (NC001408), Rous sarcoma virus (RSV) (NC001407), mouse mammary tumor virus (MMTV)/(MPMV) (NC001503), Mason-Pfizer monkey virus (MMTV)/(MPMV) (NC001550), Jaagsiekte sheep retrovirus (JSRV) (M80216), HML1 (ch19-21849393), HML2 (chr11-101600013), HML3 (chr1-48344461), HML4 (chr8-75679221), HML5 (AC004536), HML6 (consensus), HML7 (chr6-121300220), HML8 (chr3-131452286), HML9 (chr9-62700428), HML10 (chr6-32017925), HERV-H (consensus), HERV-H/RGH2 (D11078), HERV-H/RTVLH2 (M18048), HERV-Fc1 (AL354685), HERV-Fc2 (AC019088), HERV-W (chr7-9105739), ERV9 (AC073410), ERV3 (Chr7-63865366), HERV-E (M10976), murine leukemia virus (MLV) (NC001501), Moloney murine leukemia virus (MoMLV) (AF033811), baboon endogenous retrovirus (BaEV) (D10032), gibbon ape leukemia virus (GaLV) (M26927), HERV-ADP (AC005741), HERV-FRD (AC004022), HERV-I (chr16-72821350), HERV-T (chr14-104635791), HERV-S (AC004385), feline leukemia virus (FLV) (NC001940), porcine endogenous retrovirus (PERV) (AJ293656), walleye dermal sarcoma virus (WDSV) (NC001867), Xenopus laevis endogenous retrovirus (Xen1) (AJ506107), snakehead fish retrovirus (SnRV) (NC001724), bovine leukemia virus (BLV) (NC001414), human T-cell leukemia virus 1 (HTLV-1) (NC001436), HTLV-2 (NC001488), Gypsy (AJ000387), HERV-L (RepBase), human spumaretrovirus (HSRV) (AF033816), human foamy virus (HFV) (NC001736), MER4like (chr13-54208300), HERVL66 (RepBase), HERVL74 (RepBase), HERVL40 (RepBase), and Python molurus endogenous retrovirus (AAN77283).
|
|
|---|
|
View this table: [in a new window] |
TABLE 1. Properties
of recent ERV integrations in the human (hg16) and chimpanzee
(PanTro1) genomes
|
|
View this table: [in a new window] |
TABLE 2. Grouping
and characterization of recent Pol-containing ERV integrations in the
Homo (hg16) and Pan (PanTro1) genomes
|
In the Homo sp., one betaretroviral and one gammaretroviral group were detected (Fig. 1). The groups were selected using an 80% Pol similarity criterion. All recent human betaretroviruses were members of the HERV-K(HML2) group (2), hereafter called "HML2." They were 97.5 to 99.9% similar to each other in Pol (see the similarity matrix in Fig. S2 in the supplemental material) and 98.7 to 99.8% similar to an HML2 consensus sequence (V. Blikstad, G. O. Sperber, and J. Blomberg, unpublished).This homogeneous subgroup of HML2 corresponds to what was earlier named "human-specific HERV-K" (4, 38, 39). All recent human gammaretroviruses grouped within the HERV-H-like group (28).
In the
Pan sp., the recent integrations were dominated by 27
gammaretroviral sequences, but there was also one betaretroviral
sequence (Fig. 1). The
single recent betaretroviral Pan sp. sequence was assigned to
the HML2 group, based on 98.8% Pol similarity to the HML2 consensus Pol
and the dendrogram position (Fig.
1). Among the recent
gammaretroviral Pan sp. integrations, two were similar to the
HERV-H-like group (28).
We defined two major sequence groups, PtG1 (18 PtG1a elements and 3
PtG1b elements) and PtG2 (3 PtG2a elements and 1 PtG2b element), which
had no similarity to other chimpanzee or human proviruses (Fig.
1). The names were derived
from Pan troglodytes (Pt) and gammaretrovirus-like (G). The
numerals refer to the sequences joined by the
80% Pol
similarity criterion (distance matrix in Fig. S2 of the supplemental
material). Subgroups (a, b, etc.) derive from seemingly monophyletic
branches within the groups (Fig.
1 and Fig. S1 to S3 in the
supplemental material). We favor the use of this Pol similarity limit
together with data from phylogenetic analyses in grouping ERVs over the
use of phylogenetic analyses alone. The similarity criterion
is unambiguously related to evolutionary distance, regardless of
exogenous (with higher evolutionary rate) or endogenous (with lower
evolutionary rate) retroviral states. Classification based on
phylogenetic branching is relative and can split closely related
retroviruses into separate clades, depending on the selection of
included sequences. The rapidly growing number of retroviral sequences
will facilitate classification
(29). The PtG
nomenclature may have to be revised then. The recently described PtERV1
(61), whose sequence
kindly was provided by the group of Evan Eichler, clustered together
with our PtG1a subgroup, with >80% (median, 89%) Pol similarity
(Fig. S2 in the supplemental material). The average Pol similarities
within the respective subgroups were 92% (PtG1a), 92% (PtG1b), 93%
(PtG2a), and 82% in the HERV-H-like group (Fig. S2 in the supplemental
material). The branch orders were essentially the same for the
gammaretroviral sequences in an NJ cladogram and a
maximum-parsimony (MP)-derived cladogram (Fig. S3 in the supplemental
material). The minor inconsistencies within and at the borderlines of
the subgroups of PtG1 and PtG2 are reflected in the somewhat lower MP
bootstrap values. Similar to Fig.
1, the monophyletic PtG2a
is paraphyletic to PtG2b in the additional NJ and MP cladograms (Fig.
S3 in the supplemental material). In an attempt to increase the
resolution for the gammaretroviruses and the PtG2 subgroups, we
included additional nonmammalian gamma-like RT sequences
(22) in the analysis.
Using these shorter sequences, the early gammaretrovirus-like branch
topology in the RT tree was consistent with the Pol tree in Fig.
1 (data not shown),
although with less confidence. The recent PtG2b provirus
came out in an ancestral position to HERV-E. Currently, no additional
Pol sequences are available to further "pin down" the
relationships of the PtG2b element to other gamma-like retroviruses.
PtG1a and 1b (Fig. 1)
could be treated as a group according to the similarity matrix (Fig. S2
in the supplemental material). However, an exception from the otherwise
robust PtG groups is the Papio anubis (clone AC091754)
provirus, which groups inconsistently in the different analyses (see
Fig. S1 to S3 in the supplemental material). This may theoretically be
caused by convergent evolution, or recombination, within the two PtG
groups and AC091754. The gammaretrovirus-like PtG1 group is distinct
from the ICTV-defined gammaretroviruses (including MLV and BaEV),
despite sequence similarity to another baboon ERV. There are thus
distinct gammaretrovirus-like baboon ERVs as well (see
below).
Horizontal transfers. There were signs of cross-species transfer involving the PtG retroviruses. We performed tBLASTn searches against the whole nonredundant database at GenBank with a consensus of the aligned PtG1a Pol sequences and with original Pol sequences to represent PtG1b, PtG2a, and PtG2b (Table 2). A few primate nonchimpanzee sequences, yielding high tBLASTn scores (data not shown), proved to be closely similar to the Pol of each PtG group (Table 2). They grouped as novel gammaretroviruses in Fig. 1. Also, they had typical gammaretroviral gene structures, with gag, pro, and pol in one reading frame separated from env and no obvious accessory genes (see Fig. S4 in the supplemental material). It was further noted that the human retroviral sequences had primer binding sites (PBSs) complementary to either Lys-tRNA in HML2 or His-tRNA in the HERV-H-like group, while detected PBSs in the recent Pan sp. ERV groups (PtG1 and PtG2) were complementary to Pro-tRNA(Table 2). Other primate gammaretroviral sequences, like HUERSP3 and a number of ERV3-like sequences (1), also have a Pro-PBS but were only 50 to 76% similar in Pol to PtG1 (data not shown).
The nonchimpanzee sequences were also included in the Pol similarity matrix (Fig. S2 in the supplemental material). Both PtG groups were >80% similar in Pol to both murine and feline leukemia viruses (Fig. 1). The PtG1 elements were highly (84 to 96%) similar in Pol to two previously not described ERVs: a baboon (Papio anubis) sequence, clone AC093133, and a macaque (Macaca mulatta) sequence, clone AC148703 (Fig. S2 in the supplemental material and Table 2). The PtG2 elements were similar to the baboon endogenous retrovirus (BaEV) and to a Papio cynocephalus sequence, clone AF142988 (Fig. S2 in the supplemental material). The three PtG2a elements were approximately 96% similar to each other (Fig. S2 in the supplemental material). They were more than 92% similar to BaEV Pol (56) and to the Papio cynocephalus ERV Pol (Fig. 1 and Fig. S2 in the supplemental material). Further, they were 85% similar to MLV Pol. The single PtG2b sequence was more separate in the retroviral tree (Fig. 1) but was 93% similar to a Papio anubis ERV in clone AC091754 and 88% similar to MLV. A supplementary comparison of the PtG groups with a range of previously described BaEV-related Pol (RT, 108 amino acids) sequences (55), obtained through the courtesy of Antoinette van der Kuyl, showed that the PtG1 group also was similar to Papio and Colobus ERVs other than those described in Table 2 (see Fig. S5 in the supplemental material). The PtG2a elements grouped within the BaEV (55) clade. The PtG2b element grouped together with the Papio anubis sequence, however inconsistently (see above) (Fig. S1 to S3 in the supplemental material). Although a recent integration, it came out in an ancestral position relative to much older gamma-like HERVs, relatively close to HERV-E (Fig. 1 and Fig. S1 to S3 in the supplemental material). This could mean that more-or-less close relatives of gammaretrovirus-like HERVs may still be spreading among primates.
Subsequent LTR analysis (Fig. S5 in the supplemental material) showed that the PtG1 elements were highly similar (average, 88% nucleotide identities using pairwise deletion) to the chimpanzee LTR homologues of the colobus CPC-1 proviruses described by Bonner et al. (10). PtG1 LTRs were also similar to the macaque MAC-1 LTR (average, 80%) but less similar to the colobus CPC-1 LTR (average, 58%). However, the reference LTR sequences were not full-length and therefore merely indicate kinship between the ERVs. Due to separate ERV data sets and genes, a strict comparison between the Pol and LTR phylogenies (Fig. S5 in the supplemental material) is not possible. What is clear is that there are several novel sequences in the chimpanzee. They were probably transmitted to chimpanzees several times, in a complex way, from other primates in the recent past. We here demonstrate close similarities of the PtG groups to at least three different primate ERV groups, including BaEV.
The PtG1a, the PtERV1 (61), and the CPC-1 elements may derive from the same virus. CPC-1 was earlier described to occur in chimpanzee and gorilla but not in human, gibbon, and orangutan. Further, it was supposed to have transmitted to chimpanzees from colobus (10). Although mostly based on hybridization data derived from the presequencing era, that information on type C retroviral sequences in chimpanzees (6, 12) supports our findings. It was here corroborated and extended by a bioinformatic approach. The transmission hypothesis gained support from the (however limited) Pol and LTR phylogenies (Fig. S5 in the supplemental material), in addition to the pairwise LTR similarities from that alignment (see above).
Based on gag and env analyses, Yohn et al. recently showed that the PtERV1 elements (similar to our PtG1a group in the Pol analysis) (Fig. S2 in the supplemental material) may possibly have more than one origin (61). Exogenous retroviruses from at least three host groups, including (i) chimpanzee together with gorilla, (ii) baboon, and (iii) macaque, were suggested to have contributed. Their phylogenetic trees differed from generally accepted primate species trees, thus indicating horizontal transfer (61). Our data agree with those results. Although theoretically transmissions could first have reached chimpanzees and then gone from chimpanzees to other primates, the simplest explanation is transmission of nonhuman, nonchimpanzee primate gamma-like retroviruses to chimpanzee since the human-chimpanzee split. The presence of the BaEV-like PtG2 proviruses in Pan sp. but not Homo sp., and the presence of additional PtG2 LTR BLAST hits in PanTro1 but not in hg16, signifies transmissions after the Homo-Pan speciation (Table 2). As noted by van der Kuyl et al., BaEV-like viruses spread among African primates, and probably also to cats, in recent evolutionary time (54-56). In fact, MLV-like gammaretroviruses infected, and occasionally became endogenized in, a number of mammals during this period (11, 37, 49). This is an ongoing process, demonstrated by the widespread polymorphism of MLV integrations among inbred mice (46, 47, 52, 53). The inferred pattern of horizontal spread of the PtG groups is similar to that of simian immunodeficiency virus SIVcpz (3) and human immunodeficiency virus (20), which most likely arose by transfer from smaller primates to chimpanzee and from chimpanzee to human, respectively. Based on the phylogenetic analysis (Fig. S5 in the supplemental material), the PtG2 elements were, like SIVcpz (3), judged to originate from one or several small primates. The relative scarcity of primate sequences prevented the demonstration of a specific transmission route(s) for the PtG1 elements. They were similar to baboon and macaque sequences. The macaque retroviral similarity was unexpected. African macaques are present in only a small North African population, remote from the ancestral chimpanzee habitats in forests and forested savanna. Consequently, they are unlikely to have been in close contact with the chimpanzees. However, the possibility that macaques and chimpanzees have overlapped geographically in the past should not be excluded. The relatedness of PtG1 to macaque proviral sequences may represent similarity to widely distributed non-ape primate retroviruses. Our observations are thus consistent with the existence of a network of relatively frequent horizontal retroviral transmissions, followed by occasional endogenization, among primates. Exposure of wounds to prey blood during predation or eating of retrovirus-rich placentae (3, 20, 54) are possible explanations. Interspecies retroviral transmission via blood-sucking insects may also occur (58). Finally, it cannot be excluded that retrotransposon DNA may be integrated after uptake from the alimentary canal (19). Chimpanzees frequently eat other primates, like baboons, geladas, and colobi (21, 45), which harbor ERVs similar to the PtG retroviral sequences. The probable cases of horizontal transfer of gammaretroviruses to chimpanzees thus agree with the predatory practices of chimpanzees. It is, however, intriguing that the human genome seems to have been spared from the PtG integrations. Human ancestors and chimpanzees may have been differentially exposed through differing hunting practices (13, 40, 45). Differences in ERV fixation due to population size and distribution could also be reasons.
Differences in recent activity between beta- and gamma-like retroviruses. If LTR divergences for all RetroTector-derived ERVs are used as surrogate markers for integration times, an expansion of a limited number of human HML2 (i.e., hg16-beta) integrations appears to have started around 1.5 to 2 million years ago (approximately 0.5% LTR nonidentities) (Fig. 2). The expansion may still be ongoing, because the curve peaks at 0% LTR divergence. However, LTR-LTR homogenization by gene conversion could lead to falsely low LTR divergences, precluding an exact interpretation (24, 30). The betaretrovirus-like HML2 is a relatively large HERV group, of which the majority is common to Homo and Pan spp. (data not shown). A simple explanation for the recent HML2 expansion in humans could be back mutation to replication competence ("breakout"; see below). Even if these recently integrated HML2 elements have been labeled "human specific" (4, 38, 39), the sequence record of primates and other possible contributors of HMLs is incomplete. Cross-species transmissions of HML2 thus cannot be entirely excluded. Figue 2 indicates that the gammaretroviral integrational activity was separate in time from that of betaretroviruses. The recent gammaretroviral integrations of both species differ from the HML2 ones by containing more disrupted genomes yet with low LTR differences (Table 2). Retroviruses may differ substantially in mutation frequency (23, 31, 44). Although the frequency of gene conversion is highly dependent on sequence similarity, the low LTR divergences in otherwise disrupted, and thus probably ancient, gammaretroviral elements can have been caused by LTR homogenization (24, 30).
![]() View larger version (23K): [in a new window] |
FIG. 2. LTR
differences in recently integrated beta-like (HML2 group) and
gamma-like ERVs in the human (hg16) and chimpanzee
(PanTro1)
genomes.
|
Thus, both the human and chimpanzee genomes have been subjected to different kinds of recent retroviral integrations. A BLAT search for LTRs of recent unique ERVs with a stringent criterion (>98% of maximum BLAT score, using either of the 5' and 3' LTRs) resulted in numerous hits (Table 2), but only in the cognate genome. This also applies to the otherwise homogeneous HML2 group. Consequently, in the past, there must have been many more integrations of these elements in the chimpanzee and human genomes than the currently residing ones (Table 2). They may have become looped out through homologous recombination between the LTRs, as postulated previously (48). PCR would be a suitable method to address the amount of solitary LTRs but was out of scope for this study. It is also noteworthy that a BLAT search with LTRs of the selected human HERV-H-like elements resulted in fewer hits (Table 2), suggesting that retroviral RNAs with mutated R, U5, and U3 LTR portions resulted in these integrations, as expected from the "midwife" master model (28, 36). HERV-H probably did not reintegrate recently to the same extent as the HML2 elements did (Fig. 2 and Table 1). It is likely that, for more than 30 million years, the HML2 group multiplied mainly through reinfections rather than cis retrotransposition or trans complementation (5). The high Pol similarities among the recent HML2 elements (approximately 98% [Fig. S2 in the supplemental material]), together with the low (<2%) LTR divergence, indicate a common origin after Homo-Pan speciation. LTR homogenizations by gene conversions are unlikely to have occurred simultaneously in all different HML2 loci (Table 2) after the Homo-Pan sp. split, which concurs with the uniqueness of these ERVs in Homo sp. The recent expansion of highly related HML2 integrations (Fig. 2) may have derived either (i) from a random mutational activation of a slightly damaged, preexisting HML2 element or (ii) by reinfection of humans with HML2 from another species. We currently favor the first hypothesis, since there is no known source of infectious HML2 in animals close to humans or human predecessors. According to the "breakout" hypothesis (7), copackaged RNA of partially defective ERV elements occasionally may recombine, thereby rescuing and optimizing retroviral function during reinfections from within. HML2 elements are in general the most complete of all HERVs. Old, relatively intact HML2 elements could have assisted in a stochastic fashion in the recent HML2 activation in Homo sp. and done so to a lesser extent in Pan sp. This possibility should be further investigated.
Conclusion and a caveat. Inevitably, though the screening strategy is likely to detect most young integrations, the genome may contain an undefined portion of older elements where gene conversion caused LTR homogenization. If supplemented with a check for uniqueness (i.e., ERVs occurring in only one or the other species) against the next genome, as conducted here, the approach should correctly detect recent integrations which areunique to humans or chimpanzees. However, a complication is that the assessment of uniqueness for Pan versus Homo sp. elements is stronger than vice versa due to the poorer Pan sp. draft sequence quality. Eventually, this could lead to an erroneous impression of uniqueness for a human sequence. The matching of flanks of the seemingly human-specific HERV-H integrations into the chimpanzee sequence was not convincing. They were therefore not underlined in Table 2. The low number of LTRs with high similarity to the suspected human unique HERV-H proviruses (Table 2) is consistent with the "midwife" master hypothesis (28, 36), because (re)integration-competent proviruses would be more likely to give such single LTRs, while copackaged defective ones would not. Thus, the human unique HERV-H-like sequences (Fig. 1 and Table 2) eventually will need additional experimental analysis. Allelic variation and deletions are additional obstacles. However, precise deletions of proviruses are unlikely to occur (35), especially in the many different loci on different chromosomes presented here. Thus, it is more likely that our observed genomic ERV differences are the results of gain rather than loss. Further, as shown here, LTRs corresponding to the unique proviruses occur frequently but only in the cognate genome. In the BLAT search, they outnumber their proviral counterparts (Table 2). Instead of precise proviral deletions, the higher number of recognized LTRs shows a more likely event of ERV loss through homologous recombination and looped-out proviruses (48). Allelic variation cannot be covered in the single-sequence genome assemblies. A locus-specific PCR, preferably with many individuals, could be used to address this problem but was out of the scope of this study. The false LTR similarity (gene conversion) and false uniqueness problems were addressed by bioinformatic means, as discussed above.
The numerous recent species-unique proviruses, and the larger number of similar species-unique (mainly solitary) LTRs, show that both Homo and Pan sp. genomes have distinct sets of recently active ERVs. The comparison of retroviral sequences in Homo and Pan sp. genomes highlights the importance of (i) habitat, interspecies contact, and predator-prey relations facilitating cross-species retroviral infection from "outside" and/or (ii) probable stochastic reactivation of preexisting ERVs followed by reinfection from "inside" as determinants of the retroviral genetic setup of a species.
We also thank Antoinette van der Kuyl and Evan Eichler for sequence contributions, Tove Airola for assistance in genomic data collection, and Michael Tristem for valuable discussions.
Supplemental material for this article may be found at
http://jvi.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»