Previous Article | Next Article ![]()
Journal of Virology, November 2003, p. 12363-12368, Vol. 77, No. 22
0022-538X/03/$08.00+0 DOI: 10.1128/JVI.77.22.12363-12368.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Paul-Ehrlich-Institut, D-63225 Langen,1 Robert Koch-Institut, D-13353 Berlin, Germany2
Received 17 June 2003/ Accepted 10 August 2003
|
|
|---|
|
|
|---|
Endogenous retroviruses (ERV) are present in the genomes of all vertebrate species analyzed (8), and their general organization corresponds to exogenous retroviruses (8, 35, 41). Whereby multiple copies of ERV are present in their host genomes, most are truncated or mutated, rendering them replication incompetent. Nevertheless, some of the viral integration sites still display transcriptional activity and even produce viral particles by complementation in trans. Only a minority of the distinct proviruses are functional as reported for pigs (1, 2, 20, 25, 28, 29). While PERV belong to the gamma-retroviruses, other ERV also share homologies with beta- and delta-retroviruses but not with lentiviruses (12, 16, 28, 41).
For PERV, three different classes, designated PERV-A, PERV-B, and PERV-C (1, 22), exist, whereby PERV-A and -B are polytropic and productively infect human cells in vitro, thus posing a serious risk in xenotransplantation and xenogeneic cell therapies. Ecotropic PERV-C (1) does not replicate on human cells and is therefore not included in this study. There are only minor genetic differences between the classes, which are most prominent in the receptor binding domain of the env protein (Fig. 1C and D). In addition, there are two different types of long terminal repeats (LTR) (Fig. 1B) that significantly affect the replication properties of single viruses (33) by instrumentalizing a special set of transcription factors (32). Both PERV-A and PERV-B proviruses demonstrate LTRs that harbor repeats in U3. On the other hand, PERV-A and PERV-C were found to display repeatless LTRs (33). LTRs displaying repeat structures and possible multimerization mechanisms were first described for murine retroviruses (23, 43). These retroviruses, along with other gamma-type retroviruses, share a common homology of approximately 60% among each other and with PERV. Furthermore, so as not to complicate the analysis, we excluded any of these viruses from this study and concentrated on determining the relative age of the two different PERV LTR types not found in any other virus. It was shown that PERV in cell culture actively adapt their LTR repeat structure, if present, to match the optimal but not the maximum replication performance in a given host cell (33). Conversely, the recombinant incorporation of an artificially created repeat structure, i.e., 10 times 39 bp, into U3 of the 5' LTR of a molecular clone caused rapid cell death after transfection into susceptible cells (Scheef and Tönjes, unpublished data). While different env isoforms are present in a variety of retroviruses (7), only PERV is known to harbor two related but profoundly different LTR types in addition to three env classes.
![]() View larger version (37K): [in a new window] |
FIG. 1. Genomic organization of proviral PERV. (A) PERV displays genes for group specific antigen (gag), protease/polymerase polyprotein (pro/pol), and envelope protein (env), flanked by LTR. Both LTR and env vary significantly between individual proviruses, with the LTR determining the transcriptional activity and the envelope protein determining the host range with two polytropic virus classes, designated A and B. The gag and pol genes show only minor variations between proviruses. (B) Two different PERV LTR structures exist, one type harboring different numbers of a distinct 39-bp repeat in U3 (I to III) which is composed of subrepeat I (18 bp) and subrepeat II (21 bp) (26, 33). The second LTR type has no repeat structure, although sequences homologous to the subrepeats (designated Ia and IIa) can be found scattered across U3. The numbers given for each LTR designate the presence of proviral PERV LTR found in 21 BAC clones of a genomic large white library (26, 30). (C and D) Entropy plots for PERV envelope genes (panel C, PERV-A; panel D, PERV-B). For calculating entropy, all env sequences (see Appendix) were aligned, and the number of nucleotide differences was determined for every position. This difference is plotted against the position, thus indicating regions of high sequence variation. The sequence variation at the 3' end of both env genes is caused by two factors. Since sequences used for entropy analysis were taken unmodified from GenBank, some truncated sequences led to an overall increase in diversity. The more important fact, however, is a structural divergence of the R-peptide structure of various PERV. This divergence corresponds in large part, but not exclusively, with the LTR structure (see "Phylogenetic analysis of env" in the text for details). Broadly striped arrowhead, hot spot with very high sequence variation; finely striped arrow head, variations at the C terminus of Env; cap, cap site; SD, splice donor; PBS, primer binding site; SA, splice acceptor; ppt, polypurine tract; p(A), poly adenylation site.
|
Besides determining the age of PERV, we were particularly interested in investigating which of the two LTR types is the phylogenetic predecessor.
We have analyzed the prevalence of six well-characterized full-length PERV, five of them being replication competent and four of them being chromosomally assigned (20, 25). These analyses revealed a heterogeneous distribution of PERV among individuals (26), and since no PERV is present in every pig, it seems feasible to generate pigs free of functional PERV by conventional breeding. In addition, specific proviruses show internal point mutations which significantly affect their replication capacities. Since there are two different types of PERV LTR structures (Fig. 1B) showing various levels of transcriptional capacity (33), an analysis of 21 distinct chromosomal locations revealed that PERV which harbor highly active LTR with repeat elements in U3 are dominant (26). In addition, the two polytropic envelope genes were assayed for sequence variations, displaying class-specific hot spots of variation, as well as variations in the R peptide region (Fig. 1C and D).
Sample acquisition. In order to study the phylogenetics of PERV, we have analyzed the LTRs and env genes of 17 mostly intact and chromosomally localized proviruses (25, 30) by PCR and subsequent sequencing. These sequences were submitted to GenBank, and the accession numbers of those as well as of instrumental full-length proviral sequences are given in the Appendix. In addition, sequences of 27 envelope genes published in GenBank were analyzed. Their accession numbers are given in the Appendix as well.
Molecular clock analysis. In order to evaluate the clocklike behavior of PERV proviruses, distinct trees were constructed using the datasets of LTR and env sequences, proving that the molecular clock (5, 24, 45, 46) is the best approximation and could not be rejected assuming a constant rate of evolution (Table 1). When both data sets were tested for substitution saturation using the DAMBE software package (44), no saturation could be observed when transitions (s) and transversions (v) were plotted versus the evolutionary distance (Fig. 2). The plot shows a linear increase of s and v values based on increasing divergence between different PERV. Since s occurs much more often than v, s should increase faster than v. In case of substitution saturation, where multiple substitutions have occurred at each site, the phylogenetic signal is essentially lost, and its effect is detectable because v gradually outnumbers s. Thus, the established graphs (Fig. 2) indicate that no substitution saturation has occurred in the investigated data sets of PERV LTR (Fig. 2A) and env (Fig. 2B) sequences, suggesting the reliability of the chronological dating based on the molecular clock.
|
View this table: [in a new window] |
TABLE 1. Maximum-likelihood test for the molecular clock hypothesisa
|
![]() View larger version (17K): [in a new window] |
FIG. 2. Transitions and transversions of PERV sequences plotted versus Timura-Nei's two-parameter genetic distance. Nucleotide sequences of LTRs (A) and env (B) used to determine the evolutionary age of PERV were analyzed with the program DAMBE (44) for their transitions (x and s) and transversions ( and v) ratio employing the Tamura-Nei substitution model (36). Transitions are nucleotide changes T C or A G and should happen more frequently than transversions, which are changes encompassing T A, T G, C A, or C G. While the s/v plot of env sequences (B) shows a linear increase of both transitions and transversions, with transitions progressing much faster, the s/v plot of LTR sequences (A) shows a gap (II) based on LTRs with different numbers of repeats, leading to a stepwise increase in sequence variations. When comparing viruses with different LTR types, each repeat-harboring LTR type increases genetic distance by 39 bp, while the repeatless LTRs deviate considerably from the repeat-harboring LTRs on the basis of pairwise comparison, thus producing the gap in genetic distance that is observed in the LTR s/v plot. The significant dominance of transitions and no crossing of the different symbols, representing transitions and transversions, suggest no substitution saturation and proof of a constant rate of evolution, therefore allowing a determination of age by molecular clock. Each x or in the plot represent a transition or transversion event at a given genetic distance, respectively. n, number of sequences analyzed for each plot (see Appendix).
|
Phylogenetic analysis of LTR sequences. The clocklike evolution of PERV allowed an attempt of dating by sequence comparison of the 5' and 3' LTR of distinct proviruses. Since the number of fully sequenced PERV is not large enough to allow for phylogenetic analyses based on intact and defect proviral sequences, we amplified the LTRs of distinct chromosomally assigned PERV present in a genomic pig library constructed in bacterial artificial chromosomes (BAC) (30). Because every BAC statistically harbors only one provirus (checked by PCR; data not shown), we could amplify 5' and 3' LTRs of different PERV from this quasigenomic template without interference with LTRs of unrelated proviruses.
Analyses of primate ERV showed that these ERV accumulate mutations with a rate of approximately 2.3 x 10-9 to 5.0 x 10-9 substitutions per nucleotide per year (17). Assuming the same mutation rate and with a mean LTR length of 700 bp, this translates to 250.000 to 650.000 years per mutation for PERV, with a mean of 450.000 years used for the calculations in Table 2. The fact that transcriptionally active viruses exist (11, 19, 25) favors a relatively recent phylogenetic origin of PERV. While the results of this attempt of chronological dating are given in Table 2 and the actual estimation of time may be improved in the future, it is clear that the repeatless LTR (Fig. 1B) is phylogenetically younger than the repeat-harboring LTR (Fig. 1B).
|
View this table: [in a new window] |
TABLE 2. Estimation of PERV age by comparison of differences in the 5' and 3' LTRsa
|
Phylogenetic analysis of env sequences. Since the reliability of the molecular clock was proven for PERV LTR and env sequences (Table 1 and Fig. 2), an alternative method for phylogenetic dating is the comparison of coding sequences generating phylogenetic trees based on Env (Fig. 3). In general, the homologies between PERV-A and PERV-C are approximately 85%, while the similarities between PERV-B and both PERV-A and PERV-C barely exceed 70%. This and the occurrence of repeatless LTRs in both PERV-A and PERV-C but not in PERV-B (see above) lead to the assumption of a common evolutionary origin of these two classes, both seeming younger than PERV-B. Unfortunately, there are not enough PERV-C sequences available currently to conduct a thorough analysis. The exclusion of PERV-C sequences may present a weak point. There is the possibility that PERV-C recombined with PERV-A, which is evident by A/C env recombinants (13, 42). Therefore, a PERV-A population with mixed LTR sequences is conceivable. PERV-C obviously was not able to spread in the porcine genome (initial results of screening several porcine genomic libraries suggest a very limited number of PERV-C sequences in the pig genome; Preuß and Tönjes, unpublished data); however, PERV-A with both types of LTR did. We admit that if more PERV-C sequences become available the study must be extended. While the observed hot spots of variation specific for each class most likely are due to different selective pressures (Fig. 1C and D), there is a principle sequence difference between both classes in the receptor binding region, while other regions display high levels of homology. When using such sequences for computational analysis, this may lead to intrinsic bias. Therefore, both subpopulations were assayed separately. However, the comparison of coding sequences, in contrast to the comparison of LTR sequences, lacks an internal reference and must be calibrated with an externally generated fixed time point, most preferably related to archaeological data. Since those data are limited for pigs, an appropriate analysis includes study of whether peccary (Tayassuidae) genomes harbor PERV sequences. Peccaries are the closest evolutionary relatives of pigs, live isolated in America, and were separated from the Suidae approximately 7.4 x 106 years ago (18). Analysis of different peccary genomic DNA samples failed to identify any PERV-specific signals by PCR using degenerate primers (13a) suitable for amplification of gamma-type retroviral sequences or PERV specific primers (11); however, both approaches produced faint, unspecific signals (data not shown). We are currently investigating whether these unspecific signals belong to any unknown gamma-type retrovirus, which would not be surprising, since ERV sequences were found in any vertebrate species (14). These results are in line with a recent publication which identified several new PERV-related sequences in pigs and peccaries but also did not detect any PERV sequences in peccaries (28). The absence of PERV sequences in peccaries indicate that the age of PERV cannot exceed the calculated 7.4 x 106 to 7.6 x 106 years. According to calculations based on the LTR data, the repeatless LTR evolved between 400,000 and 3.1 x 106 years ago (Table 2). We assume that the appearance of that LTR which shows a low transcriptional activity enabled the virus to pursue an endogenous replication cycle more easily, since the low replication level does not damage the host cell while the virus still retains its ability to replicate outside its host.
![]() View larger version (24K): [in a new window] |
FIG. 3. Phylogenetic tree of PERV-A envelope protein sequences aligned for evolutionary time, calculated as in Table 2. Env sequences were shortened to amino acid position 654 to avoid any bias due to R-peptide variations or truncation (see the text and Fig. 1C and D for details). The analyzed PERV-A Env sequences cluster into three groups, with their branching points indicated by letters. The most divergent group (A) is the oldest one, the second group (B) being only slightly younger. This behavior correlates with LTR types with 1 1/2 and 2 1/2 repeats, respectively. The less-divergent group (C) is the youngest one, with all members belonging to proviruses with repeatless LTRs. The distance between cluster A and B is approximately one-fourth of the distance between cluster C and cluster A. When assuming a difference of approximately 0.9 million years between LTRs with 1 1/2 and 2 1/2 repeats (white box) (see Table 2), these two groups are about 3.5 millions years older than groups C (black box) (compare table 2). *, PERV-C (1) was chosen as the outgroup because of its close relationship to PERV-A sequences. The arrow indicates increasing divergence and therefore increasing evolutionary age. Please note that no absolute age is indicated, since an external reference is missing (see the text), and that only relative age can be determined from the graph.
|
Conclusion. A number of authors have pointed out that molecular clock calibrations are subject to a wide margin of error (3, 4, 15). Therefore, the calculations given in Table 2 are only rough estimates of absolute time, but they are nonetheless useful for comparing the relative ages of different PERV. In summary, we have shown that the repeatless PERV LTR evolved from the repeat-harboring LTR, most likely by insertional mutation. This can be proven by employing two different sets of data (LTR and env/Env). PERV is calculated to be approximately 7.4 million years old. This age correlates with the separation of pigs and their closest relatives, the peccaries, approximately 7.6 x 106 years ago, and the event that led to the generation of the repeatless type of LTR occurred between 400,000 and 3 million years ago (Table 2). Upon integration into the germ line, both LTR variants will have coexisted exo- and endogenously for some time. It is evident from observations of other endogenous retroviruses (especially HERV) that these ERV acquire mutations which reduce and finally abolish their replicative performance. The acquisition of a weakly performing promoter may help the virus to do no damage to its host cell while retaining some levels of replication competence. However, the generation of repeatless LTRs may reflect an adaptation process of the virus, switching from an exogenous to an endogenous life cycle. The repeatless LTR is present in the relatively closely related PERV-A and PERV-C proviruses, while the more distant PERV-B proviruses exclusively display repeat-harboring LTRs.
APPENDIX Specific PERV loci isolated from the BAC library (30) are grouped according to their respective PERV subgroup, while the accession numbers are given for 5' LTR, 3' LTR, and env for the respective clone. PERV-A: 242D4 (AY312539, AY312555, AY312523); 141G12 (AY312537, AY312553, AY312521); 305F5 (AY312542, AY312558, AY312526); 258A11 (AY312541, AY312557, AY312525); 135E5 (AY312536, AY312552, AY312520); 253B6 (AY312540, AY312556, AY312524); 383E10 (AY312543, AY312559, AY312527); and 1079D9 (AY312535, AY312551, AY312519). PERV-B: 161B7 (AY312538, AY312554, AY312522); 484G4 (AY312544, AY312560, AY312528); 498D8 (AY312545, AY312561, AY312529); 534G4 (AY312546, AY312562, AY312530); 647G4 (AY312547, AY312563, AY312531); 667G4 (AY312548, AY312564, AY312532); 783D7 (AY312549, AY312565, AY312533); 80H6 (AY312550, AY312566, AY312517); and 1058D6 (AY312534, AY312567, AY312518). PERV-C: none.
In addition, we have used full-length sequences of previously described proviruses isolated in our laboratory as well as sequences available in the international databases. PERV-A: AY099323 (6); AJ288585, AJ288584 (9), AJ133817 (11); AJ293656 (20); AF426917, AF426921, AF426924, AF426925, AF426927, AF426928, AF426942, (21); AJ279056, AJ279056, AF435966 (25); and AF296168 (Chang et al., unpublished data). PERV-B: AY099324 (6); AJ288589, AJ288592, AJ288591, AJ288590, AJ288586, AJ288588 (9); AJ133816, AJ133818 (11); AJ293657 (20); AF426916, AF426940, AF426937, AF426936, AF426935 AF426933, AF426946, (21); Y12239 (22); and AJ293656 (25). PERV-C: AF038600 (1).
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»