Previous Article | Next Article ![]()
Journal of Virology, July 2005, p. 8316-8329, Vol. 79, No. 13
0022-538X/05/$08.00+0 doi:10.1128/JVI.79.13.8316-8329.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Department of Molecular Biology and Microbiology, Tufts University, Boston, Massachusetts 02111
Received 12 December 2004/ Accepted 16 March 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Gammaretroviruses are simple retroviruses that are widespread, both as exogenous infectious agents and as endogenous proviruses in mice, cats, baboons, and other mammalian, avian, and reptilian species (30). A major group of these viruses, the murine leukemia viruses (MLVs) of mice, makes an ideal subject for study of the recent evolution of the host-virus relationship. There are currently five recognized major subgroups of MLV distributed into two broader classes based on their host ranges and sequence relationships: ecotropic and nonecotropic. The nonecotropic viruses include the endogenous polytropic, modified polytropic, and xenotropic viruses as well as exogenous amphotropic and 10A1 viruses. Despite their distinctive receptor usage (43), these viruses all have highly related env genes, with regions critical for receptor interaction, known as variable region A (VRA) and variable region B (VRB), exhibiting receptor-specific variation within an otherwise largely conserved framework. The overall diversity of sequence in VRA and VRB among the subgroups indicates that MLV has had a very active evolutionary history and continues to evolve to adapt to use different receptors at a rapid rate. This evolution is most likely driven by the appearance of host animals that have become resistant due to polymorphisms in receptor genes or to the presence of endogenous proviruses whose env genes can block receptor access. Analysis of the endogenous proviruses in different species of mice, particularly their receptor utilization, should reveal important aspects of the coevolution of virus and host. Fortunately, evolution of the host, mice of the genus Mus, has similarly been the focus of extensive research for many years (2, 5, 6, 35). Until recently, however, attempts to resolve the interplay of virus and host have been hampered by the number and complexity of proviruses present in the genomes of wild mice.
In prior work, we developed the use of specific oligonucleotide probes and PCR primers aimed at polymorphisms within the env gene and U3 region to comprehensively identify endogenous MLVs and isolate individual proviruses themselves (14, 40, 45, 46). A catalogue of the nonecotropic proviral MLV content of wild mice based upon U3 region polymorphisms led to the description of nine sub-subgroups: four xenotropic related and five polytropic related. These data further allowed the parsimonious construction of MLV phylogenies for both env and U3 sequences. A novel endogenous provirus found in the distantly related Mus spicilegus (formerly M. hortulanus) appeared to occupy a relatively central position in both the U3 and env trees, near the inferred common ancestor of MLV-related viruses, including those in nonmouse species like cats and baboons (46).
The sequence of a partial clone of the hortulanus endogenous murine leukemia virus (HEMV) provirus, including the env gene and the complete U3 region, revealed some unusual features. Based on hybridization with specific oligonucleotide probes, HEMV was originally identified as a member of the X-IV class of endogenous MLVs, the most extensively represented group in the many mouse species tested. This relationship suggested that the X-IV viruses, like HEMV, were active within a deeply ancestral mouse population. The phylogenetic analysis also placed HEMV on a branch separate from the X-IV viruses and other MLVs, although this split occurred very early in the MLV history. Confirming HEMV's distinctiveness, an oligonucleotide probe designed to identify only HEMV-like U3 regions failed to hybridize to other X-IV loci. HEMV is specifically resident in the genome of M. spicilegus only, possibly indicating that it independently evolved into its current form within this Mus branch.
Despite its central phylogenetic position among MLV-like proviruses, the HEMV env gene differs in significant ways from that of related retroviruses. Although it lacks nonsense or frameshift mutations in its open reading frame (ORF), HEMV has a significantly truncated VRA region and a very short VRB region. For this reason, we speculated that the loss of information in these regions might render its Env protein nonfunctional. However, during the course of this study, the sequence and partial tropism data of the virus M813 (an exogenous virus of M. cervicolor originally identified by Benveniste et al. [4]) were published (36). While not identical, M813 is similar to HEMV in VRA and VRB, raising the possibility that the HEMV env gene might encode a functional product.
To test the functional consequences of the unique aspects of the HEMV provirus, we have cloned the proviral DNA and found that, contrary to our speculation, it is intact, is fully capable of replication as a virus, and, despite its apparent age, is a relatively recent insertion into the germ line of M. spicilegus. Consistent with its replication competence, the HEMV Env protein is fully functional and furthermore represents a novel subgroup of MLV, with functional receptors found only on species of Mus, including M. spicilegus. The implications of this more detailed characterization of this otherwise ancestral-appearing MLV on our understanding of the mechanism of retroviral evolution are discussed.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Mouse DNA. Mus cervicolor popaeus (J53), Mus caroli (J136), Mus cookii (J135), and Mus spicilegus (Halbturn) (J131) genomic DNA preparations were donated by Christine Kozak. Mus famulus (FAM), Mus platythrix (PTX), Mus spicilegus (ZRU), Mus macedonicus (XBS), Mus cervicolor cervicolor (CRV), and Mus musculus bactrianus (BIR) genomic DNA preparations were provided by Francois Bonhomme. Mus dunni DNA was prepared from M. dunni tail fibroblasts using the DNeasy tissue kit (QIAGEN). All other mouse DNAs were purchased from the Mouse DNA Resource at Jackson Laboratories, Bar Harbor, ME.
Plasmids and constructs.
The vectors used in this study include pHEMV18 (HEMV env gene) (46); pSV-
minus-E-MLV (26); pMLVgagpol (structural and enzymatic protein expression), pMCF 247 5' (polytropic env gene), and pSVA-MLV (amphotropic env gene) (28, 29); pFBXsalf (xenotropic env gene) (44); pMOV-GaLV SEATO (gibbon ape leukemia virus [GALV] env gene) (47); pLacPuro (34); pB6 (replication-competent 10A1 provirus) (33); and pNCS (replication-competent Moloney MLV [MoMLV]) (10). Digestion of pHEMV18 with BsaAI and NheI released the HEMV env gene coding sequence. Digestion of pB6 completely with NheI and partially with BsaAI released the 10A1 env gene. These env genes were cloned into both pSV-
minus-E-MLV (to create pSV-
minus-HEMV-MLV and pSV-
minus-10A1-MLV) and pNCS (to create replication-competent pseudotype HEMV-MoMLV). pSV-
minus-E-MLV and pNCS had been completely digested with PmlI and partially digested with NheI prior to the introduction of the env genes. The entire HEMV provirus was cloned into the EcoRI sites of pCR2.1-TOPO to make pHEMV-TOPO.
PCR and primers. PCRs were performed with either Taq (Sigma) (original HEMV cloning) or Taq Platinum High Fidelity (Invitrogen) in 50-µl volumes. In general, reactions were set up with the following mixture: 5 pmol each primer, 1 unit polymerase, 2 mM deoxynucleoside triphosphates, and distilled water to a final volume of 50 µl plus buffers (regular Taq polymerase with 5 µl of 10x "genome amp" reaction buffer [100 mM Tris-HCl, pH 8.3, 20 mM Tris-HCl, pH 8.0, 250 mM KCl, 35 mM MgCl2, and 250 nM EDTA] and Taq Platinum with 5 µl of 10x "HiFi buffer" and 2 µl 50 mM MgSO4). Cycling parameters generally followed this protocol: 1 min at 95°, to dissociate the enzyme from its inhibitory antibody, followed by 30 cycles of a 30-second 95° melting step, 30-second (primer Tm, 5°) annealing step, and an extension step (1 min/kb) at 68°. An exception to this cycling protocol was the 47° annealing temperature found to be optimal for the inosine-containing primer Tris2F.
Many primers were chosen using the Primer3 program available at the Whitehead Institute website (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi). For primers not chosen due to optimal Tm or other considerations, sequences were screened using the "oligonucleotide properties calculator" at the Northwestern University website (http://www.basic.nwu.edu/biotools/oligocalc.html). Primers used in this study are listed in Table 1.
|
minus MoMLV, HEMV, or 10A1 plasmids with pLacPuro. Triple transfections were done with mixtures of pMLVgagpol and pLacPuro with the env gene constructs pMCF 247 5', pSVA-MLV, pMOV-GaLV SEATO, and pFBXsalf. Replication-competent 10A1, MoMLV, and HEMV were grown in NIH 3T3 cells. Virus-containing medium was filtered through a 0.45-µm membrane and then either used directly or stored at 70°. Transfection of 293T cells was modified from a method described previously (32). 293T cells were grown to confluence in 100-mm plates and then split 1:4 the day before to yield 90% confluent cultures. One hour before transfection, the medium was replaced with 9 ml of fresh medium. For each 100-mm plate, 30 µg of total DNA was mixed with sterile distilled water to a final volume of 450 µl. To this mixture, 60 µl of 2 M CaCl2 was added dropwise. This solution was immediately added dropwise to 500 µl 2x HBS (42 mM HEPES, 274 mM NaCl, 10 mM KCl, and 1.8 mM Na2HPO4, pH 7.10) and was allowed to sit for 1 min. This final mixture was added dropwise to the 9 ml of medium on the cells. One hundred microliters of 10 mM chloroquine was then added to the plate and swirled to mix. After 3 to 5 h of incubation at 37° and 5% CO2, this medium was replaced with fresh medium. Lipofectamine Plus transfection was carried out according to the manufacturer's protocol. Assay or virus collection was performed 36 to 48 h later.
Titering of single-round infections/determination of infectious units. Target cells were plated onto 6-well plates the day before to give 85% to 90% confluency on the day of the infection. Cells were infected with 1 ml of three 10-fold dilutions (100, 101, and 102) of LacZ+ PuroR replication-defective virus plus 8 µg/ml polybrene. Three to 5 h after the addition of virus, 3 ml of fresh medium was added to each well. After incubation for 36 to 48 h at 37°C and 5% CO2, cells were washed once with phosphate-buffered saline (PBS) (Ca2+ and Mg2+ free) and then fixed with 0.05% glutaraldehyde in PBS for 15 to 20 min. Cells were then washed once with PBS and then stained with X-Gal (5-bromo-4-chloro-3-indolyl-ß-D-galactopyranoside) solution [0.1 M NaPO4, pH 7.3, 1.3 mM MgCl2, 3 mM K3Fe(CN)6, 3 mM K4Fe(CN)6, 1 mg/ml X-Gal]. Blue cells were counted by eye using a light microscope.
Interference assay. NIH 3T3 cells chronically infected with ecotropic (MoMLV), amphotropic, polytropic, 10A1, HEMV, and M813 viruses were plated in 6-well plates at 1 x 105 cells/well the day before the assay. LacZ+ replication-defective viruses were then added, and infectious units were assayed as described above.
Virus growth and RT assay. NIH 3T3 cells in two wells each (of a 24-well plate) were transfected with 0.3 µg of retroviral construct mixed with 0.1 µg pLacPuro using Lipofectamine Plus. Forty-eight hours after transfection, one well was stained with X-Gal to determine transfection efficiency. Cells were then split 1:5 every 3 days (to achieve 90 to 95% confluence) for the duration of the experiment. Medium was collected at each passage and immediately frozen at 70°C to assay for reverse transcriptase (RT) activity. Other cell lines were treated similarly. At passage 4 for HEMV and passage 5 for 10A1 and MoMLV, 100 µl of 0.45-µm-filtered medium was added to a subconfluent well of a 24-well plate of SC-1 and MMK cells. These cells were split 3 days later, and media representing passage 1 were collected.
Medium collected from virus-producing cells was loaded in triplicate into 96-well round-bottomed plates at 10 µl/well. To each well, 50 µl of assay buffer (50 mM Tris-HCl, pH 8.3, 10 mM ß-mercaptoethanol, 10 mM MgCl2, 5.15 mM NaCl, 1 µM ATP, 1 U poly(rA)/oligo(dT), 5 µCi [
-32P]TTP) was added, and then the plates were incubated at 37°C for 1 h. The reaction mixture was filtered through 96-well DEAE filters (Millipore). The filters were washed seven times with 2x SSC (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate) and dried, and radioactivity was determined as counts per minute (cpm) in a scintillation counter.
Ligation-mediated PCR, flanking sequence cloning, and integration site occupation. Ligation-mediated PCR was performed using the "Universal GenomeWalker" kit (BD Biosciences). Internal nested HEMV primers were coupled with the provided adaptor primers. Primers were as follows: 5' junction HGAG5R and HGAG7R and 3' KT61 and KT59. The amplification reaction mixture used 2.5 µg of PANCEVO/Ei DNA digested with the blunt cutters DraI, EcoRV, PvuII, and StuI before ligation of the adaptor termini as the template. The amplicons were sequenced starting from the M13 forward and M13 reverse sites flanking the inserts. The location of the HEMV integration was initially discovered by BLASTn (http://www.ncbi.nlm.nih.gov:80/BLAST/) and then confirmed using the Ensemble mouse genome viewer (http://www.ensembl.org/Mus_musculus/).
Based upon the flanking sequence data, a set of primers, Int5F and Int6R, were designed. These primers were then used to investigate site occupation in other species, both across the integration site and with an HEMV env gene-specific primer, 3LTRF, to amplify a possible HEMV 3' junction.
Phylogenetic analysis. Sequences used to construct the phylogenies were either determined from samples sent to the Tufts Core Facility or acquired from Entrez. Alignments were initially performed by the ClustalW (version 1.4) algorithm included with the MacVector (version 7.0) program (Oxford Molecular Group) and were then corrected by hand. Trees were constructed by exhaustive parsimony analysis using PAUP (version 4.0b10) (42). Branches were swapped using the TBR algorithm using the steepest descent. Gaps were treated as "missing data."
Nucleotide sequence accession number. The nucleotide sequence of the HEMV provirus has been submitted to GenBank under accession number AY818896.
| RESULTS |
|---|
|
|
|---|
minus-E-MLV(Fig. 1A). 293T cells were then cotransfected with this construct and the
-plus pLacPuro marker vector. Two days posttransfection, virus was harvested, filtered, and assayed for Lac expression following infection of NIH 3T3 cells. In this initial experiment, we found that the HEMV env gene could lead to production of infectious virus capable of inducing LacZ expression in NIH 3T3 cells at titers comparable to those achieved with the positive-control MoMLV (up to about 107 infectious units/ml).
|
minus-E-MLV as had been done for HEMV or by triple transfection of 293T cells with an envelope expression construct, pLacPuro, and the viral structural protein encoding plasmid pMLVgagpol (Fig. 1B). Cell lines used to test for tropism were selected mostly for their species representation but in the case of 293mCAT1, E36, and Mus molossinus kidney (MMK) cells for other properties. 293mCAT1 cells stably express the ecotropic MLV receptor mCAT1 and are therefore infectible by this subgroup of MLV (1). E36 cells are Chinese hamster lung cells that express a variant of the Pit2 protein that allows them to be infected by both amphotropic MLV and GALV as well as the other nonecotropic MLVs (8, 11). MMK cells are also reported to be susceptible to GALV infection (48).
HEMV exhibited an ecotropic MLV-like species tropism, meaning that it could infect only cells derived from the Mus subgenera including M. spicilegus itself (Table 2). However, these results further indicated a tropism distinct from the established ecotropic viruses, which can also infect rat cells. Furthermore, HEMV could not infect the human 293mCAT1 cells that are susceptible to ecotropic subgroup infection. We conclude that although it has an ecotropic host range, HEMV must use a receptor distinct from that used by classical ecotropic MLV.
|
The HEMV provirus appears to be replication competent and of recent origin. The species tropism results suggested that the HEMV provirus might represent a heretofore-uncharacterized MLV subgroup using a receptor whose susceptible orthologue was limited to species of Mus. To analyze the properties of this unusual provirus further, it was cloned by assembling a collection of partial PCR amplification products (Fig. 2).
|
|
The results showed that HEMV is replication competent, matching the rise to peak RT activity of 10A1 but lagging 1 week behind the MoMLV culture (Fig. 4A). This difference in kinetics may reflect either stochastic differences in the initial transfection or more efficient replication of the laboratory-adapted strain. A notable drop in RT activity towards the end of the experiment perhaps reflected the increased viral burden and was mirrored by a drop in cellular growth rate. Despite the apparent successful spread of HEMV throughout the NIH 3T3 cells in the previous experiment, there remained a remote possibility that the rise in RT activity observed in these HEMV-transfected cells was the result of stable production of defective virus. To determine if exogenous HEMV could initiate and support an active infection, virus collected from the peaks of RT activity (passage 5 for HEMV and 10A1 and passage 4 for MoMLV) was filtered to remove cellular debris and used to initiate infection of the other mouse cells lines, SC-1 and MMK (Fig. 4B and C). HEMV readily and rapidly infected these cells as well, as indicated by the high levels of RT activity.
|
|
The HEMV provirus is likely integrated at the distal end of M. spicilegus chromosome 7. Despite the complete cloning and sequencing of HEMV, we did not know the location of the endogenous copy within the M. spicilegus genome. This information was important pragmatically for verifying the identity of our clone. Furthermore, the location would reveal the 4- to 6-base flanking repeats of the host genome target sequence that are characteristic of an actual integration event, indicating that HEMV arrived via an exogenous infection (or at least by the activity of a retroviral IN). Finally, probes for the chromosomal flanking sequence would allow us to assay for this site's occupation in other species. It was also of interest to determine the proximity of the provirus to any genes or other features of the mouse genome.
The HEMV integration site was cloned using ligation-mediated PCR as described in Materials and Methods (Fig. 5). Comparison of the sequence of the HEMV flanking region with the mouse genome sequence implied that HEMV is integrated in a region corresponding to the distal end of chromosome 7 in M. musculus C57BL/6, specifically at base 125,863,352 (of 137,389,636) according to the Ensemble mouse genome viewer at the Sanger Institute website (http://www.ensembl.org/Mus_musculus/) (Fig. 6A). It is not located in or near any genes, the closest being Ebf3, a transcription factor possibly involved in neuronal differentiation (15), at a distance of about a megabase.
|
|
The HEMV integration site is unoccupied in other Mus species, and an HEMV-like U3 sequence is present only in M. spicilegus. The determination of the integration site allowed us to test for the presence of HEMV at that genomic location in other species. We did this search by looking for the chromosome/provirus junctions as well as unoccupied integration sites using PCR with primers complementary to flanking as well as proviral sequences. We compared DNA from the laboratory strain AKR/J, from 3 unrelated M. spicilegus individuals, and from 15 other species and subspecies of Mus. As shown schematically in Fig. 6A, PCRs were performed using primers to detect the presence of the provirus at the site identified (Fig. 6B), the unoccupied integration site (Fig. 6C), and related proviruses independent of integration site (Fig. 6D and E). The presence of amplified fragments of the predicted size for integrated provirus only in the four M. spicilegus individuals and of the size expected for the unoccupied site in all mice except M. spicilegus implies that the HEMV provirus on distal chromosome 7 is unique to this species. Furthermore, it must be homozygous in at least three of the four individuals which had been collected from the limits of the species' range, suggesting residence in the species long enough to become fixed in the genome, or nearly so. In addition, no proviruses containing related LTRs were found in any species outside of M. spicilegus. More distantly related mouse species, such as M. caroli, M. dunni, M. (coelomus) pahari, and M. (pyromus) platythrix, failed to yield amplification products of the correct size with any primer pair, most likely due to local genomic polymorphisms that defeated the PCR strategy. All products were sequenced to confirm their identity. The approximately 500-bp fragment in the two M. cervicolor lanes and the one M. caroli lane were determined to be artifacts.
Phylogenetic analysis of the HEMV provirus. Previous phylogenetic analyses of the U3 regions and env genes of MLV-like viruses suggested that HEMV occupies a relatively central position. The availability of the complete nucleotide sequence of HEMV allowed us to determine this relationship for all other regions.
Trees relating HEMV sequences to those of other MLV-like viruses were constructed for gag, RT, SU, and TM DNA using maximum parsimony (42) (Fig. 7). The RT and TM trees were consistent with those derived for U3 and the env gene (46), establishing HEMV as the sole resident of an evolutionary branch derived from the node that separates the MLVs from other MLV-like viruses. The RT tree indicates that the MLVs are otherwise tightly associated but that there is a clear split of endogenous and exogenous viruses (Fig. 7A). When TM sequences were used for analysis, HEMV again straddled the node between the MLVs and other gammaretroviruses. Interestingly, there was a less-clear division between endogenous and exogenous MLVs in the TM tree (Fig. 7B).
|
The gag gene tree again preserves the division between endogenous and exogenous viruses and viruses of mouse origin as opposed to those of nonmouse origin, but HEMV is located on a branch that is positioned slightly differently from its location in the RT and TM trees (Fig. 7D).
Overall, the phylogenetic analysis, taken together with our previous report (46), is consistent with HEMV occupying a position distinct from all other gammaretroviruses, with a relatively close relationship to the common ancestor of MLV-like viruses.
| DISCUSSION |
|---|
|
|
|---|
The HEMV provirus was identified in the course of continuing work to identify distinct features of endogenous MLVs and to catalogue their representatives in laboratory and wild-derived mouse strains. Analysis of the U3 regions of multiple endogenous MLV loci isolated from wild-derived mice indicated that they are highly polymorphic and that the previously defined xenotropic, polytropic, and modified polytropic groups could be further subdivided into even more detailed classes, which were labeled X-I through X-IV (xenotropic subgroup-related I-IV) and P-I through P-V (polytropic subgroup related). The class X-IV viruses were the most broadly represented within the Mus subgenus and could be found in many species, including the single provirus in M. spicilegus. This distribution suggests that this class of virus was active in a Mus population that preexisted a period of radiation, termed the West Palearctic split, approximately 2 million years ago and therefore existed prior to all other classes (Fig. 8). The single X-IV provirus within the genome of the distantly related M. spicilegus was christened HEMV for hortulanus endogenous MLV (46) (Mus hortulanus is the prior name for Mus spicilegus).
|
Consistent with the assumption of antiquity, PCR amplification of a virus-host DNA junction fragment showed that HEMV is present and homozygous in three individuals (or their descendants) from different parts of the range of M. spicilegus. This result implies that HEMV was integrated sufficiently long ago to be fixed in the host species. However, PCR analysis and hybridization with an HEMV-specific oligonucleotide probe failed to detect similar elements in M. m. musculus, M. spretus, M. cookii, M. caroli, and M. cervicolor (popaeus). Thus, HEMV appears to be unique to its host species.
Relative to other endogenous MLVs, the env gene of HEMV is also significantly shortened, with notable deletions in the receptor-interacting VRA and possibly VRB regions (46). Like the LTR and pol genes, it also occupies a relatively central location on the MLV tree. Although it lacked other obvious defects, these sizable deletions implied that HEMV might be defective. Somewhat surprisingly, experiments using an MLV pseudotyped with the HEMV Env protein showed that it was fully functional and that HEMV represented a novel subgroup of MLV. Initial results indicated that the host range dictated by the HEMV Env protein resembled that of the ecotropic MLV subgroup in that it could infect only cells derived from the genus Mus. Unlike ecotropic MLV, it could not infect the two strains of rat cell tested. In support of this distinction, HEMV could not infect human cells stably transfected with the ecotropic receptor mCAT1. The fact that HEMV represents a novel subgroup of MLV was confirmed by performing an interference assay using a replication competent-Moloney MLV expressing the HEMV env gene. HEMV did not cross-interfere with the other traditional MLV subgroups. We thus conclude that HEMV represents a novel subgroup of MLV, in that it uses a receptor distinct from those of all other known viruses tested here. The results of a similar study with M813 support this conclusion (see below).
Although the competence of the HEMV Env protein to mediate entry was unexpected due to its unusual primary sequence, it remained possible that the HEMV locus was ancient. There are numerous examples of defective endogenous loci whose env genes have been preserved either as protective against future infections or coopted for host function (19, 20, 22, 25, 31). Inconsistent with a role in protection is the observation that primary tail fibroblast cells derived from a provirus-positive M. spicilegus individual were readily infectible by the HEMV pseudotype, implying that expression of the provirus, or at least the env gene, is not high enough to protect these cells from exogenous HEMV infection. Although levels of expression are yet to be tested directly, it is likely that, analogous to other replication-competent MLVs and ALVs, the HEMV provirus is strongly suppressed by CpG methylation (7, 17).
To assess the replication competence of HEMV, the entire provirus was cloned and sequenced. We found it to be intact, lacking in obvious defects in the form of deletions, nonsense mutations, and frameshift mutations. Furthermore, its LTRs were identical in sequence, consistent with a relatively recent insertion into the M. spicilegus genome. The lack of a mismatch between the LTRs implies that it is younger than the probable appearance of a single nucleotide change. Using a rate for silent substitutions in rodents of 7.9 x 109 substitutions/site/year, a single mismatch in a 1,000-base-pair region (i.e., both LTRs) should occur about once every 125,000 years (18, 21, 38). Furthermore, the lack of a provirus at this site in M. macedonicus, a species which split from M. spicilegus within the last 200,000 years, further caps the age of HEMV. The likeliest date of insertion of this element probably lies towards the extreme end of these estimates because it is present and homozygous in the three tested M. spicilegus isolates, all of which were collected from the edges of its range. M. spicilegus is an aboriginal species commonly known as the mound-building mouse, and it has adapted to its environment partially by forming tight-knit kin groups with little migration. It is unlikely, given this social behavior, that HEMV's ubiquity stems from the recent migration of provirus-carrying individuals between the regions. HEMV must therefore have been active as an exogenous virus just after the split from M. macedonicus.
The 5' and 3' integration junctions between virus and cell DNA were also cloned. Their sequence revealed that HEMV is located in a region corresponding to the distal end of C57BL/6 chromosome 7. According to the current information provided by the assembled mouse genome and the mouse/endogenous retrovirus linkage map, this region is otherwise devoid of retroviral integration and is not densely packed with host coding regions or even expressed sequence tags that suggest active transcripts (14) (http://www.ensembl.org/Mus_musculus/). Although the ends of the provirus were exactly as predicted, a surprising finding was that the lesions in the host chromosome created by the HEMV IN were not perfectly repaired to create the canonical 4-base direct repeats. However, alignment with the C57BL/6 mouse genome does suggest that there was a 4-base offset to the enzymatic attack. This conclusion is based upon the assumption that the M. spicilegus genome and the C57BL/6 genome are concordant at this site, which is not unreasonable, considering the similarity of genomic regions flanking the HERV integration to C57 DNA. If the HEMV provirus is completely fixed, it may not be possible to determine the sequence of the unoccupied site. In support of this misrepair hypothesis, the sequence of the unoccupied site in the closely related species M. macedonicus is AAAC, the same as the 3' flanking repeat, suggesting that the mismatch resulted from error-prone repair of the 5' repeat.
Although apparently intact, the HEMV provirus contained numerous single base changes that made it different from other MLV subgroups. To test its competence for replication, a complete clone of the HEMV provirus was created in a vector that contained no integral eukaryotic promoter. We found that this proviral clone could seed an active infection of NIH 3T3 cells following its initial transfection, as determined by assaying for the appearance of reverse transcriptase activity in the culture supernatant. Furthermore, viruses harvested from this transfection at the peak appearance of RT activity rapidly spread through freshly infected mouse cell cultures and were able to induce superinfection resistance to HEMV-pseudotyped virus. Thus, this provirus has maintained full biological activity since integration into an ancestor of Mus spicilegus, perhaps 100,000 or more years ago.
For the most part, phylogenetic analysis of HEMV supported the relationships suggested by previous consideration of its entire env gene and U3 region (46). Exhaustive maximum-parsimony searches that included only HEMV and its closer MLV-like relatives were performed. Sequences for RT, the entire gag gene, and the entire env gene were included in separate analyses. A key difference in the trees presented in this study is that, unlike the previous work, the env gene was split into SU and TM regions. We analyzed each Env protein subunit separately due to large differences in conservation of the two sequences, because there are multiple examples of env genes that are recombinants of these two regions (41, 45) and because TM and RT tree topologies tend to agree (3). This latter point is useful because complete sequence information for many gammaretroviruses is not available. The SU tree is the only one not to place HEMV at the junction of the MLVs and the other gammaretroviruses, although it did indicate that this region of HEMV shares (along with M813) ancestry with a very early progenitor of the nonecotropic MLVs and one not far removed from a hypothetical ancestor of feline leukemia virus. Unlike trees based on other regions of the genome, which tend to group exogenous and endogenous MLVs separately, the SU tree, not surprisingly, groups viruses by subgroup.
The closest relative of the HEMV SU protein is that of MLV M813, thought to be an exogenous virus from M. cervicolor, a species that last shared a common ancestor with M. spicilegus approximately 2 million years ago (2, 5, 6). The M813 receptor was recently discovered to be the sodium-dependent myo-inositol transporter SMIT1 (16). Recently, we have found that HEMV also uses SMIT1 as a receptor, although the two viruses have quite different host ranges. The conclusions of this research have broader implications and will be detailed in an upcoming paper (Tipper and Coffin, unpublished).
There are two possibilities for the origin of HEMV. One is that it represents a solitary endogenous member of a class of retrovirus that has evolved independently in the M. spicilegus lineage. The presence of HEMV-like X-IV viruses within the M. spretus genome indicates that a putative HEMV progenitor existed within the lineage that established the two species. Alternately, it is possible that the endogenous HEMV locus resulted from infection of M. spicilegus with a virus from a sympatric species. The hypothetical age of the endogenous element puts its integration soon after the split from M. macedonicus, a time when an early M. spicilegus population may not have yet adopted its peculiar social habits. The lack of shared (prespeciation) fixed HEMV loci would better fit this hypothesis. Further data are required to distinguish between these alternatives.
In conclusion, HEMV closely resembles the hypothetical MLV common ancestor by most of the genotypic and phenotypic data reported above. Unlike most other infectious endogenous proviruses of chickens and mice, it has a widespread distribution in the M. spicilegus genome but is fully capable of supporting an exogenous infection.
| ACKNOWLEDGMENTS |
|---|
J.M.C. was an American Cancer Society Research Professor of Molecular Biology and Microbiology with support from the F. M. Kirby Foundation. This work was supported by grant R01 CA 89441-01 from the National Cancer Institute.
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| ||||||||||||