Previous Article | Next Article ![]()
Journal of Virology, March 2002, p. 2410-2423, Vol. 76, No. 5
0022-538X/02/$04.00+0 DOI: 10.1128/jvi.76.5.2410-2423.2002
Copyright © 2002, American Society for Microbiology. All Rights Reserved.
Department of Biochemistry and Molecular Biology,1 Institute of Molecular Medicine and Genetics, School of Medicine, Medical College of Georgia, Augusta, Georgia,2 Hematology Division, Department of Medicine, Vanderbilt University and VA Medical Center, Nashville, Tennessee3
Received 11 September 2001/ Accepted 26 November 2001
|
|
|---|
|
|
|---|
The human genome contains approximately 50 copies of the ERV-9 endogenous retrovirus and an additional 3,000 to 4,000 copies of solitary ERV-9 LTRs (15, 17, 19, 35, 36). Compared with the LTRs of other families of endogenous retroviruses, the ERV-9 LTRs exhibit an unusual sequence feature: the U3 regions contain from 5 to 17 tandem repeats of 37 to 41 bases (17, 18) with recurrent GATA (23), CCAAT (28), and CCACC (21) motifs potentially capable of binding to cognate transcription factors expressed in embryonic and hematopoietic cells. This suggests that the ERV-9 enhancer and promoter could be active in those cells.
To gain further insight into the stability and functional significance of the ERV-9 LTRs, here we have mapped the erythroid ß-globin and embryonic axin gene loci in primates by using human primers in PCR. We found a solitary ERV-9 LTR that is conserved in identical locations in the 5" boundary area of the ß-globin gene locus and in the axin gene in the higher primates orangutan, gorilla, chimpanzee, and human, whose ancestors diverged over an evolutionary period of 15 million years (12). In the lower primates gibbon and monkey, whose ancestors diverged from the human ancestor 18 and 25 million years ago, respectively (12), the globin and axin LTRs are absent in the respective gene loci. However, other ERV-9 LTRs are detectable in the monkey genome. These results indicate that copies of the ERV-9 LTRs present in the lower primates were inserted into the globin and axin gene loci in the common ancestor of the higher primates 15 to 18 millions years ago and have remained stably integrated in the host sites during the ensuing years of primate evolution.
To assess the functional significance of the ERV-9 LTR retrotransposons, we have developed a simple and quantitative transfection assay using the green fluorescent protein (GFP) gene as the reporter and fluorescence-activated cell sorter (FACS) analyses to determine the ERV-9 LTR enhancer activity in a wide spectrum of human cells and cell lines. The results show that the U3 region of the globin LTR in both human and chimpanzee and of the axin LTR in human possesses strong enhancer activity in cells of hematopoietic lineages and even stronger enhancer activity in many embryonic cell lines of widely different tissue origins. RNA analyses using rapid amplification of cDNA 5" ends (5"RACE) (11) demonstrate that the U3 enhancer initiates transcription from a specific site downstream of the AATAAA (TATA) motif in the U3 promoter. A second AATAAA motif in the R region of the LTR does not serve as the TATA box or as the polyadenylation signal for the LTR-initiated RNAs. The LTR RNAs extend through the R and U5 regions into the GFP reporter gene in integrated recombinant constructs and into the downstream genomic DNA in the endogenous genomes of embryonic cells and adult erythroid cells. The possible functional significance of the ERV-9 LTR enhancer in regulating transcription of the cis-linked gene loci during early ontogeny and hematopoietic differentiation is discussed.
|
|
|---|
Slot blots. Membranes containing genomic DNAs from various primate and nonprimate sources were hybridized to the 5"HS5 LTR probe at 60°C overnight in buffer solution without carrier DNA (7% sodium dodecyl sulfate, 1% bovine serum albumin, 1 mM EDTA, 250 mM Na2HPO4 [pH 8]). After hybridization, the membranes were washed four times in 2x SSC (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate)-0.1% sodium dodecyl sulfate at room temperature and twice for 30 min in 0.5x SSC at 60°C. Signal intensity was quantified with a PhosphorImager (Molecular Dynamics).
Construction of recombinant GFP plasmids. The GFP plasmids (see Fig. 5a) were made from pEGFP-C1 vector (Clontech) which was digested with AseI and NheI to generate the vector backbone containing the GFP reporter gene and the simian virus 40 poly(A) signal downstream of the GFP gene. The inserts were generated by PCR with forward and reverse primers containing the corresponding AseI and NheI ends either from a phage template spanning the 5" boundary area of the human ß-LCR (18) or from human K562 or chimpanzee genomic DNAs. To generate the following PCR DNAs spanning the human and chimpanzee ß-globin (E-P-r), the positions of forward and reverse PCR primers in AF064190 were 2650 to 2672 and 3965 to 3987, and those for the axin (E-P-r) in AC005202 were 12908 to 12931 and 13931 to 13955. The (E-P-r)-fragments contained the first 55 bases of the R region upstream of the AATAAA motif in the R region (see Fig. 5a). The authenticity of the PCR fragment was confirmed by DNA sequencing. The reference GFP plasmid was made by recircularizing the vector with an AseI-NheI adapter.
![]() ![]() View larger version (60K): [in a new window] |
FIG. 5. Transcriptional initiation sites of 5"HS5 LTR in transfected plasmids and in the endogenous human genomes mapped by 5"RACE. (a) Mapping the 5" ends of RNAs transcribed from 5"HS5 (E-P-R)-GFP plasmid integrated into K562 cells. Top, plasmid map. E, P, R, and GFP, same as in Fig. 1b. Angled arrow, transcriptional initiation site in the LTR marking the 5" border of the R region. Left-to-right arrows, GFP mRNA. Two right-to-left arrows, cDNA reverse transcribed from the GFP mRNA and DNA amplified from the cDNA template after 35 cycles of PCR, respectively. The PCR fragment depicted was the subsequently sequenced DNA strand; the arrows are aligned with the plasmid map on top. (C)n, poly(dC) tails added to the 3" ends of the cDNAs by the terminal deoxynucleotide transferase enzyme. Arrowheads at the 5" ends of the cDNA or the PCR fragment, reverse primers used for cDNA synthesis, PCR amplification, or DNA sequencing (seq.). Numbers, sizes in nucleotides of the cDNAs estimated from the locations of the initiation sites of the GFP mRNAs and of the PCR fragments determined from gel electrophoresis and DNA sequencing (see panels c and d). (b) Mapping of the 5" ends and initiation sites of the 5"HS5 LTR RNAs transcribed from the endogenous genome of K562 cells. Angled arrows, locations of the three transcriptional initiation sites mapped by 5"RACE; R with *, RNA initiation site found also in placental cells (see panel c, lane P); U5 with **, RNA initiation site found also in HeLa cells (see panel c, lane H). Left-to-right arrows, LTR R, U5(2), and U5(3) RNAs. The 5" ends of these three RNAs are located at the 5" borders of the R region and the second and third repeats of the U5 region, respectively. The 3" ends of the RNAs were drawn to the locations of the reverse primers used in the cDNA synthesis, although the actual 3" ends of the RNAs may be located further downstream. Right-to-left arrows and other designations, same as in panel a. (c) Gel electrophoresis of PCR fragments used for sequencing. The PCR fragments were generated from the following RNAs: Lane K1, GFP mRNA transcribed from the integrated 5"HS5 (E-P-r)-GFP plasmid; lanes K2, P, H, and K3, endogenous (End.) RNAs of nontransfected K562 cells, placental cells, and HeLa cells and a duplicate sample of the K562 RNAs, respectively. The band in lane K1 was generated by 35 cycles of PCR; the bands in lanes K2, P, H, and K3 were generated by 2 x 35 cycles of PCR with nested primers; and the 580-bp band in lane K3 was skewed upward due to a tear in the gel. Numbers on the right margins, sizes of the PCR DNAs in base pairs; lanes M, size markers; numbers on the left margins, sizes of the size marker bands in base pairs (the top three bands are 1,500, 1,250, and 1,000 bp, respectively; shorter bands are spaced 100 bp apart). (d) Electropherograms showing the locations of the 5" ends of GFP mRNA and the endogenous R RNA (panel I), the endogenous U5(2) RNA (panel II), and the U5(3) RNA (panel III). The electropherogram presented in panel I is generated by endogenous K562 5"HS5 LTR R RNA (the identical electropherograms of GFP mRNA and placental 5"HS5 LTR R RNA are not shown). 5" 3", the 5" 3" direction of the DNA sequences in the electropherograms. The vertical arrows mark the 3" ends of the cDNAs abutting the poly(dC) tails; the corresponding RNA initiation sites are marked with angled arrows in the DNA templates shown below the electropherograms. Boldface letters, transcribed bases in the sense DNA strand (top strands) and in the complementary cDNAs (bottom strands). For complete DNA sequences of the R and U5 regions, see reference 18.
|
![]() View larger version (41K): [in a new window] |
FIG. 4. Enhancer activities of human and chimpanzee 5"HS5 LTRs and human axin LTR determined by transfection assays. (a) Maps of the transfected GFP plasmids. Hu 5"HS5 (E-P-r)-GFP and Ch 5"HS5 (E-P-r)-GFP, human and chimpanzee 5"HS5 LTR enhancer and promoter coupled to the GFP gene. The reference GFP plasmid contains no enhancer and promoter sequences 5" of the GFP gene. (b) Calculation of the enhancer-promoter activity of the transiently transfected human 5"HS5 (E-P-r)-GFP plasmid. Left, middle, and right panels, dot plots by FACS analyses of K562 cells transfected with Tris buffer, reference GFP plasmid, and (E-P-r)-GFP plasmids, respectively. x axis, GFP fluorescence intensities of the transfected cells; y axis, FL2 channel. The dot plots are the same whether FL2 or side scatter was used as the y axis in the Cellquest program. However, using FL2 as the y axis produced more compact and thus more easily gated fluorescent and nonfluorescent cell populations. R2 region, nonfluorescent cells; R3 region, fluorescent cells. The table below the dot plots shows quantitative analysis by the Cellquest program of the dot plot data. Xmean, mean fluorescence intensities (fl. inten.) of the gated cells. Below the table is a calculation of the enhancer-promoter activity of human 5"HS5 (E-P-r)-GFP in K562 cells. The enhancer-promoter activity after normalization with respect to the ratio of the copy number of transfected (E-P-r)-GFP plasmid to that of transfected GFP plasmid is 412/1.6 = 258. *, in transfections where this number was less than 1 for the reference GFP plasmid, a value of 1 was used in the calculation to obtain minimum enhancer-promoter activities.
|
RNA isolation. Total cellular RNAs were isolated with the Totally RNA kit (Ambion) from nontransfected Bewo, K562, and HeLa cells and transfected K562 cells containing integrated plasmids and also from the chorionic trophoblasts of fresh placentas of newborn infants (6). The isolated RNAs were treated with RNase-free DNase I to eliminate possible DNA contamination before being used as templates in reverse transcription-PCR (RT-PCR) and 5"RACE.
5" RACE. The 5" RACE kit (Gibco BRL) was used according to the vendor's protocol. In brief, cDNAs were first synthesized from the total cellular RNAs by using reverse primers specific to the GFP gene or the HS5 site (see Fig. 5a). Polydeoxycytosines were then added to the 3" ends of the cDNAs by using terminal deoxynucleotide transferase. The cDNAs with the poly(dC) tails were then amplified with 35 cycles of PCR using a nested gene-specific reverse primer and a universal anchored forward primer, poly(dG). Following this, another 35 cycles of PCR were carried out, using a second set of nested, gene-specific reverse primers and the universal anchored forward primer. After the second round of PCR, the amplicons were purified by agarose gel electrophoresis and sequenced by the Molecular Biology Core Laboratory using the cycle sequencing technique. The positions of the gene-specific, nested reverse primers used for cDNA synthesis, two rounds of PCR amplifications, and DNA sequencing, respectively, were as follows. In the GFP gene (see Fig. 5), the primers used were at positions 826 to 850, 736 to 757, 617 to 640, and 617 to 640 (see Clontech manual on pEGFP-C1 for corresponding primer sequences). In the endogenous DNA region between the 5"HS5 LTR (located at positions 3250 to 4349 [accession number AF064190]) and the HS5 site (located at positions 5472 to 6710), the primers used were at positions 6267 to 6289, 5522 to 5545, 4950 to 4974, and 4470 to 4493 or 4355 to 4379 (accession number AF064190).
RT-PCRs. Two to five micrograms of the endogenous total cellular RNAs isolated from various cells was used as the template in each RT reaction; aliquots of cDNAs transcribed from 400 ng of RNAs were used in the subsequent PCRs with appropriate primer pairs. The RT step was carried out with Moloney murine leukemia virus reverse transcriptase (Gibco-BRL) at 42°C for 60 min. The PCR conditions were as follows: denaturation at 94°C for 1 min, annealing at 58°C for 1 min, and extension at 72°C for 1 min, repeated for 32 cycles, if not otherwise specified. PCR products (5 µl of 50 µl) were analyzed by electrophoresis in 2% agarose gels. For Fig. 8, to detect polyadenylated RNAs, the reverse primer used for cDNA synthesis was (T)33 (5" 33[T]-C/G/A-C/G/A/T 3"). The coordinates in AF064190 of the PCR primers were as follows: F1, 3247 to 3271; F2, 4003 to 4028; G1, 4469 to 4493; F3, 5522 to 5545; and G2, 6267 to 6289. In nested PCRs for Fig. 5c, 3 µl of 50 µl of first-round PCR products after 25 cycles was used as templates for the nested, second-round PCR for an additional 25 cycles. For Fig. 7 to determine the transcriptional direction of the 5"HS5 LTR and downstream genomic DNA up to the HS5 site, the coordinates in AF064190 of the forward and reverse primers in primer pairs 1 to 5 were as follows: 1, 4003 to 4028 and 4469 to 4493 (same as F2-G1 in Fig. 6); 2, 4482 to 4502 and 4950 to 4974; 3, 4950 to 4974 and 5522 to 5542; 4, 5522 to 5545 and 6267 to 6289 (same as F3-G2 in Fig. 6); and 5, 6267 to 6289 and 6695 to 6717. To detect sense RNA colinear with the transcriptional direction of the ß-LCR and the ß-like globin genes, the reverse primers in each primer pair were used in the RT step to synthesize cDNAs; to detect antisense RNAs, the forward primers in each primer pair were used in the RT step. For Fig. 8 to determine the transcriptional direction of the axin LTR and the axin gene, the coordinates in AC005202 of the forward and reverse primers in primer pairs 1 to 4 were as follows: 1, 18040 to 18064 and 18477 to 18501; 2, 13969 to 13994 and 14605 to 14629; 3, 13969 to 13994 and 14221 to 14244; and 4, 11549 to 11573 and 11941 to 11965.
![]() View larger version (43K): [in a new window] |
FIG. 8. Transcriptional direction of the human axin gene locus. (a) Map of the second axon and second intron of the human axin locus. Angled arrows, transcriptional directions of the axin gene and the intronic ERV-9 LTR. Horizontal arrows, RNAs detected by RT-PCR primer pair 1, located in exon 2; by primer pairs 2 and 3, which spanned the ERV-9 LTR and part of intron 2 between the LTR and the second intron; and by primer pair 4, which spanned a further-downstream region of the second intron. Right-to-left arrows, sense RNAs colinear with the transcription direction of the axin gene. Left-to-right arrow, antisense RNAs transcribed from within the ERV-9 LTR and extended into intron 2 DNA. Numbers below the arrows, sizes of the RT-PCR bands in base pairs. (b) RT-PCR products. Lanes 1 to 4, RT-PCR bands generated by primer pairs 1 to 4, respectively. + lanes, sense RNAs colinear with the axin gene. - lanes, LTR-initiated RNAs in the antisense direction to the axin gene. White dots, anticipated sizes of the RT-PCR bands. Lanes M, 100-base DNA size markers from 100 to 1,000 bp. The top two bands are 1,250 and 1,500 bp, respectively.
|
![]() View larger version (35K): [in a new window] |
FIG. 7. Transcriptional direction of the RNAs between the 5"HS5 LTR and HS5 site. (a) The RNAs transcribed from DNA between the 5"HS5 LTR and HS5 site of the ß-LCR were exclusively in the sense direction in nontransfected K562 cells as determined by RT-PCRs. Horizontal left-to-right arrow, sense RNAs transcribed from the LTR into the HS5 site as amplified with primer pairs 1 to 5. Numbers below the PCR products, sizes of the RT-PCR bands in base pairs. (b) Gel electrophoresis of RT-PCR products synthesized with locus-specific primers. Lanes 1 to 5, PCR bands amplified from cDNAs by primer pairs 1 to 5, respectively. + lanes, PCR bands generated from the sense R RNAs using the reverse primer of primer pairs 1 to 5 in the RT step. - lanes, PCR bands generated from the antisense RNAs using the forward primers of primer pairs 1 to 5 in the RT step. White dots on the right margins, PCR bands anticipated to be generated by primer pairs 1 to 5. (c) PCR products amplified from cDNAs synthesized with random hexamer primers in the RT step. Lanes 1 to 5, PCR bands amplified from the cDNAs by primer pairs 1 to 5, respectively.
|
![]() View larger version (39K): [in a new window] |
FIG. 6. Polyadenylated 5"HS5 LTR RNAs detected by RT-PCR. (a) DNA sequences of the U3 promoter (P) and R regions of 5"HS5 LTR. The two AATAAA motifs with heavy underlines are the TATA box in the promoter and the potential polyadenylation signal in the R region, respectively. Angled arrow, LTR transcriptional initiation site. Underlined bases in the promoter region, binding sites for transcription factors AML1 (GTGGT), CCAAT, CACCC, and GATA (21, 23, 28). Thin horizontal arrows, F1 and F2 forward primers used in the RT-PCRs to amplify LTR cDNAs. (b) The AATAAA motif in the R region did not serve as the polyadenylation signal to terminate the LTR R RNAs. Top, genomic map of the region between the 5"HS5 LTR and the HS5 site. Angled arrow, transcriptional initiation site of the LTR R RNA. *, locations of the AATAAA motifs in the LTR and six additional AATAAA motifs or potential polyadenylation signals between the 5"HS5 LTR and the 3" end of the HS5 site. Left-to-right arrows, polyadenylated LTR R RNAs; dotted lines, different DNA sequences present in the 3" ends of the R RNAs generated by different potential polyadenylation signals in the region. (A)n, poly(A) tails of unknown lengths. (T)33, oligo(dT) primer with 33 Ts used as the reverse primer in cDNA synthesis and PCR. Horizontal line bracketed by forward F1 and reverse (T)33 primers, PCR products generated from the cDNAs by primer pair F1-(T)33. Thick lines bracketed by arrowheads, second-round nested PCRs amplified from the F1-(T)33 PCR products by nested primer pairs F2-G1 and F3-G2. The positions of all horizontal arrows, lines, and arrowheads are aligned with the genomic map of the region at the top. (c) Gel electrophoresis of PCR and nested PCR products. Left panel, PCR bands generated by the F1-(T)33 primer pair from LTR R RNAs of placenta (lane P), Bewo (lane B), and K562 (lane K) cells. Lane M, DNA size markers, the same as in Fig. 2c. The PCR bands were generated from cDNA templates after 25 PCR cycles. Center and right panels, the nested PCR bands were amplified from the F1-(T)33 PCR products after 25 additional PCR cycles by nested primer pair F2-G1 or F3-G2.
|
|
|
|---|
|
View larger version (13K): [in a new window] |
FIG. 1. Conservation of a solitary ERV-9 LTR in the ß-globin gene locus during primate evolution. (a) Map of the ß-globin gene locus in primates. Hu, Ch, Go, Or, Gi, and Mo, ß-globin gene loci in human, chimpanzee, gorilla, orangutan, gibbon, and monkey, respectively. Hatched box, ERV-9 LTR. Vertical arrows, the five DNase I-hypersensitive sites defining the ß-LCR. Black bars, embryonic -, fetal -, and adult - and ß- globin genes. Angled arrows, direction of transcription of the ERV-9 LTR, ß-LCR, and globin genes. Bent line, absence of the ERV-9 LTR in gibbon and monkey. (b) Structure of the 5"HS5 ERV-9 LTR in primates. Boxes marked 1, 2, 3, and 4, the four subtypes of the 40-bp enhancer repeats in U3 (18). Arrowheads, the 72-bp U5 repeats in U5. attagtat and gtatgtca flanking the LTR, DNA bases in the primate genomes flanking the integration site of the 5"HS5 LTR. Numbers in parentheses, lengths in DNA bases of the respective primate LTRs.
|
![]() ![]() ![]() View larger version (104K): [in a new window] |
FIG. 2. Conservation of a solitary ERV-9 LTR in the axin gene locus during primate evolution. (a) Map of the axin gene locus in primates. Unfilled boxes, -like globin genes. Black bars, the 11 exons of the axin gene. Other designations are the same as for Fig. 1a. The 300-kb locus in 16p 13.3 from the axin gene to the -globin gene was assembled from GenBank files under accession numbers AC005202, AC004652, Z99754, Z69667, Z69075, Z69890, Z69706, Z84721, Z69666, Z84813, and Z84722. (b) Structure of the axin ERV-9 LTR. Designations are the same as for Fig. 1b. (c) Alignments of the U3 promoter and R regions of the 5"HS5 LTR and axin LTR in human, chimpanzee, and orangutan. Highlighted bases, ACCAC (GTGGT), CCAAT, GGGTG (CACCC) GATA, and AATAAA sequence motifs. Arrow, transcription initiation site of LTR RNAs marking the 5" boundary of the R region.
|
-globin gene cluster on chromosome 16. Sequence alignments of the human axin cDNA (accession number AF009674) with the GenBank sequence files spanning the axin gene locus (see Fig. 2a legend) showed that the human axin gene contains 11 exons and spans 58 kb of DNA. An ERV-9 LTR is located, in the antisense orientation, in the second intron at a location 4 kb from exon 2 of the axin gene (accession number AC005202) (Fig. 2a). The human axin LTR bears extensive sequence identity of over 90% with the human 5"HS5 LTR, although the axin LTR is shorter, containing six U3 enhancer repeats and two U5 repeats. As in the 5"HS5 LTR, the identifiable transcription factor binding motifs GTGGT, CCAAT, CACCC, and GATA and the AATAAA box in the U3 region of the axin LTR are 95 to 100% conserved during primate evolution from orangutan to human (Fig. 2c). However, a number of deletions and base mutations are observed: in the chimpanzee U3 promoter, the 15 bases spanning the CACCC and the GATA motifs are deleted and the AATAAA motif (TATA box) is mutated to AACAAA, indicating that the U3 promoter in the chimpanzee axin LTR was considerably weakened. In addition, the second AATAAA motif in the R region of orangutan is mutated to AGTAAA in gorilla and chimpanzee and to AGTAAG in human (Fig. 2c). Like the 5"HS5 LTR, the axin LTR is conserved in an identical location in the higher primates from orangutan to human (Fig. 2b) and thus has been stably integrated in the primate genome for at least 15 million years. The 5"HS5 and axin LTRs are not found in the lower primates gibbon and monkey; however, ERV-9 LTRs are detectable in the monkey genome. Slot blots of primate DNAs show that the monkey genome contains approximately 2,000 copies of the ERV-9 LTRs (Fig. 3). The conservation of the ERV-9 LTRs in the primate genomes for at least 25 million years suggests that the ERV-9 LTRs are not detrimental to the hosts and may be conserved during primate evolution to serve useful cellular functions.
![]() View larger version (45K): [in a new window] |
FIG. 3. ERV-9 LTRs are present in both the higher and the lower primates. (a) Slot blots of primate and nonprimate genomic DNAs. The membranes containing the DNA samples were hybridized to the human 5"HS5 LTR probe. (b) Copy numbers of ERV-9 LTRs in primates and nonprimates relative to the haploid copy numbers in human.
|
The mean fluorescence intensity of the fluorescent cells is a measure of the combined strengths of the enhancer and promoter coupled to the GFP gene, since the GFP gene in the enhancerless and promoterless reference GFP plasmid exhibited very weak mean fluorescence intensity (Fig. 4b). In the dot plots of FACS analysis, the percentage of fluorescent cells in the R3 region reflects the transfectability of the cells, which theoretically should be a constant for a specific cell type even when transfected with different plasmids. However, we observed that the percentages of fluorescent cells correlated positively with the enhancer and promoter strengths (the mean fluorescence intensities) of the transfected plasmids (see percentages of fluorescent cells of each cell type transfected with different plasmids in Table 1). Hence, we took the product of the mean fluorescence intensity of the fluorescent cells and the percentage of the fluorescent cells as a quantitative measure of the combined enhancer and promoter strengths of the transfected plasmid. In calculating the relative enhancer-promoter activities of the test plasmids, the activity of the enhancerless and promoterless GFP plasmid was used as the reference standard (Fig. 4b).
|
View this table: [in a new window] |
TABLE 1. Enhancer and promoter activities of the human 5"HS5 LTR, chimpanzee 5"HS5 LTR, and human axin LTR in the (E-P-r)-GFP plasmids and the reference GFP plasmida determined by transient-transfection assays
|
The U3 enhancer of ERV-9 LTRs is active in embryonic cells and erythroid cells. Using the transfection assays described above, we determined the U3 enhancer and promoter activities of the human and chimpanzee 5"HS5 LTR and the human axin LTR in a wide variety of human cells. The (E-P-r)-GFP plasmids contained the U3 enhancer and promoter and the 5" half of the R region before the second AATAAA motif (see Fig. 6a). The transfection results show that the enhancer and promoter activities of the ERV-9 LTRs were 2- to 10-fold higher in embryonic cells derived from placenta, embryonic kidney, and liver than in hematopoietic cells of erythroid and lymphoid lineages and were 10- to 100-fold higher in embryonic cells than in some adult nonhematopoietic cells (Table 1). The (P-r)-GFP plasmids containing only the U3 promoter activated the GFP reporter gene to levels approximately one-fifth to one-quarter of those of the (E-P-r)-GFP plasmids (data not shown).
The U3 enhancer initiates mRNA synthesis of the cis-linked GFP gene from a site 25 bases downstream of the AATAAA motif (TATA box) in the U3 promoter. To further investigate the LTR enhancer and promoter activities in the (E-P-r)-GFP construct, we used the 5"RACE technique (11) to map the 5" ends of the GFP RNAs transcribed from the 5"HS5 (E-P-r)-GFP plasmid integrated into K562 cells. In particular, we wished to determine whether the LTR enhancer-promoter activated synthesis of GFP mRNA from the presumptive AATAAA (TATA) box in the U3 promoter as identified by sequence analysis.
5"RACE showed that a single PCR band of 140 bases anticipated to be generated by the nested primer pairs from the GFP mRNA was indeed observed (Fig. 5a and c, lane K1). DNA sequencing of this PCR band showed that the GFP mRNA was initiated from a specific site located 25 bases downstream of the AATAAA (TATA) box in the U3 promoter (Fig. 5d, panel I). This LTR RNA initiation site thus defines the 5" border of the R region in the 5"HS5 LTR (27).
In the endogenous genomes of erythroid and embryonic cells, the 5"HS5 LTR initiates RNA synthesis from the same site 25 bases downstream of the AATAAA box in the U3 promoter, and the LTR-initiated RNAs extend through the intervening DNA into the HS5 site Correlating with the RNA initiation site in integrated plasmids, the 5"HS5 LTR in the endogenous genome of nontransfected K562 cells also initiated RNA synthesis from the same site at the 5" border of the R region (Fig. 5b). The cDNA reverse transcribed from this endogenous R RNA produced a nested PCR band of the anticipated size of 580 bp in duplicate 5"RACE reactions (Fig. 5c, lanes K2 and K3). DNA sequence analyses of this band confirmed that the endogenous R RNA, like the GFP mRNA transcribed from transfected plasmids, was initiated from the same base located 25 bases downstream of the AATAAA box in the U3 promoter (Fig. 5d, panel I).
In K562 cells, two additional endogenous RNA initiation sites were detected in the U5 region, generating the U5(2) and U5(3) RNAs (Fig. 5b). These two U5 RNAs produced, respectively, the PCR bands of 370 and 260 bp (Fig. 5c, lanes K2 and K3). DNA sequencing of these two bands showed that the 5" ends of the U5(2) and U5(3) RNAs were located at the respective 5" ends of the second and third U5 repeats (Fig. 5d, panels II and III). However, unlike the R RNA, whose 5" end was reproducibly mapped to the C base 25 nucleotides (nt) downstream of the AATAAA box in the U3 promoter, the U5 RNAs were transcribed from regions that did not contain identifiable AATAAA (TATA) boxes located 25 to 30 bases upstream of their respective 5" ends (Fig. 5d, panels II and III). These U5 RNAs were not reproducibly detectable: the U5(2) RNA producing the PCR band of 370 bp was not detected in duplicate K562 RNA samples (compare lanes K2 and K3 in Fig. 5c). Moreover, their 5" ends were not fixed, varying between K562 and other cell types within a range of 20 bases in the 5" borders of the respective U5 repeats (5" end analyses of U5 RNAs in these latter cell types are not shown).
The R and U5 RNAs all extended into the HS5 site in K562 cells, since the PCR bands of 580, 370, and 260 bp were generated from cDNA templates that were synthesized from a reverse primer located within the HS5 site (Fig. 5b and c, lanes K2 and K3).
In nontransfected placental trophoblasts, the endogenous R RNA initiated from the 5" border of the R region produced a single detectable PCR band of 580 bp (Fig. 5c, lane P). DNA sequence analysis showed that, as in K562 cells, this R RNA was initiated from the identical C base located 25 bases downstream of the AATAAA box in the U3 promoter (electropherogram not shown). This indicates that the U3 enhancer and promoter of the endogenous 5"HS5 LTR were active in placental trophoblasts, which confirms the transfection results (shown above) that the 5"HS5 LTR enhancer and promoter in transfected GFP plasmids were active in the Bewo placental trophoblast cell line (Table 1).
In HeLa cells, the R RNA was not detectable and did not produce the PCR band of 580 bp (Fig. 5c, lane H), indicating that the U3 enhancer and promoter of the 5"HS5 LTR in the endogenous genome of HeLa cells were not active. This finding is again consistent with the transfection result that the 5"HS5 (E-P-r)-GFP plasmid exhibited weak enhancer-promoter activities in HeLa cells (Table 1). However, the U5 region in the endogenous 5"HS5 LTR apparently initiated the transcription of U5(2) RNA from the second U5 repeat, which generated the PCR band of 370 bp (Fig. 5c, lane H). This indicates that the U5 region in HeLa cells, as in K562 cells, may be transcriptionally active and capable of initiating RNA synthesis independent of identifiable AATAAA boxes located 25 to 30 bases upstream of the apparent transcriptional initiation sites.
In summary, RNA analyses by 5"RACE indicate that the 5"HS5 LTR is transcriptionally active in placental trophoblasts and erythroid K562 cells and initiates sense RNA synthesis in these cells from a specific site 25 bases downstream of the AATAAA motif in the U3 promoter.
The 5"HS5 LTR RNAs are polyadenylated, but the AATAAA motif in the R region located downstream of the U3 promoter does not serve as a polyadenylation signal for the LTR RNAs. In the 5"HS5 LTR, a second AATAAA motif is present in the R region, at a location 80 bases downstream of the AATAAA (TATA) motif in the LTR promoter (Fig. 6a). This second AATAAA motif did not serve as the TATA box in initiating transcription of the 5"HS5 LTR RNAs, since RNAs initiated from this AATAAA motif would have generated in 5"RACE a nested PCR band of 480 bp, which was not detected (Fig. 5b and c). Such duplicated AATAAA motifs in the promoter and R regions are found not only in the solitary ERV-9 LTRs (7, 17, 18) but also generally in both the 5" and the 3" LTRs of retroviruses (5). In retroviruses, the second AATAAA box in the 3" LTR, but not the one in the 5" LTR, has been reported to serve as the polyadenylation signal for retroviral RNAs (2). In the 5"HS5 LTR of the endogenous K562 genome, if the AATAAA motif in the R region served as a poly(A) signal, it would have produced very short, polyadenylated LTR RNAs of approximately 100 nt which consisted of 55 nt of RNA between the transcriptional initiation site and the AATAAA poly(A) signal in the R region (Fig. 6a) plus approximately 20 additional bases between the poly(A) signal and the polyadenylation site (2, 5) and a poly(A) tail of 33 nt complementary to the (T)33 primer used in the reverse transcription step (Fig. 6b). To detect such short, polyadenylated RNAs of 100 nt, we carried out the following RT-PCR experiments.
The templates for the RT-PCRs were total cellular RNAs isolated from nontransfected placental cells and the Bewo and K562 cell lines, in which the 5"HS5 LTR enhancer and promoter were active as shown by transfection and 5"RACE results. The polyadenylated RNAs were first transcribed into cDNAs with the reverse primer (T)33. The cDNAs were then amplified in PCR with the (T)33 reverse primer and the F1 forward primer located at the transcriptional initiation site in the R region (Fig. 6b). In gel electrophoresis, PCR bands of 100 bp were not produced by RNAs isolated from any one of the cell types, although much longer PCR bands were detectable (Fig. 6c, left panel). To confirm that the long PCR fragments generated with the F1-(T)33 primer pair indeed were amplified from the 5"HS5 LTR locus, we performed a second-round nested PCR using the nested primer pair F2-G1 to amplify the F1-(T)33 PCR products (Fig. 6c). The nested PCR band of 490 bp anticipated to be generated by the F2-G1 primer pair was indeed produced (Fig. 6c, middle panel). DNA sequencing showed that this 490-bp band contained the 5"HS5 LTR as well as the unique, downstream genomic DNA and was specific to the 5"HS5 LTR locus (18). In addition, the nested PCR using primer pair F3-G2, which spanned the unique genomic DNA further downstream of the F2-G1 region, also produced the anticipated nested PCR band of 768 bp from the F1-(T)33 PCR products (Fig. 6c, right panel). This F3-G2 band was not amplified directly from shorter cDNA templates synthesized from the (T)33 reverse primer, since PCRs using the cDNAs as the direct templates produced barely detectable PCR bands of 768 bp (Fig. 6c, right panel, last three lanes). These results indicate that the polyadenylated LTR RNAs spanned both the F2-G1 and F3-G2 regions and thus the entire region between the 5"HS5 LTR and the HS5 site.
Together, these results indicate that in placental, Bewo, and K562 cells, the RNAs transcribed from the 5"HS5 LTR were polyadenylated but the AATAAA motif in the R region of the 5"HS5 LTR did not serve as the poly(A) signal, so that the polyadenylated LTR RNAs extended into the downstream genomic DNA. A corollary observation of the absence of the short 100-bp polyadenylated LTR RNA is that the AATAAA motifs in the R regions of many other solitary ERV-9 LTRs in the human genome also did not serve as the polyadenylation signal to terminate the ERV-9 LTR RNAs.
The 5"HS5 LTR- and axin-initiated RNAs are transcribed predominantly in a direction toward the downstream genomic DNA. Both the 5"RACE and the RT-PCR studies indicate that the 5"HS5 LTR RNAs were transcribed in the sense direction toward the HS5 site, colinear with the direction of transcription of the further downstream ß-like globin genes (Fig. 5b and 6b). This suggests that the LTR enhancer, like the further-downstream HS2 enhancer in the ß-LCR (16, 32), initiated transcription predominantly in the sense direction. To confirm this, we carried out the following RT-PCRs to determine whether LTR RNAs were also transcribed in the antisense direction from the HS5 site into the 5"HS5 LTR. We used five overlapping primer pairs that spanned the 2.7 kb of DNA from the R region to the 3" end of the HS5 site (Fig. 7a). To synthesize cDNAs from the sense transcripts, reverse primers 1 to 5 were used in the RT step; to synthesize cDNAs from the antisense transcripts, forward primers 1 to 5 were used in the RT step (Fig. 7a). The cDNAs were then amplified separately in PCRs with the respective primer pairs 1 to 5. As expected, the sense transcripts produced RT-PCR bands of the anticipated lengths (+ lanes in Fig. 7b), but the antisense transcripts did not produce RT-PCR bands of the anticipated lengths (- lanes in Fig. 7b).
The shorter bands in the + lanes in Fig. 7b were not amplified from shorter sense transcripts, nor were shorter bands in the - lanes amplified from shorter antisense transcripts of the region. They were spurious bands produced in RT-PCRs when the same primer was used in both cDNA synthesis and PCR amplification. In RT-PCRs using random hexamers as primers for cDNA synthesis, which should be able to anneal to and transcribe both the sense and the antisense RNAs, followed by PCR with primer pairs 1 to 5, the spurious shorter bands were not detected, and only the bands of the anticipated lengths produced from the sense RNAs were detected (Fig. 7c). These results indicate that the 5"HS5 LTR and the genomic DNA downstream of it were transcribed exclusively in the sense direction toward the downstream HS5 site.
In the human axin gene locus, the ERV-9 LTR is located in the second intron in reverse orientation to the transcription direction of the axin gene (Fig. 2a and 8a). RT-PCR studies indicate that the axin LTR enhancer still initiated RNA synthesis predominantly in the direction toward the downstream genomic DNA, i.e., in an antisense direction to the transcription of the axin gene (Fig. 8b, lanes 3). This antisense transcription did not extend beyond 400 bases downstream of the axin LTR, since the direction of transcription of the further-downstream axin intron and exon DNA was exclusively in the sense direction of axin gene transcription (Fig. 8b, lanes 1 and 2). These results indicate that both the 5"HS5 LTR and the axin LTR enhancer initiated transcription into the downstream genomic DNAs regardless of the orientation of the LTR with respect to the associated gene loci.
|
|
|---|
Transfection studies showed that the 5"HS5 and axin LTRs possessed strong enhancer and promoter activities that exhibited tissue preference. The ERV-9 LTR enhancer activities in embryonic cells were 2- to 10-fold higher than those in hematopoietic cells of erythroid and lymphoid lineages and were over 100-fold higher than those in some adult nonhematopoietic cells. In the endogenous genomes of embryonic placental cells and erythroid K562 cells and in plasmids integrated into K562 cells, the U3 enhancer activated RNA synthesis from a specific site located 25 bases downstream of the AATAAA motif (TATA box) in the U3 promoter. The specific location of the transcriptional initiation site downstream of a TATA box suggests that the LTR RNAs were transcribed by RNA polymerase II (pol II). The LTR RNAs extended through a second AATAAA motif in the R region into the downstream genomic DNA and the HS5 site. This second AATAAA motif thus did not serve as a TATA box or as a polyadenylation signal for LTR transcription.
In the endogenous genome of erythroid K562 cells, the 5"HS5 LTR RNAs extended through the R and U5 region into the HS5 site exclusively in the sense direction colinear with the direction of transcription of the ß-LCR (1, 16, 32) and the further downstream ß-like globin genes. The sense LTR RNAs were polyadenylated, indicating again that the LTR RNAs were transcribed by pol II, since pol II through its unique C-terminal domain has been reported to be instrumental in polyadenylation of the RNAs it transcribes (20). These rare, endogenous LTR RNAs, which are detectable only after PCR amplifications, do not appear to be mRNAs encoding translatable gene products, as the 1.1-kb DNA between the ERV-9 LTR and the HS5 site and the DNA in the second intron of the axin gene carry no long open reading frames and do not appear to contain a gene (GenBank accession numbers AF064190 and AC005202).
The ß-LCR defined by DNase I-hypersensitive sites HS1 to HS5 is conserved during mammalian evolution from mouse to human and serves an indispensable role in transcriptional activation of the ß-like globin genes in erythroid cells (13). The HS2 enhancer in the ß-LCR located further downstream of the 5"HS5 LTR has also been reported to initiate HS2 enhancer transcription preferentially in the sense direction toward the far-downstream globin promoters and genes (1, 16, 32). These and other observations suggest that the enhancer-initiated transcription process plays a role in mediating enhancer function over distance. We are currently studying the effects of deletion of the 5"HS5 LTR and thus abolition of the 5"HS5 LTR-initiated transcription process on transcription of the downstream LCR and the ß-like globin genes during ontogeny and hematopoietic differentiation.
In the mouse axin gene locus, a murine endogenous retrovirus, an intracisternal A particle (IAP) whose LTRs possess enhancer-promoter activities (4), has been reported to regulate transcription of the cis-linked axin gene and cause the Fused and Knobbly mutations in mice (33). In the Fused mutation, the insertion in intron 6 of an IAP in the antisense orientation to the axin gene creates a gene that produces wild-type transcripts as well as mutant transcripts that initiate from the LTRs of the IAP. In the Knobbly mutation, the insertion of an IAP also in the antisense orientation in exon 7 interrupts transcription of axin mRNA and precludes the production of wild-type axin protein. These mutations are manifested by a dominant gain-of-function phenotype of a kinked tail in heterozygotes. Homozygous Knobbly mutants are embryonic lethal, showing duplication of the embryonic axis and neuroectodermal and cardiac defects (33). These findings strongly suggest that the ERV-9 LTR enhancer inserted in the second intron of the human axin gene in reverse orientation to the gene may modulate the transcription of the human axin gene in embryonic and hematopoietic cells through synthesis of the antisense LTR RNAs.
In the human genome, the ERV-9 LTRs are middle repetitive DNAs present at 3,000 to 4,000 copies. Many of these LTRs share extensive sequence identities of over 90% with the 5"HS5 and axin LTRs, as revealed by BLAST searches of the GenBank database. It remains to be determined whether these ERV-9 LTRs possess similar enhancer and promoter activities and regulate the transcription of the cis-linked genes during ontogeny and hematopoietic differentiation.
This work was supported in part by NIH grants DK-1-5555 (to S.K.) and HL 39948 and 62308 (to D.T.).
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»