Journal of Virology, February 2001, p. 1722-1735, Vol. 75, No. 4
0022-538X/01/$04.00+0 DOI: 10.1128/JVI.75.4.1722-1735.2001
Copyright © 2001, American Society for Microbiology. All rights reserved.
Integrated Program in Cellular, Molecular and Biophysical Studies1 and Department of Microbiology,2 Columbia University, New York, New York 10032
Received 18 October 2000/Accepted 22 November 2000
| |
ABSTRACT |
|---|
|
|
|---|
A consensus binding site for the human papillomavirus (HPV) E2 protein was determined from an unbiased set of degenerate oligonucleotides using cyclic amplification and selection of targets (CASTing). Detectable DNA-protein complexes were formed after six to nine cycles of CASTing. A population of selected binding sites was cloned, and a consensus was determined by statistical analysis of the DNA sequences of individual isolates. Starting from a pool with 20 random bases, a consensus binding site of ACAC-N5-GGT was derived. CASTing and electrophoretic mobility shift analyses demonstrate that human but not bovine papillomavirus E2 proteins recognize this sequence. The presence of this sequence in papillomavirus genomes suggests a role for its function. We demonstrate that this site functionally substitutes for the canonical E2 binding site (ACCG-N4-CGGT) in both transient-transcription and DNA replication assays. This sequence, in most instances, is interchangeable with the resident E2 binding sites in the context of the HPV type 16 long control region. Where the novel sequence does not support E2-mediated effects on gene expression or DNA replication, we demonstrate that changing the orientation of the novel sequence restores this effect.
| |
INTRODUCTION |
|---|
|
|
|---|
Papillomaviruses (PVs) cause a benign hyperproliferation of epithelial cells that sometimes progress to form carcinomas. The model system for studying papillomaviruses has been bovine papillomavirus type 1 (BPV1) because of its ability to infect and transform a variety of rodent cells in tissue culture (41).
Papillomaviruses encode early and late proteins that are involved in
regulation of virus gene expression and replication, and assembly of
the virion, respectively. E2 is a dimeric multifunctional early protein
that is intimately involved in the regulation of gene expression and
viral genome replication. The primary structures of papillomavirus E2
proteins are highly conserved. They consist of an N-terminal domain
that can act as a transcriptional activator and is also involved in
viral DNA replication and interaction with the viral DNA replication
protein E1; a central, poorly conserved hinge region; and the
C-terminal DNA binding and protein dimerization domains
(17). Analysis of the crystal structure of the C terminus of the BPV1 E2 protein bound to its DNA binding site revealed that two
-helices, one from each monomer of an E2 dimer, bind in the major
groove of the DNA (8, 9). The original studies of the BPV1
E2 protein identified the high-affinity DNA binding sites within the
BPV genome to be 12 bp long and to minimally contain the sequence
ACC-N6-GGT (canonical site), with the highest-affinity site
being ACCG-N4-CGGT (2, 15).
The occurrence of BPV1 E2 high-affinity DNA binding sites in the long control region (LCR) of the human papillomaviruses (HPVs) has led investigators to assume that the HPV E2 proteins preferentially bind this site (1, 33, 39, 40, 42).
Many E2 functions are dependent on the relative affinity of the protein for its various DNA binding sites. HPVs that infect the genital mucosal epithelium contain multiple E2 binding sites within their LCRs, both proximal and distal to the transcription initiation site for early-gene expression. The distal, higher-affinity sites apparently act as enhancers, and the proximal, lower-affinity sites act as repressors of early gene expression (17). These sites are thought to be part of a switching mechanism that modulates the levels of early gene expression during the viral life cycle. In addition, it has been shown that E2 can compete for the binding of cellular transcription factors from their neighboring or overlapping sites within the LCR (5, 37, 38).
The papillomavirus proteins E1 and E2 are necessary for HPV genome replication (12, 32, 36). For HPVs, the minimal origin of replication is defined as an E1 (E1BS) and an E2 binding site (E2BS) in close proximity flanked by an A/T-rich region, with additional E2BSs facilitating replication (3, 16, 36). In the absence of an E1BS, two E2BSs near the A/T-rich region can support transient HPV replication (36). Deletion analyses around the BPV1 origin of replication have revealed that the location of E2BSs with respect to the E1BS and A/T-rich region is flexible. Moreover, the affinity of the E2 protein for a particular site directly correlates with its ability to stimulate DNA replication (6, 43, 45).
The primary objective of our studies was to elucidate the highest-affinity binding sites for HPV E2 proteins. To address this question, we employed the nonbiased cyclic amplification and selection of targets (CASTing) technique (48). We identified a unique set of sequences, ACAC-N5-GGT, that HPV E2 proteins (but not the BPV1 E2 protein) bind with a relative affinity that is indistinguishable from the canonical high-affinity site. Our studies also suggest the existence of preferred nucleotides within the flanking and core (N5) sequences. Comparisons of the relative affinities and binding complex half-lives were made for the HPV51, HPV-16, and BPV1 E2 proteins with the different DNA binding sites. These novel sites are located within papillomaviruses genomes at locations where E2BSs are typically found.
In order to assess how an E2 protein might utilize the novel E2BSs, we used it in place of the wild-type sites found in the early promoter and replication origin of the better-characterized HPV type 16. HPV16 infects the epithelial cells of the genital mucosa and, like all other high-risk HPVs, is strongly associated with cervical carcinoma. We have designed a single plasmid that allows the assay of both transient transcription and transient replication. This plasmid, pOri16L, was built with a portion of the HPV16 LCR that contains both the origin of replication and the early promoter driving the expression of the firefly luciferase reporter gene. There are three canonical E2BSs within this portion of the LCR (see Fig. 5). In this study, we mutated each of the wild-type E2BSs to either eliminate binding (BS-KO) or create new E2BSs with the novel binding site sequence ACACAAATCGGT. Here we demonstrate that the novel E2BS functionally substitutes for the native E2BSs within this portion of the LCR in both transient-transcription and replication assays. Because bp 3 of the novel E2BSs disrupts the canonical site's palindrome, we also addressed the influence of the orientation of this binding site on E2 function. The results of our experiments with these mutated LCRs demonstrate that the functional role for the ACAC-N5-GGT sites in E2-mediated replication of and transcription from the HPV genome can be dependent on binding site orientation.
| |
MATERIALS AND METHODS |
|---|
|
|
|---|
DNA constructs for protein expression.
DNA constructs used
for expression of proteins in bacteria were made by PCR amplification
of portions of the genes encoding the E2 proteins from HPV types 51 and
16 and BPV type 1. The resulting PCR products were digested with
BamHI and EcoRI, whose sites are in the primers
used for amplification (underlined in the primer sequences listed in
Table 1) and then ligated in frame with
and C-terminal to the glutathione S-transferase (GST) gene
of pGEX-3X (Amersham Pharmacia Biotech, Piscataway, N.J.). These
constructs were designed to express GST fusion proteins with either the
full-length (fl) or the short C terminus (sct) E2 proteins that contain
only the DNA binding and dimerization domain of the respective E2
proteins. The fl constructs GST-51E2fl, and
GST-B1E2fl contain the entire E2 coding sequences. The set
constructs GST-51E2sct, GST-16E2sct, and
GST-B1E2sct contain papillomavirus nucleotide
sequences 3536 to 3811 from HPV51, 3584 to 3892 from HPV16, and 3457 to
3840 from BPV1, respectively. Nucleotide numbering corresponds to
papillomavirus genome sequences listed in the Human Papillomavirus
Compendium, HPV database (22). The primers used to create
these constructs are listed in Table 1.
|
-Glucuronidase-expressing baculovirus was provided as part of the
Bac-to-Bac kit and was used as a control. All constructs were sequenced
to confirm their identity.
Protein purification. Escherichia coli strain BL21/DE3 was used for bacterial expression of recombinant proteins. Proteins were extracted from cultures (500 to 1,000 ml) of bacteria grown in liquid overnight at 25°C without IPTG (isopropylthiogalactoside) induction because induction resulted in partitioning of most of the E2 proteins to the insoluble fraction upon extraction (data not shown). Total-cell extracts were made by sonication, and proteins were purified by their affinity for glutathione-agarose beads (Amersham Pharmacia Biotech). Cleavage of the GST from the E2 protein was performed by addition of factor Xa protease (Roche Molecular Biochemicals, Indianapolis, Ind.) to the glutathione-eluted fraction and incubation with gentle agitation at 4°C for 16 to 24 h in factor Xa cleavage buffer (50 mM Tris-HCl [pH 8.0], 100 mM NaCl, 10 mM MgSO4, 1 mM CaCl2, 5 mM dithiothreitol [DTT]). To further purify this protein, it was applied to an S-Sepharose (Amersham Pharmacia Biotech, Piscataway, N.J.) column that was equilibrated with S-Sepharose buffer (20 mM Tris-HCl [pH 8.5], 100 mM NaCl, 5 mM EDTA [pH 8.0]). Protein was eluted with a linear salt gradient (100 mM to 1 M NaCl) in S-Sepharose buffer, and 1-ml fractions were collected and screened for active E2 protein by electrophoretic mobility shift assay (EMSA) and Western blotting (data not shown).
Proteins expressed from Sf-21 cells infected with recombinant baculoviruses for 36 to 72 h postinfection (depending on virus and multiplicity of infection) were harvested, and nuclear extracts were made from them as described previously (24).Cloning of PCR-amplified sequences. Sequences amplified by PCR were cloned using the TA cloning kit (Invitrogen Corp., Carlsbad, Calif.) with the pCRII and pCR2.1 vectors according to the manufacturer's instructions.
Sequencing reactions.
DNA sequencing was performed using the
Sequenase version 2.0 DNA sequencing kit as per the manufacturer's
instructions (Amersham/USB) with M13 forward and M13 reverse primers
end labeled with [
-32P] ATP using T4 polynucleotide kinase.
CASTing. The CASTing method was performed as previously described (48) with minor modifications. Each oligonucleotide in the degenerate oligonucleotide library (DOL) contains PCR primer binding sites that flank a core region of 20 randomly generated nucleotides. The oligonucleotides used were DOL, 5'-AGACGGATCCATTGCA-N20-CTGTAGGAATTCGGA-3', and the primers used for PCR amplification were N20-B (5'-AGACGGATCCATTGCA-3') and N20-R (5'-TCCGAATTCCTACAG-3'). CASTing was performed by adding about 10 µg of double-stranded DOL to 20 µl of glutathione-agarose beads with approximately 15 to 20 µg of GST fusion protein (GST-HPV51E2fl, GST-His6, or GST-BPV1E2fl) captured from dialyzed extracts in a total volume of 100 µl of binding buffer (10 mM Tris-HCl [pH 7.5] 50 mM NaCl, 1 mM EDTA, 4 mM DTT, 250 µg of bovine serum albumin [BSA] per ml, 5% glycerol). This mix was allowed to interact at room temperature for 3 h with gentle agitation followed by three washes with binding buffer. The complexes of glutathione-agarose beads/GST protein/double-stranded DOLs were then resuspended in the PCR buffer mix (30 µl of 10× Taq buffer [Promega], 18 µl of 25 mM MgCl2, 3 µl of 10 mM deoxynucleotide triphosphate (dNTP) mix 3 µl of Taq DNA polymerase, 3 µl of 500-pmol/µl N20B primer, 3 µl of 500-pmol/µl N20R primer, in a total volume of 300 µl) and amplified according to the following protocol: 95°C for 5 min, then 10 PCR cycles of 94°C for 1 min, 40°C for 1 min, and 72°C for 30 sec, followed by cooling to 4°C, and a 100-µl aliquot was removed and stored on ice. The remaining 200 µl was cycled four more times (14 in total) and cooled to 4°C, and another 100-µl aliquot was removed and stored on ice. The remaining 100 µl was subjected to four more rounds of amplification (18 in total), followed by cooling to 4°C. A 10-µl aliquot (plus 1 µl of loading dye) from each of the PCR amplifications (10, 14, and 18 cycles) was electrophoresed on a 2.5% agarose-1× TBE (90 mM Tris-borate, 1 mM EDTA [pH 8.5]) gel and visualized by ethidium bromide staining and UV transillumination. A portion of the pool obtained from each round of CASTing was amplified using radiolabeled primers and tested for enrichment by EMSA. After nine rounds of CASTing with GST-HPV51E2fl, a protein-probe complex was easily detected by EMSA, and the CASTing procedure was stopped. The amplified pools were cloned into the pCRII vector (Invitrogen), the resulting plasmid DNAs were isolated from individual clones, and their sequences were determined.
CASTing with the GST-BPV1 E2fl protein was performed as above with the exception that the alignment was based on 65 independent clones from two independent CASTing experiments.Additional plasmid constructs. (i) E2 binding
sites.
The construct pBPV1E2BS (a gift from Eliot
Androphy) is a pUC18 derivative containing the sequence
tcgagaACCGAATTCGGTagcc cloned into the polycloning sequence.
This clone was used as a positive control in EMSAs that analyzed the
products of the CASTing reactions and for screening TA clones (last two
lanes of Fig. 1). Additional E2 binding sites used as probes for EMSAs
were cloned by annealing two complementary oligonucleotides (H and B
clones; see Table 2) that contain the same flanking sequences; +,
5'-CCAGAGTGAATTCCAGA-(12-bp binding
site)-TCCCAAGCTTGGCG-3', and
,
5'-CGCCAAGCTTGGGA-(complement of 12-bp binding
site)-TCTGGAATTCACTCTGG-3'.These were then cleaved with
EcoRI and HindIII and ligated into a pUC18 vector between its unique EcoRI and HindIII sites.
(ii) pOri16L. The pOri16L plasmid was constructed by building a portion of the HPV16 LCR and the firefly luciferase gene into a pUC19 plasmid backbone. The portion of the HPV16 LCR from nucleotide positions 7800 to 73 (nucleotide numbering as in reference 22) that includes the HPV16 origin of replication and the P97 early promoter, including its TATA box, was amplified by PCR. The primers used to amplify this portion of the LCR contain novel PstI and BamHI restriction sites to facilitate cloning: 16Ori-upper/PstI primer, 5'- CATGAACTGTCTGCAGGTTAGTCATAC-3', and 16Ori-lower/BamHI primer, 5'- GTGCATAAAGGATCCGCTTTTATAC-3'.
The plasmid backbone was provided by pUC19-EX. The pUC19-EX plasmid was constructed by digestion of pUC19 DNA at its EcoRI and XmaI sites in EcoRI buffer (New England Biolabs, Beverly, Mass.) and filling in the 5' overhangs with deoxyribonucleotides using T4 DNA polymerase in T4 DNA polymerase buffer (New England Biolabs) with 100 mM dNTPs. The blunt ends were then ligated using T4 DNA ligase in T4 DNA ligase buffer (New England Biolabs). The pUC19-EX and the HPV16 LCR PCR products were digested with PstI and BamHI, gel purified, and ligated to each other. The resulting plasmid is referred to as pOri16. The open reading frame (ORF) encoding firefly luciferase was purified after BamHI digestion of the p19luc plasmid (46) and inserted into the pOri16 plasmid at the BamHI site to yield pOri16L.(iii) pOri16L mutants.
The binding site knockouts and
sequence substitutions were all created by site-directed mutagenesis of
pOri16L using either the MORPH kit (5 Prime
3 Prime, Inc., Boulder,
Colo.) or the QuickChange site-directed mutagenesis kit (Stratagene
Cloning Systems, La Jolla, Calif.) as per the manufacturers'
instructions. For plasmids with mutations in more than one E2BS, one
site was altered using the pOri16L plasmid as the template and the
second site was altered using the partially mutated plasmid as the
template. All mutated plasmids were screened by DNA sequence analysis.
Upon isolation of mutant clones, origin-containing fragments were
removed by digestion with PstI and EcoRI (the
EcoRI used for this recloning step is found in the
luciferase BamHI cassette) and then ligated into a pOri16L
plasmid from which the wild-type LCR was removed by digestion with the
same enzymes.
(iv) pCMV Series. The pCMV-E216 and pCMV-E116 expression plasmids were kindly provided by Peter Howley (32). They express the full-length HPV16 E1 and HPV16 E2 proteins driven by the cytomegalovirus (CMV) promoter. The pCMV4-XS plasmid was constructed by digesting pCMV-E116 with XbaI and SmaI, filling in the resulting overhangs with deoxyribonucleotides using T4 DNA polymerase, and ligating the resulting blunted ends as described above for the pUC19-EX plasmid construct. The identities of all of the above constructs were confirmed by DNA sequence analysis.
EMSA.
M13rev primers were end labeled with
[
-32P]ATP using T4 polynucleotide kinase. PCR was then
performed using this radiolabeled primer plus an unlabeled M13for
primer to create a single end-labeled, double-stranded DNA probe for
EMSA. The constructs used as templates for making probes were either
the CASTing TA clones in the pCRII vector or pH and pB clones (see
above for cloning details and Table 2 for binding site sequences) in
pUC18. The resulting PCR products were gel purified. EMSAs were
performed in binding buffer with purified protein, 1 µg of sonicated
salmon sperm DNA (Sigma, St. Louis, Mo.), 3 µg of BSA, and
104 cpm of radiolabeled DNA probe. This binding reaction
was incubated at room temperature for 30 min and then loaded directly
onto 6 to 8% native polyacrylamide gels containing 0.25×TBE and 2.5% glycerol.
Tissue culture. J2-3T3 cells were cultured in Dulbecco's modified Eagle's medium with 10% bovine calf serum. SCC-13 cells (29) were grown on mitomycin C-treated J2-3T3 cell feeder layers in E medium as described previously (19).
Luciferase expression assays. SCC-13 cells were plated on mitomycin C-treated J2-3T3 feeder layers in 35-mm dishes 24 h before transfection. The plasmids pOri16L, pRL-TK, and pCMV-E216 or pCMV4-XS (see the legends to Fig. 6 and 7 for details) were introduced into the SCC-13 cells using LipofectAmine, as per the manufacturer's instructions (Life Technologies). Cells were harvested and assayed at 36 h posttransfection using the dual luciferase kit (Promega, Madison, Wis.), and units of luciferase activity were determined using a Berthold Lumat LB9501 luminometer (Berthold Systems, Inc., Pittsburgh, Pa.). Expression levels were determined from duplicate transfections in three independent experiments.
Transient-replication assays. Transient-replication assays were performed with SCC-13 cells as previously described for HPV31 (12). Briefly, plasmid DNAs (quantities and identities detailed in the legend to Fig. 5) were electroporated into SCC-13 cells as detailed by Hubert et al. (12) and Ustav and Stenland (44). Replication was assayed after DpnI digestion of low-molecular-weight DNA extracted by a modified Hirt protocol (10) followed by gel electrophoresis and Southern blot hybridization.
| |
RESULTS |
|---|
|
|
|---|
Cyclic amplification and selection of targets. High-affinity DNA binding sites for the HPV E2 protein were identified using the CASTing technique (48). A GST fusion with the full-length HPV type 51 E2 protein (GST-51E2fl) was used to select specific DNA binding sites from a random pool of degenerate double-stranded oligonucleotides (see Materials and Methods for details).
DNAs from the degenerate oligonucleotide pool were amplified by PCR, cloned, and sequenced to determine if representation within the central 20-bp region was truly random. DNA sequence analysis shows that the starting DNA pool used for CASTing contained a stretch of 20 bp where the abundance of all four bases was the same (data not shown). This demonstrates the random nature of the starting degenerate oligonucleotide N20S pool and that there was no apparent overrepresentation of any particular sequence. Each round of the CASTing consisted of three steps: binding of the double-stranded DNA oligonucleotide pool (N20S) to the GST-E2 protein, removal of unbound DNA, and PCR amplification of DNA that remained bound to the GST-E2 protein-glutathione-agarose bead complex. EMSAs were performed after each round of CASTing to assess the efficiency of binding site selection (Fig. 1). The enriched population pools were cloned (see Materials and Methods), their DNA sequences were determined, and a consensus binding site was determined.
|
1 flanking positions as described for the sites preferred by
HPV16 E2 (40). However, two nucleotides from only one half
of the 12-bp site (ACAC, also underlined above) differ from
the canonical BPV1 E2 binding site.
|
affinity site
(ACC-N6-GGT) is recognized by HPV51E2,
though it was identified in only 15% of the sequenced pool.
As a further control for specificity, a random set of the round 9 TA
clones were subjected to analysis by EMSA. Radiolabeled probes were
amplified by PCR using the TA-cloned DNAs as the template, and EMSA
analyses were performed (data not shown). From differences in the
abundance of the complexes, we concluded that there was a wide range of
relative affinities between the GST-HPV51E2fl protein and
the various DNA sites. In contrast, when a similar analysis was
performed using probes made from the clones isolated from the
GST-His6 CASTing experiment (data not shown), neither the
GST-HPV51E2fl nor GST-His6 protein formed
complexes with these DNAs. Therefore, neither the GST portion of the
fusion protein nor glutathione-agarose beads were selecting specific
DNA sequences.
Because the CASTing consensus sequence differed from the canonical BPV1
E2 site, we were concerned that the CASTing procedure was not correctly
identifying the HPV E2 consensus binding site. To address this
possibility, we repeated the CASTing using the GST-BPV1E2fl
protein to ensure that its high-affinity, palindromic site was
efficiently recognized within the pool of degenerate oligonucleotides.
Under our conditions of selection and enrichment, the canonical,
high-affinity, palindromic BPV1 E2 DNA binding site (ACCGggatCGGT)
but not the novel site (ACAC-N4-CGGT) was identified (data not shown). These results confirm that, despite their poor representation among the TA clones isolated from the GST-HPV51E2fl CASTing experiment, this procedure
readily identifies the canonical, high-affinity DNA binding site of the
BPV1 E2 protein.
HPV E2 proteins bind to the novel nonpalindromic site with
higher relative affinities than to a canonical BPV E2 site.
EMSAs were performed using E2 proteins derived from different
mucosal HPVs to ask if binding to the nonpalindromic HPV site was a
general characteristic of HPV E2 proteins (Fig.
3). The probes used were the
CASTing consensus sequence and a high-affinity canonical E2
binding site. Equal amounts of extract were used for each EMSA
reaction; however, because the expression levels of the different E2
proteins varied, meaningful conclusions can only be drawn from
comparisons between the relative amounts of the probes shifted by the
same E2 protein extract. In addition, the expression levels of the
baculoviruses expressing BPV1E2fl and HPV16E2ct
were very low, making the complexes in Fig. 3, lanes 6 and 10, difficult to see in this reproduction of the autoradiograph. Note also
that the complexes formed with nuclear extracts may contain more
proteins that interact with the probes than just the
baculovirus-expressed E2. Thus, the relative mobilities of the
various E2 complexes cannot be meaningfully compared. The GST-51E2fl protein has a higher relative affinity for the
CASTing consensus site than for the canonical E2 binding site (Fig. 3, lanes 1 and 2). The same result is found with the full-length HPV51 E2
protein purified from a baculovirus expression system (Fig. 3, lanes 3 and 4), demonstrating that recognition of the novel binding site was
not a property of either the GST fusion or bacterial expression of the
protein. Note that in the case of the GST fusions with full-length E2
proteins, because both the E2 proteins and the GST tag can dimerize
independently, there is a pair of shifted complexes that likely
represent dimers and tetramers of these proteins (Fig. 3, lanes 1 and
2). When a full-length BPV1 E2 protein was used in the EMSA, there was
no detectable shift of the GST-51E2 CASTing consensus site
(Fig. 3, lanes 5 and 6), consistent with the fact that this site was
not represented within the pool of sequenced GST-BPV1E2fl
CASTing clones (data not shown).
|
Analysis of DNA binding by highly purified E2 proteins.
So
that a more stringent determination of relative affinities and
off-rates could be made, a purification scheme was designed to isolate
highly purified 16E2sct, 51E2sct, and
GST-BPV1E2sct. Attempts to cleave the GST portion of the
GST-BPV1E2sct protein with factor Xa resulted in loss of
all DNA binding activity (data not shown). Therefore, factor Xa
cleavage was not performed on the GST-BPV1E2sct protein.
The GST-E2sct constructs used above were expressed in bacteria and
purified as described in Materials and Methods. Fractions were screened
for activity by EMSA and examined for the level of purity by sodium
dodecyl sulfate (SDS)-PAGE (Fig. 4A). The
active, highly purified fractions were pooled and used for all further
work defining the relative affinities and off-rates for these proteins
with selected DNA binding sites. The E2 proteins used in this study
were of comparable length, and this choice was based on the results of
Pepinsky et al., who demonstrated a correlation between the length of
BPV1 E2 C-terminal constructs and their binding affinity
(25).
|
Relative affinities and off-rates for E2BSs.
To further
confirm the CASTing results, we constructed a novel set of binding site
clones with defined nucleotide differences based on the consensus E2BSs
determined from the GST-HPV51E2fl and
GST-BPV1E2fl CASTing experiments. Figures 4B and C are
representative EMSAs used to define the relative affinities and
off-rates for these protein-DNA complexes. Table 2
lists the binding sites utilized and the
EMSA results for HPV16E2sct, HPV51E2sct, and GST-BPV1E2sct. The relative affinities suggested by the
frequency with which a sequence was identified in the
GST-HPV51E2fl CASTing experiment correlate with the
relative affinities of both HPV E2sct proteins in the EMSAs. Changes in
the most highly conserved nucleotides, the first and last two base
pairs, result in abrogation of binding (Table 2, compare Hwt with
Hm#7). HPV E2 proteins prefer A/T base pairs in the core of their
recognition sites for (Table 2 compare Hwt with Hm#9 and Bm#3 with
Bwt). The HPV E2sct proteins bound both the CASTing consensus and
canonical E2BSs with very similar affinities (Table 2, compare Hwt with
Bm#3). This result does not directly correlate with the
GST-HPV51E2fl CASTing experiment because the
canonical E2BS was identified with a much lower frequency than
the novel E2BS (see Fig. 2B). A novel palindromic sequence,
ACACAAATGTGT (Table 2, Hm#5), was also bound by the HPV E2sct proteins with relative affinities that
were lower but still within the same order of magnitude as the
CASTing consensus and canonical E2 binding sites (Table 2, compare Hm#5 with Hwt and Bm#3).
|
Novel DNA binding sites are located throughout HPV genomes.
As
a first step in addressing if these novel E2BSs may have biological
relevance, we searched the papillomavirus genome database (21) for the occurrence of ACAC-N5-GGT,
ACC-N5-GTGT, and
ACAC-N4-GTGT. Most papillomavirus genomes
contain one or more of these sites in regions where E2BSs are commonly
found. Table 3 identifies the locations
of these sites in selected papillomavirus genomes. The HPV51 genome has
two such novel high-affinity sites within the LCR proximal to the
putative E1BS. The site ACCGATTTGTGT (Table 3, column LCR
[ori/E]) closest to the E1BS is identical to the HPV51 E2
CASTing consensus sequence (Fig. 3). For other HPVs that infect mucosal
epithelium (e.g., HPV11 and HPV18), the analogous E2BS is involved in
initiation of replication (E2BS 3) (3, 4, 31, 36),
suggesting that this novel site may serve a similar function in
replication of HPV51.
|
Construction of the pOri16L reporter plasmids. We next asked if the novel E2BS could functionally substitute for the canonical sites. The E2 protein is involved in the regulation of both papillomavirus gene transcription and genome replication. By substituting the novel E2BS for the canonical site in a system that allows assay of E2-mediated effects, we can assess the functional role of the sites. The well-characterized HPV type 16 genome was used for these experiments. To this end, the pOri16L reporter was constructed; it contains the 3 portion of the HPV16 LCR that includes the DNA origin of replication and the overlapping early promoter structure that contains three canonical E2BSs fused to a luciferase reporter. All three sites are involved in the HPV16 E2 protein-mediated modulation of viral replication and transcription (4, 26, 32).
Figure 5A presents a schematic of the HPV16 LCR from which the HPV portion of pOri16L originates. Below it is another schematic representing the pOri16L construct, with the relevant features of the plasmid highlighted, including each of the wild-type E2BSs within the construct and the E1BS and A/T-rich sequence that are required for replication of virus DNA. In addition, two cellular transcription factor binding sites (Sp1 and the TATA box) that are known to facilitate the initiation of transcription from this promoter are present in this portion of the HPV16 LCR (27, 37).
|
a and BS1-c
t). Because
the novel site is not a perfect palindrome, it was placed in the
pOri16L plasmid in both orientations to determine if this would affect
the function of E2 (Fig. 5B, hE2BS [the hE2BS orientation found at the
HPV 51 ori; see Table 3] and hFLIP). The mutated pOri16L
plasmids are all named for the mutation(s) made within their E2BS(s).
Relative affinities and off-rates of the pOri16L E2BSs.
Relative affinity and stability studies with the HPV16E2sct
protein and the pOri16L construct E2BSs were performed to analyze the
effects of the sequence changes on two important biochemical parameters. Table 4 lists the results of
these HPV16 E2 binding affinity and complex stability analyses.
|
t and BS3-t
a clones
do not appreciably alter the binding affinity of the
16E2sct protein for these targets compared with the BS3-wt
site (Table 4). When both of these nucleotides are altered, hE2BS, a
member of the novel E2BS family, is created. The relative affinity of E2 for this binding site is 2.5-fold greater than it is for BS3-wt. Comparisons between the Rep-wt and hFLIP sites, which differ by 6 bp,
show an approximately 8.5-fold-higher relative binding affinity for
hFLIP. These two comparisons reveal that substitution of the novel E2BS
in either orientation for the Rep-wt site results in an increase in the
binding affinity of the E2 protein for that site.
The affinity of E2 for the BS1-wt site is increased by approximately
4.5-fold when a single-nucleotide change was made that converts it to
the BS2-wt core sequence (BS2-wt/BS1-c
t clone). Changing two
nucleotides to create hFLIP results in a 3.5-fold-higher affinity
compared with BS1-wt (Table 4). There is no appreciable difference in
the relative binding affinity of E2 for the BS1-wt and the hE2BS site
despite the fact that 5 of the 12 bp differ between these two E2BSs.
E2 had the highest relative affinity for the BS2-wt/BS1-c
t site. Its
relative affinity for the novel E2BSs hFLIP and hE2BS was 78 and 22%
of that for the BS2-wt/BS1-c
t site respectively (Table 4).
Alteration of the first two and last two nucleotides of the 12-bp E2BS,
as in hE2BS-KO, eliminated any detectable binding in these assays.
There is an apparent threefold difference in the relative binding
affinities, depending on the binding site orientation (hE2BS versus
hFLIP), that correlates with the differences in the flanking sequences.
Despite this, except for the hE2BS-KO, all binding affinities for this
collection of DNA sequences are within the same order of magnitude.
Considering that Kds for the HPV16 E2 protein
binding to its recognition sites are in the range of 10
10
to 10
11 M (33), all of the interactions
described above are very strong. Thain et al. (40) have
determined the Kds for each of the E2BSs within
the HPV16 LCR in the context of their wild-type flanking sequences. The
results reported here agree with both of those studies, as the binding
sites ranked from lowest affinity to highest affinity are BS3-E2BS,
BS1-E2BS, and BS2-E2BS (Table 4).
Off-rates of the protein-DNA complexes were also determined for each of
the E2BSs (Table 4). The relative stabilities
(T1/2) of the complexes formed with the tested
E2BS variations are indistinguishable in these assays.
Effects of site substitutions on basal promoter activity. This experiment was designed to determine if the novel E2BS can functionally substitute for the wild-type E2BSs within pOri16L in either orientation. Previous studies have shown that the HPV16 E2 protein affects expression from the early promoter at the level of transcription (26). Therefore, a luciferase cassette driven by the pOri16L constructs was used as a reporter to monitor the effects of the E2 protein on transcription from the mutated HPV16 promoters. These assays were first performed in the absence of the E2 protein to determine what effects the sequence alterations might have on the basal activity of the HPV16 early promoter. The pOri16L wild-type and mutant plasmids were introduced into SCC-13 cells along with a reference plasmid and pRL-TK (see Materials and Methods and Fig. 6 legend for details).
Figure 6A shows the effects on promoter activity when changes were made in E2BS#3. Alteration of the first and last two nucleotides of the 12-bp palindrome in BS3-KO (Fig. 5B), which are critical for E2 protein-E2BS contact (8, 15), had little effect on the basal activity of the promoter (Fig. 6A). Similarly, changing a single nucleotide from within the core 4 bp of the E2BS, as in BS3-t
a (Fig. 5B), also had little effect on
expression levels (Fig. 6A). However, the 2-bp substitution that
creates BS3-hE2BS (Fig. 5B) decreased the accumulation of luciferase
activity by 60% (Fig. 6A). Creation of BS3-hFLIP results in a more
extensive sequence change from the wild type (involving 6 of the core
12 bp of the E2BS; Fig. 5B) but only reduces the promoter activity by
30% (Fig. 6A). There are no known cellular transcription factor binding sites that directly overlap E2BS#3, but the reductions in basal
promoter activity suggest that the BS3-hE2BS and BS3-hFLIP sequence
changes interfere with some aspect of the gene expression process.
|
t), luciferase expression is reduced by 50%. More extensive mutations, such as the
4-bp changes made to create BS1-hE2BS (Fig. 5B), result in a similarly
reduced level of luciferase expression (Fig. 6C). These results suggest
that the dC
dT transition affected the basal activity of the
promoter. However, when this mutation is combined with alterations to
the third and fourth nucleotides of this E2BS to create the BS1-hFLIP
clone (Fig. 5B), basal promoter activity is partially restored (Fig.
6C). There are no known transcription factor binding sites that overlap
E2BS#1 whose interruption or alteration could explain the effects of
these sequence changes on basal promoter activity. The TATA box is
spaced only 3 bp from E2BS#1 in the HPV16 LCR and pOri16L constructs
(Fig. 5A). It would not be surprising if any or all of the mutations
that we made to E2BS#1 had effects on the interaction of the
transcription initiation complex with this region of the promoter.
Comparison of the results of mutations to E2BS#3 and E2BS#2 reveals
that substitution with the novel E2BS in a particular orientation does
not cause an equivalent decrease in levels of basal promoter activity
(Fig. 6A and B, compare BS3-hE2BS and BS3-hFLIP versus BS2-hE2BS
and BS2-hFLIP).
We also intended to determine if the novel E2BSs might affect the way
that E2 cooperatively mediates repression by making substitutions to
more than one of the three E2BSs found in the pOri16L construct.
However, changes to multiple E2BSs, in the combinations that we tested,
resulted in severely reduced levels of basal expression compared to the
wild type (Fig. 6D). Lewis et al. (14) described a similar
phenomenon when making mutations to multiple E2BSs within plasmids
containing the full HPV16 LCR driving a luciferase reporter. This
suggests that cellular transcription factors interact with many
sequences overlapping the E2BSs within the viral LCR even in the
absence of viral proteins. Because of these results, clones with
mutations to multiple E2BSs were not used for subsequent analyses.
Novel E2BS functionally substitutes for wild-type E2BSs in transient transcription assays. The E2 protein can repress transcription from the HPV early promoter (30). Therefore, an HPV16 E2 protein expression construct was cotransfected into SCC-13 cells along with each of the pOri16L mutants and the reference plasmid pRL-TK to determine whether the novel E2BS could functionally substitute for the wild-type sites in an E2-responsive manner.
Although many of the E2BS mutations resulted in reduced levels of basal expression from this cassette, the activities of all the promoters were still high enough to assay for repression by the HPV16 E2 protein. The HPV16 E2 protein repressed luciferase expression from the BS3-KO, BS3-t
a, and BS3-hFLIP plasmids almost as efficiently as it did from
the wild-type promoter (Fig. 7A). In
stark contrast, expression from the BS3-hE2BS reporter, which had only
40% of the basal luciferase activity of the wild-type promoter (Fig. 6A), was stimulated slightly by E2 (Fig. 7A). There is no precedent for
alteration of the E2 function from a repressor to an activator of early
promoter activity when only a 2-bp change in an E2BS sequence is made.
A similarly binding-site orientation-dependent effect was observed for
E2 protein function in transient-replication assays (see below). We
propose that E2's functional dependence on the orientation of the
novel E2BS indicates that the E2 protein may asymmetrically bind to the
nonpalindromic, novel E2BS and that this profoundly affects how it is
able to interact with other proteins.
|
Novel E2BS functionally substitutes for the wild-type E2BSs in transient-replication assays. Another major function of the E2 protein in the viral life cycle is the stimulation of DNA replication in conjunction with the papillomavirus E1 protein. To determine if the novel E2BS can functionally substitute for the canonical E2BSs in E2-mediated DNA replication, we used the pOri16L mutants in transient-replication assays. The pOri16L constructs were designed to be analogous to plasmids used in other studies of HPV replication (32, 36). Each of the E2BSs contained in the pOri16L plasmid is known to influence the efficiency of replication. Transient-replication assays were performed as described for HPV31 (12, 28, 32).
There is no detectable replication in the absence of E1- and/or E2-expressing plasmids (Fig. 8A,
E1/
E2,
E2, and
E1) (32). In addition, various
amounts of E1 and E2 expression plasmids relative to the pOri16L
plasmid were tested to determine if they had any effect on transient
replication. Replication of pOri16L was more readily detected when the
E1 and E2 expression plasmids were transfected in molar excess to the
pOri16L target. However, regardless of what the ratios of pOri16L
mutant to E1 and E2 expression plasmids were, the relative replication
activities between the mutant pOri 16L targets remained unaffected.
These results agree with those of Sakai et al. (32).
Figure 8A is a representative Southern blot from one of the three
experiments used to determine the replication activities summarized in
Fig. 8B to D.
|
a) has little effect on pOri16L replication
(Fig. 8B, BS3-t
a). In contrast, substitution with the novel E2BS
sequence (Fig. 5B, BS3-hE2BS) results in an almost twofold increase in
DNA replication. A similar increase in replication efficiency is
detected when the novel E2BS is placed in the opposite orientation
(Fig. 8B, BS3-hFLIP). Therefore, the novel E2BS can substitute in
either orientation for E2BS#3 and it enhances E2's stimulation of
transient replication activity 1.5- to 2-fold. This increase in
replication efficiency correlates with an increase in relative affinity
as detected by EMSAs (Table 4).
Figure 8C shows the effect of changes to sequences in
E2BS#2 (see Fig. 5B for sequence details). Elimination of the E2
binding activity at this site virtually eliminated detectable levels of transient replication (Fig. 8C, BS2-KO). Substitution of this site with the novel E2BS stimulates replication levels twofold (Fig.
8C, BS2-hE2BS), as it did for the BS3-wt to BS3-hE2BS and BS3-hFLIP
changes (Fig. 8B). However, placement of the novel E2BS at E2BS#2
in the opposite orientation abrogates replication (Fig. 8B,
BS2-hFLIP). This result contrasts with those from the expression assay, where the HPV16E2 protein repressed expression from the BS2-KO, BS2-hE2BS, and BS2-hFLIP plasmids to similar levels (Fig. 7B). Thus, at this position in the LCR, the orientation of the novel
E2BS influences E2-mediated replication. This orientation dependence
may reflect changes in the conformation of the E2 protein-DNA complexes
that form at this site with respect to the replication machinery. We
detected a threefold effect of orientation on binding affinities in our
EMSAs. These binding affinity differences between the BS2-hE2BS and
BS2-hFLIP site, in the context of the pOri16L plasmid, could explain
the differences in their replication efficiencies.
Figure 8D shows the effect of changes to sequences in E2BS#1. The
replication capacity of the BS1-KO mutant template is only 40% of the
wild-type template (Fig. 8D, BS1-KO). In contrast, substitution for
E2BS#1 with the BS1-c
t or the novel E2BS in either orientation
(BS1-hE2BS and BS1-hFLIP) has little effect on the replication
efficiency (Fig. 8D). Therefore, in contrast to its activity when used
to replace E2BS#2, the novel E2BS can functionally substitute in either
orientation for the wild-type E2BS#1.
These replication studies demonstrate that the novel E2BS can
functionally substitute for the wild type E2BSs in the context of
transient replication at each of the three HPV16LCR E2BSs studied here,
but in the case of the E2BS#2, functional substitution depends on
orientation. These results, taken together with the similar orientation
dependence seen in our luciferase expression assays for the E2BS#3
site, suggest that the ability of the E2 protein to interact with
cellular proteins can be affected by the orientation of the novel E2BS.
| |
DISCUSSION |
|---|
|
|
|---|
CASTing was used to identify a novel family of HPV E2BSs whose members are bound with affinities similar to that of the canonical E2BS. The CASTing results reveal the promiscuous nature of the HPV E2 protein with respect to DNA binding site selection. Our results allude to the potential flexibility of the HPV E2 protein's conformation upon binding to the E2BSs, as these proteins can bind to either the CASTing consensus (ACAC-N4-CGGT) or canonical E2BS (ACCG-N4-CGGT) with similar relative affinities if their core and flanking nucleotides are conserved (Table 2, Hwt versus Bm#3). These novel sites are present in HPV genomes at locations where E2BSs are commonly found.
To show that the novel site also has biological significance, we substituted it for each of three of the wild-type canonical sites within the HPV16 LCR (Fig. 5). We demonstrated that in both transient-transcription assays (Fig. 7) and transient-replication assays (Fig. 8), substitution for the wild-type E2BSs with the novel E2BSs had, in some instances, very strong and unpredicted effects on the ability of the HPV16 E2 protein to modulate transcription and replication. In the case of BS3-hE2BS, reporter expression was activated rather than repressed. For BS2, replication was only supported when the site was replaced with the novel hE2BS in one orientation. Thus, binding to this novel, asymmetric site affects the HPV16 E2 protein's activities in an orientation-dependent manner.
Binding properties of HPV E2 proteins. The CASTing experiment (Fig. 2) and relative affinity studies (Tables 2 and 4) reveal that the HPV E2 proteins that we examined have a preference for an A/T-rich 4-bp core. There was a >100-fold difference in binding affinities of the HPV16 E2 protein for an A/T-rich core versus a core that contained only 2 G/C bp (e.g., Table 2, Hwt versus Hm#9).
In agreement with studies by Thain et al. (40), we noted a preference by the HPV E2 proteins for purine (R) and pyrimidine (Y) residues at the
1 positions flanking the binding site (Fig. 2A,
78% R adjacent to the ACAC and 92% Y adjacent to the CGGT). This property of the E2BSs is preserved throughout the HPV
genomes. The LCRs of the mucosa-specific HPVs contain three E2BSs
involved in the initiation of replication (4). In all of
these sites, the flanking purine and pyrimidine residues are well conserved.
The novel E2BS was the most frequently isolated sequence in two
independent CASTing experiments with GST-HPV51E2fl.
Yet, when the core and flanking sequences are conserved, the HPV E2
proteins bind the canonical palindromic sites (2) with a
slightly higher affinity (e.g., Table 2, Hwt versus Bm#3). This paradox
may reflect the sensitivity of the CASTing procedure to subtle
differences in the binding affinities or stability of these protein-DNA
complexes that were undetectable by EMSA.
Finally, EMSAs confirm that binding to the novel site is not a property
shared by the BPV1 E2 protein (Table 2, Hwt and Hm#9).
Structural determinants for E2 binding site preferences.
The
molecular basis for the differences in the DNA binding activities of
these proteins is not readily apparent. Cocrystal structures have been
published of only the BPV1 and HPV18 E2 proteins bound to canonical DNA
binding sites (9, 13) (Fig.
9). If we make two assumptions, that the
affinity for the novel hE2BS site and the specific protein-DNA contact
points are conserved between the HPV16 and HPV18 E2 proteins, then the
cocrystal structures cannot explain the differences in binding
properties between the HPV and BPV E2 proteins. The identity of all but
one of the amino acids that contact the DNA is conserved among the E2
proteins used in this study. The single difference is a change from
Phe343 in the BPV1 E2 protein to a Tyr in the corresponding site in the HPV E2 proteins (Fig. 9A). As both crystal structures indicate that these residues make comparable DNA contacts, this sequence difference cannot explain the differences in the binding properties of
the BPV and HPV E2 proteins. In addition, this aromatic residue makes
contact with the T at the 3 end of the DNA binding site (-GGT) that is
absolutely conserved in all functional E2BSs examined to date.
|
-helix oriented in the major groove
of the DNA. Figures 9C and D show the amino acids that make specific
contacts with the DNA base pairs. The E2 proteins contain a conserved
Lys residue that makes direct contact with the adjacent dG's in the
DNA binding site (CGGT) (Fig. 9C and D). Because the Lys
residue is conserved in all of the papillomavirus E2 proteins sequenced
to date, binding to the novel sequences ACCG-N4-GTGT
and ACAC-N4-GTGT by the HPV E2 proteins
suggests a fundamental difference in the amino acid-DNA contacts and/or the positioning of the DNA-contacting
-helices.
The data available to us suggest that protein flexibility is an
important determinant of sequence recognition. Nuclear magnetic resonance and X-ray crystallographic studies comparing the bound and
free states of the homodimeric DNA binding domain of BPV1 E2 describe a
conformational change upon binding to DNA. This change involves both
the dimerization domain and the
-helix that contacts DNA (9,
47). It is plausible that HPV E2 proteins are more flexible than
the BPV1 E2 protein. The fact that the HPV E2 proteins strongly prefer
a flexible and/or prebent A/T-rich, 4-bp core within their 12-bp
binding sites also supports the concept that the HPV E2 proteins can
undergo a greater conformational change to achieve a stable complex
with DNA.
Structural analyses of an HPV E2 protein complexed with both the
canonical and novel HPV E2 DNA binding sites would help to address
these issues.
Effect of E2 protein affinity for its binding sites. Steger et al. (35) used in vitro transcription assays to demonstrate that the amount of HPV18 E2 can determine whether it acts as an activator or repressor of transcription from the HPV18 early promoter. When template containing the HPV18 LCR (with four E2BSs) is in excess, addition of the HPV18 E2 protein results in stimulation of transcription. By contrast, when the HPV18 E2 protein is in excess, transcription is repressed. Thus, E2 abundance may act as a switch to differentially control early-gene expression at distinct stages of the viral life cycle.
Transient-transfection assays in which the E2 protein is expressed from a strong promoter, as used in this study, resemble the situation where the E2 protein is in excess of the available E2Bss. Therefore, we would predict that if the new sites had affinities similar to those of the wild-type E2BS, E2 should repress early-gene expression. The 16E2sct protein used in this study had similar or elevated affinities for all of the sites studied here, with the exception of the KO sites (Table 4). Despite this, the substitution of the BS3-hE2BS site in pOri16L resulted in a promoter that was weakly stimulated rather than repressed by the HPV16 E2 protein (Fig. 7A). The BS3-hE2BS construct retains wild-type E2BS#1 and E2BS#2, which E2 utilizes to repress transcription from this promoter. Even if the E2 protein is not expressed in excess, it does not have a higher affinity for the BS3-hE2BS site than for the E2BS#1 and E2BS#2 E2BSs (Table 4). Thus, the stimulatory effect of E2 binding to BS3-hE2BS overrides the repressive effects of E2 binding to E2BS#1 and E2BS#2. These data suggest that binding affinity alone is not sufficient to explain the effect of E2 protein interactions with an E2BS.Mechanisms of E2 protein-mediated transcription repression. All three of the E2BSs within pOri16L facilitate E2-mediated repression of gene expression from this promoter (17). The spacing and relative locations of the E2BSs are well conserved, with respect to the TATA box for the early promoter and with respect to the binding sites of the known cellular transcription factors and the viral E1 protein, among the mucosal HPVs. At two E2BSs, E2BS#1 and E2BS#2, the E2 protein can compete for binding with the cellular transcription factors TFIID and Sp1, respectively (5, 37, 38). These two factors are involved in the stimulation of early-gene expression from this promoter in the absence of the E2 protein.
The TATA box utilized for the initiation of early-gene expression is adjacent to E2BS#1. If the E2 protein is bound to E2BS#1, it will interfere with the binding of the transcription initiation complex that forms around the TFIID complex (11). Therefore, direct interference with the formation of the transcription machinery complex is a m