Identification of a Short, Hydrophilic Amino Acid Sequence Critical for Origin Recognition by the Bovine Papillomavirus E1 Protein

ABSTRACT The E1 protein of bovine papillomavirus (BPV) is a site-specific DNA binding protein that recognizes an 18-bp inverted repeat element in the viral origin of replication. Sequence-specific DNA binding function maps to the region from approximately amino acids 140 to 300, and isolated polypeptides containing this region have been shown to retain origin binding in vitro. To investigate the sequence and structural characteristics which contribute to sequence-specific binding, the primary sequence of this region was examined for conserved features. The BPV E1 DNA binding domain (E1DBD) contains three major hydrophilic domains (HR1, amino acids 179–191; HR2, amino acids 218 to 230; and HR3, amino acids 241 to 252), of which only HR1 and HR3 are conserved among papillomavirus E1 proteins. E1DBD proteins with lysine-to-alanine mutations in HR1 and HR3 were severely impaired for DNA binding function in vitro, while a lysine-to-alanine mutation in HR2 had a minimal effect on DNA binding. Mutation of adjacent threonine residues in HR1 (T187 and T188) revealed that these two amino acids made drastically different contributions to DNA binding, with the T187 mutant being severely defective for origin binding whereas the T188 mutant was only mildly affected. Helical wheel projections of HR1 predict that T187 is on the same helical face as the critical lysine residues whereas T188 is on the opposing face, which is consistent with their respective contributions to DNA binding activity. To examine E1 binding in vivo, a yeast one-hybrid system was developed. Both full-length E1 and the E1DBD polypeptide were capable of specifically interacting with the E1 binding site in the context of the yeast genome, and HR1 was also critical for this in vivo interaction. Overall, our results indicate that HR1 is essential for origin binding by E1, and the features and properties of HR1 suggest that it may be part of a recognition sequence that mediates specific E1-nucleotide contacts.

Bovine papillomavirus (BPV) replication requires only two viral proteins, E1 and E2 (40), with the rest of the replication machinery supplied by the host cell (27). Both the E1 and E2 proteins are site-specific DNA binding proteins which recognize sequences in the viral origin of replication (1,41,42). The E2 binding site (E2BS) is a 12-bp partial palindrome (1,22), while E1 binds to an 18-bp imperfect inverted repeat sequence (18,19,34). In vitro, only the E1 protein is absolutely required, indicating that the E2 protein does not supply a requisite replication function (3,5). Instead, E2 appears to act as an auxiliary factor that interacts directly with E1 (24,35,44) and increases binding site specificity (5,32). Moreover, at low protein concentrations, E2-E1 complexes facilitate loading and assembly of additional E1 molecules on the origin to form the active initiation complex (30). Initial binding of E1 induces distortion in the origin DNA that is a likely prelude to origin unwinding, and this distortion of the origin DNA is also enhanced by E2 (13,30). Ultimately, a hexameric E1 complex with ATP-dependent helicase activity is assembled (11,33), though the precise steps in the formation of this hexamer are undefined. E1 also interacts with several host cell proteins, including DNA polymerase ␣ and replication protein A (4,14,29), and presumably recruits these host replicative factors to the viral initiation complex.
The origin binding function of E1 is central to its role in viral genomic replication. Truncation studies of the 605-amino-acid BPV E1 protein localized the DNA binding domain (DBD) of E1 (the E1DBD) to approximately amino acids 140 to 300 (21,31). Isolated polypeptides expressing this DBD region in the absence of other E1 sequences retain origin-specific binding activity (7,21,31,39). Since the ATP binding site and helicase activity map to the C-terminal portion of E1 and are not present in the DBD, neither of these activities is required for origin recognition (39). The E1DBD does retain the ability to interact with E2 protein (21), and this interaction cooperatively enhances origin binding (7). Clearly then, the E1DBD is a functional subdomain with binding properties similar to those of full-length E1 protein.
In contrast to the E2DBD, which has been crystallized and studied in detail (15)(16)(17), little is known about the structure of the E1DBD or the sequences which mediate origin binding. A previous study identified a pair of conserved, heptad repeats of hydrophobic residues from amino acids 249 to 282 (39). A triple mutant with substitutions at two of the conserved hydrophobic residues failed to bind origin DNA, consistent with a role for these elements in origin binding. However, the third mutation in this triple mutant changed a conserved proline located between the heptads to an alanine. The proline-toalanine substitution could likely have a significant effect on protein folding in this region and makes interpretation of the role of the heptad repeats uncertain. Two additional mutants with basic residue-to-alanine substitutions in the region adjacent to the N-terminal side of the first heptad (amino acids 241 to 247) were both severely defective for origin binding, demonstrating that this hydrophilic region was critical for E1-DNA interaction. One additional mutation at a distal site, an arginine 180-to-alanine substitution, reduced E1 DNA binding activity to less than 1% of the wild-type (WT) E1 level, though this region was not investigated further (39). In the present study, the hydrophilic region that includes E1 amino acid 180 was investigated in detail and was shown to be critical for site-specific DNA binding both in vitro and in vivo. Features of this region suggest that it may be part of a recognition sequence that mediates specific contact with nucleotides in the E1BS.

MATERIALS AND METHODS
Plasmids and mutant construction. The DNA fragment encoding the WT E1DBD region from amino acids 121 to 311 (E1 121-311 ) was amplified under standard conditions from a cloned E1 gene, using Ultma DNA polymerase (Perkin-Elmer) and primers E1AA121 (5Ј-GAGCTGAATTCGCTAACCGTGTTCTT ACGCCC-3Ј) and E1AA311 (5Ј-CGTCCGTCGACCGCGAATTTCTCGGTC TGCAAGCT-3Ј). The amplified product was ligated directly to pCRBLUNT (Invitrogen) and transformed into TOP10 cells (Invitrogen). A recombinant clone was identified, and the E1DBD fragment was excised from purified plasmid DNA by EcoRI digestion. The E1DBD fragment was ligated to pGEX-5X-1 DNA (Pharmacia Biotech) that had been linearized by EcoRI digestion and was transformed into TOP10FЈ (Invitrogen) cells. Clones were screened for the presence of the insert fragment, and a recombinant clone with the correctly oriented fragment was identified. The entire sequence of the insert fragment was determined and found to be the correct, WT sequence and to be in frame with the glutathione S-transferase (GST) coding region. This clone, designated pGEX-E1DBD, was used for the production of all point mutants. Ten individual point mutations were constructed in pGEX-E1DBD by site-directed mutagenesis using mutagenic oligonucleotides and a QuikChange mutagenesis kit (Stratagene) as specified by the manufacturer. Mutant clones were identified by direct DNA sequencing of plasmid minipreparations; for each clone, only the desired mutation was present in the E1DBD coding region.
For in vivo expression in yeast, E1 and the E1DBD were produced as fusions with the GAL4 activation domain (AD), using the pGAD424 vector (Clontech). For full-length E1, the 2,185-bp NruI-to-FspI fragment encompassing the E1 coding region was isolated from pdBPV.1 DNA by gel purification. BamHI linkers were added to the fragment, and it was ligated to BamHI-digested pGAD424. Two clones were identified with the insert in either the correct, forward orientation [pGAD-E1(F)] or the reverse orientation [pGAD-E1(R)]. The vectorinsert junctions were sequenced to confirm the insert orientation and to ensure that the forward clone could produce the AD-E1 fusion protein. For the E1DBD, the E1 121-311 fragment was excised from the above pCRBLUNT clone by EcoRI digestion and ligated to EcoRI-digested pGAD424. Again, forward (pGAD-E1DBD) and reverse [pGAD-E1DBD(R)] clones were identified and confirmed by DNA sequencing. The expression of both the AD-E1 and the AD-E1DBD proteins from these vectors in yeast was confirmed by Western blotting (data not shown). pGAD53m (Clontech) expresses the AD-p53 protein (mouse p53).
Purification of E1DBD proteins. For expression of WT and mutant E1DBD proteins, the respective pGEX-E1DBD plasmids were transformed into Escherichia coli BL21. BL21 transformants were grown to mid-log phase at 37°C in 2XYT medium with ampicillin (50 g/ml) and then induced by addition of isopropyl-␤-D-thiogalactopyranoside to 1 mM. After a 2-h induction, cultures were packed in ice for 30 min and harvested by centrifugation (15 min at 10,000 ϫ g), and the pellets were frozen at Ϫ20°C. For lysis and extraction, the cell pellets were thawed on ice, resuspended in a minimal volume of cold GST-C buffer (50 mM Tris-HCl [pH 7.9], 250 mM NaCl, 5 mM EDTA, 5 mM dithiothreitol [DTT], 10 mM phenylmethylsulfonyl fluoride [PMSF], 10% glycerol), and incubated on ice for 60 to 120 min with lysozyme at a final concentration of 0.1 mg/ml. After the lysozyme digestion, the cell suspensions were subjected to two rounds of passage through a prechilled French press cell at 16,000 lb/in 2 followed by addition of NP-40 to 0.1%. Alternatively, some preparations were lysed by resuspension of the original cell pellets in B-Per reagent (Pierce) followed by lysozyme treatment. To further solubilize the proteins extracted by either procedure, the extracts were sonicated twice for 15 s at maximum power with a microtip in an Ultrasonics sonicator. The extracts were clarified by centrifugation for 30 min at 20,000 ϫ g, and the supernatants were rotated overnight at 4°C with glutathione-Sepharose beads (Pharmacia). Subsequently, the beads were collected at 4°C by centrifugation for 5 min at 500 ϫ g and were washed twice with 10 ml of cold GST-C, three times with 10 ml of cold GST-D (50 mM Tris-HCl [pH 8.0], 200 mM NaCl, 5 mM EDTA, 5 mM DTT, 10 mM PMSF, 10% glycerol), three times with 10 ml of cold GST-E (50 mM Tris-HCl [pH 8.0], 1.0 M NaCl, 5 mM EDTA, 5 mM DTT, 10 mM PMSF, 10% glycerol), and twice with 10 ml of cold GST-C. The washed beads were incubated for 10 min at room temperature in 500 l of GST-C with 10 mM reduced glutathione followed by centrifugation for 1 min in a microcentrifuge. The supernatant containing the GST-E1DBD fusions was collected, adjusted to a final concentration of 50% glycerol, and stored at Ϫ20°C. Protein concentration was determined by a Bradford assay (6), and purity was assessed by scanning sodium dodecyl sulfate (SDS)-polyacrylamide gels with an IS1000 digital imaging system (Innotech Corp.). For removal of the GST moiety, factor Xa cleavage was performed as specified by Pharmacia. However, GST-E1DBD was found to be much more stable in gel shift assays than the free E1DBD, and so the fusion proteins was used for all the studies reported here. Comparison of the properties of GST-E1DBD and free E1DBD will be reported elsewhere.
Gel mobility shift assay. Mobility shift assays were performed basically as described by Chen and Stenlund (7). Purified WT or mutant GST-E1DBD proteins were incubated with 2.5 fmol of radiolabeled substrate and 20 ng of pUC18 DNA in 10-l reaction mixtures consisting of gel shift assay buffer supplemented to a final concentration of 5 mM DTT and 0.07% bovine serum albumin. Samples were incubated for 30 min at 25°C and then loaded onto 8% polyacrylamide gels in 0.5ϫ Tris-borate-EDTA buffer. The 10ϫ Tris-borate stock used for both the gel and the tank buffer was adjusted to pH 7.5, as this pH provided better resolution of protein-DNA complexes. Gels were electrophoresed at 100 V for 4 to 5 h. Dried gels were visualized and quantitated with a Molecular Dynamics PhosphorImager.
Yeast one-hybrid system. Plasmid p53BLUE (Clontech) contains three tandem copies of the p53 binding site (p53BS) inserted into a minimal yeast promoter (P CYC1 ) upstream of the lacZ gene in plasmid pLacZi. A similar E1 reporter plasmid was constructed by cloning a double-stranded oligonucleotide consisting of three tandem copies of the 18-bp E1BS into the P CYC1 region of pLacZi. The sequence of the resultant recombinant, designated pLacZi-E1BST, was confirmed by DNA sequencing. Purified p53BLUE and pLacZi-E1BST DNAs were linearized by NcoI digestion to promote vector integration and transfected into Saccharomyces cerevisiae YM4271 by the polyethylene glycollithium acetate procedure (12). Transformants were selected on minimal medium lacking uracil and were passaged several times on this medium to eliminate cells harboring unintegrated vector. The resulting yeast reporter strains containing either the integrated p53BS promoter-lacZ gene or the E1BST promoter-lacZ gene were designated p53BS-LACZ and E1BST-LACZ, respectively. Purified plasmid DNAs encoding the various GAL4 AD fusions were transfected into both of the reporter strains, and transformants were isolated on minimal medium lacking uracil and leucine.
Western blots. Whole-cell extracts were prepared from the yeast transformants by freeze-thawing cell pellets three times and then boiling them for 10 min in cracking buffer (40 mM Tris-HCl [pH 6.8], 0.1 mM EDTA, 8 M urea, 5% SDS, 0.04% bromophenol blue, 0.88% ␤-mercaptoethanol, 0.077% PMSF). Equivalent amounts of protein from each sample were electrophoresed on a 12% polyacrylamide gel, and the proteins were electrophoretically transferred to a Protran nitrocellulose membrane (Schleicher & Schuell) as previously described (21). The blots were probed with a 1/1,000 dilution of anti-GAL4 AD (Upstate Biotechnology) and visualized by enhanced chemiluminescence with Super Signal (Pierce).
␤-Galactosidase assays. Two procedure were used for assessing the interaction of AD fusion protein with the E1BST or p53 promoter in the yeast reporter strains, a colony lift filter assay, and a liquid culture CPRG (chlorophenol red-␤-D-galactopyranoside) assay. Both assays were performed as described in the Clontech Yeast Protocols Handbook. A brief description of each assay follows.
(i) Colony-lift filter assay. Transformants were inoculated onto an appropriate selective medium and grown for 2 to 3 days at 30°C until colonies were 1 to 2 mm in diameter. Colonies were replica lifted onto Whatman filter paper and lysed by freezing in liquid nitrogen followed by a thaw at room temperature. Subsequently the filter was wetted by incubation on top of a second filter soaked in Z buffer (50 mM sodium phosphate [pH 7.0], 10 mM KCl, 1 mM MgSO 4 , 0.27% ␤-mercaptoethanol) with 330 g of X-Gal (5-bromo-4-chloro-3-indolyl-␤-D-galactopyranoside) per ml. Development of blue color was monitored visually, and the filter was photographed with an IS1000 digital imaging system (Innotech).
(ii) Liquid culture CRPG assay. Overnight cultures were grown in the appropriate selective medium for each transformant, and the optical density at 600 nm was recorded for later normalization. Cultures (1.5 ml) were harvested by centrifugation in microcentrifuge tubes, washed one time with buffer 1 (10 mM HEPES [pH 7.3], 150 mM NaCl, 0.065% L-aspartate, 1% bovine serum albumin, 0.05% Tween 20), and resuspended in 300 l of buffer 1. Cells were lysed by three cycles of alternating between liquid nitrogen and a 37°C water bath. Aliquots of 100 l of lysed cells were mixed with 700 l of buffer 1 containing 2.23 mM CPRG and vortexed thoroughly. Samples were incubated at room temperature until visible color formation was observed or for up to 3 h. After color formation, or at 3 h if no color was observed, the reactions were stopped by addition of 500 l of 3 mM ZnCl 2 . Cell debris was removed by centrifugation for 1 min in a microcentrifuge, the optical density at 578 nm of the supernatants was read for each sample, and the ␤-galactosidase units were calculated. All samples were assayed in triplicate, and the values reported are the average.

Conservation of hydrophilic domains in the E1DBD.
We previously showed that the origin-specific DNA binding activity of the BPV E1 protein resides in the E1 121-311 fragment and that this isolated polypeptide retained site-specific recognition function (21). To initiate a functional study of the E1DBD, the primary sequence was examined for physical features. Hydropathy plots of the BPV E1DBD were prepared by using six different algorithms. All six plots were very similar (data not shown) and revealed three subregions with a high degree of intrinsic hydrophilicity, possibly reflecting accessible sequences capable of interacting with origin nucleotides. In the Kyte-Doolittle plot shown in Fig. 1A, the hydrophilic regions are designated HR1, HR2, and HR3. Given the cross-species functional conservation of E1 proteins (2,8,38) and the sequence conservation of the E1BS among different papillomaviruses (18), it was likely that subregions of the E1DBD critical for origin interaction would show some conservation of overall properties and specific primary amino acid sequence. Consequently, similar examination of hydropathy plots was conducted for the predicted DBDs of eight human papillomavirus (HPV) E1 proteins ( Table 1). All eight HPV E1 proteins had significant hydrophilic character in the regions corresponding to HR1 and HR3 of BPV E1, while only three of the eight were  hydrophilic in the region corresponding to HR2. This conservation of HR1 and HR3 implicates them both as potentially important for E1 DNA function, while HR2 may be of lesser or no significance. Consistent with this prediction, amino acids within HR3 have already been shown to be critical for origin binding by BPV E1 (39). HR1 has not been thoroughly investigated, although mutation of amino acid 180 at the N-terminal boundary of HR1 dramatically reduced the origin binding ability of E1 in an earlier study (39).
Charge-to-alanine mutations identify critical amino acids for E1DBD origin recognition function. To investigate the contribution of HR1 to origin binding, lysine-to-alanine mutations were constructed at conserved residues 183 and 186 in the context of the isolated E1DBD polypeptide (Fig. 1B). Additional lysine-to-alanine mutations were created elsewhere in the DBD at both conserved and nonconserved residues located in sequence contexts of varying hydrophilicity. Each mutant E1DBD protein was expressed as a GST fusion and affinity purified to greater than 90% homogeneity (Fig. 2). The purified proteins were assayed for in vitro DNA binding activity using a gel shift assay with short (50-or 60-bp) double-stranded oligonucleotides as the substrates (Fig. 3). Oligonucleotides of this length were previously shown to be sufficient for efficient binding of E1 protein (18). The binding test substrate contained the authentic BPV origin sequence consisting of the AT-rich element, the E1BS, and the low-affinity E2BS12, while the control substrate was identical except that the 18-bp E1BS was replaced with an unrelated palindromic sequence.
At low concentrations of WT E1DBD, a single predominant E1-DNA complex was observed with the origin substrate, while at higher concentrations additional retarded complexes were formed (Fig. 3, lanes 3, 4, 14, and 15). Mixing studies with GST-E1DBD and a GST-E1 1-311 protein indicate that the predominant complex is a monomer of GST-E1DBD bound to the substrate (C. Bazaldua-Hernandez and V. G. Wilson, unpublished results). This suggests that the more slowly migrating complexes represent bound dimers and trimers of GST-E1DBD; more detailed studies of the various WT E1DBD-DNA complexes will be presented elsewhere. Under the assay conditions used, there was no binding of WT E1DBD to the control substrate (lanes 1, 2, and 12), demonstrating the specificity of the E1DBD-DNA complexes. Both the K183A and K186A mutations in HR1 severely impaired origin binding function to less than 5% of WT activity (lanes 7 to 10). The K241A mutation in HR3 showed a similar reduction in DNA binding (lanes 18 and 19), and its impairment was comparable to that previously observed by Thorner et al. for a K241A/ R243A double mutant in the context of full-length E1 (39). In contrast, a lysine-to-alanine mutation at position 222 in the nonconserved HR2 had only a minimal effect on overall DNA binding activity. The K222A mutant exhibited the same qualitative pattern of protein-DNA complexes and showed only a 10 to 20% reduction in amount of complexes formed at the higher protein concentration (lanes 16 and 17). Similarly, nonconserved lysine K157 appeared to make no significant contribution to origin binding since the K157A mutant E1DBD protein bound the origin substrate as well as the WT protein, both qualitatively and quantitatively (lanes 5 and 6). Interestingly, E1DBD proteins with alanine mutations at conserved lysines 267 or 279 showed only modest reduction in binding activity (lanes 20 to 23), suggesting that these lysines may be conserved for some function(s) besides direct DNA binding. Overall, the in vitro binding results are consistent with HR1 and HR3 being critical motifs for E1 DNA binding activity.
Both full-length E1 and the E1DBD exhibit site-specific DNA binding activity in vivo. The above results, as well as previously published reports of E1 DNA binding, were all in vitro studies of E1-DNA interactions. Little is known about the interaction of the E1 protein with its binding site in an in vivo context where the host cell milieu might influence E1 binding activity. In vivo transient replication assays are sensitive to E1 binding ability but are not useful for direct comparison of mutant binding activities since each mutation may also affect other E1 replication functions. Consequently, to evaluate the DNA binding activity of E1 protein in vivo more directly, we developed a yeast one-hybrid assay. An E1 reporter strain, designated E1BST-LACZ, was constructed with three tandem copies of the 18-bp E1BS integrated into the yeast genome within a minimal promoter sequence adjacent to a lacZ gene. The control strain had a similar organization except that it contained three copies of a p53BS rather than the E1BS and was designated p53BS-LACZ. Both the reporter and control strains were transfected with a series of plasmids expressing various fusions between the GAL4 AD and either p53 or E1. Transformants were isolated and replated on selective medium, and the colonies were assayed for ␤-galactosidase activity by an X-Gal overlay procedure (Fig. 4). Both the E1BST-LACZ and the p53BS-LACZ reporter strains expressing the AD protein alone produced no detectable ␤-galactosidase, and the colonies remained white even after overnight incubation with X-Gal. The absence of ␤-galactosidase activity demonstrated the extremely low endogenous expression from either the E1BS promoter or the p53BS promoter and confirmed that the AD alone showed no interaction with these promoter elements.
In contrast to the AD alone, expression of the AD-p53 fusion and the AD-E1 fusion resulted in ␤-galactosidase production in their respective binding site promoter strains. In each case, dark blue color was observed within 1 h after incubation with X-Gal. Neither fusion protein activated transcription from the promoter lacking its cognate binding site, demonstrating the specificity of these interactions. Furthermore, a clone with the E1 gene in the reverse orientation failed to activate transcription of either strain. This control confirmed that the production of ␤-galactosidase seen with the AD-E1 fusion in the E1BST-LACZ strain required expression of the E1 protein and was not due to some intrinsic property of the expression vector sequences. Therefore, full-length E1 protein is capable of specifically recognizing and binding to its binding site in vivo in a yeast genomic background.
Since the full-length E1 protein showed specific in vivo binding activity, we tested whether the in vitro-defined DBD also functioned in vivo. Like full-length E1, the AD-E1DBD fusion specifically activated the E1BS-containing promoter but not the p53BS promoter. Again, the reverse E1DBD clone failed to activate either promoter. These results demonstrate that amino acids 121 to 311 constitute a functional DBD capable of specific in vivo recognition of the E1BS in the nuclear environment. Interestingly, the E1DBD fusion was two-to threefold more active than the full-length E1 fusion in quantitative ␤-galactosidase assays when data were normalized for fusion protein expression levels (Bazaldua-Hernandez and Wilson, unpublished results). This discrepancy could be the result of trivial differences relating to folding or aggregation of the

HR1 is critical for E1 DNA binding in vivo.
Our in vitro results with the E1DBD identified sequences in HR1 as critical for E1-E1BS interaction. To examine the importance of this region in vivo, the E1 K183A mutation was constructed in the AD-E1DBD fusion for analysis in the one-hybrid system. As a control, the K157A mutation, which had no effect on E1DBD binding in vitro, was also transferred to the AD-E1DBD protein. Each mutant vector was transfected into both the E1BST-LACZ and p53BS-LACZ strains, and 10 transformants from each set were tested for ␤-galactosidase activity (Fig. 5A). All 10 transformants expressing AD-E1DBD K157A in the E1BST-LACZ strain were ␤-galactosidase positive within 1 h; however, the AD-E1DBD K183A transformants remained white for up to 8 h. After overnight incubation, the AD-E1DBD K183A transformants exhibited faint blue color indicative of very low level of ␤-galactosidase expression. Neither the AD-E1DBD K157A nor the AD-E1DBD K183A transformants produced any detectable ␤-galactosidase activity in the p53BS-LACZ strain, even after overnight incubation.
To more accurately compare the binding activities of the K157A and K183A mutants, individual transformants were grown in liquid culture and tested by a quantitative spectrophotometric assay for ␤-galactosidase (Fig. 5B). Consistent with both the plate assay and in vitro results, the K157A mutant exhibited WT levels of ␤-galactosidase activity whereas the level for the K183A mutant was reduced approximately 25-fold. No activity was detected with either mutant in the p53BS-LACZ strain. The dramatically decreased binding activity of the K183A mutant was not due to reduced expression of the AD-E1DBD protein, as comparable amounts of the fusions were observed by Western blotting in each of the four transformants tested (Fig. 5C).
Orientation of the critical residues in HR1. The in vitro and in vivo studies implicated HR1 as critical for DNA binding activity of the BPV E1 protein. Projection of this region of E1 in a helical wheel format revealed an extremely hydrophilic face which included all three of the residues whose mutation abolished site-specific DNA binding: K180, K183, and K186 (Fig. 6). All three of these residues also are highly conserved in E1 proteins among all papillomavirus groups, which supports a critical functional role for this region. To further evaluate the possible importance of this hydrophilic surface, additional mutations were constructed. In the helical projection, threonine 187 lies on the hydrophilic face between critical residues R180 and K183, while threonine 188 is on the opposing half of the helix. Thus, these two threonines which are adjacent in the primary sequence are potentially located in very different environments with respect to possible DNA interaction. Supporting this possible functional difference, threonine at the position corresponding to amino acid 187 in BPV E1 is absolutely conserved whereas the 188 position is variable. However, there are also conserved residues on the opposing face such as tryptophan 192 and aspartic acid 185. To determine if a conserved residue on this opposing face contributes significantly to DNA binding activity, D185 was chosen for analysis since it is located within the short stretch of critical primary sequence defined by inactivating mutations (residues 180 to 186) yet projects to the opposite face from these essential residues. Each of these three residues, D185, T187, and T188, was mutated to an alanine in the context of the E1DBD protein and then tested for in vitro DNA binding activity (Fig. 7). Mutation of T187 completely abolished DNA binding activity (Ͻ2% of the WT level), while the nonconserved T188 mutation showed only a modest 20 to 40% reduction in origin binding. The D185A mutant was more impaired than the T188 mutant but still exhibited 30 to 50% of WT binding activity in repeated trials, and it was substantially more active than the T187A mutant. These result are consistent with the helical wheel prediction that the amino acids critical for DNA binding in HR1 are located on a common surface. The greater impact of the D185A mutation on binding than of the T188A mutation may reflect the effect of charge loss on overall structure in this region.

DISCUSSION
Papillomavirus E1 proteins are absolutely critical for initiation of viral DNA replication, serving as both origin recognition factors and helicases (36,39,41,42,44,45). While several functional domains of E1 proteins have been defined (21,25,45), little is known about the actual three-dimensional structure of these proteins. Definition of the primary, secondary, and tertiary structures required for various biochemical activities and molecular interactions will be essential for understanding how initiation complexes assemble on the viral origin. Previously we showed that the DBD of BPV E1 is functional as an isolated polypeptide of approximately 190 amino acids (21). The E1DBD protein specifically recognizes the E1BS sequence in origin DNA fragments in vitro and also interacts with the viral E2 protein (21). A subsequent study with a slightly smaller E1DBD confirmed these properties and also demonstrated cooperative origin binding between the E1DBD and both full-length E2 and the E2DBD (7). The ability of the isolated E1DBD to function in origin recognition and E2 interaction in the same fashion as full-length E1 enables studies of E1-DNA interactions to be performed with this smaller molecule that is more easily expressed, purified, and analyzed.
In this study we examined predicted HRs of the E1DBD as potential molecular surface tracts that might be directly involved in DNA contact. The BPV E1DBD has three major HRs, two of which, HR1 (BPV E1 amino acids 179 to 191) and HR3 (amino acids 241 to 252), are well conserved among papillomavirus E1 proteins. A previous mutational analysis showed that conserved basic residues in HR3 are critical for E1 DNA binding activity (39), and we now demonstrate a similar requirement for basic residues in HR1. In contrast, mutation of a lysine residue in nonconserved HR2 caused only a small decrease in origin binding activity, indicating that this hydrophilic region is not as important for E1-DNA interaction. Inspection of the HR1 sequence suggested that all three basic residues known to be required for DNA binding could be physically juxtaposed on the helical face. Consistent with this prediction, mutation of T187 located on the same predicted face as the basic residues produced an E1DBD protein severely impaired for sequence-specific DNA binding in vitro. In contrast, the adjacent T188 residue is predicted to be on the opposing helical face, and its mutation had a relatively small effect on DNA binding activity. Likewise, mutation of a conserved residue on the same face as T188, D185, also resulted in a protein that was substantially more active than the T187A mutant. From these results, residue D185 is clearly not as critical for DNA binding as those on the hydrophilic helical face, though its high conservation suggests that it may function in some other E1 activity. While the actual secondary structure of HR1 is unknown, these mutational results support the existence of a hydrophilic face including at least four residues critical for E1 DNA binding function: R180, K183, K186, and T187.
In the absence of crystallographic or spectroscopic data on the structure of E1, it is useful to compare E1 with the simian virus 40 (SV40) large T antigen, with which it shares sequence and functional homology (9,26). Even though these two proteins recognize completely different nucleotide sequences, there are some intriguing parallels to their DBDs (Fig. 8). First, their DBDs are of similar size, and each comprises a relatively small portion of the full-length protein: approximately 130 amino acids out of 708 for SV40 T antigen (20, 28) versus approximately 170 out of 605 for E1 (7,21). (Note that the boundaries for the E1DBD might actually be even slightly smaller, as they have not been precisely defined.) Furthermore, the two DBDs are located in similar positions in the primary sequences, both beginning approximately 130 to 140 amino acids from the N terminus. Mutational analysis of T antigen defined two sets of critical amino acids for specific origin DNA binding, regions A FIG. 6. Helical projection and conservation of amino acids in HR1. Amino acids 176 to 193 of BPV E1 protein are displayed on the left in a helical wheel projection. The horizontal line divides the helix into two halves: an upper, highly basic, hydrophilic face, and a lower, more hydrophobic face. Lysine and arginine residues shown to be critical for DNA binding activity are marked with asterisks. The arrows indicate the two consecutive threonines and the aspartic acid that are functionally evaluated in Fig. 7. On the right is a comparison of the BPV E1 sequence from amino acids 176 to 193 aligned with the corresponding regions from the papillomavirus groups A to E and unclassified. The sequence shown for each papillomavirus group is the consensus sequence for these BPV E1-equivalent regions. Numbered residues refer to the BPV E1 amino acid number. Boxes indicate residues that are absolutely conserved with the exception of positions 183 and 185, which diverge in the group C and group B sequences, respectively.
(residues 152 to 155) and B2 (residues 203 to 207) (37,43). Recent solution nuclear magnetic resonance studies on the isolated T-antigen DBD indicate that the A and B2 elements form a juxtaposed surface region that is the likely pentanucleotide contact site (23). Our present results, combined with those of Thorner et al. (39), identify two separate regions necessary for E1 binding activity, HR1 and HR3. HR1 and HR3 are separated by 49 intervening residues, compared to 47 amino acids between T-antigen regions A and B2. These similarities in organization of the DBDs for T antigen and E1 suggest that they could possess related three-dimensional structures. As the T-antigen DBD structure shares folding features with the DBDs of papillomavirus E2 proteins and the Epstein-Barr virus nuclear antigen 1 (10,23), the E1DBD may also be part of this superfamily.
In addition to the in vitro DNA binding studies, a yeast one-hybrid system was developed to assay E1-DNA interactions in vivo. The full-length E1 protein exhibited site-specific DNA binding activity, which is the first direct demonstration that the minimal 18-bp E1BS sequence is sufficient to mediate E1-DNA interaction in the environment of the host cell nucleus. Given the relatively modest sequence specificity of E1 protein in vitro (32), the ability of E1 to locate and specifically bind the E1BS in the background of the yeast genome, and in the absence of E2 protein, was initially somewhat surprising. While the yeast genome is approximately 250-fold smaller than mammalian genomes, the absolute ratio of total nuclear DNA to E1BS DNA in the yeast strain is still at least a 1,000-fold larger than the ratio of competitor DNA to E1BS DNA that will prevent E1 binding in vitro (32). One factor that may facilitate the in vivo binding of E1 is that the E1BS sequence in the E1BST-LACZ reporter strain is a tandem triplet which should provide a higher local concentration of the target sequences. A second factor is that much of the vast excess of nuclear DNA may be inaccessible due to association with histones and host cell regulatory proteins, while the promoter region would naturally be more available. Because of this uncertainty in the actual amount of functional competitor, it is difficult to make quantitative comparisons between the in vivo and in vitro binding results.
Like the full-length E1 protein, the E1DBD specifically interacted with the E1BS in the one-hybrid system. These results confirmed that this in vitro-defined DBD domain is also stable and functional in vivo. The activity of the E1DBD in the yeast system allowed us to examine the role of HR1 in vivo. Mutation of K183 in HR1 severely impaired binding by the AD-E1DBD fusion, demonstrating that this region is as essential in vivo as it was in vitro. Furthermore, mutations in HR1 are the most frequently obtained clones when the one-hybrid system is used to screen for randomly generated mutants lacking DNA binding activity (M. West, unpublished data). In contrast to the K183A mutant, the K157A mutant was equivalent to WT both in vitro and in vivo. The concordance between the in vitro and in vivo binding activities indicates that the in vitro defects in binding are not simply the result of reduced stability of the mutant proteins during extraction and purification and instead represent intrinsic properties of the proteins. Overall, our results strongly implicate HR1 as a critical element for originspecific binding by E1 protein and, by analogy with SV40 T antigen, suggest that the HR1 comprises at least part of the recognition sequence.