Previous Article | Next Article ![]()
Journal of Virology, May 2006, p. 4304-4312, Vol. 80, No. 9
0022-538X/06/$08.00+0 doi:10.1128/JVI.80.9.4304-4312.2006
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Tufts University School of Medicine and the Sackler School of Graduate Biomedical Sciences, Department of Biochemistry, 136 Harrison Avenue, Boston, Massachusetts 02111
Received 16 December 2005/ Accepted 7 February 2006
|
|
|---|
|
|
|---|
Fundamental insights into the initiation of eukaryotic DNA replication, at the molecular level, have been provided by model systems, in particular, the small DNA tumor viruses including simian virus 40 (SV40) and papillomavirus (reviewed in reference 52). In contrast to the eukaryotic systems (19), the origins of replication in these viruses are well defined and relatively simple (see, e.g., reference 32) and the viral "initiators" that locate and assemble on these origins are composed of just one or two proteins. These proteins recognize the origin-specific DNA sequences, melt the duplex DNA, and act as helicases exposing single-stranded DNA (ssDNA) for replication using host-encoded polymerases which are recruited along with ssDNA-binding proteins to the site of replication. Of considerable interest, all of the origin binding domains (obd's) of the viral initiators analyzed to date adopt the same fold (10, 22). This structural conservation is surprising since there is no obvious similarity in the DNA sequences to which these domains bind or the amino acid sequence which adopts this fold.
One of the more versatile model systems for studies of the initiation of viral DNA replication is SV40 (reviewed in references 8, 15, and 44). Many viral origins contain multiple initiator binding sites that consist of short sequences of 5 or 6 base pairs organized as pairs of inverted repeats; the SV40 origin contains four GAGGC binding sites, termed P1 through P4 (Fig. 1A). These repeats are flanked by an AT-rich region on one side and an early palindrome region on the other.
![]() View larger version (50K): [in a new window] |
FIG. 1. The SV40 core origin and structure of SV40 large T-antigen origin binding domain. (A) DNA sequence of the SV40 core origin. The 64-base-pair SV40 core origin is depicted as B-form DNA, with the central, site II region colored the same in the sequence and in the DNA double helix. The GAGGC pentameric sequences labeled P1 to P4 are indicated in cyan and their complements in magenta. The cyan arrows indicate the 5' 3' orientation of the pentameric sequence. The arrangement of the pentamers positions the major grooves of P1 and P3 on approximately the same face of the DNA. The same is true for P2 and P4. In addition, P1 and P2 (like P3 and P4) are on opposite faces of the DNA. The AT-rich and early palindrome (EP) sequences are gray; the mononucleotide spacer between the pentamers is yellow. (B) The amino acid sequence of T-ag obd with corresponding secondary-structure assignments (using the program DSSP) indicated by cylinders (alpha helices) or arrows (beta sheets); the A1 and B2 motifs are indicated. Residues closer than 4 Å to the protein-protein interface are indicated by boxes, magenta in one subunit and green in the adjacent subunit. (C) Ribbon diagram of the T-ag obd monomer with secondary structure elements labeled. Residues at the protein-protein interface that generates the spiral are magenta and green as in panel B. (D) Ribbon diagram of six spirally arranged obd molecules. The sixfold screw axis is perpendicular to the page. Each monomer is colored differently. The inner diameter of the channel is 30 Å. These six obd's form an "open ring" or spiral, and the position of the gap is shown. (E) Ribbon diagram of the spiral shown in side view to visualize the gap and the screw translation. Each monomer is colored as in panel D. The black line through the center of the spiral represents the sixfold screw axis (panel D rotated by 90°). The pitch of the spiral is 35.8 Å. The opening or gap between the first molecule (yellow) and the last molecule (red) is indicated. Figures were made with programs NUCCYL and PYMOL unless otherwise indicated.
|
While informative, the structural studies have yet to resolve important issues, such as how the precise arrangement of the four GAGGC pentanucleotides helps to orchestrate the assembly of 12 molecules of T-ag on the origin or how the obd's are arranged relative to the DNA and the hexamer of helicase domains. As both the repeating sequences within origins and dodecamer formation of initiator proteins are common features of replication initiation in many DNA viruses, establishing the protein-protein interactions that occur during assembly will provide fundamental insights into the architecture of the initiation complex. Herein, we report that, within crystals, the T-ag obd131-260 adopts a left-handed spiral with six obd's per turn and we discuss the implications of this structure for the mechanism by which DNA replication is initiated.
|
|
|---|
10 mg/ml using a Vivaspin 5K (Vivascience). The protein was aliquoted, flash frozen in liquid nitrogen, and stored at 80°C. Typical yields were
10 to 15 mg purified T-ag obd/liter E. coli culture.
Overexpression of selenomethionine-substituted T-ag obd was achieved using the nonauxotrophic method (58). Briefly, E. coli strain BL21 cells were grown at 37°C in M9 medium supplemented with glucose; thymine; the amino acids L-Thr, L-Phe, L-Leu, L-Ile, L-Val, and L-Lys; ampicillin; and 60 mg/liter medium of L-selenomethionine (Sigma) to an optical density of
0.6, at which point the temperature was reduced to 28°C and 0.1 mM IPTG was added. The cells were harvested after 13 to 16 h. Purification of selenomethionine-substituted T-ag obd was identical to the T-ag obd. Typical yields were
7 mg purified selenomethionine-substituted T-ag obd/liter E. coli culture.
Crystallization. Crystals of the T-ag obd were grown by vapor diffusion using the hanging-drop method at 20°C. One microliter of the T-ag obd (8.8 mg/ml) in storage buffer was mixed with 1 µl of a reservoir solution consisting of 1.6 M trisodium citrate, pH 6.7. The drop was equilibrated over 0.4 ml reservoir solution. Hexagonal-rod-shaped crystals of dimensions 0.2 by 0.1 by 0.1 mm grew in 3 to 5 days. Similar conditions were used to obtain crystals of the selenomethionine-substituted T-ag obd.
This same crystal form was also grown under very different conditions using polyethylene glycol (Peg) 4000 as the precipitant. In this case, crystals were obtained at 4°C by mixing 1 µl of the T-ag obd as before with 1 µl of a reservoir solution containing 30% (vol/vol) Peg 4000, 0.1 M ammonium acetate, and 0.1 M sodium citrate, pH 5.5. These crystals had cell parameters which were virtually identical to those obtained from crystals grown from 1.6 M trisodium citrate.
X-ray data collection. Single crystals grown using citrate were transferred to a cryoprotectant solution consisting of 1.28 M trisodium citrate, pH 6.7, and 20 to 25% glycerol. After approximately 1 minute, the crystals were flash-cooled (100°K) in a rayon loop (Hampton Research Inc.). Native (native1) X-ray diffraction data were collected at 100°K using an X-Calibur diffraction system (Oxford Diffraction). A high-resolution 1.5-Å native set (native2) and a 1.45-Å selenomethionine-substituted T-ag odb (Se1) data set were collected at beamline X29 at the National Synchrotron Light Source (Brookhaven, NY) equipped with a Quantum 315 detector at 100°K. The data were integrated and processed with MOSFLM (30) and SCALA (1) or HKL2000 (39). Crystallographic data and refinement statistics are given in Table 1.
|
View this table: [in a new window] |
TABLE 1. Data collection and refinement statisticsa
|
The coordinates have been deposited to the Protein Data Bank (PDB) and given the accession code 2FUF.
|
|
|---|
s, and the secondary structure assignments for the X-ray structure differ somewhat from the NMR model. These differences are distributed throughout the structure and involve residues within the interior of the protein and within the A1 and B2 DNA-binding regions (residues 147 to 159 and 203 to 207, respectively). Of particular note, the entirety of helix C is translated 1.5 to 2.0 Å relative to the NMR structure, and in the A1 motif, the backbone atoms of Asn 153 are shifted by 2.6 Å. As the experimental data for the both the NMR and X-ray structures are quite good, we believe these differences likely reflect intrinsic conformational flexibility within the obd and are not the result of errors in the structure determination.
The structures of three other viral obd's have already been reported. These are from papillomavirus E1 (13, 14), the Rep protein from adeno-associated virus (22), and the tomato yellow leaf curl virus (10). Of these, the most closely related to the SV40 T-ag obd is from the papillomavirus initiator E1 obd (PDB accession codes 1KSY, 1KSX, and 1F08) (13, 14). In each of the E1 obd structures, a small dimer interface between two head-to-head obd's is observed. The T-ag obd and papillomavirus E1 obd share only
10% sequence identity, and their structures have an RMSD of 3.3 Å between 109 C
s. Though all of these obd's are known to function in the context of hexameric assemblies, the structure of T-ag obd described herein is the first instance where the obd's interact with symmetry consistent with the stoichiometry of the full-length protein upon assembly on the origin.
The spiral hexamer of T-ag obd. The T-ag obd crystallized in the hexagonal space group P65, with one molecule in the asymmetric unit cell. Assembly of isolated T-ag obd's into dimers, hexamers, or larger complexes is not observed in solution. In the crystals, however, the obd molecules are arranged as a spiral with a left-handed twist having six obd's per turn. At the high protein concentration used for crystallization, this spiral arrangement of the T-ag obd's is remarkably insensitive to pH, salt concentration, temperature, and other factors. This is evidenced by our ability to grow virtually identical crystals over a 3-pH-unit range, at room temperature or 4°C, and using either 1.6 M citrate or 30% Peg 4000 as the precipitating agent.
This arrangement of obd's could equally well be described as helical, but we have used the word "spiral" rather than "helix" throughout this paper to avoid confusion with the amino acid and nucleic acid helices that are also discussed. Adjacent obd molecules within a given turn of the spiral are tightly packed. The spiral surrounds an
30-Å central channel and has an outer dimension of
95 Å (Fig. 1D). The pitch of the spiral (defined as the distance to complete one turn along the helical axis) is 35.8 Å as shown in the side view (Fig. 1E). The inner wall of the channel consists of
-helices (
B and
C). The orientation of
C is almost parallel to the sixfold screw axis. In the context of the hexameric assembly of full-length T-ag molecules, the spiral we observe contains a gap (Fig. 1D and E). As discussed below, such a gap may play a significant role in DNA unwinding.
The
30-Å-wide central channel can easily accommodate double-stranded DNA (dsDNA), and the inner surface of the spiral is highly positively charged, as seen by the electrostatic potential mapped onto the surface of the spiral hexamer (Fig. 2A). Furthermore, the diameter of this channel is consistent with that from models made from electron microscopy (EM) reconstructions of intact T-ag (59). Curiously, this channel is larger than that needed to accommodate duplex DNA; however, given the roughly spherical shape of the T-ag obd's and the tight packing observed between adjacent molecules in the spiral, no flat-ring hexamer of these domains is likely to produce a significantly smaller channel than that observed in this crystal structure.
![]() View larger version (59K): [in a new window] |
FIG. 2. Structure of spirally arranged T-ag obd's. (A) Surface representation of the T-ag obd hexamer. The surface is colored according to the electrostatic potential. Blue surfaces indicate positive potential and red negative (+/10 kT/e). Three views showing each face of the hexamer, as well as a side view to better illustrate the spiral, are included. A cartoon schematic of the spirally arranged T-ag obd's is presented below each view to indicate the direction of rotation. A yellow triangle is placed at the same position on each hexamer to aid in orientation. This figure was made using the SwissPDB viewer and rendered using POVRAY. (B) Residues involved in the protein-protein interface between each adjacent monomer. Residues listed are within 4 Å of the interface. Magenta ovals indicate residues from one subunit, and green squares indicate residues from the adjacent subunit. An asterisk indicates that mutagenesis resulted in a T-ag molecule that was defective in the initiation of replication (see text). (C) Surface representation displaying the protein-protein interface described in panel B. The contact surfaces of adjacent subunits are depicted in magenta and green. (D) Close-up of the protein-protein interface. Side chains of residues of adjacent subunits at the protein-protein interface are magenta and green. Residues known to be important for hexamerization by mutagenesis (see text) are cyan. The van der Waals surface for Phe 151 (magenta), Phe 183, and Ser 185 (both cyan) shows the tight packing of the protein interface. All other protein residues are yellow.
|
1,300 Å2 (
650 Å2 per monomer) (Fig. 2B and C). The size of this interface is on par with those of many other physiologically important protein-protein contacts (33). SV40 large T-antigen has been extensively studied as a model for DNA replication, and as a consequence, approximately 40% of residues within the T-ag obd have been mutagenized and biochemically characterized (41). Collectively, these mutants provide a significant resource for analyses of the X-ray structure. In support of the crystallographic interface, previous mutagenesis studies identified residues Phe 183 and Ser 185 as being important for hexamer formation of full-length T-ag (46). These residues are located at or very near the obd-obd interface observed in the crystal structure (Fig. 2B), in which Phe 151 from one monomer protrudes into a hydrophobic pocket located in the adjacent monomer, at the base of which is Phe 183 (Fig. 2D). Mutant T-ags with either Phe 183Leu or Ser 185Thr bind DNA and melt and untwist origin DNA like the wild type but are defective in oligomerization and DNA unwinding. In the crystal structure, the side chain of Ser 185 is buried within the interior of the T-ag obd monomer but packs tightly against Phe 183. Mutation of either Phe 183 to a smaller residue or Ser 185 to a bulkier residue could disrupt the side-to-side interface observed and prevent hexamerization. In addition, additional mutation of the interface residues (e.g., 150, 151, 153, 181, 200, 204, 206, 213, or 214) results in T-ags that are defective in replication or origin recognition (8, 44).
Most of the residues implicated in DNA binding map to a contiguous patch that includes the inner surface of the channel and one side (normal to the sixfold screw) of the hexamer (Fig. 3A). This surface contains the A and B2 motifs, as well as residues His 201 and Arg 202, which are also important for DNA binding. In the spiral, the residues critical for sequence-specific binding to GAGGC pentanucleotides (Asn 153, Arg 154, and Thr 155 from the A1 motif and residues His 203 and Arg 204 from the B2 motif [45]) are predominantly solvent accessible (Fig. 3A). Many mutations within this region disrupt DNA binding and/or cause defects in replication. Many other T-ag mutants (45, 46, 47, 62) defective in dsDNA binding also map primarily to the exposed surface of the hexamer (see Fig. S1 in the supplemental material).
![]() View larger version (56K): [in a new window] |
FIG. 3. Surface representation of the T-ag obd hexamer and the modeled DNA complex. (A) The surface representation of the T-ag obd hexamer (yellow) shows that the A1 (amino acids 147 to 159) motif (red) and B2 (amino acids 203 to 207) motif (purple) map onto the inner channel and one surface of the T-ag obd hexamer. The subunits are labeled A to F. This view is the same as the left panel of Fig. 2A. (B) Model of the T-ag obd hexamer with duplex DNA running through the central channel of the T-ag obd hexamer; the DNA is colored as in Fig. 1A. The protein is shown as a surface representation colored as in panel A. The view is rotated 45° relative to the view of the model presented in panel A. The spiral positions two of the six T-ag obd's proximal to the repeating GAGGC sequences in P1 and P2, and the duplex DNA easily fits in the positively charged central channel. (C) Cartoon of a T-ag obd spiral with DNA along the central channel. The T-ag obd's are represented as orange spheres and the DNA as a cylinder. The positions of pentamers P1 and P2 are indicated on the DNA. This schematic shows how subunit E is poised for interaction with P1 while subunit B (180 degrees away from the first) is positioned to interact with P2.
|
The hexamer has two surfaces (one contains the A1 and B2 motifs) and thus two possible orientations for docking to a given pentanucleotide. For reasons outlined below, we favor the docking of T-ag obd spirals such that the faces containing the A1 and B2 motifs are oriented towards the helicase domains. For convenience we have designated the six monomers in a spiral A through F (Fig. 3B and C). If the T-ag obd spiral is positioned such that the E subunit is placed atop P1, the B subunit (three subunits away from E) is juxtaposed opposite P2. This occurs because the 35.8-Å helical pitch of the obd spiral is very similar to that of B-form DNA. However, site-specific interactions of T-ag obd subunits B and E with pentanucleotides 1 and 2 cannot occur simultaneously; they are too far from the DNA to make the predicted major groove contacts without distortions in either the DNA and/or the T-ag obd spiral (see Fig. S3 in the supplemental material). Non-sequence-specific interactions may, however, take place with duplex DNA centered on, or slightly off of, the sixfold axis.
The formation of a spiral double hexamer of the T-ag obd has also been modeled. Given that two orientations are possible, two models were constructed: one where the hexamer-hexamer interface contained the A1 and B2 motifs and the second in the opposite orientation (the A1 and B2 motifs are proximal to the helicase domain). We favor the latter model (helicase proximal; Fig. 4A) primarily because this model positions loop B3 at the hexamer-hexamer interface (Fig. 4A). Mutations of residues within the B3 motif (residues 213 to 220; Fig. 4B; Gln 213His, Leu 215 Val, Thr 217 Ser, or Phe 220 Tyr) impair the ability of T-ag to form double hexamers (60). In our crystal structure, loop B3 has high B factors and thus is probably relatively flexible. It is possible that, upon formation of the double hexamer, these flexible residues reorganize to stabilize the hexamer-hexamer interface. Additional support for this arrangement of the obd's in the double hexamer stems from the orientation of the T-ag obd termini. The "helicase-proximal" model places the C termini away from the obd-obd interface and toward the expected position of the helicase domain. In this model, the N termini are oriented toward the hexamer-hexamer interface and situated on the periphery of the hexamer. This orientation would allow the N-terminal J domains, which are dispensable for initiation of DNA replication in vitro (9, 17), to be in close proximity without interfering with DNA binding.
![]() View larger version (56K): [in a new window] |
FIG. 4. Model of the T-ag obd double hexamer with DNA. (A) The T-ag obd double-hexamer model is shown as a ribbon diagram with B-form DNA along the sixfold screw axis. The DNA is colored as in Fig. 1A. The T-ag obd hexamers are green and yellow. The A1 (red) and B2 (purple) loops are oriented away from the double-hexamer interface and proximal to the expected position of the C-terminal helicase domains. The C termini point away from the hexamer-hexamer interface, while the N termini point toward the interface and are situated on the periphery of the model. Mutagenesis of amino acid residues 213, 215, 217, and 220 (shown as orange van der Waals spheres) impairs double-hexamer formation (see text). (B) Mutants which impair double-hexamer formation map to one face of the T-ag obd hexamer. Shown is a view of a surface representation of the T-ag obd hexamer (yellow) which displays the putative double-hexamer interface. Amino acid residues 213, 215, 217, and 220 (orange) are solvent accessible.
|
Alternatively, the spiral may represent a structure that is adopted after origin recognition and DNA melting have already occurred. In such a scheme, the GAGGC-specific interactions take place early, prior to assembly of the helicase-competent T-ag hexamers on the DNA, and a hexameric ring (spiral or flat) of the obd's forms around the melted DNA, after the duplex pentanucleotide binding sites have melted and are no longer available for binding.
Spirally symmetric protein complexes are known to play important roles in biology, as evidenced by the E. coli protein RecA (53) and the Saccharomyces cerevisiae clamp loader RFC (6). In addition, the bacterial Rho transcriptional terminator, a hexameric helicase, was recently shown to form an open-ring structure (a spiral), and the authors suggest the open ring may provide a mechanism whereby the single-stranded nucleic acid may enter the central channel of the ring without disrupting the hexamer or cutting the nucleic acid (49). Nevertheless, the spiral nature of the sixfold T-ag obd structure observed in our crystals was unexpected. A flat-ring having sixfold symmetry was anticipated; indeed, it is possible that the spiral is a crystallographic artifact and that the physiological ring structure is a donut (flat ring) and not a spiral. The left-handed spiral has a relatively small translational component (each monomer is translated 5.9 Å along the sixfold screw axis). Thus, the spiral could be flattened with relatively minor alterations in the residues that participate in the obd-obd interface. If that is the case, this would make our structure analogous to that of T7 replicative helicase, which has been shown to have significant plasticity in its interface as it has also been crystallized as a spiral or open-ring hexamer (42), as well as a flat hexamer (48) and a heptamer (56). Summarized in the following sections, however, is additional evidence that supports the spiral hexamer as being biologically important.
The spiral model for double-hexamer formation of the T-ag obd (Fig. 4A) is consistent with phenanthroline-copper DNA footprinting studies which demonstrated that pentanucleotides P1, P2, and P3, but not P4, are protected by assembly of either the T-ag obd or full-length T-ag (24). While a spiral double hexamer of the T-ag obd bound to pentanucleotides 1 and 3 would protect the central region of the origin (also known as site II), two flat rings of the T-ag obd formed on pentanucleotides 1 and 3 would not (Fig. 4A).
As mentioned previously, the overall dimensions of the spiral are consistent with those seen by EM. Earlier EM studies of T-ag double hexamers indicated that the sixfold axes of the helicase domains and the axes of the obd's were coincident in T-ag (35, 57, 59). More-recent EM analyses revealed a twist in the relative orientations of the hexamers and showed bending and distinct asymmetry in the central region (21, 43). It is possible the disorder observed in the central region, where the obd's are believed to be, could be explained by a spiral arrangement of obd's. As the asymmetry in the EM images is most pronounced in the region of the obd's, the T-ag obd's may be engaged in a complex and dynamic interaction with DNA. This suggests that the protein sequence connecting the T-ag obd to the helicase domain adopts different conformations. Indeed, the crystal structure of the SV40 T-ag helicase domain demonstrated that the linker sequence (residues 255 to 266) is flexible (31), and this region is sufficiently long to extend from the spirally symmetric T-ag obd's to the flat-ring helicase domains. Thus, the two different symmetries present in the obd and helicase domains could be accommodated in a T-ag hexamer. A number of other multidomain proteins that interact with DNA and form oligomers contain multiple symmetries within a single complex. Examples include the trimeric heat shock transcription factor HSF (50) and the dimeric nuclear retinoid X receptor (54).
Implications of the spiral for DNA unwinding.
When assembled on the core origin as a double hexamer, T-ag is capable of unwinding origin-containing DNA provided ATP and the single-stranded DNA-binding protein RPA are included in the reaction mixture (reviewed in reference 5). As previously noted, a single pentanucleotide supports hexamer formation; however, interactions with two pentanucleotides are necessary for unwinding (11, 24). This suggests that a second site-specific interaction, between the initially unoccupied pentanucleotide and a correctly juxtaposed T-ag obd subunit (e.g., the interaction between T-ag obd monomer "B" with pentanucleotide 2 [Fig. 3C]), takes place at some point during the RPA-dependent unwinding process. While significant structural perturbations are needed to complete these binding events, the architectural features of the spiral would likely facilitate these interactions. Furthermore, during unwinding the T-ag obd's must transit from binding dsDNA to ssDNA. Therefore, it is of interest that the
30-Å channel present in the hexamer could accommodate either dsDNA or two separate strands of ssDNA. Recently, the binding site of human RPA onto T-ag obd has been identified via NMR titration experiments (2) and shown to include residues in the A1 and B2 motifs, the same surface as that identified for binding dsDNA.
Finally, from the perspective of DNA unwinding, a very interesting feature of the spiral hexamer is the gap. During DNA unwinding, the gap could enable ssDNA to pass through the T-ag obd hexamer without disassembling the hexamers or breaking the DNA. Thus, the presence of the gap helps to explain studies indicating that "rabbit ears," thought to be ssDNA, emanate from the central region of the double hexamer (21, 61). The presence of a gap in helicases has been noted in other systems. In addition to the open-ring structure of the Rho helicase mentioned previously, recently published EM images of the archaeal MCM helicase exhibit six- and seven-member closed- and open-ring structures (20). The plasticity of the protein interface and the ability to form open-ring structures which have a gap may be a general feature of eukaryotic helicases.
Conclusions. The left-handed spiral hexamer of the T-ag origin binding domains observed in the high-resolution crystal structure is intriguing in many respects; it is consistent with the T-ag obd interactions suggested by EM reconstitutions, and it explains a large body of mutagenesis and biochemical data. This structure may represent an important step along the path of assembly of the active dodecamer helicase, after origin recognition and prior to unwinding of DNA. As such, it provides insight into the arrangement of the pentanucleotides required for origin recognition. However, further structural information and biophysical data will be needed to understand how SV40 T-ag accomplishes the highly complex events associated with initiation of DNA replication.
This work was supported by NIH grants RO1 GM055397-12 and RO1 GM055397-12S1.
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»