Previous Article | Next Article ![]()
Journal of Virology, December 2007, p. 13015-13027, Vol. 81, No. 23
0022-538X/07/$08.00+0 doi:10.1128/JVI.01703-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

and
Peter Tattersall1,2*
Departments of Laboratory Medicine,1 Genetics, Yale University Medical School, 333 Cedar Street, New Haven, Connecticut 065102
Received 6 August 2007/ Accepted 12 September 2007
|
|
|---|
170 nucleotides effectively compete for NS1, often binding with higher affinity to these internal sites than to sites in the origins. We explore the diversity of the internal sites using competitive binding and DNase I protection assays and show that they vary between two extreme forms. Simple sites with three somewhat degenerate, tandem TGGT reiterations bind effectively but are minimally responsive to ATP, while complex sites, containing multiple variably spaced TGGT elements arranged as opposing clusters, bind NS1 with an affinity that can be enhanced
10-fold by ATP. Using immuno-selection procedures with randomized sequences embedded within specific regions of the genome, we explore possible binding configurations in these two types of site. We conclude that binding is modular, combinatorial, and highly flexible. NS1 recognizes two to six variably spaced, more-or-less degenerate forms of the 5'-TGGT-3' motif, so that it binds efficiently to a wide variety of sequences. Thus, despite complex coding constraints, binding sites are configured at frequent intervals throughout duplex forms of viral DNA, suggesting that NS1 may serve as a form of chromatin to protect and tailor the environment of replicating genomes. |
|
|---|
5-kb), linear, single-stranded-DNA genomes flanked by short imperfectly palindromic hairpin telomeres. These terminal palindromes serve as hinges, sequentially folding and unfolding during DNA synthesis to adapt an ancient unidirectional strand displacement mechanism, called rolling-circle replication, for use with a linear substrate (15, 16). Like many rolling-circle replicons, parvoviruses encode a single essential replication protein, an initiator endonuclease, otherwise usurping the replication machinery of the host cell for their own preferential amplification. However, the highly processive unidirectional strand displacement synthesis that characterizes rolling-circle replication, while relying predominantly on cellular enzymes and cofactors, is likely not permitted at cellular forks (4, 43). Thus, parvoviruses have to create an environment in which they can escape normal cellular replication constraints.
Parvovirus initiator proteins direct replication fork assembly at two duplex origin sequences created from the viral telomeres by introducing a site-specific single-strand nick, which creates the base-paired DNA primer needed to recruit a leading-strand-specific host polymerase, most likely DNA polymerase
. Minute virus of mice (MVM) is a member of the Parvovirus genus in the family Parvoviridae. MVMs differ from their adeno-associated virus (AAV) and parvovirus B19 cousins in the Dependovirus and Erythrovirus genera, respectively, by being heterotelomeric (51). While homotelomeric viruses have inverted terminal repeats, so that the hairpins and origins at the two ends of the genome are identical, in heterotelomeric viruses, the two termini have different sequences and predicted secondary structures, and the resulting origins are structurally and functionally different. Thus, it is necessary for the MVM initiator protein NS1, an 83-kDa nuclear phosphoprotein, to establish two very different ternary initiation complexes—with duplex origin sequences expressed in different replicative-form (RF) DNA intermediates and with specific cellular cofactors that are absolutely required at one or other origin (15, 16). To form a nicking complex, NS1 first binds site specifically, in a highly asymmetric way, to a duplex cognate recognition sequence in the origin. Its binding site is characterized by the presence of two to three iterations of the tetranucleotide 5'-ACCA-3', although the sequences and spacings of these motifs in the two origins differ (11, 12). Due to its orientation relative to the nick site, we now generally refer to this motif by its complementary sequence, 5'-TGGT-3'. The binding site in the left origin (OriL) has the sequence 5'-TGGT-TGGT-CAGT, with the resulting nick occurring on the opposite strand 23 bp downstream of the 5' T, whereas in the right origin (OriR), the nick-proximal NS1 binding site is TGGT-T-CAGT-TGGT and the nick occurs on the opposite strand 20 nucleotides (nt) downstream of the 5' T. From these binding sites, NS1 then sets up stabilizing interactions with its DNA-bound cofactors. At the OriL, this is the p79 subunit of parvovirus initiation factor (PIF), a heterodimeric cellular transcription factor (7, 9). A distant, DNA-bound NS1 complex positioned at the tip of the hairpin performs this function at OriR, where initiation is critically mediated by cellular-DNA-bending proteins from the HMG1/2 family that create a specific double-strand loop in the intervening DNA (12). In a reaction that consumes ATP, NS1 then unwinds the adjacent duplex DNA, exposing and engaging the future nick site in a single-stranded form (40). Nicking is somewhat sequence specific and is mediated by a trans-esterification reaction that leaves NS1 covalently attached to the new 5' end by a phosphodiester bond, from which position it is thought to remain in the fork, serving as part of the 3'-to-5' replicative helicase.
DNase I protection analysis has shown that, unlike their AAV counterparts (called Rep68 and Rep78 [36]), MVM NS1 molecules only exhibit site-specific DNA binding if they are preassembled into some form of oligomer, at least a dimer. Dimerization can be induced by the addition of antibodies directed against their N- or C-terminal peptides or by the binding of ATP, presumably into the canonical NTP-binding pocket of the NS1 helicase domain, although ATP hydrolysis is not required (8). This approach was used in vitro to show that NS1 binds to an internal MVM sequence, called the transactivation region (TAR; nt 1863 to 2005), from which position its acidic C-terminal peptide potently upregulates transcription from the nearby capsid gene promoter in vivo (21, 31, 33, 45). Thus, NS1's ability to bind DNA site specifically has been co-opted at the TAR to fulfill an additional role in the viral life cycle, in which binding does not apparently lead to activation of the virus's endonuclease. At all sites analyzed to date, NS1 protects an extensive region, some 41 to 46 nt on each strand, from attack by DNase I. The NS1 footprint is positioned asymmetrically so that it projects
5 nt to the 5' side of the first TGGT motif in the cluster but extends over and beyond the potential nick site, by some 12 to 14 nt. Again, unlike that of its AAV Rep counterpart (36), NS1 binding cannot be demonstrated by electrophoretic mobility shift analysis in either the presence or the absence of ATP. Rather, demonstration of complex formation requires an immuno-selection procedure in which DNA fragments are incubated with NS1 in the presence of antibodies directed against one of its terminal peptides, and the bound species then coprecipitated with protein A-Sepharose (8, 11). Here we use this type of assay to explore the characteristics and distribution of NS1 binding sites in duplex viral DNA and show that there are many of these sites dispersed throughout the genome, some of which bind with substantially higher efficiency than those used to mediate nicking in the origins.
|
|
|---|
Plasmids used for binding assay substrates. pMVMp-DI is a derivative of pMM984 (25, 37), which contains a full-length infectious copy of the genome of the prototype strain of MVM (MVMp) inserted between the ClaI and BamHI sites of pAT 153 (a pBR322 derivative that lacks pBR nt 1730 to 2352). When used as a substrate for NS1 binding assays, pMVMp-D1 was fragmented by double digestion with the restriction enzymes AciI and BglII, which cut a total of 14 times within the viral insert and at many sites in the vector, generating a range of fragments of different sizes that all have 5' overhangs containing a single dGTP residue. These were labeled, according to their molar ratio, with the modified T7 polymerase Sequenase (U.S. Biochemicals, Cleveland, OH) in the presence of 32P-deoxycytidine and three cold deoxynucleoside triphosphates (dNTPs). pRM9/12 is a pCR2.1 (Invitrogen, San Diego, CA) derivative containing PCR-generated MVMp sequences from nt 4086 to 4912. For binding assays, this was fragmented with AciI and XbaI and labeled as described above.
Analytical NS1 DNA binding assays. Binding assays were carried out at 4°C in 0.1 ml of buffer A, which contains 20 mM Tris-HCl (pH 8.0), 10% glycerol, 1% Nonidet P-40, 5 mM dithiothreitol, and 100 mM NaCl containing 200 ng of poly(dI)-poly(dC) (Sigma, St. Louis, MO), unless otherwise specified in the text. ATP was added to 0.5 mM, as indicated in the text. Proteins were added as crude translation extracts (4 µl) or as purified products. In standard assays, 50 ng NS1 (±25 ng p79/96 PIF) was incubated, with or without 0.5 mM ATP, for 10 min before the 32P-labeled DNA fragments (10 ng) were added. Samples were incubated for 1 h on ice before rabbit antiserum (2 µl) was added, and the incubation continued for a further hour. Antisera against NS1 were as previously described (11), either directed against its C-terminal 16 amino acids or against a 91-amino-acid segment of its N terminus. Antibodies specific for the C-terminal peptide of the NS2P isoform have been described previously (13), and antisera specific for the p79 and p96 subunits of PIF were raised against synthetic 14-mer peptides derived from nonconserved, nonoverlapping domains of either subunit, conjugated to keyhole limpet hemocyanin (6). After incubation, 30 µl of buffer A containing 1.5 mg of prewashed protein A-Sepharose was added, and the mixtures were tumbled for 30 min at 4 to 8°C. After removal of the supernatant, immunoprecipitates were washed twice with 1 ml of cold buffer A and the DNA was deproteinized by incubation with proteinase K in the presence of 0.5% sodium dodecyl sulfate for 1 h at 55°C. Samples were analyzed by electrophoresis on 2.5% SupraSieve 3:1 GPG agarose gels (American Bioanalytical, Natick, MA) and exposed to film and/or quantitated with a Molecular Dynamics PhosphorImager.
(ii) Competitive NS1 selection assays. (i) OriL substrate. NS1 binding sites in the context of the MVM minimal OriL (OriLTC) were selected from a library of 92-bp fragments generated by annealing the two oligonucleotides N-ACCA (5'-gag aga tga gcg atg CAC GTC ACT TAC GTG AAC ATG GTN NNN NNN NNN NNA AAA TGA TAA GCG GTT CAG GGA GTT acg acg cga tac aga gc-3', where capital letters denote the MVM sequence) and AJP7B (5'-gct ctg tat cgc gtc gtA ACT CC-3'). The resulting 5' overhang was then filled in using Sequenase and all four dNTPs. Binding assays followed the analytical protocol described above, except that they contained 25 ng of substrate DNA (in the first round), 0.5 mM ATP, and 300 ng of His6-NS1. The substrate was bound in the presence of 2 µl of rabbit antibody directed against the NS1 N-terminal peptide for 2 h, 2.5 mg protein A-Sepharose was added, and samples were tumbled for a further hour, before precipitates were collected by centrifugation and washed.
Using a 32P-labeled wild-type version of this substrate, we first standardized the binding assay using 25 ng of the duplex 92-mer substrate and various NS1 concentrations. Resulting precipitates were deproteinized and analyzed by electrophoresis through nondenaturing 10% polyacrylamide gels, followed by PhosphorImager quantitation. This assay was set up to scavenge immunoglobulin G from 2 µl of anti-NS1 N-terminal serum, and we empirically determined that this allowed us to use a maximum of 300 ng of recombinant NS1 per reaction. Under standard conditions, this typically coprecipitated approximately 5 ng of wild-type DNA fragments from the 25-ng pool. When binding to completion, 300 ng of NS1, if binding as dimers, could theoretically complex with 90 ng of a 92-mer duplex substrate or, if binding as hexamers, could complex with 30 ng, so that our system appeared to be functioning with reasonable efficiency, albeit probably not to completion. Since the selection substrate had random nucleotides at 12 positions, if all possible nucleotides were equally represented in the starting pool, we theoretically needed to sample
1.7 x 107 molecules to access a single copy of each sequence. By retrieving a maximum of 5 ng of DNA per reaction mixture (or 9.1 x 109 molecules), we thus exceeded the minimum necessary sample of the entire population complexity by
500-fold. A control oligonucleotide, which resembled the test substrates except that the wild-type NS1 binding site (TGGTTGGTCAGT) was replaced by the sequence TGTGTGTGCATG (SCRAM-ACCA) was not detectably precipitated in this assay, and neither were wild-type substrates precipitated using negative-control serum or immuno-selected in the absence of NS1 (data not shown).
Immuno-complexed products were introduced directly into PCRs and amplified through 26 cycles using the primers AJP6 (5'-gag aga tga gcg atg CAC G-3') and AJP7B, annealed at 56°C. Products were then diluted 1:4 into fresh PCR mix and reamplified through a single cycle to ensure that both strands of the resulting duplexes were identical. Two hundred nanograms of these amplified products, purified through a Sephadex G-50 spin column or by agarose gel electrophoresis, served as the substrate for subsequent rounds of selection. After three or four rounds of selection, the product was cloned into pCR2.1 and the inserts of individual clones were subjected to DNA sequencing.
(ii) TAR substrate. NS1 binding sites in the context of the TAR sequence were selected using a library of 74 -bp DNA fragments based on MVMp nt 1849 to 1922. The initial library was generated by annealing the two oligonucleotides TAR-N8 (5'-AAT GGC CCA TGA TTT GTG CTT GGT NNN NNN NNA ATG GTT ACC AAT CTA CCA TGG CAA GCT ACT GTG CTA AAT GG-3') and TAR-rev (5'-CCA TTT AGC ACA GTA GC-3') and filling in the 5' overhang as before. Binding assays were carried out as described above, with the following modifications: assays contained 0.25 mM ATP and 150 ng of poly(dI)-poly(dC), substrate concentrations varied from 10 to 30 ng per assay, salt concentrations varied from 100 to 150 mM, and in some cases MgCl2 was added to 0.5 mM. Sequences obtained using these various conditions did not differ conspicuously from one another, and results were accordingly pooled to determine the underlying trends. His6-NS1 (100 ng) was preincubated with buffer for 10 min, the substrate was added, and binding was allowed to proceed for 50 min prior to the addition of 2 µl of antibody directed against the NS1 C-terminal peptide. Incubation was continued for a further 50 min, protein A-Sepharose was added, and samples were tumbled for 30 min. Immunoprecipitates were washed, deproteinized, purified through Sephadex G-50 spin columns, and reamplified by PCR. For the first round of amplification, primers TAR-for (5'-AAT GGC CCA TGA TTT GTG C-3') and TAR-rev were used, with a 45°C annealing step. In subsequent rounds of selection/amplification, more-stringent primers, TAR-for2 (5'-AAT GGC CCA TGA TTT GTG CTT GGT-3') and TAR-rev2 (5'-CCA TTT AGC ACA GTA GCT TGC CAT GGT AG-3'), were used, with a 62°C annealing step. Selected DNA sequences were determined by clonal analysis of the PCR products after rounds 3 and 4, as described above.
DNase I protection assays. Substrates were generated from plasmid pRM9/12 as follows: for the "307" site, a 149-bp fragment (MVM nt 4213 to 4361) whose bottom strand was 3'-end labeled with 32P was obtained by digestion with BglII, 3'-end labeling, and redigestion with RsaI; and for the "65-bp repeat," a fragment of 256 nt (MVM nt 4662 to 4912) whose top strand was 3'-end labeled with 32P was obtained by first cutting pRM9/12 with EcoRI, 3'-end labeling the products, and redigesting them with MseI.
Protection assays were as previously described (12), using reaction mixtures (25 µl) that contained 200 ng His6-NS1 and DNA fragments (approximately 4 x 104 cpm) in a solution containing 25 mM HEPES-KOH (pH 7.8), 75 mM sodium acetate, 0.1 mM EDTA, 2 mM MgCl2, 2.5 mM dithiothreitol, 1 mM
S-ATP, 250 µg of bovine serum albumin per ml, 0.005% NP-40, 0.125 mg of an unrelated double-stranded nonspecific competitor oligonucleotide, and 0.125 mg of blunt-ended, nonspecific duplex DNA fragments (100 to 800 bp in length) derived from vector plasmid pCR2.1 by digestion with RsaI and NciI. Products were extracted with phenol-chloroform, precipitated with ethanol in the presence of an oyster glycogen carrier, and analyzed by electrophoresis through denaturing 7% acrylamide gels. DNA probes were also chemically cleaved at G residues by the procedure of Maxam and Gilbert (35) and electrophoresed as markers to allow sequence alignment.
In vitro replication assays. In vitro replication assays contained extracts from uninfected A9 cells, prepared essentially as described by Wobbe and colleagues (53). Assay mixtures (20 µl) contained purified recombinant vaccinia virus His6-NS1 (25 mg/ml), dNTPs, MgCl2, ATP, an ATP-regenerating system, substrate plasmid DNA (10 mg/ml), and a 32P-labeled dNTP. Assays were terminated and results analyzed as previously described (14).
|
|
|---|
, since when replicated in bacteria, this sequence is susceptible to internal deletions which ultimately remove a cruciform element near the tip of the viral hairpin (3). Since these deletions are somewhat heterogeneous, they potentially generate a smear of DNA, as well as the two discrete C and C
fragments, so that this sequence may appear somewhat underrepresented in the input.
![]() View larger version (33K): [in a new window] |
FIG. 1. NS1 binds to and coprecipitates all viral sequences over 170 bp. On the right, an agarose gel shows total [32P]dCTP-labeled pMVMp-D1 fragments generated by AciI and BglII digestion prior to immunoprecipitation in lane 1 (4% of the fragments used per immunoprecipitation), while lanes 2 to 4 show species coimmunoprecipitated with in vitro translation products containing NS2 and anti-NS2 antibodies (lane 2), NS1 and anti-NS1 antibodies (lane 3), and NS1 with nonimmune serum (lane 4). Individual viral species coprecipitated with NS1 are designated by the letters A to J. P denotes a fragment from the vector, which is also represented by myriad smaller species near the bottom of lane 1. On the left, a diagram of the genome and NS1 and VP transcripts is aligned with a horizontal-bar diagram representing the AciI and BglII sites in the genome (short vertical bars); precipitated viral fragments are indicated by vertical bars that represent the fractions of input recovered, as determined by PhosphorImager analysis.
|
92% of the total), distributed throughout the length of the genome. Migrating along with this relatively high-molecular-weight cluster is a single plasmid sequence, marked P, at 362 bp, while all other plasmid sequences are smaller, at <151 bp, and run as a dense cluster toward the bottom of the lane. Additional viral sequences of 165 bp (nt 1953 to 2118), 114 bp (nt 2118 to 2232), 104 bp (nt 2822 to 2926), 50 bp (nt 1028 to 1078), and 30 bp (nt 2232 to 2262) are also present in this mixture, as discussed below. Following incubation with the NS1 in vitro translation products, viral fragments A to J could be immuno-selected with antibodies directed against the C-terminal 16 amino acids of NS1, while fragment P, the 362-bp plasmid-derived sequence, and all sequences under 170 bp failed to precipitate (Fig. 1, lane 3). No fragments were precipitated with prebleed rabbit serum (lane 2) or after incubation with in vitro-translated NS2 and coprecipitation with specific anti-NS2 C-terminal peptide antibodies (lane 4). This indicates that all viral sequences of 171 bp or more contain one or more NS1 binding sites that render them susceptible to specific immuno-selection.
The intensities of each band in Fig. 1, lane 3, were then quantified with a PhosphorImager and plotted along a horizontal line representing the viral genome and on which the positions of the various AciI and BglII sites are indicated. Certain species clearly compete for the available NS1 and are coprecipitated more efficiently than others. Thus, fragments A, H, and I and the sum of fragments C and C
appear specifically enriched in the precipitate, while fragment J, which is known to contain the NS1 binding site from OriL, is surprisingly underrepresented. The high-affinity species A (875 bp) and C/C
(
600 bp) contain NS1 binding sites of known function (TAR and OriR, respectively), but fragments H (307 bp) and I (215 bp) do not. These represent MVM nt 4212 to 4519, derived from the downstream end of the VP coding sequence (hereafter called the "307" fragment), and nt 2262 to 2477, which we refer to as the "intron" because it contains the minor splice region that is removed first from all spliced viral transcripts. These data show, therefore, that under conditions of considerable stringency, NS1 binds site specifically throughout the viral sequence, but it interacts more efficiently with some sequences than with others and specifically favors the TAR element, the minor intron, and elements towards the right end of the genome, both within the viral hairpin and extending back into the C terminus of the VP coding region.
ATP differentially potentiates NS1 binding to certain sites and allows complex assembly at higher salt concentrations.
In vitro translation products contain a wide range of nonviral factors that potentially influence complex formation. Thus, for further analysis, we used purified recombinant NS1 expressed with an N-terminal His6 tag in HeLa cells from a vaccinia virus vector. As shown in Fig. 2, we assessed the binding of His6-NS1 under various salt concentrations, from 100 to 150 mM NaCl as indicated below, and in the presence or absence of ATP, again precipitating complexes with antibodies directed against the extreme C-terminal peptide of NS1. Without ATP, site-specific binding was observed only in buffer containing 100 mM NaCl, with the same range of viral fragments (A to J) being selected as described in the legend to Fig. 1, while the 362-bp vector band and the myriad species under
170 bp remained unbound (Fig. 2A, lanes 6 to 8). Addition of ATP enhanced overall selection of these same fragments by approximately sixfold in 100 mM NaCl (cf. lanes 6 and 9) and allowed this binding to proceed in 125 mM salt (lane 10), although fragments A, H, and I substantially out-competed other sequences under these conditions. Thus, addition of ATP enhances the affinity with which NS1 is able to bind to sequences of >170 bp distributed throughout the viral genome, but binding becomes increasingly nonuniform, with fragments A, H, and I showing the same relatively enhanced affinity seen previously for NS1 translation products.
![]() View larger version (67K): [in a new window] |
FIG. 2. Relative binding intensities of viral fragments coprecipitated with NS1 at various levels of stringency. (A) Agarose gel displaying [32P]dCTP-labeled pMVMp-D1 fragments before (lane 5 [5% of fragments used per immunoprecipitation]) and after (lanes 1 to 4 and 6 to 11) coimmunoprecipitation with 50 ng His6-NS1. Samples in lanes 1 to 4 additionally received 25 ng of purified recombinant PIF heterodimers. Samples were incubated in the presence (lanes 9 to 11) or absence (lanes 1 to 4 and 6 to 8) of 0.5 mM ATP and in buffer containing 100 mM NaCl (lanes 1 to 2, 4, 6, and 9), 125 mM NaCl (lanes 3, 7, and 10), or 150 mM NaCl (lanes 8 and 11) and precipitated with an anti-NS1 C-terminal peptide antibody (lanes 3 to 4 and 6 to 11) or antibody directed against the PIF79 subunit (lane 1) or the PIF96 subunit (lane 2). (B) Samples in lanes 1 to 7 exactly correspond to those shown in Fig. 2A, lanes 5 to 11, except that NS1 complexes were precipitated with an antibody directed against the N-terminal 84 amino acids of NS1. Asterisks denote the small bands precipitated by this antibody as discussed in the text. (C) Agarose gel displaying [32P]dCTP-labeled fragments of pRM9/12, digested with AciI and XbaI, before coimmunoprecipitation (lane 4 [5% of fragments used per immunoprecipitation]) and after coimmunoprecipitation with 50 ng His6-NS1 (lanes 1 to 2) or 100 ng NS1 (lane 3) in the absence (lane 1) or presence (lane 2 to 3) of 0.5 mM ATP, using the anti-NS1 C-terminal peptide antibody. 65R, 307, and Rsa denote the fragments containing the 65-bp repeat, the "307" fragment, and the RsaI A and B segments, respectively. , anti.
|
20 nt, but NS1 will bind to these same sequences if they are self-ligated or cloned into a vector backbone (8, 11). Thus, the precise position of the binding site within a particular fragment may influence its binding affinity. If the small bands seen in Fig. 2B, lane 5, do indeed represent the missing viral fragments, this would leave just one MVM sequence, 165 bp, spanning nt 1953 to 2118, which cannot be specifically coprecipitated with NS1. This particular sequence does not contain a perfect TGGT tetranucleotide, although such motifs are present within its flanking DNA 16 nt and 40 nt upstream and downstream, respectively, of its AciI termini. Overall, these data suggest that NS1 probably does bind site specifically throughout the viral genome, although clearly, certain regions are closer to optimal than others. Figure 2A (lanes 1 to 6) also shows the effect of adding recombinant PIF heterodimers to an NS1 binding reaction. PIF binds to two tetranucleotide half-sites that are positioned next to the NS1 binding site in OriL, making direct contact with NS1 molecules on the active form of the origin (OriLTC) but not on the inactive arm (OriLGAA). Comparison of otherwise identical binding reactions carried out in the presence and absence of PIF (cf. Fig. 2A, lanes 4 and 6, respectively) indicates that this cofactor substantially enhances NS1 binding to the origin fragment (J), but the interaction still remains relatively weak compared to those of other viral sequences, and all binding is eliminated by raising the salt concentration to 125 mM, as seen in lane 3. Thus, the NS1 binding sites that direct nicking at OriLTC, and likely also mediate melting, and refolding of the left hairpin are surprisingly weak compared to other sites in the genome, even in the presence of their specific cellular cofactor. Precipitation with antibodies directed against the PIF79 or PIF96 subunits, shown in Fig. 2A, lanes 1 and 2, respectively, leads to coprecipitation of only OriL fragment J, suggesting that PIF is not able to form a stable complex with DNA-bound NS1, except when it is also bound to an adjacent PIF binding site.
The AciI/BglII digest of pMVMp-D1 is useful because it fractionates a large proportion of the viral genome into readily distinguished species. However, many fragments are likely to contain multiple TGGT-enriched regions, so that binding is potentially influenced by the clustering of sites, as well as by individual affinity differences. To explore this in more detail, we selected one particular region located upstream of the hairpin at the right end of the genome, where NS1 binding is especially strong and where there are various sequence elements that have been reported to be important for the efficient replication or packaging of internally deleted forms of the viral DNA (5, 10, 28, 49, 50). In particular, we examined the C-terminal end of the VP coding sequence and 3' untranslated region (UTR), which was previously contained within BclII/AciI fragments H and C and within which there is a tandem 65-bp repeat sequence in MVMp. For this, we cloned MVM nt 4086 to 4920 into PCR2.1, fragmented the resulting plasmid with AciI and XbaI, and assessed NS1 binding in the presence and absence of ATP, as before. This digest gave the fragments seen in Fig. 2C, lane 4, which include three viral fragments: (i) a fragment from the VP coding region flanked by 34 nt of the upstream vector sequence, which spans MVM nt 4086 to the XbaI site at nt 4342, giving a fragment of 290 bp; (ii) an internal 175-bp XbaI/AciI fragment (nt 4343 to 4518), which contains the so-called Rsa A and B segments analyzed by Tam and Astell (50); and (iii) a 415-bp fragment, which contains the 393-bp MVM sequence spanning nt 4519 to 4912, which comprises most of the viral 3' UTR, including the 65-bp repeat element. As seen in Fig. 2C, lane 2, these three fragments were all specifically selected when the reaction mixture was incubated with 50 ng of His6-NS1, while vector fragments were excluded (lane 1), but addition of ATP enhanced their binding to different extents (lane 2). Thus, the 65-bp repeat sequence, which showed the strongest binding in the absence of ATP, bound approximately twice as efficiently in its presence. The Rsa A- and B-containing fragment bound approximately four times as efficiently in the presence of ATP than in its absence, but binding of the 290-bp sequence from the VP coding region was elevated almost 10-fold by the addition of ATP, so that most of this fragment was precipitated from the mixture, and addition of a further 50 ng of NS1 (lane 3) recruited few additional copies. This suggests that the 290-bp fragment (MVM nt 4086 to 4342) contains an ATP-inducible high-affinity site, which was probably responsible for the elevated binding of the "307" fragment shown in Fig. 1 (fragment H, MVM nt 4212 to 4519), and while the 65-bp repeat region binds well, it is only minimally influenced by the addition of ATP. It was previously hypothesized that elements in the Rsa A and B sequences present in the 167-bp fragment might function as an atypical internal replication origin (5). However, while NS1 clearly binds to this fragment, such binding is neither particularly strong nor responsive to ATP, so that patterns of NS1 binding are unlikely to explain the results on which that model was based.
The 65-bp repeat region contains three separate NS1 binding sites, but only a single, more complex site dominates the 290-bp fragment.
To confirm the exact positions of the NS1 binding sites in the 65-bp repeat and 290-bp fragments, we used DNase I protection assays. As shown in Fig. 3A, the 65-bp repeat fragment contains three distinct, tandemly arranged TGGT clusters, each of 12 nt, arranged as follows: TAGT-TAGT-TGGT, ending at nt 4719, which is also the start of the first 65-bp repeat element, and two TGGT-TGGT-AGGT sequences ending at nt 4785 and 4849, which coincide with the ends of the first and second repeats, respectively. These resulted in three sharp, nonoverlapping NS1 footprints, each protecting
42 nt of DNA and spanning the junctions of the 65-bp repeat elements. Thus, at this position in the genome, there are three separate NS1 binding sites within 200 bp of viral sequence. The edges of these footprints were quite distinct, and induced overcutting of the first 5' nucleotide in site 3 can be seen. As observed previously at other NS1 binding sites, the footprints were positioned asymmetrically with respect to the TGGT clusters, with the most 5' TGGT motif positioned within a few nucleotides of the 5' end of each protected area. By analogy to AAV Rep68 nuclease domain-DNA interactions detailed by crystallography (26), the finding that each discrete site contains three contiguous, tandem, perfect, or three out of four nucleotide matches to the TGGT motif suggests that NS1 binds these sites as a lower-order oligomer, possibly a dimer or trimer. However, despite the presence of three separate cognate sites, NS1 binding to this DNA fragment was only weakly responsive to the addition of ATP (as seen in Fig. 2C), suggesting that the induction of higher-order NS1 oligomers does not markedly potentiate this type of tandem, repetitive interaction.
![]() View larger version (78K): [in a new window] |
FIG. 3. Two extreme forms of the NS1 binding site. Autoradiographs of lanes from sequencing gels showing DNase I digestion patterns obtained in the absence (–) or presence (+) of NS1. Rows marked "G" indicate the products of G-specific chemical cleavage reactions run on each substrate. (A) NS1 binding sites surrounding the MVMp 65-bp repeat region in the 3' UTR. NS1 protects three separate sequences in this region from digestion with DNase I (indicated by horizontal arrows with unfilled arrowheads). The three specific NS1 binding sites within these footprints are indicated by thick horizontal bars, and the sequences are detailed below. Site 1 is positioned immediately prior to the start of the first copy of the 65-bp repeat, ending at MVMp nt 4719, while site 2 terminates at the end of this first copy, at nt 4785 (65-bp repeats are indicated by arrows with filled heads) and site 3 terminates at the end of the second copy, at nt 4849. (B) ATP-inducible high-affinity NS1 binding site near the end of the VP coding region (fragment "307," MVMp nt 4273 to 4325). NS1 protects an extended region from digestion with DNase I, as indicated in the sequence panel beneath the gel, where individual TGGT motifs are boxed, and the direction that these sequences would be expected to orient an asymmetric footprint are indicated by arrows with unfilled heads. A central span of 42 nt, indicated by the leftward-pointing arrow within the sequence diagram, is strongly protected, but at each end protection may extend into flanking sequences, which contain additional TGGT motifs.
|
Overview of additional NS1 binding sites in the genome. We selected this region between MVM nt 4086 to 4920 to characterize in detail because it illustrates two very different types of NS1 binding site. However, data presented in Fig. 1 and 2 indicate that there are many other potent sites distributed throughout the genome, all of which have unique sequences and motif arrangements. For example, the highly ATP-inducible interaction with the intron fragment (fragment I in Fig. 1, spanning MVM nt 2262 to 2477) maps to a major site, which allows protection of the 45-bp sequence 2329-AGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGTTTT-2373 from DNase I digestion, followed by a second, opposed site, 2428-GGACCAGGGAACAGCCTTGACCAAGGAGAA CCAACCAATCCA-2469 (data not shown). These sequences contain six tandem perfect (bold and underlined motifs) TGGT motifs or motifs that match it in three of four nucleotides (underlined), with only a single match of three of four nucleotides in the opposing direction, suggesting that NS1 oligomers might bind each of these sites in a single orientation.
MVM genome fragments that competed less effectively in the immunoselection assays (fragments G, D, F, E, and B in Fig. 1 and 2) all contain between seven and nine perfect matches to the TGGT module, as well as many matches of three of four nucleotides. Some of these are clustered in ways that suggest cooperative binding could occur, as illustrated by the sequences 251-ACCAACTAACCATGGCTGGAAATGCTTACTCTGATGAAGTTTTGGGAGCAACCAACTGGT-310 in fragment G, 711-TGTTTACTGGAGCAGATGGTTGGTAACA-738 in fragment D, 2597-TGGGGAGGCAAGGTTGGTCACTA-2619 in fragment F, and 3388-TGGTTTCTACCCCTGGAAACCAACCATAGCATCACCA-3424 in fragment E, but these have yet to be assessed by DNase I footprinting. However, there are also relatively isolated, single TGGT motifs distributed throughout the genome. This is most apparent in fragment B (nt 3450 to 4212), where there are seven perfect TGGTs, but none of these are clustered, and only four show tandem clustering with even a single match of three of four nucleotides. While showing modular affinity for NS1, these scattered motifs may not exist in sufficient proximity to each other to allow cooperative binding following higher-order NS1 oligomerization.
Selection of optimal NS1 binding sites. The extensive disparities in sequence and binding affinities described above prompted us to examine in more detail what constitutes a stable NS1 binding element within the confines of specific viral sequences, first within the relatively weak site present in OriL and second within the high-affinity TAR cluster. For this, we used a competitive immuno-selection procedure based on the binding assays but in which duplex oligonucleotides bearing randomized sequences downstream of a single anchoring TGGT motif were allowed to compete in vitro for binding by a limiting amount of NS1. This was then immunoprecipitated, and the coselected oligonucleotides amplified by PCR to provide an enriched substrate for a second round of selection. In each case, we performed several (a maximum of four) rounds of competitive selection and amplification before cloning the products into pCR2.1 and sequencing the inserts.
(i) NS1 binding sites in the context of the OriL. To mimic the environment of OriL, we used a synthetic 92-mer oligonucleotide containing the sequence of the top strand of OriLTC but in which 12 nucleotides downstream of the first TGGT were randomized, as shown in Fig. 4. This potential minimal origin sequence was flanked on each side by specific primer binding sites. As a control for the selection procedure, the starting randomized substrate was directly cloned into pCR2.1 and 10 clones were sequenced. The nucleotide distribution through the randomized sequences in these control clones and in the final immuno-selected sequences is summarized in Table 1. It is readily apparent that under selection, nucleotide distribution became strongly biased, with T and G favored, both overall and at certain positions, while A and especially C residues were suppressed.
![]() View larger version (33K): [in a new window] |
FIG. 4. Selection experiments to determine preferred NS1 binding sites in the context of OriLTC. The sequence of the origin region from the MVM left-end hairpin is depicted, with essential elements either named or indicated by boxes. Of these, the two perfect TGGT motifs of the NS1 binding site are enclosed in a box of solid lines, while the more degenerate downstream tetranucleotide from this site is in a box of broken lines. The light-gray shading indicates the consensus nick site, and dark-gray shading indicates the two half-sites recognized by PIF. The sequence of the N-ACCA oligonucleotide used to generate substrates for the selection procedure is depicted below this, with the 12 randomized nucleotides (N) positioned downstream of the TGGT anchor and the primer binding sites indicated. Sequences selected in these randomized positions are shown in the table, with perfect matches and matches of three of four nucleotides to the TGGT motif indicated in bold; perfect matches are also underlined. Clones marked with asterisks were used as substrates in NS1-driven in vitro replication assays, and the synthesis they supported relative to that of the wild-type sequence is reported in the right-hand column. A frequency diagram for these 12 positions is included at the bottom of the figure. Here each nucleotide is represented as a letter that is proportional in height to the frequency with which that nucleotide occurred at that position in the selected DNAs.
|
|
View this table: [in a new window] |
TABLE 1. Distribution of nucleotides in OriL clones
|
Since the cloned products encompass the entire minimal origin region, it was of interest to ask how well these preferred NS1 binding sites were able to support NS1-mediated replication initiation. Accordingly, the plasmids indicated by asterisks in Fig. 4 were compared with cloned forms of the wild-type and negative-control (SCRAM-ACCA) oligonucleotides for their ability to support NS1-mediated replication in vitro. Clone 1, which was the only sequence in which the first selected TGGT was directly contiguous with the anchor, could not be used in this analysis because it carried a single-base deletion at position 12 and an additional downstream mutation. However, sequences with one, two, and three base insertions at this position were tested. As indicated to the right of Fig. 4, all mutated sequences showed impaired replication relative to the wild type, but clones with a single nucleotide between the anchoring TGGT and first selected TGGT motif retained the highest activity, which ranged from 28 to 29% of wild-type activity for those with two perfect TGGT sequences (clones 2 and 4) to 10 to 13% for those with one perfect and one more degenerate site (clones 9 and 13). Clones with two or three bases inserted at this position faired conspicuously less well. This suggests that the nonconsensus NS1 binding site in the wild-type origin is optimally suited for establishing the nicking complex, perhaps because it focuses direct interactions with NS1 to the upstream end of this site, where it is critically positioned to interact with the p79 subunit of PIF bound to its the proximal ACGT half-site (7).
(ii) Selection of optimal NS1 binding sites in the context of the TAR element. Finally, we asked whether it was possible to select for enhanced NS1 binding in the context of the TAR element, a strong site that is highly responsive to the addition of ATP (8). This element contains multiple tandem and opposed copies of the TGGT motif, as illustrated in Fig. 5. For this analysis, we randomized eight nucleotides that fall under the NS1 footprint, including one preexisting TGGT motif. The composition percentages for individual nucleotides in the randomized sequences of 15 unselected control sequences were 22% T, 27% G, 27% A, and 23% C, whereas in the selected group they were 39% T, 40% G, 15% A, and 6% C, again indicating clear preferences for multiple Ts and Gs in this already enriched environment. Results from the sequencing of 29 clones, shown in Fig. 5, indicated that even in this context, the tetranucleotide TGGT was strongly selected, being present in 19 of the 29 clones. Of these, one had two overlapping TGGT motifs and six also matched the motif in three of four nucleotides. Of the 10 remaining clones, all had at least one motif that matched in three of four nucleotides and two had two such motifs. The favored trinucleotide was again GGT (32), followed by TGG (21), GTT (10), and TTG (10), in that order, as found in the origin site, whereas in random sequences of this complexity, three copies of each would be expected. Of the 16 possible dinucleotides, GT again predominated (48), followed by GG (35), TG (28), TA (19), AG (19), and TT (17), whereas seven copies would be expected if nucleotides were randomly distributed. All other dinucleotides were underrepresented, with AC occurring only once in the entire group. Thus, despite the complexity of this location, NS1 binding selection strongly favored sequences that contained additional TGGT motifs in the same orientation as the locally predominating cluster. This supports a model in which NS1 molecules bound to adjacent TGGT cores interact, but, in this context, selection accumulates five to six tetranucleotide motifs that are oriented in the same direction and in close proximity, suggesting that here NS1 molecules bind as part of a colinear pentamer or hexamer.
![]() View larger version (24K): [in a new window] |
FIG. 5. Selection experiments to determine preferred NS1 binding sites in the context of the TAR. The sequence of MVMp nt 1852 to 1912, containing the TAR element, is depicted, and matches of three of four or four of four nucleotides to the TGGT tetranucleotide are boxed, with arrows to indicate the direction that these motifs would be expected to orient an asymmetric NS1 footprint. The sequence of the TAR-N8 oligonucleotide used to generate substrates for the selection procedure is depicted below this, with the eight randomized nucleotides (N) positioned downstream of the TGGT anchor and the primer binding sites indicated. Sequences selected in these randomized positions are shown in the table, with perfect matches and matches of three of four nucleotides to the TGGT motif indicated in bold; perfect matches are also underlined. A frequency diagram for these eight positions is included at the bottom of the figure and was constructed as described in the legend for Fig. 4.
|
|
|
|---|
Such interactions are modular and highly variable, which makes it difficult to describe a consensus binding site, but certain basic principles have become apparent. Sites typically involve between three and six more-or-less degenerate tandem TGGT motifs, with the stronger sites having more copies of a perfect match, but register is flexible, so that spaces between adjacent tetranucleotides are tolerated or even preferred, while the highest-affinity sites typically contain opposing tetranucleotide clusters. In consequence, effective NS1 binding sites are reiterated within almost every viral sequence block of >100 nt, despite the highly efficient sequence usage shown by parvovirus genomes, where proteins are encoded in multiple overlapping reading frames, and these same sequences also support transcriptional control, splicing, and replication elements.
Comparisons with the Rep binding sites of AAV. In contrast, binding sites for the initiator proteins of viruses from the related Dependovirus genus are not found distributed throughout their genomes, although there is one highly degenerate site with relatively low binding affinity positioned upstream of the AAV2 P5 promoter (24, 36, 39). In AAV origins, the initiator proteins Rep68 and Rep78 bind to the Rep binding element (RBE), which comprises 22 nt that include five contiguous tandem copies of the tetranucleotide GCTC (47). Some degeneracy is allowed within the RBE, especially at flanking motifs, and this varies somewhat between AAV serotypes. Rep also binds to a similar site (CGGC GCTC GCTC GCTC GCTG), present on human chromosome 19q13.4, where it induces site-specific nicking that can lead to integration of viral sequences into host DNA (23, 29, 52).
X-ray crystallography of the AAV5 Rep nuclease domain complexed with an oligonucleotide representing the RBE (26) has revealed the basis for this apparent tetranucleotide binding modality. In this crystal structure, five Rep monomers spiral around the DNA, offset from one another by 4 bp (138°), with each Rep molecule binding independently, such that each molecule within an oligomeric complex makes individual pincer-like contacts with elements from successive tetranucleotide repeats. This creates potentially redundant contacts with each motif, which explains how some degeneracy in the DNA sequence can be accommodated. The possibility that a similar pattern of interactions occurs between NS1 and its binding site is supported by the immuno-selection data presented here, with the preferred trinucleotide, GGT, in both the OriL and TAR locales likely supporting the strongest contacts. While this underlying pattern of modular interactions may well be conserved between Rep and NS1, for NS1, binding flexibility appears massively enhanced, allowing a succession of relatively stable interactions with a wide range of disparate sites that have now become interspersed throughout the coding sequences.
Effects of varied NS1 oligomerization states on site-specific binding. A requirement for modular interactions between multiple NS1 molecules and multiple tetranucleotide motifs may explain why site-specific NS1 binding can be demonstrated in vitro only after NS1 has been induced to oligomerize. Although it appears monomeric in dilute solution, within the cell, NS1 likely self-associates through a variety of oligomeric states, as has been demonstrated by means of cross-linking for the AAV Rep proteins (20, 48). Within NS1 there are three separate protein domains that are potentially capable of mediating this complex formation. These include its N-terminal-site-specific DNA-binding and endonuclease module (residues 1 to 275) (38), which would, at least, be expected to reinforce complex formation if bound to multiple tetranucleotide motifs. NS1 also supports a second type of interaction, mediated by a discrete bipartite oligomerization element (residues 261 to 276) in the C terminus of this domain, which is known to be essential for the cotransport of otherwise transport-negative NS1 mutants into the cell nucleus (44). Finally, like all parvoviral replication initiators, NS1 contains a highly conserved AAA-positive ATPase and SF3 helicase domain (residues 399 to 486), which likely functions as a hexameric toroid. A closely related AAA-positive SF3 helicase domain in simian virus 40 large T antigen is known to assemble into toroidal hexamers in the presence of ATP and to form a double hexamer on the viral replication origin (18, 19, 32), and a similar toroidal hexameric structure has been persuasively modeled from X-ray diffraction data for a monomeric form of the homologous Rep helicase domain (27). Characteristics of the binding sites in viral RF suggest that at least two oligomeric forms of NS1 are able to bind DNA in a site-specific manner. Thus, one form of the site, typified by those in the 65-bp repeat region, contains three tandem TGGT reiterations and allows stable binding that is only weakly potentiated by ATP. The second type, such as the TAR or the "307" element, contain multiple, more disparately spaced tandem motifs and antiparallel clusters and are highly responsive to ATP-mediated higher-order oligomer formation.
Consequences for the viral life cycle: remodeling the nuclear microenvironment.
Interactions with genome-bound NS1 could serve to recruit and sequester the necessary cellular proteins to intranuclear viral replication foci, ensuring an optimal microenvironment for rapid virus expansion. At early stages in infection, members of the genus Parvovirus replicate in subnuclear compartments known as autonomous parvovirus replication (APAR) bodies (1, 17), which do not coincide with any other known subnuclear domain but which sequester various essential replication proteins and almost all of the available NS1. This compartmentalization may well be induced by NS1 binding to multiple sites on RF DNA, and required cellular proteins would then be recruited to these microdomains by direct interactions with DNA-bound NS1. Critical NS1-mediated interactions of this type are known to occur. Thus, for example, NS1 is known to bind directly to the transcription factors SP1, TFIIA(
/ß), and TBP (30, 34) and to the cellular single-strand-DNA-binding protein RPA (7). RPA is specifically sequestered in APAR bodies along with other essential components of cellular replication forks, including some, such as polymerase
primase, that are not required for viral replication. Whether NS1 recruits RPA to APAR bodies, or this interaction rather allows NS1 to hijack preexisting cellular replication foci, remains uncertain, and both situations may prevail. APAR bodies become distinct within 2 to 3 h of the start of S phase as a series of tiny speckles, a small fraction of which evolve over the course of several hours into enlarged replication foci (46). Finally, after >15 h, these begin to coalesce into large amorphous structures that occupy most of the nuclear space and contain components of various formerly distinct subnuclear structures, such as Cajal bodies, PODs, and speckles (54, 55). Now termed SAABs, these bodies facilitate new interactions between NS1 and additional cellular proteins, such as the product of the survival of motor neuron gene (Smn), and may be instrumental in orchestrating the final events of the viral life cycle.
NS1 bound to replicating viral DNA would also play a more structural role in the viral life cycle, perhaps limiting access to the components of normal chromatin or modifying cellular assemblies, creating a nuclear scaffold where the virus can most effectively amplify. For example, it is possible that the leading-strand-only DNA synthesis that occurs at a typical parvoviral fork is prohibited in normal chromatin. Other viruses that use such forks, like adenovirus, provide their own, specialized single-strand-DNA-binding proteins, and this may also explain why AAV, which uses the single-strand-DNA-binding proteins of either its adenovirus or herpesvirus helper, may not require the same widely dispersed Rep binding sites. While at present we know relatively little about the proteins associated with MVM RF DNA in vivo, Doerig and colleagues (22) found that MVM DNA from infected cells did not give the characteristic nucleosome repeat pattern associated with cellular DNA when digested with micrococcal nuclease. In addition, they demonstrated that intracellular MVM RF DNA exists in some form of nucleoprotein complex that could be separated into 110S and 40S forms by velocity sedimentation. Our preliminary chromatin immunoprecipitation studies support this concept of a modified chromatin environment and suggest that NS1 is intimately associated with internal fragments from replicating MVM DNA and that histone associations are limited, both in terms of histone type and binding locale (S. F. Cotmore, Z. Ruiz, and P. Tattersall, unpublished results). Further studies of the composition and function of this unique viral pseudochromatin are ongoing.
R.L.G. was a Howard Hughes Predoctoral Fellow. This work was supported by Public Health Service grant number AI26109 from the National Institutes of Health.
Published ahead of print on 26 September 2007. ![]()
Present address: Department of Internal Medicine, The University of Iowa, 200 Hawkins Drive, Iowa City, IA 52242. ![]()
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»