ABSTRACT
Simian parvovirus (SPV) is a member of the genus Erythrovirus and is closely related to the human parvovirus B19. Natural and experimental infection of monkeys with SPV resembles B19 infection of human. We report a detailed characterization of the viral RNAs and proteins generated following transfection of cloned SPV into COS cells and SPV infection of the human erythroid progenitor line UT-7/Epo-S1. SPV and B19 are 50% identical at the nucleotide level, and although their basic transcription and protein expression profiles were generally similar, there were also significant differences. SPV pre-mRNAs contain three introns, compared to two found for B19: an additional intron was found within the capsid-coding region. RNAs in which this intron was spliced were abundant and encoded the SPV 14-kDa protein (analogous to the B19 11-kDa protein), which initiated at an AUG in the exon preceding the third intron. Unlike B19, SPV RNAs were also spliced between the donor of the first intron and the acceptor of the second intron. The third intron was additionally spliced from a portion of these molecules; these mRNAs encoded the 14-kDa protein. A portion was not spliced further and encoded VP2. Like B19, SPV has a polyadenylation signal [AAUAAA (pA)p] in the middle of the genome, which directed efficient polyadenylation of both spliced and unspliced mRNAs (encoding a putative 10-kDa protein, analogous to the B19 7.5-kDa protein, and SPV NS1, respectively). The 14-kDa protein was localized to both in the nucleus and cytoplasm.
Autonomous parvoviruses with tropism for erythroid cells are classified as erythroviruses. This group includes the human erythroviruses, represented by human B19 virus (B19) (27) and its variants V9 (26) and A6 (13), and the nonhuman primate erythroviruses simian parvovirus (SPV) (14), pig-tailed macaque parvovirus (8), rhesus parvovirus (8) and, possibly, the chipmunk parvovirus (32). SPV was initially isolated from cynomolgus monkeys (14) and is remarkably similar to the human B19 parvovirus, exhibiting a predilection for host bone marrow in vitro and the ability to cause serious anemia in some infected animals (15, 16).
SPV has a single-stranded genome of approximately 5,600 nucleotides (nt) (3), and its nucleotide sequence, except for its inverted terminal repeats, has been determined. Comparison of the SPV sequence with that of other parvoviruses shows 50% overall homology with parvovirus B19 (27). At the protein level, there is approximately 70% homology with B19 capsid proteins and 50% homology with B19 NS proteins (3). SPV has been cultured in vivo in both human and monkey bone marrow cells and in the human erythroid progenitor cell lines UT-7/Epo-S1 and KU812Ep6 (4), albeit with low infectivity. Similar to the organization of B19, SPV has two large open reading frames (ORFs) in either half of the genome. The left ORF encodes the nonstructural NS1 protein, and the right ORF encodes the two capsid proteins VP1 and VP2 (3, 31). Expression of the VP2 ORF of SPV results in the generation of virus-like particles and has been useful as a diagnostic reagent for SPV antibody detection (4).
Little is known concerning the detailed expression of the SPV genome. The splice junctions of SPV RNA isolated from the liver of SPV-infected monkey fetuses have been determined by reverse transcription-PCR (RT-PCR) and Northern blot analysis (31). In this report, we present a detailed characterization of the transcription and protein expression profile of SPV RNA generated following transfection of monkey COS cells. The main features of this profile were confirmed with RNA generated following infection of UT-7/Epo-S1 cells. Our results show that although the transcription maps of SPV and B19 are generally similar, they differ in a number of important ways. SPV RNA pre-mRNAs contain three introns, instead of the two introns found for B19. The third intron is within the capsid gene region, and RNAs in which this intron was spliced were abundant and encoded the SPV 14-kDa protein (analogous to the B19 11-kDa protein), which initiated at an AUG in the exon preceding the third intron. Also, SPV RNAs were found that were spliced between the donor of the first intron and the acceptor of the second intron. A portion of these molecules were not additionally spliced and encoded VP2; those from which the third intron was also removed encoded the 14-kDa protein. Similar to B19, SPV has a polyadenylation signal [(pA)p] in the middle of the genome which directs efficient polyadenylation; however, unlike B19, the SPV signal is a consensus AAUAAA motif. Efficient polyadenylation at (pA)p of both spliced and unspliced mRNAs (encoding a putative 10-kDa protein, analogous to the B19 7.5-kDa protein, and SPV NS1, respectively) was detected. The 14-kDa protein, which has an unknown function, was localized to both the nucleus and cytoplasm.
MATERIALS AND METHODS
Cells.COS-7 cells (ATCC CRL-1651) were maintained in Dulbecco's modified Eagle's medium with 10% fetal calf serum (FCS) at 37°C with 5% CO2. UT-7/Epo-S1 cells (11), a subclone of the UT-7/Epo cell line previously shown to be permissive for parvovirus B19 (28), were maintained in RPMI 1640 with 10% FCS and 1 U of erythropoietin (Amgen, Thousand Oaks, Calif.)/ml.
SPV and viral infection.Serum from a viremic cynomolgus monkey was used as a source of SPV (14). A total of 106 UT-7/Epo-S1 cells, collected in 0.5 ml of RPMI with 10% FCS and 1 U of erythropoietin/ml, were incubated with 50 μl of SPV serum for 2 h at 4°C. Cells were maintained on a 60-mm plate in 5 ml of RPMI with 10% FCS and 1 U of erythropoietin/ml for 2 days at 37°C prior to RNA isolation.
Plasmids. (i) Plasmids used to map the (pA)p region.The SPVNSCap plasmid was constructed by insertion of SPV sequences (nt 92 to 4986) (3) into pSK+ (Stratagene). All the nucleotide numbers for SPV sequences follow the SPV sequence deposited in GenBank (accession no. AF085716 ). SPVNSCapmpAp, SPVNSCapmUSEI, SPVNSCapmUSEII, and SPVNSCapmDSE were constructed by mutation of the putative upstream region (upstream element I [USE I] and USE II), downstream region (the downstream element [DSE]), and the AAUAAA signal (see the detailed mutation in Fig. 5, below) in the SPVNSCap backbone.
(ii) HA-tagged constructs.The nine-amino-acid hemagglutinin (HA) tag (YPYDVPDYA) was used to study SPV protein expression. SPVNSHA, SPV10kDaHA, SPVVP2HA, and SPV14kDaHA were constructed by insertion of the HA sequence (5′-TAC CCA TAC GAT GTT CCA GAT TAC GCT-3′) in frame into the C-terminal region of the NS1 (nt 2367), 10-kDa (nt 2059), VP2 (nt 4816), and 14-kDa (nt 4949) proteins. SPVVP2AUGHA was made by insertion of an HA sequence after the AUG (nt 3154) of the VP1/VP2 open reading frame (ORF). For efficient protein expression, SPVVP2HA, SPV14kDaHA, and SPVVP2AUGHA were constructed from a parent plasmid in which SPV sequences (nt 92 to 4986) were inserted into pcDNA3.1 (Invitrogen). The splicing pattern of transcripts from these SPV plasmids was the same as for SPVNSCap (data not shown).
(iii) GFP-tagged constructs.The enhanced green fluorescent protein (GFP) coding sequence from pEGFPC1 (Clontech) was fused to the C terminal of the putative 14-kDa protein at nt 4949 (SPV14kDaGFP) or VP2 at nt 4816 (SPVVP2GFP).
(iv) Constructs for RPA probe generation.RNase protection assay (RPA) probe clones for mapping transcription, splicing, and polyadenylation sites, PD1, PA1, PD2, P(pA)p, PA2, PA3, P(pA)d, which were designed based on the published B19 transcription map (1, 18), were constructed by cloning the following regions of SPV: nt 161 to 400 (PD1), nt 1601 to 1920 (PA1), nt 1921 to 2160 (PD2), nt 2481 to 2700 [P(pA)p], nt 2961 to 3200 (PA2), nt 4481 to 4800 (PA3), and nt 4721 to 4977 [P(pA)d], into BamHI-HindIII-digested pGEM-3Z (Promega). For the probe clone to map the novel splicing junction (D3, the third donor site), SPV nt 3121 to 3360 were inserted into BamHI-HindIII-digested pGEM-3Z (Promega). Homologous probes were developed to map the (pA)p required cis element and were constructed by insertion of the PCR-amplified nt 2481 to 2700 into the BamHI-HindIII-digested pGEM-3Z (Promega). The probes for (pAp)0 and (pA)p2 mapping were constructed by insertion of SPV nt 2241 to 2480 and nt 2801 to 3058 PCR fragments, respectively, into BamHI-HindIII-digested pGEM-3Z (Promega).
All the DNA constructs were sequenced at the DNA core at the University of Missouri—Columbia to confirm the sequences.
RPA.Plasmid DNA (2 μg/60-mm-diameter dish) was transfected into 60 to 80% confluent COS cells by using Lipofectamine and Plus reagent (Gibco BRL, Gaithersburg, Md.) as previously described (23). Total RNA was isolated 36 to 40 h later by using guanidine isothiocyanate as previously described (25). mRNA was isolated from the total RNA using the Dynabeads mRNA purification kit (Dynal Biotech, Oslo, Norway). RPA was performed as previously described (12, 23, 25). Probes were generated from linearized templates by in vitro transcription with SP6 polymerase, as previously described (25). RNA hybridizations for RPAs were done with a substantial probe excess, and RPA signals were quantified with the Molecular Imager FX and Quantity One version 4.2.2 image software (Bio-Rad, Hercules, Calif.). Relative molar ratios of individual species of RNAs were determined after adjusting for the number of 32P-labeled uridines in each protected fragment as previously described (25).
Northern blotting.Northern analyses were done as previously described (19), using either 10 μg of total RNA or mRNA isolated from 20 μg of total RNA. The 32P-labeled DNA probes (NS probe, Cap probe, and NSCap probe) are indicated below in Fig. 4.
5′ RACE, 3′ RACE, and RT-PCR.Primers for 5′ rapid amplification of cDNA ends (RACE) were as follows: R1982 (5′-CATTCTTCTTGCAGGTCGTCGATC-3′), R3200 (5′-TGCCACCGCCTCCAGCTCC-3′), and R4800 (5′-TTTTTGGCACTCCAGCATTC-3′). F2481 (5′-AGGAAGAAGTACCTGTGTTAG-3′) and F4721 (5′-TACGATACCGAGGTCACC-3′) were used for 3′ RACE. The anchored primer oligo d(T)19V (V = G or A or C) was used both for 3′ and 5′ RACE. The locations of these primers are shown below in Fig. 2.
5′ RACE was performed according to the instructions of the 5′ RACE kit (Roche, Nutley, N.J.). Briefly, cDNA was synthesized using the reverse primers (R1982, R3200, and R4800), to which was added poly(A) by terminal transferase, and then PCR was carried out using the oligo d(T)V19V as forward primer and the cDNA synthesis primer as a reverse primer. All RT-PCR was performed with a Titan-one RT-PCR kit (Roche) with the primers indicated below in Fig. 2. For 3′ RACE, the oligo d(T)19V primer was used as the cDNA synthesis primer and reverse primer for PCR. For RNA isolated from SPV virus-infected UT-7/Epo-S1 cells, RT-PCR or 3′ RACE was performed with 45 cycles. PCR DNA fragments were separated on a 2% agarose gel, and all bands were excised and purified by using a QIAGEN gel extraction kit (QIAGEN). Sequence determination was performed at the University of Missouri—Columbia DNA Core Facility using one of the PCR primers as sequence primer.
Western blot analysis of HA-tagged SPV protein expression.Plasmid DNA (1 μg/well of a six-well plate) was transfected into 60 to 80% confluent COS cells by using Lipofectamine and Plus reagent (Gibco BRL) as previously described (23). Two days later, cells were harvested, washed with phosphate-buffered saline twice, and pellets were lysed in 200 μl of 1× Laemmli protein loading buffer containing 1× proteinase inhibitor cocktail (Roche). A 36-μl aliquot of the total lysed protein was loaded onto a sodium dodecyl sulfate-12% polyacrylamide gel electrophoresis (SDS-PAGE) gel and subjected to immunoblot analysis as previously described (20), using a monoclonal antibody to the HA tag (HA.11; Covance Research Products, Inc., Berkeley, Calif.). In some experiments, Super Signal West Dura chemiluminescent substrate (Pierce, Rockford, Ill.) was used, followed by quantification with a Molecular Imager FX phosphorimager using Quantity One version 4.2.2 image software (Bio-Rad, Hercules, Calif.).
Confocal microscopy.A 0.5-μg aliquot of plasmid DNA was transfected into 60 to 80% confluent COS cells on one chamber of the two-chamber coverglass slide (Lab-Tek; Nunc, Inc.) using Lipofectamine and Plus reagent (Gibco BRL). Two days later, cells were labeled with 5 μM SYTO-59 dye (Molecular Probes, Inc., Eugene, Oreg.) in phosphate-deficient Dulbecco's modified Eagle's medium (Invitrogen) at least 15 min before observation (7). An Olympus IX70 inverted microscope coupled with a Bio-Rad Radiance 2000 confocal system was used; the images were taken with a 60× Uplan objective lens with 4× digital zoom using dual excitations with both fluorescein isothiocyanate and rhodamine filters. Images were acquired using a Kalman average of three scans and analysis with LaserSharp2000 software (Bio-Rad). Image manipulation involved converting the digital images of GFP and SYTO-59 into 24-bit color images in which the GFP was represented as green and the SYTO-59 was red. In addition, the files were merged to create overlays for comparison.
RESULTS
Mapping of SPV transcription units by RPAs.There is currently no characterized fully permissive tissue culture system for SPV replication. Therefore, we chose to characterize the transcription profile of SPV following plasmid transfection of monkey COS cells. We developed an RPA to determine the architecture and relative accumulation levels of SPV RNAs using the known landmarks of B19 RNA and the published SPV sequence as a guide (1, 18). The transcript profile determined in this way was confirmed following infection of UT-7/Epo-S1 cells in a sensitive RT-PCR analysis. A schematic diagram of the SPV genome and the five antisense SPV probes used for this analysis [PD1, PA1, PD2, P(pA)p, PA2, PA3, and P(pA)d] are shown in Fig. 1A.
RNase protection analysis of SPV RNA generated following transfection of COS cells. (A) Schematic diagram of the SPV genome and the probes used in this study. The promoter (P6), the intron donors (D1 to -3) and acceptors (A1-1, A1-2, A2, and A3), and the (pA)p and (pA)d polyadenylation sites are shown. The location of PD1 (nt 161 to 400), PA1 (nt 1601 to 1920), P2A (nt 2961 to 2160), P(pA)p (nt 2481 to 2700), PA3 (nt 4481 to 4800), and P(pA)d (nt 4721 to 4977) probes used in this study are shown. The expected protected bands for each probe are depicted. (B) Mapping of the SPV transcription units by RPA. Ten micrograms of total RNA isolated 36 to 40 h after transfection of COS cells by SPVNSCap was protected by the PD1, PA1, PD2, PA2, PA3, P(pA)p, and P(pA)d probes, as indicated. Lane 1, 32P-labeled RNA ladder (22), with sizes indicated to the left. The origins of the protected bands in lanes 2 to 8 are indicated. Spl, spliced species; Unspl, unspliced species. The band designated by a star in lane 2 was protected by a read-around RNA transcript from the plasmid. The band at ∼160 nt designated by an arrow in lane 4 was only variably detected and is likely incompletely digested, spliced RNA.
Probe PD1.Probe PD1, which spans the putative P6 promoter and putative donor (5′ splice site) of the first intron, protected bands centering at approximately 175 and 54 nt (Fig. 1B, lane 2). These bands mapped the initiation site of P6-generated RNA to approximately nt 226 and mapped the first donor site (D1) to nt 280. The relative ratio of unspliced RNA to RNA spliced at D1 was approximately 1:2, which is similar to that found for B19 following transfection of COS cells (Y. Yoto, J. Qiu, and D. Pintel, unpublished data). For B19, the unspliced P6-generated RNAs terminate at the proximal polyadenylation site and encode the NS1 protein (10, 18). An additional 5′ donor site, downstream of the major D1 donor of the first intron, has been detected by RT-PCR analysis of RNA generated during B19 infection of MB-02 cells (5) and in vivo SPV infection of monkey fetal liver (31). Neither the RPA described here, nor the RT-PCR analysis described below, of RNA generated following SPV transfection of COS cells detected usage of an additional 5′ splice site.
Probe PA1.Probe PA1, which spans the acceptor region of the putative first intron, protected bands of approximately 320, 181, and 129 nt (Fig. 1B, lane 3). These bands reflect unspliced RNA and spliced RNAs using acceptors at nt 1740 (A1-1) and nt 1792 (A1-2), respectively. The ratio of unspliced to spliced RNA across this region was approximately 1:1, which was slightly less than the ratio determined by probe PD1, above. This suggested that splicing from the first donor D1 likely used an acceptor in addition to A1-1 and A1-2. This was indeed the case as discussed in the next section. The ratio of the usage of A1-1 versus A1-2 was approximately 1.5:1, which is similar to what has been observed for acceptor usage for the first intron of B19 in COS cells (Yoto et al., unpublished). As described more fully below, a portion of RNAs in which the first intron is spliced are polyadenylated at a site in the center of the genome, and a portion extends to the distal polyadenylation site (with or without further splicing). In SPV there is an AUG located at nt 1782, 8 nt in front of the A1-2 acceptor site, which would have the capacity of initiating the translation of an ORF (which terminates at nt 2060) that would generate a protein analogous to the B19 7.5-kDa protein.
Probe PD2.Probe PD2, which spans the donor site of the putative second intron assumed from the map of B19, protected a band of approximately 240 nt, which represents RNAs unspliced through this region [approximately 90% of which are polyadenylated at the proximal (pA)p site; see below], and a minor spliced band of approximately 143 nt (Fig. 1B, lane 4), which mapped the donor of the second intron to nt 2063. The ratio of spliced RNA to unspliced RNA in this region was approximately 2:1. A faint band migrating at approximately 160 nt was also detected at various levels in these assays. The nature of this band is not clear; however, it is likely that this band is an artifact of the assay (perhaps anomalous migration of the 143-nt band) because, as described below, subsequent RT-PCR analysis did not identify an additional spliced band using a donor site consistent with the generation of an RNA migrating at 160 nt in this assay.
Probe PA2.Probe PA2, which spanned the putative acceptor site of the second intron, protected bands of approximately 240 and 114 nt (Fig. 1B, lane 5), mapping a single acceptor site (A2) for the second intron to nt 3087. The ratio of spliced to unspliced RNA through this region was approximately 5:1.
Probe PA3.Probe PA3, which spans the putative second acceptor site of the second intron assumed from the map of B19, protected bands of approximately 320 and 163 nt (Fig. 1B, lane 6). This mapped an acceptor site at nt 4638. Splicing at that acceptor site also is quite efficient: the ratio of spliced to unspliced RNA through this region is approximately 3:1. B19 RNAs spliced between the second intron donor and the second intron second acceptor encodes the B19 11-kDa protein (29, 30).
Probe P(pA)p.Probe P(pA)p, which spans the putative proximal (pA)p in the middle of the genome, protected bands of approximately 240 and 158 nt (Fig. 1B, lane 7), mapping the (pA)p site to nt 2638. Greater than 90% of the RNA protected by this probe was polyadenylated at this site. This result suggested that the SPV AAUAAA signal at nt 2618 was used efficiently in COS cells, a feature shared with B19 (although the analogous polyadenylation motif in B19 is a nonconsensus AUUAAA) (9, 18). There are two additional consensus polyadenylation signals within the (pA)p region (at nt 2449 and 2959); however, RPAs using probes across those two sites did not detect their usage (see Fig. 5, lanes 6 and 7, below). Probe P(pA)d, which spans the right-hand-end poly(A) signal, protected a band of approximately 239 nt (Fig. 1B, lane 8), mapping the cleavage site for distal polyadenylation [(pA)p] to nt 4960.
5′ RACE, 3′ RACE, and RT-PCR identify additional splicing patterns during SPV expression in COS and UT-7/Epo-S1 cells.RT-PCR, with subsequent sequencing of the products generated, essentially confirmed the splice junctions, initiation sites, and termination sites determined by the RPAs described above (data not shown). In addition, however, RT-PCR and 5′ RACE experiments also identified additional splicing patterns not detected by RNase protections, as described below.
5′ RACE was performed using a set of primers for cDNA synthesis (R3200, R4800, and R1982) (Fig. 2A, lanes 1, 2, and 3, respectively). Sequence determination indicated that all the species of SPV RNAs amplified in this manner were initiated at nt 226, consistent with the presence of a single promoter at map unit 6. A transcription start site in the center of the genome was also not identified by 5′ RACE with the primers R3200 and R4800 (Fig. 2A, lanes 1 and 2).
RACE and RT-PCR of SPV RNA. (A) RACE of RNA from transfected COS cells. 5′ RACE was performed using three reverse primers, R3200 (lane 1), R4800 (lane 2), and R1982 (lane 3), along with d(T)19V, and 3′ RACE was performed using two forward primers, F2481 (lane 4) and F4721 (lane 5), along with d(T)19V, as described in the text. The positions of the primers, as well as the map and expected sizes in nucleotides for all potential transcripts using different splice junctions and different poly(A) sites, are diagrammed relative to the transcription map. The sizes of DNA marker fragments are shown. The PCR conditions used did not efficiently amplify the longer RNA products. (B) RT-PCR and 3′ RACE using RNA from SPV-infected UT-7/Epo-S1 cells. For lanes 1, 2, and 3, a forward primer, F226, and three different reverse primers, which were the same as those used for the 5′ RACE in panel A, were used for RT-PCR. Primers F2961-R4800, F2001-R4800, and F2001-R3200 were used for more-extensive RT-PCR in lanes 6, 7, and 8, respectively. 3′ RACE (lanes 4 and 5) was performed exactly as for panel A. All the expected amplified fragments, as well as the expected sizes of these products, are depicted in the diagram next to the number corresponding to the lanes in the gel below. Forty-five cycles were used to amplify transcripts from infected cells.
5′ RACE with the R3200 primer generated a predominant band of approximately 168 nt (Fig. 2A, lane 1, lower band). Sequencing of this band revealed a product spliced from D1 to A2. Although sequence determination on the minor band of approximately 500 nt (Fig. 2A, lane 1, upper bands) was unsuccessful, it was the size of an mRNA species initiating at nt 226 and splicing either from D1 to A1-1 or D1 to A1-2 and from D2 to 2A.
5′ RACE with primer R4800 generated a predominant band of approximately 339 nt (Fig. 2A, lane 2, lower band), and sequencing of this band revealed a product spliced from D1 to A2 and from a previously unidentified donor, D3, at nt 3208 (see below) to acceptor A3. Faint bands around 600 nt (Fig. 2A, lane 2, upper bands) were the size of mRNA triply spliced from D1 to either A1-1or A1-2, from D2 to A2, and from the newly identified donor D3 to acceptor A3. These experiments identified an unanticipated donor site D3 at nt 3208 and an accompanying intron not seen for B19 in the VP2 capsid region of SPV.
To confirm the SPV splice junctions that were identified following SPV transfection of COS cells, total RNA isolated following SPV infection of UT-7/Epo-S1 cells (a human erythroid cell line permissive for B19 infection) was analyzed by RT-PCR (the abundance of SPV RNA following infection of UT-7/Epo-S1 cells was not great enough for RPA). Rather than the 5′ RACE described above, a forward primer at nt 226 was used with the same set of reverse primers used above (R3200, R4800, and R1982) (Fig. 2B, lanes 1, 2, and 3). Sequence determination of the bands obtained identified the same splice junctions shown in Fig. 2A, lanes 1 to 3. Furthermore, 3′ RACE analysis (Fig. 2B, lanes 4 and 5) also showed similar results to those obtained by RPA and 3′ RACE of RNA generated in COS cells (Fig. 1B; Fig. 2A, lanes 4 and 5). More-extensive RT-PCR with different sets of primers (Fig. 2B, lanes 6, 7, and 8) followed by sequence determination also showed results similar to those obtained by RNase protection (Fig. 1B; Fig. 3) and RT-PCR (Fig. 2B, lanes 1, 2, and 3). Surprisingly, as described further above, a portion of those RNAs from which the first intron, but not the second intron, was removed were also spliced using an unanticipated third intron in the capsid gene (Fig. 2B, lane 6).
Identification of the SPV third donor site. Schematic diagram of the SPV genome and the location of the PD3 probe (nt 3121 to 3360) are shown; the expected protected bands are depicted below. Ten micrograms of total RNAs isolated from SPVNSCap-transfected COS cells (lane 2) or mRNA isolated from 20 μg of the same total RNA preparation (lane 3) was protected by the PD3 probe. A 32P-labeled RNA ladder is shown in lane 1. The origins of protected bands are indicated.
Because a third donor site had not previously been reported for B19, we sought to further confirm the existence of this novel third donor site at nt 3208 and determine its relative usage. RPAs using probe PD3 (Fig. 3), which spanned the D3 site, identified bands of approximately 240 and 88 nt (Fig. 3), which reflected unspliced RNA and RNA using a donor at nt 3208, consistent with the RT-PCR and 5′ RACE and sequencing results described above. This ratio was compatible with the splicing of A3 only to D3, and RT-PCR and sequence analysis did not identify any mRNAs spliced between the D1 or D2 donor sites and the A3 acceptor site (data not shown) (31).
The RPAs using PD2 and PA2 (Fig. 1B, lanes 4 and 5), together with the 5′ RACE results (Fig. 2A), indicated that significant amounts of RNA were spliced between A2 and D1 and A2 and D2. Thus, SPV exon 2, which lies between the first and second introns, was skipped at a significantly high level.
Northern blot analysis of SPV RNA.Northern analysis with a complete SPV genome probe showed an SPV-generated RNA profile similar to that generated by B19 (Fig. 4, lanes 2 and 5). RNAs predicted to encode VP1 (3.2 kb), NS1 (2.4 kb), and VP2 (2.3 and 1.9 kb) were clearly evident on the gel and were found at abundances consistent with the RNase protection data described above and with what has been reported for B19. The small RNAs resolve poorly on these gels, and it is likely that transfer and detection of these bands is not quantitative.
Northern blot analysis of SPV RNA. Northern blot analysis was performed using RNA isolated 36 to 40 h posttransfection of COS cells with the SPV NSCap plasmid (lanes 2, 3, and 4), from COS cells transfected with the B19 NSCap plasmid (lane 5), or from murine A9 cells transfected with the infectious clone of minute virus of mice (MVM) (lane 1). Northern blots were hybridized to either an SPV whole-genome probe (NSCap; lane 2), an SPV NS gene probe spanning SPV nt 300 to 1360, which is located within the first intron, or an SPV Cap gene probe spanning SPV nt 3090 to 3440, which are depicted relative to the transcription map on the right. As controls, whole-genome probes were used to detect MVM RNA and B19 RNA. The sizes of the SPV RNA species are shown, and the putative identities of these transcripts are shown in parentheses for lanes 2, 3, and 4. The sizes of the MVM and B19 transcripts, which were used as size markers, were previously determined and are shown in lanes 1 and 5.
Hybridization of SPV mRNA with a small probe from within the SPV Cap gene (spanning nt 3090 to 3440) (Fig. 4, lane 4), however, revealed two small RNAs of 0.5 kb (R11) and 0.8 kb (R12 and R13) (see Fig. 8A, below), which were of a size consistent with RNAs spliced between the third donor (D3) and the third acceptor (A3) located in the capsid gene region.
The small B19 RNAs detected by the B19 NSCap probe (Fig. 4, lane 5) are encoded by the B19 NS1 gene and by the 3′-terminal exon (see Fig. 8B, below). While hybridization of B19 RNA with a capsid gene region probe spanning nt 3090 to 3440 would be predicted to not detect the small B19 RNAs (compare Fig. 4 with Fig. 8A and B, below), hybridization of this probe to a Northern blot of SPV mRNA (Fig. 4, lane 4) detected bands of both 2.3 kb (see R9 and R10 in Fig. 8A, below) and 0.8 kb (R12 and R13 in Fig. 8A), which likely represent mRNAs spliced at D2, and unique transcripts at 1.9 kb (R8 in Fig. 8A) and 0.5 kb (R11 in Fig. 8A, below), respectively, which comprise mRNA spliced at D1 (Fig. 4, lane 4). Quantification of this Northern blot determined that the 2.3- and 1.9-kb RNAs were present at a ratio of approximately 1:1 and, together with the RPAs using PD2 and PA2 described above (Fig. 1B, lanes 4 and 5), which also demonstrated significant splicing from D1 to A2, suggests that approximately half of the RNA spliced to the A2 acceptor used the D1 donor. Thus, SPV exon 2, which lies between the first and second intron, is frequently skipped. This is in contrast to the situation in B19, in which this exon is predominately included. Perhaps efficient splicing at the second B19 donor (18; Yoto et al., unpublished) prevents skipping of B19 exon 2. Because exon 2 of both B19 and SPV contains the ORF for the putative 7.5-kDa protein and also multiple in-frame AUGs, which have been demonstrated to reduce translation efficiency (17), it may be that skipping of SPV exon 2 allows for more efficient expression of the major capsid protein VP2.
Hybridization of SPV mRNA with probes encompassing either the NS gene or the Cap gene revealed the presence of a large species, likely initiated at P6, which extended through to the right-hand end of the genome (Fig. 4, lanes 2, 3, and 4). Such a 4.7-kb full-length transcript has not been previously identified during B19 infection (18); however, it was detected here following plasmid transfection (Fig. 4, lane 5).
Characterization of at a polyadenylation site within the middle of the SPV genome.RNAs are polyadenylated efficiently at the (pA)p site in the center of the genome within the second intron. There are three consensus AAUAAA polyadenylation signals that lie distal to the terminating TGA codon for the NS1 protein at nt 2368. As described above, RPAs using the probe P(pA)p suggested that the AAUAAA signal at nt 2618 directs polyadenylation at (pA)p (Fig. 5A, lane 1). RPAs using two additional probes, P(pA)p0 (spanning nt 2321 to 2560) and P(pA)p2 (spanning nt 2801 to 3058), yielded detectable bands of only 240 and 258 nt, respectively (Fig. 5A, lanes 6 and 7), suggesting that the consensus polyadenylation signals at nt 2449 and 2959 were not used.
Characterization of the cis-acting signals that govern SPV polyadenylation at the (pA)p site. (A) COS cells were transfected with either the SPVNSCap plasmid [the (pA)p region between D2 and A2 is diagrammed in panel B] or SPVNSCap-based plasmids which contained various mutations of the putative DSE and USE and the poly(A) signal (indicated in panel C). Total RNA from COS cells transfected with these plasmids, as indicated above each lane, was protected with three different probes, as indicated at the top of the gel. Lanes 7 and 8 are identical. (B) The probes span the potential (pA)p0 site (nt 2321 to 2560), the (pA)p2 site (nt 2801 to 3058), and the (pA)p site. RNAs generated by SPVNSCapmDSE, SPVNSCapmpAp, SPVNSCapmUSEI, and SPVNSCapmUSEII were protected by homologous P(pA)p probes. A representative experiment is shown with the identities of the protected bands on the left and includes a lane with RNA markers, which are labeled. Quantification of the ratios of RNAs polyadenylated at (pA)p relative to the total protected RNAs is shown as averages and is the result of at least three individual experiments; all standard deviations were less than 4%. RNA from untransfected cells generated no protection products in these experiments (data not shown).
The core sequences for polyadenylation consist of a CPSF binding site which is either a consensus AAUAAA motif or a variant thereof (33), a G/U-rich DSE which binds CstF and, in certain instances, a USE whose function is not well defined (33). To define those elements required in cis for efficient polyadenylation of SPV RNAs at (pA)p, mutations were introduced into the potential DSE, USE, and the poly(A) signal in that region. Destroying the AAUAAA signal at nt 2618 prevented polyadenylation of SPV RNAs at the (pA)p site (Fig. 5, lane 3 compared to lane 1). Mutation of the G/U-rich DSE 26 nt downstream of the AAUAAA reduced polyadenylation efficiency at (pA)p from 90 to 36%, demonstrating that this DSE plays a critical role in efficient polyadenylation at (pA)p (Fig. 5, lane 2). Mutational analysis did not, however, identify an upstream region important for polyadenylation of SPV RNA at (pA)p (Fig. 5, lanes 4 and 5). The consensus, but cryptic, polyadenylation AAUAAA motifs at nt 2449 and 2959 are not followed by G/U-rich region sequences. It is interesting that the CPSF binding site for the analogous (pA)p site in B19 is nonconsensus (1, 18).
Protein expression profile of SPV.In B19, RNAs polyadenylated at (pA)p have the capacity to encode the large NS1 protein and a small 7.5-kDa protein (1, 10). Similarly, unspliced RNAs generated from the SPV P6 promoter which are polyadenylated at (pA)p have an ORF for an NS1 protein whose molecular mass is predicted to be approximately 70 kDa, while those spliced between D1 and A1-1 have an intact ORF analogous to the 7.5-kDa protein of B19, as previously discussed. However, the mass of this protein is predicted to be approximately 10 kDa. RNAs polyadenylated at (pA)p and spliced between D1 and A1-2 also contain a short ORF in the same frame as the 7.5-kDa protein that would be predicted to produce a protein with a molecular mass of approximately 5 kDa. To characterize the protein expression from this area of the genome, a nine-amino-acid HA tag was fused to the C terminus of the NS ORF or the 10-kDa ORF in an SPV NSCap expression plasmid (SPVNSHA and SPV10kDaHA, respectively). Western blot analysis of the expression of these two plasmids following transfection showed that, as expected, the NS1 protein was expressed as a protein of ∼70 kDa (Fig. 6A, lane 1), and the putative 10-kDa protein-encoding ORF was expressed as a protein of approximately 11 kDa (including the HA tag) (Fig. 6A, lane2). We could not detect a protein in the 5-kDa range either by SDS-12% PAGE (Fig. 6, lane 2) or SDS-15% PAGE (data not shown).
Western blot analysis of SPV-expressed proteins. COS cells were transfected with constructs of SPVNSHA (lane 1), SPV10kDaHA (lane 2), SPVVP2HA (lane 3), SPVVP2AUGHA (lanes 4, 6, and 7), and SPV14kDaHA (lane 5), or SPVNSCap as an HA negative control (lane 8) (described in the text). The position of the HA tag within the encoding mRNAs from the SPVVP2AUGHA construct is diagramed in panel C, with the HA tag (diamond) which was inserted after the AUG of VP2. A monoclonal antibody recognizing the HA tag (Covance Research Products, Inc.) was used at a 1:1,000 dilution (A) or 1:500 dilution (B) to identify SPV proteins expressed by each construct, which were compared to a series of prestained protein markers (Invitrogen). The positions and apparent molecular masses of the markers are shown on the right of the immunoblot, and the identities of the SPV proteins are shown to the left. The band in panel A marked with a star likely represents a degradation product. The band in panel B indicated by an arrow is most likely a protein with the unique region of VP1 fused to the 14-kDa ORF by splicing of the third intron and is referred to as uVP1/14 kDa (approximately 40 kDa), as described in the text.
In B19, RNAs polyadenylated at the right-hand-end site, (pA)d, encode the minor capsid protein VP1 (which comprises less than 4% of the total capsid protein) (17, 18), the major capsid protein VP2 (95% of the total capsid protein), and the 11-kDa protein, the function of which remains unknown (1, 29, 30). While the ORF for VP2 is in the same ORF as VP1, the B19 11-kDa protein is translated from a separate ORF that starts at an AUG after the second 3′ acceptor site of the second intron and stops at a TAA in front of the AAUAAA signal for (pA)d (29, 30). It was striking that for SPV, the analogous ORF does not have an initiating in-frame AUG (1, 3). It has been predicted that an initiating AUG capable of expressing the SPV 11-kDa protein in SPV might be acquired by splicing (1, 3). Alternative splicing from D3 (nt 3208) to A3 (nt 4638) joins the VP2 AUG at nt 3149 (60 nt in front of D3) in frame with the 11-kDa ORF, and RNAs compatible with this architecture were identified by both RPA and Northern blot analysis (Fig. 3 and 4). This open ORF would be predicted to encode a protein of approximately 14 kDa and, surprisingly, the N-terminal amino acids of this protein would be shared by both VP1 and VP2.
To test this possibility, an HA tag was fused in frame with either the C terminal of the VP1 and VP2 ORF (at nt 4817; SPVVP2HA), which is out of frame with the putative 14-kDa protein, in frame exclusively with the C-terminal region of the putative 14-kDa protein (at nt 4950; SPV14kDaHA), or in frame with the N-terminal region of both the capsid proteins and the 14-kDa protein (at nt 3155; SPVVP2AUGHA). Western blot analysis following transfection of SPVVP2HA detected a VP1 protein of approximately 90 kDa (apparent on longer exposure [data not shown]) and a VP2 protein of approximately 60 kDa (Fig. 6A, lane 3), while transfection of SPV14kDaHA revealed a protein of approximately 15 kDa (Fig. 6A, lane 5). Quantification of the signal indicated that VP1 was detected at levels 1/25 that of VP2 in these assays; however, whether this surprisingly low level of VP1 represented its true relative abundance in cells or was an artifact of our experiment is not fully clear. VP1, VP2, and the 14-kDa protein were also detected following transfection of SPVVP2AUGHA (Fig. 6A, lane 4 [the VP1 band in lane 4 was apparent on longer exposure] and B, lanes 6 and 7), and their sizes were identical to the proteins expressed from the C-terminal HA-tagged constructs (Fig. 6A, compare lane 4 with lanes 3 and 5; also, compare Fig. 6B, lanes 6 and 7, with lanes 3 and 5 in A). This confirmed that the 14-kDa protein of SPV was initiated by the AUG at nt 3149 that was joined to the 3′-terminal exon ORF by splicing. Consistent with our results concerning the relative accumulated levels of the encoding mRNAs (Fig. 3B), expression of the 14-kDa protein was greater than that of the VP2 protein. Quantification of these gels demonstrated that the ratio of VP1 to VP2 to 14-kDa protein was approximately 1:25:77.
To determine the subcellular localization of the 14-kDa protein, which shares 20 amino acids with the capsid proteins, we fused the enhanced GFP gene to the C terminal of the 14-kDa ORF in the SPV NSCap backbone. Following transfection of SPV14kDaGFP into COS cells, expression of the 14-kDa protein was distributed in both the cytoplasm and nucleus, similar to the subcellular distribution of the VP2 protein, which was transfected by SPVVP2GFP (GFP fused at the C terminal of VP2) (Fig. 7).
Subcellular localization of the SPV GFP-tagged 14-kDa and VP2 proteins in cultured 293 cells. COS cells were transfected with the SPV14kDaGFP and SPVVP2GFP constructs (described in the text; see also Materials and Methods), and 36 to 48 h later the cells were counterstained with SYTO59 (Molecular Probes, Inc.) at a final concentration of 5 μM. Confocal microscopy was used to collect images with the 60× Uplan objective lens at a zoom of ×4. The green (GFP signal), red (SYTO), and merged images (overlay) from representative transfected cells are shown for SPV14kDaGFP (14kDa/GFP) and SPVVP2GFP (VP2/GFP) constructs. The SYTO59 panel shows whole-cell counterstaining with SYTO59 nuclei acid binding dye, with stronger staining in the nuclei and weak staining in the cytoplasm. The 14kDa/GFP fusion protein is localized within both the nucleus and cytoplasm, which is similar to the VP2/GFP fusion protein (compare 14kDa and VP2 panels). The GFP-tagged mRNAs generated by SPV14kDaGFP are diagrammed at the bottom.
DISCUSSION
In this paper, we report a detailed characterization of the transcription and expression profile of SPV following transfection of an SPV NSCap plasmid into nonpermissive monkey COS cells. Analysis in a fully permissive system is limited by the difficulty in obtaining significant quantities of monkey bone marrow. The splice junctions determined by this analysis were confirmed in RNA generated by SPV infection of human erythroid progenitor UT-7/Epo-S1 cells. A map depicting these results is shown in Fig. 8A. A similar strategy has been employed for the analysis of the expression of the human parvovirus B19 (2, 6, 10, 30).
Transcription maps of SPV and B19. (A) Transcription map of SPV. The approximately 5.6-kb SPV genome is shown to scale with the major transcription landmarks, including the terminal repeats (TRs), P6 promoter, the initiation site for the various RNAs, splice donors (D1, D2, and D3) and acceptors (A1-1, A1-2, A2, and A3), and the (pA)p and (pA)d sites. (B) Transcription map of B19. The approximately 5.6-kb genome of B19 isolate J35 (GenBank accession no. AY386330 ) (13, 34) is shown to scale with the major transcription landmarks, including the TRs, P6 promoter, RNA initiation site, splice donor (D1 and D2) and acceptors (A1-1, A1-2, A2-1, and A2-2), and the (pA)p and (pA)d sites. All the RNA splice junctions, the RNA initiation site, and polyadenylation sites of this isolate were confirmed to be identical to those in the published B19 map (1, 18) by RT-PCR (Qiu and Pintel, unpublished). The major transcripts observed in this study are listed below each map (designed R1 through R13 for SPV and R1 to R9 for B19), with their sizes and encoded proteins shown to the right.
The transcription profile of SPV is generally similar to that of B19 in COS cells; however, there are also significant differences (Fig. 8). One significant difference is that an abundant mRNA predicted to encode SPV VP2 joins the first intron donor (D1) to the second intron acceptor (A2), effectively skipping the intervening exon. Approximately half the RNAs that utilize A2 splice from D1, and half splice from D2 (Fig. 4). For B19, splicing between the first intron donor site and the second intron acceptor has not been reported (1) in either permissive or nonpermissive cells; splicing of the second B19 intron (D2-A2) is highly efficient (18; Y. Yoto and D. J. Pintel, unpublished data). For both B19 and SPV, pre-mRNA splicing at A2 is critical for expression of the major capsid protein VP2, because only mRNAs spliced at A2 can encode VP2.
Another striking difference between the maps of SPV and B19 is that there is a third intron in the SPV capsid region, so that an additional third exon is included in SPV small RNAs. Splicing of this intron joins a novel donor site, D3, located 122 nt downstream of the second intron acceptor site A2, to the third acceptor site, A3. As described below, the 5′ exon upstream of D3 contributes the AUG used to translate the ORF in the 3′-terminal exon. This strategy is in contrast to B19, in which splicing to the downstream A2-2 acceptor site is from the donor site of the second intron. Although pre-mRNA processing is different in these two cases, the RNAs produced generate seemingly analogous proteins: the 11-kDa protein for B19 and the 14-kDa protein for SPV. An SPV RNA which was spliced between D2 and A3 was not detected by RT-PCR. Such an mRNA is the major species encoding the 11-kDa protein in B19. The same splice junctions were used in SPV RNA generated following viral infection of UT-7/Epo-S1 cells in vitro or monkey fetal liver in vivo (31).
SPV uses a consensus poly(A) signal (AAUAAA) and a typical DSE for efficient polyadenylation at (pA)p in the center of the genome. Parvoviruses B19, adeno-associated virus type 5 (AAV5), and goose parvovirus also use a polyadenylation in the center of the genome at high efficiency (18, 22) (J. Qiu and D. J. Pintel, unpublished data). Similar to B19, SPV RNAs that read through (pA)p and remain unspliced in this region encode the minor capsid protein VP1. Interestingly, this results in the retention in the VP1 mRNA of a potent polyadenylation signal. RNAs that encode the major capsid protein VP2 remove the SPV (pA)p signal from the final mRNA by splicing; however, this implies that, in a productive viral infection, these SPV pre-mRNA transcripts must also read through the (pA)p site at high efficiency to be available for subsequent splicing.
Polyadenylation of B19 pre-mRNA at its internal site has been proposed to be a key feature governing its tissue-specific replication (9). It was observed that polyadenylation at the internal nonconsensus poly(A) signal (AUUAAA) was less efficient following B19 infection of bone marrow cells from sickle cell anemia patients than it was following transfection of nonpermissive cells (9). While the interplay between splicing and internal polyadenylation that leads to the proper steady-state levels of parvovirus AAV5 RNA has been characterized initially (21), the mechanism that governs the relative accumulation of spliced SPV or B19 RNAs versus SPV or B19 RNAs polyadenylated at an internal consensus site has not yet been determined.
Because the (pA)p site in SPV is consensus and has a strong DSE, it may be that polyadenylation site choice for SPV, and by analogy perhaps for B19, may not be directly at the cleavage and polyadenylation reaction itself but is perhaps controlled in other ways. Our investigators have reported in the AAV5 system (24) that U1 snRNP binding to the donor site both inhibits downstream polyadenylation and enhances splicing of AAV pre-mRNA and so is a key determinant that governs the fate of AAV5 RNA. As a corollary of this, it has been demonstrated that the splicing process competes with polyadenylation for pre-mRNA molecules within the same precursor pool. In SPV, the D2 donor site is approximately 550 nt upstream of the (pA)p site, within a distance that would be compatible with its potential inhibitory effects of polyadenylation at the downstream (pA)p. In addition, splicing at the A2 acceptor site is constitutively strong, suggesting splicing may be a major competitor with polyadenylation at (pA)p, allowing VP2 pre-mRNAs to read through the (pA)p site.
The profile of SPV protein expression was basically similar to that of B19. SPV VP1 expression was detected at unexpectedly low levels, however, and the reason for this is not clear. The stoichiometry of VP1 in the SPV capsid was not determined. Reduced cellular levels of VP1 could be due, in part, to highly efficient polyadenylation at (pA)p, which precludes the accumulation of VP1 transcripts. mRNAs encoding VP2 relative to those encoding the 14-kDa proteins were present at approximately a 1:3 ratio, and Western blot analysis demonstrated that these proteins were present at approximately these relative levels. The abundant 14-kDa protein was distributed in both the nucleus and cytoplasm, and its function warrants further investigation.
We also detected at low levels the excision of intron 3 (D3-A3) from mRNAs in which the first intron (D1-A1) (Fig. 8A, R6 and R7) was also removed (Fig. 2B, lane 6). This spliced molecule would be predicted to encode a protein that contains the unique sequence of VP1 (uVP1) at its N terminal fused in frame to the 14-kDa protein at its C terminal. The predicated molecular mass of this protein (uVP1/14kDa) would be ∼40 kDa. Although not abundant, a band of this size was consistently detected by Western blotting following transfection of both C-terminally tagged (Fig. 6 and 7) and N-terminally tagged (data not shown) constructs. Generation of the spliced RNA encoding this protein may be at the expense of the VP1-encoding RNA. Determination of whether such a fusion protein is generated during SPV infection will require further validation.
As is the case for B19, all SPV RNAs that are spliced at the first intron using the A1-1 acceptor retain the small ORF that could putatively encode a 7.5-kDa protein. Whether this protein is encoded in a bicistronic manner from any of the mRNAs which retain this ORF in front of another is not known.
Overall, our characterization of the SPV gene expression profiles highlights both the similarities and the significant variability in pre-mRNA processing and protein expression found for the erythroviruses. SPV provides an excellent system in which to study the relationship between alternative polyadenylation and alternative splicing in an overlapping transcription unit and, in addition, is a good model for comparison and illumination of B19 RNA processing.
ACKNOWLEDGMENTS
We thank Lisa Burger for excellent technical assistance.
This work was supported by Public Health Service grants RO1 AI46458 and RO1 AI56310 from the National Institute of Allergy and Infectious Diseases to D.J.P.
FOOTNOTES
- Received 11 May 2004.
- Accepted 20 July 2004.
- Copyright © 2004 American Society for Microbiology