Previous Article | Next Article ![]()
Journal of Virology, January 2004, p. 980-994, Vol. 78, No. 2
0022-538X/04/$08.00+0 DOI: 10.1128/JVI.78.2.980-994.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Centro Nacional de Biotecnología, CSIC, Department of Molecular and Cell Biology, Campus Universidad Autónoma, Cantoblanco, 28049 Madrid, Spain
Received 14 July 2003/ Accepted 1 October 2003
|
|
|---|
G. |
|
|---|
Coronavirus transcription is based on RNA-dependent RNA synthesis. The result of this process is the generation of a nested set of six to eight mRNAs of various sizes, depending on the coronavirus strain. These mRNAs are 5'- and 3'-coterminal with the genome. The largest mRNA is the genomic RNA (gRNA), which also serves as the mRNA for the rep1a and rep1b genes. A leader sequence of 93 nucleotides (nt), derived from the 5' end of the genome, is fused to the 5' end of the mRNA coding sequence (body) by a discontinuous transcription mechanism (18, 32).
Sequences at the 5' end of each gene represent signals that regulate the discontinuous transcription of subgenomic mRNAs (sgmRNAs). These are the transcription-regulating sequences (TRSs) that include a core sequence (CS; 5'-CUAAAC-3'), highly conserved in all TGEV genes, and the 5' and 3' flanking sequences (5' TRS and 3' TRS, respectively) that modulate transcription (2). Previous studies using TGEV minigenomes have shown that the CS was required for transcription and that the synthesis of sgmRNAs only proceeds when this CS is located in an appropriate sequence context (2).
Two major models have been proposed to explain the discontinuous transcription in coronavirus and arterivirus (18, 32). The discovery of transcriptionally active, subgenomic-size negative strands containing the antileader (cL) sequence and of transcription intermediates active in the synthesis of mRNAs (30, 31, 33, 34) favors the model of discontinuous transcription during the negative-strand synthesis (32). This concept was reinforced by demonstrating in arterivirus that the CS included in the sgmRNA was derived from the CS preceding each gene (CS-B) and not from the CS present at the 3' end of the leader sequence (CS-L) (26, 38) (Fig. 1). According to this model of discontinuous sgRNA synthesis during production of the negative strand, the TRS-B acts as a slow-down and detaching signal for the transcription complex.
![]() View larger version (29K): [in a new window] |
FIG. 1. Diagram of the elements involved in coronavirus transcription. (A) The scheme represents all of the sequence elements probably involved in the discontinuous negative-strand synthesis model. CS-L, leader CS; CS-B, body CS. TRS-L and TRS-B, transcription-regulating sequences from the leader and body, respectively. An, poly(A). (B) Representation of the discontinuous transcription during negative-strand synthesis. cCS-B and cTRS-B represent the CS-B and TRS-B complementary sequences, respectively. Un, poly(U). (C) Leader and body sequences are probably located close to one another in higher-order structures maintained by RNA-protein and protein-protein interactions.
|
The synthesis of a negative sgRNA is most probably mediated by a direct base-pairing interaction between the nascent negative body TRS (cTRS-B) and the 3' end of the leader (TRS-L). The conserved sequence of this TRS, the CS-L, is probably exposed in a stem-loop at the 5' end of the viral genome both in TGEV (S. Alonso, I. Sola, S. Zúñiga, and L. Enjuanes, unpublished) and in equine arteritis virus (EAV) (26, 38), although this RNA structure has not been experimentally proven.
Proximity to the 3' end of the genome probably influences the relative amount of sgmRNAs, because the polymerase complex finds less slow-down and detaching signals during small negative sgRNA synthesis. Therefore, in principle, these RNAs could be the most abundant. Although this is the case in the order Mononegavirales (15, 39) and, in general, in coronaviruses, the relative amounts of coronavirus mRNAs are not strictly related to their proximity to the viral 3' end (28, 37). Therefore, other factors may also regulate coronavirus transcription.
The interaction of RNA with viral and cellular proteins is probably involved in coronavirus transcription. The discontinuous synthesis of the negative RNA strand resembles a high-frequency copy-choice RNA recombination (3, 21, 26), in which the TRS-B (donor) and TRS-L (acceptor) sequences, located in distal domains in the RNA primary structure, are probably brought into physical proximity by RNA-protein and protein-protein interactions (Fig. 1C).
In arterivirus, base pairing between the leader CS and the negative-sense body CS (cCS-B) has been implicated in transcription, although the roles of other factors, such as relative TRS position in the genome and secondary structure, have led to less clear conclusions (25-27).
In this report, the role of CS sequences in coronavirus transcription is analyzed for the first time by using TGEV full-length genomes constructed with an infectious cDNA clone (1). The role of each nucleotide within the leader and body CSs has been studied by introducing point mutations in these sequences. A key strategy in these studies has been analysis of gene 3a transcription, because this gene is nonessential for TGEV replication (36). Therefore, infectious virus was rescued for all gene 3a CS-B mutants, allowing subsequent analysis. We show in the studies reported here that the presence of the highly conserved CS was associated with sgmRNA production and high virus titers, but that this sequence was not essential for sgmRNA synthesis when the TRS-L to cTRS-B duplex formation involved a high release of free energy (
G). In fact, the genome positions in which a negative sgRNA most frequently fused to the leader could be predicted in silico by determining the identity between the TRS-L and sequence domains of the genome. To this end, a computer-based program has been developed to assess the strength of base pairing between body and leader TRS that successfully predicts the authentic products as well as novel, mutant-derived sgmRNAs. In addition, it has been shown that nucleotide substitutions in the canonical CS led to the use of alternative noncanonical CSs, providing that sequences flanking the CS-L were also flanking the CS-B, leading to a favorable
G in duplex formation between TRS-L and cTRS-B. It has also been shown that during the synthesis of TGEV negative sgRNAs, template switching always took place after copying the canonical or noncanonical CS sequence, supporting the finding that coronavirus RNA discontinuous synthesis takes place during production of the negative strand. A three-step mechanism has been proposed as a working model for coronavirus mRNA transcription.
|
|
|---|
Plasmid constructs.
TGEV cDNAs with point mutations in the leader and body CS were generated by overlapping PCR. To get leader CS mutants, the plasmid pBAC-TGEV(SrfI-NheI), which bears nt 1 to 15062 from the TGEV genome (GenBank accession no. AJ271965) except a ClaI-ClaI fragment (nt 4417 to 9615) (1), was used as template. Overlapping PCR fragments with point mutations were amplified by using the oligonucleotides described in Table 1. The final PCR product (2,415 bp), amplified with outer oligonucleotides Oli 5'I and Oli 3'D, was digested with SfiI and ApaLI and cloned into the same restriction sites of plasmid pBAC-TGEV(SrfI-NheI). To introduce mutations in the TGEV infectious cDNA, SfiI-ClaI fragment (5,277 bp) from pBAC-TGEV(SrfI-NheI) with the corresponding mutation was cloned into the same sites of pBAC-TGEV
Cla; after that, the toxic ClaI-ClaI fragment (5,198 bp) was introduced as previously described (1).
|
View this table: [in a new window] |
TABLE 1. Oligonucleotides used for site-directed mutagenesis
|
C1a. To obtain the full-length TGEV cDNA, the toxic ClaI-ClaI fragment (5,198 bp) was introduced as previously described (1).
Double CS-L and CS-B mutants were obtained by introducing SfiI-ApaLI fragment from pBAC-TGEV(SrfI-NheI) plasmid with the leader mutation into the same restriction sites of pBAC-TGEV
Cla bearing the corresponding CS-3a mutation. The plasmid containing the full-length TGEV cDNA with point mutations was then generated as previously described.
All cloning steps were checked by sequencing the PCR-amplified fragments and cloning junctions.
Transfection and recovery of infectious TGEV from cDNA clones. BHK-pAPN cells were grown to confluence in 35-mm-diameter plates and transfected with 4 µg of the appropriate full-length TGEV cDNA clone and 12 µl of Lipofectamine 2000 (Invitrogen) according to the manufacturer's specifications. The estimated transfection efficiency of the TGEV cDNA using this system was around 20% in all cases. Cells were incubated at 37°C for 6 h, and then the transfection medium was discarded, 200 µl of trypsin-EDTA was added, and trypsinized cells were plated over a confluent ST monolayer grown in a 35-mm-diameter plate. After a 2-day incubation period, the cell supernatants (referred to as passage 0) were harvested and stored. Virus from passage 0 supernatant was cloned by three plaque purification steps. Recombinant TGEV (rTGEV) viruses were grown and titrated as described previously (16).
RNA analysis by Northern blotting. Total intracellular RNA was extracted at 18 to 24 h postinfection (hpi) from virus-infected ST cells by using the RNeasy Mini kit (Qiagen) according to the manufacturer's instructions. RNAs were separated in denaturing 1% agarose-2.2 M formaldehyde gels and blotted onto positively charged nylon membranes (BrightStar-Plus; Ambion) as described previously (2). The 3' untranslated region (UTR)-specific single-stranded DNA probe was complementary to nt 28300 to 28544 of the TGEV strain PUR46-MAD genome (28). Probe labeling was performed with the BrightStar psoralen-biotin nonisotopic labeling kit (Ambion), and Northern hybridizations were performed according to the manufacturer's instructions. Detection was done with the BrightStar BioDetect kit (Ambion).
RNA analysis by RT-PCR. Analysis of mutant virus RNAs was performed by reverse transcription-PCR (RT-PCR). Total intracellular RNA was extracted at 18 hpi from ST cells infected with rTGEV viruses as previously described. cDNAs were synthesized at 42°C for 1 h with Moloney murine leukemia virus reverse transcriptase (Mo-MuLV-RT) (Ambion) and the antisense primers described in Table 2. The cDNAs generated were used as templates for specific PCR amplification using the reverse primers described in Table 2 and the forward primer SP (5'-GTGAGTGTAGCGTGGCTATATGTGT-3'), complementary to nt 15 to 39 of the TGEV leader sequence. RT-PCR products were separated by electrophoresis in 0.8% or 1.5% agarose gels, purified, and used for direct sequencing with the SP oligonucleotide and the same reverse primer used for PCR.
|
View this table: [in a new window] |
TABLE 2. Reverse oligonucleotides used for RT-PCR analysis of RNA from rTGEV-infected cells
|
|
View this table: [in a new window] |
TABLE 3. Oligonucleotides used for real-time RT-PCR analysis
|
The in silico analysis was performed with TRS-L sequences of different lengths and several coronavirus genomes: TGEV, human and bovine coronavirus (HCoV and BCoV, respectively). Since viral mRNAs always were generated from a TRS with a base-pairing score of
35, this value was selected as the threshold, although all of the values were taken into account. In these analyses, a score below 18 was never obtained, because the LALIGN program provides only the best local alignments. For the same reason, score values were discrete points in several positions distributed along the genome, but to facilitate data visualization, a continuous line representation was selected as the graphical output.
|
|
|---|
![]() View larger version (86K): [in a new window] |
FIG. 2. Mutations introduced in the TGEV full-length cDNA and virus recovery. Nucleotide substitutions were introduced in the 3a gene CS (CS-B mutants [A]), the leader CS (CS-L mutants [B]), in both the CS-L and CS-B (double mutants [D]), and leader CS mutants with changes allowing non-Watson-Crick base pairing with the body cCS (non-Watson-Crick mutants [C]). Virus titers (PFU per milliliter) obtained for the passage 0 supernatant are indicated in the figure.
|
Interestingly, infectious rTGEV was recovered from all non-Watson-Crick leader mutants, with titers ranging from wild-type levels, like those obtained for L-C6U mutant, to 105-fold lower for the L-C1U mutant (Fig. 2C). Overall, these data indicated the requirement of base pairing between CS-L and cCS-B for sgmRNA synthesis.
Relationship between CS-L and CS-3a sequences and sgmRNA levels. It was postulated that synthesis of negative sgRNAs is mediated by direct base pairing between the TRS-L and the cTRS-B. This being the case, the CS-L and CS-3a sequences should modulate sgmRNA-3a levels. To determine whether this was the case, the pattern of sgmRNA synthesis produced by different rTGEVs with CS point mutations was analyzed by Northern blotting (Fig. 3). Nucleotide substitutions within the first 3 nt of CS-L led to no virus rescue, and it was not possible to analyze the sgmRNA pattern. To evaluate sgmRNA synthesis by Northern blot analysis, because mutations in CS-L sequence positions 4 to 6 considerably reduce sgmRNA production, the multiplicity of infection (MOI) and the amount of total RNA from the leader and double mutants loaded in the gel were increased in order to obtain similar levels of viral RNA (Fig. 3). The viral sgmRNA pattern for the wild-type virus was the expected one, but new bands were identified in all CS mutants (Fig. 3). Some of these unexpected bands were amplified by RT-PCR and sequenced, corresponding to alternative sgmRNAs for the S, 3a, and N genes. These data indicated that changes in the CS-L or CS-B opened new base-pairing possibilities throughout the genome, leading to the generation of alternative sgmRNAs.
![]() View larger version (112K): [in a new window] |
FIG. 3. Northern blot analysis of rTGEVs. ST cells were infected with rTGEV at an MOI of 0.5 (for the wild type [wt] and CS-B mutants) or 1 (for CS-L and double mutants). Total RNA was extracted at 20 hpi and analyzed by Northern blotting with a probe complementary to the 3' end of the gRNA. To normalize the amount of viral RNA in the gel, lanes L and D were loaded with three times the amount of the other lanes. L, CS-L mutant; B, CS-B mutant; D, double mutant. Viral mRNAs are indicated on the left side of the figure, and new sgmRNAs that have been clearly identified are indicated on the right (some of them correspond to the alternative sgRNAs analyzed in this work, indicated by the same number). n.i., still unidentified sgmRNAs.
|
![]() View larger version (61K): [in a new window] |
FIG. 4. RT-PCR analysis of the CS-B mutants. (A) Scheme of the RT-PCR strategy for testing the gRNA and the mRNA-3a. Arrows indicate the approximate oligonucleotide position in the genome and sgmRNA. UTR, 3' untranslated region. (B) mRNA-3a specific RT-PCR products were resolved in an agarose gel. mRNA-3a species were numbered 3a.1 (wild type [wt]), 3a.2, and 3a.3. MW, molecular weight markers. (C) Sequence analysis of the leader-body junction sites in the three mRNA-3a species. The sequence in the light-gray box corresponds to the leader (L) sequence. The CS appears as white letters in a dark-gray box in all cases. The sequence on top corresponds to the gRNA sequence in the fusion site; the sequence at the bottom is the mRNA sequence with nucleotides from the leader in a light-gray box. CS in white letters in a dark-gray box represents the mutated CS in each case; two examples of leader-to-body junction sites generating mRNA-3a.1 are presented: the B-C1G and B-A3C mutants. The GAA motif appears in a medium-gray box. Vertical bars represent the identity between the sequences, with thick bars at the possible fusion site. Dotted vertical bars represent the possible non-Watson-Crick interaction. Crossover should occur in any of the nucleotides above the arrow.
|
Sequencing of the leader-to-body junction sites in the three sgmRNA-3a species showed that there was an extended identity between TRS-L and gRNA in sequence domains around the noncanonical CSs used (Fig. 4C). Interestingly, all of the mutations introduced in the CS-3a appeared in the mRNA-3a.1, including a substitution in the first CS-B nucleotide (B-C1G mutant), indicating that at least the whole-body CS was copied before template transfer. Nevertheless, because an extended upstream sequence identity is observed between the CS-L and CS-3a flanking sequences, the strand transfer point could not be accurately established. Even for the B-C1G mutation that remained in the mRNA-3a.1 sequence, strand transfer could happen in any of the 5'-GAA-3' nucleotides upstream of CS-3a. However, in mutants B-A3C (Fig. 4C) and B-A5C (data not shown), template transfer had to occur at the A nucleotide, preceding GAA sequence upstream CS-3a, because the mRNA-3a.1 included the sequence 5'-AGAACUAAAC-3' (Fig. 4C) derived from the gRNA sequence. The identity between leader and body sequences was frequently extended by including all or part of the sequence 5'-GAA-3', at either the CS 5' or 3' end, or at both ends (Fig. 4C), suggesting that template switching during transcription required high complementarity between TRS-L and cTRS-B.
The transcription pattern in CS-3a mutants of proximal (gene E) or distal upstream (gene S) or downstream (gene 7) TGEV genes was analyzed by RT-PCR using specific oligonucleotides (Table 2), and no alteration was observed in the relative synthesis of these TGEV mRNAs (Fig. 5). These data suggested that the template switch was dependent on the nature of local sequences and was not influenced by sequences mapping 5' or 3' downstream.
|
View larger version (29K): [in a new window] |
FIG. 5. Effect of CS-B mutations in the transcription of other TGEV mRNAs. mRNAs from genes S, 3a, E and 7 were analyzed by RT-PCR using specific oligonucleotides (Table 2). WT, wild-type virus; B-C1G and B-A3C, CS-3a mutants with mutation at positions 1 and 3, respectively.
|
Base-pairing scores throughout the 5' two-thirds of the genome were very low (below a value of 35), except at the TRS-L, which obviously has the maximum base pairing score (a value of 70) (data not shown). Interestingly, potential base pairing throughout the one-third 3' end of the genome, encoding the structural and nonstructural proteins, showed that the sequences with highest local identity correlated with template transfer sites leading to generation of the standard TGEV mRNAs (Fig. 6A). Intermediate values of local complementarity (between 32 and 40) were associated with the generation of sgmRNAs alternative to those generated by template transfer at positions of canonical CS-Bs. In contrast, no sgmRNAs were detected at sequence positions with a low potential base-pairing score (data not shown), suggesting a dominant role for the complementarity between TRS-L and cTRS-B in the control of sgmRNA levels.
![]() View larger version (50K): [in a new window] |
FIG. 6. In silico analysis of the identity between TRS-L and the TGEV genome. As indicated in Materials and Methods, a continuous line graph was selected to facilitate visualization of the data. (A) Graphical plot of the potential base-pairing score versus the genome position. All peaks assigned to the viral CSs are indicated as the peaks corresponding to the new 3a sgmRNA species. (B) Graphical plot of the potential base-pairing score versus the genome position around CS-3a. Each three-dimensional line represents either the wild-type (wt) situation or the body mutants. The peaks assigned to each 3a sgmRNA species are indicated.
|
Influence of CS-L to cCS-B duplex
G on sgRNA-3a levels.
To study the influence of base pairing between the nascent negative sgRNA and the CS-L on sgmRNA synthesis, mRNA-3a.1 levels were quantified in all CS-B mutants by real-time RT-PCR using specific oligonucleotides (Table 3) and the gRNA as an internal standard for mRNA evaluation. The concentration of mRNA-3a.1 in CS-B mutant viruses was expressed in relation to that of the wild type. The results showed a significant decrease in mRNA-3a.1 levels of up to 103-fold and a good correlation between mRNA-3a.1 concentration and duplex
G except for nucleotide substitutions at both the 5' and 3' ends of the CS (B-C1G and B-C6G mutants) that had a higher effect than expected on sgmRNA levels (Fig. 7A). The additional decrease in the amount of mRNA-3a.1 in the B-C1G mutant could be due to the importance of this nucleotide to prime the synthesis of negative sgRNA after template switching. In addition, both the first and last CS nucleotides could play the extra role of stabilizing the formation of a duplex between the exposed CS-L and the cCS-B.
![]() View larger version (39K): [in a new window] |
FIG. 7. mRNA-3a quantification by real-time RT-PCR. (A) Amount of mRNA-3a.1, quantified by real-time RT-PCR, in the body mutants relative to the wild-type (wt) levels. Shown is a graphical representation of the G (as - G in kilocalories per mole) of the CS-L with cCS-B duplex and the relative amount of mRNA-3a.1 (represented as log [mRNA-3a.1] in relative units) for each virus. The data presented are the average of six independent experiments with duplicates in each case. Error bars represent the standard deviation in each case. (B) Graphical plot of the amounts of mRNA-3a.1 and mRNA-3a.2 relative to the level of gRNA, expressed as [mRNA] in relative units.
|
The amount of the alternative mRNA-3a species was also analyzed by real-time RT-PCR using specific oligonucleotides (Table 3). The level of mRNA-3a.2 in the CS-3a mutants did not change significantly when compared with that of the wild-type virus (Fig. 7B). The apparent discrepancy between the relative abundance of the mRNA-3a.2 bands (Fig. 4B) and the quantitative RT-PCR results for the wild-type virus (Fig. 7B) can be explained by primer sequestration by mRNA-3a.1, which was about 103-fold more abundant in the wild type than in the CS-3a mutants. As a consequence, the ratio of mRNA-3a.1 to mRNA-3a.2 was altered in all CS-B mutants. The alternative mRNA-3a.2 was also expressed in the wild-type virus as determined by real-time RT-PCR, although it was not detected by conventional RT-PCR due to the competition between the primers used. Unfortunately, real-time RT-PCR did not allow the quantification of mRNA-3a.3, since the design of specific oligonucleotides was not possible because a duplication of the sequence appears at the leader-to-body fusion site.
Effect of leader CS mutants on sgmRNA levels. The introduction of nucleotide substitutions at CS-L could affect the potential base pairing between the TRS-L and cTRS-B of all TGEV genes, with the consequent reduction in sgmRNA and virus production. Alternatively, the decrease in virus titers could also be due to an effect of CS-L nucleotide substitutions in the TRS-L secondary structure. The transcription model proposed in this article, like the one proposed for arterivirus (26, 38), postulates exposure of the CS-L in a stem-loop within the TRS-L. In agreement with this model, virus production was only observed in TGEV mutants with a CS-L presented as a single-strand RNA according to secondary structure predictions (19; data not shown).
Construction of rTGEVs with nucleotide substitutions not allowing base pairing with cCS-B at each CS-L position led to the rescue of infectious viruses when these mutations were introduced within positions 4 to 6 of the CS, but not in positions 1 to 3. Therefore, the analysis of the sgmRNA generated after infection of cells was only possible in mutants with substitutions in positions 4 to 6. Total RNA from infected cells was analyzed by RT-PCR using specific oligonucleotides (Table 2) to amplify gRNA and mRNAs (Fig. 8). Nucleotide substitutions in CS-L positions 4 to 6 led to a reduction in virus titers higher than 104-fold in relation to wild-type virus (Fig. 8, bottom). rTGEV mRNAs could be clustered into two sets: one that in general led to a unique sgmRNA (genes E, M, and N) and another leading to alternative sgmRNAs (genes S, 3a, and 7). The sgmRNA corresponding to gene 3b was only produced when the mismatch in the sixth nucleotide of the CS-B present in the parental TGEV strain, considered in this report as the wild-type strain (2, 40), was compensated for by the mutation introduced within the CS-L (mutant L-C6U).
![]() View larger version (126K): [in a new window] |
FIG. 8. Analysis by RT-PCR of viral sgmRNAs generated by rTGEVs with CS-L substitutions. After ST cell infection with rTGEVs, total RNA was analyzed by RT-PCR with specific oligonucleotides to detect all viral mRNAs. Viruses with CS-L substitutions are indicated on top of the figure. The viral mRNA detected is shown to the left of the figure. The titer (PFU per milliliter) of each virus is shown at the bottom.
|
All nucleotide substitutions introduced in the cDNA remained in the rescued virus genome (data not shown). Moreover, sequencing of 72 viral mRNA leader-to-body junction sites included in the sgmRNAs identified (Fig. 8) showed that nucleotide substitutions within the CS-L did not appear in the mRNA sequence, confirming that the CS sequence in the mRNA came from CS-B (data not shown). These results strongly suggest that the template switch was produced during negative sgRNA synthesis.
Synthesis of alternative sgmRNAs in viruses with nucleotide substitutions in CS-L. Mutations in CS-L led to the formation of at least five different sgRNA-S species, named mRNA-S.1 (wild type) to mRNA-S.5 (Fig. 9A). Some of these sgmRNA species, such as mRNA-S.2 and mRNA-S.4, were indistinguishable in agarose gel electrophoresis because of their similar size. RT-PCR amplification and sequencing of leader-to-body junction sites showed four new junction domains (extending nt 20291 to 20644 of the TGEV genome) leading to the synthesis of new sgmRNA species (Fig. 9B). Extended complementarity between leader and body sequences, mediated by the 5'-GAA-3' sequence involved in the TRS-L with cTRS-B base pairing with noncanonical junction sites, was possible in all cases (Fig. 9B). Most likely, this extended complementarity leads to a higher base-pairing score between the nascent negative RNA strand and the TRS-L, promoting a template switch in these sequence positions and production of the corresponding sgmRNAs.
![]() View larger version (66K): [in a new window] |
FIG. 9. RT-PCR analysis of the S mRNA species present in leader mutants. (A) mRNA S detection by RT-PCR in leader and double mutants. sgmRNA species are named mRNA S.1, S.2, S.3, S.4, and S.5, as shown to the right of the panel. The oligonucleotides used for the analysis did not allow the detection of sgmRNAs S.6 and S.7. (B) Sequence analysis of the leader-to-body fusion site in all of the S gene sgmRNAs generated. The sequence in the light-gray box at the bottom represents the wild-type (wt) or mutated leader; the sequence on top is the gRNA sequence in the junction sites. CS is in white letters in a dark-gray box. The GAA motif is in a medium-gray box. Vertical bars represent the identity between the sequences; thick bars correspond to the possible fusion site, because crossover should occur in any nucleotide above the arrow. Dotted vertical bars represent the possible non-Watson-Crick interaction. Numbers indicate the position in the TGEV genome.
|
Nucleotide substitutions in the CS-L led to two gene 3a sgmRNA species, named mRNA-3a.1 (wild type) and mRNA-3a.2 (Fig. 10A). These sgmRNAs corresponded to the two larger gene 3a sgmRNA species found in the corresponding CS-3a mutants. The potential base-pairing score between the mutated TRS-L and the TRS-B, leading to mRNA-3a.3 synthesis, was smaller in the mutants than in the wild-type virus (Fig. 10B). Furthermore, the estimated
G indicated that base pairing in this junction site was energetically disfavored, providing a justification for the lack of template switch and production of mRNA-3a.3.
![]() View larger version (41K): [in a new window] |
FIG. 10. Analysis of 3a and 7 sgmRNAs present in leader mutants. (A) mRNA-3a detection by RT-PCR. sgmRNA species are named as mentioned before. (B) In silico analysis of the identity between the wild-type (wt) or mutated TRS-L and the TGEV genome surrounding the 3a gene CS. Data are graphically plotted as potential base-pairing score versus the genome position. (C) mRNA-7 detection by RT-PCR. The sgRNA species are named mRNA 7.1 and 7.2. (D) Sequence analysis of the leader-to-body junction sites in all of the 7 gene sgRNAs generated. The sequence at the bottom (light-gray box) represents the wild-type or mutated leader, and the one on top represents the gRNA in the fusion site context. CS is in white letters in a dark-gray box. Vertical bars show the identity between the sequences, and thick bars represent the possible fusion site, because strand transfer should occur in any of the nucleotides above the arrow. Dotted vertical bars represent the possible non-Watson-Crick interaction. Numbers indicate the position in the TGEV genome.
|
Overall, analysis of the alternative sgRNAs produced in virus with nucleotide substitutions in the CS-L indicated that production of novel sgRNAs was associated with the possibility of duplex formation between the TRS-L and cTRS-B with a high base-pairing score.
|
|
|---|
A key strategy in our study of transcription regulation was the selection of gene 3a, a nonessential gene for TGEV growth both in vitro and in vivo, since modification of this gene did not affect the recovery of mutant viruses (36).
The requirement of complementarity between the CS-L and the cCS-B for the synthesis of a negative sgRNA was reinforced by showing a reduction in the sgmRNA synthesis associated with point mutations reducing complementarity between CS-L and cCS-B, and by demonstrating that, in general, sgmRNA synthesis was partially or completely restored by the introduction of nucleotide substitutions allowing formation of non-Watson-Crick or Watson-Crick base pairs.
The extent of sgmRNA synthesis was related to the free energy of the duplex between CS-L and cCS-B. The potential base-pairing score of sequence domains complementary to genomic RNA with the TRS-L ranged between 15 and 70. According to this score, the local sequence domains could be classified into domains with low (<35), and high (>35) base-pairing potential. In general, sequences with local base pairing of <35 led to no significant production of sgmRNA; in contrast, local base pairing higher than 35 led to the synthesis of standard viral mRNAs. These findings validated the in silico analysis method. This method was also found reliable for the prediction of most sgmRNAs synthesized by TGEV leader mutants and by other coronaviruses, such as HCoV 229E and BCoV (data not shown). The presence of a canonical CS within the TRS promoted higher sgmRNA levels. Nevertheless, the presence of a canonical CS within the TGEV genome did not guarantee the synthesis of an sgmRNA. The requirement of an appropriate sequence context was confirmed by showing that a 5'-CUAAAC-3' sequence present 121 nt downstream of the gene S initiation codon (CS-S2) did not lead to synthesis of the corresponding sgmRNA as a consequence of the 5' and 3' flanking TRS sequences (2). The lack of sgmRNA synthesis could be explained by the relatively low potential base-pairing score and
G values (32 and -3.0 kcal/mol, respectively) between the corresponding TRS-L and cTRS-B, values lower than those estimated for the canonical CS-S1 used (35 and -4.3 kcal/mol, respectively).
The presence of TRS-L complementary sequences flanking noncanonical cCS-Bs led to the use of alternative TRS-B sequences. These sgmRNAs, such as mRNAs 3a.3 and 7.2, although produced in significant amounts, were generally not translated into truncated proteins, because there was no initiation codon in their sequence. Nevertheless, in a minority of cases, new sgmRNAs could encode essential proteins, and the use of alternative noncanonical CSs could be a safeguard mechanism for virus survival. In fact, this could be the case of gene S alternative sgmRNAs that could lead to the production of truncated S proteins, similar to that found in field variants of TGEV, such as the porcine respiratory coronavirus (PRCV) (4, 29). The synthesis of alternative sgmRNAs using noncanonical CS as part of the viral life cycle has also been reported for other coronavirus (17, 23), in arterivirus (24), and in nidovirus-derived expression systems (7, 11, 36).
Nucleotide substitutions, not allowing base pairing with cCS-B, within the first 3 nt of the CS-L led to the inhibition of sgmRNA synthesis and, consequently, to a failure in infectious virus production. In contrast, mutation of CS-L nt 4 to 6 promoted infectious virus recovery, although virus production was reduced by more than 104-fold. The higher restriction within the first 3 CS-L nt during template switch fits the discontinuous negative-strand synthesis model (18, 32), because extension of the negative sgRNA body sequences to add the cL sequence could not proceed in the absence of a complementarity between the 3' end of the growing negative strand and the TRS-L. However, alternative explanations, such as the effect of these nucleotide substitutions on key CS-L structural motifs, cannot be discarded.
Our data indicate that template switching during synthesis of the negative strand takes place after termination of the complementarity between the 3' end of the nascent RNA strand (negative polarity) and the TRS-L. In this process, mismatches may be tolerated, providing that several complementary nucleotides between the TRS-L and cTRS-B upstream of the CS-B would be present. This conclusion is based on the sequence of the leader-to-body junctions in a collection of 72 sgmRNAs generated after the introduction of point mutations within CS-L and CS-3a. For instance, nucleotide substitutions introduced in the first CS nucleotide (B-C1G) of gene 3a were transferred to the sgmRNA, probably because the identity between the TRS-B and the TRS-L was extended 3 nt upstream of the CS (Fig. 11). Similarly, nucleotide substitutions in the first CS-L nucleotide (L-C1U) never appeared in the sgmRNA sequenceexcept in the mRNA-E sequence, since in this case there was no immediately adjacent upstream complementarity (Fig. 11) (data not shown)reinforcing the concept that template switching takes place at the end of the complementarity between the TRS-L with the cTRS-B.
![]() View larger version (95K): [in a new window] |
FIG. 11. CS adjacent flanking sequences identity. Identity between the TRS-L sequence and TRS-Bs for all TGEV sgmRNAs is shown in the figure. The CS sequence is in white letters in a black box. White boxes highlight the identity in the sequences immediately flanking CS both at the 5' and 3' ends.
|
![]() View larger version (32K): [in a new window] |
FIG. 12. Three-step working model of coronavirus transcription. (A) The 5'-3' complex formation step. Proteins binding the 5'- and 3'-end TGEV sequences are represented by the green ovals. The leader sequence is red, and CS sequences are yellow. An, poly(A) tail. (B) Base-pairing scanning step. Negative-strand RNA is in a lighter color than positive-strand RNA. The transcription complex is represented by the hexagon. Vertical dotted bars represent the base-pairing scanning by the TRS-L sequence in the transcription process. Vertical solid bars indicate complementarity between gRNA and the nascent negative strand. Un, poly(U) tail. (C) Template switch step. The thick arrow indicates the switch in the template made by the transcription complex to complete the synthesis of negative sgRNA.
|
G. This scanning has been postulated based on the observed relationship between the presence of a high potential base-pairing score between the TRS-B and the TRS-L and synthesis of sgmRNA. This correlation implies that during the synthesis of the negative RNA, the nascent chain has to be screened by the TRS-L, probably partially exposed at the top of a stem-loop. If the complementarity is above a certain threshold, the third template switch step takes place (Fig. 12C) in a proportion of the nascent chains, and a copy of the gRNA leader is made, leading to termination of a negative sgRNA that will be used to generate an sgmRNA with the same length. The existence of this step is required to explain the primary structure of a high number (>60) of sgmRNAs described in this article. The transcription model postulated in this article reinforces and extends a previously proposed model for coronavirus and arterivirus (26, 27, 30, 32, 38). The proposed coronavirus transcription mechanism implies a close interaction between TRS-L and each cTRS-B present in the gRNA. Therefore, there should be a restriction in the evolution of these TRSs, because changes in a given TRS-B would affect synthesis of the specific sgmRNA. More importantly, changes within the TRS-L should have a pleiotropic effect on the synthesis of all viral mRNAs. Therefore, the degree of freedom in the evolution of these TRSs should be limited, particularly for essential genes. Nucleotide substitutions within a TRS should only be fixed if the sequences flanking a canonical or noncanonical CS-B compensate the decrease in the base pairing between cCS-B and CS-L. Therefore, the probability of a nucleotide substitution within a TRS-B should be lower than that of an average nucleotide substitution within RNA virus genomes (i.e., <1 x 10-4) (8). Furthermore, the fixation of a nucleotide substitution within the TRS-L, particularly within the first 3 CS-L nt, would require the simultaneous incorporation of complementary mutations within the CS-B of at least the four essential TGEV genes encoding proteins S, E, M, and N. Therefore, this event should have an even lower probability (<1 x 10-20 in TGEV). These theoretical considerations are supported in TGEV by the lack of CS-L evolution throughout more than 50 years of virus replication.
Multiple factors seem to regulate the transcription process (2, 26, 27). These factors would probably imply protein-RNA and protein-protein recognition, including viral and host cell components that will be the subject of future studies.
This work was supported by grants from the Comisión Interminis-terial de Ciencia y Tecnología (CICYT), La Consejería de Educación y Cultura de la Comunidad de Madrid, Fort Dodge Veterinaria, and the European Communities (Frame V, Key Action 2, Control of Infectious Disease Projects). I.S., S.A., and S.Z. received postdoctoral fellowships from the Community of Madrid and the European Union (Frame V, Key Action 2, Control of Infectious Disease Projects: QLRT-1999-00002, QLRT-1999-30739, and QLRT-2000-00874).
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»