Previous Article | Next Article ![]()
Journal of Virology, January 2007, p. 650-668, Vol. 81, No. 2
0022-538X/07/$08.00+0 doi:10.1128/JVI.01327-06
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Department of Biology, University of Nebraska at Omaha, Omaha, Nebraska 68182-0040
Received 23 June 2006/ Accepted 20 October 2006
|
|
|---|
|
|
|---|
The enterovirus cap-independent translation mechanism has been studied extensively, particularly in poliovirus, and serves as a model for all enteroviruses, including CVB3. In this translation mechanism, the 5'NTR contains a cis-acting internal ribosome entry site (IRES) (26, 50, 60) that recruits ribosomes directly to a downstream AUG codon, thereby circumventing canonical cap dependent initiation (8). Elements of the 5'NTR are recognized by cellular translation factors, ribosomes, and other cellular proteins to assemble an initiation complex directly on the downstream AUG (3, 4). For replication, the 5'NTRs of CVB3 and other enteroviruses contain an RNA element known as the cloverleaf, which is required in cis for initiating negative-strand RNA synthesis. These negative strands then serve as templates for production of genomic positive strands (1, 18, 39, 45). Again, RNA-protein complexes, dependent upon structural elements in the 5'NTR RNA, are required for this function. In addition to their direct role in each function, specific RNA structures in the 5'NTR also regulate the conversion from translation to replication on the positive-strand genome (18, 22).
Structural integrity of the 5'NTR is fundamentally important for efficient viral replication and also for virulence. Numerous examples in CVB3 and other picornaviruses prove that mutations in the 5'NTR markedly decrease multiplication efficiency (62), alter cell tropism (54), and attenuate virulence (10, 14, 61). In the best-known example, each of the three attenuated Sabin vaccine strains for poliovirus, considered the prototype picornavirus, contain nucleotide substitutions in domain V of the 5'NTR that are responsible for attenuation (17, 27, 40). These mutants have been shown to multiply poorly in neuronal cells (31), accounting for their decreased neurovirulence and inability to cause poliomyelitis. In CVB3, a cardiovirulent determinant has been identified in domain II of the 5'NTR by characterizing naturally occurring genomes from noncardiovirulent strains (14) and by chimeric studies using echovirus12 and CVB3 (6). Infection studies clearly show reduced multiplication efficiency in cardiomyocytes by noncardiovirulent strains and attenuation of virulence in mice (33). Multiple sequence differences between noncardiovirulent strains and cardiovirulent strains in domain II suggest that virulence depends upon the structure of this domain (33). It is clear that RNA structure in the 5'NTR holds the key to infection and virulence in enteroviruses.
The current structural
model for the CVB3 5'NTR (Fig.
1) shows seven secondary structure domains (I to VII) defined by
long-range base-pairing interactions. Between these domains are
connecting segments that range in length from just 2 nucleotides to
over 25 nucleotides. It is generally agreed that domains II to VI house
the IRES element (46),
although the minimal IRES requires only domains II, IV, and V
(11,
21,
46). Domain I is the
cloverleaf structure, which contributes to the efficiency of the IRES
(55) but is absolutely
required for replication functions
(2,
45). The structure shown
in Fig. 1 was derived from
a combination of biochemical studies, energy minimization, and
comparative sequence analysis
(51,
56,
65). In support of this
model, short regions of the molecule have been explored experimentally
(41,
47,
56,
58). However, a
comprehensive biochemical study of the structure of the 5'NTR
RNA in the context of the entire folded molecule has not been
completed. We have used chemical modification to analyze the solution
structure of domains I to VI of the CVB3 5'NTR in the context
of the entire 5'NTR. This analysis provides critical
information about both secondary and tertiary interactions within the
molecule. The accessibility of each of the four nucleotides to solvent
was assessed by probing the molecule with dimethyl sulfate (DMS),
1-cyclohexyl-3-(2-morpholinoethyl)carbodiimide
metho-p-toluenesulfonate (CMCT), and
-keto-ß-ethoxybutyraldehyde (kethoxal)
(16,
44). We show that much of
the theoretical structure of the 5'NTR shown in Fig.
1 is supported
experimentally but that there are critical exceptions, particularly in
domains II and III, the connecting region between the cloverleaf and
domain II, and the connecting region between domains V and VI.
Proposals for new secondary and tertiary interactions in the
5'NTR, which include a loop E motif in domain III, an
expanded single-stranded region in domain II, and a long-range pairing
interaction, are based on the chemical probing analysis, comparative
sequence analysis of 105 enterovirus sequences, and folding using the
RNASTRUCTURE algorithm
(43). Together, these
results provide key structural insights into critically important
functional regions of the picornaviral
5'NTR.
![]() View larger version (22K): [in a new window] |
FIG. 1. Secondary
structure map of the CVB3 5NTR
(65). The sequence is
that of CVB3/28 (59). The
map shows the proposed secondary structures of the seven predicted
structural domains (I to
VII).
|
|
|
|---|
Plasmid DNA was isolated using a Wizard (Promega) miniprep kit according to the procedure provided by the manufacturer. To generate a template for transcription, purified plasmid DNA was digested with an enzyme that leaves a blunt end. Two enzymes were used: EcoRV, which cuts at position 919 of the genome, and Ecl136II, which cuts at position 748.
Viral RNA transcription. The transcriptions were done in vitro using the Megascript kit (Ambion, Austin, TX) according to the procedure provided by the manufacturer. The reaction mixture included 1 µg of digested template DNA, and the total reaction volume was 20 µl. The transcription reaction mixtures were incubated overnight (12 to 16 h) at 37°C. Following incubation, reaction mixtures were treated with 1 µl of DNase I solution for 30 min at 37°C. The reaction was stopped by adding 115 µl of nuclease-free water and 15 µl of 0.5 M ammonium acetate. RNA was then extracted with phenol-chloroform-isoamyl alcohol (25:24:1), precipitated by incubating at 20° overnight with 300 µl of isopropyl alcohol, pelleted for 20 min at 4°C, washed with 100 µl of 70% ethanol, dried, and resuspended in 24 µl of Tris-EDTA, pH 7.6. RNA was stored frozen at 70° for not more than 4 weeks.
Composite gel electrophoresis. Folded RNA molecules were analyzed by electrophoresis on nondenaturing agarose-acrylamide composite gels (20). A sample containing 10 µg of RNA transcript was placed in folding buffer (25 mM Tris-HCl [pH 7.6], 10 mM MgCl2, 60 mM KCl) and denatured by heating in an 80°C heat block for 2 min. The sample was allowed to cool slowly in the heat block to 37°C to renature the RNA. The sample was loaded onto a 3% acrylamide-0.5% agarose composite gel in folding buffer and electrophoresed at 150 V for 6 h at 4°C with buffer changes every 2 h. Bands were detected by staining in 0.2% methylene blue, 0.2 M Na-acetate, 0.2 M acetic acid.
RNA structural analysis. In vitro structural analysis was accomplished by subjecting the folded 5'NTR RNA to chemical modification (16, 44). The nucleotide-modifying agents used were DMS, kethoxal, and CMCT. For each modification reaction, 15 µg of RNA was denatured in either 100 µl of DMS-kethoxal buffer (40 mM K-cacodylate [pH 7.2], 10 mM MgCl2, 50 mM NH4Cl, 0.75 mM dithiothreitol) or 50 µl of CMCT buffer (40 mM K-borate [pH 8.0], 10 mM MgCl2, 50 mM NH4Cl, 0.75 mM dithiothreitol) by incubation at 80°C for 2 min. The reaction mixtures were then slowly cooled to 42°C or lower (over a period of approximately 20 min) to allow the RNA to fold into its native structure and then transferred to 37°C for the modification reactions (16). For each modified RNA, a corresponding control RNA was subjected to all the steps in the protocol without the addition of a modifying agent. For the kethoxal modification, 5 µl of a 1.5 M solution of kethoxal (U.S. Biochemicals) was added to the RNA and incubated at 37°C for 30 min. For the DMS modification, 2 µl of a 20% DMS solution in 95% ethanol was added to the RNA and incubated at 37° for 10 min. For the CMCT modification, 50 µl of a solution containing 42 mg/ml of CMCT dissolved in CMCT buffer was added to the RNA and incubated for 10 min at 37°C. Kethoxal reactions were stopped by adding 50 µl of 150 mM sodium acetate, 250 mM potassium borate, (pH 7.0). DMS reactions were stopped by adding 25 µl of 1 M Tris-HCl (pH 7.5), 1 M ß-mercaptoethanol, 0.1 M EDTA, and CMCT reactions were stopped by adding 300 µl of 95% ethanol. Modified RNA was recovered by ethanol precipitation with ammonium acetate.
Primer extension. Sites of chemical modification were identified by reverse transcriptase primer extension analysis (16, 44). To cover the entire 5'NTR, oligonucleotides were designed to prime the extension from different sections of the RNA molecule. Table 1 lists the oligonucleotides used in the extension reactions, their positions on the 5'NTR, and their sequences.
|
View this table: [in a new window] |
TABLE 1. Oligonucleotide
primers
|
-32P]ATP (10
µCi/µl), and 10 units of T4 polynucleotide kinase in a
total reaction volume of 20 µl. The reaction mixture was
incubated for 40 min at 37°C, and then the kinase enzyme was
inactivated by incubation at 60°C for 20 min. After labeling,
the reaction mixture was diluted with 30 µl of Tris-EDTA (pH
7.6) to bring the final concentration of the radiolabeled
oligonucleotide to 2 pmol/µl. For the annealing reaction, 1 µg of chemically modified RNA, 2 µl of 5x annealing buffer (250 mM Tris-HCl [pH 8.3], 200 mM KCl) and 1 µl (2 pmol) of 32P-labeled oligonucleotide were placed into a reaction mixture of 10 µl total volume. The reaction mixture was heated to 80°C for 2 min and then cooled slowly to 42°C. Once the reaction mixtures reached 42°C, 2 µl of each annealing mixture was added to 2 µl of 2x extension mix (100 mM Tris-HCl [pH 8.3], 80 mM KCl, 12 mM MgCl2, 4 mM of each deoxynucleoside triphosphate) and 1 unit of avian myeloblastosis virus reverse transcriptase (Life Sciences, St. Petersburg, FL). To generate a sequence ladder, 2 µl of unmodified RNA was added separately to one of the four different termination mixes (50 mM Tris-HCl [pH 8.3], 40 mM KCl, 6 mM MgCl2, 1 mM of each deoxynucleoside triphosphate, and 0.1 mM of one dideoxynucleoside triphosphate [ddATP, ddCTP, ddGTP, or ddTTP]) and 1 unit of avian myeloblastosis virus reverse transcriptase. All the reaction mixtures were incubated at 42°C for 23 min. Reactions were stopped using 2 µl of stop solution (95% formamide, 20 mM EDTA, 0.05% bromophenol blue, 0.05% xylene cyanol), and the mixtures were stored at 70°C.
Gel electrophoresis. Primer extension products were analyzed on 12% sequencing gels. The samples were electrophoresed for either 4 or 6 h at 60-W constant power in 90 mM Tris-borate (pH 8.3), 1 mM EDTA. Images of the gel were captured using a Packard Cyclone phosphorimager.
Sequence comparisons. Table 2 shows the 105 enterovirus sequences used in the sequence comparisons. Sequences were downloaded from GenBank, National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov) and analyzed in Vector NTI 9.1. A complete comparative sequence analysis of proposed base pairs is available at http://www.unomaha.edu/biology/tapprich.html. For several domains, specific examples of comparisons that provide strong support for inclusion of a base pair or strong support for removal of a base pair have been included in the figures. These examples are indicative but are not a comprehensive representation of the analysis that was conducted. The complete analysis is available at the website.
|
View this table: [in a new window] |
TABLE 2. Enterovirus
sequences
|
|
|
|---|
Three chemical probes were used to determine the accessibility of each of the four nucleotides to solvent. These chemical probes modify positions on the base that are involved in Watson-Crick base pairing, so only bases that are single stranded and are also accessible to solvent in the folded molecule are reactive. DMS modifies the N1 position of adenosine and the N3 position of cytosine, CMCT modifies the N3 position of uridine, and kethoxal creates a cyclic adduct using the N1 and N2 positions of guanosine (16, 44).
5'NTR conformation. Folding of the 5'NTR transcripts was accomplished by slow cooling of heat-denatured molecules to a temperature of 37°C in a folding buffer containing 10 mM MgCl2 and 60 mM KCl. To determine whether the 5'NTR transcripts folded into a single conformer, molecules taken through the heat denaturation and slow cooling procedure were analyzed on a 3% acrylamide-0.5% agarose native composite gel that was electrophoresed in folding buffer (25 mM Tris-HCl [pH 7.6], 10 mM MgCl2, 60 mM KCl). Composite gel electrophoresis is very sensitive to RNA conformation (20), and native gels have previously demonstrated conformational heterogeneity in the 5'NTR derived from hepatitis C virus (28). Migration as a single tight band indicated that the molecules folded into conformers of similar overall three-dimensional shape (data not shown).
Chemical probing. (i) Domain I. The cloverleaf structure of domain I is well supported by comparative sequence analysis, and the 3' stem-loop, known as stem-loop d, has been analyzed by nuclear magnetic resonance (NMR) (12, 23, 47). Our probing results provide solid experimental support for the cloverleaf structure. Figure 2 shows an example of the primer extension results for domain I (Fig. 2A) and the connecting region between domain I and domain II (Fig. 2A and B), as well as a summary of the modifications on the predicted secondary structure map (Fig. 2C). Comparative sequence analysis for selected positions is shown in Fig. 2D.
![]() View larger version (66K): [in a new window] |
FIG. 2. Chemical
probing results for domain I (cloverleaf). A. A 12% sequencing gel
showing primer extension analysis of modified and unmodified CVB3 RNA.
Labels on the left indicate nucleotide positions according to the
sequencing tracks (lanes A, C, G, and U), and labels on the right
identify positions that are modified. Lane Un, unmodified; lane K,
kethoxal; lane D, DMS; lane C, CMCT. B. A 12% sequencing gel
to show detailed results for the connecting region between domain I and
domain II. Labels on the right indicate nucleotide positions according
to the sequencing tracks (lanes U, G, C, and A), and labels on the left
identify positions that are modified. Lane Un, unmodified; lane K,
kethoxal; lane D, DMS; lane C, CMCT. C. Predicted secondary
structure map of domain I, showing modified positions. Filled circles
identify strongly modified positions; open circles identify moderately
modified positions. Red boxes indicate examples of pairs that are not
supported by the comparative sequence analysis results shown in panel
D. Green boxes indicate examples of pairs that are supported by
phylogentic analysis in panel D. D. Analysis of
representative paired positions, showing the number of occurrences of
nucleotide identities among 105 enterovirus
sequences.
|
Modifications at positions 48U and 49A confirm the bulged loop in stem d. The RNA then closes for the remainder of the stem region. Bases 54 to 56 in this stem are predicted to be available for modification as part of an internal loop opposite positions 71 to 73. However, NMR analysis indicates that these bases participate in noncanonical pyrimidine-pyrimidine pairs (47), creating a continuous stem. Our probing results confirm the paired arrangement, as none of the pyrimidines are accessible. Comparative sequence analysis shows that 54U changes to A 37 times, which creates an A-U pair with 73U in every case. The 55C-72U pair is completely conserved. The 56U-71U pyrimidine-pyrimidine pair changes to 56C-71C 18 times and so is either U-U or C-C in 104 of 105 sequences. There is only one occurrence of C-U. The G-U pair at the end of stem-loop d is modified, showing a light hit at 61U and a strong hit at 66G. The proposed tetraloop at the end of stem loop d is completely exposed, with modifications between 62C and 65G. The 5'-CACG-3' tetraloop sequence found in CVB3/28 is not one of the canonical tetraloop motifs (12). Interestingly, 49 of the 105 enterovirus sequences have 62U rather than 62C, which creates a canonical 5'-UNCG-3' tetraloop, and NMR results show that the 5'-CACG-3' sequence present in 53 of the 105 enterovirus strains adopts a fold very similar to a 5'-UNCG-3' tetraloop (12). In fact, the NMR results for these two sequences have helped to expand the tetraloop motif to include 5'-CACG-3' (12).
The long pyrimidine-rich connecting region between domain I and domain II is protected from our chemical probes, aside from strong modifications at 88U and 104U and a weak modification at 92C (Fig. 2B). We have observed this protection in multiple analyses and using two different oligonucleotide primers. At 16 nucleotides in length, this connecting region is the longest inaccessible stretch of bases in the entire 5'NTR. We suggest that the connecting region between domains I and II is folding into a unique structure that is involved in interactions with other regions of the molecule. Furthermore, as outlined in the section "Domain II" below, we propose that the connecting region between domain I and domain II extends much further than the present model suggests and terminates with a long-range pairing interaction.
(ii) Domain II. Domain II is a critical region of the enterovirus 5'NTR. Studies of both natural isolates and laboratory strains have shown that the sequence of domain II is a determinant for virulence in CVB3 (6, 14, 15, 33) and poliovirus (10). From our chemical probing results, it is clear that the structure of domain II is extremely complex. Figure 3 shows examples of the primer extension results (Fig. 3A) and also shows the modifications on the predicted secondary structure map (Fig. 3B). In the basal stem region of domain II there are several unusual probing results. In the lower stem (nt 105-117:171-181), the 5' strand shows protections that would be expected for a stem region, with modification only at the bulged uridines at positions 110 and 111. However, the 3' strand of this same stem region is largely exposed, showing prominent hits at 5 of the 10 positions and light hits at other 5. Given the variable intensity of the modifications in the 3' strand and the protection of the 5' strand, we suggest that this stem region is a dynamic element in the molecule, with the 5' partner consistently involved in either secondary or tertiary interaction and with the 3' partner adopting an unstructured state. Comparative sequence analysis also suggests that several positions do not conserve pairing. For example, 106U-180A is a mismatch more often than it is complementary, and 107A-179U is a mismatch in 21 cases (Fig. 3C).
![]() View larger version (40K): [in a new window] |
FIG. 3. Chemical
probing results for domain II. A. A 12% sequencing gel
showing primer extension analysis of modified and unmodified CVB3 RNA.
Labels on the right indicate nucleotide positions according to the
sequencing tracks (lanes U, G, C, and A), and labels on the left
identify positions that are modified. Lane Un, unmodified; lane K,
kethoxal; lane D, DMS; lane C, CMCT. B. Predicted secondary
structure map of domain II, showing modified positions. Filled circles
identify strongly modified positions; open circles identify moderately
modified positions. Red boxes indicate examples of pairs that are not
supported by the comparative sequence analysis results shown in panel
C. Green boxes indicate examples of pairs that are supported by
phylogentic analysis in panel C. C. Analysis of
representative paired positions, showing the number of occurrences of
nucleotide identities among 105 enterovirus
sequences.
|
![]() View larger version (51K): [in a new window] |
FIG. 7. Chemical
probing results for domain VI. A. A 12% sequencing gel
showing primer extension analysis of modified and unmodified CVB3 RNA.
Labels on the right indicate nucleotide positions according to the
sequencing tracks (lanes U and G), and labels on the left identify
positions that are modified. Results are shown for CVB3/28. Lane Un,
unmodified; lane K, kethoxal; lane D, DMS; lane C, CMCT. B.
Predicted secondary structure map of domain VI, showing modified
positions. Filled circles identify strongly modified positions; open
circles identify moderately modified positions. The pyrimidine-rich
sequence (box A) and the AUG (box B) are indicated on the
map.
|
In contrast to the basal stem, the apical stem-loop of domain II produces a modification pattern that is largely consistent with the predicted secondary structure. Strong modifications at 134C, 135G, 136U, and 137C confirm the predicted bulge loop. On the opposite side of the bulge, we see accessibility of a string of uridines from position 160 to 162. Comparative sequence analysis also suggests these uridines are not paired. The proposed 132A-162U pair is a mismatch 50 times, and the proposed 133A-161U pair is a mismatch more often than it is a canonical pair. Interestingly, neither 132A nor 133A is modified. These bases must be involved in other interactions. The stem leading up to the hairpin loop is protected at all seven paired positions except 142G, and the nucleotides in the loop region (position 147 to 152) are all exposed to solvent. Comparative sequence support for the apical stem is convincing, with a large number of compensatory substitutions at most positions (an example is shown in Fig. 3C). The connecting region between domain II and domain III consists of adjacent cytidines that are conserved in all 105 enterovirus sequences. Both of these positions are accessible to chemical probes.
(iii) Domain III. Deletion analysis with poliovirus has proven that domain III is dispensable for both IRES function and viral replication (11, 21, 46). The predicted secondary structure of domain III shows numerous internal loops separated by only a few base pairs. Our probing results suggest that the majority of the domain is not stably involved in Watson-Crick base pairing. An example of the primer extension results for domain III and a map showing the modifications on a predicted secondary structure map is shown in Fig. 4A and B.
![]() View larger version (39K): [in a new window] |
FIG. 4. Chemical
probing results for domain III. A. A 12% sequencing gel
showing primer extension analysis of modified and unmodified CVB3 RNA.
Labels on the left indicate nucleotide positions according to the
sequencing tracks (lanes A, C, G, and U), and labels on the right
identify positions that are modified. Lane Un, unmodified; lane K,
kethoxal; lane D, DMS; lane C, CMCT. B. Predicted secondary
structure map of domain III, showing modified positions. Filled circles
identify strongly modified positions; open circles identify moderately
modified positions. The two potential loop E motifs are indicated by E1
and E2. Red boxes indicate examples of pairs that are not supported by
the comparative sequence analysis results shown in panel D. Green boxes
indicate examples of pairs that are supported by phylogentic analysis
in panel D. C. Diagram of the loop E motif, showing the
noncanonical base pairs and the bulged nucleotide. Elements of the loop
E motif are numbered: 1, sheared A-G pair; 2, trans-Hoogsteen U-A; 3,
bulged A; 4, trans (locally parallel)-Hoogsteen-Hoogsteen
A-A. D. Analysis of representative paired positions, showing
the number of occurrences of nucleotide identities among 105
enterovirus
sequences.
|
Through the series of three internal loops above the basal stem nearly every base is accessible, including most of the proposed pairs that separate the loops. Our probing results do not support the proposed Watson-Crick pairs, and neither does comparative sequence analysis. Sequence comparisons for the four positions separating the large internal loops (193U-196C and 221G-224A) are shown in Fig. 4D. While not likely to be involved in Watson-Crick pairing, the internal loop sequences do, however, present the intriguing possibility for loop E motifs (34). The loop E motif uses a series of noncanonical base pairs to form a distorted helical region containing an S turn (9). The sequence signature for loop E is as follows: (i) an A-G pair (which can be A-A), (ii) an absolutely conserved U-A pair (noncanonical H bonding), (iii) a bulged nucleotide, and (iv) an A-A pair (which can vary to R-R, Y-Y, A-Y, and Y-A but rarely G-Y or Y-G). Figure 4C shows the loop E motif with the elements numbered accordingly. The noncanonical pairing in the loop E motif produces a characteristic pattern of chemical modification. Both purines of the A-G pair are accessible, the U in the U-A pair is protected while the A is accessible, the bulged nucleotide is protected, and the A-A pair is accessible (35). Two consecutive loop E sequence patterns are present in the proposed internal loops of domain III from CVB3, an upper loop utilizing bases 197-200:217-219 and a lower loop utilizing bases 191-194:202-204. These are labeled E1 and E2, respectively, in Fig. 4B. Our chemical modification results lend support to the upper loop E sequence (E1) but do not support the lower loop E sequence (E2). In the upper loop E sequence, 197A, 199U, 200A, 217G, 218A, and 219A are modified. Of these, all but 199U are predicted to be accessible in the loop E structure (34). An H bond to the backbone by the bulged nucleotide in the loop E structure should provide protection from chemical modification. Our results show such protection for 198A. Sequence comparison also provides solid evidence for the upper loop E motif. Among the 105 enterovirus sequences, none show deviation from the signature loop E sequence. Element 1 is A-G in 96 sequences, A-A in 8 sequences, and G-A in one sequence. Element 2 is conserved as a U-A pair, occurring in 104 of 105 sequences, with the only exception a C-U. The bulged nucleotide (element 3) is an A 104 times and a G once. Element 4 is variable, being A-A in 64 cases, A-G in 2, U-A in 22, and C-A in 17, but never G-Y or Y-G. Evidence for the lower loop E is lacking, both in the modification pattern and in the sequence comparison. For example, element 1, which should be A-A or A-G, is C-G 9 times, U-A 11 times and U-G 19 times; element 2, which should be U-A only, shows examples of A-A (11 times), A-G (1 time), C-A (4 times), G-A (14 times), G-G (7 times), and G-U (1 time).
In contrast to the lower part of domain III, the apical stem-loop is solidly supported by experimental evidence. The stem region is protected, and the proposed loop nucleotides are all modified. Sequence comparison also adds to this support. In the six proposed pairs in the stem, four never show examples of mismatches and the other two show a total of only four mismatches. Even these mismatches are balanced by a substantial number of compensatory base changes at these positions. The hairpin loop capping domain III is always 4 nucleotides, which is suggestive of a tetraloop. However, the sequence of the loop is highly variable among enteroviruses and does not show any clear tetraloop consensus pattern. All four of these bases are accessible to chemical probes. The connecting region between domain III and domain IV is very highly conserved. In fact, 234U, 236C, 238C, and 239C do not vary among 105 sequences, and 235A and 240G change only one time. All of the connecting-region bases are susceptible to modification.
(iv) Domain IV. The predicted structure for domain IV is very complicated and contains a long complex helical region topped by a junction loop, from which three stem-loop regions radiate (labeled A to C in Fig. 5B). Primer extension results for domain IV are shown in Fig. 5A, and a modification map is shown in Fig. 5B. The lower stem region of domain IV has 14 proposed base pairs interrupted only by a C-C mismatch. All of the proposed pairs are protected from modification, and most of the paired positions are amply supported by compensatory base changes. It is fascinating that the 249C-426C mismatch is conserved in all 105 enterovirus sequences. Only the 3' partner (436C) shows exposure to chemical modification. The lower stem is followed by an extremely large internal loop, whose bases are largely protected. On the 5' side of the loop only 1 base of the 12, 257A, is strongly modified, while 255A-256A and 261U-266A are lightly modified. On the 3' side of the loop, 421G-424G and 429U-430A are strongly modified, but five bases are protected. These results suggest noncanonical pairing or other types of interaction for this region of the molecule. Above the internal loop, a long helical region is interrupted by a small internal loop and a bulge loop involving nt 275-281:404-412. This proposed structure is supported by modification data showing that most of the single-stranded nucleotides are accessible. The exception is the trio 408A, 409C, and 410A, which show protection. Again, comparative sequence analysis provides excellent support for the vast majority of the complex stem region.
![]() View larger version (33K): [in a new window] |
FIG. 5. Chemical
probing results for domain IV. A. A 12% sequencing gel
showing primer extension analysis of modified and unmodified CVB3 RNA.
Labels on the right indicate nucleotide positions according to the
sequencing tracks (lanes U, G, C, and A), and labels on the left
identify positions that are modified. Lane Un, unmodified; lane D, DMS;
lane C, CMCT. B. Predicted secondary structure map of domain
IV, showing modified positions. Filled circles identify strongly
modified positions; open circles identify moderately modified
positions.
|
In stem-loop B the modification pattern is as expected from the model. The long stem region is protected, aside from the bulge-loop nucleotides near the center of the helix. The pyrimidine-rich bulge loop from position 335 to 340 is exposed, as is the tetraloop from position 347 to 349. The initial base of the tetraloop, 347G, is not marked on the diagram in Fig. 5 because a strong stop in primer extension did not allow analysis of this base. Stem-loop C displayed a pattern that matches the proposed structure, with only the tetraloop nucleotides in the hairpin showing modification.
The connecting region between domain IV and domain V is seven nucleotides, four of which are completely conserved (448C to 451C). All seven positions are accessible to chemical probes.
(v) Domain V. Attenuating mutations for the Sabin vaccine strains of poliovirus are located in domain V. Consequently, this domain has received a great deal of experimental attention. As shown in Fig. 6A and B, our probing results for domain V of CVB3 reveal that the predicted secondary structure is very well supported experimentally. Our modification results support the current model suggesting that the molecule folds into a long complex stem-loop containing several bulge loops and internal loops. The majority of the basal stem region is protected from modification, including a proposed internal loop composed of 455C, 555U, 556G, and 557U. A bulged U at position 459 is partially modified, as are the bases in the adjacent pair involving 460U-551G. Overall, the protection of this section of domain V indicates that it is involved in a structure beyond the simple base-pairing interactions. Every base pair between 452C-560G and 474C-537G is extremely conserved. Of the 15 proposed pairs, 11 are completely conserved, 2 have a single compensatory change, and 2 have two compensatory changes.
![]() View larger version (51K): [in a new window] |
FIG. 6. Chemical
probing results for domain V. A. A 12% sequencing gel showing
primer extension analysis of modified and unmodified CVB3 RNA. Labels
on the left indicate nucleotide positions according to the sequencing
tracks (lanes A, C, G, and U), and labels on the right identify
positions that are modified. Lane Un, unmodified; lane K, kethoxal;
lane D, DMS; lane C, CMCT. B. Predicted secondary structure
map of domain V, showing modified positions. Filled circles identify
strongly modified positions; open circles identify moderately modified
positions.
|
(vi) Domain VI. Two features in domain VI are critical for translation initiation directed by the IRES element of picornaviruses: a pyrimidine-rich region (known as box A) followed by an AUG (box B) (Fig. 7B). These features are separated by a spacer element of 15 to 25 nucleotides (57). In enteroviruses and rhinoviruses, the pyrimidine-rich region is located in a long connecting region between domain V and domain VI and the AUG is located in a stem region of domain VI. Our probing results for these regions are shown in Fig. 7A and B. Most of the nucleotides in the pyrimidine-rich connecting region are accessible for modification. However, within this region we encountered a large number of strong stops; we are unable to determine the exposure of these bases. Strong stops occurred at 561U, 562G, 566C, 567A, and 572A. Interestingly, we observed protection of positions 563U, 564U, and 565U. These bases are part of the conserved 5'-UGUUUC-3' sequence that is complementary to domain II and that we propose to be involved in a long-range pairing interaction. The long stretch of nucleotides from 568U to 582U showed accessibility; this stretch contains the putative Shine-Dalgarno-like sequence proposed to serve as the initial interaction site for the ribosome during translation initiation (64). In the hairpin-loop of domain VI, all of the proposed base pairs were protected from modification, including the pairs housing the AUG of box B and the mismatch 599A-610C. The GAGA tetraloop capping the domain was accessible, as was the bulged U at position 620. Sequence analysis reveals that this tetraloop changes to a triloop 34 times and often varies from the GNRA motif.
|
|
|---|
![]() View larger version (24K): [in a new window] |
FIG. 8. Proposed
structure model for the CVB3 5'NTR. For each domain, chemical
probing results were entered into the RNASTRUCTURE algorithm
(43) to generate a
proposed structure. This structure was refined using results from the
comparative sequence analysis. Strongly modified positions are
indicated in red. Weakly modified positions are indicated in
green.
|
The primary role of the cloverleaf structure is replication. Ample evidence points to its function in assembling the protein factors of PCBP, viral protein 3CD and 3Cpro, and poly(A) binding protein to initiate and regulate replication (1, 2, 18, 19, 22, 47, 49, 66). Our probing results confirm that the cloverleaf region consists of a four-stem structure emanating from a single junction loop where the exposure of nucleotides in the loop regions of stem-loop b, stem-loop c, and the tetraloop on stem-loop d provide available bases for recognition proteins that have been shown to interact with this region of the molecule, as has been shown in poliovirus, by PCBP, viral protein 3CD and 3Cpro, and the complex formed between poly(A) binding protein, PCBP and 3CD.
A much more complex interpretation is needed for the long, pyrimidine-rich connecting region between domains I and II and the basal region of domain II. In the connecting region, probing results show an extended string of protection in bases predicted to be single stranded. It is likely that this region is involved in some type of interaction, perhaps folding into the interior of a globular portion of the molecule or participating in a triple-stranded structure. Since this pyrimidine-rich region of the 5'NTR is very highly conserved in class I IRES elements, this complex structural role is likely to be critical for the function of the molecule. Protection of the connecting region by an alternative secondary structure would be expected if this region serves as a buttress for the adjacent long-range pairing interaction that we have proposed for domain II. Together, these elements would control the structural relationship between the cloverleaf and the IRES element downstream.
Several studies with poliovirus, particularly those conducted in vitro, have established that the cloverleaf and IRES function independently, with the cloverleaf directing replication and the IRES directing translation (39, 45, 46). However, solid evidence also exists for cloverleaf-mediated effects on translation and IRES-mediated effects on replication (5, 24, 25, 45, 53, 55). Our model for a structural organization in the enterovirus 5'NTR that brings the cloverleaf and the IRES together provides a foundation for the mechanism underlying these synergistic effects. The model also accentuates the importance of the connecting region between the cloverleaf and domain II. The cloverleaf is generally reported as nt 1 to 88 and the IRES as nt 127 to 608 (CVB3 numbering) (10, 45, 63), which leaves nt 89 to 126 unassigned. It is clear from the recent mutagenesis study by De Jesus et al. (10) that this region is critical for poliovirus virulence. In addition, deletion studies designed to prove the role of the cloverleaf in replication used constructs that included nt 89 to 110 (38, 39). Our results call attention to the important structural role of the region between the cloverleaf and the IRES and emphasize the need to focus studies in that area.
Based on the experimental results, major changes are proposed for domain II. The helix originally suggested at the base of the domain shows the expected protections on the 5' side, but many of the partnering bases on the 3' side are modified. This behavior continues through the large internal loop above the helix, where the 5' side is protected but the 3' side is exposed. These results suggest that the base of the domain is not consistently paired. This is supported by comparative sequence analysis showing that some class I IRES elements do not have the potential to pair in this region (65). Indeed, our own analysis of 105 enterovirus sequences shows that several positions in the basal helix change to mismatches in a substantial number of genomes. We have proposed a long-range pairing interaction that involves the 5' strand of domain II (Fig. 8). This pairing would bring domain II together with domain V, which links the 5' end of the IRES with the critically important polypyrimidine track (box A) and AUG (box B) at the 3' end of the IRES. A functional interaction between domain II and domain V is also amply supported by evidence showing that both are critical for virulence in enteroviruses (6, 14, 17, 27, 33, 40). As alluded to earlier for the connecting region between the cloverleaf and domain II, the long-range interactions may account for the protection of bases in domain II on either side of the interaction if these bases are buried into a globular domain of the molecule. The apical stem-loop of domain II is present in all class I IRES elements, and our sequence analysis shows numerous examples of compensatory base changes. Modification results provide strong support for this predicted structure.
Studies with poliovirus prove that domain III is dispensable for IRES activity and for viral infection (21). However, comparative sequence analysis shows that this domain is quite variable among class I IRES elements (65), raising the possibility that it distinguishes the biological differences between one type of virus and another. For example, bovine enterovirus contains only the apical stem-loop region, while other enteroviruses and rhinoviruses have a lower complex helix in addition to the apical stem-loop (Fig. 4). For all of these viruses, the apical loop is 4 nucleotides, but the sequence motif is highly variable and does not follow a tetraloop pattern. When a lower helix is present, there is at least one loop E sequence signature, and for most there are two such sequence motifs. The loop E motif is a helical element composed of three noncanonical base pairs surrounding a bulged purine (34). The motif was first discovered in the loop E region of 5S RNA and has since been found in numerous RNA molecules (35), where it provides a recognition feature for interactions with protein and RNA. The non-Watson-Crick base pairs in the motif form a characteristic tertiary structure that includes an important recognition feature called an S turn (9). The tertiary structure also gives a signature pattern of chemical modifications (34). In rRNAs, loop E motifs provide the key structures for complex protein interactions by translation factors (9) and zinc fingers (32). A loop E in domain III of the enterovirus 5'NTR may play a similar role. In CVB3 there are two potential loop E motifs in the complex helix of domain III (shown in Fig. 4B). Modification results are consistent with the formation of an upper loop E motif but not with the lower loop E motif. This is strongly supported by sequence comparison, where the loop E signature is present in the upper portion of the stem in every enterovirus genome, while the signature is not found in many cases in the lower portion of the stem. The upper loop E has been incorporated into our proposed 5'NTR structure (Fig. 8).
With two exceptions, modifications in domain IV follow the predicted pattern. One exception occurs in the extremely large internal loop involving bases 254 to 266 and 420 to 430. Within these supposed single-stranded regions there are long stretches of protected nucleotides. Such behavior suggests that the bases are involved in structural interactions, likely at the tertiary level. The other exception is in the stem that leads to stem-loop A of the junction loop. Modification results suggest that this stem is very flexible near the junction loop. Three elements in the junction loop region of domain IV, i.e., the C-rich loop of stem-loop A, the bulge loop in stem-loop B, and the bulge loop at position 370, are recognized by the protein PCBP in poliovirus (19). This is a key interaction for IRES function and for the regulatory step controlling translation and replication. Presentation of the PCBP recognition features is sure to depend upon the three-dimensional folding of the junction loop, which our results show to be dynamic. Indeed, a recent NMR study by Du et al. (13) showed that a single-base heterogeneity in enteroviral sequences at position 337 produces drastically different structures in stem-loop B. Whereas the loop containing 337C adopts a flexible L shape, the loop containing 337U adopts a rigid U shape.
Domain V has received a great deal of experimental attention because the attenuation mutations for the vaccine strains of poliovirus are located in this domain (17, 40). In our chemical probing experiments, the predicted structure of domain V is well supported. The large bulge loop around position 520 is of particular interest because it is conserved and is critical for poliovirus virulence (41). The mutagenesis study by Malnou et al. (41) specifically addressed the presence and role of two conserved GNRA motifs in this bulge loop. Their results showed that mutation of both sequences had the most dramatic effect, but when assayed separately, mutation of 524G-525C had the most impact. Our chemical probing is also consistent with their biochemical results, which showed the bulge loop to be somewhat protected from RNase attack and lead probes (41). However, the majority of the loop (nt 517 to 522) is exposed to solvent, as shown by Stewart and Semler (58) for poliovirus. Based upon probing results and a convincing sequence covariation pattern, we suggest that an intraloop pairing involving 523G-528C, which frames a GCAA tetraloop, is present in the bulge loop. This has been incorporated into our model of the 5'NTR (Fig. 8).
The connecting region between domains V and VI, as well as domain VI itself, contains sequences important for translation initiation. One such region is an extended pyrimidine-rich sequence called box A, and the other is a properly spaced AUG called box B (57). The box A sequence is complementary to 18S rRNA and has been implicated in a Shine-Dalgarno-like interaction (37, 64). The box B sequence is predicted to be part of a stem structure in domain VI. We show that the long connecting region containing box A is accessible to modification in the region complementary to 18S rRNA, making it quite available for a potential base-pairing interaction. We also propose a long-range pairing interaction that brings box A into proximity with domain II. In a study by Yang et al. (64), designed to test the participation of the polypyrimidine box A in ribosome binding by enhancing the pairing potential to 18S rRNA, bases 561 to 566 were mutated to nucleotides that would no longer participate in the pairing interaction proposed here. In bicistronic assays, the mutant IRES did not significantly decrease (or increase) in vitro translation. These results suggest that the pairing interaction structure is not required for the in vitro bicistronic translation assays conducted in the study, but they do not rule out the existence or importance of the interaction. Since the assays in the study by Yang et al. (64) were conducted using HeLa cell extracts, the effects of disrupting the long-range pairing interaction may have been masked. HeLa cells are extremely permissive for CVB3 and other enteroviruses. In fact, CVB3 mutants that show dramatic growth and virulence phenotypes in mouse fetal heart fibroblasts show no altered phenotype in HeLa cells (14, 33). Similarly, the Sabin strains of poliovirus show no growth phenotype in HeLa cells but are severely attenuated in cells of neuronal origin (31).
Overall the predicted model shown in Fig. 1 for the secondary structure of the CVB3 5'NTR is supported by our chemical modification and comparative sequence analysis. In our working model of the CVB3 5'NTR, we have identified regions of the model that need revision, including novel interactions not previously characterized between both local and distant nucleotides and localized sites likely to participate in a long-range pairing interaction that could mediate the functions of the cloverleaf and IRES. We have pinpointed a loop E motif likely to serve as a protein recognition feature, and we have provided evidence for a tetraloop closed by a lone pair. Although the probing results were generated with CVB3, the proposals for novel interactions can be expanded to all enteroviruses, as they were all supported by sequence comparison. The structural insights presented will be critical references to better guide mechanistic studies designed to gain a more complete understanding of the picornaviral 5'NTR.
This work was supported by NIH grant P20 RR16469 from the INBRE Program of the National Center for Research.
Published ahead of print on 1 November 2006. ![]()
Present
address: Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»