Previous Article | Next Article ![]()
Journal of Virology, April 2007, p. 3913-3921, Vol. 81, No. 8
0022-538X/07/$08.00+0 doi:10.1128/JVI.02236-06
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Jeremiah S. Joseph,1,
Vanitha Subramanian,1
Benjamin W. Neuman,2
Michael J. Buchmeier,2
Raymond C. Stevens,3 and
Peter Kuhn1*
Department of Cell Biology,1 Department of Molecular Integrative Neurosciences,2 Department of Molecular Biology, The Scripps Research Institute, 10550 N. Torrey Pines Road, La Jolla, California 920373
Received 11 October 2006/ Accepted 1 December 2006
|
|
|---|
|
|
|---|
29-kb genomic RNA that is thought to be organized as a helical filamentous ribonucleoprotein (RNP) complex. Several copies of the N protein self-associate and form a template for binding RNA during nucleocapsid formation (13, 16, 18, 35, 61). As noted in studies done using murine hepatitis virus (MHV), the initial steps of virus assembly, including the formation of the RNP complex and its eventual packaging into the virion lumen, occurs in a temporally regulated manner, mainly at the endoplasmic reticulum-Golgi intermediate compartments just prior to budding (1, 8, 22, 55). Successful targeting of the RNP into the virion lumen is thought to be facilitated by its anchoring onto the membrane-embedded M protein by specific interaction between their respective C-terminal tails (10, 23, 32, 39, 56). Despite extensive studies on several model coronaviruses spanning 25 years, our structural understanding of these assembly events remains sketchy (5, 7, 8, 15, 24, 34).
SARS-CoV N protein is translated from the smallest of the eight subgenomic RNAs (the bicistronic sg-mRNA 9) (15, 26, 54) that spans the genomic 3'-most open reading frame, ORF9a (Fig. 1a). Coronaviral N proteins are typically ca. 45 to 50 kDa, very basic (with typical pIs of
10), prone to aggregate into large homopolymers (16), phosphorylated at multiple sites (3, 50, 58), and extremely labile to proteolytic degradation (39, 57, 61). These characteristics have hindered in vitro structural studies on full-length N. The N-terminal domains of coronaviral N proteins (N-NTDs) typically share about 30 to 40% sequence identity (Fig. 1c). As in most nidoviruses, the full-length SARS-CoV N protein (430 residues) has three main protein domains: an N-terminal RNA-binding domain (i.e., the N-NTD), a poorly structured central serine-rich region that is thought to house the primary sites of phosphorylation (33, 58), and a C-terminal domain (N-CTD [52]) that is mainly involved in oligomerization and self-association (4; Fig. 1b). In addition, a few coronaviruses have about 20 residues upstream of the NTD that are rich in serine, glycine, and arginine (SRG motif; Fig. 1b). N protein is also known to undergo sumoylation (28). Several other ancillary functions have been ascribed to coronaviral N proteins. In MHV as well as infectious bronchitis virus (IBV), N not only binds to genomic RNA but to the six subgenomic RNAs as well (62). It is involved in cell signaling (19, 20) and is known to interact with several human proteins, including human cyclophilin A (31) and human RNP A1. Anti-N monoclonal antibodies protect mice from lethal coronaviral infection (43). SARS-CoV N is known to elicit a well-defined immunological response, as evidenced by its peptides binding to human lymphocyte antigens with nanomolar affinities (2, 53), which underscores the importance of N as a potential target in neutralizing SARS infection (30). The structure of a highly conserved nine-residue peptide corresponding to the region 362KTFPPTEPK370 has been resolved in complex with a class I major histocompatibility complex molecule (2, 53).
![]() View larger version (60K): [in a new window] |
FIG. 1. (a) Organization of SARS-CoV genome. Locations of the open reading frames (ORFs) are indicated. The boundaries of the 16 nonstructural proteins (nsp1 to nsp16) that result from proteolytic processing of the replicase polyprotein (PP1ab) by PL-Protease (green) and 3CLpro (black) are marked by vertical lines. (b) Domain organization of coronaviral N proteins. The four domains labeled are as follows: SGRD, serine-glycine-arginine-rich domain; NTD, N-terminal domain; SRD, serine-rich domain; and CTD, C-terminal domain. (c) Multiple sequence alignment of NTD domains. The region for which structural coverage is provided in this study is marked by vertical lines. Hydrophobic residues are shown in yellow. Secondary structures observed for SARS-CoV N-NTD are shown above the alignment as arrows (strand) and cylinders (helix). Positively charged residues that have been implicated in RNA binding are indicated by asterisks above the sequence. The ICTV acronyms used for each viral sequence and their corresponding database accession numbers were as follows: HCoV-229E, human coronavirus 229E (NP_073556); TCoV-NC95, turkey coronavirus NC95 strain (gi32129798); BCoV-Lun, bovine coronavirus (AAL57313); HEV-VW572, porcine hemagglutinating encephalomyelitis virus (YP_459957); TGEV-Purdue, transmissible gastroenteritis virus Purdue strain (NP_058428); HCoV-NL63 human coronavirus NL63 (YP_003771); PEDV-CV777, porcine epidemic diarrhea virus CV777 strain (NP_598314); FCoV-79-1146, feline coronavirus (YP_239358); SARS-CoV-Tor2, severe acute respiratory syndrome coronavirus-Tor2 strain (AAP41047); MHV-JHM, murine hepatitis virus JHM strain (YP_209238); HCoV-OC43, human coronavirus OC43 (NP_937954); HCoV-HKU1, human coronavirus HKU-1 (YP_173242); RCoV, Rat coronavirus (AAD33104); HECoV-4408, human enteric coronavirus 4408 (AAQ67202); CCoV, canine coronavirus; ECoV-NC99, equine coronavirus NC99 (Q9DQX6); PgCoV, pigeon coronavirus (gi58416203); PCoV, puffinosis coronavirus, gi28460530; and IBV-Beaudette, avian infectious bronchitis virus (NP_040838).
|
In the present study, we report the crystallographic characterization of SARS-CoV N-NTD spanning residues 47 to 175 in two crystal forms that have been solved at 1.17 Å and 1.85 Å, respectively. The structures have been phased by molecular replacement using the IBV homolog. Comparison of the two crystal forms versus the solution conformation of this domain (residues 45 to 181 [19]) and comparison with the two published IBV N-NTD structures (residues 29 to 160 [21]) shows several commonalities, as well as many subtle structural differences. The crystal packing noticed in the cubic form of SARS-CoV N-NTD and in the C2 lattice of its IBV homolog suggests that the two viruses probably employ different modes of oligomeric self-association during the RNP core formation. Modeling studies on this domain from related coronaviruses suggest that not only is the assembly of RNA around the helical N protein polymer likely to be different but the manner in which the N proteins recognize RNA is likely to be different as well. These observations might explain why the fully packaged nucleocapsids of different nidoviruses often exhibit different morphologies as observed in cryo-electron microscopy (cryoEM) studies (15).
|
|
|---|
Crystallization and data collection. Crystals were grown by the nano-volume sitting-drop method. Typically, 100 nl of protein was mixed with 100 nl of well solution. Monoclinic crystals grew in solution containing 0.2 M sodium bromide, 0.1 M sodium acetate (pH 5.5), and 25% polyethylene glycol 2000 MME. The crystal that was used for data collection contained BCIP (5-bromo-4-chloro-3-indolylphosphate) as an additive. Cubic crystal form grew in 40% methyl pentanediol and 0.1 M Tris (pH 8.0), typically within 2 weeks. These were cryoprotected in a solution containing mother liquor and 15% glycerol and flash frozen in liquid nitrogen. Crystal screening and data collection were done by using the BLU-ICE (36) interface at the remote facility at the Stanford Synchrotron Radiation Laboratory Beamline-11.1, and all diffraction data were processed using HKL2000 (41).
Phasing and refinement. Initial phases for the monoclinic crystal form were obtained by molecular replacement using a full-atom model of the corresponding domain of the IBV nucleocapsid (PDBId 2BTL) by using the program Phaser (44) with data from 20.0 to 3.0 Å. Rigid body refinement using Refmac5 revealed a clearly interpretable electron density map. Both phases and the model were further improved by one round of automated model building cycle followed by a solvent atom search in Arp/wARP (25). The resulting model was improved by subsequent rounds of manual model building in Coot (9) alternated with restrained refinement in Refmac5 (37) of CCP4 using anisotropic B factor refinement. Optimum TLS parameters were analyzed at the TLSMD Web server (42), and 11 TLS groups covering 139 residues were used during refinement. Similarly, the structure of the cubic crystal form was solved by using the IBV structure of N-NTD as the query model and searched based on data from 20.0 to 3.0 Å, followed by a rigid body refinement and finally by alternating manual model building and restrained refinement cycles using Coot and Refmac5 (37), respectively. The final model statistics, validation, and stereochemical quality for the two structures are reported in Table 1.
|
View this table: [in a new window] |
TABLE 1. Data collection and refinement statisticsa
|
Protein structure accession numbers. The structure factors and coordinates of SARS-CoV N-NTD in the two crystal forms have been deposited in PDB under accession numbers 2OFZ (monoclinic form) and 2OG3 (cubic form), respectively.
|
|
|---|
![]() View larger version (47K): [in a new window] |
FIG. 2. (a) Structural representation of N-NTD monomer. The structure is colored from the N terminus (blue) to the C terminus (red). (b) Distribution of electrostatic potential on the surface of N-NTD. The potential distribution was calculated by using APBS module in Pymol (6). The values range from 5 kT (red) to 0 (white) and to +5 kT (blue), where k is the Boltzmann constant and T is the temperature. The orientation of the molecule is about 180° rotation along y axis of panel a. (c) The crystal structure of the monoclinic form of SARS-CoV N-NTD over the average coordinates of the NMR structure of the same domain as reported by Huang et al. (19). The four regions along the polypeptide that differ the most between the two structures are indicated by L1 to L4. Loop L1 is colored cyan for the NMR structure and blue for the crystal structure. (d) Stereo diagram showing the C trace of superimposed structures of SARS-CoV N-NTD and IBV N-NTD. The cubic and monoclinic forms of SARS-CoV N-NTD are shown in green and blue, respectively, while the structure from IBV is traced in red.
|
atoms. These might account for the failure of attempts at phasing the two datasets using the solution structure coordinates (either the average, or the ensemble in toto, as well as individually) as the query model. Similarly, Fan et al. reported large RMSDs between the IBV crystal structure and the NMR spectroscopy structure of SARS-CoV N-NTD as a possible cause of failure in phasing the IBV structure using the SARS-CoV NMR coordinates (14). The most dramatic differences are in the loops L1 (residues 91 to 100) and L3 (residues 118 to 123), which show a concerted inward shift by as much as 4.2 Å in the X-ray structure (Fig. 2c). This movement, combined with a corresponding outward hinge motion of the ß-hairpin L2 (Glu 106 and Gly 107) and the loop L4 (residues 127 to 134), results in the RNA-binding cleft (discussed below) being both narrow and shallow in the X-ray structure compared to the solution structure.
The C
traces of the superimposed structures of the two crystal forms of SARS-CoV N-NTD and the corresponding domain of IBV are shown in (Fig. 2d). As one would anticipate (given the success of molecular replacement), the crystal forms of IBV and SARS-CoV N-NTDs are quite similar (RMSD = 1.22 Å for 110 superimposed C
atoms). The two SARS-CoV N-NTD crystal forms themselves superimpose quite well with an RMSD of 0.3 Å over 111 C
residues. They differ in the side chain rotamers of a few residues. The most important difference is in the positively charged ß-hairpin (residues 57 to 72), implicated in RNA binding, which is disordered in the cubic form (Fig. 2d).
Putative RNA binding surface. Solution studies by NMR spectroscopy of SARS-CoV N-NTD and in vitro RNA binding studies in IBV clearly indicate that this domain binds to viral RNA corresponding to a highly conserved region at the genomic 3' end (61). We also noticed that this specific region encompassed by N-NTD construct binds single-stranded RNA, double-stranded RNA, single-stranded DNA, and double-stranded DNA (in decreasing order of affinity) in gel shift assays (data not shown). As observed in the previous structures, there is a clear segregation of positive and negative charges in the crystal structure of SARS-CoV N-NTD. The positively charged residues are largely confined to a groove that includes the ß-hairpin, and the cleft whereas the negatively charged residues are clustered around the ß-sheet core (Fig. 2b). It is therefore likely that the model of RNA-N interaction that was proposed for IBV (a group III coronavirus) with the phosphate backbone of RNA stacking against the conserved arginine and lysine residues of the ß-hairpin due to favorable electrostatics is likely to be true for SARS-CoV as well. The two tyrosine residues of IBV N-NTD (Y92 and Y94) that have been proposed to potentially stack with consecutive nucleotide bases are conserved in SARS-CoV N-NTD (Y76 and Y78) and lie at the base of the RNA binding groove. Similar modes of RNA-protein interactions (mostly in mRNA cap recognition) have been reported in unrelated RNA-binding viral proteins such as VP40 of Ebola virus (16) and vaccinia virus VP39 (22, 51). The positively charged ß-hairpin was reported to be flexible due to weaker than expected nuclear Overhauser effects in the heteronuclear NMR experiment and higher-than-average B factors observed in the IBV crystal structure. This hairpin is completely disordered in the one of the crystal forms (cubic form) reported here. However, it is clearly ordered in the high-resolution structure of the monoclinic form. It is oriented in a similar conformation as one of the IBV structures and is almost perpendicular to the central core domain (Fig. 2a).
Although N-NTD binds to all four forms of nucleic acid polymers (single- and double-stranded RNA and single- and double-stranded DNA), specificity for RNA over DNA is probably provided more by context (localization to the replicase complex) rather than biochemical selectivity. Whether coronaviral N protein traffics to the nucleus (20, 59) or, more specifically, to the nucleolus (17) has been the subject of debate over many years. Given the shallow nature of the RNA binding groove of N-NTD, it is possible that full-length SARS N might bind to one or both RNA forms (single and double stranded) within infected cells.
Packing of SARS-CoV N-NTD monomers in the two crystal forms. We observed distinctly different modes of packing in the two crystal forms of N-NTD monomers. The asymmetric units of both crystal forms contain one monomer each. In the monoclinic crystal form, the symmetry mates pack in a linear three-dimensional array as head-to-head dimers, with most of the interfacial interactions being made by residues of the positively charged ß-hairpin (Fig. 3a and b). In the cubic form, the N-NTD monomers pack as helical tubules. Here, individual monomers are organized as trimeric units (Fig. 3c), where two consecutive trimers exhibit a right-handed twist, coiling around a pseudohelical axis. Three trimers arch around this axis and form one full turn of the helix (total of nine monomers per turn; Fig. 3d). A trimeric form of nucleocapsid has been observed for full-length N proteins of MHV A59 strain in RNA protein overlay blot assay experiments on a nonreduced preparation of purified virions (46). The relationship between these in vivo observations and the crystal packing we observed, however, remains unclear in the absence of bound RNA in our structure. Nonetheless, the possibility that this helical arrangement might be of physiological relevance cannot be ignored in light of the observation of a similar tubular mode of crystal packing highlighted in the IBV N-NTD structure by Jayaram et al. (21).
![]() View larger version (54K): [in a new window] |
FIG. 3. Crystal packing in the two crystal forms of SARS-CoV N-NTD. (a) Side-on view of three crystallographically related monomers showing stacking interactions in the monoclinic form. (b) Larger end-on view of the same crystal showing two primary modes of packing between the ß-sheet cores (green and blue monomers on the left) and the protruding hairpins of adjacent monomers (yellow and brown monomers in the middle). (c and d) Three symmetry related monomers viewed along a threefold axis of the cubic crystal form (c) and a zoomed-out stereo view showing one turn along the helical axis of the cubic form (d). Equivalent trimers are labeled A, B, and C and colored green, red, and blue.
|
![]() View larger version (93K): [in a new window] |
FIG. 4. Electrostatic charge distribution on the surfaces of homology models of coronaviral N-NTDs. As in Fig. 1, the values range from 5 kT (red) to 0 (white) and to +5 kT (blue), where k is the Boltzmann constant and T is the temperature. The sequences and database accession numbers that were used as templates are the same as in Fig. 1. The boundaries of the model correspond to the regions that align with that of the SARS construct as shown in Fig. 1c.
|
29-kb) SARS-CoV genome into newly formed virion spherules approximately 82 to 120 nm size (14, 45) necessitates an extremely well-packed, largely helical, supercoiling of the nucleic acid within the RNP core. Mature virions are thought to have about 50 to 100 copies of spike trimers and ca. 200 to 400 copies of N in the membrane-proximal region arranged in a paracrystalline lattice. Our recent cryoEM study on the supramolecular organization of the structural proteins on the coats of both SARS-CoV and feline-CoV showed that the RNP appears as punctuate electron-dense features that are clearly associated with M protein and organized as a linear S-M-RNP layer (outside to inside [45]). Our results and those reported by Risco et al. (47) suggest a two-layered organization wherein thread-like densities project from the inner face of the top S-M layer into the two-dimensionally ordered quasilattice of the RNP layer. If indeed the trimeric helical arrangement of N-NTD seen in the cubic form is one possibly physiologically relevant form of RNP, it is likely to face the other face of the quasilattice. This orientation would enable the terminal residue of the N-CTDs to adhere to M, thus anchoring the two layers. The absence of the structures of the other domains (or that of full-length N protein), especially the presence of the intervening serine-rich domain that is largely unstructured, preclude the development of molecular models that explain the higher-order organization of the RNP. Virus assembly and maturation. Enveloped viruses use one of three main mechanisms of assembly and budding (reviewed in references 14 and 15). Previously published studies have suggested a process that is independent of functional N protein (26, 38, 55). Experiments on tunicamycin-treated infected cells suggests that the role of spike in the budding process is also limited. Instead, assembly and budding of mature virions appear to be largely driven by correct folding and assembly of M and E proteins. Interference with M-N protein interaction has little effect on the correct incorporation of M protein into the envelope in the early stages of assembly leading to morphologically indistinguishable virions. An even less-understood process is viral closure or the pinching-off event (17). Nonetheless, it is becoming increasingly clear from multiple independent studies that an ordered lattice formation of the RNP in the immediate vicinity of the luminal face of virion envelope is integral to coronavirus budding. In SARS, MHV, and TGEV coronaviruses, the predominant forces at play in this region are those between the C-terminal tails of the M and N proteins, with the interacting residues of the M protein coming from its C-terminal luminal domain (residues 194 to 205 in the case of SARS-CoV [12]). Since the last few residues from N-CTD (the last residue being the most important) are thought to play the main anchoring role in the N-M layer, there is increasing consensus that, within the RNP, the CTDs of individual N monomers are oriented such that their C-terminal tails point toward the envelope (45). However, both the positioning and orientation of NTD remains nontrivial because of the fibrous organization of the helical RNP and the complex curved path that a fully assembled RNP traverses within the viral lumen. Further studies on the full-length N protein and complementation studies between the NTD and CTDs of N protein are needed to understand the interplay between these two domains within the N-M layer of coronaviruses.
Conclusion. This study describes the high-resolution structures of two crystal forms of the N-terminal RNA-binding domain of SARS-CoV N protein. Structure analysis in the context of ribonucleocapsid assembly of SARS-CoV, IBV, and porcine reproductive and respiratory syndrome virus hints at both common features and differences in the ribonucleocapsid assembly of these three closely related Nidovirales members. The high degree of similarity of SARS-CoV N-NTD with other coronaviral N-NTDs compared to the IBV homolog has allowed the construction of accurate homology models. The lack of conserved electrostatic profiles in the RNA binding groove in these homology models suggests the use of disparate mechanisms for RNA recognition and RNP assembly by different coronaviruses. In conjunction with the structures of N-CTD oligomerization domains, these results are beginning to provide important insights into generic and unique aspects of coronaviral ribonucleocapsid assembly and set the stage for further structural studies on full-length N proteins by cryoEM and related techniques, which would hopefully shed further light on this very important aspect of coronaviral genome assembly and packaging.
This study was supported by National Institutes of Allergy and Infectious Disease/NIH contract HHSN 266200400058C (Functional and Structural Proteomics of the SARS-CoV) to P.K.
Published ahead of print on 17 January 2007. ![]()
K.S.S. and J.S.J. contributed equally to this study. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»