Previous Article | Next Article ![]()
Journal of Virology, November 2005, p. 13685-13693, Vol. 79, No. 21
0022-538X/05/$08.00+0 doi:10.1128/JVI.79.21.13685-13693.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Department of Life Science, Tokyo Institute of Technology, Yokohama 226-8501,1 Department of Virology II, National Institute of Infectious Diseases, Shinjuku, Tokyo 162-8640,2 RIKEN Harima Institute/SPring-8, Mikazuki, Hyogo 679-5148, Japan3
Received 11 May 2005/ Accepted 9 August 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
The NV genome, which consists of positive-sense, single-stranded RNA, contains three open reading frames (ORFs). ORF1 encodes a large polyprotein, the nonstructural protein, which is probably processed intracellularly into six proteins by the viral 3C-like protease (6). ORF2 encodes a capsid protein (the major structural protein), of which 180 molecules form a regular icosahedral virion (45). ORF3 encodes a small basic protein, which is probably the minor structural protein (20). The six NV ORF1 nonstructural proteins are homologous to picornaviral nonstructural proteins and are named accordingly: N-terminal protein, 2C-like nucleoside triphosphatase, 3A-like protein, 3B VPg (genome-linked viral protein), 3C-like protease, and 3D RNA-dependent RNA polymerase (3Dpol). NV 3B VPg binds to the 5' end of the genome and to eukaryotic initiation factors such as eIF3 (15). The protein may function like the 5'-cap structure of eukaryotic mRNAs involved in translation initiation. NV 3Dpol is an RNA-dependent RNA polymerase which is unique in that its polymerase activity is not poly(A) tail/primer dependent (19). 3C-like proteases are the key enzymes for ORF1 polyprotein processing (6, 37, 49, 50) and also cleave the poly(A)-binding protein, causing cellular translation inhibition (34). NV 3C-like proteases (NV 3Cpro) belong to the chymotrypsin-like protease family, in that they appear to have chymotrypsin-like folds. Picornaviral 3C proteases are also members of the chymotrypsin-like family (17). Probably, most of these chymotrypsin-like viral proteases have a cysteine, rather than a serine, as the active site nucleophile. By individual replacement of the charged residues and the putative active site cysteine of the 3Cpro from the norovirus Chiba virus (genogroup I strain) with alanines, His30 and Cys139 were identified as a catalytic dyad, which functions without active participation of an active site carboxyl moiety (51). An active site acidic residue seems nonessential for activity of this 3Cpro, although a Glu54-to-Gly mutation abolishes protease activity (26). The results of the mutagenesis study identified five additional residues (Arg8, Lys88, Arg89, Asp138, and His157) as indispensable for protease activity, but their precise roles were not ascertained (51). Characterization of the protease tertiary structure should clarify the functional roles of the residues identified as essential to activity by mutagenesis.
For the study reported herein, we determined the first crystal structure of an NV 3Cpro (at 2.8-Å resolution) using the Chiba virus 3C-like protease (CVP) (50) as the paradigm. The Chiba virus was isolated from a patient with gastroenteritis acquired during the 1987 oyster-associated outbreak in Chiba Prefecture, Japan (56). CVP, which is released from the ORF1 polyprotein by autoproteolysis, has 181 amino acid residues (19.4 kDa). The CVP amino acid sequence is homologous to those of picornavirus and coronavirus, as well as those of other noroviruses.
Inspection of the CVP structure clarifies the catalytic mechanism and the interpretation of a mutagenesis study (51). We compared the CVP structure with the structures of proteases from poliovirus (PV) (40), human rhinovirus (HRV) (38), hepatitis A virus (HAV) (1, 7), coronavirus (2, 3), and severe acute respiratory syndrome coronavirus (58), as well as the HRV 2A protease structure (42). We expect that drug development for NV-associated gastroenteritis and other diseases caused by the viruses which have the proteases of this group will be facilitated by a global study of the available protease crystal structures.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Diffraction data collection and processing. For multiwavelength anomalous dispersion (MAD) data, mercury derivatives were prepared by soaking native crystals in 1 mM mercury chloride crystallization buffer for 24 h at 20°C or by cocrystallization of the protein with a mercury derivative in a solution similar to that used to prepare the underivatized crystal.
X-ray diffraction data were collected at SPring-8 (Hyogo, Japan). All data were collected using crystals flash cooled at 100 K, which were prepared by rapidly transferring the crystal into a cryoprotectant containing up to 30% (vol/vol) glycerol. MAD data were collected at four wavelengths at beamline BL26B2, which was equipped with a Jupiter charge-coupled device detector (Rigaku MSC). MAD data were collected for the f" maximum, for the f' minimum, and for remote wavelengths below and above the Hg LIII edge. All data were processed using the program CrystalClear. Subsequent calculations were performed using the CCP4 suite of programs (13).
Structure determination. The Hg MAD data were treated as four Hg derivatives, each with an anomalous scattering contribution. All data sets were scaled to a common level using the CCP4 program SCALEIT. Six of the eight mercury atom positions and the initial phases were identified using SOLVE. Next, the minor sites were identified using the initial phases. Then, the phases were refined using the positions of the eight Hg atoms and the program MLPHARE (13, 53, 54). Additional density modification was carried out in several steps using the CCP4 program DM (14). After the noncrystallographic symmetries operator was determined from the eight Hg positions, solvent flattening, histogram matching, and three twofold averaging were performed. Density modification, as carried out originally by the program DM, was repeated, and the figure of merit was forced to decrease 0.4-fold. These operations were repeated three times. The electron density map showed almost the entire polypeptide chain for each of the four molecules of the asymmetric unit. The initial model was built using XtalView (39). The model was refined at 2.8-Å resolution using Refmac5 and CNS. The crystallographic R factor converged to a value of 0.208, with an associated Rfree of 0.240 (10, 41). Data quality and refinement statistics are given in Tables 1 and 2, respectively. The quality of the structural model and its agreement with the structure factors were checked using the programs PROCHECK, PROMOTIF, and SFCHECK.
|
|
Figures were made using PyMOL (16), DINO (http://www.dino3d.org), MSMS (18), MEAD (4), and Secseq.
Protein structure accession number. The coordinates and structure factors were deposited in the Protein Data Bank (entry 1WQS).
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
-helices and seven ß-strands (aI, bI, cI, dI, eI, fI, and gI). The ß-strands form a twisted antiparallel ß-sheet resembling an incomplete ß-barrel. N-terminal domain antiparallel ß-barrels exist in the viral chymotrypsin-like cysteine proteasesHAV 3Cpro (1, 7), foot-and-mouth disease virus 3Cpro (8), tobacco etch virus protease (43), HRV 3Cpro (11, 38), and PV 3Cpro (9, 24, 28, 36, 40)and in the coronavirus 3C-like proteases (also known as main proteases, Mpro [2, 3, 58]). However, the CVP N-terminal domain incomplete ß-barrel is intermediate in structure between the N-terminal domain four ß-strands of HRV 2Apro (42) and the corresponding ß-barrels of other chymotrypsin-like proteases. The core of the CVP incomplete ß-barrel contains the hydrophobic residues Phe12, Phe25, Phe39, Phe40, Phe58 and Phe60, Trp19, Ile32, Ile44, Ile47, and Ile49 (see Fig. S1 in the supplemental material). The active site residue His30 is found in the N-terminal domain. The C-terminal domain is a six-stranded antiparallel ß-barrel, formed by strands aII, bII, cII, dII, eII, and fII. The active site Cys139 is found in the C-terminal domain. The catalytic site formed is situated deep within a cleft between the N- and C-terminal domains.
When the four monomers of the asymmetric unit are examined, the following structural features are found noteworthy. The root mean square deviation (RMSD) for the positions of 71 core C
atoms, with four equivalent atoms per asymmetric unit, is 0.33 (±0.02) Å. The RMSDs given above and belowwith the associated uncertainties reported as standard deviationsare the averages of the RMSDs for the coordinates of included C
positions, whereas the RMSD for 162 equivalent C
atoms is 0.64 (±0.10) Å (cutoff, 2.0 Å). Excluding residues sequentially adjacent to the N and C termini, the largest conformational variations are found for (i) a flexible surface loop (residues 33 to 36; RMSD, 1.68 ± 1.09 Å) following the dI strand, (ii) a hairpin loop (residues 107 to 113; RMSD, 1.33 ± 0.62 Å), (iii) a long loop (residues 124 to 132; RMSD, 2.10 ± 0.98 Å), (iv) a loop (residues 147 to 150; RMSD, 1.17 ± 0.55 Å), and (v) a ß-hairpin loop (residues 162 to 164; RMSD, 1.42 ± 0.89 Å) (Fig. 2). These loops are all solvent accessible and probably flexible. The C-terminal residues, 174 to 181, could not be traced in the electron density map for any of the four crystallographic independent molecules, perhaps because of disorder.
|
-helix that follows strand cI, and Glu54 is part of a loop connecting strands fI and gI. Cys139 is part of a hairpin turn between an
-helix (Pro136-Asp138) and strand dII. His30, Glu54, and Cys139 are conserved in all NV 3C-like proteases (Fig. 3). Mutagenesis of His30 to other residues always caused the enzyme to lose activity, indicating that His30 is indispensable (51). A mutant Ser139 CVP retains activity, whereas introduction of a threonine, tyrosine, or methionine at position 139 abolishes activity (51). Cys139 and His30 seem to work the nucleophilic residue and the general acid-base catalyst residue, respectively. On the other hand, the replacement of Glu54 by an alanine does not significantly affect protease activity. Only two residues, Cys139 and His30, are essential for CVP activity, and although it is not essential, Glu54 seems to fulfill the additional role in the activity. The situation for trypsin is different. In that case, if the active site aspartic acid is replaced by an asparagine, activity is severely diminished (52). The pKa of a cysteine thiol is usually about 8.3, whereas that of a serine hydroxyl is about 13. Therefore, a cysteine thiol will ionize more readily than will a serine hydroxyl. This difference in ionization strengths probably accounts for the difference in the requirement for a catalytic carboxyl moiety. In addition, it suggests the possibility that the His30 imidazolium might exist and the chemical state of the Cys139-His30 interaction might be a thiolate-imidazolium ion pair, which is also found in papain-like proteases (44). For HAV 3Cpro, although the position of Asp84 is spatially equivalent to the third triad member, its side chain points away from the general base His44 (7). Additionally, for coronavirus Mpro, the residue equivalent to Glu54 is Val84. Its aliphatic side chain obviously cannot participate in catalysis (2), and a water molecule occupies the position of the third catalytic residue of the conventional triad. A catalytic dyad must be operational in Mpro. Therefore, in general, a carboxyl moiety may not be necessary for catalysis by viral chymotrypsin-like cysteine proteases. For some proteases, an asparagine residue occupies the position corresponding to the active site carboxyl moiety of chymotrypsin-like proteases and can only hydrogen bond with the active site histidine. Sárkány et al. (47) suggested that a catalytically competent thiolate-imidazolium ion pair, not an imidazole general base catalyst, functions in the HRV 2Apro catalytic mechanism. While, in the C139S mutant, His30 is likely to work as a general base, rather than an imidazolium ion, it is still uncertain whether His30 works as a general base or as an imidazolium ion in native CVP, which has Cys139 (51). Kinetic and mutagenesis studies are needed to confirm that a third member is not necessary for the activity and to confirm the possibility of a CVP thiolate-imidazolium ion pair.
|
2 of His30 is 3.4 Å in molecules B and C, which is that expected theoretically (23) and is similar to the distances measured for HRV 3Cpro (3.5 Å [38]) and for PV 3Cpro (3.4 Å [40]). The Cys sulfur-HisN
2 distance in molecule A (3.2 Å) is a little shorter than the distances found for the other molecules of the asymmetric unit. Tartrate was included in the crystallization buffer, and an electron density corresponding to tartrate is found between Cys139 and His30 of molecule A (see Fig. S4 in the supplemental material). It is possible that an interaction between molecule A and a tartrate mimics the substrate binding state. For molecule D, the Cys sulfur-HisN
2 distance (approximately 4 Å) is longer than average. The structures of coronavirus Mpro and a part of HAV 3Cpro are inactivated, and the distance of separation between their catalytic Cys sulfur and HisN
2 increased when their active site sulfurs were oxidized to either sulfinic acid or sulfonic acid during crystallization.
As noted above, although Glu54 is not essential for CVP activity, it is plausible that Glu54 decreases the range of motion for His30 if a negatively charged carboxylate and the positively charged imidazolium interact. The average distances between the Glu54 carboxyl oxygen and the His30N
1 of molecules A, B, C, and D are 3.4 Å, 2.8 Å, 2.8 Å, and 4.1 Å, respectively; many of the values are within hydrogen bonding distance. Therefore, as indicated in the work of Someya et al. (51), CVP could be more active with Glu54 than without Glu54.
Examination of the CVP crystal structure shows that several hydrogen bonds help maintain the integrity of the active site and/or substrate binding site. These hydrogen bonds include one between the side chain nitrogen of Lys88 and the backbone carbonyl oxygen of Val9 and a second one between the guanidinium nitrogen of Arg8 and the backbone carbonyl oxygen of Thr 69 (Fig. 1D). These hydrogen bonds involve residues bridging the N- and C-terminal domains. Identification of these hydrogen bonds clarifies why Arg8 and Lys88 are conserved in NV 3Cpros and why, when they are mutated to other residues in CVP, activity is lost (51). An interaction between the Asp90 side chain oxygens and an Arg11 guanidinium nitrogen helps orient the N- and C-terminal domains. Mutants with either the first five residues or the first eight residues deleted retain a low level of activity; whereas a mutant missing the first 11 residues is inactive (51). Therefore, the hydrogen bond made by the main chain oxygen of Val9 and the side chain amino group of Lys88 appears vital for activity. The integrity of the active site and/or of the substrate binding site and indeed the entire molecule may depend heavily on the hydrogen bonds described above.
Oxyanion hole. The amino acid sequence Gly-Xaa-Ser/Cys-Gly, which includes the active site nucleophile (underlined), is highly conserved in chymotrypsin-like proteases (5). Aspartic acid occupies the Xaa position in most eukaryotic proteases, such as chymotrypsin and trypsin; in bacterial serine proteases; and in viral 2A cysteine proteases. However, for viral 3C or 3C-like cysteine proteases, the residue occupying the Xaa position varies, with Asp, Gln, Met, Tyr, or Trp often present. For CVP, the Gly-Asp-Cys-Gly motif forms the pocket of the oxyanion hole. The oxyanion hole helps to bind the substrate tightly and to stabilize the tetrahedral transition state by hydrogen bonding with the negatively charged P1-backbone carbonyl oxygen.
The CVP oxyanion hole is formed by a region preceding Cys139 and two backbone amides (those of Gly137 and Cys139), which point towards the P1-carbonyl oxygen. Additionally, a side chain Asp138 oxygen clearly forms a good hydrogen bond with the Arg89 side chain nitrogen (Fig. 1E) as reflected by the well-ordered corresponding regions of the electron density map. These two residues are conserved in all noroviruses, and no other residues tested could replace either Arg89 or Asp138 (51). Neither an R89K mutant nor a D138N mutant was active, although the R89K mutant had a positive charge at position 89 and the D138N should have been able to form a hydrogen bond between residues 89 and 138. Clearly, Arg89 and Asp138 are essential for activity and function by stabilizing the oxyanion hole.
In molecule A, the oxyanion hole is large enough to accommodate the main chain carbonyl oxygen of the P1 residue. However, the oxyanion holes of molecules B, C, and D are much narrower than that of molecule A. The volume differences are attributed to the conformations of Pro136 and Gly137 (Fig. 1F). The Pro136 carbonyl oxygen is displaced to the exterior of the molecule A oxyanion hole, whereas, for the other molecules, the oxygen faces inward. In the electron density maps, the different orientations could be clearly observed. Recall that a tartrate was found only in the active site of molecule A and mimics a substrate binding state. Therefore, in the absence of substrate, the oxyanion hole is stabilized by an electrostatic interaction between the carbonyl oxygen of Pro136 and the numerous peptide amides that line the hole, plus the aforementioned interaction involving Asp138/Arg89. Perhaps, substrate binding reduces the structural flexibility of the main chain of Pro136 and Gly137, so that the carbonyl oxygen of Pro136 rotates to the exterior of the oxyanion hole by almost 180 degrees, while the Gly137 backbone amide turns inward. The effect upon substrate binding in the oxyanion hole structure, as discussed in HRV 3Cpro, has been deduced (38).
Substrate binding site. We built a model of an oligopeptide bound to CVP. This peptide, Glu-Thr-Thr-Leu-Glu*Gly-Gly-Asp, corresponds to P5-P3' of a substrate, and the asterisk marks the scissile bond. The notation Pn-P1-P1'-Pn' for the residues of substrate or inhibitor is that of Schechter and Berger (48). (P1-P1' are the scissile bond residues.) We felt that a peptide/substrate model would increase our understanding of CVP substrate specificity. To model the peptide/CVP complex, we consulted various protease/inhibitor complex studies performed previously and used, as the starting point for the substrate conformation, the conformation of residues 14 to 21 of OMTKY3 when bound to SGPB (46) (see Materials and Methods for modeling details). For the peptide/CVP model, canonical substrate-binding interactions exist, including interactions between the enzyme and the substrate P4-P2 residues. In order to confirm the validity of this model, we determined the CVP-substrate complex crystal structure (H-Glu-Ala-Leu-Phe-Gln-pNA [Bachem] was used as a substrate). As a result, although the resolution is low, the determined electron density maps indicate the main chain of the substrate and support this substrate binding model (data not shown; see Fig. S5 in the supplemental material). Figure 4C portrays the docked substrate, with the CVP oxyanion hole and eII strand visible. The substrate fills the CVP binding pocket of molecule A without any severe van der Waals clashes. However, for molecules B, C, and D, the P1 amino acid is too close to the Pro136 carbonyl oxygen to fit in the substrate binding pocket. The inability of molecules B, C, and D to accommodate the P1 residue is a reflection of the Pro136 carbonyl oxygen position, which, as noted previously, protrudes into the oxyanion hole. The oxyanion hole of molecule A in the substrate binding model is shown in Fig. 4B.
|
The S1 and S2 sites. The residues at NV ORF1-polyprotein cleavage sites are glutamine or glutamic acid at the P1 site and glycine or alanine at the P1' site. No other residues occupy these sites (Fig. 3). His157, which is part of the S1 site, is important to substrate binding (Fig. 4C). Mutagenesis of His157 to any other tested residue severely reduces activity (51). For HRV 3Cpro, HAV 3Cpro, PV 3Cpro, and coronavirus Mpro, there are histidines at the S1 sites and their imidazoles interact with substrate P1 carboxamide side chains (1, 2, 3, 7, 38, 40, 58).
In the CVP/peptide model, the His157 imidazole is positioned to interact with the P1 glutamine side chain. His157 is centered in a hydrophobic pocket, which is composed of Ile85, Ile87, Leu97, Val99, Leu121, Pro136, Tyr143, Ala160, and Val167. Together, the Leu135 backbone, the Pro136 pyrrolidine ring, and the Ala160 methyl form the entrance to the S1 pocket. To ensure proper placement of the His157 imidazole, it is stabilized by a hydrogen bond with the buried Tyr143 hydroxyl, which has no other hydrogen-bonding partner. The average distance of His157N
1-Tyr143O
is 2.77 Å ± 0.12 Å (see Fig. S3 in the supplemental material). NV 3Cpro S1 pocket residues are highly conserved, and Tyr143 is always present. The S1 site histidines are similarly stabilized in other 3C or 3C-like proteases. For PV 3Cpro, the hydroxyl of Tyr138 hydrogen bonds to the S1 His161 imidazole, and for coronavirus Mpro, the analogous interaction is between the side chains of Tyr160 and His162. For HAV 3Cpro, Glu132 interacts through two water molecules with the S1 His191. Examination of the CVP/peptide model shows that the oxygen of the P1 glutamine side chain can hydrogen bond with the hydroxyl of Thr134, which is a residue that is conserved in all noroviruses (Fig. 4C). The distance between the two atoms is
2.7 Å. A P1 residue may be stabilized by both His157 and Thr134 in the NV 3Cpros. In addition, as mentioned above, NVs have glutamic acid or glutamine at the P1 site, but PV, HAV, HRV, and coronavirus have just a glutamine at the P1 site. In comparison of CVP structure with those 3Cpros, two major differences were found. The first is the space size of the S1 site. The S1 site of CVP is larger than other viral 3Cpros; therefore, the
1 of the P1 residue seems to be easy to rotate. The other point is Asp138 of CVP. Although other residues of the S1 site are similar to other 3Cpros, CVP-Asp138, which is located facing His157, is different. Other viral 3Cpros have Gln at the position. If
1 of the P1 residue rotates, Asp138 of CVP seems to be able to receive the side chain of the P1 glutamine and form a hydrogen bond further than Gln of other viral 3Cpros. Additional structural and biochemical studies are required to elucidate this speculation.
A bulky hydrophobic amino acid, such as Leu, Met, and Phe, preferentially occupies the P2 positions of the polyprotein cleavage sites. The S2 site is also a hydrophobic pocket. It consists of Ile109, Val114, and Ala159 side chains and the Arg112 C
and C
atoms. The S2 pocket is large enough to accommodate a bulky P2 side chain (Fig. 4C).
Conclusions. Examination of the CVP crystal structure identified the following important structural features: (i) in the overall structure, CVP, which has a chymotrypsin-like fold, resembles other viral 3C, 3C-like, and 2A proteases; (ii) the active site is located in a deep cleft between the N- and C-terminal domains and includes His30, Cys139, and unessential Glu54; (iii) on the basis of the substrate binding model, substrate/enzyme binding involves antiparallel ß-strand interactions and substrate P1 and P2 interaction with the enzyme S1 specificity site His157 and the hydrophobic S2 pocket, respectively; (iv) a hydrogen bond network, which is formed by several residues and was identified as essential by mutagenesis, contributes to the structural integrity of the active site and the domains; (v) there is a structural difference in backbone amides of the oxyanion hole probably upon substrate binding.
We believe that the CVP crystal structure is an appropriate structural model for structural studies involving other viral 3C or 3C-like proteases. A global comparison of viral 3Cpros would be especially useful if drug development for nonbacterial acute gastroenteritis and other diseases associated with viruses expressing 3Cpro were the result.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Supplemental material for this article may be found at http://jvi.asm.org/. ![]()
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| J. Bacteriol. | Mol. Cell. Biol. | Microbiol. Mol. Biol. Rev. |
|---|
| Clin. Vaccine Immunol. | ALL ASM JOURNALS |
|---|