Previous Article | Next Article ![]()
Journal of Virology, January 2003, p. 1415-1426, Vol. 77, No. 2
0022-538X/03/$08.00+0 DOI: 10.1128/JVI.77.2.1415-1426.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Institute of Virology and Immunology, University of Würzburg, Würzburg, Germany,1 Cooperative Research Center for Aquaculture, CSIRO Livestock Industries, Long Pocket Laboratories, Indooroopilly, Australia,2 Department of Medical Microbiology, Center of Infectious Diseases, Leiden University Medical Center, Leiden, The Netherlands3
Received 27 June 2002/ Accepted 15 October 2002
|
|
|---|
VRTGN2836. The trans-processing activity of the purified recombinant 3CLpro (pp1a residues 2832 to 3126) was used to identify another cleavage site, 6441KVNHE
LYHVA6450, in the C-terminal pp1ab region. Taken together, the data tentatively identify VxHE
(L,V) as the substrate consensus sequence for the GAV 3CLpro. The study revealed that the GAV and potyvirus 3CLpros possess similar substrate specificities which correlate with structural similarities in their respective substrate-binding sites, identified in sequence comparisons. Analysis of the proteolytic activities of MBP-3CLpro fusion proteins carrying replacements of putative active-site residues provided evidence that, in contrast to most other 3C/3CLpros but in common with coronavirus 3CLpros, the GAV 3CLpro employs a Cys2968-His2879 catalytic dyad. The properties of the GAV 3CLpro define a novel RNA virus proteinase variant that bridges the gap between the distantly related chymotrypsin-like cysteine proteinases of coronaviruses and potyviruses. |
|
|---|
GAV and yellow head virus have recently been placed in a new genus, Okavirus, within a new family, Roniviridae (8, 11), that, together with the Coronaviridae and Arteriviridae, forms the order Nidovirales (6, 12). The phylogenetic relationship between GAV and nidoviruses became evident from comparative sequence analyses of the 20-kb 5'-terminal region of the GAV genome (8), which revealed striking similarities in the organization and expression of the viral replicase genes. In common with nidoviruses, the 5'-terminal replicase gene of GAV encodes two large open reading frames, ORF1a and ORF1b, comprising 12,248 and 7,941 nucleotides, respectively. In vitro data also demonstrated that the downstream ORF1b, which overlaps ORF1a by 99 nucleotides, is expressed by ribosomal frameshifting, as in all nidoviruses. Most probably, slippage into the -1 frame occurs at the sequence 12215AAAUUUU12221 and involves an RNA pseudoknot located immediately downstream of this slippery sequence (8). Accordingly, ORFs 1a and 1b are translated as two polyproteins, pp1a (460 kDa) and its C-terminally extended form, pp1ab (758 kDa), which are expected to mediate the functions required for genome replication and transcription of a 3'-coterminal nested set of subgenomic mRNAs encoding the viral structural proteins (9).
Comparative sequence analysis revealed several putative functional domains in the GAV polyproteins, including helicase and polymerase motifs, ordered similarly to the cognate domains in the viral polyproteins of other nidoviruses (8). This observation, combined with the fact that the GAV polymerase domain contains the SDD motif unique to nidovirus polymerases, strongly suggested that GAV (infecting invertebrates) and nidoviruses (infecting vertebrates) have a common ancestor (14). However, the presence of a number of regions with low sequence similarity in ORF1b and, in particular, the extremely poor pp1a conservation suggested that GAV has diverged significantly from the vertebrate nidoviruses (corona- and arteriviruses). Indeed, the only region in pp1a with significant sequence similarity proved to be a putative chymotrypsin-like (3C-like) proteinase domain (3CLpro), flanked by hydrophobic (probably membrane-spanning) domains.
In vertebrate nidoviruses, the 3CLpro cleaves the viral polyproteins at multiple conserved sites and is responsible for posttranslational release of the key replicative proteins. It has therefore also been referred to as the main proteinase (Mpro) to distinguish it from accessory nidovirus proteinases, which cleave at only a few sites in the N-terminal pp1a/pp1ab regions (51). Although no 3CLpro cleavage sites could be readily predicted in the pp1a/pp1ab polyproteins of this invertebrate nidovirus, it seems likely that this GAV proteinase may have a similar critical role in viral replication, as has been demonstrated conclusively for its vertebrate nidovirus homologs (8, 51). Based on sequence comparisons, it has been proposed that the GAV 3CLpro is distantly related to the main proteinases of arteri- and coronaviruses as well as the NIa proteinases of plant potyviruses, which all have an (E,Q)
(G,S,A) substrate specificity (8). Throughout this article, amino acid residues flanking the scissile bond (indicated by
) are given from N to C terminus in the single-letter code, where x indicates any residue. If various residues are found at a given position, these are listed in parentheses.
In this report, we provide direct evidence for the predicted proteolytic function of GAV 3CLpro. Predictions of putative active-site residues identified by sequence comparisons were substantiated by site-directed mutagenesis, and information on the GAV 3CLpro substrate specificity was obtained. The theoretical and experimental data presented in this study define a new member of the constantly growing group of viral 3C-like proteinases, which may combine the Cys-His catalytic dyad of the main proteinase of coronaviruses with a potyvirus-like substrate-binding pocket.
|
|
|---|
DNA sequences encoding different GAV pp1a/pp1ab regions were amplified by PCR with the primers listed in Table 1. The PCR products were treated with T4 DNA polymerase, phosphorylated with T4 polynucleotide kinase, digested with EcoRI, and inserted into the XmnI and EcoRI sites of pMal-c2 (New England Biolabs, Frankfurt, Germany). The resulting plasmids, which are shown in Table 1, allowed the expression of GAV pp1a/pp1ab sequences fused to the maltose-binding protein (MBP) of Escherichia coli (Fig. 1). Site-directed mutagenesis was done by a recombination-PCR method (19, 47). E. coli TB1 cells transformed with the appropriate pMal-c2 derivatives (Table 1) were grown at 37°C in Luria-Bertani (LB) medium containing 100 µg of ampicillin per ml until they reached a culture density (A595) of 0.6. Expression of the recombinant proteins was induced by addition of 0.5 mM isopropyl-ß-D-thiogalactopyranoside (IPTG) for 3 h at 24°C. For analysis of recombinant protein expression, aliquots of the cell cultures were suspended in 2x Laemmli sample buffer and heated at 94°C for 3 min, and the lysates were analyzed by electrophoresis in sodium dodecyl sulfate (SDS)-polyacrylamide gels and Western immunoblotting with standard protocols.
|
View this table: [in a new window] |
TABLE 1. Oligonucleotides used for the amplification or mutagenesis of GAV sequences
|
![]() View larger version (33K): [in a new window] |
FIG. 1. Expression of GAV replicase gene. The 20,000-nucleotides gene comprises ORFs 1a and 1b, which occupy the 5'-terminal region of the GAV genome and encode two replicase polyproteins, pp1a and pp1ab. Expression of pp1ab requires a -1 frameshift during translation, which is predicted to be mediated by a slippery heptanucleotide sequence and an RNA pseudoknot structure (8). The primary GAV pp1a/pp1ab-derived protein constructs used in this study are shown schematically. The N- and C-terminal residues of the GAV-specific amino acid sequences are given in the one-letter code. The numbering of pp1a/pp1ab amino acids is based on predictions on the GAV frameshift site, AAAUUUU (nucleotides 12215 to 12221 of the GAV genome) (8) (GenBank accession number AF227196). Fusions of GAV pp1a/pp1ab amino acids with E. coli MBP are indicated. Also, the positions of putative active-site Cys and His residues and the GAV 3CLpro cleavage sites characterized in this study are given (C, H, and E V, E L, respectively).
|
N-terminal protein sequence analysis. Following SDS-polyacrylamide gel electrophoresis (PAGE), the proteins were transferred to polyvinylidene difluoride membranes (162-0180; Bio-Rad Laboratories, Munich, Germany) and subsequently stained with Coomassie brilliant blue. The membrane regions containing the proteins of interest were isolated as described previously (49), and the proteins were subjected to six cycles of Edman degradation by use of a pulsed-liquid protein sequencer (ABI 467A; Applied Biosystems, Weiterstadt, Germany).
Preparation of antiserum
-MBP-2948-3143.
The MBP-2948-3143 fusion protein was purified by amylose affinity chromatography from TB1[pMal-GAV-2948-3143] cells as described above. The protein was cleaved with factor Xa (Amersham Biosciences) and used to immunize rabbits as described previously (49). The antiserum was designated
-MBP-2948-3143.
trans-cleavage assay. Typical 20-µl reaction mixes contained recombinant GAV 3CLpro (2832-3126 or 2832-3126_C2968A) and the substrate protein, MBP-6638-6673 (each at 1.6 µM), in a buffer containing 20 mM Tris-HCl (pH 7.5), 200 mM NaCl, 1 mM EDTA, and 1 mM dithiothreitol. Following incubation at 22°C for 16 h, the reaction products were separated on SDS-15% polyacrylamide gels that were stained with Coomassie brilliant blue R-250.
Computer-aided comparative sequence analyses. Amino acid sequences were derived from the Genpeptides database. 3CLpro sequence alignments were produced with the Clustal X program (42) and the Blossum series of scoring interresidue tables (18). The virus interfamily alignments were generated in the profile mode. The alignments obtained were used in the PhD program (34, 35) to predict secondary structures and also to build profiles with the Profileweight program (43). These profiles were compared in pairs with the Proplot program (43). Two profiles, where one profile may be a sequence, were compared by sliding a window of the selected length along each possible register for a given dot plot. Several window lengths were tested. Matches between two profiles that were within the top 0.05% or between the top 0.1% and 0.05% were marked by two different types of dots.
|
|
|---|
Comparison of the entire replicase gene revealed that, among all viruses sequenced to date, the Coronaviridae represent the most closely related family to GAV (unpublished data). In the case of the 3CLpro, however, the most significant matches were found in homologs from the Potyviridae family (8) (data not shown). Comparison of the GAV 3CLpro with both corona- and potyvirus 3CLpros revealed conservation of two regions: (i) the segment containing the catalytic His residue, which is most similar between the GAV and coronavirus 3CLpros, and (ii) the segment containing the catalytic Cys residue, which is most similar between the GAV and potyvirus 3CLpros (Fig. 2). No conservation was evident in the segment between the catalytic His and Cys residues, which contains the catalytic Asp residue of potyvirus (and many other) 3C-like proteinases.
![]() View larger version (18K): [in a new window] |
FIG. 2. Profile-versus-profile dot plot cross-comparisons of GAV 3CLpro with coronavirus and potyvirus 3CLpros. Alignments of coronavirus and potyvirus 3C-like proteinases were converted into profiles and compared in a dot plot fashion, as described in Materials and Methods. Shown are the dot plots generated with a window of 35 amino acid residues. The projected positions of the catalytic residues (H46/H41 versus H2879, D81, C151/C144 versus C2968), as well as the substrate-binding H167/H162 residues versus H2983, are shown at each axis. Putative catalytic residues are designated by asterisks. Those dots, which lay at any of the four possible crosses of projections of two functionally equivalent residues (e.g., H46 and H2879) or close to a nonvisible diagonal passing these crosses, belong or may belong to the true matches between two profiles. The rest of the dots are background hits (false-positives).
|
The computer-aided analysis of putative substrate-binding residues of 3CLpro produced a low-resolution model. GAV His2983, the previously proposed counterpart to the key S1 subsite His residues of other 3C/3CLpros (8), was either at the edge or even outside of a stretch of matching residues in the GAV-versus-potyvirus and GAV-versus-coronavirus dot plots, respectively (Fig. 2). The low similarity in this region is due to the unusually short size of this segment in GAV 3CLpro and unique amino acid replacements in the immediate vicinity of GAV His2983 and the corresponding His residues in coronavirus 3CLpros (15, 17) (Fig. 3). Accordingly, when the GAV 3CLpro was compared separately with each of the two proteinase groups with Clustal X, another closely located residue of GAV, Ser2988, was aligned with the substrate-binding His (not shown). Five residues upstream of the catalytic Cys, a Thr/Ser residue which, in many 3C/3CLpros, together with His, makes contact with the substrate's P1 Gln/Glu side chain (3, 16, 28-30), was found to be conserved in the GAV sequence (GAV Thr2963), suggesting that His (rather than Ser) is the most probable candidate to assume the key position in the S1 subsite.
![]() ![]() View larger version (157K): [in a new window] |
FIG. 3. Multiple sequence alignment of GAV, coronavirus, and potyvirus 3CLpro domains. The Clustal X-based alignment of corona- and potyvirus 3CLpros produced previously (17) was modified slightly to accommodate the results of the tertiary-structure analysis of a porcine coronavirus 3CLpro (2) and used to align the GAV 3CLpro sequence. For GAV and coronaviruses, this alignment was further expanded by including upstream and downstream sequences with Clustal X. Shown are the regions enriched in hydrophobic amino acid residues and flanking the 3CLpro from both the N terminus (C-terminal part of hydrophobic domain [HD3]) and the C terminus (entire HD4). These hydrophobic domains are conserved in all nidoviruses (14). For GAV and coronaviruses, the pp1a/1ab amino acid positions are given on the right; for potyviruses, the numbers refer to the amino acid positions in the 3CLpro. The column conservation in the two groups of coronavirus/GAV versus potyvirus sequences was highlighted separately with different colors for the following groups of amino acids: green for G, A, L, I, V, M, F, Y, and W; blue for H, K, and R; red for N, Q, E, and D; yellow for P; and violet for S and T. Columns with conserved or identical residues in all sequences are indicated by colons and solid squares, respectively, in the line separating the coronavirus/GAV and potyvirus groups. Empty squares highlight columns with identical residues in the GAV and potyvirus sequences. #, conserved catalytic Cys and His residues; @, P1-binding His residue conserved in all sequences and Thr residue conserved among GAV and potyviruses; solid circle, catalytic Asp residue of potyviruses. ><, positions of cleavage sites separating 3CLpro from flanking domains in corona- and potyviruses. Abbreviations of virus names and DDBJ/EMBL/GenBank accession numbers for the sequences are as follows: HCoV, human coronavirus (strain 229E) (X69721); TGEV, transmissible gastroenteritis virus (strain Purdue 115) (Z34093); PEDV, porcine epidemic diarrhea virus (strain CV777) (NC_003436); MHVA, murine hepatitis virus (strain A59) (NC_001846); BCoVl, bovine coronavirus (isolate LUN) (AF391542); IBV, avian infectious bronchitis virus (strain Beaudette) (M95169); TVMV, tobacco vein mottling virus (P09814); TUMVQ, turnip mosaic virus (strain Quebec) (Q02597); TEV, tobacco etch virus (P04517); PVY, potato virus Y (strain N) (P18247); PSBMV, pea seed-borne mosaic virus (strain DPD1) (P29152); PPVRA, plum pox virus (strain Rankovic) (P17767); PRSVH, papaya ringspot virus (strain P/mutant HA) (Q01901); PEMVC, pepper mottle virus (California isolate) (Q01500); BSMRV, Brome streak mosaic rymovirus (strain 11-Cal) (Q65730).
|
Nidovirus 3CLpros comprise two catalytic ß-barrels and an extra C-terminal domain. In the viral polyprotein, they are flanked by well-conserved cleavage sites that are used to release the proteinase from adjacent transmembrane domains (15, 51). A similar domain organization was unraveled in GAV, although the sequence conservation was rather low, especially outside the catalytic domains (Fig. 3). In striking contrast to other nidoviruses, we were unable to identify conservation in the immediate flanking regions of 3CLpro or, at least, dipeptides conforming to canonical 3CLpro cleavage sites [(Glu,Gln)
(Ser,Ala,Gly)], indicating that the GAV 3CLpro may have a deviant specificity and release itself from the precursor in a unique fashion.
Proteolytic activity of GAV 3CLpro domain.
To address the predicted proteolytic activity of the GAV 3CLpro, pp1a/pp1ab residues 2793 to 3143 (containing the presumed 3CLpro and a short N-terminal flanking region) were expressed as part of an MBP fusion protein (MBP-2793-3143) in E. coli. Based on studies on the related human coronavirus 3CLpro (49), the N-terminal region was expected to contain a 3CLpro site that could be autoprocessed in E. coli. As Fig. 4A (lanes 2 and 3) shows, induction of expression resulted in the synthesis of two proteins of
47 and
38 kDa that were not detectable in the noninduced control, suggesting proteolytic cleavage of the primary translation product, for which a molecular mass of 82 kDa was calculated. The fact that the control protein, MBP-2793-3143_H2879R, in which Arg replaced the putative active-site His2879 residue, gave rise to the full-length protein (Fig. 4A, lanes 4 and 5) provided conclusive evidence that, as predicted, GAV pp1a/pp1ab residues 2793 to 3143 contain a functional proteinase domain.
![]() View larger version (27K): [in a new window] |
FIG. 4. Proteolytic activity of GAV pp1a/pp1ab amino acids 2793 to 3143. (A) Total cell lysates from E. coli TB1 cells transformed with pMal-GAV-2793-3143 (lanes 2 and 3, WT) and pMal-GAV-2793-3143_H2879R (lanes 4 and 5, H2879R) were separated by SDS-PAGE in a 12.5% polyacrylamide gel and stained with Coomassie brilliant blue R-250. The bacteria were mock induced (lanes 2 and 4) or induced with 1 mM IPTG for 3 h (lanes 3 and 5). The positions of the fusion proteins and cleavage products are indicated, and the molecular masses of marker proteins (lane 1) are given (in kilodaltons). (B) The protein lysate shown in panel A (lane 3) was separated by SDS-PAGE in a 10% polyacrylamide gel, transferred to a nitrocellulose membrane, and immunostained with MBP-2948-3143-specific rabbit antiserum (lane 1) or MBP-specific antiserum (New England Biolabs) (lane 2). The positions of the N-terminal (i.e., MBP-containing) and C-terminal cleavage products are indicated, and the positions of marker proteins are given (with masses in kilodaltons).
|
trans-cleavage activity of recombinant GAV 3CLpro. From the data presented above, it could not be concluded whether the N-terminal 3CLpro cleavage had occurred in cis or was mediated by trans-acting precursors. Although the high cleavage efficiency indicated by the virtual absence of detectable precursors strongly suggested a cotranslational monomolecular reaction, we expected that the recombinant 3CLpro might also have trans-cleavage activity required by the native proteinase to process the full spectrum of cleavage sites assumed to exist in the 460-kDa and 758-kDa GAV replicase polyproteins. The demonstration of such trans-cleavage activity would also formally exclude the involvement of E. coli proteinases in the processing described in Fig. 4.
trans-cleavage activity was examined with purified, recombinant 3CLpro (for details, see Materials and Methods). Because of the uncertainty regarding the C-terminal border of 3CLpro (see below), we initially tested bacterially expressed proteins with C termini of different lengths (2832 to 3143 and 2832 to 3126). Both proteins had proteolytic activity. We decided to use 2832-3126 in subsequent trans-cleavage experiments because of its superior stability. As a control, a protein with the same sequence but containing a substitution of the putative nucleophilic active-site Cys2968 residue (2832-3126_C2968A) was produced (Fig. 5). The purified proteins were incubated with bacterially expressed MBP-6338-6673 containing the C-terminal GAV pp1ab sequence corresponding to the coronavirus pp1ab region with the most C-terminal 3CLpro cleavage site (20, 25, 51). The data (Fig. 5) revealed that the wild-type proteinase but not the active-site mutant was active in trans, proving that GAV 3CLpro is indeed a proteinase.
![]() View larger version (61K): [in a new window] |
FIG. 5. trans-cleavage activity of GAV 3CLpro. Recombinant GAV 3CLpro encompassing 295 amino acids (2832 to 3126) and an active-site mutant (2832-3126_C2968A) were bacterially expressed, purified, and incubated with an MBP fusion protein substrate, MBP-6338-6673, containing the C-terminal GAV pp1ab sequence (see Materials and Methods for details). Lanes: 1, marker proteins, with molecular masses indicated in kilodaltons; 2, MBP-6338-6673 incubated with buffer; 3, MBP-6338-6673 incubated with 2832-3126; 4, 2832-3126 incubated with buffer; 5, MBP-6338-6673 incubated with buffer; 6, MBP-6338-6673 incubated with 2832-3126_C2968A; 7, 2832-3126_C2968A incubated with buffer. Cleavage products of MBP-6338-6673 are indicated by arrowheads.
|
VRTGN2836, which identifies Val2832 as the N terminus of 3CLpro. The observed molecular mass of the 3CLpro-containing cleavage product (38 kDa) slightly surpassed that calculated for this peptide sequence (34.8 kDa), making a second, C-terminal cleavage of MBP-2793-3143 unlikely.
![]() View larger version (28K): [in a new window] |
FIG. 6. Characterization of N-terminal GAV 3CLpro autoprocessing site by protein sequencing. The C-terminal MBP-2793-3143 cleavage product (Fig. 4A, lane 3) was subjected to Edman degradation, and phenylthiohydantoin (PTH)-amino acids generated during each reaction cycle were detected by their absorbance at 269 nm (expressed as milliabsorption units) and identified by their characteristic retention times on a reversed-phase high-pressure liquid chromatography support. (A) Chromatogram of PTH-amino acid standards. (B to F) Chromatograms of PTH-amino acids from reaction cycles 1 to 5. Specific peaks of PTH-amino acids are indicated by the single-letter code.
|
27-kDa C-terminal cleavage product from the trans-cleavage reaction documented in Fig. 5. This analysis unambiguously identified the scissile bond as 6441KVNHE
LYHVA6450. As no other processing product was detected, it is reasonable to assume that the C-terminal processing product of GAV pp1ab is a 27-kDa protein encompassing amino acids 6446 to 6673. The data provided additional information on the GAV 3CLpro substrate specificity, which allows us to preliminarily propose VxHE
(L,V) as the consensus sequence of GAV 3CLpro cleavage sites. Although the picture is still incomplete, our data indicate that the substrate specificity of the GAV 3CLpro is well defined, as in vertebrate nidovirus main proteinases and many of their viral relatives, but differs from that of typical 3C/3C-like enzymes. Dispensability of C-terminal sequences for 3CLpro autoprocessing activity. The observed preference for substrates containing HEL or HEV tripeptides lends additional support to our hypothesis that there is no cleavage site between the 3CLpro domain and the downstream putative membrane-spanning domain. It is thus tempting to speculate that, in contrast to the main proteinases of vertebrate nidoviruses, the GAV 3CLpro is the N-terminal component of a larger protein. To determine whether the sequences downstream of the predicted two-ß-barrel domain are essential for 3CLpro cleavage activity, we compared the proteolytic activities of two C-terminal MBP-2793-3143 deletion mutants with that of the parental protein. As Fig. 7 shows, the two C-terminally truncated proteins had reduced but clearly detectable proteolytic activities, suggesting that the N-terminal region from 1 to 197 contains all the structural elements and residues required for substrate binding and catalysis. Furthermore, comigration of the processed N-terminal product (Fig. 7) suggests that, in all three proteins with proteolytic activity, cleavage occurred at the same peptide bond.
![]() View larger version (43K): [in a new window] |
FIG. 7. Effect of C-terminal deletions on the self-processing activity of MBP-2793-3143. (A) Total cell lysates from E. coli TB1 cells transformed with pMal-GAV-2793-3143 (lanes 1 and 2; 2793-3143), pMal-GAV-2793-3143_C2968A (lanes 3 and 4; 2793-3143_C2968A), pMal-GAV-2793-3028 (lanes 5 and 6; 2793-3028), and pMal-GAV-2793-3059 (lanes 7 and 8; 2793-3059) were separated by SDS-PAGE in a 12.5% polyacrylamide gel and stained with Coomassie brilliant blue R-250. The bacteria were mock induced (lanes 1, 3, 5, and 7) or induced with 1 mM IPTG for 3 h (lanes 2, 4, 6, and 8). The positions of the fusion proteins and cleavage products are indicated, and the molecular masses of marker proteins (lane M) are given (in kilodaltons). (B) The cell lysates shown in panel A were separated by SDS-PAGE, transferred to a nitrocellulose membrane, and immunostained with anti-MBP antiserum (New England Biolabs). The positions of the uncleaved fusion proteins and the N-terminal (i.e., MBP-containing) cleavage products are indicated, and the positions of marker proteins are given (with masses in kilodaltons).
|
![]() View larger version (69K): [in a new window] |
FIG. 8. Mutational analysis of active center of GAV 3CLpro. (A) The proteolytic activities of bacterially expressed MBP-2793-3143 proteins carrying substitutions of putative active-site residues were examined by SDS-PAGE of cell lysates obtained after IPTG-induced (3 h, 24°C) protein expression. The introduced amino acid substitutions and the positions of both uncleaved fusion proteins and cleavage products are indicated. The proteolytic activity of the wild-type MBP-2793-3143 (WT) (see also Fig. 4) served as a positive control. (B) The cell lysates shown in panel A were separated by SDS-PAGE, transferred to a nitrocellulose membrane, and immunostained with anti-MBP antiserum (New England Biolabs). The positions of the uncleaved fusion proteins and the N-terminal (that is, MBP-containing) cleavage products are indicated. Also shown are the positions of molecular mass markers (with masses given in kilodaltons).
|
|
|
|---|
Previous studies of coronavirus 3CLpros suggested that ancestors of these enzymes accepted unprecedented substitutions in most of the conserved positions of the catalytic system and the substrate pocket, making this group of enzymes an outlier among the huge family of viral and cellular chymotrypsin-like homologs (2, 15, 17). We now provide evidence that GAV 3CLpro provides an evolutionary link between the 3CLpros of coronaviruses and (all the) other positive-stranded RNA viruses. Specifically, our data indicate that the unique replacements in coronavirus 3CLpros of otherwise strictly conserved residues must have been acquired gradually in the nidovirus lineage. In this context, the GAV 3CLpro seems to emerge as an important model to study (separately) the functional effects of the (abridged) Cys-His catalytic system. This is possible because, in contrast to coronavirus 3CLpros, which feature both a Cys-His catalytic center and a noncanonical substrate pocket, the GAV 3CLpro Cys-His catalytic center seems to be combined with a canonical (potyvirus-like) substrate pocket (see below and Fig. 9).
![]() View larger version (18K): [in a new window] |
FIG. 9. Variations in catalytic and substrate-binding residues of RNA viral chymotrypsin-like proteinases. PV, poliovirus; HAV, hepatitis A virus; TBRV, tomato black ring virus; PEMV, pepper mottle virus; HCoV, human coronavirus; EAV, equine arteritis virus. The key catalytic (*) and substrate-binding pocket (#) residues are indicated. The catalytic Asp residue of hepatitis A virus is shown in brackets because its side chain orientation in the hepatitis A virus 3Cpro crystal structure (1, 4) argues against the proposed catalytic function (see text for details).
|
It should be noted that an equivalent of the Asp residue of the chymotrypsin catalytic triad is also missing in coronavirus 3CLpros (2, 17, 24, 50). Also, in the crystal structure of the hepatitis A virus 3C proteinase, the side chain of the conserved Asp residue adopts an unexpected orientation (1, 4). Even though the hepatitis A virus Asp84 residue occupies the expected position in the main chain, it forms a salt bridge with the
amino group of a Lys side chain from strand fII (4) rather than interacting with the catalytic His44, and thus, a catalytic function is unlikely. Apparently, in an appropriate environment, the relatively low pKa of the Cys nucleophile (compared to that of Ser) may fully or partially relieve some 3C/3C-like cysteine proteinases from dependence on an Asp (Glu) carboxylate group, which is usually required to stabilize the developing positive charge on the catalytic histidine residue during serine proteinase catalysis (13, 23, 27).
Substrate specificity.
In this study, initial information on the substrate specificity of the GAV 3CLpro was obtained by determining the N-terminal 3CLpro autoprocessing site and a second 3CLpro cleavage site in the C-terminal region of pp1ab. The sequences flanking the scissile bonds, 2827LVTHE
VRTGN2836 and 6441KVNHE
LYHVA6450, share the VxHE
(L,V) motif. Inspection of coronavirus/GAV replicase alignments (A. E. Gorbalenya and J. Ziebuhr, unpublished data) leads us to believe that Val/Thr/Ser and Leu/Val/Ile/Gly/Ser/Ala at the substrate P4 and P1' positions, respectively, may be compatible with proteolysis by GAV 3CLpro. This conservation pattern suggests that the P4, P2, P1, and P1' positions are the major 3CLpro specificity determinants. The same positions are critical in corona- and potyvirus 3CLpro cleavage sites, which provides further support to combine the GAV, corona-, and potyvirus 3CLpros in a separate group.
Whereas the presence of Glu (or Gln) at the P1 position is a typical feature of RNA virus 3C/3CLpro substrates (16, 36), the GAV 3CLpro preferences at the other conserved positions are less common and, taken together, give this proteinase a unique substrate specificity formula. Interestingly, some plant potyvirus NIa 3C-like proteinases (21, 33, 44, 48) share the P2 His substrate specificity with the GAV 3CLpro. It is also noteworthy that, unlike most other 3C/3C-like proteinases, GAV 3CLpro seems to possess a relatively large (hydrophobic) S1' subsite, which would accommodate the branched side chains of valine and leucine.
A striking parallel between GAV 3CLpro and various well-characterized positive-stranded RNA virus homologs (3, 16, 28-30) is the conservation of the pair of His/Thr residues in the S1 subsite. Our hypothesis that the corresponding GAV 3CLpro residues (Thr2963 and His2983) may play an equivalent role is further supported by the local conservation of the corresponding region among GAV and potyvirus 3CLpros (Fig. 2 and 3) and our mutagenesis data (see above). Despite these similarities, it is likely that additional (poorly recognized) determinants may tune the P1 specificity in a virus-specific manner. Thus, for example, it is conceivable that the 3CLpros of GAV and arteriviruses, which both recognize a P1 Glu (rather than Gln) side chain (38; this paper), have similarly organized S1 subsites.
Cleavage at C terminus of 3CLpro.
RNA virus (including vertebrate nidovirus) 3C/3CLpros are commonly released from the replicase polyproteins by autocatalytic processing. In some cases, the N- and C-terminal sites are cleaved with different kinetics. Thus, for example, C-terminal 3C/3CLpro cleavage occurs more slowly (picornaviruses) (36), is tightly regulated (arteriviruses) (45), or is totally lacking (some caliciviruses) (39, 46). In our experiments, no evidence was obtained for cleavage in the region immediately downstream of the GAV 3CLpro which, according to comparative sequence analysis (Fig. 3), also does not contain potential [that is, VxHE
(L,V)] cleavage sites.
It is possible that a site immediately downstream of the proteinase domain might be cleaved by a cellular proteinase. However, this would be unprecedented based on data for other viral 3CLpros. Alternatively, domains from other regions of the viral polyprotein, which are missing in our constructs, might assist in autoprocessing at a C-terminal 3CLpro site with a deviant structure. For instance, studies of the arterivirus equine arteritis virus have revealed that the C-terminal release of the nsp4 proteinase from the nsp4-8 precursor requires nsp2 as a cofactor (45). Further studies with larger GAV 3CLpro-containing precursor proteins and alternative expression systems, including insect cells and primary crustacean cells (31), may help to address this question more rigorously.
If GAV 3CLpro and the downstream hydrophobic domain are not separated by proteolytic cleavage, as our results suggest, then the proteinase would remain anchored to intracellular membranes throughout the replication cycle. To some extent, this association would resemble the situation in the arterivirus equine arteritis virus and the coronavirus mouse hepatitis virus, in which significant amounts of nsp4 and 3CLpro, respectively, are known to remain part of long-lived (or even stable) precursors which possess flanking hydrophobic domains on either one or both sides (22, 37, 45).
Domain structure of 3CLpro.
In contrast to other 3C/3CLpros, which consist of two catalytic ß-barrel domains (1, 4, 28-30), nidovirus and potyvirus 3CLpros possess an extra C-terminal domain of variable size (51). This additional domain is also present in the GAV 3CLpro, although its precise size remains to be determined. In coronavirus 3CLpros, the C-terminal domain is involved in trans-cleavage activity (2, 26, 32, 50). Recent crystal structure analysis of the transmissible gastroenteritis virus 3CLpro showed that the domain adopts a unique
-helical structure that interacts with the enzyme's N terminus. This interaction fixes the orientation of a loop region involved in substrate binding (2).
The fact that the C-terminally truncated, 197-residue GAV 3CLpro (Fig. 1 and 7) retained significant autoprocessing activity when expressed as an MBP fusion protein argues against an equally important role for the C-terminal domain of GAV 3CLpro, at least in cis reactions. The effects of C-terminal deletions on the activity in trans remain to be determined. This experiment is of special interest because coronavirus 3CLpros have been shown to be differentially affected by C-terminal deletions in cis- versus trans-cleavage reactions (2, 26, 32, 50).
Taken together, the differences and similarities revealed in this study between the main proteinase of a crustacean nidovirus and its viral homologs indicate a novel pattern of functional and structural conservation that has not been observed in any of the previously characterized proteinases from mammalian and plant pathogens. We are confident that, from an evolutionary perspective, the characterization of proteins of positive-stranded RNA viruses isolated from less-characterized habitats will allow valuable insights into the evolution of viruses and help identify both missing phylogenetic links and evolutionary forces operating in specific biological systems.
We thank Viviane Hoppe for protein sequence data.
|
|
|---|
-chymotrypsin. Nature 214:652-656.[CrossRef][Medline]
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»