Previous Article | Next Article ![]()
Journal of Virology, November 2007, p. 12049-12060, Vol. 81, No. 21
0022-538X/07/$08.00+0 doi:10.1128/JVI.00969-07
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Departments of Molecular Biology,1 Molecular and Integrative Neurosciences,2 Cell Biology,3 Chemistry,4 Skaggs Institute for Chemical Biology,5 Joint Center for Structural Genomics, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037,6 Institute for Molecular Biology and Biophysics, ETH Zürich, CH-8093 Zürich, Switzerland7
Received 4 May 2007/ Accepted 1 August 2007
|
|
|---|
-helices that are common to ubiquitin-like folds, the globular domain of nsp3a contains two short helices representing a feature that has not previously been observed in these proteins. Nuclear magnetic resonance chemical shift perturbations showed that these unique structural elements are involved in interactions with single-stranded RNA. Structural similarities with proteins involved in various cell-signaling pathways indicate possible roles of nsp3a in viral infection and persistence. |
|
|---|
The SARS-CoV represents one of the largest currently known RNA genomes. It is composed of at least 14 functional open reading frames that encode three classes of proteins, i.e., structural proteins (the S, M, E, N, 3a, 7a, and 7b proteins), nonstructural proteins (nsp1 to nsp16), and the accessory proteins (3b, 6, 8, 9b, and 14) (38). With regard to the nonstructural proteins, the translation of the SARS-CoV genome produces two large replicase polyproteins (pp1a and pp1ab), which are processed by two proteases to yield 16 mature nonstructural proteins that mediate RNA replication and processing. Since the SARS outbreak in 2003, knowledge of the structure, activity and function of some of these proteins has increased considerably (30, 32, 35, 41, 45); however, the biological roles of many of the SARS-CoV proteins remain unknown. In this paper we describe the nuclear magnetic resonance (NMR) structure determination and a preliminary functional characterization of nsp3a, the N-terminal domain of the largest of the nonstructural proteins, nsp3.
SARS-CoV nsp3 is a 213-kDa polypeptide involved in RNA replication and has been proposed to consist of seven domains, nsp3a to nsp3g, which have been identified based on phylogenetic conservation and predicted amino acid secondary structure (38). The biological role of nsp3 is only partially understood, and so far structures have been determined of only the two domains nsp3b, which has been described as an ADP ribose-1''-phosphatase (34), and nsp3d, which is a papain-like protease (PLpro) involved in the proteolytic processing of pp1a and pp1ab. nsp3d contains three domains, two of which are involved directly in proteolysis, while the third one has a ubiquitin-like fold (31).
nsp3a exhibits less than 35% sequence identity with other known proteins, and the closest homologues are found in other CoVs. The alignment shown in Fig. 1 indicates that group 2a CoVs (e.g., murine hepatitis virus and porcine hemagglutinating encephalomyelitis virus) exhibit higher similarity with nsp3a than proteins from groups 1 (e.g., human coronavirus 229E) and 3 (e.g., avian infectious bronchitis virus). The 183-residue nsp3a domain consists of a C-terminal subdomain of residues 113 to 183 that is rich in acidic residues (38% E and 12% D) and a 112-residue N-terminal subdomain with a more homogeneous content of amino acids (Fig. 1). This report presents a structural characterization of residues 1 to 183 of nsp3a [nsp3a(1-183)] and the structure determination of the subdomain nsp3a(1-112) in solution by NMR spectroscopy.
![]() View larger version (40K): [in a new window] |
FIG. 1. (a) Sequence alignment of human SARS-CoV nsp3a(1-112) and the homologous regions from bat SARS-CoV (accession no. AAZ67050), murine hepatitis virus (HV) (strain A59; accession no. NP_740609), porcine hemagglutinating encephalomyelitis virus (HEV) (strain VW572; accession no. YP_459949), human CoV (hCoV 229E; accession no. NP_835345), and avian infectious bronchitis virus (IBV) (strain Cal99; accession no. AAS00078). The residue numbers at the top correspond to the sequence of the human SARS-CoV and do not account for the insertions shown in the drawing. In each sequence the conserved residues relative to SARS-CoV nsp3a are in bold. The regular secondary structure elements of SARS-CoV nsp3a are indicated by boxes. (b) Sequence of the subdomain of residues 113 to 183 of human SARS-CoV.
|
|
|
|---|
Production of nucleic acid-free protein for NMR spectroscopy. nsp3a prepared as described in the preceding section copurifies with nucleic acids, as was readily observed in the 1-D 1H NMR spectrum (see Fig. 8a). Nucleic acid-free samples were obtained by the following modification of the purification procedure. After the anion-exchange chromatography, the sample was kept at 25°C for 18 h. The protein solution was subsequently loaded onto a size exclusion column (Superdex 75; Amersham) equilibrated with 50 mM sodium phosphate buffer (pH 6.5) containing 150 mM NaCl and eluted with the same buffer. Under these conditions, the protein and the nucleic acid eluted separately. The fractions containing nucleic acid-free nsp3a(1-112) were again pooled and concentrated to a final volume of 550 µl, for a final protein concentration of 1 to 2 mM. The 1-D 1H NMR spectrum of the sample used for the NMR structure determination (see Fig. 8b) confirms the absence of nucleic acids.
![]() View larger version (17K): [in a new window] |
FIG. 8. (a) 1-D 1H NMR spectrum of nsp3a(1-112) before removal of copurifying nucleic acids. Spectra were measured at 25 °C with water presaturation on a Bruker DRX700 spectrometer. Sixty-four scans were accumulated. The presence of characteristic nucleic acid signals in the area from 4.8 to 6.4 ppm (*) is readily apparent (1'H, 2'H, 3'H, 4'H, 5'H, 5''H of all nucleotides and pyrimidine 5H are typically observed in this spectral region). (b) 1-D 1H NMR spectrum of the nucleic acid-free nsp3a(1-112) sample used for the NMR structure determination (see Materials and Methods). The weak peaks between 4.8 and 6.4 ppm are part of the protein spectrum. (c) Isolation of RNA that copurified with nsp3a(1-112). The chromatogram was obtained after loading a sample of unfolded nsp3a(1-112) in 6 M guanidinium-HCl onto a size exclusion column. Absorbance at 280 nm and conductivity are shown in blue and brown, respectively. The protein and ssRNA absorption peaks are labeled; the high conductivity observed after 320 minutes is due to guanidinium-HCl.
|
Steady-state 15N{1H}-NOEs were measured using transverse relaxation optimized spectroscopy-based experiments (32, 46) on a Bruker Avance 600 spectrometer with a saturation period of 3.0 s and a total interscan delay of 5.0 s. Diffusion experiments were recorded on a Bruker DRX700 spectrometer using a longitudinal eddy current delay pulse scheme (1), with a diffusion time of 50 ms and sine-shaped gradients of 4.5 ms. The data were processed with TopSpin software (Bruker BioSpin, Billerica, MA).
The interaction of nsp3a(1-112) with single-stranded RNA (ssRNA) was evaluated by comparison of the 2-D [15N,1H]-HSQC spectra of nsp3a(1-112) recorded at four nsp3a(1-112):ssRNA2 ratios, i.e., 16:1, 8:1, 4:1, and 2:1. As controls, 2-D [15N,1H]-HSQC spectra were obtained after addition of eight units of uridine (Octa-U) and Octa-A in fourfold excess with respect to the protein concentration, using otherwise identical conditions. The weighted average of the 1H and 15N chemical shift differences, 
av, was calculated as follows: 
av = {0.5[
(1HN)2 + (0.2
(15N))2]}1/2 (28).
Structure determination. The structure calculation was based on a 3-D 15N-resolved [1H,1H]-NOESY spectrum and on two 3-D 13C-resolved [1H,1H]-NOESY spectra recorded with the carrier frequency in the aliphatic and the aromatic regions, respectively. All three data sets were recorded with mixing times of 60 ms. In the input for the stand-alone version of the software package ATNOS/CANDID (9, 10), these NOE data were supplemented with the amino acid sequence and the chemical shift lists from the independently obtained sequence-specific resonance assignment (36). Seven cycles of automated NOESY peak picking and NOE cross-peak identification with ATNOS (9), automated NOE assignment with CANDID (10), and structure calculation with the torsion angle dynamics algorithm of CYANA (8) were performed. In the second and subsequent cycles, the intermediate protein structure was used as an additional guide for the interpretation of the NOESY spectra. During the first six cycles, ATNOS/CANDID/CYANA uses ambiguous distance restraints. In the final cycle, only distance restraints which could be attributed to a single pair of hydrogen atoms were retained. The 20 conformers with the lowest residual CYANA target function values obtained from the seventh ATNOS/CANDID/CYANA cycle were energy minimized in a water shell with the program OPALp (18, 21), using the AMBER force field (5). The program MOLMOL (19) was used to analyze the ensemble of 20 energy-minimized conformers.
Structure validation and data deposition. Analysis of the stereochemical quality of the molecular models was accomplished using the Joint Center for Structural Genomics Validation Central Suite (http://www.jcsg.org), which integrates seven validation tools: Procheck, SFcheck, Prove, ERRAT, WASP, DDQ, and Whatcheck.
Protein stoichiometry determination. Perfluoro-octanoic acid-polyacrylamide gel electrophoresis (PFO-PAGE) was performed according to the method of Ramjeesingh et al. (30). Purified protein samples were mixed 1:1 with PFO loading buffer containing 8% (wt/vol) PFO, 100 mM Tris, 20% (vol/vol) glycerol, and 0.05% (wt/vol) orange G. Samples with protein concentrations of 250 µM, 500 µM, and 1 mM were loaded onto precast 4 to 20% Tris-glycine gels, and electrophoresis was performed with a standard Tris-glycine running buffer (Invitrogen) to which 0.5% (wt/vol) PFO was added. Protein was detected by SYPRO-ruby poststain (Invitrogen).
Electrophoretic mobility shift assay (EMSA). Protein samples (twofold dilutions from 128 µM to 1 µM) were mixed with 0.8 µg of RNA substrate in 20 µl of assay buffer containing 150 mM NaCl, 50 mM Tris (pH 8.0), and 5 mM CaCl2. The RNA sequences used included ssRNA1, AAAUACCUCUCAAAAAUAACACCACACCAUAUACCACAU, and ssRNA2, GGGGAUAAAA. Samples were incubated at 37°C for 1 h and analyzed by native electrophoresis on precast 6% acrylamide DNA retardation gels (Invitrogen). RNA was detected by SYBR-gold poststain and photographed using a UV light source equipped with a digital camera. Protein was then detected by SYPRO-ruby poststain. Densitometric analysis was performed using a flatbed scanner with ImageJ software (NIH). The mobility shift of RNA at each protein concentration was calculated relative to the maximum shift observed in each experiment. Kd (dissociation constant) values were determined from the midpoints of the fitted titration data (37).
Nuclease susceptibility assay. nsp3a(1-183) and nsp3a(1-112) were incubated with several different nucleases in order to characterize nucleic acids that copurified with both proteins. RNase-free DNase I (NEB), T7 endonuclease I (NEB), RNase If (NEB), RNase A (Invitrogen), and RNase T1 (Ambion) cleavage assays were thus performed at 37°C for 1 h with the manufacturer's recommended buffer conditions. Digested samples were analyzed by native electrophoresis on precast 6% acrylamide DNA retardation gels (Invitrogen). Nucleic acid was detected by SYBR-gold poststain.
Protein structure accession numbers. The 1H, 13C, and 15N chemical shifts have been deposited in the BioMagResBank (http://www.bmrb.wisc.edu) under accession number 7029 (36). The atomic coordinates of the bundle of 20 conformers used to represent the solution structure of nsp3a(1-112) and of the conformer closest to the mean coordinates of the ensemble have been deposited in the Protein Data Bank (PDB; http://www.rcsb.org/pdb/) under accession numbers 2GRI and 2IDY, respectively.
|
|
|---|
![]() View larger version (53K): [in a new window] |
FIG. 2. NMR structure of nsp3a(1-112). (a) Stereo view of the polypeptide backbone of a bundle of 20 energy-minimized CYANA conformers superimposed for minimal RMSD value of the backbone atoms of residues 20 to 108. The N-terminal segment of residues 1 to 19 is flexibly disordered (Fig. 5). (b) Stereo view of a ribbon representation of the conformer with the smallest RMSD relative to the mean coordinates of the ensemble of panel a. In both panels, ß-strands are cyan and helices are red. Selected residue positions are indicated in panel a, and the regular secondary structures are identified in panel b.
|
|
View this table: [in a new window] |
TABLE 1. Input for the structure calculation and characterization of the bundle of 20 energy-minimized CYANA conformers that represent the NMR structure of nsp3a(1-112)
|
1-ß2-
2-310-
3-ß3-ß4 (Fig. 1 and 2). The long helix
2 and the presence of the
1- and 310-helices, which have not been observed in other ubiquitin-like proteins, make the overall structure more elongated than other ubiquitin-related folds. The strand ß1 spans residues 20 to 24 and is connected via a well-defined nine-amino-acid linker to the helix
1 containing residues 34 to 37. A short turn then leads to ß2 with residues 42 to 46. The helix
2 with residues 52 to 66 is followed by a short loop that leads to the 310-helix of residues 70 to 75, which is further connected by a short turn with the helix
3 of residues 79 to 84. The last two regular secondary structures, ß3 with residues 89 to 91 and ß4 with residues 101 to 106, form an antiparallel ß-sheet, and they are connected to each other by a tight turn followed by an extended chain segment. The electrostatic potential surface of nsp3a(1-112) shows a pronounced polarity (Fig. 3), with the helices
2,
3, and 310 exhibiting mainly negative charges to the solvent while the strands ß1 and ß3 and helix
1 contain primarily positive or hydrophobic surface residues.
![]() View larger version (48K): [in a new window] |
FIG. 3. Electrostatic surface potential of nsp3a(1-112). Positive and negative electrostatic potential is represented in blue and red, respectively. On the left we show the surface of helices 2, 3, and 310 and of the loop between strands ß3 and ß4, which contain a high density of acidic residues (Fig. 1). On the right are shown the surface of helix 1 and strands ß1, ß2, and ß4, which contain mainly neutral and basic residues. Positions of selected charged residues are indicated.
|
1, nsp3a has two additional helices inserted between strands ß2 and ß3, and helix
2 is much longer in nsp3a than in nsp3d.
![]() View larger version (40K): [in a new window] |
FIG. 4. Superposition of nsp3a(1-112) (green, regular secondary structures that superimpose with nsp3d; yellow, segments not present in nsp3d; gray, other segments) and the ubiquitin-like domain of nsp3d (31) (PDB code 2FE8) (red, regular secondary structures that superimpose with nsp3a; gray, other segments). The structure superposition was performed using the SSM module of Coot (7). Thirty C atoms were superimposed with a RMSD value of 2.22 Å, i.e., from nsp3a(1-112) residues 20 to 26, 40 to 46, 49 to 54, 87 to 91, and 100 to 104 and from nsp3d residues 725 to 731, 739 to 745, 748 to 753, 754 to 758, and 773 to 777.
|
![]() View larger version (18K): [in a new window] |
FIG. 5. 15N{1H}-NOE values plotted as relative intensities (Irel), versus the sequence of nsp3a(1-112). Diamonds represent experimental measurements, which are linked by straight lines along the sequence. Gaps represent proline residues, which lack a backbone 1H atom, or overlapping residues in the 15N-1H correlation spectrum that could not be integrated accurately. The experiment was recorded at a 1H frequency of 600 MHz, using a saturation period of 3.0 s and a total interscan delay of 5.0 s.
|
![]() View larger version (14K): [in a new window] |
FIG. 6. (a) Superposition of the 2-D [15N,1H]-HSQC spectra of nsp3a(1-183) (blue) and nsp3a(1-112) (red). (b) High-contour-level presentation of a 2-D [15N,1H]-HSQC spectrum of nsp3a(1-183). (c) Heteronuclear NOE experiment with nsp3a(1-183), using a saturation period of 3.0 s and an interscan delay of 5.0 s. Negative and positive peaks are shown in pink and green, respectively.
|
nsp3a(1-112) is a monomer in solution. During the purification of nsp3a(1-112), we noticed that the retention volume of nsp3a by size exclusion chromatography (Superdex 75; Amersham) was lower than expected for a globular protein with a molecular mass of 12.6 kDa. In view of the implications for the structure determination and the biological activity of the protein, we decided to further investigate the oligomeric state of nsp3a(1-112) in solution using NMR diffusion experiments and PFO-PAGE.
In diffusion NMR experiments, the decay of the signal intensity versus the square of the magnetic field gradient was used to estimate the translational diffusion properties of the proteins (40). In Fig. 7a we compare data obtained for 1 mM solutions of nsp3a(1-112), RNase A, and chymotrypsinogen, which have molecular masses of 12.6 kDa, 13.7 kDa, and 25.0 kDa, respectively. The nsp3a(1-112) intensity decay curve is located between the two standards, which is indicative of the presence of the monomeric form, since the elongated shape of nsp3a(1-112) should result in a lower diffusion coefficient than near-spherical proteins with similar molecular masses. A PFO-PAGE gel also indicates that nsp3a(1-112) exists predominantly in the monomeric form at room temperature. The assays performed at the three different protein concentrations of 1 mM, 500 µM, and 250 µM (Fig. 7b) show that even at a 1 mM concentration the monomeric form predominates, and only a small amount of the dimeric form can be observed.
![]() View larger version (17K): [in a new window] |
FIG. 7. Study of the oligomeric state of nsp3a(1-112). (a) Data obtained from NMR diffusion experiments at 700 MHz. The relative NMR signal intensity (ln I/Io) is plotted versus the square of the gradient field strength, G2. , nsp3a(1-112); , ribonuclease A; , chymotrypsinogen. (b) PFO-PAGE of nsp3a(1-112); the sizes of the protein complexes were estimated from the benchmark protein ladder shown on the left (Invitrogen). The protein concentration increases from right to left in three steps of 250 µM, 500 µM, and 1 mM. The filled arrowheads indicate the positions of the monomeric (12.6 kDa) and dimeric (25.2 kDa) forms of nsp3a(1-112).
|
![]() View larger version (32K): [in a new window] |
FIG. 9. Mass spectrum of the isolated ssRNA fragment. The proposed structures for the main peaks are presented together with their corresponding molecular weights and atomic composition.
|
![]() View larger version (21K): [in a new window] |
FIG. 10. Association of nsp3a(1-183) and nsp3a(1-112) purified from E. coli with nucleic acids. (a) Nucleic acid was visualized with SYBR-gold staining before or after digestion with nucleases specific to DNA (DNase I or T7 endonuclease) or RNA (RNase I, RNase A, or RNase T1). Cleavage assays were performed at 37°C for 1 h, and digested samples were analyzed by native electrophoresis on precast 6% polyacrylamide gels. Open arrowheads denote copurified nucleic acid species associated with nsp3a(1-112) or nsp3a(1-183), respectively. (b) EMSAs were performed to estimate the RNA binding affinity of nsp3a(1-112). Samples containing ssRNA1 or ssRNA2 were incubated at 37°C for 1 h with variable concentrations of protein and analyzed by native electrophoresis on precast 6% polyacrylamide gels. RNA was detected by SYPRO-gold poststain, and the fraction of bound RNA was calculated relative to the maximum binding observed in each experiment. Lane P, protein only; lanes 0, ssRNA only; lanes 1 to 7 (left panel), ssRNA with twofold dilutions of protein from a final concentration of 128 µM to 2 µM for ssRNA1; lanes 2, 4, 6, and 8 (right panel), ssRNA with fourfold dilutions of protein from 64 µM to 1 µM for ssRNA2. Electrophoretic mobilities of free (f) and bound (b) forms of each ssRNA species are indicated with arrowheads. (c) ssRNA1-binding at variable concentrations of nsp3a(1-112), as calculated from the EMSA data shown in panel b.
|
![]() View larger version (50K): [in a new window] |
FIG. 11. EMSAs were performed to evaluate the affinity of nsp3a(1-112) for different nucleic acid species. (a) Gels obtained after loading mixtures of nsp3a(1-112) with 10 different ssDNA fragments (1 to 10). Lanes labeled P and M correspond to nucleic acid-free protein and nucleic acid marker, respectively. Comparison of the two gels, using nucleic acid-specific (left) and protein-specific (right) stains, indicates that nsp3a(1-112) does not exhibit affinity for ssDNAs. (b) Gels containing decreasing concentrations (100 to 1.6 µM) of nsp3a(1-112), in the presence of 800 ng of an ssRNA 40-mer lacking the sequence AUA (left), a double-stranded RNA 20-mer (center), and an ssDNA 40-mer (right). In lanes labeled N, only nucleic acid species were loaded. No interaction of nsp3a(1-112) and nucleic acids (NA) was observed under any of the above conditions. All experiments were performed after incubation of nsp3a(1-112) and the corresponding nucleic acid fragment for 1 h at 37 °C.
|
1, and the helices
1 and 310, which contain a surplus of negatively charged amino acid side chains (Fig. 3). There is thus an indication that these chemical shift perturbations might result primarily from long-range effects on the protein conformation rather than from direct protein-RNA contacts.
![]() View larger version (22K): [in a new window] |
FIG. 12. (a) Superposition of the [15N,1H]-HSQC spectra of nsp3a(1-112) in the absence (blue) and presence (red) of a fourfold excess of the exogenous ssRNA2 (see text). (b) Plot versus the amino acid sequence of the chemical shift changes in the backbone 1HN-15N moieties of nsp3a(1-112) due to ssRNA2 binding. ![]() av is a weighted average of the 1H and 15N chemical shift differences determined from comparison of the [15N,1H]-HSQC spectra shown in panel a: ![]() av = {0.5[![]() (1HN)2 + (0.2![]() (15N))2]}1/2. (c) Superposition of the [15N,1H]-HSQC spectra of nsp3a(1-112) in the absence (blue) and presence (red) of a fourfold excess of Octa-U.
|
|
|
|---|
3-helices is less well conserved, and helix
3 actually appears to be absent in the groups 1 and 3 CoVs. Additionally, the regions corresponding to ß1 and
1 in nsp3a exhibit a high number of conservative amino acid substitutions. It is worth mentioning that ß1,
1, and ß4 define the positively charged surface areas of nsp3a (Fig. 3, right-hand panel). The
1- and 310-helices, which have not been observed in other ubiquitin-like proteins, seem to be important for the interaction of nsp3a with ssRNAs, since they exhibit extensive chemical shift perturbations upon ssRNA interaction and since other ubiquitin homologues do not exhibit RNA binding activity. Although the observed affinity of nsp3a for ssRNA cannot by itself define a unique biological function, it seems to be important for the overall nsp3 biological role. As indicated above, nsp3 is a large multidomain protein, and only two of its domains, nsp3b and nsp3d, have been structurally and functionally characterized to date. The analysis of these domains indicates that nsp3 is a multifunctional protein involved in multiple biological processes, such as proteolysis (31) and RNA processing (34). The fact that the presently studied N-terminal region of nsp3 and two of its other domains, nsp3c and nsp3e, exhibit RNA binding activity (B. W. Neuman et al. unpublished data) together with the ADP-ribose-1''-phosphate dephosphorylation activity of nsp3b (34) suggests that this protein could also be involved in the replication and processing of viral RNA. Although the short sequences AUA and GAUA are common in the genome, a possible biological function for the sequence-specific RNA-binding activity observed for nsp3a might be in binding to the 5' end of the SARS-CoV genome. The sequence AUA occurs several times in the 5' untranslated region (UTR) of the genome, including at the extreme 5' end. Proteins that specifically recognize the 5' UTR might function in cap-dependent translation or, alternatively, in genome replication or subgenomic RNA synthesis.
The observation of two ubiquitin-like structures within nsp3 (nsp3a and the N-terminal domain of nsp3d) has important implications in attempting to assign its likely biological function. In addition to being a cysteine protease, nsp3d is also a potent deubiquitinating enzyme that has been extensively studied (2, 3, 31). It has been speculated earlier that the ubiquitin-like domain of SARS-CoV nsp3d might act as a decoy for cellular ubiquitinating enzymes, thereby protecting nascently synthesized viral proteins from proteasome-mediated degradation. Alternatively, the two ubiquitin-like domains might be involved in modulation of protein-protein interaction pathways of cellular immunomodulators, such as interferons and ISGylating enzymes. This view is reinforced by the structural similarity of the two ubiquitin-like domains of nsp3 with ISG15, an interferon-stimulated gene that is induced as a primary response to diverse stimuli, including viral infections. The SARS-CoV proteins 3b and 6 and the nucleocapsid protein have recently been shown to function as effective interferon antagonists (16).
It seems possible that other SARS-CoV proteins, such as nsp1 (13) (and possibly host proteins as well) might also be part of these pathways, acting at either the RNA or protein levels. Several studies probing the intricate interplay of viral and host proteins during the progression of the SARS-CoV viral cycle have been reported (22, 33, 39). Since the biological role of nsp3a still remains unclear, structural homology studies could at this point provide insights into the potential function of this domain and its role within the viral cycle.
nsp3a exhibits 3-D structural similarity with Ras-interacting domains.
Many of the structural homologues of nsp3a interact with other polypeptides to regulate processes such as protein degradation, cell signaling (12), and antiviral response (24). It seems significant that five of them are Ras-interacting proteins. Based on the primary sequences, the ubiquitin
/ß-roll superfold comprises five families (14). Members of three of these families, RA (RalGDS/AF6 Ras-association domain), RBD (Raf-like Ras-binding domain) and PI3K_rbd (Ras-binding domain of phosphatidylinositol 3-kinase-like proteins) interact with Ras (14). A large fraction of the structural homologues of nsp3a(1-112) identified using the software DALI are members of these families. The protein nsp3a(1-112) has the highest structural similarity with the Ras-interacting domain (RID) of RalGDS, a member of the RA family with which it shares the topology of the ubiquitin-like fold (Fig. 13d). This effector of Ras is a stimulator of the guanine nucleotide dissociation mechanism specific for Ral. RID-RalGDS binds Ras through its C-terminal domain and presents low sequence identity with other Ras-interacting proteins but similar hydrophobic profiles (12). The superposition of the 3-D structures of RID-RalGDS and nsp3a(1-112) reveals a region with conserved residues located in strand ß1 of nsp3a(1-112) (Fig. 13a and b) that is intimately involved in the Ras contact interface. Similarly, the Ras-binding domain of the AF6 protein (29), which is also a member of the RA family, shows 3-D structural homology with nsp3a(1-112) (Fig. 13d) and similar residues located in the ß1 region (Fig. 13b). Both RalGDS and AF6 are known as Ras effectors. Similar patterns are also found in other RA domains with significant levels of structural homology with nsp3a(1-112), e.g., the human Grb7 protein and the guanine nucleotide exchange factor for Rap1 (25).
![]() View larger version (52K): [in a new window] |
FIG. 13. Comparison of nsp3a with Ras-interacting proteins. (a) In a complex consisting of a Ras dimer (gray) bound to two RID-RalGDS subunits (yellow) (PDB code ILFD), nsp3a(1-112) (red) is superimposed on one of the two RID subunits. The residues used for the superposition were identified using the software DALI with the NMR structure of nsp3a(1-112) and the X-ray structure of the Ras-RID-RalGDS complex (12): for nsp3a(1-112), residues 17 to 29, 33 to 37, 41 to 63, 83 to 87, 88 to 94, 95 to 98, and 101 to 108; for RID-RalGDS, residues 14 to 26, 27 to 31, 32 to 54, 55 to 59, 63 to 69, 74 to 77, and 93 to 100. The C atoms of these residues could be superimposed with an RMSD of 2.3 Å. (b) Sequence alignment of a dodecapeptide containing strand ß1 of nsp3a (box) with the corresponding segments in some members of the Ras-interacting protein family, with the residue numbers of nsp3a indicated. (c) Electrostatic potential surfaces of nsp3a(1-112), RID-RalGDS, and Ra-AF6. The positions of the conserved residues corresponding to R23 in nsp3a(1-112) are indicated. (d) Ribbon presentations of the same structures as in panel c.
|
Structure and potential functional role of the C-terminal Glu-rich subdomain. The Glu-rich C-terminal polypeptide segment 113 to 183 of nsp3a shows less than 25% sequence identity with the corresponding acidic regions in other CoV genomes, whereas the SARS-CoV protein contains overall a somewhat higher percentage of acidic residues than the acidic regions of other CoVs. Similar motifs are found in some eukaryotic proteins (11, 23). In mammals, these acid-rich polypeptide segments are mainly involved in the transport of mRNA from the nucleus to the cytoplasm by association with RNA binding proteins. For example, the pp32/leucine-rich acidic protein associates with HuR, which binds to AU-rich elements of mRNAs to export mRNAs from the nucleus to the cytoplasm (11). Interestingly, Higashino et al. reported that several viruses interfere with this transport in order to increase the production of their virions by the cellular machinery (11).
The NMR data of Fig. 5 and 6 now show that the Glu-rich region of nsp3a forms a flexible tail attached to the globular region of residues 1 to 112. Although sequence similarity to other proteins is not identifiable, several cellular proteins also contain regions with high percentages of acidic residues. The homopolymer of glutamic acid, poly-L-Glu, is unstructured at pH 8 but can adopt helical structures at pH 5 (15). Some acidic regions of polypeptides exhibit a well-defined regular secondary structure when interacting with other proteins. The structure of the RanGAP (35) complex with RanBP1 and RanGAP presents examples of both situations. Ran is a nuclear Ras-related protein that regulates both transport between nucleus and cytoplasm and the formation of the mitotic spindle or nuclear envelope in dividing cells (35). The C-terminal region of RanGAP, which is important for binding affinity, exhibits an acidic motif that is flexibly disordered in both the complexed and the uncomplexed forms of RanGAP, whereas other acid-rich segments of this protein comprise folded secondary structure elements. Given the high degree of similarity of nsp3a(1-112) with some of the other polypeptides involved in protein-protein interaction processes, as well as its location in the large multidomain protein nsp3, the long, flexibly extended Glu-rich segment could have an important role in interactions with other SARS-CoV or host cell molecules, and this domain might adopt a well-defined fold during interactions with other polypeptides. Overall, the structural data reported in this paper indicate that the globular and nonglobular subdomains of nsp3a are important for SARS-CoV infection and persistence and thus represent new potential targets for therapeutic intervention.
Published ahead of print on 29 August 2007. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»