Previous Article | Next Article ![]()
Journal of Virology, September 2006, p. 8379-8389, Vol. 80, No. 17
0022-538X/06/$08.00+0 doi:10.1128/JVI.00750-06
Copyright © 2006, American Society for Microbiology. All Rights Reserved.
Integrated Program in Cellular, Molecular and Biophysical Studies,1 Howard Hughes Medical Institute,2 Department of Biochemistry and Molecular Biophysics, College of Physicians and Surgeons, Columbia University, New York, New York 100323
Received 12 April 2006/ Accepted 17 May 2006
|
|
|---|
C). The 1.6-Å resolution structure resembles the known structures of the human immunodeficiency virus type 1 (HIV-1) and Escherichia coli RNase H. The structure revealed the coordination of a magnesium ion within the catalytic core comprised of the highly conserved acidic residues D524, E562, and D583. Surface charge mapping of the Mo-MLV structure revealed a high density of basic charges on one side of the enzyme. Using a model of the Mo-MLV structure superimposed upon a structure of HIV-1 reverse transcriptase bound to an RNA/DNA hybrid substrate, Mo-MLV RNase H secondary structures and individual amino acids were examined for their potential roles in binding substrate. Identified regions included Mo-MLV RNase H ß1-ß2,
A, and
B and residues from
B to
D and its following loop. Most of the identified substrate-binding residues corresponded with residues directly binding nucleotides in an RNase H from Bacillus halodurans as observed in a cocrystal structure with RNA/DNA. Finally, superimposition of RNases H of Mo-MLV, E. coli, and HIV-1 revealed that a loop of the HIV-1 connection domain resides within the same region of the Mo-MLV and E. coli C-helix. The HIV-1 connection domain may serve to recognize and bind the RNA/DNA substrate major groove. |
|
|---|
The structures of several RNases H have been determined, including those from Escherichia coli, human immunodeficiency virus type 1 (HIV-1; both alone as a subdomain and also in the context of the RT holoenzyme), Thermus thermophilus HB8, archaeal RNase HII, and, more recently, Bacillus halodurans RNase H bound to an RNA/DNA hybrid (10, 21, 26, 27, 30, 32, 37, 50). The structures of these enzymes are quite similar to each other. Moreover, the crystal structures of other enzymes that possess nuclease activity show tertiary folding that is very similar to that of the RNases H. These include the catalytic domain of HIV-1 integrase, phage Mu transposase, RuvC (an endonuclease that cleaves Holliday junctions), E. coli exonuclease I, and the exonuclease domain of E. coli DNA polymerase I (1, 3, 8, 12, 39). A high-resolution Mo-MLV RNase H structure should extend the basis for comparisons among the family members and provide a physical explanation for the distinct characteristics of the Mo-MuLV polymerase and RNase H activities.
All of the RNases H and the aforementioned nucleases contain a conserved catalytic triad of acidic residues that coordinate one or two Mg2+ or Mn2+ cations. RNases H contain four conserved acidic residues that coordinate divalent cation binding, though only the first three seem to be required for activity (25). Both a one-metal-ion and a two-metal-ion mechanism of transesterification have been postulated for the RNases H (7, 10, 15, 28, 35, 38, 44, 51). The cocrystal structure reported by Nowotny et al. reveals two magnesium ions bound within the active site (37). These authors proposed a two-metal-ion-dependent mechanism of action, with one magnesium to activate a nucleophile and the other to stabilize the transition state.
Retroviral RNases H from different viruses have broadly similar activities, although there are important differences, including their recognition of substrate structure and sequence (14). Amino acid sequence alignments show that Mo-MLV RNase H and E. coli RNase H both contain a positively charged
-helix (the C-helix) and loop that are absent in HIV-1 and the avian sarcoma-leukosis virus RNases H (Fig. 2) (10, 22, 23, 26, 50). Modeling with the E. coli enzyme suggests that the C-helix facilitates contacts with the RNA/DNA substrate, and functional studies confirm that the C-helix contributes to nucleic acid binding (24). Whereas Mo-MLV RT
C retains low in vitro RNase H activity, viruses encoding the Mo-MLV RT
C mutant enzyme do not replicate within cells (6, 46). Such results suggest that the C-helix has a specific in vivo role other than simple nonspecific RNA/DNA substrate binding. Mutational analysis of the C-helix in Mo-MLV RT has identified specific residues that are important for both polymerase and RNase H activity (33). That study demonstrated the importance of the C-helix in the efficiency and specificity of PPT recognition and cleavage.
![]() View larger version (45K): [in a new window] |
FIG. 2. Structural alignment of Mo-MLV, E.coli, HIV-1, and B. halodurans (Bh) RNases H. Each Mo-MLV RNase H secondary structure is marked above in a color scheme of secondary structures similar to that shown in Fig. 1A to C. The putative helix C is marked with a dotted blue line above, and all deleted residues from the Mo-MLV RNase H sequence are underscored in orange. Residues that contact the RNA strand are labeled in blue, while residues contacting DNA are highlighted in red. Residues contacting both RNA and DNA are shown as red on blue. Hydrophobic core residues are highlighted in gray. Highly conserved catalytic aspartate and glutamate residues are highlighted in yellow (Mo-MLV residues 524, 562, 583, and 653). The three conserved regions are boxed.
|
C) lacks detectable activity (46). The independently expressed HIV-1 RNase H domain, which naturally lacks the C-helix, also lacks enzymatic activity. Insertion of the E. coli C-helix into the single-domain HIV-1 RNase H, remarkably, restores its activity (29, 43). Reconstitution of nuclease activity can also occur by the addition of the entire p51 subunit or the thumb and connection subdomains in trans (20, 41). The presence of fusion partners or epitope tags during expression can also permit the preparation of separate HIV-1 RNase H domains with enzymatic activity in some cases (13, 16). These observations suggest that the C-helix may stabilize the isolated RNase H domains or its interaction with substrate.
The study undertaken here was initially aimed toward crystallizing the wild-type Mo-MLV RNase H. These efforts failed but did result in the successful crystallization and structure determination of Mo-MLV RNase H
C. An analysis of this structure and a comparison with the known structures of HIV-1 RT and other RNases H are presented here. This structure has already been important in studies with the complete Mo-MuLV RT. A previously determined full-length Mo-MLV RT structure exhibited a high degree of disorder, especially in the region of the RNase H domain (D. Das and M. M. Georgiadis, unpublished data). The full-length Mo-MLV RT structure, however, was ultimately resolved with the use of the high-resolution RNase H
C structure described here (9).
|
|
|---|
C as a template, resulting in a 489-bp fragment. pNCS
C contains a deletion of the C-helix of the Mo-MLV RNase H (deletion of residues I593 through L603); synthesis of this clone was previously described (46). The fragments were ligated into bacterial expression vector, pGEX-3x, which contains glutathione S-transferase (GST) and the BamHI-SmaI-EcoRI cloning site 3' of GST to form fusion proteins. Clones were sequenced to verify the fidelity of polymerase amplification and cloning. Oligonucleotides. The 5' primer for PCR amplification of the final construct was 5'-CGGGATCCTGGCCGAAGCCCACGGAACCCGA-3'. The 3' primer used for both constructs was 5'-CGGAATTCTCTAGAGGAGGGTAGAGGTGTCTGG-3'. Oligonucleotides were synthesized at the Howard Hughes Protein Chemistry Core Facility.
Protein purification.
Relevant bacterial expression plasmids included a pGEX-3x plasmid that contained our most stable and enzymatically active Mo-MLV RNase H (45, 46) and a second plasmid that contained the RNase H
C construct. Each was used to transform a BL-21-CodonPlus(DE3)-RIL or -RIL-X strain of E. coli (-RIL-X is methionine auxotrophic). Standard growth conditions were utilized for nonselenomethionyl protein preparations, whereas selenomethionyl protein preparations utilized a nonauxotrophic protocol (even though the cell strains were auxotrophic) (11). Cells were induced with 0.1 mM IPTG (isopropyl-ß-D-thiogalactopyranoside), harvested, and resuspended in 200 mM NaCl-50 mM Tris-HCl (pH 8)-1 mM EDTA-5 mM dithiothreitol (DTT). Protease inhibitors were added (aprotinin, 1 µg/ml; leupeptin, 1 µg/ml; pepstatin, 1 µg/ml; phenylmethylsulfonyl fluoride, 100 µg/ml), and the cells were then lysed with lysozyme and sonication. The suspension was next centrifuged to remove cellular debris. Clarified supernatants were collected and rocked in a 50% slurry of glutathione-Sepharose beads for 30 min at 4°C. Beads were centrifuged and washed three times with resuspension buffer and three times with Factor Xa buffer (250 mM NaCl, 50 mM Tris-HCl [pH 7.5], 1 mM CaCl2). For each milliliter of glutathione-Sepharose (100% bed volume), 50 µg of Factor Xa (Boehringer Mannheim) was added, and the beads were incubated at 4° for 16 h. Beads were then centrifuged, and supernatants were collected and modified by the addition of 100 µg of phenylmethylsulfonyl fluoride/ml, 5 mM DTT, and 2 mM EDTA.
The protein solution was dialyzed into 100 mM NaCl-10 mM PIPES (pH 6.5)-0.1 mM EDTA-5 mM DTT, and the proteins were then purified over a mono-S, cation-exchange chromatography column (Pharmacia). Protein solutions were then concentrated and run through a Superdex 200 gel filtration column (Pharmacia). Proteins were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and Coomassie blue staining and were then concentrated and quantitated by the Bradford method. Protein solutions were >95% pure, and small samples were submitted for mass spectrometry and limited N-terminal sequencing to verify mass and protein identity. Cloning into pGEX-3x resulted in the addition of glycine to the N terminus of the viral RNase H sequence after Factor Xa treatment. Mass spectrometry of selenomethionyl proteins revealed >99% selenomethionine incorporation.
Gel filtration resulted in the isolation of one fraction for wild-type RNase H but, unexpectedly, two fractions for RNase H
C (data not shown). Wild-type RNase H eluted just under the 20-kDa range, which was consistent with its expected size of 19 kDa. The larger
C fraction eluted in the 35-kDa range, which was consistent for an RNase H
C dimer. The smaller
C fraction eluted under the 20-kDa range, which was closer to the expected 17.7-kDa size for a
C monomer. About 10 to 15% of the RNase H
C protein consistently eluted in the supposed dimer form. Further investigation of possible Mo-MLV RNase H
C dimerization has been hampered thus far by the poor quality of its X-ray diffraction data (explained below).
Crystallization and data collection.
Proteins were concentrated to 7- to 15-mg/ml solutions depending on the yield and used in crystal screens (Hampton Research) by hanging-drop vapor diffusion at 20°C. Crystallization attempts with concentrated wild-type Mo-MLV RNase H resulted in no crystals. Crystals, however, were obtained for RNase H
C.
Crude crystals for both
C monomers and dimers were initially isolated in drops equilibrated against a reservoir solution of 30% PEG 4000 and 0.2 M ammonium sulfate. Starting drops contained equal volumes of stock protein solution (7 to 10 mg/ml; 10 mM HEPES [pH 7], 150 mM NaCl, 5 mM DTT, 0.1 mM EDTA) and reservoir solution. Single, large crystals were regularly grown for
C dimers with a reservoir solution that contained buffer between pH 5.2 and 6.0, NaCl, and polyethylene glycol (PEG; molecular weight of 1,450, 3,000, or 4,000). Crystallization occurred either with or without the presence of ammonium sulfate. The concentrations of protein, PEG, and ammonium sulfate were all correlated: the higher the protein concentration, the less PEG and/or ammonium sulfate required. Small crystals were visible within a few hours and grew to terminal size by 5 days. Seeding was not required for single crystal growth of
C dimers. Typical crystals were rhomboidal, with dimensions of 0.4 by 0.15 by 0.08 mm. Similar-sized crystals of selenomethionyl RNase H
C dimers were obtained under the same conditions.
Crystals were diffracted at X4A at the National Synchrotron Light Source at Brookhaven National Laboratory. The
C dimer crystals had unit cell parameters of a = b = 40.60 Å, c = 158.03 Å,
= ß = 90°, and
= 120°. The solvent content was 40.5%, and the unit cell volume was 225,585 Å3. The number of monomers per asymmetric unit was 2 but unconfirmed, and the space group of the crystals was either P31 or P32. Although morphologically very beautiful, the
C dimer crystals were not good candidates for structural work because of their weak diffraction. A full four-wavelength data set was collected on these crystals, but the data were weak and possessed high Rsym values. Statistically, the crystals were unsatisfactory for MAD (for multiwavelength anomalous diffraction) phasing beyond 3.7 Å and refinement to 3 Å.
The
C monomer crystallization failed to yield single, large crystals. Instead, crystallization was limited to needle bundles, with the largest needles occurring with a reservoir solution that contained pH 5 buffer composed of NaCl, zinc sulfate, PEG 1500, PEG MME 550 (MME referring to monomethyl ether), ammonium sulfate, 2-propanol, and 2-morpholinoethanesulfonic acid (MES). Small nucleations were visible within 1 day and grew to terminal size by 1 week. Seeding did not result in single crystal formation or any improvement in needle size. Similar crystallization results were obtained with a selenomethionyl derivative of RNase H
C monomers. The needle bundles were not amenable for crystallographic studies, but individual needles broken from the clusters proved to diffract exceptionally well.
The
C monomer crystal was of space group P1 and had unit cell parameters of a = 32.19 Å, b = 33.96 Å, c = 34.40 Å,
= 78.11°, ß = 69.77°, and
= 64.80°. The solvent content was 30%, and the unit cell volume was 31,852 Å3. The number of monomers per asymmetric unit was 1. A complete four-wavelength MAD data set was collected from one exceptional
C monomer crystal needle.
For cryo experiments during X-ray diffraction,
C monomer crystals were soaked in 15% PEG 4000-20% PEG 400-0.1 M ammonium sulfate-150 mM NaCl-1% 2-propanol-1.25% PEG MME 550-5 mM MES (pH 6.5)-0.5 mM zinc sulfate for 5 min. Crystals were then flash frozen in liquid nitrogen and maintained at 100°K in a nitrogen Oxford Cryosystem. The MAD data set was collected at the NSLS Beamline X4A on a CCD Quantum 4 detector.
Protein structure accession numbers. Structure coordinates have been deposited in the RCSB Protein Data Bank and assigned RCSB identification code rcsb038154 and PDB identification code 2HB5.
|
|
|---|
C variants of the domain, however, resulted in the preparation of large and useful crystals (see Materials and Methods). The most stable RNase H
C domain isolated for crystallization was identical to one previously constructed (46). The selenomethionyl variant of the Mo-MLV RNase H
C was expressed as a fusion protein with GST in methionine auxotrophic bacteria and isolated by using glutathione-linked Sepharose beads, Factor Xa cleavage release, and further purification through cation-exchange and gel filtration columns. Successful RNase H
C crystallization was achieved by standard crystallization screening methods. Crystals were diffracted by using four wavelength data sets and analyzed using MAD phasing. Diffraction data and structure determination were performed by using standard crystallography software programs. Refinement resolved the structure to an 1.6-Å resolution. Pertinent statistics for structure determination are provided in Table 1. |
View this table: [in a new window] |
TABLE 1. Diffraction data and structure refinement
|
C structure reveals a central five-stranded, mixed ß-sheet (four parallel and one antiparallel) surrounded by four
-helices (Fig. 1A to C). Despite only sharing 26% amino acid identity with E. coli RNase H and 18.5% identity with HIV-1 RNase H, Mo-MLV RNase H
C has an overall structure very similar to that of both enzymes (Fig. 2 and 3). Several gross differences between the RNase H structures are apparent. Both the sequence alignment and the ribbon diagrams show that Mo-MLV RNase H
C has a longer ß-strand 1 and
-helix E and a shorter ß-strand 3 compared to the other two RNases H. The longer ß-strand 1 actually bends toward
-helix E, making the loop between ß1 and ß2 lie prominently on the side of the entire five-stranded ß-sheet where
-helix E resides alone. In E. coli RNase H the loop between ß1 and ß2 actually bends away from
-helix E, while in HIV-1 RNase H, the loop bends only slightly toward
-helix E. In Mo-MLV RNase H
C, the ß1-to-ß2 transition region is held by hydrogen bonding between the backbone amide and carbonyl oxygen of Q530 to the backbone carbonyl oxygen and amide of Q533, respectively. It is likely that this loop and surrounding region closely interact with the RNA/DNA hybrid minor groove based on the E. coli RNase H structure modeled with the substrate (50). The same bending toward
-helix E is observed for the transition region between Mo-MLV RNase H
C ß2 and ß3. In this region, the side chain hyroxyl group of T541 forms a hydrogen bond with the side chain hydroxyl group of T542 causing a sharp bend in ß2 toward
-helix E. No functional assignment has yet been determined for this region through mutational analysis.
![]() View larger version (50K): [in a new window] |
FIG. 1. Ribbon diagrams of the Mo-MLV RNase H C in three different perspectives. The structure shows a central five-stranded, mixed ß-sheet surrounded by four -helices. The N terminus (NH3-) to the C terminus (-COOH) is color coordinated from dark blue, green, yellow, orange, and finally red. The magenta-shaded loop represents the unsolved His-loop between ß5 (darker yellow) and E (red), residues P636 to K640. Strictly for ribbon diagram purposes only, we have represented this loop as a best fit from the E. coli RNase H structure. The gray sphere represents magnesium. The figure was made by using MOLSCRIPT and Raster 3D (31, 34). (A) Mo-MLV RNase H is oriented as if the fingers, palm, and thumb of the polymerase domain are above the RNase H (top of page) and opened in right-handed fashion with the fingers on the left, the palm in the middle, and the thumb on the right. (B) Mo-MLV RNase H is rotated +145° around the x axis compared to Fig. 1A. This orientation best shows the putative active site and substrate binding face of the protein. (C) Mo-MLV RNase H is rotated +45° around the y axis compared to Fig. 1B. This orientation best displays the magnesium ion coordination site. (D) Close-up of magnesium ion coordination in the same perspective as depicted in Fig. 1C. Conserved residues aspartate 524 and 583, glutamate 562, and two water moiety oxygens (labeled W1 WAT and W2 WAT [represented in red]) interact with the magnesium ion. The magnesium ion is coordinated in tetragonal bipyramidal fashion (the additional water molecule not shown here).
|
![]() View larger version (38K): [in a new window] |
FIG. 3. Structure comparisons. The figure was made by using MOLSCRIPT and Raster 3D (31, 34). (A) Stereo images of superimposition of Mo-MLV RNase H (blue) with crystal structure of E. coli RNase H (green). Note the gap in the Mo-MLV structure for residues P636 to K640, which account for the unsolved His-loop. The gray sphere represents the magnesium ion coordinated within the active site. The orientation of structures is as depicted in Fig. 1B to best display the active site and substrate binding face. (B) Superimposition of Mo-MLV RNase H (blue) with crystal structure of HIV-1 RNase H (red).
|
-helix E and contain a histidine (H638) conserved in all RNases H. This region is also known as the "His-loop" and was not resolved in the initial crystal structure of the isolated HIV-1 RNase H (10). This loop is considered to be highly flexible, thus impeding structure determination for certain crystals of RNase H. The electron density map for monomeric RNase H
C contains clearly defined regions for much of the missing sequences (data not shown). For the strict purpose of providing a continuous ribbon structure, the Mo-MLV
C His-loop was manually traced as a best fit from the E. coli RNase H structure for Fig. 1 and 4. This region is represented in magenta in both figures and should not be assumed to portray the true His-loop structure.
![]() View larger version (50K): [in a new window] |
FIG. 4. (A) Electrostatic potential surface computed with GRASP. Blue represents positive potential, red represents negative potential, and white represents neutral potential. The catalytic or active site is electrostatically negative, whereas putative substrate binding region is positive. The locations of the surface amino acids implicated in substrate interactions are labeled accordingly. Two different orientations differing by roughly 50° rotation around the y axis are shown. (B) Worm diagrams of RNase H C in the same orientations as those shown in the corresponding diagrams above in panel A. The worm color (white) does not reflect electrostatic potential. All figures were made by using GRASP (36).
|
-helices A and D. This core is protected and surrounded by electrostatic interactions at the helical termini, as well as the hydrogen bond networks each from the ß-sheet and
-helical interactions (A+B, B+D, and C+D). For Mo-MLV RNase H
C, there is a similar hydrophobic core between
-helices A and D, but it is only bounded by one electrostatic interaction on the N-terminal end of the helices. This electrostatic interaction is mediated by the side chain carbonyl oxygen of N613 forming a hydrogen bond with the side chain
-amino group of R560. The side chain carbonyl oxygen of E616 also forms a hydrogen bond with a terminal side chain amino group of R560. Between the two helices, there are a series of hydrophobic residues that interact with one another. Mo-MLV
C also has an extensive hydrophobic core on the other side of
-helix A at the interface with ß-strands 1, 2, and 3. Structural analysis shows that both E. coli and HIV-1 RNases H do not possess such an extensive hydrophobic network between ß-strands 1, 2, and 3 and the N-terminal half of
-helix A. This extended hydrophobic core of Mo-MLV RNase H
C may be unique for the murine retroviral RNases H. Table 2 summarizes all residues involved with the hydrophobic core region of Mo-MLV RNase H. In addition, the sequence alignment in Fig. 2 highlights in gray all hydrophobic core residues identified in this structure, as well as RNases H from E. coli, HIV-1, and B. halodurans. |
View this table: [in a new window] |
TABLE 2. Secondary structures and residues of hydrophobic core of Mo-MLV RNase H C
|
-helix E are highlighted in yellow in Fig. 2. Of note, this region of the structure is accessible to surrounding solvent and faces toward the putative substrate-binding side of RNase H (Fig. 1, 4, and 5). More recent data from Nowotny et al. revealed that RNase H crystallized in the presence of RNA/DNA substrate contained two magnesium ions within the catalytic core (37). The second magnesium ion (not observed in our structure) is highly coordinated in position by the substrate, itself, within the cocrystal structure. The substrate-guided coordination of this second magnesium ion is postulated to designate the substrate specificity of the RNases H to cleave RNA/DNA hybrid structures. The magnesium ion observed in our structure corresponds to the nonactivating magnesium (metal ion B as designated by Nowotny et al.). This magnesium ion is believed to stabilize the transition state intermediate during catalysis.
![]() View larger version (33K): [in a new window] |
FIG. 5. Model of Mo-MLV RNase H C binding RNA/DNA substrate based upon superimposed modeling upon the HIV-1 RT cocrystal structure. DNA is represented in red, while RNA is represented in blue. The color scheme, gray sphere, and magenta-shaded His-loop are depicted as described in Fig. 1. The model is oriented to best display the interface between RNase H and RNA/DNA substrate. The figure made by using MOLSCRIPT and Raster 3D (31, 34).
|
C.
A molecular surface model of Mo-MLV RNase H
C was rendered and analyzed with respect to surface charges that might potentially interact with substrate (Fig. 4) (36). The surface reflects the solvent-accessible regions of the protein and is labeled blue in basic regions, red in acidic regions, and white where there is no net electrostatic potential charge. The catalytic site that contains the four conserved carboxylate groups is clearly acidic and resides slightly within the interior of the protein. Here the two magnesium cations bind to mediate nuclease activity (magnesium not shown). The surface that is proposed to interact with the RNA/DNA hybrid is highly basic, with many arginines and lysines responsible for the electrostatic charge. The predominant positively charged binding pocket is consistent with the model of RNase H primarily binding negatively charged DNA and RNA backbone phosphates as a basic mode of substrate recognition. The rest of the shown surface of Mo-MLV RNase H
C in Fig. 4 is fairly neutral, whereas the back surface (not shown) contains small regions of basic charge, but none as large and as dense as the putative substrate binding region. The most apparent surface residues composing the positively charged surface were identified as R534, Q559, R560, R585, K609, K612, N613, and K614. Most of these residues correspond to previously identified substrate-binding residues in E. coli, HIV-1, and B. halodurans identified by crystal structure or biochemical analysis (Fig. 2) (4, 5, 17, 24, 25, 37, 47). To further characterize these surface residues, a substrate-binding model was constructed as described below.
Superimposed model of Mo-MLV RNase H
C complexed with RNA/DNA.
Since there was strong structural homology between Mo-MLV RNase H
C and the RNases H from E. coli and HIV-1, structural coordinates were submitted to the Dali Server of the European Bioinformatics Institute of EMBL (www.ebi.ac.uk/dali/) to obtain structural alignments of the RNases H (18, 19). Rotation coordinates were obtained to superimpose the E. coli RNase H and Mo-MLV RNase H
C structures upon a base structure of HIV-1 RT bound to an RNA/DNA hybrid composed of sequences from the PPT (40). This allowed the construction of structural models of E. coli RNase H and Mo-MLV RNase H
C complexed with an RNA/DNA substrate. The superimposed models showed that many of the identified regions and residues with known functions in HIV-1 RNase H had conserved or homologous residues in the E. coli and Mo-MLV structures correctly aligned in space (Fig. 2, 3, and 5). This allowed the identification and characterization of potential binding domains and residues within the Mo-MLV
C structure with a high degree of confidence. These findings, however, must be interpreted with caution since they are not actual cocrystal structures. In addition, since the Mo-MLV RNase H
C is only a single-domain structure, the superimposed model constructed here may not reflect how wild-type Mo-MLV RNase H binds to its substrate in the context of full-length RT. Finally, the actual PPT substrate crystallized by Sarafianos et al. displayed a number of weakly paired, unpaired, and mismatched bases. Such phenomena may be a result of the inherent structure within the sequences of the PPT and the manner in which it binds HIV-1 RT. Thus, the data from that study may not accurately reflect how random-sequence, RNA/DNA hybrid substrates actually bind RT and become cleaved by the RNase H domain.
Figure 5 illustrates the overall structural model of Mo-MLV RNase H
C bound to substrate. The perspective shown here best illustrates how the RNA/DNA substrate interacts with the structure and how the RNA strand fits into the active site for cleavage. Closer examination of the active site residues show that the structural alignments have superimposed all three structures such that the catalytic site carbonyl oxygen atoms are all within 2.5Å of their homologous atom in each of the other two structures (data not shown). It is clear from Fig. 5 that the major portion of Mo-MLV RNase H preferentially binds the minor groove of the RNA/DNA substrate. Binding seems to take place in two overall locations. First, DNA is relatively close and accessible to
-helices A and B and the loop preceding
-helix D. The second binding region appears to interact with the RNA strand. RNA is close and accessible to ß-strand 1 and
-helices A and B.
Substrate binding determinants of Mo-MLV RNase H
C.
An analysis of the superimposed model of Mo-MLV RNase H
C was performed to identify those regions and their residues that may be important for RNA/DNA interactions. Four different regions from the primary structure were identified as potential binding sites for Mo-MLV RNase H
C. These include the end of ß-strand 1 to the beginning of ß-strand 2, the first few residues of
-helix A, the first few residues of
-helix B, and the loop region prior to the beginning of
-helix D (summarized in Table 3). In the region from ß-strand 1 to ß-strand 2, two residues directly face into the minor groove of the RNA/DNA hybrid. L529 resides on ß-strand 1 and faces into the base pairs of the hybrid substrate. It is the only hydrophobic residue that is directly exposed into the substrate. The side chain, terminal amino group of R534 resides at a distance of 3.25 Å from a phosphate group of the DNA backbone. For the purposes of identification, this phosphate will be labeled phosphate (4). DNA nucleotide (4) lies 4 nucleotides 3' from the base pair that contains the "scissile phosphate" (the 5' phosphate that is retained by the RNA nucleotide after cleavage).
|
View this table: [in a new window] |
TABLE 3. Amino acids potentially involved in substrate binding for Mo-MLV RNase H C
|
-helix A, three residues make potential contacts with a DNA phosphate group that is located just 3' of phosphate (4). This phosphate will thus be labeled the DNA phosphate (5). S557, Q559, and R560 all cluster close to one another along the DNA strand. The side chain hydroxyl of S557, the side chain carbonyl oxygen of Q559, and the terminal side chain amino group of R560 are only 3.46, 3.59, and 5.55 Å, respectively, from the DNA phosphate group (5) (data not shown). All of these polar groups potentially form electrostatic interactions with the DNA phosphate moiety.
The N-terminal region of
-helix B contains R585 and Y586 that may interact with the RNA and DNA strands, respectively. The terminal side chain amino groups of R585 lie close to the 2'-hydroxyl of the RNA ribose sugar. This RNA nucleotide complements the DNA nucleotide just 5' of the DNA nucleotide (4). The side chain OH group of Y586 lies close to a DNA phosphate group just 3' of DNA phosphate (5). This phosphate will thus be referred to as DNA phosphate (6). The distance between this tyrosyl hydroxyl group and the DNA phosphate (6) oxygen is 3.25Å. Interestingly, Blain and Goff have previously synthesized and studied a Mo-MLV RT mutation that substituted phenylalanine for Y586 (Y586F) (4, 5). Mutant Y586F resulted in a nonreplicative virus with the mutant RT having only 5% of the wild-type RNase H activity in an in situ RNase H assay. In another study, Zhang et al. determined that the Mo-MLV Y586F mutant RT resulted in a 17-fold-higher rate of substitution mutations during polymerization especially along adenine-thymine tracts (52). It is postulated that Y586 not only binds to substrate but also conforms and bends RNA/DNA substrate to optimize DNA polymerase fidelity. Given that RT has no proofreading abilities, RNase H domain binding determinants must have had important influences in the accurate genomic replication and survival of retroviruses as a whole.
For
-helix D and its proceeding loop, K612 lies close to DNA phosphate (6), whereas N613 actually lies closer to DNA phosphate (5). N613 is the first residue of the D-helix and lies closer to the residues of
-helix B identified around DNA phosphate (5) (see Fig. 4 [GRASP diagram]). The terminal side chain amino group of K612 resides 2.86Å from DNA phosphate (6), whereas the terminal, side chain amino group of N613 resides 3.59 Å from DNA phosphate (5).
One last residue of note is K609, which resides in the loop region between
-helices B and D. The terminal side chain amino group points toward the RNA strand phosphate groups at a distance between 6.5 to 7.5 Å. Although this is a fairly large distance, one must take into consideration the absence of the C-helix. Overlay of the E. coli RNase H in this same region shows the homologous loop between
-helices C and D and the homologous residue (K96) to reside a few angstroms closer to the RNA backbone phosphate (data not shown). E. coli residue K99, however, overlaps right over its homologue, Mo-MLV K612, suggesting the only differences between the two is the displacement of the loop and the missing C-helix. K609, therefore, may have a role in RNA backbone phosphate contacts.
The residues of Mo-MLV RNase H
C identified here as potentially involved in binding substrate are highlighted in red (for DNA interaction) and in blue (for RNA interaction) in Fig. 2. Figure 2 also highlights in similar fashion residues identified to bind substrate for RNases H of E. coli, HIV-1, and B. halodurans. Most of the Mo-MLV RNase H residues identified by our model are in concordance with the previous data for the various species of RNases H. This suggests that our model accurately represents Mo-MLV RNase H binding to substrate. Such a model may be used for future mutagenesis, biochemical analysis, and viral replication studies of Mo-MLV. We have tabulated the identified Mo-MLV RNase H residues, their respective substrate binding locations, and the corresponding HIV-1 RT residues in Table 3. In similar fashion, Table 4 lists the identified HIV-1 p66 connection and RNase H domain residues, their respective substrate binding locations, and the corresponding Mo-MLV RT residue. Both RNase H structures of Mo-MLV and HIV-1 possess the appropriate or homologous residue for each specified substrate interaction. Each sequence contained a few substrate-binding residues that had no obvious corresponding residue, or a nonbinding corresponding residue, in the other species. Such differences in binding determinants may partially be explained by the role of the HIV-1 p66 connection domain.
|
View this table: [in a new window] |
TABLE 4. Amino acids involved in substrate binding for p66 HIV-1 connection and RNase H domainsa
|
To see how HIV-1 RT compensates for the functional capacities of the Mo-MLV RNase H C-helix, the HIV-1 RT p66 connection domain was added to the superimposed structures of E. coli, Mo-MLV, and HIV-1 RNases H bound to substrate. Only this domain was added since all other domains of HIV-1 RT were too distant to be of any relevance for this particular analysis. HIV-1 RT p66 residues K353 to T365 comprise a loop within the connection domain. This loop is structurally located in the exact region where the E. coli RNase H C-helix resides (also the assumed Mo-MLV RNase H C-helix location) (Fig. 6). Closer analysis of the sequence shows a cluster of three positively charged residuesK353, R356, and R358that are similar to Mo-MLV RNase H residues R599, R600, and R601 (data not shown). The positively charged residues of HIV-1 RT, however, are more widely spaced, lie closer to the substrate, and approach the major groove from a different angle. HIV-1 RT R358 lies the closest to E. coli RNase H R88 (corresponds to Mo-MLV RT R601). Unexpectedly, HIV-1 RT H364 and T365 almost exactly superimpose E. coli RNase H W81 and N84 (which correspond to Mo-MLV RT H594 and I597, respectively). As described previously, threonine was a functional residue when substituted for Mo-MLV RT residue I597 (33). The potential three-dimensional alignment of the HIV-1 connection domain residues R358, H364, and T365 to the predicted Mo-MLV RNase H C-helix residues R601, H594, and I597 strongly suggest that this connection domain region of HIV-1 RT has replaced the substrate binding abilities of the missing HIV-1 RNase H C-helix. HIV-1 RT connection domain residues G359, A360, and H361 have also been identified as binding DNA phosphates (7) and (6) (Table 4) (40). These residues, however, did not overlay in space as well with the E. coli RNase H C-helix. The substitution of HIV-1 RT p66 connection domain residues K353 to T365 for the RNase H C-helix explains why an inactive, single-domain HIV-1 RNase H regains nuclease activity with the addition of the connection domain either in cis or in trans (41). Mutational analysis of this region of the HIV-1 RT p66 connection domain may reveal its role in nucleotide binding.
![]() View larger version (23K): [in a new window] |
FIG. 6. Stereo-superimposed structures of the RNase H C-helical regions of Mo-MLV C (in blue), E. coli (in green with the C-helix residues W81 to G89 in orange-red), and HIV-1 RNase H (in yellow) and the p66 connection K353 to T365 (in magenta). The RNA/DNA substrate is on the left (with DNA in red and RNA in blue). The figure made by using MOLSCRIPT and Raster 3D (31, 34).
|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»