Crystal Structures of GII.10 and GII.12 Norovirus Protruding Domains in Complex with Histo-Blood Group Antigens Reveal Details for a Potential Site of Vulnerability

ABSTRACT Noroviruses are the dominant cause of outbreaks of gastroenteritis worldwide, and interactions with human histo-blood group antigens (HBGAs) are thought to play a critical role in their entry mechanism. Structures of noroviruses from genogroups GI and GII in complex with HBGAs, however, reveal different modes of interaction. To gain insight into norovirus recognition of HBGAs, we determined crystal structures of norovirus protruding domains from two rarely detected GII genotypes, GII.10 and GII.12, alone and in complex with a panel of HBGAs, and analyzed structure-function implications related to conservation of the HBGA binding pocket. The GII.10- and GII.12-apo structures as well as the previously solved GII.4-apo structure resembled each other more closely than the GI.1-derived structure, and all three GII structures showed similar modes of HBGA recognition. The primary GII norovirus-HBGA interaction involved six hydrogen bonds between a terminal αfucose1-2 of the HBGAs and a dimeric capsid interface, which was composed of elements from two protruding subdomains. Norovirus interactions with other saccharide units of the HBGAs were variable and involved fewer hydrogen bonds. Sequence analysis revealed a site of GII norovirus sequence conservation to reside under the critical αfucose1-2 and to be one of the few patches of conserved residues on the outer virion-capsid surface. The site was smaller than that involved in full HBGA recognition, a consequence of variable recognition of peripheral saccharides. Despite this evasion tactic, the HBGA site of viral vulnerability may provide a viable target for small molecule- and antibody-mediated neutralization of GII norovirus.

Human noroviruses are an important etiological agent of sporadic gastroenteritis and the dominant cause of outbreaks of gastroenteritis around the world (21,35). Although the disease is self-limiting, symptoms can persist for days or even weeks, and transmission from person to person is difficult to control once the outbreak has occurred. Cross-protection from future norovirus infections is uncertain, and it is not uncommon for reinfection with a genetically similar strain (20,27,46). Currently, there are no vaccines for noroviruses (14,23). The norovirus positive-sense, single-stranded RNA genome has three open reading frames (ORF1 to ORF3), in which ORF1 encodes the nonstructural proteins, ORF2 encodes the capsid protein, and ORF3 encodes a small basic structural protein. Based on complete capsid gene sequences, human noroviruses can be divided into 2 main genogroups (GI and GII), which can be further subdivided into at least 25 different genotypes (GI.1 to -8 and GII.1 to -17) (18,47).
Human noroviruses are uncultivable, but expression of the capsid protein in a baculovirus expression system results in the self-assembly of virus-like particles (VLPs) that are morphologically and antigenically similar to the native virion (16). The X-ray crystal structure of the VLP from the prototypic GI.1 Norwalk virus (genus, Norovirus; type species, Norwalk virus) identifies two domains, the shell and protruding (P) domains (30). The P domain is further divided into P1 and P2 subdomains, with the P1 subdomain interacting with the shell and the P2 subdomain residing on the outer surface of the capsid and likely containing the determinants for antigenicity and receptor binding (16,37). The P domain can be crystallized separately, and structures of P domains for GI.1 and GII. 4 have been determined (3,4). These replicate many of the structural details of the VLP, including a P domain dimer interface.
The human histo-blood group antigens (HBGAs) have been identified as potential receptors of norovirus (15). The HBGAs are complex carbohydrates linked to proteins or lipids present on epithelial cells and other cells in the body or found as free antigens. At least nine different HBGAs that can bind to norovirus have been described (13,19,31,34,37,40,41), although relatively weak interactions, differential quality of reagents, pH, binding time, and other experimental variables have led to conflicting results concerning the specifics of HBGA binding to different noroviruses (reviewed in reference 39). Defined interactions from crystal structures, however, have been determined for three HBGAs in complex with P FIG. 1. Structures of the norovirus GII.10 and GII.12 P domains. The GII.10 and GII.12 P1 subdomains are very similar, with greater differences observed in the P2 subdomains. (A) The GII.10 VLP was modeled from the shell domain of the norovirus (NV) VLP (PDB ID, 1IHM) and the unbound GII.10 P domain (PDB ID, 3ONU). The GII.10 VLP (T ϭ 3) was modeled with different monomer interactions, A/B and C/C, where each A, B, and C monomer was colored light blue, salmon, and orange, respectively. The boxed region showed the location of the P domain capsid dimer. (B) The X-ray crystal structure of the unbound GII.10 P domain dimer was determined to have 1.4-Å resolution and colored according to monomers (chains A and B) and P1 and P2 subdomains, i.e., chain A P1 (blue), chain A P2 (light blue), chain B P1 (violet), and chain B P2 domains of two norovirus genotypes (Norwalk virus GI.1 and VA387 GII.4) (3,4,6). These studies identified a number of HBGA binding differences between the norovirus GI and GII genogroups. The GI.1 genotype bound HBGAs at the outer (P2) surface of the capsid with a monomeric interaction involving a single P2 subdomain. GII.4 also bound HBGAs at the top of the P2 subdomain but with a completely different set of residues, which spanned a P2 subdomain dimer interface. To better understand the molecular basis of HBGA binding and to examine the relationship between HBGA recognition and norovirus sequence conservation, we determined 11 different crystal structures of P domains from two rarely detected noroviruses, Vietnam026 (026) GII.10 and Hiro GII.12, alone and in complex with a panel of HBGAs. Structure-function relationships derived from analyses of the GII norovirus-HBGA structures were used to provide insight into both the sequence conservation and the potential vulnerability of the HBGA site of recognition to small molecule-or antibodymediated neutralization.

MATERIALS AND METHODS
Sequence analysis. Amino acid sequences of the entire norovirus capsid were aligned with ClustalX, and the distances were calculated by Kimura's twoparameter method. Phylogenetic trees with 1,000 bootstrap replicates were generated using the neighbor-joining method with ClustalX. GenBank accession numbers were described elsewhere (10), with the addition of VA387 (GenBank accession number AAK84679) (see Fig. S1A in the supplemental material).
Protein expression, purification, and crystallization of the norovirus P domain. The norovirus Vietnam026 GII.10 strain (GenBank accession number AF504671) was isolated from a stool specimen obtained from a male infant under 12 months of age presenting acute sporadic gastroenteritis in December 1999 at the General Children's Hospital No. 1 in Ho Chi Minh City, Vietnam (9). The norovirus Hiro GII.12 strain (GenBank accession number AB044366) was isolated from an adult male in a small outbreak of gastroenteritis in November 1999 in Hiroshima, Japan (9). An amino acid alignment of Norwalk virus, VA387, Vietnam026, and Hiro was used to predict the N and C termini of the Vietnam026 and Hiro P domains. Because residues at the N and C termini of the VA387 P domain structure were disordered (4), we designed our constructs to omit these regions. The near-full-length GII.10 (residues 224 to 538) and GII.12 (residues 224 to 525) P domains (314 and 301 amino acids in length, respectively) were optimized for Escherichia coli expression, cloned in a modified pMal-c2x vector at BamHI and NotI (New England Biolabs), and transformed into BL21 cells (Invitrogen). Expression was induced with IPTG (isopropyl-␤-D-thiogalactopyranoside; 1 mM) for 18 h at 22°C. A His-tagged fusion-P domain protein was purified from an Ni column (Qiagen) and digested with HRV-3C protease (Novagen) overnight at 4°C, and the P domain was separated on the Ni column. The P domain was further purified by size exclusion chromatography with a Superdex-200 column (GE), concentrated to 2 to 10 mg/ml, and stored in GFB (0.35 M NaCl, 2.5 mM Tris [pH 7.0], 0.02% NaN 3 ) before crystallization. Dynamic light scattering (DLS) of the P domains determined that the majority of the protein was dimeric (data not shown). Crystals of the P domain were obtained by the hanging-drop vapor diffusion method. The GII.10 P domain crystallized under different conditions using Hampton Research reagents, but for this study, we chose to use two similar crystallization conditions. The first condition contained ammonium citrate (0.66 M, pH 6.5) and isopropanol (1.65%, vol/vol).
The second condition contained imidazole (0.1 M, pH 6.5), polyethylene glycol 8000 (PEG 8000) (4.95%, wt/vol), and isopropanol (13.2%, vol/vol). The GII.12 P domain crystals were grown in PEG 1500 (30%, wt/vol), magnesium sulfate hydrate (0.2 M), sodium acetate anhydrous (0.1 M, pH 5.5), and 2-methyl-2,4pentanediol (3%, vol/vol). Crystals were grown in a 1:1 mixture of the protein sample and mother liquor at 25°C for 2 to 6 days. For the P domain and HBGA complexes, we either soaked a 60 molar excess of HBGA into premade crystals and/or cocrystallized the HBGA and P domain. Prior to data collection, crystals were transferred to a cryoprotectant containing the mother liquor in 30% ethylene glycol, and those bound to HBGAs also contained 30 to 60 molar excess of HBGA.
Data collection, structure solution, and refinement. X-ray diffraction data were collected at the Southeast Regional Collaborative Access Team (SER-CAT) beamlines 22-ID and 22-BM at the Advanced Photon Source, Argonne National Laboratory, Argonne, IL, and processed with HKL2000 (26) or XDS (17). Structures were solved by molecular replacement in PHASER (24) using Protein Data Bank (PDB) identifier (ID) 2OBR as a search model. Structures were refined in multiple rounds of manual model building in COOT (8) and refined with TLS in REFMAC (7) and PHENIX (1). Parameters for the stereochemistry of saccharide residues were taken from a new monomer library (version 5.21) incorporated in REFMAC/CCP4 (G. Murshudov, unpublished data). Glycosidic bonds for di-, tri-, and tetrasaccharides were defined in PHENIX during refinement.

RESULTS
Unbound structure of the GII.10 P domain. The GII.10 P domain MBP fusion protein was expressed at a level of ϳ10 mg/liter in E. coli. The cleaved GII.10 P domain formed rectangular plates that diffracted to better than 1.5-Å resolution ( Table 1). A molecular replacement solution with the previously determined GII.4 P domain (4) was obtained in space group P2 1 , with one P domain dimer in the asymmetric unit ( Fig. 1A and B). Refinement of the GII.10 structure led to an R work value of 0.151 (R free ϭ 0.167) and well-defined density for most of the P domain dimer (Table 1). Following the nomenclature established by Prasad and colleagues (30), the GII.10 P1 subdomain was located between residues 222 to 277 and residues 427 to 549, whereas the P2 subdomain was between residues 278 and 426. The GII.10 P1 subdomain was formed primarily by a single ␣-helix, which was flanked by seven antiparallel ␤-strands (Fig. 1B). The GII.10 P2 subdomain contained 12 antiparallel ␤-strands, 6 from each subunit, which formed 2 antiparallel ␤-sheets (Fig. 1B). Overall, the secondary structure of the GII.10 P domains was highly reminiscent of previously published GI and GII structures (4,30). On one of the asymmetric unit monomers, residues 344 to 351 (chain B) were disordered; these disordered residues were not modeled into the GII.10-apo structure.
Unbound structure of the GII.12 P domain. The GII.12 P domain MBP fusion protein was expressed at a level of ϳ2 mg/liter in E. coli. The cleaved GII.12 P domain formed rectangular parallelepipeds that diffracted to 1.75-Å resolution ( Table 2). The GII.12 P domain structure was determined by molecular replacement with the GII.10 P domain; structure solution indicated that the space group was C222 1 , with one P domain monomer in the asymmetric unit ( Fig. 1C, with its monomeric P1 and P2 subdomain partners shown in green and cyan, respectively). Refinement of the GII.12 structure led to an R work value of 0.185 (R free ϭ 0.203) and well-defined density for most of the P domain monomer ( Table 2). The GII.12 P1 subdomain was located between residues 222 to 277 and residues 414 to 536, whereas the P2 subdomain was between residues 278 and 413.
Comparisons of unbound structures of the GII.10, GII.12, GI.1, and GII.4 P domains. Despite the great genetic diversity of noroviruses, the GII.4 strains have been responsible for the majority of outbreaks around the world over the past 10 or so years (25,35,36). To examine whether the rare versus outbreak status had bearing on the overall structures, we compared rare and outbreak GII strains. The P domains from rare GII.10 and GII.12 were highly similar in structure, with a root mean square deviation (RMSD) for C␣ atoms of 0.64 Å. However, in addition to their shared rare status, they were also more closely genetically related to each other than to the GII.4 outbreak strain. Pairwise analysis of RMSD differences in the P domain structures ( Fig. 1) found that the three GII P domain structures, two rare and one outbreak, were more similar to each other than to the GI structure. Overall structural differences thus appeared to reflect genetic distance (see Fig. S1A in the supplemental material) rather than rare or outbreak status.
Structures of HBGA H type 2-trisaccharide and -disaccharide bound to the GII.10 P domain. HBGAs are a group of short oligosaccharides that are expressed in a polymorphic manner on cell surfaces or found as free antigens and have been shown through a number of studies, including the aforementioned crystallographic ones, to interact with norovirus ( Fig. 2) (11,19). HBGAs are generated from a number of different precursor disaccharides, with additional saccharides added by enzymes, which are variably present in the human population (see Fig. S2 in the supplemental material) (22). One distinction is made by the presence of ␣1,2fucosyltransferase, which adds a terminal ␣fucose1-2 unit; HBGAs with this saccharide are termed secretors, while those missing the terminal ␣fucose1-2 are termed nonsecretors.
Because the GII.10 P domain protein was expressed to larger amounts and crystals diffracted to higher resolution than those of GII.12, we chose to examine first the GII.10 P domain by X-ray crystallography in complex with a panel of HBGAs (see Fig. S2 in the supplemental material) representing an assortment of secretor and nonsecretor HBGAs. The secretor HBGAs used were H type 2-disaccharide, H type 2-trisaccharide, A-trisaccharide, B-trisaccharide, Le y -tetrasaccharide, and Le b -tetrasaccharide, whereas the nonsecretor HBGAs used were Le a -trisaccharide and Le x -trisaccharide.
The HBGA H type 2-trisaccharide is ␣-L-fucose-(1-2)-␤-Dgalactose-(1-4)-2-N-acetyl-␤-D-glucosamine, which is the first secretor in one of the major biosynthetic HBGA pathways (see Fig. S2 in the supplemental material). Cocrystallization of the GII.10 P domain with H type 2 resulted in P2 1 crystals that diffracted to 1.40 Å, with cell constants virtually isomorphous with those of the unbound crystals (Table 1). Structure solution and refinement with the unbound P domain resulted in a single clearly defined patch of electron density that spanned two P domain monomers ( Fig. 2A and 3A). Placement of the trisaccharide was assisted by a well-defined fucose density, which led to an unambiguous orientation of this HBGA. Refinement led to an R work value of 0.169 (R free ϭ 0.188) and well-defined density for all of the saccharide units (Fig. 3A). No unassigned electron density was observed in the corresponding position of the HBGA on the P domain dimer, around the molecular 2-fold. Inspection of the lattice indicated a lattice contact at this position, which would occlude the presence of a second HBGA molecule (see Fig. S3A in the supplemental material).
The fucose showed the most well-defined density and was fixed by a network of P2 subdomain hydrogen bonds, two contributed by the side chain of Asp385, two by the side chain of Arg356, and one by the main chain of Asn355 ( Fig. 3B; see also Fig. S1B in the supplemental material). A sixth hydrogen bond was contributed from the backbone of Gly451 from across the P domain dimer interface, with the aromatic ring of Tyr452 packing over the fucose methyl. Both Gly451 and Tyr452 are located on a loop that extends from the P1 subdomain to form part of the P domain dimer interface (Fig. 1E). Meanwhile, the galactose was fixed by one hydrogen bond, and the N-acetyl-glucosamine by three, contributed by a mix of backbone and side chain interactions, including Lys449 on the aforementioned P1-interface loop ( Fig. 3B; see also Fig. S1B).
To better understand H type 2 recognition, we also determined the structure of an H type 2-disaccharide [␣-L-fucose-(1-2)-␤-D-galactose] in complex with the GII.10 P domain (Table 1). The fucose appeared well ordered, but the galactose ring was substantially less well defined (Fig. 3C). Apparently the single observed hydrogen bond to the galactose ring in the trisaccharide structure was not sufficient to fix the galactose in the disaccharide structure when not also sandwiched by an N-acetylglucosamine, as in the H type 2-trisaccharide (Fig. 3).
Overall, the unbound and H type 2-bound structures of the GII.10 P domain were virtually indistinguishable, except that in the bound structures, saccharides replace a number of surface waters. Within the bound H type 2 HBGAs, the primary interactions were observed to be through the terminal ␣fucose1-2 moiety, which was tightly held by both hydrophobic and hydrophilic interactions at the P domain dimer interface and involved the P1-interface loop from one monomer and the P2 subdomain from another monomer ( Fig. 2A and 3).
Structure of HBGA Le y -tetrasaccharide bound to the GII.10 P domain. The Le y -tetrasaccharide HBGA is ␣-L-fucose-(1-2)-␤-D-galactose-(1-4)-2-N-acetyl-␤-D-glucosamine-(3-1)-␣-L-fucose, which is the product of ␣1-3fucosyltransferase on H type 2-trisaccharide HBGA (see Fig. S2 in the supplemental material). Cocrystallization of the GII.10 P domain with Le y resulted in P2 1 crystals that diffracted to 1.48 Å, with cell constants virtually isomorphous with those of the unbound and H type 2-bound crystals (Table 1). Similar to the H type 2 structure described above, the Le y complex structure solution and refinement resulted in a single patch of electron density, which overlapped with the position of the ␣fucose1-2 in the H type 2 complex structure ( Fig. 2A  and 4A). The Le y -tetrasaccharide was tested in the following two orientations: either with ␣fucose1-2 or with ␣fucose1-3 placed in the P domain interface. Only the ␣fucose1-2 placement refined well. Refinement led to an R work value of 0.185 (R free ϭ 0.204) and well-defined density for all of the saccharide units (Fig. 4A).
As described for the H type 2 complex structures, the ␣fucose1-2 of Le y was fixed by a network of six hydrogen bonds, i.e., two by Asp385, two by Arg356, one by Asn355, and one by Gly451, and a Tyr452-hydrophobic interaction, (Fig.  4B; see also Fig. S1B in the supplemental material). The galactose of Le y was fixed by one water-mediated hydrogen bond, the N-acetylglucosamine by two backbone hydrogen bonds, and the terminal ␣fucose1-3 by a hydrogen bond to the side chain of Trp381. Interestingly, the positions of the saccharides, other than ␣fucose1-2, in Le y were quite different from those in H type 2 (Fig. 5A). In Le y , the galactose kinks up away from the protein, the N-acetylglucosamine swivels closer to the protein, and the terminal ␣fucose1-3 ends up being positioned close to the location of the third saccharide (N-acetylglucosamine) from H type 2.
HBGA Le b -tetrasaccharide bound to the GII.10 P domain as a single ordered fucose. The Le b -tetrasaccharide HBGA is ␣-Lfucose-(1-2)-␤-D-galactose-(1-3)-2-N-acetyl-␤-D-glucosamine-(4-1)-␣-L-fucose, which is the product of ␣1-4fucosyltransferase on H type 1-trisaccharide HBGA (see Fig. S2 in the supplemental material). Cocrystallization of the GII.10 P domain with Le y resulted in C-centered orthorhombic crystals that diffracted to 1.85 Å, and structure solution with the unbound GII.10 P domain structure revealed the crystals to be in space group C222 1 , with three monomers of the P domain in the asymmetric unit (see Fig.  S3B in the supplemental material). These three monomers formed the previously observed dimer, with the monomer arranged around a crystallographic 2-fold, so that it also formed the standard dimer.
Refinement to an R work value of 0.164 (R free ϭ 0.189) revealed that the molecular dimer and the crystallographic dimer were virtually identical to each other (RMSD ϭ 0.20 Å) and to the unbound dimer (RMSDs of 0.19 and 0.21 Å for the molecular and crystallographic dimer, respectively). Each of the three independent monomers contained a single somewhat poorly ordered ␣fucose1-2 (average B value of 49 Å 2 ), held in place by the standard six hydrogen bonds that spanned between two P domain monomers ( Fig. 2A). Notably, other than this single fucose, no additional saccharides were observed ( Fig. 4C and D).
Comparison of the structures of the H type 2-di-and -trisaccharide HBGAs indicated that without a third saccharide,  the intervening galactose became partially disordered (compare Gal in Fig. 3A and C). Moreover, examination of the differences between the Le b and Le y chemistries indicated that the differences of these two could be envisioned as a swapping of the chemistries around the critical third saccharide ring, such that the two hydrogen bonds which are made at the first and second positions of that ring in the well-ordered Le ybound HGBA would be disrupted (compare GlcNAc in Fig.  4B and D). Thus, while we could not rule out completely different potential orientations for the bound Le y HBGA, analysis of the other bound HBGAs indicated that only the ␣fucose1-2 of Le b could bind in a manner similar to that of Le y , consistent with the singly ordered fucose that was observed.  Fig. 1B). The HBGA outline was shaded in blue, the black dotted lines represent the hydrogen bonds, the red dotted line represents the hydrophobic interaction from Tyr452, and the sphere represents water molecules. For simplicity, only the backbone was shown for residues that were backbone mediated. Hydrogen bond distances were less than 3.2 Å, though the majority was ϳ2. types A and B also resulted in P2 1 crystals that diffracted to 1.48 and 1.28 Å, respectively (Table 1). Similar to the structures described above, type A and B complex structure solutions and refinements resulted in a single patch of electron density, which overlapped with the position of the ␣fucose1-2 in the H type 2 complex structures ( Fig. 2A). Placement of the ␣fucose1-2 of types A and B at the P domain interface allowed for the other two saccharides to be easily built into the remaining density. Refinement led to R work values of 0.178 and 0.167 (R free ϭ 0.198 and 0.181) for type A and B bound structures, respectively, and well-defined density for all of the saccharide units (Fig. 6).

Structures of HBGA type A-and B-trisaccharides bound to
In addition to the six hydrogen bonds described above, ␣fucose1-2 was fixed by another water-mediated hydrogen bond to Lys449 (Fig. 6B and D). In total, five hydrogen bonds were contributed by one monomer of the P2 subdomain (Asn355, Arg356, and Asp385), and two were contributed by the P1-interface loop on the other monomer (Lys449 and Gly451), which also contributed the Tyr452hydrophobic interaction (see Fig. S1B in the supplemental material). For type A, the galactose was fixed by one backbone-mediated hydrogen bond to Gly451, and the N-acetylgalactosamine by two water-mediated hydrogen bonds to Glu382. For type B, interactions were virtually identical, with the ␣-D-galactose also fixed by two water-mediated hydrogen bonds to Glu382. In contrast to H type 2 and Le y , types A and B bound in remarkably similar manners, with all atoms of fucose and galactose superimposing after alignment of P domain, with an RMSD of less than 0.01 Å (Fig. 5B).
Nonsecretor HBGAs Le a -and Le x -trisaccharides were not observed to bind to the GII.10 P domain. The HBGAs Le atrisaccharide and Le x -trisaccharide are the product of the ␣1,3/ 4fucosyltransferase, which adds a terminal ␣fucose1-3/4 unit to the standard galactose-N-acetylglucosamine precursor. These HBGAs are termed nonsecretors because they lack a ␣fucose1-2 unit. Cocrystallization of these with the GII.10 P domain resulted in monoclinic crystals that diffracted to 1.40 and 1.43 Å for Le a and Le x , respectively, and molecular replacement and refinement revealed the standard P2 1 structure (Table 1), though in both cases, the patch of electron density was quite weak and no saccharide could be fitted (structures deposited without HBGA).
Structure of HBGA type B-trisaccharide bound to the GII.12 P domain. Having determined structures of the GII.10 P domain with a panel of HBGAs, we next turned to the GII.12 P domain. Cocrystallization of the GII.12 P domain with the type B-trisaccharide HBGA resulted in C222 1 crystals that diffracted to 1.60 Å ( Table 2). Structure solution and refinement with the unbound GII.12 P domain resulted in a small patch of electron density, located at the P domain interface (Fig. 2B). Refinement led to an R work value of 0.219 (R free ϭ 0.237). The fucose appeared very well ordered, while the two other saccharides were less well defined (Fig. 7A). The fucose was held in place by the standard six hydrogen bonds that spanned between two P domain monomers (Fig. 7B). However, in the case of GII.12, a main-chain hydrogen bond from cysteine (Cys345) replaced the GII.10 main-chain hydrogen bond from asparagine (Asn355).
Conservation of the HBGA binding motif in GII noroviruses. The structure of the outbreak GII.4 (VA387) strain of norovirus previously determined with HBGA type A-and Btrisaccharides closely resembles the GII.10 and GII.12 norovirus structures with HBGAs described here. Taken together, they reveal a coherent picture of HBGA recognition, dominated by ␣fucose1-2 binding, as observed by Tan et al. (41).
Of the 13 potential hydrogen bonds made by a terminal fucose, 6 are made by all 3 GII P domains in all 9 different HBGA P domain structures. These six, which are located in almost exactly the same places in all HBGA-bound structures, consist of five from a P2 subdomain and one from the P1interface loop on another P domain monomer ( Fig. 2A to C). These extensive contacts are quite specific for ␣fucose1-2, with ␣fucose1-3 unable to fit. The GII.10 and GII.4 interactions are further strengthened by a hydrophobic contact with the side chains of Tyr452 and Tyr443 on the P1-interface loop, respectively. Saccharides other than ␣fucose1-2 are attached in diverse ways, held in place by a rotating cast of surface residues.
To identify regions of high/low structural conservation, the six structures of GII.10 bound to different HBGA were further analyzed. Per-residue nonhydrogen atoms RMSDs were computed for each pair of structures, and the average RMSD among all structure pairs for each residue was obtained. The RMSD values for the GII.10 binding site residues were then compared to the RMSD values of nonbinding site residues, with a range of solvent accessibility cutoffs. In all cases, residues interacting with the different HBGAs were more conserved structurally as opposed to nonbinding site residues, though the average RMSD values were generally low for both sets of residues (see Fig. S4 in the supplemental material).
Sequence conservation of GII noroviruses and comparison with GI noroviruses. The conserved GII recognition of HBGAs requires conservation of interacting residues. To understand the effect on sequence conservation engendered by this conserved recognition, we aligned a panel of GII norovirus sequences onto the atomic-level structures of GII.10 norovirus and analyzed conservation of surface residues relative to HBGA recognition. The residues on the surface of the P domain corresponding to the outer surface of the capsid were substantially less conserved than the inward facing surface residues (Fig. 8A). On the outer facing surface, two major regions of high conservation were observed. These overlapped with the two dimer-equivalent regions that interact with ␣fucose1-2 of the HBGA (Fig. 8A, middle, and B). Notably, the residues forming the surface of the P domain that interacts with the peripheral saccharides were generally less conserved than the ␣fucose1-2-interacting residues (see Fig.  S5 in the supplemental material). Thus, the structure-function relationships involved in HBGA recognition appear to be reflected in the conservation of the GII norovirus surface residues.
To test whether this conservation was indeed a reflection of HBGA recognition, we aligned a panel of GI norovirus sequences (10) onto the previously determined structures (6) of GI.1 norovirus in complex with the HBGA type A and type H saccharides. The residues forming the surface of the GI P domain corresponding to the outer surface of the capsid were also substantially less conserved than the inward facing surface residues (see Fig. S6 in the supplemental material). On the outer facing surface, two regions of high conservation were observed. These overlapped with the dimer-equivalent regions on each monomer that interact with the HBGAs (Fig. S6). Notably, the surface patch formed by conserved residues in the GI noroviruses was in a different location than the patch in the GII noroviruses. In both cases, the sites of sequence conservation related to the regions involved in HBGA recognition, which is in agreement with previous observations (4,6,41). Thus, the structure-function relationships involved in HBGA recognition appear to be reflected in surface-residue conservation for both GI and GII noroviruses.
The region of high conservation on the GII.10 outer facing surface included an additional residue, His358, which was not part of the identified HBGA binding sites (see Fig.  S7 in the supplemental material). In our structures and in the GII.4 structures determined previously (4), this residue was observed to make a potential hydrogen bond with the side chain of Asp385. The conservation of both Asp385 and His358 suggests that these two residues form a hydrogen bonding network that may be essential for HBGA binding of GII viruses. Due to its solvent exposure and adjacency to the fucose-binding site residues, it may be possible for His358 to also participate in direct binding interactions with some HBGAs. Likewise for GII.12, His348 (GII.12 numbering) was observed to form a similar hydrogen bond with the side chain of Asp375 (data not shown).

DISCUSSION
Viruses often use genetic variability to escape host recognition. Such variation, however, is limited by function: the virus cannot alter functionally critical elements while retaining infectivity. In particular, recognition of host factors, such as receptors or cofactors, generally requires regions on the outer surface of the virus to remain conserved. In the case of HIV-1, interaction with the CD4 receptor requires part of the HIV-1 gp120 envelope glycoprotein to remain conserved, and this same site is recognized by antibody VRC01, which is able to neutralize over 90% of circulating HIV-1 isolates (45,48). In the case of influenza virus, interaction with the sialic acid receptor results in conservation of a small surface patch on the hemagglutinin trimer, and small molecules and antibodies that target this patch have been less successful at broadly neutralizing diverse strains of influenza virus (43,44). With noroviruses, functional requirements related to HBGA recognition could potentially require substantial portions of the capsid surface to remain conserved and thereby serve as sites of vulnerability to small molecule-or antibody-mediated neutralization.
One way that noroviruses might alter such conservation requirements is by varying their modes of interactions with HBGAs. If different noroviruses were to use different modes of interactions, then different conservation schemes-and enhanced variation-would result. Indeed, different modes of HBGA are observed between the GI and GII genotypes of human noroviruses (3,4,6). The crystal structures obtained here from rare GII isolates (GII.10 and GII.12), however, show means of HBGA recognition virtually identical to those of the previously determined outbreak GII.4 structures (4). These results suggest that within GII, a single mode of recognition occurs.
The size of a HBGA is roughly half the size of an antibody epitope. If HBGA recognition were to require a conserved surface of roughly this size, such conservation could lead to significant vulnerability to antibody-mediated neutralization. Structure-function analysis of the GII.10 norovirus with a panel of HBGAs, however, indicates conserved binding at only one saccharide unit, terminal ␣fucose1-2, with variable recognition at peripheral saccharide units. Apparently, norovirus uses variation in human HBGAs, along with flexibility between saccharide units within each HBGA and variation in amino acid side-chain stereochemistry, so that the same amino acids can recognize diverse HBGAs in different ways. This allows the GII noroviruses to reduce the size of the conserved interaction surface to residues under a single critical saccharide rather than the entire HBGA. Nevertheless, this conserved surface defines a potential site of vulnerability on GII viruses (Fig. 8C) and may thus present a useful target for therapeutic and/or vaccine design efforts.
The HBGAs analyzed here represent only a fraction of known HBGAs (22). Those described here are involved in a primary major biosynthetic pathway, happen to be commercially available, and were described in a number of previous papers characterizing norovirus HBGA interactions (11-13, 19, 31, 37, 38). We provide definition for this panel with GII.10 and GII.12 noroviruses, with crystal structures at ϳ1.5-Å resolution. The high resolution revealed unexpected details. In the HBGA with H type 2-trisaccharide, the ␣fucose1-2 refined as an ␤fucose, nuclear magnetic resonance (NMR) analysis of the commercially obtained trisaccharide shows a mixture of at least four components, including a ␤fucose-containing impurity (data not shown). The impurities in the commercially available HBGAs may also explain some of the inconsistencies among the different laboratories, as recently reported (39). Nonetheless, as the ␣fucose-(1-2)-␤-D-galactose disaccharide unit is common to most of the HBGAs analyzed here, the placement of the correct disaccharide unit was clear from other structures. We note, however, that the density observed for a ␤fucose variant of the H type 2-trisaccharide looked very good, indicating that ␤fucose is accommodated by the norovirus binding pocket, in addition to the standard ␣fucose.
One reason that the recognition of the HBGAs could be reduced to a single saccharide unit may relate to the avidity between norovirus and HBGAs on the cell surface. It is likely that HBGA affinity correlates with the number of saccharide units fixed in the norovirus-HBGA interaction, and in some cases, only a single fucose was fixed. The expected low affinity between a single fucose and a norovirus virion is unlikely to provide sufficient affinity for receptor or cofactor function; interactions between a number of cell-associated fucoses and multiple binding sites on the polyvalent norovirus capsid, however, might suffice. Similar avidity considerations have been observed with influenza, where relatively weak interactions with sialic acid are sufficient to serve as receptors (33). The FIG. 8. Surface representations of GII amino acid conservation and putative site of vulnerability for GII noroviruses. Antigenic diversity of noroviruses is seen primarily on the outermost surface of the capsid, although patches of conservation on the top surface are observed. The most prominent of these patches correspond to the P domain-binding sites of the HBGAs described here. (A) An alignment of GII genotypes was used to map the amino acid conservation and variability on the GII.10 P domain dimer structure. The color-coded conservation ranged from a deep purple, represented by highly conserved amino acids, to white, represented by highly variable amino acids. GII conservation was mapped onto a model of the viral capsid (left), with a zoomed-in P domain dimer outer-facing surface (middle) and a 90°dimer rotation that shows the difference in conservation of the outer-and inner-facing surfaces (right). The outer-facing surface (top portion) is substantially less conserved, with two major surface patches of conserved residues overlapping the HBGA binding site. (The highly conserved but nonprotruding portions of the capsid correspond to the shell domain.) (B) Close-up stereo view of panel A, middle, showing the six different HBGAs bound to the GII.10 P domain. (C) Surface representation of GII.10 amino acid conservation was obtained as described above and mapped onto the GII.10 P domain structure. The identified site of vulnerability (yellow) was defined as the surface area of the following GII.10 residues participating in conserved hydrogenbonding interactions with ␣fucose1-2: Asn355, Arg356, Asp385 from one subunit, and Gly451 from the other subunit. observed primary binding to ␣fucose along with avidity considerations open up a number of possibilities for norovirus entry: in addition to HBGAs, for example, the ␣1-2fucosylation of mucin (see references 32 and 42) may potentially allow mucin to act as a receptor or cofactor. Indeed, since the rarely detected GII.10 P domain bound a panel of HBGAs and the ␣fucose1-2 binding interface was similar to that of the dominant outbreak GII.4 strain, other receptors or cofactors may be important determinants for genotype prevalence and/or viral entry. Our structural analysis strengthens the previous observation that GII noroviruses recognition of HBGAs requires the preservation of a conserved binding site across a dimer interface, which involves interactions with both the P1 and P2 subdomains (41). It has been previously suggested that the P2 subdomain is an insertion into P1 and may be the determinant of strain specificity due to its high variability and surface exposure (30). In contrast, P1 is more conserved and more internal (30), suggesting that its role as a specificity determinant may be diminished. Our structures, as well as previously determined GII.4 complex structures (4), indicate that HBGA binding involves important contacts with residues on a P1interface loop (Fig. 3, 4, 6, and 7). These results indicate that, in addition to being partially responsible for homodimerization (30), the P1 subdomain plays a prominent role in recognizing HBGAs and thus may play a more prominent role in strain specificity than previously suggested.
Overall, the results provide a framework for understanding how requirements for HBGA interactions influence norovirus sequence conservation and lead to a highly conserved site on the outer surface of the capsid. This highly conserved site is a potential site of vulnerability for inhibition of virus entry. Whether small molecule competition with or antibody targeting to this conserved site allows for effective norovirus inhibition of entry remains to be seen. As we observe here, diversity in HBGA recognition (between different genotypes, different HBGAs, and different units of each HBGA) and reductions in required HBGA affinity (through avidity) provide a mechanism for viral reduction in the size of the conserved surface area while maintaining functional requirements for interactions with the host during entry.