GII.13/21 Noroviruses Recognize Glycans with a Terminal β-Galactose via an Unconventional Glycan Binding Site

Evidence from both phenotypic binding assay and structural study support the observed interactions of human noroviruses (huNoVs) with histo-blood group antigens (HBGAs) as receptors or attachment factors, affecting their host susceptibility. GII.13 and GII.21 genotypes form a unique genetic lineage that differs from the mainstream GII huNoVs in their unconventional glycan binding site. Unlike the previous findings that GII.13/21 genotypes recognize only Lea antigen, we found in this study that they can interact with a group of glycans with a common terminal β-Gal, including Lec, lactose, and mucin core 2. However, this wide glycan binding spectrum in a unique binding mode of the GII.13/21 huNoVs appears not to increase their prevalence, probably due to the existence of decoy glycan receptors in human gastrointestinal tract limiting their infection. Our findings shed light on the host interaction and epidemiology of huNoVs, which would impact the strategy of huNoV control and prevention.

(ORFs) (2). ORF1 encodes six nonstructural (NS) proteins responsible for viral genome replication, while ORF2 and ORF3 encode major and minor capsid proteins. The major capsid protein (VP1) is composed of a shell (S) domain and a protruding (P) domain, and the P domain is further divided into the P1 and P2 subdomains. The P2 subdomain is the main determinant of NoV diversity, antigenicity, and glycan binding patterns.
NoVs are classified into seven genogroups (GI to GVII), among which GI, GII, and GIV cause human infections (1). HuNoVs recognize histo-blood group antigens (HBGAs) as receptors or attachment factors that are believed to be important for huNoV infection and host susceptibility (3)(4)(5)(6). HBGAs are fucose-containing glycans and the determinants of various blood types, including ABO, Lewis, and secretor status. While the HBGAs are polymorphic and diverse in interacting with different huNoVs, GI and GII huNoVs showed genogroup-specific binding modes (6). Specifically, the majority of GII huNoVs recognize the ␣-fucose (␣-Fuc) of HBGAs as the major binding saccharide (MaBS), while GI huNoVs interact with the ␤-galactose (␤-Gal) as the MaBS (5). In all these cases, huNoVs also interact with at least another saccharide to support or stabilized the binding outcomes.
In addition to HBGAs, huNoVs can also recognize other glycans or small chemicals. For example, some huNoVs (GI.3 VA115 and GII.4 VA387) reportedly bind gangliosides and sialic acid-containing glycoconjugates (7), similar to murine norovirus (MNoV) (8) and feline calicivirus (9). On the other hand, animal caliciviruses are reported to recognize carbohydrates as attachment factors; for example, bovine NoV binds to ␣-Gal (10) and canine and bat NoVs recognize the A and H antigens (11,12). Also, some huNoVs and MNoVs bind bile acids as cofactors, which may enhance huNoV-HBGA interactions (13). Moreover, human milk oligosaccharides (HMOs) act as natural decoys for huNoVs, which may reduce the risk of huNoV infection of milk-fed infants (14)(15)(16). The above scenarios indicate the complexity and diversity of huNoV-glycan interactions.
The HBGA binding profiles of huNoVs change over time within a genotype, which may contribute to their prevalence. For example, a new variant of the previously rare genotype GII.17 became the predominant strain to cause outbreaks during the epidemic season of 2014 -2015 (17,18). Along with the antigenic change (17), the new GII.17 variant had a broader HBGA binding spectrum due to minor mutations at the conserved GII glycan binding site (GBS) (18)(19)(20)(21). Therefore, identification of the new huNoV-glycan interactions, especially those of the rare genotypes, is necessary to help in the prevention and control of huNoV-associated diseases. Strikingly, unlike the mainstream GII huNoVs, which share a highly conserved GBS that binds various combinations of HBGAs, GII.13 and GII.21 genotypes form a special genetic lineage with a novel GBS that binds only Le a antigen of HBGAs (22)(23)(24). These previous data raise the question of what the advantage of the new GBS with such narrow binding spectrum over the original GII GBS is. We provide phenotypic and structural data to demonstrate that this new GBS recognizes a group of glycans that share a terminal ␤-Gal. In addition, we discuss the potential mechanism of the low prevalence of the GII.13/21 lineage in humans.

RESULTS
Glycan binding specificity of GII.13 P domain proteins. Using saliva-and oligosaccharide-based binding assays, we determined the glycan binding specificities of recombinant P proteins of three GII.13 NoVs isolated at different times: Goulburn Valley/1983, 08N2045/2008, and SC1065/2016 (Fig. 1A). All three GII.13 P proteins exhibited broad-spectrum binding to all saliva samples representing type A, B, and O secretors and nonsecretors (Fig. 1B). Oligosaccharide-binding assays showed that these GII.13 P proteins bound to Gal␤1-3GlcNAc (Lec) and Lac (Gal␤1-4Glc). 08N2045/2008 and SC1065/2016 also bound weakly to mucin core 2 [Gal␤1-3(GlcNAc␤1-6)GalNAc] (Fig. 1B). Furthermore, GII.14 GZ/2016 P protein, which has been shown to bind Lewis a antigen, served as a positive control (Fig. 1C). This is the first study to show that huNoV P domains bind mucin core oligosaccharides. We noted that these glycans share a terminal ␤-Gal, which may play a critical role in the binding outcomes. GII.13 and GII.21 P domains did not show obvious binding signals to Le a antigen in our glycan binding experiments. In the saliva-based binding assays using saliva samples representing various Lewis phenotypes, including Lewis-positive nonsecretor, Lewisnegative A secretor, Lewis-negative B secretor, and Lewis-negative O secretor, GII.13 P proteins did not show any binding signals to the Lewis-positive nonsecretor samples but showed binding signals to the Lewis-negative samples. This result indicated that binding of these GII.13 P proteins was not dependent upon the Le a antigen (Fig. 1D).
Crystal structure of the GII.13 P dimer-Lec/core 2 complex. To further explore the structural basis of the observed GII.13 huNoV-glycan interaction, we solved the crystal structure of the GII.13 SC1065/2016 P domain in complex with Lec disaccharide to a 1.6-Å resolution in the P12 1 1 space group ( Fig. 2A). Lec disaccharide was visible in the (2mFo-DFc) omit difference electron density map (Fig. 2B), and two sugar rings fitted into the map. The Lec-binding sites are located on the top of each P domain ( Fig. 2A). Eight residues from the P2 domain were involved in binding to the ␤-Gal of Lec via hydrogen bonding and hydrophobic interactions (Fig. 2C). Specifically, W298 from the B loop, S357 from the N loop, and N395 and T398 from the T loop formed the bottom region of the binding pocket, and N297 from the B loop, T359 and S360 from the N loop, and N397 from the T loop constituted the edge region of the binding pocket. The ␤-Gal interacts with the side chains of N297, S357, T359, N395, and N397 through hydrogen bonds, while T398, S360, and W298 form hydrophobic interactions with ␤-Gal to support the binding outcomes ( Fig. 2C and D). In contrast, the other saccharide of the Lec disaccharide, N-acetyl-␤-glucosamine (␤-GlcNAc), points away from the surface of the P dimer and as a result, only a hydrophobic interaction was formed with E396.
The crystal structure of the GII.13 08N2045/2008 P domain in complex with mucin core 2 trisaccharide was also determined at a 1.7-Å resolution (Fig. 2E). All three sugar rings were evident in the electron density map (Fig. 2F). Similar to Lec, mucin core 2 interacted with the GII.13 P domain mainly through the ␤-Gal. Seven residues were involved in this interaction, including W297, S356, T358, S359, N394, N396, and T397 (Fig. 2G). Residues S356, T358, N394, N396, and T397 contributed to the major interactions by forming hydrogen bonds with ␤-Gal, while W297 and S359 were involved in hydrophobic interactions (Fig. 2H). In addition, the GalNAc formed a hydrogen bond with N396, while the GlcNAc of the mucin core 2 trisaccharide pointed away from the surface of the P dimer without directly participating in the interactions. These structural data show that the GII.13 P dimer interacts with Lec/core 2 glycans mainly by the common terminal ␤-Gal.
Mutation study of the GII.13 GBS. Although the GII.13 and GII.21 OIF GBSs share high genetic and structural similarity, we noted two amino acid mutations: the residue at 297 in the GII.13 SC1065 was an asparagine (N), but it was a tyrosine (Y) in GII.21 OIF, while G361 in the GII.13 SC1065 changed to an E358 in GII.21 OIF. Because the GII.21 OIF P domain was reported to bind Le a antigen (23) but the GII.13 P domains in our study did not, three reverse mutants of the GII.13 (SC1065/2016) P domain were constructed to explore the role of these amino acids in glycan binding. These included a single mutation (N297Y or G361E) and a double mutation (N297Y/G361E) (Fig. 1A), aiming to restore the Le a binding function. Unexpectedly, oligosaccharide-based binding assays showed that all mutant P domains retained their original pattern of binding to Lec/lac/mucin core 2 glycans and did not bind to Le a antigen (Fig. 3A). Also, the single mutant N297Y and the double mutant N297Y/G361E showed stronger binding signals to all Lec/Lac/mucin core 2 glycans and particularly to mucin core 2, while E361 showed the same binding affinity for the glycans as the native P proteins (Fig. 3A).
We determined the crystal structure of the SC1065/2016 N297Y mutant. Further structural comparison showed that all other residues involved in glycan binding were conserved among GII.13/21 NoVs, with an exception of a change from Y295 in GII.21 OIF to N297 in GII.13 (Fig. 3B), likely presenting different orientations. Y297 in the crystal structure of an SC1065 N297Y mutant displayed an orientation opposite that in GII.21 OIF (Fig. 3B). The residue E358, which forms a hydrogen bond with the fucose of Le a in GII.21 OIF (23), changed to G361 in GII.13 (Fig. 3B), likely losing the potential interaction. These differences of amino acid conformation may affect the binding specificity and/or affinity of the P protein with the glycan ligands, which may contribute to the lack of binding to the Le a antigen of GII.13 in this study.  lineage as controls (Fig. 1A). Oligosaccharide-based binding assays showed that all three GII.21 P proteins bound to Lec, Lac, and mucin core 2 glycans, a binding profile similar to that of the GII.13 NoVs (Fig. 4A). In contrast, two GII.17 P proteins bound to type H disaccharides (Fuca1-2Gal), consistent with the previously reported binding between GII.17 Kawasaki308 P domain and 2=FL (Fuc␣1-2Gal␤1-4Glc) (15) (Fig. 4B). Thus, GII.13 and GII.21 P proteins showed similar binding specificities in this study.
Inhibition of glycan and saliva binding of GII.13/21 P domains by nonfat milk and lactose. Consistent with the binding to lactose, nonfat milk blocked the glycan binding function of the GII.13 and GII.21 P proteins ( Fig. 5A and B) but did not affect the HBGA binding of the GII.17 P protein (Fig. 5C). Two percent nonfat milk blocked the binding of GII.13/21 P protein to saliva (Fig. 6A and B). However, even highconcentration (10%) nonfat milk cannot inhibit the binding of GII.17 P protein to saliva (Fig. 6C). These data suggest that some components of nonfat milk occupy the GBS of the GII.13/GII.21 P domain and thus inhibit binding to Lac/Lec/mucin core 2 glycans and saliva.
Similarly, a lactose blocking assay showed that free Lac oligosaccharides blocked binding of the GII.13 and GII.21 P domain proteins to Lec, mucin core 2-PAA conjugate ( Fig. 7A and B), and saliva ( Fig. 8A and B), but not the binding of GII.17 P domain protein to H antigen or saliva (Fig. 7C and 8C). Therefore, lactose is an effective blocking reagent for the observed GII.13 and GII.21 P domain-glycan binding.

DISCUSSION
The GII.13/21 genetic lineage gains a novel GBS distinct from those of the other GII NoVs. However, the questions of why, how, and when this GII.13/21 lineage emerged from mainstream GII NoVs, most likely a GII.17 NoV (22), is unclear. A basic question is what glycans this novel GBS binds. Previous studies on this genetic lineage concluded that the GII.13/21 GBS binds to Le a antigen, which is one of the many HBGAs (22,23). However, such a narrow binding spectrum appears not to be sufficient, simply because it lacks the typical diversity nature of huNoVs and does not show an advantage of the new GBS over the original one. Part of this puzzle has been solved by this study, as we have offered solid evidence to demonstrate that the GII.13/21 GBS recognizes a group of glycans that share a terminal ␤-Gal, including Lec, Lac, and mucin core 2. Importantly, the crystal structures of the GII.13 P domains in complex with Lec and mucin core 2 suggest that the interaction of the GII.13/21 GBS with these glycans is mediated by core 2 but did not bind Le a or Le x , which contains a terminal ␤-Gal. To investigate this, we performed a structural superimposition of the three P domains in complex with Le a , Lec, and mucin core 2 (Fig. 9A). This structural comparison showed that the GBSs were highly conserved but did not indicate a clash of the three glycans with any part of the GBSs (Fig. 9B). Also, an attempt to shift the binding of GII.13 SC1065/2016 to Le a by reverse mutation of residues 297 and/or 361 failed. Thus, why the GII.13/21 P domain binds some ␤-Gal-containing glycans but not others, as well as the reason for the various binding strengths, is unclear. The glycan binding profiles of huNoVs have been shown to be associated with their infection risk (3,(24)(25)(26)(27)(28)(29), and possibly also their prevalence (19,21). We showed that the GII.13/21 huNoVs with the new GBS recognize a broad spectrum of terminal ␤-Galcontaining glycans, including mucin core 2. Mucin core 2 polymerizes at the C and N termini of many proteins in gastrointestinal tract and these proteins with mucin core 2  were shown to be secreted to intestinal lumen, forming a gel-like network as the main structure of mucus (30). This mucin core 2-containing mucus may act as an attachment factor for GII.13/21 NoVs, similar to HBGAs for other huNoVs. However, except for a slight increase of GII.13 prevalence in Nepalese children between 2005 and 2011 (31), the worldwide prevalence of GII.13/21 NoVs appears not to be high according to previous reports (32,33) and the NoroNet/CaliciNet public databases (34). This seems conflict with the fact that GII.13/21 P domains bound the majority of human saliva samples tested in this study, which are rich in ␤-Gal-containing glycans. While the distribution of ␤-Gal-containing glycans on the human intestinal mucosal epithelium is unknown, it is plausible to assume that many ␤-Gal-containing glycans are there, because ␤-Gal is a common component of mammalian glycans. So, why is the prevalence of GII.13/21 huNoVs low? One possibility is related to the ␤-Gal saccharidebinding mode of the GII.13/21 GBS, which may not bind the glycan attachment factor with high affinity. Although we detected binding by enzyme-linked immunosorbent assay (ELISA), the binding affinity between single-␤-Galor terminal-␤-Gal-containing glycans and the GII.13/21 GBS should be determined using methods with greater accuracy.
A more important factor contributing to the low prevalence of the GII.13/21 huNoVs may be the GBS of the GII.13/21 huNoVs, which is easily interfered with. The function of the GII.13/21 GBSs appears to be easily blocked or inhibited by low-level glycerol molecules (22,23), suggesting that other small molecules with a glycine link structure also inhibit the GII.13/21 GBS. Indeed, we also detected a glycerol molecule in the GII.13 08N2045/2008 P dimer. Moreover, nonfat milk and lactose oligosaccharides inhibited the binding of GII.13/21 P proteins to their ligands. Because lactose and lactose-like molecules are abundant in milk, these molecules may be extensively present in the gastrointestinal tract of breastfeeding infants and milk-drinking adults. Thus, lactose and other free glycans containing terminal ␤-Gal molecules may function as decoy receptors (35) for GII.13/21 huNoVs, thus reducing the risk of infection. This may explain the low prevalence of GII.13/21 huNoVs. Based on this principle, compounds containing ␤-Gal glycans may function as antivirals against infection by GII.13/GII.21 huNoVs. Another potential significance for the emergence of the new GBS in GII huNoVs may be to escape herd immunity. By targeting individuals spared by the dominant GII strains, the new GBS might allow GII.13/GII21 strains to find a niche of susceptible individuals, even though it is quite limited. This may represent one of the potential NoV potentiality to reach their current diversity.
In conclusion, GII.13/21 huNoVs have developed a novel GBS that interacts with glycans via the common ␤-Gal through a unique glycan binding mode. This binding mode differs from that of the conventional GBS of the mainstream GII HuNoVs, which relies on the ␣-Fuc as the MaBS (Fig. 9C). Although GII.13/21 huNoVs exhibit wide glycan binding spectra that should facilitate their infection and prevalence, the decoy glycan receptors in nature may limit their prevalence. Our findings provide new insights into the host interaction, evolution, and epidemiology of huNoVs, which may facilitate development of strategies for control and prevention of huNoVs.   conjugated goat anti-rabbit antibody (1:10,000) were used as the primary and secondary antibodies, respectively. In nonfat milk and lactose blocking saliva assays, the saliva coating and nonfat milk blocking steps were performed as described above. For nonfat milk blocking and lactose blocking saliva assays, eight P proteins were added at 0.5 g/well to 2%, 5%, or 10% skimmed milk powder and 2, 5, or 10 mg/well of free lactose, respectively. Anti-SC1065 (1:6,000) and anti-GII.17 KW323 (1:8,000) (18) primary antibodies were used in the assays of GII.13/GII.21 and GII.17 P proteins, respectively. Other steps were as described above (41).

MATERIALS AND METHODS
Protein crystallization. The purified P domain of SC1065 and 08N2045 was concentrated to ϳ10 mg/ml. Native crystals of the SC1065 and 08N2045 P proteins were grown using the sitting-drop vapor diffusion method by mixing 1 l of protein solution with an equal volume of reservoir solution containing 0.2 M sodium citrate tribasic dehydrate, 20% (wt/vol) polyethylene glycol 3350, 0.2 M ammonium acetate, 0.1 M HEPES (pH 7.5), and 25% (wt/vol) polyethylene glycol 3350. SC1065 P protein and Lec disaccharide (Dextra) were mixed at a 1:50 molar ratio and incubated for 5 h at 4°C under the same conditions as the native protein and also like the complex crystals of 08N2045 P domain and core 2 trisaccharide. SC1065N297Y was crystallized under the condition of 8% (vol/vol) Tacsimate (pH 8.0) and 20% (wt/vol) polyethylene glycol 3350. After incubation for 7 days at 18°C, the native and complex crystals were transferred to a cryoprotectant containing mother liquor and 20% (vol/vol) glycerol and subsequently flash-frozen in liquid nitrogen.
Data collection and processing. X-ray diffraction data were collected at Shanghai Synchrotron Radiation Facility (SSRF) BL19U and processed with HKL2000. Additional processing was performed using CCP4 software. The structure of the GII.13 SC1065 P domain was determined using the molecular replacement module of PHASER, with the GII.21 P structure (Protein Data Bank [PDB] code 4RLZ) as a search model. The model was further refined using phenix.refine (42) in PHENIX (43) with energy minimization, ADP refinement, and bulk solvent modeling. The stereochemical quality of the final model was assessed using MolProbity. Data collection and refinement statistics are summarized in Table 2. The structural analysis was performed using PyMOL software (https://pymol.org/2/). The representative GII.13 P structures were calculated with the align function in PyMOL. a Values in parentheses are given for the highest-resolution shell. b R merge ϭ ⌺hkl |IϪ|/⌺hklI, where I is the intensity of unique relfection hkl and is the average over symmetry-related observations of unique reflection hkl; hkl is the reflection index.
Statistical analysis. All data were analyzed using SPSS 20.0 software. Analysis of variance (ANOVA) was used to compare the differences between untreated and free-lac-treated groups. In figures, statistical significance is indicated as follows: *, P Ͻ 0.05, and **, P Ͻ 0.01. Data availability. The structures of 08N2045, SC1065-lec, 08N2045-Core2, and SC1065-297Y P domains have been deposited under PDB codes 6JYR, 6JYN, 6JYS, and 6JYO, respectively.