Identification of Ebola Virus Inhibitors Targeting GP2 Using Principles of Molecular Mimicry

The most recent Ebola virus disease outbreak, from 2014 to 2016, resulted in approximately 28,000 individuals becoming infected, which led to over 12,000 causalities worldwide. The particularly high pathogenicity of the virus makes paramount the identification and development of promising lead compounds to serve as inhibitors of Ebola infection. To limit viral load, the virus-host membrane fusion event can be targeted through the inhibition of the class I fusion glycoprotein of Ebolavirus. In the current work, several promising small-molecule inhibitors that target the glycoprotein GP2 were identified through systematic application of structure-based computational and experimental drug design procedures.

ular modeling tools and experimental characterization. Specifically, large-scale virtual screening of ϳ1.7 million small molecules was performed with GP2 using the program DOCK6 (42). We hypothesize that small molecules that interact with the GP2 NHR pocket will interfere with assembly of the 6HB required for EBOV-host membrane fusion (Fig. 1). Interfering with 6HB formation is a strategy previously employed successfully against HIV (43)(44)(45)(46)(47)(48)(49)(50)(51)(52) through targeting an analogous pocket on the viral protein gp41 (53,54). The computational screening resulted in the prioritization and purchase of 165 compounds for experimental characterization, which led to 11 hits that inhibit viral entry in both EBOV-GP-pseudotyped virus and EBOV transcription-and replicationcompetent virus-like particle (trVLP) systems. Compounds were further evaluated to assess (i) potential activity artifacts using detergent-containing experiments, (ii) specificity for EBOV using a vesicular stomatitis virus glycoprotein (VSV-G)-pseudotyped virus particle counterscreen, and (iii) step(s) within the EBOV replication cycle where they exerted the majority of inhibitory activity using time-of-addition (TOA) analysis. Results suggest that 4 of the 11 compounds act to specifically inhibit EBOV entry after attachment but prior to virus-host membrane fusion. Molecular dynamics (MD) simulations in conjunction with genome analysis identified 7 highly conserved residues across different Ebola virus strains (E564.A, A568.A, L571.A, F572.A, T566.C, L569.C, and L573.C) that contribute a majority of the favorable interactions between the compounds and GP2.

RESULTS
Virtual screening outcomes. The goal of this study was to identify molecules that inhibit EBOV infection by interfering with the interactions required for formation of the GP2 six-helix bundle (6HB). Since the conformational change required to produce the postfusion structure is dependent on CHR binding the NHR region of GP2 (Fig. 1), a virtual screen of approximately 1.7 million compounds was conducted to a five-helix bundle model of GP2 constructed by the removal of one CHR from a high-resolution postfusion structure (PDB entry 2EBO [25]) (see Materials and Methods, below). Compound prioritization led to 83 candidates purchased for experimental testing (Fig. 2) and employed five distinct scoring functions: DCE SUM (DOCK Cartesian van der Waals and electrostatic energy), FPS VDW (footprint comparison of the van der Waals energy of the reference peptide and selected ligands), FPS ES (footprint for electrostatic energy), FPS SUM (footprint for both van der Waals and electrostatic energy), and TS (total score; the combination of DCE SUM and FPS SUM ). A large number of molecules was prioritized based on their structural and spatial similarity to the reference ligand composed of a segment of the CHR that made the most favorable interactions with our model of a GP2 five-helix bundle. As a rule, all 83 compounds chosen for experimental testing showed good overlap with the reference (Fig. 2A). However, those selected based on favorable footprint similarity (FPS) have somewhat better overlap than those selected based on DCE or TS (Fig. 2B).
Consistent with visual inspection (Fig. 2), molecules in each of the five groups share similar size and flexibility, with a mean molecular weight (MW) distribution of 467.3 g/mol and number of rotatable bonds of 9.5 (Table 1). Compounds purchased based on similarity in electrostatic (ES) interaction profiles (FPS ES ) had the overall smallest MW (414.0 g/mol) and fewer numbers of rotatable bonds (8.3), while those selected from the TS list were largest (492.3 g/mol) ( Table 1). As expected (45,55), compounds selected using a specific scoring function (Table 1, scoring function column) generally showed the best average score with regard to that specific chemical or physical property (Table 1, Property columns). For example, compounds prioritized using the DCE SUM function yielded a more favorable (lower) average DCE SUM energy (Ϫ65 kcal/ mol) than those obtained using other functions (Ϫ49 to Ϫ59 kcal/mol). Likewise, molecules selected using FPS SUM resulted in a more favorable average FPS SUM score (5.5) than the other groups (7.8 to 19.1). For compounds prioritized using FPS ES and FPS VDW footprint components, the scores were the lowest (1.6) and second lowest (3.9), respectively, among their respective FPS ES and FPS VDW groups.
For the DCE SUM -selected group, the favorable scores can be attributed to strong ES interactions resulting in an average DCE ES score of Ϫ15.5 kcal/mol, over 2-fold greater than the ensemble average (Ϫ6.9 kcal/mol). The overall strength of the DCE SUM scores, in conjunction with being the second smallest group in terms of MW and number of rotatable bonds (9.2), suggests that the DCE SUM list compounds are highly polar. In contrast, the TS list interactions are dominated by strong VDW interactions due to their larger size (MW ϭ 492 g/mol) ( Table 1). Consistent with the fact that FPS SUM is a part of the TS scoring function, the FPS score components are better than those observed using DCE SUM . However, the overlap is relatively moderate (FPS SUM ϭ 10.5, FPS VDW ϭ 6.8, FPS ES ϭ 3.7); therefore, future work could explore increasing the contribution of the FPS component of TS. In summary, molecular property analysis confirms that the 83 purchased candidates are similar in size and flexibility but diverse in terms of interaction energy and overlap the reference peptide.  Nine molecules from the initial in silico screen inhibit EBOV-pseudotyped virus entry in vitro. The 83 compounds identified from the aforementioned in silico screen were tested for their ability to inhibit EBOV entry and for cytotoxicity at 25 M (6,30,45). EBOV (HIV-1/EBOV)-pseudotyped virus entry into 293T cells was quantified by luciferase signal normalized by cytotoxicity and dimethyl sulfoxide (DMSO) control to yield the infectivity signal per cell as a fraction of the maximum (see Materials and Methods). Encouragingly, nine compounds resulted in a normalized luciferase signal of Յ0.25 (Fig. 3, blue). Additionally, the observed luciferase signal for the nine compounds was approximately 1.5 standard deviations below the average infectivity signal for all 83 purchased molecules, 0.76 Ϯ 0.40. Although the two compounds with the most activity (I01 and I49) were also the most cytotoxic (Fig. 3, lower, blue), all nine hits with activity were retained and used as starting points for identification of structurally related analogs in a secondary computational screen (see Discussion).
Secondary similarity screen. To identify additional compounds with enhanced activity, a second similarity-based computational screen was conducted to explore the chemical search space around the nine initial hits. Each of the hits in turn was used to rescore and rerank the top 100,000 docked molecules from the initial screen to identify compounds with similar functionality and three-dimensional (3D) shape using the DOCK Hungarian similarity (HMS) scoring function (56). The 500 top-scoring molecules from the nine unique lists were further interrogated using five additional functional methods to assess energy score (DCE) and similarity to the initial hit (footprint [FPS], pharmacophore [FMS], volume overlap [VOS], and Tanimoto). Figure 4 compares docked geometries for four of the initial hits (gray) overlaid with two representative compounds each (orange) from the secondary screen. In these examples, with the exception of I49, the compounds generally showed strong overlap and made residue-based interaction patterns similar to those of their respective references (Fig. 4), corresponding to a high average VOS score of ϳ0.7 and a low average FPS score of ϳ5.6. Despite the overall similarity of ligand scaffolds within each group, the use of different DOCK functions generally resulted in the selection of chemically diverse molecules at the atomic level. In some cases, however, the same ligand was the top-ranked candidate across the different groups. For example, rank ordering by pharmacophore or volume overlap yielded the same top-scored results for I01 (FMS ϭ 1.56, VOS ϭ 0.82), which suggests high structure and functional similarity with the initial hit (Fig. 4, FMS and VOS). Overall, the secondary virtual screen resulted in the selection of 82 additional candidates, which were subsequently evaluated for inhibition and cytotoxicity at 25 M against EBOV-pseudotyped virus. A luciferase signal of Յ0.25, which was more than 1 standard deviation below the population mean luciferase signal of 0.54 Ϯ 0.30, was used to identify 16 additional hits with moderate to low cytotoxicity ( Fig. 3, green, S prefix).
Dose-response characterization of candidates against HIV/EBOV-GP-pseudotyped virus. To further explore the 25 most promising candidates identified from the two in silico screens (9 initial plus 16 secondary), in terms of reducing infectivity and their effects on cell viability, the dose-dependent activity for each was measured. Of the 25 tested from Fig. 3, 11 compounds exhibited generally well-behaved entry inhibition compared to that of the known control inhibitor, E64, seemingly independent of cytotoxicity, especially at the observed 50% inhibitory concentration (IC 50 ) values, as shown in Fig. 5. The structures of the 11 compounds, with code names, are shown in Fig. 6.
Encouragingly, of the 11 molecules, 7 exhibited IC 50 values under 10 M, comparable to the results observed for the control inhibitor E64 (IC 50 ϭ 5.70 Ϯ 5.67 M) under the same conditions ( Fig. 5 and Table 2). Specifically, the IC 50 values for I01, I49, and S31 were less than 5 M, and the IC 50 values for S03, S33, S36, and S49 were less than 10 M (Fig. 5 and Table 2). An accurate cytotoxic concentration that results in 50% cell death (CC 50 ) could be obtained for 9 of the 11 compounds. For S42, S58, and E64, the computed CC 50 values had large standard deviations, although examination of the cytotoxicity curves suggests minimal impact on cell viability. The two most potent molecules in this assay, I01 and I49, displayed CC 50 values of approximately 11 to 15 M ( Table 2). All other hits had observed CC 50 values of 29 M or greater. Selectivity index (SI ϭ CC 50 /IC 50 ) values were also calculated. The higher the SI ratio, the more potent and the safer the compound is projected to be in vivo. Examination of the data showed a range of SI values from 3 to 14 for pseudotyped virus ( Table 2). Of the compounds with computable SI, the two hits with the greatest SI were S03 and S49, which have SI values around 13 (Table 2).

Candidate compounds show improved or comparable inhibition of EBOV trVLPs.
To test the effects of the inhibitors in an EBOV system that utilizes virus particles of a size and shape similar to that of native EBOV (57), the 11 compounds were assessed for inhibitory effect against the EBOV trVLP system at various concentrations ( Fig. 7). Notably, 8 of the 11 yielded IC 50 s under 5 M (Table 2). Of particular interest, comparative linear regression analysis between the IC 50 s observed for each candidate against EBOV-pseudotyped virus and trVLPs yielded an r 2 value of 0.55 (n ϭ 12), which increased to 0.98 (n ϭ 10) with the removal of the outliers I53 and S58 ( Table 2). As expected, based on the good correspondence between the two dose-response assays, I01 remained the most potent compound, with an IC 50 of 1.10 Ϯ 0.99 M (Fig. 7 and Table 2). Furthermore, S29, which exhibited an IC 50 of approximately 26 M in the pseudotyped experiment, was the only compound found to have an IC 50 greater than 20 M. The range of SI values from the trVLP experiments was between 2.9 and 25.8, with five candidates yielding selectivity indices greater than that of E64 (Table 2). Of the aforementioned five inhibitors, the two compounds with the largest SI values are S03 (14.3) and S58 (25.8) ( Table 2). In summary, the results indicate good reproducibility between pseudotyped virus and trVLP assays, affirming the observed activity of the tested hits.
Specificity of candidates for EBOV-GP. The 11 hits were also examined using computational and experimental methods to ascertain if the observed activity involved nonspecific effects as a result of colloidal aggregation, pan-assay interference compound (PAINS) liabilities (58-60), or promiscuity. As an initial step to assess whether activity was a result of colloids, the compounds were screened for structural similarity to known aggregators using Aggregation Advisor (http://advisor.bkslab.org) (61). Eight candidates exhibited no known similarity to compounds in the current database. The three remaining compounds (I01, S29, and S31) were found to have 75%, 70%, and 77% structural similarity to a known aggregator. As described by Irwin and Shoichet (60), the  addition of detergent should lead to a decrease in activity if a compound inhibits exclusively due to colloidal aggregation. Thus, activity was also tested in the presence of 0.025% Tween 80 (Table 3). Compounds here were defined as not sensitive to detergent if their IC 50 values with and without detergent were similar, if their IC 50 ranges with and without detergent overlapped, or if their activity increased. Based on these criteria, none of the hits appeared to be sensitive, although S49 was classified as ambiguous due to the absence of a computable error associated with the IC 50 value. The 11 compounds were also subjected to an evaluation for PAINS alerts using 3 distinct computational filters (CBLigand [62], FAFdrugs3 [63], and SwissADME [64]). I49 was the only compound with a PAINS warning, which occurred for all three programs due to the possibility of Mannich reaction (64). Despite this warning, we opted to retain compound I49 at this early stage given the fact that multiple FDA-approved drugs elicit PAINS alerts (60). Finally, PubChem (65) was searched to assess if any of the compounds FIG 7 Dose-response infectivity of EBOV trVLPs with the treatment of 11 hits. The 11 most promising candidates identified from assays using pseudotyped virus were retested against EBOV trVLPs. Molecules from the initial and secondary screens are labeled with the prefixes I and S, respectively. Dose-response curves (black) and cytotoxicity results (red) were generated from replicate experiments (n Ն 3). IC 50 s are displayed above each graph with the number of biological replicates performed to calculate the viral entry results.
were previously reported as being active against multiple targets (i.e., whether or not they were promiscuous inhibitors). Results were only available for I01, which had been tested in 708 independent studies. In these prior works, I01 was reported as active in 14 studies to different targets, as an inconclusive inhibitor in 11 experiments, and as a nonspecific inhibitor of steroidogenic acute regulatory protein (BioAssay AID 651611; https://pubchem.ncbi.nlm.nih.gov/bioassay/651611) (66). Due to its apparent promiscuity, I01 was not considered further.
A counterscreen using VSV (HIV-1/VSV-G) was performed to experimentally determine the specificity of the prioritized set of 10 compounds. In a procedure similar to that of the EBOV-pseudotyped virus screen (Fig. 3), cells were treated with DMSO, the EBOV inhibitor E64, the nonspecific endosome acidification inhibitor bafilomycin A1 (67), or the candidates (Fig. 8). Compound I49 was tested at 10 M due to its low CC 50 (Table 2), while the other 9 candidates were tested at 25 M. Notably, all compounds showed less inhibitory activity against the VSV-G screen (Fig. 8) than the initial EBOV-GP  screen (luciferase signal, Յ0.25) (Fig. 3). The four compounds with the least average inhibitory activity against VSV-G and, therefore, likely higher specificity for EBOV were I49, S29, S31, and S58 (Fig. 8). These hits showed minimal effects on cell viability. Based on the aforementioned analysis, although other compounds shown in Fig. 8 would also be promising to explore, at this stage only I49, S29, S31, and S58 were selected for further characterization. Candidate compounds exhibited maximal inhibition postattachment and before membrane fusion. To explore the stage in the EBOV entry cascade at which the candidates act, time-of-addition (TOA) experiments (6,30,31,68) were performed ( Fig.  9) for the four compounds showing the most specificity, as suggested by the averaged activity results depicted in Fig. 8. In this TOA assay, 293T cells were treated with the four candidates and the cathepsin inhibitor E64d at various time points postinfection. Compounds were tested at the concentration required to reach maximum inhibition without a significant effect on cell viability as described by the dose-response curves against pseudotyped virus (Fig. 5 and Table 2). Importantly, the four candidate molecules exhibited an activity trend similar to that of the known control E64d, where maximum inhibition occurred up until the 80-min time point and then began to decrease (Fig. 9). The fact that the compounds track with E64d suggests they act after pinocytosis, after cleavage to the NPC1 binding form, but prior to the fusion step, as expected for molecules targeted to disrupt the interaction between the CHR and NHR necessary for 6HB formation.
I49 and S31 exhibit reproducible pose stability in MD simulations. Experimental characterization through concentration-dependent analysis, counterscreening, and TOA experiments suggested that I49, S29, S31, and S58 were the most potent, specific inhibitors of the premembrane fusion stage of EBOV entry identified from virtual screening. To more fully explore the energetic and geometric compatibility of these inhibitors with GP2 at the proposed pocket, all atom MD simulations of the DOCKpredicted poses were executed. As previously described (45,55,69), six replica 20-ns simulations for each candidate-GP2 complex were performed in explicit solvent, where each replica employed a different random seed. Ligand movement was quantified using RMSDs (root mean squared deviations) that accounted for translation, rotation, and differences in internal geometry relative to the initial predicted pose.
Analysis of the trajectories showed that of the four compounds simulated, I49 and S31 maintained their DOCK-predicted poses more closely across all six simulations, as observed by the reproducible average RMSDs of 2.65 Ϯ 0.77 Å and 2.75 Ϯ 0.25 Å, respectively (Fig. 10). Since the average RMSDs of I49 and S31 were less than or equal to 2.75 Å, which is close to the typical benchmark (2.0 Å) commonly used in redocking validation tests (42), additional characterization for these two compounds was performed as described further below. In contrast, S29 and S58 adopted a wider variety of ligand poses during MD simulations, resulting in a larger range of RMSDs (Fig. 10). Visual inspection showed S29 adopted two overall geometries during its MD simulations, one closer to the original DOCK pose, which contributed to its bimodal RMSD histogram (Fig. 10). In general, compound S58 showed a much larger overall spread in RMSDs (mean of Ͼ5.5 Å) as a result of larger changes in internal geometry and/or movement in the pocket. Footprint interaction analysis. As a step toward understanding the hypothesized mechanism of action inhibiting six-helix bundle formation, the interactions of I49 and S31 with GP2 were characterized. To determine which residues had the greatest contribution to the ligand-receptor interactions across both hits, footprint interaction profiles were generated for each compound from the energies obtained over the MD trajectories (Fig. 11). Overall, the footprints showed striking similarity to the reference, especially in terms of the VDW profile (Fig. 11), suggesting good molecular mimicry of the CHR region. Moreover, I49 and S31 maintained strong contacts to a similar degree with the same residues, consistent with their overlap in the binding site and structural similarity. The residues with the most favorable interactions across the two candidates, which resulted in combined average energies greater than Ϫ2.  (Fig. 11). Regarding the ES energies, the reference profile contains two ES peaks corresponding to E564.A and Q567.A; however, E564.A was the only consensus residue with a combined average energy (Ϫ5.18 Ϯ 2.55 kcal/ mol) of less than Ϫ2.5 kcal/mol (Fig. 11). Notably, S31 also had a considerable interaction with Q567.A (Ϫ0.85 Ϯ 0.64 kcal/mol) (Fig. 11). Further inspection of the individual footprint profiles of I49 and S31 showed that S31 interacted slightly more favorably with the EBOV five-helix bundle than I49 across multiple residues in addition to Q567.A. For instance, S31 had stronger predicted interactions with E564.A in both the VDW (Ϫ6.05 Ϯ 1.12 kcal/mol) and ES (Ϫ6.58 Ϯ 1.28 kcal/mol) plots than I49 (VDW, Ϫ4.62 Ϯ 1.17 kcal/mol; ES, Ϫ3.78 Ϯ 2.72 kcal/mol). Although simulation of S31 resulted in slightly greater energies over 6 of the 8 key residues (Fig. 11), the energies of the candidates are within one standard deviation from the means and therefore are insignificantly different, highlighting E564.A, A568.A, L571.A, F572.A, T566.C, L569.C, and L573.C as the key GP2 residues that interact with the reference ligand, I49, and S31.
Of the corresponding residues, notable favorable VDW interactions were visualized at F572.A and T566.C. Specifically, F572.A was involved in strong nonspecific VDW interactions with the 4-methyloxy,6-carboxylphenyl substituent of I49 and the phenyl substituent of S31 (Fig. 12). Additionally, although both hits interact with T566.C, I49 was the only compound to exhibit a VDW interaction with T566.C throughout approximately 30.03% of the 6 simulations. Regarding ES interactions, the two inhibitors established and maintained strong ES contacts with E564.A across one main substituent throughout the majority of their MD simulations. For instance, the protonated nitrogen of the methylpiperidine substituent of I49 maintained water-mediated hydrogenbonding interactions (ϳ25%) with the backbone and sidechain of E564.A and direct hydrogen-bonding interactions with the sidechain of E564.A about 32% of the time (Fig. 12). On the other hand, S31 retained water-mediated interaction with E564.A through approximately 28% of the simulations and direct hydrogen-bonding interactions for a total of approximately 57% of the simulations (Fig. 12). In summary, results suggest that I49 and S31 have the potential to establish and retain strong VDW and ES interactions with the predicted GP2 binding site.
Sequence conservation across the key residues. To assess whether the inhibitors have the potential to interact favorably with other Ebolavirus species and related Filoviridae viruses, a comprehensive sequence alignment study was conducted. Specifically, 811 human sample sequences containing the complete GP genome for the five known Ebolavirus species, Zaire, Bundibugyo, Reston, Sudan, and Tai Forest, were selected via the Virus Pathogen Resource (ViPR) database (www.viprbrc.org; NIH). An additional 285 virus sequences were selected using BLAST (70) based on similarity to the core GP2 sequence (PDB entry 2EBO_A) used to conduct the virtual screens. Multiple-sequence alignment was then performed using COBALT (71) to align the above-mentioned 1,096 GP2-containing sequences to the full-genome sequence of GP2 (Zaire ebolavirus strain Mayinga-76; GenBank accession number AHC70246). Ultimately, 581 sequences seen in humans and nonhuman primates were retained with fragmented or complete GP sequences, which were used for sequence comparison analysis (Fig. 13).
Overall, the high sequence conservation among the subset of surveyed genomes for the five Ebolavirus species, for which a representative example is shown in Fig. 13, suggests that I49 and S31 have the potential to interact with the seven key residues in analogous GP2 binding sites (Fig. 13, shaded bars) and thereby inhibit sequence variants of Zaire ebolavirus and different Ebolavirus species. However, experimental testing would be required to characterize the activity of the small molecules against the different viruses.

DISCUSSION
EBOV particles enter the cell through macropinocytosis (16), where they are later trafficked to the endosome and a conformation change is induced in the viral envelope protein GP2 that leads to membrane fusion (17)(18)(19)(20). During this conformational change, the three CHR regions bind to the NHR trimer, forming a six-helix bundle (6HB) and mediating host-virus membrane fusion (25). Due to the current lack of FDAapproved therapeutics available to treat EVD and the key involvement of GP2 in virus entry, this study focused on the identification of small-molecule leads to inhibit the formation of the 6HB necessary for virus entry by targeting GP2 at the interface where the CHR interacts with the NHR. It is important, however, to note that our GP2 docking model is only an approximation of the EBOV prehairpin and, thus, is not likely to reflect all of the subtleties inherent in the actual biological system. Nevertheless, as the approach was successfully used by our group in prior work (45,46,73) and led to the identification of entry inhibitors targeting HIV gp41, we believe that adapting the methods to target Ebola is a reasonable strategy.
In this work, an initial virtual screen followed by a second similarity screen were performed to prioritize molecules with energetically favorable interactions with the GP2 NHR pocket. This led to a total of 165 compounds for experimental testing, of which 25 appeared promising in an EBOV-pseudotyped virus entry assay. Subsequent doseresponse analyses narrowed down the group to 11 inhibitors with low to moderate cytotoxicity. To further validate activity, the hits were tested against EBOV trVLPs, which are more similar in shape and size to the native virus. The trVLP results correspond well with those obtained using pseudotyped virus, affirming the hits are promising EBOV inhibitors. To probe specificity, the hits were also tested using VSV-G-pseudotyped virus-like particles (Fig. 8). At this stage, four compounds (I49, S29, S31, and S58) were prioritized for additional analysis given their strong inhibition, low cytotoxicity, and apparent specificity for EBOV.
In the time-of-addition assay, the control curve for E64d showed the maximal level of inhibitory activity occurring between time 0 and up to the 80-min time point at which inhibition starts to decrease (Fig. 9). This is consistent with other studies (68) that have a lag in the EBOV entry pathway compared to that of influenza virus due to trafficking to the late endosome/lysosome. The timing of loss of inhibition of E64d and the experimental compounds, as reported in Mingo et al. (68), occurred with full restoration of infectivity by the 3-h time point. In contrast, full infectivity was not restored in our system until approximately 6 h. This could be due to differences in VLPs, cell types, or the readout assay. Importantly, all four hits (I49, S29, S31, and S58) exhibited a time-of-addition trend similar to that of E64d, suggesting that they are acting late in entry at a step that is after NPC1 binding.
Although the hits are hypothesized to prevent the collapse of the metastable intermediate into the stable 6HB, it is possible they interact with an earlier GP2/GP1 prefusion conformation. They could also disrupt interactions with other partner proteins, the lipid bilayer, or bilayer components or disrupt the putative E64d-sensitive cleavage step (68). Additional mechanistic investigation, such as site-directed mutagen-esis and structural studies, will be required to confirm our hypothesis that the hits are inhibiting 6HB formation. Further study of the candidate compounds and future work on analogs and additional target sites could uncover important details about the fusion trigger. As an initial step, to help validate that the inhibitors prevent 6HB formation, the steric and energetic compatibility of the hits were explored via MD simulations. For the two hits with the most reproducible ligand poses (lower RMSDs), the MD analysis identified seven key GP2 residues (E564.A, A568.A, L571.A, F572.A, T566.C, L569.C, and L573.C) engaged in significant favorable protein-ligand interactions (Fig. 11). Notably, these residues are highly conserved (Fig. 13) across different Ebolavirus species, suggesting the hits have the ability to inhibit different types of EVD-causing viruses.
The compounds identified in this work have efficacy similar to that of other reported inhibitors of virus entry. Specifically, we identified 7 compounds with IC 50 values of less than 10 M, and three of the hits had IC 50 values of less than 5 M. Previously reported inhibitors include ZMapp, which is a combination of three antibodies, two of which appear to prevent conformational changes in the NPC-1-primed GP that are necessary for progression to late-stage entry (74). The estimated IC 50 value for ZMapp is 5 to 10 M (estimated from literature values reported by Holtsberg et al. [75] of 0.75 to 1.5 g/ml). Other examples include C-peptide inhibitors (76) designed on the concept of the successful HIV peptides T20 (enfuvirtide) and C34, which prevent 6HB collapse (77). In contrast to HIV C-peptides, EBOV C-peptides showed weak or insignificant antiviral activity due to their inability to access the endosomal compartment (76). However, inhibition was significantly improved when researchers added the HIV Tat protein transduction domain (PTD), for which the resulting Ebo-Tat hybrid showed 99% inhibition at 75 M (76). Other peptide-based inhibitors include prehairpin intermediate mimics reported by Clinton et al. (28), which showed mid-nanomolar inhibition in a pseudotype assay and a series of cyclopeptides (78) with IC 50 values ranging from 3.2 to 5.9 M.
In terms of small molecules, Basu et al. (6) reported a benzodiazepine derivative hypothesized to bind in a pocket observed in a prefusion conformation of GP1/GP2 that inhibited entry with an IC 50 of 12.1 M. Another study identified that the G protein-coupled receptor (GPCR) antagonist benztropine inhibited EBOV with an IC 50 of 3.7 M. Subsequent crystallographic studies by Stuart and coworkers (32,35) showed that benztropine and other compounds, including bepridil, paroxetine, sertraline, toremifene, and, interestingly, ibuprofen, bound to the GP1/GP2 site and are thought to destabilize the protein complex (35). In contrast, the present compounds are hypothesized to stabilize a GP2 fusion intermediate, which prevents conformational changes required for formation of the 6HB. Notably, an investigation of drug synergy reported by Dyall et al. (79) using FDA-approved drugs showed that the majority of pairs identified as synergistic inhibitors of Ebola virus included an entry inhibitor. This suggests it is worthwhile to determine if there is synergy between the entry inhibitors identified in this work and other compounds.
In summary, this study has demonstrated the utility of computer-aided modeling, in conjunction with experimental testing, to identify four compounds (I49, S29, S31, and S58) that appear to be specific inhibitors of EBOV entry. We targeted a previously unexploited site on EBOV GP2 in a conformation representative of a prehairpin intermediate and utilized protein mimicry to select for small-molecule GP2 mimics. The identified inhibitors, hypothesized to prevent formation of the critical 6HB, serve as proof of principle for this technique and as a starting point for further GP2-targeted studies.

MATERIALS AND METHODS
Computational methods. In this work, several computational methods were employed to target GP2, which can be arranged into five distinct protocols: (i) GP2 binding site and reference ligand designation through hot-spot identification, (ii) receptor and reference preparation, (iii) DOCK receptor setup, (iv) DOCK virtual screening protocols and compound prioritization, and (v) MD simulations. The work employed several software packages, including antechamber, tleap, cpptraj (80), sander, and pmemd from the AMBER suite of programs (University of California San Francisco) and dms, grid (81), and sphgen (82), which are part of the DOCK suite of programs (University of California San Francisco). divided into 42 chunks of at most 50,000 molecules. Compounds were flexibly (FLX) (42) docked to the GP2 five-helix bundle in parallel, using the MPI version of DOCK6.6 (University of California San Francisco). For each docked compound, the best scoring pose was retained, which was then energy minimized using the standard DOCK Cartesian energy (DCE) function to further fine-tune the interactions between the receptor and candidate ligands and permit footprint similarity scoring (42,84), where the similarity in VDW and ES interaction profiles between the reference and screened molecules was quantified using Euclidean distance.
Key descriptors were computed with the program MOE for the 100,000 top-scoring molecules based on DCE score, including the number of Lipinski violations, number of chiral centers, and logP, to aid in compound prioritization. The MOE MACCS clustering method was concurrently employed, using a best-first approach, to group compounds into structurally related families with the best DCE scored compound per family to serve as a clusterhead. To promote diversity in compound selection, the top-scored clusterheads were rank ordered using five distinct scoring criteria: (i) the sum of the van der Waals and electrostatic DOCK Cartesian energy score (DCE SUM ), (ii) the van der Waals FPS score (FPS VDW ), (iii) the electrostatic FPS score (FPS ES ), (iv) the sum of the FPS VDW and FPS ES scores (FPS SUM ), and (v) the combined DCE SUM and FPS SUM scores (total score, or TS) (45). Following 3D visual inspection of the top-scoring members from each of the five lists, 83 compounds, referred to with the prefix I (initial screen), were purchased for experimental testing. A second set of 82 ligands, referred to with the prefix S (secondary screen), was purchased based on similarity comparisons to hits identified in the initial screen. Similarity was computed using the following DOCK6 scoring functions: Hungarian similarity (56), footprint similarity (83), pharmacophore similarity (93), and volume overlap. For both screens, additional ligand properties considered included central location in the pocket, number of chiral centers (less than 2), formal charge between Ϫ1 and ϩ1, favorable overall score with respect to the particular rank-order method, and favorable electrostatic score.
MD simulations and analysis. For the most promising candidates, MD simulations were performed to assess geometric and energetic stability. The AMBER14 accessory programs antechamber and tleap were used to protonate, solvate, assemble, and assign force-field parameters for the protein receptor (ff14SB) (94), solvent (TIP3P) (95), and ligand (GAFF) (91). Ligand partial charges were obtained from those preassigned by the ZINC database (92). The five-helix bundle was capped where the N terminus was capped with ACE and the C terminus was capped with NME.
As previously described (69), a nine-step protocol was used to equilibrate each solvated ligandprotein complex. Briefly, all simulations were performed using the CUDA-accelerated version of pmemd (96)(97)(98) in AMBER16. In short, first the solvent and protein-ligand hydrogens were minimized with a restraint weight of 20.0 kcal mol Ϫ1 Å Ϫ2 on all complex heavy atoms for 10,000 cycles. Second, the restraint was lifted and the entire complex was minimized for 5,000 cycles. Third, over 250 ps, the system was heated from 50 to 300K. Fourth, a short MD simulation of 500 ps, with an all-atoms restraint weight of 20.0 kcal mol Ϫ1 Å Ϫ2 , was performed to optimize the water box density to 1. Lastly, each complex underwent five equilibration steps, each 200 ps in length, with lessening restraint weights of all protein and ligand heavy atoms. For the protein, the restraint weights were (i) 10 Visualization of MD trajectories was conducted using VMD (99) and Chimera (88). The AMBER14 accessory program cpptraj (80) and in-house protocols were utilized to extract VDW and ES energies (with distance-dependent dielectric) and compute molecular footprints, RMSDs (root mean squared deviations), and hydrogen-bonding interactions of each compound throughout its MD trajectories (4,000 frames for each simulation). As previously described (45,55), predicted interaction energies from all six replica MD trajectories were used to calculate the mean VDW and ES energies between the small molecule and each residue of the five-helix bundle. Residues with energies of less than Ϫ2.5 kcal/mol for the reference ligand and experimentally verified GP2 entry inhibitors were used to select key GP2 residues involved in an interaction energy. To compute ligand RMSDs, a two-step protocol was executed (73). First, the protein-ligand complex in each frame of the trajectory was aligned using cpptraj so that the protein's alpha carbons overlapped. Second, atomic-level small-molecule translation and rotation compared to that of the docked pose was quantified. For interpretation, RMSDs were binned based on frequency using cpptraj and plotted using Python (Python Software Foundation). The AMBER accessory program cpptraj was used to extract the direct and water-mediated hydrogen-bonding interactions from each trajectory and provide a frequency, location, and frame.
Experimental methods. The experimental methods to characterize the inhibitory activity of the small molecules identified from in silico screening are described below. Three different assays were employed: (i) pseudotyped HIV-1/EBOV-GP was utilized to assess viral entry, (ii) pseudotyped HIV-1/VSV-G was utilized to assess inhibitor specificity, and (iii) EBOV trVLP was utilized as a second confirmatory assay of viral entry.
EBOV trVLP preparation. A transient-transfection-based transcription-and replication-competent system that models the entire replication cycle at biosafety level 2 was utilized to confirm inhibition. This system is more physiologically relevant than pseudotyped virus due to the native size and shape of the EBOV particles. Preparations of EBOV trVLPs were prepared as previously described (57,108). Briefly, 293T cells were seeded in 2 ml in a 6-well plate at ϳ50% confluence. Twenty-four h postseeding, the cells were transfected with the following plasmids per well: 75 ng pCAGGS-VP30, 125 ng pCAGGS-NP, 250 ng pCAGGS-T7, 125 ng pCAGGAS-VP35, 1 g pCAGGS-L, and 250 ng p4cis-vRNA-Rluc, using 5.5 g PEI transfection reagent. Twenty-four h posttransfection, medium was replaced with 4 ml DMEM-PS-5% FBS. Seventy-two h posttransfection, the supernatant containing the trVLPs was pooled, clarified by lowspeed centrifugation, and stored at 4°C.
Screening of in silico-selected compounds in viral entry assays. Viral entry was measured using a luciferase reporter. Testing of selected compounds and controls against all three types of virus particles, EBOV (HIV-1/EBOV-GP) pseudotyped, VSV (HIV-1/VSV-G) pseudotyped, and EBOV trVLP, was performed in a similar procedure. 293T cells were seeded at 2 ϫ 10 4 cells/well in 96-well tissue culture-treated white-bottom plates (Greiner) that were precoated with 25 g/ml linear PEI (Sigma). For EBOV trVLP infection, helper ribonucleoprotein (RNP) components must be provided in trans through expression plasmid transfection 24 h postseeding (amounts of helper RNP plasmids per well were 4.16 ng pCAGGS-VP30, 6.94 ng pCAGGS-NP, 6.94 ng pCAGGS-VP35, 55.55ng pCAGGS-L, and 13.88 ng pCAGGS-Tim1, with 262.41 ng PEI transfection reagent). Twenty-four h postseeding (pseudotyped virus particles) or posttransfection (trVLPs), 293T cells were pretreated with selected compounds or controls for 1 h at 37°C. The medium then was removed and the cells were infected with virus particles that had also been pretreated for 1 h at 37°C. After 2 h the inoculum was removed, the cells were washed briefly with PBS, and fresh medium was added. Plates were incubated for 48 h, and viral entry was measured using the luciferase reporter. The experiment was also performed in the absence of virus to determine the toxicity of the selected compounds and controls. Viral entry and cell viability were measured using ONE-Glo ϩ Tox Luciferase reporter and cell viability assay (Promega) according to the manufacturer's protocol using a Spectra Max M5 plate reader (Molecular Devices). Luciferase signal was normalized to the cell viability and then further normalized to the luciferase signal in the DMSO-treated samples (45). Compounds with infectivity signal per cell as a fraction of the maximum below 0.25 were considered active hits in the initial screening. Additionally, for the dose-response assays, 50% inhibitory concentration (IC 50 ), 50% cytotoxicity concentration (CC 50 ), and 95% confidence intervals (CI 95 ) were computed, and IC 50 was plotted using Prism 7.0c (GraphPad Software, La Jolla California USA). CC 50 s were reported if a standard deviation within 2-fold of the CC 50 could be calculated.
As previously described (6), selected controls were dissolved in DMSO. Cathepsin inhibitor E64 (Millipore) is a cysteine protease inhibitor that prevents cleavage events that are necessary specifically for EBOV fusion with the endosomal membrane. It is used as a positive control for inhibition in HIV/EBOV-GP and EBOV trVLP assays and as a negative control in VSV-G assays, as it does not inhibit VSV-G fusion. E64d has the same action as E64 but is cell permeable. Bafilomycin A1 (Calbiochem) is a vacuolar ATPase inhibitor that prevents both EBOV and VSV entry by alkalinizing the endosome and is used as a positive control for inhibition in both assays.
Cells were infected with either EBOV-or VSV-G-pseudotyped virus at a multiplicity of infection (MOI) of 0.1 or with 50 l of EBOV trVLPs. Where indicated, 0.025% Tween 80 (Sigma) was also added to the assay to test for colloidal aggregation.
Time-of-addition assay. 293T cells were seeded at 2 ϫ 10 4 cells/well in PEI-precoated 96-well tissue culture-treated white-bottom plates. The next day, EBOV-pseudotyped virus was added to the cells at an MOI of 0.1. The plates were centrifuged for 1 h at 4°C at 1,000 ϫ g to allow the virus to attach to the cells and to synchronize the infection. The plates were washed with PBS to remove unbound virus. The plates were then moved to 37°C to allow for viral entry (0 h). Small molecules I49 (10 M), S31 (10 M), S29 (50 M), and S58 (50 M) and the E64d control (10 M; Millipore) were added to the plates at various time points as indicated. Cell viability and viral entry were measured and analyzed 48 h postinfection as described above.