Ega3 from the fungal pathogen Aspergillus fumigatus is an endo-α-1,4-galactosaminidase that disrupts microbial biofilms

Aspergillus fumigatus is an opportunistic fungal pathogen that causes both chronic and acute invasive infections. Galactosaminogalactan (GAG) is an integral component of the A. fumigatus biofilm matrix and a key virulence factor. GAG is a heterogeneous linear α-1,4–linked exopolysaccharide of galactose and GalNAc that is partially deacetylated after secretion. A cluster of five co-expressed genes has been linked to GAG biosynthesis and modification. One gene in this cluster, ega3, is annotated as encoding a putative α-1,4-galactosaminidase belonging to glycoside hydrolase family 114 (GH114). Herein, we show that recombinant Ega3 is an active glycoside hydrolase that disrupts GAG-dependent A. fumigatus and Pel polysaccharide-dependent Pseudomonas aeruginosa biofilms at nanomolar concentrations. Using MS and functional assays, we demonstrate that Ega3 is an endo-acting α-1,4-galactosaminidase whose activity depends on the conserved acidic residues, Asp-189 and Glu-247. X-ray crystallographic structural analysis of the apo Ega3 and an Ega3-galactosamine complex, at 1.76 and 2.09 Å resolutions, revealed a modified (β/α)8-fold with a deep electronegative cleft, which upon ligand binding is capped to form a tunnel. Our structural analysis coupled with in silico docking studies also uncovered the molecular determinants for galactosamine specificity and substrate binding at the −2 to +1 binding subsites. The findings in this study increase the structural and mechanistic understanding of the GH114 family, which has >600 members encoded by plant and opportunistic human pathogens, as well as in industrially used bacteria and fungi.

GAG is a linear heteropolymer of ␣-1,4 -linked galactose (Gal) and partially deacetylated N-acetylgalactosamine (Gal-NAc) (11). Analysis of A. fumigatus GAG found random distribution of the monosaccharide constituents with the percentage of galactose within each chain ranging between 15 and 60% (11). The ratio of GalNAc/Gal varies among Aspergillus species and is higher in A. fumigatus than in less-virulent Aspergillus spp. (8). Increasing the GalNAc content of GAG in the relatively nonpathogenic Aspergillus nidulans increased the virulence of this species in an immunosuppressed mouse model, highlighting the importance of GalNAc content in GAG function (8). The location and percentage of deacetylation have not been determined to date, but this modification is required for biofilm formation and GAG adherence (8). Production of exopolysaccharides containing ␣-1,4 -linked GalNAc and galactosamine (GalN) have also been confirmed in non-Aspergillus spp., including Neurospora crassa, Penicillium frequentans, Paecilomyces sp., and Trichosporon asahii (6,(12)(13)(14)(15). These GAG-like polymers were found to be involved in adherence to surfaces or flocculation depending on the species, suggesting that GAG-like galactosamine-containing polymers are utilized by a variety of fungi and may be of importance in agriculture and food industries, as well as human health (12,14,15).
A comparative transcriptomic analysis of A. fumigatus regulatory mutants deficient in GAG production identified a cluster of genes on chromosome 3 linked to biofilm formation and GAG synthesis (6). This cluster encoded five putative carbohydrate-active enzymes. A model of the GAG biosynthetic system mediated by the products of these genes has been proposed (6), and to date, three of the genes, uge3, agd3, and sph3, have been experimentally linked to GAG production and virulence (6,16,17). Uge3 is a bifunctional cytoplasmic uridine diphosphatase (UDP)-glucose-4-epimerase that mediates the production of UDP-GalNAc and UDP-Gal (7,17), Agd3 is a secreted protein required for the partial deacetylation of newly synthesized GAG polymer (6), and Sph3 is a GH135 member, with endo-␣-1,4-N-acetylgalactosaminidase activity, and is required for the production of GAG (16,18). The other two genes in the cluster, gtb3 and ega3, are predicted to encode an integral membrane glycosyltransferase and a second glycoside hydrolase (GH), respectively (6,16). Up-regulation of ega3 expression has been reported during biofilm formation (19), suggesting that Ega3 may play a role in both GAG biosynthesis and biofilm formation.
Bioinformatics analysis predicts Ega3 has an N-terminal transmembrane domain followed by an extracellular GH domain belonging to GH family 114. GH families are created based on sequence identity (20). Substrate specificity can vary within a family, but the identity of the catalytic residues and the mechanism are generally shared among family members (20 -22). There are presently 616 members of the GH114, most of which are of bacterial origin (http://www.cazy.org, June 11, 2019) 8 (92). A single GH114 protein from Pseudomonas sp. 881 (GH114 Ps , GenBank TM accession no. D14846.1) is the only member of this family that has been functionally characterized to date (23)(24)(25). GH114 Ps is specific for poly-␣-1,4-GalN with no activity on poly-GalNAc substrates (24). The enzyme exhibits endo-galactosaminidase activity, releasing galactosamine disaccharides and trisaccharides from the nonreducing end of galactosamine polysaccharides (25). Low levels of transglycosylation were found suggesting that this GH114 uses a retaining mechanism (25). An endo-␣-1,4-galactosaminidase has also been purified from Streptomyces griseus, but its amino acid sequence was not determined, and thus it cannot be assigned to a specific GH family (26). To date, endo-␣-1,4-galactosaminidase activity has not been found in any other GH family. No structures of a GH114 family member are currently available, and the identity of the catalytic residues within these enzymes remains unknown.
Herein, the extracellular region of Ega3 was recombinantly expressed in the yeast host Pichia pastoris and purified to homogeneity. The Ega3 crystal structure revealed a modified (␤/␣) 8 -barrel that lacked ␤-strand 5 (␤5), and ␣-helices 1 (␣1) and 8 (␣8). A structural insertion after ␤3 helps to create a deep cleft. Soaking Ega3 crystals in galactosamine allowed structural determination of an Ega3-GalN complex. Binding of galactosamine in the conserved cleft results in a conformational change and the formation of a tunnel, a structural feature that correlates with processive glycoside hydrolase activity (27). We show that Ega3 disrupts GAG-dependent biofilms at nanomolar concentrations, and using pure oligosaccharides demonstrates that the enzyme is an endo-␣-1,4-galactosaminidase specific for GalN-GalN linkages. Our identification of the acidic catalytic residues at the ends of ␤4 and ␤6 supports a retaining mechanism for this GH family.

Ega3 is predicted to have an extracellular GH114 domain
To gain insight into the structure and function of Ega3, we first examined its amino acid sequence to determine its domain structure and identify boundaries that could be used for construct design. The primary amino acid sequence of Ega3 from the UniProt database (gene AFUA_3G07890) was submitted to a number of bioinformatics servers. The TMHMM server (28) predicted that the N-terminal region of the protein contains a putative transmembrane helix between residues 22 and 45, with the N terminus residing in the cytosol (Fig. 1A). The extracellular region was predicted by BlastP and the dbCAN2 annotation server to include a GH114 domain between residues 83 and 314 (29,30). The first 22 residues of the linker between the transmembrane helix and the GH114 domain, residues 46 -68, have high glycine content and are predicted to be disordered according to Phyre 2 (31).

Ega3 adopts a (␤/␣)-barrel fold with a deep, highly-conserved groove
Based on these bioinformatics analyses, the predicted extracellular region of Ega3 (Ega3 46 -318 , referred to as Ega3 herein) was recombinantly produced in P. pastoris for in vitro structure-function studies. Attempts to produce soluble protein in Escherichia coli were unsuccessful. Ega3 was purified to homogeneity and used in crystallization trials both with and without its hexahistidine purification tag. Both tagged and untagged constructs crystallized in multiple conditions. Preliminary hits appeared as irregular square plates and long rods. A crystal produced using untagged Ega3 grown in 0.2 M lithium acetate and 20% (v/v) PEG 3350 yielded the highest-resolution data set, diffracting to 1.76 Å (Table 1). Conventional molecular replacement using the structure of highest-sequence identity, the hypothetical protein TM1410 from Thermatoga maritima (18% sequence identity, PDB 2AAM), was unsuccessful. Instead, phases were determined using ARCIMBOLDO_ SHREDDER based on fragments of this distant homologue. After model building and refinement, the resulting structure encompassed residues 68 -318 and had an R work and R free of 16.4 and 19.6%, respectively. No interpretable density was found for residues 46 -67 suggesting the N-terminal region that links the GH114 domain to the transmembrane helix is largely disordered or prone to proteolytic cleavage.
The structure of Ega3 consists of a central ␤-barrel of seven strands surrounded by six ␣-helices (Fig. 1B). ␤5, ␣1, and ␣8 of a canonical (␤/␣) 8 -barrel are replaced by regions with no regular secondary structure. The six cysteine residues present form three disulfide bonds, including the C-terminal amino acid Cys-318 that cross-links to Cys-294 on ␣7 (Fig. 1B). Density for five N-glycans was found linked to Asn-69, Asn-92, Asn-161, Asn-222, and Asn-253. Three sites, Asn-69, Asn-92, and Asn-161, were predicted to be glycosylated by the NetNGlyc 1.0 server (32). Asn-222 was given a low score by the server, and Asn-253 was not predicted as the sequon was Asn-Xaa-Cys instead of Asn-Xaa-(Ser/Thr) (32). Whether any of these sites are glycosylated in the native protein has yet to be determined, but it is interesting to note that only the sequons at Asn-92 and Asn-222 are conserved within Ega3 orthologues. The longest ordered N-glycan, linked to Asn-92, contained eight sugar moieties (Man␣6[Man␣3]Man␣6[Man␣2Man␣3]Man␤4 GlcNAc␤4GlcNAc). Both the Asn-92-and Asn-69 -linked gly-cans are highly ordered due to their participation in crystal contacts.
On the C-terminal end of the ␤-strands of the barrel there is a deep cleft. One side of the cleft is created by the loops after ␤3 and ␤4. There is a 29-amino acid insertion between ␤3 and ␣3 (␤3-insertion) that contains a two-strand anti-parallel sheet with two small 3 10 -helices (Fig. 1B). The other side of the cleft is formed by the loops following strands ␤1, ␤7, and ␤8. Mapping of the conserved, surface-exposed, amino acids shows highest conservation in the central cleft, which correlates with an electronegative surface potential ( Fig. 1, C and D). The glycans do not obstruct the cleft and are located away from the conserved zone.

Ega3 shares structural similarity to PelA h and TM1410
As Ega3 is the first member of GH114 to be structurally characterized, we next sought to determine Ega3's nearest structural neighbors to gain insight into its potential function. In addition to T. maritima TM1410, which was used for phase determination, the structural similarity server, DALI (33), also found that Ega3 is similar to the hydrolase domain of Pseudomonas aeruginosa PelA (PelA h , PDB 5TCB). Secondary structure alignment yielded a root-mean-square deviation (RMSD) of 1.76 Å over 187 ␣-carbons for TM1410 and 3.25 Å over 189 ␣-carbons for PelA h . Ega3 and PelA h share 14.4% sequence identity as determined by ClustalOmega but only 12% according to structural alignment. The structure of TM1410 was determined as part of a structural genomics effort and has not been functionally characterized. PelA h has recently been shown to be an endo-␣-1,4-N-acetylgalactosaminidase belonging to family GH166 (18,34,35). PelA is involved in the biosynthesis of the Pel polysaccharide, which is similar to the GAG polysaccharide in that it contains 1,4-linked GalNAc, and is partially deacetylated (36). The overall structures of Ega3, A, predicted domain arrangement of Ega3. B, crystal structure of (␤/␣) 8 -barrel fold of Ega3 shown in cartoon representation. The (␤/␣)-barrel is colored in blue (␤-strands) and teal (␣-helices) with the five N-glycans that could be built into the electron density displayed as gray sticks. The secondary structure elements of the ␤3-insertion are shown in dark red, and the three disulfide bonds are shown in yellow. The missing elements typically found in a (␤/␣) 8 -barrel, ␤5, ␣1, and ␣8, are labeled in bold and italic. C, surface representation colored from variable in teal to conserved in fuchsia, as calculated by Consurf (91). D, electrostatic surface, calculated using APBS in PyMOL, shows a highly negatively charged cleft (ϩ10 kT to Ϫ10 kT) (60).

EDITORS' PICK: Structure and activity of Ega3
TM1410, and PelA h align with high similarity in the central barrel motif (Fig. 2A). All three structures have a structural insertion after ␤3, although the number of strands and helices in this insert differ ( Fig. 2A). In both PelA h and Ega3, this insert contains a two-strand sheet, whereas TM1410 is five residues longer and has a three-strand sheet.
Further similarity was found between PelA h , TM1410, and Ega3 structures that lack the eighth helix of the canonical (␤/␣) 8 -barrel and instead have an extended coil that packs against the barrel (Fig. 2D). In Ega3, this extended coil region represents the C terminus of the protein and is anchored to ␣7 by a disulfide bond (Fig. 2D). Ega3 is unique in that it also lacks ␣1. Both PelA h and TM1410 contain a ␤-hairpin after ␤6, whereas Ega3 has a single turn 3 10 -helix (Fig. 2B).
The active-site residues of (␤/␣)-barrel glycoside hydrolases are usually found at the C termini of the ␤-strands of the barrel. The activity of PelA h depends on a highly-conserved glutamic acid (Glu-218) at the C terminus of ␤4 and Asp-160 at the C terminus of ␤6 (18,37). Structural alignment to PelA h shows conservation of these acidic residues in both TM1410 and Ega3 (Fig. 2D). Higher sequence conservation between these proteins was found in the region of Asp-189 Ega3 (Asp-160 PelA ) as compared with the region around Glu-247 Ega3 (Glu-218 PelA ) (Fig. 2E). Asp-189 and Glu-247 were previously identified as putative active-site residues in a bioinformatic analysis of the GH114 family (38). The distance between these residues is congruent with a retaining mechanism that was previously proposed for GH114 Ps .

Ega3 disrupts both Pel-and GAG-dependent biofilms
The ␣-1,4-N-acetylgalactosaminidase PelA h disrupts both P. aeruginosa Pel-dependent and A. fumigatus GAG-dependent biofilms (35,37). Given the structural similarities between Ega3 and PelA h , we next investigated whether Ega3, like PelA h , was an active glycoside hydrolase with cross-kingdom anti-biofilm activity. Biofilm disruption assays performed using A. fumigatus Af293 biofilms revealed that Ega3 disrupted Af293 biofilms with half-maximal effective concentration (EC 50 ) of 0.85 nM (logEC 50 Ϫ9.07 SE 0.21, Fig. 3A). Ega3 was Table 1 Summary of data collection and refinement statistics for Ega3 Values in parentheses correspond to the highest resolution shell.

Ega3
Ega3-GalN where I hkl (j) and ͗I hkl ͘ represent the diffraction intensity values of the individual measurements, and the corresponding mean values, for each unique reflection. The summation is over all unique measurements. b CC1 ⁄ 2 is the ratio of Pearson correlation coefficients (CC ϭ Α(x Ϫ ͗x͘) (ml Ϫ ͗ml͘)/(Α(x Ϫ ͗x͘) 2 Α(ml Ϫ ͗ml͘) 2 ) 1/2 ) between random half-sets of data. c R work ϭ ΑʈF obs ͉ Ϫ k͉F calc ʈ/͉F obs ͉,where F obs and F calc are the observed and calculated structure factors, respectively. d R free is the sum extended over a subset of reflections (4.89% apo and 9.27% GalN-bound) excluded from all stages of the refinement. e Data are as calculated using MolProbity (89). f Maximum-likelihood-based coordinate error, as determined by PHENIX (81).

EDITORS' PICK: Structure and activity of Ega3
also able to disrupt P. aeruginosa PA14 Pel-dependent biofilms within 1 h with an EC 50 of 96 nM (logEC 50 Ϫ7.01 SE 0.11, Fig.  3B). Site-directed mutants of the putative active-site residues, Ega3 D189N and Ega3 E247Q , abrogated GAG and Pel biofilm disruption activity. Collectively, these findings suggest that Ega3 is an active glycoside hydrolase that requires Asp-189 and Glu-247 for its cross-kingdom activity.

Ega3 is specific for galactosamine regions of GAG
To probe the substrate specificity of Ega3, a mass spectrometry (MS) approach was used. Secreted GAG present in the supernatant of A. fumigatus cultures contains a mixture of Gal-NAc and GalN with relatively low Gal content (18). Treatment of secreted GAG with Ega3 resulted in the release of products with ions corresponding to mass to charge (m/z) ratios consistent with tri-to pentadecasaccharides (15-mers) containing a mixture of hexosamine (HexN) and N-acetylhexosamine (HexNAc) moieties (Fig. 4A). There was evidence of disaccharide products; however, due to experimental limitations these ions were poorly resolved and not quantifiable. The released oligosaccharides also suggest that there are regions of GAG that are highly deacetylated, ranging from 50 to 100% deacetylated moieties.
In contrast to Ega3, treatment of secreted GAG with Sph3, which we have previously shown is an endo-␣-1,4-N-acetyl-  (18)), and the hypothetical protein TM1410 from T. maritima (gray, PDB 2AAM). ␤3-insertion and structural elements following ␤6 are found in each structure and are colored. B, tertiary structure alignment of Ega3 (teal) with Sph3 (purple, PDB 5C5G (16)). The ␤3-insertion is circled and is not present in Sph3. C, comparison of surface electrostatics between Ega3 and Sph3 as done in Fig. 1. D, aligned structure from A emphasizing the lack of helical structure after ␤8 of the barrel. E, highly-conserved aspartic and glutamic acid residues occur at the end of ␤4 and ␤6, respectively, and are 6.4 Å apart in Ega3. F, sequence alignment of A. fumigatus Ega3 (Ega3 Af ) and its orthologues from Aspergillus clavatus (Ega3 Ac ), Aspergillus niger (Ega3 An ), and Fusarium oxysporum (Ega3 Fo ) with TM1410, PelA h , and Sph3 showing the degree of amino acid conservation surrounding the putative catalytic residues (blue). Secondary structure is represented in blue for Ega3 above the corresponding residues. Sequence identity to Ega3 is listed for the entire sequence with sequence conservation represented by colored dots; Ega3 orthologues were calculated by ClustalOmega; and TM1410, PelA h , and Sph3 sequence identities were determined through the secondary structure alignments in Coot.  (18). Fragmentation analysis of the reduced and propionylated tetra-deacetylated hexasaccharide products of Ega3 digestion revealed the presence of a HexN moiety at the reducing end of the oligosaccharide (Fig.  4C). These results support the specificity of Ega3 for galactosamine at the site of cleavage. To determine whether deacetylation of GalNAc was required for Ega3 activity, fully acetylated GAG isolated from the ⌬agd3 strain was used as the substrate for Ega3 and Sph3 treatment. Exposure of fully acetylated GAG to Ega3 did not result in the release of any detectable oligosaccharide products further supporting Ega3 specificity for deacetylated GAG (Fig. 4C). In contrast, Sph3 resulted in a similar product profile as observed with partially deacetylated GAG from WT A. fumigatus (Fig. 4D). We previously determined that sph3 was required for GAG production and biofilm formation in a strain that exhibited WT levels of ega3 expression (16). The difference in specificity between Sph3 and Ega3 is congruent with the lack of functional redundancy found in these in vivo experiments and suggests that these hydrolases play different roles in A. fumigatus GAG synthesis.

Ega3 binds galactosamine creating a substrate tunnel
To probe the specificity of Ega3 at the molecular level, cocrystallization and crystal soaking trials were performed using either galactosamine or GalNAc. Although crystals formed in the presence of both monosaccharides, interpretable density for a monosaccharide was only found for galactosamine. A single galactosamine monomer was found bound in the active-site cleft (Fig. 6A). The galactosamine occupies substrate subsite Ϫ2, based on the orientation and distance from the site of cleavage (between Ϫ1/ϩ1 sites). The overall structures of the galactosamine complex and apo-Ega3 are very similar with the exception of the ␤3-insertion, which moved up to ϳ8 Å and folded over the galactosamine moiety creating a tunnel (Fig.  6B). The loop contains a highly-conserved tryptophan, Trp-154, which binds the galactosamine amino group through a cation interaction and moved 12.3 Å compared with the apo-structure (Fig. 6B). The side chain of Glu-157 also moved 1.6 Å compared with the unbound structure. Glu-133 and Glu-157 create an electronegative pocket that accommodates the amine of the galactosamine (Fig. 6C). Three leucines, Leu-87, -88, and -]311 form a hydrophobic pocket close to the C6 of the galactosamine (Fig. 6C). The hydroxyl oxygens of C6 and C3 are coordinated by the backbone carbonyls of Asn-310 and Arg-136, respectively.
Comparing the ␤3-insertion of Ega3 to that of PelA h and TM1410, Trp-154 was found to be conserved in all three proteins and throughout Ega3 orthologues (Fig. 6, D and E). PelA h has an "open" conformation similar to apo-Ega3 (Fig. 6D). TM1410 was crystallized with an unknown ligand that contains a ring reminiscent of a carbohydrate. The Trp-154 equivalent (Trp-121 TM1410 ) folds over the ligand in a similar "capped" conformation as observed in the Ega3-GalN structure (Fig. 6D). Sequence alignment of the ␤3-insertions shows little conservation in sequence between Ega3 and TM1410, or PelA h , besides EDITORS' PICK: Structure and activity of Ega3 the tunnel-forming tryptophan (Fig. 6E). TM1410 and Ega3 share slightly higher identity within the insertion, with conservation at the Glu-133, and an asparagine replacing Glu-157. In the place of Trp-135 and Arg-136 in Ega3, TM1410 has a Tyr/ Arg pair. These similarities suggest a comparable binding site for a hexosamine. PelA h does not have comparable residues with a serine aligning to Glu-157 and an aspartic acid replacing Glu-133 (Fig. 6, D and E). Thus, PelA h has a larger and less electronegative pocket than the one created by Glu-133 and Glu-157 in Ega3.

Docking of ␣-1,4-(GalN) 5 on Ega3 reveals six substratebinding subsites
Although the co-crystal structure of Ega3-GalN revealed a single sugar-binding site within the deep active-site cleft, the substrate for Ega3 is polymeric. The disappearance of (GalN) 5 upon oligo-␣-1,4-GalN cleavage suggests that a pentasaccharide is a productive substrate of Ega3 (Fig. 5C). To determine potential binding sites for an oligo-␣-1,4-(GalN) 5 substrate, docking studies were performed using Glide in the Schrodinger software suite (41)(42)(43). The docking of an ␣-1,4-GalN pentasaccharide in the Ega3-GalN structure, after removal of the monosaccharide found in this structure, yielded multiple conformations within the predicted binding cleft (Fig. 7A). The 10 highest-scoring conformations all predicted a galactosamine moiety binding at the Ϫ2 subsite oriented in the same manner as the galactosamine monomer found in the Ega3-GalN structure, supporting the validity of the docking studies (Fig. 7, B and D). There were two predominant groups of conformers, one group docked in the Ϫ3 to ϩ2 subsites, and the second group docked in the 2 to ϩ3 subsites (Fig. 7, B and C).
These docking studies shed light on the mechanism of Ega3, as the positioning of sugars in the ϩ1 and Ϫ1 subsites would allow Asp-189 to attack the anomeric carbon and Glu-247 to protonate the oxygen of the glycosidic bond (Fig. 7E). Glu-247 is then well-positioned to activate a water molecule to attack the anomeric carbon of the glycosyl-enzyme intermediate (Fig.  7E). Thus, the docking supports the proposed retaining enzyme mechanism of GH114 family members and suggests roles for Asp-189 and Glu-247 as the nucleophile and catalytic acid/ base, respectively (Fig. 7F). As noted above, single point mutants of D189N and E247Q were unable to disrupt either GAG or Pel-dependent biofilms (Fig. 3), supporting the role of Asp-189 and Glu-247 as the catalytic nucleophile and acid/ base, respectively. The docking studies also provide a rationale for the specificity of Ega3 for galactosamine substrates over GalNAc as many hydrogen bonds are created between the polar and charged residues in the Ϫ2 to ϩ3 sites and the ␣-1,4-(GalN) 5 (Fig. 7C). Addition of an acetate group to the moiety in

EDITORS' PICK: Structure and activity of Ega3
subsite Ϫ2 would lead to steric clashes, suggesting that this site serves as a filter for galactosamine substrates.

Conserved residues in the deep binding cleft are important for activity
The results of our docking studies suggest determinants of substrate binding and catalysis. To experimentally validate which residues are important for catalysis and substrate binding, alanine and conservative point mutants were made to the residues in the binding cleft and those identified in the Ega3-GalN structure and docking studies (Fig. 8A).
Examination of the Ega3 structures reveal that Tyr-126 creates a hydrogen bond network between Asp-189 and the galactosamine amino group in Ϫ2 subsite (Fig. 7D). Tyrosines neighboring catalytic residues have been found to increase catalytic activity by positioning the carboxyl group modulating pK a (44 -47). Mutation of Tyr-126 to phenylalanine (Y126F) abolishes this network and reduces biofilm disruption activity 215-fold compared with WT Ega3 (Fig. 8B). This decrease in activity suggests that Tyr-126 may be involved in activating the catalytic nucleophile, Asp-189.
Tunnel formation has previously been correlated with processivity of glycoside hydrolases and a decrease in substrate off-rates (27, 48 -51). Replacing the tunnel-forming tryptophan with phenylalanine (W154F) or alanine (W154A) decreased biofilm disruption activity compared with WT Ega3 (Fig. 8B). Mutation to alanine would prevent thecation interactions between the substrate and tryptophan, which could be important for tunnel formation. These results suggest that Trp-154 is important for substrate binding and that tunnel formation may increase Ega3 efficiency by increasing processivity.
Residues in the ␤3-insertion that coordinate the monosaccharide in the Ega3-GalN structure were also found to be important for activity. Glu-157 and Glu-133 hydrogen bond to the GalN amine at the Ϫ2 subsite in both Ega3-GalN structure and the docking studies. Replacement of Glu-157 with serine completely abolishes the ability of the mutant enzyme to disrupt the biofilm. The conservative mutation of Glu-157 to glutamine (E157Q) also had no measurable activity on GAG biofilms. These findings, along with the structure and docking studies, suggest that the negative charge is required at this position for galactosamine binding. Similar to the results for Glu- Figure 7. In silico docking of ␣-1,4-(GalN) 5 reveals six substrate-binding subsites. A, transparent surface representation of Ega3 structure with the galactosamine (dark gray) found in the crystal structure and the top two scoring conformations (no. 1 is yellow and no. 2 is orange). The ␤3-insertion is shown as a cartoon labeled with an arrowhead indicating the change between the apo structure (gray) and Ega3-GalN (teal). Putative catalytic residues are in blue. B, cartoon representation of the Ega3-GalN structure (white) and putative catalytic residues (blue). The galactosamine (dark gray) found in the crystal structure aligns with the top two scoring conformations (no. 1 is yellow and no. 2 is orange). The subsites are numbered with the putative site of cleavage between Ϫ1 and ϩ1. C, side view of the lowest energy conformer (yellow) with residues that participate in binding the oligosaccharide labeled. Dashed lines indicate H-bonds and salt bridges to ligand amines. All interaction distances are less than 3.1 Å. D, saccharide in subsite Ϫ2 overlaps with the galactosamine (dark gray) in the Ega3-GalN structure and has an identical hydrogen bond network. The hydrophobic pocket created by Leu-88 and Leu-311 is indicated by the dashed light orange lines. E, Ega3 active site with the hydrogen bond network is indicated by the dashed black lines. The catalytic nucleophile, Asp-189, is aligned to attack the anomeric carbon (red dashed line). F, proposed mechanism of Ega3 with D189 acting as the catalytic nucleophile.

EDITORS' PICK: Structure and activity of Ega3
157, mutation of Glu-133 to aspartate, which conserves the charge, led to a significant decrease in the ability to disrupt GAG. These residues are conserved in Ega3 homologues, and the mutagenesis results support an important role in substrate affinity. PelA h has less bulky residues at the Ϫ2 subsite, with a serine and aspartate aligning to Glu-157 and Glu-133, respectively. Ega3 thus has a smaller more negatively charged binding pocket than PelA h , which may account for the differences in substrate specificity between these enzymes.
The results of the docking studies also suggest that the acidic residues that line subsites ϩ2 and ϩ3 may play a role in hydrogen bonding of the oligosaccharide substrate. Single mutation of any of these residues to alanine had no significant effect on Ega3 activity levels. The participation of multiple acidic residues in binding may provide some redundancy, thus leading to minimal effects on the activity when only one alanine is mutated. Taken as a whole, the mutagenesis data support that Asp-189 and Glu-247 at the termini of ␤4 and ␤6, respectively, are the catalytic residues. Furthermore, the residues at the Ϫ2 subsite were found to affect Ega3 activity, strengthening the results of the substrate docking and Ega3-GalN structure and suggesting that this site acts as a filter for galactosamine specificity and affinity.

Discussion
The location of ega3 within the GAG cluster and its up-regulation during biofilm formation suggest that the GH114 domain encoded by this gene likely plays a role in the GAG biosynthesis. The hydrolase domain of Ega3 is predicted to be extracellular and thus could interact with GAG during or after secretion. Herein, using structural and biochemical characterization, we show that Ega3 is an endo-␣-1,4-galactosaminidase specific for galactosamine regions of the GAG heteropolymer. A flexible loop, which creates a tunnel upon substrate binding, and the orientation and distance between the key catalytic residues suggest that Ega3 has a processive, retaining enzyme mechanism (Fig. 7).
Previously, a GH114 from Pseudomonas sp. 881 was shown to be an endo-␣-1,4-galactosaminidase, specific for ␣-1,4-GalN-GalN bonds using an ␣-1,4-GalNAc/GalN substrate isolated from Paecilomyces sp. (24,26). The sequence identity between Ega3 and the GH domain of GH114 Ps is 49%, with high identity around Asp-189 and Glu-247. This is much higher than the sequence identity between Ega3 and PelA h (14.4%, Clusta-lOmega). As substrate specificity is not always shared within a GH family and to determine whether Ega3's activity is more similar to PelA h or GH114 Ps , defined length substrates that represent sections of the GAG polymer were synthesized and their hydrolysis products analyzed. Ega3 was found to have endo-␣-1,4-galactosaminidase activity and unlike PelA h had no measurable activity on the fully-acetylated polymer (18).
The specificity of Ega3 for galactosamine is supported by the structure of the Ega3-GalN complex. The amino group of the galactosamine is coordinated by Glu-133, Glu-157, and Tyr-126. These residues are conserved in Ega3 orthologues as well as GH114 Ps . Addition of an acetate group would create steric clashes suggesting that Ega3 could not accommodate GalNAc at this site (Ϫ2 subsite). PelA h has smaller residues in the structurally equivalent positions suggesting that PelA h could accommodate both acetylated and deacetylated oligosaccharides. Furthermore, mutation of Glu-157 to serine or glutamine greatly decreases Ega3's ability to hydrolyze GAG biofilms (Fig. 8). As only the Ϫ2 subsite was occupied in the galactosamine crystal structure, despite the high concentration of monosaccharide used, it suggests that this subsite has the highest affinity for the monosaccharide and may be important for the alignment and affinity of the polysaccharide substrate. This hypothesis is supported by the in silico docking of ␣-1,4-(GalN) 5 into the "capped" Ega3 structure. The high-scoring ligand conformations all had a galactosamine moiety in subsite Ϫ2 that superimposes with the galactosamine found in the crystal structure (Fig. 7). Thus, Ega3 may require a galactosamine at the Ϫ2 subsite to orient the substrate prior to cleavage. Recently, a similar requirement for a deacetylated moiety to help orient the substrate was found for the endo-acting glycoside hydrolase PgaB, which hydrolyzes partially deacetylated PNAG polysaccharide. PgaB requires the sequence of GlcN-GlcNAc-GlcNAc in the Ϫ3 to Ϫ1 subsites for substrate binding and cleavage (52).
Galactosamine binding caused a large conformational change in the ␤3-insertion of Ega3 (Fig. 6). The monosaccharide participates incation stacking with a highly-conserved

EDITORS' PICK: Structure and activity of Ega3
tunnel-forming tryptophan. Tunnel formation has been strongly correlated with processive activity where once bound, and the polysaccharide substrate may be cleaved repetitively from one end (27,49,53,54). Endo-acting processive enzymes have been found to produce predominantly short oligosaccharides, including di-to tetrasaccharides from polysaccharide substrates. Although the experimental approach precluded the clear quantification of disaccharides or detection of monosaccharide products, Ega3 treatment of secreted GAG produced significant levels of (GalN) 3 (Fig. 4). Processivity could not be directly measured using the methods employed herein, but the accumulation of (GalN) 3 and tunnel formation upon substrate binding are both suggestive of processivity, in which Ega3 would bind galactosamine regions of GAG and release (GalN) 3 multiple times from one end before disassociation from the substrate. GH114 Ps was also found to produce largely (GalN) 3 and some (GalN) 2 from polysaccharide substrates when incubated for extended periods (25). GalNAc may affect the degree of processivity but can be accommodated in some Ega3 subsites as shown by the products from soluble GAG after Ega3 treatment (Fig. 4A). Further experiments would be necessary to measure the degree of processivity exhibited by Ega3 in vitro. The insertion after ␤3 and the tryptophan are conserved in TM1410 and PelA h , and structures of these proteins represent the "capped" and "open" state of the binding cleft, respectively (Fig.  6D). The two structures of Ega3 presented herein strongly suggest that the flexibility of the ␤3-insertion plays a crucial role in substrate binding in all three enzymes and conservation of tunnel formation.
Ega3 was found to disrupt GAG-and Pel-dependent biofilms with nanomolar EC 50 values comparable with the activity of PelA h . The EC 50 values for PelA h are 2.8 and 35.7 nM on GAG and Pel biofilms, respectively (34,35). We have recently shown that PelA h is an endo-␣-1,4-N-acetylgalactosaminidase with preference for partially deacetylated substrates. The Pel polysaccharide itself has been reported to be a partially deacetylated 1,4-linked polymer of GalNAc and GlcNAc at a 5:1 ratio (36). The percent deacetylation and which component is deacetylated, GalNAc or GlcNAc, or both, has not yet been determined. Herein, digestion of GAG by Ega3 supports high heterogeneity with blocks that are over 50% deacetylated and regions that are fully acetylated and susceptible to Sph3 digest. The cross-reactivity of Ega3 against Pel-dependent biofilms suggests the presence of ␣-1,4-(GalN) n within the Pel polysaccharide.
Glycoside hydrolases are important for optimal exopolysaccharide export and biofilm formation in multiple bacterial species (55)(56)(57)(58)(59). For example, the synthesis of carboxymethylcellulose by Acetobacter xylinus (57) and Gluconacetobacter xylinus (56) requires the GH8 enzyme CMCax for cellulose assembly. In the absence of CMCax the nascent cellulose creates highly-twisted conformations that have been suggested to stall cellulose production (56). A similar role has been proposed for Sph3 in GAG biosynthesis but has not been experimentally verified (16). The Candida endoglucanase Xog1 also plays a role in matrix ␤-1,3-glucan production but again its mechanism is not known (61, 62). The role of Ega3 in GAG biosynthesis has yet to been determined; however, its putative localization at the cell surface is incongruent with activity on deacetylated GAG. The GAG polymer is predicted to remain fully acetylated until it reaches the cell wall where the deacetylase Agd3 is found. It is possible that the transmembrane helix tether is cleaved, thus allowing Ega3 to localize closer to deacetylated GAG. Further studies of the biological role of Ega3 are required to resolve these questions.
This study presents the first structure-function analysis of a GH114 family member, and it identifies the importance of the conserved aspartic and glutamic acid residues in catalysis (38). The studies presented herein will aid in the understanding of the other members of the GH114 family that are present throughout bacteria and fungi. Previously, it was shown that the GAG gene cluster, including ega3 orthologues, was present in numerous fungal plant pathogens and emerging human pathogens (6). GH114 members are found in many Streptomyces spp., a genus of industrial import that was cultivated for production of antibiotic, hydrolytic enzymes, and other biomolecules, as well as agricultural uses (63,64). This includes Streptomyces lydicis and Streptomyces griseoviridis, which both encode GH114 enzymes and have been commercially developed as plant-growth-promoting products. It is possible that there is interplay in plant microbiomes between poly-␣-1,4-galactosamine-producing organisms and those that encode GH114 enzymes to degrade them. Recently, glycoside hydrolases have been gaining attention as possible anti-biofilm agents (34,35,(65)(66)(67). The activity of Ega3 against biofilms of A. fumigatus and P. aeruginosa suggests that this hydrolase may have potential therapeutic applications for the treatment of infections with these organisms.

Ega3 expression and purification
A pUC57 plasmid containing an E. coli codon-optimized version of the gene encoding the extracellular region of A. fumigatus Ega3 (Ega3 46 -318 ) was obtained from BioBasic. The ega3 46 -318 gene was subcloned into the pET28a vector between the NdeI and XhoI sites. Expression trials of the predicted GH114 domain were attempted in E. coli BL21 and Origami cells. As no soluble protein was produced using this construct, primers 68NdeI, 72NdeI, and 75NdeI were paired with either 310HindIII or 318HindIII to generate shorter protein constructs ( Table 2). As little to no soluble protein was obtained for any of these constructs and Ega3 is predicted to contain three N-glycosylation sites and possible disulfide bonds, protein expression was moved into yeast.
The region including the N-terminal tag and the ega3 46 -318 gene was cloned from the pET28a vector into the pPink ␣-HC vector between the StuI and KpnI sites using primers His Fwd and Ega3 Rev ( Table 2). This resulted in a pPink ␣-HC plasmid encoding the ␣-factor signal sequence with the pET28a throm-EDITORS' PICK: Structure and activity of Ega3 bin-cleavable hexahistidine tag N-terminal to Ega3 46 -318 . Point mutants Y126F, E133D, W154A, W154F, D189A, D189N, D192A, D195A, D232A, E247Q, E254Q, and Y273F were created from this plasmid using the QuikChange Lightning sitedirected mutagenesis kit (Agilent Technologies) with the primer pairs listed in Table 2. Sequences were confirmed at ACGT Corp. (Toronto, Ontario, Canada) with primers against the ␣-factor and CYC1-encoding regions. Plasmids were linearized either with AflII or SpeI in the TRP2 gene before transformation into PichiaPink TM strain 4 (ade2, prb1, and pep4) by electroporation. Colonies of each clone, with the highest relative expression, were either restreaked and stored at 4°C or used to make glycerol stocks and kept at Ϫ80°C.
Large-scale expression of WT Ega3 46 -318 , or point mutants thereof, was carried out as outlined in the PichiaPink TM expression system manual with minor modifications. Briefly, a 50-ml starter culture was grown in YPD (1% (w/v) yeast extract, 2% (w/v) peptone, 1% (w/v) dextrose) growth medium for about 20 h at 28°C. The starter was added to 500 ml of BMGY (1% (w/v) yeast extract, 2% (w/v) peptone, 100 mM potassium phosphate, pH 6, 1.34% (w/v) yeast nitrogen base, 0.00004% (w/v) biotin, and 1% (v/v) glycerol) growth medium in a baffled Fernbach flask. Growth continued for 24 h at 28°C with shaking. Cells were harvested and resuspended in 250 ml of BMMY media (BMGY without the glycerol) and incubated for a further 24 h. Expression was induced by the addition of methanol in three staggered feedings starting 1 h after resuspension, at 1-h intervals, to a final concentration of 1% (v/v) methanol. After harvesting, the supernatant containing the secreted protein was filtered through a Whatman filter. The sample was then buffered with HEPES, pH 8, to a final concentration of 20 mM.
The protein was purified using a two-step procedure: ammonium sulfate precipitation followed by size-exclusion chromatography (SEC). Ega3 was precipitated with 80% (w/v) ammonium sulfate at 4°C, and the precipitate was pelleted at 10,000 ϫ g for 30 min. The precipitate was resuspended in SEC Buffer (20 mM HEPES, pH 8.0, 150 mM NaCl), and the ammonium sulfate was removed by dialysis, prior to loading the sample and purification using a HiLoad 16/60 Superdex 200 prep-grade column (GE Healthcare). This expression and purification protocol yielded ϳ5 mg of Ega3 for every 1 liter of media.
For untagged Ega3, the purification procedure was modified. First, the ammonium sulfate precipitation step was replaced, and the His-tagged protein was isolated using nickel-affinity chromatography. The eluent from the Ni-NTA column was subsequently buffer-exchanged into standard SEC buffer. The hexahistidine tag was cleaved with thrombin by incubating for

Ega3 crystallization and data collection
Purified Ega3 46 -318 with and without the N-terminal hexahistidine tag was concentrated to 12.8 and 11 mg/ml, respectively. Crystallization conditions were screened in 3-l drops at a (1:1) ratio of Ega3 to mother liquor, using hanging-drop vapor diffusion in VDX 48-well plates (Hampton). Crystals formed in over a third of the conditions in the MCSG suite #1 (Microlytics). Two crystal forms were dominant: singular rods and flat rectangles. A crystal of untagged Ega3 crystallized from MCSG 1 #86 was cryoprotected in 25% (v/v) PEG 3350 and 50% (v/v) mother liquor (0.2 M potassium iodide, 20% (v/v) PEG 3350) for 30 s before vitrification in liquid nitrogen. X-ray diffraction data were collected at a wavelength of 0.9794 Å at the Canadian Light Source (CLS) using beamline 08B1-1. 720 images of 0.5°o scillation were collected on a Rayonix MX300 CCD detector with an exposure time of 2.0 s per image. The data were indexed, integrated, and scaled using Autoprocess (Table 1) (71,72). Phasing was achieved using the distant homologue TM1410 (PDB 2AAM) as a template for ARCIMBOLDO_ SHREDDER (73). TM1410 was identified by HHPRED (74) as the most similar structure available with about 17% sequence identity to Ega3. ARCIMBOLDO_SHREDDER performed expected log-likelihood gain-guided placement of template-derived fragments with Phaser (75,76). Additional degrees of freedom, including gyre refinement, against the rotation function, and gimble refinement, after placement, were used to refine fragment location during molecular replacement (77). Consistent fragments were combined in reciprocal space with ALIXE (78). Best-scored phase sets were subject to density modification, and autotracing with SHELXE (79) led to a mainchain trace comprising 225 residues and characterized by a CC of 35%. Side chains were added in Coot (80) followed by iterative rounds of structure refinement in PHENIX.REFINE (81) and manual building in Coot. The TLSMD server was used to create three TLS groups that were used in refinement.

Ega3 co-crystallization, crystal soaking, and data collection
Co-crystallization of Ega3 with galactosamine or GalNAc using the previous hit conditions in 2-3 l at 1:1 ratio hanging drop vapor-diffusion were attempted as described above. Crystal screens were set up to final concentrations of 275, 330, and 550 mM galactosamine, or 200, 250, and 500 mM GalNAc. Multiple crystal formed in these conditions, and data were collected at NSLS II on the AMX beamline (17ID-1).
Apo-crystals were grown in MCSG 1 #18 (0.2 M calcium acetate, 0.1 M MES, pH 6.0, and 20% (v/v) PEG 8000) at 9 -10 mg/ml Ega3 as described above. These crystals were soaked in varying concentrations of galactosamine or GalNAc supplemented MCSG 1 #18. Crystals were vitrified in liquid nitrogen with the monosaccharides acting as cryo-protectant. Crystals were screened, and data were collected at NSLS II on the AMX beamline (ID17-1). A data set was collected on a crystal soaked in 550 mM galactosamine (0.2°oscillations, 360°, 0.01 s/image) and processed using fast_dp (82)(83)(84)(85). The structure was solved using molecular replacement with apo-Ega3 as the search model. Iterative rounds of model building and refinement were performed as described above for the apo structure using PHENIX and Coot, accessed through the SBGrid (80,81,86,87).

Biofilm disruption assays
Biofilm assays were completed as described previously (35). Briefly, 10 5 conidia of WT A. fumigatus Af293 were grown in Brian media in polystyrene, 96-well plates nontissue culturetreated for 21 h at 37°C. Biofilms were treated with the indicated concentration of WT or mutant Ega3 in 1ϫ PBS for 1 h at room temperature under gentle agitation. Biofilms were then washed, and the remaining biomass was stained with 0.1% (w/v) crystal violet and destained with 100% ethanol for 10 min. The optical density of the destain solution was measured at 600 nm.

Ega3 degradation of secreted GAG
Secreted GAG was purified as reported previously (6). Briefly, culture supernatants of 3-day-old Af293 or Af293 ⌬agd3 cultures were filtered on Miracloth prior to ethanolprecipitation. Precipitate was then washed with 70% (v/v) ethanol twice, 150 mM NaCl, and then water. The precipitate was then freeze-dried. 1 mg of precipitated secreted GAG was resuspended in 500 l of 1ϫ PBS containing 1 M Ega3 or 1 M Sph3. After incubating for 1 h, the sample was dried and reduced then propionylated. Reduction was performed incubating the oligosaccharides in 10 mg/ml sodium borohydride in 1 M ammonium hydroxide overnight at room temperature. Reaction was then quenched with 30% acetic acid prior to the propionylation reaction. Oligosaccharides were resuspended in methanol/pyridine/propionic anhydride (10:2:3) for 1 h at room temperature. Reduced and propionylated oligosaccharides were then purified using the Hypercarb Hypersep SPE cartridge and eluted with 50% (v/v) acetonitrile (ACN). Dried elute was resuspended in 0.2% trifluoroacetic acid (TFA) and spotted on the MALDI-TOF plate in a ratio of 1:1 (v/v) with 5 mg/ml 2,5-dihydroxybenzoic acid matrix reconstituted in ACN, 0.2% (v/v) TFA (70:30, v/v). Spectra were recorded on a Bruker UltrafleXtreme in positive reflector mode and an accumulation of 5000 laser shots.

In silico docking of GalN oligosaccharide
The Ega3-GalN structure was prepared using the Protein Preparation Wizard (88) in the Schrodinger suite after the removal of the bound galactosamine monomer. Receptor tautomeric and protonation state was optimized for pH 7.0. The ␣-1,4-GalN ligand was created in Coot (80) by removing the acetate groups from an ␣-1,4-(GalNAc) 5 molecule that had been built using the Glycam Carbohydrate Builder. Ligand preparation was done using LigPrep from Schrodinger suite with OPLS2005 force field and charges at pH 7.0 to create 512 tautomers. Docking was executed by Glide (41)(42)(43) with default setting, and results were viewed through Maestro (Maestro, Schrödinger, LLC, New York). Highest-scoring ligand conformers were exported for figure creation in PyMOL (Version 2.0.7).