Functional Analyses of Resurrected and Contemporary Enzymes Illuminate an Evolutionary Path for the Emergence of Exolysis in Polysaccharide Lyase Family 2*

Background: The evolutionary history of family 2 polysaccharide lyases is unknown. Results: Functional analysis highlights a key lysine-tryptophan transition involved in exolysis. Conclusion: Subtle changes in amino acid structure can transform enzyme activity. Significance: Combinatorial use of ancestral sequence reconstruction, gene resurrection, and structure-function analysis is valuable for elucidating the function and evolutionary history of polysaccharide lyases. Family 2 polysaccharide lyases (PL2s) preferentially catalyze the β-elimination of homogalacturonan using transition metals as catalytic cofactors. PL2 is divided into two subfamilies that have been generally associated with secretion, Mg2+ dependence, and endolysis (subfamily 1) and with intracellular localization, Mn2+ dependence, and exolysis (subfamily 2). When present within a genome, PL2 genes are typically found as tandem copies, which suggests that they provide complementary activities at different stages along a catabolic cascade. This relationship most likely evolved by gene duplication and functional divergence (i.e. neofunctionalization). Although the molecular basis of subfamily 1 endolytic activity is understood, the adaptations within the active site of subfamily 2 enzymes that contribute to exolysis have not been determined. In order to investigate this relationship, we have conducted a comparative enzymatic analysis of enzymes dispersed within the PL2 phylogenetic tree and elucidated the structure of VvPL2 from Vibrio vulnificus YJ016, which represents a transitional member between subfamiles 1 and 2. In addition, we have used ancestral sequence reconstruction to functionally investigate the segregated evolutionary history of PL2 progenitor enzymes and illuminate the molecular evolution of exolysis. This study highlights that ancestral sequence reconstruction in combination with the comparative analysis of contemporary and resurrected enzymes holds promise for elucidating the origins and activities of other carbohydrate active enzyme families and the biological significance of cryptic metabolic pathways, such as pectinolysis within the zoonotic marine pathogen V. vulnificus.

Family 2 polysaccharide lyases (PL2s) preferentially catalyze the ␤-elimination of homogalacturonan using transition metals as catalytic cofactors. PL2 is divided into two subfamilies that have been generally associated with secretion, Mg 2؉ dependence, and endolysis (subfamily 1) and with intracellular localization, Mn 2؉ dependence, and exolysis (subfamily 2). When present within a genome, PL2 genes are typically found as tandem copies, which suggests that they provide complementary activities at different stages along a catabolic cascade. This relationship most likely evolved by gene duplication and functional divergence (i.e. neofunctionalization). Although the molecular basis of subfamily 1 endolytic activity is understood, the adaptations within the active site of subfamily 2 enzymes that contribute to exolysis have not been determined. In order to investigate this relationship, we have conducted a comparative enzymatic analysis of enzymes dispersed within the PL2 phylogenetic tree and elucidated the structure of VvPL2 from Vibrio vulnificus YJ016, which represents a transitional member between subfamiles 1 and 2. In addition, we have used ancestral sequence reconstruction to functionally investigate the segregated evolutionary history of PL2 progenitor enzymes and illuminate the molecular evolution of exolysis. This study highlights that ancestral sequence reconstruction in combination with the comparative analysis of contemporary and resurrected enzymes holds promise for elucidating the origins and activities of other carbohydrate active enzyme families and the biological significance of cryptic metabolic pathways, such as pectinolysis within the zoonotic marine pathogen V. vulnificus.
Polysaccharide lyases (PLs) 4 are a class of carbohydrate active enzymes (i.e. "CAZymes") (1, 2) that have proven useful for investigating convergent enzyme evolution (3)(4)(5). PLs deploy a ␤-elimination mechanism to cleave glycosidic linkages within uronic acids, such as homogalacturonan (HG), a homopolymer of galacturonic acid and a primary component of pectin within the cell wall of plants (6). This reaction generates products with a 4,5-unsaturation at the non-reducing end (Fig.  1A). To perform ␤-elimination, unrelated PL families are dependent on three convergent structural features: a Brønstead base (most commonly an arginine) to deprotonate the C5 carbon, a catalytic metal cofactor (most often Ca 2ϩ ) to acidify the departing C5 proton and stabilize the oxyanion intermediate, and a stabilizing arginine residue to interact with O2 and O3 of the modified GalA residue (3)(4)(5). Cleavage can occur indiscriminately at internal linkages throughout the polysaccharide (i.e. endolysis) or exclusively at the terminus of the substrate (i.e. exolysis; Fig. 1B).
The majority of PL family 2 members (PL2s) partition into one of two functionally distinct subfamilies. Intriguingly, many species contain two paralogous PL2 copies that appear to have arisen by gene duplication and functional divergence (i.e. neofunctionalization). Insights into the functional landscape of these two subfamilies of PL2 have identified a correlation between cellular localization, mode of activity, and metal selectivity (3,7). Subfamily 1 (e.g. YePL2A) contains secreted endolytic members, whereas subfamily 2 members (e.g. YePL2B) are intracellular and exolytic and preferentially harness Mn 2ϩ during catalysis (3,8). Interestingly, PaePL2 from Paenibacillus sp. Y412MC10, an outlier that is endolytic and preferentially utilizes Mg 2ϩ (Fig. 1C) (7), has provided a snapshot into the evolution of PL2s and the activity of the progenitor enzyme (7) ( Table 1). A similar relationship has been described for the structurally unrelated PL22 cytoplasmic lyase family (Table 1) (4). Preferential use of transition metals in PL2s and PL22s is mediated by histidines (PL2 coordination pockets display two histidines; PL22 coordination pockets display three histidines), which displace acidic residues found within Ca 2ϩ -selective PLs (3,4). Nitrogen ligands provide more favorable coordination chemistries for transition metals (9).
The earliest diverging outgroup of PL2s is strictly cytoplasmic (4,7), which suggests that transition metals are a prerequisite for intracellular ␤-elimination. Ca 2ϩ is an intracellular signaling molecule, and it is present at limiting levels in the cytoplasm of bacteria (0.1-2 M) to prevent signaling interference and modification of subcellular structures (10,11). In contrast, the periplasm is believed to be a more heterogeneous metallo-environment because extracellular ions are free to passively diffuse across the outer membrane (12). The ␤-helix PLs (PL1, PL3, and PL9), the largest group of PLs most commonly associated with phytopathogens and saprophytes, are secreted into the periplasm or extracellular environment. ␤-Helix PL families active on HG preferentially coordinate Ca 2ϩ (13) and appear to have evolved for colonization and modification of the plant cell wall. Ca 2ϩ plays a crucial role in the maintenance of plant cell wall integrity, and its levels are high in this environment (10 M to 10 mM) (14).
Pectins are ubiquitous nutrients for environmental saprophytes, target substrates for macerating phytopathogens (e.g. soft rot), and components of dietary fibers that are digested by symbiotic microbes within the intestines of animals. Perhaps surprisingly, HG utilization and functional pectinases have also been reported for several human enteric pathogens, including Yersinia spp. (4,15,16) (Fig. 1C). Although the biological significance of pectinolysis within human pathogens is not clearly understood, several possible roles have been hypothesized, including environmental persistence, colonization of agricultural crops as vectors for transmission, and utilization of pectic nutrients within the intestine of an infected animal host (17). In this light, the presence of a pectinolytic pathway, complete with transport machinery (KdgM-like porin and solute binding protein), polysaccharide lyases (PL2, PL9, and PL22), and a homologue of a unique HG-binding protein (CBM32) (18), has been identified within the genome of Vibrio vulnificus ( Fig. 2A). V. vulnificus is a marine-borne bacterium most commonly associated with gastroenteritis caused by the consumption of contaminated seafood or septicemia resulting from wading in contaminated water with open wounds (19). Correspondingly, pectin represents a nutrient niche that is not consistent with its lifestyle (20). This pathway is not strictly conserved within Vibrionaceae, and whether it represents a historical remnant of a pectinolytic ancestor of V. vulnificus or evolved by horizontal gene transfer in response to its coastal water-zoonotic infectious life cycle remains to be determined.
Further insights into the evolutionary history of PL2s after the gene duplication event will help illuminate the adaptations required for metal-dependent activity and cellular specialization of pectin utilization, in addition to the biological significance of pectinolysis for various enteric pathogens. This study describes the structure and function of VvPL2, which is the first reported pectinase from V. vulnificus. Based upon its phylogenetic position within subfamily 2, VvPL2 represents a potential endolytic-exolytic transitional remnant. Additionally, we perform ancestral sequence reconstruction (ASR) of the PL2 family and resurrect progenitor PL2s to compare their activities with contemporary enzymes from subfamilies 1 and 2. We propose that ASR is an underexploited approach within the CAZyme field that will assist in streamlining the characterization of unknown enzyme activities and illuminating the evolutionary basis of substrate recognition and modification in other PL and CAZyme families.

Biochemical Characterization of VvPL2
Purification of Enzymes-Synthesized codon-optimized VvPL2, DdPL2, and PaPL2B genes were subcloned in pET28 (BioBasic Inc., Mississauga, Canada), and YePL2A and YePL2B plasmids (3) were transformed into Escherichia coli BL21 Star (DE3) cells and grown in LB broth containing 50 g ml Ϫ1 kanamycin sulfate. Cells were grown at 37°C with agitation at 180 rpm until cell density reached an A 600 of ϳ0.8. Cultures were cooled to 16°C, agitation was reduced to 120 rpm, and genes were induced with a final concentration of 200 M isopropyl 1-thio-␤-D-galactopyranoside. Overnight cultures were centrifuged at 7,000 ϫ g for 10 min. Cells were chemically lysed by a The preferential metal that provides maximal activity in recovery assays. b These assays were done with the exhaustive dialysis technique as opposed to the depletion-supplementation approach.
resuspension in a solution of 8% (w/v) sucrose, 0.65% (v/v) deoxycholate, 0.65% (v/v) Triton X-100, 30 mM NaCl, 350 g ml Ϫ1 lysozyme, 6 g ml Ϫ1 DNase, 30 mM Tris, pH 8.0. After lysis, lysate was centrifuged at 13,000 ϫ g for 45 min. The clarified supernatant was passed through a 0.45-m filter and applied to a gravity flow nickel affinity chromatography column and eluted with 0.5 M NaCl, 20 mM Tris, pH 8.0, with a stepwise increase in imidazole concentration of 5, 10, 100, and 500 mM. Samples containing the protein of interest were concentrated with an Amicon ultrafiltration cell (EMD Millipore) and passed through a HiPrep 16/60 Sephacryl S-200 HR size exclusion chromatography column (GE Healthcare) in 20 mM Tris-HCl, pH 8.0. Pure samples were pooled and concentrated.
Generation of Loop Swap Mutants-Two mutants were created by replacing residues 601-654 of YePL2A with 562-632 of YePL2B as described previously (48). Native YePL2A and YePL2B in pET28a were used as templates. 3Ј-Regions were grafted to 5Ј-regions in a secondary PCR, and full gene sequences were ligated into NheI and XhoI restriction enzyme cut sites in pET28a. Ligated transformants were sequenced. Enzymes were produced and purified as above.
Enzyme Assays-Optimal pH was determined by dialyzing samples of enzyme overnight into buffers: BisTris, pH 6.6 -7.2; Tris, pH 7.1-9.0; CAPSO, pH 8.9 -10.3; CAPS, pH 9.7-11.1; and CABS, pH 10.0 -11.0. After equilibration, enzyme was incubated at 37°C in 1 mg ml Ϫ1 HG, 50 mM buffer, and the reaction was monitored at 232 nm. Optimal temperature was determined by incubating samples of enzyme in water baths at temperatures ranging from 5 to 60°C for 15 min. Enzyme was then added to 1 mg ml Ϫ1 HG, 20 mM CAPSO, pH 9.0, preequilibrated to temperature. Reactions were run for 3 min and monitored at 232 nm. Divalent metal cation preference was determined by dialyzing samples of enzyme into 2 mM EDTA in 20 mM CAPSO, pH 9.0, to remove divalent metal cations from solution. Fractions were then dialyzed into deionized water and incubated with 1 mg ml Ϫ1 HG, 50 mM CAPSO, pH 9.0, at 37°C to ensure that activity had been ablated. Fractions were further dialyzed into solutions containing 1 mM CaCl 2 , MgCl 2 , MnCl 2 , CoCl 2 , NiCl 2 , or CuCl 2 , CAPSO, pH 9.0. After equilibration, samples were incubated in 1 mg ml Ϫ1 HG, 50 mM CAPSO, pH 9.0, and monitored at 232 nm.
Time course experiments were performed to determine product profiles. Enzymes were incubated in 1 mg ml Ϫ1 HG, 50 mM CAPSO, pH 9.0, 1 mM MnCl 2 at 37°C. Reactions were stopped by heating the samples to 95°C for 10 min followed by flash freezing in liquid nitrogen. Samples were then resolved by thin layer chromatography with 1-butanol/distilled water/acetic acid (5:3:2, v/v/v) running buffer and visualized with 1% orcinol in a solution of ethanol sulfuric acid (70:3, v/v), followed by heating at 110°C for 10 min. Samples were compared with samples of GalA (Sigma, catalog no. 48280), GalA 2 (Sigma, catalog no. D4288), and GalA 3 (Sigma, catalog no. T7407).
Enzyme Kinetics-PLs (100 nm to 1 m) were incubated in increasing concentrations of HG and GalA 3 with 50 mM CAPSO, pH 9.0, 1 mM MnCl 2 . Samples were monitored in real time at 232 nm, and product formation was determined using the extinction coefficient 5,200 M Ϫ1 cm Ϫ1 . Data were analyzed, and kinetic values were determined using GraphPad Prism version 6.

Crystallization and Structure Solution of VvPL2
Crystals of VvPL2 were developed via the hanging drop vapor diffusion method at a protein concentration of 15 mg ml Ϫ1 by mixing 1.0 l of the protein solution with an equal volume of mother liquor consisting of 16% (w/v) polyethylene glycol 3350, 0.14 M Na/K tartrate, and 0.1 M HEPES (pH 7.0) at 19°C. Crystals were cryoprotected by brief crystal immersion into reservoir solution supplemented with 25% ethylene glycol and subsequently frozen in a liquid nitrogen stream prior to diffraction experiments.
VvPL2 crystallized in space group P6 5 with one protein molecule in the asymmetric unit. Diffraction data for VvPL2 in complex with two molecules of tartrate were collected at the beamline SSRL 12-2 of the Stanford Synchrotron Radiation Lightsource. The data set was processed with XDS and scaled with Scala (21). The correct phases were derived via molecular replacement with the program Phaser (22) using the Yersinia enterocolitica PL2 structure (YePL2A, Protein Data Bank code 2V8J) as a search model (3). The structure of VvPL2 was rebuilt with the program Buccanneer and iteratively improved with cycles of manual building with Coot and positional refinement with Refmac (23)(24)(25). Data collection, processing, and refinement statistics were generated by Molprobity (26) and are presented in Table 2. Ramachandran statistics were generated using Rampage (27). Mapping of the degree of residue conservation was performed with the program Consurf (28), and fig-  AUGUST 28, 2015 • VOLUME 290 • NUMBER 35

JOURNAL OF BIOLOGICAL CHEMISTRY 21233
ures were produced using PyMOL (Schrödinger, LLC, New York).

Phylogenetic Analysis of the PL2 Family
PL2 sequences were retrieved from the CAZy database (2) and curated to remove truncated or duplicated sequences. An initial amino acid sequence alignment was built using Gblocks (29), and a guide tree was subsequently generated using PhyML (30). This guide tree was then utilized as described (31) to align the full-length sequences. A maximum likelihood (ML) phylogeny was constructed using GARLI version 2.0 (32) and the appropriate model of evolution (LG ϩ I ϩ G), as determined by ProtTest version 3.4 (33). A Bayesian phylogeny was also generated using MrBayes version 3.2.4 (34) and a mixed amino acid model with two parallel runs in order to ensure convergence. Both phylogenies were rooted on the branch between the ␥-proteobacteria and outgroup sequences. Bootstrapping was performed in GARLI 2.0 using 1,024 replicates and a 10% burnin. All trees were visualized using Geneious version 6.1.8 (35).

Ancestral Inference
Maximum likelihood ancestral inference was performed in PAML 4.3 (Ziheng Yang) on the basis of nucleotide, codon, and amino acid sequences using the ML phylogeny constructed for PL2. For nucleotide inference in BASEML, a nucleotide alignment exactly matching the PL2 amino acid alignment generated by PRANK was constructed using Geneious version 6.1.8, and the appropriate model of nucleotide substitution was implemented as determined by jModelTest version 2 (36). For codon and amino acid inference in CODEML, the WAG amino acid rate file was employed. The sequences inferred by the three methods were compiled, and a majority-rules approach was taken, with any remaining ambiguous sites resolved after consideration of the physicochemical properties of the inferred amino acids, their frequency among the contemporary sequences, and the inference made by the codon method (which is considered to be the most accurate). Bayesian ancestral inference was performed in MrBayes version 3.2.4 using a mixed amino acid model. Ancestral gaps were inferred using PRANK and incorporated into the ancestral sequences inferred by PAML and MrBayes.

Biochemistry of Ancestral PL2s
Node 49, 52, 54, and 74 sequences were codon-optimized, synthesized, and subcloned into a pET28 (Novagen, catalog no. 69864) expression vector using NheI and XhoI directional restriction sites (Biobasic). The Node 52 gene was subsequently subcloned into pET32 (Novagen, catalog no. 69015) to increase soluble yields. Gene products were purified by immobilized metal affinity chromatography, eluted with a 0 -500 mM imidazole gradient, and dialyzed into 20 mM Tris-HCl, pH 8.0. Digests were performed using 0.1 M enzyme and 1 mg ml Ϫ1 HG at 37°C. Nodes 52 and 54 were performed at pH 8.0 (4 mM Tris-HCl), and Node 74 was performed at pH 9.4 (4 mM CAPS). Direct method metal recovery assays were performed by adding EDTA to a final concentration of 1 mM and supplementing with 10 mM CaCl 2 , MgCl 2 , or MnCl 2 . Reactions were heat-killed by boiling for 5 min and clarified by centrifugation at 13,000 ϫ g.
Products were analyzed directly or following a 10-fold concentration by TLC (as above) or high performance anion exchange with pulsed amperometric detection. This analysis was performed with a Dionex ICS-3000 chromatography system (Thermo Scientific) equipped with an autosampler as well as a pulsed amperometric detector for total carbohydrates and a UV-visible detector for unsaturated galacturonides. Aqueous sample (typically 10 l) was injected into an analytical (4 ϫ 250-mm) CarboPac PA1 column (Thermo Scientific) and eluted at a 0.4-ml min Ϫ1 flow rate with a sodium acetate gradient (0 -1 min, 250 mM; 1-17.5 min, 250 -1,000 mM; 17.5-20 min, 1,000 mM; 20 -21 min, 1,000 to 250 mM; 21-35 min, 250 mM) in a constant background of 100 mM NaOH.

Site-directed Mutagenesis of YePL2B
Nucleotide substitutions were generated via PCR-mediated site-directed mutagenesis. Mutagenic primer sets were extended with KOD polymerase (Novagen, catalog no. 71086) using pETPL2B (3), encoding YePL2B as template. The entire reaction mixture was then digested with DpnI (New England BioLabs, catalog no. R0176), and one-tenth of the reaction mixture was transformed into DH5␣-competent cells. Plasmid was extracted (Omega, catalog no. D6945), and constructs were verified by Sanger sequencing (McGill University and Génome Québec Innovation Centre).
In order to compare specificities and activities, we have performed a comparative kinetic analysis between VvPL2, YePL2A, and YePL2B on both HG and the pectic fragment GalA 3 (Fig. 2 (G-I) and Table 3). YePL2A is preferentially active on HG over GalA 3 (ϳ11-fold), whereas its paralog YePL2B is preferentially active on GalA 3 over HG (ϳ13-fold). This inverse relationship is consistent with their assigned roles in a degradative pathway. YePL2A is secreted and endolytic, which is tailored for upstream activity on polymeric HG, and YePL2B is intracellular and exolytic and performs a downstream role in oligogalacturonide depolymerization (17). In comparison, VvPL2 displays a relatively high catalytic rate on both GalA 3 and HG, with preferential activity on GalA 3 (ϳ5-fold). This plasticity may be explained by V. vulfinicus containing only one PL2 copy. In this context, VvPL2 appears to take on the roles of both YePL2A and YePL2B, and its position within the PL2 family tree suggests that it represents a transitional member based upon both sequence relatedness and function (Fig. 1C).
Previously, PL2s have been reported to preferentially utilize transition metals over Ca 2ϩ during catalysis (3,7,8). To further explore a differential relationship between PL2 subfamilies and metal specificity, YePL2A, YePL2B, and VvPL2 were subjected here to a new preparative treatment that exchanges metal cofactors by performing an exhaustive dialysis against EDTAbuffered solutions, followed by exhaustive dialysis in divalent metal-buffered solutions. This alternative method was developed to promote gradual exchange of cofactors and supplant the direct depletion-supplementation method that has been routinely used previously (3,8). The direct approach introduces highly concentrated metallo-microenvironments and can be deleterious to protein stability. For example, the characterization of YePL2A metal dependence was not previously possible using the direct method (3). With the dialysis substitution method implemented here, YePL2A displayed very little precipitation during its preparation with all metals tested. Initial velocities for YePL2A, YePL2B, and VvPL2 in the presence of Co 2ϩ , Mn 2ϩ , Ni 2ϩ , Mg 2ϩ , Ca 2ϩ , and Cu 2ϩ were determined to compare related metal dependence of HG modification (Fig.  2J). For YePL2A and VvPL2, the highest catalytic rates were observed in the presence of Mg 2ϩ (at 1 mg ml Ϫ1 HG), which agrees with what was reported for PaePL2 and supports a prominent role for Mg 2ϩ as a metal cofactor in endolytic PL2s (3,8). In contrast, YePL2B displayed the highest activity when supplemented with Mn 2ϩ .
To investigate the mechanistic contributions of various metals and potentially the biological significance of metal selectivity, full Michaelis-Menten kinetics were determined for YePL2A and YePL2B using the same spectrum of cofactors (Table 4 and Fig. 2 (K and L)). Mg 2ϩ and Co 2ϩ were confirmed to promote the highest turnover rate for YePL2A. Mn 2ϩ , however, was associated with the lowest K m , which translated into a 4-fold higher catalytic specificity constant for Mn 2ϩ than for Mg 2ϩ (Table 4). This suggests that Mn 2ϩ is optimal under limiting concentrations of substrate; regardless, YePL2A demonstrates remarkable plasticity in cofactor selection. This property reflects the adaptation of secreted PL2s to the heterogeneous ionicity of the periplasm (12). The results for YePL2B indicate that the cytoplasmic enzyme has more selectivity for Mn 2ϩ in both the rate of substrate turnover (k cat ) and catalytic efficiency (k cat /K m ) ( Table 4). Higher specificity (K m ) for and catalytic turnover (k cat ) with Mn 2ϩ translates into a 8-fold increase over Mg 2ϩ and a ϳ3-order of magnitude increase over Ca 2ϩ in catalytic rate. This finding highlights that within the confines of the cell, cofactor selection by PL2s for Mn 2ϩ is more stringent. In the presence of Mn 2ϩ , YePL2B appears to adopt a substrate inhibition profile when active on HG (Mn 2ϩ *; K i ϭ 2.25 Ϯ 1.07 mg ml Ϫ1 ; Fig. 2L and Table 3). When fit to this model, the catalytic efficiency is lowered into the range of Mg 2ϩ ; however, this result should be interpreted with caution because the error values increase, which may compensate for this effect. Although this observation underpins the complexity of the metal-protein-HG interaction for YePL2B, such inhibitory effects are probably negligible in nature because HG concentrations would be limited inside the cell.
Structural Analysis of VvPL2, an Endolytic Member of Subfamily 2-In order to provide insight into the molecular determinants of subfamily 2 PL2 activities, we attempted to crystalize DdPL2, PaPL2B, YePL2B, and VvPL2. Although we were able to produce high levels of each recombinant protein, they varied in solubility, stability, and crystallizability. Of the four Exolytic enzymes digest HG strictly from the terminus of the polysaccharide and release a single defined product. Endolytic enzymes cleave internal glycosidic linkages to generate a mixed product profile. C, representative unrooted tree highlighting the phylogeny of subfamily 1 and 2 members discussed in this study. Boundaries identified by CAZy are indicated with dashed circles. The associated modes of activity are represented with endolytic and exolytic models. The presumed transitional sequence space between these activities is shown with a black triangle. Ye, Y. enterocoliticus; Dd, D. dadantii; Pa, P. atrosepticum; Pae, Paenibacillus sp. AUGUST 28, 2015 • VOLUME 290 • NUMBER 35

JOURNAL OF BIOLOGICAL CHEMISTRY 21235
proteins, only VvPL2 produced diffraction quality crystals, which were used to solve the protein structure by molecular replacement to 1.90 Å using YePL2A as a homology model (Protein Data Bank code 2V8K) (3). VvPL2 adopts an (␣/␣) 7 barrel fold with an extensive active site cleft that is characteristic of endolytic enzymes (Fig. 3A). Superimposition of VvPL2 (residues 26 -566) onto YePL2A using PDBeFold highlighted the close structural similarity between the two lyases (root mean square deviation ϭ 1.39 Å over 509 residues) (37). Significantly, the Brønstead base (Arg-191), catalytic pocket, and stabilizing residue (Arg-304) are structurally conserved, with Arg-304 ϳ13.8 Å from the metal center with reasonable geometry for interacting with 2-OH and 3-OH of the GalA in the ϩ1 subsite (Fig. 3B). These three convergent features have been suggested to be critical for ␤-elimination in unrelated PL folds (3)(4)(5). Based on an inspection of B-factors and residual electron density, the metal coordinated in the VvPL2 structure appears to be Ni 2ϩ or Mn 2ϩ (Fig. 3C). Although this may be influenced by the purification conditions, VvPL2 function was modest in the presence of Ni 2ϩ (Fig. 2J). Therefore, whereas the natural metalloenzyme complex is assumed to contain Mn 2ϩ , we cannot rule out the possibility with the available data that the metal cofactor presented in the crystal structure has not been substituted with Ni 2ϩ . Intriguingly, the uncleaved N-terminal histidine tag from VvPL2 introduced two artifactual interactions between the N⑀2 of VvPL2 residues His-4 and His-6, and the bound metal assisted in stabilizing the coordination sphere with a perfect octahedral symmetry (Fig. 3C). Other interactions involve His-129 (N⑀2), Glu-150 (O⑀2), His-192 (N⑀2), and an ordered water molecule (HOH 383) that is activated by a 2.8-Å hydrogen bond with the N␦1 imidazole nitrogen of His-546.
A sequence comparison between the metal binding pockets of YePL2A and VvPL2 reveals the presence of a histidine residue in VvPL2 (His-546) that replaces a glutamate in YePL2A (Glu-515), which was presumed to contribute to Mn 2ϩ -selective chemistries for subfamily 2. Structural superimpositions of the metal binding pockets, however, reveal that His-546 and Glu-515 are spatially and functionally conserved (Fig. 3D). Both residues interact with an ordered water molecule, charging it for coordination of the metal cofactor. To investigate whether the N⑀ of the imidazole group provided any definitive function, we performed both a single substitution (H530E) and insertion of the tripeptide sequence from YePL2A (Phe-512/Thr-513/ Glu-514) into the equivalent sequence space on YePL2B (Tyr-528/Ile-529/His-530). These mutations did not reverse the metal selectivity in YePL2B because optimal digestion was still observed in the presence of Mn 2ϩ after EDTA treatment; however, it did appear to be deleterious for the utilization of Ca 2ϩ (not shown). This observation does not rule out a differential role for His-546 in exolytic enzymes but does suggest that the more stringent metal selectivity observed in the catalytic activity of YePL2B probably depends on other structural features within the metal binding site, such as residue geometry and distance, which would be supported by a greater network of interactions within the enzyme scaffold.
Based upon its activity (Table 3) and position within subfamily 2 (Fig. 1C), VvPL2 appears to represent a transitional sequence within the phylogeny of PL2. Therefore, we examined the surface of VvPL2 to identify any structural features near the active site cleft that might help illuminate the structural transition between the endolytic and exolytic subfamilies. Subfamily 1 and subfamily 2 sequences were independently mapped onto the structure of VvPL2 using Consurf (28) (Fig. 3, E and F). This program scales the conservation (magenta) and divergence (cyan) of residues to identify similar and distinct structural elements. Near its catalytic center, VvPL2 displays a high level of structural similarity with subfamily 1 sequences, which is consistent with its observed activity (Fig. 3E). Apart from the core catalytic residues, there is notably less conservation with PL2 subfamily 2 sequences (Fig. 3F). One such region includes a hallmark lysine residue (Lys-300) that is poised near the exit of the active site cleft. This lysine is invariant in subfamily 1 and is replaced with a tryptophan through the majority of subfamily 2 sequences (Fig. 4). Intriguingly, the small outgroup of early diverging sequences in subfamily 2 that includes two Marinomonas spp. and Acholeplasma brassicae display a phenylalanine and glycine, respectively, at this position (not shown).
Ancestral Sequence Reconstruction of Family 2 PLs-The biochemical properties of VvPL2 have revealed that subfamily boundaries assigned within the CAZy database do not provide enough sequence resolution to elucidate the evolutionary history of endolytic to exolytic transition in the PL2 family. Therefore, to trace lineage at the sequence level, we constructed a robust ML phylogeny of all available PL2 sequences and used this analysis to infer the sequences of ancestral PL2s positioned at a range of branch points (Fig. 4A). Almost all of the contemporary PL2 sequences available are from members of the ␥-proteobacteria, with the exception of two sequences from Paenibacillus sp., a member of the Firmicutes, and Haloterrigena turkmenica, an archaeon, which were used as an outgroup and to root the tree. The topology of the ML phylogeny shown in Fig. 4A is supported by high bootstrap percentages; furthermore, a Bayesian phylogeny was also constructed for comparison and found to display identical topology (not shown). The  contemporary PL2 sequences form two major clades, subfamily 1 and subfamily 2, with subfamily 1 positioned closest to the root. The ancestral nodes 49, 52, 54, and 74 were selected for ancestral inference and reconstruction, given their positions at major branch points within the phylogeny (Fig. 4A). Node 49 represents the last common ancestor of all PL2 sequences, including the outgroup sequences, whereas Node 74 represents the last common ancestor of the subfamily 1 PL2s alone. Nodes 52 and 54 both represent ancestors of subfamily 2 after divergence of VvPL2, with Node 54 being the ancestor of all subfamily 2 PL2s from plant pathogens and Node 52 being the ancestor of these same enzymes plus the endolytic VvPL2 and PL2s from Vibrio furnissi and Acholeplasma brassicae. The positions of all four of these PL2 ancestors are supported by bootstrap percent-ages Ն98%. Ancestral inference was performed under the ML criterion, and the average posterior probability for each of the four ancestors was Ͼ0.7 (this increases to Ͼ0.8 for Nodes 52, 54, and 74 when inference at ancestral gaps is not considered).
The four ancestral PL2s vary from their closest contemporary descendant by at least 15% (ϳ83 amino acids). As expected from its phylogenetic position, the closest contemporary descendant of Node 74 is a subfamily 1 PL2 from Pectobacterium wasabiae (84% sequence identity), and it possesses a lysine residue (Lys-286) conserved within the endolytic subfamily 1. In contrast, Node 54 shares the greatest sequence identity with a subfamily 2 enzyme from Yersinia pseudotuberculosis (82%) and contains the conserved tryptophan residue (Trp-286) at this same position in the active site cleft. Node 52 shares only FIGURE 3. Three-dimensional structure of VvPL2. A, schematic model of VvPL2 color-ramped blue (N terminus) to red (C terminus) and with its catalytic metal modeled as a Mn 2ϩ shown as a purple sphere. B, superimposition of GalA from the ϩ1 site of the YePL2A complex (Protein Data Bank entry 2V8K) within the active center of VvPL2. The backbone of VvPL2 is shown as a gray schematic with the metal-binding residues displayed as gray sticks, ordered waters as red spheres, Mn 2ϩ as a purple sphere, and the stabilizing residue (Arg-304) and Brønstead base (Arg-191) as cyan sticks. The distances between the 2-OH and 3-OH of GalA and Arg-304, C5 and Arg-191, and the Mn 2ϩ ion and the uronate group oxygens are labeled and shown as red dashes. C, the metal binding pocket of VvPL2 with N-terminal His tag. The map of the active center residues coordinating the transition metal is presented as maximum likelihood/ A weighted 2F o Ϫ F c densities, contoured at 1.0 and carved at 1.5 Å. The coordinated Mn 2ϩ and ordered water are displayed as silver and red spheres, respectively. The presence of two tartrate molecules within the N-terminal His tag complex are rendered as yellow ball-and-stick models. D, alignment of the YePL2A (Protein Data Bank entry 2V8J; gray) and VvPL2 (green) metal coordination pocket. Residues are modeled as sticks, Mn 2ϩ as purple spheres, and waters as red spheres. Residues are labeled using VvPL2/YePL2A numbering. Bond distances are indicated with yellow dashed lines. Consurf mapping displaying the conserved (magenta) and divergent (cyan) surface features of VvPL2 with subfamily 1 (E) and subfamily 2 (F) members. The highly conserved residues (Lys-152, Arg-191, Arg-304, and metal pocket) and location of the lysine-tryptophan site (K300W) are labeled. Previously characterized enzyme activities are boxed, activities reported in this study are noted with a single asterisk (see Table 1 for references), and family members with solved three-dimensional structures are designated with a double asterisk (3). B, primary structure alignment of YePL2A, YePL2B, VvPL2, Node 52, Node 54, and Node 74. Residues involved in catalysis and metal coordination are indicated with black and white triangles, respectively. The putative lysine to tryptophan switch is highlighted with a black circle. AUGUST 28, 2015 • VOLUME 290 • NUMBER 35 60% sequence identity with its closest contemporary descendant (VvPL2). Despite Node 54 sharing only 45% sequence identity with VvPL2, Nodes 52 and 54 share 65% sequence identity. Interestingly, Node 52 does not contain either the conserved lysine found in subfamily 1 or the conserved tryptophan found in subfamily 2; rather, it contains an arginine residue (Arg-283) (Fig. 4B). The most divergent of the inferred ancestral PL2s is Node 49, sharing just 56% sequence identity with its closest contemporary descendant (a subfamily 1 PL2 from P. wasabiae). Based upon sequence alignments, Node 49 does not appear to contain a lysine or tryptophan residue at this position; however, structural modeling determined that this ancestor has a truncated loop, and a lysine (Lys-268) is spatially conserved (not shown). This observation suggests that a lysine at this position is correlated with endolytic activity and that the last common ancestor of PL2s was endolytic, which would be consistent with what was previously proposed for the early diverging PaePL2 (7).

Structure of a PL2 from V. vulnificus
Characterization of Resurrected Ancestral PL2s-From their inferred sequences, it appears that Node 74 is endolytic and Node 54 is exolytic, but, as we have observed with VvPL2, sequence information alone cannot fully predict function. Furthermore, Node 52 contains a divergent amino acid in place of the highly conserved lysine or tryptophan residue associated with endolytic and exolytic activity, respectively. Therefore, we resurrected and characterized the enzymes from Nodes 49, 52, 54, and 74 in vitro using gene synthesis and enzyme product profiling. Nodes 52, 54, and 74 were produced as soluble protein in E. coli and found to be active on HG (Fig. 5A); however, Node 49 was not produced and could not be studied further. In agreement with its phylogenetic position as the last common ancestor of subfamily 1 and the presence of Lys-286, Node 74 displayed characteristic endolytic activity on HG, with detected products ranging in size from ⌬Gal 2 to ⌬Gal 4 . Similarly, as an ancestor of subfamily 2, Node 54 displayed an exolytic profile and almost exclusively generated ⌬Gal 2 . Node 52 also generated an exolytic-like digestion profile of HG despite possessing an arginine at the Lys-268 position. This residue may represent a key transition in the evolution of subfamily 2 sequences because, despite having related charge potentials, arginine has more steric bulk than lysine. Intriguingly, both residues can display identical adenine bases in their first and third codon positions (lysine, AAA/AAG; arginine, AGA/AGG), which suggests that substitutions can arise by in-frame substitutions. Analysis of the nucleotide sequence of YePL2A reveals that Lys-291 is encoded by a triadenine codon, and the second position of Trp-286 in YePL2B contains a guanine. Therefore, it seems plausible that the AAA-lysine-encoding position may have evolved first to an AGA-arginine and subsequently to a TGGtryptophan (Fig. 5B). Alternatively, it could have proceeded through an AAA 3 AAG silent mutation and then an AGGarginine intermediate. In either case, this pathway would suggest that exolysis arose in part through increases in the steric bulk of this positional residue (lysine 3 arginine 3 tryptophan) and neutralization of its charge (positive 3 neutral) (Fig. 5B).
Node 74 purified in high yields and was therefore used to further probe ancestral functions and relationships between PL2 subfamilies. This enzyme displays a pH profile (not shown) similar to that of YePL2A and maximal enzyme recovery with Mg 2ϩ (Fig. 5C). These data indicate that ASR can accurately determine the functional relatedness of PL2s back to the subfamily divergence in their lineage and suggest that ASR will have utility for helping to define the phylogenetic relationships in other CAZyme families.
Evolution of Exolytic and Endolytic Activities within the PL2 Family-Despite numerous attempts (e.g. YePL2B, DdPL2, and PaPL2B) we have been unable to solve the structure of an exolytic PL2, and currently the molecular basis of exolytic activity in this family remains to be determined. Previously, an endolytic-exolytic switch was proposed to be the result of a loop insertion near the catalytic center (residues 200 -218 of YePL2A and 188 -212 of YePL2B) (3). Loop insertions have commonly been observed to be responsible for exolytic-endolytic transformations within CAZymes, including polygalacturonases (3) and family 11 PLs (38). In YePL2B and, by extension, other exolytic PL2s, the catalytic cleft would need to be remodeled to accommodate the reducing end of HG with subsites ϩ1 and ϩ2 for the exclusive release of ⌬GalA 2 (39) (Fig. 5D). Therefore, we attempted to define the structural role of the loop in YePL2B by performing loop-swapping experiments between YePL2A and YePL2B to generate the hybrid enzymes YePL2A-B (containing the B-loop) and YePL2B-A (containing the A-loop). Swapping the loops between the two enzymes lowered the rate of HG digestion but did not alter their respective product profiles (Table 3) (data not shown), which suggests that the predicted YePL2B loop is not the molecular determinant of exolysis. Intriguingly, the YePL2B loop shifted the pH optimum of the YePL2A toward YePL2B (not shown) and reduced the affinity for HG but not GalA 3 (Table 3), which indicates that the loop may contain residues that contribute to the formation of distal subsites for accommodating polymerized HG.
In order to identify more subtle features that contribute to the structural basis of exolytic activity, we next performed a thorough examination of the primary structure alignments of the node enzymes and a homology model of YePL2B (not shown). Consistent with what was revealed through the ASR analysis, there is a surface-exposed tryptophan conserved within all contemporary exolytic enzymes (YePL2B, Trp-286) and Node 54 (Trp-286), which underpins that it may have a functional role. To test this possibility, we performed substitutive mutagenesis on this tryptophan (W286K and W286A). The product profile of YePL2B-W286K and YePL2B-W286A contained several populations, which suggests that the mutants had become endolytic (Fig. 5E). This effect was enhanced in the presence of EDTA (Fig. 5F). It appears that Trp-286 functions to stabilize the exolytic cleft, perhaps by occluding access to the active site and restricting interactions with polymerized HG to the reducing end (Fig. 5D). Additionally, the role of EDTA in generating this phenotype suggests that the modified cleft structure of YePL2B may be fortified by interactions with the catalytic metal. In the absence of a structure from an exolytic PL2, these results shed new light on how subtle transitions in primary structure can transform enzyme activity within closely related enzyme families.
Biological Significance and Evolution of HG Utilization Pathways within Human Enteric Pathogens-PL2s are disproportionately found in human enteric pathogens, and there are often paralogous copies within a genome that partition into subfamilies 1 and 2 (Figs. 1C and 4A) (7). The presence of two distinct PL2 activities that display differential cellular localization highlights that they are contributing to upstream endolytic (secreted) and downstream exolytic (cytoplasmic) stages of HG depolymerization (Fig. 6) (20). Several examples of species with a single copy of either an exolytic or endolytic PL2 entry do exist (Fig. 4A) (7); however, in these cases, alternative pathways for HG saccharification have evolved (17,20). The pectinolytic pathway from V. vulnificus is one such example (Fig. 6). V. vulnificus is predicted to deploy an extracellular PL9, a periplasmic HG binding protein (endoVvCBM32) and endoVvPL2, and an intracellular oligogalacturonte lyase (exoVvPL22). HG transport is facilitated through a KdgM-like anionic porin (40,41), and intracellular transport is predicted to be facilitated by a solute-binding protein and an ABC transporter that is distally located in the genome but under similar regulation (20). This pathway differs from what has been biochemically defined for Y. enterocolitica (Fig. 6) (17). Y. enterocolitica deploys an  AUGUST 28, 2015 • VOLUME 290 • NUMBER 35 extracellular pectin methylesterase (YeCE8) (15); periplasmic HG-binding protein (endoYeCBM32) (18), endolytic PL2 (endoYePL2A) (3), and exolytic polygalacturonase (exoYeGH28) (16) and two intracellular depolymerases, which cleave ⌬GalA 2 (exoYePL2B) (3) and ⌬GalA (exoYePL22) (4), respectively, from oligogalacturonide substrates. The signature architectures of these pathways may reveal subtle variances in the structure of pectic nutrients and symbioses (e.g. marine versus terrestrial) or differential roles in environmental persistence and colonization of competitive ecosystems, such as the gastrointestinal tract of animals. Further investigation into biochemical function and evolution of pectin utilization pathways containing PL2s will be foundational for understanding the roles of these enzymes in the life cycle, and potentially in the pathogenesis, of human enteric pathogens.

Structure of a PL2 from V. vulnificus
Conclusion-Assigning ancestry and biological function to sequence-based CAZyme families has been complicated by the realization that many families display great diversity in substrate specificity or mode of activity. In this light, recent efforts to partition CAZyme families into subfamilies (2,(42)(43)(44) and develop predictive tools to inform function based upon structural signatures (45)(46)(47) have helped to facilitate the sequenceto-function-based characterization of protein-carbohydrate interactions and carbohydrate-modifying enzymes. We have demonstrated here, however, that a higher level of resolution may be required to define the functional boundaries and evolution of activities within some CAZyme subfamilies. Within PL2s, the progenitor enzyme appears to have been endolytic and to preferentially harness Mg 2ϩ for ␤-elimination (PaePL2 and Node 74); however, it is clear that contemporary endolytic PL2s (subfamily 1) display plasticity in metal selectivity, which can be explained by the heterogeneous metallo-environment of the periplasm and extracellular environment of the niches that these bacteria colonize. In contrast, intracellular PL2s (subfamily 2) display the highest rate of substrate turnover in the presence of Mn 2ϩ and are exolytic.
Insights into the structure of VvPL2, which represents a "transitional" enzyme that exhibits some properties of both subfamilies, and the biochemical characterization of resurrected enzymes from the lineages of both subfamilies have revealed that the molecular basis of an endolysis to exolysis transition is not loop-dependent but rather relies on subtle changes to functional groups at the opening to the active site cleft. For example, it appears that a lysine to tryptophan transition is in part responsible for the emergence of exolysis, and this mutation may have evolved through a lysine (AAA) 3 arginine (AGA) 3 tryptophan (TGG) transition. Future research aimed at illuminating the evolution of CAZyme subfamily substrate specificity and mode of activity will be central to defining general properties in the evolution of pectin recognition and modification and the colonization of intriguing nutrient niches by microbes, such as pectinolysis by food-borne pathogens.