Structural Basis of Stereospecificity in the Bacterial Enzymatic Cleavage of β-Aryl Ether Bonds in Lignin

Lignin is a combinatorial polymer comprising monoaromatic units that are linked via covalent bonds. Although lignin is a potential source of valuable aromatic chemicals, its recalcitrance to chemical or biological digestion presents major obstacles to both the production of second-generation biofuels and the generation of valuable coproducts from lignin's monoaromatic units. Degradation of lignin has been relatively well characterized in fungi, but it is less well understood in bacteria. A catabolic pathway for the enzymatic breakdown of aromatic oligomers linked via β-aryl ether bonds typically found in lignin has been reported in the bacterium Sphingobium sp. SYK-6. Here, we present x-ray crystal structures and biochemical characterization of the glutathione-dependent β-etherases, LigE and LigF, from this pathway. The crystal structures show that both enzymes belong to the canonical two-domain fold and glutathione binding site architecture of the glutathione S-transferase family. Mutagenesis of the conserved active site serine in both LigE and LigF shows that, whereas the enzymatic activity is reduced, this amino acid side chain is not absolutely essential for catalysis. The results include descriptions of cofactor binding sites, substrate binding sites, and catalytic mechanisms. Because β-aryl ether bonds account for 50–70% of all interunit linkages in lignin, understanding the mechanism of enzymatic β-aryl ether cleavage has significant potential for informing ongoing studies on the valorization of lignin.


Lignin is a combinatorial polymer comprising monoaromatic units that are linked via covalent bonds. Although lignin is a potential source of valuable aromatic chemicals, its recalcitrance to chemical or biological digestion presents major obstacles to both the production of second-generation biofuels and the generation of valuable coproducts from lignin's monoaromatic units. Degradation of lignin has been relatively well characterized in fungi, but it is less well understood in bacteria. A catabolic pathway for the enzymatic breakdown of aromatic oligomers linked via ␤-aryl ether bonds typically found in lignin has been reported in the bacterium
Sphingobium sp. SYK-6. Here, we present x-ray crystal structures and biochemical characterization of the glutathione-dependent ␤-etherases, LigE and LigF, from this pathway. The crystal structures show that both enzymes belong to the canonical two-domain fold and glutathione binding site architecture of the glutathione S-transferase family. Mutagenesis of the conserved active site serine in both LigE and LigF shows that, whereas the enzymatic activity is reduced, this amino acid side chain is not absolutely essential for catalysis. The results include descriptions of cofactor binding sites, substrate binding sites, and catalytic mechanisms. Because ␤-aryl ether bonds account for 50 -70% of all interunit linkages in lignin, understanding the mechanism of enzymatic ␤-aryl ether cleavage has significant potential for informing ongoing studies on the valorization of lignin.
The primary obstacle in the production of lignocellulosic biofuels is the release of sugars in high quantities at low cost from recalcitrant biomass feedstocks (1). Lignin is the prime source of this recalcitrance, and there has been renewed interest in the microbial enzymes capable of lignin degradation and catabolism of lignin-derived compounds (2,3). Generally, white rot fungi secrete lignin peroxidases, versatile peroxidase, manganese peroxidases, and laccases that are involved in the initial degradation of lignin (4,5), whereas bacteria are thought to play a role in further degradation of lignin-derived lower molecular weight compounds (6).
GST superfamily members are multifunctional enzymes often involved in cellular detoxification processes via GSH conjugation (13). However, some bacterial GSTs are implicated in basal metabolism and supply bacterial cells with carbon (14). GSTs with Ͼ40% sequence identity are traditionally considered to be in the same class, whereas proteins of different classes have typically Ͻ25% protein sequence identity (15). However, these classifications are also based on a number of other considerations, including structure, function, and biochemical properties (15). Although there are seven classes of GSTs in mammals (Alpha, Mu, Pi, Sigma, Theta, Omega, and Zeta), there is an ever-increasing number of non-mammalian classes, including Beta, Chi, Delta, Epsilon, Lambda, Phi, and Tau, as well as a number of more recently defined novel classes (15)(16)(17). Previous studies have suggested that the ␤-etherase enzymes LigE and LigF might be classified in the fungal GST-FuA class of GSTs based on sequence phylogeny (18).
Because plant lignins are racemic polymers, complementary stereospecificities of the multiple enzymes in the ␤-ether degradation pathway are required by bacteria to oxidize and cleave the various stereoisomers that are present in lignin polymers (19 -22). Here, we describe three protein crystal structures and provide the corresponding biochemical data for the LigE and LigF enzymes involved in the ␤-ether cleavage step of the Sphingobium sp. strain SYK-6 degradation pathway. The modest structural homology of these two enzymes highlights the fitness adaptation afforded in this and probably other microbial catabolic pathways that can degrade lignin-derived materials, required for enzymatic degradation of such racemic products. This work provides new insights into the structure-function relationships and biochemistry of this pathway, expanding our knowledge of the bacterial catabolism of lignin-derived compounds. Because lignin is the most abundant aromatic polymer in nature, this study informs broader lignin valorization efforts that will ultimately enable the development of efficient pathways for the conversion of lignin into renewable aromatics with applications in advanced biofuels and chemicals (23).
of interest and CPD. The pVP80K_LigF⌬242 vector was prepared using polymerase incomplete primer extension as described previously using Phusion High-Fidelity PCR master mix with HF buffer (New England Biolabs Inc., Ipswich, MA), and primers from Integrated DNA Technologies (Coralville, IA) (25). The pVP80K vector was provided by the Center for Eukaryotic Structural Genomics (Madison, WI), and the pVP102KSSLigF vector containing full-length wild type LigF was prepared as described previously (9). Insert and vector backbone PCR products were mixed 1:1 and immediately transformed into Escherichia coli One Shot TOP10 cells (Invitrogen). The pVP80K_LigF⌬242 vector was purified from E. coli (One Shot TOP10, 10 ml of LB with kanamycin, 18 h at 37°C) using the QIAprep spin miniprep kit (Qiagen, Germantown, MD) and transformed into the laboratory strain E. coli B834(DE3) Z-competent cells (Zymo Research, Orange, CA).
Enzyme Expression and Purification-NEB Express protein expression cells (New England Biolabs Inc., Ipswich, MA) containing pCPD-LigE were grown in autoinducing selenomethionine medium as described previously (26) and harvested via centrifugation. Harvested cells were resuspended in 30 ml of lysis buffer (50 mM HEPES buffer, pH 7.4, 150 mM NaCl, and 40 mM imidazole) and lysed by an Avestin EmulsiFlex-C3 homogenizer. The C-terminally His-tagged proteins were purified from the clarified supernatant using precharged nickel-IMAC resin (GE Healthcare). After protein binding and washing twice with lysis buffer, inositol hexakisphosphate was added to a final concentration of 200 M. Note that the inositol hexakisphos-phate was first diluted to 10 mM in lysis buffer to neutralize the acidic pH of the stock solution. After 1 h of incubation, the resin was washed with 1 ml of lysis buffer to elute the cleaved protein.
Following buffer exchange into 20 mM Tris, pH 8, the LigE protein was further purified using a HiTrap Q HP anion exchange column. Fractions containing LigE, as confirmed by SDS-PAGE, were pooled and concentrated. Final protein cleanup was done using gel filtration on a Superdex 200 10/300 GL column (GE Healthcare).
Laboratory strain E. coli B834(DE3) Z-competent cells (Zymo Research, Orange, CA) containing the pVP80K_ LigF⌬242 plasmid were grown in autoinducing selenomethionine medium as described previously (26) and harvested via centrifugation. Harvested cells were resuspended in 20 ml of lysis buffer (20 mM sodium phosphate buffer, pH 7.5, 500 mM sodium chloride, 20% ethylene glycol) and lysed by sonication. The N-terminally His-tagged LigF⌬242 fusion protein was purified from the supernatant by immobilized nickel affinity chromatography using a HiTrap Q HP anion exchange column on an ÄKTA FPLC system (GE Healthcare, Piscataway, NJ). Fractions containing LigF⌬242, as determined by SDS-PAGE, were combined and dialyzed overnight at 4°C. LigF⌬242 was cleaved from the fusion protein using tobacco etch virus protease (1 mg/100 mg of protein; provided by the Center for Eukaryotic Structural Genomics). Following cleavage, LigF⌬242 and the polyhistidine tag were separated using a HiTrap Q HP anion exchange column. Pooled fractions containing LigF⌬242, as confirmed by SDS-PAGE, were pooled and concentrated to 3

is the intensity of an individual measurement of the reflection, and ͗I(h)͘ is the mean intensity of the reflection.
where F obs and F calc are the observed and calculated structure factor amplitudes, respectively. c R free was calculated as R cryst using 5.0% of randomly selected unique reflections that were omitted from the structure refinement.
Michaelis-Menten curves were generated by measuring the enzymatic specific activities over a range of initial substrate concentrations (1.50, 1.25, 1.00, 0.75, 0.50, and 0.25 mM) obtained from serial dilution of a 1.5 mM substrate buffer made immediately prior to conducting the assays. The 1-ml assays were conducted in triplicate and were managed as follows: 1) the substrate was dissolved in DMSO at 60 mM, and 25 l were added to a 2-ml vial; 2) 875 l of 25.7 mM Tris, pH X, was added (where X is higher than the intended pH of the assay to account for the acidic effect of GSH (e.g. pH X ϭ 11.5 drops to pH 8.0 after the addition of 5 mM GSH); 3) 50 l of 100 mM GSH was added (100 mM GSH stock solution was prepared by adding GSH to 25 mM Tris (pH X)); 4) 50 l of 20ϫ concentrated enzyme was added; 5) 150-l samples were collected after 0, 6, 12, 18, 24, and 30 s of incubation, and enzymatic activity was abolished by pipetting each sample into 5 l of 5 M phosphoric acid; and 6) the remaining reaction volume was used to measure the pH of the mixture with pH paper.
Each sample was then subjected to C 18 -reversed phase HPLC using a Beckman 125NM solvent delivery module equipped with a Beckman 168 UV detector. Samples and external standards were quantified by UV absorption at 280 nm. The HPLC mobile phase was a mixture of aqueous buffer (5 mM formic acid in 95:5 water/acetonitrile) and methanol at a flow rate of 1.0 ml/min. The ratio of buffers was adjusted as follows: 0 -6 min, 30% methanol; 6 -15 min, gradient from 30 to 80% methanol; 15-25 min, 80% methanol; 25-26 min, gradient from 80 to 30% methanol; 26 -33 min, 30% methanol. Vanillin concentrations were quantified for each time point, and a linear regression was generated over the 30-s assay period in order to calculate the specific activity of each reaction. Averages of the triplicate assays were reported.  MARCH 4, 2016 • VOLUME 291 • NUMBER 10

X-ray Data Collection and Structure Determination-The
LigE crystals were placed in a reservoir solution containing 10 -20% (v/v) glycerol and then flash-cooled in liquid nitrogen. The x-ray data sets for LigE were collected at the Berkeley Center for Structural Biology beamlines 8.2.1 and 8.2.2 of the Advanced Light Source at Lawrence Berkeley National Laboratory. LigF crystals were cryoprotected with a reservoir solution containing 30% polyethylene glycol monomethyl ether 2000 and 1 mM GSH. X-ray diffraction data were collected at Life Sciences Collaborative Access Team Sector 21 with x-ray wavelength 0.9793 at the Advanced Photon Source at Argonne National Laboratory. Data sets were indexed and scaled using HKL2000 (28). The LigF crystal structure was determined by molecular replacement using the program PHASER (29) within the Phenix suite (30) with the coordinates of a LigF homologue (Lig37), whose sequence was identified from a metagenomic analysis of a rice-straw-enriched compost microbial community (Berkeley, CA) (31,32). The crystal structure of LigE was solved using selenomethionine-labeled protein by the singlewavelength anomalous dispersion method (33) with the phenix.autosol (34) and phenix.autobuild (35) programs. Structure refinement was performed using the phenix.refine program (36). Manual rebuilding using COOT (37) and the addition of water molecules allowed construction of the final models. Root mean square deviation differences from ideal geometries for bond lengths, angles, and dihedrals were calculated with Phenix (30). The overall stereochemical quality of all final models was assessed using the program MOLPROBITY (38), and all figures were generated in PyMOL (39). Structures were observed and analyzed using a stereoscopic television display (40).
Small Angle X-ray Scattering-LigE and LigF were dialyzed for 15 h at 4°C into buffer containing 10 mM HEPES, pH 7.5, 50 mM sodium chloride, 1 mM GSH, and 0.5 mM tris(2-carboxyethyl) phosphine. Prior to data collection, samples were filtered through a 0.2-m syringe filter and diluted to the working concentrations. After dilution, samples were clarified via centrifugation. The buffer blank was also syringe-filtered and clarified by centrifugation. Small angle scattering data were collected on a Bruker NANOSTAR x-ray generator located at the National Magnetic Resonance Facility at the University of Wisconsin (Madison, WI). Three data collections of 1 h each were taken for each sample and buffer. Data were merged and indexed using the Bruker NANOSTAR small angle x-ray scattering system software (Bruker AXS, Madison, WI). The scattering intensity was obtained by subtracting the scattering of the buffer blank from the sample scattering using the PRIMUS software (41). All SAXS data were processed using GNOM, integrated in the PRIMUS software, to obtain the pair distance distribution function (42). The GNOM output was used with DAMMIF to calculate 10 ab initio dummy atom models (43). Models were averaged using DAMAVER and aligned to x-ray crystal structures using SUPCOMB (44,45). Theoretical scattering curves for the x-ray crystal structure of LigE and a model of the dimer of LigF were calculated using CRYSOL (46).
Molecular Docking-Docking of MPHPV to the LigF⌬242-GSH structure was performed using the SwissDock server (47,48). Docking was performed using the "Accurate" parameter and otherwise default parameters, with the search space limited to a 10 ϫ 10 ϫ 10Å region around the GSH binding. Both the protein and the MPHPV ligand were rigid during docking. The structure of MPHPV was built in ChemDraw (49), converted to three-dimensional coordinates using OpenBable (50). Docking results were visualized and screened using the UCSF Chimera molecular modeling system (51).

Results
Structural Analysis-Attempts to solve the structure of fulllength wild-type LigE (282 residues) and LigF (254 residues) were unsuccessful, but C-terminal truncation constructs of both proteins were generated, successfully crystallized, and used for structural analysis. Truncations of LigE and LigF were designed based on homology models generated by I-TASSER Online and disorder predictions generated using PONDR (52,53). LigE⌬255 and the LigE⌬255-GSH complex crystallized in the space group C2 with four molecules in the asymmetric unit with electron density for the bound GSH molecule. LigF⌬242-GSH crystallized in the space group P6 3 22 with one molecule in the asymmetric unit. Well defined electron density corresponding to the GSH molecule is also visible in the structure. Data collection, refinement, and model statistics for LigE and LigF are summarized in Table 1. Consistent with their classification as GST enzymes, LigE and LigF each adopt the canonical GST domain fold with an N-terminal thioredoxin domain (residues 1-82 and 1-76, respectively) and a C-terminal ␣-helical domain (residues 93-255 and 93-242, respectively) connected by a short linker (residues 83-92 and 77-92, respectively) (Fig. 2). In both LigE and LigF, the thioredoxin domain comprises four ␤-strands and three ␣-helices following the topology ␤1␣1␤2␣2␤3␤4␣3. The loop between ␤1 and ␣1 is longer in LigE than in LigF and occupies the space between the thioredoxin domain and the ␣-helical domain, whereas in LigF, this loop is moved away from the domain interface toward the surface of the thioredoxin domain. The loop between ␤2 and ␣2 is longer in LigF than in LigE, but both interact with the ␣-helical domain on the protein face opposite the linker (Fig. 2). The C-terminal domains of both LigE and LigF are composed of six and eight ␣-helices, respectively. The root mean square deviation between the C-␣ locations of monomers of LigE and LigF is 4.42 Å, indicating that, although they catalyze very similar reactions, the enzymes display significant structural differences.
Biochemical and small angle x-ray scattering data suggest that both LigE and LigF exist as dimers in solution, and these dimers, related by 2-fold symmetry, can be seen in the respective crystal structures. The dimer interface accounts for 1,066 and 1,092 Å 2 of buried surface area in LigE and LigF, respectively (PISA European Bioinformatics Institute) (54). The overall dimeric shapes of both LigF and LigE were confirmed using small angle x-ray scattering on both the truncated and fulllength proteins. The protein envelopes determined by ab initio modeling align well with the crystal structures of both proteins (Fig. 2). The theoretical scattering curves predicted from the x-ray structures match well with the experimentally determined scattering curves with a value of 2.4 and 1.4 for LigF and LigE, respectively (Fig. 3).
The LigF dimer forms via interactions between helices ␣3 and ␣4, in the thioredoxin and C-terminal domains, respectively, of each monomer, forming a four-helix bundle. The dimer interface is largely polar, lacking the traditional lockand-key motif or hydrophobic surface common in other GST dimers, specifically the Alpha, Pi, and Mu classes (14,15). The LigF dimer more closes matches those of the Beta or Theta class, which, like LigF, lack a hydrophobic lock-and-key motif, and there is no open V-shape to the dimer interface (14). Although the arrangement and characterization of the dimer forms in GST structures differ within and between classes, most are canonically anchored through contacts between ␣3 (or the final helix of the thioredoxin domain) and ␣4 (the first helix in the C-terminal domain) (13,15,55,56). Variability in the arrangement of secondary structural elements away from the ␣3/␣4 four-helix bundle changes the total buried surface area of the various GST dimers as well as changing the architecture of the enzyme in the vicinity of the active site (57). Representative structures demonstrating the variability of dimer packing in GSTs are shown in Fig. 4. The Alpha (Protein Data Bank entry 1GUH, human GST A1-1), Mu (2GST, rat), Pi (2GSR, pGST P1-1 from pig), Sigma (1GSQ, squid), Theta (1LJR, human hGST T2-2), Beta (2PMT, bacterial GST from Proteus mirabilis), Omega (3LFL, human GST Omega-1), and LigG (4G10, Sphingobium sp. SYK-6) dimers show variations on the ␣3/␣4 canonical 4-helix bundle dimer structure (58 -65). In the LigE dimer, helix ␣4 of one monomer is interdigitated between ␣4 and ␣7 of the other monomer, and the entire dimer interface is contained within the ␣-helical domain. The dimer is anchored by a hydrophobic lock-and-key motif in which Phe-101 of each monomer is in a hydrophobic pocket formed in the second monomer. This motif is seen in several GST classes, including Alpha, Mu, and Phi, which display the more typical four-helix bundle dimer mode rather than the elongated dimer of LigE (57). This elongated atypical dimer form in a GST was first described for GSTFuA1 from Phanerochaete chrysosporium (Fig. 4) in the GSTFuA class of GSTs, of which LigE is a member (13,18). An additional ␤-hairpin motif between ␣2 and ␤3 in GST5118 hinders the formation of the regular ␣3/␣4 GST dimer; however, this ␤-hairpin is not present in LigE (13). An extended loop between ␣5 and ␣6 in the C-terminal domain, which protrudes above the normal ␣3/␣4 packing site, may be responsible for the alternate dimer formation in LigE. LigG, a known GST lyase in the Sphingobium ␤-ether degradation pathway, is an Omega class GST with the canonical ␣3/␣4 GST dimer with a wider opening in the C-terminal domain, allowing for a pro-posed substrate binding site on the same face as that in LigE (65).
The enzymatic active sites of these GST family members are often located in a cleft between the thioredoxin domain and the ␣-helical domain (15). Both the LigE and LigF enzymes contain the ␤␤␣ motif required for anchoring GSH in the active site (56). In LigF, Glu-65 and Ser-66 located in the turn connecting ␤4 and ␣3, recognize the ␥-glutamyl moiety of GSH as part of the ␤␤␣ motif (Fig. 5A). Additionally, Gln-52 and the backbone of Val-53 interact with the cysteinyl moiety, whereas Gln-144, His-40, Tyr-148, and Gln-39 anchor the glycine residue of the active site GSH molecule. In LigE, Asp-71 and Ser-72, both located in the turn between ␤4 and ␣3, hydrogen-bond with the amino and carboxylate groups, respectively, of the ␥-glutamyl residue of the GSH molecule (Fig. 5B). Additionally, the backbone of Val-59 interacts with the cysteinyl moiety, whereas weak hydrogen bonds are formed between the GSH glycine and Arg-138 and Tyr-133.
Due to the occlusion of one face of the GSH binding pocket in LigF, we propose that the substrate binding site is located on the opposite face of the LigF monomer from the dimer interface (Fig. 2B, black circle). In the absence of a substrate-bound structure, SwissDock (47,48) was used to generate a LigF⌬242-GSH⅐(␤S)-MPHPV complex model (Fig. 6A) from the LigF⌬242-GSH structure and a molecular model of (␤S)-MPHPV. The model supports our assignment of the substrate binding site. However, in LigE, this side of the GSH binding pocket is blocked by a number of loops, whereas the face of the GSH binding site shared with the dimer has been opened, due to the dimer rearrangement (Fig. 2D, black circle). Based on the binding site of GSH in LigE, we propose a potential location for the native substrate-binding site at the highly hydrophobic region consisting of residues Tyr 23 , Phe 45 , Trp 107 , Phe 115 , Phe 142 , and Trp 197 (Fig. 6B). The aromatic rings of these hydrophobic residues are probably important in stacking interactions with the aromatic compounds from low molecular weight lignin derivative compounds.
Enzymatic Analysis and Mutagenesis-To analyze the enzymatic activities of the GSH-dependent ␤-etherase enzymes, FPHPV degradation rates were measured by the accumulation of vanillin, a monoaromatic product of FPHPV cleavage (Figs. 7 and 8). Whereas ␤-etherase catalysis with MPHPV results in the release of guaiacol (Fig. 1), vanillin is more easily detected by UV absorption, thus improving the sensitivity of the assays. In addition to LigE and LigF, we tested the rates of ␤-etherase catalysis for LigE variant LigE-S21A and two LigF variants, LigF⌬242 and LigF⌬242-S13A.
We found that LigE catalysis resulted in stereospecific (␤R)-FPHPV cleavage, whereas LigF selectively degraded the (␤S)-FPHPV enantiomer, as is consistent with previous reports (7,  9). The effect of pH on ␤-etherase activities was determined for each enzyme, revealing that LigE, LigF, LigF⌬242, and LigF⌬242-S13A have pH optima at pH 8.0 (Fig. 8). The activity of LigE was relatively unaffected by pH, whereas the activity of LigF⌬242 and LigF⌬242-S13A was significantly reduced above pH 8.0. The truncated LigF⌬242 exhibited higher rates of catalysis than full-length LigF at all pH values, indicating that the predicted disordered region in the C terminus may actually be inhibitory to ␤-etherase activity. The specific activities of LigE-S21A and LigF⌬242-S13A were 14% and Ͻ5% (Fig. 8 and Table  2) of the wild type and LigF⌬242, respectively, consistent with the structure-based predictions that these serine residues are involved in catalysis. Given the proximities of LigE Ser-21 and LigF Ser-13 hydroxyls to the GSH thiol (4.1 and 5.4 Å, respectively; Fig. 5) and because the specific activities of the ␤-etherases did not steadily increase as a function of increasing pH (Fig.   8), it is unlikely that these serine residues activate the GSH thiol for nucleophilic attack; rather, they act in GSH binding or thiol orientation or serve a different catalytic purpose.
Because LigF Ser-13 was the only potential acid-base catalyst revealed in the active site of the LigF⌬242-GSH structure (Fig. 5A), we hypothesized that an S N 2-type nucleophilic attack mechanism is responsible for catalysis in LigF. The LigF⌬242-GSH⅐(␤S)-MPHPV complex model, generated using SwissDock (47,48), revealed that the GSH thiolate is in the appropriate orientation for an S N 2 attack relative to the substrate ␤-carbon.
The LigE⌬255-GSH structure revealed several potential catalytic residues in the active site, leaving open the possibility that the LigE ␤-etherase mechanism involves additional acid-base reactions. A substrate analog model compound, (␤S)-fluoro-(1Ј-formyl-3Ј-methoxyphenoxy)-␥-hydroxypropioveratrone [(␤S)-F-FPHPV] (Fig. 7), was used to test the possibility of a non-S N 2 mechanism that would involve the deprotonation of the ␤-carbon of the substrate. (␤S)-F-FPHPV and (␤R)-FPHPV (despite their Cahn-Ingold-Prelog-derived R/S notations (66)) have the same enantiomeric configuration with respect to the orientation of their ␤-ether bonds and differ only in replacement of the hydrogen at the ␤-carbon in (␤R)-FPHPV with a fluorine in (␤S)-F-FPHPV, and this fluorine is predicted to prohibit deprotonation. We found that LigE catalyzed conversion of (␤S)-F-FPHPV to vanillin and a glutathione-conjugated coproduct, albeit at a much lower velocity compared with cleavage of (␤R)-FPHPV (Table 2), exactly as predicted based on the hypothesis that an S N 2 catalytic mechanism would not involve deprotonation of the ␤-proton. Based on NMR analysis of the reaction products, we conclude that the LigE-catalyzed ␤-ether cleavage of (␤S)-F-FPHPV resulted in formation of the expected glutathione-conjugated product, (␤S)-F-GS-HPV. Although it is unclear why the reaction with (␤S)-F-FPHPV was some 3 orders of magnitude slower than LigE-catalyzed cleavage of (␤R)-FPHPV (Table 2), we hypothesize that the fluorine atom affects the ␤-ether bond angle and inhibits the approach of the thiolate ion for S N 2 elimination. It is possible that these effects were even more pronounced in the active site of LigF, because LigF showed no detectable activity with the (␤R)-F-FPHPV enantiomer.

Discussion
The biocatalytic breakdown of lignin-derived compounds represents a potential source of aromatic products that would be valuable for the chemical, food, and pharmaceutical industries (2). In contrast to known fungal systems, the bacterium Sphingobium sp. strain SYK-6 possesses an enzymatic route to the breakdown of lignin-derived components that is stereospecific and independent of chemical mediators and requires common cellular cofactors, such as pyridine nucleotides and glutathione. These combined structural and biochemical studies of the ␤-aryl ether cleavage pathway enzymes provide insights into the features important for substrate and cofactor binding and catalysis. We propose that both LigE and LigF cleave ␤-etherlinked lignin dimer molecules via an S N 2 nucleophilic attack on the ␤-carbon of the substrate that is consistent with previous results showing inversion of the chiral center at the ␤-carbon (9). Because LigE catalyzed the conversion of (␤S)-F-FPHPV to (␤S)-F-GS-HVP, we conclude that the LigE mechanism is unlikely to involve formation of an enzyme-substrate adduct and does not involve C␤ deprotonation or substrate enolization.
Although the sequences and x-ray crystal structures show a conserved serine in the active site of both LigE and LigF (serine 21 and 13, respectively) near the thiol of the bound glutathione (4.1 and 5.4 Å, respectively; Fig. 5), the serine is not essential for catalysis. In both LigE and LigF, mutation of the active site serine greatly reduced, but did not abolish, the enzymatic activity and did not shift the pH optimum, indicating that it may play a role other than deprotonation of the GSH thiol or perturbation of the apparent pK a of the bound glutathione. A conserved catalytic serine is a characteristic of the Theta class, Zeta class, and some bacterial GSTs (15), but there is evidence of GSTs from the bacteria P. mirabilis, Ochrubactrum anthropi, and E. coli in which this active site serine is not critical for catalytic activity (67)(68)(69). Based on the data presented here and support from previous studies, it is clear that although the active site serine is not responsible for the direct activation of the thiolate anion by deprotonation or perturbation of the pK a of the bound glutathione, it may be active in binding GSH in the active site, orienting the sulfhydryl group of GSH in the catalytic step, or stabilization of the transition state. Because GSH-dependent cleavage of these molecules does not occur readily in vitro in the absence of enzyme, it may be that the enzyme is able to stabilize the thiolate anion via a network of interactions within the active site or that the binding of the substrates in the optimal orientation and distance for the S N 2 attack is sufficient for catalysis.
The structures of the LigE and LigF enzymes also highlight the nature of stereospecific control that is key to this pathway. These enzymes possess dramatically different structural arrangements within the monomers and different dimer interfaces, reflected in very different dimer shapes. As a result, the substrate binding surfaces of the two enzymes are on opposite faces of the thioredoxin domain and glutathione binding site. This observation means that if a substrate with the wrong stereochemistry were to bind, it would not be in the correct orientation with respect to the glutathione for catalysis, hence introducing stereospecificity. Due to the completely different geometry of the active site, there is no simple set of mutations that would switch substrate specificity or make each individual enzyme more promiscuous.
Based on structural properties, LigE is most similar to the fungal GSTFuA class (13), suggesting that the enzymes in this class are present in both prokaryotes and fungi. Other representatives in this class are from saprotrophic fungi, suggesting a functional connection among the members of the class (18). Although it has been suggested that LigF also belongs in the GSTFuA class (13), the dimer interface present in the structure is inconsistent with other members of the class. Based on our data, LigF is best placed in a new structural class closely related to GSTFuAs or as a fungal Ure2p-like GST based on structural similarities and function in saprotrophic organisms, although it does not strictly fit the class (70). Assignments to different GST family classes, combined with the structural and biochemical information presented here, suggest that LigE and LigF evolved to cleave unique stereoisomers of the aromatic dimers that are predicted to be found in plant lignins.
The detailed structural and biochemical characterization of LigE and LigF in this study and other members of the ␤-aryl etherase pathway reveal important new aspects of the enzyme mechanism and the determinants of substrate stereospecificity. Future enzyme engineering studies informed by these results may focus on optimizing the pathway for catalysis of specific lignin-derived compounds, formed as the byproducts of industrial biomass processing, into suitable products for use as, or precursors of, advanced biofuels and renewable chemicals. (␤R)-F-FPHPV NDA a Where noted (i.e. in the absence of Michaelis-Menten curves), activity is reported as the velocity from assays in which the initial substrate concentration was 1.5 mM. b Where noted, independent assays using substrate (␤R)-MPHPV and either LigE or LigE-S21A as catalysts indicated that the LigE V max was approximately 7-fold greater than that for LigE-S21A.