Structure and specificity of a new class of Ca2+-independent housekeeping sortase from Streptomyces avermitilis provide insights into its non-canonical substrate preference

Surface proteins in Gram-positive bacteria are incorporated into the cell wall through a peptide ligation reaction catalyzed by transpeptidase sortase. Six main classes (A–F) of sortase have been identified of which class A sortase is meant for housekeeping functions. The prototypic housekeeping sortase A (SaSrtA) from Staphylococcus aureus cleaves LPXTG-containing proteins at the scissile T–G peptide bond and ligates protein-LPXT to the terminal Gly residue of the nascent cross-bridge of peptidoglycan lipid II precursor. Sortase-mediated ligation (“sortagging”) of LPXTG-containing substrates and Gly-terminated nucleophiles occurs in vitro as well as in cellulo in the presence of Ca2+ and has been applied extensively for protein conjugations. Although the majority of applications emanate from SaSrtA, low catalytic efficiency, LPXTG specificity restriction, and Ca2+ requirement (particularly for in cellulo applications) remain a drawback. Given that Gram-positive bacteria genomes encode a variety of sortases, natural sortase mining can be a viable complementary approach akin to engineering of wild-type SaSrtA. Here, we describe the structure and specificity of a new class E sortase (SavSrtE) annotated to perform housekeeping roles in Streptomyces avermitilis. Biochemical experiments define the attributes of an optimum peptide substrate, demonstrate Ca2+-independent activity, and provide insights about contrasting functional characteristics of SavSrtE and SaSrtA. Crystal structure, substrate docking, and mutagenesis experiments have identified a critical residue that dictates the preference for a non-canonical LAXTG recognition motif over LPXTG. These results have implications for rational tailoring of substrate tolerance in sortases. Besides, Ca2+-independent orthogonal specificity of SavSrtE is likely to expand the sortagging toolkit.

Several proteins of Gram-positive bacteria after their translocation across the membrane are covalently attached to the cell wall by action of a unique family of cysteine transpeptidases referred to as sortases (1)(2)(3). Surface proteins meant for covalent anchoring harbor a cell wall sorting signal in the C-terminal region that contains a sortase-recognition LPXTG type of pentapeptide sequence motifs (4). The general mechanism of sortase-mediated anchoring involves two steps in which the enzyme first cleaves the T-G peptide bond and captures the protein by forming an enzyme-thioacyl intermediate. This thioacyl intermediate is resolved by nucleophilic attack of an amine moiety from the peptidoglycan cross-bridge (1). Importantly, many surface proteins that are sortase substrates play crucial roles in immune evasion, biofilm formation, and other critical processes associated with bacterial pathogenesis (5,6). Sortase knock-outs display reduced virulence with retention of cellular viability (7)(8)(9). Hence, sortases are viewed as attractive targets for the development of novel therapeutics, especially against antibiotic-resistant strains (10,11).
Gram-positive bacteria encode multiple sortases and a variety of substrates (12). Accordingly, sortases are grouped into six classes (A-F) based on their sequence similarity, membrane topology, function, or substrate specificity (13)(14)(15). Class A enzymes anchor a large number of proteins and act as a housekeeping sortase. Sortases belonging to classes B-D are meant to perform specialized functions. The roles of sortases E and F remain somewhat unclear, although class E sortases are believed to function like housekeeping enzymes. Sortase A (SrtA), first discovered in Staphylococcus aureus, is considered a prototype sortase (3). SaSrtA recognizes the LPXTG motif in S. aureus surface proteins and anchors them to the cell wall by forming a peptide bond to the pentaglycine arm of the peptidoglycan (16). Sortase-mediated peptide ligation between short synthetic peptides or large engineered proteins embedded with the LPXTG motif and an aminoglycine-derivatized moiety occurs well in vitro (17) and has enabled unprecedented applications of SaSrtA in protein engineering (18). SaSrtA or its variants remain the workhorse of most sortagging applications (19). The housekeeping sortases of Streptococcus pyogenes and Lactobacillus plantarum are the only other enzymes that have been employed, although with limited utility (20,21). Furthermore, newer challenging applications of sortases would require the availability of enzymes endowed with enhanced catalytic efficiency as well as orthogonal specificities in both donor and acceptor polypeptides. Moreover, Ca 2ϩ -independent sortase activity is desirable to facilitate intracellular sortagging applications (19,22,23). The mining of bacterial genomes for improved sortase enzymes capable of recognizing diverse sorting motifs appear very attractive in this regard.
The structure of SaSrtA obtained by both NMR and X-ray highlights the presence of a typical "sortase" fold comprising an eight-stranded ␤-barrel architecture (24,25). The active site is composed of the catalytic residues His-120, Cys-184, and Arg-197 situated at the end of the long groove along one side of the ␤-barrel connected by random coil loops and two short helices. The loops connecting ␤2/␤3, ␤3/␤4, ␤6/␤7, and ␤7/␤8 strands form the walls of the groove. Residues located in the ␤6/␤7 loop play crucial roles in substrate sorting and facilitate catalytic turnover of those peptide substrates that are endowed with a "kinked" conformation due to the Pro residue in the LPXTG substrate (24). Accordingly, the LPXTG substrate containing Pro analogs and homologs capable of generating Pro-like conformational features is well tolerated against LAXTG or LGXTG peptide substrates (26). Interestingly, class E sortases present in some bacteria are annotated to process non-canonical LAXTG or other previously unknown sorting signals (13). The presence of multiple class E sortases is noted in the GCrich genome of actinomycetes. Duong et al. (27) detected two class E sortases (SrtE1 and E2) in Streptomyces coelicolor and used knock-out experiments to demonstrate the requirement of sortase activity in aerial development. The aerial hyphae formation requires "chaplin" proteins, some of which contain LAXTG as a potential sorting signal (28). The genome of Streptomyces avermitilis has revealed the presence of four putative SrtE enzymes of which SrtE3 (SAV4333) is distinct from SrtE1 and SrtE2 of S. coelicolor (27,29).
This study was initiated with a view to define the substrate specificity of SrtE3 from S. avermitilis and to delineate the structural features responsible for recognition of a non-canonical LAXTG peptide substrate. Our results of transpeptidation assays carried out using a battery of peptides reveal the attributes of a good sortase E substrate not previously recognized in classical SaSrtA or other sortases. The crystal structure of the enzyme together with bioinformatics, modeling, and mutagenesis provide insights into the altered substrate specificity of this new class E housekeeping sortase.

Design and expression of a catalytic domain of SrtE3 (SavSrtE)
S. avermitilis genome encodes at least four putative class E sortases. SrtE3 (SAV4333) is a 230-residue protein composed of an N-terminal transmembrane region (residues 12-32) and a catalytic domain (residues 83-214). We chose to express the N-terminal truncated version of the enzyme to facilitate solubility. Accordingly, the sequence representing residues 51-230 was expressed with a hexa-His tag at the N terminus. The N-terminal 50-residue deleted construct of SrtE3 referred to as SavSrtE was used for the study.
The Ni-NTA affinity-purified SavSrtE migrated as a neat single band around 20 -25 kDa on SDS-PAGE indicating high purity of the protein preparation (Fig. 1A). However, ES-MS analysis yielded two components (22,157 and 22,335 Da, respectively) whose mass differed by 178 Da (Fig. 1B). The mass of the first component fits to the calculated mass of the expressed protein sequence (22,289) minus the mass of a Met residue indicating processing of N-terminal Met by Escherichia coli aminopeptidases. The addition of 178 Da presumably arose due to N-gluconylation of the des-Met protein. Such modifications have been noted earlier in many N-terminal His-tagged proteins (30).

Expressed SavSrtE is an active sortase and prefers Ala over Pro at the second position of the pentapeptide-sorting motif
We carried out transpeptidation assays with a variety of peptide substrates to define the substrate preference and tolerance of SavSrtE. We generated several substrate variants ( Fig. 2A) by incorporating proline (Pro), 4-hydroxyproline, 3,4-dehydroproline, azetidine-2-carboxylic acid, glycine (Gly), alanine (Ala), ␣-aminoisobutyric acid, and ␤-alanine (␤-Ala) at the second position of the pentapeptide motif in a YALXNTGK model peptide template based on our earlier studies with SaSrtA (26).
We first evaluated SavSrtE-catalyzed transpeptidation reaction of Ala-containing peptide (YALXNTGK, where X ϭ Ala) using GGGKY as an acceptor. The reaction was carried out by incubating the peptide (0.5 mM) with GGGKY (1 mM) and SavSrtE (50 M) at 20°C. An aliquot from the reaction mixture was analyzed by RP-HPLC (inset, Fig. 2B) as described under "Experimental procedures," and the product was characterized by mass spectrometry (data not shown). The time course of the reaction depicted in Fig. 2B shows that the reaction attained equilibrium in about 5-8 h and produced a 20% yield. Subsequently, we carried out the transpeptidation assay of each YALXNTGK peptide (X ϭ 1-8) under identical conditions for 6 h and quantified the product (Fig. 2C). Interestingly, Gly-or ␤-Ala-containing peptides did not yield any product indicating a role of peptide conformation in substrate recognition. The ␣-aminoisobutyric acid peptide produced about 15% transpeptidation yield as compared with about 20% by Ala peptide. However, product formation was reduced to almost half that of Ala in Pro peptide (ϳ9%) and significantly diminished (5% or less) in other Pro analogs.

Residues flanking the LAXTG motif play a significant role in substrate recognition
The above results show that Pro analogs with Pro-like conformational attributes are poor substrates of SavSrtE as compared with SaSrtA. In contrast, Ala or its analogs at this position are more effective substrates. However, the overall equilibrium yield of about 20% obtained with SavSrtE using YALANTGK substrate was less than half as compared with that produced by SaSrtA (ϳ45%) with the counterpart YALPNTGK peptide (26).
What could be the reason for this lower product conversion? Can residues flanking the LAXTG motif play a role? A closer inspection of the putative sortase protein substrates encoded in the S. avermitilis genome revealed the presence of neutral res-idues, such as Asn, Ala, or Gly flanking the LAXTG pentapeptide motif. This prompted us to investigate whether the presence of a Lys residue in YALANTGK exerted any deleterious effect on the transpeptidation reaction. Accordingly, we synthesized the Ala-(neutral) or Glu (acidic)-terminated peptides in addition to Lys-terminated YALANTGK to generate the LANTG substrate flanked by Ala/Ala, Ala/Glu, and Ala/Lys residues. Interestingly, peptide containing the Ala/Ala-flanking residue produced (ϳ40%) about 5-fold higher equilibrium transpeptidation yield as compared with Ala/Glu (ϳ8%) and 2-fold or more as compared with Ala/Lys (ϳ18%) peptide (Fig. 3A).
We also replaced the Ala residue preceding Leu of the motif and created a Asn/Ala-flanking pair and generated two peptides corresponding to the LAXTG and LPXTG motifs, respectively. The YNLAETGA peptide yielded about 35% product, which was comparable with that obtained with YALANTGA, suggesting that the Asn/Ala pair behaved in a similar fashion as that of the Ala/Ala pair. Interestingly, equilibrium yield in the case of YNLPETGA peptide was found to be about 12-14% as compared with about 9% obtained with YALPNTGK suggesting that flanking residues are important in the context of both LAXTG and LPXTG substrates (Fig. 3B).

Structure of SavSrtE
SavSrtE crystals (dimensions ϳ0.5 ϫ 0.1 ϫ 0.05 mm) diffracted to 1.65 Å. The protein crystallized in the P3 2 21 space group with unit cell parameters a ϭ b ϭ 85.84 Å, c ϭ 48.20 Å, ␣ ϭ ␤ ϭ 90°, ␥ ϭ 120°and a monomer in the asymmetric unit ( Table 1). The modeled structure lacked interpretable electron density for residues 51-73 (KGTVRPTAAPGASARTSPEK-PAP) and residues 201-204 (EWGH). SavSrtE displayed typical eight-stranded ␤-barrel sortase fold despite low sequence identity with other sortases (Fig. 4A). The overall structure of SavSrtE resembles SrtA more closely than sortases of class B or C (Fig. 4B). However, the ␤6/␤7 loop (21 residues) in SavSrtE contains two 3 10 helices and is slightly longer than the equivalent loop in class A sortases (16 -18 residues) but is shorter than in class B sortases (ϳ38 residues). Although the first 3 10 helix is similar to that of SaSrtA and is expected to contact the bound peptide, the second helix partially overlaps with the signature ␣-helix present in sortase B. Additionally, the loop joining ␤1 and ␤2 strands is somewhat longer relative to other sortases. A cis-peptide ( 165 GP 166 ) is present toward the C terminus in the ␤6 strand resulting in an additional anti-parallel wide ␤-bulge following the anti-parallel classic ␤-bulge present in the ␤6 strand of other sortase structures.

Description of active site
Catalysis in sortases is performed by a triad of highly conserved His, Cys, and Arg residues. The Cys residue is responsible for the formation of the thioacyl-enzyme intermediate; His acts as a proton donor for the leaving group, and Arg is believed to stabilize the intermediate and also facilitate the positioning of the substrate in the active-site cleft (31,32). SavSrtE activesite residues, His-129 (C-terminal to ␤4), Cys-198 (C-terminal to ␤7), and Arg-207 (N-terminal to ␤8) are located at one edge of the ␤-barrel (Fig. 5, A and B). The walls of the active-site groove in SavSrtE are formed by residues from the ␤2/H1 loop, ␤3/␤4 loop, ␤4 strand, ␤6/␤7 loop, ␤7 strand, ␤7/␤8 loop, and ␤8 strand. Residue Cys-198, however, was associated with an additional blob of electron density that resembles a partial modification of Cys to S,S-(2-hydroxyethyl)thiocysteine (Cme, 6 occupancy 0.7 beyond the SG atom in the crystal structure) by ␤-mercaptoethanol present in the protein solution. The distance between the SG atom of the modified Cys and the ND1 is the ith observation of reflection hkl, and ͗I(hkl)͘ is the average intensity over all observations. atom of His-129 is 4.8 Å, compared with Ͼ5 Å in other aposortase A structures except for a value of 3.6 Å observed in Bacillus anthracis sortase A (PDB code 2kw8).

SavSrtE lacks a Ca 2؉ -binding site and contains a "preformed" pocket for substrate binding
The ␤6/␤7 loop in SaSrtA and other sortases has been demonstrated to play an incisive role in substrate recognition (32)(33)(34). In SaSrtA, this highly mobile loop undergoes conformational transition from a "disordered open state" (PDB codes 1ija and 1t2p) to an "ordered closed conformation" (PDB code 2kid) upon substrate binding with the formation of a 3 10 helix that is stabilized by a Ca 2ϩ ion (Fig. 5C). Analysis of the holostructure bound with Ca 2ϩ (PDB code 2kid) showed that Ca 2ϩ balances the electrostatic repulsion among residues Glu-105, Glu-108, Asp-112 on ␤3/␤4 loop, and Glu-171 following the 3 10  loop whose electron density is not observed). The insertion in the ␤6/␤7 loop, containing a 3 10 helix, is highlighted by a dotted box. B, surface representation of SavSrtE, colored by B-factors from low (blue) to high (red). Active-site loops and catalytic residues are labeled. The first 3 10 helix in ␤6/␤7 loop, which is expected to interact with the bound peptide, has low B-factors, whereas the portion of the ␤7/␤8 loop with observed electron density has high B-factors (the dots represent part of the ␤7/␤8 loop for which interpretable electron density is not observed). C, ␤6/␤7 loop in the apo-structure of S. aureus SrtA (SaSrtA, PDB code 1ija, pink) is in open conformation, whereas a 3 10 helix is induced upon Ca 2ϩ binding (green sphere), which closes over the bound substrate (not shown) in the holo-structure (PDB code 2kid, cyan). Comparison of apo-SavSrtE (blue) with holo-SaSrtA and apo-structure from B. anthracis SrtA (BaSrtA, PDB code 2kw8, orange) shows the 3 10 helix in ␤6/␤7 loop and its closed conformation, resulting in a preformed binding pocket. D, SaSrtA contains a cluster of negatively charged residues which are stabilized by Ca 2ϩ binding. The equivalent residues in SavSrtE do not form a charged cluster that requires neutralization by Ca 2ϩ ion. Interactions of equivalent residues in BaSrtA involve a Lys in charge neutralization. E, electrostatic surface potential in SaSrtA showing the negatively charged residues, whereas SavSrtE has no such charged cluster. Surface electrostatics was calculated using Accelrys Discovery studio software. The color scheme ranges from blue (for electropositive regions) to red (for electronegative regions). F, effect of Ca 2ϩ on transpeptidation reaction. The reactions were carried out at 20°C using 50 M SavSrtE with YNLAETGA (0.5 mM) and GGGKY (1 mM) in the absence and presence of Ca 2ϩ (5 mM). The reaction mixture was processed by RP-HPLC, and the product was quantified as described under "Experimental procedures."

Structure and specificity of a class E sortase
helix in the ␤6/␤7 loop (Fig. 5, D and E), thereby stabilizing the closed conformation of the loop (24). The equivalent residues in SavSrtE, namely Ala-113, Ala-116, Gln-120, and Pro-179, are neutral and interact via non-bonded interactions with their neighboring residues obviating the need for extraneous Ca 2ϩ ion. The first 3 10 helix in the ␤6/␤7 loop of SavSrtE shows relatively lower B-factors compared with the second half of the ␤6/␤7 loop (Fig. 5B), and it assumes an ordered closed conformation, indicating the presence of a preformed binding pocket as observed in SpySrtA and BaSrtA, which lack the calciumbinding site (35,36). Consistent with the crystal structure data, the time course of the SavSrtE-catalyzed transpeptidation reaction carried out in the absence or presence of Ca 2ϩ ion (5 mM) under identical conditions generated almost super-imposable product yield curves (Fig. 5F).

Binding of the second substrate
Comparison of the apo-and substrate-bound structures of SaSrtA (PDB codes 1ija and 2kid) and BaSrtA (PDB codes 2kw8 and 2rui) showed that the ␤7/␤8 loop undergoes a large conformational change upon substrate binding. The active site Cys residue is shifted away from the peptide displacing the ␤7/␤8 loop and resulting in a second groove leading to the active site (Fig. 6, A and B). This second groove, formed by the ␤4/H2 loop, H2 helix, and ␤7/␤8 loop, has been suggested to be the binding site for the amine nucleophile (24). A similar groove is also observed in the SavSrtE structure. Interestingly, this groove contained additional electron density that could be modeled as a glycine residue (Fig. 6, C-E).

SavSrtE shows strict specificity for Gly-based amine nucleophiles
We further explored the propensity of AAKY, Tri-DAP (Lalanyl-␥-D-glutamyl-meso-diaminopimelic acid), and a pilin motif VHVYPKN sequence corresponding to residues 185-191 of FimA pilus protein (37) to undergo SavSrtE-catalyzed transpeptidation reaction with YNLAETGA peptide substrate (Fig. 6F). However, none of these peptide nucleophiles were able to generate any transpeptidation product. The presence of a small amount of YNLAET peptide was observed due to slow hydrolysis of the substrate in the absence of a productive amine nucleophile (data not shown).
Thus, SavSrtE appears to display strict specificity for GGGKY as the second substrate.

Modeling of protein-substrate interactions gives insight into preference for LAXTG peptide and implicates a role for Tyr-112
In the absence of a substrate-bound structure of SavSrtE, we modeled the enzyme-substrate complex to understand the structural basis of substrate preference. The ALPNT or ALANT peptides were docked into the preformed binding pocket in the SavSrtE structure. The best docking pose for each ligand, as ascertained from the docking scores, contained the recognition motif placed in the active-site pocket of SavSrtE in an orientation similar to that observed in the peptide-bound SaSrtA structure (Fig. 7A). The side chain of Leu in the peptide is positioned in a hydrophobic cavity formed by the N-terminal stretch of the ␤6/␤7 loop, whereas the succeeding Ala/Pro interacts with residues from the ␤2/H1 loop and ␤3/␤4 loop.
The binding interaction was analyzed in terms of docking energy score and residue interaction network ( Table 2). Lower docking energy scores for ALPNT binding to SaSrtA as compared with ALANT binding suggest the former as the preferred substrate for SaSrtA. Consistent with this, ALPNT forms a stronger interaction network to SaSrtA than ALANT. Inspection of the docked structures shows that replacing bulky Pro by an Ala residue reduces the interaction of the peptide with Leu-169 in the ␤6/␤7 loop.
In contrast, ALANT establishes a stronger interaction network than ALPNT in SavSrtE in accordance with the preference of Ala at the second position of the pentapeptide motif (ALANT) relative to Pro (ALPNT). This may be partly ascribed to the presence of a bulky Tyr-112 as against an equivalent Ala residue (Ala-104) in SaSrtA, which is implicated in substrate recognition (38). Tyr-112 in SavSrtE presumably hinders the Pro-containing peptide from extending deeper into the active-site groove (Fig. 7, B and C), resulting in limited protein-peptide interactions. Besides, the -OH group of Tyr-112 is involved in a hydrogen bond with the backbone nitrogen of the Ala residue of the recognition motif in the ALANT, which is abolished in ALPNT peptide.

Mutational analyses reveal a critical role for Tyr-112 residue in substrate recognition and specificity discrimination
To probe the possible role of Tyr-112 in substrate recognition, we generated four mutants of SavSrtE by replacing Tyr at this site with Ala, Gly, Trp, and Phe (Table 3). The choice of Ala and Phe was based on equivalent sites in SaSrtA and sortases of class D, respectively. Gly and Trp were chosen to gauge the effect of local flexibility or steric constraint on substrate recognition. The mutants were expressed and purified as described under "Experimental procedures," and their ability to catalyze transpeptidation reaction was explored with YNLAETGA (Ala substrate) and YNLPETGA (Pro substrate) using GGGKY as the nucleophile. RP-HPLC profile of the reaction carried out with either Ala or Pro substrates with Tyr-112 mutants in which Tyr was replaced by Ala (Y112A), Gly (Y112G), or Trp (Y112W) did not show the presence of any transpeptidation or hydrolytic product peak. In contrast, Y112F mutant was active on both substrates, although to different extents relative to the wild-type enzyme (Fig. 8A). At equilibrium, the Y112F mutant yielded about ϳ18% product with the Pro , which was higher or comparable with the wild-type enzyme (ϳ14%). However, product yield associated with the Ala substrate for Y112F was considerably diminished (ϳ2% for Y112F versus 38% for wild type) indicating that Y112F mutation was particularly detrimental to the Ala substrate specificity of the enzyme (Fig. 8B).

Kinetics of Y112F mutant with Ala and Pro substrates
We carried out steady-state kinetic experiments to further characterize the role of the Tyr-112 residue in substrate recognition and specificity discrimination. Accordingly, kinetic parameters of wild-type SavSrtE and Y112F mutant were determined on the Ala substrate (acetyl-YNLAETGA) and Pro substrate (acetyl-YNLPETGA), respectively ( Table 4). The individual values of Kcat and K m with the Ala substrate for Y112F or wild-type SavSrtE was easily obtained by fitting the data to standard Michaelis-Menten kinetics or to the substrate inhibition model. However, both enzymes were subjected to inhibition by the Pro substrate above 3 mM and yielded an inhibition profile that did not fit well to the classical substrate inhibition model. Frankel et al. (39) have also observed such a complex inhibition of SaSrtA mutants with the Pro substrate and estimated K cat /K m values of the Pro substrate at low concentrations from the slope of the linear portion of Lineweaver-Burk plot. We followed similar analyses to compare the relative specificity of the Pro substrate for wild-type and mutant sortase.
The Y112F mutation in SavSrtE resulted in a decrease of about 12-fold in K cat and an increase of 4-fold in K m for the Ala substrate yielding a K cat /K m (1.03 Ϯ 0.08 M Ϫ1 s Ϫ1 ) that was about 50-fold less than the wild-type enzyme (50.94 Ϯ 7.09 M Ϫ1 s Ϫ1 ). In contrast, K cat /K m of the mutant for the Pro substrate was found to be quite similar to the wild-type enzyme (8.60 versus 10.71 M Ϫ1 s Ϫ1 ). The data suggest that Y112F mutation preferentially affected the Ala substrate specificity of SavSrtE.
We also generated kinetic parameters for SaSrtA with the above substrates to contrast the divergent substrate specificity of classical housekeeping class A sortase (SaSrtA) with class E sortase (SavSrtE) in view of the fact that SaSrtA is known to

Discussion
Development of novel sortases endowed with enhanced catalytic efficiency and newer specificity is a topical area of intense interest (19). The major motivation for this emanates from the immediate need to expand sortagging applications beyond the specificity restrictions of SaSrtA. In this endeavor, current efforts are centered on generating newer variants of SaSrtA employing rational protein engineering (34,40) and directed evolution strategies (33,38,41). Although these approaches appear promising, contemporaneous mining of natural sortases offer rich possibilities of excavating useful enzymes with altered specificity, improved turnover, and stability. Here, we report the substrate specificity and crystal structure of a hitherto unexplored class E enzyme from S. avermitilis.
Our results demonstrating the preference of "LAXTG" over "LPXTG" sorting sequence (600-fold shift in Ala/Pro substrate specificity ratio from SaSrtA), the influence of the residue following the pentapeptide motif in the substrate recognition, and the calcium ion-independent catalytic activity of SavSrtE distinguish this enzyme from the prototype housekeeping sortase of S. aureus. Importantly, equilibrium product conversions obtained using corresponding substrates (SavSrtE, YALANTGA, versus SaSrtA, YALPNTGK) under optimized conditions for SavSrtE (ϳ40%) compares well with SaSrtA (ϳ45%) corrobo-rating the utility of SavSrtE in peptide ligation and protein engineering endeavors. This is important considering that sortases with robust activity are hard to find, and perhaps for this reason the activity of many new sortases is demonstrated through in situ detection of the product by mass spectrometric analysis of the reaction mixture. It is pertinent to mention here that Duong et al. (27) recently probed the substrate specificity of SrtE1 and SrtE2 of S. coelicolor by direct ES-MS analyses of the total reaction mixture and found multiple cleavages in the LAXTG sorting signal. Curiously, they reported predominant cleavage between the second (Ala) and the third residue (Xaa) rather than at the T-G peptide bond of the motif that is recognized by all sortases. We, however, did not see any trace of such aberrant proteolysis or transpeptidation indicating high specificity of SavSrtE for the scissile T-G peptide bond of the pentapeptide recognition motif.
The finding that the polar residue adjacent to the T-G scissile peptide bond of the sorting motif exerts detrimental influence on substrate recognition by SavSrtE is an important result that differs from the prototype housekeeping SaSrtA, which tolerates both Ala and Lys residues at this position equally well (26). Interestingly, residues juxtaposed to the LPXTG motif in pilin sortases are known to play a critical role in substrate discrimination by pilin and housekeeping sortases during pilus biogenesis (42). The neutral residues flanking pentapeptide sorting sequence may be relevant in vivo for performing some regulatory function.
The housekeeping sortases prefer either oligoglycine, oligoalanine, or both as the second substrate in the transpeptidation reaction. For example, SaSrtA prefers oligoglycine, and SrtA of S. pyogenes prefers oligoalanine (24,35). Some housekeeping class A or other sortases might use diaminopimelic acid as has been experimentally demonstrated in the transpeptidation reaction catalyzed by SrtB of Clostridium difficile (43). In contrast, pilin sortases use an ⑀-amine of a lysine residue present in the "pilin motif" of pilus constituent proteins (44). The stringent specificity of aminoglycine for SavSrtE is consistent with the serendipitous presence of a bound Gly residue in the putative site for the second substrate (Fig. 6D). Interestingly, the docking program ClusPro (45) positioned the N-terminal Gly of a GGG peptide into the same pocket in a model of LANTbound SavSrtE reinforcing this observation.
Modeling of the peptide substrate in the crystal structure of SavSrtE in the absence of an enzyme-substrate complex structure has been quite insightful and implicates a role for the Tyr-112 residue that is strategically located for interaction with Ala/Pro residue of the sorting sequence and may better complement the Ala side chain in the substrate as compared with  Pro. This notion is further corroborated by the fact that Tyr-112 is conserved in all class E sortases (with LAXTG-containing substrates) in the Sortase database despite the lack of high sequence identity between them (Fig. 7, D and E), although it is not seen as conserved in the sequence alignment of sortases from all classes (Fig. 4A). The equivalent position in class D sortases, which are annotated to process an LPXTA-sorting motif, is generally occupied by a Phe residue (from structurebased sequence alignment of all sortases in the Sortase database). Analysis of B. anthracis SrtD (BaSrtD, PDB code 2ln7, NMR structure) shows that the side chain of an equivalent Phe-46 residue adopts a range of orientations of which only six are similar to the conformation of Tyr-112 in SavSrtE. There appears to be some flexibility at this Phe position in SrtD structures, whereas Tyr-112 in SavSrtE is found to be relatively rigid (all-atom B-factor, 13.2 Å 2 ) and may cause partial steric occlusion of the Pro-containing substrate from the binding pocket. The experimental results showing the absence of activity in Y112A, Y112G, and Y112W mutants is consistent with this analysis. Furthermore, display of similar Pro substrate specificity as the wild-type enzyme concomitant with a 50-fold decrease in Ala substrate specificity in Y112F mutant indicates the importance of the hydrogen bond between the -OH group of Tyr-112 and the Ala residue of the LAXTG recognition motif. The demonstration of Tyr-112 residue of SavSrtE as a critical determinant of specificity is instructive in view of the fact that no single residue of a sortase in isolation has been observed to exert considerable discriminatory influence on substrate preference. Previously reported SaSrtA variants evolved for improved activity toward the LAETG motif contained multiple mutations together with mutation of Ala-104 residue equivalent to Tyr-112 in SavSrtE (38). Interestingly, mutation of Ala-104 in isolation (single mutant) produced an inactive enzyme (29), but reversion of the 104 site to Ala in the multiple mutation setting resulted in severalfold gain of preference for the LPETG substrate (38). The retention of native-like Pro specificity concomitant with a severalfold loss of Ala specificity in a

Structure and specificity of a class E sortase
single Y112F mutant suggests some degree of natural divergence of LPXTG to LAXTG specificity in class E sortase. SavSrtE is endowed with considerable propensity for LAXTG specificity and may be exploited as a suitable scaffold for directed evolution of sortases with enhanced catalytic efficiency. In summary, the crystal structure and substrate specificity data of SavSrtE presented here represent the first detailed description of a class E sortase. SavSrtE produces useful transpeptidation yield and can fruitfully complement SaSrtA in protein engineering applications. The Ca 2ϩ -independent activity together with preference for an altered substrate makes this enzyme an attractive tool for intracellular protein labeling. Besides, the SavSrtE crystal structure can form the basis for interrogation of substrate specificity and facilitate inhibitor design especially against the homologous class E sortases of pathogenic organisms.

Cloning, expression, and purification of SrtE3 (SavSrtE) from S. avermitilis
The gene corresponding to the full-length SrtE3 (residues 1-230) was custom-synthesized from GeneScript cloned in pUC57 vector. A gene encoding truncated version of the protein sequence corresponding to residues 51-230 was subcloned in pET28b(ϩ) using appropriate primers (5Ј-CCCCCCATAT-GAAGGGGACGGTCCGTCCGA-3Ј and 5Ј-CCCCCCTC-GAGCTAACGGCGTAGAGCCTC-3Ј) nested with NdeI and XhoI restriction sites and amplified using pUC57 as template. The identity of the clone was established by DNA sequencing. The plasmid DNA isolated from the clone was used to transform BL21 (DE3) pLysS (Novagen) or BL21-CodonPlus-RP strain (Agilent) for protein expression. Subsequent to transformation, colonies of BL21-Codon-Plus-RP cells were inoculated in 5 ml of LB medium containing 50 g/ml kanamycin and allowed to grow overnight at 37°C. This overnight grown culture was used as the seed for large scale culturing of cells.
The cells were allowed to grow at 37°C until an optical density of 0.6 was attained at 600 nm. The protein expression was induced by the addition of 1 mM isopropyl ␤-D-thiogalactopyranoside, and the culture was grown for 10 h at 25°C. Cells were harvested by centrifugation (12,000 rpm for 10 min), resuspended in 10 mM Tris buffer (pH 7) containing 40 mM NaCl and 10 mM imidazole, and lysed by sonication. The lysate was further clarified by centrifugation (10,000 rpm for 30 min), and the protein was purified using standard Ni-NTA affinity chromatography. The excess imidazole present in the eluted protein was removed using a PD-10 desalting column. The purity of the protein was checked by SDS-PAGE and mass spectrometry.

Site-directed mutagenesis
The above SavSrtE pET28b(ϩ) clone was used as a template to introduce Y112A, Y112G, Y112W, and Y112F mutation by PCR with desired primers (Table 3) using QuikChange mutagenesis kit. Identity of the clone was confirmed by sequencing. The mutant proteins were expressed, purified, and characterized in much the same way as the wild-type SavSrtE.

Expression and purification of sortase A (SaSrtA) from S. aureus
The sequence comprising residues 60 -204 of SaSrtA previously cloned in pET23b vector was expressed in BL-21 E. coli cells. The expressed protein was purified as described earlier (46).

Peptide synthesis
Peptides were synthesized using solid-phase methodology employing standard Fmoc/1-hydroxybenzotriazole/N,N-dicyclohexylcarbodiimide chemistry as described previously (26). The desired amino acids pre-loaded on Wang resin were used for elaboration of peptide sequences. After cleavage from the resin with 95% TFA, crude peptide was precipitated in cold ether and purified by RP-HPLC on a C18 column (Phenomenex Luna 10 , 250 ϫ 30 mm) using acetonitrile/TFA/water solvent system. A linear gradient of 8 -72% acetonitrile containing 0.1% TFA was used to effect the separation of the peptides. The mass of each peptide was confirmed using MALDI-TOF or ES-MS measurements.
Acetylation of the terminal amino group was carried out on the resin after completion of the final coupling and Fmoc deprotection step. Briefly, the dried resin was treated with a mixture of 10% acetic anhydride and 10% N,N-diisopropylethylamine in N,N-dimethylformamide for 1 h at room temperature. The resin was filtered, dried, processed for cleavage by TFA, subjected to purification, and mass spectrometric characterization as described above.

Transpeptidation assay
The SavSrtE-catalyzed transpeptidation reaction was carried out using relevant donor (peptides with LAXTG or LPXTG sortase-recognition motif) and acceptor (GGGKY) peptides (26). Initially, test assays were performed to determine the optimum pH and temperature of the reaction. Accordingly, assays were carried out in 100 mM Tris-HCl buffer (pH 7) containing 150 mM NaCl and 2 mM ␤-mercaptoethanol at 20°C. The reaction was set up with appropriate amounts of peptide substrates and initiated by addition of known amounts of SavSrtE. The reaction was quenched at the desired time by adding 10-fold excess of 0.1% TFA and analyzed by analytical RP-HPLC (C18, 5 , 4.6 ϫ 250 mm, linear gradient of 4 -72% acetonitrile in 0.1% TFA). The product yield was estimated from the peak area using a software provided by the manufacturer (LCsolution, Shimadzu Corp., Japan). The product in each case was characterized by mass spectrometry (data not shown).
The above protocol was also followed for assaying SaSrtA activity except that the reaction was carried out at 37°C in Tris-HCl buffer (pH 7.5) containing 5 mM calcium chloride, 150 mM NaCl, and 2 mM ␤-mercaptoethanol.

Steady-state kinetics
For measurement of steady-state kinetics, transpeptidation assays were carried out with varying concentrations of acetyl-YNLAETGA (Ala substrate) or acetyl-YNLPETGA (Pro substrate) against GGGKY fixed at 1 mM. All kinetic assays were performed ensuring linearity of the product formation. Fur-thermore, the product yields at all tested substrate concentrations were in the range of 0 -10%. The enzyme concentration and transpeptidation assay times were as follows: 50 where v 0 is the observed velocity at given concentration; V max is the apparent maximal velocity; [S] is the substrate concentration; K m is the Michaelis-Menten constant; and K I is the apparent inhibitor dissociation constant for substrate binding.
The data obtained with the Ala substrate for the SaSrtA (substrate concentration range, 0.25-17.5 mM) and the Y112F (range, 0.25-14 mM) mutant of SavSrtE were analyzed by standard Michaelis-Menten kinetics (Equation 1), and the inhibition model (Equation 2) was used for the analysis of kinetics data obtained with the Ala substrate (range, 0.25-18 mM) for wild-type SavSrtE. In the case of Pro substrate, K cat /K m values for both wild-type SavSrtE and Y112F mutant were ascertained from the linear portion of the Lineweaver-Burk plot.

Crystallization, diffraction data collection, and processing
The protein was screened for crystallization conditions with commercially available screens using the hanging drop vapor diffusion method. Initial crystals were obtained in 2-3 days in a condition containing 2.0 M ammonium sulfate, 0.1 M citric acid (pH 3.5). Bigger crystals were grown in a drop containing 1 l of protein (4 mg/ml in 10 mM Tris buffer (pH 7.2), 100 mM NaCl, and 2 mM ␤-mercaptoethanol) and 1 l of a solution containing 1.6 M ammonium sulfate, 0.1 M citric acid (pH 3.75). The crystals were cryo-protected in a 10% sucrose solution, prepared by adding a concentrated stock of sucrose to the crystallization drop. Diffraction data were collected using synchrotron radiation ( ϭ 0.95372 Å) at the European Synchrotron Radiation Facility (BM-14, ESRF), Grenoble, France, with a CCD MARmosaic 225 detector. The crystal was annealed by blocking the liquid N 2 stream for a few seconds to improve the diffraction pattern. The data were processed using MOSFLM and scaled to 1.65 Å using AIMLESS from the CCP4 program suite (47)(48)(49).

Structure solution and refinement
Cell content analysis based on the molecular weight and space group predicted a monomer in the asymmetric unit with 46.6% solvent content and Matthews coefficient 2.3 Å 3 Da Ϫ1 (50, 51). SavSrtE shares low sequence identity (25-34%) with sortase A enzymes of known structure. Hence, the scaled data, sequence information, and model coordinates (sortase A from Streptococcus agalactiae, PDB code 3rcc) were submitted to the MR phasing option in the EMBL-Hamburg AutoRickshaw pipeline (52). The model generated from the server was used as input to PHASER, which generated a solution with scores RFZ ϭ 6.9, TFZ ϭ 13.5, PAK ϭ 0, LLG ϭ 208, and LLG ϭ 13,923, respectively (53). The MR solution was subjected to one cycle of rigid body refinement followed by several cycles of restrained refinement using REFMAC from the CCP4 suite, with alternate rounds of inspection and manual model building in COOT for model improvement (54,55). The convergence of the refinement procedure was checked from the R-factors. R free was calculated using 5% of the reflections set aside at the beginning of refinement procedure.

Structure analyses
The geometry of the refined model was analyzed using PROCHECK (56). Pairwise root mean square deviations between sortase structures were calculated using SSM superposed in CCP4. Structure-based multiple sequence alignment was performed using MUSTANG and rendered using ESPript 3.0 (57,58). Structural motifs were studied from the PDBSUM output (59). Surface electrostatics was calculated using Accelrys Discovery studio software.

Modeling protein-substrate interactions from peptide docking
Hetero-atoms and water molecules were removed from the SavSrtE crystal structure, and the missing residues in the protein were modeled using COOT (55).The peptides YALANTGA and YALPNTGA were generated using the coordinates of LPAT* in the substrate-bound structure (PDB code 2kid) of S. aureus SrtA (SaSrtA as scaffold), and additional residues in the peptide were added using COOT (24,55).The substrate-bound structures of SavSrtE were generated using the ClusPro server that generates multiple docked conformations using an FFT-based rigid body docking program PIPER (60). Residues from the putative binding site region of SavSrtE were provided to the program through the "attraction" option. The docked conformations were filtered by ClusPro using an empirical potential, clustered by root mean square deviation, ranked by cluster size, and subjected to a minimization procedure to remove side chain clashes. The energy score for the best docked conformation for each peptide, obtained from the top ranked cluster, was obtained using the FiberDock server (61). To compare the substrate preference of SavSrtE with SaSrtA, ALPNT-and ALANT-bound models of SaSrtA were generated by appropriate modifications in the peptide-bound NMR structure (PDB code 2kid), and the structures were energy-minimized.
Interactions between the protein and the peptide were studied from residue interaction networks (RINs) generated by the RINerator (62). Hydrogen atoms were added to the peptidebound protein model using "reduce," and the non-covalent interactions (van der Waals interaction, hydrogen bond, and atomic clashes) were identified by Probe (63,64). Interactions with a positive score (Ͼ0) were studied, and a higher score indicated stronger interaction between the connected residue pair. The RIN consisted of nodes (n) representing residues connected by edges (E), which represent the interaction between them. The edges were weighted by the interaction score. A sub-network of the protein residues directly interacting with the peptide was derived using Cytoscape and the edge-to-node ratio (E/n) was calculated (65). A higher E/n ratio is an indicator of a stronger interaction network (enhanced interaction).