Substrate Specificity of MarP, a Periplasmic Protease Required for Resistance to Acid and Oxidative Stress in Mycobacterium tuberculosis*

Background: MarP is a serine protease critical for pH homeostasis in Mycobacterium tuberculosis. Results: MarP is localized in the periplasm, and its substrate specificity was uncovered using synthetic peptides. Conclusion: The periplasmic location of MarP is essential for function and the substrate profile shows high selectivity at multiple subsites. Significance: Understanding the MarP proteolytic pathway may lead to the development of novel anti-tuberculosis chemotherapeutics. The transmembrane serine protease MarP is important for pH homeostasis in Mycobacterium tuberculosis (Mtb). Previous structural studies revealed that MarP contains a chymotrypsin fold and a disulfide bond that stabilizes the protease active site in the substrate-bound conformation. Here, we determined that MarP is located in the Mtb periplasm and showed that this localization is essential for function. Using the recombinant protease domain of MarP, we identified its substrate specificity using two independent assays: positional-scanning synthetic combinatorial library profiling and multiplex substrate profiling by mass spectrometry. These methods revealed that MarP prefers bulky residues at P4, tryptophan or leucine at P2, arginine or hydrophobic residues at P1, and alanine or asparagine at P1′. Guided by these data, we designed fluorogenic peptide substrates and characterized the kinetic properties of MarP. Finally, we tested the impact of mutating MarP cysteine residues on the peptidolytic activity of recombinant MarP and its ability to complement phenotypes of Mtb ΔMarP. Taken together, our studies provide insight into the enzymatic properties of MarP, its substrate preference, and the importance of its transmembrane helices and disulfide bond.

The transmembrane serine protease MarP is important for pH homeostasis in Mycobacterium tuberculosis (Mtb). Previous structural studies revealed that MarP contains a chymotrypsin fold and a disulfide bond that stabilizes the protease active site in the substrate-bound conformation. Here, we determined that MarP is located in the Mtb periplasm and showed that this localization is essential for function. Using the recombinant protease domain of MarP, we identified its substrate specificity using two independent assays: positional-scanning synthetic combinatorial library profiling and multiplex substrate profiling by mass spectrometry. These methods revealed that MarP prefers bulky residues at P4, tryptophan or leucine at P2, arginine or hydrophobic residues at P1, and alanine or asparagine at P1. Guided by these data, we designed fluorogenic peptide substrates and characterized the kinetic properties of MarP. Finally, we tested the impact of mutating MarP cysteine residues on the peptidolytic activity of recombinant MarP and its ability to complement phenotypes of Mtb ⌬MarP. Taken together, our studies provide insight into the enzymatic properties of MarP, its substrate preference, and the importance of its transmembrane helices and disulfide bond.
Tuberculosis is the second highest cause of death from a communicable disease, killing ϳ1.5 million people annually (1). This can in part be attributed to the multiple mechanisms Mycobacterium tuberculosis (Mtb) 4 has evolved to resist bacte-ricidal host defenses, including acid and oxidative stresses (2)(3)(4)(5). Although Mtb can stall phagosome-lysosome fusion (6 -8), activation of macrophages with the T-cell-derived cytokine IFN-␥ turns the relatively neutral (pH ϳ 6.2) environment of the Mtb-containing phagosome into the much more hostile, acidic environment of the phagolysosome with a pH as low as 4.5 (9 -13). Mtb uses mechanisms of pH homeostasis to persist within host cells and maintain a near neutral intrabacterial pH despite the acidic phagolysosomal pH surrounding the bacteria (3).
Mechanisms of pH homeostasis have been well characterized in several Gram-negative and Gram-positive bacteria (14,15), but those active in Mtb remain elusive. A small number of proteins has been implicated in the mycobacterial response to low pH including: MgtC, a predicted magnesium transporter (16); OmpA TB , an outer membrane protein (17); PhoPR, a two component regulator (18); aprABC, a gene locus regulated by PhoPR in response to acid (19); and Rv3671c, a transmembrane serine protease (3,20). A mutant with a transposon insertion in Rv3671c failed to maintain its neutral intrabacterial pH in acidic conditions both in vitro and in IFN-␥ activated macrophages and lost viability (3). pH homeostasis was dependent on Rv3671c proteolytic activity, because a catalytically inactive mutant (Rv3671c-S343A) failed to rescue the acid sensitivity of the transposon mutant, whereas the wild-type enzyme fully restored the phenotypes of the mutant. Furthermore, disruption of Rv3671c significantly attenuated Mtb in both the acute and chronic phases in a mouse model of Mtb infection. A better mechanistic understanding of the Rv3671c protease, which we named MarP (mycobacterial acid resistance protease), may aid in the discovery of new inhibitors that target pH homeostasis.
MarP is highly conserved in mycobacteria (21). Computational predictions of its secondary structure reveal four transmembrane helices in the N-terminal region connected by a linker region to the C-terminal protease domain (22). The predicted topology places the protease domain of MarP outside of the cytoplasm, in the periplasmic compartment. The presence of an outer membrane and, hence, a periplasm in mycobacteria, was recently demonstrated by cryo-electron microscopy studies (23,24). Structural and biochemical analyses demonstrated that the C-terminal domain of MarP is a chymotrypsin-like serine protease, with a classic catalytic triad composed of His-235, Asp-264, and Ser-343 (20). Recombinant MarP lacking its transmembrane domain readily cleaved ␤-casein. Furthermore, a reversible disulfide bond between Cys-214 and Cys-395 regulated proteolysis in vitro by stabilizing the active site in a catalytically active conformation. A construct of the protease containing a part of the linker that connects the N-terminal transmembrane helices with the C-terminal protease domain underwent autoproteolysis within the linker at several sites, but the major cleavage (2) occurred after Leu-160 in the sequence PKRL2SALL (20). The biological significance of this autocleavage site is unknown, as the truncated form of the native protease has not been observed in Mtb lysates.
The closest characterized homologues of MarP are the HtrA (high temperature requirement A) serine proteases, whose protease domains exhibit ϳ20% sequence identity with that of MarP. HtrA-family proteases are periplasmic serine proteases characterized by at least one C-terminal PDZ protein-protein interaction domain and an association with the cytoplasmic membrane (25,26). They function as both proteases and oligomeric chaperones depending on the molecular signal (27). Mtb contains three HtrA proteases (28,29). One of them, HtrA2 (PepD) was shown to be important for Mtb stress response and for virulence in the mouse model of infection (29,30). However, MarP differs from these proteases because it lacks a PDZ domain, contains more than one predicted transmembrane helix (31), and exists as a monomer (3,20). Despite knowledge about the structure of the protease domain and functional role of MarP in pH homeostasis, the localization of the protease domain, the function of the transmembrane helices, and the substrate preference of the protease remain unclear.
In this study we investigated the importance of the transmembrane helices in MarP-mediated acid resistance and established the periplasmic location of the protease domain experimentally. We used two distinct substrate profiling methods (32,33) to determine the peptide sequence cleavage specificity of the enzyme. Knowledge of the preferred cleavage sequences enabled us to design fluorogenic substrates and characterize the proteolytic activity of MarP. MarP was active over a broad pH range and sensitive to serine protease inhibitors and reducing reagents. This work will facilitate future studies to identify in vivo substrates and elucidate the MarP proteolytic pathway.

EXPERIMENTAL PROCEDURES
Protease Constructs-MarP_179 -397 and MarP_179 -397-S343A were purified as previously described (20). MarP_179 -397-C214A and MarP_179 -397-C214A-C395A were constructed by mutating Cys-214 and Cys-395 to alanine using the QuikChange site-directed mutagenesis kit (Stratagene). PCR products were cloned into the pET28b vector including a C-terminal His 6 tag and purified as previously described (20). MarP complementation plasmids were created using Gateway Cloning Technology (Invitrogen). For MarP-TM1234, full-length marP was amplified from the chromosome. To generate MarP-TM12, the MarP protease domain at Leu-119 was fused to Val-52 of the second MarP transmembrane helix using overlap PCR. Similarly, the MarP protease domain at Leu-119 was fused to Val-102 of TM helix three to create MarP-TM123. A C-terminal FLAG tag was introduced to all constructs by PCR, and all genes were cloned behind the Hsp60 promoter on plasmids integrating into the Mtb chromosome. In MarP-C214A and MarP-C214A/C395A, cysteine residues were mutated to alanine using the QuikChange site-directed mutagenesis kit (Stratagene). Both constructs were expressed under the control of the Hsp60 promoter from plasmids that integrate into the chromosome. ⌬MarP was electroporated, and transformants were selected for on 7H10 plates with kanamycin (20 g/ml). Expression of each protein was confirmed by immunoblotting with FLAG-tag-specific antibodies (Sigma) or MarP-specific antibodies.
Immunoblotting-Mtb lysates were prepared and fractionated as described (3), and 60 -80 g of cytosolic fractions and an equal volume of membrane and cell envelope fraction based on total volume of these fractions were separated by SDS-PAGE and transferred to a nitrocellulose membrane. Antibodies and dilutions used are as follows: 1:1,000 anti-alkaline phosphatase (Sigma), 1:2,000 anti-MarP; 1:10,000 anti-PrcB (proteasome ␤ subunit); 1:3,000 anti-FLAG (Sigma).
Alkaline Phosphatase Fusions and Activity-MarP fragments were amplified using PCR and cloned into pMB111 using the BamHI restriction sites to create in-frame fusions to alkaline phosphatase (PhoA). The pMB111 plasmid (a gift from Miriam Braunstein) contains Escherichia coli PhoA lacking a signal sequence. Plasmid pMB124 (gift from Miriam Braunstein) contains PhoA fused to the signal sequence of the Mtb secreted antigen, antigen 85B (34), and served as a positive control. Plasmids were electroporated into Mycobacterium smegmatis mc 2 155 and plated on LB agar plates containing 60 g/ml 5-bromo-4-chloro-3-indolyl phosphate, 0.2% glucose, and 0.05% Tween80.
Mtb Susceptibility to Low pH and Hydrogen Peroxide-Early to mid-log phase cultures were washed with PBS containing 0.05% Tween 80 and centrifuged at 800 rpm for 12 min. Singlecell suspensions were inoculated into stress media at an A 600 ϭ 0.05. Colony forming units (cfu) were determined by plating serial dilutions of the suspensions on 7H10 agar plates. Mtb was exposed to 8 mM H 2 O 2 in 7H9 medium for 4 h at 37°C. To measure low pH susceptibility, Mtb was inoculated into phosphate citrate buffer at pH 4.5 and incubated at 37°C for 3 and 6 days.
The Complete Diverse Positional Scanning Synthetic Combinatorial Library (PS-SCL)-PS-SCL consists of four sublibraries of fluorogenic substrates with the general structure acetyl-P4-P3-P2-P1-amino-4-carbamoyl-methylcoumarin. One position is fixed in each sublibrary, whereas the remaining positions contain an equimolar mixture of all amino acids (32). 2 M MarP_179 -397 was assayed with the library in 50 mM Tris-HCl, pH 7.4, 500 mM NaCl, 0.01% Triton X-100 for 1 h at 37°C. The enzyme mixture was preincubated for 15 min at 37°C before adding it to the PS-SCL substrates. All reactions were run in triplicate at 37°C in 96-well Microfluor-1 Black "U" bottom plates (Dynex Technologies, Chantilly, VA) on a Spec-traMax Gemini fluorescence spectrometer (Molecular Devices) with the excitation and emission wavelengths set at 380 and 460 nM, respectively.
Peptide Cleavage Site Identification by Multiplex Substrate Profiling (MSP)-Mass Spectrometry-MarP (250 nM) was profiled using the MSP-MS assay as described (33). Each peptide pool was incubated at room temperature in 50 mM Tris-HCl, pH 7.4, 500 mM NaCl, and aliquots were removed and acidquenched to pH 3 or less with formic acid (4% final) after 60, 240, and 1200 min. A control sample without enzyme was prepared under identical conditions and quenched at the first and last time point of the assay to account for non-enzymatic degradation of the substrates. Cleaved bonds derived from nonenzymatic degradation were subtracted from the MarP cleavage sites. Cleavage site identification was performed using peptide sequencing by mass spectrometry. Samples containing 1-3 g of total peptide (calculated as 60 l of enzyme reaction containing peptide pools at 500 nM) were desalted using C18 zip tips (Millipore) and rehydrated in 0.1% formic acid. The peptide mixture containing 0.1 g of peptide was injected onto the column.
For LC-MS/MS, a LTQ-FT mass spectrometer (Thermo) equipped with a 10,000 p.s.i. system nanoACUITY (Waters) UPLC instrument for reverse phase chromatography with a C18 column (BEH130, 1.7-m bead size, 100 m ϫ 100 mm) was used with standard operating conditions as previously reported (33). Mass spectrometry peak lists were generated using in-house software called PAVA, and Protein Prospector software Version 5.10.0 (35) was used to search a database containing 620 14-mer sequences. This database contains the sequences of the 124 14-mer synthetic peptides, concatenated with 4 different copies of randomized sequences for the same 124 entries to create a final database of 620 sequences for estimation of false discovery rate (33). For database searching, peptide sequences were matched with no enzyme specificity requirement, and potential modifications included oxidation of Trp, Pro, and Phe and N-terminal pyroGlu from Gln. Protein Prospector score thresholds were selected to be as follows: a minimum protein score of 22, a minimum peptide score of 15, and maximum expectation values of 0.01 for "protein" and 0.05 for peptide matches. This resulted in a peptide false discovery rate of Ͻ0.2%. Cleavage site data were extracted from Protein Prospector using an in-house script called "MSP extractor" software (33). iceLogo software was used to generate substrate specificity profiles for amino acids at Ϯ4 positions on both sides of the identified scissile bond (36). Raw mass spectrometry data can be obtained from the ProteomeCommons Tranche network.
The effect of pH on enzymatic activity was determined in different buffers in the pH range from 4.5 to 11.0 (50 mM MES, pH 4.5-6.5; 50 mM Tris-HCl, pH 7.0 -9.0; 50 mM CAPS, pH 9.0 -11.0) with 500 mM NaCl, 0.01% Triton X-100, and 50 M WKLL-AMC as substrate. Fluorescence of the reaction mixture without the enzyme in each of the above buffers was monitored to control for peptide hydrolysis.
Enzyme inhibition was characterized with 750 nM MarP_179 -397 in 50 mM Tris-HCl, pH 7.4, 500 mM NaCl, 0.01% Triton X-100, and the following inhibitors Modeling of Peptides Bound to the MarP Protease Domain-The models were generated based on the crystal structure of MarP in the active state (20) PDB ID 3K6Y) using Coot (37). The backbone conformation of the modeled peptide was the same as that of the bound peptide in the crystal structure. The residues of the peptide were mutated from those in the crystal structure to a given sequence, and the side chain rotamers of Lys, Tyr, Trp that do not clash sterically with the protease were chosen among most energetically preferred ones, as defined in Coot. Side chain torsion angles were manually adjusted slightly whenever needed to avoid minor steric clashes. The figure was created in PyMOL (Schrödinger, LLC) with the surface electrostatic potential generated by the APBS plug-in (38).

RESULTS
Localization of the MarP Protease Domain-MarP is predicted to contain four transmembrane (TM) helices by the TransMembrane server, TMHMM (22), with both the N terminus and C-terminal protease domain located in the periplasm (Fig. 1A). The TM helices are predicted to consist of amino acids 5-23 (TM1), 30 -52 (TM2), 62-84 (TM3), and 98 -119 (TM4). TM4 and the protease domain are connected by a 58-residue-long linker region (20). We tested these predictions experimentally. PhoA from E. coli has been used to determine the topology of membrane proteins (39) including those in mycobacterial species (34,40). PhoA is active only when located in the oxidizing, extracytoplasmic environment and can thus be used as a periplasmic reporter. We created four PhoA fusions at different residues of MarP, resulting in constructs containing different numbers of the predicted transmembrane helices. Two fusions, after Leu-123 and Leu-119, were C-terminal of all four predicted transmembrane helices, which places PhoA in the periplasm. Two fusions, at Val-102 and Pro-95, were made C-terminal of the first three helices, which localizes PhoA in the cytosol (Fig. 1A). M. smegmatis clones expressing Leu-123 and Leu-119 fusions demonstrated alkaline phosphatase activity, whereas those at Val-102 and Pro-95 were inactive (Fig. 1B). The plasmid containing PhoA lacking a signal sequence served as a negative control, and a plasmid expressing PhoA fused to the signal sequence of the secreted antigen 85B served as the positive control (34). Immunoblotting with a PhoA-specific antibody verified that all fusion proteins were expressed (Fig.  1C). These experiments confirm the topology predictions and indicate that MarP activity occurs in the periplasm.
Importance of the Transmembrane Helices-MarP proteolysis is important for resistance to both acid and oxidative stress; however, the role of the transmembrane helices in these functions has not been explored. To address this, we created two Mtb strains in the background of a MarP knock-out strain (⌬MarP). In ⌬MarP the entire marP gene is replaced with a hygromycin cassette (data not shown). MarP-TM12 expressed MarP lacking transmembrane helices 3 and 4, placing the protease domain in the periplasm ( Fig. 2A, upper panel). MarP-TM123 expressed MarP lacking transmembrane domain 4, so that the protease domain is located in the cytoplasm ( Fig. 2A,  lower panel). MarP-TM1234 contained full-length MarP. To test the importance of the transmembrane helices for pH homeostasis, we exposed the bacteria to phosphate-citrate buffer, pH 4.5, and measured viability (Fig. 2B). As predicted, the ⌬MarP strain was hypersusceptible to acid; its cfu decreased 30-and 300-fold compared with that of wild-type (WT) Mtb after 3 and 6 days, respectively. MarP-TM1234 and MarP-TM12 were significantly more acid-resistant and displayed survival rates similar to WT Mtb, whereas MarP-TM123 phenocopied ⌬MarP. A similar trend was observed for susceptibility to H 2 O 2 stress. Mtb lacking MarP was previously shown to be hypersusceptible to hydrogen peroxide (20). Cfu of MarP-TM123 and ⌬MarP were 10-fold lower than those of WT after exposure to 8 mM H 2 O 2 for 4 h, whereas MarP-TM1234 and MarP-TM12 survived like WT (Fig. 2C). All MarP proteins were expressed and were of expected sizes (Fig. 2D), demonstrating that the lack of complementation with MarP-TM123 was not due to lack of protein expression. These data indicate that transmembrane helices 3 and 4 are dispensable for resistance to acid and oxidative stress, whereas the periplasmic local- ization of the protease is essential for these protective functions.
Determination of Substrate Specificity-We previously purified two recombinant MarP proteins without their transmembrane domains: MarP_142-397 and MarP_179 -397 (20). The longer construct, MarP_142-397, underwent autocleavage at its N terminus, in the linker region. Liquid chromatographymass spectrometry (LC/MS/MS) revealed that autocleavage occurred preferentially between Leu-160 and Ser-161 in the sequence PKRLSALL; however, additional cleavage positions were also observed in this region (20). We generated two internally quenched fluorescent peptides based on the autocleavage site sequences of MarP, namely PKRLSALL and SALLNTSG. Each peptide contained an MCA group on the N terminus and a 2,4-dinitrophenol group coupled to a C-terminal lysine. In a kinetic assay, no cleavage of these peptides was detected. However, MALDI-TOF MS analysis revealed that PKRLSALL was partially hydrolyzed between Leu and Ser and between Ser and Ala after 48 h of incubation with 2 M MarP. SALLNTSG was not cleaved under these conditions (supplemental Fig. 1, A-D).
To uncover the substrate specificity of MarP, the enzyme was assayed with two distinct substrate-profiling methods. The complete diverse PS-SCL has been previously used to profile the P1 to P4 substrate specificity of more than 85 proteases, most of which are serine and cysteine proteases (32,41). The PS-SCL consists of tetrapeptide substrates containing a fixed amino acid at a specific position linked to a fluorogenic 7-amino-4-carbamoylmethylcoumarin group on the C terminus. The unfixed positions contain equimolar mixtures of all natural amino acids, except cysteine, and norleucine is included to increase the information content (20,32). In the PS-SCL assay, MarP specificity was dominated by leucine in the P2 position, and it displayed a 12.5-fold or higher preference over tryptophan, norleucine (n), and all other amino acids (Fig. 3). Cleavage also occurred when leucine and, to a lesser extent, methionine, proline, and arginine were present in the P1 position. Little preference was evident in the P3 position, whereas the P4 position was selective for bulky hydrophobic residues such as tryptophan, tyrosine, and phenylalanine as well as histidine.
In addition, the extended substrate specificity of MarP was determined by incubating the enzyme with a mixture of 124 tetradecapeptides. These substrates were designed to contain a unique dipeptide at each terminus (positions 1-2 and 13-14) to profile exo-acting proteases and a central decapeptide sequence (positions 3-12) consisting of all combinations of neighbor and near-neighbor amino acid pairs that is used to profile endoproteases. This assay is termed Multiplex Substrate Profiling by Mass Spectrometry and has been used to profile numerous endo-and exo-acting proteases (33,42). After the addition of a protease to mixtures of these peptides, the proteolytic reaction is quenched at various time intervals, and cleaved bonds are identified by peptide sequencing using mass spectrometry.
MarP was assayed with the peptide library, and the activity was quenched using concentrated acid to pH Ͻ 3 after incubation for 60, 240, and 1200 min. Cleavage (2) of four peptides was evident after 60 min, namely: APSLI2AKWVGFEPH, GnYYKRFn2AHWVGI, AnTDR2GWYLAIQAV, and AYN-nWSLYRnI2RQE, where n corresponds to norleucine. By 4 h, 8 additional cleavage sites were observed, and by 20 h a total of 25 sites were cleaved (Fig. 4A). An iceLogo plot was generated to illustrate the frequency of amino acids found at the P4 to P4Ј positions of all cleaved sequences (n ϭ 25). The P1 position was enriched with norleucine, isoleucine, and arginine, whereas small polar and negatively charged residues were not tolerated at this position (Fig. 4B). The P2 position was enriched with tryptophan and, to a lesser extent leucine and norleucine, whereas bulky hydrophobic residues such as tyrosine, phenylalanine, norleucine, and isoleucine were most often found in the P4 position. The MSP-MS assay also yielded information on the prime-side substrate specificity. The P1Ј position was frequently occupied by alanine and asparagine, whereas P4Ј was enriched with leucine and histidine. In general, MarP cleaved at sites that were distal from the termini of the peptides with 84% of substrates having amino acids present in the P4 to P3Ј positions; however, there was no statistically significant enrichment (p Ͼ 0.05) of amino acids in the P3, P2Ј, or P3Ј subsites.
Design and Synthesis of Fluorescent MarP Substrates-Two fluorescent substrates were generated to enable enzymatic characterization of MarP. First, a tetrapeptide substrate containing a C-terminal 7-amino-4-methylcoumarin (AMC) group was synthesized. The tetrapeptide sequence (WKLL-AMC) corresponded to the preferred amino acids at the P4, P2, and P1 positions as established by the PS-SCL profiling assay. No obvious preference was evident at the P3 position, so Lys was chosen at P3 to add hydrophilicity to an otherwise hydrophobic peptide sequence. Second, an internally quenched octapeptide, MCA-PSLIAKWV-K(DNP), was synthesized that corresponded to the P4-P4Ј amino acids of a substrate observed in the MSP-MS assay at the earliest time point.
Steady-state kinetic studies were performed with each substrate, and k cat /K m values of 14 and 62 M Ϫ1 s Ϫ1 were obtained for WKLL-AMC and MCA-PSLIAKWV-K(DNP), respectively (Table 1; supplemental Fig. 2, A and B). Mass spectrometry analysis of MCA-PSLIAKWV-K(DNP) confirmed that cleavage occurred between Ile and Ala (supplemental Fig. 2, C and D) as was observed for the 14-mer substrate in the MSP-MS assay. To identify key interactions of these synthetic peptides with the protease substrate binding pocket, we performed structural modeling of these substrates bound in the active site of MarP (Fig. 5). The peptides were modeled based on the experimentally observed peptide AVLEPFSR bound in the active site pocket of MarP in the crystal structure of this protease in the active state (20) (PDB ID 3K6Y). In the model of MarP-WKLL-AMC, the P1 leucine is stabilized by hydrophobic interactions with the side chains of Ala-361 and Val-338, which line the S1 pocket. Leucine at P2 interacts with Ile-320 and the catalytic His-235. At P3, which had no preferred amino acid, lysine points into the solvent. P4 tryptophan makes hydrophobic interactions with Leu-315, Phe-359, and Phe-370. For MCA-PSLIAKWV-K(DNP), interactions at P1 (isoleucine), P2 (leucine), and P3(serine) with the protease are similar to those described above. Proline at P4 makes nonpolar contacts with Phe-370, Leu-315, and Phe-359. On the other side of the scissile bond, alanine at P1Ј contacts Leu-218, lysine at P2Ј likely forms a salt bridge with Glu-219, and tryptophan at P3Ј is sandwiched by stacking between Lys-216 and Gln-340. The rest of the peptide chains on this side of the scissile bond as well as the fluo-  . Substrate specificity of MarP determined using multiplex substrate profiling by mass spectrometry. A, a list of all cleavage sites observed over the course of the assay is shown. Only P4 to P4Ј residues are shown, and the time at which the cleavage was first observed is indicated at the top of each list. When cleavage occurs near the N or C termini, then some subsites are not occupied by an amino acid and are represented as a blank space. B, an iceLogo represents amino acids that are most frequently (above the axis) and least frequently (below the axis) observed in each of the 25 cleavage sites. Residues that are highlighted in black text are significantly (p ϭ 0.05) enriched in the MarP subsites relative to the frequency that these same amino acids are found in the peptide library (5.2 Ϯ 0.5%).
rophores are highly solvent-exposed, and therefore, their specific interactions with the protease are difficult to predict. In summary, substrate profiling using PS-SCL and MSP-MS identified two peptide substrates that were cleaved by MarP and bound in the active site consistent with our previous structural analysis.
Enzymatic Characterization of MarP-Because MarP-mediated proteolysis was important for resistance to acid stress, we were interested in the pH optimum of the protease. Fluorescence of free AMC was unaffected by pH (supplemental Fig. 3); therefore, a pH titration was performed using WKLL-AMC. MarP displayed optimal activity at pH 9.0 but retained Ͼ95% activity between pH 7.4 and pH 10 (supplemental Fig. 3B).
We next determined the activity of protease inhibitors against MarP using the WKLL-AMC substrate ( Table 2). The serine protease inhibitors PMSF (1 mM) and FP-Rho (20 M) inhibited catalysis, resulting in 100 and 94.8% inhibition, respectively. The calculated IC 50 values for PMSF and FP-Rho were 284 and 4.5 M, respectively (Fig. 6, A and B). 20 M pepstatin A, which targets aspartic acid proteases, partially inhibited MarP activity. We have previously shown that MarP autocleavage activity is significantly reduced as a result of disruption of the intramolecular disulfide bond between Cys-214 and Cys-395 by reducing agents and mutagenesis of the two Cys residues (20). To examine if this disulfide is also important for cleavage of a peptide substrate, we tested the effect of 2 mM DTT on catalysis and observed 97.5% inhibition of protease activity. Finally, we tested the effect of complete disruption of the disulfide bond by mutagenesis on the peptide cleavage activity by MarP. We generated recombinant MarP with a single C214A mutation and double C214A/C395A mutations. Neither mutant protease cleaved WKLL-AMC (supplemental Fig. 4), which confirmed that the disulfide bond is an important structural element required for MarP activity in vitro.
Role of the MarP Disulfide Bond in Vivo-The role of the disulfide bond in regulating MarP protease in vitro demonstrated here and previous work (20,29) suggested that the disulfide bond (Fig. 7A) may have an important function in vivo. To test this hypothesis, we transformed Mtb ⌬MarP with MarP-C214A and MarP-C214A/C395A. Expression of the mutant proteases was confirmed by immunoblot with anti-MarP antiserum (data not shown). Surprisingly, both mutant proteases complemented the ⌬MarP mutant and phenocopied WT Mtb with respect to survival after exposure to acid (Fig. 7B) and H 2 O 2 ( Fig. 7C) stresses, whereas cfu of ⌬MarP were significantly reduced. Thus, the disulfide bond, although important for in vitro catalysis, appears to be dispensable for MarP-mediated resistance to low pH or H 2 O 2 .

DISCUSSION
Proteolysis is an important aspect of bacterial pathogenesis, with proteases contributing to quality control processes and regulatory pathways. Several proteases have been shown to play an important role in the pathogenesis of tuberculosis including Rip1, a protease important for regulation of cell envelope composition and virulence (43,44), the mycobacterial proteasome (45), HtrA2 (29), the ClpP1P2 complex (46), the serine protease MycP1 (47), and MarP (3). The periplasmic location of MarP is consistent with its role in resistance against extracellular stresses; its substrates must be present in the extracytoplasmic space and may include periplasmic proteins or proteins that are part of the cell envelope. In this study we determined that MarP was optimally active at pH 9, consistent with our previous observation that autoproteolysis increased at alkaline pH (20). This pH  profile suggests that MarP can cleave its substrates independently of acid stress. The pH of Mtb periplasm is unknown, and whether it is actively controlled when the bacteria are exposed to acid has not been investigated. E. coli is unable to control the pH of its periplasm (48), and recent work demonstrated that the pH optimum of the E. coli HtrA protease, DegQ, is pH 5.5, consistent with its role in degrading misfolded proteins during acid stress (49). In contrast, MarP may play a different function in the periplasmic space and is perhaps involved in cell envelope maintenance, which when disrupted results in hypersusceptibility to acid and oxidative stress. This hypothesis is further supported by the increased susceptibility of Mtb lacking MarP to lipophilic antibiotics and detergent (3). Previously, we reported that recombinant MarP underwent autoproteolysis at its N-terminal linker region. However, MarP did not efficiently cleave two fluorescent peptides containing the autocleavage sequence (PKRLSALL and SALLNTSG). In addition, we were unable to detect truncated MarP in Mtb lysates, suggesting that autocleavage may be observed only at high concentrations of the recombinant enzyme. Alternatively the absence of the transmembrane domain results in an exposed linker region that is prone to proteolysis. Using two independent substrate-profiling methods, we uncovered the specificity of MarP. Both methods determined that the enzyme preferred bulky residues at P4 and hydrophobic residues at P2 and P1; however, the exact residues often differed between assays, and the dominant preference for leucine at P2 in the PS-SCL was not as evident in the MSP-MS assay. The differences observed between methods may be due to the size of the peptide substrates (14-mer versus 4-mer) or contribution of the   prime side residues to non-prime side specificity. In particular, the MSP-MS assay revealed that MarP prefers alanine and asparagine in the S1Ј pocket, both of which are dissimilar to the bulky fluorescent acetyl-P4-P3-P2-P1-amino-4-carbamoylmethylcoumarin moiety present in the PS-SCL library. To make substrates for enzymatic characterization of MarP, we generated a fluorescent tetrapeptide (WKLL-AMC) based on the preferred residues identified from the PS-SCL assay. In addition we synthesized an internally quenched octapeptide substrate based on the P4 to P4Ј residues of a robustly cleaved 14-mer peptide, namely APSLI2AKWVGFEPH. This octapeptide sequence was chosen over others because Leu was present in the P2 position and isoleucine and alanine were frequently found in the P1 and P1Ј sites of the other substrates in the MSP-MS assay.
Using these substrates, we noted that the steady-state kinetics of MarP was significantly lower than those of other well characterized serine proteases, which generally cleave their substrates with k cat /K m values of Ͼ10,000 M Ϫ1 s Ϫ1 (50). MarP lacks a PDZ domain that regulates protease activity upon binding to substrates in HtrA family proteases. However, the C-terminal region of MarP may also be involved in protein-protein interactions or form an exosite (20) regulating protease activity. Exosites that enhance enzymatic activity upon binding to a preferred ligand have been found in other proteases (51). In addition, it is possible that the transmembrane domain or fulllength linker region, which is absent from the recombinant enzyme, are important for substrate recognition. Finally, we cannot rule out that the optimal peptide substrate for MarP has not been identified, because structural features of protein substrates that are not close to the cleavage site may also dictate the specificity of the protease. Notwithstanding, the substrate specificity profiles obtained in this study will help predict extracytoplasmic proteins as potential native MarP substrates. Additional rigorous studies will need to be performed to determine which of these proteins are true substrates of the membraneanchored protease.
Proteins in the periplasmic space can encounter harsh conditions due to their proximity to the extracellular environment. To tolerate these conditions, many secreted proteins contain disulfide bonds that provide extra stability to their tertiary structures (52,53). Disulfide bonds are introduced into periplasmic proteins by a dedicated disulfide bond-forming system, the DSB system (54). Mtb encodes a DSB system with vitamin K epoxide reductase (VKOR) instead of DsbB promoting disulfide bond formation (55,56). In E. coli, the HtrA protease is stabilized by a disulfide bond in the N-terminal region whose formation depends of the DsbA oxidase (57,58). In contrast to the above systems, the disulfide bond of MarP is not required per se for structural stability of the protease-fold as a whole; instead, the disulfide constrains the conformation of the protease in the active state (20). We tested the importance of the MarP disulfide bond for proteolysis in vitro and for complementation of the MarP mutant phenotypes. Surprisingly, the cysteine mutant proteins fully complemented acid and H 2 O 2 hypersusceptibility of ⌬MarP. In contrast, the in vitro experiments determined that the proteases lacking the disulfide were inactive. However, autocleavage of these mutant proteases was not completely abolished (20) indicating that, although important, the disulfide bond is not essential for MarP-mediated proteolysis. Together these data suggest that the structural support provided by the disulfide bond can be compensated for when the protease is expressed in the Mtb periplasm at conditions tested here. This could be due to its attachment to the transmembrane domain, which is missing in the recombinant protease, or afforded by interacting proteins including protease substrate(s). Finally, we may not have identified the condition in which the disulfide becomes essential when MarP is expressed in Mtb.
In summary, we confirmed that MarP is a periplasmic protease, determined the substrate specificity profile of recombinant MarP, and characterized its enzymatic properties. Catalytic inactivation of MarP by mutagenesis yielded an acid and peroxide sensitive Mtb strain, which may be mimicked chemically by inhibitors directed against the active site of the protease. To facilitate the screening of MarP inhibitors we have used the substrate profiling data to develop sensitive fluorogenic peptide substrates. Further understanding of the mechanism by which MarP protects Mtb from acid and oxidative stress and the development of MarP selective inhibitors may lead to the discovery of novel anti-TB chemotherapeutics.