Distinct Specificities of Mycobacterium tuberculosis and Mammalian Proteasomes for N-Acetyl Tripeptide Substrates*

The proteasome of Mycobacterium tuberculosis (Mtb) is a validated and drug-treatable target for therapeutics. To lay ground-work for developing peptide-based inhibitors with a useful degree of selectivity for the Mtb proteasome over those of the host, we used a library of 5,920 N-acetyl tripeptide-aminomethylcoumarins to contrast the substrate preferences of the recombinant Mtb proteasome wild type and open gate mutant, the Rhodococcus erythropolis proteasome, and the bovine proteasome with activator PA28. The Mtb proteasome was distinctive in strictly preferring P1 = tryptophan, particularly in combination with P3 = glycine, proline, lysine or arginine. Screening results were validated with Michalis-Menten kinetic analyses of 21 oligopeptide aminomethyl-coumarin substrates. Bortezomib, a proteasome inhibitor in clinical use, and 17 analogs varying only at P1 were used to examine the differential impact of inhibitors on human and Mtb proteasomes. The results with the inhibitor panel confirmed those with the substrate panel in demonstrating differential preferences of Mtb and mammalian proteasomes at the P1 amino acid. Changing P1 in bortezomib from Leu to m-CF3-Phe led to a 220-fold increase in IC50 against the human proteasome, whereas changing a P1 Ala to m-F-Phe decreased the IC50 400-fold against the Mtb proteasome. The change of a P1 Ala to m-Cl-Phe led to an 8000-fold shift in inhibitory potency in favor of the Mtb proteasome, resulting in 8-fold selectivity. Combinations of preferred amino acids at different sites may thus improve the species selectivity of peptide-based inhibitors that target the Mtb proteasome.

Mycobacterium tuberculosis (Mtb), 4 the single leading cause of death from bacterial infection, is growing even more danger-ous with the spread of resistance to the existing chemotherapeutic agents (1). After four decades without the introduction of new anti-tuberculosis agents into the clinic, efforts are underway to find new chemophores that inhibit new targets and shorten the 6 -9-month treatment regimen that makes compliance problematic (2)(3)(4). Many authorities believe that this will require inclusion of agents active against nonreplicating Mtb (5)(6)(7)(8).
Proteasomes provide the major pathway of intracellular protein degradation in eukaryotic cells, where they are essential for adaptation to changing circumstances and avoidance of toxicity from oxidized or denatured proteins (9 -11). Mtb is one of the few bacteria known to express a proteasome (12). The Mtb proteasome is emerging as a potential anti-tuberculosis drug target. It defends the bacterium against host nitrosative stress (12) and is essential for Mtb to persist in mice (13). The Mtb proteasome can be inhibited by small molecules both as a recombinant protein complex (14,15) and within intact mycobacteria (12). The structure of the Mtb proteasome has been solved with a peptidyl boronate (15). However, for proteasome inhibition to be a viable anti-infective approach, the inhibitors must be much more active against the proteasome of the bacterium than against that of the host. Partial inhibition of host proteasomes has been successful in the treatment of certain malignancies (16 -18). Toxicity in this setting is often considered acceptable but would be highly undesirable in the treatment of an infectious disease. The extensive conservation of proteasome structure makes it challenging to find inhibitors with species selectivity.
A standard approach to developing selective protease inhibitors is to attach a warhead to a selective substrate (19). Among proteasome inhibitors, peptidyl boronates exemplify this approach and have achieved clinical utility (16 -18). Thus, we have embarked on a program to develop peptidyl boronates with greater potency for the Mtb proteasome than for the ␤5 subunit of the human proteasome, whose inhibition is chiefly responsible for the toxicity of these compounds (16,17). As a step toward that goal, the present study sought potentially exploitable differences in the preferences of bacterial and mammalian proteasomes for tripeptide substrates.
The proteasome core, here called "20S," is a cylindrical 28-meric protein complex with proteolytic sites shielded within a central chamber (11), to which access is regulated by addi-tional subunits that bind and hydrolyze ATP. The overall structures of proteasomes among eukaryotes and prokaryotes (archaea and eubacterial Actinomycetes) share a barrel structure, with ␣ subunits forming two heptameric outer rings and ␤ subunits forming two heptameric inner rings. Proteolysis occurs at N-terminal threonines on ␤ subunits that are exposed by autocatalytic removal of a propeptide upon assembly (20). In eukaryotes there are seven types of ␣ subunits and seven types of ␤ subunits, with only three of the ␤ subunits (␤1, ␤2, and ␤5) displaying proteolytic activities. With oligopeptide substrates, these activities are caspase-like, trypsin-like, and chymotrypsin-like, respectively (21) (see Fig. 1a). In contrast, prokaryotic proteasomes usually include only one or two types of ␣ and ␤ subunit, and the active sites are usually chymotrypsin-like with oligopeptide substrates. Mtb ␤ subunits are of a single type. Cryo-electron microscopy, x-ray crystallography, and mutation analysis suggested that the ␣ subunits have a gating function and confirmed that the 14 ␤ subunits each provide an active site N-terminal threonine OH (14,15) (see Fig. 1a). In the present work we used a mutant form of the recombinant Mtb proteasome (Mtb20SOG) in which deletion of the N-terminal octapeptide from the ␣ subunits mimics the presumed "open gate" configuration and increases specific activity toward oligopeptide substrates by approximately an order of magnitude (14).
Studies of substrate specificities have been reported for proteasomes of human (22,23) and another eukaryote, Trypanosoma brucei (24), using positional scanning substrate or inhibitor libraries. The positional scanning libraries were usually constructed as 20 pools/position (P1, P2, P3, and P4), each of which contained large numbers of peptides fixed at a single residue and otherwise variant. These studies provided a global analysis of substrate preference at defined positions in the peptide adjacent to the cleavage site (P1) and more distally (P2, P3, and P4) and served well for identification of selective inhibitors for different ␤ subunits (25). However, using positional scanning libraries of substrates or inhibitors is likely to miss the interactions among amino acids at different positions. It is also likely to be more challenging to identify differences in substrate preference between two highly homologous ␤ subunits from different species that are both predominantly chymotryptic. To discern potentially subtle differences in substrate preferences, we needed to take into account the influence of the amino acid at each position on the preferences at each other position. We approached this goal by using a robotic microfluidic device to assess the specific activity of Mtb and bovine proteasomes toward a combinatorial library of 5,920 fluorescently tagged tripeptides, assaying each substrate individually. For comparison, we also tested recombinant proteins from another Actinomycete, Rhodococcus erythropolis. Finally, we validated our findings through kinetic analyses with 21 tripeptides and 18 P1 amino acid analogs of bortezomib.

MATERIALS AND METHODS
Overexpression and purification of recombinant Mtb proteasome, Mtb PrcAB-OG, and Rhodococcus 20S followed the reported method (14). Bovine RBC 20S, a generous gift of Dr. George DeMartino (University of Texas Southwestern Medical Center) was purified as described (26). Human RBC 20S was purchased from Boston Biochem (Cambridge, MA). The recombinant ␣ subunit of the rat PA28 activator was purified as described (27,28). The concentrations of the proteasomes were calculated based on their molecular mass (ϳ700 kDa); multiplicity of active sites was not taken into account. The ChemRX Protease Profiler library (29) was purchased from Discovery Partners International (South San Francisco, CA). The library was reconfigured from 96-to 384-well format. For assay, the substrate plates were prepared by mixing 1 l of the 1 mM stock and 70 l of microfluidics buffer (50 mM Tris, pH 7.8, 20 mM NaCl, 0.5 mM EDTA, 0.005% Triton X-100) in 384-well polypropylene plates, yielding a substrate concentration of 14 M. On-chip dilution of substrate was 70%, for a final substrate concentration of 10 M. Individual substrates for kinetic analysis were custom synthesized by AnaSpec (San Jose, CA). Suc-LLVY-AMC was from Bachem Biosciences (King of Prussia, PA). Z-VLR-AMC was from MD Biosciences, Inc. (St. Paul, MN). Bortezomib and its analogs were synthesized in-house (Millennium Pharmaceuticals Inc., Cambridge, MA).
Caliper Chip Assay-A detailed description of method will be published elsewhere. 5 In brief, the Caliper 220 Drug Discovery Systems (Caliper Technologies, Mountain View, CA) was set up according to the manufacturer's instructions. Taking into account an on-chip dilution of 30%, the final protein concentrations of the Mtb20SWT, Mtb20SOG, and Rhod20S were 500, 250, and 24 nM, respectively. The final concentration of bov20S and PA28a were 225 and 675 nM, respectively. The final concentration of each substrate was 10 M. The screening was carried out in buffer (50 mM Tris, pH 7.8, 100 mM NaCl, 0.5 mM EDTA, 0.005% Triton X-100, 5 mM ␤-mercaptoethanol). The instrument reaction chamber was at 26°C with 60% humidity. A FS267 or TF460 microfluidics chip was loaded into a Caliper 220 or 250 robot. The run pressure was set to make the on-chip reaction time ϳ90 s for Mtb20SWT, ϳ60S for Mtb20SOG and Rhod20S, and ϳ30 s for bov20S with PA28a. Relative fluorescence units were recorded (excitation, 355 nm; emission, 420 -480 nm). The data were collected in less than 12 h by the Caliper LabChip HTS software. Fluorogenic activity was determined for each sample using Caliper HTS Well Analyzer software. Sipper-to-sipper variations in the raw data were normalized by using the Cy5 (Caliper Technologies, Mountain View, CA) marker (FS267) or AMC control as a reference (TF460). The background fluorescence of the unhydrolyzed substrate was detected in real time in a parallel channel lacking enzyme (TF460: Mtb20WT, Mtb20SOG, Rhod20S; see PA28␣. All of the reactions were done in 200-l triplicate on a PolarStar Galaxy spectrofluorimeter plate reader (BMG Labtech, Durham, NC). Hydrolysis of AMC substrates was monitored fluorometrically with periodic shaking (excitation, 340 nm; emission, 460 nm). The linear ranges of the time course curve were used to calculate the initial reaction velocities. Raw fluorescence was converted to nM/min using an AMC standard curve. The steady-state parameters k cat and K m were determined by nonlinear regression in Kaleidagraph (Synergy Software). Occasional outliers were omitted from the analysis, but no fewer than five concentrations were used. In some cases, only k cat /K m values could be obtained because of substrate inhibition and/or precipitation occurring at high concentrations. The error in the fit was less than 10% for bov20S and 15% for Mtb20SOG. The plate was then spun at 1000 rpm for 1 min, and the progress of the reactions at 37°C was monitored at 37°C in a BMG plate reader as above for the AMC substrates. IC 50 curves were determined by nonlinear regression to the Hill equation in Activity-Base (IDBS, Burlington, MA). The human RBC 20S assays were performed as described for the Mtb20S proteasomes, with the following modifications. The 2ϫ substrate solution was 30 mM Ac-WLA-AMC, 0.05% SDS activator in 20 mM HEPES, 0.5 mM EDTA, pH 7.5. The 2ϫ enzyme solution was 3 nM human RBC proteasome in 20 mM HEPES, 0.5 mM EDTA, pH 7.5. The error in the fit was less than 10% for both hu20S and Mtb20SOG.

RESULTS
Ac-P3P2P1-AMC Library and Assay Design-The design of the combinatorial Ac-P3P2P1-AMC library is depicted in Fig.  1b. In addition to the 20 amino acids commonly incorporated into polypeptides, two additional natural amino acids, L-citrulline and L-ornithine, were incorporated in the P1 position for a total of 22. 20 amino acids (omitting L-citrulline and L-cysteine) were incorporated into the P2 and P3 positions. The N terminus of P3 was capped with an acetyl group. This has the distinct advantage of minimizing the influence of the artificial blocking group, as compared with the aromatic Z or acidic succinyl (Suc) often used in peptides. Of the possible 8,800 tripeptides (22 P1 ϫ 20 P2 ϫ 20 P3), 5,920 were usable based on yield and purity. The average purity was Ͼ90% and not less than 85% for any individual substrate (29). A robotically controlled microfluidic chip facilitated the performance of the 5,920 reactions with each of four proteasome preparations (Fig. 1c): Mtb20SWT, Mtb20SOG, Rhod20S, and bov20S. Substrate concentrations were kept at 10 M, equal to or a few-fold lower than the values of K m (10 -200 M) previously reported for tri/tetrapeptide substrates (24,30). Under these conditions, reaction rates likely reflected the specific activity (k cat /K m ) in the absence of sub-strate saturation effects. Stream splitting and simultaneous fluorescence detection facilitated subtraction of the background fluorescence of each unhydrolyzed substrate (Fig. 1c).
Overall Intra-and Interspecies Comparisons-As reported with other substrates (14), Mtb20SWT was up to 20-fold less active than Mtb20SOG. In the initial set of experiments, we used twice as much Mtb20SWT as Mtb20SOG and an increased reaction time to evaluate the correlation of specific activities for different substrates between Mtb20SWT and Mtb20SOG. This allowed us to establish that Mtb20SOG could serve as an authentic reporter of the Mtb proteasome for subsequent interspecies comparisons. Thus, a plot of the activity of Mtb20SWT and Mtb20SOG for each substrate yielded the correlation coefficient R 2 ϭ 0.78 (supplemental Fig. S1), indicating that the substrate preferences of the two Mtb proteasome preparations were highly similar. The Mtb proteasome has seven identical ␤ subunits, whereas eukaryotic proteasomes have seven different ␤ subunits. Only three ␤ subunits of eukaryotic proteasomes are active once their Thr 1 active sites are exposed by autocatalytic removal of propeptide: ␤1, caspase-like; ␤2, trypsinlike; and ␤5, chymotrypsin-like. Eubacterial proteasomes are generally considered to be chymotrypsin-like when assayed with small peptide substrates. b, structure of the acetyl-P3-P2-P1-AMC substrate library. R 1 , R 2 , and R 3 refers to the side chains of P 1 , P 2 , P 3 amino acids, respectively; and S 1 , S 2 , and S 3 refer to the binding pockets of the proteasome for P 1 , P 2 , and P 3 amino acid side chains, respectively. c, schematic illustration of microfluidic TF460 assay system. Stream splitting and simultaneous fluorescence detection facilitated subtraction of the background fluorescence of each unhydrolyzed substrate; after substrate in buffer was pulled into the channel through vacuum, the stream was split 50:50 into parallel channels and mixed with enzyme and buffer, respectively. The final annotated data were obtained by subtracting background fluorescence of the unhydrolyzed substrate from that of the enzymatic reaction. DECEMBER 5, 2008 • VOLUME 283 • NUMBER 49

Substrate Specificity of Mtb versus Bovine Proteasomes
In contrast, the correlation coefficients between results for Mtb20SOG and Rhod20S (R 2 ϭ 0.11), Mtb20SOG and bov20S (R 2 ϭ 0.03), and Rhod20S and bov20S (R 2 ϭ 0.45) (supplemental Fig. S1) showed that the Mtb proteasome stood apart in its tripeptide substrate preferences from the preferences exhibited by proteasomes of the other two species.
Preferences of Mtb, Rhodococcus, and Bovine Proteasomes for Individual N-Ac-tripeptides-Our earlier biochemical characterization of the Mtb proteasome with Z-capped tripeptide and tetrapeptide substrates revealed multiple proteolytic activities with a preference for basic P1 residues (14). In contrast, with N-acetyl tripeptides, Mtb proteasomes demonstrated only chymotryptic activity (P1 ϭ Phe, Leu, Ile, Trp, or Tyr) (Fig. 2, a and  b), with the exception of a few tripeptides with P1 ϭ His (Table  1). These observations confirmed that constituents at one position (in this case, the N terminus) could markedly affect the preference of the proteasome for particular side chains at a different position (in this case, P1). Among the N-acetyl tripeptides, Mtb20SOG was strongly biased toward P1 ϭ Trp. The most preferred substrate, WQW, allowed a 2-fold higher specific activity than RWH, the most favored tripeptide that lacked P1 ϭ Trp. Among the 30 most preferred substrates (top 0.5% of the library), 28 had P1 ϭ Trp and 2 had P1 ϭ His. At P2, Gln and Trp were the most favored amino acids. The S3 site appeared to be more accommodating, because most amino acid residues were represented at the P3 position in the top 1% of the most preferred substrates. However, acidic residues (Asp and Glu) were disfavored at P3. This might be explained by the presence of Asp 30 at the bottom of the S3 pocket (25).
Rhod20S behaved as a typical chymotryptic proteasome (Fig.  2c), with a preference for hydrophobic and aromatic amino acids at P1 (Table 1). Unlike the Mtb proteasome, there was no strong bias toward one P1 amino acid. Also in contrast to the Mtb proteasome, Rhod20S displayed no activity for any substrates having P1 ϭ His. Among the top 1% of the most preferred substrates for Rhod20S, most of the uncharged amino acids were found at P2 and P3.
As expected, bov20S displayed three protease-like activities (Fig. 2d). A preference at P1 for hydrophobic and aromatic amino acid residues reflected chymotryptic activity; for Arg and Lys, tryptic activity; and for Asp and Glu, caspase-like activity. For chymotryptic activity, the preferred tripeptide sequences were P1 ϭ Leu/Phe/Tyr, P2 ϭ Gln/Ser/Leu, and P3 ϭ Tyr/Phe/ Trp (Table 1). However, the P1 ϭ Leu and P3 ϭ Tyr were the most favored, whereas the hydrophilic P1 ϭ Asn and P3 ϭ Orn/Asn were the least favored (supplemental Fig. S2). For tryptic activity, the preferred amino acids were P1 ϭ Arg and P3 ϭ Arg and Lys, with acidic residues strongly disfavored at P3. This differed from the profile of human 20S determined in a tetrapeptidyl 7-amino-4-carbamoylmethylcoumarin positional scanning library when P1 was fixed as Arg, where the preferred P3 was Glu and Lys, whereas P3 ϭ Arg was disfavored (22,31). For caspase-like activity of bov20S, the P2 amino acids appeared to be more determinant than the P3 amino acids, because the preferred sequences were Ac-X(F/W/Y)(D/E), whereas the most disfavored were Ac-(E/K/R)X(D/E)-AMC (supplemental Fig. S2). The preference of bov20S for basic residues at P3 of tryptic substrates (supplemental Fig. S2) may reflect the presence of Asp 28 of the ␤2 chain at the bottom of the S3 pocket, which can form hydrogen bonds or salt bridges with protonated guanidino or amino groups (25). P2 amino acids were less of a determinant than P1 and P3.
Comparison of Preferences of Mtb and Bovine Proteasomes for Shared Substrates-To aid in comparison of large data sets, we performed hierarchical clustering analysis of substrate specificity for proteasomes from the three species, as illustrated in Fig.  3 for the comparison of Mtb20SOG and bov20S at P1 and P3. The clustergram revealed 12 subgroups with different pairs of residues at P1 and P3 that contain more preferred substrates for Mtb20SOG than for bov20S: Trp/Ala, Trp/Gly, Trp/Lys, Trp/ Orn, Trp/Pro, Trp/Arg, Thr/Glu, Thr/Gln, Arg/Pro, Arg/Asp, Orn/His, and Orn/Asn. Because Mtb20SOG was only substantially active on substrates with P1 ϭ Trp, we next focused on this subgroup of substrates in comparing preferences of Mtb20SOG and bov20S. The correlation between their activities yielded R 2 ϭ 0.33 (not shown), indicative of a partial overlap in substrate preferences. However, two subgroups were much more favored by Mtb20SOG than by bov20S: those with sequences Ac-G-Xaa-W-AMC and Ac-P-Xaa-W-AMC (Fig. 4, a and b). The activity of Mtb20SOG on Ac-GXW-AMC appeared to depend on bulkiness and neutral hydrophilicity (Gln/Asn) of P2; the bulkier, the more preferable the substrate was for the Mtb proteasome. Although the effect of P2 amino acids on Ac-PXW-AMC was variable, Ac-PWW-AMC was a selective substrate for Mtb20SOG.
In a tetrapeptide inhibitor with a vinyl sulfone warhead, Ac-PRLN-VS, inclusion of P3 ϭ Arg in combination with P1 ϭ Asn conferred selectivity for human and yeast 20S ␤2 subunits over their ␤5 subunits (23). This was attributable to interaction between the guanidino group of Arg and the Asp 28 of the ␤2 subunit of the proteasome of both species at the bottom of the S3 pocket (23,25), given that Asp 28 is not conserved in ␤5 of human or yeast 20S. However, the homologous residue, Asp 30 , is conserved in the ␤ subunits of the Mtb proteasome (15). As shown in Fig. 4 (c and d), comparison of the two subgroups of substrates with formulas Ac-(R/K)XW-AMC for Mtb20SOG and bov20S did not reveal a sharp difference in preferences. However, hydrophobic or aromatic P2 amino acids were favored by Mtb20SOG and basic P2 amino acids by bov20S.
Kinetics of Hydrolysis of Individual Substrates-For a better understanding of the kinetic behavior of these proteasomes, we undertook a Michaelis-Menten steady-state analysis of specificity with a group of 19 N-acetyl-tripeptidyl-AMC substrates, along with Z-VLR-AMC and suc-LLVY-AMC. Of this set of 21 substrates, 16 would be conventionally considered chymotryptic, two tryptic, and three caspase-like substrates ( Table 2). Although Z-VLR-AMC was a good substrate for Mtb proteasomes (14), activity of N-acetyl-VLR-AMC was undetectable.
Thus, Z appears to contribute significantly to binding affinity.
In general, the steady-state kinetic analysis of individual, resynthesized substrates agreed well with the large scale substrate specificity assay conducted at a fixed concentration and time. With few exceptions, the Mtb proteasome was active only on chymotryptic substrates with P1 ϭ Trp, whereas bov20S was   active on all types of substrates. The K m values of all substrates for which activity was detected were between 10 and 180 M for bov20S and similarly between 10 and 100 M for Mtb proteasomes (Table 2). No k cat and K m could be estimated for a few substrates because of either precipitation or substrate inhibition at high concentrations before saturation was achieved. The k cat /K m values for these substrates were estimated as the slops of the linear plots of each set of data. Almost all of the preferred substrates had similar K m values for Mtb20SWT as for Mtb20SOG, but with 20 -30-fold reduction of k cat (Mtb20SWT data not shown). The correlation of k cat /K m values of Mtb20SOG and Mtb20SWT yielded an R 2 of 0.82 (supplemental Fig. S3), matched well with the R 2 of 0.78 of the correlation of results from high throughput screening of substrates between Mtb20SOG and Mtb20SWT. For the Mtb proteasome, the change of P1 ϭ Trp to P1 ϭ Leu in Ac-YQW-AMC resulted in a 66-fold decrease in k cat /K m , from 0.66 to 0.01 M Ϫ1 min Ϫ1 , underscoring the importance of a bulky, hydrophobic residue at P1.
For the bovine proteasome ␤5 subunit, hydrophobic and aromatic amino acid residues at P1 contributed to activity as substrates. To a lesser degree, activity scaled with increasing size of the residue at P2. For example, changing P2 from Ac-RAW-AMC to Ac-RFW-AMC resulted in a 2.4-fold increase in the k cat /K m , from 0.26 to 0.63 M Ϫ1 min Ϫ1 . The corresponding impact on the Mtb proteasome was a 1.8-fold increase, from 0.79 to 1.4 M Ϫ1 min Ϫ1 . For the bovine proteasome, we confirmed that changing P3 from Try to Trp to yield Ac-WQW-AMC resulted in a 1.4-fold decrease in k cat /K m , from 5.12 to 3.7 M Ϫ1 min Ϫ1 . The result was opposite in the Mtb proteasome: a 2-fold increase, from 0.66 to 1.25 M Ϫ1 min Ϫ1 .
The most selective of the 21 substrates whose kinetics were tested was Ac-LWW-AMC, similar to Ac-GWW, which ranked as the most selective substrate in the larger library (Fig.  4a). Ac-LWW-AMC showed a 6-fold difference in k cat /K m (0.09 M Ϫ1 min Ϫ1 versus 0.55 M Ϫ1 min Ϫ1 ) in favor of the Mtb proteasome versus bov20S ␤5. The K m of Ac-LWW-AMC was 9.1 M against the Mtb proteasome, a 10-fold increase in binding affinity compared with other substrates with measurable K m values for Mtb proteasome.
We have previously reported that the Mtb20S possesses a broad specificity, primarily tryptic in nature, as judged by activity on Z-VLR-AMC, Suc-LLVY-AMC, and other canonical proteasome substrates (13). This finding is supported here, with k cat /K m for Z-VLR-AMC being 5-fold greater than for Suc-LLVY-AMC. However, the P1-W substrates identified in the library screen are better than either Suc-LLVY-AMC (17.5-fold in k cat /K m versus Ac-RFW-AMC) or Z-VLR-AMC (3.3-fold). Although the balance is now shifted slightly in favor of chymotryptic activity, these results reinforce the unusual observation that the Mtb20S proteasome recognizes multiple substrate types in a single active site. This affords an opportunity for drug discovery.
Mtb versus Human Proteasome Inhibition by Bortezomib P1 Amino Acid Analogs-A selective substrate can sometimes be converted into a selective inhibitor. To test the results for substrate preferences, we used a library of 18 P1-amino acid analogs of bortezomib (analog 1), which inhibits the human proteasome ␤5 subunit with high potency (IC 50 0.016 M) and selectivity. The bortezomib analog library was divided into two groups: compounds containing aliphatic, branched side chains (analogs 2-9) and those containing Phe and substituted benzyl (analogs 10-18) ( Table 3). We performed experiments with both Mtb proteasome preparations (wild type and open gate mutant) in comparison with human proteasomes purified from red blood cells and activated with PA28. We used both a chymotryptic substrate (Ac-RFW-AMC) and a tryptic substrate (Z-VLR-AMC) for the Mtb proteasomes and varied the concentration of inhibitors over the range of 5 nM to 100 M. IC 50 values were obtained by fitting the data to the Hill equation. The IC 50 values of Mtb20SWT and Mtb20SOG were almost identical; only those of Mtb20SOG are shown. Moreover, the IC 50 values of the inhibitors were generally identical with each substrate.
Among the inhibitors with aliphatic side chains, that with R ϭ isobutyl (bortezemib) 1 was the most potent, which was 3-fold more potent than the next, which had R ϭ 2-butyl (isoleucine side chain) 8. The n-propyl 9, n-butyl 7, or isopentyl 4 side chains further increased IC 50 values to 1.11-1.88 M. The continuous reduction of the size of side chain to n-ethyl 3 and methyl 5 led to the IC 50 values rising up to 8 -44 M, a 154-fold decrease of potency from bortezemib 1. The neopentyl side chain 2 eliminated inhibitory potency. In sum, substitution of P1 with a 4-carbon branched side chain improved the inhibitory potency the most, whereas extremely bulky substitution reduced it.  Consistent with P1 ϭ Trp being a preferred substrate among the tripeptides for the Mtb proteasome, the aromatic side chain analogs of bortezomib were generally better inhibitors than those with aliphatic side chains ( Table 3). The side chain R ϭ phenyl of 10 appeared to be the least potent in the group, followed by m-CF 3 -benzyl 11. The replacement of m-CF 3 11 by m-CH 3 13 improved potency by 3-fold, whereas p-CH 3

DISCUSSION
The Mtb proteasome is a nonredundant pathway by which Mtb protects itself against nitrosative stress and metabolic stringency in vitro (12), and it is essential for Mtb to survive in mice, even if they are immunodeficient (13). This information, combined with the ability of small, drug-like compounds to enter Mtb, inhibit its proteasome and kill the bacterium (12), focuses interest on developing selective proteasome inhibitors as potential leads for chemotherapeutics. Such compounds might offer a range of features currently prized in potential new agents for treatment of tuberculosis: that they be new chemophores active against new targets and effective against nonreplicating Mtb. However, it remains a major hurdle to attain selectivity between ␤ subunits that are predominantly chymotryptic in both pathogen and host.
Using a microfluidic assay, we analyzed the substrate specificity of Mtb, Rhodococcus, and bovine proteasomes with a library of 5920 tri-peptidyl-AMC substrates tested individually. When each was given a preferred substrate, the bovine proteasome was ϳ3-fold more active than the Rhodococcus protea- The bovine proteasome demonstrated three distinct protease activities, as anticipated based on studies of all other eukaryotic proteasomes to date. The Rhodococcus proteasome demonstrated only chymotryptic activity, as reported for all prokaryotic proteasomes to date except that of Mtb (14) and consistent with our earlier results using Z-capped tripeptide or tetrapeptide substrates (14). Moreover, comparing large numbers of individual N-acetyl tripeptides having hydrophobic P1 residues, the chymotryptic substrate preferences of the bovine and the Rhodococcus proteasomes were similar (R 2 ϭ 0.49).
In contrast, the Mtb proteasome stood apart from the other two in several respects. We previously reported that the Mtb proteasome displays multiple proteolytic activities with Z-capped tripeptide or tetrapeptide substrates (14). However, with acetyl tripeptide substrates, the Mtb proteasome only showed chymotryptic activity. Compared with the chymotryptic activity of bovine and Rhodococcus proteasomes, the chymotryptic activity of the Mtb proteasome was strikingly restricted to substrates with P1 ϭ W. Moreover, P1 and P3 amino acids appeared to be the major determinants of substrate preference for the bovine and Rhodococcus proteasomes, whereas for the Mtb proteasome, only P1 had a major effect. The strong preference of the Mtb proteasome for P1 ϭ Trp was reflected in the potency of bortezomib P1-variant analogs. As P1 amino acid analogs changed from Ala to m-Cl-Phe, the ratio of IC 50 values between the human proteasome and the Mtb proteasome changed from 0.001 (5) to 8.0 (16), a remarkable 8000-fold shift in favor of the Mtb proteasome. However, the modest 8-fold selectivity for compound 16 in favor of Mtb proteasome over human proteasome suggests that much work needs to be done to improve the selectivity. Incorporating the selective preference by Mtb proteasome for side chains at S1 and S3 sites, together with other proven proteasome or protease warheads may further improve the species selectivity toward the Mtb proteasome. Incorporating non-natural amino acids into the inhibitor design would also provide more avenues for improving species selectivity of proteasome inhibitors.
In sum, we discovered differences in substrate preferences, along with the differential influence of substituents at different positions in the substrate, through the use of a ϳ6,000 acetyl tripeptide AMC substrate library, suggesting that the primary and extended sites at the active center of the Mtb proteasome are very different from those of mammalian proteasomes. A limitation of this study is that substrate screening was performed with cow rather than human proteasomes. However, the studies with bortezomib analogs involved human proteasomes and were confirmatory. Thus, the findings reported here may provide opportunities to design peptide-based inhibitors with a useful degree of selectivity for the Mtb proteasome over that of the human host.