Crystal Structure of the Peptidase Domain of Streptococcus ComA, a Bifunctional ATP-binding Cassette Transporter Involved in the Quorum-sensing Pathway*

ComA of Streptococcus is a member of the bacteriocin-associated ATP-binding cassette transporter family and is postulated to be responsible for both the processing of the propeptide ComC and secretion of the mature quorum-sensing signal. The 150-amino acid peptidase domain (PEP) of ComA specifically recognizes an extended region of ComC that is 15 amino acids in length. It has been proposed that an amphipathic α-helix formed by the N-terminal leader region of ComC, as well as the Gly-Gly motif at the cleavage site, is critical for the PEP-ComC interaction. To elucidate the substrate recognition mechanism, we determined the three-dimensional crystal structure of Streptococcus mutans PEP and then constructed models for the PEP·ComC complexes. PEP had an overall structure similar to the papain-like cysteine proteases as has long been predicted. The active site was located at the bottom of a narrow cleft, which is suitable for binding the Gly-Gly motif. Together with the results from mutational experiments, a shallow hydrophobic concave surface of PEP was proposed as a site that accommodates the N-terminal helix of ComC. This dual mode of substrate recognition would provide the small PEP domain with an extremely high substrate specificity.

The quorum-sensing system is a bacterial cell-cell signaling system mediated by an inherent signal molecule (pheromone) to properly respond to environmental changes and survive as a "community" (1). This signal pathway alters the gene expression profile of the target cells after a sufficient number of cells are accumulated to form the "quorum," and thereby the concentration of a released pheromone reaches the threshold to bind to either the cell surface or intracellular receptors (2,3). The quorum-sensing system is believed to regulate the diversities of the physiological functions in Gram-positive and Gram-negative bacteria, including bioluminescence (4), virulence factor expression (5), and competence for genetic transformation (6). In addition, many opportunistic pathogens form a biofilm in response to this system, which causes chronic infections. The biofilm complicates the clinical treatment because of its resistance to antibiotics (7). Oral streptococci such as Streptococcus mutans are not only cariogenic but also cause life-threatening infective endocarditis by forming a biofilm on the native or prosthetic heart valves. The biofilm formation has been observed to be regulated through quorum sensing in S. mutans (8), Streptococcus gordonii (9), and Streptococcus intermedius (10).
Although several quorum-sensing systems have already been reported, the most thoroughly described systems are the acylhomoserine lactone-mediated system of Gram-negative bacteria and the peptide-based signaling system of Gram-positive bacteria (11). Signal-producing enzymes play a key role in the first step of this pathway. In Gram-negative bacteria, the cytosolic LuxI-type protein, which was initially discovered in Vibrio fisheri, synthesizes the acyl-homoserine lactone, and the pheromone molecule is thought to be excreted by passive diffusion or membrane transporters. Extensive studies have been made on these proteins (12), and now their three-dimensional structures, including those of Pantoea stewartii EsaI (13), Pseudomonas aeruginosa LasI (14), and Vibrio cholerae CqsA (15), are available. In contrast, signal-producing proteins of Gram-positive bacteria are membrane-embedded proteins that are responsible for both the production and secretion of the peptide-based pheromone molecules. Thus, we have fallen far behind in the biochemical and structural characterizations compared with the studies of their counterparts of Gram-negative bacteria.
The mechanism of Gram-positive Streptococcus quorum sensing has been well studied in S. pneumoniae. The competence-stimulating peptide, which functions as a quorum-sensing signal in this bacterium, is cleaved from the precursor peptide ComC and concomitantly exported to the extracellular milieu by ComA with help of the accessory protein, ComB. The accumulated competence-stimulating peptide binds to the extracellular domain of the ComD receptor, which subsequently phosphorylates the response regulator, ComE, by the histidine kinase activity of the cytosolic domain of ComD. Phosphorylated ComE up-regulates the transcriptions of the early genes such as comX and comW, leading the DNA uptake (competence) and recombination. Homologues of these com genes were found in other Streptococcus species (6,16).
ComA of Streptococcus is a member of the bacteriocin-associated ATP-binding cassette (ABC) 2 transporter family, which comprises the three following domains: the N-terminal peptidase domain, a transmembrane domain consisting of six membrane-spanning segments, and a C-terminal ATP-binding domain located on the cytoplasmic face of the membrane (17). The peptidase domains of this family are thought to cleave the propeptides after the consensus Gly-Gly motif. To date, peptidase domains of this family, such as Lag D, a transporter of lactococcin G in Lactococcus lactis (17), and CvaB, a transpofter of colicin V in Escherichia coli (18), have been demonstrated to have proteolytic activity. Recently, we succeeded in the heterologous overexpression of the 150-amino acid peptidase domain (PEP) of ComA and the propeptide ComC from several species of Streptococcus as soluble proteins in E. coli (19,20). These advances enabled us to do a detailed biochemical analysis of the initial steps of the pheromone production in Streptococcus.
Based on sequence alignments, PEP is thought to belong to a papain-like cysteine protease family (17,21). The analysis with purified PEP and ComC from S. pneumoniae (PPEP and PComC, respectively) identified the essential Cys residue (19). Using combinations of PEPs and ComCs from the species of Streptococcus, together with the mutagenesis analyses of ComC, it was found that four conserved hydrophobic residues in the N-terminal leader region of ComC extending from the Ϫ15 to Ϫ4 positions, as well as the Gly-Gly motif at the cleavage site, are critical for the interaction between PEP and ComC (20). It was also suggested that ComC undergoes a structural transition from the random coil to helix upon binding to PEP and that the four conserved residues of ComC could form a hydrophobic face of the helix (20). Based on these findings, we have hypothesized that there is a large hydrophobic patch on the surface of the PEP protein, which interacts with the hydrophobic face of the N-terminal helix of ComC, and there is a narrow region in the vicinity of the active center, which interacts with the Gly-Gly motif of ComC.
We have now determined the crystal structure of PEP from S. mutans (MuPEP1) and further constructed models of the PEP⅐ComC complexes. Based on these results, a series of mutations was then introduced into the putative substrate-binding site of PEP for the kinetic analysis and CD measurement. These experimental results supported the model, which proposes a unique substrate recognition mechanism of PEP. The present study provides a first glimpse into the signal production mechanism in the quorum-sensing pathway of Gram-positive bacteria.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-For crystallization of both the native and selenomethionine-labeled MuPEP1, a methionine auxotrophic E. coli strain, B834 (DE3), carrying an expression plasmid for MuPEP1, pSMuP1 (20), was used. The E. coli cells were grown by shaking at 37°C in LB medium containing 50 g/ml ampicillin. When the density of the cultures reached 1 ϫ 10 8 cells/ml, 10 ml of this culture was transferred to 1 liter of LeMaster medium (22), containing 10 g of lactose and 50 g/ml ampicillin, preincubated at 37°C for 1 h. The 1-liter medium was supplemented with 25 mg of L-methionine or seleno-L-methionine for the native or selenomethionine-labeled MuPEP1, respectively. These cells were grown by shaking at 37°C for 23 h and then harvested by centrifugation and resuspended in 20 ml of a buffer containing 20 mM Tris-HCl, 500 mM NaCl, and 5 mM imidazole, pH 7.9. The cell suspension was frozen and stored at Ϫ80°C.
For the biochemical studies, the PEPs and ComCs were expressed and purified as previously described (20). Briefly, the wild-type and mutant PEPs were expressed in an E. coli strain, BL21 (DE3) pLysS, carrying each expression plasmid. The cells were grown and induced by 0.2 mM isopropyl-␤-D-thiogalactopyranoside for 2 h at 37°C for the MuPEP1 or for 5 h at 30°C for the PPEPs. The PEPs were purified with His⅐Bind resin (Novagen) and dialyzed at 4°C against a buffer containing 20 mM Tris-HCl, 200 mM ammonium sulfate, and 2 mM ␤-mercaptoethanol, pH 7.0.
For the expression of the ComCs, an E. coli strain, JM109 (DE3) pLysS, carrying each expression plasmid was grown and induced by 0.2 mM isopropyl-␤-D-thiogalactopyranoside for 1 h at 37°C for PComC or 2 h for at 37°C for Streptococcus cristatus ComC (CComC). The ComCs were purified with the His⅐Bind resin followed by chromatography with a Mono Q column connected to an Ä KTA fast protein liquid chromatography system (Amersham Biosciences). The concentrations of the PEPs and ComCs were spectrophotometrically determined based on the number of the aromatic amino acid residues (19).
Protein Crystallization and Data Collection-Crystals of the native MuPEP1 were obtained by the sitting drop, vapor diffusion method at 20°C (24-well Linbro plates were used). Drops were assembled with 2 l of the 5.5 mg/ml protein, which was dissolved in 70 mM sodium phosphate and 2 mM ␤-mercaptoethanol, pH 7.0, mixed with an equal volume of the precipitant solution comprising 16% polyethylene glycol 3350 and 0.16 M di-ammonium hydrogen citrate (no buffer). Crystals were observed on day 5. The crystals were briefly soaked in the precipitant solution containing 24.5% polyethylene glycol 3350 and 0.12 M di-ammonium hydrogen citrate and then flash-frozen in a liquid nitrogen stream (Ϫ173°C). Crystals of the selenomethionine-labeled MuPEP1 were obtained under conditions similar to the native MuPEP1 with a protein concentration of 4.9 mg/ml. multiple-wavelength anomalous dispersion data sets for the selenomethionine-labeled crystals and the singlewavelength data set for the native crystals were collected at the RIKEN Structural Genomics Beamline II BL26B2, SPring-8 (Hyogo, Japan) (23) by using a MarCCD 225 detector. The data sets were processed and scaled with the program package HKL2000 (24).
Structure Determination and Analysis-The initial phases were calculated by the program SHELX C/D/E implemented in the HKL2MAP interface (25)  anomalous dispersion data sets. The space group of the crystal was determined to be P4 3 2 1 2. The resulting phases were improved with the program PHENIX (26), followed by automatic model tracing with the program ARP/wARP (27). Model refinement was performed with the programs XtalView/X-fit (28) and CNS (29). The model was subsequently refined against the native data set at a 1.9-Å resolution. The final model was validated using the program PROCHECK in the CCP4 suite (30). The refinement statistics are summarized in supplemental Table S1. Figures were drawn using the programs PyMOL (31) and VMD (32). The atomic coordinates and structure factors have been deposited in Protein Data Bank (www.rcsb.org/) under the accession code 3K8U. Molecular Modeling-The initial models for the subsequent molecular dynamics (MD) simulations were constructed using Xfit (28) (see text for details). These models were electrostatically checked using PDB 2PQR (33) with PROPKA (34). The proteins were placed in a box of water molecules, and sodium ions were randomly added to the system to maintain electrical neutrality using VMD (32). This resulted in a total atom count of 24,433 with the dimensions of 67 ϫ 60 ϫ 63 Å 3 for the MuPEP1⅐MuComC model, 20,480 with the dimensions of 57 ϫ 61 ϫ 64 Å 3 for the PPEP model, and 30,511 with the dimensions of 69 ϫ 67 ϫ 68 Å 3 for the PPEP-PComC model.
MD simulations were carried out using NAMD2.6 (35) and the CHARMM22 force field for the protein and ligands (36), along with the TIP3P model for water (37). A cutoff of 12 Å (switching function starting at 10 Å) for the van der Waals interactions and real space electrostatic interactions was assumed. Periodic boundary conditions were used. The particle-mesh Ewald method (38) was used to compute the long range electrostatic forces. The SHAKE algorithm was used to constrain all bonds between the hydrogen atoms and heavy atoms (39). An integration time step of 2 fs was used, thus permitting a multiple time-stepping algorithm (40) to be employed in which the interactions involving covalent bonds and short range non-bonded interactions were computed for every time step, and long range electrostatic forces were computed every two time steps. Langevin dynamics was utilized to maintain a constant temperature of 300 K in all the simulations. Constant pressure simulations at 1 atm were conducted using the Nosé-Hoover Langevin piston method (41) with a decay period of 100 fs and a damping time scale of 50 fs. After an initial conjugate gradient minimization for 500 iterations, the equilibrated system was simulated for 1 ns. Analysis of the computed trajectories was performed by VMD.
Site-directed Mutagenesis-Mutagenesis was done using a QuikChange II site-directed mutagenesis kit (Stratagene) according to the manufacturer's instructions. All the single mutations of PPEP were introduced into pSPP1 (19). For a double mutant, L52A/V55A PPEP, the V55A mutation was introduced into L52A PPEP. The primer pairs used for the mutagenesis are listed in supplemental Table S2. The nucleotide sequences of the entire coding regions were verified.
Determination of Kinetic Parameters-The PPEP activity was assayed in a 100-l reaction mixture containing 50 mM Tris-HCl, 150 mM ammonium sulfate, pH 7.0, and various concentrations of the substrate PComC. High-performance liquid chromatography analysis was performed as previously described (19). Briefly, the reaction mixtures were loaded onto a Waters Bondasphere C 8 reversed-phase column (Waters) connected to a Beckman System Gold high-performance liquid chromatography system (Beckman-Coulter), and the peptides were separated on a linear gradient from 10 to 55% (v/v) acetonitrile containing 0.1% trifluoroacetic acid over 10 min at the flow rate of 1 ml/min at ambient temperature. Typically, Ͻ10% of the substrates were consumed in the reactions.
CD Measurements-The CD spectra were recorded using a Jasco spectropolarimeter, model J-720WI (Jasco), equipped with a thermocontroller using a 0.1-cm light-path sample cell. The buffers and protein concentrations are described in the legend of Fig. 6.

RESULTS
Overall Structure of MuPEP1-After an initial crystallization trial for PPEP yielded no diffraction-quality crystals, MuPEP1 was chosen for the subsequent crystallization, because it showed the highest thermostability among the PEPs from six species of Streptococcus that we obtained (20). The structure of MuPEP1 was determined to a near-atomic resolution of 1.9 Å with an R work of 21.2% (R free ϭ 23.4%), and a root-mean square deviation (r.m.s.d.) from the ideal values in bond lengths and bond angles were 0.005 Å and 1.20°, respectively (supplemental Table S1). The final structure included residues 5-141 and 95 water molecules. No electron density was visible for the 4 N-terminal residues and the 15 C-terminal residues, including the His tag. The numbering of the MuPEP1 residues is based on the truncated form, in which the 44 N-terminal hydrophobic amino acid residues are removed from the native protein (20).
The structure of MuPEP1 has an ␣/␤ fold with six ␤-strands and five ␣-helices (Fig. 1A). The structure is organized around a central six-stranded antiparallel ␤-sheet with ␣-helices packing on both sides of the sheet to form two subdomains. The N-terminal subdomain, residues 5-62, consists of three ␣-helices, ␣1, ␣2, and ␣3. The C-terminal subdomain, residues 63-141, consists of six ␤-strands and two ␣-helices, and the order of these components is as follows: ␤1, ␣4, ␤2, ␤3, ␤4, ␤5, ␣5, and ␤6. The active site, containing the catalytic triad Cys-17, His-96, and Asp-112, is located in a narrow cleft at the interface of these two subdomains. Cys-17 and His-96 are at the N termini of ␣1 and ␤3, respectively, and Asp-112 is in a loop between ␤4 and ␤5 (Fig. 1A). The S ␥ atom of Cys-17 is located 3.6 Å from the N ␦1 atom of His-96. The O ␦1 atom of Asp-112 forms a hydrogen bond with the N ⑀2 of His-96 at a distance of 2.7 Å (Fig. 1B).
Despite the lack of a significant sequence similarity, the crystal structures show a common structural scaffold between MuPEP1 and the papain-like cysteine proteases ( Fig. 2A). The secondary structure elements of MuPEP1 are analogous to those of the papain-like cysteine proteases, including three helices in the N-terminal subdomain and most of a central ␤-sheet and a helix (␣4) in the C-terminal subdomain ( Fig.  2A). MuPEP1 has a C␣ r.m.s.d. of 3.0 Å for 137 selected residues and that of 2.7 Å for 133 selected residues when compared with papain and staphopain A, respectively, using the FATCAT server (42). Especially, the spatial arrange-ments of the catalytic triad and the oxyanion hole, Gln-11 of MuPEP1, are highly conserved. The overlaps of the C␣ atoms of these four catalytic residues of MuPEP1 with those of papain or staphopain A equally yielded an r.m.s.d. value of 0.59 Å (Fig. 2B).
Acyl-intermediate Model of MuPEP1⅐MuComC-To investigate the substrate recognition mechanism of PEP, we first attempted to express the recombinant ComC from S. mutans (MuComC), but could not get it accumulated in E. coli as a soluble protein. Next, MuComC was chemically synthesized, but was found to be almost insoluble in aqueous buffers. A fusion protein of MuPEP1 (C17A)-MuComC with a six-glycine linker was then constructed, expressed as a soluble protein, and purified, but it did not yield any crystals. We further tried to obtain the complex structure of MuPEP1 (C17A, C17S, or H96A) with ComCs from other Streptococcus species by co-crystallization and soaking methods. Some combinations yielded crystals, but the diffraction data revealed no ComC molecules.
Thus, we set out to construct a computer model of the MuPEP1⅐ MuComC complex. As the first step, the surface of the MuPEP1 protein was searched for a hydrophobic region, which is suitable for the interaction with the hydrophobic face of the putative N-terminal helix of ComC (20). The HotPatch program (43) was performed using the mode that predicts protein-protein interactions based on the hydrophobicity of a protein surface. Six major hydrophobic patches were identified. After taking into account the general consensus for the substrate orientation in other  papain-like proteases, only one shallow concave surface adjacent to the active site remained as a candidate for the binding site for the N-terminal helix of ComC (Fig. 3A). This concave surface is located at the subdomain interface and comprises a cluster of hydrophobic residues, such as Ala-51, Leu-52, Val-55, Leu-94, and Leu-134.
Next, we constructed the acyl-intermediate formed between MuPEP1 and MuComC, in which the nucleophilic Cys-17 of MuPEP1 is acylated by the carbonyl carbon of Gly-(Ϫ1) of MuComC. The cleaved C-terminal region of MuComC is omitted in the present model. The backbone structure from Ile-(Ϫ3) to Gly-(Ϫ1) of MuComC was fitted into the active-site cleft based on the structure of the ubiquitin C-terminal hydrolase in a complex with its inhibitor ubiquitin aldehyde (PDB ID code 1CMX), which also has the Gly-Gly sequence at the cleavage site, and a covalent bond is formed between the carbonyl carbon of the C-terminal Gly and the S ␥ atom of a nucleophilic Cys residue (44). The position of the backbone of Ile-(Ϫ4) was restricted by this fitting. Because the length of the putative helix of MuComC was unknown, the longest possible region from Met-(Ϫ25) to Ile-(Ϫ5) was initially forced to form a helix. This helix was then roughly docked on MuPEP1 with the angle that is in agreement with the assumption that the conserved hydro-phobic residues (Fig. 3B), Phe-(Ϫ15), Ile-(Ϫ12), and Leu-(Ϫ7), interact with the hydrophobic concave shape of MuPEP1. To connect these two regions of MuComC, the backbone structure from Leu-(Ϫ7) to Ile-(Ϫ4) was manually coordinated to yield a suitable conformation with distance and dihedral angle constraints considered. Finally, the MD simulations were performed at 300 K. During the MD simulation, the r.m.s.d. of the MuPEP1⅐MuComC complex increased to ϳ1.5 Å for the first 0.4 ns and reached a plateau phase continuing until 1.0 ns (supplemental Fig. S1). Hence, the structure obtained at 1.0 ns, after being subjected to the energy minimization, is shown in Fig. 3C In the model of Fig. 3C, the region of MuComC from Ϫ25 to Ϫ10 retains the ␣-helix conformation. The side chains of the conserved residues, Phe-(Ϫ15) and Ile-(Ϫ12), in this helix and Leu-(Ϫ7) in the following turn structure, are stably accommodated by the hydrophobic surface of MuPEP1. The backbone of MuComC bends in the region from Leu-(Ϫ7) to Ile-(Ϫ4), which is located at the inlet to the active site. The side chain of Ile-(Ϫ4) is tightly locked by the side chains of Gln-(Ϫ47) and Thr-(Ϫ50). On the other hand, the hydrophilic residues of MuComC, such as Lys-(Ϫ14), Glu-(Ϫ13), Lys-(Ϫ11), and Thr-(Ϫ10), are fully exposed to the solvent.
The Gly-Gly motif of MuComC fits into the narrow cleft of MuPEP1 in this model (Fig. 4). This cleft is located at the interface of the two subdomains described above. The left side wall of the cleft (in the left and right views of Fig. 4) is mainly composed of Lys-92 through Gln-95 in the loop between ␤2 and ␤3, and the right side wall is Ile-12 through Arg-15 in the N-terminal chain and Lys-46 through Gly-48 in the loop between ␣2 and ␣3. The active-site residues are located at, and comprise a part of, the bottom of the cleft. The side chain of Arg-93 closed the active site like a lid (shown in magenta in Fig. 3C) after the MD simulations. The N 1 and N 2 atoms of Arg-93 form indirect hydrogen bonds with the backbone carbonyl oxygen of Lys-46 via two different water molecules. The aliphatic portion of the side chain of Arg-93 makes van der Waals contacts with the Gly-2 of MuComC. These interactions would partly suppress the fluctuation of the loop between ␤2 and ␤3 (a "dip" signal in supplemental Fig. S2A).
Experimental Verification of the Model-To corroborate the present model, mutations were introduced into the putative substrate-binding site of PPEP. The PPEP-PComC pair was chosen, because it is the only native pair that can be experimentally examined (20).
For this purpose, an acyl-intermediate model of PPEP-PComC was newly constructed. First, the PPEP model was constructed using the MuPEP1 structure as a template. The 137 residues of the MuPEP1 structure were replaced with the corresponding residues of PPEP, which shows a 57% identity on the basis of the amino acid sequence alignment of these proteins (20), the MD simulations were completed, and then the energy minimization was carried out. The resultant PPEP model indicated that there is a hydrophobic concave shape analogous to that of MuPEP1 predicted by the HotPatch program (supplemental Fig. S3). The PPEP-PComC acyl-intermediate model was constructed according to the same procedures as described in the MuPEP1⅐MuComC modeling. Although the final model is almost comparable to the MuPEP1⅐MuComC model, the helix structure of PComC is partly unfolded during the MD simulation in the region between Leu-(Ϫ18) and Glu-(Ϫ17) (Fig. 5A). The r.m.s.d. positional fluctuations of the PComC residues relative to PPEP during the MD simulation suggest that the N-terminal region from Met-(Ϫ24) to Leu-(Ϫ18) of PComC is detached from PPEP (data not shown).
From the inspection of this model, Thr-50, Ala-51, Leu-52, Val-55, Ala-67, Leu-94, and Val-134 were picked for the mutational analysis, because each side chain of these residues was expected to make contact with one or two of the four conserved hydrophobic residues of PComC (Fig. 5). Also, these residues were predicted to be exposed to the solvent, and therefore, mutagenesis on these residues would minimally affect the overall structure of PPEP. Table 1 shows the kinetic parameters of the mutant PPEPs for PComC. The catalytic efficiencies decreased by 1.8-to 16-fold upon the mutations of T50S, L52A, V55A, L94A, and V134A. A double mutant, L52A/V55A PPEP, showed a significant decrease in the affinity of PPEP for PComC (16-fold increase in K m ), whereas the catalytic rate for this mutant only slightly decreased. In contrast, the mutations of A51W and A67W, which are expected to introduce steric hindrance into the putative substrate-binding site of PPEP, severely impair the catalytic efficiency (1400-and 110-fold, respectively) by reducing both the affinity and the catalytic rate for ComC.
We have demonstrated that the structural transition of ComC upon the interaction with PEP can be observed by CD measurements (20). To examine how the steric hindrance caused by the A51W and A67W mutations affect the binding and helix formation of ComC, the CD spectra of ComC were measured in the absence and presence of PEP bearing the A51W or A67W mutation. These mutations were introduced into C17A PPEP (CAPEP), because CAPEP has lost its catalytic activity and cannot cleave PComC (19). CComC was chosen because of its lowest K m value for PPEP among the examined ComCs (20). The CD spectrum of 20 M CComC in the aqueous buffer (Fig. 6, blue line) and that in the same buffer containing 35% trifluoroethanol (TFE) (black line) exhibit typical random coil and ␣-helix structures, respectively, as previously shown (20).

DISCUSSION
The assumption based on primary structures that the peptidase domain of the bacteriocin-associated ABC transporters is a papain-like cysteine protease (17) has been fully supported by the three-dimensional structure of MuPEP1 in the present study. C17S, C17A, and H96A PPEPs (19) and C17A MuPEP1 exhibited complete loss of activity, indicating the essential role of the Cys residue in the catalysis. The S ␥ atom of Cys-17 is located 3.6 Å from N ␦1 of His-96, and the distance is slightly longer than the ideal distance of a hydrogen bond, 2.6 -3.5 Å. The distances between the corresponding atoms of Cys and His of papain and staphopain A were also reported to be 3.7 Å and 4.0 Å, respectively (Fig. 2B). The pK a values of Cys-17, His-96, and Asp-112 of MuPEP1 were predicted to be 3.5, 12, and 0.7, respectively, by the PROPKA program (34). Therefore, as proposed in other papain-like cysteine proteases (45,46), Cys-17 and His-96 would form a thiolate-imidazolium ion pair under physiological conditions. Asp-112, which is in a hydrogenbonding distance to His-96, would function to stabilize the spatial orientation and protonated form of His-96. Thus, MuPEP1 has all the structural components required for the catalytic mechanism of a cysteine protease.
Despite the overall structural resemblance to other proteindegrading cysteine proteases ( Fig. 2A), the PEPs exhibit a peculiar substrate specificity, which strictly discriminates ComCs against other proteins (19,20). Our model of the PEP⅐ComC complexes would provide the structural basis for the substrate recognition of the PEPs. Consistent with our prediction, two distinct regions were found on the surface of MuPEP1 to interact with the ComCs. One of these regions is the active-site cleft, which accommodates the consensus Gly-Gly motif of the ComCs (Fig. 4). When each Gly-(Ϫ1) and Gly-(Ϫ2) of MuComC of our model is mutated to Ala, the side-chain methyl group clashes with the side walls of the cleft: For example, Ala-(Ϫ1) is against the main-chain atoms of Thr-14 and Ala-(Ϫ2) is against that of Gln-95. Previous studies showed that even a minimum change (Gly to Ala) at these two sites of PComC resulted in an 800-fold decrease in the catalytic efficiency of PPEP (19). In addition, PPEP was not sensitive to the typical peptide-mimetic inhibitors, such as antipain, leupeptin, and E-64, which broadly inhibit the cysteine proteases (19). These findings support the restricted geometry at the S 1 and S 2 sites of PEPs that only the Gly-Gly motif can fit.
The other region, a hydrophobic surface adjacent to the active-site cleft ( Fig. 3A and supplemental Fig. S3), would interact with the conserved hydrophobic residues of ComCs at the positions of Ϫ15, Ϫ12, Ϫ7, and Ϫ4. Although the entire region of this N-terminal leader sequence of ComC was postulated to form an ␣-helix (Fig. 3B) in the starting structure, the present model of the PEP⅐ComC complexes showed that the region from Ϫ7 to Ϫ4 is in a random coil or turn structure (Figs. 3C or 5A, respectively). The hydrophobic surface is slightly concave in shape, and the residues comprising the surface, such as Thr at position 50, Ala at 51, Leu at 52, Val at 55, His at 87, Leu at 94, Thr at 132, Leu/Val at 134, and Ile/Leu at 136, are highly conserved among the Streptococcus PEPs (20). The importance of this concave surface was examined by analyzing the effects of the mutations introduced into this region of PPEP on its catalytic activity ( Table 1). The decreases in the catalytic efficiencies by the mutations at the positions of 50, 52, 55, 94, and 134 were rather moderate (1.8-to 16-fold) compared with those (31-to 180-fold) observed in the reactions of PPEP with the mutant PComCs, Phe-(Ϫ15) 3 Ala, Leu-(Ϫ12) 3 Ala, Leu-(Ϫ7) 3 Ala, and Ile-(Ϫ4) 3 Ala (20). This was expected because each of the four conserved residues of ComC would interact with several residues of PEP. For example, the side chain of Phe-(Ϫ15) of PComC is supposed to make contacts with Ala-51, Leu-52, Val-55, and Ala-67 of PPEP (Fig. 5B), and although the Phe-(Ϫ15) to Ala mutation of ComC removes all these interactions, the Leu-52 to Ala mutation of PEP removes only part of them. Indeed, the affinity of PPEP for PComC was reduced in an additive manner by the double mutations at positions 52 and 55. The k cat /K m value of PEPP for Phe-(Ϫ15) 3 Ala PComC decreased by 110-fold (20), and that of L52A/V55A PPEP for PComC decreased by 27-fold. Thus, the contribution of Leu-52  and Val-55 of PPEP to the stabilization of the transition state is estimated to be 68% (RT ln 27/RT ln 110) of the total effect attributed to the interaction of Phe-(Ϫ15) of PComC and residues of PPEP. To more significantly impair the activity by the least number of mutations, thus to more clearly show that this hydrophobic region is the substrate-binding site, steric hindrance was introduced into this region. As expected, the A51W and A67W mutations severely decreased the catalytic efficiency, and the effect of the mutation of Ala-51, which is located at the center of this putative binding site, was larger than that of Ala-67 located at the periphery. The A51W mutation was also introduced into MuPEP1, resulting in almost complete loss of the activity, which imply the conserved importance of this region in the Streptococcus PEPs. These results are well correlated with the impairment in the helical transition of ComC in the presence of A51W CAPEP and A67W CAPEP (Fig. 6). These findings further support the idea that the N-terminal leader region of ComC undergoes a structural transition from the random coil to helix upon binding to the hydrophobic region of PEP and that this is a prerequisite step for the PEP⅐ComC complex to take the productive form. The A51W and A67W mutations resulted in large decreases in the k cat value, not only increases in the K m value. Together with the fact that PEPs do not cleave proteins with only the Gly-Gly sequence, it is thought that the substrate backbone flanking Gly-Gly bears a strain around the cleavage site in the ground state (the E⅐S complex). This strain would be released in the transition state, thereby lowering the activation energy and increasing the k cat . Apparently, a large binding energy between the hydrophobic regions of PEP and ComC described above would be necessary to cause the strain. However, this interaction is not enough by itself, and the C-terminal region must also be tightly held by binding to PEP and/or other parts of ComA to realize the strain around the cleavage site. To clarify this, the models for the ground state and the tetrahedral intermediate, a metastable structure near the transition state, are required. The acyl-intermediate structure shown here should be different from the structures of both the ground state and the tetrahedral intermediate. In fact, the acyl carbon oxygen does not point to the putative oxyanion hole Gln-11 in the models. Additionally, because the cleaved C-terminal region, the mature signal molecule, must be transferred to the transmembrane domain to be excreted without leaking into the cytoplasm, exploring the binding site for the C-terminal region of ComC would be very important for the biological function of ComA.
The present study also proposes a model that explains how a small protease like PEP can specifically recognize an amino acid residue of the substrate that is 15 residues apart from the cleavage site. In the MuPEP1⅐MuComC model, the C␣ atom of Phe-(Ϫ15) of ComC is ϳ20 Å away from that of Gly-(Ϫ1) (Fig. 7). To the best of our knowledge, PEP is a protease that specifically recognizes the most extended region of the substrate. The residues comprising the hydrophobic concave of the PEPs, except for the residue at position 55, are also conserved in the family of the bacteriocin-associated ABC transporters, implying that these transporters share the common substrate recognition mechanism with ComA. Therefore, the present results would provide a prototypical model for studying the peptidase domains of the other members of bacteriocin-associated ABC transporters. Based on the fact that the family of the bacteriocin-associated ABC transporters has so far been found only in prokaryotes, the PEP domains of ComAs would be an ideal target for, and the structure of MuPEP1 might help, the development of drugs that inhibit the biofilm formation of Streptococcus.