Distinction between Major and Minor Bacillus Signal Peptidases Based on Phylogenetic and Structural Criteria*

The processing of secretory preproteins by signal peptidases (SPases) is essential for cell viability. As previously shown for Bacillus subtilis, only certain SPases of organisms containing multiple paralogous SPases are essential. This allows a distinction between SPases that are of major and minor importance for cell viability. Notably, the functional difference between major and minor SPases is not reflected clearly in sequence alignments. Here, we have successfully used molecular phylogeny to predict major and minor SPases. The results were verified with SPases from various bacilli. As predicted, the latter enzymes behaved as major or minor SPases when expressed in B. subtilis. Strikingly, molecular modeling indicated that the active site geometry is not a critical parameter for the classification of major and minor Bacillus SPases. Even though the substrate binding site of the minor SPase SipV is smaller than that of other known SPases, SipV could be converted into a major SPase without changing this site. Instead, replacement of amino-terminal residues of SipV with corresponding residues of the major SPase SipS was sufficient for conversion of SipV into a major SPase. This suggests that differences between major and minor SPases are based on activities other than substrate cleavage site selection.

Signal peptidases (SPases) 1 play a key role in the transport of proteins across membranes in all living organisms. The type I SPases are integral membrane proteins that remove signal peptides from preproteins during or shortly after translocation across the cytoplasmic membrane, thereby releasing the mature proteins from the trans side of the membrane (for reviews, see Refs. 1 and 2).
In recent years, type I SPases from many different organisms have been identified. Comparison of these SPases showed that they can be divided in two sub-families: P (prokaryotic)-type and ER (endoplasmic reticulum)-type SPases (3). The P-type SPases are found in eubacteria and organelles of eukaryotes. In contrast, the ER-type SPases are typical for the endoplasmic reticular membrane. Strikingly, a few ER-type SPases were shown to be present in sporulating Gram-positive eubacteria such as Bacillus subtilis (3). In fact, B. subtilis was the first eubacterium in which the presence of both P-and ER-type SPases was demonstrated. With respect to SPases, B. subtilis is not only exceptional because it contains both P-and ER-type SPases but also because it is the organism with the largest known number of type I SPases. These include the chromosomally encoded P-type SPases SipS, SipT, SipU, and SipV, the plasmid-encoded P-type SPases SipP1015 and SipP1040, and the chromosomally encoded ER-type SPase SipW (3)(4)(5)(6)(7)(8). These observations suggested that at least some of the SPases of B. subtilis have specialized functions. Indeed, it was shown recently that SipS, SipT, and SipP1015 are of major importance for the secretion of degradative enzymes and cell viability, whereas SipU, SipV, and SipW have only a minor role in protein secretion and are probably involved in specific nonessential processes (3,8). For example, SipW is specifically required for the processing of two precursors, pre-TasA and pre-YqxM, but not for cell viability (9,10). Because the presence of a single major SPase (i.e. SipS, SipT, or SipP1015) is sufficient for the growth and cell viability of B. subtilis, it seems the secretory precursor processing machinery of this organism is functionally redundant (3,8).
In addition to an amino-terminal membrane anchor domain (A), all P-type SPases contain four well conserved domains (B-E) (2). These conserved domains include residues involved in substrate recognition and catalysis. Specifically, domain B contains a strictly conserved Ser residue, and domain D contains a strictly conserved Lys residue. Together, these residues form a Ser-Lys catalytic dyad (11). The domains B-E of the P-type SPases are conserved in the ER-type SPases. Nevertheless, instead of a Lys residue, domain D of the ER-type SPases contains a strictly conserved His residue, which is required for catalysis. At present, it is not known whether this His residue is part of a Ser-His catalytic dyad or a Ser-His-Asp catalytic triad as described for the classic serine proteases (12,13).
The distinction between P-type and ER-type SPases can readily be made on the basis of the conserved Lys or His residues in domain D. In contrast, it is not presently clear which properties determine whether a type I SPase is a major or a minor SPase of B. subtilis. A clear definition of these properties is important to understand the molecular basis for SPase substrate specificity. Therefore, the present studies were aimed at the characterization of differences between major and minor Bacillus SPases and the identification of domains in these enzymes that are critical for their specificity. The results show that major and minor Bacillus SPases can be distinguished by phylogenetic analyses and that critical information for their role in cell viability is provided by residues that are located amino-terminally of the catalytic Ser residue. Strikingly, molecular modeling of the active site of major and minor P-type SPases of B. subtilis suggests that the active site cleft of the minor SPase SipV is significantly smaller than those of the other known Bacillus P-type SPases. Nevertheless, this difference can not explain why SipV is a minor SPase. Table I lists the plasmids and bacterial strains used. Tryptone/yeast-extract medium contained Bacto tryptone (1%), Bacto yeast extract (0.5%), and NaCl (1%). If required, the media for Escherichia coli were supplemented with ampicillin (50 g/ml) or kanamycin (20 g/ml); the media for B. subtilis were supplemented with kanamycin (20 g/ml) or chloramphenicol (Cm) (5 g/ml).

Plasmids, Bacterial Strains, and Media-
Evolutionary Tree Computations-Amino acid sequences of Bacillus SPases were collected from the SubtiList and GenBank data bases. Alignments were performed with ClustalX software (14) using the "Gonnet 250" and "Gonnet series" matrices as the pairwise alignment parameters and multiple alignment parameters, respectively. Default gap opening and extension parameters were applied. When the SPase of E. coli (Lep) was included in the alignments using the same parameters, the aligned sequences showed highly congruent areas that correspond to ␣-helices, ␤-strands, and previously defined conserved domains (D) of Lep and other type I SPases (2,11). Therefore, the complete data set, predicted ␣-helices, ␤-strands, and conserved domains were used in the phylogenetic analyses. Autapomorphic insertions or deletions were removed from all data sets. Tree reconstructions were performed according to two different methods. First, the maximum likelihood method was used as implemented in the PUZZLE 4.02 software (15). The variable time matrix was applied with four ␥ rates. One thousand replications were used to calculate the quartet puzzling values. Second, the maximum parsimony (MP) method was used as implemented in the program PAUP 4.03b. The MP tree reconstruction was done with the branch-swapping/tree-bisection-reconnection algorithm by applying 10 random additions of sequences. One thousand replications were used to calculate the bootstrap values.
DNA Techniques-Procedures for DNA purification, restriction, ligation, agarose gel electrophoresis, and the transformation of E. coli were carried out as described (16). Enzymes were from Roche Molecular Biochemicals. The polymerase chain reaction (PCR) was carried out with Vent DNA polymerase (New England Biolabs) as described (17).
DNA and protein sequences were analyzed with the PCGene Analysis program (version 6.7, Intelligenetics, Inc.) and ClustalW (version 1.74) (18). B. subtilis was transformed as described (3). Correct integration of plasmids or resistance markers into the chromosome of B. subtilis was verified by Southern blotting or PCR.
The plasmid pGDL90, specifying Sip (Bli) from Bacillus licheniformis, was constructed by ligating an EcoRI-and SalI-cleaved PCRamplified fragment of sip (Bli) into the corresponding sites of pGDL48. The sip (Bli)-specific fragment was amplified with primers Lbl1 (5Ј-A-CGCGTCGACTATGCTGTGACAGACTG-3Ј) and Lbl2 (5Ј-CGGAATTC-GCAGTGCTGGCATCA-3Ј) using B. licheniformis chromosomal DNA as a template. Plasmid pM0V, specifying a hybrid of SipS and SipV from B. subtilis, was constructed by ligating an EcoRI-and SalI-cleaved PCR-amplified fragment of sipV into the corresponding sites of pM0. The sipV-specific fragment was amplified with primers JBV1 (5Ј-TGT-CGTCGACGGTGACAGTATGAACCCGACCTTCC-3Ј) and JBV2 (5Ј-C-GGAATTCGCTAGCGACGCCTCTTCAATTAGCA-3Ј) using B. subtilis chromosomal DNA as a template. Note that primer JBV1 is designed such that the resulting SipSV fusion protein contains the amino-terminal fragment of SipS (residues 1-43) that includes the active site Ser-43 residue fused to the carboxyl-terminal fragment starting at the corresponding position in SipV (residues 34 -168).
Western Blot Analysis-Polyclonal antibodies against SipS of B. subtilis were prepared by the immunization of rabbits (Eurogentec) with a purified soluble form of this protein (sf-SipS-His K83A). This protein consists of the catalytic domain of SipS lacking residues 2-29. Furthermore, sf-SipS-His K83A of B. subtilis (Bsu) contains a carboxyl-terminal hexa-histidine tag, facilitating the purification by metal affinity chromatography (19). Western blotting was performed as described (20). After separation by SDS-polyacrylamide gel electrophoresis, proteins were transferred to Immobilon-polyvinylidene difluoride membranes (Millipore). To detect SPases, B. subtilis or E. coli cells were separated from the growth medium by centrifugation (5 min, 10,000 ϫ g, room temperature), and samples for SDS-polyacrylamide gel electrophoresis were prepared as described previously (19,21). SPases were visualized with specific antibodies and horseradish peroxidase anti-rabbit-IgG conjugates (Amersham Pharmacia Biotech).
Molecular Modeling and Molecular Dynamics Simulations-Threedimensional models of Bacillus SPases were built on the basis of homology with the E. coli SPase (PDB Protein Data Bank no. 1b12) using the molecular modeling program What-If (22). The molecular dynamics program GroMacs (University of Groningen, MD group) was used to perform a standard energy minimization in vacuo of a pentapeptide substrate in the three-dimensional model of SipS.

RESULTS
Phylogenetic Clustering of Major Bacillus Signal Peptidases-To investigate the relationships between major and minor SPases of the Bacillus species, phylogenetic analyses were performed by applying the maximum likelihood and MP methods. For this purpose, either the complete sequences, conserved ␣-helices, ␤-strands, or domains were used (Table II). Consistent with the fact that only few ␣-helices are present in type I SPases (11), the maximum likelihood analysis that was based on ␣-helices resulted in a tree with a poorly resolved topology, and the equivalent MP analysis in 108 "most parsimonious trees" (112 steps long, CI excluding uninformative characters ϭ 0.8077, RI ϭ 0.7101, RC ϭ 0.5833). Far better results were obtained when complete sequences, ␤-strands, or conserved domains were used in the maximum likelihood and MP analyses (Table II). In fact, the MP analysis with complete sequences resulted in one most parsimonious tree (805 steps, CI excluding uninformative characters ϭ 0.8188, RI ϭ 0.7152, RC ϭ 0.5988) (Fig. 1). One most parsimonious tree was obtained when conserved ␤-strands were used (283 steps, CI excluding uninformative characters ϭ 0.8618, RI ϭ 0.8046, RC ϭ 0.7079), and three most parsimonious trees were obtained with the conserved domains (134 steps, CI excluding uninformative characters ϭ 0.8319, RI ϭ 0.7938, RC ϭ 0.6753).
Notably, all data sets are congruent with respect to the clustering of the four best supported groups of Bacillus SPases: (i) the SipW group, (ii) the SipV group, (iii) the SipT of Bsu ϩ SipT of the Bacillus amyloliquefaciens (Bam) group, and (iv) the SipS group (Table II). As shown in Fig. 1, SipT of Bacillus anthracis (Ban) seems to be more closely related to the SipW group (bootstrap percentages/quartet puzzling values are 100/ not calculated for complete sequences and 100/70 for ␤-strands) than to the SipT (Bsu) ϩ SipT (Bam) group. To prevent the possible misinterpretation that SipT (Ban) is related to the major SPases SipT (Bsu) and SipT (Bam), the SipT (Ban) protein was renamed SipX (Ban). Furthermore, SipC of Bacillus caldolyticus (Bca) seems to be most closely related to SipV (Bsu) and SipV (Bam). Most importantly, the functionally defined major SPases SipS, SipT, and SipP1015 of B. subtilis cluster together (Fig. 1, circled). This clustering is supported by bootstrap percentages of 79/93 and 77/60 when complete sequences or ␤-strands were used for the analyses, respectively. This observation suggests that the other SPases in this cluster, SipP1040, SipS (Bam), SipT (Bam), and Sip (Bli), should also be classified as major SPases. In contrast, all enzymes not included in this cluster would be minor SPases. Functional Identification of Major Bacillus SPases-Because the distinction between major and minor SPases is based on functional differences, we tested the outcome of the phylogenetic analysis in complementation experiments with two rep-resentative SPases: SipC of B. caldolyticus (23), which clusters with the minor SPases (Fig. 1), and Sip (Bli) of B. licheniformis (24), which clusters with the major SPases. To this purpose, the sipC gene was expressed in the B. subtilis strain ⌬S, which lacks the sipS gene, by transformation with the pGDL48-derived plasmid pGDL46.36. Subsequently, we tried to disrupt the sipT gene of the resulting strain with a Cm resistance marker by transformation with chromosomal DNA of B. subtilis ⌬T-Cm. Even though this experiment was repeated several times, no Cm-resistant transformants were obtained, indicating that SipC of B. caldolyticus behaves as a minor SPase in B. subtilis that cannot replace SipS and SipT. A completely different result was obtained in parallel experiments with Sip (Bli). To test whether this SPase could replace SipS and SipT of B. subtilis, the sip (Bli) gene was amplified by PCR and cloned.   2 Because the antibodies raised against SipS of B. subtilis cross-reacted with Sip (Bli) (these studies) and the major SPase SipP1015 (8), we investigated whether these antibodies could be used to discriminate between major and minor SPases of B. subtilis. To this purpose, Western blotting experiments were performed with strains containing plasmids for the overproduction of the respective SPases. Only SipT was shown to crossreact with the antibodies raised against SipS, which implies that the major SPases SipS, SipT, and SipP1015 share at least one antigenic determinant that is absent from the minor SPases SipU, SipV, and SipW (Fig. 3). This idea is supported by the observation that the major SPases SipS (Bam), SipT (Bam), and SipP1040 cross-reacted with the antibodies against SipS (Bsu) (data not shown). It has to be noted, however, that the antibodies against SipS also cross-reacted with the catalytic domain of SipC (Bca) upon overproduction in E. coli (data not shown), indicating that these antibodies do not allow the discrimination between major and minor Bacillus SPases in general.
SPase Active Site Modeling by Homology-To investigate whether the active site geometries of the known major and minor P-type SPases of B. subtilis are significantly different, three-dimensional models of these SPases were constructed on the basis of the crystal structure of the E. coli SPase as determined by Paetzel et al. (11). For this purpose, the sequences of these SPases were aligned with the ClustalW program (Fig. 4). SipS (Bsu) and the E. coli SPase show an overall sequence identity of 26%, which is low for modeling by homology. However, the four conserved domains B-E of these SPases show 62% sequence identity. Notably, the active site of the E. coli SPase is almost entirely composed of these four conserved domains that are typical for all P-type SPases (11). We have therefore based our conclusions exclusively on modeled active site regions of Bacillus SPases. The homology modeling program What-If was used to generate the three-dimensional models of various known SPases of bacilli. As shown for SipS of B. subtilis (Fig. 5), Met-44 and Leu-48 (marked in blue), Val-39 and Val-82 (marked in green), and Lys-83 form the S1 substrate binding pocket, whereas Tyr-37, Val-54, Val-73, and His-80 (marked in yellow) together with the residues marked in green form the S3 pocket. These findings are in good agreement with the structure of the S1 and S3 substrate binding pockets of the SPase of E. coli (25). Furthermore, the idea that the latter residues make contact with the substrate (i.e. the SPase recognition sequence in a precursor protein) was supported by a molecular dynamics analysis in which a pentapeptide of five Ala residues in a ␤-strand conformation was modeled into the substrate binding pockets of SipS (Fig. 5). This pentapeptide was placed at the position that corresponds to that of the penem inhibitor in the crystal structure of the E. coli SPase I (11).
The comparison of our models for the P-type SPases of B. subtilis showed that the substrate binding pockets of SipS, SipT, SipP1015, and SipP1040 were highly similar, whereas that of SipV was significantly smaller (Fig. 5). Conversely, the substrate binding site of SipU seemed to have a wider S1  B. subtilis, B. amyloliquefaciens, B. licheniformis, B. caldolyticus, B. anthracis, and E. coli, which contain residues that form the S1 and S3 substrate binding regions, were aligned. Residues predicted to belong to the S1 or S3 pockets are labeled with 1 or 3, respectively. Residue numbers below the alignment are derived from SipS (Bsu). pocket than the equivalent sites of the other P-type SPases of B. subtilis. Upon close examination, the volume of the substrate binding pocket of SipV is relatively small because the side chains of Leu-73, Ile-82, and possibly Leu-54 (SipS numbering) protrude into the S3 pocket (Fig. 5). The latter side chains are larger than those of the equivalent residues in SipS of B. subtilis (Val-73, Val-82, and Val-54, respectively) and other SPases (Fig. 4). Taken together, these observations indicate that the active site geometries of the minor SPases SipU and SipV of B. subtilis are different from the active site geometries of the known major SPases.
A SipS-SipV Fusion Protein Is a Major SPase-To investigate whether the active site geometry is a critical determinant for major and minor Bacillus SPases, a SipS-SipV hybrid protein (denoted SipSV) was constructed, which is specified by plasmid pM0V. Notably, this fusion between SipS and SipV of B. subtilis was made at the catalytic Ser residue of these SPases. Consequently, SipSV consists of the first 43 residues of SipS and the carboxyl-terminal part of SipV. The major advantage of this approach is that the active site geometry of SipSV is nearly identical to that of SipV (data not shown). Next, we tested whether SipSV is a major or minor SPase by introducing pM0V into B. subtilis ⌬ST as described above for the sipC (Bca) and sip (Bli) genes. Strikingly, viable ⌬ST transformants containing pM0V were obtained, showing that SipSV can replace SipS and SipT. As shown in Fig. 2, SipSV is not recognized by the antibodies raised against SipS. In conclusion, these observations show that SipSV is a major SPase and that SipV is not a minor SPase because of the geometry of its catalytic site but rather that some residues of its amino-terminal stretch determine SipV to belong to the class of minor SPases. Furthermore, the antibodies against SipS do not distinguish between major and minor SPases. DISCUSSION On the basis of their importance for cell viability, we previously have classified the type I SPases of B. subtilis as major (SipS, SipT, and SipP) and minor SPases (SipU, SipV, and SipW) (3,8). Thus far, it was not clear which properties of these SPases are important for this functional distinction, particularly with respect to the P-type SPases. Consequently, simple amino acid sequence alignments could not be used to predict the group to which certain Bacillus P-type SPases would belong. In the present studies, we show for the first time that major and minor SPases can be distinguished via phylogenetic analyses. Surprisingly, the subsequent molecular analyses demonstrate that the distinction between major and minor SPases does not relate specifically to the catalytic domain of a Bacillus P-type SPase but rather to its amino-terminal domain, which contains the membrane anchor. The latter result was unexpected, because it was shown recently by Carlos et al. (26) that the transmembrane domains of P-type SPases are not important determinants for cleavage fidelity in vitro.
The most important outcome of the phylogenetic analyses of the Bacillus type I SPases is that the major SPases form a distinct cluster, which is well supported by the maximum parsimony and maximum likelihood methods. Moreover, these phylogenetic analyses have predictive value, as exemplified by the complementation experiments with SipC (Bca) and Sip (Bli), showing that these behave as minor and major SPases, respectively, when the corresponding genes are expressed in B. subtilis. This, however, does not exclude the possibility that SipC is a major SPase in B. caldolyticus. Furthermore, the phylogenetic analyses indicate the existence of two clusters of minor SPases: the SipC/SipV, and SipW clusters. The latter cluster was identified previously because it contains the known ER-type SPases of bacilli (3,13). Only two Bacillus SPases, SipU (Bsu) and SipX (Ban), do not belong to the three clusters of major SPases, SipC/SipV, or SipW. This suggests that these two SPases represent possible evolutionary intermediates between different clusters, which is particularly interesting in the case of SipX of B. anthracis, because this P-type SPase might represent a link between the P-and ER-type Bacillus SPases.
The present observation that a SipSV hybrid protein containing the largest part of the catalytic domain of the minor SPase SipV behaves as a major SPase indicates that the catalytic Residues rendered in spheres are likely to be involved in the formation of the S1 pocket (Met-44 and Leu-48 of SipS, blue), S1 and S3 pockets (Val-39 and Val-82 of SipS, green), or S3 pocket (Tyr-37, Val-54, Val-73, and His-80 of SipS, yellow) for substrate binding. The SipS model contains a pentapeptide of Ala residues, which are docked in the substrate binding pocket. The P1 and P3 residues of this model substrate are shown in orange. The active site Ser-43 and Lys-83 residues as well as Tyr-81 of SipS and the corresponding Leu residue of SipV are shown as "ball and stick" models. The latter residues are probably involved in substrate stabilization by interaction with P2 residues of the substrate. domain of the P-type SPases is not the most important determinant for the difference between major and minor SPases. This view is supported by the fact that, according to our models, the active site geometry of SipSV is identical to that of the minor SPase SipV. In this respect it is important to bear in mind that the S3 substrate binding pockets of SipV and SipSV are relatively small compared with those of other P-type SPases of B. subtilis, particularly SipU. Nevertheless, the possibility that subtle changes in the active site geometry of SipSV caused by the fusion between the SipS and SipV moieties result in the conversion of a minor SPase into a major SPase can presently not be excluded. The idea that the catalytic domain is not important for the difference between major and minor SPases would explain why the antibodies raised against the catalytic domain of SipS (Bsu) can not be used to distinguish between these two functionally defined groups of SPases.
What could be the role of the amino-terminal residues of SipS in determining its role as a major SPase? Carlos et al. (26) have recently provided compelling evidence that the transmembrane segments of type I SPases such as SipS are not important for substrate cleavage site selection. Furthermore, we have shown recently that the membrane anchor of SipS is not required for its activity (23). Together with our present results, these observations imply that the major-minor difference of SPases is not based on the recognition of residues at the Ϫ1,Ϫ3 positions, relative to the scissile peptide bond per se. This leaves at least three alternative possibilities open. First, the amino-terminal residues might position the catalytic site of a major SPase in such a way in the membrane that it can interact with the cleavage site of one or more as yet unidentified preproteins that have to be processed for cell viability. Second, the amino-terminal residues might be required for an as yet unidentified essential interaction of a major SPase with preprotein translocases. Third, the amino-terminal residues might target the respective SPases to topologically distinct regions of the membrane such as the septa of dividing cells. Notably, the regions preceding the active site Ser residues of Bacillus SPases, which include their membrane anchor, show a relatively high degree of sequence variation (23). To elucidate the role of the amino-terminal region in SPase function, we are presently investigating the role of the first 42 residues of SipS by site-directed mutagenesis.