Structural Basis for Ufm1 Processing by UfSP1*

Ubiquitin-fold modifier 1 (Ufm1) is a newly identified ubiquitin-like protein. Like ubiquitin and other ubiquitin-like proteins, Ufm1 is synthesized as a precursor that needs to be processed to expose the conserved C-terminal glycine prior to its conjugation to target proteins. Two novel proteases, named UfSP1 and UfSP2, have been shown to be responsible for the release of Ufm1 from Ufm1-conjugated cellular proteins as well as for the processing of its precursor. They show no sequence homology with known proteases. Here, we describe the 1.7Å resolution crystal structure of mouse UfSP1, consisting of 217 amino acids. The structure reveals that it is a novel cysteine protease having a papain-like fold, with Cys53, Asp175, and His177 that form a catalytic triad, and Tyr41 that participates in the formation of the oxyanion hole. This differs from the canonical catalytic triad of papain-like proteases in that the aspartate and the histidine residues are from the “Asp-Pro-His” box. The Asp-Pro-His configuration seen in UfSP1, together with Atg4B and M48USP, seem to form a new subfamily of the cysteine protease superfamily. The mutagenesis study of the active site residues confirms structural basis for catalysis. The interaction between UfSP1 and Ufm1 appears quite substantial, since the KD value was estimated to be 1.6 μm by the isothermal titration calorimetry analysis. Furthermore, the NMR data shows that the loop between β3 and α2 in addition to the C-terminal region of Ufm1 plays a role in binding to UfSP1.

In eukaryotes, the posttranslational modification of cellular proteins by ubiquitin (Ub) 4 is an essential step in regulatory mechanisms of many cellular processes. These include cell cycle progression, signal transduction, and targeting of misfolded proteins for degradation by the proteasome (1)(2)(3)(4). Ub, which is a 76-residue polypeptide with a well defined ␣/␤ fold, is highly conserved in all eukaryotes but absent in bacteria or archaea (5). It is synthesized as an inactive precursor and needs to be processed by deubiquitinating enzymes (DUBs) to expose the C-terminal glycine. The C terminus of Ub is then covalently ligated to the ⑀-amino group of lysine residues on target proteins by the sequential action of three classes of enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin protein ligase (E3). Ub may be attached to a target protein as a monomer or as a polyubiquitin chain. The ubiquitination pathway is further regulated by DUBs that also remove Ub from Ub-conjugated proteins (6 -11).
In addition to Ub, there are a number of distinct ubiquitinlike proteins (UBLs), such as SUMO (small Ub-related modifier), ISG15, NEDD8, and Atg8, that function as protein modifiers. Like Ub, these UBLs also function as critical regulators of many cellular processes, such as transcription, DNA repair, signal transduction, autophagy, and the cell cycle (12,13). The UBLs share further similarities with Ub. For example, UBLs display tertiary structure similar to that of Ub, although they lack obvious sequence identity. They also require a series of E1, E2, and E3 enzymes in order to be conjugated to a target protein. Like Ub, most UBLs, if not all, are synthesized as precursors that are processed by UBL-specific proteases (ULPs). ULPs also play a role in the release of UBLs from their conjugated proteins. However, unlike Ub, most UBLs are conjugated to their substrates as a monomer. Dysregulation of UBL substrate modification and/or mutations in the UBL conjugation machinery lead to a number of human diseases (11,14,15).
Recently, a novel UBL called ubiquitin-fold modifier 1, or Ufm1, has been identified (16). It has the same tertiary structure as Ub despite the fact that they share only 16% sequence identity (17). Similar to other UBLs, Ufm1 is synthesized as a precursor. Interestingly, however, Ufm1 possesses a single glycine at its C terminus followed by a Ser-Cys dipeptide in the precursor form unlike Ub or UBLs that have the conserved diglycine at the C terminus. The matured Ufm1 is specifically activated by a novel E1-like enzyme called Uba5 and then transferred to its cognate E2-like enzyme, called Ufc1. Ufm1 is demonstrated to conjugate several proteins in human HEK293 cells and various mouse tissues, and the identification of these target proteins is under way (16). Since Ufm1, Uba5, and Ufc1 are conserved in metazoa and plants but not in yeast, potential roles of Ufm1 modification in various multicellular organisms are of special interest. Earlier two novel proteases specific for Ufm1, named UfSP1 and UfSP2, were identified (18). Both UfSP1 and UfSP2 cleave only the C-terminal extension of Ufm1 but not Ub or UBLs. In addition, they are both capable of releasing Ufm1 from Ufm1conjugated cellular proteins. The long UfSP2 is present in most, if not all, multicellular organisms, whereas the short UfSP1 is not found in plants or nematodes. Both UfSP1 and UfSP2 possess highly conserved Cys and His residues, indicating that they are cysteine proteases, and biochemical and mutational analysis confirmed that they are indeed thiol proteases (see Fig. 1). However, they share no sequence homology with other known proteases, including DUBs or ULPs, suggesting a new subfamily of cysteine proteases. Here, we report the crystal structure of mouse UfSP1 at 1.7 Å resolution and discuss the structural basis for Ufm1 processing.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-The cDNAs for Ufm1 (SwissProt entry P61961) and UfSP1 (SwissProt entry Q9CZP0) from mice were cloned into pET28a and pET22b (Novagen) to generate N-or C-terminal His-tagged proteins. The resulting vectors were transformed to Escherichia coli BL21(DE3) codon plus RIL (Stratagene) cells, and the His-tagged proteins were purified using nickel affinity resins (GE Healthcare) equilibrated with 20 mM Tris-HCl (pH 8.0), 100 mM NaCl, and 1 mM Tris(2-carboxyethyl)phosphine hydrochloride. They were further purified by gel filtration on a Superdex 75 26/60 column (GE Healthcare). The purified UfSP1 was concentrated to 40 mg/ml in a buffer containing 20 mM HEPES (pH 7.5), 100 mM NaCl, and 1 mM dithiothreitol using Amicon Ultra-15 (Millipore). Selenomethionine-substituted UfSP1 was generated as described previously (19). The enzymatic activity was measured as described earlier (18).
Crystallization-Initial screening for the crystallization was carried out by using 96-well Intelli plates (Hampton Research) and a Hydra II Plus One (MATRIX Technology) robotics system at 295 K, and they were further optimized using the hanging drop methods. Diffraction quality crystals were obtained by mixing equal volumes of 40 mg/ml mouse UfSP1 in 20 mM HEPES (pH 7.5), 100 mM NaCl, 1 mM dithiothreitol with a reservoir solution containing 0.1 M succinic acid (pH 7.0), and 5% (v/v) polyethylene glycol 3350. Attempts to crystallize UfSP1 complexed with Ufm1 did not yield crystals suitable for high resolution data collection.
X-ray Data Collection and Processing-The crystals of UfSP1 belong to the space group R32, with a ϭ b ϭ 71.70 Å, c ϭ 209.23 Å, and ␣ ϭ ␤ ϭ 90°and ␥ ϭ 120°. There is one molecule per asymmetric unit. Crystals were equilibrated in a cryoprotectant buffer containing reservoir buffer plus 20% (v/v) ethylene glycol and then flash-cryocooled in a cold nitrogen stream at 100 K. Both the native and the MAD data sets were collected at beamline 4A of Pohang Accelerator Laboratory (Pohang, Korea). The data were processed and scaled by using the HKL2000 program suite (20), and the statistics are summarized in Table 1.
Structure Determination and Refinement-The crystal structure of UfSP1 was determined by the MAD phasing method. All three selenium sites were found and refined, and the initial phases were calculated using the program SOLVE (21) and RESOLVE (22). About 84% of the residues were automatically modeled as a polyalanine chain by RESOLVE and further constructed using the molecular modeling program COOT (23). The refinement was then performed using the CNS and REF-MAC (24,25). The final refinement statistics are summarized in Table 1.
Site-directed Mutagenesis and UfSP1 Activity Assay-Sitedirected mutagenesis on the residues that might be involved in the catalysis was carried out using QuikChange site-directed mutagenesis kit (Stratagene) by following the manufacturer's instructions. Mutants of UfSP1 were produced as MBP-fused proteins with single point mutations at positions Tyr 41 , Cys 53 , Trp 98 , Gln 154 , Asp 175 , and His 177 . Ufm1 processing activity was assayed by using GST-Ufm1-HA as a substrate as described previously (18). Briefly, 100 ng of proteins of wild type and mutant forms of MBP-UfSP1 were incubated with 5 g of GST-Ufm1-HA for 1 h at 37°C. The reaction was stopped by the addition of SDS sample buffer and analyzed using SDS-PAGE. The gels were then stained with Coomassie Blue R250.
Ufm1 Binding by NMR Chemical Shift Perturbation-Uniformly labeled 15 N-labeled or 13 C/ 15 N-labeled mouse Ufm1 was produced in M9 minimal medium with 15 NH 4 Cl or a mixture of 15 NH 4 Cl and 13 C-labeled glucose, respectively. The C53S mutant of mouse UfSP1 was used. They were purified as described above and concentrated to 0.3-0.8 mM. For backbone resonance assignment, CBCA(CO)NH and HNCACB experiments were performed using 0.8 mM 13 C/ 15 N-enriched Ufm1 in 20 mM deuterated Tris-HCl, pH 7.5, with 100 mM NaCl, 1 mM dithiothreitol, and 10% (v/v) D 2 O. NMR titration experiments were performed by measurement of 1 H-15 N TROSY on 0.3 mM 15 N-Ufm1 with various amounts of unlabeled UfSP1 in 20 mM sodium phosphate, 150 mM NaCl, 1 mM dithiothreitol, and 10% (v/v) D 2 O at pH 6.8. The molar ratio of unlabeled UfSP1 to 15 N-labeled UfSP1 was about 0.1, 0.2, 0.3, 1:0.5, and 1:1, respectively. 1 H-15 N resonances of Ufm1 at pH 6.8 with or without UfSP1 were assigned from the analysis of 15 N-edited NOESY-HSQC spectrum of the respective sample. All NMR spectra were collected at 25°C on a Varian Unity Inova 900 MHz spectrometer equipped with Coldprobe TM at KIST. All NMR data were processed using NMRPipe software (26) and analyzed using NMRView software (27).
Isothermal Titration Calorimetry-Isothemal titration calorimetric experiments were performed using a MicroCal VP isothermal titration calorimetry (ITC) instrument (MicroCal, Northampton, MA) at 22°C, and the data were analyzed using the program ORIGIN. Protein samples were prepared in 20 mM HEPES (pH 7.5) and 200 mM NaCl buffer. The purified C53S mutant of UfSP1 protein was concentrated to 1.5 mM, whereas Ufm1 was concentrated to 0.1 mM using an Amicon Ultra-15 concentrator. All protein solutions were degassed for 10 min under vacuum with a ThermoVac accessory prior to performing the experiment at 22°C. During a typical titration experiment, 250 l of Ufm1 protein solution (or the same volume of the buffer as a control) was added to 1.4 ml of UfSP1 protein, and 250 l buffer solutions were added to UfSP1 protein as a control under the same condition. Typically, 5 l injections, with 10 s duration, were made every 180 s, and the sample cell was stirred with the flat tip of the syringe rotating at 307 rpm.

RESULTS
Both Ufm1-(1-85) and UfSP1-(6 -215) from mice were expressed in E. coli and purified. The apo form of UfSP1 was crystallized in rhombohedral form, and the crystal structure has been determined using multiple-wavelength anomalous dispersion data collected from the selenomethionine-substituted UfSP1 and refined to a final R value of 18.6% (R free ϭ 22.7%) at 1.7 Å resolution. The electron density map is clearly defined for all atoms except for residues 149 -154. For those six residues, the electron density was weak but enough to trace, so they were included in the final model. The side chain of Trp 98 can be modeled in two distinct orientations. Table 1 summarizes statistics on the crystallographic data. Binding of Ufm1 to UfSP1 has been analyzed using ITC and NMR chemical shift perturbation.
Overall Crystal Structure of Mouse UfSP1-The overall structure, with dimensions of 45 ϫ 30 ϫ 38 Å, consists of an ␣/␤-fold with a seven-stranded antiparallel ␤-sheet flanked by ␣-helices on either side of the sheet and the N terminus forming an antiparallel ␤-structure. The central antiparallel ␤-sheet is composed of ␤1, ␤4, ␤5, and ␤6; the C-terminal antiparallel ␤-sheet consists of ␤2, ␤3, and ␤7 ( Fig. 2A). The helices ␣1, ␣2, and ␣3 are located at one side of the ␤-sheet, whereas the other three are on the opposite side (Fig. 2B). The active site cysteine is located at the N terminus of helix ␣1, whereas histidine is at the loop between ␤5 and ␣5. The overall structure mimics the papain fold despite the fact that they share no obvious sequence homology (11%). Since the C-terminal catalytic domain of UfSP2 shares 36% sequence identity with UfSP1, the catalytic domain of UfSP2 is expected to assume the same structure as UfSP1.
Comparison of UfSP1 with other structures in the Protein Data Bank using the DALI algorithm (28) yielded a few structural homologues from the cysteine protease superfamily. The most significant match, based on having the highest Z-score, was Atg4B (Protein Data Bank accession code 2CY7 (29) and Protein Data Bank accession code 2D1I (30)), which showed a Z-score of 12.6. Atg4B, an essential enzyme in autophagy, cleaves nascent Atg8 at its C-terminal arginine residue and also deconjugates Atg8 family proteins from a small adduct, phosphatidylethanolamine. The second highest hit was the recently reported protease domain of murine cytomegalovirus, M48 USP , with a Z-score of 6.7 (Protein Data Bank accession code 2J7Q (31)). DUBs, such as a 40-kDa catalytic core domain of HAUSP (Protein Data Bank accession code 1NBF (32)) and UCH-L3 (Protein Data Bank accession codes 1UCH and 1XD3 (33, 34)) showed only low structural similarities with UfSP1, as indicated by the Z-scores of 3.9 and 1.9, respectively. Fig. 2B shows the superposition of Atg4B, M48 USP , and papain on UfSP1.
Despite their overall structural similarity, Atg4B and M48 USP exhibit a number of significant local structural differences from UfSP1 (Fig. 2B). Atg4B has an additional domain called the "short fingers domain" or "auxiliary domain," which is com-posed of three ␣-helices and two ␤-stands forming an ␣/␤ structure whose function is unknown (29,30). On the other hand, M48 USP has an "extended ␤-hairpin" structure that participates in extensive hydrophobic interaction with the Ub core that is bound. This interaction is quite distinct from what has been seen in the DUB⅐Ub or ULP⅐UBL complexes, and this Ub binding mode has been suggested to contribute to Ub specificity of M48 USP (31). HAUSP has three domains, described as  fingers, palm, and thumb, and its active site is located between the palm and thumb. In this case, however, binding of Ub aldehyde induces a drastic conformational change in the active site that realigns the catalytic triad residues for catalysis (32). The root mean square deviations for the core structures are 2.8 Å for 138 C␣ atoms in papain, 2.9 Å for 169 C␣ atoms in Atg4B, 3.9 Å for 137 C␣ atoms in M48 USP , and 4.0 Å for 121 C␣ atoms in HAUSP.
Active Site of UfSP1-Cys 53 and His 177 of UfSP1 were identified as catalytic residues earlier (18). They are part of highly conserved stretches (i.e. Gly-Trp-Cys 53 -Ala/Gly-Tyr-Arg-Ser/Thr, and Ile-Leu-Asp 175 -Pro-His 177 -Tyr-Thr/Trp-Gly, in which strictly conserved residues are shown in boldface type or else sequences are shown in the order of UfSP1/UfSP2. As shown in Fig. 3A, Cys 53 is located at the N-terminal end of the ␣1 helix, and His 177 is situated at the loop between the strand ␤5 and helix ␣5. Indeed, the structure shows that the N⑀2 atom in the imidazole ring of His 177 is only 3.4 Å away from the S␥ atom of the catalytic Cys 53 and is further hydrogen-bonded to a water molecule (W1325) through N␦1. The Asp 175 is 4.5 Å away from the N␦1 of His 177 in the current structure, but with a simple 1 and 2 rotation of His 177 it can easily make a hydrogen bond to the side chain oxygen of Asp 175 . Thus, Cys 53 , His 177 , and Asp 175 are reasonably well posed for catalysis. The proline residue between Asp 175 and His 177 plays a role in ensuring the proper positioning of the two. In addition to the catalytic triad, cysteine proteases have another conserved residue, typically Asn or Gln, which participates in the formation of the oxyanion hole that is critical for catalysis. In UfSP1, the hydroxyl oxygen of Tyr 41 makes a hydrogen bond to a water molecule (W1221), which in turn makes a hydrogen bond to the amide backbone of Cys 53 . This tyrosine is conserved throughout UfSP1 and UfSP2. Therefore, the residues involved in catalysis are Cys 53 , His 177 , Asp 175 , and Tyr 41 , and the catalytic mechanism of UfSP1 appears to be parallel to that of the papain family of cysteine proteases.
This was somewhat surprising, since the catalytic triad of canonical cysteine proteases is such that cysteine is located at the N terminus of an ␣-helix, and histidine and aspartate are at the end of two adjacent ␤-strands that form the central ␤-sheet. In papain, the residues corresponding to the triad are Cys 25 , Asn 175 , and His 159 , whereas Gln 19 participates in forming the oxyanion hole. In UfSP1, the position of canonical histidine is occupied by Gln 154 , and this is histidine in all known UfSP2 and alanine in Drosophila UfSP1. This suggests that it is unlikely that Gln 154 participates in catalysis. In the current structure, the electron density in this region (residues 149 -154) is not as well defined but enough to trace the chain, and the side chain of Gln 154 is facing away from the active site cysteine and histidine.
In order to validate our structural analysis, we employed sitedirected mutagenesis to alter the key residues in UfSP1 predicted to participate in catalysis. Alanine mutants were made for Tyr 41 , Gln 154 , Asp 175 , His 177 , and Trp 98 in addition to a serine mutant of Cys 53 . In vitro enzymatic activity was assessed by using GST-Ufm1-HA as a substrate. As shown in Fig. 3C, C53S, H177A, D175A, and Y41A mutants show no activity, whereas Q154A exhibits activity, therefore confirming the conclusions from the structural data. Since the residues at the active site are highly conserved, UfSP2 is expected to have the same active site and is expected to operate via the same mechanism. It is striking to note that the active sites of Atg4B (29,30) and M48 USP (31) have similar arrangements of the active site residues; namely, the catalytic histidine and the aspartate residues are from "Asp-Pro-His," which is located at the tip of a ␤-strand. The oxyanion position is filled by a glutamine (Gln 10 ) in the case of Atg4B and a tyrosine (Tyr 54 ) in M48 USP .
Interaction of UfSP1 with Ufm1-In order to examine Ufm1 binding to UfSP1, we first analyzed the solution structure of mouse Ufm1 by NMR. As reported previously (17), most of the backbone resonances of Ufm1 could be assigned except for the two residues at the N terminus, residues 9 -18, which correspond to the flexible loop, and the two residues at the C termi- Although the catalytic triad residues are located in a similar geometry as those from papain and some DUBs, the aspartate and histidine are juxtaposed with a proline in between. This "Asp-Pro-His" is a signature of new subfamily of cysteine protease superfamily. Similar arrangements were observed in M48 USP and Atg4B. C, in vitro processing activities of C53S, D175A, H177A, Y41A, Q154A, and W98A mutants. Proteins were produced as MBP-UfSP1, and GST-Ufm1-HA was used as substrate. WT, wild type.
nus. Second, the interaction between UfSP1 and Ufm1 was characterized by a chemical shift change in the two-dimensional 1 H-15 N TROSY experiment using 15 N-labeled Ufm1 with unlabeled UfSP1. Fig. 4A shows the overlay of the HSQC spectra of Ufm1 in the free form and complexed to the C53S mutant of UfSP1. Upon the addition of UfSP1, most of the backbone resonances of Ufm1 changed, and this was completed at a 1:1 molar ratio of Ufm1 and UfSP1. In fact, when the molar ratio of Ufm1 and UfSP1 is 1:0.5, most of the residues showed two resonances (shown in purple in Fig. 4A), which represents both the unbound and the bound states. The bottom of Fig. 4A shows the weighted average of chemical shift perturbation (⌬␦ ϭ (␦H 2 ϩ ␦N 2 /5)[sufrax,1,2]) calculated on the individual residues. We also tried an NMR titration experiment using 15 Nlabeled UfSP1 with the addition of Ufm1. Although 1 H-15 N TROSY experiments gave well dispersed amide proton-nitrogen correlations (data not shown), we were not able to assign the resonances of UfSP1. Nevertheless, we were able to confirm the binding between UfSP1 and Ufm1 using 15 N labeled UfSP1 as a probe for the chemical shift perturbation experiment (data not shown).
Residues whose resonances completely disappeared include Ile 8 , Ser 12 , Ala 48 , Ile 55 , Gly 56 , Ile 57 , and the residues from Arg 79 to Gly 83 excluding Arg 81 , and the residues with ⌬␦ of Ͼ0.1 are Lys 19 , Leu 21 , Phe 29 -Lys 34 , Glu 38 , Ser 47 , Asp 53 , Asn 58 , Gln 61 , Thr 62 , Phe 67 , Arg 75 , Ile 77 , and Arg 81 . When these residues are mapped on the Ufm1, they are mostly located on one side of the Ufm1 surface, forming a somewhat contiguous region, as seen in Fig. 4B. These chemical shift perturbations may be the result of direct interactions between Ufm1 and UfSP1 or indirect structural changes due to complex formation. But the results strongly suggest that the C-terminal part of ␤3 and the loop between the ␤3 and ␣2 of Ufm1 play an important role in UfSP1 binding. Also, the intermediate or slow exchange time scale observed suggests that the interactions between Ufm1 and UfSP1 are relatively strong.
ITC was carried out using C53S mutant of UfSP1 and Ufm1, and the results of a calorimetric titration of Ufm1 into the UfSP1 are shown in Fig. 4C. The top panel shows the heat effects associated with the injection of Ufm1 into the calorimetric cell containing UfSP1, whereas the bottom panel shows the binding isotherm corresponding to the data in the top panel and the best fitted curve. Almost identical results were obtained when UfSP1 was added to Ufm1 (data not shown). The binding is endothermic, and analysis of the data yields a binding affinity of K D ϭ 1.6 M. The dissociation constants (K D ) for SENP1, using the C604A mutant, were reported to be 0.79 and 0.49 M toward the full-length SUMO1 and SUMO2, respectively (35).

DISCUSSION
Another Cysteine Protease, UfSP1-Among the five different families of proteases described for DUBs so far, cysteine protease is the most common (36). A large number of them have conserved Cys and His boxes, and the Cys box contains a catalytic cysteine residue, which is thought to undergo deprotonation before the nucleophilic attack on the carbonyl carbon atoms of the incoming substrate. This deprotonation is assisted by histidine, which, in turn, is stabilized by either aspartate or asparagine. Although ULP family proteases have different topology, their catalytic triad residues are located in the same geometry as those of papain.
The x-ray structure of UfSP1 shows that it also has a papain-like fold, with the catalytic residues being Cys 53 , His 177 , Asp 175 , and Tyr 41 , as confirmed by mutagenesis analysis. The arrangement of the catalytic residues is similar to that observed in the canonical catalytic triad of cysteine proteases. However, unlike the canonical triad, the aspartate and histidine residues in the active site of UfSP1 are from the highly conserved "Asp-Pro-His" box. This unique arrangement is also found in the recently reported structure for Atg4B, with the catalytic residues being Cys 74 , Asp 278 , His 280 , and Tyr 54 (29,30), and M48 USP , with the catalytic residues being Cys 23 , Asp 156 , and His 158 with Gln 10 (31). Although there is a histidine (His 141 ) at the position equivalent to the canonical histidine in the case of M48 USP , it is not conserved. Therefore, in the two enzymes, the catalytic aspartate and histidine come from a strictly conserved "Asp-Pro-His" box as well. Collectively, these results indicate that UfSP1 together with Atg4B and M48 USP form a new subfamily of the cysteine protease papain superfamily.
Ufm1 Binding to UfSP1-Both NMR and ITC data suggest that there is relatively strong interaction between UfPS1 and Ufm1, and this interaction involves not only the C-terminal of Ufm1 but also the residues of ␤3 and the loop between ␤3 and ␣2, including both hydrophobic and polar residues. The endothermic nature of the interaction shown by ITC data suggests that the entropy change has to not only compensate for the unfavorable enthalpy change but also provide necessary binding energy to achieve the observed binding affinity. Main factors contributing to the entropy change would be lost in conformational degree of freedom as well as from the solvation entropy coming from the release of water molecules upon binding. Upon the addition of UfSP1, most of the backbone resonances of Ufm1 changed. These changes may well account for the necessary entropy change.
It is difficult to predict structural changes in the complex structure of UfSP1 and Ufm1, since in some complexes rather large changes are observed in DUBs or ULPs upon Ub or UBL binding (19,32). For example, in the case of USP14, in the free form, the active site is blocked by two surface loops, but upon Ub binding, a significant conformational change occurs that translocates the two surface loops, thereby allowing the C terminus of Ub access to the active site (19). In the case of UfSP1, since it has a relatively open configuration at the active site, and it lacks an analogous crossover loop seen in UCH enzymes, it is most likely that UfSP1 could allow large protein conjugates to come within a distance suitable for recognition of the isopeptide bond by the active site. Assuming this, we tried manual docking of the Ufm1 onto the active site without any modification. Since the sequence of mouse Ufm1 is identical to that of human Ufm1, we used the earlier reported NMR structure (Protein Data Bank code 1WXS (18)), and the results from the peak shift as well as the surface electrostatic potentials of both Ufm1 and UfSP1 were used as a general guide. Fig. 4D shows the result.
In the resulting model, the surfaces of the two proteins are reasonably complementary to each other. For example, the hydrophobic patch around residues Ala 48 and Ile 49 of Ufm1 complements the Val 102 , Leu 106 , and Leu 108 of UfSP1, and Ile 77 of Ufm1 faces Phe 208 and His 120 of UfSP1. These residues of UfSP1 are not strictly conserved, but they are substituted by amino acids with hydrophobic side chains in UfSP2 (see Fig.  1B). Also, Arg 79 of Ufm1 comes near Asp 87 or Glu 103 of UfSP1. In this case, both Asp 87 (Gly-Asp 87 -Lys-Pro) and Glu 103 of UfSP1 are conserved, suggesting a possible formation of a salt bridge. This model puts the C terminus of Ufm1 near the shal-low and narrow "cleft" formed by the loop between ␣2 and ␣3 and the loop between ␤3 and ␤4 of UfSP1. The residues between ␣2 and ␣3 are generally conserved, whereas the residues on the loop between ␤3 and ␤4 are not (see Fig. 1B). Especially, C␣ of Trp 98 (in Gly-Ser-Arg-Gln/Asn-Trp 98 -Ile-Gly) is 6.4 Å away from the C␣ of Cys 53 (see Fig. 3, A and B), and it is tempting to suggest that it may play a role in stabilizing the substrate binding, although the mutagenesis study on Trp 98 showed no significant effect on the proteolysis (see Fig. 3C). In the current crystal structure, there is not enough room at the cleft to accommodate the C terminus of Ufm1. However, noting that residues 149 -154 are somewhat disordered in the current model, one can easily imagine rearrangement of these residues.
Conclusion-In this study, we reported the crystal structure of mouse UfSP1 at 1.7 Å resolution. Despite no obvious sequence homology with known proteases, the structure of UfPS1 shows a papain-like fold, with catalytic residues from Cys and Asp-Pro-His boxes. This is similar to what is seen for Atg4B and M48 USP . Thus, UfSP1 together with Atg4B and M48 USP seem to form a new subfamily of the cysteine protease superfamily. Since the catalytic domains of UfSP1 and UfSP2 share 36% sequence identity (45.6% similarity) with the mentioned residues conserved throughout, UfSP2 appears to also belong to this new family. The ITC data suggest that the K D between Ufm1 and UfSP1 is 1.6 M. The NMR data also suggest that about 10 residues between ␤3 and ␣2, in addition to the 10 or so residues at the C terminus of Ufm1, may be important in binding of Ufm1 to UfSP1.