Molecular cloning and expression of a 26 S protease subunit enriched in dileucine repeats.

The 26 S protease is a multisubunit enzyme required for ubiquitin-dependent proteolysis. Recently, we identified a 50-kDa subunit (S5) of this enzyme that binds ubiquitin polymers (Deveraux, Q., Ustrell, V., Pickart, C., and Rechsteiner, M. (1994) J. Biol. Chem. 269, 7059-7061). We have now isolated, sequenced, and expressed a cDNA encoding a novel 50-kDa subunit of the 26 S protease. The recombinant protein does not bind ubiquitin polymers. Two-dimensional electrophoresis reveals that two subunits of the 26 S protease have apparent molecular masses of 50 kDa. Antibodies specific for the recombinant protein recognize the more basic of the two subunits (S5b), whereas the more acidic subunit (S5a) binds ubiquitin chains. Thus, the 26 S protease contains at least two distinct subunits with apparent molecular masses of 50 kDa.

Ubiquitin has been implicated in diverse cellular phenomena including proteolysis of key regulatory molecules such as GCN4 and some cyclins (1)(2)(3)(4)(5). Although ubiquitin may have nonproteolytic roles, covalent attachment of ubiquitin to proteins clearly produces substrates for the 26 S ATP-dependent protease. This enzyme is composed of the multicatalytic protease or proteasome and a regulatory ATPase complex (6). Both multicatalytic protease and the regulatory complex are multisubunit structures that associate in the presence of ATP to form the 26 S enzyme (7). Although the multicatalytic protease can degrade a variety of peptides, few intact proteins have been shown to be substrates. Subunits of the regulatory complex are, therefore, thought to confer the ability to recognize and degrade intact proteins, especially those conjugated to ubiquitin chains. In support of this idea, we identified a 50-kDa subunit of the regulatory complex that binds ubiquitinated lysozyme as well as free polymers of ubiquitin (8). We called this protein subunit 5 (S5) based upon its relative mobility on SDS-polyacrylamide gels (9). S5 efficiently binds tetrameric ubiquitin and selects for longer ubiquitin polymers (8). This property is consistent with characteristics expected of a component that selects ubiquitin conjugates for proteolysis since it has been shown that ␤-galactosidase molecules attached to long ubiquitin chains are degraded more efficiently than those conjugated to shorter chains (10). As a further characterization of the ubiquitinbinding subunit, we have employed standard procedures to obtain a cDNA encoding the 50-kDa subunit of the 26 S prote-ase. We have found that two distinct proteins comprise the band previously designated as S5 on SDS gels. Here we distinguish between the two 50-kDa proteins and describe the cloning, sequence, and expression of a cDNA encoding the more basic subunit, S5b.

EXPERIMENTAL PROCEDURES
Peptide Sequencing-Human red blood cell 26 S proteases (11) were resolved on SDS gels, and the subunits were transferred to polyvinylidene diflouride (PVDF) 1 membranes (9). The PVDF membrane was stained with Ponceau S, and the S5 band was excised. Direct sequencing of S5 indicated that it has a blocked amino terminus. Therefore, S5 bands were excised from PVDF membranes and treated with CNBr. The resulting peptides were eluted from the membrane, electrophoresed on SDS gels, and transferred to PVDF. The S5 fragments were then excised and sequenced directly using a ABI gas phase sequenator. Amino acid sequences were obtained from three of the isolated peptides.
Isolation of a cDNA Encoding S5b-Degenerate oligonucleotides derived from two of the S5 peptides were used in the polymerase chain reaction to synthesize a nondegenerate probe. This sequence was labeled by the random prime method and was used to isolate two clones from a ZAP II HeLa cDNA library (Stratagene). Both cDNAs encoded all three of the peptides obtained from S5 digestion. Expression of the these cDNAs, however, resulted in products smaller than the 50-kDa subunit 5 of the 26 S protease. To obtain the complete coding region for S5, we employed the 5Ј rapid amplification of cDNA ends (5Ј-RACE) procedure (Life Technologies, Inc.). Numerous subclones containing the 5Ј-RACE products were sequenced to identify polymerase chain reaction errors prior to construction of the full-length cDNA. The 5Ј-RACE products were similar in length, indicating the reverse transcriptase terminated at a single position. The longest open reading frame predicted a 56-kDa protein. This full-length construct was subcloned into pAED4 (12), and both sense and antisense DNA strands were sequenced using the Sequenase protocol (U. S. Biochemical Corp.). The complete sequence was compared with a cDNA sequence submitted to GenBank (accession number D31889) and found to be identical except for the initiating methionine.
Expression and Purification of S5b-The full-length S5b construct was transcribed using T7 polymerase and was translated in reticulocyte lysate containing [ 35 S]methionine (Stratagene). For in vivo expression, the same plasmid was transformed into BL21 cells and was induced with 1 mM isopropyl-1-thio-␤-galactopyranoside under standard conditions. An insoluble fraction was prepared following sonication in lysis buffer (50 mm Tris, pH 8.0, 50 mM NaCl, 0.25% Triton X-100) by centrifugation for 30 min at 16,000 ϫ g. The insoluble material was washed twice with lysis buffer and then suspended in sample buffer and boiled for 2 min prior to electrophoresis. Induction of authentic S5b was confirmed by NH 2 -terminal amino acid sequence analysis of protein resolved on SDS gels and transferred to PVDF. Purified S5b was obtained by subcloning the S5b sequence into the pET 16b vector (Novagen), which places a histidine-containing leader sequence at its amino terminus. The pET 16b vector containing the S5b sequence was transformed into BL21 cells, expressed, and purified as recommended by Novagen.
Antibody Preparation, Western Blotting, and Ubiquitin-Polymer Binding Assays-The purified histidine-S5b fusion protein was used for antibody production in rabbits using Titermax as described by the manufacturer (Vaxcel, Inc.). These antibodies react with both the his-* This work was supported by National Institutes of Health Grant GM37009 and by a grant from the Lucille P. Markey Charitable Trust. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

RESULTS AND DISCUSSION
Subunit 5 was identified as a ubiquitin recognition component of the 26 S protease by its ability to bind ubiquitinlysozyme conjugates (8). Subsequent analyses using a mixture of ubiquitin polymers of different lengths revealed that S5 selects for longer ubiquitin polymers even after SDS denaturation of the subunit (8). In addition, our calculations of subunit stoichiometry suggest that only one S5 molecule is present per regulatory complex. In view of these findings, we hypothesize that repeated sequences in S5 might bind multiple sites pres-ent in ubiquitin chains. This idea can be tested by obtaining the subunit's sequence. Consequently, we used a combination of polymerase chain reaction, cDNA library screening, and 5Ј-RACE to obtain a composite full-length nucleotide sequence encoding the S5 subunit (see "Experimental Procedures"). This sequence contains an open reading frame that encodes all three peptides recovered from CNBr digestion of S5 (Fig. 1). Both initiation and polyadenylation sequences present in the S5 cDNA conform to those known for eukaryotic organisms. The predicted protein, however, does not have the repeated regions that we hypothesized would be present in the ubiquitin conjugate-binding subunit. It does have numerous leucines, many of which occur as dileucine repeats. Comparison of the S5 amino acid sequence with entries in current protein databanks did not reveal significant similarities with other proteins. However, the nucleotide sequence is identical to a partial cDNA obtained by direct sequencing of random cDNAs (see "Experimental Procedures"). Using the 5Ј-RACE procedure, we identified four more nucleotides at the 5Ј end of the S5 cDNA that encode a methionine. The resulting open reading frame predicts a 56-kDa protein.
Expression of the S5 sequence in E. coli resulted in a protein that co-migrates on SDS gels with S5 of the 26 S protease (Fig.  2). Western blots show that antibodies raised against the expressed protein react with a 50-kDa subunit of the protease. In addition, antibodies specific for the human red blood cell 26 S protease recognized the recombinant S5 protein (data not shown). Taken together, these data indicate that the cDNA presented in Fig. 1 encodes the entire S5 protein.
The expressed protein, however, does not bind ubiquitin polymers (data not shown). This finding raised the possibility that more than one 26 S protease subunit comprises the band designated as S5 on SDS gels. This was tested by resolving The induced protein present in the insoluble fraction was expressed as a nonfusion product. This protein was excised and subjected to sequence analysis confirming the S5b aminoterminal sequence. The purified His 6 -S5b fusion protein was used for antibody production. Two faster migrating species in the 26 S crossreact with anti-S5b antibodies. These may be fragments of S5b or possibly subunits 9 and 11 as indicated by two-dimensional gel analysis (see Fig. 3).
components of the 26 S protease on two-dimensional gels. Proteins were transferred to nitrocellulose and incubated with either 125 I-labeled ubiquitin polymers or antibodies specific for the recombinant S5 protein. As shown in Fig. 3, the labeled ubiquitin polymers bound to a 50-kDa subunit with a pI of 4.6 (panel B, white arrow), and the anti-S5 antibodies recognized a protein focusing at pH 5.3 (panel C, black arrow). These results demonstrate that there are two distinct subunits of the 26 S protease in the band previously called S5. We have termed the two 50 kDa proteins S5a (acidic) and S5b (basic). The cDNA sequence shown in Fig. 1 codes for S5b.
The three CNBr peptides initially sequenced from the S5 band are present in the cDNA encoding S5b. However, the data in Fig. 3 clearly reveal two distinct 50-kDa proteins at the S5 position. For this reason, we separated large amounts of the regulatory complex subunits on SDS-gels and asked whether peptides other than those present in S5b could be identified. By excising proteins that migrated slightly above the major S5 band, we obtained the sequences of eight endoproteinase Lys-C digested peptides, three of which are not encoded by S5b. These three peptides (RIIAFVGSPVELDTK, VNVDIINFGEEE-VNTE, and AGTGSHLVTVPPG) are homologous to a recently identified multiubiquitin-binding protein from Arabidopsis thaliana. 2 The identification of peptides encoded by two distinct cDNAs confirms the presence of two 50-kDa subunits in the 26 S protease.
When 26 S protease and regulatory complexes were electrophoresed on native gels and transferred to nitrocellulose, antibodies raised against S5b recognized both native particles. Similarly, these antibodies identified a 50-kDa protein after subunits of each native complex were resolved by SDS-PAGE (data not shown). Therefore, S5b is an integral component of the 26 S protease. The function of subunit 5b, however, is not currently understood. Its sequence is enriched in leucine residues; 26 of the first 100 amino-terminal residues are leucines. In addition, 9 dileucine repeats can be found throughout S5b (see Table I). Similar repeats have been implicated in protein sorting to Golgi cisternae, lysosomes, and in the internalization of certain transmembrane proteins (15). Haft and co-workers (15) suggested that at least two motifs contribute in internalization and/or targeting to lysosomes. The first motif, referred to as the tyrosine-based motif (GPLY and NPEY), contains an essential aromatic residue, usually a tyrosine. The S5b sequence also contains a similar tyrosine-based motif (NPNY, residues 476 -479). The second motif, the dileucine repeat, has been implicated in trafficking of a variety of transmembrane proteins including degradation of the insulin receptor. Residues flanking dileucine repeats are often charged (15), and interestingly, the residues surrounding the dileucine pairs in S5b are also highly enriched in charged and polar side chains (Table I). Conceivably, the presence of dileucine motifs in S5b localizes some 26 S protease complexes to membranes where they may function to down-regulate receptors or other transmembrane proteins. A transmembrane protein could be degraded through two separate pathways, the cytoplasmic domain by the 26 S protease and the extracellular domain within lysosomes. In fact, ubiquitin has been implicated in the negative regulation of the platelet-derived growth factor ␤-receptor, possibly by promoting the degradation of the ligand-activated receptor (16). In addition, the apparent role of the 26 S protease in antigen presentation (17) might require its direct association with the endoplasmic reticulum. Peptides generated by the protease would then be immediately available for transport FIG. 3. Western blot analysis of 26 S subunits separated on two-dimensional SDS-polyacrylamide gels. 26 S protease subunits were separated by two-dimensional gel electrophoresis as described under "Experimental Procedures." The separated proteins were stained with Coomassie Brilliant Blue (panel A) or transferred to nitrocellulose and incubated with 125 I-labeled ubiquitin polymers (panel B) or antibodies raised against S5b (panel C). Arrows denote S5a (white) and S5b (black) subunits. For panels B and C, the position of each subunit was determined by Ponceau S staining prior to incubations with 125 I-labeled ubiquitin chains or anti-S5b antibodies, respectively. Regulatory complexes were loaded in the far left lane of each second dimension gel shown and resolved based upon molecular weight only. The migration of molecular weight markers is pictured to the left of panel A. In panel C, the immune-reactive protein migrating faster than S5b and just to the right of pI 6.0 appears to be subunit 9.

TABLE I
A comparison of dileucine repeats in S5b and various transmembrane proteins The sequences shown in the center column are dileucine motifs from various transmembrane proteins (15). Some of the dileucine repeats have been shown to be essential for sorting of the listed molecules (17). The column at the right lists the nine dileucine pairs and flanking amino acids from subunit 5b of the 26 S protease. Whereas the frequency of dileucine pairs is expected from the high percentage of leucines in S5b (13%), the 17 charged residues flanking the Leu-Leu pairs is double the expected eight based upon the frequencies of K, D, E, H and R in S5b. The fraction of charged residues surrounding the dileucine repeats is 0.44 in the transmembrane proteins and 0.47 in S5b. The dileucine repeats and surrounding residues are presented in one letter amino acid code. across the endoplasmic reticulum membrane. In summary, we have isolated a cDNA that encodes a 50-kDa subunit of the 26 S protease. Although we cannot currently assign a function to S5b, the availability of both antibodies to S5b and a histidine-tagged version of the recombinant protein should facilitate future characterization of this 26 S protease component.