|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J Biol Chem, Vol. 273, Issue 26, 16400-16408, June 26, 1998
From the Departments of Molecular Biology and
We describe a new cystatin in both mice and
humans, which we termed leukocystatin. This protein has all the
features of a Class II secreted inhibitory cystatin but contains lysine
residues in the normally hydrophobic binding regions. As determined by cDNA library Southern blots, this cystatin is expressed selectively in hematopoietic cells, although fine details of the distribution among
these cell types differ between the human and mouse mRNAs. In
addition, we have determined the genomic organization of mouse leukocystatin, and we found that in contrast to most cystatins, the
leukocystatin gene contains three introns. The recombinant proteins
corresponding to these cystatins were expressed in Escherichia coli as N-terminal glutathione S-transferase or
FLAGTM fusions, and studies showed that they inhibited papain and
cathepsin L but with affinities lower than other cystatins. The unique
features of leukocystatin suggests that this cystatin plays a role in
immune regulation through inhibition of a unique target in the
hematopoietic system.
Cysteine proteases play many very important roles in the
immune system. For instance, the de-ubiquinating enzymes are cysteine proteases, whereas lysosomal proteases are involved in antigen presentation both through the degradation of proteins to antigenic peptides and by processing the invariant chain of class II major histocompatibility complexes (1). However, the overexpression of these
proteases can be detrimental to cells, as can their release into the
extracellular space. Therefore, their activities in these cells are
controlled by a variety of mechanisms, including the presence of
macromolecular protease inhibitors.
The cystatins make up a class of very tight, reversible, competitive
inhibitors of the papain family of cysteine proteases. Cystatins have
been divided into four classes based upon their sequences and
properties. Class I, also called the stefins, are a group of
intracellular proteins of approximately 100 residues that contain no
disulfide bonds. Class II cystatins are secreted inhibitors of about
120 amino acids containing two disulfide bonds. Class III cystatins,
known as the kininogens, contain three domains, each of which resembles
Class II cystatins; two of these domains possess inhibitory activity.
Finally, Class IV cystatins constitute a poorly understood group of
glycoproteins with two nonfunctional cystatin domains. The amino acid
sequences and genomic structures within each family are highly
conserved. Cystatins are expressed throughout the body in a
tissue-specific manner. Mutations in some cystatins or alterations in
the balance of these with their cognate cysteine proteases have been
implicated in several diseases (2, 3). Many studies, involving changes
of peptide sequence, have shown that three regions of the cystatin,
which form a "wedge" that can associate with the active-site cleft,
are all required for tight binding to the protease. These studies have
been confirmed by the crystal structure of the cystatin B-papain
complex (4) and supported by other structural studies showing that
chicken egg white cystatin, a Class II cystatin, has the same fold as the Class I cystatin B (5, 6).
In this paper, we describe the characterization of a new Class II
cystatin, leukocystatin, specifically expressed by hematopoietic cells.
The unique features of the amino acid sequence suggest that the as yet
unidentified target protease is not one of the commonly studied
lysosomal cysteine proteases, although leukocystatin is an active
inhibitor of these cathepsins. In addition, the unusual genomic
structure of the mouse protein and the amino acid sequences of both the
human and mouse inhibitors suggest that they are quite divergent from
other Class II cystatins.
General--
Antisera to the human protein (Josman Laboratories,
Napa, CA), used for protein blotting, were produced in rabbits against the peptide GFPKTIKTNDPGVLQAAR, which was synthesized on a lysine matrix (Biosynthesis, Inc., Lewisville, TX). Protein blots were detected with the Enhanced Chemiluminescent Detection System (Amersham Pharmacia Biotech). Chicken egg white cystatin was from PanVera (Madison, WI). Automated DNA sequencing was performed with a
DyeTerminator Cycle Sequencing Ready Reaction kit on an Applied
Biosystems Prism 377 DNA sequencer (both from Perkin-Elmer). PCRs were
performed on a Perkin-Elmer 9600 with a GeneAmp PCR kit (Perkin-Elmer)
followed by purification with QIAquick Gel Extraction kits (Qiagen,
Santa Clarita, CA). Oligonucleotides were synthesized using an Applied Biosystems 394 synthesizer (Perkin-Elmer).
cDNA Libraries--
cDNA libraries are listed on the
Southern blots (see Figs. 5 and 6) and were made with the SuperscriptII
system (Life Technologies Inc.) as
detailed elsewhere (Refs. 7 and 8 and references therein). Details are
also available directly from the authors. In cases where the derivation
is not obvious from the name, conditions used are listed below. Mouse
libraries: 2) Braf:ER transfectant NIH3T3 cell line, ethanol-treated;
3) Mel14 bright CD4+ cells from spleen, polarized for 7 days with
IFN- Identification and Characterization of Human Leukocystatin
cDNA--
An average of 375 base pairs of highly unambiguous
sequence (EST) from individual clones from cDNA libraries 84 and 86 were determined. The sequences were analyzed for possible encoded
function by BLASTN searches versus the public data bases,
followed by BLASTP searches of the open reading frames (9). By this
method, two leukocystatin ESTs (from a total of 1190) were identified.
Both cDNAs were completely sequenced on each DNA strand and were
found to be full-length.
Isolation and Characterization of Mouse Leukocystatin
cDNA--
160 pools of approximately 500 clones each from a mouse
TH2 cDNA library (Ref. 10; Library 4) were amplified overnight to form sublibraries. Southern blots were performed on the sublibraries using a 32P-labeled 321-base pair probe to the human
sequence (see below), washing with cross-species conditions (2× SSC,
0.1% SDS at 65 °C). One of the 160 sublibraries showed a positive
signal. A bacterial stock from this pool was plated out, and colony
hybridization was conducted under the same conditions to yield several
possible positive clones, one of which was selected and found to encode a full-length copy of mouse leukocystatin.
cDNA Library Southern Blots--
The method presented by
Bolin et al. (7) was followed. Briefly,
NotI/SalI digests of 5 µg of cDNA library
released the cDNA inserts from the vector. Digestion reactions were
run on 1% agarose gels, transferred to Nytran+ filters (Schleicher and Schuell), and cross-linked with a UV Stratalinker 1800 cross-linker (Stratagene, La Jolla, CA). For the human blot, a 321-base pair 32P-labeled probe was synthesized with
[32P]dCTP (Amersham Pharmacia Biotech) using the
Rediprime system (Amersham Pharmacia Biotech). Hybridization with
1.5 × 106 cpm/ml was performed at 60 °C in
ExpressHyb (CLONTECH, Palo Alto, CA) followed by
washes in 0.5× SSC, 0.1% SDS. The 343-base pair 32P-labeled mouse probe was made using a Prime-IT II kit
(Stratagene) followed by purification on a Centrisep column (Princeton
Separations, Adelphia, NJ). Hybridization was performed in 0.5 M sodium phosphate, pH 7.2, 7% SDS, 0.5 mM
EDTA at 65 °C followed by washes in 0.1× SSC, 0.1% SDS.
Intensities of the bands were quantitated with a Molecular Dynamics
Personal Densitometer (Sunnyvale, CA) scan of the developed x-ray film
(Kodak BioMax, Rochester, NY).
Genomic DNA Sequence--
A 129SV mouse genomic library
(Stratagene) was screened with a 617-base pair 32P-labeled
probe (complementary to the mouse cDNA sequence) in QuikHyb
(Stratagene) using conditions recommended by the manufacturer. Of
approximately 10 million clones, 4 were identified as being positive.
Through a series of PCRs using primers that hybridized to various
portions of the leukocystatin sequence, one clone was shown to contain
the entire gene. In order to fully sequence this gene, a series of PCRs
(26 reactions total) were performed. The resulting overlapping
fragments were sequenced in both directions, generating 12 kilobases of
sequence consistent with the cDNA sequence.
Recombinant Protein Expression and Purification--
The PCR
primers listed in Table I were used to
amplify the leukocystatin sequence with appropriate restriction sites.
These amplimers were subcloned into the
BamHI/NotI or BamHI/EcoRI
sites of pGEX-4T-1 (Amersham Pharmacia Biotech) or
HindIII/EcoRI sites of pFLAG (IBI, Eastman
Kodak). The coding regions of the constructs were completely sequenced,
and DNA preparations using QIA filter plasmid maxi kits (Qiagen) were
made for transformation into the Escherichia coli strains
used for protein expression.
,
thi-1); human long, X156F (leu-6, proC34, purE42, trpE38,
thi-1, ara14, lacY1, galK2, xyl-5, tsx67, azi-6, rpsL109, supE44);
mouse short, Ut4400 (azi-6, lacY1, leu6, mtl-1, proC14, rpsL109,
thi-1, trpE38, tsx67, entA403, fepA). The E. coli
strain containing the pGEX-cystatin fusion was grown as a 15-liter
fermenter batch in Lym-1 medium (3% casamino acids, 2% yeast extract,
0.5% KH2PO4, 20 g/liter glycerol, 1 g/liter
MgSO4, 50 mg/liter ampicillin). The culture was induced
with 0.4 mM
isopropyl-1-thio- -D-galactopyranoside at an
A600 of 4 for 4 h at 37 °C. Cell pellets
were resuspended in 1 liter of TE with a Brinkmann (Westbury, NY)
polytron PT3000 homogenizer and ruptured with three passes through a
microfluidizer. The homogenate was centrifuged at 23,000 × g for 45 min, and then the pellets were washed with TE/1%
Triton X-100, TE alone, and finally 2 M guanidinium
chloride/20 mM Tris-HCl, pH 8; they were then solubilized
in 8 M guanidinium chloride, 50 mM Tris-HCl, pH
8.25, 2 mM EDTA, 1 mM Pefabloc (Enterchem,
Stamford, CT), 10 mM DTT (10 ml/g of inclusion bodies).
Following centrifugation at 23,000 × g for 30 min to
remove insoluble material, the denatured protein was diluted 1000×
into 0.4 M guanidinium chloride, 50 mM
Tris-HCl, pH 8.25, 2 mM EDTA, 1 mM Pefabloc,
2.5 mM reduced glutathione, 1 mM oxidized
glutathione and renatured at 4 °C overnight. The refolded material
was filtered, concentrated, and diafiltered into 50 mM
Tris-HCl, pH 8. The protein was loaded on a glutathione-Sepharose 4B
column at 2 ml/min. The column was washed with phosphate-buffered saline (PBS, Life Technologies, Inc.) and then PBS containing 0.5 M NaCl and subsequently eluted with 20 mM
reduced glutathione/PBS. The fractions containing leukocystatin (as
determined by Western blot) were pooled, diluted with 50 mM
Tris-HCl, pH 7, and loaded on an S-Sepharose column. The bound protein
was eluted with a linear 0 1 M NaCl gradient, the
leukocystatin-containing fractions were pooled, and the fusion was
cleaved with thrombin (4 units/µg of protein) for 1 h at
37 °C. After thrombin inhibition with hirudin (0.2 units/µg of
protein; Sigma) and 1 mM Pefabloc, the cleaved protein was
loaded onto a Poros reverse phase column (PerSeptive Biosystems,
Framingham, MA). The column was washed with 2% acetonitrile/0.1% trifluoroacetic acid and eluted with a 2% 80% acetonitrile
gradient containing 0.1% trifluoroacetic acid.
Leukocystatin-containing fractions, as determined by Western blot, were
pooled, dialyzed into 50 mM NaOAc, pH 4, and stored at
4 °C.
The expression of the FLAG-tagged cystatin was similar, with the
following E. coli strains being used to produce the protein: human short, UT4400; human long, W3110. However, following cell harvesting, the periplasmic fraction was obtained by osmotic shock (1 h
at 4 °C in 50 mM Tris-HCl, pH 8, 2 mM EDTA,
20% sucrose, 0.1 mg/ml lysozyme). The volume was doubled using the
above buffer, and benzonase (25,000 units/liter of extract; American
International Chemical Inc., Natick, MA) was added. After incubation
for 10 min, the suspension was centrifuged at 27,500 × g for 45 min. The inhibitor was purified from the
supernatant by chromatography over a 5-ml M2 column (Kodak Scientific
Imaging Systems) and eluted with 20 mM glycine
hydrochloride, pH 3. Following dilution into 20 mM sodium
citrate, pH 4, those fractions containing cystatin were further
chromatographed on an S-Sepharose column and eluted with a 0 1 M NaCl gradient in 20 mM sodium citrate, pH
4.
N-terminal amino acid sequencing (ABI 476 Protein Sequencer) was
performed for all forms and agreed with the predicted sequences. The
mouse short and FLAG-human long materials were quantitated by amino
acid analysis (Hewlett Packard AminoQuant using the manufacturer's standards), and the concentration obtained for the FLAG-human long form
agreed within 2-fold with that obtained by densitometry (Molecular
Dynamics Personal Densitometer) scanning of a silver-stained (Daiichi,
Integrated Separation Systems, Natick, MA) 10% Novex Bis-Tris gel
using lysozyme as a concentration standard. Determination of protein
concentrations for the other variants was performed by densitometry of
silver-stained gels using both lysozyme and the FLAG-human long
cystatin as standards. Final yields of protein were as follows: mouse
short, 0.57 mg/15 liters; human short, 0.012 mg/15 liters; human long,
1 mg/11 liters; FLAG-human short, 0.6 mg/15 liters; FLAG-human long, 1 mg/2 liters.
Refolding of Chicken Egg White Cystatin--
2 mg of chicken egg
white cystatin was concentrated to 100 µl and then incubated in 1 ml
of 8 M guanidinium chloride in 50 mM Tris-HCl,
pH 8.25, 10 mM DTT at room temperature for 2 h. The unfolded protein was diluted into 50 mM Tris-HCl, pH 8.25, 2.5 mM reduced glutathione, 1.0 mM oxidized
glutathione and incubated for 12 h at 4 °C. Following dialysis
into 50 mM Tris-HCl, pH 8, the refolded material was
purified on a Poros reverse phase column, eluting with a 2% Inhibition Studies--
The method described by Abrahamson (11)
was used, with some alterations. Papain activity (Sigma; final
concentration, 290 pM) was fluorimetrically monitored (SPEX
Fluorolog, Instruments SA, Edison, NJ) with 5-25 µM
carbobenzoxy-L-phenylalanyl-L-arginine-7-amino-4-methylcoumarin (Bachem, King of Prussia, PA) in 100 mM sodium phosphate,
100 mM NaCl, 5 mM DTT, 1 mM EDTA,
1% N,N-dimethylformamide, 0.01% Brij-35, pH
6.7, at room temperature (25 °C). The inhibition of human liver
cathepsin B (290 pM; Calbiochem, San Diego, CA) and human
liver cathepsin L (170 pM; Calbiochem) were determined in 100 mM NaOAc, 100 mM NaCl, 5 mM
DTT, 1 mM EDTA, 1%
N,N-dimethylformamide, 0.01% Brij-35, pH 5.5. The substrates were
carbobenzoxy-L-argininyl-L-arginine-7-amino-4-methylcoumarin (Bachem) and
carbobenzoxy-L-phenylalanyl-L-arginine-7-amino-4-methylcoumarin for the two cathepsins, respectively. Uninhibited rates,
vo, were measured, and then inhibitor was added,
ensuring that in all studies, at least 1 eq of inhibitor was present.
The reaction was allowed to re-equilibrate for 25 min (a time course
showed that equilibrium was achieved at this point for this range of inhibitor concentrations), and the inhibited rate,
vi, was determined. Apparent Ki
values were determined from the equation [I]t/(1
Mouse and Human Leukocystatin cDNA Cloning and
Sequence--
Almost 1200 ESTs were determined from a cDNA library
of resting and activated human dendritic cells. Of these, 67%
corresponded to sequences in the GenBankTM data base with
known function (as of May 1996) or to repeat elements. Included among
these were the known cystatins A (three copies), B (one copy), and C
(four copies). Many of the remaining sequences were identical to
sequences in public EST data bases. Of the unknown sequences, two
corresponded to the same 867-base pair cDNA and appeared to encode
a protein related to known cystatins. Further sequence analysis of
these clones confirmed this notion, revealing a full-length open
reading frame corresponding to the protein we now designate
leukocystatin. Two possible in-frame start codons, at nucleotide
positions
roughly the same amount as mouse and human cystatin C they are only about 22%
identical to the cystatin C N termini of the same species. As noted
above, the human sequence, but not that of the mouse, has two possible start codons. However, PSORT would not predict a signal sequence for
the amino acids coded by 66 to 1 but would instead predict that
leukocystatin is a Type II transmembrane protein.
Other interesting features of the leukocystatin amino acid sequence
include additional cysteine residues, one of which may be involved in
stabilizing a homodimeric form of the protein (see below), and lysine
residues at positions 35 and 84. Both lysine residues lie in the
putative protease binding regions and replace nonpolar residues found
in all other cystatins. The possible significance of these residues is
examined under "Discussion."
Genomic Structure-- A mouse genomic sequence corresponding to leukocystatin was found, and approximately 12 kilobases was sequenced. The mouse leukocystatin gene (deposited in the GenBankTM data base under accession no. AF031826) contains four exons (Fig. 4). The N-terminal intron represents an additional break in the sequence relative to other Class II cystatin genes, splitting the codon for Asp24. This first intron is particularly large, about 5 kilobases. The sites of the two C-terminal exon-intron junctions are relatively conserved in comparison with other known Class II and Class III junctions from animals. The second intron, lying between the codons for Gln81 and Val82, is approximately 1.8 kilobases, whereas the third is 0.6 kilobases and is located between the codons for Arg120 and Thr121. The sequences at all the junctions are in agreement with GT/AG consensus sequences.
mRNA Tissue Distribution-- Using cDNA library Southern blots (7), we determined the tissue distribution of mouse and human leukocystatin mRNAs (Figs. 5 and 6). This technology utilizes high quality representational cDNA libraries in place of the more typical Northern blots and is especially useful for examining mRNA tissue distributions in cases where these mRNAs are particularly hard to attain. We have confirmed many of these results with semiquantitative PCR (8, data not shown). In an analysis that addressed 61 human and 47 rodent cell types and/or activation conditions, human leukocystatin mRNA was found mainly in resting T-cells, premonocytic cells, activated dendritic cells derived from stem cells, and some natural killer cell clones. Interestingly, however, human leukocystatin was not seen in dendritic cells derived from monocytes, only in those from stem cells. Many differences between these two cell-types have been observed by others, although it is not clear what these differences mean functionally (15). The mouse leukocystatin mRNA, in contrast, is found primarily in differentiated T-cells, although there seems to be little difference in the levels between TH1 and TH2 cells; however, little is found in naive and pre-T-cells. A moderate amount is also found in monocytes, whereas B-cells, dendritic cells, and some macrophage libraries show small amounts of cDNA corresponding to this protein. The small amounts seen in lymph nodes, thymus, and spleen probably result from resident lymphocytes. The absence of readily-detectable mouse leukocystatin mRNA in splenic and bone marrow mouse dendritic cells may reflect their lineage.
Protein Production-- Proteins of two types were made in order to study the role of the N terminus in binding: the short form beginning with the conserved glycine at position 37, and the long form commencing with Gly19. Gly19 is predicted to be the last amino acid of the signal sequence, and so this version should represent the complete mature protein. Recombinant leukocystatin was produced in E. coli as either an N-terminal GST or FLAG fusion. Similar methods have been used previously to express other family members (16-20). The FLAG-tagged material was isolated as soluble material from the periplasm; the GST fusion, however, was primarily recovered as inclusion bodies which had to be refolded. Although the Class I cystatin A requires refolding after expression in E. coli (21, 22), no other Class II cystatin is insoluble when expressed in E. coli, even when expressed as the GST fusion (20). The mature protein was isolated following thrombin cleavage of the GST moiety or enterokinase cleavage of the FLAG tag. SDS-polyacrylamide gel electrophoresis showed the expected molecular weights and that the proteins were reasonably pure (Fig. 7). Nonreduced gels, however, suggest that the long form may exist primarily as a dimer (Fig. 8): as DTT concentrations are varied from 0 to 8 mM, this protein product exists as a dimer and as a monomer, respectively, as determined by apparent molecular weights. Amino acid sequencing and detection with a leukocystatin-specific antibody confirmed that the correct proteins were isolated. Because freeze-drying the purified protein resulted in material that could not be resolubilized, the protein was stored at pH 4.0 at 4 °C.
Inhibition of Cysteine Proteases-- We studied the inhibition of three cysteine proteases (papain, cathepsin B, and cathepsin L) by normal methods (11). All studies were carried out near the optimal pHs of the proteases, and also in the presence of 5 mM DTT, which may partially denature the cystatins but was necessary to maintain maximal activity of the cysteine proteases. Using this method, we confirmed that chicken egg white cystatin binds tightly to cysteine proteases: an apparent inhibition constant of 90 pM was obtained versus cathepsin B, whereas binding was too tight to papain to quantitate. Identical results were obtained following denaturing and renaturing of the chicken egg white cystatin (data not shown). These results are consistent with published values (Table II). We found that binding of leukocystatin to the cysteine proteases studied is slow and is also weaker than other Class II cystatins (Table II). In fact, we could detect no inhibition of cathepsin B activity with leukocystatin (lower limit of detection, Kd approximately 200 nM), although subnanomolar inhibition constants were found versus papain and cathepsin L. Whereas the different N-terminal forms of leukocystatin had little effect on the inhibition of papain, a 10-fold increase in affinity for cathepsin L was seen with the leukocystatin long form relative to the short form. Finally, although an affinity could not be determined quantitatively, we found that a cysteine-linked dimer of the leukocystatin long form was not as effective as a papain inhibitor as the reduced form. So, although leukocystatin is a functional Class II inhibitor, its unique amino acid sequence appears to interfere with binding to the commonly assayed cysteine proteases.
We have discovered a novel hematopoietic cell-specific Class II cystatin from an EST analysis of human dendritic cells. This protein, which we have called leukocystatin, has all the features of a Class II cystatin, but it has some notable characteristics. For example, leukocystatin contains lysine residues at two positions that are strictly hydrophobic (residue 35) and small, noncharged (residue 84) amino acids in all other characterized cystatins. Position 35 is thought to bind to the P3 site of the target protease (4, 5), so it is possible that this lysine substitution results in an especially high affinity for a cysteine protease with this preferred specificity. Because residues 81-85 in other cystatins usually form nonspecific hydrophobic interactions with the cognate protease (4), it is likely that the contacts formed by this region may also differ from those observed previously. Supporting this, computer modeling has shown that Lys84 would interfere with binding of leukocystatin to papain in this region (see below). Leukocystatin contains a total of eight cysteines: the four that are
conserved with other Class II cystatins, and four unique cysteines, two
of which are in the leader region (Fig. 3). Conserved cysteine residues
in the N-terminal portion are not seen in any other mature Class II
cystatin molecule, although they do occur sporadically in other
cystatin leader sequences. Two cysteine residues, in positions
different from leukocystatin, also appear in the N-terminal region of
each inhibitory kininogen domain, and a polymorphism in the cystatin D
sequence introduces a cysteine in this area (18, 23). It is possible
that the two additional leukocystatin cysteines in the putative mature
protein form an intrachain disulfide and provide added stability. This,
however, is not supported by the evidence. Participation of
Cys63 in an intrachain disulfide could only occur if the
leukocystatin structure is markedly different from chicken egg white
cystatin or if the N terminus folds back because the structure of
chicken egg white cystatin shows the amino acid corresponding to
Cys63 at the end of an The mouse leukocystatin gene contains four exons (Fig. 4), unlike most other members of the Class II cystatins, which have three (25). Soyacystatin is the only known molecule containing one cystatin domain and having a gene encoding four exons. The additional exon in that case, however, lies in a unique C-terminal extension (26). Because the amino acids encoded by the first two exons of leukocystatin are very different from other Class II cystatins, it is clear that the evolution of this region is very different from other family members. The C-terminal genomic organization, however, is similar to the other Class II cystatins, with the intron/exon boundaries being conserved. Furthermore, the N-terminal portion of leukocystatin is not similar to Class I cystatins: the Class I genomic organization is different, with the first intron lying at a position between the first and second leukocystatin introns and with the second lying between the second and third leukocystatin introns (27, 28). Several forms of leukocystatin were produced in E. coli and were active cysteine protease inhibitors. Although some of these products require refolding, the FLAG-tagged material was soluble, similar to other Class II members expressed in E. coli, including those used for comparison in Table II (16, 17). Although the Class I cystatin A requires refolding following overexpression in E. coli, this was shown to have no adverse effect on activity (21, 22). We further controlled for any effects that refolding may have on activity by examining denatured/renatured chicken egg white cystatin and found no difference in activity following this step. We determined the apparent Ki values of leukocystatin with papain and cathepsin L. These are compared in Table II with published values for other Class II cystatins. Ki values in the literature vary for the same cystatin-protease pair, probably due to the differing lengths of the N termini in various cystatin preparations; these residues are easily proteolyzed during isolation of native cystatins. In general, the affinity of cystatins for cathepsin B is much weaker than the binding to cathepsin L or papain, a trend that holds for the leukocystatins as well. In fact, no inhibition of cathepsin B was detected, although a reasonable apparent Ki was found for chicken egg white cystatin. One possible explanation for the weaker association of leukocystatin to all examined proteases is the presence of lysine residues at positions 35 and 84, which replaces amino acids that are uncharged in every other known cystatin. Both residues are also known to be intimately involved in the binding process of these inhibitors to cysteine proteases (4), and N-terminal truncation or substitution at either of these sites can lead to dramatic decreases in the ability of these cystatins to inhibit cysteine protease activity. For instance, some substitutions in the Gln81-Gly85 loop have been shown to be detrimental to binding, although position 84 was not itself modified in these studies (16). Modeling of a lysine at position 84, based upon the known complexed structure of stefin B to papain (4), indicated that this substitution would cause serious steric interactions at this interface in addition to penalties resulting from desolvation and burying the positive charge of the lysine side chain (data not shown). Furthermore, it has been shown that the residues preceding the conserved glycine 37 are important for binding because removal of these can reduce binding drastically (29-31), in some cases largely due to slower association rates. For the leukocystatins, we have observed that the time to reach equilibrium is rather slow, taking several minutes. In our case, this may indicate a necessary conformational change in the protease and/or leukocystatin for binding to occur. Furthermore, Hall et al. (17) postulate that the residue at position 35 may be a primary determinant of protease specificity because these N-terminal residues associate with the protease binding sites (4, 5). Lindahl et al. (32) have shown that an arginine substitution in cystatin C at the position equivalent to Pro36 can have a large impact on binding to cathepsin B or to papain and may even cause displacement of the N terminus from the protease (32). Although that position is probably more critical to tight binding to the cognate protease than the residue at 35, it demonstrates that changes in the amino acids at these positions can greatly affect the ability of cystatins to inhibit cysteine proteases. It is therefore likely that the native binding partner of leukocystatin is unlike that of the examined proteases. It is possible that the target is some as yet unidentified lysosomal protease, or even a protease from a different family. For instance, the ubiquitin-hydrolase UCH-L3 has recently been shown by x-ray crystallography to have a papain-like fold, being particularly similar to cathepsin B in the active-site cleft (33), and so may very well be inhibited by cystatins, although no evidence of this is yet in the literature. Particularly intriguing is the fact that this isozyme is primarily found in hematopoietic cells (34, 35), and is specific for the RGG sequence of ubiquitin. Furthermore, a domain of kininogen has shown inhibitory activity against calpains (36), and legumain is inhibited by chicken egg white cystatin (37), demonstrating that other families of cysteine proteases can be inhibited by these sorts of structures. In support of a unique target for the leukocystatins, we found little difference in the binding abilities of long and short forms of cystatin with papain in the presence of 5 mM DTT. Based upon the results of experiments with chicken cystatin, in which N-terminally truncated forms were found to not be as efficient inhibitors as the full-length molecules (29-31), we would expect to see dramatic differences in the abilities of these variants to inhibit this enzyme. One possible explanation is that the 8-amino acid N-terminal extension of the longer form impedes binding, although extensions have been shown to have little effect for other cystatins (38). Furthermore, the absence of an effect of the FLAG tag supports the idea that N-terminal extensions do not influence leukocystatin binding. The unique lysine at position 35 may, however, interfere with complex formation. There is also some evidence that the long forms dimerize, and this may interfere with binding. Under the assay conditions, however, a large proportion is likely to be monomeric, as evidenced by the titration shown in Fig. 8. That the interchain cysteine primarily mediates this dimerization is evidenced by the fact that activity increases for the long form with increasing DTT concentrations. If we were to assume that the dimeric form did not bind at all to the studied cysteine proteases (and that all of the inhibition resulted from the monomeric form) this would only result in changing the apparent Ki by a factor of less than 2 (in favor of tighter binding), because the effective concentration would be changed by this amount. Although there was no effect on papain inhibition, there was a 10-fold increase in binding affinity to cathepsin L with the long form, showing that at least for this particular case, the N terminus contributes to binding. This supports the idea that the various portions of cystatins are differentially involved in association to individual proteases, even though the three-dimensional structures of these proteases are very similar. We would expect the native binding partner of leukocystatin to fully take advantage of the unique features in these sites. Leukocystatin was shown by cDNA library Southern blots to be expressed selectively in hematopoietic cells. Examination of a wide variety of immune cell types suggests that the highest levels are expressed in T-cells, monocytes, and dendritic cells. Clearly, a search for a specific target protease should focus on the effector functions of these immune cell types. Currently, we are developing other tagged versions of leukocystatin, additional antibody reagents, and a mouse gene knockout to probe in depth the biological role of this novel Class II cystatin. In conclusion, we have characterized a new Class II cystatin, termed leukocystatin, which has a novel sequence, including unique lysine residues at two important protease binding sites, and a distinct distribution in hematopoietic cells.
We thank Felix Vega for peptide sequencing, Connie Huffine and Anh Quan for assisting with DNA sequencing, David Cambell for preliminary protease assays, and Jackie Timans for making the FLAG-human short cystatin construct.
* DNAX is supported by Schering-Plough Corporation.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) AF031824, AF031825, and AF031826.
¶ To whom correspondence should be addressed: Dept. of Molecular Biology, DNAX Research Institute, 901 California Ave., Palo Alto, CA 94304-1104. Tel.: 650-496-5255; Fax: 650-496-1214; E-mail: Gerard{at}dnax.org.
1
The abbreviations used are: IFN-
2 Short form, amino acids 37-146; long form, amino acids 20-146. Amino acid numbering is defined in Fig. 3.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||