Equistatin, a New Inhibitor of Cysteine Proteinases from Actinia equina , Is Structurally Related to Thyroglobulin Type-1 Domain*

It is well known that the activities of the lysosomal cysteine proteinases are tightly regulated by their en-dogenous inhibitors, cystatins. Here we report a new inhibitor of cysteine proteinases isolated from sea anemone Actinia equina . The inhibitor, equistatin, is an acidic protein with pI 4.7 and molecular weight of 14,129. It binds tightly and rapidly to cathepsin L ( k a (cid:53) 5.7 (cid:51) 10 7 M (cid:50) 1 s (cid:50) 1 , K i (cid:53) 0.051 n M ) and papain ( k a (cid:53) 1.2 (cid:51) 10 7 M (cid:50) 1 s (cid:50) 1 , K i (cid:53) 0.57 n M ). The lower affinity for cathepsin B ( K i (cid:53) 1.4 n M ) was shown to be due mainly to a lower second order association rate constant ( k a (cid:53) 0.04 (cid:51) 10 6 M (cid:50) 1 s (cid:50) 1 ). The inhibitor is composed of 128 amino acids forming two repeated domains with 48% identity. Nei- ther of the domains shows any sequence homology to cystatins, but they do show a significant homology to thyroglobulin type-1 domains. A highly conserved con-sensus sequence motif of Cys-Trp-Cys-Val together with conserved Cys, Pro, and Gly residues is present in major histocompatibility complex class II-associated p41 in- variant chain, nidogen, insulin-like growth factor proteins, saxiphilin domain a , pancreatic carcinoma L). inhibitor 5,5

M ؊1 s ؊1 ). The inhibitor is composed of 128 amino acids forming two repeated domains with 48% identity. Neither of the domains shows any sequence homology to cystatins, but they do show a significant homology to thyroglobulin type-1 domains. A highly conserved consensus sequence motif of Cys-Trp-Cys-Val together with conserved Cys, Pro, and Gly residues is present in major histocompatibility complex class II-associated p41 invariant chain, nidogen, insulin-like growth factor proteins, saxiphilin domain a, pancreatic carcinoma marker proteins (GA733), and chum salmon egg cysteine proteinase inhibitor. In each of the domains of the equistatin, the three residues are similarly conserved, and the sequences Val-Trp-Cys-Val and Cys-Trp-Cys-Val are present in domains a and b, respectively. We suggest that equistatin belongs to a new superfamily of protein inhibitors of cysteine proteinases named thyroglobulin type-1 domain inhibitors. This superfamily currently includes equistatin, major histocompatibility complex class II-associated p41 invariant chain fragment, and chum salmon egg cysteine proteinase inhibitor.
Sea anemones are known to be a rich source of variety of polypeptide neurotoxins (1, 2) and neuropeptides (3), but little is known about the presence of proteolytic enzymes and their inhibitors. A chymotrypsin-like protease was first isolated from Metridium senile and shown to possess the same zymogen activation and active site chemistry as the proteinase from mammalian pancreas (4). Early reports on the existence of proteinase inhibitors in different species of sea anemones (5)(6)(7)(8) were followed by the isolation (9) and primary structure determination of an elastase inhibitor from Anemonia sulcata (10). The inhibitor was found to be a nonclassical Kazal-type inhibitor with respect to positioning of the half-cystines. More recently, the structure of a Kunitz-type proteinase inhibitor purified from the Caribbean sea anemone Stichodactyla heliantus has been determined by NMR spectroscopy (11).
Cysteine proteinases are members of one of the four mechanistic classes of proteinases and, together with their endogenous protein inhibitors, cystatins, play an important role in intracellular degradation (12). They have not yet been found in sea anemones.
In this article we describe the isolation of a new inhibitor of papain-like cysteine proteinases from sea anemone, Actinia equina, designated as equistatin, the kinetic properties of its interaction with papain-like cysteine proteinases, and its amino acid sequence.

EXPERIMENTAL PROCEDURES
Enzymes-Papain (2 ϫ crystallized) and clostripain were purchased from Sigma (Germany), and Ep-475, 1 a specific inhibitor of cysteine proteinases, was obtained from Peptide Research Foundation (Japan). The Staphylococcus aureus V8 proteinase was obtained from Miles (UK), and glycyl endopeptidase was a gift from Dr. Alan J. Barrett (The Babraham Institute, Cambridge, UK) and was prepared as described (13). Recombinant human cathepsin B and human cathepsin L were prepared as described previously (14,15).
Inhibitor Purification-A. equina specimens were collected on the northern coast of the Adriatic sea. The anemones (3 kg) were frozen, partially thawed, cut into small pieces, and homogenized in 4.5 liters of deionized water. Nonsoluble material was removed by centrifugation at 13,000 ϫ g for 45 min. The supernatant was adjusted to pH 10.5 and incubated at room temperature for 1 h. Neutralization to pH 7.0 was followed by additional centrifugation at 13,000 ϫ g for 45 min. The clear supernatant was applied to a carboxymethyl papain-Sepharose column (6 ϫ 10 cm) previously equilibrated with 0.01 M Tris/HCl buffer, pH 8.0, containing 1 M NaCl and 0.1% Brij. After thorough washing of the column, bound proteins were eluted with 0.01 M NaOH. Fractions (20 ml) were collected and assayed for inhibitory activity toward papain using benzoyl-DL-Arg-␤-naphthylamide as substrate (16). The inhibitory fractions were pooled and concentrated by ultrafiltration (Amicon YM-5). The concentrate was applied to a Sephadex G-50 column (4.5 ϫ 140 cm) equilibrated with 0.01 M Tris/HCl buffer, pH 7.7, containing 0.1 M NaCl, and eluted at a flow rate of 18 ml/h. Inhibitory fractions with molecular weights of about 16,000 were pooled, concentrated (Amicon, YM-5), and dialyzed against 0.01 M Tris/HCl buffer, pH 7.2. The dialyzed sample was then applied to a DEAE-Sephacel column (2 ϫ 25 cm) equilibrated with the same buffer. The column was washed extensively, and bound proteins were eluted with a linear salt gradient (0 -0.1 M NaCl in 0.01 M Tris/HCl buffer, pH 7.2) at a flow rate 18 ml/h. Equistatin eluted at 0.07 M NaCl.
SDS-PAGE and Analytical Isoelectric Focusing-SDS-PAGE and isoelectric focusing were performed on a PhastSystem apparatus (Pharmacia Biotech Inc.) following the manufacturer's instructions. The inhibitor and molecular weight markers ranging from M r 14,400 to 94,000 were run in the presence of 0.5% SDS and 5% 2-mercaptoethanol on an 8 -25% gradient polyacrylamide gel. The pI of the inhibitor was determined by calibrating the gel with isoelectric focusing marker proteins with pI values ranging from 3.5 to 8. 15.
Protein Sequence Determination-Equistatin was reduced overnight with ␤-mercaptoethanol at 37°C and S-pyridylethylated (17). Pyridylethylated equistatin was hydrolyzed with glycyl endopeptidase as de-* This work was supported by the Ministry of Science and Technology of the Republic of Slovenia. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
‡ To whom correspondence should be addressed.  (18). 4 nmol of pyridylethylated equistatin were fragmented using 2% (w/w) S. aureus V8 proteinase in 0.5 M sodium lactate buffer, pH 4.0, at 37°C for 20 h. Both enzyme hydrolyses were performed in a final volume of 500 l. Reactions were stopped by the addition of trifluoroacetic acid. The resulting peptide mixtures were separated by high performance liquid chromatography (Milton Roy Co.) using a reverse phase ChromSpher C18 column equilibrated with 0.1% (v/v) trifluoroacetic acid in water. Elutions was performed using various linear gradients of 80% (v/v) acetonitrile containing 0.1% (v/v) trifluoroacetic acid. The absorbance was monitored at 215 nm. Protein samples were hydrolyzed in 6.0 M HCl at 110°C for 24 h. Analyses of the peptide hydrolysates were performed on an Applied Biosystems 421A amino acid analyzer with precolumn phenylisothiocyanate derivatization. An applied Biosystems liquid pulse sequencer 475A, connected on line to a phenylthiohydantoin analyzer 120A from the same manufacturer, was used for automated amino acid sequence analyses.
Determination of Protein Concentration-Protein concentration of equistatin was determined by absorption measurements at 280 nm using a molar absorption coefficient of 28,600 M Ϫ1 cm Ϫ1 determined by the method of Pace et al. (19) from the amino acid sequence or by the method of Lowry et al. (20) using bovine serum albumin as standard. The concentration of papain was determined spectrophotometrically using a molar absorption coefficient of 56,200 M Ϫ1 cm Ϫ1 (21).
Active Site Titration-The following buffers were used in all kinetic and equilibrium studies: 0.1 M phosphate buffer, pH 6.0, containing 5 mM dithiothreitol and 1 mM EDTA (for papain and cathepsin B) or 0.34 M sodium acetate buffer, pH 5.5, containing 5 mM dithiothreitol and 1 mM EDTA (for cathepsin L). Active site titrations of cathepsins B and L were performed using cysteine proteinase inhibitor Ep-475 as described previously (22). Papain, further purified by affinity chromatography (23), had a thiol content of 0.92 Ϯ 0.05 mol/mol of enzyme as determined by reaction with 5,5Ј-dithiobis(2-nitrobenzoic acid).
Active site-titrated papain was used to titrate equistatin as follows. Papain (0.1 M final concentration) was incubated with increasing amounts of equistatin (0 -0.2 M final concentration) in 200 l of 0.1 M phosphate buffer, pH 6.0, containing 5 mM dithiothreitol and 1 mM EDTA at 25°C. After 15 min of incubation, 1800 l of 100 M Z-Phe-Arg p-nitroanilide was added, and the residual activity of papain was monitored as described previously at 410 nm with a Perkin-Elmer Lambda 18 spectrophotometer (22). The data were analyzed by computer fitting to the theoretical binding equation (24).
Kinetics of Inhibition of Papain and Cathepsins B and L by Equistatin-The kinetics of the reaction between equistatin and papain, cathepsin B, and cathepsin L were analyzed by continuous measurements of the loss of enzymatic activity in the presence of substrate under pseudo first-order conditions with at least a 10-fold molar excess of inhibitor. Equistatin in increasing concentrations and the fluorogenic substrate (10 M Z-Phe-Arg 4-methyl-7-coumarylamide) were mixed in a cuvette with buffer (see above) to a final volume of 1.97 ml. The enzyme (30 l) was added, and the release of product was monitored continuously at excitation and emission wavelengths of 370 and 460 nm, respectively, by a Perkin-Elmer LS50 spectrofluorimeter. The biphasic progress curves were recorded and analyzed according to the model of slow tight binding kinetics using the equation of Morrison (25): is the product concentration, v z and v s are the initial and the steady-state velocities, respectively, t is time, and k is the observed pseudo first-order rate constant for the establishment of equilibrium between enzyme and inhibitor.

RESULTS AND DISCUSSION
Purification of Equistatin-Equistatin was purified from A. equina by a procedure similar to that used for the isolation of cysteine proteinase inhibitors of human origin (28). Initially, the supernatant was exposed to alkaline pH to dissociate the complexes between the inhibitor and other proteins. The most selective purification step, affinity chromatography on carboxymethyl papain-Sepharose, then allowed separation of papain-inhibiting proteins from the majority of noninhibitory pro-teins. This was followed by gel filtration on Sephadex G-50 (Fig. 1A), where the low molecular weight inhibitor (equistatin) was separated from high M r inhibitor(s) of cysteine proteinases, which were not further characterized. Final purification was achieved by DEAE-Sephacel chromatography, from which the inhibitor eluted as a single peak at 0.07 M NaCl (Fig. 1B). About 5 mg of pure equistatin was obtained from 3 kg (fresh weight) of sea anemones.
SDS-PAGE and Analytical Isoelectric Focusing-On SDS-PAGE under reducing conditions, equistain migrates as a single band with M r of about 16,000 ( Fig. 2A). The molecular weight is higher than the molecular weights of either stefins or cystatins (M r ϳ 11,000 and 13,000, respectively) but lower than those of kininogens (M r ϳ 50,000 -100,000) (12). The stefins, the cystatins, and the kininogens are proteins with similar sequences and, until recently, were the only known endogenous inhibitors of papain-like cysteine proteinases. On analytical isoelectric focusing, the inhibitor is shown to be an acidic protein with a pI value of 4.7. Very faint bands with pI values of 4.9 and 4.5, probably corresponding to the isoforms of the inhibitor (see below for explanation), could also be seen (Fig. 2B).
Amino Acid Sequence of Equistatin-The major and minor N-terminal amino acid sequences, labeled NI-1 and NI-2, respectively, are shown in Fig. 3A. Sequence analyses of the peptides derived from glycyl endopeptidase digestion provided the amino acid sequence of the whole molecule (Fig. 3A). The largest peptide, G-3, spanned the middle part of the inhibitor and overlapped with both NI sequences. The C-terminal sequence was confirmed by peptides G-5 and G-(5ϩ6), which ended with a pyridylethylated Cys residue that is not a glycyl endopeptidase cleavage site. Additional overlapping peptides, designated as E peptides (Fig. 3A) were obtained by S. aureus V8 proteinase digestion. During protein sequence analysis we have observed sequence polymorphism mainly in the middle part of the molecule (Fig. 3A). However, the yield of these residues was lower than 20% when compared with the main sequence. The observed sequence heterogeneity together with the results of isoelectric focusing reveals the presence of at least two closely related isoforms. As the isolation procedure involves the use of many anemone specimens, the difference in amino acid composition could arise from allelic polymorphism.
The inhibitor comprises 128 amino acid residues including these 11 cysteines and has a molecular weight of 14,129. The inhibitor has no potential glycosylation sites of the Asn-X-(Thr/ Ser) type.
Amino Acid Sequence Comparison-Alignment of equistatin residues 1-64 with 65-128 shows that the inhibitor consists of a tandem repeat with 48% identity and 60% similarity (Fig.   3B). This indicates that equistatin derives from a single ancestral gene that was duplicated and modified during evolution. However, neither domain shows any sequence homology with the members of the cystatin superfamily (12,29). A Blast (30) search of the Swiss-Prot data bases (31) revealed high sequence similarity with thyroglobulin type-1 domain, a domain of about 65 amino acid residues that repeats 10 times in the N-terminal part of thyroglobulin (32). A number of other proteins containing the thyroglobulin type-1 domain motif were found. These proteins display a variety of physiological functions in different organisms. Major histocompatibility complex class II-associated p41 invariant chain fragment and chum salmon egg cysteine proteinase inhibitor are potent inhibitors of papain-like cysteine proteinases (33,34), the former being involved in antigen presentation (35). Nidogen is a glycoprotein that probably plays a central role in the supramolecular organization of basement membranes and is tightly associated with laminin (36). Insulin-like growth factor-binding proteins act as inhibitors of insulin-like growth factor (37). Saxiphilin, characterized by a high affinity for a neurotoxin, saxitoxin (38,39), and a tumor-associated cell surface antigen known also as GA733 are proteins whose functions are not yet elucidated (40). Fig. 4A shows a schematic diagram of the regions of similarity between equistatin and proteins containing the thyroglobulin type-1 domain, and Fig. 4B shows a sequence alignment relative to the thyroglobulin type-1 domains of equistatin. By introducing only a few short gaps in the alignment, the amino acid identities between the thyroglobulin type-1 domains in equistatin Active Site Titration and Kinetics of Inhibition-The sequence data suggest that the two sequentially homologous parts of equistatin (Fig. 3B) may form two potential proteinase binding sites. The binding stoichiometry of papain (active concentration Ն95%) and equistatin was therefore determined by titration monitored by the loss of enzymatic activity. 0.95 Ϯ 0.04 mol of equistatin was needed to saturate 1 mol of papain, indicating that the two proteins formed an equimolar complex (Fig. 5). It could be suggested that binding of one proteinase molecule to equistatin prevents binding of the second proteinase molecule, probably by steric hindrance. However, there are a number of other possibilities. (i) One of the domains is not inhibitory at all, as observed in the kininogens (41). (ii) One of the domains has substantially lower affinity for proteinases, as found for the mucus proteinase inhibitor interaction with various serine proteinases (42). (iii) Both domains bind to the same proteinase molecule but only one of them binds to the active site; the other binds to another site distant from the active site, as reported for rhodiin binding to thrombin (43). Additional spectroscopic and structural studies involving mutant proteins will therefore be needed to clarify which of the above hypotheses is correct.
The kinetics of binding of equistatin to papain and cathepsins B and L were studied under pseudo first-order conditions assuming 1:1 binding stoichiometry (see above). The pseudo first-order rate constants were found to increase linearly with increasing concentrations of inhibitor [I], in agreement with the proposed binding mechanism (25). Values of the secondorder rate constants (k a ), the dissociation rate constants (k d ), and the equilibrium constants (K i ) are presented in Table I. Rapid binding of equistatin to cathepsin L and papain was observed, but the complexes with papain were ϳ10-fold less stable, with a 5-fold lower association rate constant and a 2-fold higher dissociation rate constant. The rate of complex formation between equistatin and cathepsin B was substantially slower. Its k a value is Ͼ30-fold lower than those for cathepsin L and papain, also reflected in the increased K i value although the overall effect is partially compensated by a lower k d value. Cathepsin B (44) differs from papain (45) and its homologue cathepsin L (46) by having an additional loop of about 20 amino acids, which partially occludes the active site, thus interfering with inhibitor binding (47).
The kinetic and equilibrium constants for the interaction of equistatin with cathepsins L and B and papain are similar to those reported for the interactions of these enzymes with cystatins (12,22,47). The K i values are also in reasonable agreement with those obtained for various forms of chum salmon egg cysteine proteinase inhibitor (34,48) although they differ significantly from the values for the p41 form of invariant chain fragment. The latter was found to be a stronger inhibitor of cathepsin L (ϳ10-fold) and a weaker inhibitor of papain (ϳ3fold) but did not inhibit cathepsin B at all (33).
In conclusion, a new protein inhibitor of papain-like cysteine proteinases was isolated from sea anemone A. equina. The inhibitor, equistatin, is distinct from cystatins but shares significant sequence homology with two other chum salmon egg cysteine proteinase inhibitors, p41 invariant chain fragment and cysteine proteinase inhibitor. The three inhibitors were therefore suggested to form a new superfamily of cysteine proteinase inhibitors. The thyroglobulin type-1 domain motif, common to all three inhibitors, has been identified in a variety of other proteins. Whether this highly conserved thyroglobulin type-1 element indeed acts as an inhibitor of cysteine proteinases in these proteins remains to be established as well as the mechanism of binding to cysteine proteinases.  and cathepsin B The association rate constants, k a , together with their standard errors were calculated from the dependence of the pseudo first-order rate constant on inhibitor concentration. Dissociation rate constants, k d , were calculated for each inhibitor concentration as described under "Experimental Procedures." The equilibrium inhibition constants (K i ) were calculated from k a and k d . The number of measurements is given in parentheses.