A novel aspartyl proteinase from apocrine epithelia and breast tumors.

GCDFP-15 (gross cystic disease fluid protein, 15 kDa) is a secretory marker of apocrine differentiation in breast carcinoma. In human breast cancer cell lines, gene expression is regulated by hormones, including androgens and prolactin. The protein is also known under different names in different body fluids such as gp17 in seminal plasma. GCDFP-15/gp17 is a ligand of CD4 and is a potent inhibitor of T-cell apoptosis induced by sequential CD4/T-cell receptor triggering. We now report that GCDFP-15/gp17 is a protease exhibiting structural properties relating it to the aspartyl proteinase superfamily. Unexpectedly, GCDFP-15/gp17 appears to be related to the retroviral members rather than to the known cellular members of this class. Site-specific mutagenesis of Asp(22) (predicted to be catalytically important for the active site) and pepstatin A inhibition confirmed that the protein is an aspartic-type protease. We also show that, among the substrates tested, GCDFP-15/gp17 is specific for fibronectin. The study of GCDFP-15/gp17-mediated proteolysis may provide a handle to understand phenomena as diverse as mammary tumor progression and fertilization.

GCDFP-15 (gross cystic disease fluid protein, 15 kDa), also known as prolactin-inducible protein (1), gp17 (2), secretory actin-binding protein (3), and extraparotid glycoprotein (4), is a protein secreted by various exocrine glands, including the seminal vesicle, salivary gland, and sweat glands. This protein is, in addition, expressed by cancer cells derived from a limited number of tissues, among which are prominent primary and secondary breast carcinomas exhibiting an apocrine differentiation (5). The factor exists as a dimer and a tetramer of a glycosylated 17-kDa subunit in various body fluids (6,7). GCDFP-15 is a highly specific marker for differential cytological diagnosis of metastatic mammary tumors (8), and its expression is up-regulated by androgens, glucocorticoids, and progesterone (9). However, its role in tumors and the prognostic value of its expression are not yet clearly established (10). Studies from our laboratory on the gp17 protein purified from human seminal plasma have shown that it is a ligand for CD4 (11), in turn a T-cell co-receptor playing a key role in antigen recognition and T-cell activation. Further analysis has indicated that early exposure of peripheral blood T-cells to GCDFP-15/gp17 results in the inhibition of CD4 ϩ T-cell apoptosis due to CD4 cross-linking and subsequent T-cell receptor triggering (12). GCDFP-15/gp17 was also found in the post-acrosomal region of ejaculated spermatozoa and remains bound to the sperm cell surface after capacitation, thus implying a possible role in fertilization (13).
In this work, we further defined the properties of GCDFP-15/gp17 with the aim of gaining a deeper understanding of its role in tumor progression and reproduction. We found, in fact, that the factor is a retrovirus-like aspartyl protease and has the potential to modify the extracellular matrix by fibronectin degradation. The finding that a significant percentage of breast carcinomas have the ability to synthesize and secrete GCDFP-15/gp17 (14), together with its absence in normal resting mammary gland, raises the possibility that this proteinase might play a role in the lytic processes associated with invasive breast cancer lesions, as already found for other proteinases, including matrix metalloproteinases (15), plasminogen activators (16), and secreted lysosomal enzymes (17).

EXPERIMENTAL PROCEDURES
Gelatin-PAGE 1 and Gelatin/SDS-PAGE Assays-15% SDS-polyacrylamide and 10% nondenaturing polyacrylamide gels were prepared, and gelatin (Sigma) was incorporated at a 0.1% final concentration as originally described by Heussen and Dowdle (18). Prior to electrophoresis, aliquots of purified proteins or of proteins from culture supernatants of yeast were mixed with an equal volume of 2ϫ Laemmli sample buffer with reducing agent for gelatin/SDS-PAGE and with 0.25 volume of 5ϫ glycerol-tracking dye buffer (containing 5% glycerol and 0.01% bromphenol blue) for gelatin-PAGE. After electrophoresis, the gelatin/ SDS-polyacrylamide gels were prewashed with 2.5% (v/v) Triton X-100 twice for 30 min each to remove SDS and successively as for gelatin-PAGE were rinsed briefly with buffer (50 mM NaCl and 25 mM Tris-HCl (pH 7.5)) and incubated in Hanks' balanced salt solution (Sigma) overnight (16 h) at 37°C. Following incubation, the gels were stained with Coomassie Brilliant Blue and destained according to standard procedures. Protease activity was revealed by negative staining of transparent bands. The separated proteins were electrotransferred onto Immobilon-P membrane for 1 h at 250 mA in a transfer unit (Bio-Rad). The blots were incubated with specific antibodies (7) and a commercial ECL kit (Amersham Pharmacia Biotech, Buckinghamshire, United Kingdom). As negative control, the blots were incubated with the secondary antibody alone, and no reactive protein was detected.
Modeling Analysis-The model of GCDFP-15/gp17 was carried out by the SWISS MODEL server using the ProMod and Gromos 96 programs for comparative modeling and energy minimization, respectively. The Swiss Protein Database Viewer program (19) was used for the analysis of the models. Several structures of the pepsin-like family proteases (1psa, 2psg, 1smr, 5apr, and 1eag) were separately used for modeling, and the best model was obtained with candidapepsin (1eag). The models were validated with the "What If" program.
Site-specific Mutagenesis-Site-directed mutagenesis according to Kunkel (20) was carried out to introduce the desired mutation (D22S) into the GCDFP-15/gp17 cDNA cloned into plasmid pET22b(ϩ) (Novagen, Madison, WI) between the EcoRI and BamHI sites. The oligode-* This work was supported by grants from the Associazione Italiana Ricerca sul Cancro and Consiglio Nazionale delle Ricerche-Progetto Finalizzato Biotechnology and by Ministero Università Ricerca Scientifica e Tecnologica Biotechnology Program L.95/95. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
§ To whom correspondence should be addressed. Fax: 39-81-5936123; E-mail: caputo@iigbna.iigb.na.cnr.it. oxynucleotide used in the mutagenesis reaction was 5Ј-CAGTACGTC-CAAATAGCGAAGTCACTGCA-3Ј (GENSET, Paris, France). The AGC mutant codon in place of the GAC aspartic acid codon is indicated in boldface. The mutation was verified by DNA sequencing of the complete cDNA. Mutant GCDFP-15/gp17 cDNA was amplified by polymerase chain reaction to obtain its 5Ј-and 3Ј-end SnaBI and NotI sites to allow its cloning into plasmid pPIC9. After this cloning, GCDFP-15/gp17 cDNA was checked again by DNA sequencing. The resulting pPIC9-GCDFP-15/gp17 D22S construct was used to express mutant D22S protein in Pichia pastoris yeast as described previously (7).
Proteinase Assays-Proteinase activity was measured in a 50-l reaction containing 20 mM NaOAc (pH 3.5), 150 mM NaCl (buffer A), and 20 g of substrate. Reactions were initiated by the addition of protease and incubated at 37°C for 24 h, and 20 l of reaction were then analyzed by SDS-PAGE, followed by Coomassie Blue staining. Human plasma fibronectin was purchased from Roche Molecular Biochemicals (Mannheim, Germany). The protein content was determined by the Bio-Rad protein assay using bovine serum albumin as a standard. The proteins were separated by SDS-PAGE under reducing conditions according to standard conditions (6% spacer gel and 10% separation gel). Molecular mass standards (Bio-Rad) used were myosin (207 kDa), ␤-galactosidase (121 kDa), bovine serum albumin (81 kDa), ovalbumin (51.2 kDa), carbonic anhydrase (33.6 kDa), and soybean trypsin inhibitor (28.6 kDa). The purity of the native and recombinant proteins was assessed by silver-stained SDS-PAGE, PAGE, two-dimensional SDS-PAGE, and NH 2 -terminal sequencing.

RESULTS
Structural Homology between GCDFP-15/gp17 and the Aspartyl Proteinases-To gain insight into the possible function(s) of GCDFP-15/gp17, we decided to made full use of the TOPITS "threading" method (21), a potent means used to search for remote homologous protein structures with sequence identity around the so-called "twilight zone" (22), for comparison of the predicted secondary structure of GCDFP-15/gp17 with those derived from proteins with known three-dimensional structure present in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University. The highest sequence identity (23% out of an alignment of 141 residues) with a high similarity score was given by an acidic aspartic proteinase from Candida albicans, also known as candidapepsin (23). A lower sequence identity (16%) was detected with pepsinogen (24). Both these enzymes belong to the superfamily of acid proteinases and, in particular, to the pepsin-like family (25), suggesting a possible hydrolytic activity of GCDFP-15/gp17. Significant similarity scores were also observed for a few other hydrolases such as epidermolytic toxin A from Staphylococcus aureus (1agj) and neuraminidase from Vibrio cholerae (1kit).
GCDFP-15/gp17 Protease Activity-We then assessed whether GCDFP-15/gp17 carries any hydrolytic activity. To do so, the proteolytic activity of GCDFP-15/gp17, obtained from human seminal plasma (HSP) of healthy donors by anionexchange chromatography and affinity and gel filtration as described previously (7), was examined by gelatin-PAGE and gelatin/SDS-PAGE. As previously reported (7), GCDFP-15/ gp17 from human seminal plasma exists under these conditions as a dimeric molecule, which tends to form tetrameric aggregates from which dimers, however, can be separated. In addition, native monomeric forms can be obtained by a different purification procedure (26). The dimeric and monomeric forms were thus separately analyzed. Fig. 1A (lane a) shows that the GCDFP-15/gp17 dimer actually exhibits two electrophoretic forms (slow (S) and fast (F)) in PAGE experiments; both forms were reactive in Western blotting with a panel of anti-GCDFP-15/gp17 monoclonal antibodies, one of which (D6) is shown in Fig. 1A (lane b). The two bands observed on PAGE are very likely charge isoforms of GCDFP-15/gp17 exhibiting the same molecular mass as shown by the following observations. (i) Two peaks of GCDFP-15/gp17 were resolved by anionexchange chromatography (Q-Fast Flow) using a shallower gradient (0 -0.3 M) of NaCl compared with the previously reported procedure (7); (ii) the two peaks are dimers by gel filtration; and (iii) both are composed of a different ratio of the four monomeric charge isoforms resolved by two-dimensional SDS-PAGE (data not shown). The two dimeric forms of the protein were both endowed with proteolytic activity as shown by a zone of negative staining in gelatin-PAGE in correspondence to the GCDFP-15/gp17 electrophoretic bands (Fig. 1A,  lane c). Furthermore, GCDFP-15/gp17 migrated as a single band under denaturing SDS-PAGE conditions (Fig. 1B, lane a); this monomer, after renaturation, was devoid of proteolytic activity (Fig. 1B, lane b). Similarly, the monomeric form of GCDFP-15/gp17, purified by the protocol previously described (26), was devoid of activity in gelatin-PAGE assays (data not shown). In conclusion, only the dimer of GCDFP-15/gp17 is functionally relevant.
Aspartate 22 of GCDFP-15/gp17 Is Required for Proteolytic Activity-We then generated a multisequence alignment based on the correspondence of secondary structures (PHDsec method) of GCDFP-15/gp17 to members of known three-dimensional structure of the pepsin-like family (data not shown). The overall sequence identity between GCDFP-15/gp17 and these members of the aspartic proteinase pepsin-like family was low (Ͻ25%).
To confirm the threading results and considering both the current lack of structural information concerning GCDFP-15/ gp17 and the absence of a better structural homolog of known three-dimensional structure, a homology modeling approach was adopted ( Fig. 2A) using, as a reference, the three-dimensional structure of C. albicans aspartic proteinase (23) complexed with the A70450 inhibitor (1eag) solved at a 2.1-Å resolution (for details, see Fig. 2A legend). The structural topological model consisted of three ␤-sheets shown schematically in Fig. 2B. The model involves most of the N-terminal domain and only part of the C-terminal domain of 1eag from Gln 11 to Glu 202 (Fig. 2). The root mean square deviation of the ␣-carbon atoms was 0.65 Å for a superimposition of 101 residues and 2.08 Å for 115 residues. Asp 32 and Asp 218 have been identified as catalytically important for the active site in 1eag (23). Interestingly, in our model, the GCDFP-15/gp17 Asp 22 is superimposed on 1eag Asp 32 . The structural alignment of GCDFP-15/gp17 with some members of the pepsin-like family is shown in Fig. 3. A search in the Class Architecture Topology Homologous Superfamily Data Bank (27) indicated that domain 1 of 1eag (chain A) belongs to the class of mainly ␤-proteins with barrel architecture and topology represented by cathepsin D domain 1 and by several homodimeric retroviral aspartic proteases (28).
To further characterize the proteolytic activity of GCDFP-15/ gp17, an expression system in yeast and a purification procedure from culture supernatants for bona fide recombinant GCDFP-15/gp17 were developed (7). In addition, we exploited the yeast expression system to express a GCDFP-15/gp17 mu-  (Fig. 4B, lane a), the supernatant derived from the clone expressing wild-type GC-DFP-15/gp17 (GS115/pPIC9-GCDFP-15/gp17) also exhibited a proteolytic activity in the position expected for GCDFP-15/gp17 (lane b). Interestingly, the degrading activity of GCDFP-15/ gp17 was abolished upon introduction of the D22S mutation (lane c, containing a 5-fold excess of proteins from culture supernatants of the clone expressing GCDFP-15/gp17 D22S).
Fibronectin Is a Specific Substrate of GCDFP-15/gp17-As GCDFP-15/gp17 is normally present in human seminal plasma, which contains proteolytic enzymes implicated in the mechanism of liquefaction of the seminal coagulum (30), we tested the ability of GCDFP-15/gp17 obtained from HSP of healthy donors (3) to degrade fibronectin (31), one of the major components of ejaculated semen contributing to the formation of the seminal gel (32) and one of the ligands of GCDFP-15/ gp17 (6). Fig. 5A (lanes a and c) shows that, although fibronectin is known to be highly susceptible to proteolysis under certain conditions, it is highly stable in buffer A after incubation for 24 h at 37°C in the absence of GCDFP-15/gp17. Conversely, the 220-kDa fibronectin band (Fig. 5A, lane a) was completely degraded after incubation with seminal GCDFP-15/gp17 under the same conditions (lane b). An identical pattern of digestion was observed when highly purified GCDFP-15/gp17 from breast cyst fluid was used (data not shown). Furthermore, a homogeneously pure preparation of this recombinant molecule (r-GCDFP-15/gp17) was tested for the presence of fibronectindegrading activity in comparison with the activity of native GCDFP-15/gp17 from HSP in buffer A. As shown in Fig. 5A (lane d), the pattern of fragmentation of fibronectin caused by r-GCDFP-15/gp17 was identical to that obtained when native GCDFP-15/gp17 was employed (lane b).
To determine the substrate specificity of GCDFP-15/gp17, laminin, vitronectin, or bovine serum albumin in its native form was digested with the recombinant protease. In all cases, no fragmentation was observed (see, for example, vitronectin in Fig. 5B). r-GCDFP-15/gp17 was then tested for its ability to degrade resorufin-labeled casein, a universal protease substrate used for the determination of the activity of proteases such as Pronase, trypsin, endoproteinase Asp-N, and endoproteinase Lys-C. The activity was below detection (data not shown). Due to the ability of GCDFP-15/gp17 to bind domains 1 and 2 of CD4 (2, 11), we also examined the possibility that recombinant CD4 can be degraded upon incubation with recombinant GCDFP-15/gp17. As shown in Fig. 5C, also CD4 was resistant to the proteolytic activity of this factor.
GCDFP-15/gp17 Activity Is Inhibited by Pepstatin A-The inhibition of the fibronectin-degrading activity of GCDFP-15/ gp17 by antipain (an inhibitor of trypsin-like serine proteases, papain, and some cysteine proteases), aprotinin (a broad spectrum serine protease inhibitor), E-64 (an irreversible inhibitor of cysteine protease), pepstatin A (an irreversible inhibitor of aspartic proteases such as cathepsin D, pepsin, and renin), phenylmethylsulfonyl fluoride (an irreversible inhibitor of serine proteases), or EDTA was then determined. As predicted by the model and demonstrated by site-specific mutagenesis, hydrolysis of fibronectin was inhibited only by pepstatin A, but not by the other inhibitors (Fig. 5D), thus further supporting the conclusion that GCDFP-15/gp17 belongs to the aspartic proteinase family. DISCUSSION The gross cystic disease fluid protein GCDFP-15 or prolactininducible protein was independently isolated as an abundant protein of the fluid of gross cystic disease of the human breast (5) and as a glycoprotein secreted by T47D human breast cancer cells in response to steroids and lactogenic hormones (9). This protein was also independently identified as gp17 (2) and secretory actin-binding protein (3) from human seminal plasma and as extraparotid glycoprotein (4) from human submandibular/sublingual saliva.
In this report, we provide the first evidence that GCDFP-15/ gp17 is a protease. In particular, we were able to build a model of GCDFP-15/gp17 using, as a guide, candidapepsin (1eag), an aspartic protease from C. albicans (23); we also observed that GCDFP-15/gp17 displays the "all ␤-fold" typical of several aspartic proteinases (25).
Reportedly, the aspartic proteases known so far are twodomain proteins Ͼ300 residues in length, with one aspartyl residue in each domain contributing to the active site (33). By contrast, the GCDFP-15/gp17 sequence does not exceed 118 amino acids in length and contains a single aspartyl residue at position 22, corresponding to the Asp 32 catalytic residue of domain 1 of the 1eag protease. The bilobate structure of cellular acidic proteases carrying a two-fold symmetric axis is believed to have evolved from the duplication of an ancestral protein whose proteolytically active form was a dimer bearing a similar two-fold symmetry (33). Dimeric retroviral proteases are considered to be less evolved examples of a progenitor common to cellular aspartic proteases (25). Duplication and fusion of the cognate gene would have then allowed divergent evolution, generating a monomeric protein of ϳ300 residues with an intrinsic two-fold symmetry, like the aspartic proteases from lower eukaryotes (23,33).
We have developed an in vitro assay to check the hydrolytic activity of GCDFP-15/gp17, and we have demonstrated that it is, in fact, a protease. By site-specific mutagenesis and by inhibition with pepstatin A, a specific inhibitor of aspartyl proteinases, we confirmed that GCDFP-15/gp17 is an aspartic protease. Our results indicate that only the dimeric form of GCDFP-15/gp17 is active, suggesting that GCDFP-15/gp17 may function as a retroviral protease. The GCDFP-15/gp17 protease, which is encoded by the prolactin-inducible protein gene onto the q32-36 region of chromosome 7, may be thus considered either a living "fossil" of an aspartic protease ancestor or a product of convergent evolution deriving from mutational changes of the cellular aspartic protease of the high eukaryotes (33).
As reported above, we found that GCDFP-15/gp17 specifically degrades the fibronectin molecule under buffer conditions in which the latter would be otherwise stable. The facts that fibronectin is one of the major protein constituents of the seminal coagulum (32) and that GCDFP-15/gp17 constitutes at least 1% of seminal plasma proteins (2) suggest that GCDFP-15/gp17 may contribute to fibronectin cleavage during liquefaction.
Furthermore, fibronectin is a multifunctional extracellular matrix protein that plays a central role in cell adhesion. It interacts in multiple ways with the cell surface as well as with other extracellular matrix components and is sensitive to digestion by various proteases (34,35). It is possible that fibronectin-degrading proteases, and thus GCDFP-15/gp17, might facilitate cell invasion by cleaving the extracellular matrix scaffold between cells, thus detaching cell membranes from adhesion sites. Consistent with this, a number of studies have shown a relationship between fibronectin-degrading proteases and cell invasion (36,37).
In this report, we show the specificity of this protease for the fibronectin substrate as compared with other extracellular matrix components such as laminin and vitronectin, which were not degraded by GCDFP-15/gp17. Thus, GCDFP-15/gp17 produced by tumoral cells may interfere in matrix deposition. Many studies have reported that the malignant cells fail to deposit components into the extracellular matrix in vitro, and many seem to express this defect also in vivo (38,39). In particular, fibronectin is among the matrix components that malignant cells fail to deposit into a matrix.
Finally, given the finding that GCDFP-15/gp17 has been found to block T-cell apoptosis induced by CD4/T-cell receptor triggering (12) and the finding that fibronectin-derived peptides modulate apoptotic cell death (40), it would be interest to assess whether the proteolytic activity of GCDFP-15/gp17 is required for such an inhibition or whether its CD4-binding activity is sufficient per se.