S100A11, S100A10, Annexin I, Desmosomal Proteins, Small Proline-rich Proteins, Plasminogen Activator Inhibitor-2, and Involucrin Are Components of the Cornified Envelope of Cultured Human Epidermal Keratinocytes*

The cornified envelope (CE) is an insoluble sheath of (cid:101) -( (cid:103) -glutamyl)lysine cross-linked protein, which is deposited beneath the plasma membrane during keratino- cyte terminal differentiation. We have probed the structure of the CE by proteolytic cleavage of purified CE fragments isolated from CEs formed spontaneously in cell culture. CNBr digestion, followed by trypsin and then proteinase K treatment released 25%, 42%, and 18%, respectively, of the CE protein. Purification and se- quencing of released peptides has identified two novel CE precursors, S100A11 (S100C, calgizzarin) and S100A10 (calpactin light chain). We also sequenced peptides derived from annexin I and plasminogen activator inhibitor 2, two putative envelope precursors, as well as portions of the well established CE precursor proteins SPR1A, SPR1B, and involucrin. Many desmosomal components were identified (desmoglein 3, desmocolin A/B, desmoplakin I, plakoglobin, and plakophilin), indicating that desmosomes

In the present report, we study the composition of envelopes formed by cultured keratinocytes, as a model of the early stages in cornified envelope formation. We purify envelope fragments from CEs formed in cell culture, digest them sequentially with CNBr, trypsin, and proteinase K, and determine the sequence of the released peptides. Using this procedure, we identify involucrin, PAI-2, keratins, SPR1A and SPR1B, annexin I, desmoplakin I, plakoglobin, envoplakin, plakophilin, desmocolin 3a/3b, desmoglein 3, S100A11 (S100C, calgizzarin), S100A10, and several unidentified proteins as components of the cornified envelope. The SPRs are, by far, the most abundant components released by the digestion protocol. Based on the pattern of fragments released by the enzymatic digestion, we identify the amino-and carboxyl-terminal regions of S100A11 and the SPRs, and the amino terminus of annexin I as sites of cross-link formation.

MATERIALS AND METHODS
Digestion of Cornified Envelopes-Cornified envelopes, formed spontaneously or induced to form by treatment with NaCl, were used to prepare highly purified cornified envelope fragments from human foreskin keratinocyte cultures (36). These envelope fragments were sequentially digested using the scheme shown in Fig. 1. CE fragments (10 mg) were resuspended in 16 ml of 70% formic acid containing 2 g of CNBr and digested, with continuous agitation, for 24 h at room temperature. The mixture was then centrifuged to yield soluble and pellet fractions. The soluble fraction was diluted 5-fold with water and lyophilized to remove CNBr. This was repeated two times. The resulting residue was dissolved in 0.4 -1.0 ml of H 2 O, acidified with trifluoroacetic acid (for this and subsequent acidifications, the final trifluoroacetic acid concentration ϭ 0.2%), and centrifuged to yield soluble and pellet fractions.
The soluble fraction (fraction 2) was analyzed using a C-18 HPLC column, and the pellet fraction (fraction 1) was dissolved in Laemmli sample buffer for electrophoresis.
The CNBr pellet was washed three times with H 2 O and resuspended by sonication in 1-2 ml of 50 mM Tris-HCl, pH 8.0, containing 11.5 mM CaCl 2 , and an aliquot was saved for amino acid analysis. The remaining suspension was digested at 1% w/w (based on the initial concentration of CE fragments) with L-(tosylamido-2-phenyl)ethyl chloromethyl ketone-treated trypsin (Worthington) for 24 h at 25°C with agitation. The material was centrifuged to yield pellet and soluble fractions. The soluble fraction was acidified with trifluoroacetic acid and centrifuged to yield an acid-insoluble pellet and an acid-soluble supernatant. The soluble fraction (fraction 4) was analyzed by HPLC, and the pellet (fraction 3) was resuspended in Laemmli buffer and analyzed by denaturing gel electrophoresis. In some cases this pellet was redissolved in proteinase K digestion buffer (10 mM Tris-HCl, pH 8.0, containing 1 mM CaCl 2 ) containing 1 g of proteinase K and digested for 30 h. After acidification with trifluoroacetic acid and centrifugation, this mixture yielded a pellet (fraction 5) and supernatant (fraction 6). The supernatant (fraction 6) was analyzed by HPLC. The pellet was not analyzed.
The pellet from the trypsin digestion was washed with proteinase K digestion buffer, suspended by sonication in 0.5 ml of the same buffer, and an aliquot was removed for amino acid analysis. The remaining sample was digested with 1 g of proteinase K (Promega) for 30 h. Centrifugation of this reaction mixture yielded a pellet and supernatant. Acidification of the supernatant with trifluoroacetic acid and centrifugation yielded pellet and soluble fractions. The soluble fraction (fraction 8) was analyzed by HPLC and the pellet (fraction 7) was resuspended in Laemmli buffer for gel electrophoresis.
Determination of Protein Concentration-Protein concentration of CE fragments was determined using the solid-phase, dot blot assay as described previously (36). The protein content of the fractions released by the various proteolytic digestions was measured using the Bio-Rad Dc protein assay kit.
HPLC Purification of Peptide Fragments-Soluble peptides released by proteolytic digestion (Fig. 1, #2, #4, #6, and #8) were filtered through a 0.2-m Uniflo-3 syringe filter (Schleicher & Schuell) and the samples were fractionated on a C18 reverse phase column (Advantage-100, 5 m, 240 ϫ 4.6 mm, Thomson Liquid Chromatography, Springfield, VA) at a flow rate of 1 ml/min. Peptides eluting during the acetonitrile gradient (gradient conditions indicated in Fig. 3 legend) were monitored at 220 nm using a Waters model 484 absorbance detector (Waters, Milford, MA). Peptides were collected, dried by rotary evaporation, and purified by re-chromatographing on the Advantage C18 column using a shallow acetonitrile gradient (0.1%/min). The purified peptides were concentrated by rotary evaporation and dissolved in 10 l of 70% acetonitrile containing 1% trifluoroacetic acid for microsequencing.
Microsequencing and Protein Assignment of Purified Peptides-Purified peptides were sequenced using a Perkin Elmer/Applied Biosystems Procise model 494 microsequenator. HPLC-purified peptides were applied to a BioBrene-Plus-treated (Applied Biosystems) glassfiber filter prior to sequencing. Polyacrylamide gel-purified peptides were sequenced directly on the excised PVDF membrane. In most cases, 10 sequencing cycles were performed for each peptide. Peptide sequences of Ͼ6 amino acids were matched to candidate proteins by searching the non-redundant protein data base using the Blastp facility, 2 the Blosum-62 matrix, and an e-value of 10 -100. Sequences of 6 or 5 amino acids were similarly analyzed using the PAM 40 matrix at an e-value of 100 -1000. Sequences of 4 or 3 amino acids were used to screen a library of candidate CE precursor proteins made in PC Gene (IntelliGenetics, Inc., Mountain View, CA). The proteins in this library were those listed in the section on molecular modeling. Only matches to human protein sequences are reported in this article.
Amino Acid Analysis and Mathematical Modeling-CE preparations were analyzed for amino acid composition using an Applied Biosystems 420A derivatizer/analyzer. The % molar data were analyzed by least squares regression analysis essentially as described previously (25,26). The data were modeled using the following equations.
x is the unknown m-dimensional vector whose jth component is the number of moles of protein j in the sample, A is the known n ϫ m matrix whose (i,j)th entry is the number of moles of amino acid i in 1 mol of protein j (n Ͼ m and A is full rank), y is the number of moles of amino acid i in the sample, and E is a random n-dimensional error vector. It is assumed that E has mean of zero and covariance ⌺. Multiplying Equation 1 by B, the left inverse for A (i.e. a m ϫ n matrix satisfying BA ϭ I, where I is the m ϫ m identity matrix) yields the following equation.
By ϭ x ϩ BE (Eq. 2) The term By defines a random vector with mean x and covariance B⌺B T . The protein composition x is estimated by choosing B to minimize the trace covariance, B⌺B T , subject to the constraints that BA ϭ I and (By) i Ն 0 for i ϭ 1, . . . , m. The nonnegative constraint was imposed because large negative solutions, which are physiologically irrelevant, were generated without the constraint. Since the number of observations was fairly small, the estimates of sample covariance were unreliable and we used ⌺ ϭ I. Therefore, the estimate for x is the positive part of the usual least squares estimate. The values calculated from the molar data were converted to percent mass. The proteins included in this analysis were involucrin (38) (SWISS-PROT P07476); loricrin (8) (SWISS-PROT P23490); proelafin (21) (SWISS-PROT P19957); cystatin A (19) (SWISS-PROT P01040); filaggrin (39) (GenPept M24355); desmoplakin I (40) (GenPept M77830); plakoglobin (41) (EMB Z62228); annexin I (42) (SWISS-PROT P04083); envoplakin (34) (GenBank U53786); S100A11 (S100C) (43) (DNA Database of Japan D38583); S100A10 (44)  Cornified envelope fragments were prepared as described previously (36) and then processed as outlined under "Materials and Methods." Fractions 1, 3, and 7 (#1, #3, and #7) were characterized by gel electrophoresis. Fractions 2, 4, 6, and 8 (#2, #4, #6, and #8) were characterized by HPLC. Since sodium dodecyl sulfate can effect the resolution of HPLC purification, CE fragments were washed with H 2 O prior to CNBr digestion for generation of fraction 2. Sample 5 was not characterized after gel fractionation, because the smeared peptides could not be resolved (data not shown).
analysis, since Met is destroyed by the CNBr digestion and Trp was not measured.
Polymerase Chain Reaction Cloning of Calgizzarin (S100A11)-Poly(A) ϩ RNA, isolated from cultured human foreskin keratinocytes (53), was used as the template to clone a cDNA encoding the human S100A11 protein coding region by reverse transcription-polymerase chain reaction (RT-PCR) using 5Ј-CAT ATG GCA AAA ATC TCC AGC CC as the forward primer and 5Ј-GGA TCC TGA GGT GGT TAG TGT GCT CA (43) as the reverse primer. Poly(A) ϩ RNA (1 g) was reversetranscribed in a standard reaction using a RT-PCR kit (Boehringer Mannheim, catalog no. 1483188) and 30 pmol of the reverse primer.
One half of this reaction was PCR-amplified using 2.5 units of Pwo polymerase (Boehringer Mannheim) in the presence of 0.5 mM MgCl 2 , 3 mM MgSO 4 , and 30 pmol of each primer (10 cycles of 94°C for 15 s, 55°C for 30 s, 72°C for 45 s; and 15 additional cycles for which the elongation step was increased by 20 s/cycle). The PCR product was blunt end-cloned at the EcoRV site of pZERO1.1 (Invitrogen, San Diego, CA), and the DNA sequence of both strands was determined using the M13-reverse and T7 promoter primers and an Applied Biosystems model 377 sequencer. Fig. 1 shows the scheme followed for digestion of the purified CE fragments. The strategy was to sequentially digest CE fragments with proteolytic agents and then to sequence the released peptides. At each stage, the residual pellet from the preceding step was digested with the next cleavage agent. The goal was to release smaller peptides at each step. We first cleaved with CNBr. CNBr cleaves at methionine residues, which occur relatively infrequently in proteins. The residual pellet from the CNBr step was treated with trypsin, which cleaves after lysines and arginines, and the pellet from this step was treated with the nonspecific protease, proteinase K.

Release of Peptide Fragments-
In preparation for HPLC separation of peptide fragments, the released material from each digestion was acidified with trifluoroacetic acid. Addition of trifluoroacetic acid resulted in the precipitation of larger peptides from each sample. The precipitates (fractions 1, 3, and 7) were characterized by gel electrophoresis, while the soluble fractions (fractions 2, 4, 6, and 8) were separated using a C18 HPLC column. Fig. 2 shows the Coomassie Blue-stained profile of a 12% polyacrylamide gel of fractions 1, 3, and 7. The CNBr digestion products, prepared from either spontaneous (S) or induced (I) envelope fragments, yielded a similar pattern of discrete peptide bands. The peptides released following trypsin digestion yielded a smear. In addition, a greater amount of material was released from the induced envelopes compared with the spontaneous envelope preparation (normalized based on the amount of starting CE protein). Proteinase K, in contrast, did not release large peptide fragments from induced envelope fragments; however, some high molecular weight material, which remained at the top of the separatory gel, was released from spontaneous envelope fragments. Table I lists the quantity of protein released, from each major fraction, following sequential digestion of 1 mg of spontaneously formed CE fragments with CNBr, trypsin, and proteinase K. CNBr released 25% of the material, trypsin released 42%, and proteinase K released 18%. The total yield for all steps was 85% of the initial CE protein.
Sequence Analysis of CNBr-released Peptides-To identify proteins of the cornified envelope, selected CNBr-released pep- Characterization of peptides present in fractions 1, 3, and 7. Cultured human foreskin keratinocytes were permitted to form cornified envelopes spontaneously, or envelopes were induced to form by addition of NaCl, according to our previously described protocol (36). Fractions 1, 3, and 7 (#1, #3, and #7) were prepared by digestion as outlined in Fig. 1, and then electrophoresed on a denaturing 12% polyacrylamide gel. Loading was normalized based on the initial concentration of undigested envelope protein as follows: spontaneous (S) envelopes (1, 120 g; 3, 400 g; and 7, 400 g) and induced (I) envelopes (1, 120 g; 3, 200 g; and 7, 400 g). The asterisks indicate lanes loaded with control reactions containing trypsin (#3) or proteinase K (#7), but no envelope fragments. Peptide fragments C1, C2, C3, and C4 from lane 1 (S) were sequenced after transfer to PVDF membrane. The molecular size standards are indicated to the right of the panel.  Fig. 1; fractions 5 and 6 were not assayed, as they are subsets of fraction 3. b CE envelopes (spontaneously formed) were digested sequentially with CNBr, trypsin, and proteinase K as outlined in Fig. 1   tide fragments (Fig. 2, C1, C2, C3, and C4) were separated by gel electrophoresis, transferred to PVDF membrane, and microsequenced. The sequences were then used to search the nonredundant protein sequence data base to identify each protein. One peptide matched desmoplakin I, and the other three matched segments of annexin I (Table II). A methionine (Met) residue always preceded the sequence, a result that is consistent with release by CNBr. The desmoplakin band (Fig. 2, C1) migrated at 46 kDa in the CNBr digest of spontaneous envelope fragments. It is interesting that this band is virtually absent in the CNBr digest of induced envelope fragments. This observation is in agreement with our previously published report that desmosomal remnants are present in electron micrographs of spontaneous CE fragments but are absent in CE fragments prepared from induced envelopes (36). The other three peptides (C2, C3, and C4) were all derived from human annexin I. Peptide C2 appears to be a partial digest, since the calculated size of the Val 56 -Met 127 fragment, based on the known sequence (42), is 7.8 kDa and the molecular mass of C2 is 16.9 kDa. Likewise, peptides C3 and C4 have the same NH 2 -terminal sequence, but different molecular masses. C4, as judged by Coomassie staining intensity, was the most abundant peptide released by CNBr. The CNBr-released, trifluoroacetic acid-soluble fraction prepared from spontaneous envelope fragments ( Fig. 1, #2) was analyzed by C18 HPLC. Since SDS adversely effects the resolution of the HPLC purification, CE fragments were extensively washed with H 2 O and then CNBr-digested to yield fraction 2. HPLC fractionation of this material revealed a complex profile (Fig. 3A). To assure purity, the indicated peaks, C6 -C20, were rechromatographed on the C18 column, using a shallower (0.1%/min) acetonitrile gradient. The samples were then concentrated and sequenced (Table III). In some cases, this second chromatography step resolved the sample into multiple peaks. The additional peaks are indicated by alphabetical extensions (a and b). Also, in some cases, sequencing of a single peak yielded two sequences, one predominant sequence and a second, less abundant sequence. The less abundant sequence is FIG. 3. HPLC separation of peptides present in fractions 2, 4, and 6. Cornified envelope fragments, prepared from spontaneously formed envelopes, were digested as shown in Fig. 1. Peptide fragments present in fractions 2 (panel A), 4 (panel B), and 6 (panel C) were separated by HPLC. The injected sample was derived from digestion of 2 mg (panel A), 1 mg (panel B), or 3 mg (panel C) of initial envelope fragments. The C18 column was pre-equilibrated in 0.1% trifluoroacetic acid at a flow rate of 1 ml/min. The samples were injected at time zero and, after 15 min, an acetonitrile gradient was initiated. The percent acetonitrile was increased at a rate of 0.67%/min (panels A and C) or 0.33%/min (panel B). The left vertical axis indicates OD units at 220 nm. The horizontal axis is time in minutes, and the right vertical axis is % acetonitrile. The acetonitrile gradient is shown by the dashed line in each panel. Peaks that were collected for purification and sequence analysis are indicated as C5-C20 (panel A), T1-T9 (panel B), and P1-P13 (panel C). The asterisk denotes peaks that were present in control digestion reactions from which envelope fragments were omitted. labeled "minor" (e.g. C9, C9-minor).
Sequence Analysis of Trypsin-released Peptides-We next characterized trypsin-released fraction 4 ( Fig. 1). HPLC fractionation of this sample (Fig. 3B) yielded a simple profile, compared with the profile observed in Fig. 3A. Nine peaks (T1-T9) were purified and sequenced. As shown in Table IV, this fraction contained SPRs, S100A11, and annexin I. The SPRs were the most abundant proteins found in fraction 4. Due to the sequence similarity between SPR1A and SPR1B, it was not always possible to make definitive assignments of the protein from which the sequence was derived (e.g. peptide T5). Also, because of the repetitive nature of the amino acid sequence of the SPR family, it was not always possible to assign the sequence to a specific region of the protein (e.g. peptide T4). All of the peptide fragments, with the exception of T8, ended with the Lys or Arg residue characteristic of trypsin cleavage. T8 probability terminates with Pro-Lys (based on known sequence of SPR1A and SPR1B) (15,17), but these residues were not detected due to a limiting amount of sample. In all cases, the peptides could be assigned to positions in the known protein sequences that are preceded by Lys, Arg, or Met. Seven of the nine peptides were identified as coming from SPRs (T1-T6, T8), two are derived from S100A11 (T7, T9), and one was derived from annexin I (T7-minor). The annexin I peptide is designated "T7-minor" because it was detected as a minor component of T7.
Sequence Analysis of Proteinase K-released Peptides- Fig. 3C shows the HPLC profile of peptides present in fraction 6 ( Fig.   1). The indicated peaks (P1-P13) were isolated and repurified. Upon repurification some were separated into multiple peptides which are designated by alphabetic extensions (Table V). Many of the peptides generated were Յ5 amino acids in length and were too short for definitive assignment of the protein from which they were derived. Several fractions contained only 1-3 amino acids (P1, P2, P5, P6, P11, and P12). Although 4-amino acid length sequences cannot be definitively assigned to a single protein sequence, they do identify possible proteins when screened against a data base that includes the known cornified envelope precursor proteins. Desmoglein (P3a) and involucrin (P3a-minor, P4a) were identified in this group. Three of the sequences (P3b, P8, and P8-minor) were not found in the CE precursor data base. Peptides that were 5 or 6 amino acids were screened using the PAM40 matrix of the blastp facility. Sequences P7 and P9a matched SPR1A protein and annexin I, respectively. P7-minor matched human desmoplakin I and II and one other human protein (Table V, legend). Peptides of Ն6 amino acids were screened using the Blosum-62 matrix. P4b matched S100A11, and P9b and P10 matched SPR proteins. P13, with the amino acid sequence GPAPCPAPAP, did not find a perfect match in the non-redundant protein data base. The closest matches were bovine ␤-crystallin B1 ( 34 GPAPAPA-PAP 43 ) and myosin light chain 1 (-PAPAPAPAP-) from several species. Since the HPLC profile obtained from fraction 8 ( Fig. 1) was virtually identical to that observed for fraction 6 (data not shown), these peptides were not sequenced.
Identification of Potential Cross-link Sites within S100A11-S100A11 (calgizzarin) is a newly identified precursor of the cornified envelope. To verify its presence in keratinocytes, we cloned and sequenced the S100A11 cDNA from cultured human keratinocytes. The sequence (Fig. 4) matches the previously published sequence of a S100A11 cDNA clone isolated from human colon carcinoma cells (43), indicating that the S100A11 protein detected in skin and in cultured keratinocytes (54,55), and the colon carcinoma protein are identical. Our sequential proteolytic digestion results suggest that S100A11 is crosslinked in specific locations. Based on the S100A11 sequence, CNBr cleavage (at Met residues 1, 43, 63/64, and 89) is expected to release four peptide fragments (Fig. 4). CNBr cleavage of the envelope fragments releases peptide C12 (Table II) (Asn 44 -Arg 62 ), suggesting that this region is not cross-linked to the envelope structure. After this initial CNBr cleavage, the undigested residual was treated with trypsin. Trypsin digestion released peptide T9 (Ala 90 -Lys 97 ). Peptide T9 (Table IV) was produced by CNBr cleavage downstream of Met 89 , followed by tryptic cleavage downstream of Lys 97 . The sequence of this peptide did not reveal any cross-links. These results strongly suggest that a cross-link at Gln 102 and/or Lys 103 is anchoring the carboxyl terminus of S100A11 to the envelope. Peptide T7 (Table IV) and peptide P4b (Table V) are positioned within the amino-terminal CNBr cleavage peptide (Ala 2 -Phe 42 ) of S100A11. This segment was not released from the envelope following CNBr digestion, suggesting that this region contains cross-links. Lys 3 , Lys 23 , Lys 27 , Lys 36 , and Gln 22 are possible cross-linking residues. However, trypsin cleaved immediately downstream of both Lys 27 and Lys 36 , releasing peptide T7, suggesting that these sites are not cross-linked.
b Retention time in minutes. c Multiple names for the same protein are separated by a slash, some sequences did not find a match in the data base, and blank places indicate that sequence could not be assigned. SPR3 appears to be the 22-kDa pancornulin (16).
d Residues indicate only the segment of the peptide sequenced; 8ϫ indicates that the sequence occurs eight times in protein. e Picomoles of peptide detected (by sequencing)/mg of starting CE fragments. f Sequence was screened against a library of known CE precursor proteins, and possible matches are given. However, definite assignment is not possible.
g Sequence is present more than once in the same protein.
Thus, Lys 3 , Lys 23 , and Gln 22 appear to be the best candidates as sites of cross-link formation within the amino-terminal segment of S100A11. The carboxyl residue of peptide P4b is Arg, a known tryptic cleavage site. However, P4b was not released from the envelope until a subsequent digestion with proteinase K, providing strong additional evidence that Lys 3 is a cross-link site. We did not sequence any portion of fragment Lys 65 -Met 89 ; therefore, it is not possible to draw any conclusions regarding the cross-link status of this segment. Cross-linking Sites in Annexin I-The annexin I amino acid sequence is shown in the lower panel of Fig. 4. Fragments spanning most of the length of the annexin I protein were recovered. CNBr digestion released peptides C2, C3, C4, C20, C11, C6, and C9. Peptide T7-minor was found after CNBr/ trypsin digestion, and peptide P9a was released after proteinase K digestion of the trypsin acid-insoluble fraction 3 (Fig. 1). The P9a peptide is contained within the Val 4 -Ile 55 peptide. The fact that the amino terminus was not released by CNBr, and that subsequent proteinase K digestion released peptide P9a, suggests that the NH 2 terminus of annexin I segment contains a cross-link(s). Possible sites for cross-link formation include Lys 9 , Lys 26 , and Lys 29 , and Gln 10 , Gln 19 , and Gln 23 . Three of these sites, Gln 19 , Gln 23 , and Lys 26 , have been previously identified as cross-linking sites in vitro (30,31). No peptides matching the annexin I COOH terminus, Tyr 319 -Asn 346 , were recovered in this study.
Identification of Potential Cross-link Sites within SPR1A and SPR1B-We collected multiple fragments from SPR1A and SPR1B proteins. There are no internal methionine residues in SPR proteins, and, as expected, we did not recover any SPR1A or SPR1B peptides in the CNBr-released fractions. There are 10 trypsin cleavage sites in these proteins, three of which yield peptides that are only 2 amino acids long (Fig. 5A). Six SPRderived, trypsin-cleaved peptides were found in fraction 4, and all were derived from the central portion of the protein (within residues 23-68). The region from Val 69 -Lys 89 was not released by trypsin. As this region contains tryptic cleavage sites, this result suggests that one or more cross-links are present within the carboxyl terminus. Proteinase K digestion of the trypsinized, acid-insoluble pellet (Fig. 1, #6) released two peptides from within this region. The fact that the Val 69 -Lys 89 region was not released by trypsin, but fragments located between Val 69 and Ala 82 were released by subsequent proteinase K digestion, confirms that it contains cross-links. This region contains Gln 83 , Gln 84 , Gln 88 , Lys 85 , Lys 87 , and Lys 89 as potential cross-link sites.
Our data also suggest that the NH 2 terminus (Met 1 -Lys 22 ) of the protein may contain cross-links, as it was not recovered. This segment of the protein has 10 candidate cross-linking sites (Gln 4 , Gln 5 , Gln 6 , Lys 7 , Gln 15 , Gln 17 , Gln 18 , Gln 19 , Gln and 20 , Lys 22 ) (Fig. 5A). Lys 22 is not a likely site for cross-link formation, since trypsin cleaved at the carboxyl side of this residue, and cleavage would not be expected if an isodipeptide cross-link was present at Lys 22 . If we plot the picomoles of each SPR peptide released (per milligram of CE) versus fragment position within the protein, it is clear that the central region of the protein is preferentially released (Fig. 5B), and that peptides representing the amino and carboxyl termini are not detected.
Prediction of Envelope Protein Composition-Using mathematical modeling (26), we attempted to characterize the protein composition of the CE fragments. This method has been reported to accurately predict CE precursor content (25,26). We performed amino acid composition analysis on undigested envelope fragments and on residual CE material (i.e. the ma- FIG. 4. Sequence and potential cross-linking sites of keratinocyte S100A11 and annexin I. S100A11 was cloned from cultured human foreskin keratinocytes using RT-PCR and sequenced. The sequence of the 105 amino acid keratinocyte S100A11 protein derived from the cDNA sequence is shown. The shaded areas represent peptide fragments that were sequenced following release with CNBr (C12); CNBr and trypsin (T7 and T9); and CNBr, trypsin, and proteinase K (P4b). The lower panel shows the sequence of the 346-amino acid annexin I protein (42). The shaded areas indicate peptide fragments that were detected following released with CNBr (C2, C3, C4, C6, C9, C11, and C20); CNBr and trypsin (T7-minor, shown in brackets); or CNBr, trypsin, and proteinase K (P9a). The upward pointing short arrows indicate possible cross-linking sites, and the upward pointing long arrows indicate probably cross-linking sites (see "Results"). It is possible that residue Lys 53 of annexin I is also a cross-linking site; however, this seems unlikely. Therefore, it is not indicated by an arrow. The methionine residues (M, CNBr cleavage sites) are shown in bold.
terial that remained following CNBr, CNBr/trypsin, or CNBr/ trypsin/proteinase K digestion). The amino acid composition data are shown in Table VI. This envelope composition data and the amino acid composition for each precursor protein (obtained from the literature) were used in a least squares analysis to estimate the percent of the total CE mass contributed by each protein. The precursor proteins included in this analysis are listed in the legend to Table VII. A nonnegative constraint was imposed upon the calculations (see "Materials and Methods"). In contrast to previous reports (25,26), we obtained large negative values in the absence of this constraint. The results of the least squares analysis are given in Table VII. Keratins, plakoglobin, S100A11, desmoplakin, SPRs, and proelafin were predicted to be the major CE components. However, the values of the residuals (i.e. root-mean-square discrepancy and the median discrepancy, Table VII) were much greater than 1, indicating a poor fit. A good fit is indicated by residuals of Յ1.
A second, straightforward method was also employed to es-timate the relative contributions of the precursor proteins. The picomoles of amino acid released during peptide sequencing was used to calculate the percentage of mass contributed by each precursor. When multiple fragments of the same protein were identified, the quantity of the most abundant peptide was used in the calculations. The estimates obtained using this method are given in Table VIII. This method is limited by several considerations. First, we know that losses occur during purification of each peptide. Second, there is no assurance that each peptide is quantitatively released by the digestion protocol. Last, this method does not give a comprehensive description of protein composition, since there may be precursors that are not released by our digestion protocols. These proteins would be missed in the analysis. Thus, this method yields the minimum amount of a protein present in the envelope. This method predicts that SPRs, desmoplakin I, and annexin I comprise a significant percentage of the CE mass.

DISCUSSION
In the present study, we use sequential proteolytic digestion to identify structural components of the cornified envelope of cultured keratinocytes. CNBr digestion, which cleaves after M residues, released fragments of involucrin, desmoplakin I, plakoglobin, envoplakin, desmoglein 3, plakophilin, PAI-2, desmocolin 3a/3b, annexin I, S100A11, and S100A10 (Tables II and  III). Trypsin digestion of the CNBr-resistant pellet releases SPR1A and SPR1B protein fragments, as well as portions of S100A11 and annexin I (Table IV). Subsequent proteinase K digestion of the CNBr-digested and trypsin-digested, acid-insoluble residue releases peptide fragments of S100A11, desmoplakin I, involucrin, SPR1A, and SPR1B (Table V). These proteins can be divided into three functional groups including (i) annexin and associated proteins, (ii) soluble precursors, and (iii) desmosomal proteins. These groups are discussed below.
Annexin I and Associated Proteins-Annexin I is a member of a family of calcium-dependent, phospholipid-binding proteins (56). These proteins share a common carboxyl-terminal core, consisting of four to eight repeats of a 70-amino acid motif. The amino terminus of each member is unique and is thought to confer functional differences among the proteins (56). The crystal structure of annexin I shows that the carboxyl-terminal core forms a curved plate (57). Calcium binding to the convex  (Fig. 1). The open bar indicates proteinase K fragments (Table V; P7, P9b, and P10), that are contained within the Val 69 -Lys 85 SPR tryptic fragment (the segment Gln 83 -Lys 85 was not found). The amount (picomoles) given for Thr 37 -Lys 44 includes the number of picomoles of Thr 37 -Lys 44 and Glu 39 -Lys 44 . The listing for Thr 86 -Lys 89 includes fragments Thr 86 -Lys 87 and Thr 88 -Lys 89 (none of these were recovered). Fragments Met 1 -Lys 7 , Gln 8 -Lys 22 , and Thr 86 -Lys 89 were not found. The values (picomoles) for the tryptic and proteinase K fragments were taken from Tables IV and V, respectively. surface of this plate triggers binding of this surface to membranes. The unique NH 2 -terminal region is postulated to form a flexible arm that is located adjacent the concave face (57) and is the site of modification by transglutaminase (30,31) and phosphorylation by epidermal growth factor receptor (58) and protein kinase C (59). The target residues for transglutaminasedependent cross-link formation are Gln 18 , Gln 22 , and Lys 28 (31).
In epithelial cells, annexin I has been shown to become detergent-and reducing agent-insoluble or to form multimers, when SqCC/Y1 keratinocytes (29) or A431 (30, 31) cells, respectively, are challenged with ionophore. Our results provide evidence that annexin I is incorporated into the cornified envelope of epidermal keratinocytes. We hypothesize that, in vivo, annexin I is soluble in basal layer keratinocytes, and then moves to the inner cytoplasmic surface as calcium levels increase in more differentiated cells. Most (60 -62), but not all (63), studies of annexin I intracellular localization in epidermis are consistent with this model. Once at the plasma membrane, annexin I becomes cross-linked to itself (64), and possibly other proteins, via its NH 2 terminus. This model is consistent with annexin I having multiple functions in keratinocytes. In undifferentiated cells, it may function as a signal transduction protein (58,59), or to anchor the cytoskeleton to the cell membrane (65). In cells that have differentiated, it becomes incorporated into the CE. It should be noted that not all annexin family members participate in envelope assembly. In keratinocyte cell cultures, annexins I, II, IV, V and possibly VI and VII are synthesized (29,66). Among this group, only annexin I is a transglutaminase substrate (29,30). Thus, although all annexins bind to membranes, annexin I may be the only member that is incorporated in the CE, suggesting a role for this specific annexin in enve-lope assembly.
The present report is the first to establish that S100 proteins are CE precursors. The S100 proteins are a family of small, acidic, Ca 2ϩ -binding proteins that contain two Ca 2ϩ -binding EF hands (67,68). S100 proteins are thought to function like calmodulin, having an important role in calcium-dependent signaling (67,68). However, unlike calmodulin, S100 proteins are expressed in a tissue-specific manner. It is thought that binding of calcium to the EF hands of the S100 proteins results in a conformation change that exposes protein-interaction sites (68). The activated S100 proteins then bind to and regulate the function of target proteins (67,68). Many of the S100 genes, including S100A10 (69, 70) and S100A11 (54,71), have been co-localized on the same chromosome as other epidermal structural protein genes, suggesting a possible function in skin. Thus, the S100 proteins join another keratinocyte envelope precursor, profilaggrin, as an EF hand-containing protein (72). Trychohyalin, a hair follicle protein, which may also be an envelope precursor, also contains an EF hand that may be functionally important (73). S100A11 (S100C, calgizzarin) has been identified in gizzard, lung, heart (74,75), skin (54), and cultured keratinocytes (55). Our cDNA sequence shows that epidermal S100A11 is identical to S100A11 isolated from human colon carcinoma cells (43). As mentioned above, the S100 proteins are thought to affect Ca 2ϩ signaling pathways by binding to and regulating the activity of target proteins. S100A11 interacts with annexin I (76,77) in a calcium-dependent manner, an interaction that requires the first 12 amino acids of the annexin I amino terminus (77). The central hinge and carboxyl terminus are the least conserved regions among S100 proteins and are likely to provide the sites that non-covalently bind other proteins (68). It has been suggested (77) that the COOH-terminal segment is not required for the interaction, implicating the hinge region as the annexin I interaction site. It is possible that annexin I, via its interaction with the plasma membrane, serves to anchor S100A11 near the plasma membrane, positioning S100A11 for crosslinking.
This study also identifies S100A10 as a component of the keratinocyte cornified envelope. S100A10 (also referred to as p10, p11, and calpactin light chain) binds to annexin II to form a tetramer, called calpactin 1, consisting of two annexin II and two S100A10 proteins (78). S100A10 binds tightly to the annexin II amino terminus via an interaction that does not require calcium. Annexin II differs from annexin I in that it is not a transglutaminase substrate (29,30); however, annexin II does bind to phospholipids in a calcium-dependent manner and thus could be expected to be associated with plasma membranes. In vitro results suggest that S100A10 and annexin II form a high affinity, specific interaction that requires urea for dissociation (78) and that S100A10 cannot bind to membranes in the absence of annexin II (78). The fact that we find S100A10  SPR1A and SPR1B (SPR), desmoplakin, S100A11, S100A10, plakoglobin, involucrin, loricrin, desmoglein, desmocolin, plakophilin, and envoplakin were included in this analysis. (The underlined proteins are not listed because they are not estimated to be present by the analysis.) b The CE amino acid composition data shown in Table VI, and the known amino acid composition of each protein (from the literature), were used to estimate, using least squares analysis, the protein composition of undigested CE fragments and of residual pellet following digestion with CNBr, CNBr and trypsin (Trypsin), or CNBr, trypsin, and proteinase K (Prot K). c The first value is the root-mean-square discrepancy, and the second is the median discrepancy. Both are expressed as % per amino acid residue. as a component of the keratinocyte cornified envelope, but not annexin II, suggests that annexin II may position S100A10 for cross-linking. In this model, annexin II could be released during envelope assembly, or it could remain associated with the CE, as a non-cross-linked or disulfide-linked precursor that is extracted by our CE purification procedures. This would suggest that annexin II may serve as an "envelope organizer protein" that never becomes covalently associated.
Soluble Precursors-Consistent with previous reports (13,14,16,18,35,36), our study identifies involucrin and SPR1A and SPR1B as cytoplasmic, hydrophilic proteins that are incorporated into the cornified envelope. Our results, showing that the central portion of the SPR1A and SPR1B proteins is preferentially released by digestion of cornified envelopes, indicate that SPR1A and SPR1B are cross-linked via the amino-and carboxyl-terminal ends. This suggests that these SPRs may function as molecular cross-bridges to connect two proteins (16). A similar role has been proposed for involucrin (11). The yield of the SPR peptides suggests that SPR proteins are major components of the cornified envelope of cultured keratinocytes.
We also detected PAI-2. PAI-2, a serine protease inhibitor that inhibits urokinase-type plasminogen activator, has previously been suggested to be an envelope precursor (32). PAI-2 is one of a growing number of proteinase inhibitors, including elafin (21,35) and cystatin A (20,79), that are thought to be components of the CE. These inhibitors may have a role in regulating the process of envelope formation and/or protecting envelope integrity, by differentially inhibiting specific proteinases (1).
Desmosomal Proteins-Our studies identify desmocolin 3a/ 3b, desmoglein 3, desmoplakin I, plakoglobin, envoplakin, and plakophilin as envelope components. The desmosome, beginning on the intracellular side, consists of an inner plaque, outer plaque, and membrane-associated desmosome core. The desmosome core contain the extracellular domains of desmogleins and desmocolins (80). Plakoglobin, plakophilin, and the intracellular domains of the desmogleins and desmocolins are components of the outer plaque (80). Desmoplakin I, and presumably envoplakin, are components of the inner plaque (80). The structure of envoplakin, a 210-kDa membrane-associated protein, has recently been presented (34). This protein was originally shown to be a component of keratinocyte envelopes, following induced envelope formation in cultured keratinocytes (33). Thus, our studies show that desmosomal proteins become cross-linked components of the keratinocyte cornified envelope, confirming previous electron microscopic and immunohistological studies that identify desmosomes as part of purified CE fragments (13,27,36). Interestingly, the presence or absence of desmosome-like structures in CEs is effected by the cell culture conditions and method used to initiate envelope formation (36).
Location of Cross-linking Sites-Our results are the first to suggest that S100A11 is covalently modified; furthermore, although we did not directly sequence the cross-linking sites, our results suggest that Lys 3 , Gln 102 , and Lys 103 are likely sites of cross-link formation. Information regarding S100A11 crosslinking sites was derived from analysis of the release pattern of the envelope precursor peptides. In contrast to the central region, the carboxyl and amino termini of S100A11 are not readily released by CNBr treatment. However, pieces of these regions are released by subsequent trypsin or proteinase K digestion. These results suggest that the amino-and carboxylterminal ends are sites of attachment to the envelope. These results are consistent with the possibility that S100A11 specifically interacts with annexin I via the central region of S100A11. Annexin I binding to this region may prevent it from being available as a TG substrate. SPR1A and SPR1B have been shown to be precursors of the cornified envelope in in vivo corneocytes (35). Cross-links have been identified at positions Lys 7 , Gln 88 , and Lys 89 (35). Our results, indicating that the amino-and carboxyl-terminal ends are the sites of cross-link formation, are consistent with this result. In this previous report, loricrin was identified as the partner of SPR in cross-link formation. Although loricrin is likely to be expressed in our 3T3-dependent, retinoid-deficient culture system (81), proteolytic digestion did not release loricrin peptides. This is surprising as loricrin has been shown to be a frequent participant in cross-link formation in vivo (35). It is possible that loricrin is produced at low levels in our culture system, or that it is produced but is not efficiently delivered to the site of cross-link formation. It is also possible that our fractionation system does not favor retention of loricrin fragments. However, this appears unlikely, since our isolation and fractionation conditions are similar to those used to identify loricrin fragments from in vivo envelopes (35). Our results suggest that SPRs are cross-linked to other proteins in the cultured cells.
Precursor Composition-We used two methods to estimate the content of precursors in the cornified envelopes. The first method, using the experimentally determined amino acid composition of the CE fragments and the known amino acid composition of the precursor proteins, predicts envelope composition using a least squares best fit analysis (25,26). Using this method, it has been reported that involucrin, cystatin A, and cysteine-rich protein (elafin) each comprise Ͼ25% (by mass) of cornified envelopes prepared from cultured cells (26). Mathematical modeling of our amino acid composition data predicts that keratins, plakoglobin, S100A11, desmoplakin, SPRs, and proelafin are the most abundant proteins (Table VII). Loricrin, cystatin A, involucrin, annexin I, desmoglein, desmocolin, plakophilin, and envoplakin are predicted not to be present. To obtain these estimates, we were forced to apply the constraint that the composition values could not be negative. When this constraint was removed, large negative (non-physiologic) values were obtained. In addition, very large residuals were obtained, indicating that the method did not accurately predict the relative content of each CE precursor. Moreover, as described below, the estimates obtained using this method are not consistent with those obtained by peptide sequencing.
The second method we used estimates protein abundance by measuring the amount of each peptide fragment detected in sequencing experiments. By mass, SPRs, desmoplakin I, and annexin I are predicted to be the most abundant proteins. The estimation of a high content of annexin I is consistent with the gel shown in Fig. 2, which shows an abundance of annexin I peptide released by CNBr cleavage of envelopes. It seems unlikely that desmoplakin I is the second most abundant CE component. On a molar basis, SPR1A, SPR1B, annexin I, S100A11, and S100A10 are the most abundant proteins. It should be noted that this method, for reasons outlined under "Results," only provides a minimum estimate of protein content. However, analysis using this method suggests that annexin I and SPRs are abundant components of the CE.
Facilitated Movement of Precursors to the Inner Face of the Plasma Membrane-An issue that has not been addressed in great detail in the past is how envelope precursors become localized at the inner surface of the plasma membrane. It has been proposed, for involucrin and other soluble precursors, that localization is determined by the location of transglutaminase (11). In this model, the precursors diffuse passively to the vicinity of the transglutaminase, an acyl enzyme intermediate is formed, and the protein is cross-linked in place, perhaps to an integral membrane protein. A second mechanism, which has been proposed for the insoluble envelope precursors, such as loricrin, involves delivery via a vesicle (10). Our present results, which identify S100A11, S100A10, and annexin I as envelope components, suggests a third mechanism, which uses what we call "envelope organizer proteins." Annexin I is known to bind to membrane phospholipids in a calcium-dependent manner (56). It is also known that calcium levels rise as keratinocytes differentiate (82,83). This suggests, that as cells differentiate and calcium levels rise, annexin I, as a complex with S100A11, moves to the plasma membrane and binds to the inner surface. In this manner, annexin I can be synthesized as a soluble precursor, and only later, when calcium levels rise, be transferred to the site of cross-linking. A similar model can be envisaged for S100A10 and annexin II. In this model, the annexins function as envelope organizer proteins (i.e. proteins that move precursors to the appropriate location, but are not themselves necessarily cross-linked).
The Precursor Availability Hypothesis-Based on previously published data (84) and our results, we suggest that the process of CE formation will utilize those reactive precursors (i.e. transglutaminase substrates) that are available at the time of cornified envelope assembly. Thus, depending upon substrate availability, envelope composition can vary considerably. This implies that the proteins that form cross-linked partners will vary in envelopes formed under different conditions (cell culture conditions, different body locations, presence of disease, etc.). We further suggest that the sites of cross-link formation within individual protein precursors, will be constrained. The presence of amino-and carboxyl-terminal cross-links in the SPRs and in the amino terminus of annexin I, described in the present study, is consistent with previous in vivo and in vitro studies showing cross-links in these regions (31,35). The molecular details regarding envelope formation must depend upon (i) the precursors that are available at the time of cross-linking and (ii) the distribution and type of transglutaminase present. This model, the precursor availability hypothesis, differs from the dustbin hypothesis (84), in that we hypothesize that CE proteins, regardless of their role in other cellular processes, also function as envelope components (i.e. are not waste proteins), and that only specific cross-linking sites are utilized on each protein. Moreover, the site of cross-link formation within each protein does not depend upon the identify of the crosslinked partner. This model predicts that there is a family of proteins that function as CE precursors and that much of the difference in envelope composition results from difference in the abundance and availability of each precursor at the time of CE assembly.