Silica-precipitating peptides from diatoms. The chemical structure of silaffin-A from Cylindrotheca fusiformis.

Two silica-precipitating peptides, silaffin-1A(1) and-1A(2), both encoded by the sil1 gene from the diatom Cylindrotheca fusiformis, were extracted from cell walls and purified to homogeneity. The chemical structures were determined by protein chemical methods combined with mass spectrometry. Silaffin-1A(1) and -1A(2) consist of 15 and 18 amino acid residues, respectively. Each peptide contains a total of four lysine residues, which are all found to be post-translationally modified. In silaffin-1A(2) the lysine residues are clustered in two pairs in which the epsilon-amino group of the first residue is linked to a linear polyamine consisting of 5 to 11 N-methylated propylamine units, whereas the second lysine is converted to epsilon-N,N-dimethyllysine. Silaffin-1A(1) contains only a single lysine pair exhibiting the same structural features. One of the two remaining lysine residues was identified as epsilon-N,N,N-trimethyl-delta-hydroxylysine, a lysine derivative containing a quaternary ammonium group. The fourth lysine residue again is linked to a long-chain polyamine. Silaffin-1A(1) is the first peptide shown to contain epsilon-N,N,N-trimethyl-delta-hydroxylysine. In vitro, both peptides precipitate silica nanospheres within seconds when added to a monosilicic acid solution.

Silicon oxide minerals, the main constituents of the earth's crust, are not exclusively formed by geological processes. In fact, hydrated silicon dioxide (silica), the second most abundant biogenic mineral (biomineral), is produced by a wide range of organisms including animals and higher plants (1). A large proportion of biogenic silica is formed by diatoms (2), which are unicellular algae that are ubiquitously present in marine and freshwater habitats (3). The main attribute of a diatom cell is its silica based cell wall. The intricate and ornate silicified cell walls of diatoms are one of the most outstanding examples of nanoscale-structured materials in nature. In the past, diatoms have been studied as model organisms to investigate the biochemical basis of biological silica formation (4 -6). This has led to the discovery of silicic acid transporter proteins (7,8) and unique organic components that are associated with biosilica (9 -11).
Interest in silica biomineralization has been greatly increased by the recognition that the organic molecules that mediate the formation of silica structures in vivo could be useful tools in materials technology for biomimetic production of nanostructured silica in vitro (12)(13)(14). Recently, silica-asso-ciated components from different diatom species were identified that mediate the formation of silica nanospheres in vitro from a silicic acid solution (10,11). These components are long-chain polyamines and polycationic polypeptides termed silaffins. The chemical structures of the polyamines have been completely elucidated. They are composed of linear chains of 8 to 20 N-methylated propylamine units that are attached to putrescine or a putrescine derivative (11). In contrast, there is only incomplete information about the chemical structure of the silaffins. Recently, a silaffin-encoding gene, termed sil1, has been cloned from the diatom Cylindrotheca fusiformis. The encoded polypeptide sil1p serves as a precursor molecule, which becomes proteolytically processed and post-translationally modified to produce the silica-precipitating peptides silaffin-1A and silaffin-1B. It has been demonstrated that silaffin-1A represents a mixture of peptide isoforms, and that their silica-precipitating activity depends on the presence of modified lysine residues (10). So far, two types of modified lysine residues (denoted Lys x and Lys y ) have been characterized from the N-terminal octapeptide fragment SSK x K y SGSY that is common to all silaffin-1A isoforms. Lys x represents a lysine residue that carries on its ⑀-amino group a linear polyamine consisting of 5 to 11 N-methylated propylamine units. Lys y has been shown to represent ⑀-N,N-dimethyllysine. However, no information was available for the chemical structures of the remaining parts of the silaffin-1A isoforms. In the present study we describe the silica-precipitating properties and complete the chemical structures of the peptides silaffin-1A 1 and silaffin-1A 2 , which together account for all peptide isoforms of silaffin-1A.
Culture Conditions-Cylindrotheca fusiformis was grown in artificial sea water medium as described previously (15).
Isolation of silaffin-1A 1 and silaffin-1A 2 -The silaffin-1A fraction was isolated from purified cell walls of C. fusiformis as described previously (10) and subjected to high pressure liquid chromatography (HPLC) 1 on a Sephasil C 18 2.1/10 column using the SMART-System (Amersham Pharmacia Biotech). Elution of peptides was performed by increasing the concentration of buffer B from 0 to 25% in 35 min (buffer A: 0.1% trifluoroacetic acid in H 2 O; buffer B: 0.085% trifluoroacetic acid in acetonitrile). Fractions containing silaffin-1A 1 and silaffin-1A 2 , respectively, were lyophilized and the residues were dissolved in H 2 O.
Digestion of Silaffin-1A 1 with Chymotrypsin and Separation of Peptides-Silaffin-1A 1 (180 g) was dissolved in 100 l of 50 mM Tris/HCl, pH 8, and 9 g of chymotrypsin (N ␣ -p-tosyl-L-lysine chloromethyl ketone-treated; Sigma) was added. Incubation was at 37°C for 15 h. The resulting chymotryptic peptides were separated by HPLC using the same conditions as described above.
Acid Hydrolysis and Hydrazinolysis-Complete degradation of silaf-* This work was supported by Grant A2 (SFB 521) from the Deutsche Forschungsgemeinschaft and by the Fonds der Chemischen Industrie. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
‡ To whom correspondence should be addressed. fins to yield free amino acids was performed by gas phase hydrolysis using 6 M HCl at 110°C for 6 -16 h. For hydrazinolysis anhydrous hydrazine was prepared from hydrazonium cyanurate (Fluka) according to the method of Nachbaur and Leiseder (16). Silaffin peptides were lyophilized in glass reaction tubes (ReactiVial, Pierce), and 100 l anhydrous hydrazine was added to each sample under a nitrogen atmosphere. The reaction tubes were sealed tightly and incubated at 110°C for 24 h. Subsequently, hydrazine was evaporated by lyophylization, and the residues were dissolved in 50 mM ammonium acetate.
Alkylation of Modified Lysine Residues-A dried acid hydrolysate of silaffin-1A was dissolved in 50 mM sodium phosphate, pH 7, and subjected to reductive ethylation with sodium cyanoborohydride and acetaldehyde according to a previously described protocol (17).
Amino Acid Sequencing-Peptides were sequenced by automated Edman degradation on a Procise 492A sequencer (PE Biosystems) with on-line detection of the phenylthiohydantoine amino acids according to the manufacturer's instructions.
Mass Spectrometry-Electrospray ionization mass spectrometry (ESI-MS) and fragmentation analysis were performed using an Ion Trap ESQUIRE LC (Bruker) instrument. Samples were infused by a nanospray source in either 1 mM ammonium acetate, 50% CH 3 CN (for analysis of polyamine modified lysine residues) or 0.5% acetic acid, 50% methanol (for analysis of peptides and amino acids).
Determination of Peptide Concentration and Silica Precipitation Assay-These procedures were performed as described previously (10).
Electron Microscopy-Silica precipitates were washed with H 2 O, mounted onto a graphite coated coverslip, and analyzed on a LEO1530 field-emission scanning electron microscope.

Isolation and Sequence Analysis of Silaffin-1A Isoforms-
Amino acid sequencing of silaffin-1A exhibited the N-terminal sequence SSK x K y SGSYSG(S/Y) (10), indicating the existence of at least two different but related peptide species within the silaffin-1A preparation. Reversed phase HPLC on a C 18 column produced two peptide fractions, which were termed silaffin-1A 1 and silaffin-1A 2 , respectively (Fig. 1A). Amino acid sequencing resulted in the sequence SSK x K y SGSYSGSK z GS for silaffin-1A 1 (Lys x and Lys y were previously shown to represent modified lysine residues; the chemical nature of Lys z will be described below) and the sequence SSK x K y SGSYSGYSTK x K y SGS for silaffin-1A 2 . No further amino acid signals were observed after sequencing cycle 14 and 18, respectively. Mass spectrometry indicated that both of these fractions contain peptide isoforms differing by increments of 71 mass units (Table I). This mass heterogeneity can be explained precisely by the variation in chain length (number of methylated propylamine units) of the polyamine modification present on lysine derivative Lys x . A comparison of the silaffin-1A 1 and silaffin-1A 2 sequences with the sequence of the silaffin precursor polypeptide sil1p clearly demonstrates that the silaffin-1A 1 peptides are derived from any of the repeats R3-R7 of sil1p, whereas silaffin-1A 2 is derived from repeat R2 (Fig. 1B).
The ratio of peak areas of silaffin-1A 1 and silaffin-1A 2 in the HPLC chromatogram is about 5 to 1. Therefore, it is reasonable to assume that each of the repeats R3-R7 of sil1p contributes to silaffin-1A 1 production (Fig. 1B). A comparison of the sil1p sequence (10) with the results from amino acid sequencing of isolated silaffins reveals the following structural features of silaffin-1A 1 and silaffin-1A 2 . 1) All of the lysine residues present are post-translationally modified. Three types of modified lysine residues (designated Lys x , Lys y , and Lys z ) can be distinguished in silaffin-1A 1 . Lys y produces a signal between Arg and Tyr in the chromatogram of the amino acid sequencer and was previously identified as ⑀-N,N-dimethyllysine (10). Lys z is a so far unidentified lysine derivative exhibiting a characteristic peak between Ala and Arg. Lys x denotes a lysine derivative that does not show up at all in automated amino acid sequencing. For lysine derivative Lys x at position 4, a long-chain polyamine was previously shown to be attached to the ⑀-amino group (10). 2) Neither the C-terminal amino acid sequence RRIL predicted by the gene sequence of repeat R2 nor the corresponding sequences KRRNL and KRRIL predicted from repeats R3 to R7 showed up in amino acid sequencing of silaffin-1A 2 and silaffin-1A 1 , respectively. Therefore, these residues may either have become removed or post-translationally modified during processing of the silaffin precursor polypeptide.
Complete Amino Acid Sequence of Silaffin-1A 1 -To further analyze the chemical structure, silaffin-1A 1 was digested with chymotrypsin, which generates the previously described Nterminal octapeptide SSK x K y SGSY (see the Introduction) and the C-terminal fragment SGSK z GS (Fig. 2). In reversed phase C 18 HPLC, the N-terminal octapeptide separates into five fractions (Fig. 2), which differ in masses by multiples of 71 Da (Table II) due to heterogeneity with respect to the chain length of the polyamine modification (10). Surprisingly, the same mass differences were observed in the subfractions derived from the C-terminal peptide SGSK 3 GS (Table II), suggesting that this peptide also contains a lysine residue carrying the polyamine modification. Because this type of lysine derivative is not detectable by automated amino acid sequencing, we hypothesized that it may constitute the C terminus, and therefore the sequence of the C-terminal peptide may rather be represented by SGSK z GSK x . To investigate this, the material from fraction 3 (C-terminal peptide) and fraction 8 (N-terminal octapeptide), respectively, was subjected to hydrazinolysis, and the resulting products were analyzed by ESI-MS. Hydrazinolysis leads to the breakdown of the peptide backbone, and all amino acids residues originally placed within the polypeptide chain become converted to the corresponding hydrazides. Only   FIG. 1. Analysis of silaffin-1A isoforms. A, separation of silaffin-1A 1 and silaffin-1A 2 on a reversed phase C 18 column. B, schematic structure of the silaffin precursor polypeptide sil1p. The black pentagons denote the repeating unit elements from which the silaffins are generated. The amino acid sequences of repeating units R2 and R3-R7, respectively, are bracketed. Repeating unit R1 gives rise to silaffin-1B (10). The white bar denotes the signal peptide, and the gray oval indicates the highly acidic prosequence (10). the C-terminal residue is released as free amino acid (18). As expected, hydrazinolysis of the material from fraction 8 (Nterminal octapeptide) generated a molecule of mass (m ϩ H) ϩ ϭ 729.8 Da (Fig. 3A). This molecule corresponds to the hydrazide of a lysine derivative carrying eight methylated propylamine units (the molecular mass of the hydrazide is 14 Da higher compared with the free amino acid derivative). In contrast, a molecule of mass (m ϩ H) ϩ ϭ 715.8 Da was found among the hydrazinolysis products from fraction 3 (C-terminal peptide) indicating that a lysine residue carrying eight methylated propylamine units is indeed present at the C terminus of the original peptide (Fig. 3B). This result demonstrates that the correct peptide sequence of silaffin-1A 1 is represented by SSK x K y SGSYSGSK z GSK x . Identification of ⑀-N,N,N-Trimethyl-␦-hydroxylysine-To identify the chemical structure of lysine derivative Lys z , complete acid hydrolysis of silaffin-1A 1 was performed, and the masses of the resulting products were analyzed by ESI-MS (data not shown). Apart from the masses of the known amino acid constituents of silaffin-1A 1 (Ser, Gly, Tyr, Lys x , and Lys y ) a molecule of mass (m ϩ H) ϩ ϭ 205.1 was detected. This molecule was present only in the acid hydrolysate of the Cterminal peptide of silaffin-1A 1 (data not shown), indicating that (m ϩ H) ϩ ϭ 205.1 corresponds to the molecular mass of the lysine derivative Lys z . The (m ϩ H) ϩ ϭ 205.1 ion was isolated using the ion trap mode of the mass spectrometer and subjected to collision induced fragmentation. The elimination of 59 and 18 mass units indicated the presence of a trimethylammonium group as well as an hydroxy group within this lysine derivative. Indeed, the obtained product ion spectrum (Fig. 4A) matches the spectrum obtained from authentic ⑀-N,N,N-trimethyl-␦-hydroxylysine (Fig. 4B). This lysine derivative had previously been found in acid hydrolysates obtained from total cell wall preparations of different diatom species including C. fusiformis (19).
Chemical Structures of Silaffin-1A 1 and Silaffin-1A 2 -Taken together, the data of silaffin-1A 1 analysis lead to the structural model presented in Fig. 5. This model exactly explains the molecular masses of all peptide isoforms found to be present within the silaffin-1A 1 fraction (Table I).
The molecular masses of the silaffin-1A 2 peptide isoforms (Table I) are consistent with the assumption that both the lysine derivatives Lys x at positions 3 and 14 carry the polyamine modification and that the lysine derivatives Lys y found at positions 4 and 15 are ⑀-N,N-dimethyllysine. This was confirmed by acid hydrolysis of silaffin-1A 2 and subsequent ESI-MS analysis. ⑀-N,N-Dimethyllysine and polyamine-modified lysine derivatives were the only modified amino acids present in this hydrolysate. The structural model for silaffin-1A 2 is shown in Fig. 5.
Methylation Pattern of the Long-chain Polyamine Modification-Structural analysis of long-chain polyamines by mass spectrometry revealed that their collision-induced fragmentation is caused exclusively by the cleavage of C-N bonds (11). According to this finding, the previously proposed structure of the polyamine moiety linked to lysine derivatives Lys x in silaffin-1A (10) has to be reconsidered. A shift of two methyl groups within the polyamine chain enables the interpretation of all the fragment ions observed (10) by allowing C-N bond cleavages only. Therefore, a modified structural model is proposed for lysine derivative Lys x (included in Fig. 5) in which the polyamine moiety is dimethylated at its terminal amino group, thus representing a methylation isomer of the previously proposed structure. The modified structural model was confirmed by reductive ethylation of lysines Lys x and fragmentation analysis (by mass spectrometry) of the resulting ethylated derivatives. Reductive ethylation introduced exactly four ethyl groups into each Lys x isoform, irrespective of chain-length variations in the polyamine moiety (data not shown). The fragment ion spectrum of one of these ethylated derivatives (Fig. 6A) demonstrates that the terminal amino group of the polyamine moiety is FIG. 2. Analysis of silaffin-1A 1 fragments. Silaffin-1A 1 was digested with chymotrypsin, and the resulting peptides were separated on a reversed phase C 18 column. The amino acid sequences of the C-terminal peptides (fractions [1][2][3][4][5] and the N-terminal peptides (fractions 6 -10) as determined by complete amino acid sequencing are indicated.  . 3. Determination of the C-terminal amino acid of silaffin-1A 1 . Chymotryptic peptides 3 and 8 (see Fig. 2 for numbering) generated from silaffin-1A 1 were subjected to hydrazinolysis, and the products were analyzed by ESI-MS. Only the m/z regions corresponding to the singly charged ions of lysine derivative Lys x are shown. dimethylated, because it is only the ⑀-amino group of the lysine core as well as the amino group of the very first propylamine unit that can be converted to an N-ethyl derivative (in addition to the ␣-amino group of the lysine core; Fig. 6B). This fact clearly indicates that both of these amino groups exist as secondary amines in the parent molecule, i.e. they are not methylated (these amino groups were assumed to be methylated in the previous structural model).
Silica Precipitation-It was previously shown that the silica precipitation activity of silaffin-1A peptides at pH values Ͻ 7 is dependent on the lysine modifications (10). Because the presence of ⑀-N,N,N-trimethyl-␦-hydroxylysine clearly distinguishes silaffin-1A 1 from silaffin-1A 2 (Fig. 5), it was investigated as to whether the two silaffin-1A isoforms have different pH-dependent silica-precipitating properties. In an in vitro assay, the amount of silica precipitated by silaffin-1A 1 and silaffin-1A 2 , respectively, was found to be fairly constant at different pH values and almost identical for silaffin-1A 1 (9.0 -11.9 nmol of Si/nmol of peptide) and silaffin-1A 2 (10.3-11.7 nmol of Si/nmol of peptide). Only at pH 5 did silaffin-1A 1 show a slightly lower silica-precipitating activity as compared with silaffin-1A 2 (Fig. 7A). The structures of the silica precipitates were analyzed by scanning electron microscopy. At all pH values silaffin-1A peptides induced the formation of spherical silica nanoparticles, but at the level of scanning electron microscope resolution, no morphological difference was noted between the precipitates induced by silaffin-1A 1 (Fig. 7B) and silaffin-1A 2 (Fig. 7C).

DISCUSSION
The present study describes for the first time the complete chemical structures of silica-precipitating peptides found in cell walls of the diatom C. fusiformis. These are silaffin-1A 1 and silaffin-1A 2 , which consist of 15 and 18 amino acid residues, respectively. Both peptides contain a total of four lysine residues, and all of these are targets for post-translational modifications. In silaffin-1A 2 , the lysine residues are clustered in two pairs with the first residue being linked to a long-chain polyamine and the second lysine being converted to ⑀-N,N-dimethyllysine. In silaffin-1A 1 , the same type of modified lysine pair is present only once within the N-terminal part of the peptide. The remaining two lysine residues in the C-terminal part are separated by two intercalated amino acids; this motif appears to alter the strategy of post-translational modification. The lysine residue at position 12 becomes modified to ⑀-N,N,Ntrimethyl-␦-hydroxylysine, and it is now the C-terminal lysine residue that carries a long-chain polyamine modification. Remarkably, more than 30 years ago, Nakajima and Volcani (19) isolated and characterized for the first time ⑀-N,N,N-trimethyl-␦-hydroxylysine in acid hydrolysates of total cell wall preparations from a number of diatoms. However, the corresponding proteins in diatoms containing this special type of modification remained elusive. Silaffin-1A 1 is (to our knowledge) the first polypeptide found in nature containing the ⑀-N,N,N-trimethyl-␦-hydroxylysine residue.
Despite the structural differences of silaffin-1A 1 and silaffin-1A 2 , both polycationic peptides show almost identical silicaprecipitating activities and promote the formation of silica nanospheres in vitro (see Fig. 7). This result suggests that the silica-precipitating activities of silaffin-1A 1 and silaffin-1A 2 are dependent mainly on the polyamine modification attached to lysine residues. This is consistent with the finding that longchain polyamines attached to putrescine that were isolated from diatom cell walls are also able to precipitate silica nanospheres (11), whereas synthetic silaffin peptides lacking the lysine modifications are unable to precipitate silica at pH Ͻ 7 in vitro (10). In this respect it is interesting to note that silica formation in diatoms takes place in an acidic, intracellular compartment (20), and thus the polyamine moieties of the silaffin-1A peptides appear to be essential to mediate silica precipitation under physiological conditions. The ⑀-N,N,N-trimethyl-␦-hydroxylysine present in silaffin-1A 1 is a structural element that might influence the ultrastructure of the precipitating silica. Remarkably, quaternary ammonium compounds are used in the technical production of zeolites for patterning of silicate structures in the nanometer size range (21). Possibly, the ⑀-N,N,N-trimethyl-␦-hydroxylysine residue exerts a similar function in biosilica formation.
The role of the polypeptide backbones in silaffin-1A-mediated silica formation is much less clear. Isolation of silaffins from diatom biosilica involves treatment with anhydrous hy-drogen fluoride that converts silica to volatile silicon tetrafluoride. Although this treatment does not attack peptide bonds, it does however specifically cleave O-glycosidic bonds (22). Silaffins contain a large number of hydroxyamino acid residues, which may be targets for O-glycosylation. However, a completely different technique for the extraction of silaffins from biosilica is required to investigate this possibility.
Comparison of the silaffin-1A 1 and silaffin-1A 2 sequences with the sequences deduced from the sil1 gene revealed that during maturation of the silaffins, the C-terminal tetrapeptides RRIL and RRNL, respectively, become cleaved off. This processing step completely removes all arginine residues that are originally present in the silaffin precursor polypeptide sil1p (see Fig. 1B). Remarkably, arginine is the biosynthetic precursor of putrescine (23), and the latter has been shown to serve as the attachment site for long-chain polyamines in C. fusiformis and other diatoms (11). Therefore, it is intriguing to speculate that sil1p is also the precursor for the putrescine-linked polyamines. After conversion of the arginine residues in the silaffin precursor to ornithine residues, the latter may become modified by the same enzymatic machinery that attaches propylamine units to the appropriate lysine residues in silaffins. Subsequently, silaffin peptides and putrescine-based polyamines could be produced simultaneously by proteolytic processing and decarboxylation of the polyamine-modified ornithine residues. If so, sil1p of C. fusiformis would give rise to two different sets of silica-precipitating molecular species.