Sequence Determination of an Extremely Acidic Rat Dentin Phosphoprotein*

The mineralization process associated with the con- version of predentin to dentin is believed to be initiated and controlled by a set of acidic regulatory noncollag- enous proteins (NCPs) which include phosphophoryn, the major NCP in dentin. Phosphophoryn binds tightly to collagen and is believed to initiate the formation of apatite crystals which play a central role in the miner- alization process. During the process of analyzing the 3 (cid:42) end of an odontoblast-specific cDNA which codes for dentin sialoprotein (Ritchie, H. H., Hou, H., Veis, A., and Butler, W. T. (1994) J. Biol. Chem. 269, 3698–3702), we discovered a 801-base pair open reading frame. This downstream open reading frame encodes a putative leader sequence and a very acidic mature protein sequence having a deduced amino acid composition con- taining high percentages of both Ser (43%) and Asp (31%) residues which closely coincides with the amino acid composition of phosphophoryns from human, bovine, rat, and rabbit ( i.e. Asp (30–40%) and Ser (38–50%)). This newly identified cDNA therefore encodes a protein with characteristics similar to phosphophoryn. Here we pres- ent the cDNA sequence, the deduced amino acid sequence, and the prospective Ser residue-specific casein kinase I and II phosphorylation sites for this putative phosphophoryn. The calcification process that accompanies the transition of predentin to dentin is understood, in

The calcification process that accompanies the transition of predentin to dentin is poorly understood, due in part to the difficulties in isolating and characterizing unique sets of extracellular matrix molecules that contribute to this complex process (1)(2)(3)(4). Phosphophoryn, the most abundant noncollagenous protein in dentin, is secreted by odontoblasts through odontoblastic processes and appears at the mineralization front within a short time after labeling with [ 33 P]phosphate (5,6). Phosphophoryn is known to bind large amounts of calcium with a relatively high affinity (7) and to then form an insoluble aggregate in the presence of Mg 2ϩ and Ca 2ϩ (8). Because of its affinity for calcium, phosphophoryn may concentrate these ions and participate in the formation of apatite crystals. For example, Linde and co-workers (9) have demonstrated that when phosphophoryn is immobilized on a stable support and incubated in physiological solutions of calcium and phosphate, phosphophoryn induced the formation of hydroxyapatite (HAP). 1 Studies by the same group (10) and by Boskey et al. (11) also suggested a dual role for phosphophoryn as both an initiator of HAP formation at low phosphophoryn concentrations and as an inhibitor of HAP formation at higher phosphophoryn concentrations.
Phosphophoryn is also believed to have a specific affinity for collagen (2,12,13) which comprises as much as 80% of the protein in dentin. Furthermore, phosphophoryn was found to be specifically associated with the "e" band of collagen (14). This site-specific protein-protein interaction, coupled with phosphophoryn's ability to initiate or inhibit HAP formation when calcium is present, has lead to the currently accepted view that phosphophoryn plays a central role in the mineralization process by virtue of its ability to target mineralization to selected sites as well as to couple mineralization, temporally, with organ development.
While only several NH 2 -terminal residues and small proteolytically obtained segments of internal amino acid sequences of phosphophoryn are currently known, the complete amino acid sequence of phosphophoryn has not as yet been reported. During the process of analyzing the 3Ј end of dentin sialoprotein (DSP) cDNA, another recently cloned dentin-specific protein (15), an open reading frame with a size of 801 bp was revealed. This open reading frame was found to encode a putative leader sequence and a deduced very acidic mature protein sequence with an amino acid composition comprised primarily of Ser (43%) and Asp (31%) residues which coincides with the amino acid composition of phosphophoryns from human, bovine, rat, and rabbit (i.e. Asp (ϳ30 -40%) and Ser (ϳ38 -50%)) (2, 16 -21).
Here we present the first reported cDNA sequence, the deduced amino acid sequence, and the postulated Ser residue-specific casein kinase I and II phosphorylation sites for this putative phosphophoryn.

RNA Preparation and Reverse
Transcription-PCR-The total RNA was extracted from adult rat incisors using RNAzol TM (Biotecx Laboratories, Inc., Houston, TX). A cDNA pool was synthesized from total RNA using an oligo(dT) primer and reverse transcriptase. This cDNA pool was then denatured at 95°C for 5 min and amplified with the primer set comprising an oligoprimer corresponding to rat DSP cDNA nucleotide sequence 1054 -1069 (15) and a poly(dT) primer. PCR was then performed as follows: denaturation (1 min at 94°C), reannealing (1 min at 56°C), and amplification (3 min at 65°C), for 40 cycles.
DNA Sequencing-The PCR products recognized by the 3Ј end rat DSP probe were subcloned into TA vectors using standard techniques (22). Following company procedures, Erase-A-Base Kit (Promega, WI) was used to generate unidirectional deletions of the 2-kb insert for DNA sequencing. DNA was sequenced according to Sanger et al. (23).
Northern Blot Analysis-The total RNA from rat incisor was electrophoresed using a 1.2% agarose gel containing 2.2 M formaldehyde (22). RNA was transferred onto a nitrocellulose paper and hybridized overnight with a 32 P-labeled putative rat phosphophoryn probe at 42°C in 50% formamide and 6 ϫ SSC. The filter was washed with 2 ϫ SSC and * This work was supported by National Institutes of Health Grant DE11442-01 (to H. H. R.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) U63111.

RESULTS
The Deduced Phosphophoryn cDNA Sequence and the Deduced Amino Acid Sequence-Using an oligoprimer corresponding to DSP cDNA nucleotide sequence 1054 -1069 (15) and a poly(dT) primer to amplify the cDNA pool from rat incisors, a 2-kb PCR fragment recognized by 32 P-labeled 3Ј end rat DSP probe was obtained and subcloned into a TA vector for DNA sequencing by the Erase-A-Base technique. An open reading frame, located immediately downstream from the 3Ј end coding region of DSP (i.e. identical to the reported DSP sequence) (15) was identified in this 2-kb insert. This open reading frame contained 801 nucleotides, representing 267 amino acids, including the 27-amino acid putative leader sequence and a 240-amino acid mature protein sequence (Fig. 1). The deduced 27-residue leader sequence is Met-Gly-His-Ser-Arg-Ile-Gly-Ser-Ser-Ser-Asn-Ser-Asp-Gly-His-Asp-Ser-Tyr-Asp-Phe-Asp-Asp-Glu-Ser-Met-Gln-Gly.
The completed DNA sequence for this clone was found to contain a translation start site (ATG) for the secreted protein at nucleotide position 43 ( Fig. 1). At position Ϫ3 from the translation ATG start site, there exists an adenine nucleotide representative of a Kozak initiation sequence and a purine residue at position ϩ4 (24). The four NH 2 -terminal amino acids (i.e. Asp-Asp-Pro-Asn) deduced from the cloned DNA sequence ( Fig.  1) were identical to those previously reported for mature rat dentin phosphophoryns (25). Hydropathy distribution analysis (not shown) revealed that the protein is extremely hydrophilic. The net charge of the secreted protein (before phosphorylation) was calculated to be Ϫ78 with an isoelectric point of 2.95.
Amino Acid Composition- Table I compares the deduced rat amino acid percentages obtained from our cDNA to the actual amino acid percentages (%) for phosphophoryns purified from rat, rabbit, bovine, and human (2, 16 -20). This putative phosphophoryn contains high percentages of Ser (43%) and Asp (31%) residues which yield a highly acidic molecule and closely coincide with the amino acid composition of authentic phosphophoryns from rat, rabbit, bovine, and human (i.e. Asp (30 -40%) and Ser (38 -50%)).
Phosphorylation Sites-The core protein of our putative phosphophoryn is acidic in nature (i.e. 31% Asp and 4% Glu) and contains a high content of Ser residues (i.e. 43% Ser; see Table I). Phosphophoryn protein is secreted by odontoblasts in a highly phosphorylated form (2,6,26,27). In vitro studies have demonstrated that membrane-bound forms of casein kinases I and II isolated from osteoblast-like cells can catalyze phosphorylation of nascent dentin phosphophoryns (28). Because Ser residues are the predominate amino acids in our putative phosphophoryn sequence, we examined the potential casein kinase I and II sites since these ubiquitous enzymes are known to phosphorylate serine and/or threonine residues in a variety of proteins involved in different cellular functions (29,30).
Casein Kinase I Sites-As a conservative estimate for the number of casein kinase I phosphorylation sites, we have used the concensus sequence (Asp/Glu)-X-X-Ser (29). Phosphophoryn contains 29 putative primary casein kinase I sites (Fig. 2). Interestingly, in some cases, the phosphorylation of the serine residue in this consensus sequence enables casein kinase I to then phosphorylate the serine residue located two amino acids downstream from this newly phosphorylated serine (Ser(P)-X-

Rat Dentin Phosphophoryn 21696
by guest on July 25, 2018 http://www.jbc.org/ Downloaded from X-Ser). When this secondary target site is phosphorylated, it then determines the tertiary Ser site for casein kinase I and so forth. For example, Ser 103 is a primary target for casein kinase I. Based on this phosphorylation mechanism, once Ser 103 is phosphorylated, it can trigger phosphorylation of the following 17 target Ser sites ending at Ser 154 (1a through 1r) (see Fig. 2). When all those subsequent secondary, tertiary . . . etc. sites are included, a total of 66 potential phosphorylation sites may be available for casein kinase I.
Casein Kinase II Sites-This rat phosphophoryn contains 23 potential primary casein kinase II sites ((Ser/Thr)-X-X-(Glu/ Asp)). Many of these sites overlap with the sites for casein kinase I, and 13 sites are specific for casein kinase II. It is plausible that the overlapping sites ensure the phosphorylaion of specific phosphophoryn domains that may be crucial to collagen and/or calcium binding activities during dentinogenesis (2, 4, 7, 9, 11-13) (see Fig. 2). For example, Ser 154 , once phosphorylated, becomes the primary target site of casein kinase II and enables the phosphorylation of Ser 151 . Following this mechanism, the casein kinase II could then phosphorylate the following 17 serines (spaced every two amino acids upstream from Ser 151 ) ending at Ser 103 (2a-2r in Fig. 2). Overlapping kinase I and II activities within this Ser-rich domain (103-154) could therefore provide a "safeguard" mechanism to ensure the phosphorylation of this particular domain. There are, in total, 55 potential casein kinase II sites of which 37 sites overlap with casein kinase I.
Overall, 78% of the Ser residues (81 out of 103) could potentially become phosphorylated by casein kinases I and/or II. This number is consistent with the reported 85-87% of phosphorylated Ser residues in native phosphophoryn (2,21). With this number of phosphoserines, phosphophoryn would carry an additional charge of Ϫ130 from phosphate groups alone (i.e. Ϫ1.6/ Ser(P) or Thr(P)). Furthermore, if 3 of the 9 Thr residues are phosphorylated by casein kinase II, the charge from the combined phosphoserine and phosphothreonine residues would be Ϫ134. Therefore, phosphophoryn would carry an overall net charge of Ϫ213 at physiological pH. Such a molecule would have a very high capacity for binding divalent cations such as calcium and magnesium, as reported for phosphophoryn (2,7,8,21).
Potential N-Glycosylation Site-Seven potential N-glycosylation sites (Asn-Xaa-(Ser/Thr)) (31,32) are present in this novel dentin protein at amino acid positions 31, 37, 69, 170, 202, and 261 (Fig. 1). Because of the overlap of potential casein kinase sites and N-glycosylation sites, only positions 31 and 37 are likely to be glycosylated. The other sites would more likely be subjected to phosphorylation of Ser residues.
Northern Blot Analysis-We examined newborn rat tooth germs for putative phosphophoryn mRNA expression. To eliminate the DSP DNA sequence, we constructed a cDNA probe for 32 P random primer labeling containing only the phosphophoryn DNA sequence (i.e. from nucleotide position 208). Northern blot analysis indicated that multiple 4.6-kb transcripts were detected in the newborn tooth germs (Fig. 3). Therefore, transcripts for the putative phosphophoryn are expressed in the rat tooth germs. DISCUSSION During the process of analyzing the 3Ј end of an odontoblastspecific cDNA which codes for dentin sialoprotein (15), we discovered an open reading frame with a size of 801 bp. Our newly discovered cDNA was found to encode a novel rat dentin protein whose characteristics are in solid agreement with the following reported features for phosphophoryn: (i) the predicted NH 2 -terminal amino acid sequence (i.e. Asp-Asp-Pro-Asn) is identical to that derived from protein microsequencing for one form of rat phosphophoryn (25), (ii) the six NH 2 -terminal amino acid sequence (i.e. Asp-Asp-Pro-Asn-Ser-Ser) obtained from our putative phosphophoryn (Fig. 1) agrees with that reported by Reynolds and co-workers (33) (i.e. Asp-Ser(P)-Pro-Asn-Ser(P)-Ser(P)) for bovine phosphophoryn, (iii) amino acid sequences of Asp-Ser and Asp-Ser-Ser were found interspersed in our putative rat phosphophoryn. Additionally, a sequence of Asp-Ser- FIG. 2. A, the potential casein kinase I (*1) and II (*2) phosphorylation sites for the putative phosphophoryn. *1a, primary target site (i.e. (Asp/Glu)-X-X-Ser) for casein kinase I (29,36), *1b-*1r, secondary, tertiary (and so forth) sites (i.e. Ser(P)-X-X-Ser) for casein kinase I (35); *2a, primary target site (i.e. (Ser/Thr)-X-X-(Asp/Glu)) for casein kinase II (29, 39 -41); *2b-2r, secondary, tertiary (and so forth) sites for casein kinase II (i.e. (Ser/Thr)-X-X-Ser(P)). B, the acidic residues and potential acidic regions generated by casein kinases I and II are depicted in boxes. Many of these acidic regions consist of (DD) n , (pSpS) n , (pSD) n units (where n ϭ 1, 2, or 3). Several significant acidic domains containing for example 21 residues (96 -116), immediately followed by a series of 2 residue (i.e. mainly pSD) repeats (118 -155) and 24 residues (176 -199) are particularly evident.
FIG. 3. Northern blot analysis. Total RNA isolated from newborn rat tooth germs was subjected to Northern blot analysis (see "Experimental Procedures") and then probed using a 32 P-cDNA probe containing only the phosphophoryn DNA sequence (i.e. from nucleotide position 208). Multiple transcripts for putative phosphophoryn were detected near 4.6 kb in the newborn rat tooth germs.

Rat Dentin Phosphophoryn 21697
Ser-Ser-Ser was identified in our putative protein. These sequence combinations of Asp-Ser, Asp-Ser-Ser, and Asp-Ser-Ser-Ser were also observed in the NH 2 -terminal 50 residues of bovine phosphophoryn (33), and (iv) its deduced amino acid composition contained high percentages of Ser (43%) and Asp (31%) residues which coincided with the amino acid composition of phosphophoryns from human, bovine, rat, and rabbit (i.e. Asp (ϳ30 -40%) and Ser (ϳ38 -50%)) (2, 16 -21). Phosphophoryn is the most acidic protein so far discovered. The interspersed arrangement of Ser and Asp residues enables phosphophoryn to be an excellent substrate for casein kinases I and II phosphorylation action. The 78% of Ser residues in this protein which can potentially be phosphorylated coincide with the reported 85-87% of Ser(P) in authentic phosphophoryn (21). As discussed previously (see "Results"), many of these phosphorylation reactions occur at secondary and tertiary Ser sites and therefore result from a phosphorylation cascade-type mechanism involving casein kinases I and II operating over similar domains but in opposite directions to ensure complete phosphorylation within these specific Ser-rich domains.
Roach and co-workers (34 -36) have reported that threonine can also serve as a substrate for casein kinase I. Based on this information, all 9 threonine residues and an additional 10 serine residues could also be phosphorylated. This mechanism could enable the sequential phosphorylation of a 47-residue acidic patch extending from Ser 96 to Ser 142 . In this case, 88% of the Ser residues (91 out of 103) could potentially become phosphorylated by casein kinases I and/or II.
The full phosphorylation of the presumed phosphorylatable serines in phosphophoryn therefore leads to the generation of many acidic patches consisting of (DD) n , (pSpS) n , (pSD) n repeat units. Furthermore, it was reported that bovine phosphophoryn can undergo a conformational folding in the presence of Cd(II) and a pH-dependent conformational folding (37,38). These folding experiments, suggesting that bovine phosphophoryn was comprised of (DD) n , (pSpS) n , and (pSD) n structures arranged into polyelectrolytic cluster regions (37), are in agreement with the predicted (DD) n , potential (pSpS) n , and (pSD) n acidic patches shown for our putative phosphophoryn (Fig. 2). Taken together, our deduced acidic protein likely represents one form of rat phosphophoryn.
By using a cDNA probe containing only the phosphophoryn DNA sequence beginning from position 208, we determined by Northern blot analysis whether this putative phosphophoryn cDNA was indeed transcribed in tooth germ. Multiple transcripts, sized around 4.6 kb, were detected (Fig. 3). The presence of multiple phosphophoryn transcripts may be due to more than one phosphophoryn gene or due to alternative splicing. However, it is equally likely that these multiple transcripts are due to the use of multiple polyadenylation signals, similar to many other mRNAs encoding extracellular matrix proteins. Further experiments are needed to determine the origin of these multiple transcripts. However, our Northern blot strongly suggests that the putative phosphophoryn mRNA was present in the rat tooth germ total RNA pool. Therefore, the DNA sequence for this novel protein is unlikely to be an artifact. Taken together, we strongly feel that rat tooth germs do actively synthesize the mRNA of this putative phosphophoryn.
The presence of both DSP and phosphophoryn DNA sequences in the PCR product could be due to an artifact generated during reverse transcription-PCR or subsequent cloning processes. However, the possibility of a bicistronic gene could not be excluded. Further work, such as the examination of these two genes at the genomic level, is needed to investigate these possibilities.