Identification of a Drosophila Vitamin K-dependent g -Glutamyl Carboxylase*

Using reduced vitamin K, oxygen, and carbon dioxide, g -glutamyl carboxylase post-translationally modifies certain glutamates by adding carbon dioxide to the g position of those amino acids. In vertebrates, the modification of glutamate residues of target proteins is facil-itated by an interaction between a propeptide present on target proteins and the g -glutamyl carboxylase. Previously, the gastropod Conus was the only known inver-tebrate with a demonstrated vitamin K-dependent carboxylase. We report here the discovery of a g -glutamyl carboxylase in Drosophila . This Drosophila enzyme is remarkably similar in amino acid sequence to the known mammalian carboxylases; it has 33% sequence identity and 45% sequence similarity to human g -glu-tamyl carboxylase. The Drosophila carboxylase is vitamin K-dependent, and it has a K m toward a model pen- tapeptide substrate, FLEEL, of about 4 m M . However, unlike the human g -glutamyl carboxylase, it is not stim-ulated by human blood coagulation factor IX propeptides. We found the mRNA for Drosophila g -glutamyl carboxylase in virtually every embryonic and adult stage that we investigated, with the highest concentration evident in the adult head. The g -glutamyl carboxylase catalyzes the post-translational modification of specific glutamates to g -carboxyglutamate

The ␥-glutamyl carboxylase catalyzes the post-translational modification of specific glutamates to ␥-carboxyglutamate (Gla) 1 in a number of proteins. To accomplish this reaction, the ␥-glutamyl carboxylase requires, in addition to its peptide substrate, reduced vitamin K, oxygen, and carbon dioxide as cosubstrates. In vertebrates, it appears that the recognition between the nascent protein substrate and enzyme is dependent upon a "docking" reaction between the propeptide of the substrate and the enzyme. In the mollusk Conus, a propeptide also appears to dock the substrate to the carboxylase, but the substrate is quite different in sequence than that of the vertebrate propeptides (1).
Gla was first identified in prothrombin as a modified amino acid necessary for the activity of this vitamin K-dependent protein. Historically, Gla has been associated with coagulation factors. However, in recent years, other Gla proteins with varied functions and locations have been identified: two proteins, bone Gla protein, or osteocalcin, and matrix Gla protein, have been isolated from bone (2); the Gla protein Gas 6 is known to be a ligand for the receptor tyrosine kinase Axl (3); and two proline-rich Gla proteins, PRGP1 and PRGP2, whose functions are not known, are found in the spinal cord and thyroid gland, respectively (3).
Drosophila melanogaster (the fruit fly) has been used as a model organism because of its relatively small genome, short growth cycle, and long history of study. The Drosophila genome, with approximately 165 million bases, is about 20-fold smaller than that of mammals. Furthermore, the sequence of the Drosophila genome has been completed and will soon be available through GenBank TM . One of the most remarkable and valuable characteristics of Drosophila as a model organism is the fact that many of its genes have functional counterparts in humans. For example, the homeobox genes of Drosophila have been used to find their mammalian homologues. In the case of the Drosophila Deformed gene, the human homologue, HOX4B, can functionally replace its Drosophila counterpart (4).
Using the human ␥-glutamyl carboxylase (hGC) amino acid sequence as query, we found a region on Drosophila chromosome 3L with strong similarity to human ␥-glutamyl carboxylase. We report here the isolation of a cDNA corresponding to this gene. When expressed in Spodoptera insect cells, it produces vitamin K-dependent carboxylase activity. Mammalian substrates are not efficiently carboxylated by Drosophila ␥-glutamyl carboxylase (dGC), however, which suggests differences in recognition specificities between Drosophila and mammalian substrates.

EXPERIMENTAL PROCEDURES
GenBank TM Search for Vitamin K-dependent ␥-Glutamyl Carboxylase Homologues-The amino acid sequence of hGC was used to search GenBank TM .
RT-PCR-Forward primer 1140-F (CTTCATCACCAAGGGCTATA) and reverse primer 1842-R (CAGCATCGTTTTGTTGGTGT) were selected to avoid introns and to give a fragment of about 700 bp (ordered from Life Technologies, Inc.). 100 mg of adult Drosophila in 1 ml of TRIZOL reagent (Life Technologies, Inc., catalogue number 15598) was ground in a mortar, and total RNA was isolated according to the manufacturer's protocol. Total RNA was further treated by RQ1 RNasefree DNase (Promega, catalogue number TB518). First strand cDNA was synthesized with SuperScript II RNase H reverse transcriptase (Life Technologies, Inc., catalogue number 18064-014) with oligo(dT) 12-18 as primer (Life Technologies, Inc., catalogue number 18418 -012). PCR reaction conditions were: predenaturing the template at 94°C for 2 min; 94°C for 1 min, 55°C for 1 min, and 72°C for 1 min for 30 cycles; followed by 72°C for 5 min.
RT-PCR Using a Drosophila Expression Panel-Drosophila Rapid-Scan™ gene expression panel (catalogue number DSCC-101) was purchased from OriGene Technologies, Inc. (Rockville, MD). The product contains first-strand cDNAs prepared from different Drosophila tissues and developmental stages. The 12 cDNAs have been normalized against the transcript for RP49 (a constitutively expressed ribosomal protein * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. ‡ To whom correspondence should be addressed: Dept. of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3280. Tel.: 919-962-0597; Fax: 919-962-9266; E-mail: dws@email.unc. edu. 1 The abbreviations used are: Gla, ␥-carboxyglutamic acid; hGC, human ␥-glutamyl carboxylase; dGC, Drosophila ␥-glutamyl carboxylase; PCR, polymerase chain reaction; RT, reverse transcriptase; bp, base pair(s). gene), serially diluted over a 4-log range, arrayed onto a 48-well PCR plate, and dried. PCR conditions were: predenaturing the template at 94°C for 3 min; 94°C for 30 s, 55°C for 30 s, and 72°C for 2 min for 35 cycles; followed by 72°C for 5 min.
Screening of the Drosophila Adult Head cDNA Library-The Drosophila cDNA libraries were a generous gift from the Rubin Lab (BDGP EST Project, University of California, Berkeley, CA). The adult head library was screened with the 700-bp Drosophila DNA fragment as probe (5). The positive clones were further characterized by sequence analysis.
Expression of ␥-Glutamyl Carboxylase in High Five Cells-The fulllength cDNA encoding the dGC was cloned into the pVL1392 vector; the recombinant clone with correct orientation and BacVector 3000 were co-transfected into SF9 cells. Plaques of recombinant virus were isolated via PCR using the same primer pair which was used for RT-PCR. The selected viruses were used to infect additional cells, which were checked for ␥-glutamyl carboxylase activities. Plaques identified to contain recombinant virus were grown and then titered according to the manufacturer's instructions. Expression of the dGC was done by infection of ϳ2 ϫ 10 6 /ml High Five Cells with the recombinant virus at a multiplicity of infection of ϳ1. After 48 h cells were collected by centrifugation and stored at Ϫ80°C.
Preparation and Assay of Microsomes from High Five Cells-A total of 1.6 ϫ 10 6 cells from 1.0 liter of culture expressing the recombinant Drosophila carboxylase was washed twice with Buffer A (20 mM phosphate, pH 7.4, 150 mM NaCl, 1ϫ protease inhibitor mixture (6), and 15% glycerol) and resuspended in 100 ml of Buffer A. The sample was homogenized with 15 strokes using a Dounce homogenizer and then sonicated with four 5 s pulses using an Ultrasonic Heat Systems sonicator. Cellular debris was removed by centrifugation at 4000 ϫ g for 15 min, and the supernatant was centrifuged at 105,000 ϫ g for 1 h. The microsomal pellet was resuspended in 20 mM phosphate, pH 7.4, 500 mM NaCl, 1ϫ protease inhibitor mixture, and 15% glycerol and stored at Ϫ80°C. Carboxylase activity was assayed as described previously (7).

Identification of the Drosophila Homologue to
Human ␥-Glutamyl Carboxylase-A "tblasn" search of GenBank TM , using the hGC amino acid sequence as query, identified a 1385-bp putative homologue of ␥-glutamyl carboxylase in the region 62A10 of D. melanogaster chromosome 3L. In many of the cases where there were conservative amino acid differences between Drosophila and human sequences, the same position was variable in the ␥-glutamyl carboxylases of other species as well.
To determine whether the ␥-glutamyl carboxylase homologue was expressed in Drosophila, we performed RT-PCR to amplify a 700-bp sequence from total Drosophila adult mRNA (Fig. 1, lane 2). To demonstrate that trace genomic DNA was not responsible for the positive result, there was no PCR product when total RNA was used as a template as shown in Fig. 1  (lane 1). These results indicate that this ␥-glutamyl carboxylase-like sequence is expressed in adult Drosophila.
cDNA Library Screening-To determine which of the three libraries available to us (embryonic, larval/pupal, or adult head) contained our target gene, we performed PCR reactions using ϳ10 5 clones from each cDNA library as template and the same primers (see above) for RT-PCR. The expected 700-bp band was amplified from the adult head library but not from the embryo or larval cDNA libraries (data not shown). Seven positive clones were obtained from the 7 ϫ 10 5 clones that were screened; of these, the two largest clones were chosen for sequence analysis.
Sequence of the Drosophila ␥-Glutamyl Carboxylase Gene-The full-length carboxylase cDNA is about 2300 nucleotides, which predicts a protein of 672 residues and a deduced molecular mass of 78.4 kDa (Fig. 2). Using the alignment tool CLUSTALW in Biology WorkBench, we did multiple sequence alignment for Drosophila, mouse, rat, bovine, and human carboxylases (Fig. 3). When the translated Drosophila sequence was compared with human, the alignment found 33% identical amino acids and 45% similar amino acids. Those amino acids From top to bottom, human, mouse, bovine, rat, and Drosophila sequences are shown. "Similar" residues are assigned based on a PAM250 matrix (8). Those residues called "identical" refer to cases where three or more of the aligned amino acids are identical. "Completely conserved" refers to those cases where the amino acids are identical in all five sequences. The three completely conserved cysteines are shown in violet. designated as "similar" were based on PAM250 setting (8). There are 13 cysteines in dGC compared with the 10 cysteines in hGC. Three of the cysteines were found to be aligned in all of the known ␥-glutamyl carboxylases (Fig. 3), thus making it highly likely that they preserve a critical structure or have an important catalytic function. Compared with the human carboxylase gene, the dGC gene contains two very short introns, which correspond to introns 4 and 7 of the human carboxylase gene. In addition, a consensus promoter sequence is present at 525 bp upstream from the first ATG coding for the aminoterminal methimine.
The dGC gene is located in D. melanogaster chromosome 3L region 62A10. P1 clone DS02777 contains the dGC gene. Using the MapViewer in Berkeley Drosophila Genome Program site, we found that there is a P-element (EP (3)0304) located about 2,000 bp upstream of the 5Ј end of dGC, but no P-element inside the dGC gene has been reported. Moreover, there is a P-glyco-protein/multidrug resistance protein (Mdr50) gene about 500 bp away from the 3Ј end of the dGC gene.
Expression Profile of Drosophila ␥-Glutamyl Carboxylase-To determine where and when the dGC is expressed, we used the same PCR primers to amplify the 700-bp fragment from Drosophila cDNA prepared at different stages of development and diluted over a 3-log range. Each cDNA was diluted in water to a series of three concentrations (labeled 1ϫ, 0.1ϫ, and 0.01ϫ), with the highest concentration (1ϫ) being approximately 1 ng of cDNA/well. Fig. 4 demonstrates that ␥-glutamyl carboxylase is expressed at least 10 times more in the adult head than in any other tissue and that the first measurable expression is after 4 h of embryonic development.
Drosophila Carboxylase Activity-To examine whether the Drosophila homologue to hGC has vitamin K-dependent activity, we expressed the dGC in SF9 cells and examined its activity in an in vitro assay. Fig. 5 shows that, in the presence of vitamin KH 2 , carboxylase activity is 11-fold higher in dGC transfected cells compared with the mock transfected SF9 cells. Using kinetic studies, the K m of dGC toward FLEEL was determined to be 4 mM (Fig. 6), which is similar to that of hGC. Through a time course, we found that the carboxylase activity for dGC was linear for at least 2 h (data not shown). It should also be noted that the propeptide of human factor IX fails to activate carboxylation of FLEEL by dGC (data not shown).

DISCUSSION
There are several interesting implications resulting from our observation that ␥-glutamyl carboxylase is present in Drosophila. First, this discovery should contribute to the understanding of the evolutionary relationship among the coelomate phyla. Second, it may be a key to clarifying the relationship between structure and function in hGC. Third, the genetics of Drosophila may facilitate the discovery of additional properties of various important Gla proteins and, furthermore, may promote the identification of the vitamin K-epoxide reductase enzyme.
It is now clear that functional ␥-glutamyl carboxylase enzymes exist in the phyla chordata, mollusca (7), and arthropoda. For convenience, a simplified evolutionary tree outlining the phyla discussed in this paragraph is shown in Fig. 7. From the phylogenic tree it is apparent that the carboxylase arose in evolution before the protostome/deuterostome split. Thus, one would expect to find the ␥-glutamyl carboxylase in the annelida and echinodermata and in some more primitive organisms as well. Further research to determine where ␥-glutamyl carboxylase first appears in evolution will be important and interesting. A search of the complete genomic sequence of a member (Caenorhabditis) of the pseudo-coelomate phylum nematoda revealed no significant homology to ␥-glutamyl carboxylase. This lack of a homologue to the carboxylase is inconsistent with the notion that nematodes (9) are closely related to the arthropods. This is probably the most interesting evolutionary aspect of our observations. However, one must remember that each organism is only one representative of thousands of diverse organisms within its phylum and that Caenorhabditis is rapidly evolving and highly specialized; therefore, it is possible other nematodes may possess ␥-glutamyl carboxylase, whereas members of the phyla arthropoda and mollusca may lack it.
The comparison of sequences of ␥-glutamyl carboxylase from different species has additional significance in understanding the relationship between the structure and function of the enzyme. For example, one would expect that the regions of the protein with the most sequence identities would have some crucial purpose relating to either the structure or function of the ␥-glutamyl carboxylase. Thus, the substrate recognition site should be conserved because glutamic acid must be recognized by all species. Similarly, one would expect the vitamin K-binding site to be conserved. The expectation that the vitamin K site is conserved is less certain, however, because Drosophila may use a slightly different vitamin K hydroquinone.
Vitamin K 1 hydroquinone does stimulate carboxylation of dGC; therefore, if a different quinone is used, it must be very similar to vitamin K 1 . In the mammalian enzyme, substrate affinity appears to be a result of its propeptide docking it to the enzyme rather than the affinity of the Gla domain per se. However, the dGC appears not to recognize a mammalian propeptide and in this respect is similar to the carboxylase of Conus. It has been shown that the mammalian propeptide is either poorly recognized or not recognized at all by the Conus carboxylase. Nevertheless, a hydrophobic propeptide different from the commonly known mammalian propeptides appears to be present in the substrate of the Conus carboxylase (1, 10). Therefore, one might expect to find a propeptide-binding site on the dGC that is not homologous to the hGC-binding site but that is located in the same region we previously determined to be the location of the mammalian site (11). One major difference between the Drosophila and mammalian ␥-glutamyl carboxylases is that the dGC is 86 amino acids shorter than the human enzyme from the carboxyl terminus. Notably, this is approximately the number of amino acids that Roth et al. (12) found could be removed from the carboxyl terminus of hGC without seriously affecting its enzymatic activity. A comparison of conserved hydrophobic amino acids between these species is also instructive. It has been known, because a number of related proteins were crystallized, that the packing of amino acids in the core of a protein must be conserved for enzyme stability (13). Also, it has since become clear that many protein-protein interactions take place through hydrophobic patches on the surface of a protein (14). Thus, examining the stability and enzymatic activity of mutations in conserved hydrophobic regions is expected to reveal whether these are surface regions or buried ones and whether they are important in particular substrate or peptide interactions.
It is interesting that the Drosophila head has the highest level of ␥-glutamyl carboxylase activity. Although Gla proteins traditionally have been associated with coagulation, there has been considerable evidence for several years that certain Gla proteins serve other important functions. For example, it has been recognized since the mid-1970s that warfarin therapy during pregnancy results in characteristic embryopathies, including hypoplasia and a phenotype similar to chondroplasia punctata. In addition, blindness and mental retardation occur in a significant number of cases (15). The importance of carboxylation in bone was implied by the isolation of osteocalcin and matrix Gla protein from bone (2). The recent demonstration that a mouse knockout of matrix Gla protein died shortly after birth because of rupture of calcified arteries (16) provides further proof for the more general importance of ␥-glutamyl carboxylase. Still another class of Gla proteins was identified when a gene selected during growth arrest of cells in culture was found to code for a protein containing a Gla domain. This protein, called Gas-6 (17), has subsequently been shown to serve as a ligand for the receptor tyrosine kinase Axl (3). It was also found that protein S, a Gla protein whose absence or defect results in severe thrombotic events (18), is a ligand for Tyro-3, which is closely related to Axl. Both proteins are expressed in neurons (3,19) as well as a number of other tissues. Two additional Gla proteins, proline-rich Gla proteins 1 and 2 (PRPG-1 and PRPG-2), were identified by searching the human genome data base for sequences homologous to the Gla domain of coagulation proteins (20). Although the function of neither protein is known, both are expressed widely, with PRPG-1 having its highest levels in neural tissues and PRGP-2 having its highest levels in thyroid tissue.
Until now, we have been unable to identify a Gla domain in the Drosophila genomic sequence that is recognizably similar to the Gla domains of most mammalian Gla proteins. A consensus sequence was made using Emotif (21) to scan the domain sequences of blood coagulation factor IX, FX, bone Gla protein, and matrix Gla proteins. This consensus sequence, EXX(E/R)EXCXXXXXXXX(L/F/Y)XXXXXXXX(A/F)(Y/W)XX-(F/Y/H) (where X equals any amino acid and the letters within parentheses indicate the amino acids found at that position in the pattern), identified all of the known Gla proteins, including osteocalcin and matrix Gla protein. However, a search of the Drosophila data base using this pattern failed to identify any recognizable Gla domain. Failure to find a Gla domain similar to the canonical Gla domain of coagulation proteins is not, however, evidence against the presence of Gla proteins in Drosophila. It has been shown in mammals that attaching a propeptide of a vitamin K-dependent protein to random substrates results in the carboxylation of those substrates, both in vivo and in vitro (22,23). This suggests that the structure of the Gla domain exists, not for recognition by the ␥-glutamyl carboxylase, but because it is a characteristic necessary for the function of the particular vitamin K-dependent protein itself. Thus, it is likely that still other vertebrate Gla proteins, which bear no resemblance to the known Gla domains, remain to be identified. Because of its easily accessible genetics, Drosophila presents a unique opportunity to identify these new Gla proteins and then, via homology cloning, to identify the same proteins in mammals. The identification of these sequences is likely to lead to an understanding of additional roles for vitamin K-dependent proteins in mammalian cells.
In summary, we have identified and made preliminary characterizations of a functional ␥-glutamyl carboxylase in Drosophila. This enzyme can utilize reduced vitamin K 1 as a substrate, making it very likely that the powerful genetics of Drosophila can be used to find the gene, still not known, for the vitamin K reductase. It will also aid in our understanding of evolution and in understanding the mechanism by which the carboxylase modifies its substrates.