Composition of Drosophila melanogaster proteome involved in fucosylated glycan metabolism.

The whole genome approach enables the characterization of all components of any given biological pathway. Moreover, it can help to uncover all the metabolic routes for any molecule. Here we have used the genome of Drosophila melanogaster to search for enzymes involved in the metabolism of fucosylated glycans. Our results suggest that in the fruit fly GDP-fucose, the donor for fucosyltransferase reactions, is formed exclusively via the de novo pathway from GDP-mannose through enzymatic reactions catalyzed by GDP-D-mannose 4,6-dehydratase (GMD) and GDP-4-keto-6-deoxy-D-mannose 3,5-epimerase/4-reductase (GMER, also known as FX in man). The Drosophila genome does not have orthologs for the salvage pathway enzymes, i.e. fucokinase and GDP-fucose pyrophosphorylase synthesizing GDP-fucose from fucose. In addition we identified two novel fucosyltransferases predicted to catalyze alpha1,3- and alpha1,6-specific linkages to the GlcNAc residues on glycans. No genes with the capacity to encode alpha1,2-specific fucosyltransferases were found. We also identified two novel genes coding for O-fucosyltransferases and a gene responsible for a fucosidase enzyme in the Drosophila genome. Finally, using the Drosophila CG4435 gene, we identified two novel human genes putatively coding for fucosyltransferases. This work can serve as a basis for further whole-genome approaches in mapping all possible glycosylation pathways and as a basic analysis leading to subsequent experimental studies to verify the predictions made in this work.

Glycans play an important role in a number of biological processes. On the cellular level N-glycans are involved in protein folding, quality control, and targeting (for a review, see Ref. 1). As constituents of the cell membrane and extracellular matrix glycosylated proteins regulate through intercellular recognition and signaling a plethora of biological processes such as fertilization, pattern formation during embryogenesis, hematopoiesis, neuronal development, wound healing, inflammation, tumor cell metastasis, host-microbial interactions, and infection (for reviews, see Refs. [2][3][4][5][6][7][8][9]. O-Glycans are involved in signal transduction, regulation of transcription, and translation (for reviews, see Refs. 10 and 11).
Fucose is an essential component of various glycan structures. Perhaps one of the most well known examples of molecules with fucose-containing modifications (␣1,2-fucosylated lactosamine) are the ABO blood group antigens. Other examples, representing the ␣1,3-fucosylated modifications, are the sialyl Lewis x glycans, a crucial decoration of selectin ligands. Sialyl Lewis x glycans have a central role in inflammation, initiating extravasation of the leukocytes by mediating their tethering and rolling on the endothelium (12). The ␣1,3-fucosylation has been shown to be involved in tumor metastasizing via blood circulation (2). Further, fucosylated proteins play an essential role during the normal development of an organism. For example, the glycosylation of O-linked fucose of the Notch receptor by Fringe has been shown to modulate the Notch-dependent signaling pathway that establishes the dorsoventral boundaries during embryogenesis of Drosophila (13,14).
Fucosylation requires GDP-L-fucose, as the donor of fucose, and fucosyltransferases, which catalyze the transfer of L-fucose to the glycans or directly to serine/threonine residues on the proteins that act as acceptors (15). GDP-L-fucose can be synthesized in vivo either via the de novo pathway from GDP-Dmannose or by the salvage pathway from fucose (16) (see Fig.  1). After synthesis in the cytoplasm, the GDP-L-fucose is transported into the Golgi apparatus by a specific transporter (17,18). A number of enzymes, fucosyltransferases, acting in the Golgi and using GDP-L-fucose as a donor have been characterized in various species (19). Not only the fucosylation but also the degradation of fucosylated glycans is an important step in fucose metabolism. The removal of fucosylated glycans is catalyzed by fucosidases (for a review, see Ref. 20). In humans, two inherited disorders caused by defects of proteins involved in fucosylated glycan metabolism are known. A congenital disorder of glycosylation of type IIc (OMIM accession number 266265) is caused by mutation in the GDP-fucose transporter gene and results in impaired GDP-L-fucose transport to the Golgi (17,18). An impaired lysosomal degradation of fucosylated glycans caused by a defect in the ␣-L-fucosidase gene (FUCA1) leads to fucosidosis, a recessive autosomal disorder (OMIM accession number 230000; Refs. 21 and 22).
An ever growing list of completely sequenced genomes offers a great opportunity and challenge for systematic in silico searches for every component of any particular pathway. We have initiated an effort to identify genes putatively involved in the metabolism of various monosaccharides. We started out with the characterization of the metabolic pathway of fucosylated glycans in Drosophila melanogaster. The fruit fly is a well studied model organism, and its genome is the most comprehensively annotated genome of any multicellular organism.
Experimental studies have shown that in the fruit fly fucose can be attached to glycan acceptors through ␣1,6and ␣1,3linkage (23,24). Recently O-linked fucose-containing glycoforms on the Notch protein have been characterized (13,14). Before this study little was known of the Drosophila fucosylation on the genomic level (for a review, see Ref. 25). Three recently described ␣1,3-fucosyltransferases are the only characterized genes involved in fucosylation in the fruit fly (24). In addition a putative candidate for the Drosophila GDP-fucose transporter has been inferred from sequence homology (CG9620) (17,18).
In this paper, we have used genome-wide bioinformatics to identify enzymes involved in the synthesis of GDP-L-fucose, in the transfer of its sugar moiety to either glycans or protein acceptors, and in degradation of fucosylated glycans.

EXPERIMENTAL PROCEDURES
The sequences used for homology searches were obtained from the Swiss-Prot and TrEMBL data bases using the Sequence Retrieval System (SRS) at the European Bioinformatics Institute (EBI, www. ebi.ac.uk/ and srs6.ebi.ac.uk/). The identifiers (accession numbers) are listed in Table I. The homology searches were performed at Internet sites offering genome information and tools. Most notably they were EBI, the National Center for Biotechnology Information (NCBI, www.ncbi.nlm.  Madison, WI). DNA sequences were aligned with the program PILEUP (Wisconsin Package, op. cit.) or with ClustalW (version 1.7 (28)) using an identity matrix, a gap weight of 8, and a gap length weight of 0.1. Amino acid sequences were aligned with the same programs using a Blosum32 protein weight matrix, a gap weight of 12, and a gap length weight of 0.5. The DNA alignments were visually examined and edited if needed using the GeneDoc program 1 and corrected to avoid alignments with disrupted reading frames. Trees were constructed from the data using maximum parsimony and neighbor joining using programs from the PHYLIP 2 and Treecon 3 packages and the GCG implementation of PAUP* (Wisconsin Package, op. cit.). Heuristic searches were utilized in parsimony analyses due to the great number of taxa examined. Branch swapping was done by tree bisection-reconnection. For neighbor joining analyses, distance measures (difference scores) were employed using a Kimura two-parameter correction for multiple hits and a transition/transversion rate of two. Bootstrap analyses (not shown) of 1000 replicates were performed to examine the relative support of each relationship in the resultant topologies. GeneDoc and TreeView (32) were used to prepare illustrations of the alignments and the trees. Protein domains were briefly investigated using Multiple EM for Motif Elicitation (MEME) (33). 4

RESULTS AND DISCUSSION
Synthesis of GDP-L-fucose-Fucosylation requires a nucleotide sugar, GDP-L-fucose, as the donor of fucose, and enzymes, fucosyltransferases, which catalyze the transfer of L-fucose onto the glycan or serines/threonines on proteins acting as an acceptor (16). In mammals GDP-L-fucose can be synthesized via two different pathways, either by the prominent de novo pathway or by the minor salvage pathway (Fig. 1).

Fucosylation Metabolism in Drosophila 3172
the conceptual Drosophila sequences with previously known GMD and GMER proteins from a wide evolutionary range of species reveal a high degree of amino acid conservation (Figs. 2  and 3). The Drosophila Gene Collection (35) contains expressed sequence tag sequences corresponding to both genes, proving that they are transcribed (Fig. 2B). Three P-element strains with the transposon (36) residing in the 5Ј area of the CG8890 (gmd gene) were identified in FlyBase. Two of these strains (l(2)k10001 and l(2)k1003a) with a transposon inserted within 0.5 kb upstream from the putative coding sequence are lethal (Fig. 2C), suggesting a vital role for the fruit fly gmd gene.
A salvage pathway is used as an alternative route for synthesizing GDP-fucose from fucose in the absence of the de novo pathway. We searched for the salvage pathway enzymes; however, the mammalian fucokinase and fucose-1-phosphate guanylyltransferase failed to identify any similar sequences in the fly genome. In Drosophila, the presence of salvage GDP-Lfucose synthesis pathways could be experimentally tested using available GMD mutant strains (Fig. 2C).
Probes generated from human FUT1 and FUT2 both with ␣1,2-specificity (37-39) did not find any significant relatives from the Drosophila genome based on the primary sequence analysis. It is somewhat surprising that, while both bacteria (40) and mammals (41) carry ␣1,2-specific fucosylation, no signs of genes belonging to this family could be depicted in the Drosophila genome. In a nematode, Caenorhabditis elegans, 13 ␣1,2-fucosyltransferases have been predicted to exist (19). However, our result is in good agreement with the fact that in insects no ␣1,2-fucosylated glycans have been described so far (for a review, see Ref. 5). It should also be noted that the loss of ␣1,2-fucosylation found in human H blood group-deficient individuals (Bombay phenotype; OMIM accession number 211100) results in no apparent phenotype (42). The absence of ␣1,2fucosylation in Drosophila can be explained by the ceasing of the particular selection pressure that once promoted the emergence of this type of modification (for a review, see Ref. 43).

FIG. 8. Phylogenetic analysis of the known and predicted human (indicated by Fut numbers) and mouse (indicated by *) and putative Drosophila fucosyltransferases (indicated by CG numbers).
The analysis clusters the polypeptides based on the their distance to each other, measured in numbers of mutations per site. The clustering follows the known enzyme activities indicated by the numbers. The scale indicates distances in number of mutations per site. FIG. 9. Alignment of the putative O-fucosyltransferases from human and fly (CG-prefixed lines). The alignment emphasizes the groupwise similarity between the "1-type" (upper two) and the "2-type" (lower two) families and the low overall similarity between the families. Three motifs suggested by the MEME motif discovery tool are indicated by Roman numerals I, II, and III.
Aligning all the human ␣1,3-fucosyltransferases, FUT3-7 and FUT9, we identified four conserved sequences that were used as "baits" to search the fly genome (Fig. 4). This approach yielded three candidate genes from the Drosophila genome (CG4435, CG6869, and CG9169) similar to mammalian genes. The Drosophila Gene Collection (35) contains expressed sequence tag sequences for all of them, proving that they are transcribed.
No P-element insertion phenotypes were identified for any of the genes in FlyBase. Alignments of the conceptual translation of CG4435, CG6869, and CG9169 to known mammalian fucosyltransferases of known ␣1,3/1,4-specificity showed several conserved stretches/domains (Fig. 4). Of the three genes, CG9169 was the most divergent as compared with the mammalian transferases. Nevertheless, a search performed with CG9169 listed the human fucosyltransferase FUT3 (44) as the most similar known protein to CG9169. Pairwise similarity analysis of the CG9169 and FUT3 sequences showed that the area of similarity is located in the carboxyl-terminal part of the protein (Fig. 5). The fact that the Drosophila genome contains several novel candidate ␣1,3-fucosyltransferases indicates that ␣1,3-fucosylated glycans are vital in the fruit fly.
We searched public sequence data bases to identify human candidates encoding for additional fucosyltransferases using as a query the Drosophila sequence CG4435 characterized in this study. The two best candidates represented novel fucosyltransferases and were termed FUT10 and FUT11 (the NCBI RefSeq data base accession numbers NT_008076 and NT_024037, respectively). The human FUT10 and FUT11 showed higher sequence similarity to the Drosophila probe than to any characterized human fucosyltransferase sequence (Fig. 6).
The human fucosyltransferase FUT8 has a ␣1,6-specific activity to the proximal GlcNAc residue on N-linked glycans. Thus, this enzyme participates in the synthesis of hybrid and complex types of N-glycans on glycoproteins. FUT8 is widely expressed in mammalian tissues, and it has distinctly high expression during fetal development and in liver tumors (45). Our homology searches identified the gene CG2448 as the best ortholog candidate with a significant sequence similarity of over 75% to the human FUT8 (Fig. 7).
In an attempt to refine the function of the novel genes and get an even more substantial handle on their putative specificities, the DNA sequence from the coding area of fly, human, and rodent genes were aligned, and all pairwise distances were calculated using standard phylogenetic tools. The resulting unrooted tree visualization of the distances (Fig. 8) confirmed the observations: CG2448 was closely related to human and mouse FUT8 with ␣1,6-transferase activity, while the other three new fly genes belong to a larger group with ␣1,3or ␣1,3/4-specific transferase activity. Furthermore, our phylogenetic analysis suggests that two novel human enzymes (FUT10 and FUT11) may belong to an evolutionary distinct group of fucosyltransferases (Fig. 8). This distinct clustering may reflect a difference either in enzyme activity or acceptor specificity. Nevertheless, the closest relatives to this group are the ␣1,3/ 1,4-fucosytransferases, so we favor the latter possibility.
Drosophila O-Fucosyltransferases-In addition to glycans the proteins also can directly be fucosylated at serine/threonine residues (for a review, see Ref. 15). O-Linked fucose exists both as a monosaccharide or elongated glycoform. It has been shown in Drosophila that the elongation of O-linked fucose (by addition of GlcNAc) of Notch receptor by Fringe has a crucial role in modulating its affinity toward Delta and Serrate ligands. Notch signaling is important in establishing the dorsoventral boundaries during embryogenesis (13,14). So far no O-fucosyltransferase, an enzyme providing the acceptor for the reaction catalyzed by Fringe, has been described in Drosophila. Recently a human O-fucosyltransferase has been identified by Wang and Spellman (29). We have searched the Drosophila genome using partial sequence data from this patent and identified two putative candidates for O-fucosyltransferase: CG12366 and CG14789, called O-FUT1 and O-FUT2 here (Fig.  9). Both of these predicted genes are active as can be deduced from the existence of similar expressed sequence tag sequences in the data bases. No P-element insertions were found in these genes in the FlyBase.
Degradation of Fucosylated Glycans-The degradation of fucosylated glycans is an important step in fucose metabolism, and defects in the human ␣-L-fucosidase gene (FUCA1) result in fucosidosis, a recessive autosomal disorder (21,22). We looked for fucosidases in the Drosophila genome using human ␣-L-fucosidase (FUCA1) as the "bait" and identified CG6128 as the ortholog gene. The alignment of the Drosophila putative fucosidase with human and rat counterparts (including Q9UJM5, a human protein similar to fucosidase) is shown in Fig. 10.
Conclusions-In this study we have applied computational methods to characterize the molecular pathways responsible for fucosylation of cellular proteins in a whole-genome-wide manner in D. melanogaster. We have identified two novel enzymes putatively involved in the synthesis of GDP-L-fucose. Our data suggest that in the fruit fly fucose is formed solely through the de novo GDP-fucose synthesis pathway with no salvage pathway present. A putative candidate for the Drosophila GDP-fucose transporter has been previously inferred from sequence homology (CG9620) (17,18). We further identified two novel fucosyltransferases predicted to catalyze ␣1,3and ␣1,6-specific linkages, two O-fucosyltransferases, and one fucosidase gene in the Drosophila genome. The Drosophila genome has apparently no genes encoding for ␣1,2-specific fucosyltransferases. Finally, we have identified two novel human putative N-fucosyltransferase sequences using one of the newly detected Drosophila fucosyltransferase as a query sequence. Our results show that new members of protein families of the same organism can readily be identified using homology searches that traverse the genomes of different species. This work can serve as the basis for further whole-genome approaches in mapping all possible glycosylation pathways, and it can also serve as the basic analysis leading to experimental studies verifying the predictions made in this work.