Molecular Cloning of Pigeon UDP-galactose:β-d-Galactoside α1,4-Galactosyltransferase and UDP-galactose:β-d-Galactoside β1,4-Galactosyltransferase, Two Novel Enzymes Catalyzing the Formation of Galα1–4Galβ1–4Galβ1–4GlcNAc Sequence*

We previously found that pigeon IgG possesses unique N-glycan structures that contain the Galα1–4Galβ1–4Galβ1–4GlcNAc sequence at their nonreducing termini. This sequence is most likely produced by putative α1,4- and β1,4-galactosyltransferases (GalTs), which are responsible for the biosynthesis of the Galα1–4Gal and Galβ1–4Gal sequences on the N-glycans, respectively. Because no such glycan structures have been found in mammalian glycoproteins, the biosynthetic enzymes that produce these glycans are likely to have distinct substrate specificities from the known mammalian GalTs. To study these enzymes, we cloned the pigeon liver cDNAs encoding α4GalT and β4GalT by expression cloning and characterized these enzymes using the recombinant proteins. The deduced amino acid sequence of pigeon α4GalT has 58.2% identity to human α4GalT and 68.0 and 66.6% identity to putative α4GalTs from chicken and zebra finch, respectively. Unlike human and putative chicken α4GalTs, which possess globotriosylceramide synthase activity, pigeon α4GalT preferred to catalyze formation of the Galα1–4Gal sequence on glycoproteins. In contrast, the sequence of pigeon β4GalT revealed a type II transmembrane protein consisting of 438 amino acid residues, with no significant homology to the glycosyltransferases so far identified from mammals and chicken. However, hypothetical proteins from zebra finch (78.8% identity), frogs (58.9–60.4%), zebrafish (37.1–43.0%), and spotted green pufferfish (43.3%) were similar to pigeon β4GalT, suggesting that the pigeon β4GalT gene was inherited from the common ancestors of these vertebrates. The sequence analysis revealed that pigeon β4GalT and its homologs form a new family of glycosyltransferases.

The structures of glycans attached to glycoproteins or glycolipids vary greatly with various glycosidic linkages, including sialylations, fucosylations, galactosylations, and/or other glycan modifications. Different glycan structures are often expressed in a species-specific fashion in animals and plants and are thought to be involved in species-specific interactions between hosts and foreign organisms, such as bacteria, viruses, and parasites (1,2). The various kinds of species-specific glycans seem to have been generated, maintained, and/or lost during the evolution and diversification of animals, but the evolutional process of glycan diversity remains to be elucidated due to the lack of sufficient information (3).
Pigeon IgG possesses another unique glycan sequence, Gal␣1-4Gal␤1-4Gal␤1-4GlcNAc, at the nonreducing termini of N-glycans (12). The same sequence is found in O-glycans from the salivary gland mucin of the Chinese swiftlet (13). This fact indicates that these birds express not only Gal␣1-4Gal but also Gal␤1-4Gal sequence. Unlike Gal␣1-4Gal, the distribution of the Gal␤1-4Gal sequence among various avian species remains to be investigated. Among mammals, it has been reported that pigs express glycolipids containing the * This work was supported by Grant-in-aid for Scientific Research 18770081 Gal␤1-4Gal␤1-4Glc-Cer sequence at an internal position (14), but Gal␤1-4Gal is not found in glycoproteins in mammals. In contrast, high titers of natural antibodies against Gal␤1-4Gal were detected in human sera (15), suggesting that there may be some (Gal␤1-4Gal)-containing antigens, either microbes or macromolecules in human habitats.
In this study, to understand the molecular basis of speciesspecific glycan differentiation, we isolated cDNAs encoding pigeon ␣4GalT(Gal) and ␤4GalT(Gal), which are responsible for the production of Gal␣1-4Gal and Gal␤1-4Gal on glycoproteins, respectively. The deduced amino acid sequence of pigeon ␣4GalT(Gal) is homologous to those of human and putative chicken ␣4GalT(Gal)s. However, by comparing these ␣4GalT(Gal)s, we demonstrated that pigeon ␣4GalT(Gal) possesses distinct substrate specificities from those of mammals and of chicken. In contrast, the deduced amino acid sequence of pigeon ␤4GalT(Gal) does not resemble those of known members of the ␤4GalT(GlcNAc) family, which produce Gal␤1-4GlcNAc in vertebrates, or those of other known glycosyltransferases. However, genes encoding "hypothetical proteins" with sequence similarity to pigeon ␤4GalT(Gal) were found in the molecular data bases of frogs and fishes as well as zebra finch, suggesting that the expression of Gal␤1-4Gal in birds is genetically related to that in these vertebrates. Moreover, the results of sequence analysis revealed that pigeon ␤4GalT(Gal) and its homologs form a new family of glycosyltransferases.

EXPERIMENTAL PROCEDURES
Additional Experimental Procedures are described in the supplemental material.
In general, to detect the activity of ␣4GalT(Gal) or ␤4GalT-(Gal) and to determine their biochemical properties, 2-amino-pyridine (PA)-derivatized biantennary N-glycans (PA-N-glycan A in Fig. 1) was used as an acceptor substrate and analyzed by normal phase HPLC using an Amide-80 column (Tosoh Co., Tokyo, Japan). Synthetic glycans, such as p-nitrophenyl ␤-Dlactopyranoside (Lac-pNP), were also used as acceptor substrate for kinetic analysis, NMR measurements, and methylation analysis. To determine the specificity of acceptor substrates, free monosaccharides or oligosaccharides were incubated with UDP-[ 3 H]Gal in the presence of pigeon ␣4GalT(Gal) or ␤4GalT(Gal).

RESULTS
Expression Cloning of Pigeon Liver ␣4GalT(Gal) and ␤4GalT(Gal) cDNAs-We selected 293T cells to be the host recipient cells and expected Gal␣1-4Gal␤1-4GlcNAc structures to be expressed on the cell surfaces when cDNA encoding ␣4GalT(Gal) was transfected into these cells. Pigeon liver was utilized as a source of mRNA to construct the cDNA library for expression cloning, because we detected (Gal␣1-4Gal)-containing glycoproteins in this tissue (12). After the selection using anti-P 1 mAb (specific for Gal␣1-4Gal␤1-4GlcNAc) by cell sorting, the plasmid designated as pcDNA-pigeon-␣4GalT(Gal) was isolated.
Sequences of the Pigeon ␣4GalT(Gal) and ␤4GalT(Gal)-The entire nucleotide sequence of the inserted cDNA in pcDNA-pigeon-␣4GalT(Gal) consisted of a single 2263-bp open reading frame encoding a protein of 360 amino acids (supplemental Fig. S1). As found in many other Golgi-resident glycosyltransferases, which possess type II transmembrane topology, the cytoplasmic tail of pigeon ␣4GalT(Gal) is most likely located toward the N terminus and followed by the transmembrane domain, whereas the putative stem region and the catalytic domain are toward the C terminus. Three potential N-glycosylation sites were present at the stem and/or catalytic regions, and the DXD motif was found in the catalytic domain. The gene products with the highest identity for pigeon ␣4GalT-(Gal) were found in NCBI Protein Database for two avian species, chicken and zebra finch (Table 1 and supplemental Fig.  S2). There are three sequences in zebra finch and one sequence in chicken, and these sequences were annotated "similar to ␣1,4-galactosyltransferase." The BLAST analysis also revealed human (16,17), mouse (18), and rat (19) ␣4GalT(Gal)s as being homologous proteins, as well as some other predicted proteins from mammals that were annotated similar to ␣1,4-galactosyltransferase. The pigeon ␣4GalT(Gal) also has some homology with the human ␣1,4-N-acetylglucosaminyltransferase (␣1,4-N-acetylglucosaminyltransferase, 36.3% identity; accession number AAD48406) and putative ␣1,4-N-acetylglucosaminyltransferase from chicken (34.4%, accession number XP_426692) and zebra finch (31.5%, accession number XP_002189476). These results suggest that the newly isolated clone from pigeon liver also belongs to the ␣4GalT family.
The entire sequence of the insert of pcDNA-pigeon-␤4GalT(Gal) revealed a 1690-bp fragment with a single open reading frame encoding a protein of 438 amino acids with type II transmembrane topology (supplemental Fig. S3). Five potential N-glycosylation sites were present at the stem and/or catalytic regions, and the DXD motif was found in the catalytic domain. The gene products in the NCBI Protein Database with the highest identity with pigeon ␤4GalT(Gal) were the hypothetical proteins from zebra finch, one of the birds belonging to Passeriformes (Table 1 and supplemental Fig. S4). It is notable that all the other gene products with high levels of identity are hypothetical proteins from frogs and fishes, as shown in Table  1. The sequence of pigeon ␤4GalT(Gal) also has some identity (20 -26%) with hypothetical proteins from several insects, such as the honeybee, jewel wasp, red flour beetle, and fruit fly, and from nematodes. In contrast, no homologous proteins were found in the NCBI-DNA/Protein Databases (as of September 2009) of mammals and chicken. None of the proteins with known functions, including ␤4GalT(GlcNAc), showed significant homology with pigeon ␤4GalT(Gal) in a BLAST search.
Structural Analysis of the Glycans Produced by the Action of Pigeon ␣4GalT(Gal) and ␤4GalT(Gal)-Triton X-100 extracts (0.2 and 1 mg of protein/ml) of 293T cells transfected with pcDNA-pigeon-␣4GalT(Gal) transferred one or two Gal residues from UDP-Gal to PA-derivatized oligosaccharides with a ␤4-galactosylated biantennary structure (PA-N-glycan A in Fig.  1), as revealed by normal phase HPLC analysis (Fig. 2), whereas the extracts of wild-type cells or mock transfectants did not transfer the Gal residues under the same conditions (data not shown). No Gal transfer was observed in the absence of UDP-Gal. Affinity-isolated FLAG-tagged soluble pigeon ␣4GalT-(Gal) also transferred one or two Gal residues to the PA-Nglycan A substrate in the presence of UDP-Gal (data not shown). Matrix-assisted laser desorption/ionization time of flight-mass spectrometry analysis revealed that the m/z values of the substrate and the final product were 1888 and 2213, respectively, confirming that the two Gal residues were added to the substrate. When the product of soluble pigeon ␣4GalT-(Gal) was digested with ␣-galactosidase from green coffee beans, m/z values were changed to 1888, which was exactly the same as those of PA-N-glycan A (Fig. 1). These results suggest that two ␣-Gal residues were added to the substrate by the action of the recombinant enzyme, and the structure of the product is most likely PA-N-glycan B (Fig. 1). No products of ␣4GalT(Gal) were detected when UDP-GalNAc instead of UDP-Gal was used as a donor, whereas products with UDP-Glc (14% of those with UDP-Gal) or UDP-GlcNAc (1.8%) were detected. FLAG-tagged soluble pigeon ␣4GalT(Gal) also transferred one Gal residue to synthetic monosaccharide or disaccharide substrates, such as p-nitrophenyl 2-acetamide-2-de-   ␤-Galactoside ␣1,4and ␤1,4-Galactosyltransferases in Birds 2B) and p-nitrophenyl ␤-D-lactopyranoside (Gal␤1-4Glc␤1-pNP, Lac-pNP) (Fig. 2C), and Gal␤1-pNP (data not shown). The enzymatic products were 162 mass units larger than those of the substrates (data not shown), confirming that the products contain one additional Gal residue. The linkage of newly formed Gal-Gal sequence on Lac-pNP was confirmed to be Gal␣1-4Gal by NMR (supplemental Fig. S5) and methylation analysis (supplemental Fig. S6), as described in supplemental material. The results clearly indicate the activities of recombinant enzyme that acts as ␣4GalT(Gal).
The enzymatic activity of pigeon ␤4GalT(Gal) was detected using the cell lysates of transfectants and FLAG-tagged soluble form of pigeon ␤4GalT(Gal) by the same in vitro assay as for pigeon ␣4GalT(Gal) as described above. The HPLC profiles of the products from PA-N-glycans or pNP-glycans by the action of ␤4GalT(Gal) (data not shown) were similar to those of ␣4GalT(Gal) as shown in Fig. 2. However, the products by ␤4GalT(Gal) were digested with ␤4-galactosidase from Streptococcus pneumoniae. For instance, the product from PA-Nglycan A (Fig. 1) by the ␤4GalT(Gal), which is most likely PA-N-glycan C (m/z ϭ 2213), was digested with ␤4-galactosidase and became PA-N-glycan D (m/z ϭ 1564). These results suggest that two ␤-Gal residues were added to the substrate, PA-Nglycan A, by the action of the recombinant enzyme. No products of ␤4GalT(Gal) were detected when it was incubated with UDP-GalNAc, UDP-Glc, or UDP-GlcNAc instead of UDP-Gal. FLAG-tagged soluble pigeon ␤4GalT(Gal) also transferred one Gal residue to synthetic disaccharide substrates, such as Lac-NAc-pNP and Lac-pNP (data not shown but similar to Fig. 2, B and C, respectively), but not to Gal␤1-pNP (data not shown). The linkage of the newly formed Gal-Gal sequence on Lac-pNP was confirmed to be Gal␤1-4Gal by NMR (supplemental Fig.  S5) and methylation analysis (supplemental Fig. S6), as described in supplemental material. The results clearly indicate the activities of recombinant enzyme that acts as ␤4GalT(Gal).
Biochemical Properties of Pigeon ␣4GalT(Gal) and ␤4GalT(Gal)-As shown in Fig. 3, the highest activities of the ␣4GalT(Gal) and ␤4GalT(Gal) were observed under neutral conditions around pH 7.0. The cation-dependent enzymatic activities of ␣4GalT(Gal) and ␤4GalT(Gal) were observed as summarized in Table 2.
When ␣4GalT(Gal) was added to the reaction mixtures, most of the monosaccharides or oligosaccharides containing Gal residues transferred additional Gal residues (Table 3). In contrast, no incorporation of Gal was observed into monosaccharides or oligosaccharides that do not possess any Gal residues, such as GalNAc, GlcNAc, Man, Glc, Fuc, or maltooligosaccharides. As expected, LacNAc was one of the preferred acceptors and was used as a standard to indicate the relative activities of the other acceptor substrates. Gal␤1-4Gal and Gal␤1-4Gal␤1-4Glc were notably better acceptors than Lac-  ␤-Galactoside ␣1,4and ␤1,4-Galactosyltransferases in Birds FEBRUARY 19, 2010 • VOLUME 285 • NUMBER 8 NAc. Gal␤1-4Glc and Gal␤1-4Man sequences were, however, less preferred substrates, although they both possess the same Gal␤1-4Hex sequence as Gal␤1-4Gal. ␤-Galactosides were clearly better substrates than ␣-galactosides, as seen in Gal␤1-methyl versus Gal␣1-methyl activity or Gal␤1-4Gal versus Gal␣1-4Gal activity. A type II linkage (Gal␤1-4GlcNAc) was preferred to a type I (Gal␤1-3GlcNAc) linkage, and the Gal␤1-6GlcNAc linkage was even better than the type II linkage, although the Gal␤1-6GlcNAc linkage was not as common as the type II linkage in vertebrates. Fucosylation at either inner GlcNAc (Le a and Le x ) or outer Gal (H-trisaccharide and lacto-N-fucopentaose I) significantly decreased the incorporation of Gal residues, suggesting that fucosylation sterically hinders the action of pigeon ␣4GalT(Gal).
Distribution of the Pigeon ␣4GalT(Gal) and ␤4GalT(Gal) Transcripts in Various Tissues-Real time PCR analysis using gene-specific primers revealed that both ␣4GalT(Gal) and ␤4GalT(Gal) mRNA/cDNAs were detected in all tissues examined (Fig. 4). However, the expression levels varied from tissue to tissue.
Comparison of the Substrate Preferences of Pigeon, Human, and Chicken ␣4GalT(Gal)s in Vivo and in Vitro-Wild-type 293T cells and cells transfected with mock or full-length cDNAs encoding pigeon, human, and putative chicken ␣4GalT(Gal)s were compared by immunostaining with anti-P 1 mAb or anti-Gb3 mAb. The FACS analysis revealed that pigeon ␣4GalT(Gal) transfectants were stained with anti-P 1 mAb on the cell surface, whereas no staining was observed on wild-type cells and mock transfectants (Fig. 5A). The pigeon ␣4GalT(Gal) transfectants did not stain with anti-Gb3 mAb. In contrast, 293T cells transfected with cDNA encoding human ␣4GalT-(Gal) (also named Gb3 synthase) stained slightly with anti-P 1 mAb but stained stronger with anti-Gb3 mAb. Similarly, the putative chicken ␣4GalT(Gal) transfectants stained strongly with anti-Gb3 mAb, suggesting that this chicken ␣4GalT(Gal) possesses Gb3 synthase activity. These results suggest that pigeon ␣4GalT(Gal) is unlikely to act as a Gb3 synthase, although 293T cells express the precursor of Gb3, i.e. lactosylceramide.
Although proteins extracted from wild-type 293T cells (data not shown) and from mock, pigeon, human, and chicken ␣4GalT(Gal) transfectant cells on the Western blotting membrane had equal levels of staining with Coomassie Brilliant Blue R-250 (CBB, Fig. 5B), only pigeon ␣4GalT(Gal) transfectants stained with anti-P 1 mAb. Proteins extracted from human ␣4GalT(Gal) transfectants did not stain with either anti-P 1 or anti-Gb3 mAb, confirming that the human ␣4GalT(Gal) transfers Gal residues onto glycolipids rather than onto glycoproteins. These results strongly suggest that the preferred substrates, either on glycoproteins or glycolipids, differed significantly between pigeon and human ␣4GalT(Gal)s. Protein extracts of chicken ␣4GalT(Gal) transfectants also failed to stain with either anti-P 1 or anti-Gb3 mAb, suggesting that the substrate preferences of chicken ␣4GalT(Gal) are closer to those of human ␣4GalT(Gal) than to those of pigeon ␣4GalT(Gal).
The substrate preferences of pigeon, human, and chicken ␣4GalT(Gal)s were also compared by in vitro GalT assays. Although cell extracts of pigeon ␣4GalT(Gal) transfectants transferred one or two Gal residues to the PA-derivatized N-glycan A, as shown in Fig. 2A, no products were detected on the HPLC analysis when cell extracts (0.2 and 1 mg of protein/ml for the assay) of human or chicken ␣4GalT(Gal) transfectants were used as enzyme sources (data not shown). Similarly, affinity-isolated FLAG-tagged soluble pigeon ␣4GalT(Gal)s (0.1 g/25 l/reaction) transferred Gal residues to PA-N-glycan A, LacNAc-pNP, or Lac-pNP after incubation ␤-Galactoside ␣1,4and ␤1,4-Galactosyltransferases in Birds at 37°C for 4 h, although no products were detected when the same amount of FLAG-tagged soluble human or chicken ␣4GalT(Gal)s were incorporated in the reaction mixtures (data not shown).
This result suggests that glycoproteins possessing Gal␤1-4Gal sequences were produced by transfection with pigeon ␤4GalT(Gal).
Molecular Phylogeny of ␣4GalT(Gal) and ␤4GalT(Gal)-A phylogenetic analysis was performed using aligned DNA sequences encoding ␣4GalT(Gal)s from human, mouse, rat, chicken, and pigeon and the predicted ␣4GalT(Gal)s from several mammals (platypus, dog, cattle, horse, rhesus monkey, orangutan, and chimpanzee) and a bird (zebra finch), as shown in Fig. 6A. As expected, among mammals, the phylogenetic distances between ␣4GalT(Gal)s correlate with the taxonomy of the animals. Among birds, three putative ␣4GalT(Gal)s from zebra finch were the closest to pigeon ␣4GalT(Gal), and chicken ␣4GalT(Gal) was closer to these avian ␣4GalT(Gal) or Lacto-N-fucopentaose I 2.7 Ϯ 0.  ␤-Galactoside ␣1,4and ␤1,4-Galactosyltransferases in Birds FEBRUARY 19, 2010 • VOLUME 285 • NUMBER 8 its homolog than to those from mammals. It should be noted that chicken ␣4GalT(Gal), i.e. Gb3 synthase, is a much closer relative of pigeon ␣4GalT(Gal) than of mammals, although the substrate preferences of chicken ␣4GalT(Gal) are more similar to those from mammals than to those from pigeon.
A phylogenetic analysis was also performed using aligned sequences of DNA encoding pigeon ␤4GalT(Gal) and its homologs from zebra finch, African clawed frog, Western clawed frog, zebrafish, and spotted green pufferfish, as shown in Fig. 6B. The ␤4GalT(Gal) homolog from zebra finch was the closest to pigeon ␤4GalT(Gal). The hypothetical proteins from frogs were closer to each other on the phylogenetic tree and relatively closer to pigeon ␤4GalT(Gal) than those from fishes.

Pigeon ␣4GalT(Gal) Possesses Distinct Substrate Preferences from Those of Human and Chicken Gb3
Synthases-The distribution of (Gal␣1-4Gal)-containing glycoproteins in nature was initially believed to be limited, because mammalian glycoproteins so far studied do not possess Gal␣1-4Gal on glycoproteins. However, we found that the expression of Gal␣1-4Gal on glycoproteins is correlated with the avian phylogeny (8). To investigate the enzyme catalyzing the formation of Gal␣1-4Gal on glycoproteins, we successfully isolated the cDNA encoding pigeon ␣4GalT(Gal), which had not been identified previously. This pigeon ␣4GalT(Gal) revealed distinct substrate specificity from that of human ␣4GalT(Gal), i.e. Gb3 synthase, because pigeon ␣4GalT(Gal) is capable of transferring Gal residues to ␤-galactosides on glycoproteins to form Gal␣1-4Gal linkages. The deduced amino acid sequence was, however, homologous to that of human, mouse, and rat Gb3 synthases (Table 1), which belong to the GT32 family in Carbohydrate-Active Enzyme Database (22)(23)(24). We also confirmed that the putative chicken ␣4GalT(Gal), which was identified from chicken genome/protein data bases in NCBI, acts as a Gb3 synthase, like human ␣4GalT(Gal). This observation is consistent with the absence of Gal␣1-4Gal on glycoproteins in chicken (4,25). The high similarity between pigeon ␣4GalT(Gal) and chicken Gb3 synthase implies that they are most likely derived from a common ancestral gene relatively recently. To estimate the period of divergence of these genes, the sequences of ␣4GalT(Gal) genes from various avian species will be compared by extensive investigations.
The isolated pigeon ␣4GalT(Gal) will also be useful in generating the building blocks of Gal␣1-4Gal by chemoenzymatic synthesis. Because bindings of the pathogenic microbes (e.g. uropathogenic E. coli) or enterotoxins (e.g. Shiga toxins) to glycolipids containing the Gal␣1-4Gal sequence in mammalian cells are the first step of their invasion of the host cell, high affinity glycoconjugate inhibitors containing Gal␣1-4Gal may serve as therapeutic and preventive agents. As indicated in this study, this recombinant enzyme has the ability to transfer Gal residues to ␤-galactosides at the nonreducing termini of Gal␤1-4GlcNAc, Gal␤1-4Glc, and Gal␤1-4Gal or even of monosaccharides, such as Gal␤1-methyl and Gal␤1-pNP. Unlike human Gb3 synthase, which has a preference for transferring Gal residues onto Gal␤1-4Glc-Cer over Gal␤1-4GlcNAc␤1-3Gal␤1-4Glc-Cer (20), the pigeon ␣4GalT(Gal) has broad substrate specificity. Therefore, this enzyme has a large potential to be utilized in the production of various kinds of glycoconjugates containing Gal␣1-4Gal. ␤-Galactoside ␣1,4and ␤1,4-Galactosyltransferases in Birds FEBRUARY 19, 2010 • VOLUME 285 • NUMBER 8

JOURNAL OF BIOLOGICAL CHEMISTRY 5185
Pigeon ␤4GalT(Gal) Defines a New Sequence Family of Glycosyltransferases Not Found in Mammals-The protein sequence of newly isolated pigeon ␤4GalT(Gal) showed no significant homology with known glycosyltransferases in the Carbohydrate-Active Enzyme Database (22)(23)(24). An NCBI-BLAST search using Conserved Domain Database, which serves to identify conserved domains in protein sequences from over 12,000 model domain collections in the data base (26,27), indicated that the sequence contains a conserved domain designated as DUF23 (accession number, pfam01697), located between amino acid residues 176 and 430. Members of the DUF23 family are found in some hypothetical proteins from various species of eukaryota, including some vertebrates (fishes, frogs, and a bird), nematodes, insects, plants, and bacteria. The DUF23 region consists of ϳ300 amino acid residues containing several conserved cysteine residues and several charged residues, which are predicted to serve as catalytic residues. Therefore, proteins containing DUF23 had been expected to possess some enzymatic activities, although there was no direct evidence of this until we identified ␤4GalT(Gal) activity in one of the member proteins, as shown in this study. Moreover, the Conserved Domain Database indicated that DUF23 belong to the GT-A superfamily, which possesses one of the two known structural folds of nucleotide-sugar-dependent glycosyltransferases (24). Because most GT-A enzymes possess a DXD motif in which the carboxylates coordinate a divalent cation and/or ribose, this observation is consistent with the fact that the isolated pigeon ␤4GalT(Gal) possesses cation-dependent activity ( Table 2).
Among vertebrates whose data are currently (September 2009) available in NCBI DNA/Protein Databases, some fishes, frogs, and a bird, zebra finch, but no mammals or chicken, were indicated as possessing hypothetical proteins containing DUF23. Since the presence of Gal␤1-4Gal on N-glycans from zebrafish was previously reported (28,29), the proteins containing DUF23 are candidates for ␤4GalT(Gal) in zebrafish. Gal␤1-4Gal on N-glycans was also reported to exist in medaka (30 -32) and dace (33,34). The presence of Gal␤1-4Gal on O-glycans from eggs of salmoid fishes was also reported (35)(36)(37). Several species of amphibians were reported to possess O-glycans with Gal␤1-4Gal in the mucin of their egg jelly coats (38 -42). Therefore, we assume that the hypothetical proteins of clawed frogs similar to pigeon ␤4GalT(Gal) possess the cognate enzyme activity.
Although limited numbers of mammalian species are selected for complete genomic sequence projects, DNA/protein sequences from mammalian species spanning monotreme (duck-billed platypus), marsupialia (gray short-tailed opossum), rodents (mouse and rat), dog, bovine, and primates (rhesus macaque, chimpanzee, and human) are now available in the NCBI DNA/Protein Databases (43). Despite this wide spectrum, no homologous sequences with pigeon ␤4GalT(Gal) or DUF23 were found in the mammalian data bases. Therefore, the genes of ␤4GalT(Gal) or DUF23 may have been lost in several lineages of mammals or in the common ancestors of various species of mammals after the ancestors of mammals and birds separated (ϳ310 million years ago) (44).
Conclusion-Species-specific glycans are often involved in interactions between hosts and foreign organisms, and even a subtle difference in glycan structure can largely influence their associations (2). For instance, Gal␣1-4Gal on glycolipids is expressed on cell surfaces and mediates the cellular internalization of Shiga toxins, which are activated by cellular proteases for intoxication of cells (45)(46)(47). In contrast, the presence of glycoproteins containing Gal␣1-4Gal in tissues and body fluids may prevent such glycolipid-mediated endocytosis and/or intracellular processing of toxins. The biological features of Gal␤1-4Gal are less well understood currently, but the high titers of natural antibodies against Gal␤1-4Gal in human sera (15) imply that this glycan sequence is also involved in hostforeign body interactions. It should be emphasized that the presence of Gal␣1-4Gal and Gal␤1-4Gal is not an evolutionary relic and that there seems to be a driving force for the generation and conservation of these glycan sequences in some animals. As we have indicated, glycoprotein-specific ␣4GalT-(Gal) was generated relatively recently in a lineage of avian species. This fact suggests that birds were able to produce novel glycan forms, and they might have entered into selective associations with some microorganisms to evade pathogenic relationships and/or to coevolve in symbiotic or commensal relationships (48). The wide distribution of Gal␤1-4Gal in vertebrates and its conservation in some avian species may indicate a benefit for possessing oligosaccharides with Gal␤1-4Gal. The availability of genes encoding pigeon ␣4GalT(Gal) and ␤4GalT(Gal) should help accelerate investigations into the biological necessity of producing species-specific glycans, as well as studies on the molecular evolution of these enzymes.
Addendum-While this manuscript was being reviewed, Titz et al. (49) reported the cloning of Caenorhabditis elegans N-glycan core ␣1,6-fucoside ␤1,4-galactosyltransferase, which defined a new GT family in carbohydrate-active enzymes (GT92). The amino acid sequence of this enzyme is similar to the pigeon ␤4GalT(Gal) (20.9% identity), suggesting that these two enzymes belong to the same GT family.