The Molecular Basis for the Absence of N -Glycolylneuraminic Acid in Humans*

N -Glycolylneuraminic acid (NeuGc) is abundantly expressed in most mammals, but it is not detectable in humans. The expression of NeuGc is controlled by cytidine monophospho- N -acetylneuraminic acid (CMP-NeuAc) hydroxylase activity. We previously cloned a cDNA for mouse CMP-NeuAc hydroxylase and found that the human genome contains a homologue. We report here the molecular basis for the absence of NeuGc in humans. We cloned a cDNA for human CMP-NeuAc hydroxylase from a HeLa cell cDNA library. The cDNA encodes a 486-amino acid protein, and its deduced amino acid sequence lacks a domain corresponding to the N-terminal 104 amino acids of the mouse CMP-NeuAc hydroxylase protein, although the human protein is highly identical (93%) to the rest of the mouse hydroxylase protein. The N-terminal truncation of the human hydroxylase is caused by deletion of a 92-base pair-long exon in human genomic DNA. The human hydroxylase expressed in COS-7 cells exhibited no enzymatic activity, and a mouse hydroxylase mutant, which lacks the N-terminal domain, was also inactive. A chimera composed of the human hydroxylase and the N-terminal domain of the mouse hydroxylase displayed the enzyme activity. These results indicate that the human homologue of CMP-NeuAc hydroxylase

Sialic acids are components of the carbohydrate chains of glycoconjugates and are involved in cell-cell recognition (1,2) and cell-pathogen interactions (3,4). Sialic acid is a generic designation used for N-acylneuraminic acids and their derivatives (5,6). N-Acetylneuraminic acid (NeuAc) 1 and N-glycolyl-neuraminic acid (NeuGc) are two of the most abundant derivatives. Previous studies suggest that some adhesion molecules recognize glycoconjugates containing NeuGc and NeuAc with different affinities. Sialoadhesin, a cell adhesion molecule of marginal zone macrophages, recognizes NeuAc␣2-3Gal but not NeuGc␣2-3Gal (7). On the other hand, mouse CD22, a B cellrestricted adhesion molecule, binds more strongly to NeuGc␣2-6Gal than to NeuAc␣2-6Gal (7). Influenza hemagglutinins recognize glycoconjugates that contain NeuAc and NeuGc with different affinities (8). These studies suggest that the diversity of sialic acids is biologically important in recognition events mediated by glycoconjugates.
Although NeuGc-containing glycoconjugates are found in most mammals (5,6), NeuGc is not detectable in normal human tissues. Glycoconjugates containing NeuGc are immunogenic in human, and an antibody against NeuGc, which is known as Hanganutziu-Deicher antibody, is produced by patients who receive therapeutic injections of animal antisera (9,10).
NeuGc is assumed to be produced from NeuAc through enzymatic hydroxylation of the N-acetyl residue of free NeuAc, CMP-NeuAc, or glycoconjugate-linked NeuAc (11,12). Previous studies showed that the major mechanism for biosynthesis of NeuGc is hydroxylation of CMP-NeuAc (13)(14)(15)(16). Therefore, the activity of CMP-NeuAc hydroxylase plays a key role in the regulation of NeuGc expression in glycoconjugates. We showed previously that the conversion of CMP-NeuAc to CMP-NeuGc is carried out by an electron transport system that includes NADH-dependent cytochrome b 5 reductase, cytochrome b 5 , and CMP-NeuAc hydroxylase (17)(18)(19). The activity of CMP-NeuAc hydroxylase regulates the overall velocity of the hydroxylation reaction (19). Therefore, we purified (20) and cloned mouse CMP-NeuAc hydroxylase (21) and showed that the ratio of NeuGc to NeuAc in glycoconjugates is regulated by the expression level of CMP-NeuAc hydroxylase (21). The human genome contains a band that hybridizes with the mouse CMP-NeuAc hydroxylase cDNA (21). To clarify the molecular basis for the absence of NeuGc in humans, we cloned and expressed the human homologue. We report here that the human CMP-NeuAc hydroxylase protein is inactive because of a partial deletion in the human CMP-NeuAc hydroxylase gene.

EXPERIMENTAL PROCEDURES
Isolation of a cDNA Clone for Human CMP-NeuAc Hydroxylase-To amplify partial fragments of human CMP-NeuAc hydroxylase, four oligonucleotide primers were synthesized on the basis of the sequence of the mouse CMP-NeuAc hydroxylase cDNA (21). PCR amplification was performed with human genomic DNA as a template. The amplified DNAs, HE1 (70 bp) and HE2 (224 bp), were sequenced. To amplify a longer cDNA fragment, primers (5Ј-ATGATGAGTGATTTTGCTGGAG-GAGC-3Ј and 5Ј-CCCAAGTGAAGTATTCTTTTATCCAGG-3Ј) were synthesized on the basis of the sequences of HE1 and HE2. PCR am-plification was performed with a reverse-transcribed cDNA prepared from total RNA of HeLa cells as a template. Amplified cDNA fragments were sequenced. One amplified clone (HE3) was 485 bp long and showed 87% homology to the mouse CMP-NeuAc hydroxylase.
A HeLa cDNA library constructed in TriplEX (CLONTECH) (6 ϫ 10 5 clones) was screened by hybridization with HE3. The hybridization was carried out at 65°C in 6 ϫ SSC (900 mM NaCl and 90 mM sodium citrate) containing 5 ϫ Denhart's solution (0.1% Ficoll, 0.1% polyvinylpyrrolidone, and 0.1% bovine serum albumin) and 0.5% SDS. Filters were washed at 65°C in 2 ϫ SSC containing 0.1% SDS. Two independent positive clones were isolated and sequenced. One representative clone H111 with a 2201-bp open reading frame was used for further analyses.
Construction and Expression of CMP-NeuAc Hydroxylase Mutants in COS-7 Cells-A cDNA coding for the Flag epitope tag (DYKDDDDK) was incorporated into the human and the mouse CMP-NeuAc hydroxylase cDNA using PCR. A cDNA for a mouse hydroxylase mutant, which lacks the N-terminal 104 amino acids, was constructed by PCR with primers 5Ј-CGCGGATCCGCCGCCACCATGGATTACAAGGACGACG-ACGATAAGGAGAACAATGGGCTTTCCC-3Ј and 5Ј-CAGCTCGAGAC-TAATCACAGTGCATTAGG-3Ј. A cDNA for a chimeric mutant was constructed by sequential PCRs. A cDNA for the N-terminal part of the chimera was amplified with primers 5Ј-CGCGGATCCGCCGCCACCA-TGGATTACAAGGACGACGACGATAAGAGGAAACAGACAGCTGA-G-3Ј and 5Ј-AAGCTCGAGGTTCGTCTCCATTTCAATAACGAGCTCG-3Ј. A cDNA for the C-terminal part of the chimera was amplified with primers 5Ј-GGAACGTCTCGAAATGGATGAAAACAACGGAC-3Ј and 5Ј-CGCCTCGAGTCATGTGGTTTTGCATTCTTGC-3Ј. The two PCR products were digested by BsmBI and ligated each other. The cDNAs were inserted into an eukaryotic expression vector, pcDNA3.1(ϩ) (Invitrogen) and sequenced. The cDNAs were transfected into COS-7 cells with LipofectAMINE plus reagent (Life Technologies, Inc.). The relative amounts of mRNAs expressed by the transfectants were measured by reverse transcriptase-PCR (22) with primers (5Ј-ATTCCCATTTATGT-TGG-3Ј and 5Ј-TTCCACCACTGAAAGTC-3Ј) designed on the basis of the mouse and the human CMP-NeuAc hydroxylase.
Immunoblot Analysis-The cytosolic proteins of the transfectants were separated by SDS-polyacrylamide gel electrophoresis and transferred to a polyvinylidene difluoride membrane. Anti-Flag antibody M2 (Eastman Kodak Co.) was used as a first antibody and horseradish peroxidase-labeled F(abЈ) 2 fragment of sheep anti-mouse Ig (Amersham Pharmacia Biotech) was used as a second antibody. The blots were developed by the ECL system (Amersham Pharmacia Biotech).
Genomic Analysis-Partial fragments of genomic DNA of the human CMP-NeuAc hydroxylase were amplified with a PromoterFinder DNA walking kit (CLONTECH) according to the manufacturer's protocol. The amplified DNAs were isolated and sequenced. The sequence similarity between the amplified DNAs and the mouse CMP-NeuAc hydroxylase was analyzed by the program Clustal W (24).

Isolation of a cDNA for Human CMP-NeuAc
Hydroxylase-In our previous paper, we showed that the human genome appears to contain a homologue of mouse CMP-NeuAc hydroxylase (21). In a preliminary experiment, we used RT-PCR to amplify first strand cDNA prepared from total RNA of HeLa cells. We obtained a human cDNA fragment of 485 bp that was 87% homologous to mouse CMP-NeuAc hydroxylase cDNA (data not shown). This result indicated that HeLa cells express mRNA for the human homologue of the mouse hydroxylase. To isolate a complete cDNA for the human homologue, we screened a HeLa cell cDNA library using the 485-bp fragment as a probe and isolated two clones. Restriction mapping and sequence analyses of these clones showed that they contained a common sequence and overlapped each other. Fig. 1 shows the nucleotide and deduced amino acid sequences of one of these clones, H111. The deduced amino acid sequence was assigned from the longest open reading frame of the cDNA. The polypeptide consists of 486 amino acid residues with a predicted molecular weight of 56,508. Using a rabbit reticulocyte lysate, a 57-kDa protein was synthesized by in vitro transcription and translation of clone H111 (data not shown). This result confirms the position of the initiation site. Fig. 2 shows a comparison of the deduced amino acid sequences of the human clone and the mouse CMP-NeuAc hydroxylase. The human hydroxylase peptide lacks the sequence corresponding to the N-terminal 104 amino acids of the mouse peptide, because of a deletion in the gene. The human protein is highly identical (93%) to the rest of the mouse hydroxylase peptide, which suggests that the cDNA encodes for a human homologue.
Northern Blot Analysis of the Human CMP-NeuAc Hydroxylase-Expression of mRNA for the human CMP-NeuAc hydroxylase was observed in many tissues (Fig. 3). Thymus expresses the mRNA most abundantly, and brain does not express a detectable amount of mRNA. In mice, CMP-NeuAc hydroxylase mRNA is also expressed in many tissues, but not in brain (21). Thus, suppression of mRNA transcription in brain is conserved in both human and mouse.
Expression of the Human CMP-NeuAc Hydroxylase in COS-7 Cells-We showed previously that in mice the expression of NeuGc in glycoconjugates is related to the level of CMP-NeuAc hydroxylase mRNA (21). However, although CMP-NeuAc hydroxylase mRNA is abundantly expressed in human tissues (Fig. 3), NeuGc is hardly detectable (5). We tested the hypothesis that the human hydroxylase protein is an inactive form of the enzyme because of the deletion of the N-terminal 104 amino acids. We constructed hydroxylase mutants and expressed them in COS-7 cells (Fig. 4). The ⌬N mouse hydroxylase is a mutant, which lacks the N-terminal 104-amino acid sequence of the mouse hydroxylase (Fig. 4A). The mouse/human chimera is a fusion protein, in which the 104-amino acid N-terminal domain of the mouse hydroxylase is attached to the N terminus of the human hydroxylase (Fig. 4A). The Flag-epitope tag was attached to the N termini of all the recombinant proteins for detection.
RT-PCR analysis showed that mRNA for each hydroxylase construct was expressed in the transfected COS-7 cells (Fig.  4B). In addition, production of the recombinant proteins was confirmed by an immunoblot analysis using an anti-Flag antibody (Fig. 4C). Fig. 4D shows CMP-NeuAc hydroxylase activity in the transfectants. The COS-7 cells expressing the native mouse hydroxylase show the hydroxylase activity, but the cells expressing the ⌬N mouse hydroxylase show no enzyme activity. These results suggest that the 104-amino acid N-terminal domain of the mouse hydroxylase is essential for enzyme activity. The cells expressing the human hydroxylase show no hydroxylase activity, but the cells expressing the chimeric hydroxylase show enzyme activity. These results indicate that the protein encoded by the human CMP-NeuAc hydroxylase gene is inactive because it lacks the N-terminal domain, which is essential for CMP-NeuAc hydroxylase activity.
Genomic Organization of the Human CMP-NeuAc Hydroxylase-The mouse CMP-NeuAc hydroxylase gene comprises 18 exons. 2 Fig. 5A shows the comparison of the cDNA sequences of the mouse and the human CMP-NeuAc hydroxylase. The human cDNA lacks a 92-bp sequence corresponding to the exon 6 of the mouse hydroxylase. An in-frame TAG termination codon (nucleotide Ϫ18 to Ϫ16) is present upstream of the methionine codon of the human cDNA (underlined in Fig. 5A) because of the 92-bp deletion. Therefore, the human protein lacks the N-terminal 104 amino acids present in the mouse enzyme due to the deletion of 92 bp in the human cDNA. RT-PCR analyses of 10 human cell lines and tissues from 21 people did not amplify any sequences corresponding to the mouse exon 6 (data not shown), suggesting that the 92-bp deletion does not represent a polymorphism. To examine the molecular basis of the deletion of the 92-bp sequence corresponding to the mouse exon 6 in human genome, we cloned and sequenced human genomic DNA corresponding to the region flanked by the mouse exon 5 and exon 7. The human genomic region comprises 22,515 bp, but we could not obtain any sequences identical or similar to the mouse exon 6 (Fig. 5, B and C). The PCR with human genomic DNA as a template and primers designed on the sequence of the mouse exon 6 did not amplify the DNA fragment corresponding to the mouse exon 6 (data not shown). These results suggest that the exon is absent in the human genome. Thus, the lack of the N-terminal 104 amino acids of the human hydroxylase is due to the deletion of one exon in the CMP-NeuAc hydroxylase genome. DISCUSSION Carbohydrate structures differ markedly from one animal species to another and even among tissues in the same species. NeuGc is one of the abundant derivatives of sialic acid and is expressed in most mammals. For example, NeuGc comprises 95 and 11% of total sialic acids expressed in mouse and rat liver glycoconjugates, respectively (16). Almost all sialic acid in rat thymus CD45 and Thy-1 is NeuGc (25). In contrast, no expression of NeuGc was observed in normal human tissues. The absence of NeuGc in humans is supported by the evidence that NeuGc is highly immunogenic to humans. Antibodies that recognize gangliosides containing NeuGc appear after administration of animal serum to humans (9,10). Another example of the species-specific absence of a carbohydrate sequence and its immunogenicity in humans is the Gal␣1-3Gal epitope. This epitope is expressed in various mammals but not in old world monkeys, apes, and humans (26). Antibodies against glycoconjugates containing the Gal␣1-3Gal epitope are naturally occurring in humans (27). The reason for the absence of the Gal␣1-3Gal epitope in humans is that human ␣-1,3-galactosyltransferase gene is a nonfunctional pseudogene, which contains nonsense codons in all three reading frames because of multiple frameshift mutations (28,29). The molecular mechanism for the absence of NeuGc differs from that of the Gal␣1-3Gal epitope. Southern blot analysis of the human genome revealed that the CMP-NeuAc hydroxylase gene is present in human genome in one copy (data not shown). mRNA for the human CMP-NeuAc hydroxylase homologue is transcribed ( Fig. 3 and 4B), and the protein was translated ex vivo (Fig.  4C). However, the human hydroxylase is inactive because of the deletion of one exon in the hydroxylase genome (Fig. 5).
As shown in Fig. 4D, the N terminus of the mouse CMP-NeuAc hydroxylase is essential for enzyme activity. Schlenzka et al. (30) cloned a partial cDNA for porcine CMP-NeuAc hydroxylase and proposed that CMP-NeuAc hydroxylase is an iron-sulfur protein of the Rieske type, and the N-terminal domain of the mouse CMP-NeuAc hydroxylase contains the Rieske iron-sulfur center. Evidence for the essential nature of this domain for hydroxylase activity is provided by the transfection experiments in this paper (Fig. 4D). Schlenzka et al. (30) also suggested that the probable binding sites for CMP-NeuAc and for cytochrome b 5 are in the middle of the mouse CMP-NeuAc hydroxylase sequence. There are no data concerning the binding sites for CMP-NeuAc or cytochrome b 5 . However, these sites are highly conserved in the mouse, the porcine, and the human hydroxylases. Amino acid sequences of functionless proteins are likely to undergo mutations, but the amino acid sequences of the mouse, the porcine, and the human hydroxylases are highly conserved, even though the human hydroxylase is inactive. We cloned a cDNA for a mouse CMP-NeuAc hydroxylase isoform that lacks 46 amino acids in the middle of the normal full-length hydroxylase (31). The short isoform is expressed in various mouse tissues and shows no enzyme activity. The amount of the short isoform mRNA is about 10 -25% of total hydroxylase mRNA (31). The biological function of the inactive mouse isoform is unknown. However, the presence of the enzymatically inactive human hydroxylase and the inactive mouse hydroxylase isoform suggests the possibility of other biological functions of CMP-NeuAc hydroxylase. Further studies are required to answer these questions.
The absence of NeuGc results in a partial loss of diversity of human glycoconjugate structures. Mouse CD22 binds more strongly to NeuGc␣2-6Gal than to NeuAc␣2-6Gal (7,25). On the other hand, human CD22 binds equally to NeuAc␣2-6Gal and to NeuGc␣2-6Gal (2), so NeuAc compensates for NeuGc in the function of human CD22 binding. Polymorphic variations of FIG. 4. Expression of the CMP-NeuAc hydroxylase cDNA in COS-7 cells. A, schematic structures of hydroxylase mutants expressed in COS-7 cells. ⌬N mouse CMP-NeuAc hydroxylase lacks the N terminus of the mouse CMP-NeuAc hydroxylase, and mouse/human chimera has the N terminus of the mouse CMP-NeuAc hydroxylase fused to the human CMP-NeuAc hydroxylase. Flag tags were attached to N termini. Each cDNA was transfected in COS-7 cells. B, RT-PCR analysis of mRNA expression in the transfectants. Total RNAs from the transfectants were reverse-transcribed with random primers. The PCR amplification was performed using primers, which have common sequences to both the human and the mouse CMP-NeuAc hydroxylase cDNA. Each PCR product was resolved in a 2% agarose gel. The gel was stained with ethidium bromide. C, immunoblot analysis of protein expression in the transfectants. The cytosolic fractions of the transfectants were analyzed by SDS-polyacrylamide gel electrophoresis, followed by immunoblot analysis using anti-Flag M2 antibody. Allows indicate the 67-kDa (upper) and the 57-kDa (lower) proteins. D, CMP-NeuAc hydroxylase activity in the cytosolic fractions of the transfectants. CMP-NeuAc hydroxylase activity in the transfectants was measured using mouse liver microsomal fraction solubilized with 0.5% Triton X-100, as the source of cytochrome b 5 and cytochrome b 5 reductase. The data represent the mean Ϯ S.E. for triplicate determinations. the expression of NeuGc-containing gangliosides have been reported in dog erythrocytes (32,33), cat erythrocytes (34), and rat small intestine (35). These studies suggest that NeuGc is not vitally essential for these animals. Therefore, it is possible that NeuAc or other components of cells compensate for the lack of NeuGc in humans or in NeuGc-deficient animals. The absence of NeuGc may confer protection against infection by certain pathogen. For example, K99, an enterotoxigenic strain of Escherichia coli, adheres to ganglioside GM3(NeuGc) but not to GM3(NeuAc) in intestinal epithelial cells (3). K99 causes diarrhea in neonatal calves, lambs, and piglets, which express GM3(NeuGc). On the other hand, K99 does not adhere to human intestine because of the absence of GM3(NeuGc) in humans (36). These results indicate that the loss of diversity of sialic acids is not necessarily unfavorable to humans.
Several studies have suggested that certain human tumor cells express NeuGc-containing glycoconjugates (6), but these data are controversial. Most of these studies used antibodies against NeuGc-containing glycoconjugates, but not chemical analysis, to detect NeuGc in tumors. Although a few studies identified the presence of NeuGc expression in tumors by chemical analyses, their results are also controversial. Furukawa et al. (37) reported that no NeuGc expression was observed in human melanoma or astrocytoma cell lines, but Marquina et al. (38) reported that NeuGc-containing gangliosides were expressed in breast tumor tissues. Our present finding excludes the possibility of the biosynthesis of NeuGc by CMP-NeuAc hydroxylation in humans. We assume that the hydroxylation of CMP-NeuAc is the major biosynthetic pathway and that other pathways minor, because the level of mRNA and activity of CMP-NeuAc hydroxylase correlate with the expression of NeuGc (21). Therefore, we consider that the NeuGc is biosynthesized almost exclusively by the hydroxylation of CMP-NeuAc, and molecular mechanisms for the possible expression of NeuGc in human tumors require further study.
It remains to be determined whether NeuGc and hydroxylase activity are present in primates. On the basis of our present findings, molecular cloning of CMP-NeuAc hydroxylases of primates is now in progress in our laboratory to identify when the partial deletion of the hydroxylase genome occurred in primate evolution.
In summary, we identified that the human CMP-NeuAc hydroxylase is the inactive protein because of the lack of the N-terminal domain, which is essential for CMP-NeuAc hydrox-ylase activity. We conclude that the absence of NeuGc in humans is due to the deletion of one exon in CMP-NeuAc hydroxylase genome. This study will contribute to understanding of molecular basis for the diversity of glycoconjugate structures in mammals. FIG. 5. Gene structures of the human and the mouse CMP-NeuAc hydroxylase. A, the nucleotide sequences at the boundaries of exons 5, 6, and 7 of the mouse CMP-NeuAc hydroxylase. The mouse (upper) and the human (lower) CMP-NeuAc hydroxylase cDNAs are aligned to achieve the maximal homology. Only the 5Ј and 3Ј ends of the mouse exon 6 are shown; the remainder of the exon 6 is indicated by the dotted line. B, schematic representation of the mouse (upper) and the human (lower) CMP-NeuAc hydroxylase genomic DNA. C, restriction map of the human genomic DNA. The GenBank™ accession number for the 22.5-kilobase pair sequence is AB009668.