Evolution of Fucosyltransferase Genes in Vertebrates*

Cloning and expression of chimpanzee FUT3, FUT5, and FUT6 genes confirmed the hypothesis that the gene duplications at the origin of the present human cluster of genes occurred between: (i) the great mammalian radiation 80 million years ago and (ii) the separation of man and chimpanzee 10 million years ago. The phylogeny of fucosyltransferase genes was completed by the addition of the FUT8 family of α(1,6)fucosyltransferase genes, which are the oldest genes of the fucosyltransferase family. By analysis of data banks, a newFUT8 alternative splice expressed in human retina was identified, which allowed mapping the human FUT8 gene to 14q23. The results suggest that the fucosyltransferase genes have evolved by successive duplications, followed by translocations, and divergent evolution from a single ancestral gene.

Previous cloning of a bovine ␣(1,3)fucosyltransferase gene (futb) gave a single transcript, and the corresponding cognate enzyme had properties in common with the products of the three human FUT3, FUT5, and FUT6 genes. The position of this futb gene in the phylogenetic tree of fucosyltransferases, showed that the separation of the bovine species from the common evolutionary pathway, during the great mammalian radiation some 80 million years ago, occurred before the duplication events, which originated the present cluster of human FUT3, FUT5, and FUT6 genes and suggested that this bovine enzyme is the orthologous homologue of the ancestor of the FUT3, FUT5, and FUT6 human genes (6).
The present cloning and expression of three chimpanzee ␣(1,3)fucosyltransferase genes provides evidence for the existence of at least three distinct, but related ␣(1,3)fucosyltransferase enzymes in this species, each one being the orthologous homologue of one of the human FUT3, FUT5, and FUT6 genes. The position of these chimpanzee genes in the fucosyltransferase phylogenetic tree suggests that the separation of man and chimpanzee from the common evolutionary pathway, about 10 million years ago, has occurred after the duplication events, at the origin of the present cluster of FUT3, FUT5, and FUT6 genes. Addition of the FUT8 gene family to the phylogenetic tree suggests that the appearance of this family preceded the ␣(1,2)and the ␣(1,3)fucosyltransferase gene families, and the analysis of sequences in GenBank TM /EBI, EST, 2 and Uni-Gene data banks allowed us to map the human FUT8 gene to 14q23.

EXPERIMENTAL PROCEDURES
Cloning-PCR was used to amplify the coding regions and immediately adjacent sequences of FUT3, FUT5, and FUT6 from a chimpanzee, with primers containing extra bases with specific restriction sites, already used for the human genes (7,8). PCR products were digested with EcoRI and XbaI for FUT3, HindIII for FUT6, and HindIII and EcoRI for FUT5. Each gene was cloned between the respective restriction sites of pcDNA1 (Invitrogen). The isolated inserts were sequenced by the dideoxynucleotide chain termination method using a DNA sequencer model 373A (Applied Biosystems, Inc) (9), the vector primers T7 and SP6 and the internal primers shown in Fig. 1, 2, and 3.
Transfection Expression-COS-7 cells were transfected by the DEAE-dextran method (10) with the human and chimpanzee FUT3, FUT5, and FUT6 constructs. After 48 h, transfected cells were trypsinized and distributed in 96-well V-bottom microtiter plates (3 ϫ 10 5 cells/well). Le a , sialyl-Le a , Le x , sialyl-Le x and B oligosaccharide epitopes were detected by 30-min incubation with primary antibodies (11,12). The antibody-treated cells were washed and incubated for 30 min with fluorescein-labeled sheep anti-mouse Ig, secondary antibodies (Pasteur Diagnostics, Marnes-la-Coquette, France). Stained and washed cells were suspended in 10 l of phosphate-buffered saline, 4% paraformaldehyde. Then 5 l of Mowiol 4:80 (Hoechst, Frankfurt, Germany) were added, and the samples were mounted under coverslides for * This work was supported in part by Grant 9514111 "Action Concertée Coordonnée des Sciences du Vivant" (ACCSV14) from "Ministère de l'Education Nationale de l'Enseignement Supérieur et de la Recherche" (France), the Immunology Concerted Action Grant (6).
Fucosyltransferase activity of the homogenates of transfected cells was measured by incorporation of GDP-L-[ 14 C]-fucose onto synthetic acceptor oligosaccharides with the Sep-Pak C18 product isolation procedure (13).
Sequence Analysis-Homologous sequences were searched in Gen-Bank TM with FASTA (14). Local alignments were made with LALIGN (15). Consensus sequences from overlapping EST were obtained by HUCAP (16). Multiple alignments of cDNA sequences of fucosyltransferase genes (Table I) were performed with the CLUSTALW 1.5 program. A matrix of genetic distances was calculated with the PHYLIP phylogeny package, and the phylogenetic tree was calculated with the Fitch Margoliash least square method with evolutionary clock (17). The tree was drawn from the PHYLIP dendrogram with the NJplot program in a Power Macintosh 6100/66 computer. 3

RESULTS AND DISCUSSION
Chimpanzee Single Point Mutations-FUT3, FUT5, and FUT6 genes were cloned using the same PCR strategy previously described for human FUT3 (8), FUT5, and FUT6 (7). The same length of coding sequences (cds) were found in a single exon for FUT5 (1125 bp) and FUT6 (1080 bp), and a longer fragment was found for the FUT3 (1119 instead of 1086 bp). A minibank was made with each of the chimpanzee genes, and 12 clones for each chimpanzee gene were sequenced both strands on their entirety. The twelve chimpanzee FUT5 clones had only seventeen missense single base differences with the human FUT5, corresponding to 98.5% sequence identity between the FUT5 genes of the two species ( Fig. 1).
We found 20 FUT6 single base differences in eight chimpanzee clones, corresponding to 98.2% sequence identity between the FUT6 genes of the two species (chimpanzee FUT6A allele, Fig. 2). In the remaining four clones of this chimpanzee, we found three additional single point base changes compared with FUT6A, suggesting the existence of a second allele, chimpanzee FUT6B, with 97.9% identity to human FUT6. These polymorphisms were: C371T, C514G and T575C, which induced the respective amino acid changes Pro 124 3 Leu, Gln 172 3 Glu, and Val 192 3 Ala.
Evidence for the existence of two different alleles was also found among the chimpanzee FUT3 clones. Half of the clones had 17 missense single base differences between chimpanzee and human FUT3, i.e. 98.4% sequence identity (chimpanzee FUT3A allele, Fig. 3), and the other half had three additional single base missense changes (chimpanzee FUT3B allele, with 98.2% identity to human FUT3). The differences were: A484G, C822T, and G910A. The first and the last induce the amino acid changes Arg 162 3 Gly and Val 304 3 Met, while the second does not change the peptide sequence.
Insertion of 33 Base Pairs in the Chimpanzee FUT3 Gene-Besides the single point differences, the FUT3 of human and chimpanzee differ in their size, this difference is due to a single insert in the chimpanzee gene of 33 base pairs in position 136 (Fig. 3). This is similar to the previously described human FUT5 (1125 bp) which also has an insert of 42 bp with respect  (3). Indeed, the peptide alignments of the Fuc-TIII, Fuc-TV, and Fuc-TVI sequences confirmed that the Fuc-TV of both species have an insert of 14 amino acids lacking in the Fuc-TVI enzymes of both species, and the first 11 amino acids of this insert are also lacking in the human Fuc-TIII. Unlike this, the chimpanzee Fuc-TIII has the 14-amino acid peptide segment with a sequence identical to the Fuc-TV sequence (Fig. 4). This difference could be due to either deletions in FUT6 and human FUT3 genes or insertions in FUT5 and chimpanzee FUT3 genes. However, similar gaps in the same area of all the other ␣(1,3)fucosyltransferase genes, bovine futb (6) and a mouse pseudogene homologous to the human FUT3 gene (18), and all the other cloned human and animal ␣(1,3)fucosyltransferase genes (FUT4 and FUT7) (19), suggest that the insertion scenario is more probable. An early insertion in the ancestor of the FUT5 gene before the separation of man and chimpanzee, followed by a recombination of the FUT5 and FUT3 genes in the chimpanzee might be at the origin of this phenomenon. However, a single recombination is not enough since three Fuc-TIII-specific amino acids (Ala 6 , Pro 12 , and Ala 19 ) are present in the chimpanzee and human enzymes before the Fuc-TIII insertion, and 17 Fuc-TIIIspecific amino acids plus a Fuc-TIII-specific deletion of two amino acids are present after the insertion in chimpanzee and human Fuc-TIII sequences (Fig. 4). Therefore, a double recombination is needed at the level of the chimpanzee or a similar insertion event occurring twice in evolution, first in the com-mon FUT5 ancestor gene and then in the chimpanzee FUT3 gene.
Transfection Expression on COS-7 Cells-Since the genomic DNA of a single chimpanzee was studied, we could not know which allele was the most frequent or wild type. Therefore, in addition to the FUT5 constructs, both A and B alleles of FUT3 and FUT6 were expressed separately in COS-7 cells. For each type of gene, the comparison between human and chimpanzee constructs gave a similar acceptor specificity pattern of enzyme activity. Neither of the observed evolutionary mutations nor the polymorphisms found between chimpanzee A and B alleles of FUT3 and FUT6 did inactivate or change the substrate specificity pattern of the enzymes.
Cells transfected with chimpanzee and human genes expressed similar amounts of Le a , sialyl-Le a , Le x , and sialyl-Le x epitopes. However, the relative amounts of type-1 and type-2 antigens were different for the three genes. The COS-7 cells transfected with the two chimpanzee FUT3 alleles (A and B), and the human FUT3 expressed larger amounts of type-1 than type-2 epitopes while the reverse was obtained for chimpanzee and human FUT5 constructs. A completely different acceptor specificity pattern was observed with human FUT6 and the two chimpanzee FUT6 alleles (A and B) that expressed only the type-2 epitpes, Le x and sialyl-Le x (Table II), as the bovine Futb enzyme (6).
Fucosyltransferase assays confirmed the tendencies observed by immunofluorescence, with type-1 and type-2 substrates (Table III). Chimpanzee and human Fuc-TIII enzymes worked better on type-1 than on type-2 acceptors. Chimpanzee and human Fuc-TV enzymes worked better on type-2 than on type-1 acceptors and chimpanzee and human Fuc-TVI enzymes worked only on type-2 acceptors, again similarly to the bovine Futb enzyme (6). However, in both FUT3 and FUT6, the products of the chimpanzee A alleles incorporated more fucose than the products of the B alleles.
Acceptor Specificity-The type-1/type-2 acceptor specificity of human Fuc-TIII, Fuc-TV, and Fuc-TVI has been shown to depend on the hypervariable region of the stem domain based on domain swapping experiments between FUT3 and FUT5 (20) or between FUT3 and FUT6 genes (21). Eleven amino acids in the subdomains 4 and 5 of this region were proposed to be crucial because they were unique to human Fuc-TVI (21) (Fig. 4). The cloning of the bovine (6) and chimpanzee genes show now that three enzymes (human and chimpanzee Fuc-TVI and bovine Futb) have the same strict type-2 acceptor specificity pattern and make only Le x and sialyl-Le x (Table III). These three enzymes have only 4 of the 11 amino acids mentioned above, in common (Arg, Glu, Val, and Gln) (Fig. 4). In these same four positions, the remaining four enzymes, able to work on both type-1 and type-2 acceptor substrates (human and chimpanzee Fuc-TIII and Fuc-TV) have three identical amino acids (Trp, Asp, Ile), and one position is specific for Fuc-TIII (Arg) or for Fuc-TV (Asn) (Fig. 4). It is interesting to note that the four Fuc-TIV and the two Fuc-TVII enzymes studied also had Arg at the first position of these four amino acids, a negatively charged amino acid in the second position (Glu for Fuc-TVII and Asp for FucT-IV) and a neutral amino acid (Leu in both Fuc-TIV and Fuc-TVII), and all the thirteen ␣(1,3)fucosyltransferases studied had two histidines (His) before these four amino acids. Histidine, carboxylates, or amines can participate in the hydrogen bond interactions (22) predicted to be essential for the reaction with the C-6 of Gal, and the C-3 of GlcNAc in type-2 acceptors or the C-4 of GlcNAc in type-1 acceptors, which are key polar gates of the fucosyltransferase-acceptor reaction (23,24).
Phylogeny-A phylogenetic tree of the first seven human fucosyltransferase genes cloned illustrated that the two main families of genes, the ␣(1,2)and the ␣(1,3)fucosyltransferases, may have evolved by successive duplications followed by divergent evolution of a single ancestral gene (25). Then, a duplication event in the ␣(1,2)fucosyltransferase branch originated the H and secretor (Se) subfamilies of genes, and two duplication events in the ␣(1,3)fucosyltransferase branch originated the leukocyte (leu), myeloid (mye) and Lewis (Lew) subfamilies of fucosyltransferase genes.
Ulterior addition of some mammalian fucosyltransferase genes to this tree (19) illustrated that the separation of the different mammalian species occurred after the duplication events at the origin of these five subfamilies of fucosyltransferase genes, since members of the five subfamilies were represented in the different mammalian species studied, but the duplication events at the origin of the present cluster of FUT5-FUT3-FUT6 genes occurred only in the Lewis subfamily and after the great mammalian radiation since a single Lewis-like transcript with properties of each of the three human genes was found in the bovine species (6). Chicken FUTB (26) can also be the orthologous homologue of the putative ancestor of the Lewis gene (Fig. 5).
A side prediction of this evolutionary model was that anthropoid apes which have diverged from the common evolutionary trunk much later, might have FUT3, FUT5, and FUT6 genes homologous to the three human genes, provided that their separation from the common evolutionary trunk had occurred after the duplication events at the origin of the present FUT3, FUT5, and FUT6 genes. The present cloning of the three chimpanzee genes confirmed this prediction and locates the duplications at the origin of FUT3, FUT5, and FUT6 genes between the great mammalian radiation (80 million years ago) and the separation of man and chimpanzee (10 million years ago) (Fig. 5).
FUT8 -A new ␣(1,6)fucosyltransferase gene has been recently cloned in pig (27) and man (28). This gene, with ubiquitous tissue expression in the human species, should be called FUT8 since the previously cloned human fucosyltransferase genes were numbered FUT1 to FUT7 according to their sequential order of cloning. Addition of these FUT8 sequences to the phylogenetic tree (Fig. 5), suggests that this ␣(1,6)fucosyltransferase gene is the orthologous homologue of the ancestor of the ␣(1,2)and ␣(1,3)fucosyltransferases, and therefore, the duplication event at its origin must be the oldest fucosyltransferase duplication event known to date. This idea is in good agreement with the substrate specificity of the FUT8-encoded enzyme since it adds a fucose in ␣(1,6) linkage to the first GlcNAc residue, next to the peptide chain in N-glycans. Due to the general pathway of biosynthesis of oligosaccharides, by sequential addition of single or multiple saccharide blocks, the appearance of the FUT8 natural substrate preceded the ap-pearance of the external polylactosamine branches, which are the natural substrates, of the other fucosyltransferases in N-glycans.
Analysis of GenBank TM /EBI and EST sequence data suggests that the FUT8 gene is better represented among the available EST than the other fucosyltransferase genes since 37 human EST plus 3 mouse EST of FUT8 were found, whereas only between 0 and 4 EST have been found for each of the other fucosyltransferase genes. Therefore, the FUT8 gene has a higher expression than the other fucosyltransferases.
A FASTA search of GenBank identified 15 full EST, and 5 large portions of EST from retina, as part of the human FUT8 transcript (Accession Number D89289). The local alignment of these last five retina EST sequences with D89289 provided evidence for the existence of a second small exon, identical in the five EST, corresponding to positions 770 -835 of D89289 and defining an alternative splice with the splice consensus motif AG/GT (Fig. 6). The discovery of this alternative splice allowed us to establish the link between a large unidentified contig of 25 EST containing the poly(A), from the UniGene data base, and the 20 EST contig found by homology with the human FUT8 sequence, since eight of the EST were common to both contigs (Fig. 6).
A consensus sequence was defined for this new retina alternative splice, which can be used as a working tool for cloning, but the definitive sequence will have to wait cloning and expression experiments since the presence of sequence errors in EST makes the consensus sequences of segments with three or less overlapping EST dangerous, and todays sequenced retina EST do not cover a stretch of 321 bp between position 1703 (end of the AA019285 EST) and 2024 (start of the AA015928 EST), which is expected to contain the stop signal (Fig. 6).
The genetic distances among human, pig, and mouse FUT8 genes (5-10% difference) are smaller than the genetic distances of the other fucosyltransferase genes among mammals (15-20% differences), suggesting that FUT8 might diverge more slowly than the other mammalian fucosyltransferase genes. However, the existence of conserved amino acid positions, a common donor substrate, and common folding features in ␣(1,2)and ␣(1,3)fucosyltransferases (31), plus the recent finding of conserved peptide motifs for the families of ␣(1,6)and ␣(1,2)fucosyltransferases, and common peptide motifs for the ␣(1,3)fucosyltransferase genes (32), strongly suggest that all these genes have a common origin and have evolved by successive duplications followed by translocations and divergent evolution.
Addendum-After submission of the present paper, a complete rat FUTA AB006137 (homologous to human FUT1), rat FUTB AB006138 FIG. 5. Combined phylogenetic tree of 9 paralogous human fucosyltransferase genes (bold type), one human pseudogene (Sec1), and 21 orthologous vertebrate genes. Relative genetic distances were estimated from the differences observed among the cds. The divergent points of ␣(1,6)fucosyltransferase, ␣(1,2)fucosyltransferase, and ␣(1,3/4) fucosyltransferase genes preceded the great mammalian radiation, while the divergent points of FUT3, FUT5, and FUT6 preceded only the separation of man and chimpanzee. The relative tree positions of the chicken FUTB gene and the bovine futb gene suggest that they are orthologous homologous of the ancestor of the Lewis gene, which gave origin by two successive duplications to the present cluster of human FUT3, FUT5, and FUT6 genes. Consequently, these species cannot have separate genes, homologous to FUT3, FUT5, and FUT6. Alternatively, the chimpanzee has at least three ␣(1,3)fucosyltransferase genes, and each one is the orthologous homologue of the corresponding FUT3, FUT5, and FUT6 human genes.  (closer to human FUT2 than to human Sec1), chimpanzee AB006612, gorilla AB006611, pongo AB006610, hylobates AB006609 (homologous to human Sec1), and hamster U78737 (homologous to the bovine futb gene and to the ancestor of primate FUT3, FUT5, and FUT6 genes) have appeared in GenBank TM /EBI. These new sequences fit well with the proposed evolutionary model.
FIG. 6. Twenty EST find by a FASTA search of GenBank with the human FUT8 transcript (28) and 25 overlapping EST from an unidentified contig of the UniGene data bank. The twenty human FUT8 EST found by FASTA had between 84 and 94% sequence identity with the pig FUT8 transcript (27). The right column identifies the tissue origin of each EST and suggest ubiquitous expression of the FUT8 transcript, but the shaded rectangles suggest the existence of an isoform with an alternative splice, with an intron of 419 bp between positions 836 and 1255, expressed only in retina. Two EST were identified as chromosome 14 markers D14S842 and D14S666E by two independent teams, both were mapped to 14q23. All EST are located relative to the human sequence (D89289), number 1 being the beginning and number 1728 the end of the human cds.