Cloning and Expression of a Proteoglycan UDP-Galactose:β-Xylose β1,4-Galactosyltransferase I

A seventh member of the human β4-galactosyltransferase family, β4Gal-T7, was identified by BLAST analysis of expressed sequence tags. The coding region of β4Gal-T7 depicts a type II transmembrane protein with sequence similarity to β4-galactosyltransferases, but the sequence was distinct in known motifs and did not contain the cysteine residues conserved in the other six members of the β4Gal-T family. The genomic organization of β4Gal-T7 was different from previous β4Gal-Ts. Expression of β4Gal-T7 in insect cells showed that the gene product had β1,4-galactosyltransferase activity with β-xylosides, and the linkage formed was Galβ1–4Xyl. Thus, β4Gal-T7 represents galactosyltransferase I enzyme (xylosylprotein β1,4-galactosyltransferase; EC 2.4.1.133), which attaches the first galactose in the proteoglycan linkage region GlcAβ1–3Galβ1–3Galβ1–4Xylβ1-O-Ser. Sequence analysis of β4Gal-T7 from a fibroblast cell line of a patient with a progeroid syndrome and signs of the Ehlers-Danlos syndrome, previously shown to exhibit reduced galactosyltransferase I activity (Quentin, E., Gladen, A., Rodén, L., and Kresse, H. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 1342–1346), revealed two inherited allelic variants, β4Gal-T7186D and β4Gal-T7206P, each with a single missense substitution in the putative catalytic domain of the enzyme. β4Gal-T7186D exhibited a 4-fold elevated K m for the donor substrate, whereas essentially no activity was demonstrated with β4Gal-T7206P. Molecular cloning of β4Gal-T7 should facilitate general studies of its pathogenic role in progeroid syndromes and connective tissue disorders with affected proteoglycan biosynthesis.

These six ␤4Gal-Ts catalyze biosynthesis of Gal␤1-4GlcNAc and/or Gal␤1-4Glc linkages in different glycoconjugates and free saccharides (for a review see Ref. 8). The six ␤4Gal-Ts have highly conserved sequence motifs in the putative catalytic domain including four conserved cysteine residues. The genomic organization of the first four genes is similar and includes conservation of spacing for five intron/exon boundaries in the coding regions (4,7,9,10). This suggests that these genes arose late in evolutionary terms as a result of gene duplication and subsequent sequence divergence. Detailed analysis of the kinetic properties of these enzymes clearly show that each has a distinct function in biosynthesis of different glycoconjugates and saccharide structures, but in accordance with their close evolutionary relationships the linkages formed are similar.
Quentin-Hoffmann et al. (14,15) showed that partial inactivation of galactosyltransferase I represented the primary defect in one patient with progeroidal appearance and symptoms of the Ehlers-Danlos syndrome. As a consequence of the enzyme deficiency, only about half of the core proteins of the small proteoglycans decorin and biglycan were linked with glycosaminoglycan chains (16), 2 whereas no abnormality in the biosynthesis of large dermatan sulfate proteoglycans and of heparan sulfate proteoglycans could be observed (14). We sequenced the ␤4Gal-T7 coding region of DNA from fibroblasts established from this patient and his family. Two alleles, ␤4Gal-T7 186D and ␤4Gal-T7 206P , were identified in the affected patient, and each allele was shown to be derived from different parents. Expression of the variant alleles showed that one exhibited a significantly higher K m for the donor substrate and the other was inactive. The results show that ␤4Gal-T7 represents one galactosyltransferase I that is involved in proteoglycan synthesis. Identification of the molecular basis of the genetic defect in the progeroid patient with signs of the Ehlers-Danlos syndrome opens the possibility of further studies on the role of this gene in progeroid syndromes and connective tissue disorders.

EXPERIMENTAL PROCEDURES
Identification and Cloning of ␤4Gal-T7-tBLASTn analysis of the dbEST data base at the National Center for Biotechnology Information (National Institutes of Health, Bethesda, MD) with sequences from human ␤4Gal-T1 to ␤4Gal-T6, performed as described previously (4,8), revealed several ESTs covering a total of 931 base pairs of the 3Ј coding sequence of ␤4Gal-T7. Additional sequence was obtained by 5Ј-rapid amplification of cDNA ends with a fetal brain cDNA library (CLON-TECH) using antisense primer EBER1218 (5Ј-CTGAAGTGGTCCAC-CTGGTTG-3Ј) and sequencing on PAC genomic DNA. ESTs from ␤4Gal-T7 are represented in two Unigene clusters, Hs.54702 and Hs.45208, where the latter originate from priming in the second intron of the five introns identified in ␤4Gal-T7. Hs.45208 was mapped to 5q35.1-5q35.3 and was flanked by D5S498 and D5S408 (stSG40105, 184. 7-195.8 cM). The complete compiled cDNA sequence was confirmed by sequencing of a PAC genomic DNA clone. A human PAC genomic library (Genome Systems) was screened using the primer pairs EBER-1207 (5Ј-CAGAGAACGGGTCTGTCACAGG-3Ј) and EBER1215 (5Ј-GA-TGTGGTGCCGGATCTTCTT-3Ј). Three clones for ␤4Gal-T7 (99/C24, 143/B4, and 222/H10) were obtained from Genome Systems Inc. Intron/ exon boundaries were determined by comparison with the cDNA sequence, optimizing for the gt/ag rule (17).
Expression of ␤4Gal-T7 in Insect Cells-Expression constructs designed to encode the full coding sequence and a secreted construct encoding amino acid residues 63-327 of ␤4Gal-T7 were prepared by reverse transcription-PCR with fetal brain mRNA. Products were cloned initially into pBluescript KSϩ (Stratagene) and subsequently into pVL1393 or pAcGP67 (Pharmingen). Expression constructs of ␤4Gal-T7 variants were prepared similarly using mRNA from fibroblasts of a patient with galactosyltransferase I deficiency (14). Plasmids pVL-␤4Gal-T7-full and pAcGP67-␤4Gal-T7-sol were co-transfected with Baculo-Gold TM DNA (Pharmingen) and virus amplified as described previously (18). The kinetic properties were determined with the secreted enzyme expressed in Sf9 or High Five TM cells. Purification of the secreted enzyme from High Five cells was performed by consecutive chromatographic steps on DEAE or Amberlite and S-Sepharose as described previously (19). Standard assays were performed in 50-l reaction mixtures containing 25 mM cacodylate (pH 7.0), 40 mM MnCl 2 , 0.25% Triton X-100, 100 M UDP-[ 14 C]-Gal (2,000 cpm/nmol) (Amersham Pharmacia Biotech), and the indicated concentrations of acceptor substrates (Sigma and Dextra Laboratories Ltd.) (see Table I for structures). The full-length construct was assayed with 1% Triton X-100 homogenates of cells twice washed in phosphate-buffered saline. Assays for determination of K m for the acceptor substrates were performed with semi-purified enzyme in the standard reaction mixture modified to include 200 M UDP-[ 14 C]-Gal for ␤4Gal-T7 and 400 M for ␤4Gal-T7 186D . Assays for donor substrate K m were performed with 2.0 mM MeUmb-␤-Xylose.
For product characterization 5 mg of MeUmb-␤-Xylose were glycosylated to completion with semipurified ␤4Gal-T7, using thin layer chromatography to monitor reaction progress. The reaction product was purified on an octadecyl-silica cartridge (Bakerbond, J. T. Baker), deuterium exchanged by repeated lyophilization from D 2 O, and then dissolved in 0.5 ml of D 2 O for NMR analysis. One-dimensional 1 H NMR, two-dimensional 1 H-1 H TOCSY, and 1 H-detected, 13 C-decoupled, phasesensitive gradient 13 C-1 H HSQC and HMBC experiments were performed as described previously (Ref. 20 and references cited therein) on a Varian Unity Inova 600 MHz spectrometer using standard acquisition software available in the Varian VNMR software package. One-dimensional reference 13 C NMR spectra were acquired using direct detection on a Varian Unity Inova 500 MHz spectrometer. A saturated solution of Xyl␤137MU was prepared for NMR analysis in similar manner, and spectra were acquired under identical conditions for comparison. Chemical shifts are referenced to internal acetone (2.225 and 30.00 ppm for 1 H and 13 C, respectively).
Northern Analysis-The cDNA fragment of soluble ␤4Gal-T7 was used as a probe. The probe was random priming labeled using [␣ 32 P]dCTP and an Strip-EZ DNA labeling kit (Ambion). A multiple human tissue Northern blot, MTN I (CLONTECH), was probed as described previously (7).
Monoclonal Antibody-A purified secreted form of ␤4Gal-T7 was used for immunizing BALB/c mice, and monoclonal antibodies were selected and characterized by immunocytology on Sf9 cells infected with various ␤4Gal-transferase expression constructs, as described previously (21). The specificity was further evaluated by SDS-polyacrylamide gel electrophoresis Western blot analysis using precast 8 -25% gradient gels and the Phast system™ (Amersham Pharmacia Biotech).
Analysis of ␤4Gal-T7 Gene in a Family with a Genetic Defect in Galactosyltransferase I-Skin fibroblast cell lines from one affected patient, the parents, and two siblings were established and grown as described previously (14). mRNA was isolated and reverse transcription-PCR products were directly sequenced and/or subcloned and sequenced. The identified sequence variations were confirmed by direct sequencing PCR products obtained from genomic DNA. One missense mutation (557CA) was also identified by restriction digestion (HinfI) of a PCR product from the mutant allele.

RESULTS
Identification and Cloning of Human ␤4Gal-T7-The EST cloning strategy produced a novel gene with an open reading frame of 327 amino acids exhibiting sequence similarity to members of the ␤4Gal-T gene family. The predicted coding region of ␤4Gal-T7 has a single initiation codon in agreement with Kozak's rule (22), which precedes a sequence encoding a potential hydrophobic transmembrane segment ( Fig. 1A) (DNA sequence available in GenBank TM ). ␤4Gal-T7 is predicted to be a type II transmembrane glycoprotein with a N-terminal cytoplasmic domain of 28 residues, a transmembrane segment of 30 residues, and a stem region and catalytic domain of 269 residues. The calculated molecular weight of the protein derived from the full coding is 37,404, and proteolytically cleaved secreted forms is predicted to be less than 31,065 (calculated from Arg 59 immediately after the hydrophobic transmembrane signal sequence). Multiple sequence alignment of the seven human ␤4Gal-transferases (Fig. 1A) shows that the sequence of ␤4Gal-T7 is distinct from other ␤4Gal-Ts in two potentially significant regions: the major conserved sequence motif (WG-WGG/REDDD/E) is not conserved in two positions, and none of the four cysteine residues conserved among the first six ␤4Gal-Ts are conserved in ␤4Gal-T7. ␤4Gal-T7 has a single N-linked glycosylation consensus site, which is different from a site conserved among ␤4Gal-T2 to ␤4Gal-T6.
Expression of ␤4Gal-T7-Expression of full coding or secreted constructs of ␤4Gal-T7 in insect cells resulted in marked increase in galactosyltransferase activity with a number of ␤Xyl containing acceptor substrates, compared with uninfected cells or cells infected with a control construct ( Table I). The best acceptor substrate identified was ␤-MeUmb-Xyl, for which the K m was estimated at 0.89 Ϯ 0.29 mM. The K m for UDP-Gal was 56 Ϯ 12 M using ␤-MeUmb-Xyl as an acceptor. Low activity was observed with ␤GlcNAc acceptors, but there was no activity with other mono-or disaccharide substrates tested. Structural characterization by 1 H NMR of the product formed with ␤-MeUmb-Xyl showed that the ␤4Gal-T7 forms the Gal␤1-4Xyl␤1-R linkage (Table II and Fig. 2). Comparison of a onedimensional 1 H NMR spectrum of the product ( Fig. 2A) with that of substrate obtained under similar conditions (in D 2 O, 25°C; not shown) clearly showed an additional H-1 resonance (4.512 ppm) from a sugar residue linked in the ␤-configuration ( 3 J 1, 2 ϭ 7-9 Hz). We did not find NMR data for the specific expected product in the literature or in glycoconjugate NMR data bases. The substantial anisotropic effects of the 4-methylumbelliferyl group obviates direct comparison of chemical shift data with those of other glycosides, e.g. linked to L-serine (23). Thus, a de novo sequence analysis of the product was undertaken by consecutive application of two-dimensional 1 H-1 H TOCSY, 1 H-13 C HSQC, and 1 H-13 C HMBC NMR experiments (for a review of this strategy, see Ref. 24). Although 1 H chemical shift data were available for the Xyl␤137MeUmb substrate (25), these were acquired in dimethyl sulfoxide-d 6 , which is known to alter proton shifts when compared with D 2 O solutions (26). Therefore, direct comparison of spectral data for the product were derived from an additional series of NMR experiments on the substrate glycoside dissolved in D 2 O. Thus, all 1 H and 13 C resonances were uniquely and unambiguously assigned by the TOCSY and HSQC experiments (Table II). The H-5ax and H-5eq resonances for ␤-Xyl were assigned on the basis of their 3 J 4,5 coupling constants; the trends for 1 H resonances and coupling constants were similar to those observed for the corresponding L-serine glycosides (24); and the linkages were unambiguously established by observation of interglycosidic H1ϪC1ϪO1ϪCx and C1ϪO1ϪCxϪHx correlations in the HMBC spectrum. As shown in Fig. 2B, evidence of the newly formed Gal␤134Xyl␤ linkage in the product is clearly demonstrated by strong cross-peaks correlating ␤-Gal H-1 (4.512 ppm) with ␤-Xyl C4 (76.02 ppm) and the corresponding ␤-Gal C-1 (101.60 ppm) with the ␤-Xyl H-4 resonance (3.972 ppm). Because ␤-Xyl H-4 is completely resolved and unique in its cou- FIG. 1. A, multiple sequence analysis (ClustalW) of human ␤4Gal-Ts and two C. elegans homologues. Introduced gaps are shown as hyphens, and aligned identical residues are boxed (black for all sequences, dark gray for eight and seven sequences, and light gray for six and five sequences). The putative transmembrane domain of ␤4Gal-T7 is underlined with a single line. The amino acid substitutions in ␤4Gal-T7 186D and ␤4Gal-T7 206P are indicated above the ␤4Gal-T7 sequence. Positions of intron/exon boundaries in ␤4Gal-T7 are indicated by solid arrows below the amino acid sequence, and the conserved boundaries in ␤4Gal-T1 to ␤4Gal-T4 indicated by solid arrows below the amino acid sequence of ␤4Gal-T1. Intron/exon boundaries in the two C. elegans homologues are indicated by solid arrows below the respective amino acid sequences. The genomic organizations of ␤4Gal-T5 and ␤4Gal-T6 are not completed. B, phylogenetic tree of human and two C. elegans ␤4Gal-T homologues. The phylogenetic tree (unrooted) was produced with DNASIS software (Hitachi) based on the ClustalW multiple sequence alignment analysis presented in A using the full coding sequences of the nine genes.
pling pattern with H-3, H-5eq, and H-5ax and because there are no instances of strong coupling between any of the ␤-Xyl ring proton resonances that might otherwise create uncertainty in their assignments, the latter correlation in particular renders assignment of the linkage unambiguous. Consistent with this, the ␤-Xyl C-4 resonance shows a substantial downfield glycosylation-induced shift (␦⌬ ϭ 7.27 ppm) that is unique when comparing product to substrate, whereas C-3 and C-5 exhibit small upfield shift changes, as expected (27). All of the proton resonances of the ␤-Xyl residue exhibit downfield glycosylation-in-duced shifts: H-4 was the largest (␦⌬ ϭ 0.231 ppm). These results confirm the linkage structure of the product as ␤1-4.
Expression Pattern of ␤4Gal-T7-Northern analysis with mRNA from eight human adult organs showed a ubiquitous pattern of expression for ␤4Gal-T7 (Fig. 3). The transcript size of ␤4Gal-T7 was approximately 2 kilobases.
␤4Gal-T7 Variants in a Patient with a Defect in Galactosyltransferase I and Proteoglycan Biosynthesis-Sequence analysis of ␤4Gal-T7 mRNA and DNA in a patient with defective galactosyltransferase I activity revealed two missense mutations in coding exon III, C 557 3 A and T 617 3 C, which result in changes in amino acid sequence, respectively, Ala 186 3 Asp and Leu 206 3 Pro (Figs. 1 and 4). A genotyping strategy involving selective restriction digestion with HinfI of a PCR product confirmed the C 557 3 A mutation (not shown). Both substitutions are in the putative catalytic domain. The Ala 186 3 Asp substitution results in introduction of an acidic residue in a fairly conserved position among ␤4Gal-Ts (Ala/Val/Ser) (Fig. 1A). The Leu 206 3 Pro substitution results in a nonconservative change of a fully conserved Leu residue (Fig. 1A). As shown in Fig. 4, sequence analysis of the patient's family confirmed that the mother was heterozygous for the Ala 186 3 Asp allele and the father heterozygous for the Leu 206 3 Pro allele. Two siblings were also heterozygous for one or the other variant alleles. The sibling with the Leu 206 3 Pro allele was previously judged to be heterozygous based on analysis of galactosyltransferase I activity. 3 Expression of the secreted forms of the variant alleles revealed that ␤4Gal-T7 186D was active (Table I), but the K m for the donor substrate was elevated (K m 230 Ϯ 64 M). The K m for the acceptor MeUmb-␤-Xyl was comparable with wild type (K m 0.54 Ϯ 0.10 mM). In contrast, expression of the ␤4Gal-T7 206P variant did not result in significant activity in the supernatant or in cell extracts of infected insect cells (Table I). SDS-polyacrylamide gel electrophoresis Western blot analysis with a monoclonal antibody to human ␤4Gal-T7 confirmed that proteins of appropriate sizes were expressed in all cases (Fig. 5). The monoclonal antibody, URH1(2C3) (IgG1), reacted specifically with cells infected with full coding or secreted constructs of ␤4Gal-T7 by immunocytology (not shown), and in Western blot analysis only one band corresponding to approximately 35.000 was detected in extracts of insect cells if these were infected with ␤4Gal-T7 constructs (Fig. 5).

DISCUSSION
The ␤4Gal-T7 gene characterized in this report encodes a ␤4-galactosyltransferase with galactosyltransferase I activity. Such an enzyme is required for the synthesis of the tetrasaccharide linkage region of proteoglycans (11)(12)(13)28). The finding that two allelic variants of the ␤4Gal-T7 gene in a patient previously shown to exhibit a defect in galactosyltransferase I activity and in the biosynthesis of small chondroitin/dermatan sulfate proteoglycans had reduced or impaired functions strongly indicates that ␤4Gal-T7 is one of potentially multiple galactosyltransferase I genes. Furthermore, ␤4Gal-T7 is the galactosyltransferase I gene that is functionally important for small proteoglycan biosynthesis in skin fibroblasts.
Given the number of genes encoding ␤4-galactosyltransferases transferring galactose to ␤GlcNAc and ␤Glc and considering the multitude of enzymes involved in catalyzing specific steps in the biosynthesis of glycosaminoglycans (e.g. glucosamine:3-O-sulfotransferases (29)), we hypothesize that additional ␤Xyl ␤4-galactosyltransferase genes exist. This is in agreement with the finding that in the patient fibroblasts no abnormality in the biosynthesis of versican and of heparan 3 H. Kresse, unpublished observation.  b Enzyme source was 1% Triton X-100 homogenates of approximately 4 ϫ 10 5 infected S19 cells harvested at day 3 (the total extraction volume was 100 l). sulfate proteoglycans had been found (14). However, it is also possible that only a single galactosyltransferase I gene exists and that the active mutant allele, ␤4Gal-T7 186D , has differential catalytic efficiency for transferring galactose to proteoglycans with different densities of glycosaminoglycan attachment sites. Independent studies on chemically mutagenized Chinese hamster ovary cells are in agreement with the proposal that either only a single galactosyltransferase I exists or that, alternatively, a single auxiliary protein for these enzymes is required. Interestingly, the mutant Chinese hamster ovary   FIG. 2. A, sections of a one-dimensional 1 H NMR spectrum of the ␤4Gal-T7 product, Gal␤134Xyl␤137MU, showing all nonexchangeable monosaccharide ring methine and exocyclic methylene resonances. Residue designations for ␤-Gal (Gal␤4-) and ␤-Xyl (Xyl␤-) are followed by proton designations (1-6 and 6Ј and 1-5ax/eq, for the two residues, respectively). B, section of the 1 H-detected 1 H-13 C HMBC spectrum showing interglycosidic H1ϪC1ϪO1ϪCx and C1ϪO1ϪCxϪHx correlations. Cross-peaks marked by ovals or cross-hairs. The unmarked cross-peaks are all intraresidue correlations. cells, which exhibited only about 2% of the normal level of galactosyltransferase I activity, could prime glycosaminoglycan synthesis on exogenously added ␤-xylosides (30), although this activity may stem from ␤Glc(NAc) ␤4GalTs (7,31).
The catalytic properties of ␤4Gal-T7 resemble those of a partially purified galactosyltransferase I activity derived from embryonic chick cartilage (11). Both enzyme activities showed a K m for UDP-Gal of approximately 50 M. Analysis of K m for the donor substrate with cell extracts appear to give higher K m of 100 M (12) or 170 M (14). Apparent K m values for different acceptor substrates tested vary around 0.5-6 mM (14). The properties of the recombinant allelic variants of ␤4Gal-T7, ␤4Gal-T7 186D and ␤4Gal-T7 206P were not in agreement with the properties of the galactosyltransferase I activity measured in extracts of fibroblasts from a patient with galactosyltransferase I deficiency (14). The catalytically active allele, ␤4Gal-T7 186D , exhibited an approximately 2-fold lower K m for the acceptor substrate MeUmb-␤-Xyl, whereas the extract of patient cells had a 2-fold higher K m for xylosyl-serine compared with control cells. Furthermore, ␤4Gal-T7 186D had a 4-fold higher K m for the donor substrate, whereas extracts exhibited an almost 7-fold lower K m than control cells. Additionally, only about 30 -70% of the secreted decorin was devoid of the glycosaminoglycan chain, whereas the mature proteoglycan even contained longer dermatan sulfate chains. These discrepancies are likely related in part to experimental variation. Analysis of glycosyltransferase activities in extracts may be influenced by a number of factors. It is possible that multiple enzymes catalyze formation of the same linkage with different kinetic properties. Other factors in extracts may interfere or bind substrates. Furthermore, the properties of recombinant secreted forms of the enzyme may be different than those found with the transmembrane enzyme in cell homogenates. The total activity assessed in the patient fibroblasts was approximately 5% compared with controls, whereas both parents showed 50% reduction in activities. The 5% activity measured in the patient is likely to originate from the ␤4Gal-T7 186D allele. The recombinant form of this allele was comparatively more active (Table  I), but its poorer kinetic properties, potentially combined with a lower stability in cells or extracts (14), may account for the reduction in activities observed in the patient and parents. One unexplained observation from the study of Quentin et al. (14) was that the ␤3-galactosyltransferase II activity forming the Gal␤1-3Gal␤1-4Xyl␤1-O-Ser structure was also reduced in the patient and parents. Further studies of the in vivo functions of the allelic variants of ␤4Gal-T7 are required to fully assess their functions in proteoglycan biosynthesis.
The identified mutations in ␤4Gal-T7 are in the putative catalytic domain and involve residues that are partly or fully conserved among members of the ␤4Gal-T gene family (Fig.  1A). The effects of these substitutions are in agreement with predictions based on x-ray crystallography data on the catalytic unit of ␤4Gal-T1. 4  segment in ␤4Gal-T1 that is included in the protein core. The adjacent Pro and His residues at positions 182 and 184 in ␤4Gal-T7 are strictly conserved among all human ␤4GalT sequences. Ala 186 of ␤4Gal-T7 corresponds to Ser 274 in ␤4Gal-T1. Ser 274 is in a region close (less than 5 angstrom distance) to the UDP-Gal binding site and may interact with this and hence explain the poorer kinetic properties of the ␤4Gal-T7 186D variant. The Gly 201 -Gly 202 adjacent to Leu 206 are strictly conserved among ␤4Gal-T sequences, and they are included in the catalytic pocket of ␤4Gal-T1. Leu 206 is also strictly conserved in all sequences, and the corresponding Leu 296 in ␤4Gal-T1 is included in a network of hydrophobic interactions in the protein core with aromatic residue Phe 301 , Phe 307 , Phe 290 , and Leu 325 . The Pro 206 substitution in ␤4Gal-T7 206P is predicted to markedly change the fold of the protein core, which is in agreement with the observed lack of activity of this variant.
Galactosyltransferase I has been considered as a cis-Golgi located enzyme in epiphyseal cartilage (32) and is believed to be noncovalently associated with the protein xylosyltransferase (33,34). The sequence of ␤4Gal-T7 predicts relatively long cytoplasmic (28 residues) and transmembrane (30 residues) domains compared with other ␤4Gal-Ts and Golgi-located glycosyltransferases in general. The cytoplasmic domain and the stem region are hydrophilic. We have not identified putative motifs that are predicted to mediate binding to the protein xylosyltransferase. The availability of recombinant ␤4Gal-T7 and antibodies thereto may provide tools for studying the interaction and possibly cloning the xylosyltransferase.
Two homologues of the ␤4Gal-T gene family have been identified in the nematode Caenorhabtidis elegans (35). In a phylogenetic analysis presented by Lo et al. (36) the gene W02B12.11 (designated C. 2) (GenBank TM accession number Z66521) clustered with the ␤4Gal-T5 and ␤4Gal-T6 subgroup, while the gene R10E11.4 (designated C. elegans 1) (GenBank TM accession number Z29095) was not clustered. If ␤4Gal-T7 is included in this analysis, the R10E11.4 gene and ␤4Gal-T7 form a separate cluster (Fig. 1B). The close relationship between W02B12.11 and ␤4Gal-T1 to ␤4Gal-T6 is further supported by the finding that two of the four intron positions in W02B12.11 align with the conserved intron/exon boundaries in ␤4Gal-T1 to ␤4Gal-T4 (Fig.  1A). In addition, the predicted coding region of W02B12.11 includes the four conserved cysteine residues in ␤4Gal-T1 to ␤4Gal-T6. An evolutionary relationship between R10E11.4 and ␤4Gal-T7 is suggested by the sequence similarity, and this includes substitutions in the same positions in major conserved sequence motif among most ␤4Gal-Ts (Fig. 1). However, the substituted residues are not similar, and cysteine residues are not conserved, although two residues in the very C-terminal sequences do align between the two sequences (Fig. 1A). The coding region of R10E11.4 is organized in six exons, and none of the intron/exon boundaries align with those of ␤4Gal-T1 to ␤4Gal-T4, nor do they align with ␤4Gal-T7.
The R10E11.4 gene corresponds to the sqv-3 gene found to play a critical role in vulval invagination in C. elegans (37). Glycosyltransferase activity of the protein encoded by either of the C. elegans ␤4Gal-T homologues have not been reported to our knowledge, but recent expression of a soluble, secreted construct of R10E11.4 in insect cells demonstrated similar activity as reported here for human ␤4Gal-T7. 5 Another gene found to play a role in vulval invagination in C. elegans is sqv-8 (37), which showed high sequence similarity to the recently cloned ␤1,3-glucuronosyltransferase that adds the fourth residue to the proteoglycan linkage region tetrasaccharide (GlcA␤1-3Gal␤1-3Gal␤1-4Xyl␤1-O-Ser) (38). The finding that impairment of the sqv-3 and sqv-8 genes in nematodes produce the same defect in vulval invagination would be in agreement with the hypothesis that both of these genes were involved in synthesis of the proteoglycan linkage region tetrasaccharide, albeit at different steps. Identification of the molecular genetic basis for the defect in proteoglycan biosynthesis of the patient studied here suggests that extensive studies of genetic defects in patients with progeroid syndromes and other inherited connective tissue disorders should be undertaken.