Molecular Cloning and Characterization of Chondroitin-4-O-sulfotransferase-3

We have identified and characterized anN-acetylgalactosamine-4-O-sulfotransferase designated chondroitin-4-sulfotransferase-3 (C4ST-3) (GenBankTM accession number AY120869) based on its homology to HNK-1 sulfotransferase (HNK-1 ST). The cDNA predicts an open reading frame encoding a type II membrane protein of 341 amino acids with a 12-amino acid cytoplasmic domain and a 311-amino acid luminal domain containing a single potential N-linked glycosylation site. C4ST-3 has the greatest amino acid sequence identity when aligned with chondroitin-4-O-sulfotransferase 1 (C4ST-1) (45%) but also shows significant amino acid identity with chondroitin-4-O-sulfotransferase 2 (C4ST-2) (27%), dermatan-4-O-sulfotransferase 1 (29%), HNK-1 ST (26%),N-acetylgalactosamine-4-O-sulfotransferase 1 (26%), andN-acetylgalactosamine-4-O-sulfotransferase 2 (23%). C4ST-3 transfers sulfate to the C-4 hydroxyl of β1,4-linked GalNAc that is substituted with a β-linked glucuronic acid at the C-3 hydroxyl. The open reading frame of C4ST-3 is encoded by three exons located on human chromosome 3q21.3. Northern blot analysis reveals a single 2.1-kilobase transcript. C4ST-3 message is expressed in adult liver and at lower levels in adult kidney, lymph nodes, and fetal liver. Although C4ST-3 and C4ST-1 have similar specificities, the highly restricted pattern of expression seen for C4ST-3 suggests that it has a different role than C4ST-1.


Molecular Cloning of a cDNA Encoding Human Chondroitin-4-O-
sulfotransferase-3-A human genomic BAC clone, RP11-390G14, derived from human chromosome 3 (GenBank TM accession number AC024558) was identified in TBLASTN searches (12) against the nonredundant data base at the NCBI using deduced protein sequences of human and rat HNK-1 sulfotransferases (1, 2) as query sequences. The putative open reading frame (ORF) encodes a protein that shows homology to the luminal domain of C4ST-1, a member of the HNK-1 family of sulfotransferases (8,9). Subsequent BLASTN queries of the dbEST data set using this region of homology identified two matching EST sequences (GenBank TM accession numbers BF448098 and AI074149, respectively). The corresponding cDNA clones (IMAGp998H017603 and IMAGp998M214153, respectively) were obtained from the RZPD (Berlin, Germany) (13) and sequenced on both strands. The partial ORF of C4ST-3, encoding most of its luminal domain, was amplified by PCR. The 5Ј-specific primer, 5Ј-cca agc ttg cca cca tgt ttg gaa aca gag ccc t-3Ј, contains a HindIII restriction enzyme site, the Kozak consensus sequence GCCACC, and an artificial start codon. The 3Ј-specific primer, 5Ј-gct cta gac tag agc agc cgc agg tag ga-3Ј, contains an XbaI site and a stop codon. The product was directionally subcloned into pcDNA3.1 (Invitrogen, Karlsruhe, Germany) and designated pcDNA3.1-C4ST-3-ORF309.
Subsequent BLASTN queries of the dbEST using the partial ORF of C4ST-3 identified an EST (GenBank TM accession number BI908522) that overlaps the NH 2 terminus of the sequence used to construct pcDNA3.1-C4ST-2-ORF309. BI908522 was used to evaluate a segment of the working draft of chromosome 3 (GenBank TM accession number NT_005588; also see below) for the presence of exons predicted by the program FGENES (version 1.6). Based on genomic sequence information, the start codon of C4ST-3 was predicted to be ϳ40 bp to the 5Ј-end of the genomic sequence represented by BI908522. The 5Ј-terminal part of the C4ST-3 ORF was amplified by PCR from human placenta firststrand cDNA (generated with Omniscript Reverse Transcriptase; Qiagen, Hilden, Germany) using the 5Ј-specific primer 5Ј-atg ggg agg cgc tgc tgc cgg cgg cgc g-3Ј and the 3Ј-specific primer 5Ј-ttg atc tcg gcg ggg ctg aag t-3Ј. The 482-bp product was subcloned into the pGEM-T Easy vector (Promega) and sequenced on both strands. Finally, the fulllength ORF was assembled in the pBluescript KS vector (Stratagene, Amsterdam, The Netherlands) by sequentially subcloning an 850-bp PstI-XbaI fragment derived from pcDNA3.1-C4ST-3-ORF310 and an additional 170-bp HindIII-PstI fragment from the pGEM-T construct above to produce pBKS-C4ST-3-ORF341.
5Ј-Rapid amplification of cDNA ends using Marathon Ready cDNA (adult kidney; BD Biosciences-CLONTECH, Heidelberg, Germany) as a template was performed using a number of different primer combinations; however, the 5Ј-UTR could not be amplified. The 3Ј-UTR was obtained by amplifying the cDNA using two different primers predicted on the basis of the genomic sequence. The 5Ј-specific primer 5Ј-tca act act ccg ccc cct cct acc-3Ј and the 3Ј-specific primer 5Ј-acc gcc cag ctc acc aaa gtc c-3Ј were used to amplify the predicted 3Ј-end of the C4ST-3 cDNA from a SMART TM cDNA library (BD Biosciences-CLONTECH) derived from a mixture of several human tissues/cell lines (e.g. generated from mRNA representing the human kidney cell line XYZ). The 714-bp DNA product was gel-purified (Invitrogen) and directly sequenced on both strands. The amplified sequence was assembled in conjunction with corresponding EST sequences (GenBank TM accession numbers AA677272, AI041547, AA887547, and BI464064, respectively) using the Lasergene (DNASTAR Inc., Madison, WI) software suite.
The deduced protein sequences of all members of the HNK-1 family of sulfotransferase were analyzed by multiple alignments using the ClustalW (version 1.4) algorithm (14) implemented in the Bioedit suite.
Genomic Organization and Chromosomal Localization of C4ST-3 and C4ST-1-The cDNA sequences of C4ST-3 and C4ST-1 (GenBank TM accession number AJ289131; see also RefSeq entry NM_018413) were used to query human genome (Genomic BLAST accessible at the NCBI Web site) sequences using BLASTN in order to elucidate the genomic structure of these human genes. Filtering of repetitive elements (Repeat Masker, version 07/16/00, University of Washington Genome Center) and BLASTN (version 2.1.3) searches against the databases of GenBank TM , EMBL, and DDBJ EST divisions (dbEST) at the NCBI were carried out on the identified working draft sequence segments to verify exon sequences and to locate neighboring genes. CpG islands as defined by Gardiner-Garden and Frommer (16) were analyzed by using the WWWCPG program (17).
Northern Blot and Expression Array Analysis-Human Multiple Tissue Northern (MTN) blots and human multiple tissue expression FIG. 1. Nucleotide and deduced amino acid sequence of human C4ST-3 cDNA (GenBank TM accession number AY120869). A, the predicted amino acid sequence of C4ST-3 is denoted by capital letters below the nucleotide sequence. The single predicted membrane-spanning domain and a single potential N-linked glycosylation site are indicated by the thick underlines and by the underlines with a black dot below the glycosylated Asn, respectively. B, Kyte-Doolittle mean hydrophobicity plot for C4ST-3 (scan window size was 13 amino acids).
Transient Expression of Human D4ST-1, C4ST-1, and C4ST-3-CHO/Tag cells were transfected with 13 g of pcDNA3.1-D4ST-1, pcDNA3.1-C4ST-1, pcDNA3.1-C4ST-3, or pcDNA3.1 using 35 g of LipofectAMINE (Invitrogen) in serum-free medium for 6 h according to the manufacturer's protocol. Sixty hours after transfection with native forms of the sulfotransferases, the cells and medium were collected separately for analysis. Cells were lysed with 200 l of 20 mM HEPES buffer, pH 7.4, 5 mM MgCl 2 , 175 mM KCl, 2% Triton X-100, protease inhibitors (23 millitrypsin inhibitor units of aprotinin and 4 g each of leupeptin, antipain, pepstatin, and chymostatin) per 100-mm diameter culture plate. The homogenate was mixed by rotation for 1 h and sedimented at 12,000 ϫ g for 20 min. The supernatant was designated as the cell extract. The culture medium was pooled and sedimented at 12,000 ϫ g for 20 min. The culture supernatant was adjusted to a final concentration of 20 mM HEPES, pH 7.4, and protease inhibitors were added as noted above.
The cytosolic and transmembrane domains of D4ST-1, C4ST-1, and C4ST-3 were substituted with the signal sequence of Ig in the pSec constructs. Furthermore, the myc epitope followed by six histidine residues was added at the carboxyl terminus of each of these sulfotrans-feraes in the pSec constructs. Following transfection with pSec-D4ST-1, pSec-C4ST-1, pSec-C4ST-3, or pSec, the culture medium was collected as described above. The culture medium (20 ml) was incubated with 200 l of Ni 2ϩ -NTA-agarose (Qiagen) overnight at 4°C. The Ni 2ϩ -NTAagarose was washed with 20 mM HEPES, pH 7.4, 200 mM NaCl, 10 mM imidazole. Ni 2ϩ -NTA-agarose with bound sulfotransferase was suspended in 800 l of 10 mM imidazole, pH 6.8, and aliquots were assayed directly for sulfotransferase activity.

Identification of a Human cDNA Related to HNK-1 ST-
The nonredundant data base at the NCBI was probed with the deduced amino acid sequences of human and rat HNK-1 STs (Gen-Bank TM accession number AF033827). A BAC clone RP11-390G14 derived from human chromosome 3 (GenBank TM accession number AC024558) that contained an ORF with a length of 843 bp in the region displaying homology was identified. This sequence was used for further BLASTN searches against dbEST. Retrieval of EST BI908522 that overlapped the 5Ј-region of the ORF in the BAC clone allowed us to evaluate FGENESpredicted exons on a corresponding working draft sequence segment (GenBank TM accession number NT_005588) of chromosome 3. An ORF with a length of 1026 bp that is encoded by three exons was identified and cloned (Fig. 1A). 5Ј-Rapid amplification

FIG. 4. Expression of pSec-C4ST-3, pSec-C4ST-1, and pSec-D4ST-1 in CHO cells. CHO/Tag cells were transfected with pSec-
C4ST-3, pSec-C4ST-1, pSec-D4ST-1, or the pSec vector alone. Medium was collected at 60 h following transfection, and the proteins were precipitated with chloroform/methanol (38). The precipitated proteins were subjected to SDS-PAGE in 7.5% acrylamide gels and electrophoretically transferred to polyvinylidene difluoride membranes. Western blot analysis with anti-myc antibody was used to estimate the amount of sulfotransferase secreted into the medium. Lane 1, pSec-C4ST-1 from 400 l of medium; lane 2, pSec-D4ST-1 from 400 l of medium; lane 3, pSec-C4ST-3 from 400 l of medium. The location and molecular masses in kDa for standards are indicated to the right of the gel.

TABLE I Substrate specificities of C4ST-3, C4ST-1, C4ST-2, D4ST-1, GalNAc-4-ST-1, and GalNAc-4-ST2
CHO/Tag cells were transfected with pSec-C4ST-3, pSec-C4ST-1, pSec-C4ST-2, pSec-D4ST-1, pSec-GalNAc-4-ST1, pSec-GalNAc-4-ST2, or the pSec vector. After 60 h in culture, the medium was collected, and secreted sulfotransferases were incubated with Ni 2ϩ -NTA-agarose (Qiagen). The beads were washed, and sulfotransferase activity was determined as described under "Experimental Procedures" using fixed aliquots of beads. Results are expressed as pmol incorporated/h/ 100-mm plate. of cDNA ends and amplification of the 3Ј-UTR were carried out to obtain the full-length sequence of the cDNA but only succeeded in elongation of the 3Ј-end (see "Experimental Procedures"). Fig. 1A shows the 1747-bp C4ST-3 cDNA including the 1026-bp ORF that encodes a protein of 341 amino acid residues with a single potential N-glycosylation site and a calculated molecular mass of 39 kDa. The deduced protein, designated C4ST-3 (GenBank TM accession number AY120869) is a type II transmembrane protein with a 12-amino acid cytosolic domain at the amino terminus (see Kyte-Doolittle hydrophobicity profile in Fig. 1B). Multiple alignment of the protein sequence of C4ST-3 with other members of the HNK-1 sulfotransferase family was performed using the ClustalW algorithm as implemented in the BioEdit software suite (Fig. 2). The alignment indicates that C4ST-3 is 45% identical to C4ST-1, 27% identical to C4ST-2, 26% identical to GalNAc-4-ST1, 23% identical to GalNAc-4-ST2, and 26% identical to HNK-1 ST (all of the protein sequences shown in Fig. 2 are of human origin). The regions with the highest degree of identity are the putative 5Ј-phosphosulfate binding site (5Ј-PSB), the putative 3Ј-phosphate binding site (3Ј-PB), and three regions of unknown function designated III, IV, and V that are carboxyl-terminal to the 3Ј-phosphate binding site (Fig. 2). Identical and similar amino acids are shaded if they occur at a specific position in at least five of the seven sequences shown in the multiple alignment in Fig. 2.
Genomic Organization and Chromosome Localization of C4ST-1-A BLAST analysis of the genomic sequence available through the NCBI Web site using C4ST-3 cDNA as a query sequence mapped the C4ST-3 gene to human chromosome 3q21.3. The ORF and the 3Ј-UTR of C4ST-3 are encoded by three exons (Fig. 3). The genomic sequence was examined for the presence of CpG islands as defined by Gardiner-Garden and Frommer (16). A CpG island was identified extending from 620 bp upstream to 880 bp downstream of the C4ST-3 start codon. Such CpG islands have been detected in 82% of analyzed genes that show widespread expression and are indicative of the presence of a promoter region.

C4ST-3 Is a Chondroitin-specific GalNAc-4-O-sulfotransferase-Whereas
C4ST-3 displayed the highest percentage of identical amino acids, 45.1%, when compared with C4ST-1, it also displayed significant homology with other members of the HNK-1 family of sulfotransferases. Initial experiments using C4ST-3 expressed by CHO/Tag cells revealed detectable levels of an activity in both cell extracts and medium able to transfer sulfate from [ 35 S]PAPS to chondroitin (not shown). Due to the low levels of activity, a secreted form of C4ST-3 was prepared by substituting the cytosolic and transmembrane domains of C4ST-3 with the signal sequence of human IgG (Invitrogen) to produce pSec-C4ST-3. The identical constructs were prepared for C4ST-1, C4ST-2, D4ST-1, HNK-1 ST, GalNAc-4-ST1, and GalNAc-4-ST2 and designated pSec-C4ST-1, pSec-C4ST-2, pSec-D4ST-1, pSec-GalNAc-4-ST1, and pSec-GalNAc-4-ST2. The myc epitope followed by six histidines was located at the carboxyl terminus of each of these constructs. Following transfection into CHO/Tag cells, the secreted transferases were allowed to bind to Ni 2ϩ -NTA-agarose and assayed for transfer of sulfate to chondroitin, dermatan, and GGnM-MCO while bound to the agarose beads as summarized in Table I. pSec-C4ST-3 transferred sulfate to chondroitin (195 pmol/h/ plate) but not to either dermatan or GGnM-MCO (Table I). Likewise, pSec-C4ST-1 was highly active with chondroitin but not dermatan or GGnM-MCO. pSec-D4ST-1 and pSec-GalNAc-4-ST1 and -ST2 were active with dermatan and GGnM-MCO, respectively. Whereas the transfer of sulfate to chondroitin was 10-fold higher than seen with mock-transfected cells for pSec-C4ST-3 (195 versus 20 pmol/h/plate), the rate of transfer was one-tenth of that seen for pSec-C4ST-1. Since this may reflect differences in the level of expression, the relative amounts of pSec-C4ST-1 and pSec-C4ST3 were estimated by Western blot analysis using an anti-myc antibody following SDS-PAGE and electrophoretic transfer to polyvinylidene difluoride (Fig. 4). The level of expression was significantly lower for pSec-C4ST-3 than for either pSec-C4ST-1 or pSec-D4ST-1, accounting for much of the apparent lower level of activity.
In addition to lower levels of expression, pSec-C4ST-3 was found to be labile at temperatures above 28°C (Fig. 5B). Whereas pSec-C4ST-1 and pSec-D4ST-1 displayed similar levels of activity at 37 and 28°C, the transfer of sulfate by pSec-C4ST-3 at 37°C was 35% of that seen at 28°C (Fig. 5B). As with pSec-C4ST-3, the transfer of sulfate by pSec-C4ST-1 was reduced at 45 and 55°C, whereas the transfer of sulfate to dermatan by D4ST-1 was not markedly reduced at these temperatures. When transferase reactions were carried out at 28°C, incorporation of sulfate by C4ST-3 into chondroitin remained linear for up to 2 h under the assay conditions utilized (Fig. 5A). As with C4ST-1 and pSec-C4ST-1, the incorporation of [ 35 S]SO 4 into chondroitin by pSec-C4ST-3 is saturated at an acceptor concentration of 200 g/ml (Fig. 6). pSec-C4ST-3 has a pH optimum of 6.5 for transfer of sulfate to chondroitin (Fig. 5C).
Tissue Expression Pattern of C4ST-3-Macroarray and Northern blot analyses were used to determine the expression pattern for C4ST-3 in human tissues. As evident by MTE TM macroarray analysis (Fig. 8), C4ST-3 transcripts are most highly represented in adult liver (A9). Significantly lower expression levels were detected in adult kidney (A7), lymph nodes (F7), and fetal liver (D11). By Northern blot analysis (not shown), a single 2.1-kb transcript was detected in adult liver (Fig. 8). The level of expression was not sufficient to detect a transcript in the kidney upon Northern blot analysis. DISCUSSION C4ST-3 represents the seventh member of the HNK-1 family of sulfotransferases. C4ST-3 has the highest percentage of identical amino acids when aligned with C4ST-1, 45.1% identical. Like C4ST-1, C4ST-3 transfers sulfate to the C-4 hydroxyl of GalNAc substituted at C-3 with ␤-linked GlcUA (i.e. GlcUA␤,3-GalNAc). Neither sulfotransferase transfers sulfate to the C-4 hydroxyl of GalNAc substituted at C-3 with ␣-linked IdoUA (i.e. dermatan) or to terminal GalNAc in the sequence GalNAc␤1,4GlcNAc␤1,2Man found at the terminus of certain N-linked oligosaccharides. Based on the levels of mRNA expression, C4ST-1 is widely expressed in tissues (8,9), whereas C4ST-3 has a restricted pattern of expression. Since C4ST-3 and C4ST-1 have similar, if not identical, specificities and are both expressed in liver, kidney, and lymph nodes, the biological role played by C4ST-3 is not clear. The lability of C4ST-3 at 37°C as compared with C4ST-1 and the lower levels of expression seen following transfection as compared with C4ST-1 suggest that C4ST-3 may have a different and perhaps highly specialized role in vivo.
With the exception of HNK-1 ST itself, the seven members of the HNK-1 family of sulfotransferases have all proved to be GalNAc-4-O-sulfotransferases. The GalNAc-sulfotransferases add sulfate to the C4 hydroxyl of either terminal ␤1,4-linked GalNAc (GalNAc-4-ST1 (3,5) and GalNAc-4-ST2 (6)) or to ␤1,4-linked GalNAc that is substituted at its C-3 hydroxyl with either GlcUA (C4ST-1 (8,9) and C4ST-3) or IdoUA (D4ST-1 (7)). In contrast, HNK-1 ST adds sulfate to the C-3 hydroxyl of terminal ␤1,3-linked GlcUA (1,2). Each member of this family of sulfotransferases is thus highly specific, and the unique sulfated structures that are produced have distinct biological roles. The HNK-1 structure is involved in neural recognition and synaptogenesis in the central and peripheral nervous system (21)(22)(23). Oligosaccharides terminating with SO 4 -4-GalNAc␤1,4GlcNAc␤1,2Man are recognized by a receptor, the Man/GalNAc-4-SO 4 receptor (24 -26), that regulates the half-life of glycoproteins such as the glycoprotein hormone lutropin (27,28). This recognition is essential for attaining maximal biologic activity in vivo (29). The highly regulated expression of GalNAc-4-ST1 (3,30) in other tissues such as the brain suggests that it will have additional roles that remain to be defined. In addition to their importance for the formation and maintenance of cartilage (31,32), there is also evidence that chondroitin sulfate proteoglycans are important for neural cell adhesion, neurite outgrowth, synaptic plasticity, and regeneration (33)(34)(35)(36)(37). A number of chondroitin sulfate-bearing proteoglycans are produced by tissues (31,32). Since the functions of these proteoglycans are modulated and/or require the addition of sulfate, the existence of multiple chondroitin-4-GalNAc sulfotransferases provides the potential for additional regulation and specificity. The availability of the cloned sulfotransferases provides tools to investigate the biological roles of these complex sulfated structures.