Multiple isoforms of heparan sulfate D-glucosaminyl 3-O-sulfotransferase. Isolation, characterization, and expression of human cdnas and identification of distinct genomic loci.

3-O-Sulfated glucosaminyl residues are rare constituents of heparan sulfate and are essential for the activity of anticoagulant heparan sulfate. Cellular production of the critical active structure is controlled by the rate-limiting enzyme, heparan sulfate D-glucosaminyl 3-O-sulfotransferase-1 (3-OST-1) (EC 2.8.2.23). We have probed the expressed sequence tag data base with the carboxyl-terminal sulfotransferase domain of 3-OST-1 to reveal three novel, incomplete human cDNAs. These were utilized in library screens to isolate full-length cDNAs. Clones corresponding to predominant transcripts were obtained for the 367-, 406-, and 390-amino acid enzymes 3-OST-2, 3-OST-3A, and 3-OST-3B, respectively. These type II integral membrane proteins are comprised of a divergent amino-terminal region and a very homologous carboxyl-terminal sulfotransferase domain of approximately 260 residues. Also recovered were partial length clones for 3-OST-4. Expression of the full-length enzymes confirms the 3-O-sulfation of specific glucosaminyl residues within heparan sulfate (Liu, J., Shworak, N. W., Sinaÿ, P., Schwartz, J. J. Zhang, L., Fritze, L. M. S., and Rosenberg, R. D. (1999) J. Biol. Chem. 274, 5185-5192). Southern analyses suggest the human 3OST1, 3OST2, and 3OST4 genes, and the corresponding mouse isologs, are single copy. However, 3OST3A and 3OST3B genes are each duplicated in humans and show at least one copy each in mice. Intriguingly, the entire sulfotransferase domain sequence of the 3-OST-3B cDNA (774 base pairs) was 99.2% identical to the same region of 3-OST-3A. Together, these data argue that the structure of this functionally important region is actively maintained by gene conversion between 3OST3A and 3OST3B loci. Interspecific mouse back-cross analysis identified the loci for mouse 3Ost genes and syntenic assignments of corresponding human isologs were confirmed by the identification of mapped sequence-tagged site markers. Northern blot analyses indicate brain exclusive and brain predominant expression of 3-OST-4 and 3-OST-2 transcripts, respectively; whereas, 3-OST-3A and 3-OST-3B isoforms show widespread expression of multiple transcripts. The reiteration and conservation of the 3-OST sulfotransferase domain suggest that this structure is a self-contained functional unit. Moreover, the extensive number of 3OST genes with diverse expression patterns of multiple transcripts suggests that the novel 3-OST enzymes, like 3-OST-1, regulate important biologic properties of heparan sulfate proteoglycans.

for mouse 3Ost genes and syntenic assignments of corresponding human isologs were confirmed by the identification of mapped sequence-tagged site markers.

Northern blot analyses indicate brain exclusive and brain predominant expression of 3-OST-4 and 3-OST-2 transcripts, respectively; whereas, 3-OST-3 A and 3-OST-3 B isoforms show widespread expression of multiple transcripts. The reiteration and conservation of the 3-OST sulfotransferase domain suggest that this structure is a self-contained functional unit. Moreover, the extensive number of 3OST genes with diverse expression patterns of multiple transcripts suggests that the novel 3-OST enzymes, like 3-OST-1, regulate important biologic properties of heparan sulfate proteoglycans.
Heparan sulfate proteoglycans are hybrid molecules composed of a protein core to which is attached one or more linear glycosaminoglycan chains of the heparan sulfate variety. Extreme structural diversity of the heparan sulfate side chains enables interactions with a broad array of protein effector molecules that modulate a wide range of biologic processes. The specificity of any given heparan sulfate-protein interaction is largely dictated by placement of sulfate groups along the chain's length. Thus the order and ring position of sulfate substituents creates distinct oligosaccharide sequences (fine structure) and defines corresponding biologic activities (reviewed in Refs. [1][2][3]. The profound functional diversity of heparan sulfate proteoglycans necessitates a mechanism that can generate and independently regulate the production of a myriad of fine structures. Such control is predominantly imposed in a cell typespecific fashion by varying the functional status of the Golgi apparatus, with the core proteins potentially contributing only a minor degree of influence (4,5). Thus, heparan sulfate biosynthetic enzymes are implicated as key components in generating regions of defined monosaccharide sequence. The production of the antithrombin-binding site by the enzyme heparan sulfate D-glucosaminyl 3-O-sulfotransferase-1 (3-OST-1) 1 (EC 2.8.2.23), reveals a mechanism for the independent biosynthesis of a specific heparan sulfate sequence that regulates an important biologic activity.
Antithrombin is a natural anticoagulant that neutralizes serine proteases of the intrinsic blood coagulation cascade through the formation of a 1:1 enzyme-antithrombin covalent complex. The rate of complex formation is dramatically enhanced via interactions with glycosaminoglycans containing the antithrombin-binding site; i.e. pharmaceutical heparin and anticoagulant heparan sulfate. The latter is generated by endothelial cells, which line the blood vessel wall. The importance of the anticoagulant heparan-antithrombin interaction is evidenced by the arterial thrombotic events that occur in patients producing antithrombin variants defective in anticoagulant heparan binding (reviewed in Ref. 6). Given this critical role, it is not surprising that the cellular production of anticoagulant heparan is regulated independently of the general bulk of heparan sulfate (5,7).
Antithrombin specifically recognizes the structure: -Glc(NS or Ac)6S-GlcA-GlcNS3SϮ6S-IdoA2S-GlcNS6S-, 2 which triggers a conformational change that results in the accelerated neutralization of specific coagulation proteases (reviewed in Refs. 3 and 8). The central 3-O-sulfate group is absolutely essential for induction of the conformational change and high affinity antithrombin binding. Binding specificity additionally requires the 6-O-sulfate groups on residues 1 and 5, the amino group at residue 5 and carboxyl groups at other sites (9). The critical role of the 3-O-sulfate group and the extreme paucity of this substituent within heparan sulfate (5,7), raises the possibility of a key regulatory role. Indeed, we have recently demonstrated that the enzyme 3-OST-1 performs the rate-limiting biosynthetic reaction that determines cellular production of anticoagulant heparan (10,11). The enzyme recognizes a specific precursor structure, corresponding to the antithrombinbinding site devoid of just the 3-O-sulfate, and adds this rare substituent to complete the formation of anticoagulant heparan (10). Thus, 3-OST-1 activity controls cellular anticoagulant phenotype. This example raises the possibility that additional heparan sulfate biosynthetic enzymes may function in an analogous fashion, controlling production of other important heparan sulfate fine structures.
The molecular cloning of the cDNA for the precursor protein of 3-OST-1 showed that the enzyme undergoes removal of an amino-terminal leader sequence to generate a Golgi intraluminal resident of ϳ290 amino acids (12). Most importantly, the carboxyl-terminal ϳ260 residues have striking homology to the comparable region of the bifunctional biosynthetic enzymes heparan sulfate N-deacetylase/N-sulfotransferase-1 and -2 (ϳ50% similarity to both NST-1 and NST-2), and at least 30% similarity to virtually every type of sulfotransferase enzyme previously identified. Consequently, this conserved structure that spans the majority of 3-OST-1 has been presumptively designated as the sulfotransferase domain (12).
In the present article, we have employed this conserved structure to molecularly clone related cDNAs, which encode homologous carboxyl-terminal sulfotransferase domains but distinct amino-terminal structures. Unlike 3-OST-1, the novel 3-OST-2, 3-OST-3 A , and 3-OST-3 B enzymes are predicted to have type II integral membrane architecture. An incomplete cDNA encoding the sulfotransferase domain portion of the enzyme 3-OST-4 was also obtained. Comparison of the enzyme structures predicts motifs that may govern sequence specific modification. Northern hybridizations show isoform-specific expression patterns, whereas genomic characterizations iden-tified at least 7 human 3-OST genes. Thus, the 3-OST multigene family is exquisitely suited to encode key enzymes that regulate the production of many distinct heparan sulfate fine structures. Isolation of 3-OST-2, 3-OST-3, and 3-OST-4 cDNA Clones-The National Center for Biotechnology Information data bank of I.M.A.G.E. Consortium (Lawrence Livermore National Laboratory) expressed sequence tag cDNA clones (13) was probed with the deduced sulfotransferase domain region of mouse 3-OST-1 (12), which identified partial length clones that were obtained from the TIGR/ATCC Special Collection (ATCC). Complete sequencing of the inserts revealed three clone categories: 3-OST-2, I.M.A.G.E. Consortium (Lawrence Livermore National Laboratory) Clone ID c-20d10 (GenBank™ accession number F07258) (14) from a normalized library generated from total brain of a 3-month muscular atrophy female (15); 3-OST-3 A CTF , Clone ID 284542 (GenBank™ accession number N71828), from a library of 4 multiple sclerosis lesions isolated from a 46-year-old male (13); and 3-OST-4, Clone IDs HIBCX69 (GenBank™ accession number T33472) from human brain (16), IB727 (GenBank™ accession number T03677) from infant brain (17,18), 166466 (GenBank™ accession number R88592) from adult brain (13), 23279 (GenBank™ accession number T75445) from infant brain, c-3ie01 (GenBank™ accession number F13088) (from the same library as c-20d10) (14). To obtain full-length clones, we first identified cDNA regions, described below, which would function as isoform-specific probes by hybridizing Southern blots of human genomic DNA with expressed sequence tag fragments 32 P-labeled by random priming. The corresponding fragments were used to screen TriplEx brain and liver cDNA libraries (CLONTECH), as described previously (12). Positive plaques were purified, TriplEx-based plasmids were excised in vivo according to the manufacturers protocol, and inserts were sequenced as described below.

EXPERIMENTAL PROCEDURES
Characterization of cDNA Clones-The 5Ј and 3Ј insert regions were enzymatically sequenced from flanking primer sites of the respective cloning vectors. The remaining sequence of both strands was obtained with internally priming oligonucleotides. Primers were spaced no more than 400 base pairs (bp) apart with a 200-bp offset between ϩ and Ϫ strands, thus each nucleotide was detected within 200 bp of a primer. Automated fluorescence sequencing was performed with Perkin-Elmer Applied Biosystems Models 373A and 477 DNA Sequencers. Each reaction typically yielded 400 to 600 bases of high quality sequence.
Typically, 10-g samples were restriction digested, resolved by 0.8% or 1% agarose gel electrophoresis, and transferred to GeneScreen Plus membranes. Membranes were hybridized for 16 h in 1% SDS, 10% dextran sulfate buffer containing SSC and formamide. The concentrations of these latter two components were adjusted so that all homologous hybridizations were incubated at T m Ϫ 25°C and all heterologous hybridizations were incubated at T m Ϫ 35°C, where for a DNA:DNA hybrid T m ϭ 81.5°C ϩ (16.6 ϫ log([Na ϩ ])) ϩ (41 ϫ percentage GC) Ϫ (500 Ϭ length of probe template in bp) Ϫ (62 ϫ percentage formamide) (25). Membranes were washed in 1% SDS and sufficient SSC to generate a final incubation stringency of at least T m Ϫ 25°C or T m Ϫ 35°C, as described above, respectively.
Northern Blot Analysis-Tissue and cell type-specific expression of 3-OST forms was performed with human multiple tissue and human cancer cell line Northern blots, respectively (CLONTECH). Endothelial expression of 3-OST forms was tested on 10 g of total RNA from immortalized rat fat pad endothelial cells and primary mouse cardiac microvascular cells, as well as 5 g of poly(A) ϩ prepared from primary human umbilical vein endothelial cells, as described previously (12). Samples were resolved on 1.2% formaldehyde-agarose gels and capillary transferred to GeneScreen Plus membranes. Membranes were hybridized for 16 h in 1% SDS, 10% dextran sulfate buffer containing SSC and formamide. The concentrations of these latter two components were adjusted so that all homologous hybridizations were incubated at T m Ϫ 25°C and all heterologous hybridizations were incubated at T m Ϫ 35°C, where for a DNA:RNA hybrid T m ϭ 79.8°C ϩ (18.5 ϫ log([Na ϩ ])) ϩ (58.4 ϫ percentage GC) ϩ (11.8 ϫ (percentage GC) 2 ) Ϫ (820 Ϭ length of probe template in bp) Ϫ (50 ϫ percentage formamide) (26). Membranes were washed in 1% SDS and sufficient SSC to generate a final incubation stringency of at least T m Ϫ 25°C or T m Ϫ 35°C, as described above, respectively.
Interspecific Mouse Back-cross Mapping-Interspecific back-cross progeny were generated by mating (C57BL/6J ϫ Mus spretus)F 1 females and C57BL/6J males as described (27). A total of 205 N 2 mice were used to map the 3Ost loci, as described under "Results." DNA isolation, restriction enzyme digestion, agarose gel electrophoresis, Southern blot transfer, and hybridization were performed essentially as described (28). All blots were prepared with Hybond-N ϩ nylon membrane (Amersham). The employed hybridization probes are described above. The 3Ost1 probe, mST-1, was labeled with [␣-32 P]dCTP using a random primed labeling kit (Stratagene); washing was done to a final stringency of 1.0 ϫ SSCP, 0.1% SDS, 65°C. A fragment of 6. kb. The presence or absence of the M. spretus-specific fragments was followed in back-cross mice.
A description of the probes and restriction fragment length polymorphisms for the loci linked to the 3Ost genes has been reported previously. These include Adra2c, Msx1, and Bst1, chromosome 5 (29) 3 ; Pth, Pkcb, Spn, and Mgmt, chromosome 7 (29,31); and Adra1a, Csfgm, Myhsf1, and Trp53, chromosome 11 (32,33). Recombination distances were calculated using Map Manager, version 2.6.5. Gene order was determined by minimizing the number of recombination events required to explain the allele distribution patterns.
Chromosomal Mapping of Human 3OST Genes-Data base searching identified bacterial artificial chromosome clones containing human 3OST2, 3OST3A1, and 3OST3B1 genes (GenBank™ accession numbers AC003661, AC002287, AC005411, AC005375, AC005224). Data base searching with a combination of genomic and cDNA sequences identified expressed sequence tag and sequence-tagged site markers (GenBank™ accession numbers G24436, T03677, G21216, and G03581). The chromosomal location of these markers was then determined through the Human Genome Sequencing Index (http://www.ncbi.nlm.nih.gov/HUGO/).

Isolation and Characterization of cDNAs Encoding 3-OST-2, 3-OST-3, and 3-OST-4 Isoforms
We probed the National Center for Biotechnology Information data base of expressed sequence tag cDNA clones (13) with the deduced amino acid sequence of the presumptive sulfotransferase domain from the human 3-OST-1 cDNA to reveal several human partial length cDNAs encoding novel related species, as described under "Experimental Procedures." Sequencing the contained cDNA inserts confirmed three distinct forms designated as 3-OST-2, 3-OST-3 A CTF , and 3-OST-4. Isotype-specific probes were generated, 32 P-labeled, and screened against TriplEx human cDNA libraries made from brain (3 ϫ 10 6 plaques for 3-OST-2 and 3-OST-4) and liver (4.5 ϫ 10 6 plaques for 3-OST-3) to identify 7, 8, and 4 additional clones of 3-OST-2, -3, and -4 groups, respectively. The contained inserts of the corresponding isolates were completely sequenced, which revealed two forms for 3-OST-2 (-2 and -2 CTF ) and 4 kinds of 3-OST-3 cDNAs (-3 A , -3 A CTF , -3 B , and -3 B CTF ) (Fig. 1). 4 The 3-OST-4 clones overlapped with clone 23279, but were all shorter partial-length clones and so are not presented; thus, the longest obtained 3-OST-4 cDNA contains an incomplete coding sequence. The primary structures of 3-OST-2, -3 A , and -3 B composite cDNAs are presented in Figs. 2, 3, and 4; whereas the sequence data for the incomplete 3-OST-4 cDNA can be obtained from the GenBank™ /EMBL Data Bank. The accompanying article (34) describes the analysis of recombinantly expressed 3-OST-2, -3 A , and -3 B cDNAs, which confirms that the encoded enzymes specifically 3-O-sulfate glucosaminyl residues within heparan sulfate. Table I summarizes the major structural features of all composite cDNA forms. The length of the 5Ј-untranslated region from the full-length cDNAs varies widely (from 72 to 798 bp) and all ATG codons within this region are followed by in-phase termination codons. For each full-length cDNA, the assigned coding region is by far the longest open reading frame and begins with an initiation ATG conforming to Kozak's consensus sequence (a purine at Ϫ3 and/or a G at ϩ4) (35). Moreover, each initiator sequence is precede by one or more in-phase termination codons. A consensus polyadenylation signal (AATAAA) occurs within 20 -30 bp of the 3Ј-untranslated region termini and is followed by a poly(A) tail for all cDNAs except 3-OST-3 B (Fig. 1). This distinction indicates the 3-OST-3 B composite cDNA contains only an incomplete 3Ј-untranslated region; especially since the cDNA is 4.2 kb shorter than its corresponding transcript (Table I). For the 3-OST-2 cDNA, an alternate site for polyadenylation is also predicted by an extra signal occurring ϳ200 bp from the 3Ј-untranslated region termini.
The composite 3-OST-2 cDNA sequence presented in Fig. 2 was derived from clones pJL-2.1, pJL-2.2, pJL-2.3, and pJL-2.7 ( Fig. 1). However, clones c-20d10 and pJL-2.6 both differ from the composite cDNA sequence at four positions (G 804 3 A, T 1249 3 G, T 1350 3 C, C 1507 3 T) (Figs. 1 and 2). These two clones were isolated from different libraries and so the sequence discrepancies could not have possibly arisen by errors in reverse transcription or cDNA amplification. Given that the human 3OST2 gene is single copy, described below, these differences indicate allelic variation. The G 804 3 A transition is the only coding region variant, but does not alter the amino acid sequence. The remaining mutations are found in the 3Ј-untranslated region; thus, all mutations may be silent.
Most importantly, significant nucleic acid sequence conservation only occurs for the sulfotransferase domain portion of the cDNAs. Within this span, each of the novel cDNAs exhibits ϳ55% identity to 3-OST-1. However, sulfotransferase domains share ϳ72% identity between 3-OST-2, -3, and -4 classes. Conservation is most extreme between 3-OST-3 A and 3-OST-3 B , with 99.2% identity over 774 bp that encodes the entire sulfotransferase domain region of 3-OST-3 B (between ␣ and ␥ in Figs. 3 and 4). Immediately after this shared region the 3-OST-3 A coding sequence extends two codons (Gly and Stop), whereas the 3-OST-3 B cDNA just encodes a Stop codon. Thus, the predicted sulfotransferase domain of 3-OST-3 A is 1 amino acid longer than that of 3-OST-3 B . The nearly identical regions could have resulted from a single genetic locus by alternative splicing, but only if the nonidentical residues stem from allelic variation. However, this possibility is statistically unlikely (p ϭ 0.016). 5 Alternative splicing is completely excluded by genomic 5 Within all five 3-OST-3 B clones the region from ␣ to ␤ contains 6 silent point mutations in wobble positions (C 729 3 G, A 762 3 C, C 798 3 T, C 843 3 T, C 852 3 T, and C 876 3 T), whereas both of the two 3-OST-3 A clones lack these mutations (Fig. 1). If these differences reflect the sequences of two distinct alleles of the same gene, then the probability that both 3-OST-3 A clones would be of either single allele is 1/2 and the probability that all 3-OST-3 B clones would be of the opposite allele is ( 1 ⁄2) 5 . Thus, the observed exclusive distribution would randomly occur

cDNAs of Human Heparan Glucosaminyl 3-O-Sulfotransferases
restriction mapping, which reveals separate 3-OST-3 A and 3-OST-3 B genes, as described below (Fig. 6). The profound conservation of a genomic segment between distinct loci is indicative of gene conversion, as described below.

Characterization of 3OST Genomic Loci
Four 3OST3 Genes Exist-The genomic loci of the various 3OST genes were characterized to identify the origins of these structurally related cDNAs. The copy number of all known 3OST genes was assessed by Southern blot analysis of human genomic DNA with gene specific probes (Fig. 5A). This analysis suggests that 3OST1, 3OST2, and 3OST4 only occur as single copy genes. Heterologous hybridization of the same probes to mouse genomic DNA separately digested with the same 5 restriction enzymes, described under Fig. 5, yielded comparable results (data not shown). The combined analyses strongly argue that both humans and mice possess only single copies of 3OST1, 3OST2, and 3OST4 genes.
Southern analysis targeting distinct gene regions reveals the human 3-OST-3 multigene subfamily. Sulfotransferase domain sequences common to all 3-OST-3 cDNAs were detected with the probe ST-3, whereas 3Ј-untranslated regions specific to 3-OST-3 A or 3-OST-3 B cDNAs were detected with probes 3ЈA or 3ЈB, respectively (probe locations shown in Fig. 1). The existence of at least two 3OST3 genes was initially suggested by hybridizations to EcoRI-digested genomic DNA. ST-3 revealed two bands, one exclusively detected by 3ЈA and the other identified only by 3ЈB (Fig. 5A). Indeed, we have recently identified with a frequency of 1/2 ϫ 1/2 5 ϭ 1/64 ϭ 0.016. Thus, it is extremely unlikely that this exclusive distribution could have resulted from allelic variation of a single gene.

cDNAs of Human Heparan Glucosaminyl 3-O-Sulfotransferases
genomic clones of two distinct genes 3OST3A1 and 3OST3B1, as noted under "Experimental Procedures." However, BstXI digestions suggest greater complexity as ST-3 displayed three bands of about 2.0, 1.1, and 0.5 kb in a 1:2:1 stoichiometry, respectively. 3ЈA detected both of the weak bands; whereas 3ЈB identified only the strong band (Fig. 5A). This pattern could only result from just two 3OST3 genes if a single copy 3OST3A gene has an allelic BstXI restriction fragment length polymorphism. Alternatively, if such an allelic polymorphism is not present, then the pattern must result from a minimum of four 3OST3 genes with BstXI sites differing in the two 3OST3A forms but being invariant in the two 3OST3B copies. The possibility of only two 3OST3 genes was excluded by ST-3 probing of BstXI-digested genomic DNA from an additional 16 unrelated individuals. In contrast to an allelic segregation pattern, all samples generated the identical 1:2:1 band pattern described above (data not shown).
Duplication of the amino-terminal portions of 3OST3A and

cDNAs of Human Heparan Glucosaminyl 3-O-Sulfotransferases
3OST3B genes was also documented by Southern analyses with the isoform-specific probes SPLAG-A and SPLAG-B, respectively (Fig. 5B). Only some of the detected fragments were predicted from a computer-generated restriction map of 3OST3A1 and 3OST3B1 gene sequences, described above. Consequently, these fragments (A1 and B1 in Fig. 5B) are derived from 3OST3A1 and 3OST3B1. The additional unanticipated bands (A2 and B2) document the duplicated amino-terminal regions of genes 3OST3A2 and 3OST3B2. Single bands were occasionally detected (A1 & 2 and B1 & 2 in Fig. 5), which indicates conservation of sequence within a gene pair. We conclude from the above data that the human genome contains two 3OST3A genes and two 3OST3B genes.
Inspection of genomic clone sequence also reveals that discriminating 0.5-and 2.0-kb BstXI fragments are derived from 3OST3A1 and 3OST3A2 genes, respectively. Similarly, a BamHI polymorphism between ST-3 and 3ЈB differentiates the genes 3OST3B1 from 3OST3B2, respectively. 6 Examination of the individual cDNAs inserts that encompass these defining regions shows the 3-OST-3 A clone pJL-3.4 derives from 3OST3A1; whereas the 3-OST-3 B clones pJL-3.6, -3.7, and -3.9 all originate from 3OST3B1. However, the limited number of analyzed clones are insufficient to exclude functionality of 3OST3A2 and 3OST3B2. Moreover, it remains unclear whether each gene pair produces identical or distinct products.
Southern analysis does not always resolve each member of a human 3OST3 gene pair (e.g. EcoRI of Fig. 5), suggesting a high degree of sequence homology between each pair of copied 6 On BamHI digests, 3ЈB detects two equal intensity bands but only one co-hybridizes to ST-3. In addition, ST-3 reveals an unaccounted band that is not detected with 3ЈB or 3ЈA. Given that 3OST3B1 lacks a 3Ј BamHI, the data indicate that 3OST3B2 contains a BamHI site between ST-3 and 3ЈB (N. W. Shworak, unpublished data).

cDNAs of Human Heparan Glucosaminyl 3-O-Sulfotransferases
genes. Accordingly, we assessed the extent of similarity by performing genomic restriction mapping on the 3Ј-untranslated regions of 3OST3A and 3OST3B forms, given that 3Јuntranslated region sequences are typically divergent, even within multigene families (36). The data demonstrate a high degree of identity in the 3Ј-untranslated regions of each pair of copies; indeed, discrimination between members of each gene pair was not observed with any of the employed enzyme combinations (Fig. 6). This suggests either a very late duplication of 3OST3A and 3OST3B forms, or a concerted mechanism, i.e. gene conversion, to maintain primary structures. We note the murine genome must contain at least one copy of both forms, 7 which indicates that human 3OST3A and 3OST3B genes cannot have resulted from late duplication. Accordingly, the human 3OST3 genes have apparently been subjected to gene conversion. At the minimum, gene conversion homogenizes the sulfotransferase domain sequences between human 3OST3A1 and 3OST3B1 loci. It is even possible that conversion maintains the 3Ј structural similarities between the 3OST3A gene pair and between the 3OST3B gene pair.
Chromosomal Localization of Mouse 3Ost Loci-The mouse chromosomal location of each 3Ost locus was determined by interspecific back-cross analysis using progeny derived from matings of [(C57BL/6J ϫ M. spretus)F 1 ϫ C57BL/6J] mice. This interspecific back-cross mapping panel has been typed for over 2500 loci that are well distributed among all the autosomes as well as the X chromosome (27). C57BL/6J and M. spretus DNAs were digested with several enzymes and analyzed by Southern blot hybridization for informative restriction fragment length polymorphisms using cDNA probes specific for each gene. The strain distribution pattern of each polymorphism in the interspecific back-cross mice was then determined and used to position the 3Ost loci on the interspecific map (Fig. 7).
3Ost1 mapped to the proximal region of mouse chromosome 5, 0.5 centimorgan distal of Msx1 and 3.7 centimorgan proximal of Bst1. 3Ost2 and 3Ost4 mapped to the distal region of chromosome 7: 3Ost2 did not recombine with Pkcb in 165 animals typed in common, suggesting that the two loci are within 1.8 centimorgans (upper 95% confidence limit), and 3Ost4 mapped 2.3 centimorgans distal of this cluster and 0.7 centimorgans proximal of Spn. Finally, 3Ost3a and 3Ost3b mapped to the central region of mouse chromosome 11 and did not recombine with each other in 141 mice typed in common, suggesting the two loci are within 2.1 centimorgans of each other (upper 95% confidence limit). The cluster of the two murine 3Ost3 genes is 3.8 centimorgans distal of Csfgm and 2.4 centimorgans proximal of Myhsf1 on mouse 11. The very tight linkage between 3Ost3a and 3Ost3b suggests that the genes arose by a tandem duplication event.
We have compared our interspecific map of chromosomes 5, 7, and 11 with a composite mouse linkage map that reports the map location of many uncloned mouse mutations (provided from Mouse Genome Data Base, a computerized data base maintained at The Jackson Laboratory, Bar Harbor, ME). The 3Ost loci mapped in regions of the composite map that lack mouse mutations with a phenotype that might be expected for an alteration in these loci (data not shown).
The proximal region of mouse chromosome 5 shares a region of homology with human chromosome 4p (Fig. 7). Our placement of 3Ost1 in this interval suggests that the human isolog 3OST1 will map to 4p, as well. The distal region of mouse chromosome 7 shares regions of homology with human chromosomes 11p, 16p, and 10q. Both Pkcb and Spn have been mapped to 16p in human chromosomes. The tight linkage in mouse between Pkcb and Spn, and 3Ost2 and 3Ost4 suggests that the human isologs 3OST2 and 3OST4 will also map to human 16p. Indeed, the identification of cloned mapping markers confirms that 3OST2 and 3OST4 localize to human 16p12 and 16p11.2, as described under "Experimental Procedures." Similarly, 3Ost3a and 3Ost3b map between Csfgm and Myhsf1 in mouse. These two latter genes have been assigned to 5q31 and 17pter-p11 in humans, respectively, which suggests the human 3OST3 genes will map to 5q or 17p. The identification of cloned markers resolves this ambiguity and shows that 3OST3A1 and 3OST3B1 both localize to 17p12-p11.2. The 3OST chromosomal regions lack human disorders with a phenotype that might be expected for an alteration in these loci (data not shown).

Tissue and Cell-type Specific Expression of Multiple Transcripts
Northern analyses with isoform-specific probes reveal tissuespecific expression for members of the multigene family (Fig.  8). Moreover, the more ubiquitously expressed members produce multiple transcripts that predominantly show coordinate regulation. 3-OST-4 exhibits the most selective pattern with only a single transcript detected in brain. The transcripts of 3-OST-2 predominantly occur in brain, but low expression is also observed in heart, placenta, lung, and skeletal muscle. Levels of the two 3-OST-1 transcripts are predominant in kidney and brain, intermediate in heart and lung, and low but detectable in the remaining analyzed organs (Fig. 8). The 3-OST-3 forms show the broadest expression pattern and the largest number of transcript forms. Although most tissues express both 3-OST-3 A and 3-OST-3 B , quantitative differences are evident. For example, the highest expression of 3-OST-3 A occurs in heart and placenta, whereas 3-OST-3 B is most abundant in liver and placenta. Furthermore, each tissue exhibits a distinct ratio of 3-OST-3 A subtypes and 3-OST-3 B subtypes. For 3-OST-2, 3-OST-3 A , and 3-OST-3 B the small transcripts of minor abundance are alternative splice variants that encode the unusual carboxyl-terminal fragments. The characterization of these and additional 3-OST-3 transcript classes shall be provided in a separate communication (as described above).
Interestingly, 3-OST-3 versus 3-OST-2 and 3-OST-4 transcripts show essentially reciprocal tissue expression. In contrast, the tissue-specific pattern of 3-OST-1 has overlap with all 7 We have found mouse expressed sequence tag clones derived from each gene (GenBank™ accession numbers W14854, W49404, and W71608 from 3Ost3a, AA254888 and AA288201 from 3Ost3b), and have detected the corresponding genes with SPLAG-A and SPLAG-B probes by heterologous hybridization to genomic DNA, as described under "Interspecific Mouse Back-cross Mapping."  Fig. 8. For certain forms, discrepancies between transcript and cDNA sizes not only reflects differences in polyadenylation, but also stems from incomplete isolation of 5Ј-UTR, 3-UTR, and coding regions. A, BstXI or EcoRI digests were sequentially hybridized to gene-specific probes respectively generated from the sulfotransferase domains of 3-OST-1, -2, -3, and -4, as described under "Experimental Procedures" and shown in Fig. 1 (ST probes). Note that gene copy number could be overestimated if the hybridization region is bisected by an unanticipated intron. To minimize this possibility, relatively short probes (192-575 bp) were selected. Moreover, genomic restriction mapping confirms the target regions of all 3OST3 genes are devoid of latent introns (Fig. 6). The 3-OST-3 probe, ST-3, detects both 3OST3A and 3OST3B bands, as sulfotransferase domain sequences are identical for both genes. The origins of these bands were determined by hybridization to 3ЈA or 3ЈB probes (Fig. 1), which are specific to the respective 3Ј-untranslated regions. Similar data were also generated with genomic DNA digested with BamHI, PstI, and PvuII. B, duplication of amino-terminal sequences was shown by hybridizations with SPLAG-A or SPLAG-B probes (Fig. 1). For each band, the gene(s) of origin were determined by comparing the observed size to that expected from the 3OST3A1 or 3OST3B1 gene sequence. Note that BstXI bisects the SPLAG-B probe, consequently the 3OST3B1 gene generates two bands (3, 275 and 314 bp). The absence of a BstXI site in 3OST3B2 shifts the expected lower band up to 1967 bp.

Predicted Protein Structures
Extensive data base searching revealed the full-length 3-OST-2, -3 A , and -3 B enzymes and the partial length 3-OST-4 sequence to all be novel proteins. The 3-OST-2, -3 A , and -3 B cDNAs predict type II integral membrane proteins (37) of 367, 406, and 390 residues, respectively. Each is comprised of four domains beginning with a short (19 -32 residues) amino-terminal cytoplasmic tail that exhibits a net positive charge (3-OST-2, -3 A , and -3 B contain 32, 12, and 19% basic residues but only 0, 4.2, and 3.1% acidic residues, respectively) and terminates with 2 or 3 basic residues (Figs. 2, 3, and 4). Interestingly, this segment of 3-OST-3 B contains a polyproline run of 7 residues (residues [22][23][24][25][26][27][28]. The second domain is hydrophobic, has a high probability of forming ␣-helix, and is flanked by charged residues; thus, is anticipated to function as a membrane spanning segment (Figs. 2, 3, and 4). Kyte-Doolittle hydropathy analysis reveals this section to be the only hydrophobic region of sufficient length to cross a membrane. The lengths of the hydrophobic regions of 3-OST-2 and -3-OST-3 A (22 and 19 residues, respectively) are typical for transmembrane domains; however, 3-OST-3 B has a 33-amino acid stretch of hydrophobic groups. Although the extent to which the 3-OST-3 B hydrophobic region is buried in the membrane is presently unclear, sequence analysis with trained neural networks favors a transmembrane helix extending from Leu 35 to Gly 53 (21). Interestingly, the hydrophobic regions contain 3, 2, and 5 Cys residues (3-OST-2, -3 A , and -3 B , respectively), which is atypical of transmembrane domains.
The third domain ranges from 67 to 104 residues, and is designated as the SPLAG domain due to an extreme enrichment in Ser, Pro, Leu, Ala, and Gly (comprising 69, 62, and 70% of third domain residues in 3-OST-2, -3 A , and -3 B , respectively). Consequently, this region is predicted to be predominantly devoid of secondary structure, with only 4.4, 10, and 13% of contained residues having potential to form ␣-helix or ␤-sheet, for 3-OST-2, -3 A , and -3 B , respectively. Thus, this segment is likely to act as a flexible stem which links the catalytic sulfotransferase domain to the membrane anchor. Only the stem region of 3-OST-2 contains cysteines (two residues present), with Cys 55 and Cys 73 potentially forming a disulfide bond that generates a peptide loop of 19 amino acids (Fig. 2). Within the SPLAG domain, 3-OST-2 contains a single potential N-glycosylation site but all enzymes harbor potential O-glycosylation sites (5, 2, and 6 sites for 3-OST-2, -3 A , and -3 B ) with mucin-like clustering (Figs. 2-4). A similarly high enrichment of SPLAG residues occurs in the amino-terminal stretch that abuts the sulfotransferase domain of the intraluminal resident 3-OST-1, 8 Exclusive expression of 3-OST-3 A occurs in HeLa S3 (cervical carcinoma) and G361 (melanoma) cells; exclusive expression of 3-OST-3 B in HL-60 (promyelocytic leukemia), MOLT-4 (lymphoblastic leukemia), and Raji (Burkitt's lymphoma) cells; whereas, both transcript types are found in K-562 (chronic myelogenous leukemia), SW480 (colorectal adenocarcinoma), and A549 (lung carcinoma) cells (N. W. Shworak, unpublished data).
FIG. 6. Genomic restriction mapping of 3-untranslated regions of 3OST3A and 3OST3B gene pairs. Restriction mapping of genomic DNA by Southern hybridization with isoform-specific probes reveals a conservation in 3Ј-untranslated region restriction sites for 3OST3A and 3OST3B gene pairs. Blots were consecutively hybridized to 3Ј-specific probes (3ЈA and 3ЈB), and the 3-OST-3 common probe (ST-3). Genomic DNA was double digested with BstEII (sulfotransferase domain cleavage) in conjunction with enzymes that cleave near the known 3Ј limit of 3-OST-3 A (EcoRV, BclI) or 3-OST-3 B (BsaI, AvrII) cDNAs. The obtained fragments indicate the distance of each 3Ј site from BstEII. Restriction site position was confirmed by triple digests supplemented with EcoNI. Each 3Ј-specific probes detected single fragments for all tested enzyme combinations. Thus, both members of a gene pair have identical restriction site maps for these enzymes. Upper, deduced restriction maps for the 3Ј-portions for 3OST3A and 3OST3B gene pairs. Regions that are colinear with 3-OST-3 A and 3-OST-3 B cDNA sequences are schematically represented as described in the legend to Fig. 1. Presented are the size and location of hybridization probes. Lower, detected restriction fragments. Thick lines indicate anticipated fragments predicted from the respective cDNA structures, whereas thin lines represent novel products generated by scission of downstream unknown sequences.
The final region of ϳ260 residues extends to the carboxyl terminus and is the putative sulfotransferase domain. Although the 3-OST-2, -3 A , and -3 B enzymes all show a common regional organization, only the primary structures of the sulfotransferase domain show significant homology (Fig. 9A). Indeed, the 3-OST-3 A and 3-OST-3 B sulfotransferase domains are almost identical, except the 3-OST-3 A form contains an additional carboxyl-terminal residue (Gly 406 ). As described above, this identity results from the 3-OST-3 A and 3-OST-3 B cDNAs exclusively sharing a common sulfotransferase domain sequence. The entire sulfotransferase domain is extremely basic (about 20% His, Lys, Arg versus 10% Glu and Asp); however, this region does not exhibit previously recognized heparin binding motifs (38). Only two cysteine residues are present, which are closely spaced and could form a disulfide bond to generates peptide loops of 13 amino acids, respectively (Figs. 2-4). The 3-OST-2 and the common 3-OST-3 domains contain 3 and 2 potential sites for N-glycosylation but all show a single potential O-glycosylation site. Interestingly, all 3-OST enzymes show a conserved potential N-glycosylation signal just before the potential peptide loop (Fig. 9A, consensus residues 214 -216).

DISCUSSION
The 3-OST Multigene Family and Heparan Diversity-Heparan sulfate proteoglycans bearing glycosaminoglycans with distinct fine structures have been implicated in a myriad of biologic roles; however, the means to independently regulate the production of such a broad array of functionally important structures has remained largely unclear. Indeed, such a mechanism is only exemplified by the rate-limiting action of 3-OST-1. To find new candidates for regulating heparan sulfate structure, we identified expressed sequence tag clones homologous to the sulfotransferase domain of 3-OST-1 and subsequently isolated human cDNAs encoding 3-OST-2, -3 A , -3 B , and an incomplete clone of 3-OST-4. We also obtained novel splice variants encoding carboxyl-terminal fragments, which shall be separately described. Southern analyses revealed a surprisingly extensive multigene family, with 7 human members (3OST1, 3OST2, 3OST3A1, 3OST3A2, 3OST3B1, 3OST3B2, and 3OST4). However, the functionality of 3OST3A2 and 3OST3B2 remains to be established. Localization of the mouse isologs (3Ost1, 3Ost2, 3Ost3a, 3Ost3b, and 3Ost4) and bioinformatic identification of cloned markers predicts the chromosomal loci of the corresponding human genes. These analyses suggest that the human genes are not candidates for previously mapped genetic disorders.
Northern analyses show that the human 3-OST genes are differentially regulated in both tissue and cell type-specific fashions, testifying to distinct functional roles. Moreover, multiple transcript sizes occur for most isoforms. Multiplicity has

cDNAs of Human Heparan Glucosaminyl 3-O-Sulfotransferases
also been observed for the transcripts of heparan biosynthetic enzymes NST-1, 2-OST, and uronosyl C 5 -epimerase (39 -41). Additional mRNAs might engender enhanced regulatory control or distinct functional properties. On one hand, the two 3-OST-1 messages probably differ by alternative splicing within the 5Ј-untranslated region, which occurs extensively for the murine counterpart (12). Such differences in noncoding regions can provide for differential regulation of translational efficiency or message accumulation (42,43). On the other hand, alternative splicing within the coding region produces minor transcript variants of 3-OST-2, -3A, and -3 B , which encode carboxyl-terminal fragments that likely serve a nonenzymatic function. Presumably, the large number of 3-OST-3 transcripts implies participation in several biologic processes.
Distinct biologic roles for each isoform is also indicated by our elucidation that 3-OST-1, -2, and -3 forms each generate unique 3-O-sulfated structures (34). Given the paucity of 3-Osulfated glucosaminyl residues within heparan sulfate (7,23), the novel isoforms may mimic 3-OST-1 by functioning in a critical rate-limiting capacity (5,10). The newly isolated enzymes should then serve as key regulatory components that enhance the functional diversity of heparan sulfate. We speculate that 3-OST-2 may play a role in the nervous system, whereas the 3-OST-3 isoforms might contribute to the permselectivity of the glomerular basement membrane (elaborated in Ref. 34). However, the extreme complexity of the multigene family suggests these enzymes may serve to modulate a rather diverse array of biologic functions.
Structural Features of the Divergent Amino-terminal Region-Examination of the deduced structures of the novel enzymes reveals several common as well as distinctive features and provides a foundation for exploring the molecular basis of heparan sequence diversity. The 3-OST-2, -3 A , and -3 B enzymes are type II integral membrane proteins and so are structurally comparable to all previously cloned glycosaminoglycan biosynthetic enzymes except for 3-OST-1, which has an intraluminal resident style (12, 40, 41, 44 -49). The architecture of type II enzymes is akin to that of the glycosyltransferases (46), which show two major functional regions. The large carboxyl-terminal region accounts for most of the intraluminal portion and forms a globular catalytic domain. The smaller amino-terminal region encompasses the cytoplasmic, transmembrane, and flexible stem domains; however, residues from each of these regions have been shown to direct localization to Golgi subcompartments (50). Thus, the entire amino-terminal region may be considered in terms of compartmentalization and protein-protein interactions.
The 3-OST family parallels this division via the conserved carboxyl-terminal sulfotransferase domain and the divergent amino-terminal regions. That these two regions may be functionally discrete is supported by examination of the presumptive Caenorhabditis elegans 3-OST. In this organism, we have identified only a single gene and the encoded enzyme shows features of a primordial 3-OST. Specifically, the sulfotransferase domain is most closely related to the type II enzymes (Fig. 9B); however, the amino-terminal domain shows an intraluminal resident style like 3-OST-1. 9 If this hybrid structure represents the primordial enzyme, then the type II aminoterminal domain must have evolved long after the elaboration of a functional sulfotransferase domain. Functional distinctiveness is also favored by the determination that 3-OST-3 A and 3-OST-3 B generate identical 3-O-sulfated disaccharides (34). Thus, sulfation specificity corresponds to the nearly identical sulfotransferase domains and is not perturbed by the unique amino-terminal regions.
That the amino-terminal region serves a compartmentalization/protein interaction role is supported by an analysis of NST-1, which occurs in the trans-Golgi network. The aminoterminal 161 residues are sufficient for retention within the Golgi (51). Within this region of NST-1, NST-2, and 6-OST the flexible stem shows a SPLAG enrichment comparable to the 3-OST stem region (SPLAG domain). However, the absence of such enrichment in 2-OST suggests that extreme SPLAG enrichment is not exclusively necessary for conveying flexibility and the SPLAG domain may thereby participate in an additional process, such as compartmentalization. Such a role could also account for the intraluminal retention of 3-OST-1, which is simply composed of an amino-terminal SPLAG domain fused to a carboxyl-terminal sulfotransferase domain. Compartmentalization/protein interactions may additionally involve residues within the transmembrane region or the cytoplasmic tail. In 9 C. elegans 3-OST was identified from data banks as described in the legend to Fig. 9. The amino-terminal portion (residues 1-22; MKYRLLLILHLIDLISC2GVIPN) show striking similarities to 3-OST-1. In particular, the short hydrophobic stretch with internal charged residues (single underline) and a potential signal peptidase cleavage site (2) (63), suggest C. elegans 3-OST is an intraluminal resident just like 3-OST-1 (12). Furthermore, residues immediately preceding the sulfotransferase domain are nearly identical between C. elegans 3-OST (double underline) and human 3-OST-1 (residues 44 -48, GVAPN).

FIG. 8. Tissue-specific expression of human 3-OST isoforms.
Northern blot analysis was performed on 5 g of poly(A) ϩ RNA isolated from various human organs (CLONTECH). Displayed are transcript sizes determined by co-electrophoresis of mRNA size standards. Membranes were hybridized with 32 P-labeled isoform-specific probes, as described under "Experimental Procedures." In particular, 3-OST-3 A and 3-OST-3 B isoforms were detected with the 3Ј-untranslated regionspecific probes. Note that this analysis must at least detect products of 3OST3A1 and 3OST3B1 genes, but potentially may also reveal transcripts from the structurally similar 3OST3A2 and 3OST3B2 genes, respectively. Note that with ST-3, the 3-OST-3 common sulfotransferase domain probe, the liver 2.6-kb 3-OST-3 A transcript and the 6.2-kb 3-OST-3 B transcript exhibited equal intensities, which provides a reference point for comparing 3-OST-3 A and 3-OST-3 B expression levels.

cDNAs of Human Heparan Glucosaminyl 3-O-Sulfotransferases
FIG. 9. Comparison of sulfotransferase domains. A, the program "Pileup" was used to align amino acid sequences of the HS1 type sulfotransferase domains from the human 3-OST and NST isoforms, described below. Indicated are sequence spans highly conserved among virtually all sulfotransferases that are predicted to serve in 5Ј-phosphate binding/lysyl catalysis (⅐⅐K⅐⅐), 3Ј-phosphate binding (⅐⅐3Ј PO 4 ⅐⅐), and possibly 5Ј-sulfate interaction (⅐⅐? 5Ј SO 4 ). Also shown is the presumptive cystine-bridged region that is conserved among HS1 type sulfotransferase domains ({⅐⅐C⅐⅐C⅐⅐}). Consensus residues (shaded) are indicated for each position where at least 5 candidates exhibit identical or similar amino acids. Numeration is given for each full-length enzyme and for a consensus sequence. B, a dendrogram comparing the relatedness of sulfotransferase cDNAs of Human Heparan Glucosaminyl 3-O-Sulfotransferases 5182 the first case, the unusual placement of cysteine residues within the transmembrane segment of 3-OST-2, -3 A , and -3 B raises the possibility of a covalent interaction with a retention partner or with biosynthetic components. Such a role has previously been proposed for the conserved cysteine residue that occurs in the membrane spanning domain of the syndecan-1 core protein (52). In the second case, the cytoplasmic tail of 3-OST-3 B contains a polyproline tract. Poly-L-proline can form a rigid left-handed-helix and such motifs are critical elements bound by protein interaction modules such as SH3 and WW domains (53)(54)(55). In summation, protein-protein interactions within the amino-terminal regions may control the formation of specific heparan sulfate sequences by constraining the enzyme's spatial organization or functional interactions. Consequently, the unique amino-terminal regions of 3-OST-3 A and 3-OST-3 B may engender distinctive biologic roles to the virtually identical sulfotransferase domains.
Structural Features of the Conserved Sulfotransferase Domain-3-OST family members are defined by the highly conserved sulfotransferase domain. The importance of this structure is highlighted by our finding that gene conversion maintains virtually identical sulfotransferase domains between 3OST3A1 and 3OST3B1 genes. Gene conversion occurs in the germ line as a transfer of genetic information from donor to acceptor loci without alternation of the donor material. This process can prevent mutational drift and proceeds quite efficiently between nonallelic loci on the same chromosome (reviewed in Ref. 56), which would be constant with the proposed 3OST3 multigene cluster. It is especially striking that the limits of the converted DNA sequence correspond exactly to the limits of the sulfotransferase domain of 3-OST-3 B .
We have previously employed simultaneous multiple sequence alignment to shown that the sulfotransferase domain of 3-OST-1 shows homology to a broad range of sulfotransferases, including cytosolic and Golgi enzymes isolated from animals, plants, and bacteria (12). Critical features are revealed by extending this comparison to include virtually all known adenosine 3Ј-phosphate 5Ј-phosphosulfate (PAPS) requiring sulfotransferases found in GenBank™. Collectively, this group modifies a broad range of molecules, yet these enzymes show a 260 -290 carboxyl-terminal region with at least 25-30% similarity to each 3-OST sulfotransferase domain. Such conservation reflects common structural and functional constraints imposed by the obligate cofactor PAPS (57). In particular, we have observed that the consensus sequence (L/I/V) 3-4 -X 3-5 -K-S-G-T-X 1-2 -(W/L) occurs near the amino terminus of the sulfotransferase domain of all enzymes (each consensus residue occurs in at least 50 of 66 tested sequences, minor conservative substitutions not presented). This consensus predominantly overlaps conserved region I (of cytosolic sulfotransferases) that appears to be a critical active site component, as indicated by affinity labeling with a PAPS analog and by mutational analysis (58 -60). The central basic residue, typically lysine (92%), is considered essential for stabilization of a transition state intermediate, as the Lys 3 Ala mutant of flavonol 3-sulfotransferase dramatically reduces enzymatic activity with minimal affect on PAPS binding (59). These assertions are confirmed by x-ray crystallography of the estrogen sulfotransferase bound to adenosine 3Ј-phosphate 5Ј-phosphate (a PAPS analog), where the consensus region forms a ␤-strand/P-loop/␣-helix motif. The P-loop corresponds to the underlined tetrapeptide and amide nitrogens from each residue may hydrogen bond with the 5Јphosphate. Moreover, N from the central lysine neutralizes the negative charge of this phosphate (61). Thus, the above consensus ascribes a fundamental sulfotransferase structure that is critically required for both the binding PAPS and the catalysis of sulfate transfer. This consensus region is almost invariant among the human 3-OST enzymes (Fig. 9A) and secondary structure analysis predicts a strand-loop-helix motif for each enzyme. Moreover, the conserved lysine occurs in all heparan sulfate sulfotransferases and likely serves an equivalent catalytic role. Indeed, alanine mutagenesis of the conserved lysyl has recently been shown to dramatically reduce sulfotransferase activity of 3-OST-1 10 and NST-1 (62).
A second, less well conserved, consensus K-(aliphatic) 5 -R-N-X 2 -(D/E)-X 3 -S-X-Y forms a sheet-turn-helix structure in the estrogen sulfotransferase and side groups from underlined residues interact with oxygens of the 3Ј-phosphate (61). This region is predicted to form a sheet-loop-helix structure in the 3-OSTs which would also be consistent with phosphate binding. Recently, Kakuta et al. (57) have similarly noted the importance of the above two regions (57); however, our analysis additionally reveals a previously unidentified structure, G-X-(W/Y)-X 2 -H-X 3 -(W/L) 2 . We have determined that this third sequence maps to a loop-helix structure at the active site and the underlined residues are in a vicinity to approximate the 5Јsulfate of PAPS. These interactions could facilitate sulfate binding or enzymatic transfer. Of course, such contacts could not have been crystallographically observed because estrogen sulfotransferase was co-crystallized with the sulfate-free analog adenosine 3Ј-phosphate 5Ј-phosphate (61). This potential sulfate interaction region is predicted to also form a loop-helix structure in the 3-OSTs.
Comparison of just the heparan sulfate sulfotransferases allows the designation of three distinct types of sulfotransferase domains (HS1, HS2, and HS3; Fig. 9B). Although four sulfotransferase families are clearly delineated, the N-and 3-O-groupings both possess a related HS1 structure (ϳ50% similarity between families) (Fig. 9B). Presumably, unique features of individual sulfotransferase domains enable discrimination of distinct precursor structures and thereby provide a mechanism for generating and regulating heparan sulfate sequence diversity. In this regard, the HS1 form is distinguished by a carboxyl-terminal region of ϳ30 residues that contains the presumptive cystine-bridged peptide loop (Fig. 9A, consensus residues 211-240). Within this highly conserved region, cysteines are invariant but the intervening 8 -11 amino acids are 10 J. Liu, unpublished data.

cDNAs of Human Heparan Glucosaminyl 3-O-Sulfotransferases
poorly conserved. Indeed, the peptide loop is structurally distinct for each 3-OST isoform. Thus, this variable loop might serve to discriminate between different heparan sulfate structures and thereby account for the distinct sequences generated by individual 3-OST isoforms (34).
In conclusion, the multiple functions of heparan sulfate proteoglycans necessitate a biosynthetic mechanism that tightly regulates the generation of a myriad of distinct heparan sulfate fine structures. The paradigm of 3-OST-1 shows that a biologic activity of heparan sulfate can be individually regulated by controlling the level of a sulfotransferase that contributes a rare modification to complete the formation of a critical heparan sulfate sequence. The utility of this mechanism may account for the large number of 3OST genes with distinct tissue and cell type-specific expression patterns. 3-OST isoforms with different sulfotransferase domains differentially place the rare 3-O-sulfate in different sequence contexts to presumably regulate discrete biologic activities. This capacity of the sulfotransferase domain to generate distinct sequences may in turn be modulated by distinct amino-terminal domains. The elucidation of the critical nonconserved and conserved residues which determine the sequence specificity for sulfation and enzyme interactive properties is fundamental groundwork toward understanding the regulated production of defined monosaccharide sequences.