Identification of Amino Acids Important for the Catalytic Activity of the Collagen Glucosyltransferase Associated with the Multifunctional Lysyl Hydroxylase 3 (LH3)*

Collagen glucosyltransferase (GGT) activity has recently been shown to be associated with human lysyl hydroxylase (LH) isoform 3 (LH3) Biol. Chem. The LH and GGT activities of the multifunctional LH3 protein modify lysyl residues in collagens posttranslation-ally to form hydroxylysyl and glucosylgalactosyl hydroxylysyl residues respectively. We now report that in the nematode, Caenorhabditis elegans , where only one ortholog is found for lysyl hydroxylase, the LH and GGT activities are also associated with the same gene product. The aim of the present studies is the identification of amino acids important for the catalytic activity of GGT. Our data indicate that the GGT active site is separate from the carboxyl-terminal LH active site of human LH3, the amino acids important for the GGT activity being located at the amino-terminal part of the molecule. Site-directed mutagenesis of a conserved

others are not essential for development and survival (1,2).
Collagens are a large family of glycoproteins, which are ubiquitously distributed in almost all tissues of the body. It is known that collagens are structural building blocks of tissues and, in addition, have regulatory functions important for cell behavior. The biosynthesis of collagen is a multistep process including many posttranslational modifications, some of which are unique to collagens (3)(4)(5)(6). These include the hydroxylation of lysyl residues and glycosylation of hydroxylysyl residues. The sugars linked to hydroxylysyl residues are galactose or a disaccharide glucosylgalactose and are present only in collagens and proteins having collagenous amino acid sequence in their structure (3)(4)(5)(6).
The number of hydroxylysyl and glycosylated hydroxylysyl residues varies among different collagen types. The numbers may also vary in the same collagen type in different tissues and in different physiological states (3)(4)(5)(6). Studies on bone collagen, for instance, indicate that the lysyl modifications vary among different skeletal regions according to the main function the bone exerts in different locations (7). Hydroxylysyl residues play an important role in collagen cross-link formation, hydroxylysyl derived cross-links being typically found in skeletal and major internal connective tissues (3,6,8). Many of the cross-links are glycosylated (9). Recent studies have demonstrated changes in the cross-link profile in osteoporosis, a disease with a decrease in bone strength (8). There is evidence that hydroxylation of lysyl residues in collagen is associated with mineralization of fibrils, a high hydroxylation level preventing the deposit of minerals between the fibrils (10). Furthermore, studies on fibrillar collagens have indicated that an increased content of glycosylated hydroxylysyl residues of collagen is associated with a decrease of fibril diameter both in vivo and in vitro (11,12). The detailed biological functions of galactosylhydroxylysyl and glucosylgalactosylhydroxylysyl residues are not known, however. There are no studies so far using transgenic animals to study the consequences of abnormalities of the glycosylated hydroxylysyl residues in vivo, and there is no information regarding possible heritable disorders caused by defects in these modifications.
Lysyl hydroxylase (LH, EC 1.14.11.4) 1 in the mouse and human is present in three different molecules, LH1, LH2, and LH3, originating from three different genes (16 -21). Galactosylhydroxylysyl glucosyltransferase (GGT, EC 2.4.1.66) catalyzes a transfer of the glucose from UDP-glucose to galactosylhydroxylysyl residues in the presence of Mn 2ϩ (3)(4)(5)(6)13). We recently found (14) that GGT activity is associated with the * This work was supported by grants from the Research Council for Biosciences and Environment within the Academy of Finland, a grant from the Sigrid Juselius Foundation, and a grant from the Biocenter Oulu. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The ability to selectively delete or alter predetermined genes in an animal model provides enormous power for the study of gene function. The deletion of the mouse or human LH3 gene, for instance, will lead to a disturbance of the consecutive steps in hydroxylysine-linked carbohydrate formation. To investigate the specific function of glucosylation, however, the selective inhibition of the GGT activity of LH3 is required. To prepare for these studies, more information about the amino acids of LH3 responsible for the GGT activity is needed. In this study we have investigated the catalytic properties of the single ortholog of lysyl hydroxylase found in the nematode Caenorhabditis elegans (22) and found that LH and GGT activities are associated with the same gene product. We have characterized amino acids important for GGT activity in human LH3 and C. elegans LH and shown that these amino acids are localized at the amino-terminal part of the molecule, separate from the lysyl hydroxylase active site. This enables us to manipulate the gene to remove the GGT activity without affecting the LH activity of the gene product.

EXPERIMENTAL PROCEDURES
Cell Cultures-Sf9 (Spodoptera frugiperda) insect cells were grown in Sf-900 II serum-free medium (Invitrogen). Escherichia coli XL1-Blue cells transformed with different LH constructs were grown in LB broth containing 100 g/ml ampicillin.
Expression of cDNAs in Insect Cells-Baculovirus transfer vector pFastBac1 in the BAC-TO-BAC TM (Invitrogen) expression system was used in the production of recombinant proteins. The LH3 construct (18) expressed in insect cells produced a recombinant protein having a signal peptide from the human LH1 sequence, a His tag at the amino terminus after signal peptide cleavage, and the amino acids encoded by nucleotides 289 -2455 of the LH3 cDNA sequence. The insect cells were harvested 72 h after infection and homogenized as described earlier (18).
Expression of cDNAs in Bacterial Cells-The LH3 cDNA construct used in this study was described elsewhere (14). The insert for the C. elegans LH (corresponding to amino acids 14 -730 in the cDNA) was generated by PCR (Dr. Robert Barstead kindly provided us with a C. elegans cDNA library) and ligated into the E. coli SacI-PstI site of a pQE30 vector. The recombinant proteins have the whole coding sequence without the signal sequence and a His tag at the amino terminus. Constructs were transformed into the E. coli XL1-Blue strain and cultures were grown to an OD 600 of 0.6 -0.7 at 37°C. Proteins were induced by the addition of 1 mM isopropyl-1-thio-␤-D-galactopyranoside at 30°C for 5 h. For activity measurements, the cells were resuspended in lysis buffer (0.4 M NaCl, 0.5% Nonidet P-40, and 20 mM Tris-HCl, pH 7.8, at 20°C) and incubated in the presence of lysozyme (50 g/ml) and RNase A (10 g/ml) at room temperature for 30 min. Lysis was completed by sonication. The cell debris was removed by centrifugation and the supernatant was used in the measurements. For the protein mobility shift assay, 10 mM iodoacetamide was used in the lysis buffer to alkylate the proteins.
Enzyme Activity Assays-LH activity was assayed by a method based on the hydroxylation-coupled decarboxylation of 2-oxo[1-14 C]glutarate (23) with the synthetic peptide IKGIKGIKG as a substrate. GGT activities were measured by a method based on the transfer of radioactively (tritium) labeled glucose from UDP-sugar to galactosylhydroxylysyl residues in a calf skin gelatin substrate (24). The specific detection of the reaction products was performed after alkaline hydrolysis.
Other Assays-In vitro translation was carried out as described elsewhere (14) by using a pCITE 4a vector (Novagen) for the transcription and a STP3 kit (Novagen) for the translation. Western blot analysis was carried out using monoclonal antibodies against the His tag (Sigma). The proteins were fractionated under reducing or non-reducing conditions by 10% SDS-PAGE, blotted onto an Immobilon-P membrane (Millipore), and incubated with the primary antibody. Anti-mouse IgG peroxidase conjugate (Zymed Laboratories Inc.) was used as the secondary antibody. Bound antibodies were visualized using the ECL detection system (Amersham Biosciences) and x-ray film (Eastman Kodak Co). The QuikChange site-directed mutagenesis kit (Stratagene) was used to make mutations in the cDNA sequences. The nucleotide changes of the mutations were confirmed by sequencing.

RESULTS AND DISCUSSION
GGT Activity in E. coli Cells Transformed with the C. elegans Lysyl Hydroxylase cDNA Construct-The whole genome of C. elegans has been recently sequenced (15), and the sequence data indicate that there is a single gene for lysyl hydroxylase (22). To express the protein in E. coli and to determine whether the LH gene product possesses the GGT activity, which has been found to be associated with human LH3 (14), we have prepared a cDNA construct for C. elegans LH. Control E. coli cells, containing only the pQE30 vector, as well as cells expressing DHFR, have only very low residual GGT activity (Ͻ300 dpm/mg soluble cell protein). When the C. elegans LH cDNA was expressed in E. coli cells, GGT activity was present in the soluble protein fraction. Due to different expression levels, the activity varied in different experiments, giving values in the range of 21,000 up to 30,700 dpm/mg soluble protein.
Amino Acids Conserved in LH3 and C. elegans-Our data indicate that GGT activity is associated with the LH of C. elegans and LH3 of human (14) and mouse, 2 whereas human LH1 and LH2 possess only LH activity (14). To determine the conserved amino acids for the multifunctional GGT/LH protein ( Fig. 1) we have aligned the C. elegans LH and mouse and human LH3 cDNA sequences and compared them to the sequences of LH1 (bovine, rat, chicken, mouse, human) and LH2 (mouse, human). There are 29 amino acids conserved between the LH3 sequences and C. elegans sequence (LH3-C. elegansspecific amino acids) but differing from the LH1 and LH2 sequences (shaded black, Fig. 1). The amino acids are scattered evenly throughout the molecule. The whole Drosophila genome has also recently been sequenced (25), and only one gene was found for LH in this genome. As seen in Fig. 1, 11 of the 29 amino acids are conserved also in the Drosophila lysyl hydroxylase.
LH3-C. elegans-specific Amino Acids Necessary for GGT Activity-In vitro mutagenesis combined with an in vitro translation system was used in our experiments (Fig. 2). Both C. elegans and human LH3 sequence were used in the studies. We started to search the important amino acids for GGT activity with C. elegans LH. The most inhibitory changes and one as a control of carboxyl-terminal area were tested further with human LH3. Aspartic acid at position 392 in human LH3 conserved in all species of LH isoforms was mutated to alanine as it was one candidate of the DXD motif of the molecule (see later). The data was then confirmed in an E. coli system for the human LH3 (not shown). We mutated the LH3-C. elegansspecific amino acids to the amino acids present in LH1 and/or LH2. As seen in Fig. 2, mutations at the amino-terminal end of the molecule caused an inhibition of GGT activity, whereas mutations at the carboxyl-terminal end of the molecule had no effect on this activity. Two mutations, the C144I and L208I in the human sequence and the C132I and L196I in the C. elegans sequence, inhibited GGT activity markedly in both species suggesting that these amino acids, conserved in LH3-C.elegans LH, are important for the catalytic activity of GGT. Our data also reveal some differences in the GGT active sites between C. elegans and human. Two leucine changes, L124V and L125V, in the C. elegans sequence inhibited the GGT activity by 40 and 70%, whereas the corresponding mutations in the human sequence increased GGT activity. Furthermore, a mutation of A453I in the C. elegans sequence (corresponding to Ala 464 of the human LH3 sequence) caused a remarkable inhibition of GGT activity, whereas the corresponding change in FIG. 1. Alignment of lysyl hydroxylase isoform sequences of bovine, rat, chicken, human, mouse, C. elegans, and Drosophila. Bovine (GenBank TM accession no. AF054274), rat (GenBank TM accession no. L25331), and chicken (GenBank TM accession no. M59183) sequences most probably represent the LH1 isoform of lysyl hydroxylase. Only one isoform for lysyl hydroxylase is found in the genomes of C. elegans (15) (GenBank TM accession no. Z66512) and Drosophila (24) (GenBank TM accession no. AE003545). The amino acids conserved between the LH3 sequences (mouse, GenBank TM accession no. AF 046783; human, AF046889) and the C. elegans sequence but differing from LH1 (mouse, GenBank TM accession no. AF046782; human, L06419) and LH2 (mouse, GenBank TM accession no. AF080572; human, U84573) sequences are shaded black, whereas DXD-like motifs in these molecules are shaded gray.
the human sequence had no effect on this activity.
During in vitro mutagenesis studies we also generated a C. elegans molecule, in which Leu 124 (C. elegans sequence) was mutated to valine, and the molecule also contained a tandem duplication of 11 amino acids (His-Tyr-Ser-Glu-Lys-Arg-Val-Leu-Phe-Gly-Ala) in the amino-terminal region. This change reduced the GGT activity dramatically when compared with a single amino acid change in the molecule, giving a GGT activity value of 0.5% (not shown). These data suggest that disruption of the structure of the molecule by the insertion of amino acids at this position abolishes the GGT activity.
There is no data about the multifunctionality of the LH gene in Drosophila, but it is interesting to note that the conservated amino acids, Cys 144 and Leu 208 in the human sequence, are also found in the Drosophila LH, suggesting a possible association of GGT activity with the insect LH gene. It is worth noting that the amino acid corresponding to Ala 464 of the human sequence (Ala 453 in C. elegans sequence) is isoleucine in FIG. 2. The effect of mutation of LH3-C. elegans-specific amino acids on GGT activity. C. elegans LH and human LH3 were expressed by an in vitro translation system. The translation level was analyzed by SDS-PAGE and autoradiography (not shown). The activity without any mutation is taken as 100.
the Drosophila sequence, which corresponds to the LH1 sequence in this position.
Cys 144 Forming a Disulfide Bond in LH3 Structure?-We used the alkylating agent, iodoacetamide, to block free sulfhydryl groups to determine whether the human Cys 144 forms a disulfide bond in the LH3 molecule (26). E. coli cells producing His-tagged LH3 were lysed in the presence of the alkylating reagent, and the proteins purified on a nickel column were then analyzed under non-reduced and reduced condition by SDS-PAGE combined with Western blotting. As seen in Fig. 3, under non-reducing conditions LH3 migrates as a monomer. There is a shift in the electrophoretic mobility after reduction of the native LH3 by DTT, suggesting the presence of disulfide bonds in the native molecule (Fig. 3). The molecule in which Cys 144 was mutated to isoleucine behaved like the native molecule in the gel, suggesting that Cys 144 is not-disulfide linked in contrast to Cys 494 and Cys 577 , which seem to be disulfide-bonded in human LH3 (Fig. 3). Cys 494 and Cys 577 are conserved in all lysyl hydroxylase isoforms (Fig. 1). We measured GGT activity in alkylated and non-alkylated LH3 samples by using both crude cell supernatant and nickel column-purified enzyme preparations in our assay. The data indicate no decrease in the activity in the alkylated samples (not shown). As already indicated above (Fig. 2), mutation of Cys 144 to Ile reduced GGT activity dramatically. Accordingly, our data suggest that the Cys 144 is required for the functional integrity of GGT but does not form a disulfide link and probably does not participate directly in the enzyme catalysis.
The Amino Acid Changes Inhibiting Glucosyltransferase Activity Do Not Affect LH Activity-The baculovirus expression system was used to determine whether amino acid changes that inhibited the GGT activity of LH3 have an effect on the LH activity. We studied the most effective mutations of human LH3, i.e. C144I and L208I, in an LH activity assay. Our results (data not shown) indicate for GGT that the results were independent on the expression system used for the recombinant protein production, the baculovirus, and in vitro translation, and that the E. coli system gave similar values for the mutated molecule. Our data for LH indicates that these mutations did not have any effect on LH activity, however. This is consistent with our earlier data obtained from LH1, that the amino acids responsible for LH activity are located in the carboxyl-terminal portion of the molecule (27,28). The carboxyl-terminal part of the molecule is the most conserved region among the LH isoforms (16 -20).
The Amino-terminal Part of LH3 Is Able to Generate GGT Activity-Our data above suggest that the amino-terminal portion of the LH3 molecule is important for GGT activity. To determine whether the carboxyl-terminal portion of the molecule is required for GGT activity, we removed amino acids from the carboxyl-terminal end of the human LH3. We generated a translational stop codon in different parts of the molecule, and the truncated LH3 molecules produced in E. coli and insect cells (baculovirus system) were assayed for GGT activity (Fig.  4). These data indicate that shortening the carboxyl-terminal portion of the LH3 molecule reduces, but does not eliminate, GGT activity. A 355-amino acid-long portion of amino-terminal moiety is still able to generate a low GGT activity (Fig. 4). None of the truncated molecules were able to hydroxylate lysyl residues (not shown), confirming our previous finding that amino acids important for lysyl hydroxylase activity are located on the carboxyl-terminal portion of the molecule (27,28).
A DXD Motif Required for GGT Activity-Sequence alignments and x-ray crystal structures have revealed a so-called DXD motif in many glycosyltransferases, and this motif has been found in different glycosyltransferase families (29). The motif is thought to stabilize the Mn 2ϩ and thus indirectly stabilize the binding of the diphosphate moiety of the UDPsugar (29,30). Therefore we looked for this motif in GGT. There are at least three DXD-like motifs in the LH3 sequence (Fig. 1). Our data from the first screenings, a point mutation of DAD to FIG. 5. DXD-like motif in LH3 required for GGT activity. Human LH3 was expressed in an E. coli system. Aspartates in the sequence, amino acids 187-191 (18,19) were selectively mutated to alanine, and the GGT activity was measured. The expression level of recombinant protein was analyzed by His tag antibodies on Western blot as shown in the figure.  (18,19). M2, M7, and M5, the molecule having a translational stop codon after amino acid 520, 401, and 388, respectively. The activity value of M2 in insect cells has been reported earlier (14). AAD ( Figs. 1 and 2) indicate that the DXD-like motif at position 392-394 of human sequence is not important for GGT. Our data from the shortened molecules (see above) show that the amino-terminal portion of the molecule retains some of the GGT activity, suggesting that DXD-like sequence at the carboxyl-terminal portion of the molecules (at position 489 -491) were not essential for the activity. A short conserved motif containing many aspartate residues is located in the sequence at position 187-191 (Fig. 1), and mutation of aspartates in this sequence eliminated the GGT activity (Fig. 5). This suggests that this short region is important for catalytic activity and might be a Mn 2ϩ binding site in GGT. The crystal structure of the catalytic domain of GGT is required to prove the suggestion, however.
Could GGT Activity Be Restored in LH1, LH2a, and/or LH2b by Mutating Amino Acids to the LH3-C. elegans-specific Amino Acids?-Mutation analyses were used to determine whether LH1 and/or LH2 (LH2a and LH2b) would possess GGT activity if some of the LH3-C. elegans-specific amino acids were introduced into their structure. Six amino acid changes were analyzed: Ile to Cys in LH1 and Leu to Cys in LH2a/LH2b (corresponding to the change C144I or C144L between LH3 and LH1/LH2, see Fig. 1), and Ile to Leu (corresponding to the change L208I between LH3 and LH1/LH2a/LH2b, see Fig. 1) in LH1 and LH2. We also generated molecules where both changes were introduced in the same molecule. The mutated proteins were expressed in E. coli, and GGT activity was measured. However, the activity measurements revealed no GGT activity (not shown). This data suggest that LH1 and LH2 have diverged during evolution from the ancestral LH and LH3 (20) such that replacement of the two most conserved amino acids into their structures is not sufficient to restore GGT activity. As observed in our data, there are evolutionary changes also between LH3 and C. elegans sequence; the active sites are not identical in these species although GGT activity is present in both molecules.
Conclusions-We report in this paper amino acids important for collagen glucosyltransferase activity, the glycosyltransferase associated with unique hydroxylysine-linked glycosylation of collagens. The activity has recently been found to be associated with a single human gene product (14), which is also a lysyl hydroxylase isoform (18,19). In the human LH3 sequence, Cys 144 , which is probably not disulfide-linked, and Leu 208 are important for GGT activity. These amino acids are also found in the C. elegans LH gene product, which possesses GGT activity. The corresponding amino acids in LH1 and LH2a/LH2b are isoleucine, and neither isoform has GGT activity. Mutation of the cysteine to isoleucine and the leucine to isoleucine in LH3 reduces GGT activity dramatically. These conserved amino acids are not responsible for the LH activity of LH3, suggesting that the LH and GGT active sites are separated on the LH3 molecule. The distinct locations of the active sites of LH and GGT on the LH3 molecule were also demonstrated by truncation of the LH3 molecule, which removed the LH active site. The truncated molecules still retained some GGT activity. Our data furthermore reveal a DXD-like sequence, a motif characteristic of many glycosyltransferases, required for GGT activity in the multifunctional LH molecule.
The results of this study are highly significant. We have identified amino acids important for GGT activity, which provide tools to manipulate this activity of the LH3 molecule. This knowledge is essential for the design of knock-out and transgenic studies to elucidate the functions of the glucosylation of galactosylhydroxylysyl residues in vivo. These experiments may also provide information about possible heritable disorders associated with abnormal glucosylation of galactosylhydroxylysyl residues in vivo.