Cloning and characterization of a novel human lysyl hydroxylase isoform highly expressed in pancreas and muscle.

We report the isolation and characterization of cDNA clones for a novel isoform of lysyl hydroxylase (lysyl hydroxylase 2), a posttranslational enzyme of collagen biosynthesis. The open reading frame predicted a protein of 737 amino acids, including an amino-terminal signal peptide. The amino acid sequence has overall similarity of over 75% to the lysyl hydroxylase (lysyl hydroxylase 1) characterized earlier. This similarity is even higher in the carboxyl-terminal end of the molecules. Lysyl hydroxylase 2 contains nine cysteine residues, which are conserved in lysyl hydroxylase 1. Furthermore, the conserved histidines and aspartate residues required for lysyl hydroxylase activity are present in the sequence. Northern analysis identified a transcript of 4.2 kilobases, which was highly expressed in pancreas and muscle tissues. Expression of cDNA in insect cells using a baculovirus vector yielded proteins with lysyl hydroxylase activity and an antiserum against a synthetic peptide of the deduced amino acid sequence recognized proteins with molecular weights of 88 and 97 kDa in homogenates of the transfected cells.

Collagens are the most abundant proteins in the human body, and they are found essentially in all tissues. To date, 19 different collagen types have been identified (1)(2)(3)(4). The biosynthesis of collagens is characterized by several posttranslational modifications, one of which is hydroxylation of lysyl residues. Hydroxylysine occurs in the Y position of the repeating X-Y-Gly triplets within the helical region of the collagen molecule. Hydroxylysine also occurs in the sequence of nonhelical telopeptide regions of some collagen molecules, when glycine is replaced by either serine or alanine (1,2,5). The amount of the hydroxylysyl residues varies considerably between the different collagen types. Additional variation is found within the same collagen type in different tissues and even within the same tissues in different physiological and pathological states (1,2,5). Hydroxylysyl residues have an important role in the structure and stability of collagens. The hydroxy groups participate in the formation of intermolecular cross-links and serve as sites of attachment for carbohydrate units that are unique to collagens (1)(2)(3)(4)6).
Lysyl hydroxylase (EC 1.14.11.4) catalyzes the hydroxylation of lysyl residues in collagens (1). The enzyme requires Fe 2ϩ , 2-oxoglutarate, O 2 , and ascorbate in the reaction. The active enzyme is a homodimer consisting of subunits with a molecular weight of about 85,000 (1) and is constitutively expressed in a variety of tissues (7,8). The complete cDNA-derived amino acid sequence has been reported for the enzyme from chick (9), human (10,11), and rat (12). The enzyme has also been characterized at the genomic level (7), and the first mutations in the lysyl hydroxylase gene have been characterized in patients with type VI of the Ehlers-Danlos syndrome (EDSVI) 1 (13)(14)(15)(16)(17).
We report here the cloning and characterization of a novel human lysyl hydroxylase (lysyl hydroxylase 2), which is highly expressed in pancreas and muscle tissues. The similarity in the amino acid sequence is about 75% when compared with the human and chicken lysyl hydroxylase characterized earlier (lysyl hydroxylase 1).
* This work was supported by grants from Sigrid Juselius Foundation and the Research Councils for Natural Sciences and Medicine in the Academy of Finland. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) U84573.
5Ј/3Ј RACE-The 3Ј end and the 5Ј end of the novel cDNA were obtained using the Marathon cDNA amplification kit (Clontech) and 5Ј/3Ј RACE kit (Boehringer Mannheim).
To clone the 3Ј end of noncoding region of the novel cDNA, doublestranded cDNA was prepared using poly(A) RNA from human kidney and oligo(dT) primer (Marathon kit) which contained two degenerate nucleotides following the poly(T) sequence. The Marathon cDNA Adaptor was ligated to the blunt ended double-stranded cDNA and amplification was carried out by PCR using an adaptor specific primer and a gene-specific primer obtained from the sequence of clone 44.
To clone the 5Ј end of the novel cDNA, double-stranded cDNA was prepared using a sequence-specific primer. The Marathon cDNA adaptor was ligated to the blunt ended double-stranded cDNA, and amplification was carried out using an adaptor-specific primer paired with a primer obtained from the sequence of 44. Using 5Ј/3Ј RACE kit, a single-stranded, tailed cDNA was converted to a double-stranded molecule by PCR using an oligo(dT) anchor primer paired with a sequencespecific oligonucleotide. Amplification was carried out using a PCR anchored primer paired with a sequence-specific primer.
Sequencing-DNA sequencing was performed by standard dideoxynucleotide sequencing using T7 sequencing kit (Pharmacia Biotech Inc.), ⌬Taq cycle sequencing kit (U. S. Biochemical Corp.) and PRISM™AmpliTaq FS dye terminator cycle sequencing kit (Perkin-Elmer). Part of the sequencing was carried out manually and part using an Abi Prism 377 DNA sequencer (Perkin-Elmer). All clones were sequenced, despite the overlap in the sequence of the clones. The sequence of the ends of the molecule was confirmed by sequencing DNA fragments obtained by different oligonucleotide pairs in 5Ј/3Ј RACE.
Northern Blot Analysis-A human Multiple Northern blot containing poly(A) RNA from different tissues (Clontech) was hybridized for 16 h in 5 ϫ SSPE, 10 ϫ Denhardt's, 50% formamide, 2% SDS, 100 mg/ml denatured herring DNA at 42°C using radioactively labeled 44 clone as a probe. The blot was washed in 2 ϫ SSC, 0.05% SDS at room temperature for 40 min and then at 50°C for 40 min.
Expression of cDNA in Insect Cells-Expression of the novel cDNA was carried out by baculovirus transfer vector (18) in the BAC-TO-BAC™ Expression system (Life Technologies, Inc.). PCR was used to generate two different constructs using human kidney cDNA as a template. Construct 1 contained the nucleotides from 1 to 2272 and construct 2 the nucleotides from 27 to 2272. The PCR products were confirmed by sequencing. Insect cells were harvested 48 or 72 h after infection according to protocol described for human lysyl hydroxylase (19).
Other Assays-Western blot analysis was carried out using a polyclonal antibody, produced in chicken, against a synthetic peptide of NPRTLKILIEQNRKI (amino acids 399 -413 of lysyl hydroxylase 2). The homogenates of baculovirus-infected cells were fractionated in reducing conditions by SDS/10% polyacrylamide gel electrophoresis, blotted onto an Immobilon membrane (Millipore) and incubated with the antibodies against the synthetic peptide. Bound antibodies were visualized using the ECL detection system (Amersham Life Science, Inc.) and x-ray film (Eastman Kodak Co.). Lysyl hydroxylase activity was assayed by a method based on the hydroxylation-coupled decarboxylation of 2-oxo-[1-14 C]glutarate (20), synthetic peptide ARGIKGIRGFSG or IKGIKGIKG was used as substrate.

RESULTS AND DISCUSSION
Isolation and Characterization of cDNA Clones-The first cDNA clone (PS1-PS2) was generated by amplification of oli- Lysyl Hydroxylase Isoform (2) go(dT)-primed human kidney cDNA using oligonucleotides PS1 and PS2 as primers (Fig. 1). The amplification yielded a fragment of about 300 nucleotides in length. Human kidney and pancreas cDNA libraries were then screened using PS1-PS2 clone as a probe. One positive recombinant was identified in the pancreas (44) and one in the kidney (6) library (Fig. 1). Clone 44 was used as a probe to obtain four additional overlapping clones ( Fig. 1, 5, 25, 48, and 86). The 5Ј and 3Ј ends of the novel cDNA were obtained using 5Ј and 3Ј RACE.
Nucleotide and Derived Amino Acid Sequences of the cDNA-The cDNA clones (Fig. 1) encode a polypeptide of 737 amino acids, which starts with a codon for methionine and covers 1290 nucleotides of the 3Ј-untranslated sequence. The sequence contains two internal EcoRI restriction sites, and the 3Ј-untranslated sequence contains two potential polyadenylation signals. A putative signal peptide is present at the aminoterminal end of the protein. The predicted molecular weight of encoded polypeptide is 84,659 when calculated including the signal peptide (Fig. 1).
The nucleotides of the coding region of the novel cDNA sequences are 63 and 59% identical to the chicken (9) and human (10) lysyl hydroxylase sequences, respectively. There are no long regional streches of identical nucleotide sequences in the molecules; the identical nucleotides are distributed evenly in the molecule. The 3Ј-untranslated region of the molecule is clearly different from that in the previously characterized human lysyl hydroxylase 1.
Comparison of the Amino Acid Sequence with Lysyl Hydroxylase-The amino acid sequence of the novel polypeptide is compared with the chicken (9) and human (10) lysyl hydroxylase sequences in Fig. 1. The translated product of the novel cDNA is 11 amino acids longer than a translation product of human lysyl hydroxylase. A high similarity of the amino acid sequence (overall similarity over 75%) was observed between the novel polypeptide and human and chicken lysyl hydroxylases. The COOH-terminal region is especially similar to lysyl hydroxylase. Furthermore, a 62-amino acid sequence in the central region of the molecule shows a high similarity (over 90% identity covering the amino acids 414 -475 of the novel molecule). Examples of particularly variable regions are a 13amino acid sequence that is totally dissimilar (amino acids 283-295) and an 18-amino acid sequence (amino acids 340 -357) with an 11% identity to lysyl hydroxylase. The 50 amino acids from amino terminus of the molecule have a 56% identity to human lysyl hydroxylase. The nine cysteine residues conserved in lysyl hydroxylase amino acid sequences are also conserved in the novel polypeptide. The amino acid sequence contains seven potential attachment sites for asparaginelinked oligosaccharides, two of them have identical location in the sequence for human lysyl hydroxylase.
The carboxyl-terminal end of lysyl hydroxylase 1 is especially well conserved between chicken and human (10) having an identity of over 90% for the last 139 amino acid residues of the molecule. This finding suggests that this region of the molecule is important for the function of the enzyme. A search for conserved residues within the sequences of 2-oxoglutarate dioxygenases and a related dioxygenase, isopenicillin N synthase, suggested that two histidine-containing motifs in the carboxyl terminus, residues 656 and 708 in the human lysyl hydroxylase sequence, may function as ferrous binding ligands in the catalytic mechanism common to all 2-oxoglutarate dioxygenases (21,22). Site-directed mutagenesis of prolyl 4-hydroxylase (23) and lysyl hydroxylase (19), the enzymes belonging to the group of 2-oxoglutarate-dependent dioxygenases, demonstrated the importance of these histidine residues in catalytic activity of the enzymes. Mutation analysis revealed three other functionally important histidines in the carboxyl-terminal portion of lysyl hydroxylase polypeptide (19). It is remarkable that all these histidine residues are also conserved in the novel molecule (amino acids His-666, His-667, His-710, His-716, and His-718). Studies on human lysyl hydroxylase furthermore indicated that the mutation of aspartate residues in the carboxyl-terminal region of the molecule cause dramatic inactivation of the enzyme (19); two of these residues (amino acids Asp-668 and Asp-684) can be also found in the carboxyl-terminal region of the novel polypeptide. Asparagine-linked glycosylation has been shown to be a requirement for maximal lysyl hydroxylase activity (19,24). One can speculate that the requirement holds true also for the novel protein, because (i) the sequence contains many putative glycosylation sites, (ii) two putative glycosylation sites are highly conserved; at least one of these is glycosylated in lysyl hydroxylase expressed in insect cells and both asparagines (corresponding to Asn-209 and Asn-696 of the novel sequence) are required for lysyl hydroxylase activity (19).
A Novel Polypeptide with Lysyl Hydroxylase Activity-To determine whether the novel polypeptide has lysyl hydroxylase activity, a recombinant baculovirus encoding for the novel cDNA was generated in insect cells (Table I). Two constructs were prepared for the experiments, one containing the whole coding region, including the signal peptide (construct 1) and the other containing the coding region but lacking the first nine amino acids of the signal peptide (construct 2). Recombinant proteins were harvested by homogenization of the cells in a buffer containing Nonidet P-40, and the insoluble pellet was further homogenized in a buffer containing glycerol (19). Activity measurements of the supernatants indicate that majority of lysyl hydroxylase activity was found in the Nonidet P-40 fraction. Both peptide substrates were hydroxylated by the recombinant proteins.  b One plate contained about 10 ϫ 10 6 cells. All values are corrected further for minor differences in cell numbers of the plates.
Immunological Studies-An antiserum was prepared against a synthetic peptide derived from the cDNA sequences of lysyl hydroxylase 2. The recombinant proteins produced by the constructs were analyzed by SDS-polyacrylamide gel electrophoresis and then immunostained using the peptide antibodies for detection. The peptide antibodies recognized protein bands with molecular weights of 88,000 and 97,000 in cells transfected with the constructs for lysyl hydroxylase 2 (not shown). The lower molecular weight band corresponds exactly to the size of an immunostained band found in kidney tissue and kidney cells (not shown). It is not known, however, if both molecular weight 88,000 and 97,000 proteins have lysyl hydroxylase activity. As indicated above, lysyl hydroxylase 2 contains many potential sites for glycosylation, and it is possible that the two forms on the immunoblot are due to variation in the glycosylation of proteins in insect cells.
Expression of the Novel Lysyl Hydroxylase in Various Tissues-Expression of the novel gene in various tissues was determined by Northern hybridization. A single band of 4.2 kilobases was obtained in the hybridization from heart, brain, placenta, lung, liver, skeletal muscle, kidney, and pancreas. The data indicate that the expression of the gene is highly regulated. A prominent hybridization signal was present in muscle tissue such as heart and placenta, as well as pancreas, whereas the signal was very faint in the lung (Fig. 2).
Conclusions-A novel isoform (lysyl hydroxylase 2) for lysyl hydroxylase has been described in this study. The molecule has a high identity to the lysyl hydroxylase characterized earlier (lysyl hydroxylase 1, Refs. 9 -12), but the novel molecule clearly is a product of a different gene. This finding confirms the hypothesis of existence of isoforms for lysyl hydroxylase based on observations that hydroxylation of telopeptidyl lysyl residues may be under separate enzymic control distinct from that active toward lysyl residues within the helical regions of type I collagen (5,25). The discovery of isoforms of lysyl hydroxylase is highly significant, because it may explain the great variation of lysine hydroxylation in collagen of different tissues in patients with EDSVI (26,27), a disease resulting from the gene defect in lysyl hydroxylase 1. The isoforms are also in agreement with the finding that lysines in collagen types II, IV, and V in EDSVI patients were hydroxylated normally (27). Furthermore, the residual activity of EDSVI cells preferentially directed toward type IV collagen (28) suggests the presence of isoforms for lysyl hydroxylase. The presence of isoforms may also explain our previous observations that cells from an EDSVI patient producing a truncated form of lysyl hydroxylase 1, which lacks the highly conserved carboxyl-terminal portion of the molecule, nevertheless have detectable lysyl hydroxylase activity (13).