Molecular Cloning, Expression, and Sequence Analysis of the Endoglycoceramidase II Gene from Rhodococcus Species Strain M-777*

Endoglycoceramidase (EGCase (EC 3.2.1.123)) is a hydrolase that hydrolyzes the linkage between the oligosaccharide and ceramide of various glycosphingolipids. This paper describes the molecular cloning and expression of EGCase II, one of the isoforms of EGCases. The gene encoding EGCase II was obtained by screening of a genomic DNA library from Rhodococcus sp. strain M-777 constructed in pUC19 with oligonucleotide probes deduced from a partial amino acid sequence of the enzyme protein. RecombinantEscherichia coli cells in which the EGCase II gene was expressed produced 14 units of the enzyme per liter of culture medium but did not produce sphingomyelinase. Recombinant EGCase II was a functioning enzyme with substrate specificity identical to that of the wild-type enzyme. Sequence analysis showed the presence of an open reading frame of 1470 base pairs encoding 490 amino acids. The N-terminal region of the deduced amino acid sequence had the general pattern of signal peptides of secreted prokaryotic proteins. Interestingly, the consensus sequence in the active site region of the endo-1,4-β-glucanase family A was found in the amino acid sequence of EGCase II.

Glycosphingolipids are amphipathic compounds consisting of oligosaccharides and ceramide moieties and are cell surface components of all vertebrates. Some glycosphingolipids are tumor-associated antigens and receptors of bacterial toxins and hormones and modulate cell growth and differentiation (1). Some of these biological functions of glycosphingolipids have been elucidated in experiments in which glycosphingolipids are added to cells. However, information on the roles of endogenous glycosphingolipids in biological processes is limited.
Endoglycoceramidase (EGCase), 1 found first in a culture supernatant of Rhodococcus sp. strain G-74-2 (2), cleaves the linkage between oligosaccharides and ceramides of various glycosphingolipids. EGCases have been found in bacterial cells (3), earthworms (4), leeches (5), rabbit mammary tissues (6), and clams (7), as well. Three molecular species of the enzyme, EGCases I, II, and III, each with different specificity, have been isolated from the culture supernatant of Rhodococcus sp. strain M-750, a mutant of the wild strain G-74-2 (8). EGCase II hydrolyzed globo-type glycosphingolipids more slowly than did EGCase I. EGCase III specifically hydrolyzed the galactosylceramide linkage of gala-type glycosphingolipids, which were resistant to EGCase I and II. These enzymes are useful in structural studies of glycosphingolipids (7, 9 -14).
Protein activators of EGCase activity in the absence of detergents have been purified from the culture supernatant of Rhodococcus sp. strain M-777, another mutant of strain G-74 -2 (15). When an activator is used, sugar chains of cell surface glycosphingolipids of living cells are removed without cell viability being decreased (16,17). It is thus possible to use EGCase II in conjunction with an activator to elucidate the biological functions of glycosphingolipids (18 -20). However, many steps of chromatographic separation are needed to separate EGCase from contaminating enzymes, especially sphingomyelinases, before such use and so preparation of purified EGCase II in large amounts is still difficult.
We report here the isolation of an EGCase gene and its expression in Escherichia coli. The protein sequence obtained from a clone of EGCase II included the consensus sequence of an activity site of endo-1,4-␤-glucanase.
DNA Techniques-Cells of Rhodococcus sp. strain M-777 were lysed with lysozyme and proteinase K, and genomic DNA was prepared as described elsewhere (21). Recombinant plasmid DNA from E. coli clones was isolated by the alkaline lysis method (22). Agarose gel electrophoresis, DNA restriction, and treatment with alkaline phosphatase were done by standard procedures (22). DNA was ligated with a DNA ligation kit as described by the manufacturer.
Amino Acid Sequencing-EGCase II from Rhodococcus sp. strain M-777 was purified as described before (8). About 30 g of purified EGCase II was treated at 100°C for 5 min with vapor from a mixture of 4 l of pyridine, 1 l of 4-vinylpyridine, 1 l of tributylphosphin, and 5 l of water and dried in a desiccator under reduced pressure (23). The pyridylethylated EGCase II was digested with 8 pmol of lysylendopeptidase at 37°C in 50 l of 4 M urea in 20 mM Tris-HCl, pH 9.0, for 16 h. The digest was put on a reverse-phase column (RPC C2/C18 SC 2.1/10, 2.1 ϫ 100 mm; Pharmacia Biotech Inc.) and eluted with a linear gradient from 0 to 50% acetonitrile in 0.1% trifluoroacetic acid at a flow rate of 0.1 ml/min by a SMART system for micropreparative liquid chromatography (Pharmacia). The isolated peptides were numbered in * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) U39554.
Molecular Cloning-Genomic DNA (5 g) was digested with MluI, and the digest was fractionated by 0.7% agarose gel electrophoresis by the standard method (22). DNA was transferred from agarose gels to nylon membranes (Hybond N ϩ , Amersham Corp.) as described by the manufacturer. On the basis of the amino acid sequence from residues 1 to 15 of peptide L1 (see Table I) obtained by lysylendopeptidase digestion, and in the expectation that the residue at position Ϫ1 of peptide L1 must be lysine, a mixed oligonucleotide probe, AAGTC(G/C)GCCCC-CGACGG(C/T)ATGCC(G/C)CAGTTCAC(G/C)GA(A/G)GCCGACCTC-GC, was designed and designated probe 1. T4 polynucleotide kinase from the MEGALABEL kit was used to label the probe with [␥-32 P]ATP. Hybridization was done in 5 ϫ SSC (1 ϫ SSC ϭ 0.15 M NaCl and 0.05 M sodium citrate) containing 0.5% SDS, 5 ϫ Denhardt's solution (1 ϫ Denhardt's solution ϭ 0.02% (w/v) each of Ficoll 400, bovine serum albumin, and polyvinylpyrrolidone-40), and 0.1 mg/ml denatured salmon sperm DNA at 65°C for 16 h. After hybridization, the membranes were washed for 30 min with 0.2 ϫ SSC containing 0.1% SDS at 65°C and used to expose an imaging plate, which was examined later on an imaging analyzer (BAS2000, Fuji Photo Film Co., Tokyo). Hybridization with probe 1 to Southern blots of the MluI digest showed that only the 4.4-kbp fragment contained the EGCase II gene. For cloning of this gene, a digest was prepared with 20 g of genomic DNA. Restriction fragments of genomic DNA of Rhodococcus sp. strain M-777 were fractionated by preparative 0.7% agarose gel electrophoresis. Fragments (4.4 kbp long) were extracted from the gel by adsorption to glass beads from the EASYTRAP Version 2 kit. A phosphorylated MluI linker, pGACGCGTC, was inserted into the HincII site of pUC19, and the resulting plasmid was designated pUC19M. The MluI fragments from the fractionated DNA digest were ligated to the MluI site of pUC19M. The recombinant plasmids obtained were used to transform E. coli JM109, used in the preparation of a gene library enriched with the EGCase II gene hybridized with probe 1. Colony hybridization was done by the standard procedure (22), and hybridized clones were detected with probe 1 under the same conditions as in Southern blotting. One clone was selected and the plasmid in the clone was designated pEGCM36.
A DNA probe was prepared by digestion of pEGCM36 with HincII. The probe, which was the 3Ј-end of the EGCase II gene in pEGCM36, was 210 base pairs long. This probe 2 was random prime labeled with [␣-32 P]dCTP using the BcaBEST labeling kit. In the same way as when pEGCM36 was prepared, a gene library enriched with the 3Ј-end of the EGCase II gene was constructed from genomic DNA digested with BamHI with probe 2 used for hybridization and detection. BamHI fragments (2.7 kbp long) were ligated to the BamHI site of pUC19 and used to transform E. coli JM109, used in the preparation of an enriched gene library. The library was screened by colony hybridization with probe 2.
DNA Sequencing and Sequence Analysis-The nucleotides were sequenced by the dideoxy chain termination method with BcaBest DNA polymerase and a DNA sequencer (Applied Biosystems, model 373A). Computer analysis including comparison of DNA sequences was done with DNASIS and GENEBRIGHT software (Hitachi Software Engineering, Tokyo). Frame analysis was done as described elsewhere (24).
Construction of Expression Plasmid with EGCase II Gene-The vector pTV118N was treated with SalI, Klenow fragment, and SphI. An insert that included the 5Ј-end of the EGCase II gene was prepared by digestion of clone pEGCM36 with AccIII, blunting, digestion with MluI, and gel purification. An insert that included the 3Ј-end of the EGCase II gene was prepared by digestion of clone pEGC20 with MluI and SphI and then gel purification. The inserts and vector were ligated and used to transform E. coli JM109. The recombinant plasmid was purified and designated pTEG3.
Expression and Purification of Recombinant EGCase II-E. coli JM109 cells transformed with pTEG3 were grown at 37°C in Luria-Bertani medium containing 100 g/ml ampicillin upon reaching the optical density (absorbance at 600 nm) of about 0.5. Then isopropylthio-␤-D-galactopyranoside was added to the final concentration of 1 mM to cause transcription, and culture was continued at 37°C for 4 h more. Cells were harvested by centrifugation, suspended in extraction buffer (10 mM Tris-HCl, pH 8.0, and 0.5 mM 4-(2-aminoethyl)-benzenesulfonyl fluoride hydrochloride), and sonicated. Crude extracts were further purified by ion-exchange chromatography on Q-Sepharose.
Enzyme Assay-EGCase activity was assayed with purified asialo-G M1 as the substrate in the presence of Triton X-100 as described before (8). One unit of enzyme was defined as the amount needed to catalyze the hydrolysis of 1 mol of substrate/min. The substrate specificity was examined as reported elsewhere (8).

RESULTS AND DISCUSSION
Molecular Cloning of EGCase II Gene-The N-terminal amino acid of the EGCase II purified from Rhodococcus sp. strain M-777 was blocked, so we could not identify the Nterminal sequence. Seven peptides, L1 to L7, were purified from a lysylendopeptidase digest and sequenced (Table I). Southern blotting with probe 1 showed a 4.4-kbp band in the hybridization pattern of the genomic DNA digested with MluI. A clone, pEGCM36, containing a 4.4-kbp insert was isolated from the gene library enriched with the EGCase II gene hybridized with probe 1. Its nucleotides were sequenced, and it was found to contain a putative open reading frame. The amino acid sequences L1, L2, L3, L4, L6, and L7 were found in the deduced amino acid sequence of the open reading frame, but sequence L5 and a stop codon were not found so we screened the genomic library for the missing sequences. A probe 2 was prepared from pEGCM36 for use in this screening. Southern blotting with probe 2 of the BamHI digest gave a 2.7-kbp band. A clone, pEGCB20, containing a 2.7-kbp insert was isolated from the gene library enriched with the 3Ј-end of the EGCase II gene and sequenced. The deduced amino acid sequence contained sequence L5 and a stop codon in the same frame. A partial sequence of the 3Ј-end of the EGCase II gene in pEGCM36 was found in pEGCB20. pEGCM36 and pEGCB20 overlapped (Fig. 1).
DNA and Amino Acid Sequence Analysis-We cloned and sequenced 2012 nucleotides of the two contiguous clones, pEGCM36 and pEGCB20, including the coding regions of the EGCase II gene and found an open reading frame at positions 1-1860. The initiation codon was not identified because the N-terminal amino acid of native EGCase II was blocked. In an attempt to tentatively identify the initiation codon and open reading frame, we analyzed the nucleotide sequence by frame  DDDGRSLILRGFNTASSAK analysis plotting, which shows codon position-specific differences in the GC content. In organisms with a high GC content, bases of coding regions have a high GC content at the third codon position, a low content at the second position, and an intermediate content at the first position (24). Frame analysis plotting of the 2012 nucleotides that included the EGCase II gene suggested that there was a coding region between the nucleotide positions of approximately 350 and 1850 (Fig. 2). A hydrophobic motif was found in the deduced amino acid sequence (Fig. 3). This sequence motif, a putative secretion signal peptide (25), had a positively charged N terminus followed by a hydrophobic core and a string of polar residues. The finding of a signal peptide sequence was in agreement with EGCase II being secreted into the culture medium. The signal sequence was coded at nucleotides 391-480, starting with GTG. A possible Shine-Dalgarno ribosome binding sequence started 4 bases upstream from the GTG. These results were in complete agreement with nucleotides 391-393 (GTG) being initiation codons. The DNA sequence and deduced amino acid sequence of the open reading frame of EGCase II are shown in Fig. 4. The open reading frame was 1470 base pairs long with 490 codons. All of the peptide sequences shown in Table I were in the deduced amino acid sequence.
Expression of EGCase II-The expression plasmid pTEG3 was constructed by insertion of a fragment of the coding sequence at nucleotide positions 482-1997, without a putative secretion signal sequence, into plasmid pTV118N between the SalI and SphI sites and in frame with the initiation codon of the plasmid. In pTEG3, transcription of recombinant genes is controlled by the promoter plac and can be induced by isopro-pylthio-␤-D-galactopyranoside. E. coli JM109 cells transformed with pTEG3 were cultured in a medium containing 1 mM isopropylthio-␤-D-galactopyranoside and separated from the medium by centrifugation. The EGCase II activity of the cell lysate was assayed with asialo-G M1 as the substrate. Recombinant E. coli cells produced 14 units of EGCase II activity per liter of culture medium, but Rhodococcus sp. strain M-777 produced 3 units of a mixed enzyme activity of EGCase II and EGCase I (8). Extracts from the negative control strain con- The deduced amino acid sequence of the EGCase II is shown as 1-letter symbols below the nucleotide sequence, and amino acid residues are numbered beginning with the first methionine. Amino acids known from peptide sequencing are underlined; the possible Shine-Dalgarno sequence is double underlined; the translation termination codon is denoted by an asterisk. Numbers to the right and left of the sequence correspond to nucleotides and amino acids, respectively. taining plasmid pTV118N without the EGCase II gene had no EGCase II activity, so the enzyme activity found was due entirely to expression of the cloned EGCase II gene. The specific activity of purified recombinant EGCase II was 3.2 units/ mg, the same as native EGCase II with asialo GM1 as the substrate. The substrate specificities of recombinant and native EGCase II were examined with various glycosphingolipids. Recombinant EGCase II hydrolyzed various glycosphingolipids at the same rates as native EGCase II under conditions I and II in Table II. No activity toward proteases, exoglycosidases, and sphingomyelinases by the EGCase II purified from E. coli was found. No sphingomyelinase activity was detected in the lysate of recombinant E. coli cells. The protein activator of EGCase II activity, activator II (15), was not detected in the lysate of recombinant E. coli cells with polyclonal antibodies against activator II purified from Rhodococcus sp. strain M-777.
The molecular weight of EGCase II isolated from Rhodococcus sp. strain M-777 was 58,900 by SDS-polyacrylamide gel electrophoresis. The molecular weight of the recombinant EGCase II was 57,500 by SDS-polyacrylamide gel electrophoresis, although the recombinant enzyme contained 18 additional amino acids from pTV118N. Native EGCase II might be cleaved within the signal sequence upstream from the putative cleavage sites, or its N terminus might be blocked with a bulky residue.
With various glycosphingolipids, we found that activator II increased the activities of both recombinant and native EGCase II. However, only native EGCase II hydrolyzed cell surface glycosphingolipids in the presence of activator II (15). That is, recombinant EGCase II did not hydrolyze the cell surface glycosphingolipids under the conditions we used (data not shown); the reason is not known. The N-terminal amino acid of native EGCase II was blocked (15). We speculate that the N-terminal structure of native EGCase II could be needed for the hydrolysis of cell surface glycosphingolipids.
In the putative amino acid sequence of EGCase II, those sequences of the N terminus of EGCase I and the sequence of activator II that have been isolated from a Rhodococcus sp. (15) were not found. These results suggested that the EGCase II gene is independent of the EGCase I and activator II genes.