Identification and characterization of a novel human aldose reductase-like gene.

We have identified a novel human protein that is highly homologous to aldose reductase (AR). This protein, which we called ARL-1, consists of 316 amino acids, the same size as AR, and its amino acid sequence is 71% identical to that of AR. It is more closely related to the AR-like proteins such as mouse vas deferens protein, fibroblast growth factor-regulated protein, and Chinese hamster ovary reductase, with 81, 82, and 83%, respectively, of its amino acid sequence identical to the amino acid sequence of these proteins. The cDNA of ARL-1 was expressed in Escherichia coli to obtain recombinant protein for characterization of its enzymatic activities. For comparison, the cDNA of human AR was also expressed in E. coli and analyzed in parallel. These two enzymes differ in their pH optima and salt requirement, but they act on a similar spectrum of substrates. Similar to AR, ARL-1 can efficiently reduce aliphatic and aromatic aldehydes, and it is less active on hexoses. While AR mRNA is found in most tissues studied, ARL-1 is primarily expressed in the small intestines and in the colon, with a low level of its mRNA in the liver. The ability of ARL-1 to reduce various aldehydes and the locations of expression of this gene suggest that it may be responsible for detoxification of reactive aldehydes in the digested food before the nutrients are passed on to other organs. Interestingly, ARL-1 and AR are overexpressed in some liver cancers, but it is not clear if they contribute to the pathogenesis of this disease.

Besides kidney and testes, AR is present in many other tissues where its function is not clear. Interestingly, glucose is not its preferred substrate (K m ϭ 100 mM, k cat /K m ϭ 2.8 ϫ 10 2 M Ϫ1 min Ϫ1 ) (5). It is more efficient in reducing various aromatic and aliphatic aldehydes including glyceraldehyde, benzaldehyde, pyridine aldehyde etc. (6). In particular, AR's ability to reduce methylglyoxal (K m ϭ 7.8 M, k cat /K m ϭ 1.8 ϫ 10 7 M Ϫ1 min Ϫ1 ) (7), a toxic byproduct of glucose metabolism, and 4-hydroxynonenal (K m ϭ 22 M, k cat /K m ϭ 4.6 ϫ 10 6 M Ϫ1 min Ϫ1 ) (8), a toxic lipid aldehyde produced during oxidative stress, suggests that it may be responsible for detoxifying these and other harmful aldehydes generated by cellular metabolism.
A previous report showed that AR is induced in rat hepatoma (9), and suggested that it may be essential for detoxifying harmful metabolites produced by fast growing cancer cells. In another report, partial sequence determination of a protein induced in rat hepatoma called Spot 17, showed that it is highly homologous to the rat AR (10). We were interested in seeing if the AR and AR-like genes are also overexpressed in human liver cancers and contribute to the pathogenesis of this disease. We found that about 29% of the liver cancers overexpressed AR, and about 54% of them overexpressed a novel aldose reductase-like gene. We cloned the cDNA of this gene, and sequence analysis showed that the amino acid sequence of this protein we called ARL-1 is 71% identical to that of AR. As a first step toward understanding the function of ARL-1, we determined its sites of expression and also expressed its cDNA in Escherichia coli to obtain protein for the analysis of its enzymatic activities. Since this enzyme is highly homologous to the much better characterized AR, in this study we compared its enzymatic characteristics and its substrate specificity with that of AR.

MATERIALS AND METHODS
Human Liver Cancer Samples-Liver cancers were diagnosed by computerized tomography or ultrasonography and confirmed postoperationally by histological examination. After surgical operation, part of the liver cancer and some surrounding nontumorous tissues were collected for RNA extraction. These samples were quickly frozen in liquid nitrogen and stored at Ϫ80°C until RNA extraction.
Isolation of Total RNA-RNA was extracted by a modification of the protocol of Auffray and Rougeon (11). Frozen tissues were placed in buffer containing 3 M LiCl, 6 M urea solution and homogenized with POLYTRON TM homogenizer (Kinematic, Lucerne, Switzerland) for 1-2 min with a 30-s interval with the tubes kept on ice. The homogenate was kept at 4°C overnight, and then the RNA was pelleted by centrifugation at 15,000 ϫ g for 30 min at 4°C. Pellets were washed twice with the 3 M LiCl, 6 M urea solution, drained, and resuspended in 0.4 ml of TE buffer containing 0.5% SDS and 200 g of proteinase K. The resuspensions were incubated at 37°C for 30 -60 min. RNA was extracted twice with phenol/chloroform (1:1, v/v) and once with chloroform/ isoamyl alcohol (24:1, v/v) and precipitated with 1 ⁄10 volume of 3 M sodium acetate and 2.5 volumes of absolute ethanol at Ϫ20°C for 2 h. The mixtures were centrifuged for 20 min at 15,000 ϫ g, 4°C, and the RNA was resuspended in diethylpyrocarbonate-treated distilled water.
The amount of RNA was estimated by measuring the absorption at 260 nm.
Northern Blot Hybridization-20 g of total RNA was loaded onto each lane of a 1.2% agarose gel containing 3% formaldehyde. After electrophoresis, the RNA was transferred onto Hybond-N ϩ membranes (Amersham International plc, Buckinghamshire, UK) by capillary blotting in 20ϫ SSC (3 M NaCl, 0.3 M sodium citrate) buffer. Membranes containing RNA from various adult and fetal human tissues were purchased from CLONTECH Laboratories, Inc. (Palo Alto, CA). They contained 2 g of mRNA from each tissue.
For standard hybridization, the membranes were first incubated with prehybridization buffer (7% SDS, 0.25 M sodium o-phosphoric acid; manufacturer's protocol) at 65°C for 1-2 h and then changed to hybridization buffer complemented with ␣-32 P-labeled cDNA probes and then incubated at 65°C for 12 h. The 3Ј-untranslated region of AR (MscI cut at nucleotide 1109 to the end, 1366, of AR cDNA; Ref. 11) and ARL-1 (polymerase chain reaction-amplified product nucleotide 1020 -1300, GenBank TM accession number HARL U37100) were used as probes. The sequence identity between these two probes is less than 10%. After hybridization, the filters were briefly washed twice with a solution of 0.1ϫ SSC and 0.5% SDS at room temperature and twice in the same solution for 30 min each at 65°C. The radioactivity on the membranes was quantitated and visualized by PhosphorImager (Molecular Dynamics) and also exposed to x-ray film (X-MAT TM AR, Eastman Kodak Co.) at Ϫ70°C with intensifying screens for a better quality image.
The procedures were essentially the same for reduced stringency hybridization except that hybridization was done at 50°C, and the membranes were washed in 0.5ϫ SSC and 0.5% SDS at 50°C.
Construction and Screening of Tumor cDNA Library-Poly(A) RNA was purified from about 750 g of total RNA from the human hepatocellular carcinoma (HCC) specimen and used to construct the cDNA library with unidirectional ZAP Express TM /EcoRI/XhoI vector (Stratagene, La Jolla, CA) following the manufacturer's protocols.
In the primary screening, duplicate plaque lifts were first hybridized with 32 P-labeled human AR cDNA (12) at 50°C in a solution of 7% SDS, 0.25 M sodium o-phosphoric acid. 1 ϫ 10 9 cpm of 32 P-labeled probe per 40 ml of hybridization solution was used. After hybridization, the membranes were washed at 50°C in 0.5ϫ SSC, 0.5% SDS for 1 h. The membrane filters were then exposed to x-ray film. After exposure, the filters were washed at 60°C with 0.1ϫ SSC, 0.5% SDS and reexposed to x-ray film. Plaques that showed a positive hybridization signal on duplicate filters with the signal disappearing after the 60°C wash were selected for further analysis. These were eluted and plated on 80-mm plates to obtain well isolated plaques. These plaques were rescreened with radioactively labeled human AR cDNA as above. Three clones were isolated after the second round of screening. These three phages were used to infect bacteria XPORT along with helper phage and plated on a lawn of XLOLR bacteria according to the manufacturer's protocol (Stratagene, La Jolla, CA). The phagemid DNA portions excised by the helper phage from the phage vector DNA were propagated as plasmids in XLOLR for further analysis.
DNA Sequence Analysis-The nucleotide sequence at both ends of the clones were determined by dideoxy chain termination sequencing method (FMOL TM DNA Sequencing System, Promega Corp., Madison, WI) using the T3 and T7 universal primers. From the sequence information, other oligonucleotides were synthesized to serve as primers to determine the remaining sequence. Both strands of the longest clone pCDL3 were sequenced at least once.
Expression of AR and ARL-1 in E. coli-AR and ARL-1 cDNA were inserted into the E. coli expression vector pQE-30 (Qiagen) such that their proteins would be translated in frame from the vector's translation start site. The plasmid DNA was then transfected into bacteria host M15. Induction of expression of the cDNA inserts and purification of the recombinant proteins were done according to the manufacturer's manual. Briefly, bacteria grown to A 600 ϭ 0.8 were induced by 2 mM isopropyl-1-thio-␤-D-galactopyranoside for 3 h. Cells were harvested and lysed by sonication. After centrifugation to remove debris, the supernatant was mixed with a 50% slurry of nickel-nitrilotriacetic acid resin, which binds to the 6 histidine residues at the amino terminus of the recombinant protein. The slurry was then loaded onto a column and eluted with 0 -0.5 M imidazole wash buffer. One-ml fractions were collected, and samples from each column were analyzed in SDS-polyacrylamide gel electrophoresis. Fractions containing the recombinant protein were pooled, dialyzed, and stored at Ϫ20°C.
Enzyme Assays-The standard reaction mixture for AR was 135 mM sodium phosphate buffer (pH 6.2), 0.2 mM NADPH, 0.3 M lithium sulfate, 0.5-2.5 g of enzyme with the appropriate amount of substrate as indicated in Table II. The reaction was carried out at 30°C, and the decrease in NADPH was monitored by spectrophotometer at 340 nm. Enzyme activity was calculated as the amount of NADPH oxidized/ min/g of protein. Reaction conditions for ARL-1 were the same except that the phosphate buffer was pH 7.0. For the determination of pH optima and the effect of chloride and sulfate ions on enzymatic activity, 4 mM DL-glyceraldehyde was used as substrate.

RESULTS
Cloning of AR-like cDNA-To see if AR is induced in human liver cancer, we used AR cDNA to probe Northern blot containing RNA from the first two HCC tissues and the surrounding normal tissues from one of the HCCs we obtained (samples T1, T2, and N2 in Fig. 1). No overexpression of AR was found, but when the hybridization and washing temperature was reduced, a RNA band larger than that of AR was found in HCC T2 (data not shown). RNA from HCC T2 was used as templates to construct a cDNA library where the 5Ј-3Ј orientation of the cDNAs is defined (see "Materials and Methods"). Approximately 2 ϫ 10 5 plaques were screened by hybridization using the human AR cDNA (12) as a probe. Plaques that hybridized under reduced stringency conditions and whose hybridization signal disappeared when rewashed under standard conditions were selected. After rescreening under the same conditions, three clones named pCDL1, pCDL2, and pCDL3 were isolated for further analysis. The plasmid DNAs containing the cDNAs were excised from the phage arms as described under "Materials and Methods," and their cDNA inserts were released by digestion with restriction enzymes. The insert sizes of pCDL1, pCDL2, and pCDL3 were determined to be 0.68, 1.0, and 1.35 kilobases, respectively. The nucleotide sequences at both ends of these clones were determined, and the sequences at the 3Ј-end of these clones were found to be identical (data not shown), indicating that they are cDNAs of the same message. The sequences at the 5Ј-ends of these clones are highly homologous to the corresponding regions of AR cDNA (data not shown), suggesting that all three clones are cDNAs of a gene that is very similar in sequence to AR. We called this AR-like gene ARL-1.
Expression of AR and ARL-1 genes in HCC-To confirm that the ARL-1 gene is the AR-like gene expressed in liver cancers and to survey a larger number of cancers, RNA from 24 HCC specimens and the surrounding normal tissues from 16 of these HCCs were analyzed by Northern blot hybridization under standard conditions using ARL-1 cDNA as a probe. As shown in Fig. 1, 13 HCCs (T2, T3, T4, T11, T12, T14, T16, T18, T19, T20,  T21, T23, and T24) contained higher levels of ARL-1 mRNA, indicating that about 54% of HCCs overexpress in the AR-like gene.
The RNA blot was reprobed with labeled AR cDNA, and HCC samples T5, T7, T11, T14, T16, T22 and T23 were found to overexpress this gene (Fig. 1). None of the 24 HCCs tested overexpressed aldehyde reductase, an enzyme with 49.5% of its amino acids identical to AR (13) (data not shown).
Sequence Analysis of ARL-1 cDNA-The sequence of both strands of pCDL3, the longest clone of ARL-1, was determined and submitted to GenBank TM (HARL U37100). This clone contains 1337 base pairs (not counting the poly(A) tail). The first ATG codon at nucleotide 70 signals the beginning of an open  (17), and CHO reductase (19), only the amino acid that is different from ARL-1 is shown. For rat Spot 17, only part of the amino acid sequence is available (10). The key conserved amino acids described under "Discussion" are shown in boldface type.
reading frame coding for a protein of 316 amino acids, the same size as AR from human and other mammalian species studied (14 -16). The AATAAA sequence at position 1316 is most likely the polyadenylation signal, indicating that this clone contains the complete 3Ј-end of the ARL-1 mRNA. It is not clear whether it contains the entire 5Ј-untranslated sequence.
The protein sequence deduced from the cDNA sequence shows that ARL-1 is 71% identical to that of human AR (12). It is more homologous to a group of AR-like proteins. Its amino acid sequence is 80, 82, and 83% identical to that of the mouse vas deferens protein (MVDP) (17), the mouse fibroblast growth factor-regulated protein (FR-1) (18), and the Chinese hamster ovary (CHO) reductase (19), respectively. Further, it is 94% identical to the partial sequence of Spot 17 in rat hepatomas (10), indicating that it is most likely the human homologue of this protein. A comparison of the amino acid sequences of these proteins is shown in Fig. 2. ARL-1 is thus a new member of the structurally related family of proteins that includes aldehyde reductase (13), bovine prostaglandin F synthase (16), frog lens -crystallin (20), and many others (21).
Expression of AR and ARL-1 in Adult and Fetal Human Tissues-To determine the sites of gene expression of ARL-1 in normal tissues, RNA from 16 adult and four fetal organs were analyzed by Northern blot hybridization. AR was also included in this survey for comparison and also because there was no systematic study of its expression at the RNA level in humans. As shown in Fig. 3, ARL-1 is not expressed in many tissues. Its message is most abundant in the small intestines and the colon, and a low level of its mRNA is found in the liver, thymus, prostate, testis, and skeletal muscle. No ARL-1 mRNA is detected in the four fetal tissues tested. AR, on the other hand, is expressed in all tissues tested. In the adult, AR message is most abundant in the prostate, testis, skeletal muscle, and the heart; substantial amounts are found in the ovary, small intestines, colon, thymus, spleen, kidney, and the placenta; lower levels are found in the brain, lung, leukocytes, and the pancreas; a trace amount is found in the liver. Of the four fetal tissues tested, AR mRNA is most abundant in the kidney, a substantial amount is found in the brain, and a low level is found in the lung and liver.
Enzymatic Activities of AR and ARL-1 Recombinant Proteins Expressed in E. coli-Since ARL-1 is highly homologous to well characterized AR, we wanted to compare its enzymatic activities with this enzyme. The cDNAs of these two genes were expressed in E. coli as described under "Materials and Methods," and the purified recombinant proteins were used for enzyme assays. Preliminary experiments showed that ARL-1 is active in reducing DL-glyceraldehyde, and it was used as a substrate to optimize enzymatic condition. The optimum pH for AR and ARL-1 was found to be 6.2 and 7.0, respectively (Fig. 4). As reported for AR from tissues, recombinant AR is stimulated by ammonium sulfate and lithium sulfate and not affected by ammonium chloride (22). The activity of ARL-1, on the other hand, was not affected by any of these salts (Table I).
Using their respective optimum assay conditions, the abilities of AR and ARL-1 to reduce various substrates were tested, and the result is shown in Table II. Similar to the AR purified from tissues, recombinant AR can reduce aromatic aldehydes such as nitrobenzaldehyde and carboxybenzaldehyde; aliphatic aldehydes such as methylglyoxal and diacetyl; pyridine aldehydes; glyceraldehyde; and hexoses. Steroids and quinones were poor substrates for AR. The spectrum of substrates that ARL-1 can reduce is very similar to that of AR. The kinetics of these two enzymes in reducing some of these substrates was further analyzed. The results, shown in Table III,  and 0.7-fold of the K m of human AR determined in the absence of sulfate (23). While the k cat values for these substrates are, respectively, 6-, 1.7-, 6-, and 4-fold of that determined without lithium sulfate. DISCUSSION We have identified a novel protein that is a member of the growing family of structurally related aldo-keto reductases (21). The sequence of this protein called ARL-1 is 71% identical to that of aldose reductase, and together with aldose reductases from other organisms, it belongs to the AKR1B group of aldoketo reductases according to the new nomenclature system for these proteins (24). ARL-1 is more closely related to MVDP (17) and FR-1 (18), and the recently identified Chinese hamster protein, the CHO reductase (19). The amino acid sequences of these four proteins are about 80 -90% identical to each other. They form a distinct subgroup within the AKR1B group because they are about 70% identical to the aldose reductase from various mammalian organisms, which share about 80% sequence identity with each other.
Among these four AR-like proteins, MVDP is unique in that its gene is primarily expressed in the vas deferens and the adrenal gland (17,25). Using Northern blot analysis, we found that ARL-1 mRNA is not present in human vas deferens (data not shown). This indicates that the physiological functions of MVDP and ARL-1 are quite different. ARL-1 is more closely related to FR-1 in terms of sequence homology and sites of gene expression. Similar to ARL-1, FR-1 is also expressed quite strongly in the small intestines, but it differs from ARL-1 in that its mRNA is quite abundant in the ovary and testis, with lower levels in the brain, heart, kidney, liver, lung, skeletal muscle, skin, and spleen (18). It appears that FR-1 also differs from ARL-1 in that it is expressed in fetal liver, while ARL-1 is not. However, this could be due to the fact that the fetal RNA samples of these two organisms may be taken from fetuses at different stages of development. Interestingly, although CHO reductase and FR-1 are 93% identical to each other, there are similarities as well as major differences in the sites of their gene expression. CHO reductase mRNA is quite abundant in the urinary bladder and the jejunum, which, unfortunately, had not been screened for FR-1 expression. Both genes are expressed in the testis and the ovary. However, CHO reductase mRNA is not detectable in brain, heart, liver, kidney, and muscle, where FR-1 mRNA is found (19). This family of aldoketo reductases have broad range of substrate specificity. They probably participate in a number of physiological functions depending on the type of cells and tissues in which they are located.
Although AR and ARL-1 have similar enzymatic activities, the difference in the site of expression of the genes for these two proteins suggests that they may have different physiological functions. AR is quite abundant in kidney and testis, where it is thought to be involved in osmoregulation (3) and fructose synthesis (1), respectively. AR is also found in many other tissues, where its function is not clear. Judging from its ability to reduce a number of cytotoxic aldehydes, it has been proposed that AR may serve as a detoxification enzyme to eliminate harmful aldehydes generated by cellular metabolism (7,8). Since ARL-1 is predominantly found in the small intestines and in the colon, it is likely to be responsible for inactivating toxic aldehydes in digested food before the nutrients are transported to other organs.
We found that the enzymatic activity of ARL-1 is very similar to that of the well characterized human AR. This is not sur -FIG. 4. pH optima for AR and ARL-1. The enzyme activity of AR and ARL-1 at various pH levels was determined using 4 mM DL-glyceraldehyde as substrate. The reaction conditions were same as the standard condition described under "Materials and Methods" except for the pH as indicated.

TABLE I Effect of chloride and sulfate ions on AR and ARL-1 activities
Reduction of DL-glyceraldehyde by AR and ARL-1 were done under standard conditions as described under "Materials and Methods" except the addition of salts as indicated. The enzyme activity of AR and ARL-1 without chloride and sulfate ions are set as 100%, and the activities in the presence of various ions are expressed as percentages of the enzyme activity in the absence of chloride and sulfate ions.   (26 -30). These two proteins differ from each other in terms of pH optimum, salt requirement, and kinetics in reducing some of these substrates. Sulfate ion had been shown to increase the maximum velocity and K m of aldose reductase for DL-glyceraldehyde (31), suggesting that it interacts with the catalytic site or affects the conformation of the catalytic site. The activity of ARL-1, on the other hand, was not affected by 0.3 M sulfate ion, the optimum concentration for stimulating AR activity. However, we cannot rule out the possibility that sulfate ion may have a stimulatory effect on ARL-1 at lower concentrations. Cysteine 298 in AR is thought to be involved in sulfate activation (29). This amino acid is conserved in ARL-1, suggesting that this is not the only amino acid involved in sulfate stimulation in AR. Comparative analysis of the structure and enzymatic properties of these proteins should be helpful to our understanding of the structure-function relationship of these proteins. Although CHO reductase is more homologous to ARL-1 than to AR, its k cat values for various substrates are more similar to those of AR than to ARL-1 (Table III). It is interesting to note that CHO reductase and FR-1, which are 93% identical to each other, have very different kinetic constants toward DL-glyceraldehyde, the only substrate that has been studied for FR-1. The K m of FR-1 to DL-glyceraldehyde is 0.92 mM (32), while that for CHO reductase is 28 mM (19). The amino acids proposed to be involved in NADPH binding and catalysis for AR mentioned above are also conserved in FR-1 and CHO reductase, suggesting that small differences in other amino acid positions may contribute significantly to the conformation of the co-factor binding and catalytic sites. Comparing these small structural differences between these proteins will be helpful in elucidating the enzymatic mechanism of this class of enzymes. AR is thought to be a key enzyme in the etiology of diabetic complications such as cataract, retinopathy, neuropathy, and nephropathy. At this point, it is not clear if ARL-1 also contributes to these diabetic complications. Although Northern blot analysis showed that it is not expressed in tissues prone to diabetic lesions, more detailed studies such as in situ hybridization analysis are necessary to see if it is expressed in a small population of cells within these tissues. If ARL-1 is also involved in the pathogenesis of diabetic complications, it will be important to find AR inhibitors that would inhibit both AR and ARL-1. The binding site of an AR inhibitor, Zopolrestat, on FR-1 has been determined by x-ray crystallographic analysis of the AR inhibitor-enzyme complex (31). Binding of Zopolrestat to FR-1 involves hydrophobic interaction with the amino acids Trp 20 , Tyr 48 , Trp 79 , Trp 111 , Phe 122 , Leu 300 , Met 306 , Tyr 309 , and Pro 310 . With the exception of Leu 300 and Met 306 , which are replaced by Val and Leu, respectively, these amino acids are all conserved in ARL-1. Since the substituted amino acids are also hydrophobic, this suggests that Zopolrestat should also be effective in inhibiting ARL-1.
This project was prompted by previous reports that showed that AR (9) and an AR-like protein called Spot 17 (10) are induced in rat hepatomas, suggesting that these proteins may be needed to detoxify methylglyoxal or other metabolites generated by fast growing cancer cells. We looked at individual human HCCs and found that about 29% of them overexpressed AR and 54% of them overexpressed an AR-like gene we called ARL-1. The deduced protein sequence of ARL-1 is 94% identical to that of Spot 17, indicating that it is most likely the human homologue of that protein. Since not all HCCs overexpressed these genes and since there is no correlation between size of tumor and expression of these genes (data not shown), it is unlikely that they are essential for the growth of the cancer cells.
In the rat studies, the analyses were done on a pool of liver tumors from a number of rats. Therefore, it is not clear if there are variations among individual hepatomas in the expression of these genes as we have observed in the human liver cancer. Since the rat hepatomas were induced by a single carcinogen, while human liver tumors were probably caused by different agents, comparing the pattern of expression of AR and AR-like genes in individual rat and human hepatomas may tell us whether or not the variations in the expression of these genes observed in human liver tumors are the result of different mechanisms of induction of these tumors. Serum stimulation of quiescent fibroblasts induces the expression of AR but not FR-1 (32). On the other hand, fibroblast growth factor stimulates the expression of FR-1 much more than AR (4), indicating that the transcription of these two genes are induced by different growth factors. It is likely that the different liver cancers produce different growth factors or different amounts of these growth factors such that some of these cancers overexpress AR and some overexpress ARL-1. A better understanding of how these genes are induced in the liver cancers may help us unravel the pathogenesis of this deadly disease.