Expression, secretion, and processing of rice alpha-amylase in the yeast Yarrowia lipolytica.

The gene encoding rice α-amylase in Oryza sativa was expressed in the yeast Yarrowia lipolytica, which is a potential host system for heterologous protein expression. For efficient secretion, the strong and inducible XPR2 promoter was used in the construction of four kinds of expression vectors with the following configurations between the XPR2 promoter and terminator: 1) XPR2 prepro-region-rice α-amylase coding sequence, 2) rice α-amylase signal peptide-rice α-amylase coding sequence, 3) XPR2 signal peptide-rice α-amylase coding sequence, and 4) XPR2 signal peptide-dipeptide stretch-rice α-amylase coding sequence. Secretion of active recombinant rice α-amylase into the culture medium was achieved only in the first two cases, demonstrating that the XPR2 signal peptide is not sufficient to direct the secretion of heterologous protein. Furthermore, our study shows that the XPR2 prepro-region causes imprecise processing (after Pro150-Ala151 or Val135-Leu136 instead of Lys156-Arg157) and leads to N-terminal amino acid sequences that differ from that of native rice α-amylase. Secondary structure analysis proposed that the structural form in the vicinity of the KEX2-like endopeptidase processing site in the XPR2 pro-region might play a critical role in the processing of heterologous proteins. These results suggest that the XPR2 pro-region is dispensable for obtaining the precise N-terminal amino acid in heterologous protein secretion. In contrast, utilizing the rice α-amylase signal peptide was sufficient in directing secretion of recombinant protein with the expected N-terminal sequence, indicating that the signal peptide of rice α-amylase was effectively recognized and processed by the Y. lipolytica secretory pathway.

The gene encoding rice ␣-amylase in Oryza sativa was expressed in the yeast Yarrowia lipolytica, which is a potential host system for heterologous protein expression. For efficient secretion, the strong and inducible XPR2 promoter was used in the construction of four kinds of expression vectors with the following configurations between the XPR2 promoter and terminator: 1) XPR2 prepro-region-rice ␣-amylase coding sequence, 2) rice ␣-amylase signal peptide-rice ␣-amylase coding sequence, 3) XPR2 signal peptide-rice ␣-amylase coding sequence, and 4) XPR2 signal peptide-dipeptide stretchrice ␣-amylase coding sequence. Secretion of active recombinant rice ␣-amylase into the culture medium was achieved only in the first two cases, demonstrating that the XPR2 signal peptide is not sufficient to direct the secretion of heterologous protein. Furthermore, our study shows that the XPR2 prepro-region causes imprecise processing (after Pro 150 -Ala 151 or Val 135 -Leu 136 instead of Lys 156 -Arg 157 ) and leads to N-terminal amino acid sequences that differ from that of native rice ␣-amylase. Secondary structure analysis proposed that the structural form in the vicinity of the KEX2-like endopeptidase processing site in the XPR2 pro-region might play a critical role in the processing of heterologous proteins. These results suggest that the XPR2 pro-region is dispensable for obtaining the precise N-terminal amino acid in heterologous protein secretion. In contrast, utilizing the rice ␣-amylase signal peptide was sufficient in directing secretion of recombinant protein with the expected N-terminal sequence, indicating that the signal peptide of rice ␣-amylase was effectively recognized and processed by the Y. lipolytica secretory pathway.
There is currently a strong interest in the development of new eukaryotic hosts for the secretion of heterologous proteins. Yeasts are attractive hosts for the production of foreign proteins because they combine the advantages of prokaryotic and higher eukaryotic systems (1,2). Saccharomyces cerevisiae has been used extensively for the production of many heterologous proteins since host-vector systems, genetic information, and recombinant DNA techniques for this organism are well-established (1)(2)(3). However, several drawbacks have emerged in the use of this yeast (1)(2)(3)(4)(5). For instance, S. cerevisiae is regarded as a non-optimal host for the large scale production of foreign proteins due to reduced biomass yield caused by aerobic alcohol fermentation. In many cases, it has been difficult for S. cerevisiae to secrete large quantities of proteins 40 kDa or larger. Hyperglycosylation of recombinant proteins is another concern. Because of these problems, non-conventional yeasts such as Pichia pastoris (6,7), Kluyveromyces lactis (8), and Hansenula polymorpha (9) have been explored as new hosts for foreign gene expression (5).
Yarrowia lipolytica is a dimorphic yeast and is heterothallic for mating (10). This yeast has been used to produce citric acid, isopropylmalic acid, erythritol, and mannitol (11,12,13). Recently, Y. lipolytica has received special attention as a potential host for the production of heterologous proteins due to its ability to secrete high levels of large proteins such as alkaline extracellular protease (AEP) 1 and RNase (2,5,10,14). In fact, approximately 1 g/liter AEP can be secreted into the medium under optimal conditions, indicating a significant secretion capacity.
The XPR2 gene encoding AEP has been cloned by three different groups (15)(16)(17). The DNA sequence of XPR2 and its deduced amino acid sequence suggest that AEP is produced as a prepro-protein (15,16). The pre-region is a 15-amino acid signal peptide followed by a stretch of 9 X-Ala or X-Pro dipeptides (X is any amino acid) that are substrates for a dipeptidyl aminopeptidase (18). The pro-region is composed of 122 amino acids attached to the N terminus of the mature AEP (16). The Lys 156 -Arg 157 dipeptide, which is a substrate for a KEX2-like endopeptidase encoded by the XPR6 gene, is present at the junction between the pro-region and the mature AEP (19). Since the promoter of XPR2 is strong and regulated by pH and nitrogen source, the XPR2 promoter has been used to express homologous and heterologous genes in Y. lipolytica (20 -22).
In this study, we investigated the secretion and processing of rice ␣-amylase in Y. lipolytica using its own signal peptide and the fusion protein with the XPR2 pre-(signal peptide) or prepro-region. Rice ␣-amylase was chosen as a model protein for studying heterologous protein secretion since it has a moderate molecular mass (45 kDa), contains one N-linked glycosylation site, and can be easily assayed.

MATERIALS AND METHODS
Strains and Plasmids-Yeast strains and plasmids used in this work are described in Table I. Plasmid pOS103 which carries the rice ␣-am-ylase cDNA at the XbaI site of pBluescript (Stratagene, La Jolla, CA) was kindly provided by Dr. R. L. Rodriguez (23,24). The Escherichia coli strain DH5␣ (25) was used for expression vector construction and plasmid DNA propagation. A derivative strain of Y. lipolytica SMS397A (Mat A, ade1, ura3, xpr2), was used as a host strain for the expression vectors.
DNA Manipulation and Transformation-General recombinant DNA techniques were performed as described in Sambrook et al. (25). E. coli transformation was performed by the SE method (27). Yeast transformation was carried out by the lithium acetate method (28).
Selection of Transformants Producing Rice ␣-Amylase-For the first selection, yeast cells transformed with expression vectors were grown on minimal medium without uracil. The growing yeast cells (Ura ϩ ) were transferred onto starch-containing YPD medium for the second selection. After 2 days of incubation at 28°C, the plate was stained with iodine vapor. The transformants producing recombinant rice ␣-amylase were distinguished by the clear zone around the colonies.
␣-Amylase Enzyme Assay-The starch-degrading activity of recombinant rice ␣-amylase was determined by monitoring reducing sugars using the modified dinitrosalicylic acid method (23). Enzyme solution (0.5 ml) was added to 0.5 ml of substrate solution (100 mM sodium acetate buffer with 5 mM CaCl 2 and 1% soluble starch, pH 5). After 10 min of incubation at 30°C, the reaction was terminated by adding 0.5 ml of dinitrosalicylic acid solution to 0.5 ml of reaction solution and boiled for 5 min. The solution was then diluted with 4 ml of distilled water, and absorbance was monitored at 540 nm. As a standard, glucose solution was used. One enzymatic unit corresponds to the amount of enzyme required to produce 1 mol of glucose from soluble starch per min.
To construct the hybrid between the XPR2 prepro-and rice ␣-amylase coding sequence, the first two PCRs were performed separately. One PCR mixture contained primers a and 1 with pIMR52 as a template, and the other PCR mixture had primers b and 2 with pOS103 as a template. After the first PCR products were purified, a second PCR was carried out with primer a, primer b, and the two purified PCR products as templates. For the other three fusions, the same PCR method was utilized with different primers (primers 3 and 4, 5 and 6, and 7 and 8 were used for pXOS103, pXOX103, and pXOP103, respectively). The final PCR fragments, which had NheI and EagI restriction enzyme sites at the 5Ј and 3Ј ends, respectively, were purified and exchanged with the small fragment of pXO103 digested with NheI and EagI. Consequently, four precise fusion plasmids were constructed (pXOM103, pXOX103, pXOP103, and pXOS103). The PCR conditions were as follows: 1 mol of each primer and 5 units of Vent polymerase (New England BioLabs, Beverly, MA) were used for each reaction in 100-l total volume. The PCR reactions were performed for 25 cycles of 1 min at 94°C, 2 min at 60°C, and 3 min at 72°C, followed by a 7-min incubation at 72°C.
Polyacrylamide Gel Electrophoresis-Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) was carried out as described by Laemmli (30). Samples were resolved on 10% polyacrylamide gels containing 0.1% SDS at room temperature. After electrophoresis, gels were stained with Coomassie Brilliant Blue R-250 and dried on cellulose acetate film. Amylolytic activity was detected by repeatedly rinsing the gels with distilled water for 15 min at room temperature to remove SDS. The gels were then immersed in the enzyme reaction buffer (50 mM sodium acetate, 5 mM CaCl 2 , pH 5) containing 1% (w/v) soluble starch for 30 min at room temperature. The gels were rinsed with the same buffer without soluble starch for 15 min, stained with iodine solution (0.3% iodine, 3% potassium iodide) for 10 min, and rinsed with distilled water. Finally, amylolytic activity was detected as a clear zone in a brown background. The N-terminal amino acid sequence was determined by an automated Edman degradation apparatus (model 477A, Applied Biosystems) with on-line high performance liquid chromatography (model ABI 120). One g of each sample was separated by SDS-PAGE and electroblotted onto a polyvinylidene difluoride membrane. After staining with Coomassie Brilliant Blue R-250, the sample bands were cut and used for sequencing.
Prediction of the Protein Secondary Structure-Prediction of AEP and hybrid protein secondary structures were performed by the profile network method called PHDsec (Protein Design Group, European Molecular Biology Laboratory, Heidelberg, Germany). The secondary structure prediction method is rated at 72.1% average accuracy for water-soluble globular proteins in the three states, helix, strand, and loop.
Western Blot Analysis-After separation by SDS-PAGE, proteins were electroblotted onto a nitrocellulose membrane (Schleicher & Schuell) in ice-cold transferring buffer (15.6 mM Tris, 120 mM glycine, and 20% methanol, pH 8.3) at 100 V for 1 h. The first antibody for Western blot analysis was anti-barley ␣-amylase antibody which was generously given by Dr. S. Katoh and Dr. M. Terashima (Kyoto University, Japan). The second antibody was anti-rabbit IgG antibody conjugated with alkaline phosphatase supplied by Vector (Burlingame, CA). The detection procedure followed was the same as that recommended by the manufacturer.
Affinity Chromatography-Recombinant rice ␣-amylase produced by the Y. lipolytica transformants was purified using ␤-cyclodextrin-substituted, epoxy-activated Sepharose 6B affinity chromatography (23,31). The recombinant strain was grown in GPP medium with a working volume of 1 liter at 28°C for 30 h in a 2-liter jar fermentor. After centrifugation (5000 rpm with Sorvall GSA rotor, 10 min at 4°C), the secreted proteins in the supernatant were precipitated with 75% (w/v) ammonium sulfate. A protein pellet was obtained by centrifugation (5000 rpm with Sorvall GSA rotor, 15 min at 4°C) and redissolved into the buffer (50 mM sodium acetate, 5 mM CaCl 2 , pH 5). After dialysis, 40 ml of protein solution was subjected to ␤-cyclodextrin affinity chromatography. After the sample was applied to a column (2.0 ϫ 10.0 cm), the column was washed with 50 ml of low salt buffer (50 mM sodium acetate, 5 mM CaCl 2 , 5 mM NaCl, pH 5) and 50 ml of high salt buffer (50 mM sodium acetate, 5 mM CaCl 2 , 0.3 M NaCl, pH 5). The rice ␣-amylase was eluted with 8 mg/ml ␤-cyclodextrin, and each fraction volume was 5 ml.
Plasmids pXOM103-In, pXOX103-In, pXOP103-In, and pXOS103-In were independently transformed into Y. lipolytica SMS397A. Chromosome integration of the plasmids into the genome of the host Y. lipolytica strains was confirmed by Southern blot analysis using the rice ␣-amylase cDNA as a probe (data not shown). The resulting transformants were named YLAMIn (Y. lipolytica producing rice ␣-amylase with pXOM103-In), YLAXIn, YLAPIn, and YLASIn, respectively. The secretion of recombinant rice ␣-amylase was determined with the plate clear zone assay described under "Materials and Methods." After 2 days of incubation at 28°C, transformants producing rice ␣-amylase were detected by the clear zone around the colonies (Fig. 2). YLAXIn and YLAPIn were not able to produce rice ␣-amylase either intracellularly or extracellularly (Fig. 2), implying that the signal peptide cleavage sites in the fusion proteins of pXOX103 and pXOP103 were not recognized efficiently. In contrast, YLAMIn and YLASIn showed clear zones (Fig. 2) and ␣-amylase activity in the culture medium, indicating secretion of recombinant rice ␣-amylase. However, the plate clear zone and rice ␣-amylase assays demonstrated that YLASIn strains produce higher amounts (ϳ3-fold) of rice ␣-amylase than YLAMIn strains. Since YLAMIn and YLASIn produced recombinant rice ␣-amylase, the secreted proteins from both strains were studied further.
The Rice ␣-Amylase Signal Peptide Is Sufficient to Direct Secretion of Rice ␣-Amylase in Y. lipolytica-The presence of recombinant rice ␣-amylase in the YLAMIn culture supernatant was confirmed by SDS-PAGE (Fig. 3) and Western blot analysis (Fig. 4). The rice ␣-amylase secreted by YLASIn was purified by ␤-cyclodextrin affinity chromatography (Table II). Figs. 3 and 4 show that the purified recombinant rice ␣-amylase is homogeneous with a molecular mass of 45 kDa, which is similar to the native enzyme produced in germinating rice seed (23). However, the specific activity of purified rice ␣-amylase was 143 units/mg, which is lower than reported for S. cerevisiae (226 units/mg) (23). The difference in specific activities may be due to variations in the methods used for determining total protein concentration and enzyme activity. To determine glycosylation and the size of the N-linked carbohydrate chain, purified rice ␣-amylase was treated with endoglycosidase H, which cleaved the N-linked oligosaccharide chain from the glycoprotein. Fig. 3A demonstrates that the N-linked glycosyl group is approximately 3 kDa and that deglycosylation does not affect enzyme activity (Fig. 3B).
To determine the signal peptide cleavage site of recombinant rice ␣-amylase, N-terminal amino acid sequence analysis was performed. Previous attempts to determine the N-terminal amino acid sequences of native and recombinant (S. cerevisiae) rice ␣-amylases have failed (23). However, a tentative signal peptide processing site between Gly 31 and Gln 32 has been proposed based on a comparison with the amino acid sequence of barley ␣-amylase (23,24). The recombinant rice ␣-amylase produced by YLASIn had Gln-Val-Leu-Phe-Gln as an N-terminal amino acid sequence, which is identical to the proposed sequence of native rice ␣-amylase (Fig. 6). Although the XPR2 signal peptide with and without a diaminopeptide stretch failed to direct the secretion of rice ␣-amylase (Fig. 2), our results indicate that the signal peptide of rice ␣-amylase is recognized and efficiently processed in the secretory pathway of Y. lipolytica. In fact, the rice ␣-amylase signal peptide is the first exogenous signal sequence reported that conducts heterologous protein secretion in Y. lipolytica.
The XPR2 Prepro-region Causes Imprecise N-terminal Sequences in Recombinant Rice ␣-Amylase and Thus Is Dispensable for Heterologous Protein Secretion-When the rice ␣-amylase produced by YLAMIn was purified by affinity chromatography, two distinct bands appeared on the SDS-PAGE gel (Fig. 5A). Interestingly, both bands showed enzyme activity (data not shown) and reacted with the anti-barley ␣-amylase antibody (Fig. 5B). Molecular sizes of major and minor recombinant rice ␣-amylases were approximately 45 and 47 kDa, respectively. After endoglycosidase H treatment, two bands still appeared, and the size difference between the two bands remained unchanged, indicating that glycosylation does not confer any difference between the two proteins. Moreover, the specific activity of those proteins (136 units/mg) was almost the same as that of the recombinant rice ␣-amylase produced by YLASIn.
To investigate the processing of recombinant rice ␣-amylase secreted by YLAMIn, the N-terminal amino acid sequences of the two proteins appearing in SDS-PAGE were determined. Both proteins had unexpected N-terminal amino acid sequences. Initially, we predicted that the recombinant rice ␣-amylase produced by YLAMIn might be processed after Lys 156 -Arg 157 at the end of the XPR2 pro-region, which is a substrate for a KEX2-like endopeptidase encoded by the XPR6 gene (19). However, results show that the major and minor proteins are processed after Pro 150 -Ala 151 and Val 135 -Leu 136 , respectively (Fig. 6).
To determine whether or not the secondary structure of the XPR6 processing site (Lys 156 -Arg 157 ) in AEP was altered in the hybrid protein, computer analysis of the protein secondary structure was carried out. In the predicted secondary structure of AEP, the Lys 156 -Arg 157 region is a loop exposed to the surface (data not shown). Interestingly, however, the structure of the same region in the hybrid protein, in which the XPR2 preproregion was fused with the rice ␣-amylase coding region, is an ␣-helix buried from the surface. These data imply that the XPR6 cannot access the Lys 156 -Arg 157 cleavage sites in the hybrid protein. Therefore, we have concluded that the structural change in the XPR6 cleavage site between AEP and the hybrid protein directs an altered processing of recombinant rice ␣-amylase. However, there is no known protease that reacts with the cleavage site described above. Hence, the enzyme that may be involved in the processing of recombinant rice ␣-amylase secreted by YLAMIn remains an open question. From these results, we have concluded that even if the prepro-region of XPR2 were successfully employed to direct the secretion of some heterologous proteins in Y. lipolytica (20), it may result in imprecise cleavage of the prepro-fusion protein and produce the wrong N-terminal amino acid in the secreted protein; thus, the XPR2 prepro-region is dispensable for foreign protein secretion in Y. lipolytica.

DISCUSSION
Since Yarrowia lipolytica is an attractive host for the production and secretion of heterologous proteins (5,14), it has been used to secrete foreign proteins such as bovine prochymosin, S. cerevisiae invertase, porcine ␣-interferon, and human blood coagulation factor XIIIa (20 -22, 32). In all cases, the XPR2 signal sequence was used for directing endoplasmic reticulum translocation. However, utilizing the XPR2 signal peptide for heterologous protein secretion is not always successful as demonstrated in this study. Furthermore, we found that although the prepro-region of XPR2 was sufficient to direct secretion of recombinant rice ␣-amylase, the signal peptide (pre-region) of XPR2 by itself and with a dipeptide stretch did not direct secretion of rice ␣-amylase. Instead, the heterologous signal peptide of rice ␣-amylase was efficiently recognized and processed by the Y. lipolytica endoplasmic reticulum translo-cation machinery, and it conducted foreign protein secretion in this yeast. This result demonstrates that the signal peptide of rice ␣-amylase can be used for achieving heterologous protein secretion in Y. lipolytica.
The results of this study show that the recombinant ␣-amylase produced by YLASIn is a 45-kDa glycoprotein, which is similar to native rice ␣-amylase, with Gln-Val-Leu-Phe-Gln as an N-terminal amino acid sequence. Previously, the N-terminal amino acid sequence of native and recombinant rice ␣-amylases could not be determined because the N-terminal amino acid was modified or blocked (23). However, the site between Gly 31 and Gln 32 was proposed as a signal peptide cleavage site based on the sequence comparison to barley ␣-amylase (23,24). Our data confirm that this proposed signal peptide cleavage site is correct.
Interestingly, two types of active recombinant rice ␣-amylase were secreted by YLAMIn. Using endo H treatment and Nterminal amino acid sequence analysis, it was shown that the two secreted recombinant proteins are processed differently (Figs. 5 and 6). Tokunaga et al. (33) also detected a minor portion of mouse ␣-amylase which was processed differently when it was expressed and secreted by S. cerevisiae using the pGKL128-kDa killer toxin secretion signal sequence. They assumed that the secreted ␣-amylase was processed by a second protease in the secretory pathway after removal of the signal sequence. Since our results show that the XPR6 cleavage site (Lys 156 -Arg 157 ) is not processed in the XPR2 prepro-rice ␣-amylase fusion protein, it was presumed that an alternative protease might attack the fusion protein instead of XPR6. However, we failed to identify a specific protease that may be involved in this processing. Interestingly, Valverde et al. (34) found a putative ␣-helix structural motif at the C-terminal of the pro-region which might play a critical role in directing the secretion of active endothiapepsin in S. cerevisiae. In this study, we utilized secondary structure computer analysis for fusion proteins to determine if this is the case. Although secondary structure computer analysis revealed that the ␣-helical structure is not conserved in the C terminus of the AEP proregion, the predicted secondary structure of the hybrid protein shows that the Lys 156 -Arg 157 site is altered from the exposed  loop form in AEP to the buried helical structure in the fusion protein. Therefore, we presume that the XPR6 cannot access the Lys 156 -Arg 157 cleavage sites in the hybrid protein. In conclusion, although the pro-region is known to play an essential chaperon-like role in the secretion of several proteinases and is indispensable for the secretion of AEP (35)(36)(37), the AEP proregion is unnecessary for the secretion of heterologous protein in Y. lipolytica because it can cause imprecise processing of the hybrid protein and lead to the wrong N-terminal amino acid in the final product.