Incorporation of Norvaline at Leucine Positions in Recombinant Human Hemoglobin Expressed in Escherichia coli*

We report here a novel finding that norvaline can be incorporated in place of leucine in recombinant human hemoglobin expressed in Escherichia coli. The presence of the norvaline was confirmed by several analytical methods such as amino acid analysis, peptide mapping, electrospray mass spectrometry, and Edman protein sequencing. It appears that substitution is distributed across both the β- and di-α-globins in purified recombinant hemoglobin. The level of misincorporation correlated with the ratio of the free norvaline/leucine pool available in the cell culture. This suggests that the incorporation of norvaline for leucine occurs through misaminoacylation of tRNALeu, similar to the misincorporation of norleucine for methionine found in many recombinant proteins expressed in E. coli.

We report here a novel finding that norvaline can be incorporated in place of leucine in recombinant human hemoglobin expressed in Escherichia coli. The presence of the norvaline was confirmed by several analytical methods such as amino acid analysis, peptide mapping, electrospray mass spectrometry, and Edman protein sequencing. It appears that substitution is distributed across both the ␤and di-␣-globins in purified recombinant hemoglobin. The level of misincorporation correlated with the ratio of the free norvaline/leucine pool available in the cell culture. This suggests that the incorporation of norvaline for leucine occurs through misaminoacylation of tRNA Leu , similar to the misincorporation of norleucine for methionine found in many recombinant proteins expressed in E. coli.
The best known example of misincorporation of uncoded amino acids into recombinant proteins is substitution of norleucine for methionine. This phenomenon has been found in many recombinant proteins and is well documented in the literature (1)(2)(3)(4)(5)(6)(7)(8)(9). Norleucine can be substituted for methionine in proteins both at internal residues and at the initiating amino terminus. Bogosian (1) postulated that high level synthesis of leucine-rich proteins may promote the biosynthesis of norleucine in bacteria. Norleucine can then become randomly incorporated into cloned proteins at methionine residues. The incorporation of norleucine occurs through tRNA misaminoacylation (mischarging) during protein synthesis.
Norvaline can be intentionally incorporated into proteins either by semi-synthesis at designed positions (10 -13) or by feeding microorganisms expressing heterologous proteins with norvaline (14). Also, it has been reported that norvaline is a natural component of an antifungal peptide produced by Bacillus subtilis (15). We report here that a low but readily detectable level of norvaline can be naturally misincorporated into recombinant hemoglobin in place of leucine. To determine the extent of Nva 1 incorporation, and establish which of the 72 leucine residues were involved in the misincorporation, we undertook the task of detailed structural characterization of (putatively) norvaline-enriched samples of rHb1.1. We used a variety of analytical tools such as amino acid analysis, peptide mapping, Edman sequencing, and a several mass spectrometry techniques. Our results demonstrate that the misincorporation of norvaline is affected by fermentation conditions. The extent to which misincorporation occurs strongly correlates with the ratio of free norvaline to leucine found in the total pool of lysed bacterial cells in fermentation broth. In purified recombinant hemoglobin, the misincorporation appears to occur specifically at leucine positions and is distributed across both the ␤and di-␣-globins. We hypothesize that norvaline misincorporation may occur in other heterologous proteins expressed in Escherichia coli under conditions where norvaline can be produced and accumulated during the expression phase.
Recombinant hemoglobin (rHb1.1) expressed in E. coli is a human hemoglobin stabilized by the fusion of two ␣ chains through a glycine linking the carboxyl terminus of the first ␣ chain to the amino terminus of the second ␣ chain (di-␣-globin). At the amino termini of both the di-␣and the ␤-globin chains of rHb1.1, the wild type valine residues have been replaced with methionine residues. In addition, rHb1.1 also contains the Presbyterian mutation (Asn 108 3 Lys) in the ␤ chain of the molecule which confers reduced oxygen affinity (17)(18)(19). Recombinant hemoglobin contains 72 leucine residues (12.5%) out of 575 total amino acids. In contrast, typical E. coli proteins contain on average 8.4% of leucine (16). That distinction classifies hemoglobin as a leucine-rich protein, which under certain conditions appears to trigger norleucine and norvaline production in E. coli in a manner similar to that postulated by Bogosian (1).

EXPERIMENTAL PROCEDURES
Materials-Amino acid standards for amino acid analysis were from Pierce. Reagents for OPA amino acid analysis were from Hewlett-Packard. AccQ-Tag chemistry package was purchased from Waters. Acetone and 4-vinylpyridine were from Aldrich. Proteolytic enzymes were as follows: V8 was from Pierce and trypsin was from Worthington, and Porozyme cartridge of immobilized trypsin was purchased from PerSeptive Biosystems. Restriction enzymes and DNA ligase were purchased from New England Biolabs. All sequencing reagents for the Porton sequencer were from Beckman. Horse heart myoglobin, norvaline, and formic acid were purchased from Sigma. HPLC-grade acetonitrile and water were from Burdick and Jackson, and general chemicals and buffer components were obtained from Amresco.
Rare Leucine Codon Replacement-The leucine codons at positions 110 and 141 in the ␤-globin were replaced by polymerase chain reaction amplification of the carboxyl-terminal region of ␤ using two DNA primers in which the CUA codons were replaced by a sequence creating CUG codons. The primers were CBG111 (5Ј-GGTAAAGTACTGGTTT-GCGTT-3Ј) and CBG112 (5Ј-AAAACAGAAGCTTCATTAGTGGTATTT-GTGAGCCAGCGCGTTAGC-3Ј). The polymerase chain reaction product was digested with restriction enzymes ScaI and HindIII and ligated into the identically digested beta gene in plasmid pSGE720, replacing the original coding sequence including the rare codons for leucine. These leucine codon mutations removed SpeI and NheI restriction sites so clones were digested with these enzymes to eliminate any background carryover, and the digest was transformed into competent E. coli cells by the method of Hanahan (21). Correct codon replacement clones were identified by resistance to SpeI and NheI restriction digestion and confirmed by dideoxy DNA sequencing using the Sequenase (U. S. Biochemical Corp.) kit and protocol. The strain resulting from the replacement of the two rare codons was SGE2770, containing plasmid pSGE761, identical to pSGE720 except for the A to G change in the two ␤ leucine codons.
Protein Sequencing-The sequence of proteins and peptides was determined by automated Edman degradation chemistry using a Porton 2090 gas phase sequencer. Beckman peptide supports were used as recommended. Phenylthiohydantoin-derivatives were identified by reversed phase chromatography on a modified Hewlett-Packard 1090L HPLC using a Hewlett-Packard AminoQuant column.
Amino Acid Analysis-Protein and peptides were subjected to gas phase hydrolysis at 165°C for 1 h in the presence of HCl containing 1% phenol using a Savant AP100 AminoPrep Hydrolyzer. Amino acids were analyzed using pre-column derivatization with 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC). The standard AccQ-Tag method was used for derivatization of amino acids (22). AQC derivatives of amino acids were separated on a HP 1090IIM using a NovaPack C18 (2 ϫ 150 mm) column. Solvent A was diluted AccQ-Tag Eluant A from Waters (1000 ml of water, 85 ml of concentrated Eluant A). Solvent B was 60% acetonitrile in HPLC grade water. Separation was performed using the following gradient conditions: 0 min, 0.0% B; 0. A modified OPA pre-column derivatization was used for quantitation of norvaline (23)(24). Hydrolyzed protein and peptides were derivatized in-line with OPA and injected on an AminoQuant RP column at 35°C. To improve resolution of Nva from surrounding amino acids, the elution gradient was modified as follows: 0 min, 5% B, 0.45 ml/min flow; 0. Analysis of Free Amino Acids in Lysate-1-ml samples of fermentation broth were collected and stored at Ϫ80°C until analyzed. Prior to analysis, samples were thawed and lysed by sonication for 1 min on ice. Lysate was filtered through 3K Centricon membranes and analyzed for content of free amino acids using the modified OPA method described above.
Pyruvate Assay-Fresh cell broth was spun down and the supernatant was immediately frozen at Ϫ20°C. Thawed supernatant was analyzed for pyruvate by HPLC using an Aminex HPX-87H column (from Bio-Rad) employing isocratic elution with 0.004 N H 2 SO 4 as recommended by the manufacturer.
Analytical C3 HPLC Separation-Recombinant hemoglobin was diluted to 2 mg/ml in HPLC-grade water. Where specified, globin was precipitated with 20 volumes of cold 0.6% HCl in acetone and solubilized in 2% (v/v) formic acid to a final concentration of 2 mg/ml. 100 g of protein was injected onto a HP 1090IIM using a Zorbax CBC3 HPLC analytical column (0.46 ϫ 25 cm) at 40°C. Solvent A was 0.1% (v/v) trifluoroacetic acid in water, and solvent B was acetonitrile with 0.1% (v/v) trifluoroacetic acid. The flow rate was regulated at 1 ml/min. Gradient conditions were as follows: 35% solvent B for 5 min, then the percentage of B was increased to 49% over 45 min. Eluant was monitored at 215, 280, and 400 nm.
Electrospray Mass Spectrometry-Electrospray mass spectra were obtained using a Vestec electrospray source and a model 201 single quadrupole mass spectrometer with a 2000 atomic mass unit range. Samples were delivered to the source by direct infusion at 1.6 l/min. Horse heart myoglobin at a concentration of 0.02 mg/ml was used daily to calibrate the instrument.
Distribution of norvaline substitution was obtained using LC-MS Finnigan MAT SSQ 710C single quadrupole mass spectrometer with a Finnigan MAT electrospray ionization source. The quadrupole was calibrated on the day of analysis with an MRFA tetrapeptide/apomyoglobin solution using standard procedures. Mass spectrometer and source parameters have been optimized for LC-MS analysis and were as follows: electrospray ionization source voltage, 4500 V; electrospray ionization N 2 sheath gas, 60 p.s.i.; heated capillary, 250°C; scan range, 500 -2000 m/z in 2 s; split flow, 1:100.
The HPLC system used in this study was a TSP 4100MS with an HP1050 autosampler and an ABI785A absorbance detector.
Peptide Mapping-Staphylococcus aureus V8 protease was used to map recombinant hemoglobin. V8 mapping of the entire hemoglobin molecule was performed without separating the globin chains. The protein was precipitated with acetone containing 0.6% (v/v) HCl and reduced and alkylated with vinylpyridine in specified experiments. After alkylation, the chains were desalted using a PD-10 disposable column (Sephadex G-25M), 2% (v/v) formic acid mobile phase. Dried globins (alkylated or nonalkylated) were reconstituted in 100 mM ammonium acetate, pH 4, containing 2 M urea. The conditions for this digest were 1:30 enzyme/substrate (w/w) with an overnight incubation at room temperature. The following day, additional V8 protease was added to bring the final enzyme/substrate ratio to 1:25 (w/w), and the incubation was continued for an additional 2 h at room temperature. The products of the V8 digestion were separated on a 4.6 ϫ 250 mm Vydac C4 HPLC column at a flow rate of 1 ml/min. Solvent A was 0.1% (v/v) trifluoroacetic acid in water, and solvent B was 70% (v/v) acetonitrile with 0.1% (v/v) trifluoroacetic acid. The peptides were eluted isocratically with 5% solvent B for 5 min, and then solvent B was increased to 70% over 65 min. The absorbance was monitored at 215 and 400 nm.
Tryptic mapping was performed using Porozyme trypsin cartridges (25)(26). Separation was performed using a Zorbax C18 column. The peptides were eluted starting with 1% solvent B at 25 min, and then solvent B was increased to 50% over 65 min. Solvent A was 0.1% (v/v) trifluoroacetic acid in water and solvent B acetonitrile with 0.1% (v/v) trifluoroacetic acid. The absorbance was monitored at 215 and 400 nm.
Oxygen Affinity Measurements-Oxygen equilibrium was measured using a Hemox analyzer, and values of p 50 and n max were calculated as described previously (17)(18)(19). and isolated shouldering fractions (see Fig. 1 for fraction collection) were in general agreement with the expected composition (Table I). Amino acid analysis of the di-␣-globin showed low recovery of the methionine (3.4 versus expected 5 residues). This is due to approximately 40% methylation of the aminoterminal methionine in the di-␣ chain (27). In the case of both shouldering fractions, recovery of methionine was discernibly increased relative to the respective main fraction, and in both shouldering fractions the methionine peak was atypically broad, possibly suggesting the presence of an unusual or modified amino acid in these fractions eluting closely to the methionine peak in our amino acid analysis. Later it was found that the AQC derivative of norvaline elutes very close to methionine causing overestimation of methionine content (see Fig. 4). LC-MS analysis of hemoglobin sample B showed a mass loss of 14 Ϯ 1.2 Da for the shouldering fraction on the ␤ versus the main fraction and a mass loss of 10 Ϯ 6.6 Da for the shouldering fraction on the di-␣ versus the main fraction (Table II). It should be noted that the observed mass of the di-␣ chain (30,329.3 Ϯ 3.5) was significantly higher than expected (30,323.9). We ascribe this discrepancy to methylation of 40% the di-␣-globin amino-terminal methionine (27), which translates to an expected 5.6 Da mass gain over the predicted mass for the di-␣-globin.

V8 Mapping of Globins
A control sample A of rHb1.1 and sample B containing a new modification were mapped with V8 proteinase (Fig. 2A). Prior to the digestion both samples were alkylated with 4-vinylpyridine. V8 does not cleave at all available Glu sites and therefore generates a rather simple and consistent peptide set for rHb1.1 (Fig. 2B). V8 mapping of sample B showed small differences when compared with control sample A ( Fig. 2A). A small difference peak was seen in front of the G peptide (carboxyl-terminal peptide from the ␤-globin), and another small difference peak is visible in front of the M peptide (internal peptide from the di-␣-globin). Mass analysis (Table III) of these difference peptides indicated that they each exhibited a mass loss of approximately 14 Da when compared with their respective neighboring peptides G and M. This mass loss was strikingly similar to the mass difference found between the leading shoulder fractions of the ␤and di-␣-globins and their respective main peaks (Table II and Fig. 1). These difference peptides  were isolated and further characterized.
V8 peptide mapping of the nonalkylated globin showed improved separation of the M and N peptides and additionally revealed a pre-N peptide eluting at the back side of the M peptide which was not evident in the map of the alkylated globins (chromatogram not shown). Mass analysis is included in Table IIIB.

Characterization of the ␤-Globin Difference Peptide
Eluting before G Peptide Electrospray Mass Spectrometry-Electrospray mass spectra of the pre-G difference peptide (which showed an asymmetric elution profile in the peptide map suggestive of more than one component) showed an envelope indicative of a single component with mass loss of 14 Da when compared with the G peptide (Table III).
Edman Protein Sequencing-Edman protein sequencing confirmed that the pre-G peptide has the same amino-terminal sequence as the G peptide. However, during further sequencing we observed partial substitution of norvaline for leucine at positions 110, 114, and 141 (cycles 9, 14, and 40). Fig. 3 shows an example chromatogram detecting both Leu and Nva at position 110 in cycle 9. The level of norvaline substitution in the pre-G peptide varies at each position with about 25% at position 110, more than 50% at position 141, but significantly lower amounts of substitution at positions 114, and 106, and virtually none at the leucine in position 105.
Amino Acid Analysis-The pre-G difference peptide was analyzed for amino acid composition. Amino acid analysis clearly indicated a new unusual amino acid eluting between S-pyridylethylcysteine and methionine (Fig. 4). The G peptide does not contain any methionine (for which it could be mistaken), further suggesting the presence of a new modified or unusual amino acid. By comparison with the retention times of a standard amino acid mixture spiked with norvaline, it was estab-  lished that the new peak in amino acid analysis corresponds to norvaline (Fig. 4). Elution of the norvaline next to methionine is consistent with the amino acid analysis of shoulders from the ␤and di-␣-globins.

Characterization of the Di-␣-globin Difference Peptide Eluting in Front of M Peptide
Electrospray Mass Spectrometry-Electrospray mass spectra of pre-M the difference peptide showed an envelope indicative of a single component with mass loss of about 14 Da when compared with the M peptide (Table III).
Trypsin Digest and Edman Sequencing-The nominal M peptide was too large to establish the source of the observed difference in mass by sequencing. Therefore, mapping with trypsin was used to identify smaller sub-peptide(s) containing the difference. Trypsin digest of the pre-M peak revealed one difference peptide when compared with a trypsin map of the original M peptide (Fig. 5). The difference peptide was isolated and analyzed by electrospray mass spectrometry and Edman sequencing. Again this difference peptide revealed a 13.7-Da mass loss when compared with the disappearing reference subpeptide from trypsin digestion of the M peptide. This new difference peptide was sequenced and was found to be a 28-mer (Leu 100 -Lys 127 in di-␣ notation) containing 7 leucines: LLSHCLLVTLAAHLPAEFTPAVHASLDK.
We did not observe significant substitution at the first 2 leucines; however, at the subsequent 5 leucines, we found evidence of partial norvaline substitution for leucine. The level of substitution ranged from 15 to 25%.
Amino Acid Analysis-The isolated and re-purified trypsin difference sub-peptide from pre-M was analyzed for amino acid composition. Amino acid analysis was in good agreement with the expected composition of the 28-mer, with a loss of 0.9 leucine residues and an indication of a new amino acid nearly coeluting with methionine. By comparison with the retention times of standard amino acids, it was established that the new peak in the amino acid analysis elutes at the same position as norvaline and is poorly resolved from methionine (data not shown).
Based on these results we conclude that the pre-M difference peptide (Leu 100 -Lys 127 ) was from the modified M peptide from the first half of the di-␣-globin in which leucines were substituted with different levels of norvaline yielding on average one substitution of norvaline for leucine per peptide. Similarly, the existence of a pre-N peak during mapping of the nonalkylated globins suggested that the same phenomena occurred on the second half of the di-␣-globin. This may indicate that the corresponding repeating peptide (Leu 242 -Lys 269 ) in the second part of the di-␣-globin also contained partial substitution of leucine by norvaline, although we did not determine this experimentally. The pre-M (Arg 31 -Glu 172 ) and pre-N (Arg 173 -Arg 283 ) peptides in the sample B of rHb1.1 each constitute roughly 5% of the total of the respective M and N peptides.

Determination of Total Norvaline for Leucine Substitution
Our standard AQC amino acid analysis system did not serve well for the determination of the extent of norvaline for leucine substitution because of its poor resolution between norvaline and methionine. Therefore, we adopted another amino acid analysis method using OPA precolumn derivatization. In that system, norvaline is essentially base line resolved from the nearest amino acid, valine. Because the concentration of leucine is several orders of magnitude higher than norvaline, we used the UV signal for the quantitation of leucine and the fluorescent signal for the quantitation of norvaline. Fig. 6 (bottom panel) shows a standard amino acid mixture spiked with norvaline monitored with a fluorescence detector. Fig. 6 (top  and middle panel) shows the level of norvaline versus leucine  Table IV.

Distribution of Norvaline Substitution Measured by LC-MS
To determine the extent of Nva incorporation and to establish which leucine residues within the amino acid sequence of ␤and di-␣-globin undergo norvaline misincorporation for leucine, we applied an orthogonal technique. Samples A and B were mapped with trypsin and analyzed by LC-MS. It was relatively straightforward to query complex LC-MS data for molecular weight information on certain tryptic peptides and on certain known (or possible) modifications to tryptic peptides by selective ion monitoring. For example, the expected mass for the amino-terminal tryptic peptide of ␤-globin (MHLTPEEK) is 983 daltons, whereas the norvaline misincorporated form at position 3 of the same peptide (MHXTPEEK, X ϭ Nva) would be 969 daltons. Selective ion chromatograms can be created for their respective M ϩ ions, i.e. 984 and 970. Percent relative abundance can be calculated by taking the integrated ion area for a substituted peptide and dividing it into the total area of all known forms of that peptide. If one compares the ionization ability of two peptides that have very similar primary amino acid sequence, it is quite conceivable that the relation of the two selected ion chromatographic intensities can give a reasonable estimate of percent relative abundances of these peptides.
Using this approach, it was possible to probe a number of sites within the amino acid sequence of the molecule and to determine the relative abundance of norvaline for leucine misincorporation. The leucine-containing tryptic peptides were chosen by the criteria that one or two leucine residues were present in the peptide. Analysis of peptides containing more than two leucine residues would become intractably complex. There are eight tryptic peptides containing a single leucine residue which should give a representative view of the pattern of norvaline for leucine misincorporation. Table V shows the percent leucine to norvaline misincorporation. These results were in agreement with Edman sequencing of pre-G and pre-M difference peptides.
The data for sample B indicated that norvaline can misincorporate for leucine residues throughout di-␣and ␤-globins, and there was variation between sites in the level of misincorporation. The control sample A only showed detectable leucine to norvaline substitution at position 141 (0.2%) in the ␤ chain, which was one of the most highly substituted sites in the sample B.

Codon Effects
The two positions in the ␤-globin from sample B with one of the highest levels of norvaline misincorporation were Leu 110 and Leu 141 . This was coincident with the only occurrence of CUA, the rarest of the six leucine encoding codons in E. coli (28). This suggested that the level of norvaline misincorporation could be tRNA-specific, possibly more likely at rarely used codons. To test this hypothesis we constructed a new version of the ␤-globin expression gene in which both of these CUA codons were changed to CUG. This is the most commonly used leucine codon in E. coli and constitutes 86% of the Leu codons used in the rHb1.1. This change resulted in no significant changes in the amount of soluble rHb1.1 produced, and virtually identical levels of norvaline misincorporation were measured (0.20 Ϯ 0.11 versus 0.14 Ϯ 0.00%). These results indicated that any potential codon-directed misincorporation did not play a significant role in overall levels of norvaline misincorporation. Probably very low levels at the other 68 leucine sites cumulatively comprised the major portion of the overall misincorporation. It is possible that low levels of norvaline misincorporation found in certain positions of purified hemoglobin may reflect structural requirements for folding of subunits, insertion of heme, and/or assembly of a functional trimeric recombinant hemoglobin. Therefore globins containing significant levels of misincorporation at those sites could fail to accumulate as soluble rHb1.1.

Accumulation of Free Amino Acids in Whole
Fermentation Broth Fig. 7 shows a set of typical time courses of accumulation of selected free amino acids found in lysed whole broth from a fermentation of E. coli engineered for high level expression of rHb1.1. In six fermentations the profiles were very similar; however, the amplitude and phase varied from fermentation to fermentation, resulting in different levels of misincorporation. The level of leucine decreased in the first 5 h and then remained relatively constant through the fermentation. In addition, we monitored the level of pyruvate since it has been indicated by others (1,5,29) as the primary substrate in biosynthesis of norvaline and norleucine. As can be seen in Fig.  7, stabilization of leucine level coincided with a rapid increase in pyruvate. Also, the accumulation of pyruvate was paralleled closely by norvaline. It appeared that norvaline and norleucine accumulation followed a bi-phasic profile. In the first 5 h following induction, norvaline and norleucine accumulated at a relatively slow rate, but after that norvaline accumulated significantly faster. Norleucine followed a similar trend, but the increased accumulation of norleucine occurred 1 to 2 h later than seen for norvaline.

Kinetics of Norvaline Production and
Misincorporation to rHb Fig. 7 shows a time course of rHb1.1 production and norvaline misincorporation. Hemoglobin expression began almost immediately following induction. Misincorporation, however, could not be detected before 6 -8 h after induction. From that point on, the level of norvaline substituted for leucine continued to increase until the end of fermentation.

Ratio of Nva/Leu and Misincorporation
Hemoglobin from several fermentations was isolated at the end of fermentation and analyzed for norvaline misincorporation. At the same time, homogenized aliquots of whole fermentation broth were tested for free norvaline, norleucine, and leucine concentrations. When the level of norvaline substitution was plotted against the ratio of free norvaline to free leucine available at the end of fermentation, a high correlation was established with an r 2 ϭ 0.90 (Fig. 8). This indicated that the ratio of norvaline to leucine available in the fermentor had a very significant impact on misincorporation. Significant but weaker correlations of norvaline misincorporation with norvaline concentration or the inverse of leucine concentration were observed. This supports the argument that detectable levels of norvaline misincorporation can result from a significant increase in free norvaline or a decrease of leucine concentration, either of which can force this ratio to increase.

Effect of Norvaline Misincorporation on Functionality of Hemoglobin
The effect of norvaline substitution on hemoglobin functionality was assessed by measuring oxygen affinity (p 50 ) and cooperativity (n max ). Samples A and B of recombinant hemoglobin were analyzed as described under "Experimental Procedures." Results are listed in Table VI. It was estimated that the relative standard deviation for both measurements was Ͻ4%. The observed 0.73% of norvaline for leucine substitution found in sample B had no effect on oxygen affinity and cooperativity of hemoglobin when compared with control sample A. DISCUSSION The results presented here show that E. coli can biosynthesize norvaline and incorporate it into recombinant hemoglobin at positions normally occupied by leucine. It appeared that the misincorporation in the purified sample of rHb1.1 was not equally distributed across the ␤ and the di-␣ chains, with certain leucine positions appearing more highly misincorporated than others. Either these positions were favored for substitution or the procedure employed for purification of soluble rHb1.1 may work as a selection mechanism against isoforms containing substitutions at certain other positions. At certain positions in hemoglobin norvaline misincorporation may disrupt folding and proper assembly. This hemoglobin may fail to accumulate as a soluble protein and/or may be destabilized such that it can be denatured during the heat deactivation step at the time of purification (20). Therefore the sites where misincorporation was detected in the purified functional recombi- nant hemoglobin may not necessarily reflect the actual, presumably random distribution of misincorporation occurring during translation of the globins. The observed level of norvaline misincorporation did not affect oxygen affinity and cooperativity.
Our results indicate that the ratio of norvaline to leucine available during expression is the most critical factor in determining norvaline misincorporation (Fig. 8). This suggested that misincorporation of norvaline was the result of the mis-aminoacylation of leucine transfer RNA with norvaline (2). Misincorporation produced by mis-aminoacylation of tRNA would be expected to result in random distribution at leucine positions. However, our results indicate that the misincorporation in purified protein sample was not equally distributed across the ␤ and the di-␣ chains. Two sites in the ␤-globin, Leu 110 and Leu 141 , were correlated with the only occurrences of the rarest Leu codon "CUA" in the rHb1.1 genes, suggesting that the corresponding tRNA may have been more sensitive to misaminoacylation. However, our experiments did not confirm a relationship between the rare codon usage for Leu employed in our expression construct and the level of norvaline misincorporation.
We believe that norvaline misincorporation into recombinant hemoglobin may not necessarily be unique; norvaline misincorporation is likely to be found in other recombinant proteins produced in E. coli. Similar misincorporation of norleucine for methionine is known to vary widely among different recombinant proteins expressed in E. coli (5). Therefore, it is conceivable that other proteins could demonstrate similar variation in norvaline misincorporation.
The leucine biosynthetic pathway in E. coli has been studied by a number of researchers. These groups have proposed norvaline and norleucine as potential by-products in this pathway (1, 29 -31). They proposed that imperfect substrate fidelity could result in norvaline and norleucine production, originating from pyruvate through keto acid elongation and transamination processes. Our experiments have shown a reproducible, sequential order of such events (Fig. 7): accumulation of expressed hemoglobin, initial decrease in leucine concentration to a steady level with a simultaneous increase in pyruvate concentration, followed by accumulation of norvaline. Norvaline accumulation results in an increase of Nva/Leu ratio which is followed by norvaline misincorporation into recombinant hemoglobin. Later still a similar accumulation of norleucine is observed. These findings are consistent with Bogosian's hypothesis that high level expression of leucine-rich recombinant proteins can stimulate production of norvaline and norleucine and cause misincorporation of these uncoded amino acids into proteins (1). However, it should be noted that we did not find norleucine for methionine misincorporation in the sample exhibiting norvaline misincorporation. Norleucine misincorporation should be influenced by the ratio of norleucine to methionine. Observed norleucine accumulation without significant methionine depletion was insufficient to produce detectable norleucine misincorporation. The difference in affinities of amino acids competing for the same tRNA will influence the ratio necessary to trigger detectable misincorporation of uncoded amino acids.
Based on our findings, we propose a model for norvaline misincorporation. In this model endogenous norvaline synthesis in E. coli alone is generally insufficient to produce detectable misincorporation into proteins. A crucial component for detectable misincorporation is the equilibrium between the supply and demand for leucine. In "normal" cells, leucine is constantly biosynthesized. At the same time, norvaline (and norleucine) is synthesized to some extent as a by product of the pathway. As the relative concentration of leucine is higher than norvaline, the ratio of Nva/Leu is low. A low Nva/Leu ratio helps the leucine tRNAs to properly charge and precludes significant norvaline misincorporation into proteins. Overexpression of a leucine-rich protein such as recombinant hemoglobin (12.5% leucine) can trigger norleucine and norvaline production in E. coli as postulated by Bogosian. Under this overly high leucine demand, the leucine biosynthesis pathway may begin to lose stringency. In such a "stressed" system, normal feedback signals are absent, and production and accumulation of norvaline as a by-product of the leucine pathway may increase. Our data also suggest that high levels of pyruvate may contribute to the increased production of norvaline relative to leucine. Since there is no clear mechanism to utilize norvaline, it will accumulate and the ratio of free Nva/Leu will rise. This allows leucine-specific tRNAs to increasingly mischarge with norvaline, resulting in the detectable incorporation of norvaline into proteins. Because the preponderance of norvaline accumulation happens late in the hemoglobin expression period (several hours after induction) when most of the cell proteins are already synthesized, most of the misincorporation occurs in hemoglobin produced late in the expression period. This would explain why we have been unable to find evidence of norvaline incorporation into native E. coli proteins from fermentations with readily detectable levels of norvaline in the expressed recombinant hemoglobin. From this proposed model, it would appear possible to carefully design a strategy to prevent norvaline misincorporation by appropriately supplementing the fermentation of E. coli with leucine. This strategy would be similar to the prevention of norleucine for methionine misincorporation by feeding E. coli with methionine (1). It should be noted, however, that addition of leucine may significantly influence physiology of the cells through leucine-responsive regulatory proteins (32). Alternatively, fermentation control strategies that limit the production of pyruvate appear to limit norvaline production by minimizing the concentration of the necessary substrate (pyruvate) for its biosynthesis. Limited production of norvaline results in minimizing norvaline misincorporation. 2 The requirement of very high purity and homogeneity for recombinant proteins that are to be used as human therapeutics present a scientific and analytical challenge to demonstrate the purity and integrity of final products. Several very sensitive analytical methods can be used to detect low levels of contamination with other unknown proteins and/or misincorporation of uncoded amino acids. For several decades norleucine and norvaline were used as convenient internal standards in amino acid analysis. The discovery of norleucine misincorporation precluded its use as a standard in analysis of recombinant proteins. Our findings raise the question as to whether there are other cases in which norvaline should no longer be used as an internal standard in the analysis of heterologous proteins. Anthony-Cahill, Dr. Jeff Etter, Dr. Bruce Kerwin, Dr. Antony Mathews, and Dr. Justin Neway for their very helpful discussions during the discovery process and later during manuscript preparation. This work would not be possible without support of large group of Somatogen technical staff in manufacturing, pilot operations, and assay services, especially Daryl Odgen, Daniel Rice, and Tim Fattor.