Selection of Linkers for a Catalytic Single-chain Antibody Using Phage Display Technology*

Phage display has been evaluated as a means of rap- idly selecting tailored linkers for single-chain antibodies (scFvs) from protein linker libraries. Preliminary experiments with a conventional linker failed to yield a functional single-chain version of a catalytic antibody with chorismate mutase activity. A random linker library was therefore constructed in which the genes for the heavy and light chain variable domains were linked by a segment encoding an 18-amino acid polypeptide of variable composition. The scFv repertoire ( (cid:39) 5 (cid:51) 10 6 dif-ferent members) was displayed on filamentous phage and subjected to affinity selection with hapten. The population of selected variants exhibited significant in- creases in binding activity but retained considerable sequence diversity. Screening 1054 individual variants subsequently yielded a catalytically active scFv that was produced efficiently in soluble form. Sequence anal- ysis revealed a conserved proline in the linker two residues after the V H C terminus and an abundance of arginines and prolines at other positions as the only common features of the selected tethers. There are ap- parently many viable solutions to the problem of linking individual V H and V L domains, but subtle differences in sequence dramatically influence the production, stability, and recognition properties of the scFv. The success of these experiments suggests that phage display will be generally useful for identifying peptide sequences for covalently linking any two protein domains.

Functional single-chain antibodies (scFvs) 1 have been engineered in many laboratories by linking together immunoglobulin heavy and light chain variable domains (V H and V L ) via polypeptide tethers (1,2). Compared to intact IgGs or Fab fragments, scFvs have the practical advantages of smaller size and structural simplicity with comparable affinity for antigen. Covalent linkage of the V H and V L chains also favors domain assembly and enhances protein stability relative to analogous two-chain Fv fragments. Moreover, organizing Fv genes as a single continuous DNA sequence can facilitate the generation of mutant libraries.
Successful construction of an scFv depends on the choice of a linker that neither interferes with the folding and association of the V H and V L domains nor reduces the stability and recognition properties of the Fv molecule. A certain degree of flexibility in the linker may also be needed for the functional cooperation of the two subunits. To satisfy these requirements, several design strategies have been developed (1,2). In one approach, flexible glycine-rich sequences such as (GGGGS) 3 have been used as tethers. Alternatively, useful linkers have been derived from multidomain proteins or designed by molecular modeling. There is no reason to believe, however, that a linker suitable for one antibody will be optimal for others. Indeed, expression levels, solubility, stability and binding affinity of scFvs can vary significantly depending on linker length and sequence.
Selection of a functional linker from a large population of candidate sequences represents a potentially more general solution to the problem of scFv design. Stemmer and co-workers have developed a procedure to identify scFvs synthesized from libraries of scFv genes with randomized linker DNA sequences (3). This method uses filter lifts to detect binding activity to hapten, but requires hapten labeling and is limited by the number of molecules that can be practically screened. Phage display technologies have the potential to extend this approach greatly. Large repertoires of Ͼ10 7 scFv clones can be produced on the surface of filamentous phage particles (4 -6), making possible direct selection of linkers on the basis of their ability to yield functional (i.e. hapten binding) receptor molecules. Here we report the successful application of this strategy to the construction of a single-chain version of the catalytic antibody 1F7 (7) which was raised against transition state analog 1 and possesses modest chorismate mutase activity (Fig. 1).

MATERIALS AND METHODS
Strains, Bacteriophage, and Vectors-Escherichia coli XL1-Blue (Stratagene) was used for cloning of libraries, phage panning, screening for soluble scFv protein production, and for routine plasmid preparations. BL21(DE3) (Novagen) was the host strain for large scale production of functional scFv. Helper phage VCS-M13 (Stratagene) was chosen for phage rescue. Phagemid pComb3-M3 (Fig. 2a) was constructed 2 by deleting a 272-base pair fragment from NheI (859) to XbaI (1131) in pComb3 (8) to avoid problems with in vivo recombination and rearrangements associated with a second homologous cloning site. A sequence encoding a decapeptide tag (YPYDVPDYAS) was placed upstream of the gIII sequence for easy detection of protein products by the monoclonal antibody 12ca5 (provided by Dr. I. A. Wilson). The restriction sites XbaI and NheI flanking the gIII fragment are compatible and facilitate excision of the gIII sequence for production of soluble scFv protein. Phagemid pET-22b(ϩ) (Novagen) was used for the overproduction of selected scFv proteins.
Oligonucleotide Synthesis-The randomized linker oligonucleotide and PCR primers were synthesized on a Pharmacia Biotech Inc. Gene Assembler Plus. The antisense oligonucleotide 1F7V H Mlu (5Ј-TTCAACGCGTT(SNN) 18 GACAGTGACCAGAGTAC-3Ј, where S is dA, * This work was supported in part by grants from the National Institutes of Health and the Office of Naval Research. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  in V L . The V L 3Ј antisense primer (1F7V L Spe; 5Ј-GACTAGTCTTC-AGCTCCAGCTTGGT-3Ј) adds a SpeI site (underlined) to the end of V L for fusion to the decapeptide tag in pComb3-M3. All synthesized oligonucleotides were desalted by chromatography using a NAP-10 column (Pharmacia) and used directly for PCR without further purification. Library Construction-The 1F7 scFv library was constructed in two steps. First, the V L gene was amplified by PCR from the corresponding Fab gene (9) using the primers 1F7V L Mlu and 1F7V L Spe and cloned into phagemid pComb3-M3 at the HindIII and SpeI sites to give the intermediate plasmid pComb3-mV L 3. This process created a unique MluI site at the 5Ј end of the V L gene and removed the internal NcoI site within the V L sequence. Second, the random linker library was fused to the V H gene obtained by PCR from the corresponding Fab gene (9) using the primers 1F7V H Mlu and 1F7V H Nco; the PCR products were ligated into NcoI/MluI-digested pComb3-mV L 3. The final ligation mixture was transformed into XL1-Blue cells by electroporation. An aliquot of transformed cells was plated on an ampicillin-containing agar plate to allow characterization of the starting library by sequencing of individual clones. The remaining sample was incubated at 37°C for 9 h in 200 ml of 2xYT medium (10) containing 1% glucose and 100 g ampicillin/ml, and stored in 15% glycerol at Ϫ70°C.
Phage Rescue and Panning-Phage rescue from the phagemid library by helper phage VCS-M13 was carried out as described by Marks et al. (11). Phage panning was performed in 96-well microtiter plates following standard protocols (11). Phage particles that bound to the immobilized bovine serum albumin-conjugate of transition state analog 1 (7) were eluted with 1.5 mM of free hapten and introduced back into exponentially growing XL1-Blue cells by infection. Five rounds of panning were performed. Cells containing the final pool of selected phagemids were stored at Ϫ70°C in 15% glycerol.
Screening for Soluble Functional scFv-Double-stranded phagemid DNA was isolated from the final pool. In order to produce scFvs in soluble form, the gIII fragment was excised by digestion of the phagemid pool with XbaI and NheI, and the large fragment was recircularized after purification by agarose gel electrophoresis. To screen for soluble hapten binding scFvs, this phagemid pool was transformed into XL1-Blue. Individual colonies were picked with a toothpick and grown in 150 l of CS medium (4.8% yeast extract, 2.4% tryptone, 0.3% Na 2 HPO 4 , 0.3% NaH 2 PO 4 , pH 7.1) containing 100 g of ampicillin/ml in a 96-well microtiter plate at 37°C for 6 h. Expression of the scFv genes was induced by addition of 50 l of fresh CS medium containing 1 mM isopropyl-1-thio-␤-D-galactopyranoside to each well, and the cultures were incubated at room temperature for an additional 24 h. Hapten binding activity was assessed by ELISA with 50 l of the culture supernatant. Active clones were further characterized by semiquantitative ELISA, Western blotting, and DNA sequencing (10).
Large Scale Production of scFv-Novagen's pET system (12) was used for the production of preparative amounts of the most active scFv. The scFv gene, containing both pelB signal and decapeptide tag sequences, was cloned into the expression vector pET-22b(ϩ) cut with NdeI and XhoI after appropriately altering the restriction sites at both ends of the fragment to be inserted by PCR. Strain BL21(DE3) was transformed with the resulting pET-sc1F7/L2 construct (Fig. 2c) and grown in 3 liters of 2xYT medium containing 1% glucose and 100 g of ampicillin/ml at 37°C with frequent additions of NaOH to maintain pH 7.0 until the cell density reached A 600 nm ϭ 6. The cell culture was then cooled to room temperature and its pH was adjusted to 5.8. Expression of the scFv gene was induced by addition of isopropyl-1-thio-␤-D-galactopyranoside to 0.3 mM, and the culture was incubated at room temperature for 8 h at a constant pH of 5.8. Cells were harvested by centrifugation and stored at Ϫ70°C prior to protein isolation.
Protein Purification-Soluble scFv protein was purified from the cells in two steps. First, the frozen cells were suspended in 80 ml of buffer A (50 mM sodium phosphate, 300 mM NaCl, pH 8, containing 0.2 mM phenylmethylsulfonyl fluoride and 2 g/ml leupeptin, pepstatin A, and aprotinin) and disrupted by passage through a French press. After centrifugation for 60 min at 25,000 ϫ g, the supernatant containing all soluble proteins was loaded onto a column packed with 4 ml of ProBond nickel-binding resin (Invitrogen) that had been pre-equilibrated with buffer A. The column was thoroughly washed with buffer A containing 10 mM imidazole, and bound protein was eluted with 250 mM imidazole, pH 7.0. Second, proteins eluted from the nickel column were loaded onto a monoS cation exchange column (Pharmacia) in 50 mM imidazole, pH 7.0. After washing the column with the loading buffer, bound proteins were eluted by a 100 -600 mM NaCl gradient. The scFv protein eluted at Ϸ400 mM NaCl. Protein purity was checked by SDS-polyacrylamide gel electrophoresis using Pharmacia's PhastGel system. Protein concentration was determined by measuring the absorbance at 280 nm (⑀ 280 ϭ FIG. 1. Transition state analog 1 was used to generate the monoclonal antibody 1F7, which catalyzes the rearrangement of chorismate 2 to prephenate 3.

FIG. 2. Vectors used for the construction and phage display of sc1F7 libraries and the expression of individual clones.
The abbreviations in panel a are: Plac, lac promoter; pelBs, pelB signal sequence; Tag, decapeptide sequence (see text); gIII, fragment of gIII from filamentous phage; bla, ampicillin resistance gene; pUC ori, DNA replication origin of the ColE1 group; f1 ori, the phage f1 origin of replication. Panel b, sc1F7 gene constructed with random linker sequences in pComb3-M3. Amino acid positions and key restriction sites are marked. Panel c, plasmid for expression of the sc1F7/L2 gene in the T7 system. T7p and T7ter stand for the T7 promoter and transcription terminator, respectively; lacI, lac repressor gene; pBR322 ori, DNA replication origin of the ColE1 group. 50.5 mM Ϫ1 cm Ϫ1 ).
Determination of Catalytic Activity-Purified protein was assayed for chorismate mutase activity in 50 mM imidazole, 400 mM NaCl, pH 7.0 at 15°C. Disappearance of chorismate was monitored spectroscopically at 275 nm (7), and initial rates, corrected for the background reaction, were used to determine values of k cat and K m . The apparent dissociation constant (K d ) app for hapten 1 was determined by competition ELISA (13) and defined as concentration of free hapten required to inhibit maximal binding by 50%. Contamination of the samples by the bifunctional E. coli enzyme chorismate mutase-prephenate dehydratase was investigated by monitoring the production of phenylpyruvate in chorismate mutase assays at 30°C (14); aliquots (800 l) were mixed with 3.3 N NaOH (200 l) and the absorbance at 320 nm measured (⑀ 320 ϭ 17.6 mM Ϫ1 cm Ϫ1 ). Similarly, contamination by chorismate mutase-prephenate dehydrogenase was assessed by incubating the scFv in the presence of 0.25 mM prephenate and 2 mM NAD ϩ and monitoring the absorbance change at 340 nm (⑀ 340 ϭ 6.4 mM Ϫ1 cm Ϫ1 ) (15).

Construction of the scFv Library and Selection for a Func-
tional Linker-A variety of polypeptide linkers differing widely in length and sequence have been used to construct singlechain antibodies (1,2). For the production of a single-chain version of the catalytic antibody 1F7, we initially adopted the Genex 212 linker (GSTSGSGKSSEGKG) which has been successfully incorporated into several scFvs (16,17). After altering the codons for three consecutive arginines in the heavy chain (H94, H95, and H96) to reflect the codon preferences of E. coli for highly expressed genes (from AGA AGA CGG to CGT CGC CGT), high level bacterial expression of the single-chain gene was achieved using the T7 expression system (12). 3 Although the protein produced in these experiments bound hapten, it was unstable and lacked catalytic activity, perhaps because the Genex linker prevents optimal association of the V H and V L domains. Rather than optimize the 212 sequence for 1F7 or switch to another unoptimized linker, we opted to select an active scFv directly from a protein linker library by phage display (4 -6).
Examination of the crystal structure of the 1F7 Fab-hapten complex (18) suggested that an extended 18-amino-acid polypeptide would easily span the 30 Å distance between the C terminus of the 117 residue V H domain (Val-H117) and the penultimate N-terminal amino acid of the 108-residue V L domain (Asn-L2). An scFv gene library, in which the 18 codons for the polypeptide linker were randomized, was constructed as described under "Materials and Methods" (Fig. 2b) and expressed as scFv-gIIIp fusion proteins on the surface of phage. As described above, the three arginine codons at positions H94 -H96 were altered by PCR to reflect optimal codon usage in E. coli. In addition, a C-to-T transition occurred unexpectedly during the construction of the linker library, resulting in the replacement of Leu-H79 by Ser. This mutation was shown by reversion to the wild-type sequence to confer favorable solubility properties on the scFvs and was retained throughout the selection.
The starting scFv library contained approximately 5.7 ϫ 10 6 individual members with a full-length scFv gene, as judged by colony hybridization using the V H fragment as a probe. An additional 1.3 ϫ 10 5 clones possessed only the V L gene. DNA sequencing of 28 randomly chosen clones confirmed that the V H and V L genes were correctly fused and that the intervening linker sequence was random. A stop codon was found in 25% of the clones (7 of 28), somewhat lower than the calculated value of 31.5%. The base composition in the randomized positions also deviated from the anticipated levels of incorporation, with 39% more C and 27% less A and G than expected. Although the origin of these discrepancies is unknown, differential coupling efficiency of the individual phosphoramidite solutions used for oligonucleotide synthesis may be at fault. Two truncated linkers were also found. One apparently resulted from internal digestion by MluI in the second cloning step; the other may have arisen from a truncated 1F7V H Mlu primer.
Phage panning was initiated by incubating 2 ϫ 10 11 phage particles in a microtiter plate coated with a bovine serum albumin conjugate of the transition state analog 1. Bound phage were eluted with free hapten (1.5 mM) and used to reinfect XL1-Blue cells. Phage output was determined by titering a sample of the phage on the same cells. Approximately 1 ϫ 10 12 phage particles were obtained following amplification and phage rescue, and 2 ϫ 10 11 phages were used for the next cycle of panning. After five rounds of selection, the phage output/ input ratio increased roughly 30-fold, from 4.8 ϫ 10 Ϫ7 to 1.5 ϫ 10 Ϫ5 . Soluble single-chain protein was produced from the total phagemid pool after excision of the gIII fragment from the fusion genes. The selected and pooled scFv proteins exhibited significant hapten binding activity by ELISA, whereas protein isolated from the starting library gave no detectable signal. Nevertheless, no bias for specific codons was observed at any position in the selected linkers when the scFv gene pool was sequenced.
Screening for Efficiently Expressed scFvs-A panel of 1054 individual clones was assayed for the production of soluble scFv capable of binding hapten 1. A total of 93 clones gave an ELISA signal at least 2-fold higher than background; the best 22 were confirmed to have elevated levels of hapten binding activity by semiquantitative ELISA (using Ϸ50 g of total cellular protein) and were sequenced. The deduced amino acid sequences of the linkers are shown in Table I. The clone which gave the highest hapten binding activity (sc1F7/L2) was found 7 times; the other clones are unique. Alignment of the linker sequences (Table I) reveals a strong bias for a proline residue in the second position after the V H C terminus (present in 12 of the 16 unique clones); Arg and Asn, preceded or followed by a proline, are the only alternatives observed at this position. Comparison of these sequences with those obtained prior to panning also shows that negatively charged residues (Asp and Glu) are disfavored in the selected linkers, whereas Pro and positively charged Arg are more abundant (Fig. 3). Thus, only two linkers have a negatively charged amino acid (one Asp each), but many clones, including sc1F7/L2, have multiple prolines and arginines. Aside from these minimal shared features, no other consensus properties are evident in the sequences. Apparently, a diverse set of polypeptides can link the variable domains of 1F7 3 Y. Tang, unpublished data. to yield hapten binding single-chain molecules.
The effect of individual linker sequences on protein production and hapten affinity was investigated by Western blotting and quantitative ELISA for nine of the selected scFvs (Fig. 4). Similar levels of total scFv gene expression were observed for each of the clones (Fig. 4c), but the production of soluble scFv varies dramatically (Fig. 4b). Furthermore, efficient production of soluble protein does not correlate with antigen binding activity. Thus, clones sc1F7/L2 and sc1F7/L6 have considerably more hapten binding activity than sc1F7/L8 (Fig. 4a, lanes 2, 6,  and 8), despite comparable yields of soluble scFv in all three cases (Fig. 4b, lanes 2, 6, and 8). Similarly, sc1F7/L7, which is expressed at higher levels of soluble protein than any of the other clones (Fig. 4b, lane 7), gives one of the weakest ELISA signals (Fig. 4a, lane 7). Although many linkers can be used to produce hapten binding scFvs, specific sequences are apparently needed to achieve an optimal balance of efficient expression as a soluble protein and high affinity binding.
Characterization of sc1F7/L2-Clone sc1F7/L2 was selected for more extensive investigation on the basis of its high hapten binding activity in preliminary assays. Preparative quantities of the single-chain protein were obtained by expression of the corresponding gene, modified to incorporate a C-terminal His tag, with the Novagen T7 system (12) (Fig. 2c). Efficient production of the soluble scFv required optimization of the growth conditions. Very rich medium, such as 2xYT or CS, proved superior to LB or YT medium. In addition, induction of expression at high cell density (A 600 nm of 6 or higher), low pH (5.5-5.8), and low temperature (23°C) gave the best results. Under these conditions, Ϸ1 mg of soluble scFv per liter of culture could be obtained after purification. Higher overall expression levels were achieved at higher pH and temperature (i.e. Ϸ50 mg/l at 37°C and pH 7), but most of the product was found in inclusion bodies with the signal peptide still attached (data not shown).
The soluble sc1F7/L2 protein was purified to homogeneity in two steps (affinity chromatography on a nickel column followed by FPLC cation exchange chromatography on MonoS) and shown to catalyze the rearrangement of chorismate to prephenate (Fig. 5). Although E. coli possesses two bifunctional chorismate mutases, contamination by host enzymes was excluded by the absence of detectable prephenate dehydrogenase (15) or prephenate dehydratase (14) activity in the purified scFv samples. Moreover, the values of the steady-state kinetic parameters for the scFv compare favorably with those determined for the parent monoclonal antibody under comparable conditions (Table II). The k cat value is reduced only by a factor of 2.7 and corresponds to a rate acceleration of approximately 100-fold over the uncatalyzed reaction. Affinity for chorismate and the free transition state analog 1 are affected to a greater extent as judged by K m and (K d ) app values that are, respectively, 5.3-and 8.5-fold larger for the scFv as compared with the corresponding Fab fragment, leading to a 15-fold reduction in k cat /K m . These differences presumably reflect minor adjustments in the relative orientation of the V H and V L domains imposed by the linker peptide. DISCUSSION Phage display has emerged in recent years as a powerful tool for protein engineering (19). Nowhere is this more evident than in the successful isolation of immunoglobulin-based receptors from large combinatorial libraries displayed on filamentous phage and in the modification of the affinity and selectivity of these molecules through multiple rounds of mutagenesis and selection (5,6,20). As shown in the current study, phage display can also greatly facilitate the identification of tailored linkers for single-chain antibodies. Even the largest scFv phage libraries can contain but a tiny fraction of the 2.6 ϫ 10 23 possible 18-amino-acid linkers. Nevertheless, our experiments show that rather small libraries (Ϸ5 ϫ 10 6 clones) are sufficient for the identification of single-chain proteins that are functional with respect to both hapten recognition and catalysis. These results are consistent with those of Stemmer and co-workers (3) who found that approximately 0.2% of the members of an scFv library derived from a metal chelate binding antibody and containing a 15-amino-acid randomized linker segment were active. There are apparently many viable solutions to the problem of linking individual V H and V L domains.
The principal advantage of the phage display approach is the rapid identification and amplification of functional gene variants that phenotypic selection makes possible. In our experiments this is illustrated by the increased hapten binding activity associated with the scFv pool obtained after five rounds of selection as compared with the starting library. Nevertheless, the diversity of the final pool is still large. The failure of a single clone to dominate the population of selected scFvs may reflect an incomplete course of selection, but a more likely explanation, as mentioned above, is that successful linkers have relatively few sequence requirements. The selection protocol used in these experiments is probably insufficiently stringent to differentiate between receptors with roughly comparable hapten affinities.
Although diverse linker sequences yield scFvs that can be selected on the basis of hapten affinity, the resulting molecules are not equally suited for large scale applications. Genes expressed at low levels may provide sufficient fusion protein for display on the surface of phage, allowing passage to subsequent rounds of selection, but they may not be the best candidates for high level expression or may not encode proteins possessing optimal combinations of hapten affinity, catalytic activity, and stability. Screening for scFvs possessing high binding activity and which can be produced in high yield is therefore an important additional step in the approach we describe. We found that only Ϸ9% of the clones obtained after the final round of selection secreted sufficient soluble scFv with enough hapten binding activity to afford an ELISA signal 2-fold over background. In detailed studies of representative clones producing the highest ELISA signals, large variations in soluble protein production and hapten binding were observed (Fig. 4). Perhaps fortuitously, the most active variant (sc1F7/L2) was produced in good yield; multiple copies of this clone were also found in the library, indicating that it was a particularly successful competitor in the initial selection step.
Sequence analysis of 22 of the best binders (Table I) reveals a highly conserved proline in the second (or an adjacent) position of the linkers. This proline is likely to have functional significance, making possible a tight turn that orients the linker segment into the groove separating V H and V L . Aside from a modest enrichment in prolines and arginines at other positions, the selected polypeptides share no other common features. The abundance of positively charged arginines suggests that hydrophilicity of the linker may be important, while proline residues will disfavor formation of regular secondary structure and lower the susceptibility of the scFv to proteolysis. Given this, the linker segments are likely to be rather flexible and at least partially disordered, consistent with structural studies of other single-chain antibodies (21)(22)(23), although specific interactions between individual linkers and the Fv domain presumably dictate the observed variation in hapten affinity (and catalytic activity).
In summary, phage display, coupled with a high throughput screen for efficient expression, has yielded a single-chain variant of the catalytic antibody 1F7 which is suitable for structural and mechanistic studies. Conceivably, even better linkers can be found by searching larger starting libraries or by improving the first-generation scFv obtained here through additional mutagenesis and selection. This strategy obviates the need for the heuristics and structural information that usually guide site-directed mutagenesis experiments. In principle and more generally, therefore, the same approach that yielded sc1F7/L2 should be easily extendible to other proteins, affording a general methodology for identifying suitable peptide sequences for covalently tethering any two protein domains together in the absence of structural data. a Values of (K d ) app for sc1F7/L2 and 1F7Fab were determined by competition ELISA (13) and defined as the concentration of free hapten required to inhibit maximal binding by 50%.