Engineering Anti-vascular Endothelial Growth Factor Single Chain Disulfide-stabilized Antibody Variable Fragments (sc-dsFv) with Phage-displayed sc-dsFv Libraries*

Phage display of antibody fragments from natural or synthetic antibody libraries with the single chain constructs combining the variable fragments (scFv) has been one of the most prominent technologies in antibody engineering. However, the nature of the artificial single chain constructs results in unstable proteins expressed on the phage surface or as soluble proteins secreted in the bacterial culture medium. The stability of the variable domain structures can be enhanced with interdomain disulfide bond, but the single chain disulfide-stabilized constructs (sc-dsFv) have yet to be established as a feasible format for bacterial phage display due to diminishing expression levels on the phage surface in known phage display systems. In this work, biological combinatorial searches were used to establish that the c-region of the signal sequence is critically responsible for effective expression and functional folding of the sc-dsFv on the phage surface. The optimum signal sequences increase the expression of functional sc-dsFv by 2 orders of magnitude compared with wild-type signal sequences, enabling the construction of phage-displayed synthetic antivascular endothelial growth factor sc-dsFv libraries. Comparison of the scFv and sc-dsFv variants selected from the phage-displayed libraries for vascular endothelial growth factor binding revealed the sequence preference differences resulting from the interdomain disulfide bond. These results underlie a new phage display format for antibody fragments with all the benefits from the scFv format but without the downside due to the instability of the dimeric interface in scFv.

Single chain variable fragment (scFv) 2 displayed as a fusion protein amino-terminal to the pIII minor capsid protein on the filamentous phage surface is one of the most prominent methods in antibody engineering. The small size of the scFv construct enables superior tissue-penetrating capabilities over the whole IgG or Fab fragment (1), making scFv an ideal scaffold for designing tumor-homing molecules carrying therapeutic or imaging agents (for reviews see Refs. [2][3][4][5][6][7][8]. scFv is a single polypeptide chain antibody fragment construct encoding a light chain variable domain and a heavy chain variable domain, with a flexible linkage peptide connecting the two domains (9 -11). The recombinant antibody fragment frequently retains antigen-recognizing capability rivaling that of the parent antibody. Moreover, an scFv library, which could contain more than one billion scFv variants, can be propagated with an Escherichia coli vector of bacterial phage origin (12,13); the recombinant phages displaying the scFv variants can be selected or screened for antigen binding and re-amplified with the host E. coli. The sequences of the selected scFv can be further diversified with various mutagenesis technologies for affinity maturation (14,15). As such, the scFv scaffold supports a powerful in vitro technology mimicking the production of high affinity antibodies in mammalian immune responses to foreign antigens, and the recombinant scFv could play key roles as protein therapeutics and diagnostics.
One shortcoming of the scFv scaffold that frequently hampers the applications of this powerful technology is the aggregation tendency of the scFv molecules under physiological and storage conditions (for reviews see Refs. 16 -19). The aggregation mechanism has much to do with the stability of the two variable domains and the dimeric interface (20), any nonoptimal interactions within or between the two variable domains could result in unstable dimeric interface and dissociation of the two variable domains in the scFv. The variable domain dissociation has two molecular consequences as follows. First, the variable domains are less stable in isolation and could unfold. Second, the transiently dissociated variable domains could re-associate with supplementary domains from other scFv polypeptide chains, forming higher order complexes composed of more than one polypeptide chain (16). Both situations lead to insoluble aggregates or meta-stable structures. The instability of the scFv structure also compromises the fidelity in reproducing the antibody gene products on the phage surface. This could cause biases in favor of more stable scFv molecules over the less stable ones, in selecting misfolded structures on phage surfaces but nevertheless binding to the antigen (21), and in selecting false-positives due to multimeric scFv conformations that are unique only to the phage display systems. This structural instability has thus impacted negatively on the utilities of scFv, leading to uncertainties to the outcomes of the selected and screened scFv molecules in terms of their potential applications in biomedicine.
Although the misfolding tendency of scFv has been potentially problematic, the uncertainties because of the intrinsic stability properties of the scFv in phage display remain uncharacterized. A more robust scFv phage display system without the uncertainties due to scFv misfolding would provide an alternative circumventing the downside of the conventional scFv phage display systems. One way to stabilize the scFv scaffold is to engineer a disulfide bond between the two Fv domains, so that the variable domains can be covalently linked with a disulfide bond. Single chain disulfide-stabilized Fv fragment (sc-dsFv) format has been constructed in a single polypeptide chain, as in scFv, with a disulfide bond linking the two variable domains in the framework region (22)(23)(24). The advantages of the sc-dsFv molecules have been demonstrated with the sc-dsFv expressed in E. coli (22)(23)(24). But the single chain disulfidestabilized Fv fragments have not been expressed on phage surface or as soluble form secreted by E. coli in the culture medium, mostly due to severely decreased yield because of the introduction of the interface cysteines (19,22,24), and phagedisplayed sc-dsFv libraries and their applications have not been established.
In this work, an sc-dsFv phage display platform has been developed. The key discovery is that the signal sequence variants of the sc-dsFv-pIII fusion protein have varying effects in directing the sc-dsFv expression on the recombinant phage surface. The signal sequence has been known to be responsible for the Sec system-dependent translocation of the pIII fusion protein from the translation site in the cytoplasm to the periplasm membrane (25)(26)(27)(28)(29)(30)(31), a critical process for the integration of the displayed protein on the recombinant phage surface (32). But the optimal signal sequences for the translocation of the pIII fusion protein were not known. We used biological combinatorial strategies to diversify the signal sequence with synthetic phage display libraries. The variants in the phage libraries were selected and screened for high expression capabilities (31). The results indicated that the sequence of the signal peptidase cleavage site (c-region) in the signal peptide amino-terminal to the pIII fusion protein is more critically responsible for the expression of the phage-displayed fusion protein. This finding enabled success in increasing the expression of functional anti-VEGF sc-dsFv on M13 phage surface by up to 2 orders of magnitude compared with the expression efficiency due to the wild-type signal sequence. The interface disulfide bond of the phage-displayed sc-dsFv was largely formed, and the binding affinity against VEGF was substantially enhanced. More importantly, synthetic sc-dsFv libraries were expressed successfully on the M13 phage surface with one of the optimum signal sequences, and a large number of the binding sc-dsFv variants against VEGF were compared with the binding scFv variants to assess the effects of the interdomain disulfide bond on the binding sequence preferences of the scFv or the sc-dsFv molecules. This work demonstrates a new antibody display platform with all the benefits from the scFv format but without the drawbacks due to the instability of the scFv structure.

EXPERIMENTAL PROCEDURES
DNA Construct of the Anti-VEGF scFv-The sequence of the template scFv was derived from G6 anti-VEGF Fab (Protein Data Bank code 2FJG (33)). The construct starts from amino terminus with a hexa-His tag (His) 6 GH, followed by the V1 light chain variable domain, and then the VH3 heavy chain variable domain. The two variable domains are connected with a linker polypeptide, (G) 4 SIEGRS(G) 4 S (for anti-VEGF scFv(fXa ϩ ) construct) or (G) 4 S(G) 4 S(G) 4 S (for anti-VEGF scFv(fXa Ϫ ) construct), where IEGR is the factor Xa (fXa) cutting site. A knowledge-based computer program is available in scanning sequences for tentative fXa cleavage sites in a protein sequence (34); any possible cleavage sites other than the engineered site could interfere with the fXa-based assay and should be avoided in designing the phage-displayed protein sequence. The sequence of the template scFv is shown in supplemental Fig. 1. The software DNAWorks (helixweb.nih.gov) (35) was used to optimize the DNA sequence for optimum expression in E. coli and to design a total of 30 overlapping DNA fragments (45 bases each from Integrated DNA Technologies) encoding the sequence of the template scFv. These oligonucleotide fragments were synthesized and ligated with PCR in three stages. Due to the large number of the DNA fragments, it was successful only when the fragments were ligated separately in three groups (a01-a10, a11-a20, and a21-a30, see supplemental Fig. 1B) in the first stage. Each 50-l PCR contains 5 l of 10ϫ PCR buffer, 2 l of 25 mM MgSO 4 , 1 l of 50ϫ dNTP mix, 1 l of 50ϫ high fidelity KOD hot start polymerase Mix (Novagen), and 0.5 l from each of the 10 M DNA fragments in water. After hot start for 5 min at 95°C, 35 PCR cycles (94°C for 30 s, 57°C for 30 s, and 68°C for 25 s) were carried out for each of the three reactions. In the second stage, 5 l of each of the PCR product mixtures was mixed with respective end fragments (2 l, 100 M) into a 50-l PCR mixture as described above. The same PCR protocol was applied to each of the three reaction mixtures. 10 l from each of the reaction mixtures were pooled, and the DNA products were extracted with the gel extraction kit from Qiagen. In the third stage, the extracted DNA products were mixed together with the end fragments, i.e. 1st and the 30th DNA fragment (0.5 l, 10 M), in a 50-l PCR mixture, followed by the PCR procedure as described before. The PCR product was purified from excised gel slabs with the gel extraction kit from Qiagen. The agarose gel electrophoresis of the DNA products from the three PCR stages is shown in supplemental Fig. 2. The sequence of the DNA product was confirmed by DNA sequencing.
Construct of the scFv Phagemid-The purified PCR product was digested with the restriction enzymes NotI and SfiI (New England Biolabs). The digested DNA (788 bases) was purified with agarose gel electrophoresis and the gel extraction kit from Qiagen. The phagemid pCANTAB5E (GE Healthcare), digested with the same restriction enzymes NotI and SfiI, was purified with agarose gel electrophoresis and incubated with calf intestinal alkaline phosphatase (New England Biolabs) (1.5 units/g of DNA) in buffer at 37°C for 1 h. The 5Ј-dephosphorylated linear vector was extracted with phenol/chloroform/ isoamyl alcohol (25:24:1), precipitated with 100% ethanol, washed with 70% ethanol, and dissolved in pure water. 2 g of the purified linear vector was mixed with the digested PCR product (vector/insert ϭ 1:2 to 1:5 in molar ratio) in the presence of T4 DNA ligase (New England Biolabs) in buffer overnight at 16°C. The ligation reaction product was examined with agarose gel electrophoresis, as shown in supplemental Fig. 3. The ligation reaction product was purified with phenol/chloroform/isoamyl alcohol extraction and ethanol precipitation. The construct of the pCANTAB5E phagemid encoded the scFv, followed by an E-tag sequence and a TAG amber stop codon amino-terminal to the gene III sequence. The E-tag is a marker for the expression of the scFv. The DNA product, dissolved in pure water, was electroporated into an electrocompetent E. coli strain ER2738 (New England Biolabs). The transformed E. coli was plated on LB-ampicillin agar plate overnight. Single colonies were picked, and the insert in the phagemid was confirmed by DNA sequencing. Recombinant phage particles were rescued from the E. coli culture by super-infecting the E. coli with the helper phage M13KO7 (GE Healthcare), precipitated with polyethylene glycol/NaCl, and resuspended in PBS.
VEGF Expression and Purification-Human VEGF-121 (VEGF-A residues 34 -135 receptor binding domain) (33) was expressed in E. coli as inclusion body; refolding and purification followed the procedure published recently (36).
Competitive Phage ELISA for Affinity Measurement of the Phage-displayed Anti-VEGF scFv-Multiple competition curves with various concentrations of immobilized and free VEGF competing for anti-VEGF scFv binding were determined to estimate the EC 50 values (the concentration of free VEGF resulting in 50% reduction in binding of the phage-displayed scFv to immobilized VEGF) at the lowest immobilized VEGF concentration (37). Wells in the first column of a Maxisorb (Nunc) 96-well plate were coated with human VEGF-A, 2 g/100 l. The following wells of the nth column in the plate were coated with (0.25 nϪ1 ) ϫ 2 g/100 l VEGF. After overnight coating at room temperature, the plate was blocked with 5% skim milk in PBST buffer (phosphate-buffered saline with 0.2% Tween 20) for 3 h. The plate was washed three times with PBST and two times with PBS. Recombinant phage displaying the anti-VEGF scFv (10 9 cfu/50 l) was mixed with free VEGF (10 g/50 l in PBS) and 5% skim milk (50 l in PBST) in the wells of the first row of the plate; the following wells of the mth row contained the same recombinant phage and skim milk solution, but the VEGF was diluted 10-fold in series, i.e. (0.1 mϪ1 ) ϫ10 g/50 l for wells in the mth row. The plate was allowed to equilibrate for 7 h at room temperature, washed with 3ϫ PBST and 2ϫ PBS, added anti-M13 antibody was conjugated with HRP (GE Healthcare) for 1 h, washed again with 3ϫ PBST and 2ϫ PBS, and then developed with 2,2Ј-azino-bis(3-ethylbenzthiazoline-6-sulfonic acid) (Kirkegaard & Perry Laboratories) for 30 min before adding 1% SDS to stop the reaction. Absorbance at 405 nm was measured, and the data were used for EC 50 calculations.
Construction of Phagemids Encoding Interdomain sc-dsFv for Phage Display-Primers encoding the altered residues at light chain positions for L:Q38C, L:G41C, L:A43C, L:F98C, and L:Q100C mutations and heavy chain positions for H:Q39C, H:G42C, H:G44C, H:L45C, and H:Q112C mutations were synthesized (Integrated DNA Technologies). The amino acid numbers followed the numbering published in the Protein Data Bank code 2FJG (33). The primers were designed with the annealing temperature in the range 60 -65°C. Pairs of cysteines were introduced in one step to the anti-VEGF scFv sequence in the pCANTAB5E phagemid (see above paragraph for the construction of the template phagemid) following the Kunkel protocol (38).
Phage Display Libraries with Randomized Sequences in the GeneIII Signal Peptide Region or in the CDR Region-The DNA construct of the gene III signal peptide in the pCANTAB5E is shown in Fig. 3. Four libraries spanning the signal peptide region were designed as shown in the figure to identify the critical regions of the signal sequence. The primer sequences for the library constructions are also shown in Fig. 3. For each of the sequence regions to be randomized in amino acid sequence, TAA stop codons were first inserted in the designated place as shown in Fig. 3. The TAA-containing phagemids were used as the parents for the following library construction with the primers containing the degenerate codons NNK (Fig. 3). The TAA stop codons were designed to ensure that only the phagemids carrying degenerate codons would produce pIII fusion protein for phage surface display. Both the TAA-containing phagemids and the phagemid libraries for phage display were constructed with the oligonucleotide-directed mutagenesis initially proposed by Kunkel (38). In this work, we followed the Sidhu and Weiss protocol (39). E. coli strand ER2738 was transformed with the phagemid libraries, and the recombinant phage particles were rescued with helper phage M13KO7, precipitated with polyethylene glycol/NaCl, and resuspended in PBS. More details of the phage library preparation can be found in a previous publication (40). The construction of the phage display libraries in the CDR regions followed the same procedure above.
Selection for Phage-displayed Anti-VEGF scFv-Wells in a 96-well Maxisorb microtiter plate were coated with VEGF (1 g/100 l PBS per well) overnight at room temperature. The wells were then blocked with 5% skim milk in PBST for 1 h and then washed as described before. 100 l of resuspended phage library (10 13 cfu/ml) was added to each well for 1 h under gentle shaking. The plate was cleared with 16 ϫ 300 l of PBST and 2 ϫ 300 l of PBS. Vigorous wash steps were intended to remove phages from the well surface because of nonspecific binding. The bound phages were eluted with 100 l of 0.1 M HCl/glycine (pH 2.2) per well, followed by neutralization with 20 l of 2 M Tris-base buffer (pH 9.1). The eluted phages were mixed with 1 ml of E. coli strand ER2738 (A 600 nm ϭ 0.6) for 15 min at 37°C. Infected E. coli was titered, and the recombinant phage particles were rescued and amplified (in 9 ml of 2ϫ YT, ampicillin 100 g/ml; after 1 h, add helper phage (10 10 ); wait for another hour before adding 50 ml of medium containing kanamycin 50 g/ml and ampicillin 100 g/ml) overnight at 37°C with vigorous shaking. The phage in the supernatant of the culture was titered, precipitated with polyethylene glycol/NaCl, and resuspended in PBS. The phage solution was ready for the next round of selection.

Phage Display of Disulfide-stabilized scFv
Phage-displayed scFv-VEGF Binding Analyses with ELISA-Single E. coli colonies harboring the selected phagemid were randomly picked using GENETIX Qpix II colony picker to a 96-well deep well culture plate. Each well contained 1 ml of 2ϫ YT, 100 g of ampicillin, 20 g of tetracycline, and helper phage M13KO7 1 ϫ 10 9 cfu. The culture plates were incubated at 37°C and shaken vigorously overnight. After centrifuging at 3000 ϫ g for 30 min at 4°C, 50 l of supernatant was mixed with 50 l of 5% skim milk in a corresponding well of a 96-well Maxisorb microtiter plate, for which the wells were coated with 1 g/100 l VEGF and blocked with 5% skim milk. The ELISA developing procedure was identical to that described above under "Competitive Phage ELISA for Affinity Measurement of the Phage-displayed Anti-VEGF scFv," except that the 3,3Ј,5,5Јtetramethylbenzidine substrate (Kirkegaard & Perry Laboratories) was used, and absorbance at 450 nm was measured.
Phage-displayed scFv-VEGF Binding Analyses with Phage Array on Nitrocellulose Membrane-13 ϫ 10-cm nitrocellulose membrane (PROTRAN) was coated with VEGF (100 g/30 ml in PBS) overnight at room temperature. The membrane was blocked with 5% skim milk for 20 min and then laid on a PBSsoaked filter paper over the platform in GENETIX Qpix II. The phage solutions in the 96-well culture plate were arrayed on the membrane with 96-pin gridding tool (FP3S100 from V & P Scientific, Inc; each pin holds 100 nl of phage solution) installed on the gridding robot in GENETIX Qpix II. After arraying the phage solutions, the membrane was then immersed in 5% skim milk in PBS buffer for 20 min before washing with PBST. Phage remaining on the membrane was quantified by immersing the membrane in anti-M13 antibody conjugated with HRP (1:3000 in 5% skim milk PBST) for 30 min and then washing with PBST and PBS. The membrane was allowed to react with Western Lighting Chemiluminescence Reagent Plus (PerkinElmer Life Sciences) following the protocol suggested by the manufacturer. The chemiluminescence signal was recorded with a LAS-3000 imaging system (FujiFilm).

Construction and Confirmation of Phage-displayed Anti-
VEGF scFv-Phage particles displaying anti-VEGF scFv(fXa ϩ ) derived from standard pCANTAB5E phagemid were tested for VEGF binding with competitive phage ELISA. EC 50 values (the concentration of free VEGF resulting in 50% reduction in binding of the phage-displayed scFv to immobilized VEGF) derived from the series of curves of competitive ELISA in Fig. 1 indicated that the K d (dissociation constant) values of anti-VEGF scFv against VEGF is on the order of 10 -100 nM. This value was derived as EC 50 approached K d when both the phage-displayed anti-VEGF scFv concentration and the immobilized VEGF concentration approached zero (37). But the signal-to-noise ratio decreased as the immobilized VEGF concentration decreased (Fig. 1), rendering large uncertainty in measuring EC 50 . Hence, the EC 50 values measured in Fig. 1 can only be regarded as a crude estimation of the K d for the anti-VEGF scFv(fXa ϩ ) against VEGF. Nevertheless, the competitive phage ELISA against VEGF, along with the negative control experiment (null control phage without displaying scFv did not show any binding to VEGF, as shown in Fig. 2), confirmed the expression and binding of the phage-displayed anti-VEGF scFv(fXa ϩ ) against VEGF with reasonable binding affinity for the following experiments.
Construction of Phage-displayed Anti-VEGF sc-dsFv-Seven sc-dsFv variants were constructed on the basis of the phagemid encoding the template anti-VEGF scFv(fXa ϩ ) as follows: S1(L: . These cysteine pairs were determined by distance constraints for possible disulfide bonds in the model structure (Protein Data Bank code 2FJG) (41). As shown in Fig. 2, only S1 and S2 were expressed at a comparable level as in anti-VEGF scFv on the recombinant phage particles, judged by the ELISA measurement of the E-tag expression. However, the expressed S1 and S2 sc-dsFv did not bind to VEGF, perhaps because of poorly folded three-dimensional structure. S3 to S7 sc-dsFv did not express on the phage surface (Fig. 2), and thus binding to VEGF was not expected to be detected (Fig. 2). Fig. 2 also indicated that the expression of the soluble anti-VEGF scFv was measurable in the culture medium, as shown by the ELISA signals developed with HRP-conjugated protein L that binds to the V1 of the soluble anti-VEGF scFv. In the suppressive E. coli strain ER2738, a portion of the expression of the scFv-pIII fusion proteins are terminated at the amber codon between the E-tag and gene III (supplemental Fig. 1), and the products are secreted to the medium along with the reproduced phage. The secretion of the soluble sc-dsFv (S1 to S7) from E. coli ER2738 was either greatly reduced (S5) or completely blocked (Fig. 2). This result is in agreement with previous observations (19,22,24). . The x axis shows the concentration of free VEGF added to the corresponding well. The free VEGF molecules compete with the immobilized VEGF for binding to phage-displayed anti-VEGF scFv(fXa ϩ ). The phage particles displaying anti-VEGF scFv(fXa ϩ ) binding to the immobilized VEGF on the well were quantitatively determined by the ELISA signal shown in the y axis.
The results shown in Fig. 2 indicated that the introduction of cysteine pairs at various locations in the scFv completely impeded the expression and/or folding of the sc-dsFv in the host E. coli. One working hypothesis was that the signal sequence that worked for the expression and folding of the template anti-VEGF scFv no longer worked for any of the sc-dsFv designs explored so far; one reason was attributed to the misfolding of the sc-dsFv structures during translation and/or translocation because of the interface disulfide introduced in the sc-dsFv sequences. To engineer the sc-dsFv with the phage display platform, it is necessary to rescue both the expression and the proper folding of the sc-dsFv in the host E. coli. The working hypothesis implied that alternative signal sequences might rescue the sc-dsFv expression and folding on phage surface. This implication was tested with high throughput phage display experiments (see below).
Exploration of the Signal Sequence Space with Phage-displayed Protein Libraries-The tentative signal sequence responsible for the high expression level and proper structural folding of the anti-VEGF scFv encoded in the pCANTAB5E phagemid is the signal sequence for M13 gene III in connection with the signal peptidase cleavage sequence of pelB (a natural Pectobacterium wasabiae signal sequence for pectate lyase with a known pelB peptidase cutting site, which is frequently used in the phage display system) (Fig. 3). But because the open reading frame for gene III in pCANTAB5E starts 39 amino acid residues amino-terminal to the tentative signal peptidase cutting site, we aimed to answer the two following questions with biological combinatorial approaches. 1) Which part of the amino-terminal sequence in the open reading frame encodes the key regions (n-, h-, and c-regions) of the signal sequence? 2) Do the amino acid preferences in the signal sequence underlie phage-displayed protein expression and folding? If so, which part(s) of the key regions is(are) critically related to the expression of functional displayed proteins, and which amino acid types are preferred in the key region for effective mature protein production? This sought after knowledge was to be applied to rescue the expression and folding of the sc-dsFv on phage surface.
To address these questions, synthetic phage display libraries, spanning the tentative signal peptide region, were constructed  Fig. 3. The x axis shows various single chain antibody variable domain fragment constructs in the pCANTAB5E phagemid. The details of the antibody fragment constructs are described in the text. In the pCANTAB5E phagemid, an E-tag sequence followed by a TAG amber stop codon is encoded between the scFv construct and the pIII sequence (supplemental Fig. 1a). The E-tag was used as a marker for the expression of the scFv/sc-dsFv-pIII fusion protein displayed on the phage surface or as a marker for the free secreted scFv/sc-dsFv molecules. The null control phagemid contains TAA stop codons in the signal sequence (Fig. 3), and thus the phage particles rescued from the E. coli hosts harboring the control phagemid did not display any polypeptide connecting to the pIII minor capsid protein. Also, the bacteria harboring the control phagemid did not secrete free soluble scFv protein. The black histogram and the light gray histogram show the ELISA signals reflecting the binding of the recombinant phage displaying scFv(fXa ϩ )/sc-dsFv(fXa ϩ ) to the immobilized anti-E tag antibody (0.1 g in 100 l) and to the immobilized VEGF (1 g in 100 l), respectively. The overnight culture medium for phage rescue also contained bacteria-secreted soluble protein of the scFv/sc-dsFv constructs, for which the expression stopped at the amber stop codon. The binding of the soluble scFv/sc-dsFv to the anti-E tag antibody and to the VEGF can be detected by HRP-conjugated protein L. Protein L binds to the V light chain of the scFv/sc-dsFv, which in turn binds to immobilized anti-E tag antibody or immobilized VEGF. The dark gray histogram and the white histogram show the ELISA signals reflecting the binding of free soluble scFv/sc-dsFv to the immobilized anti-E tag antibody and to the immobilized VEGF, respectively. The error bars were derived by three repeats of the experiments.  (Fig. 3), and the signal sequence variants that resulted in high expression level and proper folding of the phage-displayed scFv(fXa ϩ ) were identified and characterized. The rationale was that the signal sequence preferences resulting in high expression level and functional structure for the displayed proteins would provide assignments for the key regions of the signal sequence that are responsible for the expression level and functional structure of the phage-displayed proteins. More importantly, once the correlation between the expression/folding of the displayed protein and the sequence preferences of the key regions of the signal sequence is established, the optimum signal sequences would be useful guides for the rescue of the expression and folding of the sc-dsFvs on the phage surface. The phagemid encoding the anti-VEGF scFv(fXa ϩ ) had been established ( Fig. 1) for exploring the signal sequence space; the phage-displayed sc-dsFvs (S1 to S7) were not feasible for this experiment due to undetectable expression on the phage surface (Fig. 2).
As shown in Fig. 3, the tentative n-and h-regions of the signal sequence were diversified in the first two phage-displayed protein libraries (L1 and L2 in Fig. 3), and the tentative c-region and the first few amino-terminal residues of the mature displayed protein were diversified in another two phage-displayed protein libraries (L4 and L5 in Fig. 3). In each of the libraries, 10 consecutive amino acid positions were encoded with degenerate codon NNK (Fig. 3), and the complexity for each of the libraries was established above 10 9 (see figure legend of Fig. 4).
To enrich functional signal peptide variants, the libraries were selected for binding to immobilized VEGF through repetitive selection/amplification cycles. Fig. 4 shows that the capability of binding to VEGF (y axis) for the selected phage population was increasingly enhanced after increasing rounds of selection/amplification cycle (x axis). Various signal sequence variants emerged from each of the libraries were enabling more effective expression and/or functional structure compared with the positive control phagemid with wild-type M13pIII-pelB signal sequence.
Results shown in Fig. 4 indicated that all the amino acids in the tentative signal sequence region could be involved in phagedisplayed protein expression and/or folding and that the wildtype M13pIII-pelB signal sequence was not the most effective one for displaying the anti-VEGF scFv. Fig. 4 also indicated that although only ϳ1:10,000 of the theoretical variations (20 10 ) were expressed in each of the phage-displayed libraries, the diversity was nevertheless adequate in exploring the relationships between the signal sequences and their functionalities in protein expression and folding. In the following section, these relationships are established by a large scale analysis on the signal sequences that emerged from the selected phage population and their effects on the phage-displayed scFv-VEGF binding. (Fig. 4), 1536 single phage colonies from each of the four libraries (Fig. 3) were selected at random. Because mutants with deletions in the scFv region were observed in L2 and L5 after the 4th round of selection/amplification cycle and mutants were rare in L1 and L4 even at the 10th round of selection/amplification cycle, single colony phage samples from each of the libraries after 4 (for L2 and L5) or 10 (for L1 and L4) rounds of selection/amplification cycle were picked. The phage cultures after overnight rescue were spotted on nitrocellulose membranes coated with VEGF (supplemental Fig. 4 and Fig. 5). Each membrane contained 96 blocks of 6 ϫ 7 phage spots; the area of each block is 1 cm 2 . Each block contained eight phage samples, each of which was spotted in triple repeats. Also, each block contained two repeat sets of spots for standard serial dilutions of positive control phage with displayed anti-VEGF scFv(fXa ϩ ) and negative control phage without any displayed protein (supplemental Fig. 5). After washing, the bound phage remaining on the membrane was quantified with HRP-conjugated anti-M13 antibody and chemiluminescence (more details under "Experimental Procedures"). The chemiluminescence signals were calibrated with the standard curves of the control phage within the block. The VEGF binding strength of a phage sample was determined by the calibrated signals normalized by the phage concentration of the phage sample (supplemental Figs. 4 and 5). The phage concentration of a phage sample was determined by spotting the phage sample on an uncoated and unblocked nitrocellulose membrane and by measuring the chemiluminescence signals after the HRP-conjugated anti-M13 antibody treatment (supplemental Figs. 4 and 5). For those membrane blocks with poor signal-to-noise ratio, i.e. R 2 Ͻ0.8 for the standard serial dilution curves in either the membrane block for measuring phage-VEGF binding or the membrane block for measuring phage concentration, the phage samples were dropped from further consideration. By this signal-to-noise threshold, 65% of the data were suitable for further analysis.

Single Colony Analysis of Highly Effective Signal Sequences in Expression of Functional Phage-displayed Anti-VEGF scFv-(fXa ϩ )-After selection/amplification cycles
All the phage samples with measurable binding strengths were ranked; the signal sequences of the top ranked ϳ40 phage   (42) for each of the signal sequence groups in supplemental Table 1 are shown in Fig. 5. Fig. 5A shows that the positively charged residues in the tentative n-region of the M13 gene III signal sequence were not conserved in the emerged effective signal sequences. In contrast, the hydrophobic preference in the tentative h-region was conserved, as indicated in Fig. 5B, although the position-wise amino acid preferences were less specific (low in information content). More striking is the emergence of a preferable sequence pattern, AM(F)SRPP, in both (Fig. 5, C and D), indicating that a peptidase cleavage site, which is different from the pelB peptidase site (Fig. 3), in the tentative c-region plays a dominant role for effective expression of the phage-displayed anti-VEGF scFv(fXa ϩ ). This also indicated that the amino-terminal poly-His of the mature scFv does not affect the emergence of this dominant sequence pattern. This conclusion is supported by the result shown in Fig. 5D where the poly-His sequence pattern diminished following the emergence of the dominant sequence pattern. The shift of the position of the dominant sequence pattern in Fig. 5D in comparison with Fig. 5C indicates that the sequence preference in the c-region is more important than the length of the surrounding polypeptide. Fig. 6 compares the anti-VEGF scFv expression resulting from the most effective signal sequences, i.e. the top-ranked sequences from each of the four libraries, with the control positive and negative phages. Both of the measurements (titer and ELISA signal of the bound phage on the VEGF-coated surface) shown in Fig. 6 indicated that the most effective signal sequences were clearly emerged from the tentative c-region variants (L4 and L5). The VEGF binding strengths for the best c-region variants were more than 20-fold (Fig. 6) larger than that of the control anti-VEGF scFv with the M13pIII-pelB signal sequence. This result suggested that residues in the tentative c-region critically affect the expression of the phage-displayed scFv, providing a direction to rescue the expression of the phage-displayed sc-dsFv.
Rescuing the Expression of the Phage-displayed sc-dsFv through Optimizing the c-region Residues of the Signal Sequence-For each of the phage-displayed sc-dsFv variants (S1 to S7), a phage-displayed library with 10 diversified residues in the c-region of the signal sequence (L4 in Fig. 3) was constructed, and the effective signal sequences were enriched by the selection/ amplification cycle against VEGF binding as described previously (Fig. 4). Among the seven libraries, only the S5 sc-dsFv could obviously be rescued as phage-displayed fusion protein; all other sc-dsFv variants either were not enriched for VEGF binding or were dominated by mutants after a few rounds of selection/amplification cycle (data not shown). We thus focused on the relationships between the signal sequences and the expression of the phage-displayed S5 sc-dsFv.
The supplemental Table 2 lists the most effective signal sequences for the expression of S5 sc-dsFv on the phage surface.
The amino acid preference pattern for these sequences is shown in Fig. 7. The most prominent sequence pattern reflected, again, the preference of the signal peptidase cleavage site. The amino-terminal sequencing of the mature protein from the representative variant of signal sequence, VKKLL-FAIPLVVPFYSFAMSAQPSLHHHGH, showed the signal peptidase cleavage site in the peptide bond before the sequence SAQPSLHHHGH. The pattern shown in Fig. 7 is similar, but not identical, to the pattern shown in Fig. 5C, the optimum signal sequence pattern for scFv display. As shown in supplemental Table 2, many of the phage-displayed S5 sc-dsFv variants have severalfold binding strength over the control anti-VEGF scFv(fXa ϩ ). We estimated that the expression level for the optimum S5 variant was increased by 2 orders of magni-   (Fig. 4). The top-ranked signal sequence variants (-PWLPRD-PYIPVVPFYAAQPAMAHHHHHHGH-from L1, -VKKLLPSSLAFLLVFAAQPAMAH-HHHHHGH-from L2, -VKKLLFAIPLVVPFYAMSMSRPVASHHHGH-from L4, and -VKKLLFAIPLVVPFYAAQPAYAMSRTPVRS-from L5) were cultured and normalized to 1.0 ϫ 10 10 cfu/ml. The phage solutions were incubated with immobilized VEGF in microtiter wells. The ELISA signals (shown in the y axis) and the titer of the phage particles eluted from the well (shown in the x axis) are plotted for each of the variants. Eight repeats of the experiment were carried out for each of the variants, and the results are shown for the variant from L1 in gray diamonds, L2 in gray circles, L4 in black squares, and L5 in empty triangles. The scales of the ELISA signal and the titer of the bound phage are normalized against the signal and the titer of the phage-displayed anti-VEGF scFv(fXa ϩ ) (as shown in empty circles). The scales in both the x axis and the y axis show the folds increased against the data for anti-VEGF scFv(fXa ϩ ) encoded in the original pCANTAB5E phagemid with the M13pIII-pelB signal sequence. Both the null control phage and M13KO7 helper phage are negative controls (data shown in black triangles and in black diamonds, respectively). As expected, the data for the negative controls are close to zero in both axes. MARCH 12, 2010 • VOLUME 285 • NUMBER 11 tude compared with the S5 with the original pCANTAB5E construct.

Phage Display of Disulfide-stabilized scFv
Disulfide Bond Formation in the S5 sc-dsFv-To test the formation of the disulfide bond in the phage-displayed S5 sc-dsFv variants, we constructed two control phage-displayed anti-VEGF scFv variants as follows: one with factor Xa cutting site -IEGR-encoded in the linker peptide connecting the two variable domains (anti-VEGF scFv(fXa ϩ )); the other one without this fXa cutting site (anti-VEGF scFv(fXa Ϫ )) (see "Experimental Procedures"). Both anti-VEGF scFv(fXa ϩ ) and anti-VEGF scFv(fXa Ϫ ) bind to VEGF comparably in the absence of bovine factor Xa (fXa) (Fig. 8). But in the presence of fXa, only the anti-VEGF scFv(fXa Ϫ ) variant binds to VEGF (Fig. 8), indicating that the cleavage of the linker peptide abolishes the scFv-VEGF binding. The S5 construct contains the -IEGR-cutting site in the linker peptide (supplemental Fig. 1), making fXa resistance a marker for the disulfide formation in the sc-dsFv. The two S5 variants shown in Fig. 8 remained largely capable of binding to VEGF in the presence of fXa, in contrast to the fXa effect on anti-VEGF scFv(fXa ϩ ). These results indicated that the cleavage of the linker region of the phage-displayed S5 sc-dsFv did not abolish the variable domain structure, suggesting that the disulfide linkage between the two domains maintained the functional scFv structure for VEGF recognition. The fXa effects on the VEGF binding capability of the top-ranked signal peptide variants of the S5 sc-dsFv are listed in supplemental Table 2. The extent of the disulfide formation varies from ϳ30 to ϳ100%, suggesting that the signal peptide not only affected the expression level of the phagedisplayed sc-dsFv, it also determined the folding of the sc-dsFv.
Comparison of VEGF Binding among Variants from Phagedisplayed Synthetic sc-dsFv and scFv Libraries with Sequence Variegation in the CDR Regions of the Light Chain Variable Domain-So far, we have demonstrated that optimizing the relevant region in the signal sequence can substantially improve the expression and folding of the model sc-dsFv molecule on the phage surface. The immediate challenge following this finding was to use one of the newly discovered signal sequences to construct sc-dsFv libraries with sequence variegation in the CDR regions so as to identify the CDR sequence preferences for the sc-dsFv binding against VEGF. These CDR sequence preferences were to be compared with the CDR sequence preferences derived from the corresponding scFv framework in the absence of the S5 interdomain disulfide bond. With the comparison, we intended to characterize the following: 1) the performance of the sc-dsFv phage display platform, and 2) the effect of the interdomain disulfide on the binding of the sc-dsFv against its antigen.
Based on the structural and experimental data published (33), the major interaction site between the anti-VEGF scFv and the VEGF epitope is centered on the CDR3 region of the heavy chain. We thus focused on variegating the sequences in the CDR regions of the light chain, such that the effects of the interdomain disulfide bond can be manifested through the differences of the light chain sequence preference, which are expected to reflect the presence or absence of the interdomain disulfide bonding connecting the light chain domain to the heavy chain domain. As shown in Fig. 9, we constructed three synthetic CDR libraries (CDRL3, CDRL2-CDRL3, and CDRL1C-CDRL3C) based on each of the two templates, S5   Table 2). All the phage solutions were normalized to 1.0 ϫ 10 10 cfu/ml. One set of the phage solutions was mixed with bovine factor Xa (1 unit) at 37°C for 1 h (data shown in the gray histogram); the other set of the phage solutions were mixed with only buffer in the same reaction condition (data shown in the black histogram). The binding capabilities of the phage particles from both sets of phage solutions to the immobilized VEGF were measured with ELISA, for which the signal strengths are shown in the y axis. The white histogram shows the binding signal of the phage to immobilized anti-E tag antibody, reflecting the relative expression level of the scFv or sc-dsFv. The error bars were derived from three repeats of the ELISA measurement.
sc-dsFv and anti-VEGF scFv(fXa ϩ ). These two templates are different in only place, the interdomain disulfide bond (L:Q100C and H:G44C) evidently forms to stabilize the S5 sc-dsFv structure (Fig. 8). More than 40 top ranked binders against VEGF were selected from each of the six libraries after a few rounds of panning against VEGF. These variants were sequenced, and the phage cultures were tested for VEGF binding and for resistance against fXa digestion. The statistics of the sequence preferences are compared in Fig. 9, and the statistics of VEGF-binding affinities and the effects of the interdomain disulfide bond on fXa digestion are compared in Fig. 10. The sequence information and numerical results for each of the variants are listed in supplemental Table 3.
Comparisons of the sequence preferences for VEGF binding from CDRL3 variants (Fig. 9B) and from CDRL2-CDRL3 variants (Fig. 9C) indicated that the sequence preferences for VEGF binding in these CDR regions in the scFv were highly similar to those in the sc-dsFv, suggesting that the interdomain disulfide bond in S5 sc-dsFv did not significantly affect the interactions between the CDR regions and the VEGF epitope. On the other hand, Fig. 9D contrasts the differences between the scFv and the sc-dsFv in the cysteine preference for forming the intralight chain disulfide bond; diminishing conservation of the cysteines in the sc-dsFv variants suggests that the interdomain disulfide bond in the sc-dsFv can at least partially, if not completely, replace the conserved intra-light chain disulfide bond in maintaining the functional antibody structure. Fig. 10A shows that it was easier to find slightly stronger VEGF binders in the scFv libraries than in the sc-dsFv libraries. This could be due to a more flexible dimeric interface that allows fine adjustments between the antibody-antigen interactions according to the CDR residues. Fig. 10B shows that the sc-dsFv variants formed interdomain disulfide bond, leading to resistance to the dimeric dissociation because of fXa cleavage of the linker peptide between the variable domains. The scFv variants remained cleavable by fXa, although the extent of the dimeric dissociation due to the fXa cleavage was reduced, especially for the CDRL3 variants. This is likely due to the stronger interaction between the scFv and VEGF, which in turn holds the dimeric interface even when the linker peptide is cleaved by fXa.
The results shown in Figs. 9 and 10 demonstrate the performance of both the sc-dsFv phage-displayed libraries and the scFv phage-displayed libraries. The sc-dsFv libraries were expressed as efficiently as the scFv libraries. The interdomain disulfide bonds were largely formed in the sc-dsFv variants examined. The scFv variants examined were likely to fold correctly as in the sc-dsFv. Finally, the VEGF binding affinity can be further enhanced by several folds in both scFv and sc-dsFv FIGURE 9. Light chain CDR sequence preferences for VEGF-binding derived from phage-displayed synthetic scFv and sc-dsFv libraries. The light chain sequences of the template scFv and the sc-dsFv are shown in A, along with the randomized CDR regions in the three pairs of synthetic libraries CDRL3, CDRL2-CDRL3, and CDRL1C-CDRL3C. X indicates the locations of the residue encoded with degenerate codon NNK in the synthetic libraries. The complexities for scFv CDRL3 and sc-dsFv CDRL3 library are 9.6 ϫ 10 8 and 8 ϫ 10 8 , respectively; the complexities for scFv CDRL2-CDRL3 and sc-dsFv CDRL2-CDRL3 library are 1.9 ϫ 10 9 and 4.6 ϫ 10 9 , respectively; the complexities for scFv CDRL1C-CDRL3C and sc-dsFv CDRL1C-CDRL3C library are 1.3 ϫ 10 9 and 4.3 ϫ 10 9 , respectively. The signal sequence for both the scFv-and sc-dsFv-pIII fusion proteins is TRSCFAFMLP (see L4 to S5 number 64 shown in Fig. 8, also see supplemental Table 2 for more details). The sequence logos (42) in B-D compare with the sequence preference differences in VEGF binding between the scFv and the sc-dsFv variants from the three pairs of synthetic libraries, respectively. The logos shown in B were derived from 43 and 40 variants for scFv and sc-dsFv, respectively; the logos shown in C were derived from 45 and 42 variants for scFv and sc-dsFv, respectively; the logos shown in D were derived from 38 and 40 variants for scFv and sc-dsFv, respectively. Details of the variants are listed in supplemental Table 3.
variants by just refining the amino acid sequence in the minor VEGF-binding sites in the light chain variable domain.

DISCUSSION
This work demonstrates a methodology to systematically optimize the signal sequences for phage-displayed protein expression, in particular for disulfide-linked scFv structures based on the S5 interdomain disulfide bond, for which the expression with conventional signal sequences was not viable. The optimized signal sequences and the discovering methodologies led to the establishment of the phage display systems with the sc-dsFv format, enabling the demonstration and comparison of the performance of the sc-dsFv phage display platform with that of the conventional scFv platform. The sc-dsFv libraries were expressed indistinguishably compared with the corresponding scFv libraries due to the optimized signal sequence amino-terminal to the sc-dsFv-pIII fusion protein. The interdomain disulfide bond in the sc-dsFv was largely formed and did not affect the antigen-binding preferences unless the conserved intrachain disulfide bond in the light chain variable domain was compromised; sc-dsFv structural integrity was less sensitive to the disruption of the intrachain disulfide bond. The expression of the sc-dsFv libraries underlies a new antibody display platform that retains the best features of the scFv scaffold without the shortcomings due the unstable dimeric interface.
The expressions of all other sc-dsFv constructs (S1-S4 and S6 -S7) were not able to be rescued by the optimization of the signal sequence in the c-region. It is evident that the signal sequence is not the only determinant in sc-dsFv expression; the positions of the engineered cysteine pair are one of the determinants in the expression and folding of the sc-dsFv. The reason could be that the translocation of the nascent unfolded polypeptide chain exposes the unfolded polypeptide chain sequentially from amino toward carboxyl termini to the folding environment in periplasm. As such, the disulfide configuration, and thus the folding of the variable domain, is dependent on the sequential appearance of the cysteines in the primary structure.
Very little is known as to why some sc-dsFv constructs could not be expressed on the phage surface. Because the disulfide bonds of the newly synthesized preprotein can only be formed in the oxidizing environment of periplasm, the mechanism for the translocation of the nascent unfolded polypeptide chain from the translation site in the cytoplasm across the periplasm membrane could be a key determinant for the folding and, consequently, for the expression of the displayed protein on the phage surface (25)(26)(27)(28)(29)(30)(31). Alternative sequences in the signal peptide region have been known to modulate the expression level and folding quality of the displayed protein (25,27,31), but difficulty remains in identifying optimum signal sequences in a vast sequence space for some sc-dsFv constructs.
The sequence preferences of the key regions in the signal sequences have been well characterized (43). The n-region contains a few positively charged residues (Lys and Arg). The h-region contains ϳ10 consecutive hydrophobic residues. The c-region contains polar residues and a signal peptidase cleavage site composed of residues of small side chains. Each of the regions plays important roles at various stages of the translocation mechanism (44). The translocation mechanisms in E. coli for M13 phage-displayed proteins follow two major pathways. The Sec pathway is a post-translational translocation process (44 -46). Newly synthesized polypeptide forms a complex with SecA and the dimeric SecYEG protein conduction channel in periplasm membrane, followed by the insertion of the h-region of the signal peptide in the amino terminus of the polypeptide into the lateral opening of the trans-membrane helix barrel FIGURE 10. Distributions of the VEGF-binding affinity and resistance of fXa digestion for the selected scFv and sc-dsFv variants of which the sequence logos are shown in Fig. 9. In each of the distributions shown in A and B, the boundary of the box closest to zero indicates the 25th percentile; a line within the box marks the median; the boundary of the box farthest from zero indicates the 75th percentile; whiskers (error bars) above and below the box indicate the 90th and 10th percentiles. The dots show the extreme values of the respective distribution. A, the normalized VEGF binding affinity shown in the y axis is defined as follows: normalized VEGF binding ϭ ((VEGF sample Ϫ VEGF null )/(anti_E sample Ϫ anti_E null ))/((VEGF WT Ϫ VEGF null )(anti_E WT Ϫ anti_ E null ), where anti_E sample and VEGF sample are the mean anti-E tag-binding and VEGF-binding ELISA signal, respectively, for the variant in consideration; anti_E WT and VEGF WT (WT is wild type) are the mean anti-E tag-binding and VEGF-binding ELISA signal, respectively, for the template anti-VEGF scFv/sc-dsFv, for which the sequences for the CDR regions are shown in Fig. 9A. The corresponding null values (anti_E null and VEGF null ) were derived with a negative control phage without the displayed fusion protein. B, VEGF binding % after fXa treatment shown in the y axis is defined as follows: VEGF binding % ϭ ((VEGF sample fXa Ϫ VEGF null fXa )/(VEGF sample Ϫ VEGF null )) ϫ 100%, where VEGF sample and VEGF sample fXa are the mean VEGF-binding ELISA signal before and after fXa treatment for the variant in consideration. The corresponding null values (VEGF null and VEGF null fXa ) were derived with a negative control phage without the displayed fusion protein.
composed of 10 trans-membrane helixes from SecY. The inserted h-region forms a helical structure and packs into the trans-membrane helix barrel as part of the translocon. Through the motion of the translocon complex driven by the hydrolysis of ATP, the unfolded polypeptide is pushed through the pore of the translocon across the periplasm membrane from the amino to carboxyl termini. The c-region is recognized and cleaved by a membrane-bound signal peptidase, releasing the mature protein to the periplasm.
The second model is the SRP pathway, which is a co-translational process (46,47). The synthesized signal peptide, in particular the hydrophobic residues in the h-region, is recognized by Ffh-4.5RNA complex, which brings the protein synthesis complex to the membrane to dock with SecYEG translocon before the major part of the polypeptide is translated. The regions of the signal peptide are integrated into the translocon complex as in the Sec pathway described above. The continuing synthesis of the polypeptide chain from the ribosome sends the polypeptide across the periplasm membrane through the pore of the SecYEG complex. The mature polypeptide is processed by the membrane-bound signal peptidase during or after the completion of the protein synthesis.
Based on the results in this work, we conclude that the amino acid residues in the c-region of the signal sequence dominate the anti-VEGF sc-dsFv as well as scFv display efficiency on the phage surface, suggesting that the signal peptidase cleavage rate affects the expression and folding of the sc-dsFv and scFv. This result is somewhat unexpected because previous findings suggest that the Sec system-dependent signal sequences have little influence on scFv expression (27). Still, the molecular details of the control mechanism for the signal sequence on the translocation and folding of the sc-dsFv-pIII fusion protein remain largely unclear.