Arginine mutations in antibody complementarity-determining regions display context-dependent affinity/specificity trade-offs

Antibodies commonly accumulate charged mutations in their complementarity-determining regions (CDRs) during affinity maturation to enhance electrostatic interactions. However, charged mutations can mediate non-specific interactions, and it is unclear to what extent CDRs can accumulate charged residues to increase antibody affinity without compromising specificity. This is especially concerning for positively charged CDR mutations that are linked to antibody polyspecificity. To better understand antibody affinity/specificity trade-offs, we have selected single-chain antibody fragments specific for the negatively charged and hydrophobic Alzheimer's amyloid β peptide using weak and stringent selections for antibody specificity. Antibody variants isolated using weak selections for specificity were enriched in arginine CDR mutations and displayed low specificity. Alanine-scanning mutagenesis revealed that the affinities of these antibodies were strongly dependent on their arginine mutations. Antibody variants isolated using stringent selections for specificity were also enriched in arginine CDR mutations, but these antibodies possessed significant improvements in specificity. Importantly, the affinities of the most specific antibodies were much less dependent on their arginine mutations, suggesting that over-reliance on arginine for affinity leads to reduced specificity. Structural modeling and molecular simulations reveal unique hydrophobic environments near the arginine CDR mutations. The more specific antibodies contained arginine mutations in the most hydrophobic portions of the CDRs, whereas the less specific antibodies contained arginine mutations in more hydrophilic regions. These findings demonstrate that arginine mutations in antibody CDRs display context-dependent impacts on specificity and that affinity/specificity trade-offs are governed by the relative contribution of arginine CDR residues to the overall antibody affinity.

context-dependent impacts on specificity and that affinity/ specificity trade-offs are governed by the relative contribution of arginine CDR residues to the overall antibody affinity.
In vitro display methods such as phage (1) and yeast surface (2) display are invaluable for efficiently isolating high-affinity antibody variants from large libraries. These display methods have several advantages relative to immunization, including the ability to isolate and/or evolve antibodies with higher affinities than those that are typical for natural antibodies. This advantage stems in part from the exquisite control over antigen presentation afforded by in vitro display methods, including the concentration, conformation, and higher order structure of the target antigen. Another key advantage of such display methods is the potential of using them to perform negative selections during sorting and/or affinity maturation, which enables identification of antibodies with low cross-reactivity against molecules that are similar to the target antigens (3)(4)(5)(6).
Nevertheless, a key disadvantage of in vitro display methods is that they generally yield lower quality antibodies than those isolated via immunization (4). Common deficiencies include reduced antibody specificity, folding stability, and/or solubility relative to natural antibodies. The increased likelihood of in vitro display methods to yield suboptimal antibodies may be due to the reduced quality control mechanisms employed by bacteria and yeast relative to higher order organisms.
Many approaches have been developed to improve in vitro selection of antibodies with enhanced biophysical properties and specificities. The most common method is to use elevated temperature to unfold destabilized antibodies and then select for stable variants that possess high affinity (7,8). This approach has also been coupled with the use of conformational ligands (e.g. Protein A, Protein L, and conformational antibodies) that recognize folded antibodies to enrich libraries for folded variants either at the beginning of or during the sorting process (9 -15). Negative selections have also been reported using polyspecificity reagents (e.g. mammalian cell lysate) to eliminate non-specific antibodies from in vitro libraries to improve the selection of highly specific antibodies (6).
We have developed a directed evolution method for improving the selection of antibody fragments with increased affinity and stability (15) that overcomes affinity/stability trade-offs observed for antibody fragments isolated from in vitro libraries (4,8,16). This approach involves displaying mutant libraries of lead variable domain of heavy chain (V H ) 2 antibodies on the surface of yeast and co-selecting for antigen binding (e.g. Alzheimer's A␤42 peptide) and stability via a conformational ligand (Protein A). Interestingly, we find that co-selection for both affinity and stability mutations is critical for maintaining thermodynamic stability during affinity maturation of antibody variable (V H ) domains (15,17). Moreover, we observe that stable V H domains evolved against the A␤ peptide accumulate several arginine mutations in the CDRs, which are important for binding to the negatively charged A␤ peptide (pI ϳ5).
The accumulation of positively charged mutations, especially arginine mutations, in the CDRs of antibodies during affinity maturation raises concerns about specificity. Arginine is a highly interactive amino acid that can participate in several different types of interactions (cation-, hydrogen bonding, and van der Waals) in addition to electrostatic interactions. However, therapeutic antibodies with high specificity also commonly contain one or more arginine residues in their CDRs (18 -20), and these arginine residues in some cases contribute significantly to binding affinity (21)(22)(23)(24). This suggests that the location and context of arginine CDR mutations is a critical, yet poorly understood, determinant of antibody specificity.
To better understand how arginine mutations in the CDRs influence antibody affinity/specificity trade-offs, we have performed selections of single-chain antibody fragments (scFvs) against A␤42 in different solution environments that possess unique abilities to block non-specific interactions. Based on our previous findings for related A␤ V H domains (25,26) and the strong preference for A␤ to interact with arginine-rich peptides and proteins (27)(28)(29)(30)(31), we reasoned that the isolated antibodies would be enriched in arginine mutations in the CDRs using both weak and stringent selections for specificity. However, we posited that the unique specificities of the selected antibodies would provide important details about how arginine mutations in the CDRs impact antibody specificity. Moreover, we hypothesized that affinity/specificity trade-offs for antibodies with low specificity would be due to over-reliance on arginine for binding. Here, we evaluate these hypotheses for A␤ scFvs isolated via weak and stringent selections for antibody specificity using both experimental and computational methods.

Isolation of A␤ scFvs enriched in arginine CDR mutations with low specificity
To generate a lead antibody against A␤42 that could be subjected to affinity maturation using directed evolution methods, we first grafted A␤ residues 33-42 into heavy chain CDR3 (HCDR3) of an scFv scaffold based on the variable domains of trastuzumab (4D5, Fig. 1 and supplemental Fig. S1). The ration-ale of this approach is that homotypic interactions between the grafted A␤ peptide and the same peptide segment within A␤42 will mediate self-recognition. The wild-type grafted scFv (referred to herein as WT) displayed high stability (apparent T m ϭ 69.2 Ϯ 0.1°C relative to 72.2 Ϯ 1.3°C for 4D5) and expression in bacteria (purification yield of 3.6 Ϯ 0.6 mg/liter relative to 15.6 Ϯ 1.1 mg/liter for 4D5) despite that HCDR3 of the wild-type grafted antibody is longer (14 residues relative to 11 for 4D5).
These results suggest that the HCDR3 sequence modestly impacts scFv stability and that focused mutagenesis in this CDR for affinity maturation may have relatively minor impacts on stability. To further test this hypothesis, we inserted three randomized residues at each edge of the grafted A␤ peptide in HCDR3 using degenerate codons. Sequencing analysis confirmed that ϳ66% of the mutated scFvs only contained mutations in HCDR3. We expressed 12 of these scFvs in bacteria and found that most of them (67%) displayed purification yields of Ͼ1 mg/liter ( Fig. 2 and supplemental Fig. S2) and similar secondary structures compared with 4D5 and WT (supplemental Fig. S3). Notably, the scFv purification yields were well correlated with solubility predictions based only on the physicochemical properties of HCDR3 (R 2 ϭ 0.76) (32). Moreover, the stabilities of the best expressing scFvs (purification yields Ͼ1 mg/liter) were relatively high (average apparent T m of 66.9 Ϯ 2.9°C) and comparable with wild type (apparent T m of 69.2 Ϯ 0.1°C; Fig. 3 and supplemental Fig. S4). These results collectively suggest that variants in the scFv library display high stability and expression levels that are generally predictable based on their physiochemical properties.
We next displayed the scFv library on the surface of Saccharomyces cerevisiae and performed sorting analysis to isolate 2   A, third complementarity-determining region of the antibody heavy chain (HCDR3) is grafted with A␤ residues 33-42 (underlined) and defined according to Kabat numbering. The edges of the grafted segment contain three randomized residues (denoted as "X") along with residues from the parent antibody (shown in red for HCDR3 and gray for the framework). B, structure of the variable domains of the light (V L ) and heavy (V H ) chains of 4D5 (PDB code 1FVC) with HCDR3 highlighted in red. In this work, these variable domains are connected via a peptide linker (residues SPNSASHSGSAPNTSSAPGSQ) to form the scFv 4D5 (V L -linker-V H ).

Effect of arginine mutations on antibody specificity
variants with enhanced affinity. We reasoned that the solution environment suggested for antibody sorting using conventional yeast surface display (PBS and 1 mg/ml BSA) may not be stringent enough to block non-specific interactions and that more stringent conditions (e.g. PBS, 5 mg/ml BSA, and 0.1% v/v Triton X-100; PBS-BX) would be necessary due to the hydrophobic nature of A␤42. Therefore, we performed two rounds of MACS and two rounds of FACS at 250 -1000 nM A␤42 in PBS-BX, which led to strong enrichment of scFv-displaying cells that bound A␤ (supplemental Fig. S5). The enriched library was sequenced to determine the scFvs with enhanced antigen binding (supplemental Fig. S6). Most of the sequenced scFvs were full-length (83%) and unique (90%). Notably, all of them contained at least one arginine mutation in HCDR3 and some contained up to four arginine mutations. Moreover, they all bound to A␤ better than wild type (supplemental Fig. S6). One of the best binding variants (A10) contained the mutations Ala-Arg-Pro and Arg-Arg-Gly at the N and C terminus of the grafted A␤ peptide in HCDR3, respectively.
We next evaluated the antigen-binding properties of the A10 variant in more detail on the surface of yeast (Fig. 4). Interestingly, A10 binding to A␤ failed to saturate at high (micromolar) antigen concentrations and was poorly fit using a three-parameter Langmuir isotherm to determine the K D value ( Fig. 4 and supplemental Fig. S7). However, a four-parameter model that included a linear term to describe non-specific interactions fit the binding curve well (K D of 132 Ϯ 11 nM). The lack of saturation at high antigen concentrations may reveal that A10 has a poorly formed binding pocket that is unable to prevent bound A␤ from interacting with additional A␤ monomers. In support of this hypothesis, we find that A10 is unable to bind A␤ in a complex environment (PBS, 1 mg/ml BSA and 1% milk, data not shown), revealing that A10 has low specificity.
One explanation for the low specificity of A10 is that it is over-reliant on arginine CDR mutations for antigen binding. To test this hypothesis, we performed alanine-scanning mutagenesis for each of the A10 mutations and evaluated their impact on affinity (Fig. 5). Mutating either of the two consecutive arginines (Arg-100i and Arg-100j) to alanine resulted in a significant reduction in affinity (p values Ͻ0.007). Moreover, mutating the other arginine (Arg-97) to alanine also decreased affinity (p value of 0.01), which was similar to the impact of the G100kA mutation (p value of 0.03). We also observed that arginine-to-lysine mutations reduced affinity (p values Ͻ0.02), revealing that positive charge is not sufficient to achieve full   The flow cytometry binding analysis was performed using yeast displaying the A10 scFv, and the solution conditions were PBS, 5 mg/ml BSA, and 0.1% Triton X-100. A␤42 binding was normalized by the maximum signal. The binding data were fit using three (black) and four (red) parameter fits, the latter of which contained a term that was linearly dependent on antigen concentration to describe non-specific interactions. The data are shown for linear (top) and logarithmic (bottom) scales, and each point is the average of two independent experiments.

Effect of arginine mutations on antibody specificity
binding activity and the arginine side chains play a key role in mediating A10 antigen binding. These findings collectively demonstrate that A10 affinity is strongly dependent on its arginine CDR mutations, which appears to explain its low binding specificity.

Stringent selection of scFvs with arginine CDR mutations and increased specificity
The fact that the selected scFvs such as A10 possessed low specificity for A␤ in complex environments (e.g. PBS, 1 mg/ml BSA, and 1% milk) suggested that the solution environment used for sorting (PBS, 5 mg/ml BSA, and 0.1% v/v Triton X-100) was not stringent enough to block non-specific interactions. Thus, we repeated the scFv selections in the presence of a PBS solution containing 1% milk and other additives (5 mg/ml BSA and 0.1% v/v Triton X-100, PBS-BXM). We performed three MACS sorts and one FACS sort to isolate scFvs with enhanced affinity and specificity for A␤ (supplemental Fig. S8).
Sequencing analysis of the enriched scFv library revealed six unique, full-length scFvs (supplemental Fig. S9). The six mutated HCDR3 positions possessed a similar range of numbers of arginine mutations (zero to four arginines per HCDR3) as the antibody variants isolated in the absence of milk (supplemental Fig. S6). However, the average arginine content for the six mutated positions was reduced for antibodies isolated using the more stringent sorting method (22% arginine) relative to those isolated using the less stringent method (50% arginine).
One of the antibodies (variant 2, herein referred to as B2) was observed most often (50%). B2 is also notable because it contains four mutations in the grafted A␤ peptide in addition to the six mutations at the edges of the grafted peptide (HCDR3 sequence WKPIGLMGGRRGIALSSMDY, mutations italicized). These mutations include a pair of arginine mutations at positions 100d and 100e.
The ability of the six unique scFvs to bind A␤ was first tested on the surface of yeast (supplemental Fig. S9). All of the enriched variants bound to A␤ better than wild type. The B2 scFv was one of the best binding variants. Interestingly, this variant displayed A␤ binding that saturated at high antigen concentration ( Fig. 6 and supplemental Fig. S10), in contrast to the non-saturating behavior observed for A10 ( Fig. 4 and supplemental Fig. S7). Fitting the binding data for B2 to either three-or four-parameter fits yielded similar values of K D (84 Ϯ 32 and 65 Ϯ 9 nM for three-and four-parameter fits, respectively; Fig. 6 and supplemental Fig. S10). We also tested whether the saturable binding characteristics of B2 are indicative of a better formed binding pocket that would correspond to increased binding specificity in complex solution environments (such as those containing milk). Indeed, we find that B2 retains binding activity in a PBS solution with 1% milk, whereas most of the binding activity for A10 is lost in the presence of 1% milk (Fig. 7). This suggests that sorting in the presence of complex solutions such as milk favors selection of antibody variants with increased specificity. Alanine and lysine substitution mutations were generated for A10 mutations in HCDR3. Values of the equilibrium association constant (K A ) were measured using yeast surface display (PBS, 5 mg/ml BSA, and 0.1% Triton X-100) and a high-throughput flow cytometry method. The K A value for the wildtype antibody (A10) measured using the high-throughput method performed in microtiter plates (K A of 0.29 Ϯ 0.10 ϫ 10 7 M Ϫ1 ) was modestly lower than the value obtained using a lower throughput method in tubes (K A of 0.43 Ϯ 0.01 ϫ 10 7 M Ϫ1 ). Error bars represent standard deviations for three to eight independent experiments. The HCDR3 sequence is defined using Kabat numbering. The K A values were obtained using a three-parameter binding model. A two-tailed Student's t test was used to determine statistical significance (p values Ͻ0.05 (*) or 0.01 (**)).

Effect of arginine mutations on antibody specificity
Nevertheless, the increased specificity of the B2 scFv is unexpected because its HCDR3 is enriched in positive charge (Lys-96, Arg-100d, and Arg-100e) in a similar manner as that observed for A10 (Arg-97, Arg-100i, and Arg-100j), including a pair of arginines in the middle of HCDR3. We first evaluated whether the fact that B2 has one less arginine than A10 was an important determinant of its increased specificity given that arginine residues are known to contribute to non-specific anti-body interactions (33)(34)(35). However, mutating the lysine mutation in B2 (Lys-96) to arginine resulted in similar binding as B2 in a complex environment (1% milk) and much better binding than A10 (supplemental Fig. S11). This reveals that the number of arginines does not explain the difference in specificity between A10 and B2.
To better understand the importance of the arginine mutations in B2 relative to the other HCDR3 mutations, we performed alanine-scanning mutagenesis to identify the most important residues involved in A␤ binding (Fig. 8). Surprisingly, mutating either of the arginines in the middle of HCDR3 (Arg-100d and Arg-100e) had little impact on binding (p values Ͼ0.2) relative to the significant impact of mutating the A10 arginines to alanine (p values Ͻ0.01; Fig. 5). Moreover, mutation of 10 of the 15 B2 residues in HCDR3 to alanine resulted in a significant reduction in affinity (p values Ͻ0.05; Fig. 8), revealing that the B2 binding mechanism involved a distributed network of interactions. Moreover, we also found that B2 affinity was similar after mutating the pair of arginines (Arg-100d and Arg-100e) to lysines (p value of 0.09).
One possible mechanism for the specificity of B2 is that its binding pocket may involve the light chain despite that we only mutated the heavy chain. To test this hypothesis, we generated a V H -only version of B2 and evaluated its affinity (supplemental Fig. S12). Interestingly, the B2 affinity was primarily due to the V H domain, as the version with only the B2 V H domain displayed similar affinity (K D of 44 Ϯ 2 nM) as the entire B2 scFv (K D of 84 Ϯ 32 nM).
We also investigated the biochemical properties of the A10 and B2 scFvs in more detail by producing them as antibody fragments in bacteria and purifying them via metal affinity chromatography. The purification yields of both scFvs (6.2 Ϯ 0.6 mg/liter for A10 and 10.2 Ϯ 0.4 mg/liter for B2) were similar to the average for the initial library variants we analyzed (5.0 Ϯ 4.0 mg/liter), modestly lower than the 4D5 scaffold (15.6 Ϯ 1.1  The analysis was performed as described in Fig. 5. The equilibrium association constant for the wild-type antibody (B2) measured using a high-throughput (flow cytometry) method performed in microtiter plates (K A of 0.45 Ϯ 0.16 ϫ 10 7 M Ϫ1 ) was modestly lower than the value obtained using a lower throughput (flow cytometry) method in tubes (K A of 1.31 Ϯ 0.42 ϫ 10 7 M Ϫ1 ). A two-tailed Student's t test was used to determine statistical significance (p values Ͻ0.05 (*) or 0.01 (**)).
Despite the similar biochemical properties of the A10 and B2 scFvs, we find that their ability to specifically recognize A␤ monomer as purified antibody fragments is significantly different (Fig. 9C). Both scFvs bind to immobilized A␤ in conventional buffers (PBS, 5 mg/ml BSA and 0.1% Triton X-100), but only B2 displays binding in complex environments (1% milk, PBS, 5 mg/ml BSA, and 0.1% Triton X-100). Moreover, A10 (100 nM) displays non-specific binding to ovalbumin but not to the hydrophobic peptide islet amyloid polypeptide (IAPP), whereas B2 (100 nM) fails to interact with either.
We further characterized the affinity of purified A10 and B2 scFvs by evaluating antibody binding to A␤42 immobilized on magnetic beads in the presence and absence of milk (Fig. 9D). B2 displays a modest reduction in affinity in the presence of milk (K D of 49 Ϯ 2 nM relative to 12 Ϯ 3 nM in the absence of milk). In contrast, A10 displays large reductions in affinity in the presence of milk (K D Ͼ1 M relative to 23 Ϯ 0.1 nM in the absence of milk). We also produced scFv-Fc fusions for A10 and B2 to examine whether the specificities of these antibodies are similar in bivalent formats. Indeed, we observed similar trends in specificity for monovalent (scFv) and bivalent (scFv-Fc) antibodies (supplemental Fig. S14).
We also evaluated the A␤ epitope recognized by the B2 scFv using immunoblotting analysis. B2 bound to A␤(9 -42) and A␤(1-42) peptides but failed to recognize A␤(1-40) and A␤(1-28) (supplemental Fig. S15). This demonstrates that B2  4)). C, scFv (100 nM) binding to immobilized antigens in a buffer solution (PBS, 5 mg/ml BSA, 0.1% Triton X-100, 0.02% sodium azide) either without or with 1% milk. The error bars represent standard deviations for two independent experiments. Normalized binding signals were calculated as signal minus background divided by background, and background binding was measured for each scFv without immobilized antigen. D, scFv binding to A␤42 immobilized on magnetic beads in buffer (PBS, 1 mg/ml BSA; closed symbols) or buffer supplemented with 1% milk (open symbols). Two independent experiments were performed, and one representative experiment is shown.

Effect of arginine mutations on antibody specificity
recognizes an epitope, including the C terminus of A␤42. More generally, these results demonstrate that B2 displays significant binding specificity despite that its HCDR3 is enriched in positively charged residues.

Computational models of A10 and B2 antibodies highlight unique presentations of arginine HCDR3 mutations
Our analysis suggested that the location and context of the arginine mutations in the A10 and B2 antibodies significantly altered the resulting binding specificity. To better understand the structural origins of this behavior, we used a combination of homology modeling and molecular simulations to generate models of each antibody variant. Starting from the 4D5 crystal structure, we generated initial structures of both Fvs and performed three independent molecular dynamics simulations (each for 60 ns; supplemental Fig. S16). After an initial equilibration of each trajectory (10 ns), we used a greedy-type clustering algorithm (36) with a cutoff of 0.2 nm to identify the most dominant structure for each antibody and each simulation run (supplemental Fig. S17). As expected, we find that HCDR3 is more flexible for both antibodies relative to the more rigid antibody framework (supplemental Fig. S18). A closer examination of the HCDR3 structures reveals that the three B2 simulations yielded similar antibody structures, whereas two of the three A10 simulations yielded similar structures. We used clustering analysis of the different simulations to identify the most dominant structure for each antibody (Fig. 10) (36). Notably, the location of the pair of arginine mutations in A10 is near the base of HCDR3, whereas the arginine mutations in B2 are located in the middle of the loop.
We also investigated how the arginine mutations in A10 and B2 influence the hydrophobicity of HCDR3 (Fig. 11). To accomplish this, we used a method (INDUS) that evaluates protein hydrophobicity by calculating the free energy of removing water molecules from regions near the protein surface (37). This is motivated by findings that the free energy of cavity formation serves as a measure of hydrophobicity for nanoscopic rugged protein surfaces (38 -43). Specifically, the free energy of cavity formation is reduced near hydrophobic surfaces and gradually increases as the surface becomes more hydrophilic.
The resulting excess free energy distributions calculated using INDUS for the entire Fv as well as for their HCDR3 loops are summarized in supplemental Fig. S19. The highest probability of each distribution corresponds to positive free energy

Effect of arginine mutations on antibody specificity
values, which is consistent with our general observations that these antibodies are largely hydrophilic and display low levels of aggregation (Fig. 9B). The free energy distributions for the HCDR3 loops are more complex and contain regions with both low and high hydrophobicity (supplemental Fig. S19).
We next used these free energy calculations to highlight the most hydrophobic atoms in HCDR3 of A10 and B2 (Fig. 11). The hydrophobic regions (excess free energies less than Ϫ5 kJ/mol) in HCDR3 are highlighted in red and the arginine pairs are highlighted in yellow in Fig. 11. It is notable that B2 contains more hydrophobic HCDR3 sites (11 residues) than A10 (6 residues). Moreover, the most hydrophobic HCDR3 sites in B2 overlap with its arginine mutations. In fact, two of the most hydrophobic HCDR3 sites in B2 are the arginine mutations. In contrast, the most hydrophobic regions in the A10 HCDR3 are well separated from the arginine mutations. It is also interesting that two of the five most hydrophobic regions in HCDR3 of A10 (residues Gly-100d and Val-100e) correspond to the hydrophobic arginine residues in B2 (residues Arg-100d and Arg-100e). Moreover, one of the arginine mutations in A10 (Arg-100i) with low hydrophobicity corresponds to a leucine residue in B2 (Leu-100i) with high hydrophobicity. These findings are generally consistent with the fact that B2, which has more hydrophobic HCDR3 sites than A10, is less dependent on any single HCDR3 residue for binding (Fig. 8) than A10 (Fig. 5). Moreover, the fact that the arginine mutations are located in the most hydrophobic regions of HCDR3 in B2 (but not for A10) may be linked to the increased binding specificity of B2.

Discussion
Our studies demonstrate strong enrichment of arginine mutations in the CDRs of antibody fragments during in vitro affinity maturation. It is notable that these findings share similarities with natural antibody evolution. Arginine is present at low levels in the CDRs of germ line antibodies relative to other much more common residues such as tyrosine, serine, and glycine (34,44,45). However, arginine is one of the most enriched amino acids during affinity maturation. Indeed, analysis of CDRs in diverse antibodies reveals that arginine residues can make key contributions to binding (21)(22)(23)(24). It is also notable that high-throughput sequencing of naive and antigen-experienced B-cells has revealed that positively charged HCDR3 loops with similar theoretical net charges as observed in this work (net charge of ϩ2 for A10 and B2) occur in natural antibody repertoires, and the fraction of such antibodies is enriched after maturation (46). More generally, arginine residues are common in hot spots that govern protein-protein recognition (47)(48)(49)(50).
The important role that arginine plays in molecular recognition is likely due to the fact that it is an unusually interactive amino acid. Arginine has the ability to interact via many different types of interactions, including electrostatic, hydrogen bonding (up to five hydrogen bonds), hydrophobic (three methylene carbons), and pseudo-aromatic (electron delocalization of guanidinium moiety) interactions (47,51,52). Moreover, the large size of the arginine side chain results in a greater contribution to the buried surface area for antibody-antigen inter-faces even when it is present at the same levels as other amino acids with smaller side chains (48).
The fact that arginine is a highly interactive amino acid can also lead to non-specific antibody binding. High levels of arginine in synthetic antibody CDRs have been linked to non-specific interactions (33,34). Moreover, HCDR3 loops in polyspecific antibodies during early B-cell development are often rich in positive charge (35). Our findings, especially for A10 and related scFv variants selected using non-stringent selections, are consistent with these findings.
However, our findings also highlight that the context of arginine CDR mutations is a key determinant of the resulting antibody specificity. The increased specificity of B2 relative to A10 appears to be linked to the reduced reliance on arginine mutations for binding as well as the increased role of other nonarginine HCDR3 residues. Our computational results also reveal more hydrophobic sites in HCDR3 of B2 relative to A10, suggesting that the binding interface may be more distributed across this important binding loop for B2. Moreover, these findings suggest that the arginine mutations in B2 (but not A10) are some of the most hydrophobic sites in HCDR3. This may indicate that the specific types of interactions mediated by the pair of arginines in B2 HCDR3 are different from those for A10, and this may be linked to increased antibody specificity.
Our findings also have practical applications in terms of improving the selection of specific antibodies. We performed direct positive selections in a complex environment to screen for antibodies with specific antigen binding. This solution environment was unusually stringent compared with previously reported conditions for conventional yeast surface display (2,53). The most similar types of positive selections reported previously in complex environments were in the presence of various types of surfactant solutions for selection of antibodies against membrane proteins (54 -56). Nevertheless, we find that use of elevated concentrations (0.1%) of surfactants such as Triton X-100 are insufficient to block non-specific interactions and that addition of complex mixtures such as 1% milk is necessary to block diverse types of non-specific interactions.
Our library design strategy also deserves further consideration. We used a hybrid approach of rational design and directed evolution. This involved grafting a self-recognition peptide from A␤42 in HCDR3 flanked by randomized residues for enhanced affinity. This design strategy was inspired by previous work to design integrin-specific antibodies (57). Other previous studies have shown that it is easier to identify mutations at sites that are peripheral to the known binding interface to increase binding rather than altering the known binding interface itself (58 -63). Randomizing only the edges of the grafted segment is a more conservative approach because it does not disrupt the initial binding interface. A potential weakness of this strategy is that the mutations are positioned at the edges of HCDR3 and are less accessible for interaction compared with those in the center of HCDR3. Indeed, our identification of B2 with mutations in the grafted peptide segment suggests that future efforts should focus mutagenesis of grafted antibodies throughout HCDR3.
Our strategy of focusing sequence variation in HCDR3 yielded scFv variants with desirable biophysical properties.

Effect of arginine mutations on antibody specificity
Two-thirds of the initial library variants that we evaluated expressed well, and purification yields were generally predictable based on the HCDR3 sequence. Additionally, the average apparent melting temperature of the initial library and selected variants (A10 and B2) were greater than 65°C. The lack of trade-offs between affinity and stability are notable because such trade-offs have been reported both for antibodies and other affinity proteins (15, 17, 64 -69). Our approach is attractive for initial antibody discovery, and additional sequence variation in HCDR3 as well as other CDRs can be introduced during further affinity maturation.
A long-term goal of in vitro antibody discovery is to obtain antibodies with a collection of properties (e.g. affinity, specificity, and stability) that rival those of natural antibodies. This work has revealed potential trade-offs between antibody affinity and specificity that can occur during in vitro antibody selection. It will be important in the future to incorporate stringent selections combined with improved library design methods to limit arginine content at early stages of antibody discovery to improve selection of highly specific antibody variants, which we are currently pursuing in our laboratory.

Library construction and cloning
The parent scFv gene (4D5) was created via a custom gene block (gBlock, Integrated DNA Technologies). The amino acid sequences of the V L and V H domains (supplemental Fig. S1) were obtained from the PDB (1FVC). The amino acid sequence of the peptide linker (SPNSASHSGSAPNTSSAPGSQ) was chosen for its non-repetitive nature (70). Linear epitope tags for detection (Myc) and purification (heptahistidine) were added to the C terminus of the V H domain. The gene block was flanked with N-terminal NheI and C-terminal XhoI restriction sites as well as 45 bp of homology at each end with complementary sites in the yeast display plasmid (pCTCON2). The gene block was amplified (Phusion High-Fidelity DNA polymerase, M0530L, New England Biolabs) with terminal primers that were complementary to the gene block. Next, the PCR-amplified fragment was ligated into the yeast display plasmid (after the original plasmid was digested with NheI and XhoI) via homologous recombination. Plasmids were isolated from yeast (D2004, Zymoprep II yeast miniprep kit, Zymo Research) and transformed into XL1-Blue (200228, Agilent Technologies) via electroporation. Plasmids from individual bacterial colonies were then isolated (QIAprep Spin miniprep kit, 27106, Qiagen) and sequenced.
A subset of the scFv genes isolated in yeast display plasmids were transferred to a bacterial expression plasmid (pET-17b, Novagen) for soluble scFv expression. The scFv genes were amplified using primers with flanking HindIII (N terminus) and KpnI (C terminus) restriction sites. The PCR products were then digested and ligated into the bacterial expression vector that contained an N-terminal pelB sequence for periplasmic secretion and C-terminal tags (3xFLAG and heptahistidine tags) for detection and purification. Single alanine mutations were generated via site-directed mutagenesis using PfuUltra II (600850, Agilent Technologies).
The A10 and B2 scFv genes were also transferred to an Fc fusion mammalian expression plasmid (pBIOCAM5, Addgene) for soluble scFv-Fc expression. The scFv genes were amplified using primers with flanking NcoI (N terminus) and NotI (C terminus) restriction sites. The PCR products were then digested and ligated into the mammalian expression vector that contained a C-terminal human Fc region (scFv-Fc) and multiple peptide tags (3xFLAG and hexahistidine tags) for detection and purification.
The yeast libraries were first sorted via magnetic-activated cell sorting (MACS). Induced yeast (10 9 cells) were collected and washed twice with 25 ml of PBS with 5 mg/ml BSA and 0.1% v/v Triton X-100 (PBS-BX). Cells were then resuspended in 10 ml of 1 M biotinylated A␤42 peptide (62-0-15B, American Peptide Co.) in either PBS-BX ("A" sorting series) or PBS-BX with 1% w/v milk (nonfat dry milk, PBS-BXM; "B" sorting series), and rocked at room temperature for 2 h. After incubation, cells were washed once with PBS-BX (50 ml) and resuspended in PBS-BX (5 ml), and streptavidin microbeads (130-048-102, Miltenyi Biotec) 200 l were added. The mixture was incubated on ice for 10 min. The yeast cells were then pelleted and resuspended in PBS-BX (50 ml). The cell suspension was then passed through a MACS separation column (130-042-401,

Effect of arginine mutations on antibody specificity
Miltenyi Biotec) that was connected to a MidiMACS separator magnet (130-042-302, Miltenyi Biotec) and attached to a MACS MultiStand (130-042-303, Miltenyi Biotec). In this step, cells bound to the magnetic particles were retained by the MACS separation column. The bound cells were then eluted by removing the column from the magnetic source and flowing SDCAA (7 ml) through the column. The collected cells were then grown in SDCAA at 30°C overnight with agitation and subjected to additional rounds of sorting.
For FACS and flow cytometry analysis, yeast cells (10 7 cells) were pelleted and washed twice with 1 ml of either PBS with 1 mg/ml BSA (PBS-B) or PBS-BX. The cells were then resuspended in 200 l of biotinylated antigen with 1 l of anti-c-Myc chicken IgY antibody (A-21281, Life Technologies, Inc.) and allowed to incubate for 1.5 h (25°C and 850 rpm). For FACS, the buffer used during antigen binding was PBS-BX for the A series or PBS-BXM for the B series. After incubation, the cells were pelleted and washed once with 1 ml of the corresponding buffer (PBS-B or PBS-BX). Cells were then resuspended in the same buffer (200 l) with 100-fold diluted secondary reagents (Alexa Fluor 488-conjugated goat anti-chicken IgG, A-11039, and Alexa Fluor 647-conjugated streptavidin, S-32357; Life Technologies, Inc.) and allowed to incubate for 5 min. The solutions were then pelleted and washed once with the corresponding buffer (1 ml) and analyzed via flow cytometry (LSR II or FACSAria, BD Biosciences). For all FACS and flow cytometry studies, 10 5 events were recorded and adjusted for Alexa Fluor 488 and Alexa Fluor 647 cross-signals using appropriate compensation controls. Individual MACS screening rounds were also visualized in this manner. Finally, the enriched yeast libraries were miniprepped (Zymoprep II yeast miniprep kit) and transformed into electrocomponent bacteria (XL1-Blue), and plasmids were isolated (QIAprep spin miniprep kit) and sequenced.

Bacterial expression and purification
Bacterial expression (pET-17b) plasmids were transformed into the BL21(DE3)pLysS bacteria (200132, Agilent Technologies). The cultures were grown for 48 h (30°C) in auto-induction media (200 ml) supplemented with antibiotics (100 g/ml ampicillin and 35 g/ml chloramphenicol) (71). The bacterial cells were then pelleted and discarded, and the supernatants were incubated with 2 ml of Ni-NTA beads (30230, Qiagen) overnight at 4°C (80 rpm). The beads were then collected and washed with 150 ml of PBS with 20 mM imidazole (pH 7.4), and the protein was eluted with 3 ml of PBS with 500 mM imidazole (pH 7.4). Eluted proteins were centrifuged (21,000 ϫ g) for 5 min and filtered (0.22 m filters, SLGV013SL, Millipore). Imidazole was removed via buffer exchange (Zeba spin desalting columns, 89893, Thermo Fisher Scientific) against PBS (pH 7.4). The scFvs were then refolded via buffer exchange into PBS with 6 M guanidinium chloride (pH 7.4), incubated overnight at 4°C, and buffer exchanged into PBS (pH 7.4). The scFv concentrations were measured via UV absorbance measurements at 280 nm (extinction coefficients calculated based on their amino acid sequences). The purity of the scFvs was evaluated using SDS-PAGE analysis (WG1203BOX, Life Technologies, Inc.), and the gels were stained using Coomassie dye (24615, Thermo Fisher Scientific).

Mammalian expression and purification
Mammalian expression plasmids (pBIOCAM5) were transfected into adherent HEK293T cells using Lipofectamine 2000 (11668027, Thermo Fisher Scientific), and the cells were cultured in DMEM supplemented with 10% fetal bovine serum and 1% penicillin/streptomycin. After 36 h of transfection, the cells were subsequently passaged and cultured for 3 days. Following this initial growth phase, the media were removed and stored at 4°C. Fresh growth medium (DMEM ϩ 10% fetal bovine serum ϩ1% penicillin/streptomycin) was then added, and the cells were cultured for 2 days at high cell density. This medium was then combined with the original growth media and centrifuged at 2500 ϫ g for 5 min to remove cellular debris.
scFv-Fc fusions were purified in two steps. First, Ni-NTA beads (4 ml) were added to the media containing the secreted scFv-Fc fusions and incubated overnight at 4°C with end-overend mixing. The beads were then collected and washed (200 ml of PBS). The beads were incubated in 3 ml of PBS with 50 mM imidazole (pH 7.4) for 15 min for an additional wash. The proteins were then eluted with 3 ml of 500 mM imidazole (PBS, pH 7.4). Imidazole was removed via buffer exchange into PBS using desalting columns. Second, the imidazole-purified scFv-Fc fusions were incubated with 2 ml of Protein A-agarose (20334, Thermo Fisher Scientific) at 4°C overnight with end-over-end mixing. Protein A resin was washed (100 ml PBS) and then incubated in 3 ml of 0.1 M glycine (pH 3) for 15 min to elute the proteins. K 2 HPO 4 (300 l of 1 M) was then added to neutralize the protein solution (pH 7). Finally, the protein solutions were buffer exchanged into PBS using desalting columns, centrifuged (21,000 ϫ g, 5 min), and filtered (0.22-m filter).

Circular dichroism
Circular dichroism (CD) was performed with a Jasco J-815 CD spectrometer. Far-UV CD spectra were measured for scFvs diluted to 0.16 mg/ml in water (0.03-0.5ϫ PBS final concentration). Spectra were measured in the range of 200 -260 nm at increments of 0.5 nm and a scanning rate of 50 nm/min. The final spectra are averages of 20 accumulations. Residue ellipticity values were calculated after background subtraction. Thermal unfolding of scFvs (0.1 mg/ml) was monitored at 235 nm from 25 to 95°C and a heating rate of 1°C/min. For thermal unfolding curves, the fraction folded is calculated as (⍜ T Ϫ ⍜ U )/(⍜ F Ϫ ⍜ U ) (72). ⍜ T is the ellipticity at a given temperature, and ⍜ F and ⍜ U are the ellipticities of the folded and unfolded states, respectively, at the given temperature after fitting the folded and unfolded states to separate linear models. The reported melting temperatures (T m *) are apparent values because the scFvs aggregated at high temperature.

Antibody binding analysis
Measurements of equilibrium dissociation constants (K D ) were performed for scFvs displayed on the surface of yeast. Briefly, 10 6  For the alanine and lysine mutational analysis, a closely related methodology was used for measuring antibody affinity that involved preparing the samples in microtiter plates instead of tubes. This led to minor differences in the protocol. These include washing with 300 l of PBS-BX for all washing steps after the yeast cells were incubated with antigen and using the High Throughput Sampler attachment (BD Biosciences) to analyze the samples via flow cytometry.
The flow cytometry measurements of A␤42 binding were quantified in terms of the total fluorescence signals of the Alexa Fluor 647 dye in the allophycocyanin (APC) channel. The mean APC values at each A␤42 concentration were normalized by the highest APC value obtained for a given experiment. The normalized APC binding signals were then fit to a three-parameter binding model, given as APC ϭ APC min ϩ APC sat [A␤42]/ (K D ϩ [A␤42]). APC min is the minimum APC value, APC sat is the APC value at saturation, and [A␤42] is the concentration of A␤42. In some cases, a four-parameter binding model was used that involved an additional linear term, given as APC ϭ where APC ns is the non-specific binding term. Data were fit to both of these models using the Excel solver tool and minimizing the mean squared error between data and the models. The reported K D values are the average of two or more independent experiments.
Measurements of the equilibrium dissociation constants for purified scFv and scFv-Fc antibodies were measured using A␤42 immobilized on magnetic beads. To immobilize A␤42 on magnetic beads, 20 million streptavidin-coated Dynabeads (2.8 m, Dynabeads Biotin Binder, 11047, Thermo Fisher Scientific) were washed twice with PBS-B (1 ml) using a magnet (DYNAL MPC-S, Thermo Fisher Scientific). Next, the washed beads were resuspended in 800 l of 2.5 M biotinylated A␤42 peptide (PBS-B) and allowed to bind overnight (4°C). The next day, the A␤ immobilized beads were washed twice with PBS-B and resuspended to a concentration of 100,000 beads/l. Purified scFv or scFv-Fc (200 l) was prepared at various concentrations (0.4 -1000 nM) in PBS-B or PBS-B with 1% milk and added to microtiter plates. Washed A␤ immobilized beads (5 l, 500,000 beads) were added to each well, and the antibody/ antigen mixtures were allowed to equilibrate for 3 h at room temperature. The beads were sedimented (2500 ϫ g for 2 min) and washed once (300 l of PBS-B). Next, 100 l of 1000ϫ diluted mouse anti-FLAG M2 IgG (F1804, Sigma) in PBS-B was added and allowed to bind for 1 h to the 3xFLAG tag present on both scFvs and scFv-Fc fusions. The plate was then washed again, and 100 l of 100ϫ diluted Alexa Fluor 488-conjugated goat anti-mouse IgG (A-11001, Life Technologies, Inc.) was added (5 min). Finally, the plate was washed one final time with PBS-B before analysis via flow cytometry using the High Throughput Sampler attachment. Binding was quantified using the fluorescein (FITC) channel. The mean FITC values at each antibody concentration were fit using a three-parameter binding model.
A well plate binding assay was also used to evaluate the isolated scFvs as purified proteins. scFv binding was evaluated using immobilized A␤42, IAPP and ovalbumin (A5503, Sigma). Disaggregated A␤42 was prepared by incubating the lyophilized, non-biotinylated peptide (62-0-80, American Peptide) in hexafluoroisopropanol overnight (1 g/liter). The hexafluoroisopropanol was then evaporated, and A␤42 peptide was dissolved in 50 mM NaOH at 1 g/liter A␤42 and centrifuged at 208,000 relative centrifugal force for 1 h at 4°C. The supernatant (top 80%) was removed and passed through a 0.22-m filter (SLGV004SL, Millipore). The peptide was neutralized by adding acidified PBS to achieve a final pH of 7.4 and a concentration of 25 M A␤42. Disaggregated IAPP was a kind gift from Daniel Raleigh. Lyophilized IAPP was dissolved in 20 mM Tris (pH 7.4) to achieve a final concentration of 32 M.
Each peptide or protein (100 l per well) was immobilized at 1 M (room temperature) in MaxiSorp Nunc-Immuno 96-well plates (442404, Thermo Fisher Scientific). Afterward, the well plates were blocked with 10% w/v milk in PBS for 8 h and subsequently washed with PBS. ScFvs (0.1-1000 nM) were then diluted into PBS-BX or PBS-BXM supplemented with 0.02% w/v NaN 3 , added to the well plates, and allowed to incubate overnight at room temperature. The well plates were then washed once with PBS and incubated with 100 l of 1000ϫ diluted mouse anti-FLAG M2 IgG (F1804, Sigma) in PBS with 0.1% v/v Tween 20 (PBST). After washing with PBS, the well plates were incubated with 100 l of 1000ϫ diluted horseradish peroxidase-conjugated goat anti-mouse IgG (32430, Thermo Fisher Scientific) in PBST. Finally, the well plates were washed with PBS, and peroxidase activity was determined by adding substrate (1-Step Ultra TMB-ELISA, 34028, Thermo Fisher Scientific) and quenched after 10 min with 100 l of 2 M H 2 SO 4 . The absorbance values at 450 nm were measured to quantify the signals using a Tecan Safire 2 plate reader. Normalized binding signals were calculated as signal minus background divided by background, where the background is the absorbance for a given concentration of scFv and secondary reagents binding to wells without immobilized antigen.

Size-exclusion chromatography
scFvs were evaluated via size-exclusion chromatography. Ten g of each scFv was injected into an analytical TSKgel G3000SWXL column (0.78 ϫ 30 cm, 08541, Tosoh Bioscience) and evaluated using a Waters 600 high-performance liquid chromatography system. The mobile phase (0.5 ml/min) consisted of PBS with 0.2 M arginine (pH 7.4). The elution of scFvs was detected via absorbance measurements at 280 nm using a Waters 2487 series UV absorbance detector.

Intrinsic solubility score
The intrinsic solubility scores were calculated using CamSol Intrinsic (32). The amino acid sequences for residues 95-102 (Kabat numbering) in HCDR3 of each scFv were evaluated via the on-line program to calculate the solubility scores.
For the loading control, peptides were immobilized on nitrocellulose membranes and washed thoroughly with water. Stock solutions of 40% (w/v) sodium citrate (S467-3, Fisher Chemical), 20% (w/v) ferrous sulfate (215422, Sigma), and 20% (w/v) silver nitrate (209139, Sigma) were prepared. A colloidal stain was prepared by mixing 90 parts water, 5 parts sodium citrate, 4 parts ferrous sulfate, and 1 part silver nitrate (by volume). The blots were incubated in the stain for 5 h and then rinsed with water to remove excess stain.

Modeling of scFv structures
The crystal structure for 4D5 variable fragment (Fv) was obtained from the Protein Data Bank (1FVC (73)). The homology models for the A10 and B2 Fvs were generated using the crystal structure of 4D5 Fv as the template except for HCDR3. The HCDR3 loops were modeled using the corresponding loops in the PDB entries 4LKC (A10) and 3C2A (B2) as templates. The homology modeling calculations used the Antibody Modeler module in MOE2015.10 (www.chemcomp.com) (88). 3 The force-field AMBER10:EHT in Molecular Operating Environment (MOE) was used along with an internal dielectric constant of 4 and external dielectric constant of 80. For each antibody, 625 intermediate models (25 backbone models and 25 side chain models per backbone model) were generated, optimized via energy minimization to root mean square gradient values Ͻ0.01, and scored according to generalized born/ volume integral solvation energy. The final reported model for each antibody corresponds to the one with the best generalized born/volume integral solvation energy. In addition, the C termini of the light and heavy chains were neutralized via amidation, and the final models were further energy-minimized to root mean square gradient values of Ͻ0.0001.
The Fv structures were solvated with ϳ16,000 water molecules in a periodic cubic box (ϳ8 ϫ 8 ϫ 8 nm 3 ) and then relaxed via molecular dynamic simulations using GROMACS (74). The antibody and water molecules were explicitly represented using the AMBER99SB force field (75) and TIP3P model (76), respectively. Electroneutrality was maintained via the addition of chloride and potassium ions. Cross-interaction parameters were determined using the Lorentz-Berthelot mixing rules (77), and electrostatic interactions were calculated using the particle mesh Ewald algorithm (78). Bonds involving hydrogen atoms were constrained using the LINCS algorithm (79).
Next, three independent simulations were performed for A10 and B2 Fvs with different initial velocities. Each production run (60 ns) was performed using the isothermal-isobaric (NPT) ensemble. The system temperature and pressure were maintained at 300 K and 1 atm using the Nose-Hoover thermostat (80, 81) and Parrinello-Rahman barostat (82,83), respectively. Time steps of 2 fs were used for each simulation, and configurations were stored every 10 ps for analysis.

Computational analysis of antibody surface hydrophobicity
The surface hydrophobicities of A10 and B2 were evaluated using the INDirect Umbrella Sampling (INDUS) method (37) to estimate the free energies of cavity formation near antibody surfaces. First, the instantaneous antibody-water interfaces (84) of the dominant structures of A10 and B2 were identified based on their heavy atoms, and then benzene-shaped probe volumes were used to cover their entire hydration shell. The values used for the effective cavity radii for the carbon and hydrogen atoms were 3.11 and 2.6 Å, respectively, based on the Weeks-Chandler-Andersen prescription (85,86) and the AMBER force field Lennard-Jones parameters (87). Each probe volume was then gradually dehydrated with the application of a series of external biasing potentials (0, 2.5, 5.0, 7.5, 10.0, and 12.5 kJ/mol) that were linearly coupled to the number of water molecules in the probe volumes.
Each simulation was performed for 1.2 ns, and data were stored every 0.1 ps for analysis. The first 200 ps of the simulations allowed for equilibration of the number of water molecules in the probe volumes, and this initial period was excluded from further analysis. The next 1 ns of the simulations was used to calculate free energies of cavity formation (37) as shown in where ⌬ v ex is free energy required to empty a probe with a specific shape/size v, N v is number of water molecules in the probe volume v, and is the external potential energy used to dehydrate the probe volume. The free energy of hydrating the benzene-shaped cavity in bulk water was also calculated in a similar manner to provide a reference value. We calculated the excess free energy of probe cavity formation, the difference between cavity formation free energy near the antibody surface and that in the bulk water, as an estimate of the surface hydrophobicity shown in Equation 2.