The Protein-Protein Interface Evolution Acts in a Similar Way to Antibody Affinity Maturation

,

Protein-mediated interactions in biological systems are used to organize the macromolecular complexes and networks responsible for regulation and complexity. Understanding the evolutionary mechanism that acts at the interfaces of proteinprotein complexes is a fundamental issue with high interest for appreciating and delineating the macromolecular complexes and networks responsible for regulation and complexity in biological systems. Affinity maturation of antibodies is unique in being the only evolutionary mechanism known to operate on a molecule in an organism's own body (1). It is interesting to ask whether the evolution of distinct protein-protein interfaces may use the same basic strategy under selection pressure to maintain interactions as that of an antibody response to a protein antigen during affinity maturation. Unfortunately, archaeological records for tracing the evolutionary pathway of specific protein-protein interfaces are unavailable. Tools to rationally alter and manipulate protein interaction offer great promise for understanding and delineating the protein-protein interface evolution (2). Recent advances in computational sciences have led to novel sophisticated and refined computational methods, which have addressed some problems related to the design of protein-protein binding affinity improvements, such as the design of stable protein folds (3), altered enzymatic activity (4), and altered protein-protein association rate (5). However, because of limits of conformational search and inaccuracies in the treatment of polar interactions in the energy function, the design of improved binding affinity has met with limited success (6,7).
Previous investigations have extensively studied the evolution of antibody/antigen interface during affinity maturation. Recently, Li et al. (1) provided the first visualization of the maturation of antibodies to protein. By directly comparing the structures of four antibodies bound to the same site on hen egg white lysozyme (HEL) at different stages of affinity maturation, they revealed that antibody affinity maturation is the result of small structural changes, mostly confined to the periphery of the antibody-combining site. Moreover, comparison of the germline to mature sequences in a structural region-dependent fashion allows insights into the methods that nature uses to mature antibodies (Abs) 3 during the somatic hypermutation process. Tomlinson et al. (8) have previously analyzed the diversity of amino acids at specific positions in the germline and mature Ab sequences. They found that the frequency of somatic hypermutation and the diversity of the germline sequences are highest in the CDRs. Rather than focus on the mutation frequencies, Clark et al. (9) examined the type of mutation and its functional implications deduced from the location in the structure. Their results indicated that residue type changes during the somatic hypermutation process were significant and had underlying functional rationales.
In the present study, several strategies incorporating the evolutionary information derived from in vivo antibody affinity maturation with classical simulation techniques was used to investigate whether the evolution of protein-protein interface acts in a similar way as antibody affinity maturation. If the same evolutionary mechanism is used in all the proteinprotein interfaces, antibody evolutionary information would help to improve the prediction success rate of the classical simulation method in affinity enhancement of other protein-protein complexes. Our design strategies were evaluated in four different types of protein-protein complexes. It was interesting to find that even in other protein-protein complexes besides antibody-antigen complexes, one of the strategies yields exceptional high success rates (Ͼ57%) for single mutations from wild type. We further investigated the position of the affinity-improving mutations in the coding sequence of antibody and other proteins. Our data suggest that the evolution of distinct protein-protein interfaces may use the same basic mechanism under selection pressure to maintain interactions. The present study also demonstrates the generality of our design strategy and suggests that it may be used to accurately predict affinity improvement of any protein.

EXPERIMENTAL PROCEDURES
Protein Simulation-Crystal structures of the target proteins complexed with their respective binding partners were from the Protein Data Bank. Most crystallographic water molecules and ions were removed, except for water molecules bridging the binding interface or buried away from bulk solvent. Hydrogen atom positions were assigned using the Biopolymer module of Insight II (Accelrys). The computational mutation was carried out on the target protein. Docking was performed using MCSA for random generation of a maximum of sixty structures through the Affinity module of Insight II (CVFF force field) (33). The resulting set of structures was evaluated for total energy and how close each was to the crystal structure based on a heavy-atom RMSD of the binding partner critical amino acids for interaction. Then the lowest energy complexes presenting lower RMSD were selected for the binding free energy calculations. Briefly, molecular dynamics (MD) simulations were done using the CHARMM program (34) with the PARAM22 all-atom parameter set (35) to obtain a stable MD trajectory for each of the simulated structures. Finally, the binding free energy was calculated using molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) method (36). The detailed procedure can be seen under supplemental methods.
Methods-Mutations were introduced into wild-type antibody or receptor-Fc fusion protein gene by the overlapping PCR method. The antibody heavy and light chain variable region genes were fused in-frame to the human ␥-1 and con-stant region genes, respectively. The extracellular domain of the receptor was genetically fused to the 5Ј terminus of the human IgG1 Fc. Antibody and receptor-Fc fusion protein genes were cloned into the mammalian expression vectors and were transiently expressed in COS-7 cells (ATCC), followed by protein A purification (Amersham Biosciences). The binding activity of antibodies and receptor-Fc fusion proteins was determined by flow cytometry. Their binding affinity constant (K D ) was measured by ELISA or Biacore analysis. Full experimental methods and any associated references are provided under supplemental methods.

RESULTS
Capturing the Key Evolutionary Information of the Antibody Binding Site during Affinity Maturation-The trends in antibody sequence changes during the somatic hypermutation process were systematically studied by Clark et al. (9). They illuminated the strategies that nature uses to bias immature Ab properties and subsequently refines them during the affinity maturation process. The puzzling finding in their studies was that there was a substantial relative increase in the usage of some amino acids at the antibody-antigen interface during affinity maturation, although most of these amino acids still had a significant low usage in the binding site of matured antibody (9,10). This leads us to ask why those amino acids are markedly increased during affinity maturation by a net conversion from other types but still have a low usage at the combining site of matured antibody, and whether these residue types are more important for molecular recognition than those residue types that have a high tendency to be present at the binding site of affinity-matured antibody.
To investigate these issues and to obtain the critical evolutionary information of antibody combining site during affinity maturation, we defined three groups of amino acids for our studies. The first group included ten amino acids that contribute dominantly (more than 5% contributions to either the residue composition of matured antibody interface or the surface residue composition) to the residue composition of the mature antibody combining site (9). The second group was ten amino acids with the highest net increase at the antibody-antigen interface during affinity maturation (9), and the third group contained ten amino acids randomly selected from the twenty common amino acids based on the completely randomized design (Table 1) (11). As clearly shown in Table 1, 4 amino acids (Glu, Arg, Pro, and Thr) are found both in the first and second groups. In addition, x-ray snapshots of the maturation of an antibody response to a protein antigen provided us valuable clues to insight into the evolution of high affinity in other protein-protein interfaces (1). Their results clearly indicated that the binding of protein-specific antibodies was improved through small structural changes at the periphery of the anti- body combining site. Indeed, somatic hypermutation has been found to spread structural diversity generated by V-D-J recombination from central to peripheral regions of the antibody binding site (8). Double mutant cycle analysis of hydrogen bonds between residues located at the periphery of proteinprotein interfaces has shown that they usually make little or no net contribution to complex stabilization, presumably because the strength of these solvated interactions is comparable to those of the water-protein hydrogen bonds they replace (12).
First we attempted to redesign the humanized antibody trastuzumab for improved binding to its antigen, epidermal growth factor receptor 2 (HER2). Trastuzumab, a therapeutic agent for breast cancer (13), has a picomolar affinity, maintaining a great challenge for affinity improvement. Single mutation at each of 60 CDR positions to the 20 common side chain were designed using a Monte Carlo simulated annealing (MCSA) algorithm and molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) calculations (14). Mutations were ranked by the total calculated binding free energy. Eleven single mutations of largest magnitude of predicted affinity improvement (supplemental Table S1) were constructed, expressed, and purified as described under "Experimental Procedures." The binding affinity (K d ) of trastuzumab mutants for the extracellular domain of HER2 (HER2 ECD) was determined by an ELISA. The results showed that the predictions yielded a very low success rate (18.2%), and only two mutations improved affinity (supplemental Table S1).
In our second attempt, we incorporated the evolutionary information derived from antibody affinity maturation with the computational method described above to improve the antibody binding affinity, and further evaluate whether the evolutionary information could significantly improve the prediction success rate of the computational method. As shown in Table 2, three different design strategies were employed. Based on each strategy, ten single-point mutants with the largest magnitude of predicted affinity improvement (supplemental Table S2) were selected for experimental binding affinity measurement. The binding of these trastuzumab mutants to the HER2-overexpressing human breast cancer cell line SK-BR-3 was determined by flow cytometry. The results showed that our predictions based on the second strategy yielded a very high success rate (60%) for single mutations from wild type ( Table 2). Six point mutations (L28DR, L93TY, H55NK, H102DT, H102DY, and H102DK), which span four positions, were shown to be able to improve the binding of trastuzumab to SK-BR-3 cells (supplemental Table S2), and the binding curves of the representative mutants are shown in Fig. 1A. Next, the binding affinity analysis by ELISA showed that six mutants bound tighter than wild type (supplemental Table S2), which is consistent with the results obtained by flow cytometry assays. Our results indicated that the second strategy yielded much higher prediction success rate than that of other two strategies ( Table 2 and supplemental Table S2). The second strategy employed the selection criteria listed below to define mutations that need to be calculated: 1) the mutating position must be at the periphery of the binding interface; 2) the substitute residue must be the following amino acid: Glu, Arg, Asn, Pro, Ser, Thr, Tyr, Lys, Asp, or Ala. Each of these amino acids has been demonstrated to have over 5% contributions to either the residue composition of the matured antibody interface or the surface residue composition (9, 10). Our results indicated that the combination of these two selection criteria played an important role in improving computational prediction success rate, suggesting that small structural changes at the periphery of the antibody combining site during affinity maturation and those most frequently used amino acids in the binding site of affinity-matured antibody may be the key evolutionary information of antibody interface during affinity maturation.

Protein-Protein Interface Evolution Follows a Mechanism Similar to
That of Antibody Affinity Maturation-To further assess whether the evolutionary information used in our second strategy could help to improve the prediction success rate of the computational method in other protein-protein complexes, three additional different systems were employed. First, we attempted to improve the binding affinity of  a Strategy 1: The mutating position was at the periphery of the binding interface, and the substitute residues were amino acids with the highest net increase at the antibody-antigen interface during affinity maturation (Group 2 in Table 1). b Strategy 2: The mutating position was at the periphery of the binding interface, and the substitute residues were amino acids with a high usage in the binding site of matured antibody (Group 1 in Table 1). c Strategy 3: The mutating position was at the periphery of the binding interface, and the substitute residues were amino acids randomly selected from the twenty common amino acids based on the completely randomized design (Group 3 in Table 1).
the chimeric anti-CD20 antibody rituximab to its antigen peptide, which is a therapeutic antibody for B-cell lymphomas. The crystal structure complex of rituximab Fab-CD20 epitope peptide has recently been determined in our laboratory (15). Based on the second strategy, the experimental binding affinity for the seven largest magnitude predictions was measured by using the Biacore SPR technology. The experimental results showed that five of seven mutations successfully improved affinity (Table 2 and supplemental  Table S2), with 6.1-and 4.0-fold improvement for the best two single mutations at position His 57 and His 102 , respectively. The predicted structures for the two single mutations are shown in supplemental Fig. S1. To investigate whether protein-protein interface evolution acts in a similar way as antibody affinity maturation, we utilized the evolutionary information obtained from in vivo antibody affinity maturation to improve the binding of other proteins besides antibodies. Cytotoxic T-lymphocyte-associated protein 4 (CTLA4)-Ig (Abatacept) is a receptor-Ig fusion protein that binds to CD80 and CD86 for blocking T cell co-stimulation and has been approved for the treatment of rheumatoid arthritis (16). We redesigned CTLA4-Ig to improve its binding to CD86. The binding affinity was tested experimentally for the ten mutations of largest magnitude of predicted affinity improvement based on each of the three different strategies ( Table 2). As clearly shown in the Table 2 and supplemental Table S3, the second strategy improved the binding affinity of CTLA-4 to CD86 with a much higher accuracy (70%) than that of other two strategies.
Interleukin-2 (IL-2), which is one of the first cytokines identified and a member of the four-helix bundle cytokine superfamily, acts at the heart of the immune response (17,18). To increase the binding of interleukin 2 receptor ␣ (IL2R␣) to IL-2, seven single mutations, which were predicted to have the largest magnitude of improvement based on the strategies, were measured experimentally for the binding affinity. Out of the three strategies (Table 1), the second strategy exhibited a significantly enhanced prediction success rate ( Table 2). As shown in the supplemental Table S3, four of seven single mutations were successfully improved affinity based on the second strategy, with 2.7-fold improvement for the best single mutation S64K (supplemental Fig. S3). Taken together, our data clearly demonstrated that the second strategy could improve the binding affinity in all the four protein-protein complexes with a high prediction success rate, suggesting that protein-protein interface evolution may follow a similar mechanism as antibody affinity maturation.
Broad Usage of DNA Hotspot Mechanisms in the Evolution of Protein-Protein Interfaces-High affinity antibodies were generated in mice and humans by somatic hypermutation. It has been confirmed that somatic hypermutation does not occur randomly within immunoglobulin V genes but is preferentially targeted to certain nucleotide positions (hot spots) and away from others (cold spots) (19). This process mainly results in the introduction of mutations that are located at or very near (A/G)G(C/T)(A/T) (RGYW) or (A/T)A (WA) sequences (20). To further investigate the position of these affinity-improving mutations in the germline antibody sequence, we firstly used the BLAST algorithm to determine the best germline V, D, and J segment alignments in the IgBLAST directory of mouse immunoglobulin genes. As clearly shown in Fig. 2 and supplemental Fig. S4, most of the affinity-improving point mutations in trastuzumab and rituximab were placed at NRGYWN or NWAN sequences in the germline antibody. Intriguingly, all of the affinity-improving mutations were also found at or very near RGYW or WA sequences in the mature antibody ( Fig. 2  and supplemental Fig. S4). More than 87 and 100% of affinityimproving point mutations were found at these sequences (NRGYWN or NWAN) in trastuzumab and rituximab, respectively (Table 3). These results were in agreement with the previous report by Ho et al. (21) who introduced random mutations around a few hotspots in the antibody CDR by PCR and efficiently improved binding affinity of affinity-matured antibody.
Surprisingly, we found that most of affinity-improving mutations in CTLA-4 and CD25 also occurred within RGYW and WA sequences ( Fig. 3 and supplemental Fig. S5). The data in Table 3 clearly showed that, in CTLA-4 and CD25, the frequency of affinity-enhancing point mutations occurring within or adjacent to the RGYW or WA sequences were 3/5 and 3/3, respectively. In the sequence of CTLA-4, the Leu 106 , although not placed at or very near RGYW and WA sequences, was found to be located at TACCTGGGC, which belonged to the reverse complement of RGYW motif (WRCY) (Fig. 3). To evaluate whether this striking observation was due to the bias of our design strategy, we further investigated another two affinityenhanced proteins previously reported by other teams (6,22). As shown in supplemental Figs. S6 and S7, almost all of the affinity-improving mutations in their results were located at RGYW or WA sequences. Our results suggested that RGYW and WA motifs not only were somatic hypermutation hotspot sequences during antibody affinity maturation, but also would be seen as mutation hotspot sequences to enhance proteinprotein interaction during the evolution of protein-protein interfaces.

DISCUSSION
The immune system contains a highly diverse population of antibodies, and each is distinguished by a unique set of CDRs that confer antigen specificity (23). The data base of natural antibody sequences has revealed that, whereas the compiled CDR sequences are highly diverse, there are clear biases for particular amino acids (24). The structural data base further reveals that these biases are even greater when one considers residues that mediate antigen recognition through direct contacts (25). Recently, Clark et al. (9) have extensively investigated the trends in antibody sequence during the somatic hypermutation process. They found that residue type changes during the somatic hypermutation process were significant and had underlying functional rationales. We analyzed their data and found that most of the amino acids with the highest net increase at the antibody-antigen interface during affinity maturation had a low usage in the binding site of matured antibody (Table  1). Our data reveal that those amino acids with a high tendency in the combining site of matured antibody could be more important in enhancing protein binding affinity than those amino acids with a substantial relative increase during affinity maturation. These results may imply that the amino acids with markedly relative increase during affinity maturation may have only recently been into the amino acid repertoire on an evolutionary time scale, and the use of amino acids at the protein-protein interface may be restricted by their potential. Furthermore, we interestingly found that nearly all of the amino acids used in the second strategy with very high prediction success rates exhibited a high usage in the interface of protease-inhibitor complex or enzyme complex as previously reported by Lo et al. (10). In addition, Fellouse et al. (26) have demonstrated that synthetic antibodies from a four-amino acid code (Tyr, Ala, Asp, and Ser) by used phage-displayed antibody libraries with precisely defined and highly restricted diversities were sufficient for high-affinity antigen recognition. They revealed that the tyrosine side chain was well suited for mediating molecular recognition at pro-  tein-protein interfaces, and, as a consequence, the natural antibody repertoire has likely evolved under selective pressure for the enrichment of tyrosine in antigen-binding sites. Intriguingly, all of the four amino acids in their study are also included in the selected amino acids of the second strategy in our study.
Our data indicate that those amino acids in our second strategy, which are enriched in the protein-protein interface under selective pressure during evolution, may be preferentially employed for molecular recognition.
In immunology, affinity maturation is the process by which B-cells produce antibodies with increased affinity for antigen during the course of an immune response (27). The process is thought to involve two interrelated processes (somatic hypermutation and clonal selection), occurring in the germinal centers of the secondary lymphoid organs. During affinity maturation, two highly mutable nucleotide motifs (RGYW and WA), which was first deduced from a statistical analysis of silent mutations (28), were shown to be hotspots for somatic hypermutation in various organisms (20). In the present study, we found that nearly all of affinity-improving mutations at the binding interface of affinity-matured antibody were located at or very near RGYW or WA motifs. In addition, the binding affinity of trastuzumab was substantially enhanced, breaking the affinity ceiling (about 0.1 nM) for antibodies produced by in vivo maturation (29). All these results suggest that the hotspot motif also plays an important role in further improving binding affinity of antibody beyond in vivo maturation. One of the most striking findings in our present study is that, not only in the antibody-combining site but in the protein-protein interface, almost all of the affinity-enhancing mutations are found to lie at or very close to the hotspot (RGYW or WA) ( Table 3). Our data further indicate nearly all of the affinity-enhancing point mutations previously reported by other groups (6,22) are located at the hotspot. Especially in the recent study by Song et al., (6) they utilized multiple structure-based approaches to design ICAM-1 variants with enhanced affinity for ␣ L ␤ 2 . We found that eight of nine affinity-improving mutations in their data occurred within the RGYW or WA sequences (supplemental Fig. S7). All of these observations suggest that the mutation hotspot sequences (RGYW or WA) are not only critical for antibody affinity maturation, but may also play an important role in the evolution of protein-protein interface.
Previous work by Wang et al. (30) has demonstrated that activation-induced cytidine deaminase (AID)-mediated mutation requires no Ig gene sequences and AID and other trans-acting hypermutation factors may function as general mutators. Endo et al. (31) recently revealed that ectopic AID expression serves as a link between the cellular editing machinery and high mutation frequencies, leading to human cancer development. They found that tumor necrosis factor-␣ induced aberrant AID expression via IB kinase-dependent nuclear factor (NF)-B-signaling pathways in human colonic epithelial cells. Aberrant activation of AID in colonic cells preferentially induced genetic mutations in the TP53 gene, whereas there were no nucleotide alterations of the APC gene. In addition, substantial studies have demonstrated that consensus motifs RGYW and WA are universal descriptors of somatic hypermutation, which could be functioned as the target of AID and other trans-acting hypermutation factors (20,32). All these observations indicate that the nucleotide sequences RGYW and WA in the genome may serve as mutation targets, further supporting the hypothesis that the evolution of protein-protein interface may follow a similar mechanism as antibody affinity maturation.
In conclusion, classical simulation techniques incorporating the evolutionary information derived from in vivo antibody affinity maturation, which can be utilized as a tool to manipulate the protein binding affinity, provide insight into the evolution of high affinity in protein-protein interfaces. Our present results indicate that all the protein including antibody may employ a similar mechanism to achieve selective molecular recognition during evolution.