Antigen specificity and high affinity binding provided by one single loop of a camel single-domain antibody.

Detailed knowledge on antibody-antigen recognition is scarce given the unlimited antibody specificities of which only few have been investigated at an atomic level. We report the crystal structures of an antibody fragment derived from a camel heavy chain antibody against carbonic anhydrase, free and in complex with antigen. Surprisingly, this single-domain antibody interacts with nanomolar affinity with the antigen through its third hypervariable loop (19 amino acids long), providing a flat interacting surface of 620 A(2). For the first time, a single-domain antibody is observed with its first hypervariable loop adopting a type-1 canonical structure. The second hypervariable loop, of unique size due to a somatic mutation, reveals a regular beta-turn. The third hypervariable loop covers the remaining hypervariable loops and the side of the domain that normally interacts with the variable domain of the light chain. Specific amino acid substitutions and reoriented side chains reshape this side of the domain and increase its hydrophilicity. Of interest is the substitution of the conserved Trp-103 by Arg because it opens new perspectives to 'humanize' a camel variable domain of heavy chain of heavy chain antibody (VHH) or to 'camelize' a human or a mouse variable domain of heavy chain of conventional antibody (VH).

Conventional antibody IgG molecules consist of two light chains folded in two domains and two heavy chains folded in four domains (1). Surprisingly, the serum of Camelidae contains in addition a large proportion (ϳ50%) of functional antibodies devoid of light chains and heavy chains possessing only three domains since the equivalent of the first constant domain is missing (2). The two C-terminal domains of the heavy chain homodimers within camelids and conventional IgG molecules share large sequence identities and are responsible for the effector functions. Also, the N-terminal variable domain of the heavy chain antibodies (referred to as VHH) 1 (3) has an overall sequence and structure that is homologous to the variable domain (VH) of the heavy chain of a classical human antibody (4 -8). Important amino acid differences occur between the VH and VHH in their framework 2 region. This region is hydrophilic in VHHs rendering the domain soluble in aqueous solution, whereas the region is hydrophobic in the VH, and its amino acids associate with the VL. The VHH domain represents the smallest naturally occurring, intact antigen-binding site (9), comprising only one single immunoglobulin domain with three antigen-binding loops (or complementarity determining regions, CDRs). Heavy chain antibodies with high specificity and affinity can be generated against a wide variety of antigens (10). Their VHHs are readily cloned (11,12) and expressed in bacteria and yeast (13) and are extremely stable (14).
In human and mouse, the first two antigen-binding loops of a VH domain, CDR1 and CDR2, can be assigned to a limited number of possible conformations referred to as canonical structures (15)(16)(17)(18). The conformation of these loops depends both on their length and on the presence of specific residues at key positions. In contrast, the x-ray structure analysis of four VHH domains showed that their CDR1 and CDR2 deviate significantly from the canonical loop structures observed in human or mouse VHs (19).
The third antigen-binding loop (CDR3) of the VHH fragments is often constrained by an interloop disulfide bond and is, on average, longer than a human or mouse VH-CDR3 loop (4). This allows for a potentially larger antigen-binding surface (20).
About half of the dromedary single-domain binders to enzymes are potent inhibitors (12). This can be explained by their long CDR3 loop inserting into the active site cleft on the enzyme surface, as illustrated by the lysozyme binder cAb-Lys3. In this case, the N-terminal part of the 24-amino acid-long CDR3 loop protrudes from the remaining antigen-binding surface, penetrates deeply into the active site of the enzyme (6), and mimics the lysozyme natural substrate (21). However, several non-inhibiting antibody fragments with a long CDR3 loop were also isolated (12), and these fragments are not expected to interact with the active site of their enzymes. Therefore, it was hypothesized that these non-inhibiting VHH molecules would interact with other clefts present on the protein surface.
Here, we present the crystal structures of a camel VHH fragment, cAb-CA05, both as free antibody and in complex with its antigen. This specific non-inhibiting enzyme binder recognizes the bovine erythrocyte carbonic anhydrase with an affinity of 72 nM (K d ), which is in the same range as other VHHs or single chain variable fragment (11,12). Surprisingly, the structural data reveal for the first time an antibody using only one single loop, its CDR3, to interact directly with the antigen.

EXPERIMENTAL PROCEDURES
Crystallization and Data Collection-The cAb-CA05 was extracted from Escherichia coli periplasm and purified by chromatography on Ni-NTA (Qiagen) and Superdex 75 (Amersham Pharmacia Biotech) and gel filtration (12). The cAb-CA05-carbonic anhydrase complex was prepared by mixing cAb-CA05 in phosphate buffered saline with bovine erythrocyte carbonic anhydrase (Sigma) in a molar ratio of 1.2: 1 and applied on Superdex 75 (Amersham Pharmacia Biotech). Crystals from the antigen-free cAb-CA05 (1.7 mg/ml) and from the complex cAb-CA05⅐carbonic anhydrase (3.5 mg/ml) were grown in 25% (w/v) PEG8000 (Hampton), 0.1 M sodium citrate, pH 5.6, using the hanging drop vapor diffusion method.
A data set to 2.1 Å for cAb-CA05 was collected using a Rigaku RU-H2R rotating anode generator (Kobe, Japan) and a MarResearch image plate (MarResearch, Hamburg, Germany). Data for the antigenantibody crystal were collected on beam line BW7A at EMBL-Hamburg using a MarResearch image plate. Primary data processing was done with DENZO (22), scaling was done with SCALA, and further processing was done with the CCP4 program suite (23).
Structure Determination and Refinement-The structure of the free antigen was solved by molecular replacement as implemented in AMORE (24), using the VHH cAb-RN05 (Protein Data Bank entry code 1BZQ) as a search model. The structure was refined with X-PLOR (25) and REFMAC (26). The CDR loops were deleted from the search model and rebuilt from scratch. Possible water positions were identified with ARP (27) and checked manually. Model and structure factors are deposited in the Protein Data Bank, entry 1F2X.
The antigen-antibody complex structure was solved by molecular replacement with the antigen-free cAb-CA05 structure and human carbonic anhydrase structure (1CA2) as search models. For refinement, we used the CNS program (28), and included a simulated-annealing step to reduce possible model bias. As the high resolution limit of the data set is 3.5 Å, a grouped B factor refinement scheme was used (only one B factor for main chain atoms and one B factor for side chain atoms/residue). Model and structure factors are accessible through the Protein Data Bank, entry 1G6V.
Superposition of structures or structure fragments and calculation of rmsd were done with LSQKAB (23). Inter-residue and inter-atom distances were calculated with CONTACT (23), and accessible surface areas were calculated with NACCESS (29). Figures were produced with MOLSCRIPT 2.1 (30) and rendered with RASTER3D (31).

RESULTS
cAb-CA05 Sequence and General Structure-cAb-CA05 is a VHH antibody fragment of 135 residues (M r 15,000) that binds specifically to bovine erythrocyte carbonic anhydrase (K d ϭ 72 nM) but does not inhibit the enzymatic activity of the antigen. It was selected by panning from a VHH library of an immunized dromedary (12). The cAb-CA05 shares a high sequence identity with other VHHs or human VHs of family III (Fig. 1). Nevertheless, the VHH characteristic amino acids in framework 2 (Phe-37, Glu-44, Arg-45, and Gly-47) (4,20) are all present (Kabat amino acid numbering (32)). In addition, residue Trp-103, constitutively conserved in all VHs and interacting with the VL, is substituted by Arg in cAb-CA05. Like most other VHHs from camelids, the cAb-CA05 contains a long CDR3 (19 amino acids) with a cysteine at position 100c expected to form a disulfide bond with Cys-33 located in the CDR1. The CDR2 of cAb-CA05 is unusually short as it contains 15 amino acids instead of a standard length of 16, 17, or 19 amino acids (32).
The cAb-CA05 was crystallized both as free antibody and in complex with its antigen. Antigen-free cAb-CA05 crystallizes in space group P 2 1 with cell dimensions of a ϭ 29.98 Å, b ϭ 43.86 Å, c ϭ 87.95 Å, ␤ ϭ 93.23°and two VHH molecules in the asymmetric unit. The antigen-antibody complexes crystallize in space group P 4 1 2 1 2 with cell dimensions of a ϭ 83.86 Å and c ϭ 224.05 Å and one antigen-antibody complex in the asymmetric unit.
The structure of the antigen-free cAb-CA05 was refined to 2.1 Å resolution (Table I). It adopts the standard fold of an immunoglobulin variable domain with nine conserved antiparallel ␤-strands (Fig. 2) and three hypervariable regions clustering at one end of the domain (1,33,34). The Cys-22 and Cys-92 are oxidized into an intradomain disulfide bond, conserved in all immunoglobulin domains. Its general structure superimposes very well with a human VH reference structure (1igm) and with all available VHH structures of camel (1mel, 1bzq) and llama (1hcv, 1qdo). Root-mean-square deviations for the main chain atoms of the framework residues (residues 2-24, 32-52, 55-72, 77-92, and 103-112) ranged between 0.58 and 0.88 Å.   (35). The L45R and W47G substitutions and the Trp-103 rotated over its C␤-C␥ bond in cAb-Lys3 (Fig. 3C) make this VHH region more hydrophilic. The V37F mutation fills a hydrophobic pocket created by the side chains of the Trp-103, Tyr-91, and the CDR3, where the conserved Tyr (three amino acids upstream of Trp-103) plays a central role (6). The W103R substitution found here in cAb-CA05 renders this 'former VL side' of the VHH even more hydrophilic. It also allows a shift of the Phe-37 side chain toward the Tyr-91 and Arg-103 (Fig. 3D). As a result, the backbone of the long CDR3 approaches the former VL-side even more closely (Fig. 4D). All these modifications occur in the absence of distortions of the framework structure. In contrast, the partial camelization of a human VH in this area by L45R and W47I substitutions makes the isolated domain more soluble but induces backbone deformations at positions 37-38 and 45-47 (36). In addition, the side chain of Trp-103 takes a completely new position (Fig. 3B).
cAb-CA05 Hypervariable Regions-The conformation of the H1 loop (residues 26 -32, the solvent-exposed loop around the CDR1) of cAb-CA05 fits with the canonical structure type-1, a conformation observed in all VH structures containing a 7amino acid H1 loop (15,16). This canonical loop structure is shaped by a sharp turn at Gly-26, clustering of the hydrophobic side chains of Ala-24, Phe-27, Phe-29, and Met-34 (Fig. 4A), and the hydrophobic part of the Arg-94 side chain (C␦-C⑀ of Lys-94 in Pot VH) (16). The sequence composition of the cAb-CA05 H1 loop harbors the key elements for a type-1 structure except for the R94G and conservative F27Y and F29V substitutions (Fig. 1). These substitutions lead to a slightly different organization of the side chains forming the hydrophobic core of the loop but do not influence the main chain conformation of the loop (Fig. 4A). In contrast, all previously solved camel or llama VHH structures had their H1 loop folded into completely different main chain architectures.
Residues 52-56 form a hairpin loop (denoted H2) that constitutes the antigen-binding region of the second hypervariable region (15,16). Canonical structures of the H2 loop are described for loops with sizes of five, six, or eight amino acids. We previously showed that VHH H2 loops of six amino acids adopt conformations not yet observed in VH structures (19). Here, we are facing an H2 loop with only four amino acids (Fig. 1) that adopts a regular ␤-hairpin structure (Fig. 4B) known as type IIЈ (37). A comparison of four-and five-amino acid-long H2 loops indicates that the addition of a fifth amino acid introduces a bulge at position 55 and converts the loop to an H2 canonical structure type-1 (Fig. 4, B and C).
The CDR3 (residues 95-102) of a VHH is on average longer than that of a VH (17 versus 12 residues) (4), although a notable fraction of llama VHHs were found with 'short' CDR3 loops (20,38). Another remarkable feature of the CDR3 of VHHs is the frequent presence of a cysteine forming a disulfide bond with a cysteine in the CDR1 (4,6). In this respect, the cAb-CA05 with a 19-amino acid-long CDR3 loop and cysteines at positions 33 and 100c forming a disulfide bond is comparable with cAb-Lys3 (6) (Figs. 1 and 4D).
The cysteine at position 100c can be considered to divide the CDR3 region into an N-terminal and a C-terminal part. The C-terminal part of the CDR3 loop folds back onto the side of the VHH domain corresponding to the side of the VH interacting with VL (1) (see Table II for a list of contacting residues) (39). Large parts of the former VL-side are apparently shielded from the solvent by the C-end of the CDR3. A similar location and function has been observed for the C-terminal part of the cAb-Lys3 CDR3-loop (Fig. 4D) (6) for the entire, much shorter CDR3 loop of cAb-RN05 (5) and that of RR6 llama VHH (8).
The N-terminal part of the CDR3 of cAb-CA05 and cAb-Lys3 are in a different environment. In cAb-Lys3, it forms a protruding loop (Fig. 4D) inserting in the catalytic site of the lysozyme (21). In cAb-CA05, this part of the loop does not extend into the solvent but associates with the residues of the remaining hypervariable loops (Table II). Thus, the N-terminal and C-terminal half of the CDR3 of cAb-CA05 contact different parts of the domain. Furthermore, the entire (long) CDR3 of cAb-CA05 appears to be well fixed by these abundant contacts and by the covalent Cys33-Cys100c disulfide bond. The Antigen-Antibody Complex-In addition to the antigenfree cAb-CA05 crystal, crystals of the complex of cAb-CA05 with its antigen, carbonic anhydrase, were obtained and diffracted to 3.5 Å using synchrotron radiation. The structure was solved by molecular replacement (see "Experimental Procedures") and the data and refinement statistics for the antibodyantigen complex structure are shown in Table I.
Binding of the antibody has little influence on the overall structure of the carbonic anhydrase. The rmsd is 0.7 Å for the C␣ atoms between the bovine carbonic anhydrase molecule found here and the human carbonic anhydrase used as search model. Also, the structure of the cAb-CA05 antibody in the complex is the same as the uncomplexed structure (Fig. 2). The rmsd of the antibody in the complex with respect to the free antibody is 0.4 Å for all main chain atoms. Moreover, the hypervariable loops maintain their main chain conformation upon antigen binding; the H1 loop remains in the canonical structure type-1, the four-amino acid-long H2 loop keeps its conformation, and the structures of the CDR3 loops also resemble each other closely.
The epitope consists of two separate continuous segments within the carbonic anhydrase. A first stretch includes residues 46 -52 and the second stretch involves residues 180 -187 (Fig.   2, Table II). The first segment uses mainly main chain atoms (21 of 29 contacts), whereas the second part of the epitope involves mainly side chain atoms (50 of 58 contacts). The paratope comprises both, the N-and the C-terminal part of the CDR3 loop. In the N-terminal part of the CDR3, many main chain atoms participate in antigen binding; 32 of 60 contacts (i.e. antibody atoms within 4.0 Å of antigen atoms) involve an antibody main chain atom. In the C-terminal part of the CDR3, only antibody side chain atoms contact the antigen.
The antigen-interacting surface of cAb-CA05 is to a large extent planar (Fig. 5). A solvent-accessible surface area of 622 Å 2 becomes buried upon antigen complexation. Remarkably, the CDR3 region provides this surface entirely. The antigen makes no contacts with the CDR1 and CDR2 loops apart from Tyr-32, which is at 3.7 Å from the carbonic anhydrase molecule. This is the first antibody fragment that shows high affinity and high binding specificity using one CDR loop for direct interaction. DISCUSSION The crystal structures of cAb-CA05, both antigen-free and in complex with carbonic anhydrase, reveals the structural adaptations in the VHH that explain its solubility and antigen- binding capacity in absence of a VL. The VH residues at positions 11, 37, 44, 45, and 47 are all conserved, hydrophobic, and are involved in interdomain contacts (1,32,35). In the VHHs, these residues are substituted, more hydrophilic, accessible to solvent (20), and increase the solubility of the isolated VHH domains (40,41). The Trp-103 is another amino acid that is crucial for the interaction with a VL domain (35) and absolutely conserved in VH (98.8% occurrence) (1,42). This Trp-103 is maintained within all the VHHs of reported structure; however, the side chain rotates ϳ180°about the C␤-C␥ bond to expose its most polar part, the N⑀ atom, to solvent (Fig. 3C). As found in ϳ10% of the VHHs, Arg occupies position 103 in cAb-CA05 (Fig. 1). Obviously, this W103R mutation changes the nature of the 'former VL' surface even more drastically without disturbing the main chain conformation (Fig. 3D). The hydrophobic part of the Arg side chain integrates well with the neighboring hydrophobic side chains, whereas the guanido group extends into the solvent. From the present structure we infer that the W103R mutation on a human VH might form a better choice than the framework 2 mutations (41) to render isolated VHs more soluble. Such strategy for camelizing a VH would keep the domain more human-like. We think that a domain with the conserved human sequence at its framework 2 and carrying an Arg-103 might behave as a VHH because such sequences were found as part of heavy chain antibodies, e.g. clones 12 and 14 of llama HCAbs (20). In addition, it might be envisaged to humanize a camel VHH by bringing the framework 2 VHH hallmarks to the conventional human VH sequence and by substituting the Trp-103 into Arg.
From the available structural information of the antigenbinding loops of VHHs it seems that the classification of canonical structures needs to be extended (19). A new canonical structure type-4, adopted by the CDR1 loops of cAb-Lys3 and cAb-RN05, was already introduced (5, 19). In addition, the anti-human chorionic gonadotropin (hcg) and anti-hapten llama (RR6) VHHs exhibit a non-conventional H1 loop structure. It is shown here for the first time that the H1 loop of VHHs can be assigned to a type-1 canonical structure. This proves that the determinants of the type-1 H1 loop conformation are not a priori different for a VHH domain. Moreover, the Cys-33 forming an interloop disulfide bond with a Cys in the middle of the CDR3 loop is compatible with a canonical type-1 structure in cAb-CA05 and a type-4 structure in cAb-Lys3. It is therefore clear that Cys-33 is not a determinant for the H1 loop conformation.
The H2 loop of the cAb-CA05 is special because it contains only four amino acids instead of the conventional five, six or eight residues. Structurally, the H2 loop of four amino acids forms a regular ␤-turn of type IIЈ (37). Because H2 loops of this length are absent in germline VHHs (43), they are generated by a somatic mutation.
The long CDR3 loops of cAb-CA05 are divided based on distinct functions into an N-terminal and a C-terminal part, with the Cys-100c residue as midpoint making an interloop disulfide bond. A similar loop division was introduced in the CDR3 of cAb-Lys3 (6). The C-terminal part of both CDR3-loops is involved in extensive interactions with residues that constitute the VL-interacting surface of a VH in normal VH-VL pairs. Thus, the CDR3 loop covers the remaining hydrophobic patches in this area of the domain and shields these from the aqueous solvent. With the exception of the llama hcg VHH with its short CDR3 of eight amino acids, all VHHs of known structure share this feature. The VHH-specific amino acids Phe-37, Arg-45, and Gly-47 are all contacted by the CDR3 residues (Table II), supporting the idea that it is the combined effect of these hallmark VHH substitutions and the (C-end of the) long CDR3 that provides the optimal single-domain properties of a VHH. To fulfill this role, the CDR3 of VHHs adopts a conformation that deviates fundamentally from the CDR3 of VHs. Indeed, the rules introduced to predict the CDR3 loop structures in VHs (44 -46) do not apply for the CDR3 of VHHs since parts of the CDR3 are used to cover the side of the domain that interacts FIG. 5. The antigen-binding site of cAb-CA05. The cAb-CA05 is shown with the scaffold atoms in gray, CDR1 atoms in blue, CDR2 atoms in green. The CDR3 atoms that interact with the antigen are in red for hydrogen bonding, orange for van der Waals contacts, and black for salt bridge. The Kabat numbering of the amino acids to which the contacting atoms belong is given for reference. The carbonic anhydrase epitope segments 1 (pink dots) and 2 (red dots) are in ball-and-stick representation. with the VL in a VH-VL heterodimer. Furthermore, this novel CDR3 positioning in VHHs strongly suggests that the mutagenesis of the framework 2 amino acids of a VH to mimic the VHH of camelids will be insufficient to convert a VH into a functional, soluble single-domain antibody fragment. Additional CDR3 mutagenesis in a camelized VH will be a prerequisite to restore the antigen binding characteristics of the parental VH domain. This is exactly what the group of Riechmann (47) encountered when they used a camelized human VH to generate single-domain molecular recognition units. The N-terminal part of the CDR3 of cAb-CA05 folds back over the CDR1 and CDR2. Both the N-terminal and the Cterminal part of the CDR3 participate in carbonic anhydrase binding (Table II). However, the residues of the N-terminal part that interact with the remaining CDR loops of the VHH domain bind with the antigen as well, whereas the C-terminal part of the CDR3-loop shows an alternating pattern of residues contacting either the antigen or the remaining VHH domain.
The N and C-terminal part of the CDR3 of cAb-CA05 forms one large surface that is essentially flat. Two marginal notes can be made from this observation. First, although this paratope architecture is also provided by the six hypervariable loops in the VH-VL heterodimers to recognize large antigens such as proteins, the difference is that the cAb-CA05 paratope is composed by CDR3 residues only. The concentration of the paratope into one single loop opens opportunities to design smaller peptidomimetics (9) or to randomize these residues and to create a synthetic library from which binders with new specificities could be retrieved by the phage display technology (48) or ribosome display (49). Secondly, it seems that the paratope of VHHs bears a large structural diversity including the formation of a flat surface as in cAb-CA05, a protruding loop as seen for the N-terminal part of the CDR3 of cAb-Lys3, or a cavity between the CDRs as observed for the hapten binder RR6. These VHHs contain a long CDR3 of 19, 24, and 16 residues, respectively. The N-terminal part of the long CDR3 of cAb-Lys3 protrudes from the remaining antigen-binding site and provides ϳ70% of the antigen binding surface by insertion into a cavity harboring the catalytic site of the lysozyme (6,21). In contrast, in cAb-CA05 we are confronted with a long CDR3 that forms a planar surface that does not insert into a cavity on the antigen surface. It explains the failure of this antibody fragment to inhibit the enzymatic activity of its antigen. Furthermore, it proves that different parts of the carbonic anhydrase with a planar or a concave surface are antigenic for heavy chain antibodies carrying a long CDR3.