Structural studies of human autoantibodies. Crystal structure of a thyroid peroxidase autoantibody Fab.

The three-dimensional structure of the Fab of TR1.9, a high-affinity IgG1,κ human autoantibody to thyroid peroxidase, was determined crystallographically to a resolution of 2.0 Å. The combining site was found to be relatively flat, like other antibodies to large proteins. Sequence differences from the most closely related germline genes mainly occur at positions occupied by residues with outward-pointing side chains. An increased deformability of the second and third complementarity-determining regions of the heavy chain may result from the replacement of two germline asparagines and the presence of several glycines, and may allow “induced fit” in the binding to antigen. Four exposed charged residues, resulting from the use of a particular D (diversity) and J (joining) segments in the assembly of the heavy chain, may contribute to the high affinity of antigen binding. The crystal structure of TR1.9 Fab is the first for a human IgG high-affinity autoantibody.

The three-dimensional structure of the Fab of TR1.9, a high-affinity IgG1, human autoantibody to thyroid peroxidase, was determined crystallographically to a resolution of 2.0 Å. The combining site was found to be relatively flat, like other antibodies to large proteins. Sequence differences from the most closely related germline genes mainly occur at positions occupied by residues with outward-pointing side chains. An increased deformability of the second and third complementarity-determining regions of the heavy chain may result from the replacement of two germline asparagines and the presence of several glycines, and may allow "induced fit" in the binding to antigen. Four exposed charged residues, resulting from the use of a particular D (diversity) and J (joining) segments in the assembly of the heavy chain, may contribute to the high affinity of antigen binding. The crystal structure of TR1.9 Fab is the first for a human IgG high-affinity autoantibody.
The effector mechanisms in human autoimmune diseases may involve either T cells or B cells. Presently accepted examples of T cell-mediated autoimmune disease are diabetes mellitus type I and multiple sclerosis. On the other hand, autoantibodies to the acetylcholine receptor are responsible for myasthenia gravis and autoantibodies to the thyrotrophin receptor cause the hyperthyroidism of Graves' disease.
The most common organ-specific autoimmune disease in humans is Hashimoto's thyroiditis. IgG class autoantibodies to thyroid peroxidase (TPO), 1 a large glycoprotein (107 kDa) expressed on the apical surface of thyroid cells, are an invariable marker of the disease and may contribute to thyroid damage and hypothyroidism (reviewed in Ref. 1). Recently, we have generated a panel of 42 human monoclonal TPO autoantibodies (expressed as Fab) from thyroid-infiltrating plasma cells by screening immunoglobulin gene combinatorial libraries with eukaryotic recombinant TPO (2)(3)(4)(5)(6)(7). These recombinant IgG class Fabs have a high affinity (K d ϳ 10 Ϫ10 M) for TPO and recognize overlapping conformational epitopes in a restricted region of the molecule (5,8,9). Furthermore, the TPO Fabs compete for binding to TPO by Ͼ80% of autoantibodies in serum from most patients and, consequently, they define a TPO immunodominant region (3,5,8,9).
One of these recombinant TPO Fabs, TR1.9 (5), interacts with the B2 domain in the immunodominant region. Like other IgG class autoantibodies, TR1.9 binds specifically to its antigen and it is encoded by genes which appear to be somatically mutated from the germline (reviewed in Ref. 10). In contrast, IgM class autoantibodies are frequently polyreactive and may be derived from unmutated or only slightly mutated germline genes (see for example, Ref. 11).
Information on the three-dimensional structure of human TPO-specific Fab and, ultimately, the Fab-TPO complex will provide insight into TPO recognition by the immune system. In this report, we present the crystallographic analysis, at 2.0-Å resolution, for TPO-specific Fab TR1.9. Of the limited number of human antibodies for which crystal structures have been determined (12), none are autoantibodies. The present data, therefore, present the first structural analysis of a human, IgG class, disease-associated autoantibody.

MATERIALS AND METHODS
Expression of TPO-specific Fab TR1.9 -To permit higher levels of expression, the heavy and light chain genes for TR1.9 in the Immunozap vector (5) were subcloned in the XhoI and XbaI sites of pBP101 (13), kindly provided by Dr. B. Posner, Pennsylvania State University. Expression of TR1.9 was performed as described (13) with some modifications. In brief, BL21 cells bearing the pTG119 and the pBP101 plasmids were grown at 37°C in Luria Bertani medium containing 30 g/ml kanamycin and 10 g/ml tetracycline (both from Sigma) until the optical density of the cells reached 0.8 (600 nm). Protein expression was induced by addition of 1 mM isopropyl-thio-D-1-galactopyranoside (Sigma) for 4 h at 37°C. Cells were pelleted and processed as described previously (3). TR1.9 Fab was affinity-purified using goat anti-human IgG-coupled Sepharose beads (ZYMED Laboratories, South San Francisco, CA). After elution with glycine buffer, pH 2.5, samples were immediately neutralized with 1 M Tris, pH 7.4, and the buffer was then exchanged to 10 mM Tris, pH 7.4.
Crystallographic Data Collection and Processing-X-ray intensity data were collected on an R-AXIS II system using graphite-monochromatized CuK␣ radiation from a Rigaku RU200 rotating-anode generator. All the data used in the analysis were obtained from one crystal of approximate size, 0.2 ϫ 0.2 ϫ 0.3-mm 3 , at room temperature. The data were processed using the program XDS (15). Statistics for the intensity data are given in Table I. Data completeness at various resolutions and F/(F) levels are presented in Fig. 1. The average redundancy of the data was 3.5 and the R sym on intensities was 8.4%.
Crystal Structure Determination-The crystal structure was determined by Molecular Replacement using the program AMoRe (16). The Fab fragment of antibody 3D6 (17) (available from the Protein Data Bank (18,19) as Entry 1DFB), the only human Fab of known threedimensional structure with a light chain, was used as the probe, with separate searches for the V L :V H and C L :C H 1 modules. The 95% most intense reflections in the resolution range 8 -4 Å were used in the analysis. The results of the rotation and translation searches are summarized in Table II. Since space group P2 1 has an undefined origin along y, a relative translation search was performed between the two probe modules. After rigid-body refinement, the correlation coefficient between calculated and observed structure factors was 0.465 and the crystallographic residual, the R-value, was 40.6%. The packing of the molecules in the crystal was very reasonable. The ␣-carbons at the end of V L and at the beginning of C L were 7.2 Å apart, while those at the end of V H and at the beginning of C H 1 were 7.3 Å apart, demonstrating that Molecular Replacement had positioned the two modules properly relative to each other. As a test, a composite model including all four domains was used as the probe in another search with AMoRe. The results (see Table II) were again unambiguous, with a correlation coefficient of 0.448 and an R-value of 40.7%. There is one Fab in the asymmetric unit.
Structure Refinement-All subsequent refinement was done using X-PLOR (20). Another rigid-body refinement, using data for which F Ն 3 (F) in the 10 -4.0-Å resolution range and allowing the four domains of the Fab to move independently of each other yielded an R-value of 39.0%. At this point, 3D6 residues were replaced with alanines at the 20 positions in V L and the 67 in V H where 3D6 and TR1.9 differ in sequence (or with glycines where TR1.9 has this residue) (Table III). In addition, the 3D6 residues 91-96 (in CDR3-L) in the light chain and residues 96 -100 (in CDR3-H) in the heavy chain (following the numbering convention of Kabat et al. (21)) were excised, since 3D6 and TR1.9 differ in the number of amino acids in these two CDRs. The valine at position 225 in C H 1 was replaced by the germline alanine (Kabat et al. (21). The R-value was 41.6% for this mutated molecule with data in the resolution range 10 -2.2 Å with F Ն 3 (F) (16,475 reflections).
A 2F o Ϫ F c map was computed and displayed with the mutated molecule using the graphics program FRODO (22). The fit of model to map was very good. There was density for most of the omitted side chains and for the excised CDR3-L and CDR3-H segments (Fig. 2). The omitted side chains were manually built into the structure on the basis of the map, as well as the excised regions. Adjustments were made in the NH 2 and COOH termini, in the switch regions, as well as in several loops. After the first rebuilding, the R-value was reduced to 37.4%. A second round of model rebuilding based on a 2F o Ϫ F c map further reduced the R-value to 34.8%. A third round of rebuilding only reduced the R-value to 34.6% and manual rebuilding was discontinued for the time being.
The structure was then refined using X-PLOR with data for which F Ն 2 (F) in the resolution range 10 -2.2 Å (17,852 reflections). One run of simulated annealing reduced the R-value to 25.3% and four cycles of alternating thermal factor (B-factor) and positional refinement further reduced the R-value to 20.7%. The R-value for the 2,422 reflections between 2.2-and 2.0-Å spacings and which had not been included in the refinement up to this point was 30.7%. Another round of model rebuilding using FRODO was performed based on 2F o Ϫ F c and F o Ϫ F c maps. The maps clearly showed that the residue at heavy chain position 225 is not an alanine, as had been assumed on the basis of the germline sequence. The electron density was consistent with a valine, as in antibody 3D6, and the appropriate change was made. Furthermore, putative solvent (water) molecules were identified.
From this point onward, all the data in the resolution range 10 -2.0 Å, for which F Ն 2 (F) (20,274 reflections), were included in the refinement (R-value ϭ 23.5%). Three more cycles of alternating B-factor and positional refinement reduced the R-value to 18.0%.

RESULTS AND DISCUSSION
Description of the Structure-A ribbon diagram of TR1.9 Fab is presented in Fig. 4. As in other Fabs, the homologous domains of TR1.9 Fab are related by pseudodyads: 174.0 degrees between V L and V H , and 168.5 degrees between C L and C H 1. The elbow bend of TR1.9 Fab is 134.1 degrees. These values are within the range observed for other Fabs (see Refs. 12,25,and 26) and, in fact, are very nearly the same as those observed for Hil, 2 a human IgG 1 , myeloma protein (PDB entry 8FAB, second Fab in the entry).
The molecular surface (27) that covers the CDRs of TR1.9 is included in Fig. 4. The CDR surface of TR1.9 is revealed to be relatively flat. Other antibodies to intact protein antigens also have relatively flat CDR surfaces, in contrast to antibodies to haptens and other smaller ligands which display pronounced FIG. 2. Stereodrawing of a portion of the 2F o ؊ F c map of TR1.9 Fab after rigid-body refinement, with the CDR3-L loop from the final model overlaid. Although residues 91-96 had not been included in the structure factor calculation, the map has continuous electron density corresponding to those residues. The contour level is 1.0 .

RFTISRDNAKNSLYLQMNSLRAEDMALYYCVK GRDYYDSGGYFTVAFDI WGQGTMVTVSS
grooves or pockets in their CDR surfaces (28). Results from the crystallographic analysis of many antibody-ligand complexes strongly suggest that the combining site of an antibody is primarily constructed with CDR residues, although on rare occasions neighboring framework residues have been found to be involved also. Thus the CDR surface of TR1.9 most probably portrays the topography of its combining site. The relative flatness of the surface implies that the epitope for TR1.9 on TPO is in the main also flat. The CDRs of TR1.9 are canonical: CDR1-L belongs to the canonical group 2, CDR2-L to group 1 (the only group identified so far), CDR3-L to group 1, CDR1-H to group 1, and CDR2-H to group 3; no canonical groups have been identified for CDR3-H (29).  domain, that of the antibody 3D6, is available from the Protein Data Bank. Coordinates for five C␥1 domains are available: those from antibody 3D6, and those from the immunoglobulins Kol, New, Hil, and Mcg. The V and V H comparisons were made on the basis of the 72-residue positions which have been found to be structurally equivalent in V L and V H domains, while the C and C␥1 comparisons were made on the basis of 63 equivalent positions (12).

Comparison with Other Human Antibody Structures-
The TR1.9 V is found to be very similar in three-dimensional structure to the other human V domains (Table IV, Fig.   5). Indeed, all the human V domains are seen to be very similar to each other and, with the exception of the CDRs, are essentially superimposable. The average difference among these V L domains is 0.49 Å (S.D. ϭ 0.01); TR1.9 V L differs from the other human V domains on average by 0.42 Å (S.D. ϭ 0.03). These numbers are essentially the same as those obtained when various structures for hen egg white lysozyme, crystallized in different space groups and independently analyzed, are compared (average difference for C␣ positions is 0.41 Å (S.D. ϭ 0.02) (for PDB Entries 1HEL (tetragonal) (36), 132L (orthorhombic) (37), and 1LYS (monoclinic, with two molecules per asymmetric unit) (38)).
The C L of TR1.9 differs from that of 3D6 on average by 0.34 Å, again showing a close similarity, although not unexpectedly since the 3D6 C L domain was used as the search probe in the Molecular Replacement analysis as well as the initial model for the refinement of the TR1.9 C L domain.
A greater variation is observed for the human V H domains ( Table V  seen to be very similar, the average difference being 0.42 Å (S.D. ϭ 0.01); the TR1.9 C H 1 domain differs from those of the other human antibodies on average by 0.39 Å (S.D. ϭ 0.04).
In this collection, the immunoglobulin Mcg is found to be the most different, not only in V H but also in the C H 1 ( Table V). The consistently larger differences found in the comparisons involving the Mcg domains probably reflect the low resolution of the Mcg structure (3.2 Å). Most of the other structures had been determined at relatively high resolution: TR1.9 Fab at 2.0 Å, New Fab also at 2.0, Kol Fab at 1.9, Hil Fab at 1.8, Rei V L at 2.0, Wat V L at 1.9, and Len V L at 1.8, although the Pot Fv structure was determined at 2.3-Å resolution and 3D6 Fab at only 2.7.
The Somatic Mutations in TR1.9 -The V H region of the TR1.9 heavy chain appears to be derived by somatic mutation from the germline gene V1-3b (39), also known as DP-25 (40). The V region of its light chain is most closely related to the germline gene AЈ (21), also known as L4/L18 (41). An alignment of the TR1.9 sequences with the closest germlines is shown in Table VI.
Ignoring the differences at the NH 2 termini which are primer-derived, there are 15 mutations which appear to have occurred in the light and heavy chains of TR1.9 relative to germline. We are unable to relate the CDR3-H segment to any of the known D (diversity) segments. The joining segment for the light chain variable domain is J4 and that for the heavy chain is J H 4 (21).
Relative to the closest germline, five somatic mutations appear to have occurred in TR1.9 V L and 10 in V H ; six of these are in CDRs (Table VI). All five changes in V L involve residues that have outward-pointing side chains; four are accessible to solvent (Asn 20 , Ala 22 , Arg 45 , and Asn 53 in CDR2-L), while the fifth is partly buried (Ile 85 ). Of the 10 changes in TR1.9 V H , two are buried in the domain interior (Leu 34 in CDR1-H and Phe 70 ); six of the eight non-glycine residues have side chains that are outward-pointing: four are exposed to solvent (Ser 28 , Thr 54 , and Arg 65 in CDR2-H, and Pro 85 ), while two are partly buried (Ser 52 in CDR2-H and Thr 77 ). None of the putative somatic changes occurs at a position that is involved in the V L :V H interaction (Table VI). The putative somatic mutations which appear to have occurred in TR1.9 are portrayed in the threedimensional structure of the molecule in Fig. 7.
The insertion of an extra residue in the NH 2 -terminal segment of TR1.9 V H is the result of the use of a 1a/3a oligonucleotide primer for amplification (5). The insertion of this extra residue causes a structural rearrangement in this part of TR1.9 V H (relative to the other known V H structures) (Fig. 6). The fact that TR1.9 still displays high affinity for TPO strongly suggests that the NH 2 terminus of V H is not involved in the interaction with the antigen.
Conclusions-In the absence of a three-dimensional structure for the complex of TR1.9 with TPO, we can only guess at the structural basis for the high affinity of the binding. The 7-8 kcal/mol required to increase the affinity from weak (say, K d ϳ10 Ϫ4 -10 Ϫ5 M) to strong (e.g. K d ϳ10 Ϫ10 M) binding could be derived from the formation of salt bridges, or of hydrogen bonds, especially when involving charged groups (42). Comparing TR1.9 to the most closely related germlines (Table VI) reveals that replacements involving charged residues appear not to have occurred. However, the D segment and the J H used to construct the heavy chain variable domain of TR1.9 produced four charged residues in CDR3-H: two aspartic acids, one glutamic acid, and one lysine. These charged residues, if they form salt bridges with oppositely charged residues in TPO, may be responsible, in part or in whole, for the high affinity of the interaction.
Some other replacements may contribute to the high affinity of binding. Of the 15 putative somatic mutations that appear to have occurred in the maturation of TR1.9, four involve asparagines. Three of those are in CDRs (at position 53 in CDR2-L and at positions 52 and 54 in CDR2-H) and the fourth is at the framework position 20 in the light chain. It has been noted that asparagines in CDRs frequently form hydrogen bonds with main chain atoms, apparently stabilizing the conformation of the local structure (43). In TR1.9, the asparagines at positions 20 and 53 in the light chain are exposed to solvent and do not form hydrogen bonds, so that they are probably not critical to conformational stability. The two other somatic changes involving asparagines occur at positions 52 and 54 in CDR2-H, where asparagines in the closest germline V H are mutated to serine and threonine, respectively, in TR1.9. Ser 52 -H in TR1.9 is at the start of the loop structure in CDR2-H and Thr 54 -H is in this loop. The murine antibody 36 -71 (44) (PDB entry 6FAB) and the humanized murine antibody H52 (45) (PDB entry 1FGV) have asparagines at both positions. In antibody 36-71, the side chain of Asn 52 -H forms a hydrogen bond with the main chain while Asn 54 -H does not; in antibody H52, both asparagines form hydrogen bonds with the main chain. The replacement of the germline Asn 52 -H and Asn 54 -H should result in a reduced stability and greater flexibility of this part of CDR2-H, especially since two glycines are present in this segment. Another part of TR1.9 that is almost certainly flexible is the CDR3-H loop which features three glycine residues in a row. The CDR2-H and CDR3-H loops abut each other and together occupy a central position in the combining site (Fig. 7). Many residues in the CDR2-H and CDR3-H loops are often found to be involved in ligand binding in other antibodies (12,28,46). It is tempting to speculate that increased flexibility and deformability, 4 made possible by the presence of the glycines and the reduced number of asparagines, improve the binding of TR1.9 to TPO, in the manner of an "induced fit" (47).
The structural basis for the high affinity will be clarified by the crystal structure of the complex of TR1.9 with TPO. Knowledge of the structural details of the binding of TR1.9 to TPO will add to our understanding of TPO recognition by the immune system, including antigen presentation by TPO-specific 4 It may have been possible to deduce the extent of deformability of these loops from their thermal factors, but nine of the 17 residues in CDR2-H are involved in lattice contacts, as are four of the 12 residues in CDR3-H, so that the thermal factors are low for these segments of TR1.9 in this crystal structure.

TABLE VI
Primary structure of TR1.9 V L and V H compared to the most closely related germline sequences Sequence differences are indicated by vertical bars (͉). Amino acid differences from the germlines at the beginning of both light and heavy chains are introduced by the primers and restriction sites (7). The extent of exposure of the individual residues is indicated by the italicized letters B, b, p, e, and E. Residues, with side chains having fractional accessibility values between 0.00 and 0.20 are designated as being completely buried (B). Values between 0.20 and 0.40 indicate mostly buried (b), and those between 0.40 and 0.60 indicate partly buried/partly exposed (p). Values between 0.60 and 0.80 indicate mostly exposed (e), and a value of at least 0.80 indicates completely exposed (E). In the special case of glycine, the residue is considered completely exposed if its ␣-carbon atom is accessible to solvent, otherwise it is considered completely buried. Fractional solvent accessibility values were computed as described previously (43); residue exposures were computed in the context of an isolated Fv. The residues which are in contact with the opposite domain are indicated by asterisks. RVTFTRDTSATTAYMGLSSLRPEDTAVYYCAR DPYGGGKSEFDY WGQGTLVTVSS bBEBEbpeEepbBpBEBeEBeEEBpBpBBBBB BbEEEBeBBBbe BBpBBeBeBEE * * ******* ** cells, and will provide new insights into humoral autoimmune diseases in humans. FIG. 7. Ribbon drawings of the Fv of TR1.9 viewed from the side (top) and end-on (bottom). V L is on the left (lighter shading) and V H is on the right (darker shading). The residues which differ from germline are indicated by filled circles; those in the CDRs are drawn larger. The residues in CDR3-H are indicated by empty circles. The NH 2 and COOH termini of both chains are labeled.