Mapping and Quantitation of the Interaction between the Recombination Activating Gene Proteins RAG1 and RAG2*♦

Background: The RAG1-RAG2 interaction is critical for V(D)J recombination but is poorly understood. Results: The RAG1-RAG2 interaction has a binding constant of ∼0.4 μm and requires only a small portion of RAG1. Conclusion: RAG1 and RAG2 interact with modest affinity using regions of RAG1 flanking the RAG1 catalytic region. Significance: Inefficient association of RAG1 with RAG2 could help limit damage to the genome. The RAG endonuclease consists of RAG1, which contains the active site for DNA cleavage, and RAG2, an accessory factor whose interaction with RAG1 is critical for catalytic function. How RAG2 activates RAG1 is not understood. Here, we used biolayer interferometry and pulldown assays to identify regions of RAG1 necessary for interaction with RAG2 and to measure the RAG1-RAG2 binding affinity (KD ∼0.4 μm) (where RAG1 and RAG2 are recombination activating genes 1 or 2). Using the Hermes transposase as a guide, we constructed a 36-kDa “mini” RAG1 capable of interacting robustly with RAG2. Mini-RAG1 consists primarily of the catalytic center and the residues N-terminal to it, but it lacks a zinc finger region in RAG1 previously implicated in binding RAG2. The ability of Mini-RAG1 to interact with RAG2 depends on a predicted α-helix (amino acids 997–1008) near the RAG1 C terminus and a region of RAG1 from amino acids 479 to 559. Two adjacent acidic amino acids in this region (Asp-546 and Glu-547) are important for both the RAG1-RAG2 interaction and recombination activity, with Asp-546 of particular importance. Structural modeling of Mini-RAG1 suggests that Asp-546/Glu-547 lie near the predicted 997-1008 α-helix and components of the active site, raising the possibility that RAG2 binding alters the structure of the RAG1 active site. Quantitative Western blotting allowed us to estimate that mouse thymocytes contain on average ∼1,800 monomers of RAG1 and ∼15,000 molecules of RAG2, implying that nuclear concentrations of RAG1 and RAG2 are below the KD value for their interaction, which could help limit off-target RAG activity.

RAG1 and RAG2 (known collectively as RAG; where RAG1 and RAG2 are the proteins encoded by the recombination activating genes 1 or 2) are the critical lymphocyte-specific proteins required for V(D)J recombination, which assembles the variable regions of immunoglobulin and T cell receptor genes in developing B and T lymphocytes. The RAG1-RAG2 complex, together with an architectural DNA binding/bending factor HMGB1 (or HMGB2; where HMGB1 and -2 are high mobility group proteins B1 and B2), initiates V(D)J recombination by binding to a DNA recognition motif known as the recombination signal sequence (RSS). 3 The RSS is composed of conserved heptamer and nonamer motifs separated by a poorly conserved spacer whose length is 12 or 23 bp, resulting in two forms of the RSS known as the 12RSS and 23RSS. Efficient V(D)J recombination takes place only between a 12RSS and a 23RSS, a restriction known as the 12/23 rule. After binding to one RSS, the RAG-HMGB1 complex captures a 12/23 appropriate partner RSS to form the paired complex, within which RAG catalyzes the formation of DNA double strand breaks immediately adjacent to the heptamers of the RSSs. DNA cleavage occurs in two steps, with one strand nicked to create a 3Ј-hydroxyl group, which then attacks the opposite strand to create a hairpin sealed flank and a blunt RSS end. The reaction is completed by the action of ubiquitously expressed DNA repair proteins (1)(2)(3).
RAG1 is the major player in DNA binding and cleavage. Its "core" region (the minimal portion required for activity; defined as aa 384 -1008 of the 1040-aa protein (4)) contains a tightly dimeric nonamer binding domain (NBD), a central region containing two critical active site residues (Asp-600 and Asp-708), and a large C-terminal region that contributes the third essential active site residue (Glu-962) (Fig. 1A). In addition to making direct contact with the nonamer, the RAG1 core is responsible for heptamer recognition, interaction with RAG2, and is likely to contain the entire active site (2,5). Nonetheless, it displays no catalytic activity in vitro in the absence of RAG2, and RAG2-deficient mice display a complete absence of V(D)J recombination activity (6). RAG2 is thus a vital accessory factor, with a core region (aa 1-383 of the 527 aa protein; Fig.  1A) whose primary function appears to interact with RAG1, thereby activating RAG1 endonuclease function. This interaction has also been demonstrated to enhance RSS recognition, particularly in the vicinity of the heptamer, and to decrease nonspecific DNA binding by RAG1 (7,8). The mechanism by which the RAG2 core alters the DNA binding and catalytic functions of RAG1 core is not known, although it is widely speculated to be the result of a RAG2-induced conformational change in RAG1. The non-core regions of RAG1 and RAG2, although not required for catalytic activity or V(D)J recombination, play important roles in vivo and contain multiple regulatory domains, some of which mediate chromatin interactions (9).
The only high resolution structural information available for either RAG core region is for the RAG1 NBD in complex with the nonamer (10). Sequence analysis, modeling, and mutagenesis suggest that the RAG2 core adopts a six-bladed ␤-propeller structure (11,12). The minimal functional RAG complex is likely to be a heterotetramer consisting of a tight RAG1 dimer bound to two monomers of RAG2 (2,5).
RAG exhibits striking functional similarities with cut and paste transposases such as those encoded by Tn5, Tn10, Transib, and Hermes. The similarities include a nick-hairpin mechanism of DNA cleavage, a similar active site architecture (resembling the RNase H fold) containing a glutamate and two aspartate catalytic residues, and the ability of the RAG core proteins to mediate transposition efficiently in vitro (13). The Transib and Hermes transposases are of particular interest because they cleave DNA with a similar polarity to RAG (leaving hairpins on the flanking DNA rather than on the terminal inverted repeat ends of the transposon) (14,15) and, like RAG, have an extended region of amino acids (the insertion domain) separating the active site glutamate from the second active site aspartate (Fig. 1A). The structure of Hermes transposase has been determined alone (16) and in complex with DNA (17), and it provides potential structural parallels with the RAG1 core.
The region of RAG1 responsible for interacting with RAG2 was initially mapped to a large portion of the RAG1 core (aa 504 -1008) (18). Subsequent studies implicated the RAG1 central core domain (aa 528 -760) (19) or a putative zinc finger in RAG1 (zinc finger B, or ZFB; aa 727-750) (20) as sufficient for the interaction, although in both cases the interaction appeared less efficient than with the entire RAG1 core. The importance of ZFB was subsequently questioned by a large scale mutagenesis analysis of RAG1 (21). Finally, several acidic residues in the region from aa 546 to 560 of RAG1 were shown to be important for binding to RAG2 (22). A limitation of these studies was the use of qualitative co-immunoprecipitation or pulldown meth-ods to assess the RAG1-RAG2 interaction. The use of more quantitative biochemical approaches has not been reported, likely because of the difficulty in obtaining sufficient amounts of purified RAG2 for study. As a result, many basic parameters of the interaction remain uncharacterized, including the binding affinity.
Here, we use biolayer interferometry to identify the regions of RAG1 necessary for interaction with RAG2, and Western blotting to estimate the concentration of RAG1 and RAG2 in mouse thymocytes. Our data yield a K D value of ϳ0.4 M for the RAG1-RAG2 interaction and suggest that the nuclear concentrations of both RAG1 and RAG2 are below this value. Our results also demonstrate that ZFB is not required for the RAG1-RAG2 interaction and lead to the definition of a truncated minimal RAG1 protein, lacking about half of the RAG1 core, that is sufficient for robust binding to RAG2. Residues of RAG1 critical for the interaction are located in a region structurally different from Hermes transposase, and modeling suggests that this region lies near a portion of the RAG1 active site. Interaction with RAG2 might therefore alter the conformation of this critical region of RAG1.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-RAG1 core (aa 384 -1008) and RAG1 mutants (as indicated in Table 1) were cloned into pMH6 (23), which provides an N-terminal MBP tag and C-terminal His 6 tag, and expressed in Escherichia coli. Proteins were purified through two affinity columns (nickel-nitrilotriacetic acid (NTA) and amylose) followed by size-exclusion chromatography using a Superdex 200 10/300 GL column (GE Healthcare) in buffer PB500 (25 mM Tris-HCl, pH 7.5, 500 mM NaCl, 0.5% glycerol). MBP-tagged Mini-RAG1 and truncations or mutations (subset shown in Table 2) were also cloned in pMH6 and purified in the same way but in a different buffer (25 mM Tris-HCl, pH 8.5, 200 mM NaCl, and 0.5% glycerol). C-terminal His 6 -tagged Mini-RAG1 lacking MBP was prepared from a modified pMH6 vector with a tobacco etch virus (TEV) protease cleavage site between MBP and the RAG1 open reading frame. The MBP tag was removed by TEV protease cleavage (4°C overnight) followed by nickel-NTA and gel filtration column purification. MBP-and GST-tagged RAG2 core (aa 1-383) were expressed in HEK293T cells and purified as described previously (24). His-tagged HMGB1 was expressed and purified as described previously (25).
Pulldown Interaction Assays-In vitro GST pulldown assays were performed by incubating 400 ng of GST-RAG2 core and 1 g of MBP-RAG1 core or variants together with glutathione-Sepharose 4B resin in interaction buffer (25 mM Tris, pH 8.0, 200 mM NaCl, 1 mM DTT), with 1 mg/ml BSA added to decrease nonspecific interactions, for 30 min at 4°C. After three 3-min washes, proteins were eluted with 10 mM glutathione and SDS-PAGE sample buffer and analyzed by SDS-PAGE. Western blots were developed with anti-GST (Santa Cruz Biotechnology), anti-MBP (Cell Signaling), or anti-RAG2 monoclonal antibody (mAb) number 39 (26).
For in vivo pulldown assays, 20 g of RAG1 and RAG2 pEBB expression vectors (27) were co-transfected into a 10-cm dish of HEK293T cells using calcium phosphate. Forty eight h after transfection, cells were incubated with lysis buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 1 mM DTT, 1% Nonidet P-40, protease-inhibitor mixture (Roche Applied Science)) on ice for 30 min, and after centrifugation (15,000 rpm, 30 min), supernatants were incubated with glutathione-Sepharose 4B resin or amylose resin at 4°C for 1 h. After three 5-min washes, proteins were eluted and analyzed as described above.
V(D)J Recombination Assay-The pSF-12con/23con substrate plasmid contains consensus 12 and 23 RSSs separated by 405 bp and oriented so as to retain a signal joint on the plasmid. 4 Three g of each of three plasmids (pEBB-R1c or mutants, pEBB-R2c, and pSF-12con/23con) were co-transfected into a 6-cm dish of HEK293T cell with Lipofectamine 2000 (Invitrogen). Cells were harvested 48 h post-transfection, and extrachromosomal plasmid DNA was purified with a Qiagen miniprep kit. Recombination products were detected by PCR (34 -35 cycles) or nested PCR (35 cycles followed by 35 cycles) followed by agarose gel electrophoresis, 4 with some products confirmed to be proper signal joints by DNA sequencing.
Biolayer Interferometry and K D Calculation-Measurements were performed on a BLItz instrument (ForteBio). Prior to use, biosensors were soaked in BLItz assay buffer (20 mM Tris-HCl, pH 8.0, 150 mM KCl, 0.02% Tween 20, 2 mM DTT, 1 mg/ml BSA) for at least 10 min. Biolayer interferometry assays consisted of five steps, all performed in BLItz assay buffer: initial base line (30 s), loading, base line (30 s), association and dissociation. GST-R2c and MBP-R1c-His 6 were immobilized on anti-GST or anti-His sensors separately. For the loading step, protein concentrations were adjusted to yield a signal intensity in the range of 1 to 2 nm, thereby ensuring that the sensors were not saturated. Times and protein concentrations for the association and dissociation steps were as indicated in the figures and legends. Control values, measured using empty (no protein loaded) sensors, were subtracted from experimental values before data processing. Initial experiments indicated that empty sensors and sensors loaded with control GST protein yielded similar values in binding experiments with MBP-R1c (data not shown). Sensorgrams were fit globally to a 1:1 binding model by BLItz Pro version 1.1.0.28, from which the equilibrium dissociation constant (K D ) and association (k a ) and dissociation (k d ) rate constants were calculated. RAG1 homodimers were treated as a single molecule in calculations of protein concentration.
Protein Thermal Stability Assay-45 l of a 5 M protein solution (25 mM Tris-HCl, pH 8.5, 150 mM NaCl) was mixed with 5 l of SYPRO Orange (Invitrogen) protein dye (final dilution 1:2,000). The thermal stability curve was measured using a CFX96 real time PCR machine (Bio-Rad). After an initial incubation at 10°C for 10 min, temperature was increased from 10 to 85°C in 0.5°C increments (30 s per increment), with fluorescence measured at each increment. Data were analyzed by Bio-Rad CFX Manager version 3.1, which generated the melting curves and their first derivatives (which are plotted).
Size-exclusion Chromatography-A Superdex 200 10/300 GL column was equilibrated in 25 mM Tris-HCl, pH 8.5, 200 mM NaCl, 1 mM DTT; and MBP-Mini-RAG1 (8 g) or Mini-RAG1 (2 g) were fractionated separately, with absorbance monitored at 280 nm. The profiles were compared with gel filtration standards from Bio-Rad. For MBP-R2c and GST-R2c, 0.5 ml of cell lysate (from one 10-cm dish of HEK293T cells) was loaded after being passed through a 0.2-m filter. Fractions were collected and analyzed by Western blots with anit-RAG2 antibody.
Quantitative RAG Western Blot Analysis-Thymuses from 5-week-old male C57BL/6 mice were mashed through a cell strainer in 2-3 ml of cold PBS, pelleted (800 ϫ g, 4°C, 5 min), resuspended in 1 ml of red blood cell lysis buffer (10 mM KHCO 3 , 150 mM NH 4 Cl, 0.1 mM EDTA, 5% fetal bovine serum) for 5 min at room temperature, and re-pelleted as above. Cells were resuspended in 10 ml of cold PBS and counted manually with a hemocytometer. After pelleting, cells were lysed with at least 2 pellet volumes of 20 mM Tris, pH 7.4, 20 mM ␤-glycerol phosphate, 10 mM Na 3 VO 4 , 10% glycerol, 0.5 mM EDTA, 0.5 mM MgCl 2 , 400 mM NaCl, 0.5% Triton X-100, 1 mM DTT, and 100 mM PMSF or protease inhibitor mixture (leupeptin/pepstatin/aprotinin (Sigma)) on ice for 10 min. Samples were aliquoted; and an equal volume of loading buffer containing 4% SDS, DTT, and loading dyes was added and then frozen at Ϫ80°C. RAG protein standards were either MBP-RAG1 core and GST-RAG2 core, purified individually as described above (concentrations determined by mass spectrometry; Keck Biotechnology Resource Laboratory, Yale University), or MBP-RAG1 core and MBP-RAG2 core that were co-expressed and co-purified from HEK293T cells as described previously (24) (quantitated by SDS-PAGE followed by SYPRO Orange (Invitrogen) staining and comparison to BSA standards).
A known number of cell equivalents of thymocyte whole cell extract was analyzed by Western blotting together with a dilution series of one of the RAG protein standards. Standards were mixed with an appropriate amount of whole cell extract from M12 (a RAG-negative mature B cell line) prior to SDS-PAGE so as to create samples similar in composition to the thymocyte extracts. Samples in loading buffer (2% SDS) were heated to 95°C for 20 min, spun at 14,000 rpm for 5 min, fractionated by SDS-PAGE (8% acrylamide; 100 V, 1.5 h), and transferred to polyvinylidene difluoride membranes (Bio-Rad) by electroblotting (120 V, 0.5 h). Membranes were incubated with either anti-RAG2 mAb number 39 or anti-RAG1 mAb number 23 (26), diluted 1:500 in PBS overnight at 4°C, washed, and incubated with secondary antibody (goat anti-rabbit IgG (HϩL) alkaline phosphatase, Jackson ImmunoResearch) diluted 1:1,000 in PBS for 1 h. Signals were detected using ECF chemiluminescent detection substrate (GE Healthcare). Membranes were imaged by Pharos Fx Plus (Bio-Rad); bands were quantitated using Image Lab version 4.1 (Bio-Rad), and local background (determined from a region of membrane just above each band) was subtracted to yield the signal for each fluorescent band. The signals from the protein standards were fit with a linear equation, which was used to determine the amount of RAG protein in the thymocyte sample. Similar experiments were performed with thymocyte whole cell extract and a purified fragment of the Ikaros protein containing the DNA binding domain (quantitated by mass spectrometry at the Yale Keck facility) and anti-Ikaros antibody recognizing that domain, both kindly provided by Sarah Wadsworth and Steve Smale.
Modeling of the RAG1 RNase H Fold-A RAG1 consensus secondary structural prediction was built using the output of HNN (28), SOPMA (29), JPRED3 (30), SSPRO (31), and ITASSER (32). Accessibility, charge, and hydropathy profiles were generated with ACCpro (31) and tools from the analysis and modeling package SLIDE (33). Fold recognition methods such as PHYRE (34), RAPTORX (35), and ITASSER were unable to identify any global template for RAG1. Hence, by using remote homology techniques, it was possible to reconstruct by modeling only the region comprising the RAG1 RNase H domain (RNHd) and two adjacent extensions corresponding in sequence to aa 538 -732 and 960 -1010. The resulting model, comprising RNHd and extensions, was built starting from proteins, including Hermes transposase and HIV-1, HIV-2, and prototype foamy virus integrases, all known to containing RNHd, which consists of nine secondary structural elements in the following order: ␤1-␤2-␤3-␣1-␤4-␣2-␤5-␣3-␣4. All of these RNHd regions exhibited less than 5 Å root mean square deviation from each other. The RNHd in RAG1, corresponding to the regions aa 594 -732 and 960 -996, exhibited remote homology with all templates in the RNase H group. However, the architecture of the RAG1 insertion domain was predicted to be significantly different from that of Hermes.
Remote homology modeling of the RNHd of RAG1 involved refining the alignment with information from secondary structural predictions and locking of critical active site amino acids (Asp-600, Asp-708, and Glu-962 in mouse RAG1). The RAG1 RNHd has longer insertions between the secondary structure elements than its closest template the Hermes transposase (Protein Data Bank code 4D1Q) (17% identity and 26% similarity), and the RAG1 RNHd has longer insertions between the secondary structural elements. These loops were generated ab initio by taking into account the properties of surrounding amino acids. In addition, based on homology and secondary structural predictions, it was possible to extend the RNHd model with 56 aa toward the N terminus (aa 538 -593) and 14 aa toward the C terminus. Within the N terminus, aa 552-579 showed high homology with a corresponding helix in Hermes transposase (␣0), whereas the rest had to be modeled ab initio. Here, a predicted helix containing aa 538 -547 was modeled to satisfy both the hydrophobic pattern and "knobs into holes" packing constraints with respect to helix ␣0 and helix ␣4 within RNHd. The C terminus was modeled starting from a homologous stretch from HIV-1 integrase (Protein Data Bank code 4OVL).
The model was generated using INSIGHT II from Accelrys and further refined with a molecular dynamics procedure involving a Generalized Born simulated annealing using NAMD (36) on a 14 ϫ HP BL280c G6 high performance computing cluster. The simulation used the CHARMM36 force field and was performed in implicit solvent for 10 ns with harmonic position restraints on the backbone of the protein in regions of secondary structure, whereas the loops were left to move freely so as to eliminate steric conflicts and bring the model to a lower energy minimum. The model was validated using the QA RecombineIT method for quality assessment (37).
Identifying RAG1 Residues for Mutagenesis-To select candidates for mutagenesis within RAG1 R2BD-B and the 997-1008 helix, we identified possible interacting residues such as aromatic and charged aa and designed mild point mutations intended to alter the interaction potential of the surface with minimal perturbation of the local structure. Mutants were screened for their ability to preserve the local sequence propensities and structure using SLIDE and the prediction programs noted above.
Calculation of Free and Bound RAG in Thymocytes-In the first scenario, a dimer of RAG1 was assumed to bind in two steps to two monomers of RAG2 according to Reaction 1, where A 2 and B represent the RAG1 dimer and RAG2, respectively, and the equilibrium binding constants are K D , 1 and where A 0 and B 0 are the total concentrations of RAG1 dimer and RAG2 monomer, respectively, in mouse thymocytes, determined from quantitative Western blotting using a value of 6 m as the approximate diameter of the thymocyte nucleus (38). Reaction 1 then yields two equilibrium equations in two unknowns that were solved to yield values for x and y using the computational engine Wolfram alpha.
In the second scenario, a dimer of RAG1 was assumed to bind a RAG2 dimer (Reaction 2).
where B 2 is the RAG2 dimer and the equilibrium constant K D Ј ϭ 0.04 M as determined by BLItz using GST-R2c and R1c.
, and the equilibrium equation was then solved for x as above.

Zinc Finger B Is Not Required for the RAG1-RAG2
Interaction-We created a panel of MBP-RAG1 core (R1c) deletion mutants ( Fig. 1B and Table 1) to test the conclusion of Aidinis et al. (20) that the RAG1 ZFB region is critical for binding RAG2. Strikingly, in vitro GST pulldown experiments revealed that complete deletion of ZFB (R1c⌬721-760, in which ZFB was replaced with a flexible linker of five glycineserine (5ϫGS) repeats) had little effect on the amount of RAG1 protein pulled down with GST-RAG2 core (GST-R2c) (Fig. 1C,  lanes 6 and 7). In addition, truncation mutants that removed large portions of the N-or N-and C-terminal portions of R1c but left ZFB intact were pulled down less efficiently (Fig. 1C,  compare lanes 8 and 9). Pulldown of R1c528 -777 was barely above that observed in the GST control (Fig. 1C, compare lanes  4 and 9). When we analyzed the same panel of mutants using an in vivo GST pulldown approach (as in the study of Aidinis et al. (20)), we found that again the deletion of ZFB had no discernible effect on the RAG1-RAG2 interaction (Fig. 1D, lanes 5 and  6), although the other deletion mutants were pulled down less efficiently (lanes 7 and 8). We conclude that ZFB can be deleted from RAG1 without disrupting the RAG1-RAG2 interaction.
Assessment of the RAG1-RAG2 Interaction by Biolayer Interferometry-Although active RAG1 core dimer can be purified in large quantities from bacteria (23), it has proven much more difficult to obtain substantial quantities of active, purified RAG2 core. Like many in the field (39), we routinely prepare RAG2 core (fused at its N terminus to GST or MBP) from transiently transfected 293T cells, which yields microgram quantities of active protein. To analyze the RAG1-RAG2 interaction, we took advantage of biolayer interferometry (40,41) as implemented in the BLItz TM instrument, which allows kinetic analysis of macromolecular interactions using small amounts of material. In a typical experiment, an anti-GST biosensor (the tip of which is coated with anti-GST antibodies) is loaded with GST-R2c, washed in buffer, immersed in a solution of MBP-R1c, and the association kinetics monitored. The biosensor is then immersed in buffer, and the dissociation kinetics are monitored, resulting in a kinetic binding sensorgram from which association and dissociation rate constants (k a and k d ) and the equilibrium binding constant (K D ) can be determined (see under "Experimental Procedures").
We first analyzed binding to GST-R2c by WT and mutant R1c proteins from which ZFB or portions of ZFB had been deleted. The binding sensorgrams of the WT and mutant proteins were similar, again suggesting that ZFB was not required for the interaction with RAG2 (Fig. 1E). We then performed a more detailed analysis using a R1c mutant (Cys-M) in which the two cysteine residues of ZFB predicted to coordinate zinc (Cys-727 and Cys-730) were mutated to alanine, which should prevent zinc binding and disrupt the function of the zinc finger (42). R1c and Cys-M (both fused to MBP at their N termini and His 6 at their C termini) were coupled to anti-polyhistidine biosensors, and binding was measured using a range of MBP-R2c concentrations. The sensorgrams of the WT and Cys-M mutant proteins were similar in appearance and yielded comparable K D values of 0.38 and 0.35 M, respectively (Fig. 1F). The Cys-M mutant was, however, inactive for V(D)J recombination in a transient transfection plasmid-based assay (Fig. 1G), consistent with previous analyses of mutations within ZFB (4,21). We conclude that ZFB is important for the function of RAG1 but is not critical for the interaction of RAG1 with RAG2.
We noted that dissociation of R1c from sensor-bound GST-R2c (Fig. 1E) was considerably slower than dissociation of MBP-R2c from sensor-bound R1c (Fig. 1F). Because GST forms a tight dimer with a K D in the nanomolar range (43), the different dissociation kinetics might be due to different association states of the RAG2 proteins used. As expected, GST-R2c behaved as a dimer, whereas MBP-R2c behaved as a monomer in gel filtration ( Fig. 2A). Effects of dimerization on the binding and rate constants are addressed below.
Identification of a Minimal RAG1 Protein Able to Bind RAG2-We reasoned that comparison of RAG1 (which is dependent on RAG2) with Hermes transposase (which functions autonomously) might provide insights into the regions of RAG1 needed for interaction with RAG2. Comparison of the known secondary structural elements of Hermes transposase with the predicted elements of R1c revealed striking overall similarity, as noted previously (Fig. 3A) (16). Each is composed of three major regions as follows: an N-terminal DNA binding domain (Fig. 3A, red), an RNase H fold catalytic domain (purple and green), and a large ␣-helical insertion domain (yellow) that splits the catalytic domain between its main portion (purple) and C-terminal helices (green) containing the third active site residue. The C-terminal two-helix bundle of the RNHd in Hermes transposase lies across the face of the catalytic domain ␤-sheet so as to bring the three catalytic residues into close proximity (Fig. 3B).
In contrast, the regions lying N-and C-terminal to the predicted RNHd of RAG1 differed in sequence propensities from those in Hermes transposase (see "Experimental Procedures"), leading us to focus our attention on a region between the NBD and the central catalytic core, and a final C-terminal predicted ␣-helix (within the orange-shaded rectangle in Fig. 3A). We hypothesized that these regions (aa 479 -559 and 997-1008) of R1c were important for binding to RAG2. Analysis of various truncation mutants suggested that these short regions were not stable by themselves but required the presence of the catalytic central core region to produce a stable protein (Table 1 and data not shown).
Based on these considerations, we created a deletion mutant lacking the NBD, most of ZFB, and the insertion domain, and containing only two portions of R1c, aa 479 -732 and 960 -1008, linked together with a flexible 5ϫGS linker (Fig. 3C). This protein, hereafter referred to as Mini-RAG1, was soluble, readily purified in large amounts from bacteria, and behaved as a monomer by gel filtration (Fig. 2B). Importantly, Mini-RAG1 interacted robustly with biosensors coated with GST-R2c (Fig.  3D, dark blue trace).
Analysis of deletion mutants of Mini-RAG1 (Fig. 3C) revealed that it was not possible to delete substantial portions of its N or C termini without affecting the interaction with RAG2. Deletion of aa 479 -507 yielded a protein with reduced RAG2 binding capacity (K D ϳ6 M, sensorgrams not shown; see below for comparison with K D for Mini-RAG1), and a larger N-terminal deletion to aa 527, or C-terminal deletions of aa 998 -1008 or 960 -1008 strongly compromised the interaction (Fig. 3D). These mutants appeared to be well folded based on a comparison of their melting curves to that of Mini-RAG1 ( Fig. 3E; the curve minima define the melting temperature). We noted, however, that the melting curve for Mini-RAG1 contained a shoulder (Fig. 3E, arrow) not seen in the mutants, which we speculate arises from an interaction of the N-and C-terminal portions of Mini-RAG1 (see below). We conclude that Mini-RAG1 represents a small, likely minimal, portion of RAG1 capable of interacting strongly with RAG2.
Quantitation of the RAG1-RAG2 Interaction-To determine the affinity of the interaction between Mini-RAG1 and RAG2, and to investigate the influence of stoichiometry on the interaction, sensorgrams were collected for several RAG1 proteins (Fig. 4A) using biosensors loaded with GST-R2c (a dimer). Dimeric MBP-R1c (Fig. 4B) dissociated considerably more slowly than monomeric Mini-RAG1, with (Fig. 4C) or without (Fig. 4D), an MBP tag. The effects of stoichiometry were confirmed using biosensors loaded with MBP-R1c (a dimer);  MAY 8, 2015 • VOLUME 290 • NUMBER 19
Analysis of the sensorgrams of Fig. 4, B-D, yielded a K D of 0.04 M for the interaction of MBP-R1c with GST-R2c, but values 10-fold higher (0.48 -0.66 M) were for the interaction of Mini-RAG1 with GST-R2c (Fig. 4E). As expected, the difference in K D was driven primarily by differences in the dissociation rate constants (Fig. 4E). Importantly, the K D value for the Mini-RAG1 interaction with GST-R2c was close to that obtained for the interaction of R1c with MBP-R2c (0.375 M; Fig. 1F). Finally, we measured the K D value for the interaction of Mini-R1 with MBP-R2c (ϳ0.7 M; Fig. 4F), demonstrating that a similar affinity is observed even when both proteins are monomers. The different arrangements of proteins analyzed by biolayer interferometry and the resulting K D values are summarized in Fig. 2D. We conclude that Mini-RAG1 recapitulates most of the RAG2 binding capacity of R1c. Furthermore, the data indicate that if both RAG1 and RAG2 are in dimeric form, they interact with one another with a 10-fold higher avidity than if one or both are monomeric.
Mapping of Residues Important for RAG2 Binding-Based on secondary structural predictions, we divided the N-terminal portion of Mini-RAG1 (hereafter referred to as the RAG2 binding domain, or R2BD) into regions A (aa 479 -507) and B (aa 508 -559) (Fig. 5A). Mutations were introduced into R2BD or the C-terminal helix of Mini-RAG1 (hereafter referred to as  helix 997-1008), and the mutant proteins were purified and tested for their ability to interact with GST-R2c by biolayer interferometry. The melting curve of each protein was also determined (data not shown). The first set of mutations changed two or more adjacent residues and yielded a clear clustering of the mutations deleterious to the Mini-RAG1-RAG2 interaction (Fig. 5E, Mut-series 1; green text indicates poorly expressed and hence uninformative mutants). Most of the informative mutations in region B substantially reduced binding to RAG2 (Fig. 5E, red text), although none of the mutations in region A or the C-terminal helix did so (black text).
Based on this, single or double point mutations were introduced into R2BD-B and helix 997-1008, focusing on residues predicted to be surface-exposed and choosing mutations predicted to leave secondary structures intact (see "Experimental Procedures"). The sensorgrams for the C-terminal helix mutants are shown in Fig. 5B, and those for the single and double point mutations in R2BD-B are shown in Fig. 5, C and D, respectively. The results are summarized in Fig. 5E and Table 2. Charge-reversal mutations of Asp-531, Asp-551, and Lys-1001 had little effect on binding to GST-R2c, whereas K524D, RT529/530NA, and L541N, reduced but did not abrogate binding. Mutation of K997E substantially reduced binding, while K997Q had an even stronger effect. The most deleterious effects were seen with mutation of Asp-546 and Glu-547, where single charge neutralization (Fig. 5C), or double charge reversal or charge neutralization mutations (Fig. 5D), eliminated detect-able binding. These results identify multiple residues in R2BD-B, as well as Lys-997 in the C-terminal helix, that contribute to the ability of Mini-RAG1 to bind to RAG2.
Finally, we determined the consequences of mutating Asp-546, Glu-547, or Lys-997 in the context of R1c. Biolayer interferometry revealed that single mutation of Asp-546, or double mutation of Asp-546 and Glu-547, eliminated detectable RAG2 binding, although some residual binding was seen with the E547Q single mutant (Fig. 6A). Notably, the K997Q mutation in the context of R1c had only a modest effect on RAG2 binding (Fig. 6A), in contrast to its strong effect in the context of Mini-  (R2BD). B-D, sensorgrams obtained using GST-R2c-loaded biosensors, incubated with 20 M Mini-RAG1 or mutants thereof, as indicated. E, diagram of Mini-RAG1 mutants analyzed. Below the RAG1 sequence is indicated the predicted secondary structure (PSIPRED), and above it are indicated four series of mutants. Multiple aa changes contained in a single mutant are boxed. Green text indicates mutants that were poorly expressed and could not be analyzed, and black, blue, and red text indicate mutants with no effect, a moderate effect, or a strong effect, respectively, on binding to GST-R2c. RAG1 (Fig. 5B). Quantitation of RAG2 binding by R1c-K997Q revealed a K D of 0.054 M (Fig. 6B), just slightly higher than the 0.04 M value obtained for WT R1c. When analyzed with an in vitro GST pulldown assay, the D546N, E547Q, and D546N/ E547Q mutants all eliminated detectable interaction with GST-R2c (Fig. 6C). However, the results of in vivo pulldown (Fig. 6D) and recombination assays (Fig. 6E) were more nuanced, with E547Q showing clear evidence of interaction with R2c and recombination activity; D546N supporting small but detectable levels of activity in both assays, and the D546N/E547Q double mutant having no detectable activity in either assay. The K997Q RAG1 core protein was as active as WT in all three assays, consistent with its strong binding by biolayer interferometry. We conclude that Asp-546 and Glu-547 constitute a small acidic region of RAG1 required for binding to RAG2 and for recombination activity, with Asp-546 being a particularly important residue. This is consistent with a previous study that found that D546A and E547A R1c mutants exhibited defects in in vitro pulldown and cleavage assays, with the Asp-546 mutant having a stronger phenotype (22).
Structural Modeling of Mini-RAG1-Given the importance of Asp-546/Glu-547 and helix 997-1008 for Mini-RAG1 to interact with RAG2, we wondered whether these two regions of the protein might be close to one another in the folded protein.
To investigate this, we used homology modeling, with the Hermes transposase and retroviral integrases as starting templates, to create a structural model for a large portion of Mini-RAG1 (aa 538 -732 linked to 960 -1010), as described under "Experimental Procedures." Secondary structural analysis demonstrated that the RAG1 RNHd shared a similar cadence of structural elements with the RNase H folds of Hermes transposase and HIV-1 integrase (Fig. 7), consistent with a previous analysis (44). Furthermore, secondary structural similarity could be detected N-and C-terminal to the RNHd by comparison with Hermes and HIV-1 integrase, respectively (N-terminal extension (Nex) and C-terminal extension (Cex) in Fig. 7).
Mini-RAG1 consists of the RAG1 core with the NBD, part of ZFB, and insertion domain excised (Fig. 8A). The modeling revealed that the RAG1 RNHd could indeed be modeled as an RNase H fold, consisting of a five-stranded ␤-sheet on which lies a two-helix bundle (␣3 and ␣4) from the C-terminal region, the first of which contributes Glu-962 (modeled in yellow) to the active site (Fig. 8B). The Mini-RAG1 model exhibits substantial overlap with Hermes transposase (Fig. 8C), as expected given that the Hermes structure provided a template for model construction. Notably, the three-dimensional disposition of the structural elements places the predicted Asp-546/Glu-547 helix (blue), helix 997-1008 (which makes up most of the C terminus; dark green), and the ␣3/␣4 bundle (light green), on the same face of the structure (Fig. 8B). Neither the helix containing Asp-546/Glu-547 nor helix 997-1008 corresponds to the Hermes transposase template (Fig. 8C). This juxtaposition of structural elements in the model leads us to speculate that interaction of RAG2 with the Asp-546/Glu-547 helix could influence the structure of the C-terminal helices of the RAG1 core and thereby alter the geometry of the RAG1 active site.
Quantitation of RAG1 and RAG2 Protein in Thymocytes-How does the K D value for the RAG1-RAG2 interaction (ϳ0.4 M) compare with the concentration of RAG1 and RAG2 in the nucleus of developing lymphocytes? It was previously estimated that the average thymocyte contains 10 4 -10 5 molecules of RAG1 and RAG2, with RAG2 clearly in excess over RAG1 (45). The methodology used (immunoprecipitation followed by SDS-PAGE/silver staining) was crude and required assumptions that were difficult to validate. In an attempt to improve on this, we performed quantitative Western blotting of whole cell extracts prepared from total mouse thymus and compared the RAG1 and RAG2 signal intensities to those obtained from purified RAG protein preparations whose concentrations were determined either by mass spectrometry or SDS-PAGE followed by staining (see "Experimental Procedures"). The results (Fig. 9, A-D, for sample blots; tabulated in Fig. 9E) revealed an average value for RAG1 of 1,800 molecules (monomer) per cell (range, 900 -2,200), and for RAG2, 15,300 molecules per cell (range, 6,800 -23,300). To test for methodological error, we measured the number of molecules of the transcription factor Ikaros in total thymocytes using identical methods and a recombinant Ikaros protein standard (see "Experimental Procedures"). Two determinations yielded values of ϳ150,000 and ϳ250,000 Ikaros molecules per thymocyte (Western blots not shown), similar to the previously determined value of ϳ250,000 (46).
Assuming the nuclear diameter of a thymocyte is 6 m, it yields nuclear concentrations of ϳ0.013 and ϳ0.22 M for RAG1 dimers and RAG2, respectively. Although there are a number of assumptions and limitations associated with these values (see "Discussion"), they indicate that the concentration of RAG1 is below that of RAG2 and well below the K D value for the RAG1-RAG2 interaction.

DISCUSSION
This study provides several important insights. First, Mini-RAG1 was found to be a stable, well expressed protein that constitutes the key RAG2-binding components of the RAG1 core. Second, Mini-RAG1 could be modeled as an RNase H fold, allowing identification of a putative RAG2-interaction surface. Third, we determined the K D value of the RAG1-RAG2 interaction, which to our knowledge has not previously been reported. The K D value for Mini-RAG1 interaction with GST-R2c (0.5-0.6 M) or with MBP-R2c (ϳ0.7 M) was somewhat higher than that for the interaction between MBP-R2c and the full RAG1 core (0.4 M), suggesting that Mini-RAG1 does not recapitulate all of the interactions that occur between the two RAG core proteins. Hence, it is possible that ZFB, the NBD, or the insertion domain contribute to the interaction with RAG2.
Our findings regarding ZFB conflict with those of Aidinis et al. (20). We suspect that the reason for the discrepancy lies in this previous study's reliance on a qualitative approach (GST pulldown) that might have allowed detection of weak interactions. We demonstrate that mutation of two zinc-coordinating residues of ZFB had no quantitative effect on binding of RAG1 core to RAG2 core (Fig. 1F), leading us to conclude that neither the presence nor the structural integrity of ZFB is required for binding of RAG1 to RAG2. However, as noted above, we cannot rule out the possibility that ZFB makes some contribution to the RAG1-RAG2 interaction.
Our data are in good agreement with other analyses of the RAG1-RAG2 interaction. McMahan et al. (18) observed that N-or C-terminal deletions from RAG1 aa 504 -1008 substantially interfered with RAG2 binding, consistent with our findings regarding the importance of R2BD-B and helix 997-1008. Our findings also parallel those of Arbuckle et al. (19), who detected a weak interaction between RAG1 aa 528 -760 and RAG2, similar to our results with RAG1 aa 528 -777 (Fig. 1, C  and D). Finally, our findings regarding the importance of Asp-546 and Glu-547 for the RAG1-RAG2 interaction and for V(D)J recombination agree with previous biochemical studies (22,47).
Given the strong phenotype associated with mutation of Asp-546 and to a lesser extent Glu-547, it is plausible that these residues make direct contact with RAG2. However, we cannot rule out the possibility that they contribute indirectly to the FIGURE 6. Functional analysis of RAG1 core mutants. A, sensorgrams obtained using GST-R2c-loaded biosensors, incubated with 20 M WT or mutant MBP-R1c proteins, as indicated. B, sensorgrams obtained using GST-R2c-loaded biosensors, incubated with the indicated concentrations of MBP-R1c K997Q, with the calculated K D shown. C, in vitro GST pulldown experiments were performed as described in Fig. 1C with WT or mutant MBP-R1c proteins, as indicated. Data representative of two experiments. D, in vivo MBP-pulldown experiments were performed as described in Fig. 1D with WT or mutant MBP-R1c proteins, as indicated. Data are representative of two experiments. E, V(D)J recombination assay using WT and mutant RAG1 core proteins, as indicated, as described in Fig. 1G, with confirmation by nested PCR to increase the sensitivity (arrowhead indicates expected recombined product). Data are representative of three experiments.   Fig. 3A. B, remote homology model of the RNase H fold of the RAG1 core created by molecular dynamics modeling. DDE motif residues, yellow; N-terminal extension (538 -593, Nex), dark pink or blue for the very N-terminal helix containing Asp-546 and Glu-547 (red); major part of catalytic domain (552-732), pink; C-terminal region (960 -1010), green or dark green for the C-terminal extension (Cex). C, superposition of RAG1 model (colored as in B) and Hermes transposase catalytic core (gray). The root mean square deviation between the ␣-carbons in the two structures is 3.9 Å, with the best superposition located in secondary structural elements, although loops, where RAG1 has more insertions compared with Hermes, are significantly different.
interaction. Similarly, we cannot determine which portion or portions of R2BD or helix 997-1008 might make direct contacts with RAG2. Lys-997, which lies at the beginning of helix 997-1008 (Fig. 8B), remains enigmatic; mutation of Lys-997 to Gln had a much stronger effect on RAG2 binding in the context of Mini-RAG1 than in the context of RAG1 core. Previous analysis of K997A RAG1 core did not reveal a strong defect in V(D)J recombination (21). It is possible that Lys-997 is more important for stabilizing local protein structure in the context of Mini-RAG1 than when the rest of the RAG1 core is present.
Notably, a number of mutations found in RAG1 in human severe combined immunodeficient and Omenn syndrome patients map within or close to the borders of R2BD-B as follows: G513A, W519C, D536V, R556S, and R558H/R558C (aa numbers from mouse RAG1; human aa numbers are three larger). Such mutations could destabilize the interaction with RAG2. In our analysis, mutation of clusters of aa encompassing Trp-519, Asp-536, or Arg-556/Arg-558 greatly decreased the yield of Mini-RAG1 (Fig. 5E), suggesting that they are important for protein stability.
Our estimates of the average number of RAG1 (ϳ1,800) and RAG2 (ϳ15,000) molecules per thymocyte have some limita-tions and likely underestimate the actual values in at least some cell populations, particularly for RAG1. Thymocytes are heterogeneous, with ϳ20% being mature CD4 or CD8 single-positive cells that do not express RAG and ϳ5% being CD4/CD8 double-negative cells that express lower levels of RAG mRNA than do CD4/CD8 double-positive thymocytes. 5 During extract preparation, the RAG proteins might not be quantitatively extracted from the cells; a portion of RAG1 is known to be difficult to extract from the nuclei (45). RAG1 Western blot signals are weaker and more difficult to quantitate accurately than those of RAG2 (Fig. 9). The values obtained depend heavily on how accurately the concentrations of the standards were determined. The fact that two different sets of RAG protein standards, quantitated by different methods, gave comparable results (within a factor of 1.5), provides some confidence in this regard. It is likely that double-positive thymocytes, which constitute about 75% of thymocytes, have higher levels of RAG1 and RAG2 than our estimates indicate. Our finding that RAG2 is in considerable excess over RAG1 is fully consistent with our previous study (45).
Determination of the K D value for RAG1-RAG2 binding (ϳ0.4 M) provides a fundamental parameter useful for understanding the properties of RAG and potential mechanisms for regulating RAG function. Even if our value for the number of RAG1 molecules per thymocyte is a 4-fold underestimate, then the concentration of RAG1 in the thymocyte nucleus is still well below the K D value. Applying the K D and RAG concentration values determined here to a simple model of a dimer of RAG1 binding sequentially to two RAG2 monomers (see "Experimental Procedures") yields the result that, at equilibrium, only 16% of RAG1 dimers are predicted to be bound to two molecules of RAG2, with 29 and 55% having one or no bound RAG2, respectively. Under the (implausible) assumption that all RAG2 molecules are dimers (and the corresponding 10-fold drop in the K D ), a much higher percentage of RAG1 dimers (72%) is now predicted to be bound to RAG2. In all scenarios, a high percentage of RAG2 is predicted to remain free of RAG1, consistent with a recent finding that much of RAG2 is readily extracted from thymocyte nuclei under mild detergent conditions (48).
Our study does not address the possibility that the non-core portions of the RAG proteins influence the affinity with which the RAG1 and RAG2 core regions interact, and our calculations of RAG protein concentrations are based on simplistic assumptions that do not take into account molecular crowding or other physical parameters that might alter their effective nuclear concentrations. There are numerous ways in which the RAG1-RAG2 interaction, and hence catalytic activity, might be regulated, including RAG post-translational modifications, binding of RAG to DNA or chromatin, and the influence of other binding partners. The K D and RAG concentrations derived here raise the possibility that the majority of RAG1 in the nucleus is in a configuration incompatible with coupled cleavage. This provides an appealing means of limiting genome damage caused by off-target RAG-mediated cleavage, which might be of particular importance given that RAG1 binding can be detected at many off-target sites in the developing lymphocyte genome.  Our findings indicate that areas of high local RAG2 concentration (particularly if RAG2 molecules are held in close proximity as in the GST-RAG2 core dimer) should favor interactions with RAG1. Interestingly, RAG2 displays a punctate staining pattern in thymocytes (48) and in pro-B and pre-B cells, 6 indicating that the protein is not uniformly distributed in the nucleus. The basis for this is unknown and does not seem to be a direct consequence of the ability of RAG2 to interact with methylated histone 3 (48). It will now be important to determine whether there are regulatory mechanisms that modulate the RAG1-RAG2 interaction in vivo.
While this work was under revision, Gellert and co-workers (49) reported the crystal structure of the RAG1 core-RAG2 core complex. This demonstrated that indeed the RAG1 catalytic center adopts an RNase H fold and identified an extensive RAG1-RAG2 interface, much of which is contained in Mini-RAG1. Residues in R2BD-B constitute a major part of the interaction surface, and interestingly, Asp-546 lies on an extended loop that projects into RAG2 where it forms a salt bridge with Arg-229 of RAG2. Also of note is that the ZFB region does not itself constitute a zinc finger but instead Cys-727 and Cys-730 coordinate a zinc ion together with His-937 and His-942 (49). Residues in the ZFB region are located at the interface with RAG2, but our deletion and mutation data argue that they are not a major contributor to the RAG1-RAG2 interaction.