Assembly Pathway and Characterization of the RAG1/2-DNA Paired and Signal-end Complexes*

Background: V(D)J recombination, initiated by the RAG1/2 protein complex, requires two types of DNA recombination signal sequences (RSS), 12RSS and 23RSS. Results: Improved RAG1/2 expression and purification allowed us to systematically assemble protein-DNA complexes. Conclusion: RAG1/2 binds one copy each of 12RSS and 23RSS DNA. Significance: Strictly enforced pairing of the 12RSS and 23RSS explains the recombination specificity. Mammalian immune receptor diversity is established via a unique restricted set of site-specific DNA rearrangements in lymphoid cells, known as V(D)J recombination. The lymphoid-specific RAG1-RAG2 protein complex (RAG1/2) initiates this process by binding to two types of recombination signal sequences (RSS), 12RSS and 23RSS, and cleaving at the boundaries of RSS and V, D, or J gene segments, which are to be assembled into immunoglobulins and T-cell receptors. Here we dissect the ordered assembly of the RAG1/2 heterotetramer with 12RSS and 23RSS DNAs. We find that RAG1/2 binds only a single 12RSS or 23RSS and reserves the second DNA-binding site specifically for the complementary RSS, to form a paired complex that reflects the known 12/23 rule of V(D)J recombination. The assembled RAG1/2 paired complex is active in the presence of Mg2+, the physiologically relevant metal ion, in nicking and double-strand cleavage of both RSS DNAs to produce a signal-end complex. We report here the purification and initial crystallization of the RAG1/2 signal-end complex for atomic-resolution structure elucidation. Strict pairing of the 12RSS and 23RSS at the binding step, together with information from the crystal structure of RAG1/2, leads to a molecular explanation of the 12/23 rule.

During lymphocyte development, an enormous variety of immunoglobulin and T cell receptor genes can be assembled through a series of carefully orchestrated DNA breakage and rejoining events by the process called V(D)J recombination (1). V(D)J rearrangements specifically occur between gene segments flanked by recombination signal sequences (RSSs). 3 An RSS typically contains conserved heptamer (5Ј-CACAGTG) and nonamer (5Ј-ACAAAAACC) elements, separated by a non-conserved spacer of 12 (12RSS) or 23 (23RSS) nucleotides (2). Recombination follows the so-called 12/23 rule, by which one gene segment has to be flanked by a 12RSS and the other has to be flanked by a 23RSS (3).
V(D)J recombination is initiated when the products of the recombination activating genes 1 and 2, the RAG1 and RAG2 proteins, bind a pair of 12RSS and 23RSS DNAs and cooperate in cleavage of DNA at the boundaries between RSSs and their neighboring gene segments (or coding flanks) (4 -6). Both RAG proteins have been trimmed to smaller, more tractable species. "Core" regions of RAG1 (384 -1008 out of 1040 aa) and RAG2 (1-387 out of 527 aa) are active in V(D)J recombination on episomal or integrated substrates in cell culture (7)(8)(9)(10), and are most frequently used in biochemical characterization. Comprehensive mutational analyses have found that the catalytic center of the RAG1/2 recombinase resides in the RAG1 subunit and contains a typical DDE triad of carboxylates, in this case Asp-600, Asp-708, and Glu-962 (11)(12)(13).
Biochemical studies have shown that the DNA cleavage at an RSS-coding gene boundary occurs in two steps. The RAG1/2 complex, with the assistance of an HMGB1 protein (14), specifically interacts with a 12/23 pair of RSSs to form a paired complex (PC) and introduces site-specific nicks precisely at the boundaries between the RSS heptamer and coding flank, leaving 3Ј-hydroxyl groups on each coding end (4). These 3Ј-OH groups then attack the phosphodiester bond of the opposite strand and make double-strand breaks, with hairpin loops on the coding ends and blunt 5Ј-phosphorylated signal ends (4). The hairpin ends are later opened, processed, and joined together by the non-homologous end-joining machinery to form a coding joint (15,16). Cleaved signal ends remain associated with RAG1/2 in a stable signal-end complex (SEC) (17,18) and later are also joined by non-homologous end-joining enzymes, in most cases being lost from the cells in the form of excised circles.
Although V(D)J recombination in vivo exhibits a strong (Ͼ30-fold) preference for a 12/23 RSS pair over 12/12 and 23/23 pairs, a low level of recombination between identical sites has been demonstrated (19). The 12/23 rule can also be captured in in vitro reactions. With Mg 2ϩ as cofactor, formation of double-strand breaks is strongly favored if both 12RSS and 23RSS are bound, whereas nicking of individual 12RSS or 23RSS is observed (20). Under less stringent conditions in Mn 2ϩ , purified RAG1/2 can generate a double-strand break on either a 12RSS or a 23RSS alone. These observations have left it unclear whether the PC can form with two RSS DNAs of the same kind (21).
The assembly pathway leading to the PC has also been unclear. Some previous work has supported an ordered assembly of functional RAG1/2-RSS complexes. A "capture" model described in several studies suggests that RAG1/2 preferentially assembles first on the 12 RSS in a single complex (12RSS-SC) and then captures the 23RSS to form a PC (22,23). When RAG1/2 and HMGB1 were first incubated with the 12RSS, a very strong preference for acquiring and cleaving a 23RSS, versus a 12RSS partner, was observed. Although the reciprocal also appeared to be true, the adherence to the 12/23 rule was more strict if the 12RSS was the first one bound (23). A similar trend was seen on some chromatinized substrates, which is closer to the situation in cells (24). Further support of this model comes from detection of nicks of 12RSS in pre-B cells and in primary thymocytes, whereas nicks on 23RSS were not detected (22). However, an alternative model suggested that RAG1/2 binds and nicks each RSS type independently and then associates them into the final RAG1/2 complexes, with both 12 and 23 RSSs required for completing the cleavage reaction (25). In sum, it has not been clear whether RAG1/2 can simultaneously bind intact 12 and 23 RSSs to form the PC, or whether staged assembly preceded by DNA nicking is necessary to carry out the DNA cleavage that forms the SEC.
A central problem in the in vitro studies of V(D)J recombination has been to find an optimal source of active RAG proteins. Biochemical characterization has been successfully conducted using core RAG1/2 proteins expressed in insect cells (4,12,26,28,29). However, we found that only a small fraction of RAG1/2 in these preparations was active (30). To obtain homogeneous protein for structural studies, an activity-based purification was developed in our group. By immobilizing RSS DNAs on a chromatographic resin, the active subpopulation of RAG1/2 complexes in the form of SEC was isolated after cleaving the bound RSS. The SEC was produced by this activitybased purification in very low yield (ϳ10 g from several liters of insect cell culture), but it was sufficient for electron microscopic analysis and was shown to contain two RAG1 and two RAG2 units, one HMGB1, and one each of cleaved 12RSS and 23RSS (30).
In this work, a mammalian expression system has been used to generate RAG1/2 proteins that are superior in both activity and physical properties. We use these improved protein samples to show that the RAG1/2 tetramer specifically binds only a single copy of 12RSS or 23RSS DNA, but not two of the same kind. The assembled complex of RAG1/2 and a pair of intact 12/23RSS DNAs can be purified to homogeneity and can proceed to form the cleavage SEC product in good yield. We have crystallized the resulting SEC and thus paved the way to structural analysis of RAG1/2 by x-ray diffraction.

Experimental Procedures
Proteins and DNA-Several expression constructs encoding mouse core RAG1 (384 -1008 aa) and RAG2 (1-387 aa) were made for co-expression in HEK293GNTI cells. A maltosebinding protein (MBP) tag was necessary for RAG2 preparation and was placed at the N terminus followed by a cleavable linker recognized by PreScission protease or a non-cleavable linker of five alanine residues. Protein yields and DNA cleavage activity were similar regardless of the type of MBP tag. For efficient removal of MBP, a 10-asparagine linker was added between the RAG1 and the PreScission cleavage site. For RAG2, seven glycines in addition to the asparagine linker were placed between the PreScission cleavage site and the RAG2. A 5Ј Kozak sequence and 3Ј stop codon were introduced by PCR. For MBPtagged constructs, RAG genes were cloned between the AgeI and XhoI sites of pLEXm (31). The tagless construct of RAG1 was produced by inserting the RAG1 gene into pLEXm between its MfeI (compatible with EcoRI in pLEXm) and XhoI sites. All DNA sequences were confirmed.
For RAG1/2 co-expression, HEK293GNTI cells were cultured in suspension in FreeStyle medium supplemented with 1% FBS, 100 units/ml penicillin, and 100 g/ml streptomycin (Gibco) to a density of 2 ϫ 10 6 cells/ml. Cells were transfected by adding a mixture of 1 mg of RAG1/2 expression vector DNAs (total) and 4 mg of PEI per liter of culture. Cells were further incubated for 72 h at 37°C, 115 rpm, 5% CO 2 , and 95% humidity. After 72 h, cells were pelleted at 2500 rpm, washed in PBS, and stored at Ϫ80°C.
For protein purification, cell paste was thawed in 25 mM Tris, pH 7.5, 500 mM NaCl, 2 mM DTT (buffer A), with the addition of a Complete protease inhibitor cocktail (Roche Life Science) and then lysed by sonication. The lysate was clarified by ultracentrifugation at 35,000 rpm. The supernatant was loaded by gravity flow onto an amylose column (4 ml of resin, New England Biolabs) equilibrated in buffer A. After washing the column with 50 column volumes of buffer A, the RAG proteins were eluted with 40 mM maltose in buffer A. The heterotetrameric nature of RAG1/2 protein (30) was confirmed using a Superdex-200 GL 10/300 gel-filtration column (GE Healthcare) with molecular weight markers and SDS gel. For untagged core RAG1 and MBP-tagged core RAG2 (RAG1/M2), the protein peak fractions were pooled, and the salt concentration was lowered to 100 mM by the addition of a no-salt buffer (50 mM HEPES, pH 7.0, and 2 mM DTT). The protein solution was applied onto a 5-ml HiTrap Q HP column (GE Healthcare) equilibrated in 50 mM HEPES, pH 7.0, 100 mM KCl, 2 mM DTT and eluted with a gradient of 0.1-1 M KCl. RAG1/2 was eluted at ϳ300 mM KCl. Fractions containing RAG1/2 complex were pooled, and the salt concentration was raised to 500 mM KCl before applying onto a Superdex-200 column equilibrated in 50 mM HEPES, pH 7.0, 500 mM KCl, 5% glycerol, 2 mM DTT, and 2 mM maltose. Fractions from the main eluted peak were pooled and concentrated using an Amicon Ultra-15 centrifugal filter unit (Millipore) with 100-kDa cutoff. After adjusting the glycerol concentration to 30% (w/v), RAG1/2 aliquots were stored at Ϫ20°C and thawed only once before experiments. Human HMGB1 (1-163) (30) was expressed in BL21(DE3) cells and purified via heparin, HiTrap SP HP, HiTrap Q HP, and HiLoad 16/600 Superdex-75 columns in the final buffer (25 mM HEPES, pH 7.0, 100 mM KCl, 2 mM DTT, 0.1 mM EDTA, 10% (w/v) glycerol) and stored at Ϫ20°C as described (30). DNA oligonucleotides were synthesized by the NIH Facility for Biotechnology Resources and purified on Glen-Pak DNA purification cartridges according to the manufacturer's instructions, taking advantage of the hydrophobicity of the 4,4Ј-dimethoxytrityl protection group. Complementary strands were annealed in 10 mM Tris (pH 7.5), 50 mM NaCl, 0.2 mM EDTA and stored at Ϫ20°C. Pre-nicked substrates were assembled in the same way from three pieces for each RSS. The 12RSS sequence was GTCTTACACAGTGATACGTACCAGAACAAAAACCCTG-CAG, and the 23RSS sequence was GTCTTACACAGT-GATAGAACTCCAGCTGTCAGCCAGACAAAAACCCTG-CAG, where the conserved heptamer and nonamer elements are italicized and underlined.
RAG1/2-DNA Complex Assembly and Analysis-The 12RSS-SC or PC with RAG1/M2 and HMGB1 was formed in 50 mM HEPES (pH 7.0), 60 mM KCl, 5 mM CaCl 2 , 1 mM maltose, 0.01% n-dodecyl-␤-D-maltoside (DDM), and 2 mM DTT. Protein-DNA mixtures were incubated for 20 min at room temperature followed by 5 min at 37°C. In all experiments, the concentration of RAG1/M2 was 1 M (as a heterotetramer), and HMGB1 was at 3 M. The amount of 12RSS varied, depending on the experiment, from 1.2-to 2.2-fold molar excess over the RAG1/2 heterotetramer. After incubation, the complex was analyzed on a Superdex-200 GL 10/300 column, equilibrated with 50 mM HEPES (pH 7.0), 60 mM KCl, 1 mM maltose, and 2 mM DTT. Elution points of molecular mass standards (669, 443, 200, and 150 kDa) were established under the same buffer conditions. The content of the peak was analyzed by SDS-PAGE stained with Coomassie Blue and by a TBE-urea gel stained with SYBR Green.
To determine the protein to DNA ratio, A 260 and A 280 were measured for protein-DNA complexes eluted from the Superdex-200 column. To investigate the ratio of the bound DNA species in the S200-purified PC, proteins were digested by proteinase K, and the DNA in the reaction mixture was 32 P-labeled and analyzed on a TBE-urea gel. The 23RSS-SC or PC was formed with M1/M2 (both RAG1 and RAG2 with MBP tags) under the same condition as described above for the 12RSS-SC.
DNA Cleavage Assays-To assay coupled DNA cleavage, the bottom strands of the 12RSS and 23RSS were 32 P-labeled. The complex was assembled by mixing a 1:1 molar ratio of RAG1/2 heterotetramer and 12/23RSS in 50 mM HEPES (pH 7.0), 60 mM KCl, 5 mM CaCl 2 , 1 mM maltose, 0.01% DDM, and 2 mM DTT. The cleavage reaction was initiated by the addition of 5 or 10 mM MgCl 2 , or 2, 1, or 0.5 mM MnCl 2 , or a combination of both metal ions. Reactions were allowed to proceed for 1-3 h at 37°C and stopped by adding formamide solution containing 10 mM EDTA and a loading dye, heated, and resolved on TBE-urea denaturing gel. Radioactively labeled products were visualized by a PhosphorImager and quantified using ImageQuant NL (GE Healthcare).
For cleavage assays used to compare RAG1/M2 and M1/M2 complex in which both RAG1 and RAG2 contained MBP, the top strand of the 12RSS was 32 P-labeled. 100 ng of RAG1/2 (as a heterotetramer), 100 ng of HMGB1, 5 mM MgCl 2 , and 2 nM each of labeled 12RSS and cold 23RSS were mixed in 50 mM HEPES (pH 7.0), 60 mM KCl, 0.01% DDM, 2 mM DTT, 5 mM MgCl 2 , and 2 mM maltose and incubated for 1 h at 37°C. Reaction products were analyzed on a 15% TBE-urea gel and visualized by autoradiography.
SEC Preparation and Crystallization-Co-expression and purification of MBP-tagged RAG1/2 protein were conducted as described above except for omitting the HiTrap Q HP column. The PC was assembled using the following 12RSS (GTCTTACA- After incubation for 20 min at room temperature, DNA cleavage was carried out for 1 h at 37°C. PreScission protease was then added with overnight incubation at 4°C. The cleavage mixture was loaded onto a 5-ml HiTrap Q HP column (GE Healthcare) equilibrated with 25 mM HEPES (pH 7.0), 100 mM KCl, 2 mM DTT, and 1% glycerol. Unwanted components (cleaved MBP, unreacted or aggregated RAG1/2) were eluted using a step gradient of 300 mM and 550 mM KCl. The SEC was then eluted with 650 mM KCl. After concentration using an Amicon Ultra-15 centrifugal filter unit (Millipore) with 100-kDa cutoff, the SEC complexes were loaded onto a Superdex-200 10/300 GL column equilibrated with 25 mM HEPES (pH 7.0), 500 mM KCl, 1% (v/v) glycerol, 2 mM DTT. The eluted peak was collected and concentrated to ϳ10 mg/ml protein (determined by the Bradford method). Crystals were grown within a week at 4°C using a reservoir of 0.2 M potassium citrate tribasic monohydrate and 18% (w/v) PEG 3350, by the sitting-drop diffusion method. Single crystals were cryoprotected in mother liquor supplemented with 30% (v/v) ethylene glycol, transferred, mounted in MicroLoops (MiTeGen), and flash-cooled in liquid nitrogen. Diffraction was tested on the 23-ID-D beamline at the Advanced Photon Source at Argonne National Laboratory.

Results
Expression and Purification of RAG Proteins-Because the RAG1/2 protein complex is found only in jawed vertebrates, we wondered whether expression of RAG1/2 in mammalian instead of insect cell culture could improve the protein solubility and activity. Core regions of mouse RAG1 (384 -1008 aa) and RAG2 (1-387 aa), which are abbreviated as RAG1 and RAG2, were placed downstream of an MBP tag with a cleavable linker in the mammalian expression vector pLEXm (31) (see "Experimental Procedures" and Fig. 1A). MBP greatly improves the expression yield and solubility of RAGs (4). RAG1 and RAG2 co-expressed in HEK293GNTI cells were readily purified to near homogeneity by one-step amylose affinity chromatography and eluted as a heterotetramer of two RAG1s and two RAG2s in a single symmetric peak from the ensuing gel-filtration column (Fig. 1, B and C). The yield of the RAG1/2 protein expressed in mammalian suspension culture (4 -8 mg/liter of cultured cells) was similar to that of insect culture. Chemically, the RAG1/2 proteins expressed in mammalian and insect cells are identical, with neither containing detectable post-translational modifications, as determined by mass spectrometry analysis. 4 From the mammalian co-expressed RAG1/2 proteins, the MBP was readily cleaved off RAG1, resulting in a RAG1 and MBP-RAG2 complex (abbreviated as RAG1/M2) (Fig. 1C). We found that the MBP tag on RAG2 was essential for obtaining good expression of the RAG1/2 complex and also allowed the use of an amylose pulldown as a first purification step. As an alternative, we created a tagless construct for RAG1 and co-expressed it with a non-cleavable MBP-tagged RAG2 (MRAG2 or M2 for short) (Fig. 1D) to avoid the overnight tag removal step (see "Experimental Procedures"). The resulting protein (Fig.  1E) was as active in coupled DNA cleavage as RAG1/2 complex with MBP tags on both subunits (Fig. 1, F-G). For the remaining experiments, unless specified, the RAG1/M2 complex was used.
Complex of RAG1/2 Tetramer and a Single 12RSS-According to the previously proposed capture model, the RAG1/2 heterotetramer preferentially assembles first on a 12RSS in a single complex (12RSS-SC), and a 23RSS then binds to form the paired complex (22,23). To test this model, we first asked how many copies of 12RSS can be accommodated in a 12RSS-SC. The tetrameric RAG1/M2 was mixed with intact 12RSS (Figs. 1F and 2A) in a molar ratio of 1:1.2 or 1:2.2 in the presence of Ca 2ϩ and HMGB1 protein. The small excess of DNA substrate over RAG1/2 was used to ensure that there was no free protein left in solution. After incubation and separation by size-exclusion chromatography, the major peak eluted from the sizing column, which was shifted when compared with the RAG1/M2 protein alone, contained both RAG1/M2 and 12RSS (Fig. 2, B and C) and was identical for both ratios of DNA to protein (in position and amplitude as well as the ratio of A 260 /A 280 ). The later eluting peak of excess 12RSS was merely increased at the higher ratio of DNA. These observations indicated that the majority of RAG1/2 tetramer was able to accommodate only one copy of 12RSS DNA in the 12RSS-SC, although we cannot rule out that a minute fraction of RAG1/2 tetramer below our detection capability might form a complex with two copies of the same RSS DNA.
We next asked whether the DNA cleavage intermediate, a pre-nicked 12RSS substrate ( Fig. 2A), could be assembled with RAG1/M2. Repeating the experiments of mixing tetrameric RAG1/M2 and the pre-nicked 12RSS at 1:1.2 and 1:2.2 molar ratios, the elution profiles showed the same results, that a single copy of pre-nicked 12RSS appeared to be bound to each RAG1/M2 tetramer (Fig. 2, B and C).
To test whether single complexes (SC) of intact and prenicked 12RSS were active in DNA cleavage, the SC fractions eluted from the sizing column were incubated with 2 mM Mn 2ϩ . Analyses of the products showed that the SC complexes supported hairpin formation (Fig. 2C). The reaction was more complete with pre-nicked 12RSS than intact DNA. We conclude that a RAG1/M2 tetramer binds one copy of 12RSS DNA and can proceed to cleave the DNA in the presence of Mn 2ϩ .
The 12RSS-SC Reserves the Second DNA-binding Site for a 23RSS and Vice Versa-When the purified 12RSS-SC was incubated with a 1.2 molar excess of 23RSS, size-exclusion chromatography yielded a major peak, corresponding to a bona fide PC with RAG1/M2, HMGB1, and both 12RSS and 23RSS, and a small peak of excess 23RSS (Fig. 2, D and E). 32 P labeling of the DNA showed that the two RSSs were present in 1:1 ratio, although by SYBR Green staining, the intensity of the 23RSS DNA in the purified PC was higher than that of the shorter 4 G. J. Grundy, M. Gellert, and W. Yang, unpublished data. 12RSS (Fig. 2E). Thus when one (and only one) site is occupied by a 12RSS in the initial 12RSS-SC, the other is reserved for a 23RSS, which can be incorporated later to form a PC.
However, assembly of an SC did not have to start from a 12RSS. The SC with 23RSS (23RSS-SC) could also be assembled by mixing M1/M2 protein (with MBP tags on both) with 23RSS (intact or pre-nicked) at 1:1.2 or 1:2.2 molar ratios in the presence of a divalent cation and HMGB1 protein. Whether the molar ratio of 23RSS to RAG1/2 protein was 1.2:1 or 2.2:1 in the incubation mixture, the protein-DNA complexes eluted from the gel-filtration column appeared to contain only one 23RSS for each RAG1/2 heterotetramer, as judged by protein and DNA gels (Fig. 2, F and G) and confirmed by their having the same A 260 /A 280 ratios (1.35 and 1.36). The 23RSS-SC was readily converted to the PC after the addition of 12RSS (Fig. 2, F and  G), with the A 260 /A 280 ratio increased to 1.54. Therefore, the previously reported SCs (22,23) most probably contained a single copy of either 12RSS or 23RSS DNA. SCs of 12RSS or 23RSS can be converted to a PC with the addition of the second type of RSS DNA.

The Paired Complexes Are Fully Active in DNA Cleavage-
Assembly of the paired complex did not require stepwise incubation. PC was also produced by direct mixing of all components (Fig. 3, A and B). Furthermore, the PC could be assembled from pre-nicked DNA substrates in the same way, forming a pre-nicked PC ("PC-nick") (Fig. 3, A and B). The 12/23 RSSs were present in 1:1 ratio in both the PC and PC-nick, again based on the intensity of DNAs labeled with 32 P (Fig. 3, C and D).
In previous studies, RSS cleavage by RAG1/2 was often incomplete even in the presence of a large excess of protein (4,14,20). To determine how efficient double-strand cleavage by PC or PC-nick was, both types of complexes were assembled with both 12 and 23 RSS DNAs containing 5Ј-32 P-labeled bottom strands in the presence of Ca 2ϩ and subjected to cleavage by adding Mg 2ϩ or Mn 2ϩ or a combination of both. Doublestrand cleavage of a non-nicked PC did not go to completion (about 60% at best) even when all DNA was bound to RAG1/M2 (Fig. 3C). However, a PC assembled from pre-nicked substrates was more efficiently processed (Fig. 3D), presumably because  12RSS (gray dashes). The elution profile of the pre-nicked 12RSS at 1.2:1 molar ratio was identical to that of the intact 12RSS at the same molar ratio and thus is omitted for clarity. The 12RSS-SC complex (355 kDa) was readily identifiable by the peak shift when compared with RAG1/M2 protein alone. Arrows indicate the elution points of molecular mass markers (443 and 150 kDa). The excess 12RSS (intact or pre-nicked) was eluted in a later peak. C, analysis of the elution peak of the 12RSS-SC by SDS-PAGE and denaturing TBE-urea gel stained by Coomassie Blue and SYBR Green, respectively. The DNA gel also shows double-strand cleavage by purified 12RSS-SC in Mn 2ϩ . D, superimposed gel-filtration profiles of the 12RSS-SC (blue) and its conversion to PC (orange, 386 kDa) after incubation with 23RSS (gray filled triangle). The excess of free DNA was eluted after SC and PC. E, analysis of components of the PC peak by SDS-PAGE and denaturing TBE-urea gel electrophoresis (stained by SYBR Green). 32 P labeling of the PC peak showed that the 12RSS and 23RSS are present in 1:1 molar ratio. The 32 P labels on DNA strands are marked with asterisks. F, superposition of gel-filtration profiles of 23RSS-SC formation (which did not alter by the addition of more 23RSS (green)), and its conversion to PC after the addition of 12RSS (orange). Because both RAG1 and RAG2 were MBP-tagged, the molecular masses of SC and PC were 446 and 468 kDa. Due to the flexible linkers, their apparent sizes appeared to be larger than the calculated values. The elution points of molecular mass markers (669 and 150 kDa) are indicated by arrows. G, the protein and DNA components of 23RSS-SC and PC were confirmed by SDS and TBE-urea gels. In panels C, E, and G, the open and filled triangles represent 12RSS and 23RSS, respectively, as shown in panel A. the first step of DNA cleavage was bypassed. The fact that cleavage of the non-nicked PC is only ϳ60% complete may suggest that the first step of the reaction is a limiting step under these conditions.
To increase the cleavage of PC-nick, we varied the metal ion composition and reaction time (Fig. 3, D and E). With Mg 2ϩ alone, cleavage was incomplete (ϳ80%), whereas in 2 mM Mn 2ϩ , there was more than 90% conversion to hairpins (Fig.  3D). The best condition used a mixture of 10 mM MgCl 2 and 2 mM MnCl 2 , which yielded nearly 100% hairpin conversion after 2 h at 37°C (Fig. 3, E and F).
Preparation and Crystallization of SEC-The efficient DNA cleavage of our assembled PC produced an ample amount of the SEC, a stable complex that is known to resist challenge by high salt or heparin (18). To obtain homogeneous tagless SEC for x-ray crystallography, we prepared RAG1 and RAG2 both containing a cleavable MBP tag on the N terminus (Fig. 1A). The amylose affinity-purified fully tagged RAG1/2 (Fig. 1A) and RSS DNAs (Fig. 1F) were assembled under the DNA cleavage conditions to allow PC formation and its conversion to the SEC (Fig. 4A). The MBP tags were removed later by overnight incubation with PreScission protease (Fig. 4A). When the entire DNA and protein cleavage reaction mixture was loaded on a HiTrap Q column equilibrated in the low salt reaction buffer, the HMGB1 protein, which was likely displaced by the positively charged column resin, dissociated from the SEC and flowed through the column (Fig. 4, B and C). The SEC was separated from MBP, PreScission protease, free RAG1/2, and DNA with a step gradient of KCl, and then eluted alone at 650 mM KCl as peak 2 (Fig. 4C). The HMGB1-free SEC was purified to homogeneity over a Superdex-200 column (Fig. 4D). SDS-PAGE and TBE-urea gels confirmed that it contained the cleaved 12 and 23 RSS as well as RAG1/2 proteins (Fig. 4E).
The SEC complex was easily concentrated to 5-10 mg/ml and screened for crystallization. Crystals of the SEC were obtained in 0.2 M tri-potassium citrate and 18% (w/v) PEG 3350 (see "Experimental Procedures" and Fig. 4F) and were verified to contain RAG1/2 proteins as well as 12/23RSS DNAs (Fig.  4G). To optimize the SEC crystals, the extensions downstream of the nonamer on 12RSS and 23RSS were varied from 2 to 15 bp in parallel. An 8-bp extension on both RSS DNAs (Fig. 1F) gave the best crystals. These SEC crystals diffracted x-rays up to 4.5Å, but the diffraction pattern indicated that these SEC crystals were mosaic and consisted of clusters of thin plates (Fig.  4H). Although not resulting in a three-dimensional structure immediately, the readiness of SEC to crystallize has inspired continuous efforts and led to the eventual crystal structure of RAG1/2 (34), for which the SEC was prepared from pre-cleaved RSS DNAs.

Discussion
Using HEK293 cells and a mammalian expression vector, we were able to produce ample RAG1/2 protein in the active tetrameric form. By showing that this protein complex can be quantitatively converted to the PC and then SEC after DNA cleavage, we found that RAG1/2 in this preparation is essentially 100% active. This contrasts with RAG1/2 expressed in insect cells, which forms amorphous soluble aggregates and contains only a small fraction of active species (30). Because RAG1/2 produced by both insect and human are not post-translationally modified according to mass spectrometry analysis, it may be that protein folding of RAG1/2 in insect cells is defective. We also found that MBP tags on RAG1, RAG2, or both do not alter the biochemical properties of the RAG1/2 recombinase, which is consistent with previous characterization of the SEC complex (30). However, the stability of RAG2 depends on the MBP fusion until RAG1/2 is associated with DNA. The MBP tags on RAG2 can be removed in either the PC or the SEC complex but not from the tetrameric RAG1/2 proteins alone. Apparently, RSS DNA binding stabilizes the RAG1/2 recombinase. RAG1/2 isolated from HEK293 cells is a preformed tetramer without DNA (Fig. 1B) (34). Our step-by-step assembly and characterization of single and paired complexes unequivocally show that the tetrameric RAG1/2 binds and synapses one 12RSS and one 23RSS into a paired complex without change of oligomeric composition, whether the RSS DNAs are intact or pre-nicked. The tetrameric RAG1/2 within PC is fully active in cleaving 12/23RSS DNAs, as shown previously (30,32). Characterization of the RSS DNA cleavage product SEC by EM and atomic force microscopy analyses (30) also gives no evidence of species other than RAG1/2 tetramer. These observations are in contrast to the report, largely based on atomic force microscopy measurements, that a RAG PC assembled on a long coding DNA contains a pair of RAG1/2 tetramers in the form of a heterooctamer (33).
The detectable single complexes of RAG1/2 tetramer with 12RSS or 23RSS DNA contain only one copy of either RSS, and the second DNA-binding site is reserved for binding the complementary RSS DNA. Although this conclusion was also drawn from earlier work (20), it then relied on gel electrophoresis, with the possibility of confounding gel-specific effects. This specificity in DNA selection could arise in two ways. The first possibility is that each tetrameric RAG1/2 initially contains two equivalent sites that can bind either a 12RSS or a 23RSS DNA, and the protein undergoes a conformational change upon binding of one RSS, becoming asymmetric and selecting specifically for the absent RSS DNA (Fig. 5). The alternative, that two non-equivalent DNA-binding sites are present in RAG1/2 before any single complex is formed, is less likely than the first scenario because the two copies of RAG1/2 in the tetramer are intrinsically identical.
Recently, following the procedure described here, we have determined the crystal structure of RAG1/2 without RSS DNA (PDB: 4WWX) (34). Interestingly, the nonamer-binding domains (NBDs) of RAG1 subunits are dimerized as reported (27) and appear to be connected to the bulk of the protein by a flexible link. In the crystals, they are positioned asymmetrically relative to the two catalytic centers, presumably due to crystal packing forces. We thus hypothesize that the NBDs are mobile relative to the bulk of the RAG1/2 tetramer and to the catalytic domains that bind the heptamers of the RSS DNAs (Fig. 5). Upon binding of either 12RSS or 23RSS, the RAG1/2 tetramer becomes asymmetric, with the NBD tilted such that only the other type of RSS DNA can bind (Fig. 5). The 12/23 rule for  Once bound to either a 12RSS or a 23RSS to form an SC, the NBD becomes tilted asymmetrically relative to the two active sites (marked by red stars), and the second DNA-binding site can only bind the complementary RSS to form a PC. The PC can also assemble from 12RSS and 23RSS DNAs added simultaneously, whether the DNAs are intact or pre-nicked. DNA cleavage in the PC occurs in the presence of Mg 2ϩ , leading to SEC formation. HMGB1, which facilitates both SC and PC formation, is omitted in the diagram for clarity. V(D)J recombination is thereby established at the paired complex formation and is re-enforced by the efficiency of DNA cleavage only in the paired and not single complex (20).