Transcription-induced Cleavage of Immunoglobulin Switch Regions by Nucleotide Excision Repair Nucleases in Vitro*

Immunoglobulin (Ig) heavy chain class switch recombination (CSR) mediates isotype switching during B cell development. CSR occurs between switch (S) regions that precede each Ig heavy chain constant region gene. Various studies have demonstrated that transcription plays an essential role in CSR in vivo. In this study, we show that in vitro transcription of S regions in their physiological orientation induces the formation of stable R loops. Furthermore, we show that the nucleotide excision repair nucleases XPF-ERCC1 and XPG can cleave the R loops formed in the S regions. Based on these findings, we propose that CSR is initiated via a mechanism that involves transcription-dependent S region cleavage by DNA structure-specific endonucleases that function in general DNA repair processes. Such a mechanism also may underlie transcription-dependent mutagenic processes such as somatic hypermutation, and contribute to genomic instability in general.


Immunoglobulin (Ig) heavy chain class switch recombination (CSR) mediates isotype switching during B cell development. CSR occurs between switch (S) regions that precede each Ig heavy chain constant region gene.
Various studies have demonstrated that transcription plays an essential role in CSR in vivo. In this study, we show that in vitro transcription of S regions in their physiological orientation induces the formation of stable R loops. Furthermore, we show that the nucleotide excision repair nucleases XPF-ERCC1 and XPG can cleave the R loops formed in the S regions. Based on these findings, we propose that CSR is initiated via a mechanism that involves transcription-dependent S region cleavage by DNA structure-specific endonucleases that function in general DNA repair processes. Such a mechanism also may underlie transcription-dependent mutagenic processes such as somatic hypermutation, and contribute to genomic instability in general.
Immunoglobulin (Ig) molecules comprise N-terminal variable regions and C-terminal constant regions. The Ig variable regions are encoded by component V, (D), and J gene segments that are assembled into a complete variable region coding exon during precursor B cell differentiation by V(D)J recombination (for review, see Ref. 1). The murine Ig heavy chain locus contains 8 different constant region (C H ) genes with the following organization: 5Ј-V(D)J-C-C␦-C␥3-C␥1-C␥2b-C␥2a-C⑀-C␣-3Ј. Differentiating B lymphocytes first produce Ig heavy chains in association with Ig light chains as an IgM surface receptor. Upon activation, mature B cells can join the antigenspecific V(D)J gene to a different downstream effector C H gene by a recombination/deletion process termed class switch recombination (CSR) 1 The CSR process results in switching from the expression of IgM to other classes of Ig molecules such as IgG, IgE, and IgA (for reviews, see Refs. 2 and 3).
CSR is a process that is clearly distinct from the site-specific V(D)J recombination reaction. CSR occurs in regions composed of tandem repetitive sequences termed switch (S) regions that are located upstream of each C H gene that undergoes CSR. S regions range in size from 1 to 10 kilobases in length. Most CSR sites fall within or around the repetitive S region sequences (4,5). Despite intensive investigation, the mechanism of CSR has remained elusive, although a wide range of models have been proposed (reviewed in Refs. [1][2][3]. One significant clue, however, is the dependence of CSR on transcription. Germline C H genes are organized into transcription units in which transcription initiates from a promoter 5Ј of an exon termed the I exon, runs through the S region, and undergoes polyadenylation downstream of the C H exon. CSR is always preceded by the activation of I promoters (germline transcription) (6 -11). Correspondingly, inactivation of I promoters by targeted mutagenesis inhibits switch recombination (12)(13)(14).
The precise role of germline C H gene transcription in CSR is still unknown. However, as the transcript does not contain long open reading frames, its function is likely mediated at the RNA level. Some in vitro studies have yielded evidence consistent with this hypothesis. It was found that transcribing S regions in vitro results in the association of S region containing transcripts with the template DNA as an RNA-DNA hybrid (15)(16)(17). The structure of this RNA-DNA complex is not known. Various models have been proposed, including triplex, G quartette, and collapsed R loop structures. In the context of an R loop structure, the switch transcript would hybridize with the template DNA strand, and the non-template strand would exist as single-stranded DNA. Theoretically, the R loop structure could mediate recombination by serving as a substrate for structurespecific endonucleases.
In the context of an R loop, the junction of the singlestranded, non-template strand with the flanking duplex DNA could be cleaved by two nucleases that recognize duplex-single strand junctions: XPF-ERCC1 and XPG. The XPF-ERCC1 and XPG nucleases were identified on the basis of their roles in nucleotide excision repair (for reviews, see Ref. 18). Both can cleave bubble structures at the duplex-single strand junctions (19 -27). However, these two nucleases recognize different sides of the loop-duplex junction. XPF-ERCC1 cleaves the 5Ј side of the loop-duplex junction, while XPG cleaves the 3Ј side. While it is unknown whether either of these nucleases can cleave an R loop, the potential strand breaks generated by such activity could serve to initiate subsequent recombination in S region.
To test the potential role of transcription-dependent R loop structures in CSR, we carried out the following experiments. First, we transcribed S regions in vitro and used nuclease P 1 to probe single-stranded regions in the resulting RNA-template complex. The P 1 digest pattern is consistent with that of an R-loop. In addition, we showed that both the XPF-ERCC1 and the XPG proteins can cleave the single-stranded regions in the transcribed S region, clearly demonstrating that these structures can be recognized by generally expressed repair enzymes. Based on these results, we propose a novel model for CSR.

EXPERIMENTAL PROCEDURES
Plasmids and Probes-The in vitro transcription vector (pT7) is derived from pET-15b (Novogen). pT7-S(S) and S(A) were generated by cloning a 3.7-kilobase HindIII fragment containing most of the repetitive DNA of S into pT7 in either orientation. pT7-S␥2b(S) and S␥2b(A) were generated by cloning a 3.7-kilobase HincII/HindIII fragment containing most of the 49-base pair repeat of S␥2b into pT7 in either orientation. pT7-lacZ(S) and -(A) were generated by cloning a 3.3-kilobase lacZ fragment into pT7 in either orientation. The plasmids were linearized with restriction enzymes immediately 3Ј of the switch regions or the lacZ sequence for in vitro transcription. Probes in Southern analysis are 50-base oligonucleotides, which hybridize to the 5Ј end of non-template strand or the 3Ј end of the template strand, respectively. The probes are labeled with [␥-32 P]ATP using T4 polynucleotide kinase.
R Loop Substrate-The R loop substrate was constructed by annealing two 60-base oligonucleotides and an in vitro transcribed RNA. The sequences of the oligonucleotides are: non-template strand: 5Ј-GCGA-TCCAGAGGTTCACCTGTTTTTTTTTTTTTTTTTTTTCCGTTGACCA-CGTGATTGGC-3Ј; template strand, 5Ј-GCCAATCACGTGGTCAACG-GCTCACATTCCCACCATCCCCCAGGTGAACCTCTGGATCGC-3Ј. The sequence of the in vitro transcribed RNA is: 5Ј-GGGGGGGAGCT-GGGGATGGTGGGAATGTGAGGGACCAGTCCTAGCAGCTATCCTC-GA-3Ј. The oligonucleotides were labeled at the 5Ј end with [␥-32 P]ATP using T4 polynucleotide kinase. The oligonucleotides and the RNA were mixed in 40 mM Tris acetate (pH 8.0), 10 mM MgCl 2 . The mixture was heated to 95°C for 5 min, and cooled to 40°C over a period of 1 h. The The strand detected by the strand-specific probe in the autoradiograph is marked by an asterisk (*), and denoted by NT (non-template strand) or T (template strand). Lane M is end-labeled HindIII marker. The amount of P 1 nuclease in each reaction is shown at the top of the autoradiograph. Panels A and B show that transcription induces R loop formation in S and S␥2b, respectively. Panels C and D show that RNase H can eliminate R loops in S and S␥2b, respectively. 30 min, and stopped by the addition of EDTA to 25 mM final concentration. The DNA and RNA were purified by phenol:chloroform extraction and ethanol precipitation.
Nuclease P 1 Digestion-The reaction was carried out in 50 mM sodium acetate (pH 5.5), 200 mM NaCl, 1 mM ZnSO 4 , 25 ng of DNA. The amount of P 1 used ranged from 10 to 10 5 pg/20-l reaction as indicated in the figures. The reaction was incubated at 37°C for 15 min and stopped by the addition of 2 l of 0.5 M Tris-Cl (pH 8.5), 50 mM EDTA. 2 g of RNase A was added to each reaction, and the reaction was incubated for 15 min. The DNA was purified by phenol:chloroform extraction and ethanol precipitation. Denaturation was achieved by incubating the DNA at 50°C for 1 h in 50% dimethyl sulfoxide, 1 M glyoxal, 10 mM sodium phosphate (pH 7.0), 0.5 mM EDTA, 1 g of sonicated samon sperm DNA. The denatured DNA was electrophoresed in 1% agarose gels containing 10 mM sodium phosphate (pH 7.0), 0.5 mM EDTA. After electrophoresis, the DNA was transferred to a nylon membrane, and detected by Southern hybridization. The P 1 digest pattern of the model R loop substrate was analyzed by electrophoresing the cleavage products on a 10% denaturing polyacrylamide gel in 90 mM Tris borate (pH 8.0), 2 mM EDTA.
XPF-ERCC1 and XPG Proteins-Human ERCC1 (hERCC1) was cloned by reverse transcriptase-polymerase chain reaction from HeLa cell cDNA. Human XPF (hXPF) was cloned by polymerase chain reaction from a baculovirus expressing hXPF (kindly provided by Drs. Sancar and Bessho). A 6 ϫ histidine tag was introduced at the C terminus of hXPF during the polymerase chain reaction. Mouse XPG (mXPG) was cloned by reverse transcriptase-polymerase chain reaction from lipopolysaccharide-stimulated spleen cDNA. A 6 ϫ histidine tag was introduced at the C terminus of mXPG. The cDNAs for hXPF and hERCC1 were cloned into pVL941 vector (Pharmigen), and recombinant baculoviruses were generated using the BaculoGold Kit. The cDNA for mXPG was cloned into pJVP10Z vector, and recombinant baculoviruses were generated by co-transfection with wild-type baculovirus DNA. The recombinant virus was purified according to procedures provided by the manufacture (Pharmigen).
XPF-ERCC1 was purified from Sf9 cells co-infected with the two recombinant viruses. Cells were harvested 48 h after infection. All the purification steps were carried out at 4°C. The infected cells were washed once with phosphate-buffered saline. The cells were resuspended in lysis buffer (50 mM sodium phosphate (pH 7.0), 500 mM NaCl) plus mixtures of protease inhibitors. The cells were lysed by sonication. The lysate was centrifuged at 20,000 rpm in a SW50 rotor for 30 min. The supernatant was loaded onto a Ni 2ϩ affinity column equilibrated in lysis buffer. The column was washed with lysis buffer. Elution was achieved with a gradient of 0 -250 mM imidazole in lysis buffer. Fractions containing XPF-ERCC1 were pooled and dialyzed overnight against 20 mM Tris-Cl (pH 8.0), 100 mM NaCl. The dialyzed sample was centrifuged at 8,000 rpm in a HB-6 rotor for 15 min. The supernatant was loaded onto a S-Sepharose column equilibrated with 20 mM Tris-Cl, 100 mM NaCl. The proteins were eluted with a NaCl gradient from 100 to 500 mM in 20 mM Tris-Cl (pH 8.0). Fractions containing XPF-ERCC1 were pooled and dialyzed against 50 mM Tris-Cl (pH 8.0), 0.5 mM DTT, 20% glycerol. Protein concentration was determined by the Bradford assay using bovine serum albumin as a standard. The purity of the protein was checked by electrophoresing on a 10% SDS-PAGE.
For purification of XPG, infection and lysis were carried out the same way as with XPF-ERCC1. The cleared lysate was loaded onto a Ni 2ϩ affinity column equilibrated with the lysis buffer. The column was washed sequentially with lysis buffer and lysis buffer plus 25 mM imidazole. Proteins were eluted with an imidazole gradient from 25 to 225 mM in lysis buffer. Fractions containing XPG were pooled and loaded onto a hydroxyapatite column equilibrated in the lysis buffer. The column was washed sequentially with lysis buffer and 20 mM Tris-Cl (pH 8.0), 100 mM NaCl, 0.5 mM EDTA. Proteins were eluted with a gradient of potassium phosphate (pH 7.0) from 0 to 500 mM. The peak fractions containing XPG was pooled and dialyzed against 50 mM Tris-Cl (pH 8.0), 0.5 mM DTT, 20% glycerol. Protein concentration was determined by the Bradford assay using bovine serum albumin as a standard. The purity of the protein was checked by electrophoresing on a 8% SDS-PAGE.
Cleavage by XPF-ERCC1 and XPG Proteins-The reaction was carried out in 50 mM Tris-Cl (pH 8.0), 5 mM MgCl 2 , 1 mM DTT, 0.1 mg/ml bovine serum albumin, 13 ng of DNA, 80 ng of XPF-ERCC1 or 45 ng of XPG or both in a volume of 20 l. For comparison between untranscribed and transcribed switch regions, in vitro transcribed RNA was added to the untranscribed DNA. For this purpose, a separate in vitro transcription reaction of the same template was set up. After 15 min of in vitro transcription reaction, the template was degraded by the addition of DNase I. After 15 min of DNase I digest, the DNase I was eliminated by the addition of SDS to 0.5% and 20 g of proteinase K. The reaction was incubated for 30 min, and the RNA was purified by phenol:chloroform extraction and ethanol precipitation. The RNA was mixed with the untranscribed DNA. The amount of RNA is equivalent to that generated in a transcription reaction containing 13 ng of DNA template. For cleavage of model R loop substrate, 16 ng of XPF-ERCC1 proteins or 9 ng of XPG protein or both were added per reaction. The cleavage reaction was incubated at 37°C for 1 h, and was stopped by the addition of EDTA to 25 mM final concentration. 2 g of RNase A was added, and the incubation was continued at 37°C for 15 min. Then, SDS was added to 0.5% final concentrations, and 20 g of proteinase K was added. The digest was incubated at 37°C for 30 min. The DNA was purified by phenol:chloroform extraction and ethanol precipitation. The DNA was analyzed the same way as in P 1 digest reactions.

RESULTS
Transcription Generates R Loops in the S Region-To determine whether transcription induces R loop formation in S regions, we generated constructs that allow the transcription of S regions (S or S␥2b) in vitro by T7 RNA polymerase. We chose to test S and S␥2b because they represent the two different types of S region families. S, S␣, and S⑀ are characterized by pentameric repeats, while S␥1, S␥2a, S␥2b, and S␥3 are composed of 49-base pair repeats (28 -37). If the same result is obtained with both types of S regions, it may reflect a property of S regions in general. The plasmids were linearized immediately after the S regions, and transcribed with T7 RNA polymerase. After transcription, the DNA was treated with nuclease P 1 that specifically cleaves single-stranded DNA (Fig.  1). To better compare P 1 sensitivity under different conditions, we varied the concentration of P 1 over a range of 10 4 . The digested DNA was denatured by glyoxal, electrophoresed on an agarose gel, and transferred to a nylon membrane. The P 1 digest pattern on either strand of the template was then revealed by hybridizing with an oligonucleotide complementary to each DNA strand.
The untranscribed S DNA template was completely resistant to P 1 , with no cleavage detectable even at the highest P 1 concentration tested (Fig. 1A). In contrast, the non-template strand of S was highly sensitive to P 1 after transcription (Fig.  1A). Thus, the relative proportion of full-length DNA decreased with increasing concentration of P 1 , with the majority of fulllength DNA being degraded in the presence of 10 3 pg of P 1 per reaction. The template strand of S was also cleaved by P 1 in a transcription-dependent manner (Fig. 1A). However, cleavage was observed only at much higher concentrations of P 1 (from 10 3 to 10 5 pg of P 1 /reaction), and the extent of degradation is less compared with that of the non-template strand. This difference indicates that the template strand contains less single-stranded regions compared with the non-template strand. Essentially identical results were obtained with the S␥2b template (Fig. 1B).
The location of P 1 cleavage sites can be determined by the size of the cleavage product. As the probe employed hybridizes to the 5Ј end of the non-template strand; the size of cleavage product corresponds to the distance of the cleavage site to the 5Ј end of the strand (Fig. 1, A and B). Based on this, cleavage occurred exclusively within the switch region. At the lowest P 1 concentration (10 pg/reaction), the cleavage sites distributed relatively evenly in the S region, suggesting that singlestranded regions exist throughout the transcribed S. When the concentration of P 1 was increased (from 10 2 to 10 5 pg/ reaction), the sizes of the cleavage product shift toward 5.7 kilobases, the location of the transcription initiation site. The digest pattern remained constant from P 1 concentrations of 10 3 to 10 5 pg/reaction, suggesting that all the single-stranded DNAs have been degraded at this concentration range; and the cleavage sites correspond to the 5Ј boundary of the singlestranded regions. Thus, the single-stranded regions started mostly in the 5Ј region of S, close to the transcription initiation site. Similar to the non-template strand, cleavage of the template strand occurred exclusively in the switch region (Fig.  1A). Again, the same results were observed with the S␥2b template (Fig. 1B).
The large extent of single-stranded regions on the non-template strand is consistent with an R loop structure, which should be eliminated by RNase H degradation of the RNA hybridized to the template strand. To test this, we treated the transcribed switch regions with RNase H before P 1 digest. Consistent with the expectation, RNase H eliminated most of the P 1 -sensitive regions in S (Fig. 1C). In these experiments, a small amount of single-stranded DNA did remain. One explanation for the latter finding is that some of the singlestranded DNA on the non-template strand forms stable secondary structures, and cannot reanneal with the template strand after the RNA is eliminated by RNase H. If this explanation is true, then similar patterns of single-stranded DNA should exist on the template strand. This was indeed observed on the template strand (Fig. 1C). For the S␥2b template, RNase H treatment completely eliminated single-stranded regions from either strand (Fig. 1D).
In the experiments described above, the existence of RNA-DNA hybrids and the large amount of single-stranded DNA on the non-template strand are consistent with an R loop struc-ture. However, the detection of short single-stranded regions on the template strand was unexpected, since the template strand should exist as an RNA-DNA duplex and be resistant to P 1 . One possibility is that the P 1 sensitivity is due to the distortion of duplex structure at the junction of the putative R loop with flanking DNA. We tested this possibility by analyzing the P 1 digest pattern of a model R loop (Fig. 2). The model R loop was constructed from oligonucleotides and an in vitro transcribed RNA. It consists of 20-base pair DNA duplexes on both sides and an R loop of 20 bases in the middle. The RNA forms a 20-base pair RNA-DNA hybrid on one oligonucleotide, with single-stranded extensions of 11 and 26 bases at the 5Ј and 3Ј end, respectively. The single-stranded extensions are designed to mimic the R loop formed during transcription, since the transcript may not hybridize with the template strand throughout its entire length. The oligonucleotide containing the single-stranded loop represents the non-template strand, while the oligonucleotide hybridized to the RNA represents the template strand. The R loop was labeled at the 5Ј end on either DNA strand. Therefore, the size of the cleavage product corresponds to the distance of the cleavage site to the 5Ј end of the strand.
We digested the model R loop with P 1 , and electrophoresed the product on a denaturing polyacrylamide gel. We found that the non-template strand was highly sensitive to P 1 (Fig. 2). At low P 1 concentration (100 pg/reaction), the cleavage sites are distributed through out the loop from bases 21 to 38. The bases at the R loop-duplex junction are cleaved inefficiently at this low P 1 concentration, probably due to steric hindrance. With increasing P 1 concentration (10 4 to 10 5 pg/reaction), the cleavage sites shift toward the 5Ј end of the loop, and extend into the flanking duplex regions. This result suggests that the duplex flanking the R loop is distorted, and shows some singlestranded character. Similarly, the R loop-duplex junction on the template strand is also sensitive to P 1 digest, but the extent of digestion of this strand was substantially less compared with that of the non-template strand (Fig. 2). Thus, the P 1 digest pattern of the model R loop is similar to that of the transcribed switch regions, and suggests that the RNA-DNA complex in the transcribed switch region is an R loop.
Previous studies showed that RNA-DNA hybrids are formed only when the S regions are transcribed in the physiological orientation (15)(16)(17). To compare R loop formation between different templates in our system, we did P 1 analysis on either S region or a fragment of the lacZ gene transcribed in either the physiological or reverse orientation (Fig. 3, A, B, and C). S regions transcribed in the reverse orientation (S(A) and S␥2b(A)) were significantly more resistant to P 1 digest when compared with S regions transcribed in the physiological orientation (S(S) and S␥2b(S)). Even at the highest P 1 concentration used, most of the template DNA remained intact (Fig. 3,  A and B). Similarly, the transcribed lacZ DNA, in either orientation, was significantly more resistant to P 1 digests than the switch regions (Fig. 3C). These data show that stable R loop formation is most efficient in S regions transcribed in the physiological orientation.
XPF-ERCC1 and XPG Cleave Transcribed Switch Regions-To test whether XPF-ERCC1 and XPG can cleave the R-loops formed in transcribed S regions, we employed proteins purified from recombinant baculoviruses that overproduce XPF-ERCC1 or XPG (Fig. 4A). We first tested the activity of the recombinant proteins with the model R loop as a substrate (Fig.  4B). XPF-ERCC1 cleaves the 5Ј R loop-duplex junctions in both strands at positions a few bases into the duplex region, consistent with the known activities of the enzyme (21,26,27). The efficiency of cleavage on the template strand is similar to that on the non-template strand. This result suggests that XPF-ERCC1 activity requires only short single-stranded regions. XPG cleaves the 3Ј R loop-duplex junction on the non-template strand, but did not cleave the template strand. Thus, XPG may require longer single-stranded regions for function. Alternatively, the 3Ј junction may have less single-stranded character due to sequence effects. In P 1 digests of the model R-loop, the 3Ј junction on the template strand is indeed less sensitive to P 1 than the 5Ј junction (Fig. 2). Addition of both XPF-ERCC1 and XPG to the reaction resulted in a slight inhibition of cleavage for both proteins on the non-template strand. This inhibition may result from the small size of the R loop, which cannot accommodate both nucleases. On the template strand, XPG induces a new minor cleavage site, probably due to structural changes of the R loop induced by the binding of XPG. Thus, our recombinant XPF-ERCC1 and XPG proteins show the expected activities, and are capable of cleaving an R loop substrate.
We then went on to test whether these two nucleases can cleave transcribed S regions (Fig. 5, A and B). The S regions were transcribed in either orientation and incubated with the two nucleases. We found both XPF-ERCC1 and XPG can cleave S and S␥2b sequences transcribed in their physiological (S) orientation. XPG cleaves more efficiently than XPF-ERCC1. The addition of both nucleases to the reaction resulted in slightly more cleavage than either nuclease alone, although it appears that most of the cleavage was mediated by XPG. The cleavage sites were scattered throughout the switch regions, but no cleavage was observed outside of the switch regions. In addition, the cleavage was significantly more efficient on the non-template strand than on the template strand. Either type of S region transcribed in the reverse orientation (A) was also cleaved, but the extent of cleavage was significantly lower compared with the S regions transcribed in their physiological orientation. Similarly, only very low levels of cleavage were observed when the lacZ fragment transcribed in either orientation was incubated with either or both of the nucleases (Fig. 5C).
One unexpected aspect of the cleavage pattern is that both XPF-ERCC1 and XPG cleave throughout the switch region. Since XPF-ERCC1 cleaves the 5Ј end of R loop-duplex junction, it was expected that the cleavage sites by XPF-ERCC1 should be concentrated toward the 5Ј end of the non-template strand. Similarly, the cleavage sites of XPG were expected to be biased toward the 3Ј end of the non-template strand. The lack of the expected bias could in part be attributed to the heterogeneous positions of the R-loops, which can be formed anywhere on the transcribed S regions. Another factor may be the existence of secondary structures in the single-stranded non-template strand. Duplex-single strand DNA junctions in the secondary structure could also serve as substrates for these enzymes, and provide additional cleavage sites inside the single-stranded loop region (22,25).
Since CSR is dependent on transcription, we compared the cleavage of transcribed and untranscribed switch regions. In initial experiments, we found that XPF-ERCC1 can nonspecifically degrade untranscribed DNA (data not shown). This phenomenon could be attributed to the nonspecific nicking activity of this nuclease observed in other studies (26,38). We further found that addition of RNA can inhibit the nonspecific nuclease activity (data not shown). This inhibitory effect of RNA explains why XPF-ERCC1 does not nonspecifically degrade transcribed DNA (Fig. 5, A, B, and C), since the DNA tested was mixed with large amounts of in vitro transcripts. Therefore, to protect untranscribed DNA against nonspecific degradation, we mixed the DNA with in vitro transcripts from the S region before the addition of XPF-ERCC1; the amount of RNA added was equal to that present in a transcribed DNA preparation. This procedure allowed the comparison of the structure-specific nuclease activity of XPF-ERCC1 and XPG against transcribed and untranscribed DNA without interference from nonspecific degradation. Under these conditions, the untranscribed switch regions were completely resistant to XPF-ERCC1 and XPG, in contrast to their efficient cleavage after transcription (Fig. 5D). These data show that XPF-ERCC1 and XPG can specifically cleave switch regions in a transcription-dependent manner.

DISCUSSION
In this study, we show that transcription induces R loop formation in S regions. Furthermore, we demonstrate that these R loops can be cleaved by the XPF-ERCC1 and XPG nucleotide excision repair proteins. Based on these results and several known factors directly implicated or strongly correlated with CSR in vivo, we propose two related, not mutually exclusive, general models for CSR (Fig. 6, A and B). A key feature of these models is that the initiation of CSR results from the formation of S region R loops generated via germline transcription and that the process is enzymatically initiated by generally expressed cellular enzymes rather than S region-specific endonucleases. A similar conclusion regarding R loop formation was reached very recently in a parallel study (39). The general models we propose could account for the roles of germline transcription (40), DNA replication (41), and non-homologous end joining recombination (42)(43)(44), all of which have been implicated in the CSR process.
In the first model, cleavages at the R loop-duplex junctions in both strands of an S region by structure-specific endonucleases, such as XPF-ERCC1 and XPG, would liberate the R-loop and generate double strand breaks (DSBs) in the S region (Fig. 6A). Repair of these DSBs by non-homologous end joining would lead to either a productive CSR junction between two different S regions or to internal deletions in the S regions, depending on the choice of joining partners. In this scenario, the generation of a productive CSR product could be accompanied by the formation of extrachromosomal circles of the deleted intervening sequences. The second model invokes a role for DNA replication resulting from B cell activation (Fig. 6B). This hypothesis originates from our model system observation that endonucleases such as XPF-ERCC1 and XPG cleave more frequently on the non-template strand than on the template strand, suggesting that some transcribed S regions may contain only single strand breaks or gaps (Fig. 6B). In vivo, single strand breaks could be converted to DSBs through DNA replication. If a replication fork progressed to the single strand breaks, it would collapse into a DSB on one branch of the replication fork. Repairing the DSB by non-homologous end joining could also generate productive CSR products. The other branch of the replication fork would be linked to the RNA-DNA hybrid, and could fill in the gap. In such a CSR pathway, no extrachromosomal circles would be generated. Although extrachromosomal circles of deleted sequence have been isolated (45)(46)(47), they may not accompany all the CSR events (48,49).
CSR is a spatially and temporally specific event. Spatially, it occurs between S regions. Temporally, it happens only in activated B cells. These specificities cannot be attributed to generally expressed repair proteins such as the XPF-ERCC1 and XPG proteins, as such proteins lack sequence specificity (Refs. 19 -27 and this work) and are constitutively expressed. 2 Based on our model, spatial specificity stems from the high propensity of S regions to form R loops. The high purine content of S transcripts is most likely the cause, since purine-rich RNA forms stable RNA-DNA hybrid (50). Temporal specificity may be attributed to several processes associated with B cell activation. The first is germline S region transcription, which is required to generate R loops. The second is a high rate of cell proliferation, which via DNA replication may be necessary to convert single strand breaks into DSBs. In addition, cell proliferation is associated with a substantial increase in polyamine synthesis (51). In our in vitro transcription reactions, spermidine significantly enhanced R loop formation, 3 potentially via charge neutralization. A third contributing factor may be the B cell activation induced expression of particular DNA repair proteins required either for initiation or completion of the reaction during B cell activation (e.g. Ref. 52). The combination of these factors may effectively limit CSR to occur only in activated B cells.
Although R loop formation is most efficient with S region transcripts, we observed some R loop formation with the nonpurine-rich sequences (antisense S region transcripts and lacZ sequence transcripts). Correspondingly, we also observed a low level of cleavage of these DNAs by XPF-ERCC1 and XPG. In vivo, RNase H and helicases may suppress these low levels of nonspecific R loop formation and cleavage by endonucleases. Studies in Escherichia coli have implicated RNase HI and the RecG helicase in this function, since mutations in these genes induce origin-independent replication, presumably due to initiation of replication by RNA-DNA hybrids formed during transcription (53). The mechanism that we have proposed may have additional implications. Under special conditions, low levels of R loops may form in non-switch region sequences and persist long enough to have mutagenic effects. A theoretical example of such a phenomenon would be the somatic hypermutation that occurs in Ig variable region genes during B cell activation and which is dependent on transcription (54 breakpoints also frequently display mutations (41). Thus, CSR and somatic hypermutation may share similar mechanisms with respect to the initial lesion induction and the subsequent repair process. Finally, transcription induced aberrant DNA structure formation and cleavage by structure-specific nucleases may be a general mechanism that underlies chromosomal instability.
CSR often has been postulated to involve a putative switch recombinase that is induced in B cells and which specifically recognizes S regions (see Refs. 1-3). However, it also has also been proposed that CSR could result from the action of more general factors targeted via S region transcription (for discussions, see Refs. 1-3). In the current study, we have developed a model system to test the feasibility of the latter hypothesis. In particular, we have shown that in vitro S region transcription generates stable R loops in a strand-specific fashion and that these R loops can be specifically recognized by XPF-ERCC1 and XPG. We emphasize that our current studies were designed to test the general feasibility of our model. We picked XPF-ERCC1 and XPG based on their potential for R loop recognition. However, our findings do not prove that these enzymes actually perform this role in CSR in vivo. In this regard, no major phenotype with respect to CSR in ERCC1-or XPGdeficient mice has been reported (55)(56)(57). However, these studies do not rule out a potential role for these proteins, given the potential redundancy indicated by our finding that either nuclease alone can achieve significant cleavage of R loop structures in vitro. On the other hand, it is conceivable that other repair enzymes might also serve the roles of R loop cleavage that we have modeled with XPF-ERCC and XPG. In any case, our in vitro model study may provide a new view of an intriguing process that has proven to be relatively refractory to prior in vivo mechanistic analyses.