The Integrase of the Long Terminal Repeat-Retrotransposon Tf1 Has a Chromodomain That Modulates Integrase Activities*

Chromodomains in a variety of proteins mediate the formation of heterochromatin by interacting directly with histone H3, DNA, or RNA. A diverse family of long terminal repeat (LTR)-retrotransposons possesses chromodomains in their integrases (IN), suggesting that the chromodomains may control integration. The LTR-retrotransposon Tf1 of Schizosaccharomyces pombe is highly active and possesses a chromodomain in the COOH terminus of its IN. To test this chromodomain for a role in integration, recombinant INs with and without the chromodomain were assayed for activity in in vitro reactions. The full-length IN had integration activity with oligonucleotide substrates that modeled both the insertion reaction and a reverse reaction known as disintegration. The INs of retroviruses possess an additional activity termed 3′ processing that must remove 2-3 nucleotides from the 3′ ends of the viral cDNA before insertion can occur. These additional nucleotides are added during reverse transcription because of the position of the minus strand primer downstream of the LTR. The position of the primer for Tf1 suggests no nucleotides are added 3′ of the LTR. It was therefore surprising that Tf1 IN was capable of 3′ cleavage. The most unexpected result reported here was that the IN lacking the chromodomain had significantly higher activity and substantially reduced substrate specificity. These results reveal that both the activity and specificity of enzymes can be modulated by their chromodomains.

Post-translational modifications of histones, such as the acetylation or methylation of lysines, establish the accessibility of DNA to the transcription machinery (1)(2)(3). Conserved domains that recognize these specific modifications exist in a variety of factors that interact with the histones and propagate the structures of chromatin (4 -6). A large family of proteins that regulate chromatin structure bind specifically to histone H3 with methylated lysines using a conserved module called a chromodomain (6,7). Although all the chromodomains studied thus far function to regulate chromatin structure, sequence analyses revealed large families of long terminal repeat (LTR) 3 -retrotransposons with chromodomains in the COOH termini of their INs (8,9). This is intriguing because it suggests that the transposons acquired chromodomains to regulate integration.
LTR-retrotransposons are closely related to retroviruses. Both types of elements encode Gag, protease, reverse transcriptase, and IN proteins and both depend on IN to insert their cDNA into the genome of a host cell. The LTRs, positioned at the ends of the cDNA, are subdivided into an upstream segment, U3, a central region, R, and a downstream segment, U5. A large number of biochemical studies with purified IN have characterized the chemistry of insertion (10 -12). The INs of retroviruses have two activities that are necessary for the integration of the viral cDNA. At first, 2 or 3 nucleotides are removed from the 3Ј ends of the LTRs. This site-specific nucleolytic cleavage or "3Ј processing reaction" is required to position the "CA" dinucleotides at the 3Ј ends of the cDNA. The CA dinucleotides are conserved among all retroviruses and their position at the 3Ј ends of the cDNA is required for the second step of retrovirus integration, the "strand transfer reaction." During this second step, the 3Ј hydroxyls of the terminal "A" act as nucleophiles in a pair of transesterification reactions that cleave the phosphodiester bonds in the target DNA and simultaneously make a covalent bond between the 3Ј ends of the viral DNA and the 5Ј ends of the target DNA.
In retroviruses, the minus strand primers initiate reverse transcription 2 or 3 nucleotides from the LTR and this results in additional nucleotides at the cDNA termini that IN removes during the processing reaction. In the case of most LTR-retrotransposons the minus strand primers initiate reverse transcription with no added nucleotides. As a result, these LTR-retrotransposons are thought to proceed to the strand transfer reaction without a processing step (13).
The INs of retroviruses have three protease-resistant domains and the functions of these domains have been studied extensively (12,14,15). The NH 2 -terminal domain contains a conserved motif similar to zinc fingers. Although this domain does bind zinc, little else is known about its function. The central domain is the catalytic core that contains highly conserved acidic residues, Asp, Asp, and Glu, that are juxtaposed on an RNase H-like fold. These three amino acids are conserved in all the INs of retroviruses, LTR-retrotransposons, and DNA transposons (16). The central domain of HIV-1 IN was shown to be sufficient for catalytic activity in in vitro assays for the reverse reaction of strand transfer known as disintegration (17). The COOH-terminal domain is less well conserved but has nonspecific binding activity to DNA. Although the structure of individual domains has been determined for several INs, no single structure yet contains all three domains (18).

* This work was supported in part by the Intramural Research Program of the National
Institutes of Health, NICHD, and the National Institutes of Health Intramural AIDS Targeted Antiviral Program. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. We dedicate this article to the memory of our colleague and good friend, Dr. Kiebang Nam. □ S The on-line version of this article (available at http://www.jbc.org) contains additional supplemental text. 1  Tf1 of Schizosaccharomyces pombe is one of the LTR-retrotransposons that contains a chromodomain at the COOH terminus of IN (8). Its high level of transposition activity allows Tf1 to be studied as a model for reverse transcription and integration. Expression of Tf1 from a high copy plasmid results in the formation of virus-like particles, the reverse transcription of the Tf1 mRNA, and the integration of the cDNA into intergenic sequences (19 -21). A novel mechanism of self-priming initiates minus strand reverse transcription from the upstream LTR (22). After initiation, reverse transcription proceeds through the same intermediates typical of other LTR elements. The IN of Tf1 that inserts this cDNA into the genome of S. pombe contains putative zinc binding, catalytic core, and COOH-terminal domains that are necessary for integration (23). Mutation analysis indicated that its high level of activity and the tools available to study it make Tf1 uniquely suited for examining the function of the chromodomain in integration.
In this report the activities of Tf1 IN and the function of its chromodomain were studied. IN and IN without the chromodomain (CH Ϫ ) were expressed in bacteria and purified. Biochemical assays for disintegration and strand transfer showed that IN possessed substantial activity. Because Tf1 cDNA was predicted to terminate with CA, it was surprising that IN also had strong 3Ј processing activity. This indicates that the position of the minus strand primer relative to the LTR is not sufficient to predict the presence of processing activity. The most unexpected result was that CH Ϫ had significantly higher specific activity than IN. In addition, CH Ϫ was much less dependent on specific sequences at the termini of the donor DNA than was IN. These results reveal that the chromodomain in the Tf1 IN modulates both the level and specificity of integration.

EXPERIMENTAL PROCEDURES
Plasmid Constructions and Cell Growth-The construction of pHL2468, the plasmid for the expression of the full-length Tf1 IN has been previously described (19). Briefly, the IN-coding DNA segment of Tf1 was amplified by PCR. The resulting fragment was flanked with SalI sites that were introduced into the BamHI site of pET15B and sequenced. The NH 2 terminus of the Tf1 IN was predicted based on the amino terminus of the IN from a closely related element, Ty3 (24). The NH 2 terminus of the recombinant Tf1 IN started with the sequence TDDFK-and the COOH terminus ended with the sequence -NNLNI. The resulting protein was 477 residues long and was fused at its amino terminus to a 21-residue peptide containing the His 6 tag and the thrombin cleavage site. The plasmid expressing CH Ϫ , pHL2469, was generated by a similar method. The CH Ϫ coding DNA segment was amplified by PCR with the primers HL1167 and HL1168, using the Pfu Turbo thermostable DNA polymerase (Stratagene). The DNA generated was cleaved with NdeI and BamHI, cloned into the vector pET15b, and sequenced. The CH Ϫ protein had the same amino terminus as the fulllength IN but ended with the sequence -RHNSE. Altogether CH Ϫ was 406 ϩ 21 (tag) residues long. Both plasmids were introduced into the strain BL21. The bacteria were grown in LB broth supplemented with 100 g/ml of carbenicillin. The culture temperature was 32°C and the cells were grown until they reached an A 600 of about 0.6. Induction was performed with 1 mM isopropyl 1-thio-␤-D-galactopyranoside for an additional period of 3 h at 32°C. The cells were harvested, and the pellets were stored at Ϫ80°C.
Purification of the Recombinant Full-length IN and CH Ϫ Proteins-All steps were performed at 4°C, unless otherwise stated.
Lysis of the Bacteria-The bacterial pellets were brought up in 50 mM sodium phosphate buffer, pH 8.0, 0.1 M NaCl, 1 mM phenylmethylsulfonyl fluoride, and 0.5 mg/ml lysozyme. After a thorough mixing, the solutions were kept for 15 min on ice followed by the addition of more NaCl to a final concentration of 0.5 M. Then, the solutions were sonicated on ice, followed by the addition of Triton X-100 to a final concentration of 0.1% (v/v), and gentle mixing. We have found that the presence of this detergent is critical to dissociate the recombinant INs off the bacterial DNA (data not shown). The insoluble material was removed by ultracentrifugation for 1 h in the Beckman SW28 rotor at 26,000 rpm and the clear yellowish supernatants were subjected to affinity purification.
Affinity Purification on Sepharose-Co 2ϩ Columns-Sepharose-Co 2ϩ columns (BD Talon, BD Biosciences) with bed volumes of 5 ml were equilibrated with 50 mM sodium phosphate, pH 8.0, 0.5 M NaCl, 10% (v/v) glycerol, and 0.1% Triton X-100, and then loaded with the supernatant from the ultracentrifugation. The column was then washed twice, once with 50 mM sodium phosphate, pH 7.5, 0.5 M NaCl, 10% glycerol, 0.1% Triton X-100, and the second with the same buffer, supplemented with 20 mM imidazole. Elution was performed with the same buffer but with a final concentration of 50 mM imidazole. The fractions were all analyzed by SDS-PAGE, and the peak IN protein fractions were pooled and dialyzed against 50 mM HEPES-NaOH, pH 7.5, 10% glycerol, 10 mM MgSO 4 , 1 mM EDTA, 2 mM dithiothreitol, and 0.05% Triton X-100. The proteins were fully soluble after this dialysis.
Heparin-Sepharose Column Purification-Five-ml heparin-Sepharose (6, Fast Flow, Amersham Biosciences) columns were pre-washed with about 100 ml of the same buffer used for dialysis, and then loaded with the dialyzed protein solution. The same buffer was also used for extensive washing (about 500 ml) of the column after loading was completed. Finally, the proteins were eluted off the affinity resin with a similar buffer, supplemented with NaCl at a final concentration of 0.4 M. The recombinant INs emerged off the columns at sharp peaks with protein concentrations as high as 1.5 mg/ml. The proteins were stored at Ϫ80°C in small aliquots. Each aliquot was thawed only once for conducting the enzymatic assays.
Oligonucleotides-All oligonucleotides were gel purified. Their sequences and designations are presented in the supplemental material.
Substrates Used in IN Assays-The appropriate DNA oligonucleotides were 5Ј end-labeled. Aliquots containing 0.4 g of each oligo were 32 P end-labeled using 20 units of T4 polynucleotide kinase (New England Biolabs), and 100 Ci of [␥-32 P]ATP in a final volume of 20 l using the buffer supplied. After an incubation of 1 h at 37°C, 1 l of 0.25 M EDTA was added to each sample and the mixtures were treated for 5 min at 95°C to inactivate the enzyme. After adding NaCl to a final concentration of 0.1 M and increasing the volume to 40 l, the labeled DNA was annealed to three times the equimolar ratio of the appropriate non-labeled DNA. After 5 min at 95°C, the mixtures were allowed to cool slowly to room temperature for about 2 h. The duplexes generated were then stored at Ϫ20°C and were used after freezing and thawing for up to 1 month.
Assays for IN Activities-Unless otherwise stated, assays of IN and CH Ϫ were performed in final volumes of 20 l containing 25 mM MOPS-NaOH, pH 7.2, 4% (w/v) polyethylene glycol 8000, 10% (v/v) glycerol, 5 mM dithiothreitol, 6.25% (v/v) of freshly prepared dimethyl sulfoxide, and 1 mM MnCl 2 . Each reaction mixture contained between 0.2 and 1 g of the enzymes and 1 pmol of the labeled DNA substrates. Incubations were performed for 1 h at 37°C, and reactions were stopped by adding 10 l of formamide loading buffer (95% formamide, 20 mM EDTA, 0.5 mg/ml bromphenol blue, and 0.5 mg/ml xylene cyanole FF) to each sample. The samples were then treated for 3 min at 95°C, quickly cooled on ice and loaded on an 8 M urea, 13.5% (w/v) polyacrylamide sequencing gel in Tris borate/EDTA buffer. After electrophore-sis, the wet gels were subjected to phosphorimaging, using an Amersham Biosciences Storm 840 instrument.
Purified recombinant HIV-1 IN (a generous gift of R. Craigie) was assayed with the specified substrates under the optimal conditions of this enzyme that differ from those suitable to the Tf1 enzymes. Reactions were performed each in 20 l of 25 mM MOPS-NaOH, pH 7.2, 0.1 mg/ml bovine serum albumin, 10% glycerol, 10 mM ␤-mercaptoethanol, 0.1 M NaCl, and 7.5 mM MnCl 2 using 0.25 g of the purified IN. All other steps, including preparation of the substrates, incubation times, and analyses of the reaction products were performed as described for the Tf1 INs.

Recombinant IN and CH
Ϫ of Tf1 Have Significant Levels of Disintegration Activity-To investigate the function of the chromodomain in Tf1 IN we studied the integration activities of the full-length IN and of a truncated IN that lacked the chromodomain (CH Ϫ ). Both proteins were expressed with His 6 tags in bacteria and were purified to a high degree using chromatography with Co 2ϩ -agarose (Talon, BD Biosciences) followed by chromatography with heparin-Sepharose ( Fig. 1).
Under in vitro conditions INs can catalyze the reverse of DNA strand transfer in a reaction designated disintegration (10). This activity can be assayed using oligonucleotides that mimic a single ended insertion (25)(26)(27). The disintegration activity of Tf1 IN was studied first because the substrate sequence and structural requirements for this reaction are less stringent than those for the forward reactions.
The model substrate designed to measure disintegration activity is shown in Fig. 2A. The 5Ј end of the target DNA was labeled with 32 P. The reaction results in a concerted cleavage-ligation where the donor DNA is excised and the disrupted target is repaired. This results in the production of a 63-nt product labeled with 32 P at its 5Ј end. Because of its size, the 63-nt product could be readily distinguished from the initial species labeled with 32 P, the 20-nt substrate. Separate substrates were designed that mimicked either the U3 or U5 ends of the Tf1 LTR. The IN and CH Ϫ proteins produced substantial amounts of the 63-nt product using either the U3 or U5 substrates (Fig. 2B). We tested whether the disintegration substrates as designed for Tf1 IN could be recognized by another IN, that of HIV-1. Interestingly, HIV-1 IN is at least as active with the U3 and U5 substrates as the two Tf1 INs (Fig. 2B, right panel). This is despite the low homology between the Tf1 and HIV-1 sequences at the ends of their donor DNAs.
To optimize the reaction conditions we measured the influence of NaCl, MnCl 2 , and MgCl 2 on the activities of IN and CH Ϫ (Fig. 3). NaCl greatly inhibited the activities of both proteins and a strong preference for MnCl 2 over MgCl 2 was observed. The highest disintegration activity was obtained with 25 mM MOPS-NaOH, pH 7.2, 4% polyethylene glycol 8000, 10% glycerol, 5 mM dithiothreitol, 6.25% Me 2 SO, and 1 mM MnCl 2 . Other additives, often used to enhance IN activities, such as, bovine serum albumin, spermidine, Triton X-100, or Me 2 SO at final concentrations above 6.25% were all found to inhibit the Tf1 INs (data not shown).
Having established the assay conditions for the Tf1 IN and the CH Ϫ enzymes, dose-dependent reactions were performed with increasing amounts of each protein. The quantitative data presented in Fig. 4 show that for the U3 substrate the activity of IN increased with protein content up to 275 ng, whereas no maximum was observed with the U5 substrate. The data with CH Ϫ revealed the surprising result that the CH Ϫ protein was substantially more active than the full-length IN. Although this was true for both the U3 and U5 substrates, it was with the U3 substrate that CH Ϫ exhibited the greatest increase in activity over that of IN.
Recombinant IN and CH Ϫ Possess Strand Transfer Activities-Strand transfer assays are a more relevant measure of IN activities because they represent the forward reaction of integration. The strand transfer activities of IN and CH Ϫ were assayed to determine whether these proteins were capable of mediating integration. The substrates for these assays were double-stranded oligonucleotides designated to mimic either the U3 or U5 ends of Tf1. Initially, we used double-stranded oligonucleo-  tides with blunt ends that terminated with the CA dinucleotide (Fig.  5A). The oligonucleotide with the 3Ј CA was labeled at its 5Ј end with 32 P. Strand transfer results in the insertion of the oligonucleotides into other identical molecules and as a result oligonucleotides are created that are both bigger and smaller than their original size (Fig. 5A). We found that the conditions established for the disintegration reaction are also optimal for the strand transfer activity of the Tf1 enzymes (data not shown). Fig. 5B shows that relative to protein dose, both the IN and CH Ϫ generated oligonucleotide products longer than the original U3 and U5 substrates. The lengths of these products varied because the integration into the acceptor DNA was relatively random. However, it was apparent for both IN and CH Ϫ that with the U5 substrate there were preferred positions. Interestingly, a significant fraction of the extended products were longer than twice the size of the 24-nt substrate and products as long as 70-mers were observed. This indicates that both versions of Tf1 INs were capable of three or more integration events in which products of the first round of integration served as substrates for further integration events. To our knowledge, such multiple round integrations have not been reported for other INs.
We tested whether the chromodomain played a role in strand transfer by comparing the activity of IN to CH Ϫ . To evaluate the levels of strand transfer activity we calculated the amount of the oligonucleotide species longer than 24 nt as a percentage of the total amount of the labeled oligonucleotides. Fig. 5C shows that CH Ϫ was substantially more active than IN with both the U3 and U5 ends. The elevated activity of CH Ϫ correlated well with the results of the disintegration assays. However, the difference between the strand transfer activity of CH Ϫ and IN was more pronounced than observed for disintegration. The maximal differences in activity between CH Ϫ and IN were approximately 7and 4-fold when observed with the U3 and U5 end-derived substrates, respectively. These results indicate that the presence of the chromodo- CH Ϫ and IN Possess 3Ј End Processing Activity-It is currently thought that the role of the 3Ј end processing activity of INs is solely to remove the 2 or 3 nucleotides 3Ј of the CA that result from the position of the minus strand primer 2 or 3 nucleotides downstream of the first LTR. The primer for the minus strand of Tf1 is positioned adjacent to the LTR so it is predicted that no processing activity is required (28,29). However, sequences of Tf1 cDNA extracted from particles revealed the surprising result that 85% of the 3Ј ends had one or more untemplated nucleotides. 4 Because these nucleotides are added to the 3Ј ends after the CA dinucleotide, they are expected to block integration. We therefore speculated that Tf1 IN would have a processing activity that would remove these nucleotides to expose the CA dinucleotides and allow integration.
We tested IN and CH Ϫ for the 3Ј processing activity using the assay already described for the strand transfer activity. In this case, the 5Ј end-labeled substrates contained nucleotides positioned 3Ј to the critical CA (Fig. 6A). Processing activity would cause a reduction in length of the labeled substrate after the extra sequences are removed from the 3Ј end.
Processing the U5 End of the LTR-The IN and CH Ϫ enzymes were tested for processing activity using a set of substrates that mimicked the U5 end of the LTR with sequences that extended past the conserved CA. Each of the oligonucleotide pairs were designed such that processing activity that removes nucleotides 3Ј of the CA would produce the same 24-nt product. The 3Ј end extensions were either in the form of a singlestranded overhang (Ov) beyond the CA or they were base paired in the form of a blunt end (Bl). The sequences of the extensions were chosen based on their presence at the 3Ј ends of the DNAs that were isolated in vivo from particles. For example, the most prominent sequences observed after the U3 and U5 ends were "TT" and "AT," respectively. 4 In addition, longer stretches of "T"s were observed.
As indicated by the 24-nt product, both IN and CH Ϫ exhibited processing activity that specifically removed the nucleotides 3Ј of the CA (Fig. 6B). The processing activity of IN produced the 24-nt sequence as the dominant product from the substrates with "tt," "at," "ttt," and "ttttt" as single-stranded extensions. In the one case of a substrate with a Bl end ("at"), IN generated the 24-nt species along with several other smaller products. These additional products are seen to a lesser extent with the other substrates and are likely because of nonspecific nuclease activity of IN. This type of nonspecific activity is observed with other INs (30). It is reported that a variety of INs possess an intrinsic DNA hydrolysis or alcoholysis activity that is responsible for nonspecific nuclease activity similar to what we observed (31).
For retrovirus INs, the product of the processing activity is the only substrate suitable for strand transfer. Therefore, reactions with efficient processing activity also exhibit strand transfer. The processing activity of Tf1 IN with either the "at " or "tt" extensions was high and strand 4   The reactions were conducted with either the U3 or U5 substrates and increasing amounts of Tf1 IN or CH Ϫ . The proteins were diluted in the buffer used to store the proteins (50 mM HEPES-NaOH, 10% glycerol, 0.05% Me 2 SO, 2 mM dithiothreitol, and 0.4 M NaCl, final pH 7.2) and equal volumes of the protein samples were added to each reaction. After the reaction, products were subjected to electrophoresis and the gels were scanned with a phosphorimager.  Fig. 2. Also included as markers was the X174 DNA cleaved with HinFI and 5Ј end-labeled. C, quantitative analyses of strand transfer activity. The radioactivity in the gel shown in B was scanned with a phosphorimager. The radioactivity in all the bands larger than 24 nt was calculated as the percentage of the total radioactivity. The activities are expressed as the percent of conversion.
transfer was detected (Fig. 6B). The processing activity with the 3-and 5-nucleotide extensions was less efficient and no strand transfer was observed. Thus, the IN of Tf1 was similar to INs of retroviruses in that processing was required for strand transfer.
The amounts of processed products accumulated by CH Ϫ were similar to that of IN (Fig. 6B). However, CH Ϫ generated substantially more products of strand transfer than IN. The levels of these strand transfer products greater than 24 nt were quantified. The activity of CH Ϫ with all substrates was significantly higher than that of IN. For example, the activities with the tt-Ov was 57% and with at-Ov was 38% relative to that produced with the pre-processed substrate. The comparable figures for IN were 11 and 15%, respectively. Because the increase in strand transfer requires enhanced 3Ј end processing, these data indicate that the 3Ј end processing activity of CH Ϫ is higher than that of IN.
The efficient processing of the U5 substrates suggested that the Tf1 proteins specifically recognized and removed the extensions beyond the CA dinucleotide. To test whether other sequences in the substrate were recognized, IN and CH Ϫ were assayed for activity using a substrate with the U5 sequence of HIV-1 (Fig. 6B, far right). The HIV-1 substrate was 21 nt long and had the physiological GT extension 3Ј of the CA dinucleotide. Processing activity would reduce the 21-nt species to 19 nt. Compared with the processing activity of HIV-1 IN with this substrate, the Tf1 proteins exhibited very little activity. This indicates that in addition to the CA dinucleotide, the Tf1 proteins recognize sequence in the U5 substrate specific for Tf1. It was also seen that HIV-1 IN did not utilize the Tf1 pre-processed substrates (data not shown).
Processing the U3 End of the LTR-Retroviral INs have processing activities that remove extensions 3Ј of the CA from both the U5 and U3 ends of the cDNA. Because the Tf1 proteins were found to have processing activity on the U5 end, we tested whether they could also process extensions from the U3 end of Tf1. The results show that both CH Ϫ and IN can remove nucleotides 3Ј of the CA in the U3 sequence (Fig. 7). As seen with the U5 substrates, the shorter extensions on the U3 substrates were processed more efficiently than those with longer extensions. The cleavages of the Ov versions were somewhat more efficient than the Bl counterparts.
The processing activity of CH Ϫ with 3-, 5-, and 6-nucleotide extensions was significantly greater than that of IN. However, comparing the activities of the proteins with the shorter extensions was complex because the strong strand transfer activity of CH Ϫ reduced the accumulation of the processed species. To approximate the processing activities of IN and CH Ϫ , the levels of strand transfer were quantified. As found above for the U5 substrates, CH Ϫ exhibited substantially higher levels of activity. For example, the activities with tt-Ov was 88%, with tt-Bl was 32%, and with ttt-Ov was 19% of that produced with the pre-processed substrate. The comparable figures for IN were 11 and 4%, and not detectable, respectively.
Strand Transfer Activities Require Pre-processing of the Extensions 3Ј to CA-The processing activities of IN and CH Ϫ were estimated by measuring the products of the subsequent strand transfer. This was done because the products of processing can be readily converted by strand transfer activity into larger oligonucleotides. However, the accumulated products of strand transfer would be an overestimate of processing if the Tf1 proteins could mediate strand transfer with substrates that retained nucleotide extensions 3Ј to the CA. To test this possibility, strand transfer reactions were conducted with substrates that possessed a 2Ј,3Ј-dideoxynucleotide at their 3Ј end. Unless removed by processing, this substrate would lack the necessary 3Ј hydroxyl and as a result it would block strand transfer.
The experiment described in Fig. 8 tested the U3 substrate of Tf1 that had 3Ј ends with the sequence "-ACA-tt(dd)c." This substrate was compared with the control substrate that ended with a "c" instead of (dd)c. Two variants of each DNA, one with a single-stranded O v and the other with the Bl configuration were tested. The CH Ϫ protein exhibited high levels of strand transfer with the blocked substrate ACA(t) 2 ddc-Ov, which were not significantly different from that with the unblocked DNA, ACA(t) 2 c-Ov. This indicated that the strand transfer occurred only after an efficient removal of the blocking sequence. The strand transfer activity with the Ov substrates was substantially greater than seen with the Bl versions. The full-length IN did not show significant activity with either the modified or unmodified substrates (Fig. 8).
The Strand Transfer Activity of Tf1 IN but Not CH Ϫ Specifically Requires That "ACA" Be Present at the 3Ј End of the Substrate-The strand transfer activities of retrovirus and retrotransposon INs require that the highly conserved CA dinucleotide be present at the 3Ј end of the FIGURE 6. The 3 end processing and strand transfer activities with the U5 substrates of Tf1 and with an HIV-1 substrate. A, a schematic diagram of the 3Ј processing assay shows the oligonucleotides annealed to mimic the U5 and U3 ends of Tf1 DNA. The asterisk indicates the 5Ј end of the oligonucleotide with the conserved CA labeled with 32 P. B, the electrophoretic analyses of the reaction products of the 3Ј processing assays. Reactions were conducted with equal amounts of the specified substrates. The sequence from the 3Ј end of the substrates are shown above the lanes and extensions 3Ј to the conserved ACA are indicated in lowercase. The substrate with the pre-processed end (ACA) was blunt ended and was the same one used in Fig. 5B. Substrates marked with Bl are double-stranded oligonucleotides with blunt ends. Those marked Ov have 3Ј overhangs corresponding to the sequence in lowercase. Other reactions were performed with a blunt ended substrate that mimics the U5 end of HIV-1 DNA with the additional two nucleotides, gt, 3Ј to the conserved CA. To generate this substrate the oligonucleotide RZ132 was 5Ј end-labeled and annealed to oligonucleotide RZ61. The enzymatic reactions performed with Tf1 IN, Tf1 CH Ϫ , and HIV-1 IN were marked with a ϩ, and those without enzyme were marked with a Ϫ. The positions of the labeled oligonucleotides are indicated as 24 or 21 nt for the Tf1 or HIV-1 substrates, respectively. donor substrate (30). We tested whether the IN of Tf1 had this same requirement for CA and whether the chromodomain contributed to this specificity. IN and CH Ϫ were tested for whether each of the last three nucleotides of the LTRs, ACA was necessary for strand transfer (Fig. 9). The substrates were based on the U3 double-stranded donor used in Figs. 7 and 8 that lacked extensions 3Ј of the ACA. Thus, no 3Ј processing was required. Single, double, and triple substitutions were made in the ACA to test the dependence on this triplet. All the modifications introduced in the sequence led to a dramatic reduction of the strand transfer activity of IN. In sharp contrast to IN, CH Ϫ was capable of strand transfer with most of the substituted substrates. Except for the substrate in which all three terminal nucleotides were flipped (tgt), CH Ϫ had greater than 10% of the activity observed with the unsubstituted substrate. The most striking difference between CH Ϫ and IN was with the substrate with a Ϫ1 transition (AtA). CH Ϫ retained 50% of the activity observed with the ACA terminus, whereas the full-length IN had just 1% activity. This large difference in the specificity of the two related proteins shows that IN was far more stringent than the CH Ϫ counterpart in selecting the DNA donor with the correct 3Ј end. Although the disintegration activity may not have direct biological relevance, it is particularly useful for biochemical characterization of INs. It provides a highly sensitive assay for the cleavage-ligation activity of IN. The disintegration assay developed here for Tf1 IN was used to optimize the reaction conditions. The increased activities of Tf1 IN with low concentrations of NaCl (Fig. 3A) were similar to what was observed with retroviral INs (26,33). Because magnesium is thought to be the important divalent metal in vivo, the preference Tf1 IN had for manganese might be the result of structural defects in the recombinant protein. However, the preference for manganese could be a common prop-erty of INs as the INs of HIV-1 and murine leukemia virus also exhibit strong preferences for manganese over magnesium in vitro (26,33). The ability of HIV-IN to react efficiently with the Tf1 substrate (Fig. 2) suggested that the disintegration activity of HIV-1 IN had little sequence specificity. This is consistent with data from several studies indicating that beyond the requirement for the CA, the disintegration activity of retroviral INs is not sequence specific (25, 34 -36).

DISCUSSION
The full-length version of Tf1 IN was found to have substantial levels of strand transfer activity with both the U3 and U5 substrates. The genomes of eukaryotes accumulate large numbers of inactive retrotransposons. The strand transfer assay is a stringent test of IN function, and the activity of Tf1 IN detected here indicates that this protein was fully active. Strand transfer assays with substrates containing nucleotides 3Ј to the CA revealed that the Tf1 IN also had 3Ј processing activity ( Figs. 6 and 7). This was observed with U3 and U5 substrates and with single and double-stranded extensions. As is the case with the 3Ј end processing activity of HIV-1 IN, extensions longer than 2 nucleotides were more readily removed when they were single-stranded than when double-stranded (37).
In one respect, the processing activity was quite unexpected. It was thought that the 3Ј end processing activity of retroviruses and LTRretrotransposons is required only in the cases where the priming of minus strand reverse transcription adds nucleotides beyond the conserved CA. For Tf1, the 3Ј end of the minus strand primer is immediately adjacent to the U5 sequence of the upstream LTR (28,29). As a result, no extensions and no processing were thought to occur. However, recent analyses of Tf1 cDNA extracted from particles revealed that 85% of the U3 and U5 ends terminated with nontemplated nucleotides. 4 The processing activity of IN observed here could in theory remove these 3Ј extensions and allow the bulk of the cDNA to participate in integration. This model raises the question why would nontemplated nucleotides be added just so they could be removed by IN. One possibility is that the nontemplated nucleotides could prevent aberrant annealing of the 3Ј end to sequences in the opposite LTR and its subsequent extension. As a result, the nontemplated nucleotides would protect the completed cDNA until IN initiates 3Ј processing. Another possibility is that the nontemplated nucleotides could protect the conserved CA from attack by nonspecific 3Ј exonucleases.  The LTR-retrotransposon Ty3 of Saccharomyces cerevisiae does contain an IN with 3Ј processing activity and this was predicted based on the position of its minus strand primer (38). Nevertheless, most other LTR-retrotransposons such as Ty1 have minus strand primers immediately adjacent to the U5 sequence of their LTR and are not thought to have processing (39). The finding that Tf1 IN has processing activity brings into question the predicted lack of processing activity in the INs of the other LTR-retrotransposons. Approximately 25% of Ty1 cDNA terminates with nontemplated nucleotides (40). The cDNA with the nontemplated additions could participate in integration only if the IN has processing activity. However, analyses of Ty1 IN failed to detect processing activity (41,42).
The removal of nontemplated nucleotides by IN may also occur with retroviral cDNA. In vitro experiments show that HIV-1 reverse transcriptase possesses terminal transferase activity that adds nontemplated nucleotides to the 3Ј end of the nascent DNA (43)(44)(45). A time course of reverse transcription during HIV-1 infection identified a prominent species of cDNA that had one nucleotide extra on the 3Ј end of the plus strand (46). This led the authors to propose that the extra nucleotide was the result of nontemplated addition and that the 3Ј processing activity of IN removed it.
Perhaps the most striking results presented in this report indicate a role for the chromodomain in modulating integration. Although removing 15% of IN might be expected to reduce activity substantially, the deletion of the chromodomain resulted in a intriguing increase in the disintegration and strand transfer activities. CH Ϫ also had greater 3Ј processing activity than IN, assuming that the greater amounts of strand transfer products generated were all derived from processed oligonucleotides. Evidence supporting this assumption was that oligonucleotides with the blocking nucleotide (dd)c 3Ј of the CA were just as efficiently converted into products of strand transfer by CH Ϫ as oligonucleotides without the (dd)c block (Fig. 8). These data indicate that the chromodomain of Tf1 IN negatively modulates the IN activities.
In addition to its role in limiting integration activity, the chromodomain had a significant impact in restricting the sequence at the 3Ј end of the donor DNA that could function in strand transfer. Only when the chromodomain was present was there a strict requirement for the ACA at the 3Ј end of the donor DNA (Fig. 9). The strong specificity of the intact IN is in contrast to the IN of Ty1, which in similar types of assays can tolerate many alterations in the 3Ј end of the donor DNA (42). However, INs of retroviruses, such as HIV-1, lack chromodomains and yet exhibit strict requirements for the sequences at the 3Ј end of the donor DNA (35, 37). Thus, the chromodomain is not the only means for INs to achieve this strict requirement for the CA at the 3Ј end of donor oligonucleotides. Therefore, it appears that in the case of Tf1 IN, the chromodomain evolved independently as a means of restricting donor sequences.
Both contributions of the chromodomain, limiting IN activity and increasing sequence specificity, may result from a single mechanism that restricts access to the active site of IN. Although there are several examples of chromodomains that interact directly with histone H3, the lack of histone proteins in our assays indicates such an interaction is not required for the restrictive properties of the Tf1 chromodomain. Functional analyses of several chromodomains recently revealed an unexpected diversity of interaction targets such as DNA and RNA (7). It is possible that the chromodomain of Tf1 restricts IN activity through interactions with the donor DNA. Regardless of the mode of inhibition, there is the interesting possibility that in vivo, the inhibition is regulated. One intriguing hypothesis is that the chromodomain does interact with histone H3 and it is this interaction at the target site that relieves the inhibitory function of the chromodomain. The chromodomain of HP1 forms an organized cluster of ␤ strands and ␣ helices in response to its interaction with the NH 2 terminus of H3 (32,47). This dramatic reorganization is consistent with a role in regulating integration.
Although chromodomains are known to exist in many INs of LTRretrotransposons (8,9), the results reported here are the first direct effort to identify the function of the chromodomain in IN. The restrictive role of the chromodomain identified here may very well be true for the other chromodomain containing INs. In addition to INs, chromodomains have been identified in several other enzymes. The H3 lysine 9-specific histone methyltransferase Su(var)3-9, the ATP-dependent chromatin remodeling factors, CHD, the DNA methyltransferase CMT, and the histone acetylase CDY, are all enzymes that contain chromodomains (7). In light of the restrictive function of the chromodomain in Tf1 IN, it will be interesting to test how the chromodomains in these other proteins affect their enzymatic activities.
The poor solubility of the recombinant INs studied thus far has been a major obstacle in producing high resolution structures. High concentrations of salt or the introduction of amino acid substitutions have been necessary to solubilize individual domains of INs. A number of structures have been determined for each of the three domains of the INs from the retroviruses HIV-1, simian immunodeficiency virus, avian sarcoma virus, and Rous sarcoma virus (18). However, the lack of a structure for an intact IN makes it difficult to model the spatial relationships between the three domains. During the purification of Tf1 IN it became clear that this IN was substantially more soluble than other integrases previously studied. It is possible that this solubility will lead to the first crystal structure of a full-length IN. This would serve as a significant leap in understanding the structures of INs in general and more specifically, of HIV-1 IN. A structure of an intact IN would substantially facilitate the design and development of new drugs against HIV IN and, thus, contribute to the ongoing fight against AIDS.