Differential assembly of Rous sarcoma virus tetrameric and octameric intasomes is regulated by the C-terminal domain and tail region of integrase

Retrovirus integrase (IN) catalyzes the concerted integration of linear viral DNA ends into chromosomes. The atomic structures of five different retrovirus IN–DNA complexes, termed intasomes, have revealed varying IN subunit compositions ranging from tetramers to octamers, dodecamers, and hexadecamers. Intasomes containing two IN-associated viral DNA ends capable of concerted integration are termed stable synaptic complexes (SSC), and those formed with a viral/target DNA substrate representing the product of strand-transfer reactions are strand-transfer complexes (STC). Here, we investigated the mechanisms associated with the assembly of the Rous sarcoma virus SSC and STC. C-terminal truncations of WT IN (286 residues) indicated a role of the last 18 residues (“tail” region) in assembly of the tetrameric and octameric SSC, physically stabilized by HIV-1 IN strand-transfer inhibitors. Fine mapping through C-terminal truncations and site-directed mutagenesis suggested that at least three residues (Asp-268–Thr-270) past the last β-strand in the C-terminal domain (CTD) are necessary for assembly of the octameric SSC. In contrast, the assembly of the octameric STC was independent of the last 18 residues of IN. Single-site substitutions in the CTD affected the assembly of the SSC, but not necessarily of the STC, suggesting that STC assembly may depend less on specific interactions of the CTD with viral DNA. Additionally, we demonstrate that trans-communication between IN dimer–DNA complexes facilitates the association of native long-terminal repeat (LTR) ends with partially defective LTR ends to produce a hybrid octameric SSC. The differential assembly of the tetrameric and octameric SSC improves our understanding of intasomes.

Reverse transcription of retrovirus RNA in virus-infected cells results in the synthesis of a linear viral DNA genome. Concerted integration of the viral DNA ends by the viral integrase (IN) 3 into the host genome is essential for virus replication. The three-dimensional structure of five different retrovirus IN-DNA complexes assembled in vitro, collectively termed intasomes, has advanced our understanding of the integration process (1)(2)(3)(4)(5). Structures containing two viral DNA ends in complex with IN capable of concerted integration activity are termed stable synaptic complexes (SSC), and a branched viral/ target DNA substrate in complex with IN is termed strandtransfer complexes (STC). The IN subunit composition in intasomes vary from tetramers to octamers, dodecamers, and hexadecamers (6). All of the intasomes have structural similarities, including a conserved intasome core (CIC) for catalysis (6), but differ in many ways, including a variety of architectures due to their subunit composition.
X-ray crystallography of the Rous sarcoma virus (RSV) STC containing a branched viral/target DNA substrate revealed IN octamers (2). The octameric RSV STC includes two copies each of proximal and distal IN dimers that take on different functions and conformations. The catalytic proximal IN dimers form the tetrameric CIC structure with the two viral DNA ends. The proximal IN dimer consists of an inner subunit that engages the viral/target DNA junction and an outer subunit associated via the conserved catalytic core domain (CCD) and the C-terminal domain (CTD) dimer interfaces. The distal IN dimers are recruited by the tetrameric SSC for assembly of the octameric SSC (7). The proximal and distal IN dimers bind both viral and target DNA resulting in STC assembly (2).
The composition of RSV STC containing IN octamers (2) is in contrast to the STC of the prototype foamy virus (PFV) that contains only an IN tetramer (1,8,9). The STC assembly process involves the formation of insoluble complexes in low-salt buffers in vitro. These insoluble complexes are dialyzed in higher salt buffers that produced soluble and stabilized STCs for purification by size-exclusion chromatography (SEC) and structural analyses. This insoluble/soluble assembly method was also used for mouse mammary tumor virus SSC containing IN octamers (3), HIV type-1 (HIV-1) STC containing either tetramers or higher-order dodecamers (4), and the maedi-visna virus SSC (5) containing a hexadecamer, which were analyzed by cryo-EM (EM). The roles of each subunit and their domains for assembly of these diverse retrovirus intasome structures are just beginning to be understood.
We developed a one-step direct assembly method to produce soluble SSC that allowed us to investigate the assembly mechanisms of the SSC containing either IN tetramers or octamers whose molecular mass was determined by size-exclusion chromatography-multiangle light scattering (SEC-MALS) (7,10). Additional SEC-MALS and protein-protein cross-linking analyses of the soluble RSV STC further established the presence of IN octamers (2,7). Studies were initiated to determine what regions of the IN are responsible for the transition of a tetrameric SSC to the octameric form. The concerted integration of viral DNA ends into target DNA catalyzed by RSV IN is highly susceptible to clinically approved HIV-1 strand-transfer inhibitors (STI), producing IC 50 values very similar to inhibitory concentrations for HIV-1 IN (10). Assembly of the RSV SSC in the presence of an HIV-1 STI results in the production of tetrameric or octameric SSC, which were purified by SEC in the absence of STI in the elution buffer, demonstrating that the complex is kinetically stabilized by the inhibitor (7,10). The 5Ј-ends of the two viral DNA molecules within both the tetrameric and octameric SSC are in close proximity as measured by fluorescence resonance energy transfer (FRET) (7). C-terminal truncations of wildtype (WT) RSV IN  demonstrated that the last 18 residues or "tail region" (Fig. 1) influences the ability of IN to produce SSC that contained either IN tetramers or octamers. The truncated IN  ending at the last wellordered residue (Ile-269) near the ␤-sheet (residues 222-267) in the CTD (Fig. 1) of the RSV STC (2) is only effective in producing an SSC containing IN tetramers (7,10). IN  and IN(1-278) are capable of producing both the tetrameric and octameric forms of the SSC, with the longer length IN possessing the highest efficiency for producing the octameric SSC. HIV-1 SSC stabilized by STIs containing either an IN tetramer or a higher-order IN oligomer have also been identified by native agarose gel electrophoresis and proteinprotein cross-linking studies (11,12). The tetrameric HIV-1 SSC is also the major species observed in the absence of STI (13,14).
We have further investigated the role of the tail region and the CTD (Fig. 1) of RSV IN by site-directed mutagenesis as well as the long-terminal repeat (LTR) DNA sequences on the assembly of the SSC. The results suggest that the assembly of the octameric RSV SSC is multifactorial, including proteinprotein and protein-viral DNA interactions. Selected mutations differentially affect the assembly of the tetrameric and octameric SSC. The results further suggest that two different proximal IN dimer-viral DNA complexes, one with a native LTR end and another with a partially defective end, can communicate in trans to produce the tetrameric SSC that is the precursor to the octameric form.

Temperature, time, and length of the tail region from the end of the last ␤-strand in the CTD affect assembly of both tetrameric and octameric SSC
The last well-ordered residue past the ␤-strand in the crystal structure of the RSV STC containing IN octamers is Ile-269 ( Fig. 1) (2). The C-terminal IN(1-269) truncation that ends with Ile-269 is only capable of effectively producing the tetrameric SSC at either 4 or 18°C (Fig. S1) (7,10). We wanted to determine which residues immediately past Ile-269 had a major effect on the conversion of the tetrameric SSC to the octameric form (Fig. 2). We used a gain-of-function (G)U3 LTR substrate and other LTR substrates for these assembly studies (Table 1) (15). IN  was mixed with the 18R GU3 DNA substrate under standard assembly conditions in the presence of MK-2048 for varying times of incubation at 4 or 18°C and subsequently analyzed by SEC (Fig. 2, A and B). At the higher temperature and shorter times of incubation ( Fig. 2A), IN  was significantly better in producing the octameric SSC than it was at the lower temperature of 4°C (Fig. 2B). The total incorporation of IN(1-270) and GU3 into either SSC appears to be nearly equivalent as observed by the quantities of both SSC in each experiment (compare Fig. 2, A to B). The results suggest the efficiency of converting the tetrameric SSC to the octameric form is highly dependent on temperature.
The conversion of the tetrameric SSC to the octameric form was significantly increased by one additional residue with the C-terminal IN(1-271) truncation (Fig. 2, C and D). The efficiency of producing the octameric SSC with IN(1-271) was best at 18°C and completed by 24 h (Fig. 2C) in contrast to the results observed at 4°C for the same time periods (Fig. 2D), suggesting that the transition of the tetrameric SSC to the octameric form was rapid and efficient at 18°C. Similar results compared with IN(1-271) at the higher temperature were obtained using IN    (7,10). The results suggest that the addition of only a few residues beyond the well-ordered Ile-269 modulates the ability of the C-terminal IN truncations to assemble the octameric SSC.

Octameric STC is produced by all of the C-terminal IN truncations
We determined whether all of the enzymatically active C-terminal IN truncations were able to assemble the octameric RSV STC with the 42-bp branched viral/target DNA substrate used for crystallization of IN(1-270) (2). The various C-terminal IN truncations were assembled with the branched substrate at 4°C for 24 h in the absence of MK-2048 (Fig. S2). All of the C-terminal IN truncations formed the STC with essentially the same efficiency and yield. The assembly of STC requires higher salt conditions and lower temperature to avoid aggregation problems (7) compared with conditions to assemble the SSC. The assembly buffer contains 0.35 M NaCl and 0.1 M ammonium

Assembly of retrovirus tetrameric and octameric intasomes
sulfate, and the SEC elution buffer contains 0.4 M NaCl and 0.1 M ammonium sulfate.

Major alterations of the amino acid sequences in the RSV IN tail region from Gln-271 to Lys-278 do not affect assembly of octameric SSC
Our earlier RSV IN C-terminal truncation mapping studies indicated that IN(1-278) was the most efficient length for producing the octameric SSC (7). Here, we determined whether modifications of the charged residues ( Fig. 1) or introduction of random amino acids in the tail region of IN(1-278) would affect the ability of IN to assemble the octameric SSC at 18°C (Fig. 4). The three positively charged Lys residues in the tail region of IN(1-278) were altered to Ala in K272A/K277A/K278A (termed 3K/A) (Fig. 4). In addition, the charged residues Lys-272/Asp-273/Glu-274 group was changed to uncharged Ala or Gly in K272A/D273A/E274G (termed Alt), and a complete random sequence of eight residues ( 271 GAAGGAAA 278 ) (termed Rdm) was substituted in the IN(1-278) backbone. All three different alterations in this C-terminal IN backbone were very similar to the original C-terminal IN(1-278) truncation for assembly of the octameric SSC at 18°C for 24 h (Fig. 4). IN (3K/A) had the lowest yield of the octameric SSC among the four IN constructs. As shown previously at 4°C with all three new C-terminal IN truncation mutants (Fig. 2) (7), the formation of the tetrameric SSC occurred first prior to conversion to the octameric form in a time-dependent manner (data not shown). All of these C-terminal IN truncation mutants had similar concerted integration activity with 3Ј-OH recessed 18R GU3 or blunt-ended 20B GU3 as IN(1-278) (Fig. S3A), and like the other previously defined C-terminal IN truncations (7, 10), they were dimeric in solution (data not shown). In summary, there appears to be no major amino acid sequence requirements in the tail region between Gln-271 and Lys-278 for assembly of the octameric SSC in vitro.

Influence of LTR sequences in assembly of the octameric SSC
Site-directed mutagenesis of RSV WT U5 and U3 LTR sequences established that major alterations within seven nucleotides from the viral DNA blunt-ends significantly affect concerted integration activity and the 20-bp DNase I footprint protection by IN (15)(16)(17). The terminal two nucleotides (TT) are cleaved from the catalytic strand on blunt-ended DNA substrates by IN DNA endonuclease activity to produce the 3Ј-OH recessed end. The recessed WT U3 DNA sequence starting at the eighth position on the catalytic strand is 5Ј-ACTACA OH and the GU3 sequence is 5Ј-ACAACA OH . The underlined nucleotides identify the fifth and sixth nucleotides from the blunt end (Table 1). The fifth to seventh nucleotides are located in the major groove of the viral DNA substrate in the RSV STC structure (2).
Assembly of the octameric SSC with 18R GU3 by IN(1-278) is better than with 18R WT U3 at 18°C after 24 h (Fig. 5A) consistent with an enhancement of concerted integration activity observed with GU3 (16 -18). In contrast, WT U5 (5Ј-GCTT-CA OH ) containing two pyrimidines at positions 5 and 6 was not

Assembly of retrovirus tetrameric and octameric intasomes
capable of effectively assembling the octameric SSC at 18°C after 48 h, possibly due to the higher ionic strengths of 0.1 M ammonium sulfate and 0.1 M NaCl required for producing soluble complexes (Fig. 5B) (7,10). We had previously demonstrated that GU3-GU3 and U3-U3 ends are highly preferred over U5-U5 ends in assay mixtures at 0.3 M NaCl for concerted integration (15,19). The concerted integration activity with GU5 (5Ј-GCAACA OH ), produced by changing these two nucleotides to purines in WT U5 thus mimicking the GU3 DNA substrate, is significantly higher than WT U5 (15,16). This change in GU5 resulted in the effective assembly of the octameric SSC (Fig. 5B). The results suggest that the two nucleotides in the fifth and sixth position play a major role in the assembly of the STI-stabilized tetrameric and octameric SSC.

Trans-communication occurs between IN-LTR ends to promote concerted integration activity and assembly of octameric SSC
The U3 and U5 LTR ends must communicate with each other in the preintegration complex in virus-infected cells for the concerted insertion of the two ends into the host genome. The RSV U3 end can effectively interact with the U5 end for concerted integration by IN at 0.3 M NaCl (15,19). We wanted to determine whether GU3 DNA substrate was able to incorporate WT U5 into the octameric SSC even though U5 by itself is poorly incorporated into this complex under relatively high ionic strengths (

Table 1 Sequences of the LTR substrates used in this study
The WT U3 and U5 DNA substrates are shown. With the gain-of-function GU3 as well as GU5 substrates, the 5th and 6th nucleotide on the 3Ј-OH recessed ends contain AA. The changes in the nucleotides from the WT sequences are in bold and underlined. In selected experiments, as described in the text, several substrates were labeled with fluorophore Cy5 at 5Ј-end of the LTR on the recessed strand.

Substrate name
Substrate sequence

Assembly of retrovirus tetrameric and octameric intasomes
substrate was labeled with Cy5 fluorophore on the 5Ј-end of the catalytic strand to avoid any influence of Cy5 interfering with MK-2048 in the active site to form the SSC (Fig. 6A) and to measure the quantity of Cy5-U5 incorporated into the assembled octameric SSC in the presence of varying molar ratios of GU3 (Fig. 6B). The total concentration of both LTR substrates in the mixture was always held constant at 15 M, but the molar ratio of GU3 to U5 was varied. The concentration of the Cy5-U5 was always fixed at 3.75 M, and the ratio was varied with unlabeled U5. Concerted integration of the Cy5-U5 is the same as U5 (Fig. S4). Without MK-2048 to stabilize the SSC, no tetrameric or octameric SSC were observed in the presence of the U5 DNA substrate at 15 M (Fig. 6A). In the presence of MK-2048, a minor population of tetrameric and octameric SSC were observed, as shown previously with WT U5 (Fig. 5B). By decreasing the ratio of GU3 to total U5 (labeled and unlabeled) from 1:1, 1:2, and 1:3 to 1:7, the quantity of the octameric SSC was significantly decreased as expected because the U5 substrate does not efficiently produce SSC (Fig. 6, A and B). At 1:7 molar ratio, the quantity of U5 incorporated decreased possibly due to the amount of GU3 being a limiting factor. However, the quantity of the U5 substrate incorporated into the octameric SSC significantly increased at the molar ratio of 1:1, 1:2, and 1:3, as determined by fluorescence intensity across the peak fractions of the octameric SSC (Fig. 6B). As expected at the molar ratio of 3:1 (Fig. 6B), the quantity of U5 significantly decreased in the octameric SSC (Fig. 6A). The total quantity of the octameric SSC was the highest with only GU3 as substrate.
In summary, the results suggest that GU3 ends can act cooperatively for assembly of the octameric SSC by incorporating U5 ends, even under high-salt conditions not permissive for U5 ends by themselves (Figs. 5B and 6A). The apparent cooperative binding of proximal IN dimers to the two different LTR ends in the tetrameric SSC promotes binding of the two additional distal IN dimers resulting in a stable octameric SSC to mediate concerted integration.
RSV IN is capable of communicating in trans between one native LTR end and one partially defective LTR end in IN-DNA complexes that promotes concerted integration activity (17,   19). Studies were conducted to investigate the role of the seventh position of the RSV U3 LTR; the CG pair at this position is critical for concerted integration activity in vitro (17). Arg-244 of the inner catalytic subunit of the proximal IN dimer is positioned in the major groove of the viral DNA, closest to purine G7 of the nontransferred strand (2). The recessed WT U3 DNA sequence starting at the eighth position on the catalytic strand is 5Ј-ACTACA OH ( Table 1). The octameric SSC is not assembled in the presence of 100% modified LTR substrate at the seventh position (5Ј-ATTACA OH ) (termed Cy5-U3-T7) (Fig.  S5). The purine A7 on the nontransferred strand does not allow the assembly of the SSC. Again, Cy5 was placed at the 5Ј-end of the catalytic strand to avoid potential interference in the active site. Changing the seventh position (5Ј-AATACA OH ) (termed U3-A7) also produced negative assembly results (data not shown). The two modified U3 substrates were capable of concerted integration only at 0.125 M NaCl with U3-A7 being the most defective (Fig. S4). Neither modified substrate was capable of concerted activity at 0.3 M NaCl.

Assembly of retrovirus tetrameric and octameric intasomes
However, as in the previously described interactions between GU3 and U5 DNA substrates (Fig. 6), the presence of Cy5 on the partially defective Cy5-(5Ј-ATTACA OH ) substrate allowed the detection of this Cy5-U3-T7 substrate coupled with GU3 in the octameric SSC (Fig. S5). The molar ratio of GU3 to Cy5-U3-T7 was again varied with the final concentration of both LTR substrates maintained at 15 M. As expected, the incorporation of Cy5-U3-T7 into the SSC increased as the ratio of GU3/ Cy5-U3-T7 increased from 1:1 to 1:3 (Fig. S5B) and decreased at a molar ratio of 1:7. At the opposite high molar ratio of 3:1, a significantly lower quantity of Cy5-U3-T7 was incorporated into the octameric SSC. The GU3 substrate interactions with this defective LTR end (Cy5-U3-T7) mirrored the GU3/U5 cooperative interactions to produce the octameric SSC as described above.
These assembly data are consistent with the results obtained for concerted integration activity of these modified LTR substrates and the ability of the two LTR ends to communicate in

Assembly of retrovirus tetrameric and octameric intasomes
trans (17,19). In summary, the present LTR studies demonstrate that IN binding in the major groove of at least one viral DNA end is critical for assembly of the tetrameric SSC that is eventually converted into the octameric SSC.

Functional analysis of mutations introduced into Arg-263 in the CTD of IN for SSC and STC assembly defects
The CTDs of retrovirus IN (20), including RSV IN (21,22), have been shown to be required for virus replication (Fig. 1). The CTDs of all eight RSV IN subunits (chains A-H) together form a ring-shaped structure to "bundle up" the two viral DNA molecules in the STC (2). We wanted to investigate several residues in the CTD of RSV IN that would potentially play critical roles in the assembly of both the tetrameric and octameric SSC.
In the RSV STC, Arg-263 interacts with both viral DNA strands, and two IN subunits (C and G chains from the distal IN subunits) are nearest the putative dolutegravir (DTG) pocket-binding site (2). In HIV-1 IN, R263K plays a key role in the observed low level DTG drug resistance found in patients and produces HIV-1 that possesses a reduced replication and IN-DNA-binding capacities (23,24 (Fig. S6A) (10, 25). The IN mutants were assembled with GU3 under standard high-ionic strengths in the presence of MK-2048 (Fig. 7, A and B). The R263A mutant was unable to produce SSC even after 48 h at either 18 or 4°C (Fig. 7, A and B), consistent with the ability of IN R263A to catalyze concerted integration at a low level only under 0.125 M NaCl low-salt assay conditions (Fig. 7C) (26). The very minor tetrameric peak

Assembly of retrovirus tetrameric and octameric intasomes
observed with R263A is probably the tetrameric form of IN only (Fig. S6A). In contrast, IN R263K was capable of assembling the tetrameric SSC at both temperatures (Fig. 7, A and B) but lacked the ability to produce octameric SSC as observed with IN(1-278) (Fig. 4). Viral DNA is associated with the tetrameric SSC produced by IN R263K (data not shown). IN R263K was able to catalyze concerted integration even under high-salt conditions (300 mM NaCl) and also at a significantly lower level than IN(1-278) (Fig. 7C). The results suggest that IN R263K also has reduced DNA-binding capacity similar to that observed with HIV-1 R263K. The lack of forming an octameric SSC by IN R263K suggests that the formation of the tetrameric SSC occurs prior to assembly of the octameric SSC, similar to the observed temperature effect on assembly of RSV SSC (Fig. 2) (7, 10). The results also suggest that Arg-263 of the distal IN subunits is important for assembly.
As stated previously, the assembly efficiency of the RSV STC for WT IN and all of the C-terminal IN truncations appear equivalent (Fig. S2) suggesting that these proteins have equal DNA-binding capacities at 0.35 M NaCl in the assembly buffer and stability upon SEC analysis with 0.4 M NaCl in the elution buffer. However, both the R263A and R263K mutations in the parental IN(1-278) construct have diminished capability to assemble the STC compared with IN(1-278) (Fig. 8). These assembly defects with R263A and R263K IN mutants and their decreased ability to catalyze concerted integration (Fig. 7) are consistent with their diminished capacity to bind viral DNA. The ability of these two IN mutants to partially assemble the octameric STC suggests that the mutations in Arg-263 affect IN-viral DNA-binding interactions and not necessarily protein-protein interactions that are necessary for incorporation of IN octamers into the STC. Whatever the defect(s) are, they could be compensated or rescued by target DNA.

Highly conserved Trp-233 in RSV IN plays a role in the assembly of the SSC by maintaining essential protein-protein interactions necessary for dimerization
A highly conserved residue in the CTD of all retrovirus IN is Trp-233 ( Fig. 1) (20). Mutations of Trp-233 in full-length RSV IN, which is stacked between the Arg-227 and Lys-266 side chains in the STC (2), to Ala but not Phe abolish concerted integration activity and DNase I protection of the U3 ends by IN (27). Modification of Trp-233 to Glu decreases concerted integration over 70% and eliminated DNase I protection by IN. We determined what effect the above mutations introduced into WT RSV IN had on assembly of the SSC. The concerted integration activities using GU3 as DNA substrate with these fulllength WT and IN mutants (Fig. S3B) are similar to that previously observed (27).
The assembly efficiency of GU3 with WT RSV IN(1-286) or full-length IN (W233F) under standard assembly conditions was similar after 48 h at 18°C (Fig. 9A) or 4°C (Fig. 9B). Both the tetrameric and octameric SSC were produced with these INs. However, the efficiency with full-length IN (W233E) was ϳ40% of that observed with WT RSV IN for assembly of the octameric SSC but similar for the tetrameric SSC (Fig. 9). WT, W233E, and W233F IN had similar efficiency for assembly of the STC (Fig. S7). The enzymatically inactive IN W233A did not allow assembly of the tetrameric or octameric SSC (data not shown). The W233A mutation caused the protein to aggregate in the absence of viral DNA producing higher-order structures (Fig.  S6B), possibly octamers, whereas WT IN(1-286), W233E, and IN W233F essentially only produced mainly dimers and some tetramers (Fig. S6B) (7, 10). If W233A was incubated with viral DNA under standard assembly conditions, the normally observed aggregated IN octamers were produced (Fig. S6B), and viral DNA was not present in this protein complex (data not shown). Aggregated IN W233A failed to bind DNA.
These alterations at Trp-233 suggest that the stacking of an aromatic residue between the Arg-227 and Lys-266 is important in orienting the basic side chains in the RSV STC (2) and in DNA-free IN (26). The results further suggest that Trp-233 plays a critical role in maintaining an IN dimeric structure essential for the formation of the tetrameric SSC, the precursor to the octameric SSC.

Role of IN Trp-213 in tetrameric and octameric SSC assembly
Trp-213 in RSV IN is the last residue at the end of helix 9 ( Fig.  1) that stacks with another Trp-213 in the free IN dimer (26) and in the RSV STC (2). These Trp residues appear to stabilize the asymmetrically associated linker segments between the CCD and the CTD. We wanted to examine the role of Trp-213 in the assembly of the tetrameric and octameric SSC by producing full-length IN W213A, a mutation that affects the concerted integration reaction. The reaction is significantly inhibited in the presence of 0.3 M NaCl but is similar to WT IN at 0.125 M NaCl (Fig. S3C), as observed previously (26). The dimeric structure of full-length IN W213A in solution is maintained under a variety of conditions without a low level of tetramers being present (Fig. S6C) compared with WT IN (Fig. S6A).
Full-length WT IN(1-286) is capable of forming both the tetrameric and octameric SSC with the octameric complex being the major species assembled at 18°C for 24 h (Fig. 10) (7,10). The efficiency of IN W213A to produce the octameric SSC was significantly diminished relatively to WT IN, whereas the tetrameric SSC failed to accumulate at 18°C for various times of incubation up to 24 h (Fig. 10). Neither the tetrameric nor

Assembly of retrovirus tetrameric and octameric intasomes
octameric SSC were assembled by IN W213A at 4°C for 24 h (data not shown) suggesting that higher energy at 18°C is necessary for proper assembly of the octameric SSC. IN W213A was slightly less efficient than WT IN(1-286) for assembly of the STC (Fig. S7B). The results further suggest that the stacking of the two Trp-213 residues within proximal dimers is essential for producing the transient tetrameric SSC but not necessarily the octameric SSC.

Discussion
The assembly mechanisms for retrovirus intasomes and which regions of IN that are involved in these processes are unknown. Here, we demonstrate that the assembly mechanisms between the soluble octameric STC and SSC appear different in vitro. All of the C-terminal IN truncations readily assembled the STC, including IN(1-270) used for X-ray crystallography (2). In contrast, IN(1-269) was capable of assembling the tetrameric SSC but essentially fails to produce the octameric SSC. All of the other C-terminal IN truncations were capable of assembling both the tetrameric and octameric SSC at different efficiencies with a higher temperature (18°C), which appears to be a critical parameter for octameric SSC formation. The results suggest a critical role of the tail region in the octameric SSC assembly process. Site-directed mutagenesis of residues Trp-213, Trp-233, and Arg-263 in the CTD does not severely affect STC assembly except for Arg-263, which affects both STC and SSC formation. Mutations in these three residues affected the assembly of the octameric and/or tetrameric SSC by apparently different mechanisms. A surprising new role for trans-communication occurring between proximal IN dimer-DNA complexes was unveiled with the use of modified donor LTR DNA substrates in the fifth, sixth, and seventh nucleotide positions that are located in the major groove of the viral DNA in the STC (2). This trans-communication enables the assembly of nondefective LTR ends with partially defective LTR ends to produce an octameric SSC capable of concerted integration.
The tail region following the CTD is disordered in RSV IN (26,28) and is not resolved in retrovirus intasome structures (1,(3)(4)(5). The length but not the amino acid content of the tail region affects the assembly of the octameric RSV SSC in vitro. The last well-ordered residue in the crystal structure of the RSV STC is Ile-269 ( Fig. 1) (2). IN(1-269) is capable of the assembly of the STI-stabilized tetrameric SSC but not the octameric form (7,10) under current assembly conditions. The addition of several residues beyond Ile-269 promotes the assembly of both the tetrameric and octameric SSC (Figs. 2 and 3). The modification of the amino acid content of the tail region in IN(1-278) had minimal effect on assembly of the octameric SSC (Fig. 4). The tetrameric SSC is easily observed at 4°C with all of the proximal IN truncations suggesting a minimal energy requirement for this initial DNA binding and assembly event. In contrast, the energy supplied at 18°C was essential to effectively produce the

Assembly of retrovirus tetrameric and octameric intasomes
stable octameric SSC, which suggests a major structural change in IN is necessary to accomplish this task.
What structural changes in RSV IN are required for binding of the proximal IN dimers to the viral DNA to produce the tetrameric SSC and subsequently the addition of distal IN dimers to produce the octameric SSC? Reflected in their differential roles in the RSV STC, the proximal and distal IN dimers show distinct conformations that differ in the relative configurations between the dimerized CCDs and CTDs (2). The IN dimers in solution in their native conformation are poised for binding the viral DNA (2,26,28), and indeed the proximal IN dimer takes this native conformation to engage the viral DNA in the RSV STC (2). Two juxtaposed proximal IN dimers, each bound to a viral DNA terminus, form a tetramer by swapping NTDs of the inner catalytic subunit. The energy requirement of the tetrameric SSC assembly process is minimal as demonstrated by its formation at 4°C (Figs. 2 and 3) (10). In contrast, the alternative conformation of the CTD in the distal IN dimers, which are not observed in free IN structures, is required for all IN molecules to fit in the RSV STC without steric clashes (2). The simplest explanation for the efficient assembly of the octameric SSC at 18°C is that the alternative CTD conformation requirement of the distal IN dimers for binding requires energy, suggesting an induced fit mechanism upon association of the free IN dimers through the C-terminal interactions to complete the octameric SSC assembly. The CCDs of the distal IN dimer do not have any catalytic role but serve as a platform for target DNA binding in the STC (Fig. 11) (2).
We investigated the interactions of IN with LTR ends for assembling the RSV octameric SSC by utilizing the important role of the fifth, sixth and seventh nucleotides from the DNA end for concerted integration activity. Previous studies estab-lished that trans-communication occurs when RSV IN bound to an active LTR end rescues IN bound to partially defective LTR ends for concerted integration activity (15)(16)(17)19). In this report, we established what IN-LTR structures were necessary for concerted integration under these conditions. Assembly of WT U5 into an octameric SSC is minimal under the higher salt conditions required for maintaining solubility (Fig. 5) as well as concerted integration activity with the U5 substrate under 0.3 M NaCl conditions (15,19). However, under the same higher salt assembly conditions in the presence of GU3, WT U5 is readily incorporated along with GU3 into the octameric SSC (Fig. 6). Apparently, the stability of the proximal IN dimer-U5 complex is sufficient enough that IN dimer-GU3 complex is able to capture and stabilize the CIC of this hybrid tetrameric SSC that requires domain swapping of the NTD (2). Similar trans-communication for assembly occurs between GU3 and partially defective U3 ends modified at the seventh nucleotide position that is critical for concerted integration with RSV IN (Figs. S4 and S5) (17). RSV IN Arg-244 of the inner catalytic subunit of the proximal IN dimer is positioned in the major groove of the viral DNA, closest to G7 of the nontransferred strand. The equivalent residue Glu-246 of HIV IN interacts with the A7 nucleotide of its nontransferred LTR strand shown by disulfide cross-linking studies (29) and mutagenesis studies of Glu-246 affecting catalytic activities (30). Mutagenesis of HIV-1 and RSV LTR ends demonstrated that trans-communication can also occur between nondefective and partially defective ends for integration and replication in vivo (31)(32)(33)(34). In summary, RSV IN has the capability to rescue a deficient LTR end by its interactions with a nondefective LTR end to assemble the octameric SSC, thus allowing concerted integration.
IN is capable of assembling the STC with the product of the concerted integration reaction (a branched viral/target DNA substrate) and the SSC that utilizes only viral DNA substrates in vitro (1)(2)(3)(4)(5)(6). Our results here suggest that the soluble form of the RSV STC may not address what regions or residues of IN are necessary for assembly as well as what is observed with the SSC. For example, the STC ( Fig. 4 and Fig. S2) is produced by all of the C-terminal truncations at similar efficiencies but vary in their capacities to produce the tetrameric and octameric SSC (Figs. 2 and 3). Currently, the soluble octameric RSV STC (7) can only be assembled at 4°C because higher temperatures cause aggregation problems, although the insoluble complex can be solubilized in the presence of 1 M NaCl for X-ray structural analysis (2) similar to procedures used to determine structures of other retrovirus intasomes. In contrast, it appears that target DNA can functionally compensate for the inability of IN  to assemble the octameric SSC by itself for concerted integration activity (Fig. 11) (7, 10), suggesting that a combinational approach may also prove effective. As discussed below, results from our limited single-site mutagenesis analyses of RSV IN suggest that both the tetrameric and octameric SSC will provide differential tools to investigate functions of IN subunits.
Single-site mutations introduced into RSV IN raises problems for interpretation of data because each subunit in the proximal and distal IN dimers have different functions and locations in the intasome. Unexpectedly, RSV IN single-site

Assembly of retrovirus tetrameric and octameric intasomes
mutants have provided some insights into functional relationships between the tetrameric and octameric SSC, their capacity to catalyze concerted integration, and their subunit structure in solution. Previous biochemical and structural studies of RSV IN suggest that interactions between several different residues are necessary to stabilize the asymmetrically associated linker segments between the CCD and CTD (26). One of these residues is IN Trp-213, which stacks with each other in an IN dimer (26), producing a 2-fold symmetry in the proximal dimers in the STC (2) and is the last residue before the linker between the CCD and CTD (Fig. 1). Although full-length IN W213A is capable of producing the octameric STC as WT IN(1-286) (Fig. S7B), the assembly of the tetrameric SSC was not observed at 18°C, whereas the octameric SSC was produced at an ϳ50% level (Fig.  10) corresponding to their concerted integration capabilities (Fig. S3C). Neither assembled structure is observed at 4°C suggesting the CTD is effectively nonfunctional for assembly compared with WT IN. The results further suggest this stacking between Trp-213 residues plays a critical role for tetrameric SSC assembly but not necessarily the octameric form. The dimeric state of free IN W213A is maintained under several different salt conditions (Fig. S6C). Concerted integration by IN W213A at 37°C is salt-sensitive ( Fig. S3C) (26). Possibly, the ability of the CTD residues to interact with the viral DNA in the higher salt assembly buffer is also compromised because of an improper linker structure. In summary, the results suggest that Trp-213 has a critical role in assembly of the tetrameric SSC but is not absolutely necessary for octameric SSC assembly or concerted integration activity under low-salt conditions. RSV IN R263A and R263K significantly impact the assembly of the octameric SSC, although IN R263K is capable of assembling the tetrameric SSC (Fig. 7, A and B). The ability of IN R263A and R263K to promote concerted integration under low-salt conditions in contrast to higher salt concentrations ( Fig. 7C) (26) appears to correlate with their capacity to assemble the tetrameric SSC and the octameric STC (Fig. 8). Decreased DNA binding may also be responsible for the lower efficiency to produce the STC with these mutants (Fig. 8). The results suggest that Arg-263 plays a direct role in binding to viral DNA as shown in the RSV STC (2). The R263K mutation in HIV-1 IN also causes a slightly decreased DNA binding ability, plays a role in producing a weak drug resistance against Dolutegravir in virus-infected individuals, and produces HIV-1 that possesses a significantly reduced replication capacity (23,35,36).
The assembly of the RSV tetrameric and octameric SSC produced by IN mutants of the highly conserved Trp-233 ( Fig. 1) appears to be more closely aligned with results previously shown for RSV IN (27) and HIV IN (37) concerted integration activities. Both IN mutants (W233F and W233E) can produce the tetrameric and octameric SSC but at different efficiencies ( Fig. 9), which correspond to their ability to promote concerted integration under low-salt conditions (Fig. S3B) (27). These IN mutants appear to produce near-normal quantities of dimers and tetramers in solution, whereas W233A produced only an aggregated IN octamer (Fig. S6), which is devoid of catalytic activity under any salt condition (Fig. S3B) (27). Maintenance of the dimeric IN structure appears critical for proper intasome assembly.
Our results overall suggest that understanding the assembly mechanisms, which are involved for at least the RSV SSC, appears approachable. The independent and the dependent formation of both the tetrameric and octameric SSC allows separation of two components necessary for understanding concerted integration. Atomic resolution of these structures will surely provide further detailed information necessary to understand these mechanisms.

Concerted integration assay
The concerted integration assay using 3Ј-OH recessed oligonucleotides (ODN), viral DNA substrates with RSV IN, was previously described (7,10). The concentrations of IN and the viral ODNs were generally 2 and 1 M, respectively. The NaCl concentration in the assay mixture was either 0.125 or 0.3 M as indicated. The strand transfer products were separated on a 1.8% agarose gel, stained with SYBR Gold (Invitrogen), and analyzed by a Typhoon 9500 laser scanner (GE Healthcare).

Viral DNA and viral DNA/target ODNs for assembly of IN-DNA complexes
Double-stranded 3Ј-OH recessed ODN containing RSV gain-of-function(G) U3 and WT U3 long-terminal repeat (LTR) sequences were 18 nucleotides in length and synthesized by Integrated DNA Technologies (IDT). The DNA substrates were recessed by two nucleotides on the catalytic strand and designated with an R. The identified length of the ODN denotes the noncatalytic strand. The sequences were as follows: WT U3 (5Ј-ATTGCATAAGACTACA-3Ј and 5Ј-AATGTAGTCTTA-TGCAAT-3Ј) and GU3 18R (5Ј-ATTGCATAAGACAACA-3Ј and 5Ј-AATGTTGTCTTATGCAAT-3Ј). The bold underlined nucleotides on the catalytic strand are different from the GU3 and WT U3 sequence. RSV WT U5 20R was prepared by annealing 5Ј-ATG AAG CAG AAG GCT TCA-3Ј and 5Ј-AAT GAA GCC TTC TGC TTC AT-3Ј, and GU5 20R was prepared by annealing 5Ј-ATG AAG CAG AAG GCA ACA-3Ј and 5Ј-AAT GTT GCC TTC TGC TTC AT-3Ј. The bold underlined

Assembly of retrovirus tetrameric and octameric intasomes
nucleotides on the catalytic strand are different from the WT U5 and GU5 sequence. A 20-bp blunt-ended GU3 substrate (20B) was also used to measure the coupling of 3Ј-OH processing and strand transfer for concerted integration. The sequence of the branched viral GU3/target DNA substrate used for crystallography of the RSV STC has been previously described (2).
Several other modified LTR substrates were used. The WT U3 sequence was changed at the seventh position on the noncatalytic strand, which is also located in the major groove within the catalytic site of IN (2). They are (5Ј-ATTGCATA-AGAATACA-3Ј and 5Ј-AATGTATTCTTATGCAAT-3Ј) (termed U3-T7), which switches a purine (G) for a pyridine (T) on the noncatalytic strand at the seventh position, and (5Ј-ATTGCATAAGATTACA-3Ј and 5Ј-AATGTAATCTTAT-GCAAT-3Ј) (termed U3-A7), which switches the purine (G) to purine (A) on the noncatalytic strand.
Several LTR substrates were also labeled on the 5Ј-end of the catalytic strand with the Cy5 fluorophore to measure the incorporation of the labeled substrates in a mixture containing unlabeled LTR substrates into octameric SSC. The Cy5 label was incorporated into WT U5 and termed Cy5-U5. Cy5 was also used to label U3-T7. The incorporation of the 18R Cy5-U5 substrate into purified SSC was detected upon gel analysis of the deproteinized samples by excitation (635 nm) using a Typhoon FLA9500 biomolecular imager. The incorporation of Cy5 into purified SSC was also measured in solution using a Fluoromax-3 (Jobin Yvon, Inc, Edison, NJ) (7,38). The fluorophore labeled substrates were synthesized by IDT.

Assembly protocols for RSV SSC and STC
RSV SSC produced with viral DNA require the presence of STI to produce kinetically "trapped" IN-DNA complexes (7,10). The standard direct assembly buffer was 20 mM HEPES, pH 7.5, 100 mM ammonium sulfate, 100 mM NaCl, 1 M nondetergent sulfobetaines (NDSB)-201, 10% dimethyl sulfoxide (DMSO), 10% glycerol, 1 mM tris(2-carboxyethyl)phosphine) (TCEP). IN (as monomers), 3Ј-OH recessed DNA ODN, and STI concentrations were set at 45, 15, and 125 M, respectively, unless otherwise indicated. After addition of DNA to the assembly mixture and subsequently IN, the samples were incubated at 4 or 18°C for different times as indicated. The STI was MK-2048 unless indicated otherwise. Each specific assembly experiment was performed in duplicate or triplicate.
The direct assembly condition to produce the RSV STC using a 42-bp branched viral/target DNA substrate was the same as described above for SSC except for the following. The assembly mixture contained IN (35 M), the 42-bp GU3/target DNA substrate (10 M), and NaCl (0.35 M), but no STI was present (7). The assembly was always at 4°C because at 18°C the IN-DNA complexes precipitated out of solution. The SEC buffer for analysis of the STC is described below.

Size-exclusion chromatography
SEC using Superdex 200 Increase (10/300) (GE Healthcare) and molecular mass standards were previously described (10). The SEC buffer for analysis of the RSV SSC was 20 mM HEPES, pH 7.5, 100 mM ammonium sulfate, 200 mM NaCl, 5% glycerol, and 1 mM TCEP. The SEC buffer for analysis of the RSV STC was the same except 0.4 M NaCl was used. Chromatography was at 4°C, and UV absorption monitored at 280 nm. Generally, 100 or 250 l of sample was injected into the column. Purified IN and its mutants were generally injected into the Superdex 200 column at 100 -200 g per 100 l (32-62 M, respectively) or 250 l at 45 M. As indicated, a buffer containing 20 mM HEPES, pH 7.5, 1 M NaCl, and 1 mM TCEP was also used.