Apical Loop-Internal Loop RNA Pseudoknots

Nearly all members of a widespread family of bacterial transposable elements related to insertion sequence 3 (IS3), therefore called the IS3 family, very likely use programmed -1 ribosomal frameshifting to produce their transposase, a protein required for mobility. Comparative analysis of the potential frameshift signals in this family suggested that most of the insertion sequences from the IS51 group contain in their mRNA an elaborate pseudoknot that could act as a recoding stimulator. It results from a specific intramolecular interaction between an apical loop and an internal loop from two stem-loop structures. Directed mutagenesis, chemical probing, and gel mobility assays of the frameshift region of one element from the IS51 group, IS3411, provided clear evidences of the existence of the predicted structure. Modeling was used to generate a three-dimensional molecular representation of the apical loop-internal loop complex. We could demonstrate that mutations affecting the stability of the structure reduce both frameshifting and transposition, thus establishing the biological importance of this new type of RNA structure for the control of transposition level.

though PRF-1 examples are found in organisms from all three domains, most of the authenticated cases are in viruses (notably in retroviruses), in bacteriophages, and in bacterial DNA transposable elements called insertion sequences (IS). The mandatory short sequence on which the ribosome shifts, the frameshift motif, can be either a heptamer of the form X-XXY-YYZ (6,7) or a Y-YYZ tetramer (8 -11).
In many cases the frequency of frameshifting is greatly stimulated by a structure formed by the mRNA on the 3Ј side of the shift site; this structure can be either a pseudoknot or a stemloop (12). The combination of the shift site and of a 3Ј stimulator leads to frameshifting frequencies well above the spontaneous level of error (13). The design of viral 3Ј stimulatory structures elicited much interest as a source of potentially new RNA motifs with the ability to target the ribosome (12,14). Another reason is their presence in pathogenic viruses such as HIV-1 or the severe acute respiratory syndrome (SARS) coronavirus. It has been proposed that a 3Ј structural element stimulates shifting at the slippery site by constituting a physical barrier to mRNA translocation, thus causing ribosomal pausing while the frameshift motif is in the P and A sites (15)(16)(17)(18). In retroviruses and coronaviruses, the slippery sequence is almost always associated with a pseudoknot (Recode data base) (19,20). A notable exception is HIV-1, where the stimulator is a stem-loop (21)(22)(23). In contrast with the viral situation, there are no published statistics on the bacterial stimulatory structures. However, stem-loops are predominant over pseudoknots among the few analyzed cases, i.e. dnaX (24), IS150 (25), IS911 (26), IS1221 (27), and IS1222 (28) versus IS3 (8).
To have a better knowledge of the repertoire of stimulatory structures in bacterial frameshift regions, we carried out a comparative analysis of a class of transposable elements, the IS of IS3 family (29), because its members very likely use PRF-1 to express the transposase, a protein required for their mobility. The ISFinder data base currently contains 354 members of the IS3 family. Nearly all members of this ubiquitous family share a common genetic organization where two open reading frames, orfA in frame 0 followed by orfB in frame Ϫ1, partially overlap (see Fig. 1). The overlap region invariably contains a potential heptameric or tetrameric frameshift motif generally accompanied by a frameshift stimulator. Detailed analysis of transposase expression has been performed on a few ISs: IS150 (25), IS911 (26,30), IS3 (8), and IS629 (11). In each case, the OrfAB transposase is indeed synthesized via PRF-1 on the predicted motif. Our comparative analysis revealed one particularly interesting group among the IS3 family, the IS51 group, because many of its members appeared to contain a new type of stimulatory struc-ture. This structure is a pseudoknot more elaborate than those previously analyzed. It is formed by the interaction an apical loop and an internal loop; we called it an ALIL-PK by analogy with similar structures obtained by in vitro selection of RNA aptamers against stem-loops from the 5Ј-and 3Ј-untranslated regions of hepatitis C virus (31). It has also been suggested that a related structure may be involved in Ϫ1 frameshifting and replication of the Barley yellow dwarf virus (32).
To carry out a detailed analysis of this potentially new frameshift stimulator, we chose an Escherichia coli transposable element from the IS51 group, IS3411 (33). In the present work, we have tested by an extensive genetic analysis the contribution of each element of the PRF-1 region of IS3411 (see Fig. 1). These studies as well as chemical probing experiments and gel shift mobility assays have confirmed the existence of the ALIL-PK structure, and molecular modeling provided clues on its architecture. Through its role as a stimulator of PRF-1, this intricate structure clearly determines the level of transposition of IS3411. From a structure/function point of view, it widens our knowledge on the strategies used by biological systems to promote RNA-RNA interactions for control of gene expression.

EXPERIMENTAL PROCEDURES
Enzymes and Chemicals-The enzymes were provided by New England Biolabs, by MP Biochemical (Taq DNA polymerase), by Promega (T7 RNA polymerase), and by InvitroGen (SuperScript II reverse transcriptase).
The frameshift region of IS3411 (nt 336 -454) in its wild type or mutated forms was reconstituted by ligating 12 DNA oligonucleotides covering both strands into the HindIII and ApaI sites of the pOFX302 vector (26) (see Fig. 1). The end of orfA is in frame with g10 and the start of orfB is in frame with lacZ. In all constructs (save the m4 -5 mutant) the two AUG initiation codons of orfB (overlapping the UAA stop codon of orfA; see Fig. 2) have been changed to AUa and AaG to prevent translation initiation. Thus, the ␤-galactosidase activity results only from Ϫ1 frameshifting.
For the in vivo transposition assays, we modified plasmid pOH9, a derivative of pBR322 containing an entire and functional IS3411 (33). The HindIII site of pBR322 was eliminated by an EcoRI-NheI digestion followed by treatment with the Klenow fragment of DNA polymerase I and ligation; this leaves a unique HindIII site in the IS, near its right end. A small portion of IS3411 was then duplicated by insertion of complementary nucleotides between the BseRI and HindIII sites close to the right end of the IS. The new plasmid, pOFX514, contains an XbaI site after the stop codon of orfB. Plasmid pOFX515 was derived by inserting into the XbaI site complementary oligonucleotides corresponding to the terminal inverted repeats of IS3411 irr and irl facing each other and separated by three nucleotides: 5Ј-CT AGT CAC AGA TAA AAC ACT CTC CAG GAA ACC CGG GGC GGT TCA GAA TGA ACC GCC CCG GGA ATC CTG GAG ACT AAA CTT CCT GAG ACT CGA G. The part of the frameshift region between the EcoRV and RsrII sites of pOFX515 was replaced by oligonucleotides to generate three mutants: U342C (no slippery sequence), G416A (stem 3 destabilization), and C419A (stem 2 destabilization).
To assess the possibility of formation in trans of the SLI-SLII complex, the SLII region was cloned under control of the P LtetO-1 promoter of pOFX512D This plasmid uses a p15A replication origin and contains a kanamycin resistance gene. It was derived from pMPM-K6 (36) by insertion of an EcoRI-PstI-BsiWI-NheI linker between its BglI and EcoRI sites followed by insertion in the BstEII and EcoRI sites of an 81-bp XhoI-[P LtetO-1 ]-EcoRI fragment from plasmid pZA31 (35).
Measurement of Frameshifting Frequency-Frameshifting was assessed using the pOFX302 reporter plasmid either by measuring ␤-galactosidase activity or by in vivo labeling of proteins with [ 35 S]methionine (37).
Bacterial strains containing each plasmid construct were grown for 18 h at 37°C in LB medium, and their LacZ activity of was measured by determining the rate of hydrolysis ortho-nitrophenyl-galactoside at 28°C (34,37). Four clones per strain were assayed on three consecutive days. The activity of mutant m1 (Fig. 2), in which the g10 and lacZ genes are in the same frame, was used as 100% to calculate absolute frameshift frequencies. All of the results were normalized by setting as 1 the activity of the mutant used as wild type (denoted mutant m0).
Transposition Assays-Transposition frequencies of wild type IS3411 and of its three mutated derivatives were determined by a mating out assay as previously described (26).
Electrophoretic Mobility Shift Assay-DNA segments containing the SLI or SLII region were PCR-generated and transcribed with T7 RNA polymerase as recommended by Promega. DNA was eliminated by treatment with RQ1 DNase (Promega). The RNA was purified by gel filtration (MicroSpin G50 from GE Healthcare) and checked for purity and quantity on an 8% polyacrylamide gel in TBE buffer (100 mM Tris base, 83 mM boric acid, 1 mM Na 2 EDTA, pH 8.0; 28/1 acrylamide/bisacrylamide ratio). SLI or SLII RNA was labeled with 100 Ci of [␣-33 P]UTP during the transcription reaction. The interaction between SLI and SLII was assayed by mixing a fixed amount of the [ 33 P]RNA with different quantities of the other partner, in a final volume of 15 l in TC buffer (10 mM Hepes, pH 7.3, 10 mM NaCl, 70 mM CH 3 COOK, and 1.5 mM [CH 3 COO] 2 Mg). Before mixing, each type of RNA was heated at 75°C for 5 min and allowed to cool down to 20°C over a 30-min period. After mixing, each sample was incubated 30 min at 22°C and then analyzed by electrophoresis for 90 min at 80 V on 8% polyacrylamide gels (TBM buffer served for gel preparation and as running buffer; migration was carried out at 22°C). The amount of radioactivity present in the gels was determined using phos-phorimaging. The apparent K d for the SLI-SLII complex was estimated as described (31).
Chemical Probing of RNA-The IS3411 frameshifting region was amplified by PCR with the sense primer OFD3475 (TAA TAC GAC TCA CTA TAG GGA GCT TCG CTT ATT TTG CGA AGG CGG AG) containing the T7 RNA polymerase promoter region (underlined) and the antisense primer OFD3476 (CTT TAT CTT CAG AAG AAA AAC CGC CAG TGA ATT AGG GCC CGA CG) using as template the pOFX302-m0 plasmid. To produce the DNA fragments containing either stemloop I or stem-loop II, PCRs were performed on the same plasmid using the sense primer OFD3475 and the antisense primer OFD3536 (CTG CTCA CGC AGC TTA TCC), or with the sense primer OFD2463 (TAA TAC GAC TCA CTA TAG AGC TGC GTG AGC TGT A) containing the T7 RNA polymerase promoter region and the antisense primer OFD3476. Three types of mRNA (mRNA-0, mRNA-I, and mRNA-II) were prepared by in vitro transcription from the PCR products using T7 RNA polymerase. Transcripts were purified by denaturing PAGE and electroelution, and their sequences were verified by sequencing with reverse transcriptase. 50 pmol of mRNA were renatured in 50 l of buffer A (50 mM potassium cacodylate, pH 7.2, and 300 mM KCl) or buffer B (50 mM borate-KOH, 100 mM NH 4 Cl) by warming at 90°C for 1 min followed by incubation at 4°C for 10 min. The magnesium concentration was then adjusted to 10 mM. Chemical probing of mRNA was performed on samples (10 pmol in 50 l) by the addition of 1.5 l of DMS (2:10 dilution in 95% ethanol), 3 l of kethoxal (19 mg/ml in H 2 O), or 20 l of CMCT (84 mg/ml in H 2 O). After incubation at 37°C for 10 min, all of the modification reactions were stopped by the addition of 150 l of 95% (v/v) ethanol and 5 l of 3 M sodium acetate followed by rapid mixing. The kethoxal-modified samples were adjusted to 25 mM potassium borate, pH 7.0. The pellets were resuspended in 10 l of H 2 O (for DMS and CMCT samples) or in 10 l of 25 mM potassium borate for kethoxal-treated samples. Modified bases were detected by primer extension reactions with 5Ј endlabeled oligonucleotides and reverse transcriptase (38). Fragments were separated on a 7 M urea, 8% (w/v) polyacrylamide gel. For lead mapping, a freshly prepared 40 mM lead acetate solution (pH 5.5) in H 2 O was used. The reaction mixture (25 l) containing the 5Ј end-labeled mRNA was incubated at 37°C for 5 min in the presence of lead acetate at concentrations of 0.6, 1.25, 2.5, and 5 mM.
Modeling of the IS3411 Stimulatory Structure-The ERNA-3D program (version 2.0 from Pentafolium-Soft) and an Octane work station (Silicon Graphics) were used to build molecular models of the ALIL-PK from the m23 mutant. Only the SLI (nt 9 -25 in Fig. 3) and SLII (nt 71-110 in Fig. 3) regions were considered. Stems 1, 2, 3, and 4 were first generated using the defaults parameters for A-form RNA (11 bp/helix turn and a distance of 0.281 nm between bp) and then joined manually. Stereochemically correct models could be generated without necessity of helix distortion. The description file in the Protein Data Bank format for model Md1 is included in the supplemental material. Molecular graphics images were produced using the UCSF Chimera package (39).

A Potential Elaborate Pseudoknot (ALIL-PK) Characteristic of the Frameshift Region within the IS51 Group-Transposable
elements of the IS3 family correspond to about 20% of all known ISs. Its 354 members, found in all eubacterial phyla, were sorted into five groups by comparison of their OrfB protein (ISFinder data base) (40). Our own analysis revealed interesting features in 28 elements of one group, the IS51 group (supplemental Table S1). The shift site is either U-UUU or U-UUC (U-UUU-UUC in three cases), which is in contrast with the rest of the IS3 family where A-AAA-AA(A/G) motifs are largely predominant. It is separated from a first stem-loop (SLI) by a spacer sequence (sp1) generally 5 nt long. A second spacer (loop 2), whose length varies between 44 and 104 nucleotides, separates SLI from a second, more complex, potential stemloop (SLII). Remarkably, a kissing loops type of interaction is possible between the apical loop of SLI and an internal loop of SL-II (stem 2, in supplemental Table S1 and Fig. 1). The end result is an ALIL pseudoknot. The group can be separated into two subgroups according to phylogenetic origin; type 1 ISs are found in proteobacteria, whereas type 2 ISs originate from actinobacteria.
The sequence of five base pairs constituting the SLI stem, nearly exclusively composed of C and G, is highly conserved in type 1 ISs (5Ј-GGCGG-3Ј) and less conserved among type 2 ISs. In a few IS the SLI stem contains four base pairs (ISRme15, ISNeu_1, ISAfa_1, IS1137, and IS994) or more than five base pairs (ISMysp3, ISMyma3, ISRru1, ISGwe_1). Except for IS986, IS1137, IS6110, and ISMyma3, in which the SLI loop contains eight nucleotides, most of the IS possess an apical loop of seven nucleotides whose sequence is well conserved (5Ј- In all ISs except ISAfa_1 the junction between stem 1 and the paired region of loop 1 is an adenine. The size of SLII varies from 38 to 63 nucleotides. The ISs that possess a longer loop 2 generally have a longer SLII region. This is the case for all type 2 ISs except ISTesp1. In most ISs stem 3 at the base of SLII is composed of five to six paired bases (with extremes of 4 and 14 base pairs), which are mostly 5Ј-GC-3Ј pairs. An argument in favor of the proposed ALIL interaction is the observed covariation for the three nonstrictly conserved bp of stem 2; a substitution in the apical loop of SLI is always accompanied by a complementary variation in the internal loop of SLII; covariation also appears to operate in the case of stem 3. In the three ISs with an 8-nt apical loop, the PK could possibly be extended to 7 base pairs.
Genetic Dissection of the Frameshift Region of IS3411 Provides Evidence for the Proposed ALIL-PK-One insertion sequence, IS3411, was selected to determine experimentally whether the conserved structural elements identified by comparative analysis were indeed related to control of expression by translational frameshifting (Fig. 1). This IS was used because it was originally found in E. coli and shown to transpose in this organism (33) A recent study on a nearly identical element, IS629, showed that frameshifting is required for transposition (11). Intriguingly, expression of IS629 would give two transframe products by frameshifting either on the predicted U-UUU FIGURE 2. Genetic analysis of the IS3411 frameshift region. A, set of 30 mutants generated into the pOFX302 reporter plasmid ("Experimental Procedures" and B). The boxed nucleotides in the sequence were replaced by those indicated on the side. The PRF-1 frequency of the wild type (denoted mutant m0, which contains two mutations abolishing expression from the two consecutive AUG codons in frame Ϫ1) is 0.66%. Its activity was set as 1 and used to calculate the relative PRF-1 frequency of the others. B, protein expression profile from a selected set of the mutants from A. isopropyl ␤-D-thiogalactopyranoside-induced cultures were pulse-labeled with [ 35 S]methionine, and total proteins were analyzed as described under "Experimental Procedures." The size and position of the three expected proteins is indicated on the left. The vector plasmid (first lane) produces the G10 protein. The m1 mutant (second lane) synthesizes only a G10-LacZ fusion similar to the one generated by PRF-1 (FS species). The m4 -5 clone (true wild type, third lane) allows synthesis of 3 plasmid-encoded proteins, the G10 and FS species plus the one resulting from initiation at the orfB AUG codons (IN). The fourth lane corresponds to the m0 mutant, which synthesizes the G10 and FS proteins. The m2 construct (fifth lane) has a mutation in the slippery sequence and consequently produces only the G10 species. Sixth and seventh lanes, respectively, show the pattern generated by similar plasmid constructions containing the frameshifting signals of dnaX (24) and IS911 (26). The apparent frameshifting frequency for the m0 mutant is about three times higher than obtained by dosage because for physiological reasons the value used as 100% (amount of radioactivity in the FS band for the m1 in-frame mutant) is an under evaluation. motif (OrfABЈ 629 transposase protein) or on an A-AAA motif in loop 2 (OrfAB 629 modulator protein). The OrfAB 629 protein would interact with and stabilize the OrfABЈ 629 transposase, thus allowing a higher frequency of transposition. In the IS629 study, the authors did not use the full-length frameshift region; the distal part of SLII starting three nucleotides 3Ј of stem 2 (nt 428 -454) was omitted. This very likely affected quantitatively but probably not qualitatively the results from the expression analysis.
To assess the role of the various elements potentially involved in IS3411 frameshifting (Fig. 1), mutations were introduced into the region from nt 336 to 454 cloned into a lacZbased reporter plasmid ( Fig. 2A). ␤-Galactosidase expression requires a Ϫ1 frameshift event or an initiation event in the Ϫ1 frame within the cloned region. The level of expression in the Ϫ1 frame was monitored by LacZ assay and by protein pulselabeling for some mutants. As shown in Fig. 2B (third lane), the cloned wt fragment of IS3411 (m4 -5) indeed allows synthesis of the protein species expected from a Ϫ1 frameshifting event (FS band); however, the amount is much lower than for two other PRF-1 regions: dnaX (24) and IS911 (26). A second pro-tein is also synthesized (IN band); it results from an initiation event, in the Ϫ1 frame, on one of the two consecutive AUG codons overlapping the UGA stop in frame 0. Similar translational coupling was also demonstrated for IS629 (11) and IS3 (8). To make sure the LacZ activity reflects only frameshifting, we used as a basis the m0 mutant in which translational coupling is abolished (Fig. 2B, fourth lane). This mutant contains two other mutations in loop 2. Primarily intended for conservation of a potential stem-loop, these mutations have no significant effect on PRF-1 (see m5 versus m0), and no evidence was obtained for the potential structure within loop 2. According to folding programs, stems 1 and 3 could each be extended by an A-U pair at their basis (supplemental Table S1). However, no evidence could be obtained by genetics and probing for their existence. Consequently, they were not taken into account; spacers 1 and 2 were considered to be 5 and 45 nucleotides long, respectively.
The cloned IS3411 segment leads to only 0.66% frameshifting (Fig. 2). Even if low, this level is significant and depends on the presence of the first PRF-1 motif, the U-UUU tetramer. Its replacement by the nonshifty C-UUU sequence leads to a 6-fold reduction in activity (m2 and m2-3 mutants). In contrast, mutation of the second motif, from A-AAA to C-AAA, has very little or no detectable impact on PRF-1 (mutant m3). Thus, in our conditions where a segment of IS3411 longer than the one tested for IS629 (11) was analyzed, the U-UUU motif appeared to be the only efficient frameshift motif. As expected, substitution of this motif by the highly shifty A-AAA-AAG sequence (mutant m6), found in the dnaX and IS911 signals, resulted in a strong stimulation (6-fold) of PRF-1. The evaluation of IS629 frameshifting using another lacZ-based reporter system gave a value of 4.5% despite the fact that most of SLII was missing (11). The discrepancy probably originates from an under evaluation of the amount of LacZ from the in-frame plasmid in the IS629 study. We noted that full derepression of Plac-type promoters carried by multicopy plasmids often lead to abortive translation and aggregation of LacZ for highly expressed constructions.
The U-UUU motif is not the sole determinant of frameshifting frequency. First, its position relative to the SLI region was shown to be important; a 5-nt spacing is three to four times more efficient than a spacing of 2, 8, or 11 nt (supplemental Fig.  S1C). Then the implication of each of the four predicted stems was established by mutating one base pair (or two), first on one side or the other to disrupt Watson-Crick pairing and then on both sides to restore pairing (e.g. mutants m7, m8 and m9 in stem 1; Fig. 2). In all cases, frameshifting was reduced 2-3-fold for the disruption mutants and restored to nearly wt level for mutants reinstating pairing (in Fig. 2, mutants m7 to 9 for stem 1, m21 to 25 for stem 2, m27 to 30 for stem 3, and m16 to 18 for stem 4). In the case of stem 2, the m23 mutant harbors the perfectly matched sequence found in other ISs of the group (see supplemental Table S1). This modified version of stem 2 turned out to be a 3.2 times more efficient recoding stimulator. Starting with this mutant, we derived a series of ten stem 2 mutants with perfect Watson-Crick pairing in which only one base pair was changed (supplemental Fig. S1A). Consequently, PRF-1 frequency varies over a 7-fold range. This suggests that perfect pairing of 6 base pairs is not sufficient to ensure efficient PRF-1 stimulation but that the sequence itself plays a role. The size of loop 1 is also important; the addition of three nucleotides on one side or the other of the six paired nucleotides (mutants m14 and m15) or deletion of A356 (m13) reduces frameshifting severely. The effect of the identity of the unpaired nucleotide of loop 1 (nt 356) was determined; replacement by U or C is detrimental, but substitution by G is not (m10, m11, and m12). Increasing the length of stem 1 by 3 bp results in a notable reduction of PRF-1 (supplemental Fig. S1B, mutants m57 and  m59). The architecture of the SLII region is also constrained. The insertion of three nucleotides between stems 2 and 3 results in a 5-fold reduction (m26). The series of stem 4 modifications suggests that a stem-loop, even a short one, is absolutely required to obtain PRF-1 stimulation at the wt level (supplemental Fig. S1D). On the other hand, increasing the size of stem 3 by 3 bp augments frameshifting nearly 2-fold, provided stem 1 is not simultaneously altered (supplemental Fig. S1B, mutants  m58 and m59).
To conclude, the genetics approach carried out on IS3411 strongly support the four-stem structure deduced from alignment and secondary structure prediction of frameshift regions from the IS51 group. In addition, several spatial constraints were revealed which suggest that the six nucleotides from loop 1 and their six pairing partners from loop 3 have to be properly positioned to interact and form a structure capable of interfering with the ribosome.
Formation of an ALIL-PK Revealed by Chemical Probing of the IS3411 Frameshift Region-We investigated the conformation of an RNA molecule, mRNA-0, containing the entire frameshift region of IS3411. It contains an IS segment from nt 336 to 454, plus a few nucleotides from the pOFX302 vector on both sides. The nucleotides were renumbered; the ϩ1 position corresponds to nt 343 of IS3411 (Fig. 3). The analysis was carried out with DMS, which reacts with the N1 position of adenines and the N3 position of cytosines, with kethoxal, which reacts with the N1 and N2 positions of guanines, and with CMCT, which modifies the N1 position of guanines and the N3 positions of uridines, when those positions are not engaged in hydrogen bonds. Probing data suggest that nucleotides 9 -25 of SLI fold into a short stem-loop because G 9 , G 10 , G 12 , and G 13 and C 11 , C 21 , C 22 , C 24 , and C 25 were not reactive, respectively, to kethoxal and DMS. The stem nucleotides are also insensitive to lead cleavage of the sugar-phosphate backbone, indicating structural stability of this region (Fig. 4). In the apical loop 1, the N1 position of A 20 , the N1 and N2 positions of G 19 and the N3 position of U 16 and U 17 are weakly accessible respectively to DMS, kethoxal, and CMCT, whereas there is a strong protection of A 14 , G 15 , and C 18 (Fig. 3). The weak sensitivity to lead cleavage of loop 1 nucleotides confirms the weak flexibility of this region suggested by the probing data (Fig. 4). Chemical reactivities of bases as well as lead cleavage, demonstrate that the loop 2 region (nucleotides 26 -70) is not structured. A majority of its nucleotides is accessible to DMS, kethoxal, and CMCT (Fig. 3) and is sensitive to lead cleavage, indicating a strong structural flexibility (Fig. 4). Our data reveal a second structured region constituted by two stems separated by a sixnucleotide bulge corresponding to the predicted SLII structure. The lower stem, stem 3 (C 71 to G 75 paired with C 106 to G 110 ), is stable because all of its guanine and cytosine bases are not reactive to kethoxal and DMS, respectively. A 70 , which displays reactivity toward DMS, is probably not paired with U 111 in these conditions. The upper stem, stem 4, is also stable because most of its guanine, adenine, and uracil bases are not accessible to kethoxal, DMS, and CMCT, respectively. However, there is a weak reactivity of the C 89 base.   and 4). B, determination of the apparent K d for the wild type (wt) ALIL interaction (m0 mutant). Radiolabeled SLI m0 RNA was incubated with an increasing amount of nonlabeled SLII m0 RNA (2-fold increase per step). C, determination of the apparent K d for an optimized ALIL interaction (m23 mutant). The experiment was carried out as described in B. D, disruption of the ALIL complementarity prevents complex formation (mutant m25). Radiolabeled SLI m0 RNA was incubated with an increasing amount of nonlabeled SLII m25 RNA. otides from both stem 3 and 4 are not sensitive to this reagent, whereas a strong lead-induced cleavage is observed in the RNA backbone of the four nucleotides of loop 4. In clear contrast also, there are only three rather weak lead cleavage sites in loop 3 (Fig. 4). All of the above probing experiments gave identical results when carried out in the absence of magnesium (data not shown). The weak level of accessibility of nucleotides from both loop 1 and loop 3 strongly suggests that there is base pairing between these two loops even in the absence of magnesium.
To verify this, we have investigated the conformation of two transcripts: mRNA-I and mRNA-II corresponding respectively to nucleotides Ϫ15 to 68 (SLI region) and nucleotides 55-152 (SLII region). As shown in supplemental Fig. S2, probing of mRNA-I suggests that nucleotides 9 -25, like in the longer transcript, fold into a short stem-loop, because only C 25 and G 23 are weakly accessible respectively to DMS and kethoxal. A 20 , G 19 , U 16 , and U 17 are strongly sensitive, respectively, to DMS, kethoxal, and CMCT, whereas C 18 , G 15 , and A 14 are more protected from the probes. The reactivity of the loop 1 nucleotides is therefore increased in the shorter transcript. The probing of mRNA-II reveals a structure more unstable than in mRNA-0. The reactivity of stem 3 guanines (G 72 to G 75 ) to kethoxal is very strong (supplemental Fig. S2), whereas in mRNA-0 these bases are protected (Fig. 4). The global destabilization of the SLII structure is confirmed by the increased reactivity of the residues of stem 4 (G 91 , G 88 , U 87 , A 86 , U 85 , A 100 , G 98 , and U 97 ). In loop 4, the N1 positions of two adenine bases (A 94 and A 95 ) as well as the N1 and N2 positions of G 93 are very reactive, but the N3 position of C 92 is more protected against DMS. A 80 , C 77 , and C 81 in the loop 3 bulge are accessible to DMS as G 78 , G 79 , G 83 , U 76 , U 103 , and U 104 are to kethoxal and CMCT.
Taken together, these results are consistent with the formation in solution of the ALIL-PK by base pairing between the apical loop 1 and the internal loop 3. An interesting aspect is that in the absence of SLI, we observe a global destabilization of the SLII structure, suggesting that formation of stem 2, by the loop 1-loop 3 interaction, has an overall stabilizing effect.
Formation of an SLI-SLII Complex in Vitro and in Vivo-The length of loop 2 and the fact that it contains no conserved structure suggested that it was dispensable for SLI-SLII interaction. We therefore anticipated the formation in vitro of a stable SLI-SLII complex by mixing short RNA molecules containing one or the other region. As shown in Fig. 6, this is indeed the case; when 33 Plabeled SLI RNA was incubated with nonlabeled SLII RNA (or the other way round) and separated by electrophoresis on a nondenaturing polyacrylamide gel, a retarded band was observed provided Mg 2ϩ was present (Fig. 6A). A quantitative analysis was carried out using mutants in the four stems of the IS3411 structure. Fig.  5 (B-D) illustrates the case of stem 2. Having the wt sequence on both sides led to formation of a complex at 22°C with an apparent K d of 3035 nM, whereas the m23 sequence increased about 60-fold the SLI-SLII affinity (apparent K d of 48 nM). When stem 2 was destabilized (wt sequence in SLI and m25 sequence in SLII), no complex was observed within the range of SLII concentrations used. Similar results were obtained for the three other stems. From this, we conclude that SLI-SLII trans complex formation relies on the same structural elements as frameshift stimulation. In contrast with the probing experiments, detection of complexes by electrophoretic mobility shift assay necessitates the constant presence of magnesium. This suggests that although complexes can be formed in an Mg 2ϩ -independent manner in solution, this cation is essential for their stability during electrophoresis; similar observations were obtained in the case of other kissing loop complexes (41).
The high affinity of SLI and SLII displayed in vitro by the m23 mutant prompted us to test whether SLII m23 was also capable of acting in trans on SLI m23 when provided in vivo. The SLI m23 region was cloned into the pOFX302 reporter plasmid, and SLII m23 was inserted into a compatible plasmid under control of a P LtetO-1 promoter; the noninteracting SLII m25 mutant was similarly cloned to assess the effect of stem 2 destabilization. Frameshifting frequency was measured in the absence or presence of an inducer of P LtetO-1 (Fig. 6). It appeared that frameshifting was stimulated nearly three times when transcription of SLII m23 was derepressed, whereas it remained unaffected when SLII m25 was expressed. Thus, SLII m23 RNA can act in trans on its normal target, the SLI m23 RNA embedded in a 4-kblong mRNA. However, to obtain this effect, the SLII RNA had to be inserted into the anticodon loop of a tRNA, presumably to provide a stabilizing environment. . Trans-activity of the SLII RNA in vivo. A, cloning of the SLI and SLII RNAs into two compatible plasmids. The SLI m23 region was cloned into the pOFX302 reporter plasmid (left), and the SLII m23 RNA was inserted into a compatible plasmid under control of an inducible P LtetO-1 promoter (right). The noninteracting SLII m25 mutant was similarly cloned to test the effect of the disruption of the ALIL interaction. The two types of SLII RNA were themselves inserted into the anticodon loop of an RNA identical to the E. coli lysine-tRNA. Note that short sequences not shown on the figure are also present at the 5Ј and 3Ј ends of the tRNA-like sequence; GCAGGAAUUCUGCAGAUAC at the 5Ј end and CUUGCUAUCCUUAGCGAAAGCUAAGGAUUUUUUUU at the 3Ј end (the latter correspond to the rrnB terminator region where transcription stop somewhere within the run of 8 U). B, frameshifting frequency measured in the absence or presence of an inducer of the P LtetO-1 promoter.

Destabilization of Stem 2 and Stem 3 Decreases the in Vivo
Transposition Activity of IS3411-PRF-1 was shown to be required for synthesis of the OrfAB transposase in a few members of the IS3 family (8,11,30). To determine whether it is the case for IS3411, we introduced mutations affecting frameshifting negatively while not changing the sequence of the OrfA and OrfB proteins; mutation U342C renders the U-UUU motif nonshifty (m2 in Fig. 2), mutation G416A destabilizes stem 3 (m27 in Fig. 2), and mutation C419A destabilizes stem 2 (m24 in Fig.  2). These mutations were transferred into the IS3411 copy carried by plasmid pOFX515 (Fig. 7A). In this plasmid, the right end of the IS (irr) is followed, at a distance of 3 base pairs, by a copy of the left end of the IS (irl). This creates an irr-irl junction that is an efficient substrate for the OrfAB transposase (42), thus leading to integration of pOFX515 into the pOX38Kan target plasmid at high frequency (Fig. 7B). When frameshifting is prevented by mutating the shifty motif, there is a dramatic 2500-fold reduction in cointegrate formation; this is the background frequency of recombination observed with an IS-less plasmid. If frameshifting is made less efficient by destabilizing the ALIL-PK with mutations in either stem 2 or stem 3, the reduction is still important because it is 250-fold. These results suggest that frameshifting frequency and transposition level are directly correlated.

DISCUSSION
The ALIL-PK, a Moderate but Essential Stimulator of Frameshifting and Transposition-The elaborate pseudoknot of IS3411, which also exists in 27 other ISs, acts on the ribosome to increase its frequency of slippage on the U-UUU sequence located 5 nt upstream ( Fig. 1 and supplemental Table S1). The resulting OrfAB protein is absolutely required for transposition ( Fig. 7) (11). Surprisingly, such a sophisticated structure as the IS3411 ALIL-PK leads to a rather inefficient recoding signal. Only 0.55% of the ribosomes change frame on the U-UUU motif, after deduction of 0.11% corresponding to the background activity of mutants without frameshift motif (Fig. 2). However, such a low level of frameshifting does not prevent IS3411 from being active in transposition (Fig. 7) (11,33). Like other transposable elements, IS3411 evolved to produce the amount of transposase necessary to maintain transposition frequency at a level ensuring propagation while not generating too many mutations of the bacterial host. To achieve this goal an IS could either produce a relatively large amount of a low efficiency transposase or synthesize a small quantity of a more efficient one. In the IS3 family, IS150, with its PRF-1 frequency of 50%, is an example of the former strategy (25), whereas IS3411 constitutes an illustration of the latter. The intracellular concentration of the OrfAB IS3411 transposase is controlled at the translational and post-translational levels. A low amount of OrfAB synthesis results from the use of a moderately efficient slippery motif (U-UUU is 5.4 times less efficient than A-AAA- The transposition frequency was calculated by dividing the number of recipient bacteria that received a transposition-generated pOX38::pOFX515 plasmid (kanamycin-and ampicillin-resistant clones) by the number of bacteria that received pOX38 or pOX38::pOFX515 (kanamycin-resistant clones). wt, wild type.
AAG; see m6 mutant in Fig. 2) combined to an intrinsically inefficient frameshift stimulator. Even if stem 2 is improved, as illustrated by the m23 mutant, the stem 2 of which is identical with that of many other members of the IS51 group (supplemental Table S1), the efficiency of frameshifting remains moderate (Fig. 2). The intricate design of the ALIL-PK is perhaps what limits its capacity to interfere with ribosome progression (either because of a loose direct interaction and/or because in vivo its formation is impaired by ribosome traffic?). If so, it would appear as an unnecessarily convoluted way to proceed because a regular short stem-loop of the same efficiency would have been simpler to establish. An interesting alternative is that the ALIL-PK evolved for a different purpose; whether or not it is to adjust transposase expression (hence transposition) to cell physiology remains to be investigated. In addition, once synthesized, the OrfAB protein is probably rather unstable as shown by the analysis of IS629 (11). The same study suggested synthesis through frameshifting on a second slippery motif (A-AAA in loop 2; Fig. 1) of an alternate OrfAB protein differing by a few amino-acids. It has no transposase activity but is capable of interacting with and stabilizing the transposase (11). Although we could not detect frameshifting on the second motif with our reporter system, there is probably a low level of production of the alternate OrfAB because the A-AAA sequence is intrinsically shift-prone at low levels. However, IS3411 and IS629 are unique from that point of view because none of the others members of the IS51 group displays such a second potential shifty motif.
The ALIL-PK, a New Variation on the RNA Loop-Loop Interaction Theme-Many studies have revealed that functionally important interactions between two RNA sequences rely on loop-loop pairing. Kissing loops interactions are for example implicated in the dimerization of retroviral RNA genomes required for encapsidation and in various control processes involving small antisense RNAs in bacteria (43). In most of these examples, the interaction occurs by Watson-Crick pairing of a few bases from apical loops carried by two separate RNA molecules. Its function is to promote the rapid formation of an intermolecular initial complex, which generally evolves toward a more stable form by formation of an extended duplex or through the action of proteins. In the case of the HIV-1 dimerization initiation site, a 6-bp interaction is necessary and sufficient to allow formation of a stable initial complex, but not all six nt sequences work equally well (44,45). A few unpaired nucleotides (three or one) must be present, and again not any nucleotide will work. In IS3411 and the related ISs, there is also a clear preference for a 6-bp loop-loop interaction, with very little variation in the sequence of the interacting bases; our data also indicate that modification of a single base pair can affect significantly frameshift stimulation and therefore affect complex formation and stability (supplemental Fig. S2). This similarity maybe related to the necessity of obtaining in both cases a rapid interaction by the proper presentation of two short particular sequences. One unpaired base is also present in loop 1 of IS3411, but none exists in the other loop (loop 3), which is instead closed by a stem on its 3Ј side (stem 4). This difference in design of the second partner is probably due to a difference in the fate of the dimerization initiation site and ALIL initial kissing complexes. Whereas the former is stabilized by extending the number of paired nucleotides, such an option does not exists for the latter. IS3411 uses the three helices flanking the kissing loops to generate and stabilize a relatively rigid ALIL-PK, as suggested by the modeling analysis ( Fig. 8 and supplemental Fig. S3 and below).
RNA aptamers (30 -40 nt in size) binding to one or the other of three RNA hairpins present at the 5Ј-and 3Ј-untranslated regions of the hepatitis C virus mRNA were selected (31,46). Interestingly, with the three targets, the aptamers with the strongest binding turned out to be stem-loops quite similar to the SLII region of IS3411. In agreement with the IS3411 structure, the hepatitis C virus ALIL complexes have only one unpaired nucleotide on the 5Ј side of the apical loop. At variance with the IS, the internal loop is on the 3Ј side and contains one or three unpaired nucleotides. In two cases, unpaired nucleotides (one or three) are also present on the 5Ј side of the aptamer stem, across the internal loop. The exact significance of these differences remains to be determined. One possibility is that the aptamers were selected from random sequences, whereas the IS structure had to emerge (by slow reciprocal adjustment of the two partners through natural evolution) in a more constrained background because it is embedded within a coding sequence (even coding in two frames for a part of it). The end result is the formation of complexes of similar stability with apparent K d in the nanomolar range (at least for optimized stem 2, like in the m23 mutant; see Fig. 5).
The biological importance of ALIL RNA complexes is substantiated by several examples such as the Drosophila bicoid mRNA dimerization signal (47,48), the bacteriophage phi29 pRNA multimerization (49), and the loop-loop interaction in the 5Ј-untranslated region of the avian leukosis virus (50). However, the case most akin to the IS3411 and the in vitro selected ALIL complexes is the frameshift stimulator of the Barley yellow dwarf virus (32). It is constituted by a 6-nt internal loop present on the 3Ј side of a stem-loop situated 6 nt downstream of a G-GGU-UUU shift site that interacts with an apical loop located 4 kb downstream in the 3Ј-untranslated region. Thus, there is an inversion in the positions of the SLI and SLII equivalents, and like in the in vitro selected ALIL complexes (31), the internal loop is on the 3Ј side rather than on the 5Ј side. Even if the size, design and distance of the two interacting stem-loops differ from their IS3411 counterparts, some features are nevertheless the same. The Barley yellow dwarf virus loop 1 homolog is also 7 nt long, the unpaired base is also an adenine on the 5Ј side and the long range interaction between the two loops implicates 6 base pairs, as well. In addition, the internal loop does not contain unpaired bases and is flanked by two stems, which are longer versions of stems 3 and 4 of IS3411. However, the final structure is going to be presented to the ribosome in a different manner because the stem 3 homolog is on the 5Ј side of the mRNA, and the stem 1 homolog is on the 3Ј side. This led to a 10-fold stimulation of PRF-1 frequency (compared with a mutant where the loop-loop interaction is disrupted), which attains 1.1%; this moderate level of recoding is similar to the level observed with the IS3411 signal. From these examples it appears that ALIL complexes exist in different forms; however, the structure of most of them remains to be experimentally determined.
Molecular Model of the IS3411 ALIL-PK-Modeling of the IS3411 ALIL complex lead to a rather compact structure where stems 1, 2, and 4 are coaxially stacked (forming a nearly continuous three-helix stem of 20 base pairs), with a moderate bend between each helix, whereas stem 3 abuts against stem 2 at an angle of about 62° (Fig. 8). There is little room for variation at the level of stems 1 and 2, because they have to be joined on one side by a single adenine nucleotide (A14), which base probably sits within the major groove of stem 2. This part of the IS3411 modeled structure is very similar to the experimentally determined structure of a mutant of the SRV-1 pseudoknot (51). Formation of the IS3411 PK clearly requires presence of stems 3 and 4 (Fig. 2). Interestingly, once formed, it exerts a stabilizing effect on both of them (Figs. 3 and 4 and supplemental Fig. S2), perhaps because it restricts their mobility. In model Md1 base stacking between stem 4 and stem 2 has been maximized. Supplemental Fig. S3 shows its superposition with two other structures in which stacking of stems 2 and 4 has been minimized. The tip of stem 4 can be displaced by 2.6 nm at most, the angle between stem 2 and stem 4 increases from 14 to 50°, and the 5Ј end of stem 3 moves on one side or the other by about 1.5 nm. A feature of the structure, and what perhaps makes it able to impede ribosome progression, may therefore be its relatively important rigidity. As shown in Fig. 8B, it is conferred by an interlinked network of 4 base pairs acting as restraining struts, namely the ones that close stem 2 (A-U top and G-C bottom), stem 3 (G-C top), and stem 4 (C-G bottom).
To conclude, it appears that the insertion sequences of the IS51 group evolved a quite remarkable RNA structure ensuring a strong interaction of two separate stem-loops present within the coding part of their mRNA. This lends further support to the notion that apical loop-internal loop complexes constitute a mode on their own of intra-and intermolecular RNA interaction and as such are probably widely used for control and structural purposes.