How bacterial ribosomal protein L20 assembles with 23 S ribosomal RNA and its own messenger RNA.

In bacteria, the expression of ribosomal proteins is often feedback-regulated at the translational level by the binding of the protein to its own mRNA. This is the case for L20, which binds to two distinct sites of its mRNA that both resemble its binding site on 23 S rRNA. In the present work, we report an NMR analysis of the interaction between the C-terminal domain of L20 (L20C) and both its rRNA- and mRNA-binding sites. Changes in the NMR chemical shifts of the L20C backbone nuclei were used to show that the same set of residues are modified upon addition of either the rRNA or the mRNA fragments, suggesting a mimicry at the atomic level. In addition, small angle x-ray scattering experiments, performed with the rRNA fragment, demonstrated the formation of a complex made of two RNAs and two L20C molecules. A low resolution model of this complex was then calculated using (i) the rRNA/L20C structure in the 50 S context and (ii) NMR and small angle x-ray scattering results. The formation of this complex is interesting in the context of gene regulation because it suggests that translational repression could be performed by a complex of two proteins, each interacting with the two distinct L20-binding sites within the operator.


Summary.
In bacteria, the expression of ribosomal proteins is often feedback regulated at the translational level by the binding of the protein to its own mRNA. This is the case for L20, which binds to two distinct sites of its mRNA that both resemble its binding site on 23S rRNA. In the present work, we report an NMR analysis of the interaction between the C-terminal domain of L20 (L20C) and both its rRNA and mRNA binding sites. Changes in the NMR chemical shifts of the L20C backbone nuclei were used to show that the same set of residues are modified upon addition of either the rRNA or the mRNA fragments, suggesting a mimicry at the atomic level. In addition, SAXS experiments, performed with the rRNA fragment, demonstrated the formation of a complex made of two RNAs and two L20C molecules. A low-resolution model of this complex was then calculated using (i) the rRNA/L20C structure in the 50S context and (ii) NMR and SAXS results. The formation of this complex is interesting in the context of gene regulation since it suggests that translational repression could be performed by a complex of two proteins each interacting with the two distinct L20 binding sites within the operator.

Introduction.
Ribosome assembly is a complex process in which proteins must assemble onto the ribosomal RNA in an ordered fashion. This process has been extensively analysed for bacterial ribosomal subunits, using in vitro reconstitution studies. In the case of Escherichia coli large 50S subunit, it has been shown that nine "core proteins" L1, L3, L4, L9, L10, L11, L20, L23 and L24 bind directly to the 23S rRNA, independently of other proteins (1,2). The other 50S ribosomal proteins (r-proteins) depend upon those primary binders for attachment to the large ribosomal subunit.
Interestingly, several of these core proteins, namely L1, L4, L10 and L20, are also involved in the feedback regulation that allows the coordinated expression of the various ribosomal components (3)(4)(5)(6)(7)(8). A similar situation is also observed for the small ribosomal subunit where regulatory rproteins are also primary 16S rRNA binders.
Most bacterial ribosomal protein genes are clustered in polycistronic operons, the expression of which is controlled at the translational level (9)(10)(11). One cistron encodes a regulatory ribosomal protein that binds to the messenger RNA, thereby shutting off the translation of the downstream r-proteins cistrons. In the classical Nomura model, it is assumed that the protein binding site on the mRNA, the translational operator, is structurally similar to its binding site on the rRNA. Molecular mimicry between the two target RNA sites would then be used to adjust the translation of r-proteins to the level of transcription of the rRNA: in the presence of excess unbound rRNA, the repressor r-protein would be displaced from its mRNA, thereby allowing translation to proceed. So far, in E. coli, the molecular mimicry between the translational operator on the mRNA and the rRNA has been demonstrated in the case of L1, through mutagenesis of either target sites (12), and of S8 where direct binding to both RNAs has additionally been shown (13).

L20-RNA interactions
More recently, detailed studies were performed on the feedback control mechanism of S15 from various bacterial species. In E. coli, S15 translational operator is a pseudoknot structure (14), whereas in T. thermophilus and B. stearothermophilus it is a three-helix junction, as in the binding site for S15 in central domain of the 16S rRNA (15,16). However, in all three cases, there appear to be a common recognition pattern: a bipartite recognition motif with a conserved G-U/G-C in one of the two subsites (15)(16)(17). This motif is also involved in S15 binding to the16S rRNA, as seen in the crystal structures (18,19), thereby confirming the molecular mimicry model.
In E. coli, the case of L20, one of the four 50S repressor proteins, is quite puzzling because of the unique nature of its translational operator organisation. The L20 gene is part of the IF3 operon, which comprises infC, rpmI and rplT, encoding translation initiation factor IF3, and ribosomal proteins L35 and L20, respectively. L20 represses the translation of rpmI (8), which in turn prevents the translation of its own gene through translational coupling (20). Regions of the rpmI leader sequence important for translational control have been identified by a combination of genetic, mutational and footprinting experiments. They are composed of two independent structures: a pseudoknot structure which involves long-range RNA-RNA interactions and overlaps the rpmI ribosome binding site and initiation codon (site 1) (21), and an imperfect stem structure located in the infC-rpmI intergenic region (site 2) (22).
Interestingly, both sites are apparently independently bound by L20, and both are required for repression of expression of L35 and L20 (22). This peculiar "dual-site" organisation of the operator raises a number of questions: (i) How many L20 molecules are required to bind to each site? (ii) Why is simultaneous binding required for repression? (iii) What is the mechanism allowing the crosstalk between the two sites?

L20-RNA interactions
The 3D structures of free and ribosome-bound forms of L20 have been recently reported (23,24). They reveal a segmented organisation, with a globular C-terminal domain (L20C) which interacts with helices 40 and 41 of the 23S rRNA (25), and a highly cationic N-terminal domain, which is disordered in the free state and folds as a long helical shaft within the ribosome, penetrating deeply into the 50S subunit (24). Interestingly, it has been shown that, by itself, L20C from E. coli is able to act as a translational repressor in vivo (23). Therefore, the Nterminal domain of L20 appears to be dispensable for the control of rpmI and rplT expression.
The present work reports the study of the interaction of L20C with either its target site on the 23S rRNA or one of its two binding sites within the translational operator of rpmI, using heteronuclear NMR and SAXS 1 . Both selected RNA fragments interact specifically with L20C and the two complexes involve similar residues within the protein, indicating a common binding site. The observed complexes are however significantly larger than expected and the SAXS data suggest that it is a dimer of complexes i.e. 2 proteins+2 RNAs. This indicates that L20C has the intrinsic capacity to dimerise RNA and sheds new light on the translational control mechanism. In the case of the rRNA/protein complex, a low-resolution structural model of this dimer could be constructed from both the NMR and SAXS data, yielding some structural indications on how L20 could interact with both sites on its messenger RNA.

Protein expression and purification
The plasmid pET42aL20aaΔN was constructed as follow: pBL20aaΔN (M. Guillier et al., manuscript in preparation) was digested by BsrGI-XbaI and the DNA fragment carrying L20C was transferred into pET42a (Novagen). The resulting plasmid was transformed into BL21 (DE3) CodonPlus (Stratagene).
For expression and labelling of L20C, BL21 (DE3) CodonPlus/pET42aL20aaΔN was grown at 37°C in Martek-9 N medium (Spectra Stable Isotopes) supplemented with Celtone-N (2g/l), and the appropriate antibiotics. When the A 600 turbidity reached 0.5, L20C expression was induced by addition of 0.5 mM IPTG and protein production was allowed to proceed for 4 hours. Typical cultures (0.5 litre) yielded approximately 2 g of cells (wet weight). Cells were harvested by centrifugation, suspended in 10 ml of 200 mM NaCl, 50 mM Tris-HCl pH 7.5, 1 mM EDTA, 1 mM PMSF, and disrupted by sonication. Cell debris were spun down and the soluble extract was concentrated by ammonium sulphate precipitation (70% saturated). The sample was first submitted to a gel filtration column (Superdex75 prep-grade 2.6 x 60 cm, Amersham) equilibrated in 20 mM Tris-HCl, 500 mM NaCl, 0.5 mM EDTA, pH 7.5, at room temperature.
The presence of L20C within the collected fractions (4ml each) was monitored by analysing aliquots on SDS-PAGE. The pooled fractions (24 ml) were then diluted 2.5 times with H 2 O to lower the ionic strength. An ion-exchange chromatography step was then performed on an SP Sepharose column (Hiload 2.6 x 10 cm, Amersham) equilibrated in 20 mM Tris-HCl, 200 mM NaCl, pH 7.5. L20C was eluted using a 0.2-1.5 M linear NaCl gradient in the same buffer (300 ml). The purified protein was then dialysed against 20 mM potassium phosphate, 100 mM NaCl, L20-RNA interactions -7 -pH 6.6, and concentrated with a centricon YM-3 (Millipore) The overall yield was 8 mg of purified L20C per litre of culture.
RNA fragments corresponding to the rRNA and oRNA target sites ( Fig. 1) were purchased from Dharmacon Research, deprotected as indicated by the supplier and dialysed extensively.

NMR titration
The NMR experiments were carried out on a Bruker DRX600 Avance spectrometer. NMR titrations were performed using a 420 µl sample at a final L20C concentration of 0.6 mM in buffer 20 mM potassium phosphate, 100 mM NaCl, 0.1 mM EDTA, pH 6.6. One molar equivalent of RNA was lyophilised and suspended in 80µl of the same buffer. Aliquots of this solution were added stepwise to the labelled L20C to perform a several point titration.
RNA/L20C molar ratios used were 0.5:1, 0.6:1, 0.8:1 and 1:1. The pH was checked after each step and remained within 0.1-0.2 pH unit of the starting value. NMR spectral changes were monitored by recording TROSY experiments (27) at 303 K. The assignment of the L20C residues in the complex was derived from a 3D-NOESY-TROSY experiment (28) with a mixing time of 80 ms, also at 303 K. NMR data were processed with the Gifa software (29).

Gel filtration and SAXS experiments
The complex was formed as described above, by stepwise addition of the RNA to the protein solution, and monitored by NMR. The complex recovered from the NMR tube (6.5 mg in 0.

L20-RNA interactions
The distance distribution function p(r) corresponds to the distribution of distances between any two volume elements within one particle. It has been determined using the indirect transform method as implemented in the program GNOM (33).This function provides an alternative estimate of the radius of gyration derived through the relationship: Scattering intensities were computed from the atomic coordinates of L20C, rRNA and their complexes by using the program CRYSOL, which takes into account the hydration water by introducing a 3 Å thick border layer surrounding the molecule (34). The calculated scattering profile is fitted to the experimental pattern using only two adjustable parameters, the average displaced solvent volume per atomic group and the contrast of electron density of the border layer with respect to bulk solvent ∆ρ b = ρ b -ρ 0 , to minimize the discrepancy: where N is the number of experimental points, and I e (s i ) and σ(s i ) denote the experimental scattering curve and its standard deviation, respectively.

Molecular modelling
In the following text, "complex" refers to the L20C/rRNA fragment complex (1 protein+ 1 RNA duplex), whereas "dimer" designates the dimer of complexes (2 proteins + 2 RNA duplexes). The structure of the complex between L20C and the rRNA fragment was modelled as follows: coordinates from L20C and the rRNA fragment were extracted from D. radiodurans 50S ribosomal subunit structure (24) (the latest PDB entry,1NKW, was used, as the original 1KC9 with the first complex. Contact between the two complexes was detected by monitoring the van der Waals term of XPLOR energy function. After these steps, a random dimer conformation was obtained in which the two contacting complexes were related by a two-fold symmetry axis.

L20-RNA interactions
For each random dimer thus generated, the following three geometrical parameters were computed: the radius of gyration, the minimum distance between L20C in the first complex and the RNA in the second complex (the bridging distance), and the minimum distance between the C-terminal helix of L20C (residues 102-118) in the first complex and the closest atom in the second complex (either from L20C or from the RNA (the C-terminus distance and R90, full-length L20aa numbering), which were broadened to baseline in the full-length protein. Therefore, the NMR structure and the complete sequence-specific NMR assignments of L20C provided a basis for structural studies of the interactions between L20C and its RNA substrates.

Target RNA sequences
Two RNA fragments were designed according to the L20C-binding sites, either on the 23S rRNA The spectral changes observed in the rRNA/L20C and oRNA/L20C complexes appear strikingly similar. This is illustrated in Fig. 2, which shows a side-by-side view of the [ 15 N, 1 H] TROSY spectra of free L20C ( Fig. 2A) and L20C in complex with rRNA ( Fig. 2C) or oRNA (Fig. 2D).
Very large changes are observed in the 15  Altogether, these data support the conclusion that oRNA/L20C and rRNA/L20C form similar, stable and specific complexes, as opposed to the non-cognate unspecific binding of tRNA to L20C.

Analysis of the complex formed with rRNA
Given the large chemical shift perturbation induced in L20C spectrum upon rRNA binding ( fig   2C), the direct comparative assignment of the spectrum would have been unreliable. We therefore peaks. Within the assigned region, the structure of L20C appears largely unchanged: as shown on the strip plot running from Q102 to Q106 for either free or complexed L20C (Fig. 3), the patterns of NOE cross-peaks are very similar, both in terms of frequency and intensity. This indicates that not only the backbone but also the side-chain of these amino acids are unaffected upon rRNA addition. It also confirms the local structure of this region, which remains folded into an α-helix (strong HN i -HN i+1 and HN i+3 -Hα i NOEs). Interestingly, these 30 amino acids belong to the Cterminal part of L20C, spanning the two C-terminal α-helices (α3 and α4, light blue, Fig. 4). This is in keeping with observations in the ribosome context, where the surface of interaction between L20C and the 23S rRNA corresponds to the first two helices and loops of the domain.
Accordingly, the unaffected C-terminal region of L20C, including α3 and α4, is located on the solvent exposed side of the 50S subunit and is thus not expected to contact the RNA (Fig. 4).
In parallel, the structure of the rRNA within the complex was probed by monitoring the imino proton resonances engaged in base pairs. Both 1D and 2D NOESY experiments were performed on the complex and compared to those recorded with the free rRNA. These indicate that the structure of the rRNA is not globally modified upon L20C binding: the imino/imino, imino/aromatic and imino/amino cross-peaks are nearly identical in the free and the bound rRNA spectra. The former value is significantly larger than a theoretical value of 17.8 Å calculated from the rRNA atomic coordinates extracted from the ribosome using the CRYSOL program, suggesting that a small but detectable fraction of RNA is associated in solution as small oligomers. This is confirmed by the appearance of the Guinier plot (see inset to Figure 5A), which exhibits a slight

Modelling the geometry of the L20C/rRNA dimer
Based on our NMR data, which indicates both that the rRNA fragment is correctly paired and that the surface of interaction of L20C with this fragment is the same as that in the ribosome context, it was reasonable to assume that L20C recognised the rRNA fragment as is observed in the full ribosome context. A complete atomic model of the L20C/rRNA complex was therefore constructed using the 50S crystal structure, as described under experimental procedures. This model was then used to construct a dimer of L20C/rRNA complexes that fitted all the available experimental data. Indeed, based on the results of both the NMR and the SAXS experiments, a number of simple assumptions could be made about the dimer geometry:

L20-RNA interactions
-21 -(i) Symmetry constraint: In the NMR spectra of the L20C/RNA complex, only one single set of peaks is observed for each residue in the protein. This strongly suggests that the dimer is symmetrical, as is usually the case for homodimers, and implies that there must be a two-fold axis relating the two complexes within the dimer.
(ii) Bridging constraint: L20C or the RNA do not dimerise by themselves. This indicates that the dimer assembly must involve RNA/protein contacts. In other words, within the dimer, one L20C molecule will contact both RNAs (and vice-versa).
(iii) Exposed C-terminus constraint: In the TROSY spectrum of the complex, residues within the C-terminal helix of L20C (residues 102-117) appear globally unaffected, in contrast with other parts of the protein, for which large chemical shift displacements are observed. This could be confirmed by tracing the C-terminal helix in the 3D NOESY-TROSY experiment (see above).
Thus, the C-terminal helix is unlikely to be in close contact with the other complex in the dimer, but, rather, remains exposed to the solvent.
(iv) Radius of gyration constraint: The radius of gyration of the dimer (25.7 Å) is known from the SAXS experiment.
Forming a dimer with two identical monomers related by a two-fold symmetry axis is a problem with 3 degrees of freedom. In order to systematically sample the conformational space, a Monte-Carlo approach was used as described under experimental procedures. The resulting dimers were then selected according to the above geometrical constraints. More specifically, we enforced that the calculated radius of gyration should fall within 24.5 and 27.5 Å, that the bridging distance (see experimental procedures) should be 3 Å or less and that the C-terminus distance (see experimental procedures) should be at least 5 Å. Overall, about 1300 random dimers were generated, 100 of which met the above geometrical criteria. These 100 dimers could be grouped SAXS data further indicates that the L20C/oRNA complex has a slightly larger radius of gyration and maximum diameter than the L20C/rRNA complex, suggesting that it adopts a more extended conformation, e.g. the V formed by the two RNA molecules could be more open than that of the L20C/rRNA dimer. All these results suggest that L20C possesses the intrinsic ability to dimerise

L20-RNA interactions
RNAs that share some resemblance with its ribosomal RNA binding site. Furthermore, this dimerisation ability was observed in stringent conditions, i.e. in the absence of divalent cations (Mg 2+ ), which can sometime mediate unspecific aggregation of RNA at high concentrations. This is of great potential relevance in the context of translational control of the IF3 operon for which two target sites for L20 have been identified within the operator region upstream of the rpmI translation start, which are both similar to the rRNA binding site of L20C (22). Interestingly, these two sites are able to bind L20 independently, i.e. mutations that affect L20 binding at one site do not prevent binding to the other site. However, simultaneous binding at both sites appear to be required for the negative feedback control of rpmI (22). This led to the hypothesis that two L20 molecules could be required for the translational repression on the rpmI operator. Our observation that L20C is able to promote dimerisation of RNA fragments strongly supports this possibility and provides structural information on the geometry of the resulting complex. The model that was obtained for the dimer suggests that bridging contacts between the two L20C/RNA complexes involve residues within loop 2. Interestingly, loop 2 contains a stretch of residues that are strictly conserved across all bacterial L20 sequences: (D/N)RK. The two basic residues, R90 or K91, would be good candidates as "bridging" residues, as they could contact phosphate groups of the other RNA within the dimer ( Figure 6).

L20-RNA interactions
-25 -Dimer formation was observed using high concentrations of L20C and RNA, which one might argue to be physiologically unrealistic. However, in the context of rpmI mRNA, the two L20 binding sites are covalently tethered to one another, thereby increasing their relative apparent concentration. The two L20C binding sites within the operator are separated by approximately 18 nucleotides. Assuming a phosphate to phosphate distance of ~7 Å his means that one site will be constrained to be located within a sphere of ca. 120 Å radius centred at the other site. This will result in an apparent concentration of at least 200 µM, i.e. similar to that at which dimerisation was observed. This is thus fully consistent with formation of a high order assembly of two L20 molecules on the rpmI mRNA, following the model schematically described in Figure 7, which accounts for all the currently available physiological and structural data. The translational repressor would thus be the molecular assembly of two L20C bound to the two sites on the messenger RNA, thereby explaining the simultaneous requirement for two distinct sites. This assembly would be necessary to efficiently hinder the binding of the ribosome to the rpmI initiation codon and/or to enhance the stability of the L20/mRNA complex.
Under physiological conditions where L20 concentration becomes limiting, naked 23S rRNA is predicted to displace L20 from its mRNA, thereby de-repressing translation of rpmI.
Interestingly, under the above dimer repressor model (Fig. 7), one could also imagine a heterodimer species, in which one rRNA and one mRNA could assemble together with two L20 molecules. In this intermediate complex, in which the rRNA "invades" the dimer structure, the mRNA would become more accessible to ribosomes and thus allow translation to proceed. This could provide a unique and sensitive mechanism to precisely adjust L20 and L35 concentrations to that of the ribosomal RNA. This model predicts that L20 forms a heterodimer assembly with its two binding sites on the mRNA. A direct structural analysis of this complex will require the L20-RNA interactions  rRNA:L20C binding site on E. coli 23S rRNA, as deduced from the structure of D. radiodurans 50S ribosomal subunit (24), numbering is that of the E. coli 23S rRNA. oRNA: Secondary structure of the L20 binding site 2 on the rpmI translational operator (22). Outlined residues were added to increase the pairing stability.   Shown is the schematic structure of the complex L20C/rRNA complex, extracted from the crystal structure of the 50S subunit (24). The chemical shifts of residues shown in cyan are either mildly or not affected by either rRNA or oRNA binding, whereas peaks corresponding to residues L20-RNA interactions -32 -shown in red are either strongly affected or broadened down to baseline. The N-and C-terminus of L20C are indicated.   Left view: schematic drawing of 2D structure of the 5' leader region of the rpmI-rplT operon, that has been determined previously (21). The stop codon of infC and the ribosome binding site and start codon of rpmI are boxed. The two L20 binding sites (22) are shown in green. In this representation, the two halves of the pseudoknot are not paired.