The ribosomal maturation factor P from Mycobacterium smegmatis facilitates the ribosomal biogenesis by binding to the small ribosomal protein S12

The ribosomal maturation factor P (RimP) is a highly conserved protein in bacteria and has been shown to be important in ribosomal assembly in Escherichia coli. Because of its central importance in bacterial metabolism, RimP represents a good potential target for drug design to combat human pathogens such as Mycobacterium tuberculosis. However, to date, the only RimP structure available is the NMR structure of the ortholog in another bacterial pathogen, Streptococcus pneumoniae. Here, we report a 2.2 Å resolution crystal structure of MSMEG_2624, the RimP ortholog in the close M. tuberculosis relative Mycobacterium smegmatis, and using in vitro binding assays, we show that MSMEG_2624 interacts with the small ribosomal protein S12, also known as RpsL. Further analyses revealed that the conserved residues in the linker region between the N- and C-terminal domains of MSMEG_2624 are essential for binding to RpsL. However, neither of the two domains alone was sufficient to form strong interactions with RpsL. More importantly, the linker region was essential for in vivo ribosomal biogenesis. Our study provides critical mechanistic insights into the role of RimP in ribosome biogenesis. We anticipate that the MSMEG_2624 crystal structure has the potential to be used for drug design to manage M. tuberculosis infections.

Ribosomes account for a large portion of cell mass, and their synthesis can be highly energy-consuming. It is estimated that in rapidly growing Escherichia coli, around 90% of the energy consumption is for protein synthesis, and a significant amount is used for generating ribosomes (1). The complete 70S bacterial ribosome is composed of the 30S small subunit and the 50S large subunit. In general, the 30S small subunit consists of the 16S rRNA and 21 ribosomal proteins (S1-S21); the 50S large subunit consists of the 23S and 5S rRNAs and 36 ribosomal proteins (L1-L36). Although active ribosomes can be reconstituted in vitro using individually purified ribosomal proteins and rRNAs, the process occurs at a much slower rate and requires harsher conditions (2). By contrast, in vivo ribosomal biogenesis is assisted by various ribosomal cofactors, including helicase, chaperones, maturation factors, and GTPase, and hence needs a lower activation energy and produces fewer intermediates (3).
The ribosomal maturation factor P (RimP), also known as yhbC, is a highly conserved ribosomal cofactor in both Gram-negative and Gram-positive bacteria (Fig. S1, A-C). Null mutation of RimP in E. coli shows slower growth than WT at high temperatures (4). In the food-borne pathogen Salmonella enteritidis, the RimP mutant shows decreased growth rate and becomes more sensitive to both reactive oxygen and nitrogen intermediates but less virulent in vitro (5). In the Gram-positive pathogen Streptococcus pneumoniae, the null mutant is lethal (6). Overall, these phenotypic studies highlight the physiological importance of RimP in bacteria.
Several mechanistic studies have showed the functional association of RimP with ribosomal biogenesis. Wikström and coworkers (4) found that the RimP null mutant reduced the levels of polysome and mature 70S, whereas it increased the amounts of 30S and 50S in E. coli. Moreover, less accumulation of 30S than 50S was observed in polysome profiling (4). In their study, RimP was only found in the fractions of 30S subunit but not others in sucrose gradient centrifugation, and primer extension studies demonstrated that RimP knockout bacteria up-regulated the level of pre-16S rRNA but down-regulated mature 16S rRNA levels (4). Recently, quantitative MS studies on E. coli suggested that RimP can increase the binding kinetics of the S5 and S12 ribosomal proteins to the 5Ј domain of rRNA in vitro (7). In addition, it was reported that the relative timing of the assembly of the 3Ј domain and the formation of the central pseudoknot structure in the 16S rRNA depend on the presence of the assembly factor RimP (8). These in vivo and in vitro data indicate that RimP is essential for the biogenesis of the 30S small subunit in E. coli.
Despite extensive biochemical and biophysical study, the mechanism of RimP in ribosomal biogenesis is still not fully understood. The only available structure to date is the solution structure of SP14.3, an ortholog in S. pneumoniae, solved using NMR. It consists of a highly negatively charged N-terminal domain and a slightly positively charged C-terminal domain resembling the Sm fold (6). Because of the flexible linker between the two domains, their relative orientation is not defined. The structure, however, is limited in clarifying its mechanism in ribosomal biogenesis.
Mycobacterium tuberculosis is the causative agent of tuberculosis, which killed 1.7 million people in 2016 (9). Active M. tuberculosis infection leads to severe pulmonary and occasionally extrapulmonary symptoms, particularly in immunocompromised patients, such as those with HIV. Moreover, the emergence of multidrug-resistant strains confounds effective treatment, which typically involves prolonged use of multiple antibiotics. Therefore, the discovery of a druggable target in M. tuberculosis is of importance in future drug development.
Here, we describe the crystal structure of RimP homolog MSMEG_2624 in Mycobacterium smegmatis, a model organism for studying M. tuberculosis. We demonstrate that the linker region of MSMEG_2624 plays an important physiological role in interacting with RpsL, the small ribosomal protein S12, thereby affecting ribosomal biogenesis. Because RimP is a highly conserved mycobacterial protein, the understanding of its mechanism will provide insight in studying ribosomal biogenesis and developing new drugs to target M. tuberculosis.

The crystal structure of MSMEG_2624
MSMEG_2624 was purified as a monomer in solution (data not shown). The crystal structure of MSMEG_2624 was solved by single isomorphous replacement with anomalous scattering (SIRAS) phasing, and the final model was refined to 2.2 Å (Fig.  1A). Detailed statistics of data processing, phasing, structure refinement, and Ramachandran plot are shown in Table 1 and Fig. S2. Both N-terminal (aa 4 1-93) and C-terminal (aa 94 -181) domains of MSMEG_2624 are similar to that of SP14.3, with 2.1 and 2.8 Å C␣ root mean square deviation, respectively. The C-terminal domains of MSMEG_2624 and SP14.3 resemble the Sm fold, which is composed of six ␤ sheets. The electron density of the loop between ␣2 and ␤3 is missing, which indicates a disordered structure (Fig. S3). The major difference between MSMEG_2624 and SP14.3, however, lies in the additional ␣-helix (␣2Ј), connected by the long loop to the last ␤ sheet ␤6Ј

RimP facilitates ribosomal biogenesis by binding to S12
(aa 158 -165) on the C terminus of MSMEG_2624, which forms the handle of the barrel-like Sm fold (Fig. 1, B and C).
Despite the high resemblance between the crystal structure of MSMEG_2624 and the solution structure of SP14.3 at the individual domain level, they differ in the interdomain orientation. Whereas the solution structure of SP14.3 did not reveal a rigid orientation from the NOE data, our crystal structure of MSMEG_2624 shows strong electron density around the interdomain linker, indicating a well-defined clamplike orientation in the crystal (Fig. 1D). Under this orientation, the handle-like structure from the C-terminal domain interacts with the N-terminal domain through 1) hydrophobic interaction between the side chains of Phe-161 and Val-50 and 2) the hydrogen bond network formed among the Ser-162, Asp-35, and Val-36 ( Fig. 1, E and F). These interactions are absent in the structure of SP14.3 and might be involved in the stabilization of the interdomain orientation, resulting in a more rigid structure. It is unclear whether the interdomain orientation is also present in solution, as we observed that both domains of MSMEG_2624 participated in crystal contact with the neighboring molecule ( Fig. S4 ), which is expected for a protein that consists of only two domains.

The interdomain linker of MSEMG_2624 is essential for the coordinated binding with RpsL
To study the function of each domain of MSEMG_2624 in ribosome biogenesis, we first tested their interaction with binding partners. In previous interaction network studies of E. coli and Helicobacter pylori, RimP was reported to interact with S5, S7, and S12 of the small ribosomal subunit (10). In M. smegmatis, we detected the interaction between MSMEG_2624 and RpsL (S12) by co-immunoprecipitation followed by tandem MS (Fig. S5) and further validated it using a tandem affinity purification assay with recombinant proteins ( Fig. 2A). We found that neither the C-nor the N-terminal domain of MSMEG_2624 alone is sufficient to bind RpsL (Fig. 2, B and C), which suggested that the two domains of MSEMG_2624 cooperatively interact with RpsL. In addition, the first 25 residues of RpsL were dispensable for the binding to MSMEG_2624 (Fig. S6).
We hypothesized that the linker region was essential for the coordination between the two domains of MSMEG_2624 to bind RpsL (Fig. 3A). To test this hypothesis, we mutated residues in this region and tested their effects on binding with RpsL. The charged residues, Asp-93 and Arg-94, were either deleted or reversed/neutralized; residues with more (Pro-95) or less (Gly-91) restricted torsion angles were swapped ( Fig. 3A and Fig. S7). As shown in Fig. 3B, both linker deletion mutants significantly reduced the binding efficiency to RpsL, suggesting the importance of the linker region. The swapping of glycine to proline at position 91 but not 95 abolished the binding. We speculate that this distinct effect might be explained by the fact that G91P mutant disrupted the interdomain orientation by steric hindrance, whereas P95G merely increased the flexibility of the backbone. Alternatively, the residue Gly-91 might directly interact with RpsL. For mutations on the charged residues, R94D/A but not D93R/A greatly reduced the strength of the binding. In addition, we noticed that R94D showed a much stronger effect than R94A, suggesting Table 1 Crystal data processing, phasing and structure refinement statistics In the two column results, the left column gives the raw count, right column gives the percentage.

RimP facilitates ribosomal biogenesis by binding to S12
that the positive charge of Arg-94 might directly participate in binding with RpsL through electrostatic interaction. To test whether the decreased binding efficiency of the mutants was due to protein denaturation or aggregation, we used size-exclusion chromatography to measure their molecular size. We found that all mutants had similar elution profiles as the WT, suggesting their proper folding (Fig. S8). The linker region is significantly enriched for highly conserved residues (p ϭ 0.0005, Fisher exact test), suggesting its functional importance. Strikingly, we observed that all mutants that affected conserved residues showed a decrease in binding efficiency with RpsL, which suggested that these sites were under purifying selections. Taken together, we demonstrated that the linker of MSMEG_2624 was essential in coordinating the two domains to bind with RpsL.

The interdomain linker of MSMEG_2624 is essential for ribosomal biogenesis
Although previous literature suggested that RimP was involved in the maturation of the 30S ribosome by stabilizing the pseudoknot structure and facilitating the incorporation of late binder ribosomal proteins, a detailed mechanism is still lacking. Having shown the importance of interdomain linker of MSMEG_2624 in binding RpsL, we analyzed the functions of RimP in ribosomal biogenesis in terms of its interaction with

RimP facilitates ribosomal biogenesis by binding to S12
RpsL. Similarly to RimP in E. coli, MSMEG_2624 knockout in M. smegmatis showed reduction in polysomes and 70S ribosomes and concomitantly increased free 30S and 50S ribosomal subunits (Fig. 4A). The phenotype can be rescued by complementation of RimP in the knockout strain (Fig. 4A). In addition, we confirmed using MS that MSMEG_2624 was essential for the efficient recruitment of RpsL to the maturing 30S subunit (Fig. 4B). To specifically characterize the in vivo effects of the linker region of MSMEG_2624 on ribosomal biogenesis, we analyzed the ribosomal profile of the MSMEG_2624 knockout strains of M. smegmatis, complemented by two MSMEG_2624 mutants. We compared one that significantly abolished the interaction with RpsL (⌬P90 -D93), with another that did not (P95G). The knockout strain complemented by the ⌬P90 -D93 mutant showed a significantly lower level of 70S ribosome than that complemented by P95G, suggesting that the interdomain linker of MSMEG_2624 is essential for ribosome biogenesis via its interaction with RpsL (Fig. 4C). Taken together, we propose a model of ribosomal biogenesis in which the two domains of RimP cooperatively bind with S12 through the linker region, whereby it facilities efficient maturation of the complete 70S ribosome.

Discussion
In this study, we have reported a high-resolution crystal structure of RimP homolog in M. smegmatis that revealed a well-defined interdomain orientation. Moreover, we have provided a mechanistic view of RimP in ribosomal biogenesis. Our binding assays demonstrate that the two domains of RimP cooperatively bind with the small ribosomal protein RpsL through its linker region. More importantly, this linker region is essential for ribosomal biogenesis. Sashital et al. (8) reported that in the absence of RimP, the central pseudoknot structure of rRNA was unstable, resulting in the accumulation of intermediates. These intermediate products were depleted of S5 and S12, which are both pseudoknot-interacting ribosomal proteins. We propose that the linker region of RimP forms a platform for recruiting S12 and facilitating the rRNA binding. Indeed, this hypothesis could explain the functional importance of the evolutionarily conserved residues in this linker region.
In the structure of MSMEG_2624 and SP14.3, the C-terminal domain has an Sm fold that is also observed in Hfq, an RNA chaperone that mediates the interaction between sRNAs and their mRNA targets. Hfq is a hexamer that efficiently binds RNA, whereas RimP remains as a monomer in solution and does not appear to show RNA-binding activity. Indeed, the amino acid sequence between Hfq and RimP diverges significantly. This observation indicates that Hfq and the C-terminal domain of RimP are functionally divergent, even though they are structurally similar.
RimP is highly conserved in prokaryotes, and its functional importance in ribosomal biogenesis has been reported in multiple bacterial species. As no homolog protein has been identified in mammals, it has the potential of becoming a drug target for therapeutic purposes to combat bacterial infections, such as tuberculosis. This makes our structure and biochemical study of particular importance for future drug design.

Bacterial strains and plasmids
The bacterial strains and vectors used in this work are summarized in Table S1.

Protein expression, purification, and crystallization
Recombinant His-tagged MSMEG_2624 was expressed in E. coli BL21 (DE3) cells, and purified using nickel-nitrilotriacetic acid affinity chromatography. The His tag was cleaved, and the protein was further purified by size-exclusion chromatography (10 mM Tris-HCl, pH 7.4, 250 mM NaCl). MSMEG_2624 crystals were grown under 20% PEG 3350, 6% acetonitrile, 0.2 M sodium citrate (pH 8.0) by the hanging-drop vapor-diffusion method at 16°C. The native crystals were cryoprotected by quickly passing through the crystallization buffer containing 12% glycerol before flash freezing. Heavy atom soaking was performed by directly adding 40 mM K 2 PtCl 4 water solution to crystals in the mother liquor twice with a 15-min interval between, to a final concentration of ϳ20 mM. The crystals were soaked in the dark overnight and backsoaked for 5 min in cryoprotectant buffer containing 8.8 mM K 2 PtCl 4 and 12% glycerol before data collection.

Structure determination
Data sets were processed using the CCP4 suite (11). Specifically, data were integrated using iMosflm version 1.5 and merged and scaled using SCALA. Scailit was used to scale across the data sets after they were combined using CAD.

RimP facilitates ribosomal biogenesis by binding to S12
Heavy atom sites were searched using Afro/Crunch2 followed by refining/phasing using Bp3 using the CRANK suite (12). The initial map was built by the automated model building software Buccaneer (13), followed by iterative local noncrystallographic symmetry (NCS) phased refinement using DM (14) and Refmac (15). Local NCS operator was only used in early stages of refinement to help identify the initial phase. After the refinement statistics converged in the Refmac program, we used Phenix refinement (16) and manual building using the molecular graphics program COOT (17), without the use of NCS. Structures were visualized using PyMOL (18). Statistics of the quality of the structure were calculated using MolProbity (19). Statistics of crystal contacts were calculated using Pisa analysis (20). The structure was deposited to the Protein Data Bank (entry 5GL6).

Construction of MSMEG_2624 knockout strain (⌬rimP)
The rimP null mutant strain (⌬rimP) was constructed using homologous recombination to replace the rimP gene with the hygromycin selection marker as described previously (21). Briefly, a DNA substrate for allelic replacement of rimP (MSMEG_2624) was generated by cloning 500-bp upstream and downstream regions of the rimP gene to 5Ј and 3Ј ends of the hygromycin (hyg) gene, respectively, and transformed into the Msmeg mc 2 155 strain harboring a pJV53 vector, which can produce recombinase for efficient recombination. The successful knockout strain ⌬rimP::hyg was selected and confirmed by sequencing and Western blot analysis.

In vitro binding of recombinant RimP (MSMEG_2624) and RpsL
Plasmids pGEX-6P-1-RpsL and pRsfDuet-1-RimP were cotransformed to E. coli BL21(DE3). The bacteria were grown in 3 liters of Luria-Bertani medium at 37°C until A 600 ϳ0.6. Protein expression was induced by the addition of 1 mM isopropyl 1-thio-␤-D-galactopyranoside at 16°C overnight. Bacteria were harvested and resuspended in 1ϫ PBS and 0.5% Triton X-100. Bacteria were lysed by sonication, and cell debris and precipitants were removed by centrifugation. The supernatant was mixed with TALON Cobalt beads (Clontech) for 1.5 h at 4°C followed by 10 column bed volumes of 1ϫ PBS washing. Proteins were then eluted with 20 mM Tris-HCl, 500 mM NaCl, 250 mM imidazole. The eluted protein was incubated with GSH column (GE Healthcare) for 1.5 h at 4°C and washed with 10 column bed volumes of 1ϫ PBS. Proteins were eluted with 20 mM Tris-HCl, 150 mM NaCl, and 10 mM reduced GSH.

Polysome profiling
The strains WT, ⌬rimP, ⌬rimPϩrimP, and ⌬rimPϩvector were cultured in 7H9 medium, and the cells were harvested at early log phase (A 600 ϳ1). The cell pellets were lysed as described previously, and the ribosome was extracted according to the method reported previously (22).

Mass spectrometry study of ribosomal fractions
Sample preparation-The fraction was precipitated in acetone with a 1:4 ratio at Ϫ20°C overnight. The protein pellet was dissolved in urea buffer (8 M urea in 50 mM ammonium bicarbonate). Protein concentration was measured by a Bradford assay. 10 g of protein were diluted to 1 M urea by 50 mM ammonium bicarbonate and digested overnight at 37°C using sequencing grade trypsin (Promega) with an enzyme/substrate ratio of 1:50. 5 g of peptide were desalted by C18 Ziptip (Mil-

RimP facilitates ribosomal biogenesis by binding to S12
lipore). The sample pellet was reconstituted in formic acid and injected into MS.
LC-MS/MS experiment-The obtained peptides were reconstituted in 12 l of 0.1% formic acid, and 2 l was injected into an Easy-nLC 1200 (Thermo). Peptides were separated on a reverse phase C18 column (75-m inner diameter ϫ 15 cm, 3-m particle size) and analyzed by an Orbitrap mass spectrometer (Thermo). The mobile phase buffer consisted of 0.1% formic acid in ultrapure water (buffer A) with an eluting buffer of 0.1% formic acid in 80% (v/v) acetonitrile (buffer B) run with a linear 50-min gradient of 7-25% buffer B at a flow rate of 250 nl/min. The mass spectrometer was operated in positive ion mode acquiring a survey mass spectrum with a mass resolution of 120,000, m/z ϭ 350 -1800 using an automatic gain control (AGC) target of 3 ϫ 106. The 12 most intense ions were selected for higher-energy collisional dissociation fragmentation (normalized collision energy 27), and MS/MS spectra were generated with an AGC target of 1 ϫ 105 at a resolution of 30,000. The dynamic exclusion time was set to 30 s.
Database analysis-The raw data from MS/MS spectra were searched against the M. smegmatis strain mc2 155 (downloaded on August 20, 2018) using the Sequest HT node integrated within the Proteome Discoverer (PD) software (version 2.2, Thermo). The precursor and fragment mass tolerances were set to 10 ppm and 0.02 Da, respectively. A maximum of two missed cleavages was allowed for trypsin digestion. Cysteine carbamidomethylation was set as static modification, whereas methionine oxidation and N-terminal acetylation were set as variable modifications. False discovery rates of peptide spectrum matches and identified peptides were determined using the Percolator algorithm at 1% based on q value. For quantification, the precursor ion areas in a node-based processing and consensus workflow in PD 2.2 were used. The areas of the ions in the MS1 scan were calculated using the Minora Feature alignment and feature mapping. The abundance values of proteins were obtained via a label-free quantification method using LC-MS/MS.