Mycobacterium tuberculosis RNA Polymerase-binding Protein A (RbpA) and Its Interactions with Sigma Factors*

Background: RNA polymerase-binding protein A (RbpA) plays an unknown essential role in Mycobacterium tuberculosis. Results: The structure of RbpA was solved using NMR. Conclusion: RbpA binds sigma factors A and B via its conserved N and C termini. Significance: The identified interactions shed light on the function of RbpA in regulating transcription. RNA polymerase-binding protein A (RbpA), encoded by Rv2050, is specific to the actinomycetes, where it is highly conserved. In the pathogen Mycobacterium tuberculosis, RbpA is essential for growth and survival. RbpA binds to the β subunit of the RNA polymerase where it activates transcription by unknown mechanisms, and it may also influence the response of M. tuberculosis to the current frontline anti-tuberculosis drug rifampicin. Here we report the solution structure of RbpA and identify the principle sigma factor σA and the stress-induced σB as interaction partners. The protein has a central ordered domain with a conserved hydrophobic surface that may be a potential protein interaction site. The N and C termini are highly dynamic and are involved in the interaction with the sigma factors. RbpA forms a tight complex with the N-terminal domain of σB via its N- and C-terminal regions. The interaction with sigma factors may explain how RbpA stabilizes sigma subunit binding to the core RNA polymerase and thereby promotes initiation complex formation. RbpA could therefore influence the competition between principal and alternative sigma factors and hence the transcription profile of the cell.


Tuberculosis (TB)
is an infectious disease that has plagued humanity for thousands of years, indeed, molecular evidence of TB infection has been found in a 5,400-year-old Egyptian mummy (1). Through the long years of coexistence between TB and humankind Mycobacterium tuberculosis, the main etiological agent of human TB, has evolved to become one of the most successful human pathogens. In 2011, M. tuberculosis claimed 1.4 million lives and caused an estimated 8.7 million incident cases of TB (2).
Transcription by bacterial RNA polymerase (RNAP) is a key point of control of gene expression and an important target for antimicrobial chemotherapy, including the frontline anti-tuberculosis drug rifampicin. RNAP is formed by a core of five subunits (␣ 2 ␤␤Ј), which is competent for transcription elongation, and by a sixth -subunit (or factor) that is essential for promoter recognition and transcription initiation. All bacterial genomes encode one primary -subunit that is responsible for the transcription of housekeeping genes and is essential for the growth of the organism (3). However, the total number of -subunits encoded can vary from 1 in Mycoplasma pneumoniae (4) to 65 in Streptomyces coelicolor A3(2) (5). Each -subunit has different promoter specificity, so the use of different -subunits, which is often linked to a specific stimulus such as stress response, represents an efficient mechanism of gene expression control.
M. tuberculosis encodes 13 distinct -subunits ( A -M ) (6): A is the primary -subunit, B is a primary-like -subunit that shares a high degree of sequence homology with the last 300 residues of A , and C -M are alternative -subunits (3). Interestingly, M. tuberculosis has the highest ratio of alternative -subunits compared with genome size of any obligate pathogen (7), possibly reflecting the complex stages of infection that require tight regulation.
Switching between different -subunits to reprogram transcription is a common strategy to control gene expression. However, other mechanisms that control the transcription activity of the RNAP include transcription factors, other RNAP-binding proteins, non-protein ligands, and the folded bacterial chromosome structure (8,9). The expression of a particular gene is thus the result of the interaction of different players in a complex and dynamic network that, for M. tuberculosis, is still not completely understood.
A novel small RNAP-binding protein RbpA was identified in S. coelicolor and is highly conserved in the actinomycetes (10). RbpA of M. tuberculosis contains 111 amino acids, is encoded by Rv2050, and is thought to be part of the gene expression control network (11). Hu et al. (11) have shown that RbpA activates transcription by stabilizing the formation of the RNAP holoenzyme containing A but does not activate F -dependent transcription. RbpA and its homologue in Mycobacterium smegmatis have been reported to bind to the ␤-subunit of RNAP, but the basis of sigma factor specificity is unknown (11,12). One hypothesis is that RbpA can modify the RNAP core structure to specifically increase the affinity for A (11).
Further evidence for the role of RbpA in transcription comes from its role in reducing inhibition of the RNAP by rifampicin, during in vitro transcription and cell-based assays in M. tuberculosis (11), S. coelicolor (13), and M. smegmatis (14). Rifampicin binds to the ␤-subunit of the RNAP. The binding site of rifampicin does not overlap with the predicted binding site of RbpA, and the role of RbpA in cell sensitivity to rifampicin remains unclear (11,14). RbpA expression is up-regulated in several stress conditions: starvation, hypoxia, in mouse macrophages, and in the presence of rifampicin and vancomycin (13,15,16). Furthermore, RbpA was predicted to be essential for normal growth of M. tuberculosis (17). This early prediction was confirmed using an RbpA conditional knockdown mutant strain in M. tuberculosis (18). However, it is not clear yet what essential function(s) RbpA performs in M. tuberculosis.
The essential role of RbpA in the regulation of gene expression and its role in rifampicin tolerance make this protein an important subject for study. Here we present evidence that RbpA can directly interact with both A and B , thereby providing insights into the mechanism(s) used by RbpA to modify the transcription activity of the RNAP. Furthermore, we report the high resolution structure of the N-terminal domain of RbpA (residues 1-79, RbpA 1-79 ), which reveals the presence of a well structured core with long and extremely flexible, N and C termini. We investigated the interaction between RbpA and the -subunit and found that the structured central domain of RbpA is not involved in the formation of the complex with the -subunit, but both ends of the construct are involved in the interaction, although only the C terminus is essential.
Expression and Purification of RbpA Constructs-Escherichia coli BL21(DE3) cells, transformed with the relevant expression vector, were grown in LB medium containing 100 g/ml ampicillin at 37°C with shaking to an absorbance at 600 nm of ϳ0.7. Protein expression was then induced by the addition of 0.5 mM isopropyl 1-thio-␤-D-galactopyranoside, and the cell growth was continued at 30°C for 4 h. The cells were lysed in a buffer containing 50 mM Tris-HCl, pH 8.0, and 0.1 mM EDTA plus 1 mM 4-(2-aminoethyl) benzenesulfonyl fluoride hydrochloride (AEBSF) and 0.1 mg/ml lysozyme. The cell lysate was then centrifuged (15,000 ϫ g for 30 min), and the supernatant was loaded on a nickel-Sepharose 5-ml column (GE Healthcare) pre-equilibrated with a buffer containing 25 mM Tris-HCl, pH 8.0, 150 mM NaCl, and 30 mM imidazole. Subsequently the His 6 -tagged protein was eluted in the same buffer with a linear imidazole gradient (30 -500 mM). Fractions containing the His 6 -tagged protein were subjected to size exclusion chromatography using a Superdex 75 16/60 (GE Healthcare) column and a buffer containing 25 mM Tris-HCl, pH 8.0, 150 mM NaCl, and 2 mM DTT.
To uniformly label RbpA with 15 N, 13 C, or 15 N/ 13 C, E. coli BL21(DE3) cells transformed with the relevant expression vector were grown in modified Spizizen's minimal medium (19) containing 100 g/ml ampicillin, where the sole source of nitrogen and carbon were 1 g/liter of [ 15 N]ammonium sulfate and 2 g/liter of 13 C-D-(ϩ)-glucose as appropriate. The same expression protocol was applied to uniformly label RbpA 1-79 , except for 15 N/ 13 C-labeled RbpA 1-79 , which was prepared using nonisotopically labeled aromatic amino acids (His, Tyr, Trp, and Phe at 50 mg/liter) in the minimal medium. The preparation of uniformly labeled 15 N/ 13 C/ 2 H RbpA was achieved following a protocol for high density expression of labeled protein (20). Purification of the labeled proteins were performed as for isotopically unlabeled RbpA.
Expression and Purification of A , B , and B 1-228 -E. coli BL21(DE3) cells, transformed with the relevant expression vector, were grown in LB medium (containing 100 g/ml ampicillin) at 37°C with shaking to an absorbance at 600 nm of ϳ0.7. Protein expression was induced by the addition of isopropyl 1-thio-␤-D-galactopyranoside at 0.05 mM, and the cell growth continued at 16°C overnight. The cells were lysed in a buffer containing 50 mM Tris-HCl, pH 7.9, 500 mM NaCl, and 5% glycerol plus 1 mM AEBSF, 0.1 mg/ml lysozyme, 0.1 mg/ml deoxyribonuclease I, and 5 mM MgCl 2 . The cell lysate was then centrifuged (15,000 ϫ g for 30 min), and the supernatant was loaded on a 5-ml nickel-nitrilotriacetic acid column (Qiagen) pre-equilibrated with the lysis buffer containing 20 mM imidazole. The His 6 -tagged protein was eluted in the same buffer with a linear imidazole gradient (20 -500 mM). Fractions containing the His 6 -tagged protein were subjected to size exclusion chromatography using a Superdex 75 16/60 (GE Healthcare) column and a buffer containing 50 mM Tris-HCl, pH 7.9, 500 mM NaCl, 5% glycerol, 0.1 mM EDTA, and 0.5 mM DTT.
His 6 Tag Cleavage-The His 6 tag was removed from the His 6 -tagged proteins by enzymatic cleavage using His 6 -TEV protease (provided by PROTEX, University of Leicester). The MAY 17, 2013 • VOLUME 288 • NUMBER 20 reaction results in complete removal of the tag leaving a single serine residue. His 6 -TEV protease and the His 6 tag were removed from the reaction mixture using a 5-ml nickel-nitrilotriacetic acid column (Qiagen).
The two-and three-dimensional spectra recorded to obtain sequence specific assignments for RbpA were: 1  Typical acquisition times in the three-dimensional experiments were: 4 -9 ms ( 13 C), 9 -18 ms ( 15 N) in F 1 and F 2 , and 65-83 ms in F 3 ( 1 H).
The two-and three-dimensional spectra recorded to obtain sequence specific assignments and 1 H-1 H distance constraints for RbpA 1-79 were: 1 H-1 H TOCSY (21) with mixing times of 35 and 55 ms, 15  To identify changes in the RbpA signals induced by His 6 -B 1-228 binding, NMR spectra were acquired from 0.35-ml samples of 15  Acquisition times in the three-dimensional experiments were: 12 ms in F 1 ( 13 N), 10 ms F 2 ( 15 N), and 65 ms in F 3 ( 1 H). The two-dimensional NMR spectra were acquired typically for ϳ0.5 h, and the three-dimensional spectra were acquired typically for ϳ65 h.
The WATERGATE method (33) was used to suppress the water signal when required. The NMR data collected were processed using the program Topspin (Bruker Biospin Ltd.) with linear prediction used to extend the effective acquisition time up to 2-fold in F 1 and F 2 . The spectra were analyzed with the Sparky package (T. D. Goddard and D. G. Kneller, University of California, San Francisco).
Sequence-specific Assignment-Sequence-specific backbone assignments were obtained for RbpA from the identification of intra-and inter-residue connectivities in a series of double and triple resonance spectra: 15 (25), and NOESY-TROSY (24). Initially, tentative assignments for RbpA 1-79 were obtained by transposing the assignment from RbpA onto the 1 H-1 H TOCSY (21), 1 H-1 H NOESY (22), 15 N/ 1 H HSQC (23), and TOCSY-HSQC (31) spectra. The inter-residue amide nitrogen/proton to C␣ and C␤ connectivities observed in the HNCACB (30) and the NH to NH and CH to CH NOEs identified in the 15 N-and 13 C-edited NOESY spectra (24,32) were used to confirm the assignment.
Structure Calculation-The program Cyana (34) was used to calculate the family of converged structures of RbpA 1-79 in a two-stage process. Initially, the combined automated NOE assignment and structure determination protocol (CANDID) was used to automatically assign NOE cross-peaks and to generate a preliminary family of converged structures. The inputs used for this first step were: 15 N, 13 C, and 1 H resonance assignments, a set of 72 backbone torsion angle constraints, determined by the protein backbone dihedral angle prediction program TALOSϩ (35), and three lists of manually picked NOE cross-peaks that were identified in a two-dimensional NOESY spectrum (399 peaks) and in a three-dimensional 15 N-and a 13 C-edited NOESY spectra (881 and 1049 peaks, respectively). The CANDID calculation was carried out using the standard parameters of Cyana but with a chemical shift tolerance set at 0.03 ppm for 1 H and at 0.4 ppm for 15 N and 13 C. In the second stage, the final family of converged RbpA 1-79 structures was produced through several cycles of simulated annealing combined with redundant dihedral angle constraints (36), resulting in 98 structures with no distance violations greater than 0.5 Å and no dihedral angle violations greater than 5°. The structures obtained were then refined in a generalized Born solvent model (37) using the AMBER 10 package (38) as described previously (39), but with one cycle of restrained molecular dynamics simulated annealing. The 35 structures with the lowest AMBER energy were selected and analyzed using MOLMOL (40) and CING (41).
Sequence Alignment and Conservation Score Assignment-RbpA homologue sequences were identified using the BLAST algorithm (42) with the default settings leading to the identification of 416 sequences. The alignment was performed using the software COBALT (43) with sequences from Actinobacteria. When multiple sequences were identified from the same genus, only the sequence with the highest similarity score was selected. The resulting alignment was used to assign a conservation score to each RbpA residue, calculated using the analysis of multiply aligned sequences method (44) implemented in Jalview (45).
Construction of a Genomic M. tuberculosis Yeast Two-hybrid Library-The total DNA of M. tuberculosis H37Rv (kindly provided by Roland Brosch, Pasteur Institute, Paris) was partially digested with AciI and HpaII. The fragments were subsequently cloned into the single ClaI site of plasmid pGADT7 (Clontech) to generate a genomic M. tuberculosis library for yeast twohybrid analyses. To avoid religation of the digested vector, the ligation mix was treated with ClaI before transformation into E. coli. The obtained library consists of ϳ3 ϫ 10 5 independent clones with an insertion frequency of nearly 100% and an average insert size of ϳ500 base pairs. Two-hybrid Screening for B -interacting Proteins-The entire coding sequence of B was amplified by PCR with M. tuberculosis H37Rv DNA as a template and fused in-frame to the DNA of the yeast Gal4 DNA-binding domain using the restriction sites EcoRI and PstI of plasmid pGBT9 (Clontech). Subsequently, this construct was used to screen for B -interacting proteins. The "bait" and "prey" containing plasmids were co-transformed by the method of Klebe et al. (46) into the yeast strain Y190 (MATa, ura3-52, his3-200, ade2-101, lys2-801, trp1-901, leu2-3, Ϫ112, gal4⌬, gal80⌬, cyh r 2, LYS2ϻGAL1 UAS -HIS3 TATA -HIS3, URA3ϻGAL1 UAS -GAL1 TATA -lacZ; Clontech). Yeast transformants were selected and cultivated on SD synthetic medium (2% glucose and 0.67% yeast nitrogen base without amino acids) supplemented with essential amino acids and nucleotides. Potential interaction partners of B were monitored using the ␤-galactosidase as a reporter system by a filter lift assay (47,48).
Interaction Assay by Split-Trp-Protein-protein interactions were tested in E. coli using the Split-Trp method (49,50). The genes encoding B and RbpA were each cloned into PL184 and PL185 for expression as fusions with N trp and C trp . Pairs of plasmids were transformed in E. coli ⌬trpF, and strains were grown in LB then plated on VB minimal agar containing 0 or 60 g/ml tryptophan and incubated at 25°C. Photographs were taken on the sixth day.
Mapping the B Binding Site on RbpA-Changes in the positions of RbpA NMR signals resulting from binding to His 6 -B 1-228 were analyzed using the minimal shift approach (51-53). The combined HN, N, and CO chemical shift differences (⌬␦) between the peaks detected in HNCO spectra recorded on free 15 N/ 13 C/ 2 H RbpA and on 15 N/ 13 C/ 2 H RbpA bound to unlabeled His 6 -B 1-228 were calculated using the following equation: ⌬␦ ϭ ͌((⌬␦ HN ) 2 ϩ (⌬␦ N *0.2) 2 ϩ (⌬␦ CO *0.35) 2 ), where ⌬␦ HN , ⌬␦ N , and ⌬␦ CO correspond to the differences in 1 H, 15 N, and 13 C chemical shifts between pairs of HNCO peaks, and 0.2 and 0.35 are scaling factors required to account for differences in the range of amide proton, amide nitrogen, and carbonyl chemical shifts. For each peak detected in the HNCO spectra recorded on free 15 N/ 13 C/ 2 H RbpA, the minimal shift induced by the binding with His 6 -B 1-228 was taken as the lowest calculated combined shift value (⌬␦).
Complementation of Rv2050 Mutant-The construction of a conditional mutant strain of M. tuberculosis H37Rv in which the gene Rv2050 is under the control of a pristinamycin-inducible promoter has been described previously (18). The coding region of the gene Rv2050 with the 304-base pair promoter region was amplified with specific oligonucleotides and cloned into plasmid pMV306 to obtain the plasmid pMV306-RbpA. pMV306 is an integrative plasmid in which the expression cassette of pMV361 is replaced by a multiple cloning site (54). A truncated version, pMV306-RbpA 1-79 , missing 32 residues at the C terminus was constructed in an identical manner. Each complementing plasmid, and, as a control, pMV306, was electroporated into the conditional mutant. Colonies were obtained on Middlebrook 7H10 agar supplemented with 10% ADN (5% albumin, 2% dextrose, 145 mM NaCl), kanamycin (40 g/ml), hygromycin (100 g/ml), and pristinamycin (0.5 g/ml). A few clones for each transformation were grown in Middlebrook 7H9 medium supplemented with ADN, Tween 80 (0.05%), kanamycin, hygromycin, and pristinamycin for 10 days and then serially diluted and spotted in parallel onto 7H10 agar plates with and without pristinamycin. Photographs were taken after 24 days of incubation at 37°C.

The Role of the C-terminal Region of RbpA in Oligomerization-
The protocol adopted for the purification of RbpA involves metal affinity chromatography followed by size exclusion chromatography. In the latter step, it was noticed that RbpA eluted from the column earlier than is expected for a protein with a molecular mass of 13 kDa. The oligomerization status of RbpA was investigated by performing a number of size exclusion chromatography experiments on RbpA samples at different concentrations (Fig. 1). The resulting chromatograms reveal asymmetric peaks with the elution volume strongly dependent on the concentration of the sample. These observations suggest that there is a dynamic and concentration-dependent equilibrium between different oligomeric species of RbpA.
Initial NMR experiments aimed at determining the solution structure of RbpA were recorded and analyzed. For residues 1-76, the extent of the assignment is 94.1% of all the aliphatic 13 C resonances and 88.1 and 91.5% of all the 15 N and 1 H resonances, respectively. However, approximately half of the residues 77-111 do not generate observable NMR signals. This prevented unequivocal assignment of the last 35 residues of RbpA. The lack of backbone NMR signals for this region implies that it samples a number of discrete structural states on a second to millisecond time scale.
The secondary structure prediction for RbpA ( Fig. 2A) reveals that two distinct ␣-helices are likely to be formed in the C-terminal region (residues 80 -111). Interestingly, the second predicted helix (residues 91-108) has a marked amphipathic character (Fig. 2B). The hydrophobic side of this helix could potentially interact with the equivalent face on a second RbpA molecule, leading to homo-oligomerization. The elongated shape of the peak obtained during the size exclusion chromatography analysis of RbpA (Fig. 2C, blue trace) indicates that there is a constant interchange between different oligomerization species. Conformational exchange in the C-terminal region of RbpA could account for the poor quality of the NMR signals generated by the last 35 residues: some signals are weak, some are very broad, and some are missing.
To investigate the role of the C-terminal region in oligomerization, we expressed and purified a shorter version of RbpA, termed RbpA 1-79 , which consists of residues 1-79 and therefore lacks the predicted amphipathic helix. The elution profile of this truncated protein in size exclusion chromatography is consistent with the predicted molecular mass of the monomer (9 kDa; Fig. 2C) and is independent of the protein concentration (not shown). The difference in retention between RbpA and RbpA 1-79 clearly indicates that the C-terminal region is essential for oligomerization.
Solution Structure Calculation of RbpA 1-79 -There was a significant improvement in the quality of the NMR data obtained for RbpA 1-79 as compared with full-length RbpA, which is evident from both the increased signal to noise ratio and decreased line widths observed for RbpA 1-79 as shown in the spectra reported in Fig. 3 and supplemental Fig. S1. Comparison of the backbone chemical shifts of RbpA and RbpA 1-79 revealed only minor changes (Fig. 3), indicating that the structure of the first 79 residues of RbpA is maintained in the shorter RbpA 1-79 construct. Therefore, RbpA 1-79 was chosen for further structural characterization.
The quality of the NMR data acquired for RbpA 1-79 allowed us to obtain comprehensive backbone and side chain assignments. For example, nearly complete assignments of backbone resonances (N, NH, H␣, H␤, C␣, and C␤) were made for all residues except: Met 1 (N, NH, H␣, H␤2, H␤3, and C␤), Gly 8 (H␣2 and H␣3), Tyr 16 (34) was used to automatically assign the NOE cross-peaks identified in two-dimensional 1 H-1 H NOESY and three-dimensional 15 N-and 13 C-edited NOESY spectra. Unique assignments were obtained for 87% of the NOE cross-peaks identified. The structure of RbpA 1-79 was calculated with the program Cyana (34) using the constraints listed in Table 1. In the final round of structure calculation, 98 satisfactorily converged RbpA 1-79 structures were obtained from 100 initial random structures. They had no distance or van der Waals violations greater than 0.5 Å, had no dihedral angle violations greater than 5°, and had an average value for the Cyana target function of 1.17 Å 2 . The structures were refined using the AMBER 10 package (38), and the 35 structures with the lowest AMBER energy were selected. The final family of converged structures has no distance constraint violations greater than 0.37 Å and no dihedral angle violations.
Superposition of the final family of converged structures (Fig.  4A) reveals the presence of a well defined central domain (residues 26 -66), whereas the N-and C-terminal regions of the protein are both disordered and highly flexible. The structure of the central domain was determined to high precision, as is clearly evident from the overlay shown in Fig. 4B, and is reflected in low root mean square deviation values to the mean structure for both the backbone and all the heavy atoms of 0.32 and 1.08 Å, respectively (a summary of the NMR constraints and structural statistics is reported in Table 1). The central domain is primarily composed of four distinct ␤-strands: ␤_1 (residues 27-33), ␤_2 (residues 39 -45), ␤_3 (residues 53-55), and ␤_4 (residues 61-65), which fold to form two antiparallel ␤-sheets linked by turns and loops (Fig. 4C). The two ␤-sheets lie perpendicular to each other forming a ␤-sandwich-like structure, which is stabilized by a cluster of aromatic and nonpolar residues (Tyr 32 , Val 42 , Phe 44 , and Trp 54 ). Analysis of the electrostatic surface of the central domain (Fig. 5) revealed the presence of a cluster of conserved residues (Arg 27 , Val 42 , Phe 44 , Ala 45 , Asp 47 , Ala 48 , Glu 49 , and Trp 54 ) that form a surface patch of 742.2 Å 2 with a hydrophobic center surrounded by charged residues. This patch is the most conserved region on the surface of the central domain (Fig. 5B) and may form a potential protein interaction site.
Comparison of the structure of the RbpA central domain (residues 26 -66) with the known folds in the Protein Data Bank using Dali (55) identified 14 structures (supplemental Table S1) that share structural homology (z score Ͼ 2). How- ever, the structural homology of each with RbpA central domain is relatively low (z score Ͻ 3.5), and no specific function has been assigned to this fold. The highest scoring structural homologue (Protein Data Bank code 3UXQ, chain 6) is a ribosomal protein that has a C␣ root mean square deviation to the RbpA central domain of 2.8 Å (supplemental Table S1) but no sequence homology (2% amino acid identity).
Identification of RbpA as a Binding Partner of B -M. tuberculosis is unusual, but not unique, in having a second principlelike sigma factor B that is up-regulated during stress. As part of an investigation into the function of B , we performed a yeast two-hybrid screen to identify potential binding partners encoded within the genome of M. tuberculosis. Of ϳ75,000 yeast transformants analyzed, one clone was identified, which scored positive after several rounds of verification. Sequencing revealed that the gene Rv2050 was fused in-frame to the activation domain of Gal4.
Because this interaction had not been observed previously, we additionally verified that RbpA binds to B in bacterial cells using a protein fragment complementation assay in E. coli (49,50). Strains carrying N trp -B and RbpA-C trp grew on minimal medium, suggesting that interaction of B with RbpA reconstituted the Trp1p enzyme (Fig. 6). The strain carrying N trp -B and B -C trp formed a negative control, as B is not thought to form homo-oligomers.
Characterization of the Interaction of RbpA with Sigma Factors-To further characterize the identified interaction in vitro and to determine the domains responsible for interaction, we expressed and purified B (full-length, residues 1-323) and a truncated version, B 1-228 (residues 1-228), which contains conserved motifs 1.2-3.1 (56). The site of B truncation was chosen using multiple sequence alignment and structural modeling with Phyre (57) based on the structure of the Thermus thermophilus principal subunit, Protein Data Bank code 1IW7 (45% of sequence identity) (58). Both versions were tested for the ability to interact with RbpA during size exclusion chromatography (below) and affinity chromatography (not shown). RbpA was found to bind B 1-228 (Fig. 7) or full-length B (data not shown). Co-purification demonstrates that the interaction is tight (K D Ͻ 10 Ϫ6 M) and spe-cific and shows that the first 228 residues of B are sufficient to form the complex with RbpA.
To investigate the stoichiometry of the complex, further size exclusion chromatography experiments were performed using different molar ratios: 2:1 and 1:2 of RbpA: B 1-228 (supplemental Fig. S2). In each case a 1:1 complex was eluted from the column, with the excess material eluting as expected for the free protein. The elution volume for the His 6 -RbpA⅐His 6 -B 1-228 complex is lower (earlier) than that predicted by the molecular mass of the complex, 43.9 kDa but is nevertheless closer to the prediction for a 1:1 complex than a 2:2 complex (Fig. 7). Hence we suggest that the His 6 -RbpA⅐ B 1-228 complex has a 1:1 stoichiometry and that the slightly earlier elution may be due to the extended conformation that is known for sigma factors (57).
Sigma factor A ( A ), the primary sigma factor of M. tuberculosis, shares high homology with B including 64% amino acid identity in residues 1-228 of B . We hypothesized that A could also interact with RbpA. To test this hypothesis, we expressed and purified A and assessed the binding with RbpA by size exclusion chromatography experiments, confirming that A can also bind and form a tight complex with RbpA (data not shown).
Insights on the Interaction between RbpA and B -To better understand the binding mode of RbpA with the sigma subunit, we took advantage of the RbpA NMR assignment to investigate which residues of RbpA are involved in the interaction with B 1-228 by NMR minimal shifts. To this purpose, a sample of triply labeled 15 N/ 13 C/ 2 H RbpA was mixed in an equimolar ratio with unlabeled His 6 -B 1-228 , and the resulting complex was purified by size exclusion chromatography. TROSY and TROSY-HNCO spectra were acquired on the complex sample and on a triply labeled sample of 15 N/ 13 C/ 2 H RbpA in the free form. The differences observed in the peak positions between the complex and free RbpA are reported in Fig. 8. Comparison between the TROSY spectra recorded on RbpA free and on the RbpA⅐His 6 -B 1-228 complex (Fig. 8A) show evident perturbations in both assigned and unassigned resonances. In total, at least 19 RbpA backbone amide groups are clearly affected by the binding with His 6 -B 1-228 , and 10 of these are unassigned resonances (supplemental Fig. S3). All the unassigned RbpA backbone amide groups-except Met 1 , Gly 8 , and Arg 10 -lie in the last 35 residues of the protein. Therefore, the relatively high number of perturbations observed in the unassigned peaks show that the C terminus of RbpA is affected by the binding with His 6 -B 1-228 . Furthermore, the only resonance that we were able to assign in the region 77-111, the indole group of the tryptophan in position 82, is clearly affected by the binding with His 6 -B 1-228 (Fig. 8A). Among the assigned resonances that are affected by the binding with His 6 -B 1-228 , there is a clear cluster between residues 11 and 20 (Fig. 8C).
This NMR analysis of the RbpA⅐His 6 -B 1-228 complex led us to hypothesize that both the N-and C-terminal regions of RbpA are involved in the binding with His 6 -B 1-228 . To support this hypothesis, we tested three truncated versions of RbpA: RbpA 1-79 , RbpA 1-92 (residues 1-92), and RbpA 24 -111 (residues 24 -111) for interaction with the factor by size exclusion chromatography experiments (data not shown). The truncation site at Glu 92 was chosen to remove the predicted amphipathic C-terminal helix but retain residues 79 -89 that are almost invariant among RbpA homologues and include Trp 82 , which is thought to be involved in B binding based on the minimal shift assay. The results obtained are summarized in Fig. 9. From this set of experiments, it is evident that residues 1-24 of RbpA are not essential for the interaction with His 6 -B 1-228 , but the whole C-terminal region is required for the formation of a stable complex with His 6 -B 1-228 . Complementation of an RbpA Conditional Mutant Strain in M. tuberculosis-The gene encoding RbpA was predicted to be essential for the growth of M. tuberculosis (17), and essentiality was confirmed recently by constructing a strain where the expression of RbpA is under the control of the antibiotic pristinamycin I (PI) (18). This strain exhibits a conditional lethal phenotype when PI is withdrawn (18). To confirm that the lethal phenotype is due to the lack of RbpA expression and to investigate the domains of RbpA, we complemented this mutant strain with RbpA or RbpA 1-79 expressed from the pMV306 plasmid vector. Ectopic expression of full-length RbpA led to complementation, as measured by growth on agar lacking PI (Fig. 10). By contrast, expression of RbpA 1-79 did not complement the conditional mutant, suggesting that residues 80 -111 are required for the physiological function of RbpA.

DISCUSSION
The essential functions of RbpA are important in understanding the control of bacterial transcription, characterizing the target of a frontline anti-tuberculosis drug, and identifying new vulnerabilities that could be exploited for future drug development. Although the interaction between RbpA and RNAP was reported in 2001 (10) and the influence of RbpA expression levels on rifampicin sensitivity was reported in 2006 (13), the mechanisms by which RbpA influences transcription and rifampicin sensitivity are still unclear. Our results showing that RbpA can bind directly to A supports the proposal by Hu et al. (11), that RbpA may promote open complex formation and influence the competition between sigma subunits for binding to RNAP. The fact that RbpA can bind to both RNAP and A provides the simplest mechanism by which RbpA could stabilize the holo-RNAP. We have shown that RbpA can form a tight complex with subunits alone, whereas previous studies have shown that RbpA can bind apo-RNAP (11,14), but evidence is lacking over the order of binding. Previous studies, using RbpA labeled with a chemical cross-linker (12,14) or a chemical protease (11), have shown that RbpA binds to the ␤ subunit of RNAP but disagree about the likely binding site. Given the structural flexibility of the termini of RbpA and the fact that B interaction occurs through these termini, it is not possible to propose which, if any, of the previously proposed binding sites might be compatible with RbpA binding to the ␤ and subunits simultaneously. Although Hu et al. (11) postulate that RbpA could bind RNAP as a dimer, there is conflicting data on the likely oligomerization state of RbpA and its homologues. The retention during size exclusion chromatography was the reason that RbpA (11) and its homologue from S. coelicolor (13) were proposed to be homodimers, whereas the same method led Dey et al. (14) to propose that the homologue from M. smegmatis is monomeric. These homologues share high amino acid identity with RbpA (55 and 92%), and each is also predicted to have an amphipathic helix at the C terminus (not shown). We suggest that the apparent disagreement in molecular size could be due to relatively low affinity homo-oligomerization, with a K D close  A, contact surface views of the central domain colored according to electrostatic potential. Neutral areas are shown in white, and areas of significant charge are shown in red (negative) and blue (positive). A dashed line is used in the left panel to locate the conserved patch. B, the same views are colored according to amino acid conservation among homologues with the least conserved residues in blue, scaling to completely conserved residues in orange, and invariant residues in red. Conservation scores were taken from Jalview: blue, 0 -3; cyan, 4 -6, yellow, 7-9; orange, 10; and red, 11). FIGURE 6. Evidence for RbpA interaction with B from the Split-Trp protein fragment complementation assay. Tryptophan auxotrophic E. coli were transformed with pairs of plasmids, serially diluted, and plated on minimal medium in the presence or absence of tryptophan. The interaction between RbpA and B led to reconstitution of the Trp1p enzyme and hence the ability of the strain to grow on medium lacking tryptophan. The strain expressing N trp -B plus B -C trp is a negative control because B is not thought to form dimers or multimers. Tryptophan-containing medium is a control to demonstrate that all strains are viable.
to the range of protein concentrations we used (3-200 M) so that the oligomerization state depends on the concentration (Fig. 1). Given the homo-oligomerization observed for RbpA in vitro, a positive result might have been expected for the protein fragment complementation assay with E. coli expressing N trp -RbpA and RbpA-C trp (Fig. 6). There are many possible reasons for the observed negative result, such as the low expression levels of fusion proteins in this system or the influence of fusion to Trp1p fragments.
The results of size exclusion chromatography of the complex between RbpA and B 1-228 indicate that it is likely to have a 1:1 stoichiometry. Together with the established 1:1 stoichiometry of sigma subunit binding to RNAP, this leads us to suggest that a single RbpA binds per RNAP. Whether oligomerization of RbpA occurs in vivo would depend on its concentration, as well as the local environment, and competition with other (non-self) binding partners. In our hands, there was evidence of oligomerization at all concentrations tested (Ն3 M). The intracellular concentration of RNAP in bacteria depends on growth phase and has been estimated at ϳ2.5 M (59). Therefore, we cannot rule out a role for RbpA multimerization in binding RNAP or indeed for other functions, for example protection from non-specific interactions, as reported for the anti-sigma factor AsiA (60).
The direct interaction between RbpA and A not only provides a mechanism for stabilizing the holo-RNAP, but may also explain the observed sigma factor/promoter selectivity of RbpA transcription activation. The identification of a direct interaction with B potentially broadens the function of RbpA to the B regulon, which includes transcription factors and genes related to cell envelope stress. It is attractive to postulate a role for RbpA in activating B -driven transcription, because both RbpA and B are up-regulated under multiple stress conditions. Here we established that RbpA binds to both A and B , which are highly homologous in the binding region ( B residues 1-228; 64% amino acid identity). Of the remaining sigma factors in M. tuberculosis, the stress sigma factor F shares the most similarity with the binding region (31% amino acid identity). However, F -driven transcription is not activated by RbpA (11), and this would suggest that RbpA binding is restricted to A and B and that its activity is restricted to Aand B -driven transcription.
We have recorded and analyzed NMR experiments for RbpA 1-79 and full-length RbpA, concluding that there are only minor spectral changes between the two forms ( Fig. 3) despite the change in oligomerization state. Because chemical shifts are extremely sensitive to the chemical environment, the similarity between spectra indicates that the structure of the first 79 residues of RbpA is independent of the C-terminal region. We suggest that RbpA consists of three regions: residues 1-26, which are a flexible random coil; residues 26 -67, which form a stably folded central domain that is connected by an unstructured linker to the predicted helical region, and residues 80 -111. Residues 1-79 are not involved in the oligomerization of RbpA, because the chemical shifts for these residues in oligomeric RbpA are essentially unchanged in monomeric RbpA  .
Hoping to characterize the binding site on RbpA for sigma factors, we compared the spectra of RbpA with those of the RbpA⅐ B 1-228 complex. This analysis, summarized in Fig. 8, shows that the central domain of RbpA is not directly involved in the formation of the complex with B 1-228 , but the N and C termini are clearly affected by binding. Because it was not possible to assign the NMR signals for the C-terminal region of RbpA, we cannot map the precise residues involved. It is not unusual for conformational averaging caused by protein flexibility to occur at high affinity protein-protein interaction sites; indeed there are many documented examples (52,(61)(62)(63)(64). The N and C termini are on the same side of the structured core (Fig.  4A), suggesting that they are in close proximity. Therefore, both ends could take part in the interaction with the -subunit. A series of deletion experiments (Fig. 9) show that the whole C-terminal region is essential for a tight interaction with the sigma factor. Somewhat surprisingly, given the shifts in signals from residues in the N terminus, the first 23 amino acids are not required for RbpA to bind B but may stabilize the complex. It is also possible that the conserved N terminus may play other roles in the function of RbpA.
Although the structural core of RbpA does not appear to form part of the interaction surface with B , there is a signifi- cant hydrophobic patch on the surface, bordered by conserved charged residues (Fig. 5), which could form a putative interaction surface, perhaps for RNAP. Notably, the residues that form this patch are highly conserved in all homologues, including Trp 54 , which is invariant in RbpA homologues, but not conserved throughout the structural homologues. . Asterisks indicate peaks that are affected by His 6 -B 1-228 binding, and an arrow shows the shift for the Ser 15 peak. C, combined minimal shift changes observed comparing the TROSY-HNCO spectra described. Assigned resonances for RbpA that cannot be detected in the complex are marked with a red circle (•), whereas RbpA residues for which the assignment in the TROSY-HNCO spectrum is missing are indicated with a blue square (f). A check indicates that the proteins formed a complex, a cross indicates that no interaction was detected, and a dash indicates that the combination was not tested. FIGURE 10. RbpA, but not RbpA 1-79 , complements the conditional lethal phenotype of the conditional RbpA mutant strain of M. tuberculosis. Liquid cultures of the conditional mutant strain transformed with pMV306, pMV306-RbpA, or pMV306-RbpA 1-79 were diluted (from 10-to 10 6 -fold) and grown on 7H10 agar with or without PI. The plates were placed on a white light box for photography. Plasmid pMV306-RbpA, but not pMV306-RbpA 1-79 , complemented the conditional lethal phenotype, allowing growth in the absence of PI. The control strain carrying unmodified plasmid pMV306 did not grow in the absence of PI. All strains grew in the presence of PI.
With the work of Hu et al. (11), RbpA became recognized as part of a repertoire of small RNAP-binding proteins that modulate transcription. Here we establish that RbpA binds not only to RNAP, but also to the sigma factors A and B . In E. coli, Crl is also thought to be able to bind to RNAP and to a sigma factor (65)(66)(67). Crl is thought to perform a chaperone-like function, influence competition between sigma factors, and promote open complex formation. Potentially there may be similarities with the function of RbpA, but there are significant differences. First, RbpA binds to the principle sigma factor and principlelike sigma factor, whereas Crl binds to the alternative stress responsive sigma factor S (equivalent to F of M. tuberculosis). Second, Crl binds to S and RNAP with low affinity, and it is not able to compete for interaction of S with other binding partners, such as RssB, which targets S to the proteasome (66). Stoichiometric co-purification of RbpA with B indicates a tight complex (K D Ͻ 10 Ϫ6 M), and so it is possible that RbpA may affect sigma factor stability in cells or promote/prevent interactions with additional ancillary proteins. Lastly, Crl is nonessential, whereas a loss of RbpA leads rapidly to loss of growth and viability of M. tuberculosis (18), although the essential nature of RbpA function has not yet been identified clearly. B is dispensable for growth in vitro and in infection models (68), so the likely essential function could be related to the activation of transcription driven by essential A . Our results with the conditional mutant of M. tuberculosis (Fig. 10) support the importance of direct interaction with sigma subunits as part of the essential function of RbpA because truncated RbpA 1-79 that is unable to bind sigma subunits is also unable to complement the conditional mutant.
An interesting feature of RbpA is that it is conserved only in actinomycetes. This information, together with its essentiality in M. tuberculosis, makes RbpA a potential anti-tuberculosis drug target (11,15). In the last few years, significant progress has been made toward characterization of RbpA, and this work contributes a structure and new interaction partners to shed light on the function and mechanism of action.