Crystal Structure of the Lysine Riboswitch Regulatory mRNA Element*

Riboswitches are metabolite-sensitive elements found in mRNAs that control gene expression through a regulatory secondary structural switch. Along with regulation of lysine biosynthetic genes, mutations within the lysine-responsive riboswitch (L-box) play a role in the acquisition of resistance to antimicrobial lysine analogs. To understand the structural basis for lysine binding, we have determined the 2.8Å resolution crystal structure of lysine bound to the Thermotoga maritima asd lysine riboswitch ligand-binding domain. The structure reveals a complex architecture scaffolding a binding pocket completely enveloping lysine. Mutations conferring antimicrobial resistance cluster around this site as well as highly conserved long range interactions, indicating that they disrupt lysine binding or proper folding of the RNA. Comparison of the free and bound forms by x-ray crystallography, small angle x-ray scattering, and chemical probing reveals almost identical structures, indicating that lysine induces only limited and local conformational changes upon binding.

Riboswitches are metabolite-sensitive elements found in mRNAs that control gene expression through a regulatory secondary structural switch. Along with regulation of lysine biosynthetic genes, mutations within the lysine-responsive riboswitch (L-box) play a role in the acquisition of resistance to antimicrobial lysine analogs. To understand the structural basis for lysine binding, we have determined the 2.8 Å resolution crystal structure of lysine bound to the Thermotoga maritima asd lysine riboswitch ligand-binding domain. The structure reveals a complex architecture scaffolding a binding pocket completely enveloping lysine. Mutations conferring antimicrobial resistance cluster around this site as well as highly conserved long range interactions, indicating that they disrupt lysine binding or proper folding of the RNA. Comparison of the free and bound forms by x-ray crystallography, small angle x-ray scattering, and chemical probing reveals almost identical structures, indicating that lysine induces only limited and local conformational changes upon binding.
Small non-protein-coding RNAs and mRNA sequences play a central role in cellular regulatory processes and are involved in virtually every aspect of the maintenance and transmission of genetic information. One prevalent form of riboregulation in bacteria is the riboswitch; at least 4% of all genes in Bacillus subtilis and related species are controlled in this fashion (1). This non-protein-coding element exerts genetic control in a cis-fashion by interacting with a cellular metabolite, thereby directing formation of one of two mutually exclusive mRNA secondary structures (reviewed in Ref. 2). Depending upon their placement within the mRNA, riboswitches control transcription or translation in bacteria (3) and alternative splicing or mRNA stability in eukarya (4,5).
Currently, there at least 20 distinct families of riboswitches that recognize a diverse set of metabolites including nucleobases, sugars, vitamin cofactors, amino acids, and metal ions (2). The lysine-binding riboswitch is of particular interest for several reasons. Although in vitro selection methods are capable of generating aptamers to an equally diverse set of compounds (6), one of the few molecules for which an aptamer has failed to be raised is lysine (7). This suggests that RNA may require a complex architecture for recognition of this otherwise simple amino acid. Second, the lysine riboswitch has been the focus of studies involving the potential of riboswitches as targets of antimicrobial agents (8 -10) because resistance to lysine analogs such as S-(2-aminoethyl)-L-cysteine (AEC 2 , see Fig. 1A) in Escherichia coli and B. subtilis is the result of mutations within the lysine riboswitch that regulates the lysC gene (see Fig. 1B, highlighted in blue) (11,12).
A recent study of AEC resistance in E. coli uncovered a mechanism that implicates lysyl tRNA-synthetase (LysRS) as the primary target of this compound (10). The toxic effects of AEC are a result of its incorporation into proteins in the place of lysine due to the inability of LysRS to discriminate between the two compounds. Mutations in the lysine riboswitch confer resistance to AEC because they result in a loss of lysine-dependent regulation of key lysine biosynthetic enzymes, increasing the intracellular lysine concentration. This allows lysine to effectively outcompete AEC for binding to LysRS, alleviating its toxic effects (10). Thus, development of new effective lysine analog antimicrobials will require targeting both LysRS and the lysine riboswitch (10). In the current study, we have solved the crystal structure of the lysine riboswitch in complex with lysine, revealing the basis for recognition of both the cognate ligand and the antimicrobial analogs, and provided insights into how mutations in the riboswitch might confer AEC resistance through two different means.

EXPERIMENTAL PROCEDURES
RNA Preparation and Crystallization-A 161-nucleotide double-stranded DNA coding for the Thermotoga maritima lysine riboswitch aptamer domain controlling the asd gene was constructed by PCR using overlapping oligonucleotides. The RNA was transcribed and purified using previously published techniques (13). The refolded RNA was then exchanged into 10 mM Na-HEPES, pH 7.0, 5 mM MgCl 2 , and 2 mM lysine before storage at 4°C. For the free state, RNA was refolded in the lysine supplemented buffer, exchanged three times into 10 mM Na-HEPES, pH 7.0, 5 mM MgCl 2 followed by overnight dialysis into 1 liter of lysine-free buffer. The final concentration was determined by absorbance at 260 nm (⑀ ϭ 1,570,000 M Ϫ1 cm Ϫ1 , molecular weight ϭ 52,433 g mol Ϫ1 ). RNA was stored at 4°C until use.
The riboswitch was crystallized by the hanging drop vapor diffusion method in the presence of 1 mM lysine or in the absence of lysine for the free state crystals. Drops were set up by mixing 1 l of RNA with 1 l of a mother liquor solution consisting of 2 M Li 2 SO 4 , 5 mM MgCl 2 , and 10 mM Na-HEPES, pH 7.0, and 60 mM iridium hexammine to obtain the heavy atom derivative crystals. Identical conditions were used to grow the free state crystals except that no iridium hexammine was used in the mother liquor. Crystals were obtained within 24 h and required no additional cryoprotection agent; they were looped with 0.2-0.3-mM loops and flash-frozen in liquid nitrogen before data collection.
Data Collection-Data for the bound state iridium hexammine derivative crystal were collected on beamline X29A at the Brookhaven National Synchrotron Light Source X-rays at the iridium absorption peak. These data were integrated and scaled using HKL2000 (14). All data used in phasing and refining came from a single crystal. Data for the unliganded structure were collected using CuK␣ wavelength (1.5418 Å) radiation on an R-AXIS IVϩϩ home source (Riguaku MSC), and the data were indexed and scaled using D*TREK (15).
Phasing and Structure Determination-Phases were determined by single wavelength anomalous diffraction using data extending to 2.8 Å. SHELXD (16) was used to find three iridium heavy atom sites within the asymmetric unit that had reasonably high occupancy. These heavy atom sites were used to calculate phases in SHELXE (17). The resulting experimental density map, following density modification (0.5 solvent fraction), displayed clear features corresponding to RNA backbone and base pairing (supplemental Fig. S1). This map was used for initial building of the model.
The model was built in Coot (18) and refined in PHENIX (19) using iterative rounds of building and refinement. The RNA nucleotides were initially built along with four iridium hexammine molecules. This model was brought through multiple rounds of simulated annealing and atomic displacement factor refinement before building lysine into the model. At this point, the density for the entire ligand was clearly visible and was validated by inspection using a simulated annealing omit map in which the ligand and a few surrounding nucleotides were omitted from the model (supplemental Fig. S2). One round of water picking was carried out by the PHENIX-ordered solvent protocol; waters were chosen based on peak size in an F o Ϫ F c map. R free was monitored in each round to ensure that it was dropping. Figures were prepared using PyMOL (20). The unliganded RNA model was built using the bound form as a molecular replacement solution using only the RNA and re-refined using iterative rounds of simulated annealing and atomic displacement factor refinement. The final refinement statistics are shown in supplemental Table S1, and the structure factors and models have been deposited in the Protein Data Bank (accession codes 3D0U and 3D0X).
Chemical Probing Using Selective 2Ј-Hydroxyl Acylation Analyzed by Primer Extension Chemistry-RNA sequences were constructed to correspond to the T. maritima RNA that was crystallized as well as the riboswitch controlling the B. subtilis lysC gene (21). The B. subtilis sequence was truncated in the P5 region to match the length of the T. maritima sequence to ensure that the RNAs are comparable, and the 5Ј-and 3Ј-structure cassettes were appended to these sequences as described previously (22). RNA was generated by run-off transcription and purified according to the same protocol used to generate RNA for the crystallographic studies.
RNA was prepared for modification by placing 1 l of 2 M RNA (2 pmol) into 11 l of 0.5ϫ Tris EDTA buffer. This sample was heated/cooled to allow the RNA to refold and then supplemented with 6 l of buffer consisting of 333 mM K-HEPES, pH 8.0, and 333 mM NaCl. This buffer was supplemented with 2 mM lysine for the plus ligand reactions (final concentration of 667 M), and MgCl 2 was included in the folding buffer at concentrations ranging from 6.8 mM to 425 M in 2-fold dilutions to yield the concentrations shown in the magnesium titration experiments. All reactions were supplemented with 1 l of 3 mM Mg 2ϩ to ensure proper extension by the polymerase during the reverse transcription step. Modifications were carried out using 30 mM NMIA for the recommended five half-lives at 20°C; reverse transcription and gel running procedures were performed as described previously (22).
Small Angle X-ray Scattering (SAXS) Data Collection and Analysis-The B. subtilis lysC riboswitch was prepared for SAXS analysis using an Ettan LC liquid chromatography system configured with a Superose 6 PC 3.2 size-exclusion column (GE Healthcare), an in-line vacuum degasser, and a 0.01-m filter. Three distinct sample buffer conditions were used and are referred to as native-bound, bound, and unfolded containing either 2 mM lysine and 5 mM MgCl 2 or 2 mM EDTA, respectively. All sample buffer conditions contained 20 mM Na-HEPES, pH 6.5, and 50 mM KCl. Prior to gel filtration purification of the sample, the column was equilibrated with the appropriate buffer. Likewise, RNAs were refolded as described above at 5 M and concentrated to 50 l to a final concentration of 5 mg/ml. Each purification utilized 50 l of sample, and a fraction corresponding to the major eluting peak was taken for direct SAXS experiments.
SAXS data were collected at the synchrotron beamline 12.3.1 of the Advanced Light Source (Berkeley, CA). All scattering data were collected at 1.0332 Å using a minimal sample volume of 18 l. A full scattering curve was measured as two separate exposures at 6 and 60 s for the sample and buffer. The x-ray scattering due to the riboswitch RNA was determined by subtracting the background x-ray scattering of the gel filtration buffer from that containing the RNA and buffer. Data were reduced and combined with Primus to produce the final x-ray scattering curve (23). Data were collected over a range of RNA concentrations (0.7-0.1 mg/ml). No concentration-dependent changes were observed in the lowest scattering angles.
The radius of gyration (R g ), which describes the mass distribution of a particle around its rotational center of mass, was determined either by using the Guinier approximation (Guinier R g ) (24) within the angular range (q) of q⅐R g Ͻ 1.3 or by calculating the electron pair-distribution function (Real Space R g ) in Gnom (23,25). All final plots were prepared with KaleidaGraph.

RESULTS AND DISCUSSION
Crystallization and Structural Determination-To understand the basis for lysine recognition and AEC resistance, we have solved the structure of a riboswitch that controls the expression of the T. maritima ␤-aspartate semialdehyde dehydrogenase gene (asd).
This RNA, which is centered about a conserved five-way junction motif, contains all of the nucleotides whose identity is Ͼ90% conserved across phylogeny (Fig. 1B, red) (3). An iridium hexammine derivative yielded sufficiently high quality data from which an experimental electron density map could be calculated (supplemental Fig. S1). Data collection and refinement statistics for both liganded and unliganded structures that include all 161 nucleotides are presented in supplemental Table S1 (for the bound structure, final R work ϭ 18.2%, R free ϭ 20.9%). The 2.8 Å resolution structure of the RNA-lysine complex agrees well with previous genetic, biochemical, and phylogenetic analysis of the RNA (21,26,27). The global architecture of the RNA comprises three sets of coaxially stacked helices (P1-P2/2a, P2b-P2b/3-P3, and P4-P5) arranged roughly parallel to one another (Fig. 1C), a mode of helical organization common in larger RNAs (28). At the center of this fold is the five-way junction containing the majority of the nucleotides with Ͼ90% phylogenetic conservation, in which a single lysine is observed wedged between helix P1 and the J2/3 joining region.
Overall Structure-The tertiary architecture of the RNA is dominated by formation of a three-helix bundle structure composed of the P2, P3, and P4 helices (Fig. 1C) stabilized via interactions mediated by their terminal loops. A kissing loop interaction is observed between the terminal loops of P2 and P3 that was identified as important for the ability of the B. subtilis lysC riboswitch to efficiently terminate transcription (29). Unlike other structurally characterized kissing loop interactions, there is a stacking interaction between G40 and U91, which are oriented perpendicular to the P2b/3 helical axis. These two bases form hydrogen-bonding interactions with the major groove of the central four base pairs of the P2b/3 helix. This additional dinucleotide "staple" may constitute an adaptation for function at elevated physiological temperatures; similar observations were made in the selection of thermophilic ribozymes, where mutations that add new tertiary interactions or further stabilize existing ones are responsible for adaptation to elevated temperatures (30).
The ability of the two terminal loops to interact is achieved by a ϳ120°bend at J2a/2b using a novel internal loop motif. In the majority of other lysine riboswitches, this turn is achieved by the structurally similar canonical kink-turn motif (31). Thus, although the majority of the aptamer domain is highly conserved, some elements of the peripheral region of the lysine riboswitch have evolved unique solutions to the stabilization of a common global architecture, reflecting the modular nature of RNA structure in general (32).
The second element stabilizing the three-helix bundle is an interaction between the terminal pentaloop of P4 and an internal loop motif adjacent to the sarcin/ricin motif between P2 and P2a (33). The pentaloop of P4 forms a structure similar to a standard GNRA tetraloop motif by flipping out a uridine residue (U125). Rather than docking with another helix using the sugar edge of the three stacked adenosine residues, as commonly observed for most tetraloop-mediated interactions (34), the adenine bases interact with the minor groove of P2 using their Watson-Crick faces. Unusually, A123 forms the central base of a U21⅐A123⅐G65 base triple that anchors the interaction.
Lysine Recognition-The ligand-binding pocket is contained within the core of the five-way junction motif, sitting between the P1 helix and J2/3 and is flanked by the first base pairs of the P2 and P4 helices (Fig. 1D). The carboxylate group of lysine forms a set of hydrogen bonds with the N2 amino groups of the G111⅐U137 wobble pair, the G9-C76 Watson-Crick pair and the 2Ј-hydroxyl group of G8. Further contacts to the N3 and O 2 Ј atoms of G111 are made by the ␣-amino group of lysine. The ⑀-amino group of lysine is recognized by a combination of electrostatic and hydrogen-bonding interactions within a pocket that places it close to the non-bridging phosphate oxygen of G77 along with the O4 oxygen atom of the ribose sugar. The relatively small size of the ⑀-amino pocket near G77 precludes efficient recognition by homoarginine and N 6 -trimethyl-L-lysine (8,21). This is likely the basis for discrimination between the related metabolites lysine and diaminopimelate.
Discrimination between lysine and other closely related compounds is further achieved through indirect recognition of the methylene linker of the side chain. The lysine side chain is bound in an extended conformation that allows it to span the two sites of interaction of the polar atoms, consistent with the ability of a lysine analog that contains a trans-double bond between the ␥and ␦-carbons to productively bind this riboswitch (8). As a result, compounds containing shorter or longer side chains (L-ornithine and L-␣-homolysine, respectively) are not efficiently bound because their side chain is of the incorrect length to allow the proper contacts between all of the polar atoms of lysine and the RNA. The hydrophobic methylene groups are primarily contacted through stacking interactions with G77, A78, and the G8⅐G152 pair.
Despite being a critical component of proper recognition, the methylene groups of lysine are not tightly packed against the RNA. The loose packing around lysine explains the ability of antimicrobial lysine analogs containing modifications at the ␥-position (Fig. 1A), such as L-3-[(2-aminoethyl)-sulfonyl]-alanine, L-4-oxalysine, and AEC, to bind reasonably well to the riboswitch (8,21). Similarly, the thiamine pyrophosphate riboswitch only moderately contacts the central thiazole ring of thiamine pyrophosphate (35,36). In each case, moieties recognized through indirect readout, which are generally the hydrophobic groups, are modified to yield riboswitch-binding antimicrobial agents (37).
Ligand-dependent Conformational Changes in the RNA-Lysine is completely buried within the five-way junction (100% solvent-inaccessible), indicating that there is some form of folding event concurrent with binding. To test the hypothesis that lysine-dependent conformational changes occur within the riboswitch, we crystallized the RNA in the absence of lysine and determined its structure. The RNA crystallized under the same conditions and in the same space group. The resulting structure is nearly identical to the complexed form (supplemental Fig.   S3A), with only minor differences between the two structures in the positioning of the 5Ј-side of the P1 helix (supplemental Fig. S3B). This finding reveals that the global architecture can be formed in the absence of ligand. Close examination of the binding pocket shows that positioning of some of the nucleotides is perturbed by 2-3 Å, but the overall pattern of base interactions remains the same (Fig. 2A).
The above result, however, could potentially be an artifact of the crystal lattice where crystallization induces a conformation that does not exist in solution. To further explore the potential similarity between the bound and unbound lysine riboswitch in solution, we probed backbone flexibility using selective 2Ј-hydroxyl acylation analyzed by primer extension chemistry to monitor local changes in the RNA (22) and SAXS to monitor global changes. To ensure that the observed behavior is a general feature of the L-box, we probed both the T. maritima asd and the B. subtilis lysC RNAs at varying magnesium concentrations in the absence and presence of 667 M lysine (Fig. 2B, and supplemental Figs. S4 and S5). In particular, we wished to probe these RNAs under near physiological salt concentrations (100 mM NaCl, 0.5-2 mM MgCl 2 ), as opposed to the high salt conditions of the crystals (2 M Li 2 SO 4 , 5 mM MgCl 2 , 60 mM iridium hexammine). These experiments reveal two general trends in the response of the L-box to magnesium and ligand. First, both RNAs exhibit clear changes in reactivity upon the addition of magnesium. These changes are centered on J2a/2b and tertiary interactions involving the terminal loops of P2, P3, and P4. For example, the conserved uridine in L4 becomes reactive in both RNAs at 2 mM magnesium (Fig. 2B), reflecting its exposure to the solvent in the folded structure. Second, both RNAs show limited lysine-dependent changes localized principally to the five-way junction, primarily J2/3, the base of P5, and 3Ј-side of the P1 helix (Fig. 2B). It is noteworthy that at high magnesium concentrations (Ͼ1 mM), lysine does not appear to significantly affect the structure in the B. subtilis variant; lysine-dependent effects are most pronounced near physiological magnesium concentrations (0.2-0.5 mM). These data are in agreement with both in-line probing of the B. subtilis lysC riboswitch (8,21) and an observed magnesium-dependent fluorescence change of a 2-aminopurine label-incorporated riboswitch at the equivalent position to A153 in the five-way junction (38). Together, these data indicate that the structure of the L-box can be induced by magnesium and nearly native in the absence of lysine.
To further validate the chemical probing data, we analyzed the B. subtilis lysC riboswitch by SAXS. This solution method is sen-  Fig. S3. C, small angle x-ray data corresponding to free (EDTA, gray; magnesium, orange) and lysine-bound (green). The left panel shows the experimental electron pair distribution plot where the x-intercept reflects the most likely maximum intermolecular scattering distance (D max ). The right panel is a Kratky plot that reflects the extent of unfoldedness of the macromolecule (40). D, map of mutations (cyan) conferring AEC resistance onto the lysine riboswitch structure.
sitive to the global conformation of a macromolecule and provides a direct measure of the radius of gyration (R g ), maximum dimension (D max ), and electron-pair distribution function of the macromolecule (39). The addition of 5 mM magnesium in the absence of lysine induces a significant compaction of the RNA (Fig. 2C, left; supplemental Fig. S6). This compact form is not significantly altered by the inclusion of 2 mM lysine, as anticipated. More importantly, the electron-pair distribution plots describing both the bound and the unbound forms of the riboswitch in 5 mM magnesium are identical. The maximum dimension of the RNA in the presence of magnesium, regardless of lysine (108 -110 Å; supplemental Table S2), is consistent with the crystal structure after taking into account the first solvation shell (ϳ108 Å). In the absence of magnesium, the riboswitch is weakly folded, as illustrated by the hyperbolic feature of the Kratky plot (Fig. 2C, right). In the presence of magnesium, the Kratky plot converges rapidly at small scattering angles, reflecting a well folded molecule (40). Together with the chemical probing data, these results indicate that this riboswitch can adopt a near native form without lysine in solution with the aid of magnesium. Thus, lysine likely gains access to the binding pocket through small, localized fluctuations in the RNA structure, mostly centered within J2/3.
Resistance Mutations in the Lysine Riboswitch-Resistance to the antimicrobial lysine analog AEC is conferred in both E. coli and B. subtilis via mutations within the lysine riboswitch (Fig.  2D, cyan) (8,11,12). Some of these mutations map (e.g. G8 and G9) around the binding pocket, abrogating direct contacts with lysine. More interestingly, others are observed in the distal regions of the P2 and P4 helix (nucleotides A62-A64 and G129), disrupting formation of key tertiary interactions. Notably, some of these mutants bind lysine with nearly the same affinity as the wild type B. subtilis lysC RNA (8), suggesting that AEC resistance is gained by decreasing the rate at which the RNA folds into a binding-competent structure. As transcriptional regulation has a short temporal window in which to direct the downstream secondary structural switch (41), a lowered folding rate would lead to loss of regulatory control. In this fashion, riboswitch-mediated regulation is governed by the rates at which individual elements are able to fold rather than interplay between the thermodynamic stabilities of each structure in the switch. Further studies will be needed to address the role of the ligand in the kinetic folding pathway of the riboswitch and its relation to efficient genetic regulation.