The Structural Basis for Recognition of the PreQ0 Metabolite by an Unusually Small Riboswitch Aptamer Domain*♦

Riboswitches are RNA elements that control gene expression through metabolite binding. The preQ1 riboswitch exhibits the smallest known ligand-binding domain and is of interest for its economical organization and high affinity interactions with guanine-derived metabolites required to confer tRNA wobbling. Here we present the crystal structure of a preQ1 aptamer domain in complex with its precursor metabolite preQ0. The structure is highly compact with a core that features a stem capped by a well organized decaloop. The metabolite is recognized within a deep pocket via Watson-Crick pairing with C15. Additional hydrogen bonds are made to invariant bases U6 and A29. The ligand-bound state confers continuous helical stacking throughout the core fold, thus providing a platform to promote Watson-Crick base pairing between C9 of the decaloop and the first base of the ribosome-binding site, G33. The structure offers insight into the mode of ribosome-binding site sequestration by a minimal RNA fold stabilized by metabolite binding and has implications for understanding the molecular basis by which bacterial genes are regulated.

Riboswitches are naturally occurring, structured motifs in the 5Ј-untranslated regions of a handful of mRNAs. It has been estimated that these elements regulate the expression of 3-4% of bacterial genes (1). Their mechanism of action entails "sensing" a cellular metabolite via a high affinity aptamer domain, which alters the accessibility of flanking mRNA sequences necessary for control of transcription or translation (2,3). Respective riboswitches have been discovered that sense more than a dozen distinct small molecules (reviewed in Ref. 4), and these RNA-regulatory elements have been identified in the genomes of several human pathogens (5)(6)(7). As such, elucidating the principles by which riboswitches bind their cognate ligands is critical for the identification and validation of new antibiotic targets (8,9). Queuosine (Q) 5 is a hypermodified variant of guanosine necessary for wobbling of certain tRNAs (10). This modification improves translational accuracy (11)(12)(13) and pervades both prokaryotic and eukaryotic phyla. De novo Q synthesis occurs only in bacteria, requiring that humans acquire it from gut flora or dietary sources (14). Production of Q begins with GTP and proceeds via formation of the metabolic intermediate preQ 0 (see Fig. 1A, inset), the antecedent to preQ 1 (15). Breaker and co-workers (16) discovered recently that some genes encoding proteins whose function is preQ 1 uptake or biosynthesis are regulated by riboswitches responsive to this metabolite and its analogs. Equilibrium dialysis revealed that a representative riboswitch aptamer favors preQ 1 binding over preQ 0 by only 5-fold, with preQ 0 displaying a K d of ϳ100 nM (16). Phylogenetic comparisons suggested a stem-loop secondary structure (see Fig. 1A) that could be divided further into two aptamer "types" based on sequence differences in the L1 region.
In contrast to other riboswitches, the preQ 1 aptamer is unusually small (ϳ34 nucleotides), which is ϳ2.5-fold shorter than functionally similar purine riboswitches (17). To elucidate the molecular basis by which this minimal riboswitch binds its cognate ligand to modulate gene expression, we determined the structure of a type I, preQ 1 riboswitch at 2.75 Å resolution. The results reveal the mode of ligand binding and support a mechanism of translational regulation that features metabolite-induced RNA folding to sequester the RBS. This work represents an important step in elucidating a mode of gene regulation that controls cellular levels of a metabolite unique to bacteria.

Choice of Aptamer Construct and RNA-Ligand Preparation-
The preQ 1 aptamer domain (see Fig. 1A) was derived from sequence and biochemical analyses that described its minimal architecture (16). Conserved elements include the P1 stem and L1 loop, as well as a downstream A-rich sequence. These regions were protected in the presence of metabolite under "in-line" cleavage conditions (16). Although several aptamer sequences were prepared and subjected to crystallization, including that from the well studied Bacillus subtilis, a 33-mer from Thermoanaerobacter tengcongensis (see Fig. 1A) was favored due to its small size and hot spring origins (18), which has precedents for RNA crystallization (19). All RNA sequences were synthesized by Dharmacon Inc. (Fayette, CO) followed by in-house deprotection, high pressure liquid chromatography purification, and desalting (20).
Pure preQ 0 was produced enzymatically (21). Approximately 2 mg of material was added to 25 ml of "Q 0 buffer" comprising 10 mM sodium cacodylate, pH 7.0, 10 mM MgCl 2 , and 5% (v/v) 1,3-propanediol. This solution was heated to 85°C for 1 h and then centrifuged at 10,000 ϫ g. PreQ 0 was soluble to ϳ0.5 mM as assessed by absorbance at 275 nm using an extinction coefficient of 12.9 mM Ϫ1 cm Ϫ1 . A 5-ml volume of Q 0 buffer was then heated to 65°C.
Lyophilized aptamer RNA was suspended in 0.01 M sodium cacodylate buffer, pH 7.0, to a concentration of 0.7 mM. A total of 50 l was mixed slowly with the 65°C Q 0 buffer. Once the entire 5-ml volume was added, the solution was incubated at 65°C for 3 min and then moved to a 0.5-liter water bath at 65°C. The bath was allowed to slow-cool to 28°C over 2 h. Meanwhile, a Vivaspin 2 centrifugal filter with a 3-kDa molecular mass cut-off (GE Healthcare) was pre-equilibrated with 1 ml of Q 0 buffer using a FIBERLite F21 rotor (FIBERLite, Santa Clara, CA) and a force of 4500 ϫ g. The dilute aptamer solution was applied to the filter and centrifuged for 30-min intervals at 22°C. Progress was monitored by UV absorption at 260 nm. To correct for the absorbance of preQ 0 in the flow-through, a blank was prepared from water plus a volume of Q 0 buffer equal to the volume of the measured sample. The sample solution remained clear throughout the concentrating process. The final concentration of the aptamer-preQ 0 complex was ϳ0.55 mM.
Crystallization, Cryoprotection, and Heavy Atom Derivatization-Crystals of the aptamer-ligand complex were screened by hanging drop vapor diffusion experiments against a sparse matrix designed for RNA (20). Multiple crystal habits were observed from precipitating agents such as 2-methyl-2,4pentanediol, poly(ethylene) glycol, and high salt. Superior crystals were obtained at 20°C from 2-l:2-l mixtures of RNAligand complex with precipitant comprising: 1.8 M Li 2 SO4, 0.10 M sodium cacodylate, pH 6.0, 0.01 M MgSO 4 , 5% (v/v) 1,3-propanediol, and 2 mM spermine. Crystals grew as hexagonal rods to dimensions of 0.2 ϫ 0.2 ϫ 0.4 mm in 3-4 weeks and were harvested into a "stabilizing" mother liquor of 2.0 M Li 2 SO 4 , 0.10 M sodium cacodylate, pH 6.0, 0.02 M MgSO 4 , 5% (v/v) 1,3propanediol, 2 mM spermine, and saturating amounts of preQ 0 . Native crystals were cryoprotected by passage through a fresh, 1:1 mixture of silicon and paratone-N oils (Hampton Research). The aqueous hydration layer around the crystal was removed by repeated grazing of the surface with a 20-m nylon loop (Hampton Research). After ϳ5 min under oil, the crystal was captured in a nylon loop and flash-cooled by plunging into N 2 (l).
A heavy atom derivative was prepared by soaking crystals in solutions of pentaammine-(trifluoro-methane-sulfonato) Os(III) triflate (Sigma-Aldrich). Due to the limited solubility and reactivity of heavy atom compounds in the Li 2 SO 4 mother liquor, crystals were adapted from the stabilization solution into a final derivative solution of 4.0 M LiC 2 H 3 O 2 , 20 mM Mg(C 2 H 3 O 2 ) 2 , 0.10 M sodium cacodylate, pH 6.5, 5% (v/v) 1,3-propanediol, 2 mM spermine, saturating preQ 0 , and 0.10 M Os(NH 3 ) 5 -triflate. Gradual removal of SO 4 2Ϫ and maintenance of ionic strength entailed 1:3, 1:1, and 3:1 combinations of the stabilizing mother liquor with the derivative mother liquor. Each transfer lasted 15 min. Crystals were allowed to react with osmium for 4 h and showed no signs of osmotic shock or hysteresis. Previously, we reported that high acetate levels serve as a vitrification agent for RNA crystals (20). As such, crystals were flash-cooled by direct plunging into N 2 (l). Loop-mounted crystals were loaded into a 96-position cassette (Crystal Positioning Systems) at N 2 (l) temperature and shipped to the Stanford Synchrotron Radiation Laboratory (SSRL, Menlo Park, CA) for x-ray data collection.
X-ray Structure Determination and Refinement-X-ray diffraction data were collected remotely at beamline 7-1 of SSRL using the Blu-Ice and Web-Ice interfaces (22). Respective native and preliminary derivative data sets were recorded at a crystal-todetector distance of 32.5 cm as 320 ϫ 0.5 o oscillations using a Quantum 315 CCD detector (ADSC); exposure times were 20 s per degree. Native crystals diffracted anistropically between 2.75 and 2.55 Å resolution. Intensity data were reduced using the HKL2000 suite (23) with intensity statistics provided in supplemental Table S1. Space group P6 3 22 was assigned based on the Laue symmetry of 6/mmm and the observation of l ϭ 2n ϩ 1 systematic absences for the 00l class of reflections.
Derivatization by Os(NH 3 ) 5 -triflate was identified by local scaling of derivative and native data sets, which produced a 19% isomorphous difference. A fluorescence scan of a fresh Os(NH 3 ) 5triflate crystal produced fЈ and fЉ values of: Ϫ15.6 e, 13.8 e (peak); Ϫ7.2 e, 10.8 e (remote); and Ϫ20.4 e, 8.9 (inflection) as determined by CHOOCH (24). A multiwavelength anomalous diffraction (MAD) experiment was conducted using dose mode with inverse beam geometry collected in 30 o wedges to match Friedel pairs. Reflections were reduced as described for native, but keeping I hkl and I hϪkϪlϪ separate. Intensity statistics for the MAD data sets are provided in supplemental Table S1.
Two osmium sites were located and refined using SOLVE (25). A useful MAD signal was observed to 3.5 Å resolution. The application of density modification and phase extension to 3.1 Å resolution in RESOLVE (26) revealed clear RNA structure (see Fig. 1B) corresponding to one molecule per asymmetric unit. The resulting experimental, density-modified phases were subjected to PHENIX (27) for phase combination using the "build_rna" option. Although the resulting electron density maps were greatly improved by placement of the RNA polymer, PHENIX did not produce a reasonable aptamer model and could not fit the input sequence. Manual building was conducted using experimental and phase-combined maps in the graphics program O (28). The resulting model was subjected to positional and individual B-factor refinement against the native data set in CNS (29). The strong anisotropic component of the intensity data is reflected in the B-factor correction applied to F obs (B11 ϭ Ϫ21, B22 ϭ Ϫ21, B33 ϭ 42). Target geometry for preQ 0 refinement was prepared using XPLO2D (30) from a small molecule crystal structure BOVYEQ retrieved from the Cambridge Structural Database (31). Two sulfate ions were observed and modeled with partial occupancy, which lowered R free . All structure-derived graphics were produced in PyMOL (32). Coordinates and structure factor amplitudes are available from the Protein Data Bank (PDB) as entry 3gca.

Structure Determination and Quality Indicators of the PreQ 1
Aptamer Domain-To discover how a type I, preQ 1 riboswitch employs a highly economical fold to recognize its cognate metabolite, we solved the structure of the T. tengcongensis aptamer domain in complex with preQ 0 by MAD phasing. The quality of the experimental phases is indicated by the excellent agreement of initial electron density maps with the final refined coordinates (Fig. 1B). The presence of the A-rich backbone segment (orange) running parallel to the minor grove of the P1 stem (red) is distinctive at 3.1 Å resolution. Electron density was visible for several bases in this region as well, although the L1 region was less clear due to its irregular structural features. Electron density for the refined structure (Fig. 1C) is provided for comparison with the MAD maps. The refined structure exhibits strong, continuous electron density for all backbone atoms. All bases are ordered except U12 (Fig. 2), which is unstacked and protrudes into solvent. The average B-factor for all RNA atoms is 76 Å 2 (on scale of 400 Å 2 ). The final model produced an R work of 24.5% with R free ϭ 27.2%, suggesting that the model was not over fit (33). The root mean square deviations from ideal bonds and angles were 0.008 Å and 1.5 o , respectively, which is comparable with other RNA structures of this size and resolution (34,35). Refinement statistics are summarized in supplemental Table S1.
Global Fold of the PreQ 1 Riboswitch-When one considers that an RNA duplex is 20 Å, the triple-stranded preQ 1 aptamer domain appears relatively compact with dimensions of 48 ϫ 28 Å ϫ 15 Å (Fig. 2A); indeed, more than half of all bases engage in triplet pairs ( Fig. 2A). As one proceeds from the 5Ј-to 3Ј-end, nearly one-third of all bases comprise stem P1 (Fig. 2, A  and B). The flanking L1 decaloop is the site of metabolite binding and reveals numerous non-canonical nucleotide interactions. A sharp bend in this loop occurs at residue A10. Here a Watson-Crick pair forms between C9 and G33, the 3Ј-most base of the aptamer construct, that facilitates the sharp change in direction. The nearby base G11 resides in the heart of the L1 loop and forms a Watson-Crick pair with C30 of the A-rich segment; G11 stacks directly upon preQ 0 (discussed below). Thus, when the metabolite is bound, an uninterrupted base stack forms throughout the aptamer beginning with G33, passing through G11 and preQ 0 , and culminating with G20 at the 3Ј-end of the P1 stem.  (16). The conservation key is: boxed residues Ͼ95% and shaded Ͼ80%. Black residues represent the bacterial sequence with red denoting purine and blue signifying pyrimidine preferences. Inset, chemical drawings of preQ 0 and preQ 1 . QueF is the nitrile reductase gene product in this pathway. B, representative MAD, solvent-flattened electron density at 1.5 (teal) and 5.0 (yellow) at 3.1 Å resolution with the final refined model (sticks). Residues from the A-rich segment in A (orange, left) begin at A24 (lower) and proceed to A28 (upper). Regions from the Watson-Crick stem (right, red) begin at G20 (lower) and proceed to C16 (upper). C, A-weighted 2F o Ϫ F c electron density at 2.75 Å resolution using ␣ calc from the final refined coordinates. Contour levels were set to match B. A ribose zipper (supplemental Fig. S1A) involving bases U2, A19, and A23 facilitates another sharp bend that joins the 3Ј-P1 stem to the A-rich segment (Fig. 2). A comparable interaction exists in the 23 S rRNA (36) between C1476, C1477, and A2681 (supplemental Fig. S1B). Due to the intrastrand nature of these respective interactions, as well as the absence of a second canonical 2Ј-OH-to-minor-groove interaction, this motif has been classified as a "pseudo cis" ribose zipper (37). Nonetheless, the presence of this structural feature in two highly divergent RNA sequences is illustrative of the point that RNA adopts recognizable tertiary contacts that are readily classified and potentially predictable.
The ribose zipper represents a transition point for base stacking of residues in the A-rich segment. Beginning at A24 and proceeding through C30, a nearly continuous base stack forms (Fig. 2B). Remarkably, each base of the phylogenetically conserved A-rich segment (Fig. 1A) utilizes either its Watson-Crick or its Hoogsteen face to interact with the sugar edge of the P1 stem (Fig. 2). This trend continues beyond the stem and culminates with the A28-to-U6 Hoogsteen-to-Watson-Crick pairing. Importantly, this architecture primes the structure for formation of the metabolite-binding pocket in which the Watson-Crick face of A29 is poised to interact with ligand (Fig. 2B). The conserved C30 base is the last residue of the A-rich segment to stack upon the preceding 5Ј-bases and culminates with a canonical pairing to G11 located in the core of the L1 decaloop (Fig. 2).
The Metabolite-binding Pocket-To discover the mode of ligand binding, we co-crystallized the aptamer in the presence of preQ 0 . A simulated annealing-omit electron density map (Fig. 3A) indicates the quality of the ligand model. The average B-factor for preQ 0 was 58 Å 2 , which is among the lowest in the structure but comparable with the surrounding core RNA atoms. PreQ 0 itself is sequestered in a deep pocket within the L1 loop of the aptamer. The walls of the pocket comprise three strands contributed by the decaloop and the A-rich segment (Fig. 3, A and B).
Specific structural interactions to the ligand were predicted by phylogenetic and biochemical analyses (16). The structure supports the observation that the conserved base C15 engages in a Watson-Crick pair with the ligand (Fig. 3A), which accounts for the dramatic loss of metabolite affinity when C15 is mutated or the ligand is substituted with adenine (16). Such direct Watson-Crick readout of purine metabolites was observed for the guanine and adenine riboswitch aptamers (17,38). However, this shared mode of ligand recognition is simply an instance of convergent evolution. This is because the spatial organization of the preQ 1 aptamer observed here, as well as its diminutive size, refutes the possibility that it originated from an ϳ80-nucleotide purine-binding ancestor, which appears to be the case for the adenine and guanine riboswitches (17). Additional hydrogen bond contacts to preQ 0 include invariant bases U6 and A29. These interactions can be considered the walls of the binding pocket. Although the O 4 keto group of U6 receives a hydrogen bond from the N-9 imino of preQ 0 , A29 utilizes its Watson-Crick face to recognize the N 2 amino group of the ligand. These observations, as well as the aforementioned Watson-Crick pairing to C15, account for why hypoxanthine, which lacks an exocyclic amine at N 2 , loses ϳ10 3 -fold in binding affinity relative to guanine. Finally, an additional hydrogen bond interaction is observed between the O 6 keto moiety of preQ 0 and the 2Ј-hydroxyl of G11. This position is not highly conserved, which fits structural observations with regard to ligand binding.
In addition to hydrogen bonding, the preQ 0 ligand engages in several base-stacking interactions that promote core packing of the decaloop and A-rich tail elements. From the P1 stem, the G5:C16 pair forms a "floor" within the ligand-binding pocket (Fig. 3A). Conversely, a ceiling is formed by a base quartet featuring C7, G11, A14, and C30 (Fig. 3B). This arrangement effectively buries 300 of 338 Å 2 of preQ 0 surface area with nitrogen of the nitrile moiety being most solvent-exposed (Fig. 3, A and  B). An additional base triple stacks over the ligand ceiling (supplemental Fig. S2), forming a broad "roof" upon which to assemble the final tertiary interaction that unites C9 of the decaloop with G33 at the 3Ј-terminus.
Implications for PreQ 1 Binding and Translational Control-PreQ 0 and preQ 1 differ by a single amino group attached to atom C-7 of the nitrile moiety (Fig. 1A). This difference accounted for a 5-fold reduction in relative affinity for preQ 0 (16). From an energetic standpoint, the N-7 amine of preQ 1 contributes ϳ1 kcal mole Ϫ1 to the stability of the RNA-ligand complex or two hydrogen bonds (39). Thus, although all the interactions described here are expected to dictate both preQ 0 as well as preQ 1 specificity and affinity, the mode of N-7 recognition for the latter is unknown at present. However, because the sp 3 C-NH 2 hybridization of preQ 1 at the methyl-amine moiety allows it to freely rotate, it is plausible that preQ 1 interacts with the neighboring O-6 of G5, an invariant residue. In contrast, the sp-hybridized nitrile group of preQ 0 cannot donate a hydrogen bond and must remain linear. Another possibility is that nearby U12, which is unstacked from the core fold (Fig. 2B), changes conformation upon preQ 1 binding to effectively close the "lid" on the metabolite-binding pocket. A pyrimidine is favored at this position, which would support base unstacking and the need for flexibility. Ultimately, elucidation of the final specificity determinant for preQ 1 must await additional analyses.
The riboswitch of this investigation was derived from the 5Ј-untranslated region of a gene encoding a putative preQ 1 permease from T. tengcongensis (16). Import of preQ 1 or its analogs would therefore promote riboswitch aptamer folding as described here. As such, the means by which permease translation is diminished in a preQ 1 -rich environment could be through the sequestration of the RBS sequence GGGAG, in which the underlined position is G33 of the aptamer domain. Furthermore, the ability of the metabolite to mediate core folding through RNA base stacking, as well as directed interactions to the A-rich segment at A29, suggests that the aptamer-ligand complex functions as an integrated unit to sequester the RBS and 5Ј-regions, thus blocking translation. Such a mechanism has been cited for other riboswitches (40). This mode of gene regulation is distinct from "kinetic" mechanisms that produce competing terminator versus anti-terminator structures (2,41).
The riboswitch presented here is one of the smallest known aptamer motifs and provides a plausible mechanism for metabolite recognition and translational arrest. Efforts to grow crystals in the absence of metabolites have not been fruitful, making it unclear to what extent the aptamer domain folds prior to ligand binding. Another open question is whether divergent bacterial species use the same principles of ligand recognition as the T. tengcongensis aptamer domain presented here. A phylogenetic comparison suggests that many organisms preserve the C9:G33 Watson-Crick pair, but this mode of binding to the RBS is not universal. As such, additional investigations will be required to develop an understanding of preQ 1 aptamer recognition by a diverse number of organisms. This effort is likely to reveal subgroups within the type I and II domains. Such work will be essential to advance biomedical efforts aimed at targeting riboswitches that control production of bacterial-specific metabolites.