Identification of the RNA Binding Domain of T4 RegA Protein by Structure-based Mutagenesis*

The T4 translational repressor RegA protein folds into two structural domains, as revealed by the crystal structure (Kang, C.-H., Chan, R., Berger, I., Lockshin, C., Green, L., Gold, L., and Rich, A. (1995) Science 268, 1170–1173). Domain I of the RegA protein contains a four-stranded β-sheet and two α-helices. Domain II contains a four-stranded β-sheet and an unusual 3/10 helix. Since β-sheet residues play a role in a number of protein-RNA interactions, one or both of the β-sheet regions in RegA protein may be involved in RNA binding. To test this possibility, mutagenesis of residues on both β-sheets was performed, and the effects on the RNA binding affinities of RegA protein were measured. Additional sites for mutagenesis were selected from molecular modeling of RegA protein. The RNA binding affinities of three purified mutant RegA proteins were evaluated by fluorescence quenching equilibrium binding assays. The activities of the remainder of the mutant proteins were evaluated by quantitative RNA gel mobility shift assays using lysed cell supernatants. The results of this mutagenesis study ruled out the participation of β-sheet residues. Instead, the RNA binding site was found to be a surface pocket formed by residues on two loops and an α-helix. Thus, RegA protein appears to use a unique structural motif in binding RNA, which may be related to its unusual RNA recognition properties.

The bacteriophage T4 RegA protein is a unique translational repressor in that it is able to regulate the expression of 15-30 T4 genes, including its own. Previous studies have demonstrated that RegA protein binds to the translation initiation region of target mRNAs (1,2) and inhibits the formation of ribosome-mRNA initiation complexes (1,3). Although the RegA recognition element (RE) 1 has been studied in detail in two genes, gene 44 (2,4) and the rIIB gene (1,5), and has been mapped by RNase protection assays in three other mRNAs (3), it is not completely clear what common RNA features allow RegA to recognize its diverse targets.
The solution of the crystal structure of RegA protein by Kang and co-workers (6) has allowed for inspection of potential RNA binding domains and for comparison of RegA to the structures of other known RNA-binding proteins. Although RegA protein exists as a dimer in crystals (6) and in solution at moderate concentrations (7), it binds to a 16-mer RNA corresponding to the gene 44 RE as a monomer (7). Within the RegA monomer, shown in Fig. 1, there are two structural domains. Domain I consists of a four-stranded ␤-sheet (␤6, ␤1, ␤5, and ␤4) and two ␣-helices (helix A and helix C). Domain II consists of a second four-stranded ␤-sheet (␤2, ␤3, ␤9, and ␤8) adjacent to an unusual 3/10 helix (helix D). Although photocross-linking (8), partial proteolysis (9), and truncation studies (9) have been performed on RegA protein, a definitive localization of the RNA binding site on RegA has not been achieved.
In considering where RNA might bind on RegA protein, Kang et al. (6) noted that ␤-sheets often are important contributors to RNA recognition. For example, a ␤-pleated sheet in glutaminyl-tRNA synthetase plays a role in binding the anti-codon region of tRNA Gln (10), and residues in the MS2 coat protein that participate in sequence-specific RNA binding lie on ␤-strands within an extended 10-stranded ␤-sheet in the protein dimer (11). A ␤-sheet also plays a critical role in the binding of U1 RNA to the small nuclear ribonucleoprotein U1A protein (12). U1A binds to stem-loop II of U1 RNA through interactions between loop nucleotides and residues on the central two ␤-strands of a four-stranded ␤-sheet. The central ␤-strands contain the two highly conserved RNP consensus sequences (RNP-1 and RNP-2) found in over 100 RNA-binding proteins (13,14) (see Fig. 1). As noted by Kang and co-workers (6), two strands of the four-stranded ␤-sheet in domain I of RegA protein exhibit sequence similarities to RNP-1 and RNP-2 (including aromatic residues on ␤5), suggesting that this region may participate in RNA binding. Further, U1A protein has basic residues in two loops at the base of the ␤-sheet, referred to as "jaws," which interact with the backbone of the U1 RNA hairpin. RegA protein also has two pairs of basic residues (Lys 7 and Lys 8 ; Lys 41 and Lys 42 ) in loops at the base of the ␤-sheet (see Fig. 1), which could be envisioned to function in binding RNA.
Although domain II of RegA protein does not exhibit structural similarity with any known RNA binding protein, there is experimental evidence suggesting that domain II residues contribute significantly to RNA binding. For example, the site of photocross-linking of RegA protein to nucleic acid was found to be Phe 106 (8), which lies within domain II. Also, removal of 13 or 17 residues from the C terminus of RegA (located in domain II) resulted in 100 -1000-fold reductions in RNA binding affinity (9). Finally, partial proteolysis of RegA protein, which leads to cleavage at three sites in the C-terminal domain, is reduced by RNA binding (9).
In addition to the above experimental observations and the observed similarities of RegA protein with other RNA-binding proteins, the fact that RegA protein is small (14.6 kDa) and binds to a relatively large binding site size on RNA (Ն12 nucleotides) (4) suggests that residues involved in RNA binding may span both domains of the protein. In fact, Kang et al. (6) have suggested that the two ␤-sheet regions, which are 25 Å apart, may both participate in RNA binding. To test this possibility, site-specific mutagenesis of surface residues in both domains I and II of RegA protein was performed. Basic and aromatic residues were particularly targeted, since they have been found to play important roles in a number of protein-RNA complexes (11,15). Ten residues that lie within the central ␤-strands and ␤-loop regions of domain I were mutated, and the resulting proteins were assayed for RNA binding in vitro. In addition, residues within ␤-sheet A of domain II were mutated to test for their contribution to RNA binding. During the course of these studies, molecular models of RegA protein were generated to examine surface regions for potential RNA binding pockets and to determine whether residues to be mutated are located at the surface or in the core of the folded protein. These models revealed the presence of a pocket and a deep cleft at opposite sides of the interface between domains I and II. Subsequent mutagenesis was performed to test the potential role of these two regions in RNA binding. Taken together, the results of these mutagenesis and modeling studies have ruled out the participation of a number of proposed functional residues, revealing instead an unexpected site for RNA binding on RegA protein.

MATERIALS AND METHODS
Reagents and Strains-Oligodeoxyribonucleotides were synthesized on an Expedite (model 8909) nucleic acid synthesizer by the Medical University of South Carolina Oligosynthesis Facility. Oligoribonucleotides were synthesized by the W. M. Keck Foundation Biotechnology Resource Laboratory (Yale University) and were deprotected as described previously (16) and then purified by perfusion chromatography using Poros TM HQ and R1 columns (PerSeptive Biosciences), essentially as described (16). Poly(U) and poly(U)-agarose were purchased from Amersham Pharmacia Biotech. Escherichia coli AR120 (cI ϩ , N ϩ ) was obtained from A. Shatzman (Smith, Klein and French); construction of pAS 1 regA was described previously (2).
RegA Mutagenesis-Site-directed mutagenesis of residues Lys 7 , Lys 8 , Lys 16 , Lys 41 , Lys 42 , Tyr 45 , and Tyr 46 was carried out by annealing mutagenic oligonucleotides (33-42 nucleotides in length) to a doublestranded plasmid carrying the wild type (WT) regA gene (pAS 1 regA) (2) as described previously (7). The remainder of the mutations were introduced by the Quik-Change TM Site-Directed Mutagenesis protocol (Stratagene). Pfu DNA polymerase was used in a 16-cycle thermal cycling reaction to extend and incorporate the appropriate mutagenic primers, which resulted in nicked circular strands. The methylated, nonmutated parental DNA was then digested with DpnI, and the DNA was transformed into E. coli AR120 cells. Plasmids were purified from overnight LB/amp cultures (25 ml) using the Qiagen miniprep system. Mutations were then confirmed by DNA sequence analysis, using an ABI 377 automated DNA sequencer, by the Medical University of South Carolina Biotechnology Resource Laboratory.
To facilitate purification of selected mutant RegA proteins, the RegA expression vector was modified to code for a fusion protein containing four His residues at the COOH terminus of RegA protein. An oligonucleotide containing four CAT codons was inserted into the pAS 1 regA vector between the terminal codon (AAT) and the stop codon (TAA) of regA, using the Quik-Change TM protocol. Insertion of the 12-nucleotide sequence was confirmed by DNA sequencing. Mutations K14A, T18A, R21A, and W81A were then introduced into the regA-His4 vector.
RegA Protein Purification-Wild type (WT), K7L, K41L, and K42L RegA proteins were purified from AR120 cells containing WT or mutant pAS 1 regA plasmids following induction of transcription from the phage P L promoter by nalidixic acid treatment, as described previously (7,17). Protein concentrations for fluorescence analysis were determined by duplicate amino acid analyses, performed by the W. M. Keck Foundation Biotechnology Laboratory (Yale University). The expected error in the resulting concentrations is less than Ϯ10%.
RegA proteins carrying a C-terminal His 4 fusion were purified by perfusion chromatography using a Poros TM MC column charged with Ni 2ϩ on a Biocad SPRINT TM chromatography system (PerSeptive Biosciences). Induced cell extracts were centrifuged at 100,000 ϫ g for 1 h, and the supernatant was dialyzed into 20 mM phosphate (pH 7.5), 200 mM NaCl, 10 mM imidazole (buffer A) overnight. The supernatant was applied to a 1.7-ml MC column equilibrated in buffer A. The column was washed with 5 column volumes of buffer A and then eluted with a gradient of 10 -200 mM imidazole in buffer A. Column fractions were analyzed by SDS-polyacrylamide gel electrophoresis and then pooled and concentrated by centrifugation through Centriprep 10 TM (Amicon, Inc.) filtration units.
Fluorescence Quenching Assays-Fluorescence quenching assays were performed at 25°C using an SLM model 8000C spectrofluorometer (4). Reverse titrations (addition of poly-or oligoribonucleotide lattice to protein ligand) were performed in 2-ml stirred cuvettes at protein concentrations of 0.1-1.0 M in buffer C (10 mM HEPES, pH 7.2, 5 mM MgCl 2 , 1 mM EDTA, and 1 mM ␤-mercaptoethanol) (4) plus 150 mM NaCl (for gene 44 RE RNA) or 20 mM NaCl (for poly(U)). Data were acquired at an excitation wavelength of 282 nm and an emission wavelength of 347 nm. The effects of photobleaching during titrations were corrected for by monitoring RegA protein fluorescence in a control cuvette. The average photobleaching control was 7.3%. Correction for absorption of incident light by oligonucleotides was made by performing a parallel titration of N-acetyl-L-tryptophanamide (Sigma) with nucleic acid.
The effect of salt on RNA affinity was determined by "salt-back" titration (18), in which NaCl was added stepwise to RegA proteinpoly(U) complexes following reverse titrations. The apparent association constant (K app ) for poly(U) was calculated from the equivalence point of the titration (i.e. at the addition of an equimolar amount of RNA), assuming a binding site size of n ϭ 9 nucleotides (4). A single binding site was assumed for RegA protein binding to gene 44 RE (16-mer) RNA (4,7).
Preparation of Induced Cell Supernatants-Cell supernatants were prepared by the method of Johnson and Hecht (19), with slight modifications. Briefly, transformed E. coli AR120/pAS 1 regA cells were grown at 37°C to an A 590 nm of 0.8 -0.9, and then WT or mutant RegA expression was induced for ϳ15 h by the addition of nalidixic acid (80 g/ml). After centrifugation of a 100-ml culture, cell pellets were frozen in an ethanol/dry ice bath for 2 min, followed by thawing in an ice water bath for 8 min, repeated for a total of four cycles. The pellet was resuspended in 800 l of 20 mM Tris-HCl, pH 8.0, 100 mM NaCl, and 1 mM EDTA and incubated in an ice water bath for 30 min. The suspension was centrifuged in a microcentrifuge for 3 min; the supernatant was recovered, and the pellet was resuspended in 1 ml of 1ϫ TE (10 mM Tris, 1 mM EDTA, pH 7.4). Cell supernatants were stored in 80-l aliquots at Ϫ20°C. Evaluation of the protein content of both the supernatant and the pellet resuspension was carried out by SDS-polyacrylamide gel electrophoresis. Some mutant RegA proteins were not completely soluble when produced at 37°C, and in those cases, induction was repeated at 25°C. In each case, solubility was increased sufficiently to allow gel shift assays to be performed. RegA protein concentrations in cell lysates were determined by quantitation of protein fluorescence in gels stained with SYPRO Orange (Molecular Probes, Inc., Eugene, OR) compared with a standard curve of known concentrations of purified RegA protein, using a Molecular Dynamics Storm Imager. Protein concentrations had an error range of 1-16% (average ϭ 7%).
RNA Gel Mobility Shift Assays-Purified gene 44 RE RNA (5Ј-GAAUGAGGAAAUUAUG-3Ј) was 5Ј-32 P-end-labeled by treatment with T4 polynucleotide kinase and [␥-32 P]ATP (20). Increasing volumes of induced cell supernatants were incubated with a constant amount of 32 P-labeled gene 44 RE RNA (10 nM) to generate titration curves. Binding was carried out in 10 mM Tris-HCl, pH 7.5, 50 mM NaCl, and 1 mM EDTA at 4°C for 15 min. Binding reactions were performed in a 20-l final volume with final concentrations of 2.5-40 nM RegA protein.
A freshly thawed aliquot of supernatant was used for each experiment and then discarded. Reaction products were analyzed by electrophoresis on a native 8% polyacrylamide gel in 0.5 ϫ TBE (89 mM Tris, 89 mM boric acid, 4 mM EDTA, pH 8.3) at 4°C. To eliminate hydrolysis of RNA during electrophoresis, gel solutions and running buffer were prepared in diethylpyrocarbonate-treated H 2 O, and the electrophoresis apparatus was washed frequently with detergent. Gels were dried and analyzed by autoradiography, and 32 P-RNA was quantitated by Phosphor Imager analysis on a Molecular Dynamics Imager model 425.
K app values for mutant proteins were determined from gel shift assays in a manner similar to that of Rebar and Pabo (21). K app. was calculated at four points on the titration curve (Equation 1). The mean of the four values was calculated, and K app values from 2-4 experiments were averaged. WT RegA in cell supernatants was assayed in parallel with 2-4 mutant proteins in each set of experiments. S.D. values (calculated using the nonbiased or "n Ϫ 1" method using EXCEL) ranged from K app (Av)/3.3 to K app (Av)/11. Association constants were determined from phosphor image data as follows, Circular Dichroism-Circular dichroism spectra were collected on an Jasco spectropolarimeter model J-710 from 250 to 190 nm at 0.5-nm intervals. Spectra were recorded in 25 mM Tris, 25 mM NaCl, 5 mM MgCl 2 , 1 mM EDTA at 25°C, using a 0.2-mm path length quartz cuvette. The data were averaged from 10 repeat scans and were corrected for background noise by subtraction of signal from buffer alone. Protein concentrations were determined by duplicate measurements of A 280, which had a error range of Ϯ2% The molar extinction coefficients used were as follows: WT, K14A and T18A: ⑀ M ϭ 22,250; W81A ⑀ M ϭ 16,650.

Mutagenesis of Basic Loop Residues of RegA Protein-From
inspection of the structure of RegA protein, Kang et al. (6) noted that in domain I of RegA, basic residues in the loops at the base of the central two ␤-strands, i.e. Lys 7 , Lys 8 , Lys 41 , and Lys 42 (see Fig. 1) are located in positions similar to basic residues of U1A protein (residues Lys 22 , Lys 23 , Lys 50 , and Arg 52 ) that function in RNA binding (12). To test if these residues form electrostatic interactions with RNA, as they do in U1A, individual lysine to leucine substitutions were introduced at each of these residues to remove the basic charge and leave a similarly sized side chain. Mutant proteins K7L, K41L, and K42L were purified (see "Materials and Methods"), and their equilibrium RNA binding affinities were measured by fluorescence quenching assays (4), as shown in Fig. 2. Like many other nucleic acid-binding proteins, the intrinsic tryptophan fluores-cence of RegA protein is quenched upon binding to nucleic acids (4). This quenching is due to a change in the environment of one or more of the three tryptophan residues in RegA, presumably reflecting either a direct interaction of the tryptophan(s) with RNA or a conformational change in RegA upon RNA binding. Because the fraction of the maximal quenching obtained is directly proportional to the fraction of RegA protein bound to RNA (4), fluorescence quenching assays offer an accurate method of determining equilibrium binding constants.
The affinities of mutant proteins for a specific RNA (T4 gene 44 RE RNA (4)) and a nonspecific RNA (poly(U)) were assessed to evaluate the effect of these mutations on the RNA binding properties of RegA protein (Table I). As shown in Fig. 2A and Table I, all three purified mutant proteins bound gene 44 RE RNA with affinities similar to that of wild type (WT) RegA protein. In addition, K41L and K42L RegA proteins bound poly(U) with WT affinity (Table I); however, K7L bound poly(U) with approximately 10-fold lower affinity. Previous studies (4) have indicated that RegA protein binds polynucleotides with weak cooperativity (i.e. with a cooperativity parameter () of ϳ10 -12). Thus, the lower affinity of K7L for poly(U) could result from a decrease in RNA binding affinity or from alterations in protein-protein interactions leading to reduced cooperativity in RNA binding.
To assess the effects of the mutations on electrostatic interactions between RegA and RNA, salt-back titrations (18) were performed, in which NaCl was added incrementally to RegA protein-poly(U) complexes. As shown in Fig. 2C, RNA complexes formed with mutated proteins K41L and K42L exhibited salt sensitivities similar to that of WT RegA protein, while K7L-poly(U) complexes were considerably more sensitive to ionic strength. These data suggest the observed reduction in K app for nonspecific poly(U) (Table II) may be due to decreased protein affinity for RNA, rather than decreased cooperativity, since RegA protein-protein interactions have been found to be unaffected by salt concentrations (150 mM NaCl) that disrupted the K7L RNA complexes (7).
Evaluation of RNA Binding Affinities by Gel Shift Assays-To eliminate the requirement for purification of individual mutant proteins, the activities of the remainder of the mutated RegA proteins were assessed by RNA gel mobility shift assays (21,22) rather than fluorescence quenching as- says. As shown in Fig. 3, the addition of lysed cell supernatants containing WT RegA protein (induced AR120/pAS 1 regA cells) to 32 P-labeled gene 44 RE RNA produced a shift in the electrophoretic mobility of the RNA, while cell supernatants that did not contain RegA protein (induced AR120/pAS 1 cells) did not alter RNA mobility. To determine the affinity of a mutant protein for gene 44 RE RNA, increasing volumes of cell supernatants were incubated with a constant amount of RNA. In these assays, final RegA protein concentrations (see "Materials and Methods") ranged from 2.5 to 40 nM and [RNA] was 10 nM. The fraction of 32 P-labeled RNA bound by RegA protein in each reaction was determined by PhosphorImager analysis (see "Materials and Methods"), to generate a titration binding curve. Cell supernatants containing K7L, K41L, and K42L proteins each produced shifts in RNA mobility, with K app values of 4.8 ϫ 10 7 M Ϫ1 , 4.9 ϫ 10 7 M Ϫ1 , and 2.6 ϫ 10 7 M Ϫ1 , respectively, compared with the affinity of WT RegA protein of 4.7 ϫ 10 7 M Ϫ1 (Table II). The WT RegA protein affinity measured by gel shift assays (4.7 ϫ 10 7 M Ϫ1 in 50 mM NaCl) agrees well with the value determined by fluorescence equilibrium binding assays (4.0 ϫ 10 7 M Ϫ1 , in 50 mM NaCl (Ref. 4, Fig. 8)). This supports the validity of the quantitative gel shift method , and a binding site size of 9 nucleotides/protein was used to calculate K app from the binding curves. Titrations were performed at 25°C. C, salt-back titrations of RegA proteins bound to poly(U). Titrations were made by the addition of NaCl to samples saturated with poly(U). The percentage of initial fluorescence (% fluorescence) was estimated from comparison with a photobleach control of each protein, which was corrected for dilution effects. 2.8 ϫ 10 7 1.9 ϫ 10 7 a Binding affinities were determined by equilibrium fluorescence quenching assays (4); S.D. values are given in those instances where multiple determinations were carried out.
b Measured in buffer C plus 150 mM NaCl. c Measured in buffer C plus 10 mM NaCl. K app calculation assumed the binding site size (n) ϭ 9 (4).
a K app was determined by quantitative RNA gel shift analysis (see "Materials and Methods"). Binding was performed in 10 mM Tris (pH 7.5), 50 mM NaCl, and 1 mM EDTA. Values are the mean of 2-4 experiments except for WT (nine experiments).
b All mutated residues are located on the surface, except for Tyr 46 , where the OH group is exposed, but most of the aromatic ring is in the interior. c -, no binding observed.
to measure RNA binding affinities for mutant RegA proteins. Gel shift assays demonstrated that K8L RegA protein bound RNA with a K app value of 2.1 ϫ 10 7 M Ϫ1 , indicating this mutation, like the K42L mutation, has a small effect on RNA binding (2-fold reduction).

Mutagenesis of Residues on ␤-Sheet B of Domain I-In
the U1A-RNA complex, three residues within the RNP-2 segment (␤-strand 1) (Tyr 13 , Asn 15 , and Asn 16 ) form bonds with RNA (12). In addition, three residues in the RNP-1 segment (on ␤-strand 3) (Arg 52 , Gln 54 , and Phe 56 ) form critical interactions with RNA, which include formation of base-specific hydrogen bonds (Arg 52 ) and stacking with RNA bases (Phe 56 ). To see if residues on the central ␤-strands of ␤-sheet B (in domain I) play a role in RegA-RNA complex formation, mutations were introduced at Ile 4 , Thr 5 , and Leu 6 , on ␤-strand 1 and at Leu 44 , Tyr 45 , and Tyr 46 on ␤-strand 5. Ala substitutions were made at Ile 4 and Thr 5 to eliminate potential hydrophobic and hydrogen bond interactions, respectively; Leu 6 was mutated to Arg, and Leu 44 was mutated to Lys to introduce basic charges at those hydrophobic sites, and Tyr 45 and Tyr 46 were mutated to Ala to eliminate potential base-stacking interactions with RNA. Of these six mutations, only the Tyr 46 substitution significantly affected RNA binding, producing a 3-fold reduction. Thus, none of these mutations eliminated or severely (Ͼ10-fold) reduced RNA binding. In contrast, mutations in RNP-1 and RNP-2 residues in U1A protein reduced affinity 15-140-fold or eliminated RNA binding (23). Taken together with the effects of the basic loop mutations, these results suggest that ␤-sheet B and adjacent loop residues of domain I do not contribute significantly to RNA binding.
Mutagenesis of Domain II of RegA-To determine if residues on ␤-sheet A play a role in RNA binding, mutations were introduced at surface residues Asn 26 , Lys 30 , Tyr 33 , Gln 34 , and Lys 113 . Mutations were also introduced at nearby basic residues on the hairpin loop (Lys 28 ), loop 7 (Lys 107 ), helix D (Lys 109 ), and loop 8 (Lys 117 ). As shown in Table III, none of the nine mutations significantly reduced RNA affinity. Thus, ␤-sheet A and basic residues flanking it do not appear to directly participate in RNA binding. It is interesting to note that for the three mutations in this region and the six muta-tions in domain I where Leu was substituted for Lys (Tables II  and III), the altered proteins remained soluble and retained WT RNA binding affinity, indicating that the introduction of a hydrophobic residue on the surface of RegA does not lead to misfolding of the protein. Overall, the lack of effect on RNA affinity of the 19 mutations described above suggests that when mutations are restricted to surface residues, as deduced from crystal structure data, RegA protein retains its native structure.
Modeling of RegA Protein-To examine the surface topography of RegA protein for potential RNA binding pockets or platforms, space filling and ribbon models were generated from the RegA protein crystal structure coordinates (Protein Data Bank (Rutgers University), accession number 1REG), using the program SYBYL (Tripos, Inc.). These models revealed the presence of a pocket (loosely defined by residues Phe 106 , Lys 14 , Arg 21 , and Trp 81 ) that is formed at the interface of domains I and II and a cleft, formed between the C-terminal 5 residues and residues on ␣-helix B. As shown in Fig. 4, the cleft is narrow (10 Å) and deep, and there is an aromatic side chain (Tyr 118 ) at the top of the cleft. The pocket region is shallow and contains two unusually exposed aromatic side chains (Fig. 4). The fact that this pocket contains Phe 106 , which was previously found to be the site of photocross-linking to nucleic acid (8), as well as two basic residues, makes this region a strong candidate for the site of RNA binding.
Mutagenesis of the Aromatic Pocket-To test the role of the aromatic pocket region in RNA binding, individual substitutions were made at Trp 81 , Lys 14 , and Arg 21 . As shown in Fig. 5, the K14A mutation greatly reduced the RNA binding affinity of RegA protein, while the R21A mutation abolished RNA binding. Substitution of Trp 81 with Ala also abolished RNA binding (see Table II). Previous studies have shown that mutations at Phe 106 do not significantly affect RNA binding affinity (9), which suggested that although this residue lies at the interface of RegA-nucleic acid complexes (8), it does not form a bond to RNA.
Residues Arg 21 and Lys 14 lie on ␣-helix A, while Phe 106 and Trp 81 are on flanking loops (Fig. 4). To explore the potential role of other residues on ␣-helix A (containing residues 10 -21), mutations E10A, D11A, F12A, K16L, E17A, and T18A were constructed. Residues Val 15 and Leu 19 are in the core of the  2.5 Ϯ 0.6 a K app was determined by quantitative RNA gel mobility shift assays, as described in Table II.
b Location of residues and numbering of helices and strands is as in Table II. All residue side chains are located on the surface except Tyr 112 , which is only partially exposed. protein and were not mutated. As shown in Table II, the T18A substitution had an even greater effect than the K14A substitution, reducing affinity by approximately 30-fold. The Phe 12 , Lys 16 , and Glu 17 mutations produced only moderate or no reductions in RNA affinity, while the Glu 10 and Asp 11 mutations slightly increased RNA affinity. Mutagenesis of two residues on the flanking loops, Leu 83 and Lys 107 , did not affect RNA binding affinity (Tables II and III). Two mutations were introduced to test whether RNA interactions might extend from the pocket into the cleft region. A truncation was made deleting residues 119 -122, which form one side of the cleft, and a Y118A mutation was made. Neither modification significantly altered the RNA binding affinity of RegA protein (Table III). Also, the K117L mutation described above, which lies in this region, did not reduce RNA affinity. Thus, the cleft region does not appear to play a role in RNA binding.
As noted above, the quenching of the intrinsic fluorescence of RegA that occurs upon RNA binding is due a change in the environment of one or more of the three tryptophan residues. The effect of the W81A mutation on RNA affinity suggests that Trp 81 contributes to the observed fluorescence change. To determine if either of the other two tryptophan residues are involved in RNA binding, Ala substitutions were introduced at Trp 76 and Trp 112 . As shown in Tables II and III, mutagenesis of Trp 76 did not affect RNA affinity, while mutation of Trp 112 eliminated RNA binding. Jozwik and Miller (24) saw a similar effect in vivo for a W112C mutation, suggesting that Trp 112 plays a role in RNA binding. However, the crystal structure of RegA shows that the side chain of Trp 112 is predominately buried, suggesting that the observed loss of RNA binding could be due to structural alterations rather than functional alterations. Thus, structural studies would need to be performed before a conclusion can be made about the role of Trp 112 in RNA binding.
Assessment of the Effect of Mutations on RegA Structure-The observed reduction in the RNA binding affinity of RegA proteins with mutations in the aromatic pocket could be due to loss of a protein-RNA interaction or due to protein misfolding. To determine if mutant proteins are properly folded, three mutant proteins were purified, and their conformations were assessed by CD spectroscopy. To facilitate purification, mutations were introduced into an expression vector coding for RegA protein with four His residues fused to the COOH terminus. The His 4 tag was added to the COOH terminus because the NH 2 terminus is in the core of the protein, whereas the COOH terminus is on the surface of RegA protein. WT-His 4 , K14A-His 4 , T18A-His 4 , and W81A-His 4 proteins were purified, and their ultraviolet CD spectra were examined. A fourth mutant, R21A-His 4 , was expressed but was not purified because only a fraction of the recombinant protein was present in cell extract supernatants. As shown in Fig. 6A, RegA proteins K14A-His 4 , T18A-His 4 , and W81A exhibit spectra with a single minimum at 208 nm and a shoulder at 220 nm, which is characteristic of the WT RegA protein spectrum (Ref. 9 and Fig. 6). Comparison of the CD spectra of WT and WT-His 4 RegA proteins showed no detectable differences (Fig. 6B), confirming that the His 4 tag does not alter the secondary structure of RegA protein. Gel shift assays indicate that WT-His 4 RegA protein binds gene 44 RE RNA with an affinity similar to that of WT RegA (data not shown). This is consistent with the report of Allen and Miller (25) that the addition of six His to the COOH terminus of RegA protein does not dramatically alter the RNA binding properties of RegA protein. The fact that the overall shape and intensities of the CD spectra for the three mutant proteins are the same as that of WT RegA protein suggests that there is little difference in the proteins' secondary structures. The observation that the spectra crossover points (ϳ198 nm), which are independent of concentration effects, are also similar for the mutant and WT proteins further supports the conclusion that the K14A, T18A, and W81A mutations do not significantly affect the secondary structure of RegA protein. RegA protein concentrations were determined as in Fig. 3. Note that this gel image presents a darker "exposure" than the image in Fig. 3 to enable visualization of the bound RNA band in K14A samples.

DISCUSSION
In this report, the functional activity of surface residues within both domains I and II of RegA protein have been explored. Ten residues in or near ␤-sheet B of domain I were assayed by site-specific mutagenesis. Substitutions of residues in the central two ␤-strands and the basic loop regions of domain I of RegA had little (2-3-fold) or no effect on RNA binding. In contrast, mutagenesis of residues in the central ␤-strands (containing the RNP-1 and RNP-2 residues) of U1A protein (e.g. residues Arg 52 , Gln 54 , and Phe 56 ) reduced affinity 10 -100-fold or completely abolished RNA binding (23,26). The minor effects of mutations in potentially analogous residues in RegA (i.e. residues Lys 42 , Leu 44 , and Tyr 46 ) suggest that ␤-sheet B does not function as an RNA binding platform. Similarly, mutation of basic loop residues of U1A protein (e.g. Lys 22 and Lys 23 ) had large effects on RNA binding affinity (26), while mutation of basic loop residues in RegA protein had only minor effects. In addition, five surface residues on ␤-sheet A of domain II do not appear to contribute to RNA binding. This is an important finding, because ␤-sheet residues have been found to play critical roles in protein-RNA complexes, even when different overall structural motifs are involved (e.g. the MS2 coat protein versus U1A protein).
These extensive mutagenesis studies of RegA protein have shown that there are four residues (Lys 14 , Thr 18 , Arg 21 , and Trp 81 ) where substitutions significantly impair or abolish RNA binding. Since three of the four mutations do not affect the secondary structure of RegA protein, it is likely that the observed affects on RNA binding affinity are due to loss of RNAprotein interactions. The clustered location of these residues near the site of photocross-linking to nucleic acid (8) suggests that RNA binds to a surface pocket on RegA protein formed by residues on loop, helix, and loop secondary structures. As shown in Fig. 7, this arrangement positions aromatic residues on the loops and basic and polar groups on the ␣-helix. The helix residues offer a number of side chains that could form hydrogen bonds to RNA bases or the ribose-phosphate backbone. As shown in Fig. 7, the three helix residues that are critical for RNA binding are clustered on one face of helix A, spanning approximately 90°of the helix. Within this helix are two acidic side chains (Asp 11 and Glu 10 ) (Fig. 7), where mutations that eliminate the negative charge slightly increase RNA binding affinity. This may indicate that these residues lie near the binding site of the ribose-phosphate backbone and that some charge repulsion may occur in the WT protein. It is interesting to note that two of the three helix residues that potentially participate in RNA binding are charged and that hydrogen bonds involving a charged residue contribute more to the free energy of complex formation than hydrogen bonds between uncharged groups (27). In studies with tyrosyl-tRNA synthetase, Fersht et al. (27) have shown that a hydrogen bond to a charged group can contribute up to 3-4 kcal/mol to complex stability, which provides a factor of about 10 3 in specificity. In the case of RegA protein, mutagenesis suggests that Arg 21 may make a high energy hydrogen bond to RNA, with smaller energy contributions coming from hydrogen bonds between RNA and Lys 14 and Thr 18 . This is consistent with an earlier estimate that the number of ionic interactions involved in RegA-RNA binding is between two and three, based on the effect of [NaCl] on RegA-RNA affinity (4). Additional stability (and specificity) to the complex apparently comes from interactions between RNA and Trp 81 , which may include basestacking interactions. Bonds formed between RNA and main chain atoms are also likely to make contributions to complex stability, although these bonds would not be detected by the mutagenesis approach used here. Small contributions also may come from side chains of adjacent residues in the two loop regions; however, the bulk of the energy of complex formation apparently comes from bonds to Lys 14 , Thr 18 , Trp 81 , and potentially Arg 21 .
Prior to solution of the crystal structure of RegA protein, Jozwik and Miller (24) used a genetic approach to identify functionally important residues in RegA. From examination of the in vivo translational repression properties of 15 regA mutants, they identified five mutants that exhibited altered repression activity, where some target genes were repressed, but not others. Interestingly, two of the mutants with altered repression specificity had substitutions in residues on the RNA binding helix, i.e. mutants T18A and E17K. The reduced RNA affinity of the T18A mutation observed in our in vitro studies is consistent with the reported repression of high affinity target RNAs and the loss of repression of the low affinity targets in vivo (24). The roles of the other residues where mutation affected repression specificity, i.e. Ala 25 , Asp 72 , and Ser 73 , are not as clear, in that Ala 25 is located some distance from the binding pocket, and Asp 72 and Ser 73 are buried in the core of the protein.
Are there other examples of protein helix-RNA interactions? ␣-Helices frequently play a role in protein binding to doublestranded DNA, where both helix-turn-helix and zinc finger motifs use ␣-helices as recognition elements in binding to the major groove of DNA (for discussion, see Ref. 28). Although fewer examples of helices being involved in protein-RNA interactions are known, one example is the binding of human immunodeficiency virus REV protein to REV response element RNA. A 19-amino acid peptide corresponding to the RNA binding site of REV forms an ␣-helix in solution and has been found to bind specifically to the REV response element hairpin IIB RNA (29). Genetic studies have demonstrated the importance of specific residues on the helix in RNA recognition and have led to development of a model of how the REV helix might dock in the major groove of double-stranded RNA (30). Presumably, the helix projects from the surface of REV, where it can position side chains into the groove of the RNA. In RegA protein, the helix lies within a recessed pocket, which potentially offers its target single-stranded RNAs multiple opportunities for interaction with residues in the sides and trough of the pocket. The formation of an RNA binding pocket by residues in helix and loop regions of the protein appears to be unique to RegA protein and may be related to the protein's unusual RNA recognition properties.
Two remaining questions are how RegA protein discriminates between target and non-target RNAs and what features contribute to the observed differences in affinities for target RNAs. A recurrent theme in protein-RNA recognition is that RNA structure is often altered by protein binding (for discussion, see Ref. 31). Examples in which RNA distortion occurs upon protein binding include the binding of tRNA Gln to glutaminyl tRNA synthetase (10), the binding to U1A protein to U1 RNA (32,33), and binding of the REV peptide to the REV response element (34). In the case of RegA protein, RNA discrimination may well depend on deformability of the RNA as well as nucleotide sequence recognition. In fact, since deformability can depend on nucleotide sequence (10), the ability of an RNA to alter its structure to fit into a pocket on RegA protein may be a form of nucleotide discrimination. The previous finding that two dodecamer RNAs with a single base difference have different affinities for RegA protein but have the same structure in solution (35) is consistent with RNA deformability being a critical element in RegA-RNA recognition. Now that the unique binding site on RegA protein has been deciphered, future experiments can be directed at determining how different RNAs might be accommodated in the binding pocket. sity of South Carolina Biomedical Computing Resource Facility for assistance in protein modeling.