The Structure of the Periplasmic Ligand-binding Domain of the Sensor Kinase CitA Reveals the First Extracellular PAS Domain*

The integral membrane sensor kinase CitA of Klebsiella pneumoniae is part of a two-component signal transduction system that regulates the transport and metabolism of citrate in response to its environmental concentration. Two-component systems are widely used by bacteria for such adaptive processes, but the stereochemistry of periplasmic ligand binding and the mechanism of signal transduction across the membrane remain poorly understood. The crystal structure of the CitAP periplasmic sensor domain in complex with citrate reveals a PAS fold, a versatile ligand-binding structural motif that has not previously been observed outside the cytoplasm or implicated in the transduction of conformational signals across the membrane. Citrate is bound in a pocket that is shared among many PAS domains but that shows structural variation according to the nature of the bound ligand. In CitAP, some of the citrate contact residues are located in the final strand of the central β-sheet, which is connected to the C-terminal transmembrane helix. These secondary structure elements thus provide a potential conformational link between the periplasmic ligand binding site and the cytoplasmic signaling domains of the receptor.

Specific recognition of environmental conditions is an important prerequisite for the adaptive regulation of gene expression in response to external stimuli. In bacteria, this response is often mediated by two-component regulatory systems consisting of a sensor kinase and a response regulator. Transmembrane sensor kinases are used for the recognition of external stimuli and typically contain an extracytoplasmic sensor domain, flanked by two transmembrane helices, and a cytoplasmic histidine kinase domain. In many cases, they also contain additional cytoplasmic modulatory and one or more signal transfer domains (1,2). Coupling between the sensor and kinase domains enables these proteins to transduce a signal across the membrane to the cytoplasm, initiating a downstream cascade that involves the autophosphorylation of a conserved histidine residue within the kinase domain and the subsequent transfer of the phosphoryl group to an aspartate residue in the corresponding response regulator (1).
In the past two decades, hundreds of two-component systems have been identified. Based on knock-out experiments, few two-component systems appear to be essential. However, some have been implicated in the control of bacterial virulence and drug resistance. Furthermore, their adaptive importance and complete absence in mammals could provide an attractive set of antimicrobial drug targets (2). Nevertheless, many aspects of their molecular function remain obscure. In particular, for the ligand-binding sensor kinases little is known about such fundamental processes as the stereochemistry of ligand-binding, the nature of the associated conformational changes, and the mechanism of signal transduction to the cytoplasmic kinase domain.
To obtain a more detailed understanding of these processes, we have analyzed the crystal structure of the ligand-binding domain of the sensor kinase CitA from Klebsiella pneumoniae. Together with the response regulator CitB, CitA is required for the induction of citrate fermentation genes by the presence of environmental citrate under anoxic conditions. These genes are organized in a cluster that also includes, for example, subunits of citrate lyase, the citrate carrier CitS and oxaloacetate decarboxylase (3,4). Synthesis of these proteins requires careful regulation, because inappropriate expression would interfere with central metabolic pathways. Thus, even though CitA appears to be nonessential (1,2), targeted dysregulation of its pathway could provide therapeutic approaches.
CitA exhibits a domain organization similar to that described above, i.e. a periplasmic citrate-binding domain flanked by two transmembrane helices and followed by linker and kinase domains located in the cytosol (3). The periplasmic domain of CitA can be expressed as a soluble polyhistidine fusion protein (CitAP) 1 that exhibits highly specific citrate binding properties (5).
Biochemical investigations of the interactions between citrate and CitAP wild-type and mutant proteins have been reported previously (5,6). In this study we present the structure of CitAP in complex with citrate and molybdate, revealing the ligand-binding stereochemistry of an integral membrane sensor kinase and providing initial insights into possible mechanisms of signal transduction.

EXPERIMENTAL PROCEDURES
Crystallization-CitAP was expressed in Escherichia coli strain BL21 (DE3) as a 15.3-kDa polyhistidine fusion protein and purified as previously described (5). Selenomethionine (SeMet)-labeled CitAP was produced in BL21 (DE3) cells grown in M9 medium (0.04% glucose, w/v) as described (7). Purified CitAP was dialyzed extensively against 10 mM NaCl, 10 mM Tris, pH adjusted to 8.0 with HCl at 20°C. 1 mM Nacitrate was added to protein (16 mg/ml) prior to crystallization. Initial conditions for crystal growth were identified by microbatch trials performed at the High Throughput Crystallization Laboratory at the Hauptmann-Woodward Institute (Buffalo, NY) (8). Following optimization of conditions, crystals were grown in hanging drops by vapor diffusion against buffer containing 100 mM Na 2 MoO 4 , 100 mM MES, pH 6.0, and either 8% (w/v) polyethylene glycol (PEG) 4000 and 2.5% (v/v) glycerol or 14% PEG-4000 and 15% glycerol for the native and SeMetlabeled protein, respectively. Protein and reservoir buffer were mixed 1:1. The best crystals were obtained at 16 and 27°C for the native and labeled proteins, respectively. Prior to freezing in liquid N 2 the crystals were soaked briefly in a solution of 0.5 mM Na-citrate, 25% (w/v) PEG-4000, and 20% (v/v) glycerol.
Structure Determination-Oscillation data were collected at ID 14 -4 of the European Synchrotron Radiation Facility (Grenoble, France) at 100 K using an ADSC Q4R CCD detector. For phasing, data were collected from crystals of SeMet-labeled protein at the Se K-edge ( ϭ 0.9793 Å, 1°oscillations, 0.5 s, 180-mm plate distance). Diffraction data for crystals of native protein were collected at ϭ 0.9393 Å (1°oscillations, 1 s, 140-mm plate distance). Data were processed in space group P1 using XDS (9). A test set was established containing 3% of the reflections chosen in thin resolution shells. The Matthews volume was estimated at 2.8 Å 3 /Da, assuming the presence of 10 molecules per asymmetric unit. Self-rotation functions also indicated the presence of up to five independent non-crystallographic 2-fold axes.
Forty ordered Se sites were identified in the asymmetric unit using single wavelength anomalous techniques as implemented in SHELXD (10), corresponding to ten monomers. Initial phases were calculated using SHELXE (10) and improved using RESOLVE (11), which automatically identified and applied 2-fold non-crystallographic symmetry (NCS). Ten clusters of very strong positive peaks in the electron density were used to determine initial NCS operators (these were later identified as Mo 7 O 24 clusters).
Model Building and Refinement-Using Arp/Warp 6.0 (12) and the CCP4 program suite (13), 1068 of the 1390 protein residues were placed in a solvent-flattened, 2-fold NCS-averaged initial electron density map. For the most completely traced monomer (132/139) all side chains were autodocked using the program GUISIDE (12). The other nine copies were generated using the initial NCS operators.
The resulting model was then refined against the native data set. Based on values of mean I/ I and R mrgd-F as a function of resolution (Table II), data to 1.6 Å resolution were used. In the first cycle the individual monomers were treated as rigid bodies. Subsequent cycles of simulated annealing, conjugate gradient minimization, and temperature-factor refinement against maximum-likelihood targets were followed by manual rebuilding, using the programs CNS (14) and O (15), respectively. In the first cycle, experimental phase restraints were used. Tight NCS definitions were applied during refinement and were partially released in the final cycles as necessary to account for alternate conformations and crystal contacts.
The final model contains 10 monomers composed of amino acid residues 5-135, 1 Mo 7 O 24 cluster, 1 MoO 3 -citrate, and 1 Na ϩ each. A total of 1573 water molecules were added. Refinement statistics are presented in Table I.
Size-exclusion Chromatography-Size-exclusion chromatography was performed using a Superose 12 HR 10/30 column (Amersham Biosciences), pre-equilibrated in 100 mM MES, 150 mM NaCl, 0.02% Na-azide, pH 6.0, in the presence or absence of 1 mM Na-citrate, respectively. Experiments were performed at 25°C, and concentrations were held at 34 mg/ml. 100 l of sample was loaded onto the column and eluted at 0.6 ml/min. Elution profiles were monitored by UV absorbance at 280 and 218 nm.

RESULTS
Overall Structure-The structure of CitAP in complex with citrate was determined at an initial resolution of 1.9 Å using single wavelength anomalous diffraction data from crystals of SeMet-labeled CitAP (Table I and Fig. 1). The asymmetric unit contains 10 monomers (referred to here as monomers A-J). A model containing residues 5-135 of each CitAP monomer, ligand moieties, and 1573 water molecules was refined against native diffraction data to 1.6 Å resolution, yielding an R free of 19.0% with excellent geometry (Table I). Crystals formed only in the presence of molybdate.
In the ligand-bound state, CitAP forms a mixed ␣/␤-structure (Fig. 2, A and B) with a central five-stranded ␤-sheet containing residues 56 -61, 66 -68, 94 -100, 104 -110, and 121-128 (residue 2 of CitAP corresponds to residue 45 in the native CitA sequence). This sheet is flanked on one side by the protein N terminus, which forms two long, approximately parallel ␣-helices (residues 10 -24 and 38 -51) connected by a third, shorter ␣-helix nearly perpendicular to the first two (residues 27-34). The other side of the ␤-sheet is packed against the long peptide that connects strands S2 and S3 of the sheet and that contains a short 3 10 -helix and a short ␣-helix. The sheet and the interstrand peptide together form a deep concave pocket that contains electron density for a 1:1 complex of the citrate ligand with an additional moiety well fit by the structure of MoO 3 (16), consistent with the presence of molybdate in the crystallization buffer ( Fig. 1). In this complex, the molybdenum is octahedrally coordinated by the three oxo groups and a tridentate interaction with citrate ( Figs. 1 and 2B).
The discovery of a MoO 3 -citrate complex in the binding site was unexpected, because CitAP binds citrate in vitro in the absence of metal ions other than sodium (present in the buffer) and because Mg 2ϩ appears to inhibit the interaction (5). How- ever, the role of metal ions in regulating the expression of the citrate fermentation genes has not been systematically investigated, except in the case of sodium, which is required (3). Thus, the observation that CitAP can accommodate MoO 3citrate suggests that other metal-citrate complexes might also be recognized by CitA. Furthermore, in several systems, metalcitrate complexes have been implicated in regulating the transport of metals across the cell membrane. For example, in Bacillus subtilis, the homologous CitS/CitT two-component system regulates the expression of a Mg-citrate transport protein (17). In E. coli, the outer membrane receptor/transporter FecA up-regulates its own transcription in response to ferric citrate binding (18). By analogy, the ability of K. pneumoniae CitA to bind MoO 3 -citrate raises the possibility that the associated citrate carrier CitS (3,19) may be able to transport not only free citrate but also metal-citrate complexes as well.
In addition to the citrate-bound MoO 3 group, two other nonprotein moieties were identified ( Fig. 2A). One forms an octahedral complex involving three water molecules, the mainchain carbonyl group of Pro-111, and the hydroxyl moieties of Ser-110 and Ser-24, and thus links the central ␤-sheet with the C-terminal end of the H1 ␣-helix. Its position is marked with a green sphere in Fig. 2A. Based on its coordination geometry (20) and the composition of the mother liquor, this moiety has been modeled as a sodium ion. A second site with very strong electron density was found near the N and C termini and could be well fit by the structure of an isopolymolybdate cluster (Mo 7 O 24 ; Fig. 2A, bottom) (21). To our knowledge, it represents the first example of a Mo 7 O 24 cluster bound to a protein structure. Although unlikely to be physiologically relevant, the cluster mediates crystal packing interactions, consistent with the essential role of molybdenum for crystal formation.
CitAP Has a PAS Fold-A structural data base search using DALI (22) (27), the oxygen sensor of FixL (28), and profilin (29). Based on sequence homology, a PAS domain has been proposed in the cytoplasmic portion of CitA between the second transmembrane and the kinase domains (30), but not for the periplasmic CitAP domain. Indeed, neither sequence-based fold prediction by BLAST (31), 3D-PSSM (32), or LOOPP (33) nor a manual search for S1/S2 boxes (34) indicated the PAS fold of CitAP.
Based on the structural similarity of CitAP and PYP (Fig.  3A), we aligned their sequences and that of FixL, because PYP and FixL represent the first structures determined for a PAS domain and for a ligand-binding PAS domain, respectively (24, 28) (Fig. 3B). Also included in the alignment are the sequences of the ligand-binding domains of the sensor kinases DcuS and PhoQ, both of which are homologous to CitA and are subjects of parallel structural studies (35,36).  The most strongly conserved feature of the PAS domain proteins is the presence of a central ␤-sheet (30). Superposition of the CitAP ␤-scaffold composed of the strands S1-S5, with the corresponding strands A␤, B␤, G␤, H␤, and I␤ of PYP reveals a root mean square deviation of 1.2 Å for 39 C␣ atoms (nomen-clature for PYP PAS elements according to Ref. 28). In contrast, the structural elements on either or both sides of the ␤-scaffold show a much greater variability (Fig. 3A). The two major helices H1 and H3 of CitAP are more elongated than the A␣-and B␣-helices in PYP and are connected by a short helical linker  Fig. 3B. B, the CitAP structural elements forming the citrate-binding pocket (residues 51-134) are shown in stereo as a ribbon diagram, together with stick figures representing the bound citrate (Cit) and molybdate (Mo) groups and the protein side chains that contact them. The N-terminal helices are omitted for clarity. Atom coloring as in Fig. 1. C, the hydrogen-bonding network in the citratebinding pocket is represented schematically using the program LIGPLOT (56). Ligand-ligand interactions and hydrophobic contacts are not shown. Atom colors as in panels A and B, except for carbon (black) and molybdenum (green). The MoO 3 and citrate moieties are shown with purple bonds, protein side chains with yellow bonds.
instead of a loop. N-terminal ␣-helices corresponding to those of CitAP are found in the YGK9 GAF domain, the 30 S ribosomal subunit, and (cyclically permuted) in TM0065 and profilin. They are completely absent in the FixL and HERG PAS domains and are not, in fact, required for PYP function (37).
Ligand-binding Site-Additional differences between CitAP and PYP are observed on the opposite side of the ␤-scaffold (Fig. 3A), where the ligand-binding site is located (Fig. 2). In this region, CitAP lacks the C␣-helix and the short B␤-strand, and the helical connector H5 is significantly shorter than the corresponding F␣-helix of PYP or FixL (Fig. 3B). Because of the smaller size of the interstrand loops in this part of CitAP compared with PYP, citrate is bound closer to the ␤-sheet and its pocket is more closed than is the case for the PYP 4-hydroxycinnamoyl moiety. Overall, the PAS binding pockets characterized to date exhibit considerable variability, consistent with the need to accommodate their respective structurally diverse cognate ligands (23, 24, 28, 38 -41). Nevertheless, the binding pockets are typically located in the same general area between the ␤-sheet and the interstrand loops. Putative effector binding pockets similar in size and location to that of CitAP have also been found in the TM0065 signal-binding domain, the YGK9 GAF domain, and the HERG PAS domain, although the cognate ligands have not yet been identified for these domains (23,25,26).
In detail, the CitAP ligand-binding groove is formed by the ␤-sheet together with helix H4 (residues 72-76) and two flanking random coils (residues 69 -71 and 77-84) (Fig. 2). The relevant contacts between the protein and the MoO 3 -citrate can be divided into several groups. First, the binding of citrate is supported by hydrophobic contacts between citrate and the residues Tyr-56 and Met-79. Main-chain hydrogen bond interactions to citrate are mediated by Thr-58, Ser-101, and Leu-102. However, the majority of citrate-protein interactions are formed by hydrogens bonds originating from the side-chain atoms of Arg-66, His-69, Arg-107, Lys-109, and Ser-124. Except for Lys-109 and Ser-124, these residues are highly conserved among the CitA subfamily of histidine kinases (5, 6). Ser-124 is FIG. 3. CitAP has a PAS fold. A, a stereo C␣ trace of CitAP (black) was superimposed on that of PYP (Ref. 24; red) by the program DALI (22). Numbers refer to CitAP sequence. B, the sequence of CitAP was aligned with those of other PAS domains using structural (Ectothiorhodospira halophila PYP, residues 5-125, and Bradyrhizobium japonicum FixL, residues 154 -270), or sequence homology (E. coli PhoQ, residues 42-183 and E. coli DcuS, residues 42-189). For known structures, residues in ␣-helices and ␤-sheets are marked in red and blue, respectively. Schematics above the sequence represent secondary structure elements of CitAP (␣-helices, bars; ␤-sheets, arrows) with corresponding labels above the schematics. Labels below the schematics indicate the published nomenclature for secondary structural elements of FixL (28). partially conserved in the CitA subfamily and is located in the final ␤-strand S5. Hydrogen bonds to the oxygens of MoO 3 are formed by Gly-81, Arg-107, and especially by Arg-98, which interacts with two MoO 3 oxygens. Among these residues only Gly-81 is partly conserved.
Dimerization-In solution, CitAP exhibits a monomer-dimer equilibrium at millimolar protein concentrations in both the presence and absence of citrate (Fig. 4). Analysis of the lattice packing interactions in the crystal revealed two distinct candidate dimer interfaces (Fig. 5). In the first type (e.g. between monomers "E" and "G"), the interface is mainly built by the strands S3 and S4, the helical connector H5, and additional interactions in the N-terminal region (Fig. 5, A and B). This dimer interface buries a total of 1798 Å 2 of accessible surface. The dominant feature of the second dimer interface (e.g. between monomers "G" and "J") is the parallel association of the N-terminal helices H1 and H3 involving residues 10 -24 and 38 -51 (Fig. 5, C and D). This dimer formation decreases the total accessible surface by 1415 Å 2 .
There are two major differences in the molecular architecture of the resulting dimers. In the EG dimer (Fig. 5, A and B), the interface is relatively planar and the two N-terminal helices are oriented roughly perpendicular to the molecular 2-fold, i.e. parallel to the presumed membrane surface (horizontal in Fig. 5, A and C; in the plane of page in Fig. 5, B and D). In the GJ dimer (Fig. 5, C and D), the interface is characterized by a deep pocket in each surface that binds the Phe-51 side chain of the other molecule (Fig. 5D) and the N-terminal helices are arranged roughly perpendicular to the putative membrane. Although it is not possible to determine which of the two dimer interactions is physiological on the basis of these data alone, the GJ assembly is more suggestive of a mechanism for signal transduction, because it clusters the attachment points for the transmembrane helices and permits them to form as extensions of domain secondary structure elements. Furthermore, a GJ-like dimer has also been reported for a preliminary structure of the PhoQ histidine kinase sensor domain in the apo form (36), which, like CitAP, exhibits only weak dimerization in solution (42). DISCUSSION The ability of organisms to adapt to changing environmental conditions is essential for their long-term survival and requires a set of receptors that can recognize external stimuli and activate the appropriate responses. In bacteria, two-component regulatory systems play an important role in these processes. In particular, membrane-bound sensor kinases allow the detection of ligands outside of the cytoplasmic membrane, so that the cellular response does not require transport or diffusion of these ligands into the cell. However, despite their key role in bacterial adaptation, many aspects of the molecular mechanisms of these proteins are unknown. Here, we present the first detailed crystallographic analysis of a sensor kinase periplasmic signal-binding domain and of the interaction of this domain with its cognate ligand.
The structure reveals that CitAP adopts a PAS fold, unexpected on the basis of sequence comparisons (Figs. 2 and 3). The PAS fold has now been identified in domains from a functionally heterogeneous group of prokaryotic and eukaryotic proteins, often exhibiting negligible sequence identity. Named for the founding members Per, ARNT, and Sim, the family includes signal-recognition modules that regulate cellular responses to light, oxygen, voltage (LOV) and other environmental stimuli (28,30,38,41,43,44). In addition, PAS domains have been found in several other proteins, including a eukaryotic ion channel (26), a nucleotide-binding protein (25), and transcriptional regulators (23,39,40).
Based on sequence motifs, PAS domains have also been identified in a number of sensor histidine kinases, including CitA. However, these domains, like all PAS family members identified to date, are proposed to have an intracellular location. To our knowledge, the CitAP domain is the first representative of the PAS family located outside the cytoplasm. Because a large number of sensor kinase signal-binding domains exhibit sequence homology to CitAP (5, 6), we can anticipate that they will also adopt a PAS fold. Given its demonstrated flexibility in accommodating a wide variety of ligands in an evolutionarily protean binding pocket, the PAS fold appears well suited to the functional requirement that the sensor kinases be able to recognize chemically distinct environmental signals.
Citrate is generally bound to proteins via clusters of positively charged side chains that neutralize its multiple negatively charged carboxylate moieties (e.g. aconitase (45), yeast isocitrate dehydrogenase (46), citrate synthase (47), FecA (48)), although this is not strictly required (e.g. human aldose reductase (49), MoFe nitrogenase (50)). In the case of CitAP, two arginines, a lysine, and a histidine are found to interact with citrate in the pocket (Fig. 2), all of which had been previously identified by calorimetric binding studies of site-directed mutant CitAP domains (6). A third arginine that was found to influence citrate binding interacts indirectly with citrate via the MoO 3 group. Conservation of many of these countercharges suggests that the mode of binding is also likely to be found among the CitA subfamily of histidine kinases (5,6). Because the mutational study was performed in the absence of molybdenum, it also suggests that the binding stereochemistry of the citrato-molybdate complex observed in the crystal structure is similar to that of citrate alone. The formation of this MoO 3citrate complex is consistent with the presence of molybdenum in the crystallization buffer and represents, to our knowledge, the first structure of the MoO 3 species in complex with protein.
In the absence of a citrate-free structure, the molecular rearrangements responsible for kinase activation cannot be directly visualized. Effector-induced conformational changes have been characterized for the oxygen sensor FixL and the light-sensing LOV1 and LOV2 domains. In these cases, only small, local changes are observed (28,38,41,44), and it appears that these changes may represent conserved signaling mechanisms (37,51). Because these sensing domains interact directly with their respective histidine kinase domains, relatively small changes are sufficient for downstream signaling. In contrast, the CitA ligand-binding and histidine kinase domains are located on opposite sides of the cytoplasmic mem-brane from each other, requiring distinct, long-range conformational changes for signal transduction.
Because CitAP is observed to dimerize in solution (Fig. 4) and because sensor kinase autophosphorylation generally occurs in trans within a homodimer (2), signal transduction is likely to occur within the context of a CitA dimer. However, the extent of CitAP dimerization does not appear to be strongly dependent on ligand binding (Fig. 4), so it is unlikely that signal transduction is mediated by a shift in the CitA monomer: dimer equilibrium, such as that seen in the human growth hormone receptor system (52).
Instead, CitA activation is more likely to be associated with rearrangements among the transmembrane helices that can, in turn, be transmitted to the cytoplasmic kinase domains. The citrate binding site includes the final strand (S5) of the ␤-sheet, which is connected to the transmembrane helix at the C terminus of the domain. As a result, ligand binding could directly affect the packing of the helices. Furthermore, alanine mutations of residues Arg-7, His-9, and Arg-15 have been shown to affect citrate-binding thermodynamics (6). These residues are located outside the binding pocket but near the N-terminal transmembrane helix. Finally, evidence suggests that sodium is required in addition to citrate for signal transduction to occur (3), and the putative sodium ion identified in our structure of CitAP is located between the N-terminal helix H1 and the ␤-sheet ( Fig. 2A). All of these observations are consistent with the proposal that conformational rearrangements among the transmembrane helices and the central ␤-sheet are important for sensor kinase activation. Helical shifts have also been proposed as the mechanism for the bacterial chemoreceptors (53), although the nature of the conformational couplings is likely to differ because of differences in the structure and location of the ligand-binding site (54). As a result, a detailed understanding of PAS domain-mediated transmembrane signaling will ultimately require comparison of the ligand-bound structure presented here with a ligand-free state structure that remains to be determined.