Structural basis of a novel repressor, SghR, controlling Agrobacterium infection by cross-talking to plants

Agrobacterium tumefaciens infects various plants and causes crown gall diseases involving temporal expression of virulence factors. SghA is a newly identified virulence factor enzymati-cally releasing salicylic acid from its glucoside conjugate and controlling plant tumor development. Here, we report the structural basis of SghR, a LacI-type transcription factor highly conserved in Rhizobiaceae family, regulating the expression of SghA and involved in tumorigenesis. We identified and characterized the binding site of SghR on the promoter region of sghA and then determined the crystal structures of apo-SghR, SghR complexed with its operator DNA, and ligand sucrose, respectively. These results provide detailed insights into how SghR recognizes its cognate DNA and shed a mechanistic light on how sucrose attenuates the affinity of SghR with DNA to modu-late the expression of SghA. Given the important role of SghR in mediating the signaling cross-talk during

Regulation of gene transcription is essential for bacteria to maintain their cellular function and respond to environmental changes. In bacteria, a large number of transcription factors have been identified and shown to form tremendously complicated regulation networks that control gene expression (1). Transcription factors are generally classified into repressors and activators, which block and promote gene transcription, respectively. Most transcription factors are composed of two domains: a regulatory domain (core domain) for sensing the external or internal signals and triggering the conformational change and a DNA-binding domain (DBD) for associating/dissociating with its cognate DNA in response to the conformational change originated from the regulatory domain. These two domains function allosterically to play a central role in gene regulation.
As one of the most well studied transcription factors, LacI family regulators have been discovered in numerous bacteria to coordinate a wide range of cellular processes (2,3). Since the first lac repressor from Escherichia coli identified in the 1960s (4), more than 1000 bacterial members of the LacI family have been characterized with diverse functions (2). Therefore, the LacI family has been of considerable interest in molecular biology, microbiology, and structural biology. Detailed studies of these LacI-type regulators not only have advanced our understanding of the genetic control of a variety of metabolic and signaling pathways but also have revealed huge potentials in practical applications, such as redesigning of existing proteins in synthetic biology (5).
As the causative agent of crown gall disease, A. tumefaciens infects a wide range of plants, which is of great concern to the agricultural industry. This bacterial pathogen invades plants by transferring its tumor-inducing plasmid into the plant genomes, which also renders it a useful tool in plant genetic engineering. Infection by A. tumefaciens is mainly regulated by the Agrobacterium VirA/VirG two-component regulatory system (6). Independent of VirA/VirG, we have recently found that a pair of novel proteins, SghR and SghA, regulate plant tumor growth through controlled release of the host defense signal salicylic acid (SA) by a previously uncharacterized mechanism. SghA is a glucosidase, controlling the release of SA from its inactive storage form, SA 2-O-b-D-glucoside, in host plants, which then activates plant immune response and shuts down bacterial virulence when the infection is successfully established (7).
Sucrose is a major photosynthetic product representing .95% transported carbohydrate in plant phloem, and it could reach a concentration of up to 1 M in the conducting vascular cells and 2-7 mM in the extracellular space (8). In addition, sucrose would be unloaded from the sieve elements of phloem, leading to its accumulation to reach even higher concentrations while tissue wounding (9,10). It has also been reported that sucrose plays an important role in wound signaling (11) and endogenous signaling to induce defense responses against pathogens in rice (12). We found that the accumulated sucrose at the wound sites during healing appears to serve as an environmental stimulus involved in the SghR-SghA pair mediating the signaling cross-talk at the late stage of Agrobacterium infection (7).
Here, we first investigated the functional regulation of SghR on SghA in vivo by performing a tumorigenesis assay and RT-PCR. The results demonstrated the involvement of SghR/SghA in tumorigenesis, and transcription regulation of SghA is This article contains supporting information. ‡ These authors contributed equally to this work. * For correspondence: Yong-Gui Gao, ygao@ntu.edu.sg; Lian-Hui Zhang, lhzhang01@scau.edu.cn. modulated by SghR. Given that the expression of SghA is negatively regulated by SghR and SghR is a LacI family transcription factor and highly conserved in the Rhizobiaceae family, we further identified and characterized the cognate DNA sequence for SghR binding on the sghA promoter region. Subsequently, we determined the crystal structures of isolated SghR, SghR-DNA, and SghR-sucrose complexes. Structural analyses unraveled the detailed interactions of SghR with its cognate DNA and its effector, sucrose, as well as uncovered the allosteric mechanism underlying how sucrose relieves the repression of SghA upon its binding to SghR. Taken together, SghR controls Agrobacterium infection through its regulation of SghR in response to host signal sucrose, and our results are of considerable importance for establishing a detailed understanding of gene regulation during the process of host-pathogen interaction.

The regulation of SghA by SghR
SghA, an SA-releasing enzyme, has been implicated in immune response and tumorigenesis during Agrobacterium infection, which is likely relevant to SghR (7). To further reveal that SghA is regulated by SghR and that this pair, SghR-SghA, is involved in regulating plant tumor development, we examined the tumor occurrence and growth on carrot disks that are infected by either WT or deletion mutant Agrobacterium strains (DsghA, DsghR, and DsghRA). First, we recorded the number of tumors incited on carrot disks and found no significant difference among the WT and these mutant strains, suggesting that SghA does not involve the initiation of infection. However, the tumor weight, which has been used to measure the severity of crown galls (13,14), appeared to be ;300 mg for the DsghA and ;120 mg for the DsghR mutant, in contrast to ;200 mg for the WT A6. Statistical analyses show significant difference between WT A6 and either the DsghA or DsghR mutant (Fig. 1A). These results appeared consistent with our previous data with the Tn5 transposon mutants (7), further demonstrating that the transcription factor SghR negatively regulates the expression of sghA (Fig. 1B) and the SghR-SghA pair is involved in tumor growth after agrobacterial infection.

Characterization of the cognate DNA sequence of SghR on sghA promoter region
To identify the exact DNA-binding site of SghR on the sghA promoter, a DNase I protection assay was carried out. The sequencing result using a FAM-labeled forward primer indicated that SghR protects a DNA region covering the sequence 59-GCTGAAACGTTGCAGATTTTGCGT-39 from digestion ( Fig. 1C and D, and Fig. S1A), thereby suggesting that this region is the DNA-binding site for SghR. This binding site was also confirmed by sequencing using an FAM-labeled reverse primer in the DNase I protection assay (Fig. S1B). Sequence analyses showed this binding region comprised a pseudopalindromic sequence (Fig. 1D). Furthermore, EMSA results showed that the binding ratio of SghR dimer protein to DNA appears to be 1:1, as they completely bind to each other and shift together without free DNA or free protein observed in the case of mixing SghR dimer and DNA in equal molar amounts, whereas free DNA or free protein bands could be observed given an excess amount added (Fig. 1E). Isothermal titration calorimetry (ITC) assay further demonstrated that the stoichiometry of protein and DNA complex was one SghR dimer to one double-stranded DNA (dsDNA), with a dissociation constant (K d ) value of 60 nM (Fig. 2B), although there is no interaction observed between SghR and Ppa0305 promoter DNA from Pseudomonas aeruginosa ( Fig. 2A). When incubating SghR protein with sucrose before the titration experiment, the K d value increased up to 1.7 mM (Fig. 2D), implying that sucrose attenuates the repression effect of SghR on the sghA promoter.
Overall structure of apo SghR The crystal structure of isolated full-length SghR was refined at 2.1-Å resolution to a working R value of 20.6% and a free R value of 25.9% (Table 1). There are two copies of SghR in one asymmetric unit, with a root mean square deviation (r.m.s.d.) of 0.4 Å for 275 aligned Ca atoms, suggesting the two SghR molecules are almost identical. The final model (chain A, as an example) comprises 275 out of 350 residues, corresponding to the C-terminal core domain, whereas the N-terminal region (residues 1-74), which presumably functions as a DNA-binding domain, was not observed in the resultant electron density, implying that this region is disordered. The stereochemical quality of the final model was validated with PROCHECK (15), and the Ramachandran plot indicated that 96.3% of the residues lie in the most favored regions, with the remaining 3.7% in the allowed region ( Table 1).
The C-terminal domain, the core of SghR protein, comprises two subdomains: the N-and C-terminal subdomains. The Nterminal subdomain is composed of a core of six parallel b strands (b1-b5 and b11) flanked by two a helices (a6 and a7) on one side and three (a5, a12, and a13) on the other side (Fig.  3A). Meanwhile, the C-terminal subdomain consists of a core of five parallel b strands (b6-b10), which is also flanked by two a helices on each side, namely, a8, a9, a10, and a11, respectively (Fig. 3A). The two subdomains are connected by three noncontiguous peptide crossovers, which not only form a cleft to allow ligand binding but also create a linker region serving as a hinge to permit the movements of the subdomains relative to each other upon ligand association and dissociation.

Dimerization of SghR
Analysis of the apo SghR structure revealed that two SghR molecules in an asymmetric unit form a dimer (Fig. 3B), which was further supported by the molecular weight calibration result indicating that SghR exists as a homodimer in solution (16). Upon dimerization, the buried interface is 2870 Å 2 , accounting for ;11.5% of the total solvent-accessible area. The dimerization interface involves helix a5 and strand b2 in the N-terminal subdomains of both monomers as well as helix a11 and the loop connecting b9 and b10 in the C-terminal subdomains of both monomers, where they pack against each other. The formation of dimer is mainly through hydrogen-bonding interactions formed between residues Ser95-Glu889, Val114-Leu1129, and their dyad symmetry mates (where prime refers to the second monomer within a dimer, same as below). In addition, hydrophobic interactions involving residues Met91, Thr102, Pro116, Ile270, Phe297, Trp300, and Ile301 also contribute to the dimer assembly (Fig. 3, C and D).

Overall structure of SghR-dsDNA
To gain structural insights into how SghR recognizes the sghA operator DNA, an intense effort has been put into improving the diffraction quality of SghR-dsDNA complex crystals. Different dsDNAs encompassing the core region (partial palindrome, as indicated in Fig. 1C) with varying lengths and sequences, as well as different SghR truncations containing both the N-terminal DBD and the C-terminal core, were tested for crystallization. Ultimately, well diffracting crystals of SghR-dsDNA complex were obtained by cocrystallization of the truncated SghR (without the first N-terminal 17 residues) and a 16bp dsDNA with a two-nucleotide overhang at the 59 end (Fig. 4, B and D). Notably, two base pairs on the left half-site of the natural pseudo-palindrome dsDNA were mutated to make a perfect palindrome sequence (Fig. 4, B and D). Indeed, this strategy was frequently used to obtain the crystal structures of lac repressor-DNA complexes, and it was proposed that it would not change the overall conformation by comparing the lac repressor-natural operator structure with the lac repressorsymmetric operator structure (17,18).
The crystal structure of SghR-dsDNA was solved by the molecular replacement method using the C-terminal domain from our apo SghR structure combining the cognate DNA-bound Nterminal DBD from the lac repressor-DNA complex structure (PDB entry 1EFA) as a search model. The structure was refined to 3.2-Å resolution with crystallographic statistics shown in Table 1. The final model contains 6 dimers and 6 dsDNA molecules per asymmetric unit. Each dimer binds one dsDNA, consistent with our ITC and EMSA results that the stoichiometry of SghR dimer to dsDNA is 1 to 1 ( Fig. 1D and 2B). Comparison of the six SghR dimers revealed that their overall structures are quite similar, with an average r.m.s.d. of 0.75 Å for the aligned Ca atoms. Therefore, we used the SghR-dsDNA complex ABab (chains A and B for protein, chains a and b for DNA) in the following analysis (Fig. 4A). Structural analysis of the SghR-dsDNA complex revealed that SghR contains an additional headpiece, referred to as the N-terminal DBD, besides the large C-terminal core adopting a similar conformation as observed in the isolated SghR structure (Fig. 4A). The N-terminal DBD (residues 18-73) consists of four helices and can be divided into two functionally important parts. The first part contains a helix-turn-helix (HTH) motif (a1-turn-a2, residues 22-40), followed by a short loop (residues 41-47), the helix a3 (residues 48-61), and its C-terminal loop (residues 62-66) (Fig.  4A). Connecting the globular subdomain involving the first three helices to the C-terminal core are the fourth helix a4 (residues 67-73) and the following loop (residues 73-78) (Fig. 4A). The a4 helix, usually named the hinge helix, constitutes the second functionally important part of the N-terminal DBD. The first part within the N-terminal domain is responsible for the dsDNA major groove interaction, whereas the hinge helix binds with its 2-fold related mate to the minor groove of the dsDNA (Fig. 4, A and C). It was proposed that the formation of hinge helix was conducted by both the C-terminal core dimerization and the DNA minor groove recognition (19). Note that in our structure, stabilization of such a helix conformation was mediated by hydrophobic interaction involving the residues Gly69-Leu729 and van der Waals forces between residues Ala68 and Arg719 and their dyad symmetry mates (Fig. S2). In addition to the C-terminal core domain and the hinge helix, hydrogen bonds formed between Arg19 in the N-terminal DBD and Thr1339 in the N-subdomain also contribute to the dimerization of SghR in our complex (see Fig. 6C). Interestingly, the dsDNA in this structure is kinked ;45°at the central C10-G119 base pair toward the major groove and pulled away from the protein, which is largely because of the insertion of the two hinge helices a4 and a49 into the minor groove (Fig.  4A).

SghR-DNA interaction
In our SghR-dsDNA structure, each SghR monomer binds to a semisymmetrical half of the dsDNA via extensive hydrogen-bonding interactions (Fig. 4C). In particular, the HTH (a1turn-a2) motifs within the N-terminal DBD of the two monomers make extensive and symmetrically related contacts with the bases and phosphate backbone in the major grooves located at both half-sites of the dsDNA. Taking one monomer (chain A) as an example, Leu22 in a1, Thr35 and Ser37 in a2, Thr50 in a3, as well as Gly32, Lys41, and Gly47, proximal to the HTH motif, interact with the phosphate backbone of DNA via hydrogen bonds (Fig. 4, C and D). Notably, the canonical recognition helix a2 is well positioned, such that its N terminus projects into the major groove of dsDNA for specific interaction, namely, hydrogen bond between Thr34 and the base of G-6 in DNA chain b. Additionally, the residue Asp45, located in the loop connecting a2 and a3, forms a base-specific hydrogen- Figure 1. SghR-SghA involved in plant tumorigenesis and SghR regulating SghA via binding to its promoter region. A, tumorigenicity assay for A. tumefaciens A6 and its derivatives using carrot discs. Representative images (bottom) were taken and fresh weight of tumors (top) were recorded at 6 weeks after inoculation. DsghR, DsghA, and DsghRA indicate the in-frame deletions of sghR, sghA, and both genes, respectively. The error bars represent SD from 6 repeats. A nonparametric 1-way analysis of variance was performed in GraphPad for statistical analyses. **, p , 0.001; ***, p , 0.0001. B, SghR represses the transcription of the gene indicated by RT-PCR. The transcription level of 16S RNA in two strains is also presented. C, identification of SghR-binding site in sghA promoter region using Dye Primer sequencing on an automated capillary DNA analyzer. The double-ended arrow shows the region in the sghA promoter protected by SghR during DNase I digestion. The sghA promoter region and DNase I are incubated either in the presence (1SghR) or absence (2SghR) of SghR protein. Electropherograms of ddNTP (ddATP, ddTTP, ddCTP, and ddGTP) panels displayed in the figure are the sequencing results using forward primer (6-FAM labeling). D, depiction of the SghR-binding site in the promoter region of sghA. The binding site is shown in red letters. The two opposite red arrows indicate the partial palindrome feature presented in the binding site. 235 and 210 regions of sghA were predicted by the online program BPROM, and sequences were labeled with blue lines. The pentacle indicates the transcription start site, predicted by an online server (http://www.fruitfly.org/seq_tools/promoter.html). The translation start site of SghA is also indicated by a blue arrow. E, in vitro gel shift assay of SghR with the identified binding site. Various ratios of protein to DNA were used. Left, native-PAGE gel stained by DNA-staining dye. Right, the same gel stained by Coomassie brilliant blue (CBB). D and Pro represent DNA and protein, respectively. Pro:D = 1:1, 1:2, and 2:1 indicate that the molar ratios of protein dimer to dsDNA are 1 to 1, 1 to 2, and 2 to 1, respectively. This is native PAGE; no protein marker was included in the gel, as the charge of protein also contributes to its migration rate in addition to its molecular weight. bonding interaction with the base of C-4 in DNA chain b (Fig.  4, C and D). As aforementioned, the hinge helices a4 and a49 insert into the minor groove of the dsDNA, involving both specific and nonspecific interactions. The insertion of these two helices into the narrow minor groove is accomplished by the side chains of residues Leu72 and Leu729, which intercalate into the central GC pairs (C10-G119 and G11-C109) and act as a lever to pry open the minor groove. As a result, the main chain O of both Gly69 and Leu72 forms hydrogen bonds with the base of G-11 in DNA chain a and the base of C-10 in DNA chain b, respectively ( Fig. 4, C and D). Remarkably, a single residue in a4, Arg73, makes a complicated hydrogen-bonding network involving several nucleotides, namely, the base of T-13 and the deoxyribose of G-14 in DNA chain a, as well as the bases of A-9 and C-10 in DNA chain b (Fig. 4, C and D). In proximity to the hinge helix a4, the residues Tyr63 and Asn66 contact the phosphate backbone of the DNA chain a (Fig. 4, C and D). Therefore, the residues Thr34, Asp45, Gly69, Leu72, and Arg73 from each monomer contribute to sequence-specific recognition of each half-site of its cognate DNA, which validates the necessity of palindrome DNA sequence. As expected, neither mutated nucleotide pair (T3-A189 and C7-G149) is involved in specific protein-DNA interaction, implying that our SghR-dsDNA structure represents an authentic operator DNA-bound complex. Sequence alignment analysis revealed that those residues, such as Thr35, Ser37, Lys41, Thr50, and Tyr63, which are involved in phosphate backbone interactions by their side chains, are conserved among the LacI family (Fig.  S3). Consistently, their counterparts in lac repressor and PurR protein sequences also have been shown to participate in DNA binding (20)(21)(22)(23). However, residues that contact DNA bases through their side chains, such as Thr34, Asp45, and Arg73, are varied in these sequences (Fig. S3). Taken together, our results suggest that the interactions between SghR and DNA are in a sequence-specific manner despite sharing a similar DNA-binding mode with other LacI family proteins.

Structure of sucrose-bound SghR
Given our functional characterization of sucrose as a ligand for SghR (7), we carried out microscale thermophoresis experiments and obtained the binding affinity of sucrose to SghR protein (K d = 42.7 6 3.3 mM) (Fig. 2C). Next, we cocrystallized SghR with sucrose and determined its complex structure by a molecule replacement method using apo SghR as a search model, and the crystallographic statistics are listed in Table 1. As expected, sucrose, clearly indicated by the electron density map, binds to the cleft created by the two C-terminal subdomains (Fig. 5, A and B) in a position similar to that observed for   other effectors, such as IPTG (Fig. 5D) and guanine (24,25). The orientation of sucrose, surrounded by polar and aromatic residues, was nearly parallel to the strands b5 and b11 (Fig.  5A). The fructose ring is flanked by the phenyl group of Phe210 in the C-terminal subdomain and the imidazole group of His164 in the N-terminal subdomain, thereby establishing a sandwich arrangement to stabilize the orientation of sucrose (Fig. 5C). The sucrose is recognized by several specific hydrogen bonds formed between its hydroxyl groups in the sugar ring and the residues in both the N-and the C-terminal subdomains, including Glu89, Ser142, Lys143, Ile144, His164, Asp177, Asn180, Lys292, and Glu310 (Fig. 5C). The binding site for the glucosyl and fructosyl moiety in sucrose corresponds to 21 and 11 subsites, respectively. Interestingly, the side chains of Asn180 and Ser142, as well as the main-chain N of Lys143, form hydrogen bonds with the two hydroxymethyl groups of fructosyl moiety, appearing to be crucial for defining the 11 subsite. The binding of the 11 subsite was further stabilized by the hydrogen bonds between its hydroxyl groups and the main-chain N of Lys143 and Ile144, the main-chain O of His164, and the side chain of Asp177. Simultaneously, watermediated interactions with the sugar hydroxymethyl group, as well as the side chains of Glu89 and Lys292, contacting hydroxyl groups in glucosyl moiety, contribute to the recognition of the 21 subsite. At the kinked link of sucrose (which is located at the half gateway of the pocket), residue Glu310 established bilateral hydrogen-bonding interactions with both moieties, thereby determining the two subsites. Remarkably, complicated hydrogen-bonding networks mediated by several water molecules, both sugar moieties and residues Thr94, Ser142, Arg166, Ser294, and Phe210, cross-link the C-terminal subdomains and define their relative orientation to each other (Fig. 5C). In contrast to the LacI repressor and its gratuitous inducer IPTG binding pocket, depicted in Fig. 5D (25), there are also mainly hydrogen bonds formed between sugar ring hydroxyl groups and LacI to bind the IPTG in the middle of Nterminal and C-terminal subdomains. Namely, the residues Ser69, Asp149, Arg197, Asn246, and Asp274 form direct hy- drogen bonds with the IPTG sugar ring hydroxyl group. Similar to sucrose binding in our structure, water molecule-mediated hydrogen bonds involving residues Asn125 and Ser191 in LacI were observed. Additionally, Trp220 shows pi-pi interaction, and residues Ile79, Ala75, and Pro76 form hydrophobic interactions or van der Walls forces with IPTG sugar ribose. In general, the inducers IPTG and sucrose share similar types of interaction with the repressor proteins involved in their binding pockets and located in the middle of the N-terminal subdomain and C-terminal subdomain.
Insight into allosteric mechanism by structural comparison of SghR in apo, sucrose-, and DNA-bound forms The structural basis of how an effector molecule alters the affinity of the transcription factor for its cognate DNA is of significant importance for a detailed understanding of the molecular mechanism of gene regulation. Here, the structures of SghR in three states (apo, sucrose-, and DNA-SghR) allow us to exploit the structural basis of how sucrose modulates the affinity of SghR for its cognate DNA.
Consistent with the notion that the hinge helix of LacI family protein is unfolded and the N-terminal DBD moves freely with respect to the C-terminal core in the absence of their cognate DNAs (26)(27)(28), the N-terminal DBD in both apo and sucrosebound SghR are disordered. Nevertheless, comparison of their C-terminal domains reveals virtually identical structures, with an r.m.s.d value of 0.15 Å for the aligned Ca atoms. In addition, the conformation of residues Glu89, Ser142, Lys143, Ile144, His164, Asp177, Asn180, Lys292, and Glu310, which are involved in hydrogen-bonding networks for sucrose recognition in SghR-sucrose, are almost identical to that in apo SghR (Fig. S4). This is in line with LacI from E. coli. The inducer (IPTG) binding structure (PDB entry 2P9H) is extremely similar to apo LacI (PDB entry 1LBI). The structure comparison of LacI-IPTG with apo LacI (PDB entry 1LBI) gave an r.m.s.d. value of 0.5 (Fig. S4B). In addition, there is little conformation change for the residues that are involved in IPTG binding. A detailed analysis revealed that many water molecules in apo SghR structure were observed in the sucrose-binding pocket (Fig. S4A). Notably, some of these water molecules occupy the positions of hydroxyl groups of the sugar ring in the SghR-sucrose structure, thereby forming extensive hydrogen-bonding networks with the residues of SghR (Fig. S4A). These interactions are similar to that observed between sucrose and SghR in the SghR-sucrose complex, which perhaps explains the similarity of sucrose-binding pocket in these two structures. Moreover, weak binding of sucrose to SghR was demonstrated by microscale thermophoresis assay (Fig. 2C). Consistent with this weak binding, relatively high sucrose concentration (1000-fold molar excess of SghR) is required to reduce the SghR-DNA binding (Fig. 2D). Nevertheless, we could postulate that the side chains of the residues involving sucrose binding have undergone certain conformational changes during sucrose diffusing into the binding pocket. Such a conformational change of these local residues induced upon sucrose binding is so transient that it cannot be observed in the present structure.
Similar to LacI and PurR (18,29) (Fig. S7), the C-terminal domain of SghR is organized into two structurally similar subdomains: an N-subdomain containing residues 79-178 and 312-339 and a C-subdomain covering residues 180-309 and 340-345. Superposition of the individual N-subdomains was based on the alignment matrix of individual C-subdomains from both SghR-sucrose and SghR-DNA complexes using the program ALIGN (30). Comparison of both N-and C-subdomains in sucrose-bound dimeric SghR with that in DNA-bound dimeric SghR showed r.m.s.d. values of 0.79 Å and 0.66 Å, respectively, for Ca atoms. In contrast, comparison of the N-subdomains of these two structures based on the alignment of the C-subdomains gave an r.m.s.d. of 1.31 Å for Ca atoms, suggesting that the relative orientation of the N-and C-subdomains in SghR is changed upon sucrose binding. Furthermore, detailed plotting of the mean Ca deviation for each residue within the N-subdomain revealed that those residues with relatively larger variation are mapped to two regions: residues 145-174 and 321-337 (Fig. 6, A  and B). Note that these regions are closely linked to the residues involved in sucrose binding, such as residues Ser142, Lys143, Ile144, Asp177, Asn180, and Glu310 (Fig. 5C), suggesting that this discrepancy between SghR-dsDNA and SghR-sucrose was introduced by sucrose association. In addition, the residue Thr1339 in the N-subdomain, which forms a hydrogen bond with Arg19 in the N-terminal DBD, revealed a relatively large conformational change upon sucrose binding (Fig. 6C). Strikingly, the side chain of residue His111 in the N-subdomain in the SghR-dsDNA complex, which was observed to establish a p-cation interaction with Arg67 in the DBD and a hydrogen bond with Tyr1299 in the N-subdomain, also showed a relatively large conformational change compared with that of the SghR-sucrose complex (Fig. 6D). In particular, the two residues His111 and Tyr1299 in the SghR-sucrose complex are too far apart to form any interaction. Notably, as the abovementioned residues contribute to defining the relative orientations of the DBD with respect to the N-subdomain as well as the two N-subdomains within the dimeric SghR, their conformation changes are likely to reposition the N-terminal DBD with respect to the dsDNA, thereby modulating the affinity of SghR for DNA. Therefore, structural comparison of SghR-dsDNA with SghR-sucrose suggests an allosteric mechanism: sucrose binding induces transient local conformation changes in the sucrose-binding pocket, which were then transmitted via the regions in proximity to this pocket in the N-subdomains to the N-terminal DBDs of SghR, ultimately resulting in the reorientation of the DBD and leading to a suboptimal positioning for DNA binding. In line with this notion, ITC experiments showed that the affinity of sucrose-bound SghR for its operator DNA was 1.7 mM, which is 28-fold lower than that of SghR in the absence of sucrose (Fig. 2, B and D).

Discussion
Transcriptional regulation of specific genes through signaling molecules plays a central role for bacteria to adapt to environmental change. Given that the ubiquitous soil agrobacteria infect a variety of plant species, it is of great interest to investigate its gene regulation during the process of infection. We have identified a novel virulence factor, SghA, from A. tumefaciens that temporally regulates plant tumor development by controlled release of host-conjugated defense signal, in response to sucrose accumulated in plant wounding sites at the late stage of infection (7). In this paper, we reported that the expression of SghA is negatively controlled by a putative LacI family transcription factor, SghR, thereby involving tumor growth postinfection. How SghR specifically recognizes the promoter of sghA to repress gene transcription and how SghR specifically senses the signaling molecule sucrose to make responses are central to elucidate the molecular mechanism of this regulation process. To address these issues, our structural findings unraveled the detailed molecular mechanism of SghR controlling Agrobacterium infection by cross-talking to plants.
Our in vivo tumorigenesis assay together with RT-PCR analyses revealed that both SghA and SghR are involved in plant tumor growth in a manner that the mRNA level of sghA is negatively controlled by SghR (Fig. 1, A and B). Given that SghA is involved in the enzymatic releases of the plant defense signal SA from its conjugated form, SA 2-O-b-D-glucoside, after successful infection, this pair of novel virulence factors, SghA/ SghR, plays a critical role in tumor development in agrobacterium-infected plants. The DNA-binding site of SghR on sghA promoter region covers 224 nt to 21 nt upstream of the sghA transcription start site (defined as 21), based on DNase I footprinting assay (Fig. 1C). Note that the binding site presents the common feature of inverted repeat sequence (partial palindromic consensus sequences), as observed for other LacI family members (2,3).
Comparison of our apo SghR structure with apo lac repressor Atu1522 (PDB entry 3GV0) from A. tumefaciens C58 gave an r.m.s.d. value of 0.62 for Ca atom, demonstrating high structural similarity (Fig. S6A). However, for Atu1522, no other information is available. In addition, structural comparison of apo SghR with LacI from E. coli (PDB entry 1LBI, excluding the C-terminal residues 325-357, involved in tetramerization) and PurR from E. coli (purine repressor, PDB entry 1DBQ) gave C, Arg19 (chain A) and Thr1339 (chain B) from SghR-DNA complex form a hydrogen bond (shown as dashed line), whereas in SghR-sucrose complex, Thr1339 (colored gray) shows a different conformation. D, His111 (chain A) and Tyr1299 (chain B) form a hydrogen bond (indicated by a dashed lines), whereas this hydrogen bond is abolished in the SghR-sucrose complex because of the conformational change of both His111 and Tyr1299 (colored gray). In addition, in the SghR-DNA complex, Arg67 and His111 establish a p-cation interaction (shown as a double-ended arrow), whereas because of the conformation change of His111, this interaction is also likely to be abolished in the SghR-sucrose structure. In panels C and D, their electron densities (2Fo-Fc map contours to 1s) are shown.
Structural insights into the regulation mechanism by SghR r.m.s.d. values of 2.39 and 3.29, respectively, for the Ca atom (Fig. S6, B and C). The relatively larger conformational difference might be explained by their roles in sensing different signal molecules during transcription regulation. Nevertheless, all four structures show a similar overall topology for their C-terminal core domains (18,19).
Similar to our SghR structures in apo (Fig. 3A) and sucrosebound form (Fig. 5A), the N-terminal DBDs in apo Atu1522, LacI, and PurR structures are all disordered (Fig. S6), which reinforces the high flexibility of the N-terminal DBD of the LacI family in the absence of their cognate DNAs. As expected, DNA binding would stabilize this flexible domain, leading to an ordered model in our SghR-DNA complex structure (Fig. 4A). The SghR-DNA complex structure provides the detailed interactions of how SghR specifically recognizes the operator DNA in a sequence-dependent manner, as well as rationalizes the requirement of the partial palindrome feature for its cognate dsDNA binding (Fig. 4, C and D). Moreover, similarities in protein topology, DNA deformability, and protein-DNA binding mode could be also observed for other LacI family members in complex with their cognate DNAs, such as LacI-DNA (PDB entry 1EFA) and PurR-DNA corepressors (PDB entry 1PNR and 1WET) (17,18,29), suggesting a universal dsDNA-binding mode applies to all LacI family members.
LacI/GalR family proteins play a vital role in regulating metabolic pathways through sensing and responding to a large variety of internal and external signal molecules, such as lactose, fructose, galactose, and many others, to adapt to constant environmental changes (2,31). Detailed studies of how LacI proteins respond to these effectors are of considerable importance to elucidate their subtle specificity as well as to design anti-inducer/inducer analogues, which in turn may provide a rational strategy for metabolic control. Given that sucrose could destabilize DNA binding and induce the expression of SghA (7), it appears to be the effector of SghR. Consistently, our results of binding assay revealed that sucrose binds to SghR (K d , ;42 mM), and sucrose could lower the binding affinity of SghR for its operator DNA (Fig. 2, C and D). Indeed, the structure of the SghR-sucrose complex demonstrated that sucrose specifically binds to a pocket created by the two C-terminal subdomains (Fig. 5A), similar to that observed for other structural homologs, LacI and PurR (24,25,28). Sucrose recognition by SghR involves several residues located in the pocket as well as a water-mediated network of interactions, which cross-link the two subdomains (Fig. 5C). The water-mediated network of interaction that bridges the N-subdomain and C-subdomain has been reported to be essential for the allosteric transition, as proposed by Daber and coworkers (25). The conservation and difference likely contribute to a common regulation switch mechanism (association/disassociation with DNA) but with a difference specificity for effector recognition (Fig. 5, C and D).
Structural comparison of sucrose-and DNA-bound dimeric SghR showed a relatively large discrepancy for the N-terminal subdomain (Fig. 6A), in contrast to a good superposition of the individual N-subdomains and C-subdomains (Fig. S5). Interestingly, those residues with larger conformational change are found to be located in close proximity to the sucrose-binding site, as well as to be involved in defining the relative orientation of the N-subdomains within a dimer and the positioning of the DBD with respect to the N-subdomain within a monomer (Fig.  6). Such a conformational change upon sucrose binding appears to result in an orientation of the two N-terminal DBDs that is proposed to be no longer in favor of DNA binding, resulting in dissociation of SghR dimer from its DNA and relieving the repression of the transcription of sghA. Therefore, we propose a possible allosteric mechanism that the signal molecule binding would trigger several transient local conformation changes, which are then propagated via the regions in proximity to the sucrose-binding pocket on the N-subdomains to the N-terminal DBDs of SghR, ultimately inducing the reorientation of the DBD and leading to a suboptimal positioning for DNA binding. Interestingly, a similar allosteric mechanism was also proposed for the Lac repressor, in which IPTG binding to a pocket at the junction of two subdomains results in N-subdomain reorientation (20,25). This reorientation induces an alteration of the intersubunit interactions, which are the interactions at the dimer interface as well as interactions between core and DBD, and ultimately destabilizes LacI-DNA binding (20,25). However, we only modeled the core domain of SghR in both apo SghR and SghR-sucrose structures, whereas the DBD domains were found to be disordered in both structures. A better understanding of how exactly the repositioning of the Nterminal DBD modulates the affinity of SghR for its cognate DNA requires a well-modeled DBD domain within the SghRsucrose-bound structure and a higher resolution of SghR-DNA structure to perform a comprehensive analysis of intersubunit interactions and the difference between the sucrose-and DNA-bound structures. In particular, as the linker region (a4 and the following loop in Fig. 4A) is essential for minor groove binding and conformational change propagation from the regulatory domain to the N-terminal DBD upon signal molecule binding, a model of the linker region in the aforementioned structures is central to this allosteric mechanism. Note that structural information of the linker region in the absence of DNA for LacI family protein is only available at low resolution from small-angle X-ray scattering (2,32). Other complementary methods need to be employed for a better understanding of this mechanism.
Nevertheless, our findings further demonstrate that the signal molecule sucrose specifically binds to SghR. LacI regulates the lac operon involved in the metabolism of lactose, and a rapid and sensitive response for the cell to lactose needs a high affinity of LacI for its inducer allolactose, which is an isomer of lactose (33,34). Indeed, LacI has a strong binding affinity for allolactose, with a binding constant of 0.1 mM (similar to LacI-IPTG) (34,35). In contrast, SghR has a relatively weak affinity for its signal molecule sucrose, with a K d of ;42 mM (Fig. 2C). The weak affinity of SghR for sucrose perhaps has been evolved for its physiological role during Agrobacterium infection. At the initial stage of infection, SghR physically binds to the promoter region of sghA and tightly represses its transcription to avoid the release of SA, which consequently triggers host plant defense. Once Agrobacterium infection is established, sucrose accumulated with high concentration at the wounding sites during healing and served as an environmental stimulus to alleviate the repression of SghA by SghR, thereby facilitating the expression of SghA and leading to increased defense signal in plants. As a pathogen, Agrobacterium causes crown gall disease in a wide variety of crops. Given the role of SghR in regulating the virulence factor SghA in tumorigenesis in response to the signal effector sucrose originated from the host plant, our results offer a structure-based inducer analog design that has potential applications for the agricultural industry.

Tumorigenesis assay
The deletion mutants of A. tumefaciens A6 with genes sghR and sghA both in-frame deleted were generated according to the protocol described previously (36). The virulence assays were performed on carrot disks. Briefly, fresh carrot roots obtained from the supermarket were surface sterilized by scrubbing under running water. After immersing in 2% Clorox for 5 min, the carrot roots were then cut into ;1-cm-thick slices and immersed in 2% Clorox for 30 min. After rinsing thrice in autoclaved distilled water for 15 min, the discs were placed apical side up on agar plates. A. tumefaciens strains freshly grown in BM medium were resuspended in sterile DPBS buffer with an optical density at 600 nm of 1.0, and for each disk, 10ml agrobacteria suspensions were applied to the top surface of the carrot disk. They were then sealed with parafilm, and the Petri plates with the inoculated disks were incubated in a growth chamber for tumor development. After a 4-week incubation, results were recorded by photography and then the fresh weight of tumor was measured.

RT-PCR
Total RNAs were extracted from bacterial cells cultivated in BM minimal medium and VIB medium using an RNeasy ® mini kit (Qiagen). To minimize data variation, total RNAs were prepared from three independent repeats and pooled together for RT-PCR analysis. RT-PCR was conducted by following the one-step strategy (Qiagen). An aliquot of 0.2 mg total RNAs was used as the template for RT-PCR to amplify a portion of the target genes. For sghA, the PCR primers are 59-GGCGACCGGCTGGATG-39 and 59-CGGGGCTTGTT-TGGTGG-39. In addition, a fragment of 16s rRNA was also amplified in each RT-PCR reaction as an internal control using primers of 59-TGACGAGTGGCGGACGGGTG-39 and 59-ATGCAGTT CCCAGGTTGAGC-39.

Gene cloning, protein expression, and purification
Generally, the gene encoding the full-length target protein SghR (residues 1-350) from A. tumefaciens was amplified by PCR and cloned into pET-14b vector. The resultant plasmid was verified by DNA sequencing and then transformed into the E. coli BL21 CodonPlus-(DE3) RIL cell strain. The culture, expression, and purification of full-length SghR are the same as methods reported previously (16). In general, two sequential purification steps (nickel affinity and gel filtration chromatography) were applied to produce homogeneous SghR protein.
The purified protein was in the final buffer of 50 mM HEPES, pH 7.0, 50 mM NaCl, and 2 mM tris(2-carboxyethyl)-phosphine and concentrated to 6.4 mg/ml (the protein concentration was determined by NanoDrop TM 2000/2000c spectrophotometers, applying SghR protein molecular weight [including extra affinity tag] and its extinction coefficients during the measurement, which is the same as that below for determining protein concentrations of all the other different SghR constructs), and then frozen into liquid nitrogen and stored at 280°C for later experiments.
For cocrystallizing SghR-DNA complex, the gene encoding the SghR fragment spanning from residue 18 to 350 was cloned into the NdeI and XhoI restriction sites of a modified pET-26b vector. The recombinant SghR truncation with a C-terminal 6-3His tag (LEHHHHHH) was expressed in the E. coli BL21 CodonPlus-(DE3) RIL cell strain. To purify this truncated SghR protein, procedures similar those reported previously (16) were used, except for the introduction of a high salt concentration (1 M NaCl) into the lysis buffer and the addition of a heparin chromatography step between the nickel affinity chromatography and gel filtration chromatography steps. This additional step was revealed to remove the junk nucleic acids that are nonspecifically bound to the target protein. Finally, the purified protein was dialyzed against buffer containing 10 mM Tris pH 8.5, 100 mM NaCl, 0.1 mM EDTA, and 4 mM b-mercaptoethanol (b-ME) and concentrated to 1.67 mg/ml for preparing protein-DNA complex. The DNA oligonucleotide with the sequence 59-TATCTGCAACGTTGCAGA-39, used for cocrystallization, was purchased from Sigma Aldrich with HPLC purification grade.

DNase I footprinting assay
The sghA promoter region was amplified by PCR using the FAM-labeled forward primers 59-(6-FAM)-CAGATGCAA-GACTTTCCACCAC-39 and reverse primers 59-CCTTT-CGTCATC GACGT'-39 from the genomic DNA of A. tumefaciens A6. The experiment was performed referring to previous published methods (37,38) with some modifications. Briefly, 500 ng of FAM-labeled DNA was diluted with 20 ml of gel shift buffer (20 mM NaH 2 PO 4 pH 8.5, 200 mM NaCl, and 5 mM b-ME), and then 3 ml of purified SghR (18.6 mg/ml) and 16 ml of water were added. After incubation at room temperature for 20 min, 10 U of DNase I (Roche) was added, followed by further incubation for 5 min. Subsequently, the reaction was stopped by heating the sample at 75°C for 10 min. The control experiment was performed in parallel by using the labeled probe in the absence of SghR protein. The digested DNA products were purified with the PCR purification kit (Qiagen) and eluted in 20 ml of water for further analysis. Dideoxynucleotide-based sequencing was performed using Thermo Sequence Dye Primer manual cycle sequencing kit (USB, Inc., Cleveland) by following the manufacturer's instructions, and the samples were analyzed with the Applied Biosystems 3730 DNA Analyzer. To confirm the forward FAM-labeled primer sequencing result, the FAM-labeled reverse primer sequencing was also carried out in parallel. The two primers used for sequencing are 59-CAGATGCAAGACTTTCCACCAC-39 and 59-(6-FAM)-CCTTTCGTCATCGACGT'-39. The procedures are the same as described for forward primer sequencing.

EMSA
Based on the results of our DNase I footprinting assay, the DNA oligonucleotides (forward, 59-GCTGAAACGTTGCA-GATTTTGCGT-39; reverse, 59-ACGCAAAATCTGCAACG-TTTCAGC-39) were synthesized to test binding with SghR. Equimolar complementary strands were mixed in the annealing buffer (20 mM NaH 2 PO 4 , pH 8.5, 200 mM NaCl, 5 mM b-ME), with a final concentration of 0.5 mM. Subsequently, the DNA solution was heated at 95°C for 3 min, followed by gradually cooling down to room temperature.
For the in vitro DNA-binding assay, full-length SghR was cloned into a modified pET-26b vector between restriction enzyme sites NdeI and HindIII. The final construct containing a C-terminal 63His tag was prepared according to the same procedures used for purifying the SghR truncation (18-350 aa). The purified SghR (0.5 mg; if the molar ratio SghR to DNA was 2:1, then 1 mg SghR protein was added) and dsDNA (amount added based on the ratio with SghR) were mixed at varied molar ratios, as indicated in Fig. 1E, and incubated at room temperature for 30 min. The samples were then resolved on 10% native PAGE, followed by gel staining with ethidium bromide and Coomassie brilliant blue. The staining results were recorded by Gel Doc TM XR (Bio-Rad).

ITC
ITC measurements were carried out at 15°C using a Micro ITC200 (MicroCal Inc., Northampton, MA, USA). SghR was loaded into the sample cell at a concentration of 15 mM (dimer), and the dsDNA (the same sequence as that in the above EMSA experiment from the sghA promoter) was loaded into the syringe at a concentration of 200 mM. To investigate the effect of sucrose on SghR-DNA binding, the SghR protein at a concentration of 18.7 mM was premixed with 1000-fold sucrose for 1 h at room temperature before loading to the sample cell. As a control, Ppa0305 promoter DNA from P. aeruginosa, used to test its binding with SghR, was carried out in parallel. To correct the dilution and mixing effect, blank titrations were conducted by titrating DNA buffer into either protein or proteinsucrose solution in the sample cell, which was subtracted from the raw data. All data were analyzed with MicroCal Origin 7.0 (Microcal Software, Inc., MA, USA).

MST
The microscale thermophoresis (MST) experiment was performed using a Monolith NT.115 instrument (NanoTemper Technologies, Germany) at 24°C. The PBS buffer (137 mM NaCl, 2.7 mM KCl, 10 mM Na 2 HPO 4 , 1.8 mM KH 2 PO 4 ) supplemented with 0.05% Tween-20 was used as the MST-binding buffer. Premium-coated capillaries were used. SghR protein was purified to homogeneity as described above. The purified SghR was labeled with the Monolith protein labeling kit RED-NHS 2nd generation (Amine Reactive). The labeled SghR was diluted to a concentration of 250 nM, mixed with an equal volume of a serial dilution series of sucrose, and then incubated at room temperature for 20 min before loading into MST capillaries. A single MST experiment was performed using 60% LED power and 20% MST power with a wait time of 5 s, laser-on time of 30 s, and a back-diffusion time of 5 s. MST data were analyzed in GraphPad Prism, and the data were fitted with the Hill equation. The mean 50% effective concentration (or K d ) values were calculated with standard error (S.E.). Each experiment was repeated at least three times.
Crystallization, data collection, and structure determination The crystallization procedures for apo SghR are the same as those reported in our previous paper (16). In general, for each condition, three ratios of protein and reservoir solution (0.2:0.1 nl, 0.15:0.15 nl, and 0.1:0.2 nl) were screened by the sittingdrop vapor-diffusion method using a Phoenix robot (Art Robbins Instruments) at 20°C. Screening kits from Hampton Research, including Crystal Screen, Crystal Screen 2, Index, PEG/Ion, PEGRx, and SaltRx, were used. The apo SghR crystal hits were observed under conditions consisting of 0.05 M HEPES-Na, pH 7.0, 20% PEG 3350. The crystallization condition was optimized by varying the PEG concentration (17-22%) and increasing the drop size to a 2-ml protein solution and 1-ml reservoir solution using the hanging-drop method at 20°C . The crystals were cryoprotected with a cryoprotectant consisting of 35% PEG 3350 in the reservoir and were then flashcooled in liquid nitrogen. To cocrystallize SghR-sucrose complex, sucrose with a final concentration of 60 mM was mixed with SghR (6 mg/ml) and incubated for 30 min at room temperature before setting up the trays. After crystallization screening, crystals were observed under the same condition as that of apo-SghR. A similar optimization strategy was also applied to obtain diffraction-quality crystals.
To cocrystallize the SghR-DNA complex, dsDNA was prepared using the above-mentioned annealing method in buffer containing 10 mM Tris-HCl, pH 8.5, 100 mM NaCl, and 1 mM EDTA. The annealed DNA duplexes were directly added to the concentrated truncated SghR (residues 18-350) with a molar ratio of 1.1:1 (DNA:protein). The mixture was incubated on ice for at least 3 h before setting up crystallization trays. The final protein concentration in the mixed solution was ;5.5 mg/ml. The initial screening (sitting-drop method at 20°C) and crystal optimization (hanging-drop method at 20°C) for this complex were performed similarly to those for apo-SghR. Well-diffracting SghR-DNA complex crystals were obtained in condition consisting of 19-22% PEG 3350 and 0.2 mM ammonium citrate tribasic, pH 7.0. Crystals were then flash-frozen in the above reservoir solution supplemented with 30% glycerol.
Diffraction data were collected at 100K on beamline X06SA of the Swiss light source (SLS) or I04 (Diamond). The data were processed using XDS (39). All three structures were determined by molecular replacement using the program Phaser (40). The apo-SghR structure was solved using the structure of a homologous protein (PDB entry 3GV0) as a search model. It showed that there are two copies of SghR in an asymmetric unit. Subsequently, automatic model building was employed using ARP/wARP (41). The model was further improved by manual model building with Coot (42) and refined by applying TLS refinement combing Phenix (43) and REFMAC5 (44). The apo SghR structure was then used as a search model to obtain the initial phase of the SghR-sucrose complex. The procedure for manual model building and refinement was the same as that for apo SghR. As for the SghR-DNA complex, initial molecular replacement trials with either monomer or dimer of apo SghR (residues 74-349) did not yield unambiguous solutions. However, there were 1233 partial solutions, and the best one showed four monomers in the space group P2 1 2 1 2 1 . Using these four monomers (forming two physiological dimers) as a fixed solution, a second round of molecular replacement was carried out using the dsDNA-bound N-terminal domain of the lac repressor (PDB entry 1EFA). The N-terminal domain of lac repressor is equivalent to the region 1-73 of SghR. This allowed the identification of the orientation and positioning of the dsDNA-bound N-terminal region (residues 1-73) of SghR with respect to the C-terminal domain (residues 74-349). Thus, a dimer of SghR was correctly placed. Further molecular replacement with the newly obtained dsDNA-bound SghR dimer allowed the identification of all six copies in one asymmetric unit. The procedure for manual model building and refinement was the same as that of apo SghR. The final models of the three structures determined in this study were checked by PROCHECK (15). The crystallographic data and refinement statistics are listed in Table 1, and all figures were generated with PyMol (Delano Scientific). The area of protein interface was calculated using the PISA program (45).

Data availability
Coordinates and structure factors have been deposited in the Protein Data Bank under accession codes 7CDV, 7CDX, and 7CE1. All remaining data are included within the paper or are in the supporting information.