Structure of the Hepatocyte Nuclear Factor 6α and Its Interaction with DNA

Hepatocyte nuclear factor 6 (HNF-6) belongs to the family of One Cut transcription factors (also known as OC-1) and is essential for the development of the mouse pancreas, gall bladder, and the interhepatic bile ducts. HNF-6 binds to DNA as a monomer utilizing a single cut domain and a divergent homeodomain motif located at its C terminus. Here, we have used NMR methods to determine the solution structures of the 162 amino acid residue DNA-binding domain of the HNF-6α protein. The resulting overall structure of HNF-6α has two different distinct domains: the Cut domain and the Homeodomain connected by a long flexible linker. Our NMR structure shows that the Cut domain folds into a topology homologous to the POU DNA-binding domain, even though the sequences of these two protein families do not show homology. The DNA contact sequence of the HNF-6α was mapped with chemical shift perturbation methods. Our data also show that a proposed CREB-binding protein histone acetyltransferase protein-recruiting sequence, LSDLL, forms a helix and is involved in the hydrophobic core of the Cut domain. The structure implies that this sequence has to undergo structural changes when it interacts with CREB-binding protein.

The HNF-6 1 transcription factor is expressed in epithelial cells of the developing pancreas, gall bladder, and liver and in the neural crest cells during mouse embryonic development (1,2). HNF-6 continues to be expressed in the adult mouse liver and pancreas where it regulates the transcription of genes required for glucose homeostasis (3)(4)(5)(6). The main targets of HNF-6 are genes required in glucose metabolism (3)(4)(5)(6). Contained within the N terminus of the HNF-6 protein is the STP box, a serine/threonine/proline-enriched region believed to act as a transcriptional activation domain (1). The DNA recognition involves the single Cut domain and the Homeodomain at the C terminus of HNF-6 (7,8). These two domain constructs show unusual properties. Unlike most of the homeodomaincontaining transcription factors (9,10), the Homeodomain of HNF-6 cannot bind to DNA on its own, whereas the Cut domain of HNF-6 alone is sufficient for DNA binding of some sites with lowered affinities (9,10). The Cut motif is a newly identified DNA binding motif and consists of 70 -80 conserved residues. The Cut domains function as DNA-binding domains alone or in combination with a homeodomain (11,12). Even though the Cut motif acts primarily as a DNA binding motif, a coactivator CREB-binding protein (CBP) interaction motif was identified within the Cut motif (13). A recent study has demonstrated that the Cut Homeodomain of HNF-6 interacts with the winged helix DNA-binding domain of FoxA2 (previously called HNF-3␤) both in vivo and in vitro (14,15). The interaction blocks HNF-6 binding to DNA but stimulates transcription of FoxA2 target promoters by functioning as a transcription coactivator, facilitating the recruitment of CBP to the transcriptional activation complex (15). Detailed studies of the structure of the Cut and Homeodomain of HNF-6 and their interactions with DNA and proteins will reveal the versatile mechanisms used by HNF-6 to regulate transcription.
In this study, we present the high resolution NMR structure of the DNA-binding domains of HNF-6␣ and study its interaction with a strong DNA binding sequence. Surprisingly, our data indicate that the proposed CBP-recruiting motif is folded into a small helix and is involved in the formation of the hydrophobic core of the Cut domain. Our result implicates a possible mechanism utilized by HNF-6 as mode to regulate transcription.

Expression and Purification of HNF-6␣ and the Cut Domain-
The gene encoding the Cut, linker, and Homeodomain DNA-binding domains (solely referred to as HNF-6 from hereon) and the gene encoding the Cut domain and the linker (referred to as the Cut domain) between the two domains of HNF-6 protein were generated from a mouse cDNA and subcloned into pET21a vector fused with N-terminal His 6 tag fusion. Both proteins were produced in Escherichia coli BL21(DE3) (Novagen, Madison, WI). Uniformly labeled HNF-6 and the Cut domain were obtained by growing cells in isotopically enriched M9 minimum medium. The expression of each protein was induced by adding isopropyl-␤-D-galactosidase to 1 mM in the medium, and overnight-induced cells were then harvested by centrifugation. Due to a high level of expression of HNF-6 and the Cut domain present in inclusion bodies, the six-histidine tagged proteins were purified by a standard procedure using nickel-nitrilotriacetic acid resin under denaturing conditions per manufacturer's instructions (Qiagen).
The Mutagenesis of HNF-6 -pECE HNF-6 L350A, a generous gift from Frédéric Lemaigre (Brussels, Belgium), was used as a template that was * This work was funded by an ADA grant (to X. L.) and by a Public Health Service Grant R01 GM43241-14 (to R. H. C.) from the National Institutes of General Medical Sciences. The Bruker DRX600 was purchased with funds from the University of Illinois at Chicago and a grant from the National Science Foundation Academic Research Infrastructure Program (BIR 9601705). These studies made use of the National Magnetic Resonance Facility (NMRFAM) (Madison, WI), which is supported by grants from the National Institutes of Health and the National Science Foundation. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
PCR-generated, digested with EcoRI and NotI, and then ligated into pcDNA V5/His expression vector (Invitrogen) to generate the L45A HNF6 mutant. The altered sites II mammalian in vitro mutagenesis system (Promega) with 5Ј-phospho-ACG CGA CTT GAG TCT GCT CCA GGG CTT-3Ј HNF-6 mutagenesis oligomer was used to generate the K52R HNF6 mutant as directed by the manufacturer's protocol.
NMR Sample Preparation-Purified HNF-6 or the Cut domain was renatured by dialysis against a phosphate buffer (50 mM Na 2 HPO 4 , pH 6.5, 100 mM NaCl, 10 mM Na 2 SO 4 ). The renatured proteins show the expected molecular weights on a SDS gel. The expressed HNF-6 binds to its strong binding sites with nanomolar K d values as expected (13) judged by gel-shift essays. The protein concentrations used in the NMR study were in the range of 1-2 mM as measured by the Bio-Rad protein assay using bovine serum albumin as the standard (Bio-Rad, Hercules, CA).
HNF-6⅐DNA Complex Formation-For the study of HNF-6-DNA interaction, a 17-base pair double-stranded DNA sequence (5Ј-CGAAAA-ATCAATATCGG-3Ј) was purchased from Qiagen. This sequence is a strong HNF-6-binding site derived from the FoxA2 promoter (10). Formation of the complex was achieved by adding an excess of DNA to the protein sample with an estimated final 2:1 DNA to protein ratio. Due to the strong DNA interaction, the free HNF-6 and the complexed HNF-6 is in slow exchange and the resonances from HNF-6 in the DNA complex are not dependent on DNA concentration. From the heteronuclear single quantum coherence (HSQC) spectrum of the complex, no obvious signals from the DNA-free HNF-6 are detected in the sample used in this study.
NMR Spectroscopy-NMR experiments were performed either in 10% D 2 O, 90% H 2 O or in 100% D 2 O at 22°C for DNA-free protein and at 30°C for the HNF-6⅐DNA complex on Bruker DRX600, Varian INOVA 600, and 800 MHz spectrometers. The backbone and side chain assignments of DNA-free HNF-6 were described previously (16). In the assignments of HNF-6, up to 400 mM urea was also added in the different NMR samples. In this study, the urea in the NMR samples suppresses strong resonance signals from flexible residues in HNF-6 and facilitates the assignments of overlapped resonances by following urea concentration-dependent chemical shift changes of the residues in the unstructured sequences. In this study, the urea only has the minimum effect on the resonances of residues in the structured sequences. For 1 H-2 H exchange experiments, the sample was lyophilized and dissolved in 99.9% D 2 O immediately before HSQC spectra were acquired. For comparison, a HSQC was acquired on the 15 N-enriched Cut domain at the same condition as used for the DNA-free HNF-6. The backbone assignment of the DNA complex was achieved through a combination of HNCA, HNCO, CBCA(CO)NH, HN(CO)CA, and NOESY-15 N-HSQC spectra (17)(18)(19)(20). All of the data processing, peak picking, and peak intensity measurements were performed using commercial software SYBYL (Tripos Inc.). All of the 1 H dimensions were referenced to internal 2,2-dimethyl-2-silapentane-5-sulfonate, and 13 C and 15 N were indirectly referenced to 2,2-dimethyl-2-silapentane-5-sulfonate (21).
Structure calculations were performed with program DYANA 1.5 (29) using a 40,000-step energy minimization procedure. For the initial rounds of structure calculations, only sequential, intraresidual, medium-range NOEs and unambiguous long-range NOEs and coupling constants were used. Later, all of theother long-range NOEs and hydrogen bonds were introduced in consecutive steps. 100 structures were calculated in each round, and of these, the 20 structures with the lowest target functions were used to analyze restraints violation and to assign  Fig. 1, C and D. d Statistics were made for residues 6 -69 and 104 -152 for HNF-6, residues 6 -69 for Cut domain, and residues 104 -152 for Homeodomain.
FIG. 1. Solution structure of HNF-6 (residues 6 -152). Backbone traces from 20 selected conformers with the lowest target functions from the final DYANA calculations are shown. A, the structures were superimposed using residues of Cut domain (residues 6 -78) colored by the domains (Cut domain, red; Homeodomain, blue). B, the structures were superimposed using residues of Homeodomain (residues 104 -152) colored as in panel A. C, ribbon representation of structure of HNF-6 colored as in panels A and B. The linker region is colored green. D, sequence comparison of different HNF-6 family members. Dots represent identities to rat HNF-6. Shaded red box indicates highly conserved residues. Shaded green box indicates conserved residues. The abbreviation used are: r, rat; m, mouse; h, human; c, Caenorhabditis elegans. The secondary structure of HNF-6 is shown below the sequence.
Solution Structure of the HNF-6␣ 33930 additional NOE restraints for the following round. This process was repeated until Ͼ90% NOE peaks in the spectra had been assigned and all of the violations were eliminated. In the final stage, the 20 structures with the lowest target functions were used for the structural analyses. All of the subsequent analyses of the structure and graphic representations of the three-dimensional structures were performed using MOLMOL (30) and PROCHECK NMR (31).

RESULTS
The Structure of HNF-6 -The backbone and side-chain resonance assignments of 162-residue HNF-6 were described previously (16). Based on the resonance assignments, the HNF-6 structure was determined using 1805 experimental distance and dihedral angle restraints (Table I). The overall structure of HNF-6 has two distinct globular domains: the Cut domain (residues 1-78) and the Homeodomain (residues 100 -162). The two domains are tethered by a long flexible linker region (residues 79 -99). Unambiguous NOE connectivities were observed between the Cut domain and the linker region. However, due to no NOE connectivities observed between the Cut and the Homeodomain, the structure of HNF-6 did not converge to one conformation. The calculated average r.m.s.d. of 20 structures was 5.66 Å in the global heavy atom (Fig. 1, A and B). A ribbon diagram of a representative NMR structure of HNF-6 is also shown in Fig. 1C. For individual domains, however, we obtained the structure that converged well. The best-fit superposition of backbone atoms for the Cut domain and Homeodomain was shown in Fig. 2 Homeodomains, as demonstrated previously, can fold independently (32,33). However, it is not known whether the Cut domain can fold as an independent domain. The full-length HNF-6 protein and the Cut domain (residue 1-78) with the linker (residues 79 -99) were studied by heteronuclear NMR. The similarity between the HNF-6 and the Cut domain 15 N-HSQC spectra (Fig. 3) demonstrates that the Cut domain constitutes a nearly independent structural domain. Only several resonances of the Cut domain differ between the two spectra. In addition, the resonances from the residues corresponding to the central linker region (residues 79 -99) are intense and poorly resolved in the HSQC spectrum (Fig. 3), suggesting this region of the HNF-6 DNA-binding domain is rather mobile without the Homeodomain attached.
The Structure Comparison of the Cut Homeodomain of HNF-6 and POU Homeodomain of Oct-1-The Cut domain is a new DNA binding motif. It contains four helices and resides at the N terminus of the Homeodomain. Another type of the Homeodomain containing the DNA-binding protein is the POU homeodomain (34). Both the Cut and POU domains contain a similar number of residues and four helices. Even though Cut domains and POU domains do not show sequence homology to each other (Fig. 4B), the topology of the four helices in the POU domain and the Cut domain is similar (Fig. 4A). If the structures of the Cut domain of HNF-6 and the POU domain of Oct-1 (35,36) are compared, the backbone r.m.s.d. value is 6.44 Å. The major difference is observed in the third helix. This helix is shorter and arranged differently in the Cut domain as in the POU domain. The Homeodomain of HNF-6 was proposed as one of the unique members in the Homeodomain family (1). A conserved Trp observed in other homeodomains is replaced by a Phe (Phe 147 ) in the hydrophobic core of HNF-6 ( Fig. 4D), and a Met (Met 149 ) is only observed in HNF-6-type homeodomains (1). Substitution of these two residues with Trp and His influences the activity of the HNF-6 without interfering with the HNF-6 DNA binding affinity (13). The structural comparison indicates that our structure of the Homeodomain of HNF-6 is highly homologous to the Homeodomain of Oct-1 (Fig. 4C) with the exception that the last helix is shorter in the Homeodomain of HNF-6 (36). The protein-DNA interaction extends the length of the major DNA contact helix (H6) in the Homeodomain of Oct-1 (36). The backbone r.m.s.d. value of the two domains is 4.08 Å (Fig. 4C). However, because of the Phe residue, the arrangement of three helices is slightly different between the two homeodomains. The structural differences may be critical for the recognition of HNF-6 by other cofactors such as CBP; therefore, a Phe to Trp substitution influences HNF-6 activity (13).
Amino Acid Substitution in the LDSLL Motif Does Not Disrupt the Structure of Cut Domain-Previous in vivo data indicated that LSD 45 LL sequence in the Cut domain is an important CBP recruitment motif. The Leu to Ala amino acid substitution (LSD 45 AL) abolishes the CBP recruitment of HNF-6. Because this sequence forms a small helix and is involved in the hydrophobic packing of the Cut domain, it is possible that the substitution disrupts the folding of the Cut domain and DNA binding of HNF-6. To address this question, the Leu to Ala mutant HNF-6 (L45A) and another HNF-6 mutant (K54R) were expressed and studied by NMR. HSQC spectra indicate that neither mutant disrupts the folding of the Cut domain (Fig. 5, A-C) and the DNA binding property of HNF-6 (data not shown). The Leu 45 in the Cut domain is only partially imbedded in the hydrophobic core according to our structure (Fig. 5D). Hence, it is reasonable that the Leu to Ala substitution does not disrupt the Cut motif and Leu 45 is a critical residue in CBP recognition. Interestingly, because LS-DLL sequence forms a small helix and is involved in the hydrophobic core formation of the Cut domain, the structure of this motif has to be rearranged for CBP recognition to expose the three Leu residues for the intermolecular protein-protein interaction (37,38). Lys 54 is a surface residue in the unstructured sequence (Fig. 5D). A previous experiment showed that the K54R mutant reduced the in vivo stability of HNF-6. 2 Based on our NMR data, apparently the reduced in vivo activ- ity of K54R mutant HNF-6 is at least not due to the incorrect folding of the Cut domain.
DNA Recognition by HNF-6 -One of the unique properties of HNF-6 is that the Homeodomain of HNF-6 cannot bind to DNA on its own and that the Cut domain is the major DNA-binding contributor of HNF-6 (8). To investigate how the HNF-6 Cut domain and the Homeodomain recognize DNA, HNF-6 bound with a strong DNA-binding site was studied using the NMR chemical shift perturbation method (39). This site is derived from the HNF-6-binding site of the FoxA2 promoter (8). It was proposed that, on this promoter, HNF-6 recruits CBP for transcriptional stimulation and that the interaction depends both on the LXXLL motif of the Cut domain and Phe 147 -Met 149 in the Homeodomain (13). The method detects protein residues that are interacting directly or that undergo conformational changes upon binding. The 15 N-1 H HSQC spectra of the HNF-6, free and in complex with DNA, are shown in Fig. 6, A and B. This comparison shows that these two spectra share many similarities, suggesting that HNF-6 does not undergo major structural rearrangements upon binding to DNA. The 15 N, H N , C ␣ average chemical shift change for backbone resonances is given in Fig. 6C  The residues that did not show big chemical shift changes upon DNA binding are primarily located on two surfaces, one for the Cut and the other for Homeodomain. This finding also suggests that the chemical shift changes may not result from global structural changes but are due to the DNA binding interaction. If the secondary structures of HNF-6 and Oct-1 are aligned, the HNF-6 regions that show large chemical shift perturbations match well to the DNA contact sequences observed in Oct-1 such as the C terminus of the Homeodomain and the N termini of the ␣3 and ␣4 of the Cut domain (36). Therefore, HNF-6 may use a similar DNA contact mode as Oct-1 to recognize DNA (36,40). DISCUSSION HNF-6 is a liver-enriched transcription factor and controls the expression of several critical enzymes required for glucose homeostasis. The importance of this factor during development is displayed in the HNF-6 knock-out mice, which possess severe liver, gall bladder, and pancreatic defects and develop type II diabetes (5). Thus, it is important to determine the mechanism used by HNF-6 to bind to its DNA recognition sequence and how HNF-6 transcriptional activity is regulated. Our NMR structure shows that the Cut domain folds into a topology homologous to the POU DNA-binding domain, even though the sequences of these two protein families do not have obvious sequence homology. However, when the secondary elements are aligned between HNF-6 and Oct1 (Fig. 4, B and D), the positions of the hydrophobic core residues show similarity.
Both POU and Cut motifs are DNA-binding domains. The LSDLL sequence in Helix 3 (␣3) of the HNF-6 Cut domain has been proposed to function in transcriptional activation by recruiting the CBP coactivator protein. LXXLL motif is involved in protein-protein-specific interactions and is well studied. The three-leucine residues in the motif interact strongly with its target through specific hydrophobic interactions (37,38). However, based on our study, this motif is not surface-exposed and is involved in hydrophobic packing of the Cut domain. How this motif is involved in CBP recruitment is not clear. However, it has been shown that HNF-6 and FoxA2 interact strongly, and the interaction serves to recruit CBP to FoxA2-dependent target genes (15). It is possible that the protein-protein interaction rearranges the structure of the Cut domain and ␣3 and exposes this motif for CBP recognition.
The Cut domain is structurally homologous to the POU domain. Both the N termini of ␣3 and ␣4 of POU domain make DNA contacts with ␣3 residing in the major groove of DNA (36). When HNF-6 binds to DNA, the N termini of the ␣3 and ␣4 also show obvious chemical shift perturbations. Therefore, when HNF-6 binds to DNA, it is possible that ␣3 of the Cut domain contacts the major groove of the DNA. In this case, the LSDLL sequence cannot be exposed and therefore HNF-6 cannot recruit CBP through this motif and act as a transcriptional coactivator protein when HNF6 protein is bound to DNA. This hypothesis is supported by our previous studies that demonstrated that inhibition of CBP activity by the adenovirus E1A protein abolished FoxA2-HNF-6 transcriptional synergy of a FoxA target gene but CBP inhibition did not influence HNF-6 transcriptional activation of an HNF-6 target gene (15). Furthermore, because the HNF-6/FoxA2 interaction is likely to perturb the folding of ␣3 to expose the LXXLL motif and then the interaction with the FoxA2-winged helix domain should interfere with the HNF-6-DNA recognition. This possibility is supported by our previous results, which show that the FoxA2-HNF-6 interaction blocks HNF-6 binding to DNA (15). When HNF-6 binds to DNA, the resonances from the residues in the linker region also show noticeable chemical shift perturbations. They are likely to be caused by the structural rearrangement of the linker or the linker/DNA interaction or both. The linker/ DNA interaction can explain the observed differences in DNA binding profiles of HNF-6␣ and HNF-6␤, the latter of which is a splicing isoform of the HNF-6 mRNA that replaces an exon encoding a longer linker region between the Cut and homeodomain motifs (13). Apparently, further studies are necessary to ambiguously answer these questions.
The HNF-6 homeodomain is proposed to be a divergent Homeodomain, because it contains a Phe (Phe 147 ) at the core position instead of Trp seen in the other Homeodomains and a Met at position 149, an amino acid never been found in any other Homeodomain proteins (Fig. 4D). Phe 147 is a hydrophobic core residue, and Met 149 is a surface residue in the HNF-6 Homeodomain. Our structure indicates that the folding of the HNF-6 Homeodomain is typical of that of the Homeodomain motif with Phe 147 in the hydrophobic core. If the structures of different homeodomains are compared, both Phe and Trp contact a Leu residue in the helix 1 in their corresponding homeodomains. The previous data indicated that the single amino acid substitution, F147W or M149H, and the double amino acid substitution, F147W/M149H, reduced the CBP-dependent HNF-6 activity without modifying the HNF-6 DNA binding affinity dramatically (8). According to previous structures of Homeodomain-DNA complexes and our NMR study of the HNF-6⅐DNA complex, this Met residue should reside in the major groove. Therefore, if Met 149 is involved in CBP interaction, it should play a role in DNA-independent CBP interaction.