Identification of a Novel PDX-1 Binding Site in the Human Insulin Gene Enhancer*

Islet β cell type-specific transcription of the insulin gene is regulated by a number of cis-acting elements found within the proximal 5′-flanking region. The control sequences conserved between mammalian insulin genes are acted upon by transcription factors, like PDX-1 and BETA-2, that are also involved in islet β cell function and formation. In the current study, we investigated the contribution to human insulin expression of the GG2 motif found between nucleotides -145 and -140 relative to the transcription start site. Site-specific mutants were generated within GG2 that displayed a parallel increase (i.e. -144 base pair) or decrease (i.e. -141 base pair) in insulin enhancer-driven reporter and gel shift binding activity in β cells consistent with human GG2 being under positive regulatory control. In contrast, the corresponding site in the rodent insulin gene, which only differs from the human at nucleotides -144 and -141, is negatively regulated by the Nkx2.2 transcription factor (Cissell, M. A., Zhao, L., Sussel, L., Henderson, E., and Stein, R. (2003) J. Biol. Chem. 278, 751-756). Human GG2 activator binding activity was present in nuclear extracts prepared from human islets and enriched in those from rodent β cell lines. The human GG2 activator binding factor(s) was shown to be ∼38-40 kDa and distinct from other size-matched islet-enriched transcription factors, including Nkx2.2, Pax-4, Cdx2/3, and Isl-1. Combined DNA chromatographic purification and mass spectrometry analysis revealed that the GG2 activator was PDX-1. These results demonstrate that the GG2 element, despite its divergence from the core homeodomain consensus binding motif, is a site for PDX-1 activation in the human insulin gene.

Insulin is a polypeptide hormone critically involved in the control of glucose homeostasis and is synthesized exclusively by pancreatic islet ␤ cells. Tissue-specific expression of the insulin gene has been shown in transgenic animals in vivo to be me-diated by 5Ј-flanking sequences within 350 base pairs of the transcription start site that are ϳ60% conserved between human and rodent (1)(2)(3). A focused analysis of this region has shown that restricted expression is at least partially regulated by the respective binding of islet-enriched PDX-1 (4 -6), MafA (7)(8)(9), and BETA-2 (10) to conserved insulin A3 (Ϫ216/Ϫ207 bp), C1 (Ϫ125/Ϫ116 bp), and E1 (Ϫ111/Ϫ102 bp) element sequences. Each of these factors represent a distinct transcription factor family with DNA binding mediated by the homeodomain, basic helix-loop-helix, and basic leucine-zipper regions in PDX-1, BETA-2, and MafA, respectively.
The current work was focused on the GG2 element located between nucleotides Ϫ145 and Ϫ140 in the human insulin gene. Recently the corresponding region in the rodent gene was found to be negatively regulated by islet-enriched Nkx2.2, a homeodomain transcription factor of the NK-2 class (25). However, earlier studies showed reduced human insulin enhancerdriven reporter activity from a GG2 deletion and block mutant, while in vitro DNase I footprinting indicated that a ␤ cellenriched factor of roughly 30 kDa was involved in control (34 -36). To more precisely examine the mechanism of human GG2-mediated control, a comprehensive series of directed mutations were made and analyzed in the context of a human insulin enhancer-driven reporter. These studies strongly suggested that human GG2 was under positive control by a protein(s) of ϳ38 -40 kDa that was present in nuclear extracts from human islets and enriched in rodent ␤ cell lines. The binding site for the human GG2 activator also appeared to be distinct from other pancreas-enriched transcription factors and incapable of interacting with those of similar size, including Nkx2.2, Pax-4, Cdx2/3, and Isl-1. The GG2 activator was purified from a cultured ␤ cell line by DNA affinity chromatography and identified by mass spectrometry to be PDX-1, a result confirmed by antibody and DNA binding competition analysis. These studies have identified an atypical PDX-1 binding site in the human insulin enhancer and demonstrated a fundamental difference in the regulation of the human and rodent insulin genes.

MATERIALS AND METHODS
Transfection Constructs-Insulin GG2-mediated activation was assayed from human insulin enhancer-driven firefly luciferase expression constructs that contained either wild type Ϫ251 to ϩ2 bp sequences (Ϫ251 Luc WT) 1 (37) , and Ϫ141 mut B ( Ϫ155 CCAGCACCAGGGAAGTGGTCCG-GAAA Ϫ130 ) were constructed using the QuikChange TM site-directed mutagenesis kit (Stratagene, La Jolla, CA). Enzyme restriction digest and DNA sequencing analyses was utilized to determine the correctness of each construct. The Ϫ238 Luc WT construct contains rat insulin II sequences from Ϫ238 to ϩ2 bp (38), while Ϫ238 Luc (Ϫ133/Ϫ132 mut) contains an A to C change at nucleotides Ϫ133 and Ϫ132 (25).
SDS-PAGE Fractionation-␤TC-3 nuclear extract (100 g) was separated on a 4 -12% SDS-polyacrylamide gel. The lane containing the fractionated extract was cut horizontally into 3-mm pieces to represent different molecular weight fractions. The proteins were eluted overnight from the crushed gel slice at 4°C by agitation in 100 l of renaturation buffer (gel shift binding buffer plus 1% Triton and 1 g of bovine serum albumin). The gel pieces were pelleted by centrifugation, and the supernatant was analyzed for human GG2 binding in the gel mobility shift assay.
DNA Affinity Chromatography-The oligonucleotide trapping method (44) was modified for use with the Nkx2.2 Consensus M5 sequence containing a single TGTGTGTGTG tail at the 3Ј-end of the top strand, termed M5(TG) 5 . All purification steps were carried out at 4°C. ␤TC-3 nuclear extract (10 mg) was incubated in 25 ml of gel shift binding buffer containing Tween 20 (5% final concentration) and a protease inhibitor mixture (Roche Applied Science, one tablet/50-ml total volume) and then applied to a column containing 5 ml of Sepharose 4B covalently linked to single-stranded ACACACACAC ((AC) 5 ) DNA. The flow-through and early washes containing all of the human GG2 element binding activity were combined and incubated with M5(TG) 5 DNA (200 fmol/10 mg of nuclear extract) and poly(dI-dC) (2 g/pmol of M5(TG) 5 DNA). The reaction was then applied to an (AC) 5 DNA column (0.5 ml of Sepharose/mol of M5(TG) 5 DNA), and binding activity was analyzed over a 100 mM-1 M NaCl elution gradient. GG2 activator binding was detected at ϳ200 mM NaCl. These fractions were combined, and the DNA affinity step was repeated. Proteins in the affinity-purified GG2 binding fractions were precipitated with 20% trichloroacetic acid, resuspended in 1ϫ SDS loading buffer, and separated by 10% SDS-PAGE. The 38 -40-kDa protein band(s) detected by Colloidal Blue staining (Invitrogen) was excised and analyzed by mass spectroscopy.
Protein Identification by Mass Spectroscopy-The 38 -40-kDa protein band(s) was excised from the lane representing the fractions with peak activity along with a similar band from a neighboring fraction exhibiting no activity to serve as a negative control. Each band was separately cut into cubes, equilibrated in 50 mM NH 4 HCO 3 , reduced with dithiothreitol (3 mM in 50 mM NH 4 HCO 3 at 37°C for 15 min), and alkylated with iodoacetamide (6 mM in 50 mM NH 4 HCO 3 for 15 min). The separate sets of gel cubes were then dehydrated with acetonitrile and rehydrated with 15 l of 20 mM NH 4 HCO 3 containing 0.01 g/l modified trypsin (Promega), and trypsin digestion was carried out for Ͼ2 h at 37°C. Peptides were extracted with 60% acetonitrile, 0.1% trifluoroacetic acid, dried by vacuum centrifugation, and reconstituted in 10 l of 0.1% trifluoroacetic acid. The peptides were then desalted and concentrated into 2 l of 60% acetonitrile, 0.1% trifluoroacetic acid using ZipTip C18 pipette tips (Millipore), and then 0.4 l was applied to a target plate and overlaid with 0.4 l of ␣-cyano-4-hydroxycinnamic acid matrix. Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry and TOF/TOF tandem mass spectrometry were carried out using a Voyager 4700 mass spectrometer (Applied Biosystems, Foster City, CA) operated in reflectron mode.
For intact peptide mass analysis, the mass spectra were calibrated to within 20 ppm using trypsin autolytic peptides present in the sample (m/z ϭ 842.51, 1045.56, and 2211.09 daltons). Peptides unique to the sample generated from the peak activity fraction were used for data base interrogation against the Swiss-Prot and National Center for Biotechnology non-redundant data bases using the MASCOT algorithm. These included m/z ϭ 791. 45

Identification of GG2 Sequences That Regulate Human Insu-
lin Enhancer-driven Expression-To investigate how the GG2 element influences human insulin gene expression, a series of mutations in GG2 (Ϫ145/Ϫ140 bp) were constructed in a reporter driven by Ϫ251 to ϩ2 bp of the human gene (termed Ϫ251 Luc). Human GG2 mutant activity was compared in transfected ␤TC-3 and Min6 ␤ cells with the rat insulin II GG2 mutant, Ϫ238 Luc (Ϫ133/Ϫ132 mut). A block mutation within human GG2 resulted in an ϳ80% reduction in reporter activity, whereas the rat element mutant was ϳ2-3-fold more active (Fig. 1B).
Because of the different regulatory behavior of the human and rat GG2 mutants, site-directed transversion (type A) and rat insulin II conversion (type B) mutants in human Ϫ251 Luc were constructed at the non-identical bp Ϫ144 and bp Ϫ141 sequences (Fig. 1). Both of the bp Ϫ141 mutants decreased activity to a level similar to the block mutation. Conversely the bp Ϫ144 mutants resulted in 3-4-fold activation over wild type (Fig. 1B). These results suggested that human insulin nucleotides Ϫ145 to Ϫ141 are part of the functional core of the GG2 element.
An Islet ␤ Cell-enriched Protein(s) Stimulates Human GG2 Activity-Gel mobility shift assays were performed to determine the binding properties of the factor(s) associated with human GG2. Binding to both human and rat II GG2 element sequences was analyzed in ␤TC-3 nuclear extracts. The isletenriched Nkx2.2 transcription factor binds to and negatively regulates rat GG2 (25). However, Nkx2.2 did not appear to bind to the human GG2 probe using ␤TC-3 extracts; instead a unique slower mobility complex was principally formed (Fig.  2A, compare lanes 1 and 2). Furthermore in vitro synthesized Nkx2.2 did not bind human GG2 (Fig. 2B) nor did rat GG2 compete with human GG2 for the slower mobility complex (Fig.  2C). This human-specific complex was not found with either bp Ϫ141 mutant, whereas enhanced binding was observed with both bp Ϫ144 mutants ( Fig. 2A and data not shown). The Nkx2.2 consensus binding sequence also competed very effec-tively for the human-specific GG2 complex (Fig. 2C, compare Nkx2.2 Cons. with Ϫ144 mut).
Strikingly the roughly 4-fold increase in human-specific complex binding achieved with the bp Ϫ144 mutants was also similar to their effect on insulin enhancer-driven activity (Fig.  1B). In contrast, neither bp Ϫ141 mutant was capable of competing with wild type human GG2 for binding to this ␤ cell factor(s), while a wild type behavior was observed with divergent bp Ϫ138 mutant competitors (data not shown), indicating that nucleotides Ϫ145 to Ϫ140 contain the principal binding sequences. This competition analysis also demonstrated that there was no regulatory binding specificity to any of the other human GG2 complexes. Importantly, as GG2 complex binding paralleled regulation of Ϫ251 Luc in transfection assays, we conclude that this element is a site for positive control in the human insulin gene. The unique human GG2 binding complex was termed GG2 activator, abbreviated GG2 Act.
To obtain insight into the distribution of GG2 Act, nuclear extracts from various islet ␤ and non-␤ cell lines were analyzed for binding. GG2 Act binding activity was found in all of the tested rodent ␤ cell lines (␤TC-3, Min6, Ins-1, and HIT T-15) but not islet ␣ (␣TC-6), kidney (BHK), liver (HepG2), cervical cancer (HeLa), or fibroblast (NIH 3T3) cell lines (Fig. 3A). Wild type and mutant human GG2 competition analysis was used to distinguish the specific GG2 Act regulatory complex (i.e. ␤ cells) from nonspecific (e.g. HepG2) (data not shown). Human ␤ cell lines have not been isolated, and as a consequence it is not possible to directly determine the presence of GG2 Act in these cells. However, a slightly faster migrating complex (human GG2 Act) was found in human islet nuclear extracts, which would primarily represent ␤ cells and to a lesser extent ␣, ␦, and pancreatic polypeptide (Fig. 3B). Furthermore competition analysis demonstrated that this complex displayed the same binding specificity as GG2 Act of rodent ␤ cell lines (Fig. 3B). Collectively these results suggest that the GG2 Act factor(s) is enriched in mammalian pancreatic islet ␤ cells.
GG2 Act Is a 38 -40-kDa Factor Distinct from Pax-4, Cdx2/3, and Isl-1-To determine the approximate size of GG2 Act, ␤TC-3 nuclear extract was first size-fractionated by SDS-PAGE. The lane containing fractionated extract was then cut into slices, each containing proteins representing a particular size range. The proteins eluted from each gel slice were analyzed for GG2 Act activity. A binding activity of around 38 -40 kDa was found that co-migrated with GG2 activity in unfractionated ␤TC-3 nuclear extract (Fig. 4, compare Fraction 11 with unfractionated (UNF)) and displayed the appropriate specificity (data not shown). These data indicated that GG2 Act is composed of one or more proteins of ϳ38 -40 kDa, a similar size range to that previously estimated for the human GG2 activator (34).
Of the characterized pancreas-enriched transcription factors, Cdx2/3, Pax-4, and Isl-1 all fall within or near the appropriate size range predicted for the GG2 Act binding factor. To determine whether Pax-4 or Cdx2/3 were capable of binding the human GG2 element, each was in vitro translated and analyzed in gel mobility shift assays. Although neither of these factors bound the human GG2 probe, specific binding to control probes was observed (Fig. 5). In addition, antiserum raised against Isl-1 had no effect on the formation of GG2 Act (data not shown).
Biochemical Purification and Identification of GG2 Act by Mass Spectrometry-Since GG2 Act appeared to be regulated by a novel ␤ cell-enriched factor, a DNA affinity chromatography approach was used to determine its identity. Successful isolation of factors via this method requires the use of a high affinity DNA probe for efficient binding complex retention. The Nkx2.2 consensus element appeared to be of use in this process since it competed for GG2 Act binding even more effectively than the bp Ϫ144 mutant of human GG2 (Fig. 2C).
One diffuse binding complex was principally formed with the Nkx2.2 consensus probe using ␤TC-3 nuclear extract (Fig. 6B). However, Nkx2.2 appeared to represent only a fraction of the binding activity as Nkx2.2-specific antibody supershifted only a small portion of the binding complex (Fig. 6B), a pattern unaf-fected by additional antiserum (data not shown). Support for this broad gel shift band actually being composed of two closely migrating complexes was obtained upon addition of WT human GG2 competitor (Fig. 6B). Additional experiments performed with the Nkx2.2 antiserum and human GG2 competitors demonstrated that the bottom complex contained Nkx2.2 (data not shown) and the top contained GG2 Act (Fig. 6C). Together these data revealed that the Nkx2.2 consensus sequence was
The isolation of the GG2 Act protein was performed by Nkx2.2 Cons. M5 element-based affinity chromatography using an oligonucleotide trapping strategy (8,44). In this method, a large scale gel shift binding reaction was performed in the presence of an Nkx2.2 Cons. M5 probe containing a singlestranded (TG) 5 tail on the 3Ј-end of one strand. To retain GG2 Act binding, the reaction was applied to a column containing Sepharose 4B cross-linked to a single-stranded (AC) 5 sequence complimentary to the (TG) 5 overhang of the probe. During column chromatography, the complimentary tails anneal, thus retaining the Nkx2.2 Cons. M5 probe and the GG2 Act protein(s). Binding complexes were eluted with a linear salt gradient, and GG2 Act binding activity was tracked by gel shift analysis. Following an initial Sepharose 4B column step performed in the absence of DNA probe, two successive affinity steps yielded a highly purified 38 -40-kDa GG2 Act binding fraction with an overall purification of ϳ5,000-fold (data not shown).
Gel shift analysis performed with human insulin GG2 and Nkx2.2 consensus competitors revealed that the affinity-purified fraction displayed the specificity of GG2 Act (Fig. 7). This fraction was subjected to SDS-PAGE, and the protein identity of the 38 -40-kDa band was determined by MALDI-TOF and MALDI-TOF tandem mass spectrometry analysis. Tryptic peptides corresponding to the ␤ cell-enriched PDX-1 protein were uniquely found in the sample.
PDX-1 Is the ␤ Cell-enriched Human GG2 Binding Factor-To directly test whether GG2 Act was composed of PDX-1, gel shift analysis using crude ␤TC-3 nuclear extract and affinity-purified GG2 Act binding activity were conducted with bona fide PDX-1 binding site competitors (human insulin A1 and A3) as well as PDX-1 antiserum (Fig. 8A). Both A1 and A3 competed more effectively for GG2 Act binding than did human insulin GG2, while rat insulin GG2 did not compete (Fig. 8B). Furthermore the addition of PDX-1 antiserum specifically prevented formation of the GG2 Act complex, while the Nkx2.2 antibody had no effect (Fig. 8B). Western blot analysis also revealed that PDX-1 co-eluted with GG2 Act binding activity throughout the purification (data not shown). Based upon these results, we conclude that PDX-1 is the human insulin GG2 activator.

DISCUSSION
Islet ␤ cell-selective and glucose-mediated expression of the insulin gene is mediated by sequences within 350 bp of the transcription start site (1,2). This region is also the principal area of conservation between mammalian insulin genes with FIG. 4. GG2 Act contains a protein(s) of ϳ38 -40 kDa. ␤TC-3 nuclear extract was fractionated by SDS-PAGE, and eluted proteins of distinct molecular weights were assayed for human GG2 Ϫ144 mut binding activity. Each fraction (1-16) represents a different molecular mass range. Binding specificity was determined by competition with unlabeled WT and mutant human GG2 (not shown). The position of GG2 Act in unfractionated (UNF) ␤TC-3 nuclear extract is indicated. human displaying roughly 60% identity to rodent (3) and greater than 90% identity to other primates (45). The functional characterization of this control region has almost exclusively been performed on the rat insulin genes wherein the islet-enriched PDX-1, BETA-2, and MafA transcription factors have been shown to be essential for control (46,47). Each of these activator binding sites is also found in the other mammalian genes, suggesting their mechanism of action is critical to proper insulin gene expression. In this study, we defined the basis for regulation of the human insulin GG2 element. Our results revealed that PDX-1 stimulates human GG2-driven activity using sequences dissimilar from the TAAT core motif commonly associated with binding of this important islet activator.
The distinct regulatory properties of human GG2 was revealed upon examination of the functional consequence of mutating the 2 base pairs divergent with the rat site. This analysis showed that bp Ϫ144 mutations resulted in a gain of human insulin enhancer-driven reporter activity in ␤ cell lines, while bp Ϫ141 mutations reduced expression to GG2 block mutant levels (Fig. 1). As these mutants had identical binding properties to a human islet and a ␤ cell-enriched complex in gel mobility shift assays (Fig. 3), we concluded that the activator (termed GG2 Act) regulated human GG2-mediated expression FIG in ␤ cells. These results also revealed a regulatory distinction between the human and rodent insulin genes as Nkx2.2 acts as a repressor of the corresponding rat and mouse GG2 sequences. Interestingly the GG2 sequence found in the human gene is conserved in many other mammalian genes (e.g. gorilla, chimpanzee, dog, and guinea pig (3,45)), while only rat and mouse appear capable of binding Nkx2.2 (25).
PDX-1 was associated with GG2 Act after DNA affinity chromatography and mass spectroscopic analysis and was confirmed as the activator by antibody and DNA binding competition analysis. This finding was surprising considering the apparent differences between PDX-1 and GG2 Act in regard to SDS-PAGE-estimated molecular mass (i.e. 46 kDa (6, 48) versus 38 -40 kDa (Fig. 4)) and binding properties (CTAATG (PDX-1) versus GAAATG (Fig. 8A)), although PDX-1 has the same ␤-cell-enriched distribution pattern (4, 5, 33, 49). It is unclear why our molecular weight estimation of PDX-1 is dif-ferent from previous measurements since only a single PDX-1immunoreactive band was found in ␤TC-3 cell extracts by Western analysis (data not shown). PDX-1 also binds to enhancer region sequences of the insulin gene in vivo (25,50), although a limitation of the chromatin immunoprecipitation assay used for these purposes is the inability to distinguish the occupancy of one site over another (A1 or A3) within a relatively small control region (i.e. ϳ500 bp).
A modified Nkx2.2 consensus element was used during the purification of GG2 Act as the wild type consensus sequence bound even more effectively than the high affinity GG2 Ϫ144 mut. The Nkx2.2 consensus sequence (T(T/C)AAGT(A/G)(C/ G)TT) was defined using a PCR-based binding site selection method with a truncated Nkx2.2 protein containing the homeodomain and NK2-specific domain (43). Interestingly the Nkx2.2 central core sequence T(T/C)AAG is quite distinct from the TAAT core of homeodomain proteins, yet PDX-1 binds with FIG. 7. Binding specificity of affinity-purified GG2 Act. Binding assays were performed with the WT human GG2 probe using the purified GG2 Act fraction obtained from Nkx2.2 consensus M5 affinity column purification. A 50-and 100-fold excess of the listed competitors was used. The asterisk indicates nonspecific binding.
FIG. 8. Human GG2 represents a novel site for PDX-1 regulation. A, comparison of human GG2 to human Ins A1 (Ϫ75/Ϫ83 bp) and human Ins A3 (Ϫ207/Ϫ216 bp) sequences. The proposed PDX-1 binding site consensus (55) is also shown with the boxed region containing the TAAT core sequence important in homeodomain binding. B, binding assays were performed with the WT human GG2 probe and either ␤TC-3 nuclear extract or affinity-purified GG2 Act. The reactions were performed in the presence of PDX-1 antiserum (␣PDX-1), Nkx2.2 antiserum (␣Nkx2.2), or a 50-fold excess of listed competitors. The asterisk denotes nonspecific binding. comparable efficacy to the A1 and A3 sites of insulin ( Fig. 8 and data not shown). In contrast, human GG2 is a noticeably weaker binding site for PDX-1 than either the A1 or A3 sites of the insulin gene (Fig. 8B). Despite this, inactivating GG2 mutants had a similar compromising effect on human insulin enhancer-driven reporter activity to dysfunctional A3 (6), C1 (MafA (37,39)), and E1 (BETA-2 (37)) site mutants.
The ability of GG2 to function as an activator site in human insulin suggests that this gene is transcribed at higher levels, or at least regulated distinctly, from rat or mouse. The observed concomitant increase in both insulin-enhancer driven and gel shift binding activity in the human Ϫ144 mutant also supports this proposal (Figs. 1 and 2). However, it is not possible to readily address this question due not only to sequence differences between rodent and humans within the insulin enhancer region but also to differences between the two nonallelic rat and mouse genes (termed I and II (51)). For example, the islet-enriched BETA-2 protein activates the E2 (Ϫ232/Ϫ220 bp) element of the rodent I gene but not II (41,46,52), while the ubiquitous upstream stimulatory factor protein regulates the human gene (53). Dissimilarities also exist in the organization of the MafA control sites between the human and rat insulin genes (54). Such observations indicate that a concerted effort to identify the regulatory factors of the human gene could provide unique insight into the mechanisms involved in regulating insulin expression under normal and diabetic conditions.