Regulation of the Gene Encoding GPR40, a Fatty Acid Receptor Expressed Selectively in Pancreatic β Cells*

GPR40 is a G protein-coupled receptor expressed preferentially in pancreatic β cells. It is activated by long-chain fatty acids and has been implicated in mediating physiological and pathological effects of long-chain fatty acids on β cells. We mapped the GPR40 transcription start site to a location 1044 bp upstream of the translation start site. This permitted definition of the GPR40 core promoter and the organization of the gene, which comprises a 24-bp non-coding exon, a 698-bp intron and a 4402-bp second exon, containing the entire protein coding sequence. Sequence analysis of the GPR40 locus revealed three evolutionarily conserved regions upstream to the translation start site (HR1-HR3). DNase I-hypersensitive sites were present in the HR2 and HR3 regions in β cells but not in non-β cells. The 5′-flanking region of the GPR40 gene was capable of directing transcriptional activity selectively in β cells. An important component of this is attributable to the HR2 region, which showed strong β cell-specific enhancer activity. Systematic mutagenesis of HR2 revealed several important sub-regions. Mutagenesis of sub-regions 4-5, and 9 reduced transcriptional activity by ∼60 and 40%, respectively. These sub-regions can bind the β cell-specific transcription factors PDX1 and BETA2, respectively, both in vitro and in vivo. Thus, cell-specific expression of the GPR40 gene involves a characteristic chromatin organization of the locus and is controlled at the transcriptional level through HR2, a potent β cell-specific enhancer.

GPR40 is a G protein-coupled receptor expressed preferentially in pancreatic ␤ cells. It is activated by long-chain fatty acids and has been implicated in mediating physiological and pathological effects of long-chain fatty acids on ␤ cells. We mapped the GPR40 transcription start site to a location 1044 bp upstream of the translation start site. This permitted definition of the GPR40 core promoter and the organization of the gene, which comprises a 24-bp non-coding exon, a 698-bp intron and a 4402-bp second exon, containing the entire protein coding sequence. Sequence analysis of the GPR40 locus revealed three evolutionarily conserved regions upstream to the translation start site (HR1-HR3). DNase I-hypersensitive sites were present in the HR2 and HR3 regions in ␤ cells but not in non-␤ cells. The 5-flanking region of the GPR40 gene was capable of directing transcriptional activity selectively in ␤ cells. An important component of this is attributable to the HR2 region, which showed strong ␤ cell-specific enhancer activity. Systematic mutagenesis of HR2 revealed several important sub-regions. Mutagenesis of sub-regions 4 -5, and 9 reduced transcriptional activity by ϳ60 and 40%, respectively. These sub-regions can bind the ␤ cellspecific transcription factors PDX1 and BETA2, respectively, both in vitro and in vivo. Thus, cell-specific expression of the GPR40 gene involves a characteristic chromatin organization of the locus and is controlled at the transcriptional level through HR2, a potent ␤ cell-specific enhancer.
Type 2 diabetes mellitus (T2DM) 3 is a serious metabolic disease characterized by impaired insulin action on peripheral tis-sues, combined with defective insulin secretion by pancreatic ␤ cells (1). The incidence of T2DM worldwide is increasing dramatically, driven by a sedentary lifestyle and the associated obesity in the population (2). Obesity and T2DM are associated with elevated levels of lipids such as LCFAs in the serum (3,4). LCFAs have complex, divergent effects on ␤ cells: whereas acute exposure leads to enhanced glucose-stimulated insulin secretion, prolonged exposure results in impaired ␤ cell function ("lipotoxicity") (5,6). It has been proposed that this impairment may contribute to the defective ␤ cell function associated with T2DM (7). Yet the precise mechanisms linking obesity and T2DM remain incompletely understood.
The mouse GPR40 gene is a member of a small gene family encoding G protein-coupled receptors (8). These receptors are activated by fatty acids: GPR40 responds to medium-and longchain fatty acids (9 -11), whereas the other two members of the family GPR41 and GPR43 respond to short-chain fatty acids (12,13). The gene family is clustered in a region of ϳ40 kb on mouse chromosome 7, downstream to the gene encoding CD22, a receptor involved in modulation of B lymphocyte function (14).
GPR40 is expressed in highly selective manner. In rodents, GPR40 mRNA is present at high levels in pancreatic islets and ␤ cell lines, but at very low levels in all other tissues examined (9,15). In humans, it has been reported that GPR40 is expressed not only in pancreas, but also in brain and monocytes (11). A recent report suggests that GPR40 protein is expressed in mouse pancreatic ␣ cells (16). Several studies have examined the role of GPR40 in ␤ cells. In cultured ␤ cells, down-regulation of GPR40 inhibits LCFA-dependent increases in intracellular [Ca 2ϩ ] and insulin secretion (9,(17)(18)(19). Consistent with this, islets isolated from mice lacking GPR40 show impaired insulin secretion in response to short term exposure to LCFAs (15). Interestingly, these mice display significant protection from obesity-induced hyperglycemia and fatty liver (15): on the other hand, overexpression of GPR40 in ␤ cells leads to impaired ␤ cell function and diabetes (15). Taken together, these results indicate that GPR40 plays a role in mediating both acute and chronic effects of LCFAs on ␤ cells, and may therefore help explain the link between obesity and T2DM.
The important physiological role of GPR40 prompted us to study the regulation of the gene. We identified the GPR40 transcription start site (TSS) and showed that the promoter is selectively active in ␤ cells. The GPR40 5Ј-flanking region contains two highly conserved regions that display DNase I hypersensitivity in ␤ cells: one of these, HR3, contains the GPR40 core promoter, whereas the other, HR2, is a strong transcriptional enhancer that plays an important role in directing GPR40 expression selectively to ␤ cells.

RNA Preparation
Total RNA was extracted from cell cultures using the Tri-Reagent procedure (Molecular Research Center Inc.) according to the manufacturer's instructions. Where indicated, RNA was treated with DNase I (RQ1, Promega, Madison, WI) to remove traces of genomic DNA.

5-Rapid Amplification of cDNA Ends (5-RACE)
First strand cDNA was synthesized from 5 g of DNase I-treated ␤TC1 RNA, using reverse transcriptase (SuperScript II, Invitrogen) according to the manufacturer's instructions. Primers used were complementary to a specific sequence located either within the GPR40 coding region or upstream to it. The cDNA was purified and a poly-G tail was added to the cDNA 3Ј-end using terminal deoxynucleotidyl transferase (Promega) by incubating 80% of the cDNA purified above with 1ϫ terminal deoxynucleotidyl transferase buffer (Promega), 0.83 mM dGTP and 20 units of enzyme in a final volume of 60 l at 37°C for 1 h. The reaction was stopped by heating to 65°C for 15 min. The cDNA was purified, and PCR was performed using the Expand high fidelity PCR system (Roche Applied Science), with 5 l of cDNA and 30 pmol of a reverse primer complementary to GPR40 sequences nested to the primers used for reverse transcription and a forward primer: GAATTC(C) 24 . A fraction of the PCR product was resolved on a 1% agarose gel. When a band appeared, it was excised from the gel, purified, and sequenced. When a smear appeared, the PCR product was purified using the Roche Applied Science purification kit and subcloned into pGEM T-Easy vector (Promega). Plasmids from the resulting bacterial clones were isolated, fractionated on 1% agarose gel, and analyzed by Southern blot, using as a probe, a radioactive end-labeled primer specific for GPR40 and nested to the primer used for PCR. Three of the relevant plasmids were sequenced.

RNase Protection Assay (RPA)
A plasmid (pBS-RPA3-T7antisense) was constructed in which a genomic fragment, spanning the GPR40 transcription start site, was inserted antisense to the T7 promoter and sense to the T3 promoter. The insert contained 133 bp of the GPR40 second exon, the entire first exon (24 bp), and 52 bp upstream to the transcription start site. In vitro transcription reaction was carried out as described (20). The resulting radioactive probe was 278 bases, of which 209 bases are derived from the GPR40 genomic sequence. RPAs were performed using 50 g of ␤TC1 (21) RNA, ␣TC1 (22) RNA, yeast RNA, or 25 pg of positive control RNA (prepared by incubating pBS-RPA3-T7antisense in the presence of T3 RNA polymerase) (20). Samples were resolved on polyacrylamide urea gels and visualized by autoradiography.

Genomic Library Screening
A DNA fragment spanning the GPR40 gene was isolated from a mouse genomic library using a 32 P-labeled probe containing the 900-bp GPR40 open reading frame. The clone obtained is 16,195 bp long, begins in the tenth intron of the CD22 gene, contains the entire GPR40 gene, and ends 3 kb downstream to the GPR41 open reading frame. The DNA fragment was characterized by restriction enzyme analysis and partial sequencing.

Transient Transfections
Transfections were carried out by the calcium phosphate coprecipitation technique (25). CHO and HIT transfections consisted of a mixture of 2 g of reporter construct and 250 ng internal control plasmid. When expression plasmids were used, 1-2 g of each was added. The total amount of expression plasmid DNA was equalized by addition of the empty expression plasmid pcDNA3. Transfection of 293T cells utilized 4 g of expression plasmid and 0.5 g of plasmid pCMV-GFP. Total DNA amount was adjusted to 10 g by addition of pUC18 DNA. Glycerol shock was applied to HIT cells (20% glycerol) and CHO cells (10% glycerol) 4 -7 h after transfection. Cells were harvested 40 -48 h after transfection, and whole cell extracts were prepared as described below.

Cell Extracts
Whole Cell Extracts-Cells were washed with PBS, collected, and pelleted by centrifugation for 30 s in a microcentrifuge. The pellet was resuspended in KP i buffer (0.1 M potassium phosphate, pH 7.8, 1 mM dithiothreitol), and extracted by 3 cycles of freezing in liquid nitrogen and thawing at room temperature. Cell debris was removed by 5 min centrifugation in a microcentrifuge at 4°C.
Nuclear Extracts-Cells were washed with PBS and collected by centrifugation for 30 s in a microcentrifuge. Nuclear extracts were prepared as described (26). Protein concentration of cell extracts was determined by the Bradford method (27) using bovine serum albumin as a standard.

Mutagenesis of GPR40 Promoter
Site-directed mutagenesis was carried out using the QuikChange system (Stratagene) according to the manufacturer's protocol. Oligonucleotides designed for mutagenesis of the HR2 region of GPR40 are indicated in the supplemental information.

Bioinformatics
Sequence alignment of genomic DNA sequences was performed using NCBI-BLAST (www.ncbi.nlm.nih.gov/BLAST) and software from GCG. Identification of transcription factor binding sites was performed using rVista program and Genomatix suite programs MatInspector and FrameWorker.

Electrophoretic Mobility Shift Assay (EMSA)
Probe Labeling-The double-stranded oligonucleotides from the HR2 region of the mouse GPR40 promoter, which were used as probes are described in the supplemental materials. Radioactive probes were generated as described (28). Specific activity obtained was typically ϳ2,000 cpm/fmol.
Binding Reaction-Nuclear extracts of ␤TC1, NIH-3T3, 293T, and transfected 293T cells were prepared as described above. Binding reactions and electrophoresis conditions used for PDX1 were as follows: protein extracts (0.5-2.5 g) were incubated for 10 min on ice in binding buffer (20 mM Hepes, pH 7.9, 1 mM EDTA, 70 mM KCl, 10% glycerol, 0.5 mM dithiothreitol, 5 mM MgCl 2 , and 0.2% Nonidet P-40) with addition of 900 ng of poly(dI⅐dC) (Sigma) in a final assay volume of 14 l, in the presence or absence of antibody, as specified in the figure legends. Competition experiments included 10-, 50-, 100-, or 200fold molar excesses of the following three unlabeled oligonucleotides: HR2-2-6, HR2-M4-5, and the E1 element from the rat insulin gene promoter (29). Following this incubation, 1 l of 32 P-labeled probe was added (ϳ25 fmol), and incubation was allowed to continue on ice for an additional 20 min. Samples were subsequently resolved on 5.5% polyacrylamide gels (37.5:1 acrylamide:bisacrylamide) in 45 mM Tris borate, 1 mM EDTA at 184 V for 80 min at 4°C.

DNase I Hypersensitivity Assay
Preparation of Nuclei-Cells were washed with PBS, trypsinized, and resuspended in growth medium. They were pelleted at 1000 rpm for 5 min, washed twice in cold PBS, and incubated on ice for 10 min in Dounce buffer (20 mM Tris-HCl, pH 7.0, 3 mM CaCl 2 , 2 mM MgCl 2 , and 0.3% Nonidet P-40). Cells were then homogenized on ice using a Dounce homogenizer (20 strokes, pestle B (loose)), and nuclei were pelleted at 1000 rpm for 7 min at 4°C. Nuclei were then resuspended in cold RSB buffer (10 mM Tris-HCl, pH 7.0, 10 mM NaCl, and 3 mM MgCl 2 ), counted in a hemocytometer, pelleted again, and resuspended in 1ϫ RQ1 DNase buffer at concentration of 10 8 nuclei/ml.
DNase I Treatment-A series of digestions using DNase I (RQ1) was performed in the presence of 10 7 nuclei and increasing concentrations of DNase (0 -20 units/ml) for 15 min at 37°C. Reactions were terminated by addition of 1 volume of 2ϫ TNESK (20 mM Tris-HCl, pH 7.0, 200 mM NaCl, 2 mM EDTA, 2% SDS, and 200 g/ml proteinase K). Samples were incubated overnight at 37°C and then phenol-chloroform-extracted, ethanol-precipitated, and resuspended in 100 l of TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 7.8). The concentration of purified genomic DNA was determined using a NanoDrop spectrophotometer.
Southern Blot-Purified DNase I-treated genomic DNA (10 g) was digested to completion at 37°C overnight with KpnI and resolved on 1% agarose gel in 1ϫ TAE buffer (40 mM Tris acetate, 1 mM EDTA) at 20 V overnight. Southern blot analysis was performed as described (20).

Chromatin Immunoprecipitation (ChIP)
Soluble chromatin was prepared as described (31) from 2 ϫ 10 7 ␤TC1 cells. For each immunoprecipitation reaction, 25 g of pre-cleared chromatin was incubated with anti-PDX1 or anti-BETA2 or control antiserum overnight at 4°C on a rotator wheel. Following centrifugation, the extracts were incubated with 20 -30 l of 50% (w/v) protein A-Sepharose suspension at 4°C for 2 h on a rotator wheel, and the beads were then washed. The immune complexes were then eluted by incubating the beads three times with 100 l of elution buffer (1% SDS, 0.1 M NaHCO 3 , 20 g/ml glycogen) for 10 min each. The combined eluates were heated at 65°C overnight to reverse the formaldehyde cross-links. The eluates were extracted once with phenolchloroform and once with chloroform, precipitated with ethanol, and resuspended in 20 l of TE buffer. A 2-l sample was used for amplification by quantitative real-time PCR.
The real-time PCR reactions were performed in an ABI 7300 real-time PCR system using the "absolute quantification method" option with SYBR-Green PCR master Mix (ABI) and 333 nM primers. The primers used are described in the supplemental materials. In each experiment, a standard curve for each chromatin was plotted using serial dilutions of chromatin input. The PCR program employed 40 cycles of denaturation (95°C for 10 s), annealing, elongation, and fluorescence detection (60°C for 1 min). At the end of the final cycle, a dissociation curve of all samples was performed. The DNA recovered from each immunoprecipitate was calculated as a percentage of the total input chromatin.

Statistical Analysis
Statistical significance between two groups was assessed using unpaired two-tailed t test. The data in Fig. 5 was analyzed by two-factor analysis of variance using experiment and mutant as the factors. Means for each mutant were compared with wild-type mean using the Dunnett test (␣ ϭ 0.05). The data in

Characterization of the GPR40 Transcription Start Site and
Intron-Exon Organization-As a first step toward understanding the molecular mechanisms controlling cell-specific regulation of the mouse GPR40 gene, we identified the 5Ј-end of the GPR40 mRNA. 5Ј-RACE experiments using ␤TC1 RNA and primers complementary to the open reading frame of GPR40 (Fig. 1A) produced a major product of ϳ400 bp (Fig. 1B, lane 1). Comparison of its sequence with that of mouse genomic DNA revealed the presence of a 698-bp intron beginning 321 bp upstream of the ATG initiation codon, and a 24-bp first exon. Thus, the major TSS is located 1044 bp upstream of the translation initiation site. Additional weaker bands, suggested the existence of minor upstream transcription start sites (Fig. 1B, lane 1). Indeed, a 5Ј-RACE experiment performed with a primer starting 4 bases downstream to the major transcription start site gave rise to several products: sequencing of three of these showed distinct 5Ј-ends mapping 100 -150 bp upstream of the major transcription start site (data not shown).
To confirm this result, RPAs were performed using a 278-base RNA probe comprising sequences corresponding to the 52 bp immediately upstream of the putative transcription start site, the 24 bp of exon 1 and the 133 bp at the 5Ј-end of exon 2 (Fig. 1C). A protected band of ϳ157 bases was observed using ␤TC1 RNA, but not ␣TC1 RNA (Fig. 1D, lanes 1 and 2), confirming the location of the start site indicated by the 5Ј-RACE result. An additional weak full protection band (209 bases) was also seen, consistent with the existence of the additional minor transcription start sites previously indicated by 5Ј-RACE.
Additional 5Ј-RACE experiments performed using primers distributed across the GPR40 gene (data not shown), and examination of expressed sequence tag (EST) sequences present in public databases, suggest that no additional introns are present in the gene. Taken together, these data establish the organization of the GPR40 gene, which is comprised of two exons of 24 and 4402 bp separated by a 698-bp intron. This arrangement of a short, non-coding first exon, followed by a long exon containing the complete protein coding information (Fig. 1A), is common in G protein-coupled receptors (32).
Characterization of the GPR40 Gene Promoter and Identification of Conserved Regions-Comparison of genomic DNA sequences of different species can often assist in identifying important control elements (33). We therefore compared the intergenic region between the CD22 gene and the GPR40 gene in mouse, rat, and human. The alignments revealed three conserved regions, which we named HR1-HR3 ( Fig. 2A). HR1 is a part of an Alu repeat; it is 38 bp long, is 91% identical between mouse and human, and is located 2126 bp upstream from the transcription start site. HR2 is 194 bp long; it is 82% identical between mouse and human and 93% identical between mouse and rat (Fig. 2B). HR3 is 185 bp long; it is 81% identical between mouse and human and 89% identical between mouse and rat (Fig. 2C).
HR2 is located 1060 bp upstream of HR3. Sequence analysis of this region revealed two conserved E-boxes at Ϫ1110 and Ϫ1032, the latter being duplicated in human (Fig. 2B). These sequences can serve as binding sites for basic helix-loophelix (bHLH) transcription factors (34). Also present in the region is a sequence that is similar to the binding site for HNF4␣ (35) at Ϫ1017. Analysis of conserved sequences in the HR2 region of human, chimpanzee, rat, and mouse, performed using the rVista program and Genomatix suite programs MatInspector and FrameWorker, revealed in addition the binding sites for the transcription factors AP4, ETS family, RFX1, PBX1, MEIS1, Evi1, Bel1, and homeodomain proteins of the HOX family.
Transcription start sites and promoter elements are often conserved among species. Indeed, the HR3 homology region (Fig. 2C) spans the major transcription start site; hence it is expected to contain at least part of the promoter. Inspection of HR3 revealed characteristic features of a TATA-less promoter, including a GC-rich region (64% in mouse and 63% in human) and two putative Sp1 binding sites (Fig. 2C). The sequence at the GPR40 TSS matches the consensus sequence for the initiator (Int) element (PyPyA ϩ 1NT/APyPy) (36) except for the last two pyrimidines. The rest of the sequence, including the A at the ϩ1 position, matches in all three species (Fig. 2C). In addition, the sequence between ϩ28 to ϩ32 matches the DPE consensus sequence (36) in all three species. Indeed, DPE-containing promoters typically possess also the Int element, and the DPE element is located exactly at positions ϩ28 to ϩ32 relative to the A at ϩ1 of the Int (36). In DPE-Int promoters there is a preference for G at position ϩ24, and in fact G is present at position ϩ24 in the GPR40 promoter of all three species. Thus, the GPR40 core promoter displays the typical features of a TATAless, Int-DPE core promoter.  HR2 and HR3, and the GPR40 protein coding region are shown. B, the sequence of HR2 in mouse, rat, and human. The E-boxes and the putative HNF4␣ binding sites are indicated. The sequence duplicated in human is underlined. The recognition site for the restriction enzyme BspHI is shown. Numbers indicate nucleotides relative to the transcription start site. C, the sequence of HR3 in mouse, rat, and human. Transcription start site (ϩ1) is indicated by an arrow, and exon 1 is shown. The following are indicated by gray shading: Int element at the start site, the DPE element at position ϩ28 to ϩ32, the G at position ϩ24, and the A-element at position Ϫ89 to Ϫ86. Also indicated are two putative Sp1 binding sites. Numbers indicate nucleotides relative to the transcription start site.
We also identified in HR3 a conserved A-element (TAAT) at position Ϫ89 to Ϫ86, which may serve as a binding site for transcription factors of the homeodomain family (Fig. 2C). Other conserved potential transcription factor binding sites found in HR3 using the rVista program include AP2, AP4, ELK1, GATA family, IK2, MZF1, PEA3, and STAT family.
Chromatin Accessibility at the 5Ј-Flank of the GPR40 Gene-To compare the chromatin structure at the GPR40 locus in ␤ versus non-␤ cells, we tested the DNase I sensitivity of nuclei from ␤TC1 and NIH-3T3 cells. We focused on an 8.8-kb KpnI genomic fragment spanning exon 9 of CD22 to the middle of the GPR40 coding sequence, including ϳ7.5 kb upstream to the GPR40 TSS and the GPR40 intron (Fig. 3A). A radiolabeled probe of 870 bp corresponding to the 3Ј-end of the region (Fig. 3A) was used for identification of the genomic band and sub-bands.
The analysis showed that in ␤TC1 cells, which express GPR40, the accessibility of chromatin to DNase I was higher than in the nonexpressing cell line NIH-3T3, as indicated by the greater sensitivity of the genomic fragment to nuclease digestion (Fig. 3, B and C). Furthermore, although in NIH-3T3 cells no sub-bands appeared following DNase I digestion, clear sub-bands of ϳ2.5 kb and 1.5 kb were seen in ␤TC1 cells (Fig. 3B), indicating sites of DNase I hypersensitivity. Because the radiolabeled probe we used corresponds to the terminus of the examined genomic region, we were able to determine the location of these hypersensitive sites. Interestingly, they correlated extremely well with the HR2 and HR3 homology regions. Because the GPR40 core promoter is located in the HR3 region, it is expected that in ␤ cells this region will be highly accessible. In addition, the hypersensitivity of the HR2 region is consistent with an important function for this region.
Transcription Regulation Conferred by Sequences from the 5Ј-Flank of the GPR40 Gene-To determine whether the 5Ј-flanking region of the mouse GPR40 gene is capable of directing transcriptional activity, the sequence Ϫ3315 to ϩ844 relative to the TSS was fused to the firefly luciferase reporter gene, and transfected into the ␤ cell line HIT and the non-␤ cell line CHO. Activity of this construct, 4155-GPR40P-LUC, was 17.7-fold higher than the parental promoterless vector pGL3-basic in HIT cells (Fig. 4B). In contrast, the activity of 4155-GPR40P-LUC was only 1.3-fold higher than pGL3-basic in CHO cells, suggesting that this fragment has ␤ cell-specific promoter activity. A 5Ј deletion of 4155-GPR40P-LUC from Ϫ3315 to Ϫ1257, yielding the construct 2095-GPR40P-LUC, did not affect the activity in HIT cells, suggesting that homology region HR1 is not required for activity of the promoter. However, deletion of an additional 159 bp, including the 5Ј portion of the HR2 region (1936-GPR40P-LUC), led to a dramatic 10-fold reduction in activity in HIT cells. No reduction in activity was seen on comparing 2095-GPR40P-LUC and 1936-GPR40P-LUC in CHO cells, indicating that the activity conferred by the deleted 159 bp is ␤ cell-specific. The data therefore suggest that an important transcriptional element lies between HinDIII and BspH1 (Fig. 4A). In addition, 3Ј deletion of 816 bp, including the entire intron, from 2095-GPR40P-LUC (1279-GPR40P-LUC), led to decreased activity both in HIT and CHO cells (Fig. 4B), probably due to the removal of the DPE element (see Fig. 2C).
The ability of conserved region HR2 to activate transcription from a heterologous promoter activity was also examined. For this purpose, a 258-bp fragment spanning HR2 was fused to a herpes simplex virus thymidine kinase (TK) promoter controlling the firefly luciferase reporter gene. HR2-TK-LUC had 11.4-fold higher activity than TK-LUC in the ␤ cell line HIT, whereas in the non-␤ cell line CHO, the HR2 region had little or no effect on the TK promoter (Fig. 4C). The HR2 region was able to activate the TK promoter in an orientation-independent fashion (data not shown). Thus, HR2 contains a potent ␤-cell-specific enhancer.
Systematic Mutagenesis of the HR2 Region-Based on the 5Ј deletion analysis, and the evolutionary conservation of the HR2 element, it is likely that the 5Ј 100 bp of the HR2 element (Ϫ1191 to Ϫ1091) contains a critical control sequence. This 100-bp region was therefore systematically mutated to define key transcriptional control elements. Accordingly, a series of ten substitution mutations (M1-M10), each spanning ϳ10 bp, was generated in 2095-GPR40P-LUC (Fig. 5A), and their transcriptional activity following transfection to HIT cells was compared with the wild-type promoter (Fig. 5B). Striking reduc- tions in activity were observed with the adjacent mutations M4 and M5 (remaining activity, 38 and 40%, respectively). Several mutations showed smaller, yet statistically significant reduction in activity. The observed activities were: M1, 59%; M8, 74%; and M9, 59%. Thus, the analysis defines three sub-regions within HR2 that are important for transcription activation: subregion 1, sub-region 4 -5, and sub-region 8 -9.
To examine the transcriptional activity of the cis-elements in the HR2 region independently of the remainder of the promoter, the same series of substitution mutations of HR2 was introduced to HR2-TK-LUC and tested in the ␤ cell line HIT. The results obtained were similar to those of 2095-GPR40 mutants (Fig. 5C).
Binding of PDX1 to HR2 in Vitro and in Vivo-Bioinformatic analysis identified within sub-region 4 -5 potential binding sites for transcription factors of the homeodomain family (Fig.  5A). We therefore tested whether PDX1, an important ␤ cellspecific homeodomain protein can bind this sequence in vitro using EMSA. We used a radioactive probe (designated HR2-2-6) that contains the wild-type sequence of region HR2 from the middle of sub-region 2 to the middle of sub-region 6. Incubation with nuclear extracts from 293T cells transfected with a PDX1 expression vector, generated a strong band (Fig. 6A, lane  2). Incubation of the probe with ␤TC1 cell nuclear extracts yielded a band of similar mobility, whereas incubation with NIH-3T3 cell extracts did not show this band (Fig. 6A, lanes 3  and 4). Addition of anti-PDX1 anti-serum abolished the band (Fig. 6A, lane 6), whereas addition of PBS, pre-immune serum (anti-P33) or irrelevant anti-serum (anti-N22) had no effect (Fig. 6A, lanes 5, 7, and 8). To examine the specificity of binding, we performed quantitative competition experiments using increasing concentrations of unlabeled oligonucleotides (Fig.  6B). The oligonucleotides used were HR2-2-6, HR2-M4-5 (which contains HR2-2-6 sequence mutated at sub-regions 4 and 5), and E1 (which contains an E-box element from the rat insulin I promoter, and thus is not expected to bind PDX1). HR2-M4-5 and E1 oligonucleotides showed significantly reduced ability to compete for binding (Fig. 6B), indicating that PDX1 binds in a sequence-specific manner and with highest affinity to the wild-type sequence of sub-region 4 to 5 within HR2.
To determine whether PDX1 binds to the HR2 enhancer in vivo, we performed ChIP experiments using anti-PDX1 or control antibody. PDX1 occupancy of the HR2 region in ␤TC1 cells was significantly higher than its occupancy of GPR40 or CD22 coding sequences (CDS), which reside within the vicinity of HR2, but are not expected to bind PDX1 (Fig. 6C). In contrast, PDX1 occupancy on HR2 was similar to its occupancy on the promoter of the mouse insulin II gene, a bona fide PDX1 target gene (data not shown). As control, we used pre-immune serum and observed significantly lower binding to HR2 and the insulin II gene promoter as compared with anti-PDX1 antibody ( Fig.  6C and data not shown).
Binding of BETA2 to HR2 in Vitro and in Vivo-Mutation M9 overlaps the 5Ј E-box of the HR2 region. Because E-box ele- after normalizing the wild-type mean to 1.0. Asterisks denote means differing from the wild-type by the Dunnett test (␣ ϭ 0.05). ments represent potential binding sites for transcription factors of the bHLH family, we examined whether BETA2 (NeuroD1), an important ␤ cell-specific bHLH protein (37), can bind this sequence in vitro using EMSA. BETA2 typically binds E-box sequences as a heterodimeric complex with products of the ubiquitously expressed E2A gene (38). We used a radioactive probe (designated HR2-7-10) that contains the wild-type sequence of HR2 from sub-region 7 to the end of sub-region 10. Incubation of the probe with ␤TC1 cell nuclear extract yielded a slow migrating complex (Fig. 7A, lane 1). Addition of anti-BETA2 antibody to the incubation mixture resulted in a supershift of the complex, whereas addition of pre-immune serum (anti-P33) did not (Fig. 7A, lanes 2 and 3). Incubation of the probe with nuclear extracts from 293T cells did not generate this complex (Fig. 7A, lane 4), whereas 293T cells transfected with a BETA2 expression vector generated a clear band whose mobility is similar to the complex observed with ␤TC1 extract (Fig. 7A, lane 5). The complex was supershifted upon addition of anti-BETA2 antibody (Fig. 7A, lane 6).
To examine the specificity of binding, we performed quantitative competition experiments using increasing concentrations of unlabeled oligonucleotides (Fig. 7B). The oligonucleotides used were HR2-7-10, HR2-EM (which contains a mutated sequence of the E-box element), and P1 (which contains an A-element from the rat insulin I gene promoter, and thus is not expected to bind BETA2). P1 and HR2-EM oligonucleotides showed significantly reduced ability to compete for binding (Fig. 7B), indicating that BETA2 is capable of binding to sub-region 8 -9 within HR2 with high affinity and in a sequence-specific manner.
To determine whether BETA2 binds to the HR2 enhancer in vivo, we performed ChIP experiments using anti-BETA2 antibody. BETA2 occupancy of the HR2 region in ␤TC1 cells was significantly higher than its occupancy of the CDS of GPR40, CD22, or the HR3 region, none of which were expected to bind BETA2 (Fig. 7C). The signal observed with control antibody was low for all regions tested (Fig. 7C).

DISCUSSION
The pancreatic ␤ cell plays a crucial role in maintaining metabolic homeostasis through the production and regulated secretion of the key metabolic hormone insulin in response to physiological needs (39). Inadequate production of insulin is a hallmark of both major forms of diabetes mellitus. Insulin secretion is regulated mainly according to fluctuating levels of blood glucose but also in response to nutrients, neurotransmit-  8). B, competition experiments were performed by incubation of nuclear extracts from ␤TC1 cells with 32 P-labeled HR2-2-6 probe, and increasing concentrations of unlabeled E1, HR2-2-6, or HR2-M4-5 oligonucleotides. Extent of binding is expressed relative to that observed in the absence of unlabeled oligonucleotides. C, ChIP. Cross-linked chromatin from ␤TC1 cells was immunoprecipitated using either anti-PDX1 antibodies or pre-immune serum. DNA fragments recovered from immunoprecipitations were amplified using specific sets of primers corresponding to the ␤ cell-specific enhancer HR2, and the coding sequence (CDS) of GPR40 and CD22 genes. Binding of antisera to the tested regions was analyzed using quantitative real-time PCR and presented as percentage recovery of input chromatin. Results are expressed as the mean Ϯ S.E. of three or more independent ChIP experiments. Asterisks indicate statistical significance (p Ͻ 0.04 for anti-PDX1 compared with GPR40 CDS, p Ͻ 0.02 for anti-PDX1 compared with CD22 CDS, and p ϭ 0.05 for anti-PDX1/HR2 compared with pre-immune/ HR2). Statistical significance involving more than two groups in the experiment was determined by comparing the occupancy of PDX1 on HR2 region to each of the other regions using the Wilcoxon 2-sample test corrected for multiple comparisons by the Holms method.
ters, and hormones (40). To fulfill their complex role, ␤ cells express a characteristic profile of genes, including many that are expressed selectively in ␤ cells. Regulation of these genes is mediated in large part through transcriptional control mechanisms (29) involving the combinatorial actions of ubiquitous and cell-specific transcription factors (28,41). It has recently been shown that the GPR40 gene is expressed selectively in ␤ cells, where it appears to play an important role in mediating the effects of fatty acids on insulin secretion (5,42).
In this study, we have examined the molecular basis for the selectivity of GPR40 expression in the ␤ cell. We have defined the organization of the gene and the location of the core promoter, which lacks a TATA element but contains an Int-like element and a DPE sequence. Reporter gene experiments demonstrated that ␤ cell-specific expression of the GPR40 gene is mediated through selective transcription via the 5Ј-flanking region of the gene. Cell-specific transcription of mammalian genes generally involves evolutionarily conserved sequences located upstream of the core promoter. Indeed, we demonstrated that the conserved HR2 element, which coincides with a nuclease hypersensitive site, functions as a strong ␤ cell-specific transcriptional enhancer.
Systematic mutagenesis of the HR2 enhancer, revealed three important domains. One of these (sub-region 8 -9) contains an E-box element that can bind transcription factors of the bHLH family. Indeed, BETA2, a ␤ cell-specific bHLH protein, implicated in regulation of several ␤ cell genes (28,37), was shown to bind in vitro to sub-region 8 -9 of HR2 in a sequence-specific manner, and was found to occupy the HR2 region in vivo. A second domain (sub-region 3-5) contains an A-element that can bind transcription factors of the homeodomain family. We were able to demonstrate that PDX1, a crucial ␤ cell homeodomain transcription factor, binds to this region both in vitro and in vivo.
These results represent the first characterization of the GPR40 gene promoter. Like other ␤ cell-specific genes, cell specificity is controlled at the transcriptional level through enhancer elements located in the 5Ј flank. This may be mediated by interactions among PDX1, BETA2, and additional transcription factors. Accumulating evidence has indicated a critical role for GPR40 in mediating the responsiveness of ␤ cells to LCFAs (9,17,19); furthermore, persistent activation of GPR40 in obesity has been proposed as a mechanism to explain the well established link between obesity and type 2 diabetes (15). The important physiological role proposed for GPR40 emphasizes the likely need for maintaining appropriate intracellular con- FIGURE 7. BETA2 binds the HR2 region of the GPR40 promoter in vitro and in vivo. A, EMSA analysis. Nuclear extracts from ␤TC1 cells (lanes 1-3), untransfected 293T cells (lane 4), and 293T cells transfected with BETA2 expression vector (lanes 5 and 6) were incubated with 32 P-labeled HR2-7-10 probe and resolved on a non-denaturing polyacrylamide gel. The incubation reactions were performed in the presence of anti-BETA2 antibody (lanes 2 and 6) or pre-immune (lane 3). B, competition experiments were performed by incubation of nuclear extracts from ␤TC1 cells with 32 P-labeled HR2-7-10 probe and increasing concentrations of unlabeled P1, HR2-EM, or HR2-7-10 oligonucleotides. Binding is expressed relative to that observed in the absence of unlabeled oligonucleotides. C, ChIP. Cross-linked chromatin from ␤TC1 cells was immunoprecipitated using either anti-BETA2 antibodies or pre-immune serum. DNA fragments recovered from immunoprecipitations were amplified using specific sets of primers corresponding to the ␤ cellspecific enhancer HR2, the coding sequence (CDS) of GPR40 and CD22 genes, and the GPR40 core promoter HR3. Binding of antisera to the tested regions was analyzed using quantitative real-time PCR and presented as percentage recovery of input chromatin. Results are expressed as the mean Ϯ S.E. of three or more independent ChIP experiments. Asterisks indicate statistical significance (p Ͻ 0.05) compared with control antibody and compared with GPR40 CDS, CD22 CDS, and HR3.
centrations of the protein. Thus, regulated transcription of the gene is probably essential for maintaining ␤ cell function, and defects could lead to perturbed insulin secretion. The present study has elucidated the molecular basis for understanding ␤ cell-specific transcription of GPR40 gene expression. Future studies will be aimed to determining the detailed mechanisms modulating GPR40 gene transcription in normal ␤ cells and to investigate possible defects in diabetes.