Pancreatic Islet Expression of the Homeobox Factor STF-1 Relies on an E-box Motif That Binds USF*

The commitment of cells to specific lineages during development is determined in large part by the relative expression of various homeodomain (HOX) selector pro- teins, which mediate the activation of distinct genetic programs. But the mechanisms by which individual HOX genes are themselves targeted for expression in different cell types remain largely uncharacterized. Here, we demonstrate that STF-1, a homeodomain protein that functions in pancreatic morphogenesis and in glucose homeostasis is encoded by an “orphan” homeobox gene on mouse chromosome 5. When fused to a (cid:98) -galactosidase reporter gene, a 6.5-kilobase genomic fragment of 5 (cid:42) -flanking sequence from the STF-1 gene shows pancreatic islet specific activity in transgenic mice. Two distinct elements within the STF-1 promoter are required for islet-restricted expression: a distal en- hancer sequence located between (cid:50) 3 and (cid:50) 6.5 kilobases and a proximal E-box sequence located at (cid:50) 104, which is recognized primarily by the helix loop helix/leucine zip- per nuclear factor USF. As point mutations within the (cid:50) 104 E-box that disrupt USF binding correspondingly impair STF-1 promoter activity, our results demonstrate that USF is an important component of the -end labeled using [ (cid:103) - 32 P]ATP and T4 polynucleotide kinase. 5 (cid:109) g of Poly(A) (cid:49) RNA was incubated with end-labeled antisense primer at 80 °C for 5 min followed by 16 h of incubation at 42 °C. Primer extension reactions were performed using

The vertebrate pancreas consists of endocrine and exocrine components, which arise from a common progenitor cell in the duodenal anlage (1). Within the endocrine component of the pancreas, a pluripotent precursor cell, which initially expresses multiple islet hormones, undergoes progressive restriction to form the four subpopulations of cells comprising the adult islets of Langerhans: insulin, somatostatin, glucagon, and pancreatic polypeptide-producing cells (2,3). The mechanism by which these developmental pathways are activated is unclear, but current evidence implicates the homeobox factor STF-1 (IPF-1/IDX-1) as an important determinant in this process. Indeed, the requirement for STF-1 in development is supported by homologous recombination studies in which targeted disruption of the STF-1/IPF-1 gene leads to congenital absence of the pancreas (4).
Although STF-1 appears to be an important regulator of pancreatic genes, the mechanism by which STF-1 expression is itself targeted to pancreatic cells remains uncharacterized. Here, we show that a 6.5-kb 1 fragment of the STF-1 promoter is sufficient to direct islet-specific expression of a ␤-galactosidase reporter gene in transgenic mice as well as in cultured cells. Within this 6.5-kb fragment, an E-box element, located at Ϫ104 relative to the major transcription initiation site, appears to be particularly critical for STF-1 promoter activity. Our studies suggest that this element is recognized by an upstream activator which is essential for islet expression of STF-1.

MATERIALS AND METHODS
Chromosome Mapping-Chromosome mapping of the STF-1 gene was performed using a (B6 ϫ SPRET)F1 ϫ SPRET backcross panel of DNAs from The Jackson Laboratory Backcross DNA Panel Map Service using a 32 P-labeled STF-1 cDNA fragment as a probe.
Generation of Transgenic Mice and ␤-Galactosidase Staining-A fusion gene containing 6500 bp of upstream STF-1 sequence in front of the ␤-galactosidase reporter gene was constructed using standard cloning techniques and injected into male pronuclei of fertilized oocytes. Founder mice were identified using Southern blotting and polymerase chain reaction amplification techniques. Expression of the STF-1/␤-galactosidase gene in transgenic tissues was evaluated on 20 M sections of paraformaldehyde-fixed tissues using X-Gal as chromogenic substrate.
Isolation of STF-1 Genomic Clones-The STF-1 gene was isolated from an EMBL 3 rat genomic library using a 32 P-labeled STF-1 cDNA fragment as hybridization probe. STF-1 positive genomic fragments were subcloned into the EcoRI sites of the SK II plasmid (Stratagene).
RNase Protection and Primer Extension-Poly(A) ϩ RNA was prepared using an Oligotex(dT)30 system (Quiagen). Oligonucleotide primers for primer extension analysis were 5Ј-end labeled using [␥-32 P]ATP and T4 polynucleotide kinase. 5 g of Poly(A) ϩ RNA was incubated with end-labeled antisense primer at 80°C for 5 min followed by 16 h of incubation at 42°C. Primer extension reactions were performed using * This work was supported by the Foundation for Medical Research. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  avian myeloblastosis virus reverse transcriptase at 37°C for 1 h, and products were analyzed on 5% denaturing polyacrylamide. RNase protection analysis was performed using 25 g of total RNA extracted from Tu6 cells (19). Antisense STF-1 RNA probe was generated using a STF-1 genomic fragment extending from Ϫ185 to ϩ93 relative to the translation start site. 32 P-Labeled antisense STF-1 RNA was synthesized in vitro using T7 RNA polymerase and [ 32 P]UTP. STF-1 antisense RNA probe and mRNAs were annealed at 80°C for 5 min followed by 16 h of incubation at 65°C. Annealing reactions were subsequently treated with 40 g/ml RNase at room temperature for 1 h, and the digestion products were analyzed on 5% denaturing polyacrylamide electrophoresis.
Gel Shift Assays and DNase Protection Assays-For electrophoretic mobility shift assays, oligonucleotide probes were labeled with [␣-32 P]dCTP by fill-in reaction using Klenow fragment. 4 g of nuclear extract was incubated with 0.5 ng of 32 P-labeled, double-stranded oligonucleotide and subjected to nondenaturing polyacrylamide electrophoresis as described previously (9). For supershift analysis, proteins were preincubated with the antibody followed by incubation with radioactive double-stranded oligonucleotide and electrophoresis. DNase protection assays were performed as described previously (22).
Figures-Figures for DNA binding assays were scanned from original photos using an HP ScanJet 3C and assembled with Canvas software on a Macintosh. Scanned images were reproduced on a Tektronix Phaser II SDX.

Chromosomal Location and Genomic
Organization of the STF-1 Gene-Using the STF-1 cDNA as hybridization probe on a backcross panel of DNAs from Jackson Laboratories, we mapped the single copy STF-1 gene to the distal region of mouse chromosome 5 ( Fig. 1, top). No recombinants were found with the distal markers Pmv12 or Iapls3-9 while six recombinants were observed with the more distal Actb locus (Fig. 1,  bottom). These results predict that the STF-1 gene would cor-  [7][8][9][10]. Correspondence between RNase-protected products and primer-extended products are marked, and first three start sites are designated S1, S2, and S3 (see also A, bottom). respondingly be found on rat chromosome 14 and human chromosome 7q, loci which do not correspond to any of the four homeotic HOX clusters. These results indicate that STF-1 should be classified as an "orphan" homeobox gene.
To isolate the gene encoding STF-1, we screened 10 6 bacteriophage clones from a rat EMBL 3 genomic library with a 32 P-labeled STF-1 cDNA probe and obtained two positive clones, each containing a genomic insert of 15 kb. In addition to 6.5 kb of 5Ј-flanking and 3.5 kb of 3Ј-flanking sequence, the 15-kb STF-1 genomic fragment contained the entire STF-1 coding region, which was interrupted by a single 4-kb intron inserted immediately upstream (Ala-135) of the homeobox coding sequence (amino acids 140 -215) ( Fig. 2A).
The absence of consensus TATA box or initiator sequences in the 5Ј-flanking region of the STF-1 genomic clone ( Fig. 2A) prompted us to map the transcriptional initiation sites for this gene. Using RNase protection and primer extension analysis on mRNAs from the insulin-producing cell lines RIN and Tu6 (Fig.  2, A and B), we identified three principle initiation sites, termed S1, S2, and S3, which were located 91, 107, and 120/125 nucleotides upstream of translational start site, respectively. A fourth minor transcriptional initiation site 137 nucleotides upstream of the translational start site was also observed. Like other TATA-less promoters, the STF-1 promoter contains G/A and G/C-rich sequences 30 bp upstream of the S1 and S2 start sites (23)(24)(25).
STF-1 Promoter Activity in Pancreatic Islet Cells-To determine whether sequences within the 5Ј-flanking region of the STF-1 gene were sufficient to target expression of STF-1 to pancreatic islet cells, we fused 6500 bp of 5Ј-flanking STF-1 sequence to the ␤-galactosidase gene and examined the expression this STF-1-lacZ reporter in transgenic mice. Using X-Gal as chromogenic substrate, we detected ␤-galactosidase activity in pancreatic islets of transgenic but not control littermates from three independent founder lines (Fig. 3). In keeping with the previously described expression pattern for endogenous STF-1 protein, no significant ␤-galactosidase activity was detected in exocrine acinar cells (Fig. 3) or in non-pancreatic tissues such as liver and spleen of transgenic mice (not shown). In keeping with the reported expression of the endogenous STF-1 gene in the duodenum (8,13), in situ hybridization studies with antisense ␤-galactosidase RNA probe also revealed transgene expression in epithelial cells of the duodenum from transgenic animals (not shown). These results indicate that 6500 bp of the STF-1 promoter are indeed sufficient to target expression of STF-1 to pancreatic islet and duodenal cells.

FIG. 4. Distal and proximal elements within the STF-1 promoter direct STF-1 expression to pancreatic islet cells.
A, activity of a Ϫ6500 STF-1 luciferase reporter plasmid following transfection into pancreatic islet (␤TC 3, HIT) versus non-islet cell lines (PC12, COS, HeLa). Representative assay showing STF-1 promoter activity in HIT cells (100%) relative to other cell lines after normalization with cotransfected Rous sarcoma virus-chloramphenicol acetyltransferase control plasmid. Assays were repeated at least three times. B, representative assay of STF-1 luciferase (STF Luc) promoter constructs following transfection into HIT cells. Constructs are named according to 5Јpromoter boundary relative to the major transcriptional start site (S1, Fig. 2, A and B). Schematic diagrams show position of potential binding sites for nuclear factors; the major transcriptional start site is represented by the filled arrow. Asterisk indicates uncharacterized binding activity in the distal 3 kb. For each construct, activity was calculated relative to Ϫ6500 STF Luc (100%) following normalization for transfection efficiency using Rous sarcoma virus-chloramphenicol acetyltransferase as an internal control. Assays were repeated at least four times.
To define functional elements that direct STF-1 expression to pancreatic islet cells, we examined the activity of the Ϫ6500 STF Luc reporter in two distinct pancreatic islet cell lines (␤TC 3, HIT). As predicted from results in transgenic mice, the STF-1 reporter showed 20 -100-fold more activity in these islet cells compared to non-islet lines such as HeLa, PC12, and COS (Fig. 4A). By contrast, the 4-kb intron and 3-kb 3Ј-flanking region of the STF-1 gene showed no such activity when inserted into a minimal SV40 chloramphenicol acetyltransferase promoter plasmid (not shown), suggesting that the 6.5-kb STF-1 promoter fragment is specifically responsible for targeted expression of STF-1 in islet cells.
To delineate sequences within the STF-1 promoter that confer islet cell expression, we generated a series of 5Ј-deletion constructs and analyzed these reporters by transfection into HIT cells (Fig. 4B). Deletion of sequences from Ϫ6500 to Ϫ3500 bp from the Ϫ6500 STF reporter construct reduced STF-1 reporter activity 4-fold, suggesting the presence of a distal activating sequence within that region. Further truncation of the STF-1 promoter from Ϫ3500 to Ϫ190 bp did not affect reporter activity in HIT cells significantly (Fig. 4B), but deletion of STF-1 promoter sequences from Ϫ190 to Ϫ95 bp severely attenuated reporter activity in HIT cells, indicating that a proximal element was also required for STF-1 promoter function. Inspection of the sequence in the Ϫ190 to Ϫ95 region of the STF-1 promoter revealed three consensus E-box motifs ( Fig.  2A). Although removal of two tandem E-boxes at Ϫ177 did not reduce promoter activity, deletion of the proximal E-box sequence at Ϫ104 (Ϫ95 STF Luc) completely abolished STF-1 expression in HIT cells.
A Proximal E-box in the STF-1 Promoter Recognizes a USFcontaining Complex-To characterize upstream factors that bind to functional elements in the STF-1 promoter, we performed DNase I protection assays using nuclear extracts from HIT and HeLa cells (Fig. 5A). In both extracts, we observed a predominant footprinting activity whose boundaries coincided with the functionally important proximal E-box motif (Ϫ118/ Ϫ95). To further characterize the proteins that bind to the critical Ϫ104 E-box motif in HIT versus HeLa extracts, we performed gel mobility shift assays. Using a double-stranded STF-1 oligonucleotide extending from Ϫ118 to Ϫ95, we observed three complexes, termed C1, C2, and C3 (Fig. 5B). Formation of C1, C2, and C3 complexes was inhibited by a 50-fold excess of unlabeled STF 1 E-box competitor oligonucleotide in binding reactions. Mutant E-box oligonucleotide or nonspecific competitor DNAs had no effect on these binding activities, however, indicating that C1, C2, and C3 are indeed specific for the STF-1 E-box sequence (Fig. 5B). No qualitative difference in the pattern of these complexes was detected between HeLa and HIT extracts (not shown), suggesting that the Ϫ104 E-box motif may recognize factors that are comparably expressed in both cell types.
heat denaturation (not shown) led us to first examine whether USF, a heat-stable upstream factor, was a component of C1, C2, or C3 (28). Remarkably, addition of anti-USF antiserum to gel mobility shift reactions inhibited formation of all three complexes (Fig. 5C, left panel), but anti-TFE-3 antiserum had no effect on complexes C1, C2, or C3, suggesting that these complexes were most likely formed by USF proteins. In gel shift assays, recombinant USF-1 gave rise to a protein DNA complex, which migrated at the same relative position as complex C2 (not shown), and in DNase I protection studies, recombinant USF-1 footprinting activity coincided with that observed in HIT extracts (Fig. 5A).
Two forms of USF, termed USF-1 and USF-2, appear to be expressed in most cell types (18). To distinguish which of these USF proteins was contained within the C1, C2, and C3 complexes, we incubated HIT or HeLa extract with either anti-USF-1 specific or anti-USF-2 specific antiserum (Fig. 5C, right  panel). Although USF-1 antiserum could "supershift" all three complexes, the USF-2 specific antiserum only inhibited formation of the C1 and C3 complexes. These results suggest that complexes C1 and C3 correspond to USF-1⅐USF-2 heterodimers, whereas C2 may contain a USF-1 homodimer.
To verify whether the CACGTG E-box sequence was essential for STF-1 promoter activity, we constructed a mutant STF-1 oligonucleotide that contains two base pair substitutions in the E-box (Ϫ118/Ϫ95). In gel mobility shift assays with HIT nuclear extracts, this mutant E-box motif (AACGCG) could not form C1, C2, and C3 complexes and could not compete for binding of USF-1 to wild-type E-box oligonucleotide (Fig. 6A). Correspondingly, full-length (6.5 kb) STF-1 and truncated (Ϫ190 STF) reporter plasmids containing the mutant STF E motif were nearly 10-fold less active than their wild-type counterparts in pancreatic islet cells (Fig. 6B). These results indi-cate that the proximal E-box, which binds USF, is indeed critical for STF-1 promoter activity. DISCUSSION The majority of vertebrate homeobox genes are confined to four chromosomal clusters, termed HOX A-D (29,30). Within each cluster, individual homeobox genes are ordered in a 5Ј to 3Ј pattern, which is co-linear with each antero-posterior expression pattern during development. It is not entirely clear whether this colinear organization is critical for proper expression of hox genes, but current evidence suggests that such clusters may contain upstream enhancers that coordinately regulate hox gene expression (29,31). Remarkably, the STF-1 gene does not map to any hox cluster but rather belongs to a group of so-called orphan hox genes. Although the regulatory implications of this distinct chromosomal location for STF-1 remain to be shown, our results suggest that orphan homeobox genes like STF-1 may be regulated by signals that are distinct from those employed for the HOX clusters. In this regard, the STF-1 promoter displays pancreatic islet cell-specific activity both in transgenic animals as well as in transient transfection assays, and the lineage specific activity of this transgene contrasts with the segmental expression pattern of most hox genes.
Two elements within the first 6500 bp of STF-1 5Ј-sequence appear to be important for islet-specific expression: a distal element located between Ϫ6500 and Ϫ3500 and a proximal element located at Ϫ104. Although the identity of the distal element remains to be elucidated, the proximal Ϫ104 element consists of a consensus E-box motif that predominantly recognizes the upstream activator USF. Multiple lines of evidence suggest that USF is important for STF-1 promoter activity. First, both non-discriminating USF-1 and USF-2 antibodies as FIG. 6. Binding of USF to the ؊104 E-box is important for STF-1 promoter activity. A, effect of E-box mutations on USF binding activity. Gel shift assay of HIT nuclear extract using wild-type (E-WT) or mutant (E-MUT) STF-1 E-box probes. Sequence of wild-type and mutant probes from Ϫ106 to Ϫ102 are shown below. C1-3, complexes C1, C2, and C3. B, effect of E-box mutation on STF-1 promoter activity in HIT cells. Representative assay of HIT cells transfected with wild-type, mutant, or deleted (Ϫ118/Ϫ95) STF-1 E-box motifs in the context of 6500 or 190 bp of STF-1 promoter. Reporter activities are shown relative to wild-type Ϫ6500 STF-1 Luc (100%) construct after normalizing for transfection efficiency with a cotransfected Rous sarcoma virus-chloramphenicol acetyltransferase control plasmid. Assays were repeated at least three times. well as USF-1-and USF-2-specific antibodies recognize the complexes specific for the STF-1 E-box. Second, the STF-1 E-box binding activity in HIT nuclear extracts has characteristics reminiscent of USF: the complexes are heat stable and demonstrate half-lives similar to recombinant USF-1. 2 Finally, point mutations that inhibit formation of USF complexes on the STF E-box correspondingly attenuate STF-1 reporter activity. These results suggest that USF complexes are indeed important for STF-1 promoter activity and consequently for pancreatic organogenesis.
Other nuclear factors in addition to USF, most notably Myc and Max, can also bind with high affinity to the STF-1 E-box (CACGTG) motif. Myc has been shown to stimulate target gene transcription by binding as a heterodimer with Max to E-box motifs (32,33). As myc gene expression is typically undetectable in post-mitotic cells such as those in pancreatic islets, Myc-Max complexes may not be involved in STF-1 promoter regulation there. During development, however, STF-1 expression appears to be concentrated in proliferating ductal cells (6), and myc may consequently stimulate STF-1 expression under those conditions. In this regard, it is tempting to speculate that the profound changes in STF-1 expression, which are observed during pancreatic development, may in part reflect changes in E-box binding activities that ultimately restrict STF-1 production to pancreatic islet cells.