Transcriptional Regulatory Elements of the Human Gene for Cytochrome P450c21 (Steroid 21-Hydroxylase) Lie within Intron 35 of the Linked C4B Gene*

The CYP21 gene, which encodes P450c21, the adrenal steroid 21-hydroxylase needed for glucocorticoid synthesis, lies in the major histocompatibility locus only 2.3 kilobase pairs (kb) downstream from the C4 gene. A 300-base pair (bp) proximal promoter and two upstream regions within C4are needed for expression of mouse CYP21; the human gene also has a proximal promoter, but upstream elements have not been studied. To search for upstream regulatory elements in humanCYP21B, we examined up to 9 kb of 5′-flanking DNA by transient transfection into human adrenal NCI-H295A cells. The 300-bp proximal promoter had substantial activity, but constructs retaining the DNA between −4.6 and −5.6 kb had increased activity, indicating the presence of distal elements. This region does not correspond to the mouse upstream regions, lying further upstream within intron 35 ofC4B, which encompasses the previously described “Z promoter.” DNase I footprinting located two elements, F1 and F2, lying −186 to −195 bp and −142 to −151 bp upstream from the Z cap site (−4862 to −4871 and −4818 to −4827 bp upstream of theCYP21B cap site). Each element formed a specific DNA-protein complex and conferred orientation-independent expression to a heterologous promoter. Mutations abolished formation of the DNA-protein complexes but only partially decreased expression. We identified a third site, F3, lying at −33 to −42 bp from Z. Competitive gel mobility supershift assays and co-transfection studies with SF-1 produced in vitro indicate F2 and F3 bind SF-1; BLAST searches and Southwestern blotting suggest that NF-W2 may bind F1. These results indicate that the Z promoter is a component of theCYP21 promoter needed to drive its adrenal-specific expression and that CYP21 transcription elements withinC4 have kept these two genes linked during evolution.

Congenital adrenal hyperplasia (CAH) 1 is a group of inborn errors of human steroid hormone biosynthesis (1) that occurs in about 1 in 14,000 individuals (2). Although mutation of the genes of any of the steroidogenic enzymes may cause CAH, over 95% of cases are due to mutations in steroid 21-hydroxylase; hence, this genetic locus has been the subject of intensive study (3)(4)(5). Adrenal 21-hydroxylation is catalyzed by cytochrome P450c21 (6), although other, unidentified enzymes can catalyze some steroid 21-hydroxylation in extra-adrenal tissues (7). Human P450c21 is encoded by the CYP21B gene, which lies in a complex array of genes on chromosome 6p21.3. The human (8 -10), bovine (11,12), and rodent (13,14) genomes have duplicated CYP21A and CYP21B genes, but these duplications postdate mammalian speciation and have different duplication boundaries (15). Only the human 21B gene encodes P450c21, as the 21A gene carries three mutations that destroy the reading frame (8 -10). By contrast, the mouse 21A gene is active while the corresponding 21B gene carries a single large internal deletion (13,14), and in cattle both genes are active (11,12,16). The human, rodent, and bovine CYP21 genes are located in the major histocompatibility locus and are duplicated in tandem with the closely linked C4 genes for the fourth component of serum complement so that the array is 5Ј C4A, 21A, C4B, 21B 3Ј from telomere to centromere (8 -17) (Fig. 1). The 5Ј ends (transcriptional start sites) of the human CYP21A and CYP21B genes lie only 2466 bp downstream from the polyadenylation sites of the corresponding C4A and C4B genes (15,18).
In addition to the C4 and CYP21 genes, at least nine other transcription units overlap the human C4 and C21 genes. XB encodes the extracellular matrix protein tenascin-X (19 -22); XB-S is a truncated XB transcript that arises from a promoter within an intron of XB and encodes a protein of unknown function (23); XA is an expressed, truncated XB gene that carries an internal deletion and does not encode protein (15); YA-S, YA-L, YB-S, and YB-L are short (S) and long (L) alternately spliced transcripts that arise at or near the CYP21A and B transcriptional start sites but have a different exonic array and lack open reading frames (24); the ZA and ZB transcripts arise from promoters within intron 35 of the C4A and C4B genes and have the potential to encode a protein identical to the carboxyl-terminal 131 amino acids of C4 (25) (Fig. 1). The three X transcripts are encoded on the DNA strand antisense to all the other transcripts and overlap the CYP21 and Y transcripts by several hundred bases and are transcribed in the same cells (20), but these sense and antisense strands do not form significant RNA:RNA duplexes in vivo (26). The two C4 genes are expressed almost exclusively in the liver (27), and XB is expressed in a wide variety of tissues (20,28); all of the remaining transcripts (CYP21B, XA, XB-S, YA-S, YA-L, YB-S, YB-L, ZA, and ZB) are expressed only in the adrenal cortex (15,(23)(24)(25). Thus, this locus is especially well suited for studies aimed at identifying the requirements for adrenal-specific transcription.
Although the high frequency of 21-hydroxylase deficiency has stimulated intense study of human CYP21 genetics, less attention has been directed to the regulation of human CYP21 transcription. Studies of the mouse CYP21A gene showed that a small promoter fragment of only 230 -330 bases upstream from the transcriptional start site is sufficient to confer both basal and cAMP-induced transcription in mouse adrenal Y-1 cells (29 -33). These initial observations contributed to the discovery of the orphan nuclear receptor called SF-1 or Ad4-BP, which is required for the expression of steroidogenic enzymes in the adrenals and gonads (Refs. 34 -36; for review, see Ref. 37). However, this proximal promoter region was necessary but not sufficient, as two small regions located 5.3 and 6.0 kb upstream from the mouse CYP21A gene were required for its expression in transgenic mice (38). Preliminary experiments with the promoter of the human CYP21B gene similarly identified basal and cAMP-responsive elements within a 200-bp proximal promoter adjacent to the transcriptional start site (39,40), but far upstream sequences have not been studied. We previously identified a 1-kb adrenal-specific transcript, operationally termed Z, that arises from a transcriptional start site in intron 35 of the human C4 gene, 55 bases upstream from the 5Ј end of exon 36 of C4 and 4676 bases upstream from the cap site of CYP21B (25). The "Z promoter," comprising as little as 235 bp upstream from the Z cap site, drove robust expression of a luciferase reporter when transfected into human adrenal NCI-H295 cells, but not when transfected into human placental JEG-3 cells, human liver HepG2 cells, or monkey kidney COS-1 cells (25). Because no function could be found for the Z transcript and because the Z promoter lies near to but upstream from the regions corresponding to the Ϫ5.3 and Ϫ6.0 regions of the mouse C21A gene tested by Milstone et al. (38), we suggested that the Z promoter is a component required for efficient adrenal-specific expression of the human CYP21B gene (25). We have now confirmed this hypothesis by characterizing large segments of the human CYP21B promoter, identifying specific sites of DNA/protein interaction.

MATERIALS AND METHODS
Plasmid Constructions-A pWE15 human cosmid library (Stratagene, La Jolla, CA) was screened under stringent conditions for C4/ C21 sequences using a 372-bp PCR probe containing sequences from intron 35 of the C4-Z promoter region (Ϫ5254 to Ϫ4882 of the CYP21B gene) (Fig. 1). To isolate C4B/CYP21B clones, secondary screening was performed by PCR of primary clones using 21B specific oligonucleotides resulting in a 670-bp product (Ϫ64 to Ϫ729 of CYP21B), which excluded amplification of CYP21A sequences. Positive cosmids were then digested with EcoRI and BamHI and hybridized with the 670-bp PCR probe. Positive DNA fragments were purified and used to assemble the constructs. The luciferase reporter constructs were built by cloning contiguous and internally deleted 5Ј-flanking DNA fragments of the CYP21B gene extending from ϩ13 bp to Ϫ9 kb (CYP21B transcription initiation site is designated as Ϫ1). These fragments were cloned upstream from the firefly luciferase reporter gene in the PGL3-Basic vector (Promega, Madison, WI) at various restriction sites within the polylinker. The constructs were named according to the approximate length of the DNA segment cloned from the CYP21B gene with the bases numbered according to the sequence of the human C4A gene (18) (Fig. 1).
Mutations were introduced into various functional elements by sitedirected mutagenesis using a modification of the PCR protocol of Weiner et al. (41). Using 200 ng of wild type plasmid DNA as template, PCR was performed in reactions containing 500 M dNTPs, 2 units of Pfu polymerase, and 250 ng each of the sense and antisense mutant oligos (T m ϭ 75°C). The reaction conditions were: 95°C for 30 s, followed by 20 cycles of 95°C for 30 s, 55°C for 1 min, and 65°C for 25 min (2.5 min/kb DNA). PCR products were directly treated with 20 units of DpnI at 37°C for 90 min, and the treated products used to transform Escherichia coli DH5␣. To prevent the need for sequencing the resulting mutant clones in their entirety, restriction fragments encompassing the mutant regions were subcloned into the respective wild type constructs.
Constructs used for heterologous promoter experiments consisted of single and multiple tandem copies of the wild type and mutant oligos used in the electrophoretic mobility shift assays (Table I) inserted upstream from an 86-bp fragment of the herpes simplex virus thymidine kinase promoter (42) extending from Ϫ32 to ϩ55, lying immediately upstream from the firefly luciferase gene (HSV-TK32/Luc). Double-stranded oligonucleotides were blunt-end cloned into the SmaI site. The fidelity of all constructs was verified by restriction enzyme digestion and sequencing. Cell Culture, Transfection, and Dual Luciferase Reporter Assays-An adherent subline (NCI-H295A) (43) of human adrenocortical carcinoma NCI-H295 cells (44,45) was maintained in RPMI 1640 medium supplemented with 2% fetal calf serum and antibiotics (penicillin, 20 units/ml; streptomycin, 20 g/ml), selenium (5 ng/ml), insulin (5 l/ml), and transferrin (5 l/ml). Mouse Y1 adrenal carcinoma cells (46), a generous gift from Dr. B. Schimmer (University of Toronto, Ontario, Canada), were grown in 50% Dulbecco's modified Eagle's medium (DMEM)-H16: 50% Ham's F12 with 15% heat-inactivated horse serum, 2.5% fetal bovine serum, and antibiotics. Monkey kidney COS-1 cells and human HepG2 hepatocarcinoma cells were grown in DMEM-H21 media supplemented with 10% fetal bovine serum and antibiotics. Human JEG-3 choriocarcinoma cells (47) were grown in DMEM-H21 media supplemented with 5% horse serum and 0.2 mM gentamycin. Mouse MA-10 Leydig cells (48) were grown in Weymouth's medium supplemented with 15% horse serum, 2.5% HEPES buffer, and 0.2 mM gentamycin. All cell lines were maintained at 37°C and 5% CO 2 . For transient trans-  (4.6 -5.6)kb/Luc represent internal deletion constructs retaining 300 bp of the CYP21B promoter and the segments from Ϫ4.6 to Ϫ5.0 kb or to Ϫ5.6 kb, which, respectively, correspond to 400 bp and 1 kb of the Z promoter. Z235 and Z542 are the Z promoter constructs retaining 235-and 542-bp fragments of the Z promoter (25) cloned into PGL3-Basic vector. Black boxes A and B correspond to the mouse regions identified at Ϫ5.3 and Ϫ5.8 kb in the mouse C21A gene (38), and the gray boxes represent the Z promoter (25). fection with the luciferase reporter constructs, cells were grown to 80% confluence in 10-cm Petri dishes and split into six-well plates 24 h prior to transfection. For the NCI-H295A cells, the DMEM-H16 medium supplemented with 10% fetal calf serum was used for transfection and replaced with growth medium 12 h after transfection.
Plasmid constructs used for transfection studies were purified using Qiagen columns (Qiagen, Chatsworth, CA). Equal molar amounts of plasmid DNA containing varying lengths of contiguous and internally deleted CYP21B 5Ј-flanking DNA were transfected using the calcium phosphate-DNA co-precipitation method. Following incubation for 12 h, the medium was removed, and fresh medium was added and incubated for another 12 h. Cells were harvested and cellular extracts were assessed for luciferase activity by the dual luciferase reporter assay system (Promega). Transfection efficiencies were normalized by cotransfecting with the pRL-CMV plasmid (Promega) containing the Renilla luciferase gene driven by the CMV promoter. Luciferase activity values were normalized by initial division of all pRL-CMV-luciferase values by the lowest value obtained and the result used to divide the luciferase values obtained from the corresponding constructs. All constructs were transfected in triplicate, and the means of two independent experiments are shown.
Electrophoretic Mobility Shift Assays-Nuclear extracts from NCI-H295A, JEG-3, COS-1, HepG2, MA-10, and Y1 cells were extracted using a method adapted from Dignam et al. (49). Protein concentrations were determined by the Bradford method (Bio-Rad) using bovine serum albumin as a standard. Double-stranded probes were prepared by hybridization of [ 32 P]dATP (Amersham Pharmacia Biotech) end-labeled complementary oligonucleotides (Table I). Mobility shift binding reactions typically contained 5-10 g of nuclear extract and 40,000 cpm (less than 0.5 ng) of end-labeled double-stranded probe in a final buffer composition of 4% glycerol, 1 mM EDTA, 5 mM DTT, 10 mM Tris-HCl, pH 7.5, 0.1 mg/ml bovine serum albumin, and 50 mM or 100 mM KCl. Poly(dI-dC) (1 g) was included as a nonspecific competitor in all reactions. When competitive binding studies were being performed, 5-50 ng (10 -100-fold excess) of unlabeled specific and nonspecific oligonucleotides were pre-mixed with the probe for 1-2 min prior to addition of the nuclear extract. The reactions were incubated for 15 min at room temperature and electrophoresed through an 8% native polyacrylamide gel in 50 mM Tris base, 0.38 M glycine, 2 mM EDTA and analyzed by autoradiography or phosphorimaging.
In Vitro Transcription/Translation of SF-1 Protein-SF-1 cDNA was obtained by reverse transcription-PCR of human adrenal RNA. cDNA was synthesized using 1 g of random-primed RNA. Specific oligonucleotides designed at the 5Ј and 3Ј end of the gene to amplify the cDNA in the correct reading frame were used in a PCR reaction containing 2 mM MgCl 2 , 5% Me 2 SO, 200 M dNTPs, 0.6 mM oligonucleotides, and 2.5 units of Pfu DNA polymerase. The 1.3-kb cDNA fragment was cloned into the BamHI-EcoRI site of the pCDNA3 vector (Stratagene). SF-1 protein was obtained by in vitro transcription and translation using the TnT T7-coupled reticulocyte lysate system kit (Promega). 5 l of lysate containing SF-1 protein or protein from lysate containing empty pCDNA3 vector were used in mobility shift assays.
DNase I Footprinting Assay-Probes were generated by PCR of the CYP21B Ϫ4.6 to Ϫ5.6 kb region using the C21/Ϫ5.0kb/Luc and C21/ Ϫ5.6kb/Luc constructs as templates and [ 32 P]dATP-end-labeled oligonucleotides. Both the Ϫ94 to Ϫ310 fragment from the C21/Ϫ5.0kb/Luc construct and the Ϫ717 to Ϫ937 fragment from the C21/Ϫ5.6kb/Luc construct were amplified using a vector-specific sense oligonucleotide. All other probes were obtained using the C21/Ϫ5.6kb/Luc construct. PCR products were purified using the Qiagen PCR purification kit and 70,000 cpm used in each assay. Probe DNA was incubated with varying concentrations of NCI-H295A and Y1 nuclear cell extracts (5-70 g) in reactions containing 50 mM NaCl, 0.5 mM EDTA, 20 mM HEPES buffer, 10% glycerol, and 0.5 mM DTT. Poly(dI-dC) (1 g) was included to prevent nonspecific DNA-binding proteins from binding to the labeled DNA. Reactions were incubated at 25°C for 15 min, followed by the addition of 5 mM CaCl 2 , 10 mM MgCl 2 and digestion with 0.1 units of DNase I. The reactions were terminated by adding 20 mM EDTA, 1% SDS, 0.2 M NaCl, and 250 ng/ml yeast tRNA. Phenol/chloroform-purified and ethanol-precipitated products were resuspended in formamide loading buffer and separated by electrophoresis through an 8% denaturing polyacrylamide gel. Dried gels were subjected to autoradiography.
Southwestern Blot Assay-Southwestern blotting was done essentially as described (50); 100 g of NCI-H295A cell nuclear extract was electophoresed on 10% SDS-polyacrylamide gels and electrophoretically transferred on to Immobilon-P polyvinylidene difluoride membranes (Millipore, Bradford, MA) treated according to the manufacturer's protocol. The membranes were incubated in 10 mM HEPES, pH 7.9, 60 mM KCl, 1 mM EDTA, 8% glycerol, 1 mM DTT, 5% nonfat dry milk powder, and 10 g/ml poly(dA-dT) for 1.5 h at 25°C. The membranes were then transferred to hybridization buffer (10 mM HEPES, pH 7.9, 60 mM KCl, 1 mM EDTA, 8% glycerol, 1 mM DTT, and 0.25% nonfat dry milk powder) containing 2 ϫ 10 6 cpm/ml labeled probe for 2 h at 25°C. The membranes were washed three times for 15 min each in hybridization buffer without probe at 25°C and exposed to autoradiography or phosphorimaging.

RESULTS
Transient Transfection of CYP21B Promoter/Reporter Constructions-The 5Ј upstream DNA for both the human CYP21A and 21B genes is active, but expression from the 21A promoter occurs at only about 20% of the level of the 21B promoter (24,51), due at least in part to a 4-bp difference in the proximal promoter segment (51,52). As the 21B promoter is more active and is required for human adrenal P450c21 production, all studies were done with 21B. An initial series of promoter/ reporter constructions was built specifically to test the potential roles of the proximal promoter (Ϫ300 bp), of upstream regions lacking the A and B regions of Milstone (33) (Ϫ3.4 kb), of region A (Ϫ3.8 kb), of regions A and B (Ϫ4.6 kb), of regions AϩB and the Z promoter (Ϫ5.0 and Ϫ5.6 kb), and far upstream regions (Ϫ9 kb) (Fig. 1). These constructs were transfected into human adrenal NCI-H295A cells, as this cell line expresses its endogenous CYP21B gene in a physiologically appropriate, hormonally responsive fashion (45). We have observed substantial species-specific differences in the behaviors of the promoters of the human CYP11A and CYP17 genes for the steroidogenic enzymes P450scc and P450c17 when expressed in human adrenal NCI-H295A cells as compared with their expression in mouse adrenal Y1 cells (43); hence, we also examined the behavior of the various human CYP21B promoter/reporter constructions in Y1 cells. The 300-bp "basal promoter" exhibited substantial activity in the NCI-H295A cells. Addition of DNA segments to Ϫ3.8 kb (including the A region of Milstone) or Ϫ4.6 kb (including the AϩB region of Milstone) had little effect, but addition of DNA to Ϫ5.0 kb (including the "Z" promoter) increased reporter activity about 2.6-fold over the basal C21/ 0.3kb/Luc construct ( Fig. 2A). This increase was maintained in the longer C21/Ϫ5.6kb/Luc construct. A generally similar pat-

CAGAATGTAGCATCTGCTGGAGA GTCTTACATCGTAGACGACCTCT
tern was seen when these same constructs were transfected into mouse adrenal Y1 cells; constructs containing up to Ϫ4.6 kb showed little change compared with Ϫ0.3 kb, but activity increased 3.5-fold with DNA to Ϫ5.0 kb and 5.2-fold with DNA to Ϫ5.6 kb (Fig. 2B). No activity was detected for any of these constructs when transfected into human placental JEG-3 cells, human liver HepG2 cells or monkey kidney COS-1 cells (data not shown). Thus, the activity of the human CYP21B promoter is adrenal-specific and shows major activity in the basal 300-bp region and in the Ϫ4.6 to Ϫ5.0 kb region.
To examine further the contribution of the DNA sequences between Ϫ4.6 and Ϫ5.6 kb, we built internal deletion constructs C21/0.3ϩ(4.6 -5.0)/Luc and C21/0.3ϩ(4.6 -5.6)/Luc containing the 0.3-bp basal promoter fused to 400-bp (4.6 -5.0) or 1-kb fragments (4.6 -5.6) of the Z promoter, thus deleting the entire DNA between Ϫ300 bp and Ϫ4.6 kb (Fig. 1). When expressed in NCI-H295A cells or Y1 cells, both of these constructions exhibited levels of activity comparable to the 5.0-or 5.6-kb constructs, indicating that the elements in the Z promoter can exert a positive activity in the absence of the A and B regions or other DNA between Ϫ300 and Ϫ4.6 kb (Fig. 2). Two additional constructs (Z235 and Z542), which had previously been used in the study of Z gene transcription (25) were also analyzed. For methodological consistency, these Z constructs were re-cloned in to the same PGL3-Basic vector used for the other constructs. The Z constructs, which lack the first 4667 of the 21B promoter and hence examine the region between Ϫ4667 and Ϫ5209, had activities corresponding to the degree of increase observed between Ϫ4.6 and Ϫ5.0 kb in NCI-H295A cells and activity comparable to the 300-bp basal promoter in Y1 cells. Thus, our promoter/reporter analyses localized positive regulatory elements between Ϫ4.6 and Ϫ5.0 kb.
Identification of Specific Functional cis Elements-To identify the active DNA sequences between Ϫ4.6 and Ϫ5.0 kb, we performed DNase I footprinting of this region by analyzing two overlapping fragments that extended from approximately Ϫ4.6 to Ϫ4.8 and Ϫ4.8 to Ϫ5.0. When incubated with nuclear extracts from NCI-H295A cells or Y1 cells, no footprinted regions were seen between Ϫ4.6 and Ϫ4.8 (data not shown); however, two clear and reproducible footprints were seen between Ϫ4.6 and Ϫ4.8 (Fig. 3A). These footprinted regions lie within the Z promoter close to the Z cap site at Ϫ4676 with respect to CYP21B (25); therefore, the base numbers in this region are described with respect to the Z promoter. On the antisense strand, the upstream footprint (designated F1) encompasses the sequence 5Ј-CGTCCATGATGCAAGACTCTGC-3Ј, from bases Ϫ200 to Ϫ179 of the Z promoter (Ϫ4876 to Ϫ4855 of CYP21B) (25). The downstream footprint (designated F2) encompasses the sequence 5Ј-CGACTGGGGCAAGGTCACCCTC-TGGGAA-3Ј, corresponding to bases Ϫ159 to Ϫ132 of the Z promoter (Ϫ4835 to Ϫ4799 of CYP21B) (25). This region includes a sequence nearly identical to the consensus PyCAAG-GTCA sequence (underlined) recognized by SF-1 (53) and includes a DNase I-hypersensitive G at position 156. On the sense strand, the closely corresponding regions Ϫ201 to Ϫ179 and Ϫ156 to Ϫ114 were footprinted, showing that these two regions interact with proteins that protect both strands. A third, rather poorly resolved footprint was also seen at Ϫ129 to Ϫ114 (GAAGTCACCAGAGACC) on the antisense strand, but was not seen on the sense strand. Thus, DNase I footprinting localized DNA-protein interactions to the same Ϫ4.8 to Ϫ5.0 kb region that constitutes the Z promoter (25) and which confers increased cell-specific transcription to the CYP21B proximal promoter.
To characterize the DNA/protein interactions in the regions identified by DNase I footprinting, we performed a series of electrophoretic mobility shift assays. Double-stranded oligonucleotides encompassing bases Ϫ162 to Ϫ129 and Ϫ205 to Ϫ177, corresponding to the two principal footprints seen in Fig. 3A, each produced DNA/protein complexes (Fig. 3B), but a doublestranded probe encompassing bases Ϫ134 to Ϫ106, did not yield a specific complex (data not shown). The Ϫ162/Ϫ129 probe formed a nonspecific complex and a distinct specific complex termed complex II, that could be competed by a 20-fold or 100-fold molar excess of unlabeled Ϫ162/Ϫ129 DNA, but not by a 100-fold molar excess of an unrelated sequence encompassing bases Ϫ85 to Ϫ66 of the human CYP11A (P450scc) promoter (54). The Ϫ205/Ϫ177 probe consistently produced a diffuse complex that appeared to consist of 2-3 bands, which were collectively termed complex I. Complex I migrated more slowly than a lower nonspecific complex, which was not competed. Altering the size of the double-stranded probe or electrophoresis of the complex at lower temperatures, higher voltage or varying salt conditions did not increase the resolution of complex I. Complex I was competed by a 20-or 100-fold molar excess of unlabeled Ϫ205/Ϫ177 DNA, but not by a 100-fold molar excess of oligonucleotides corresponding to the Ϫ85/Ϫ66 regions of human P450scc promoter. Thus, the two footprinted regions form different specific DNA/protein complexes; footprint F1 corresponds to complex I, and footprint F2 corresponds to complex II.
Characterization of the DNA/Protein Interactions-Wild type and mutant oligonucleotides spanning the sequence covered by footprint F2 were used to perform a series of electrophoretic mobility shift experiments (Fig. 4). In the presence of the wild-type Ϫ162/Ϫ129 probe, competitor DNA mutated at Ϫ162 to Ϫ158, Ϫ136 to Ϫ129, or Ϫ159 to Ϫ150 bases would still compete for formation of complex II equivalently to the competition observed with the wild-type sequence, but unlabeled oligonucleotides containing internally mutated bases Ϫ150 to Ϫ142 or Ϫ149 to Ϫ144 did not compete for formation of complex II (Fig. 4A). When mutant probes were used, formation of complex II was still observed with mutations at bases Ϫ162 to Ϫ158, Ϫ159 to Ϫ150, or Ϫ136 to Ϫ129 of the Ϫ162/Ϫ129 sequence but mutation of the 6 internal bases (Ϫ149/Ϫ144) eliminated the formation of complex II (Fig. 4B). Thus, complex II requires the AAGGTC core sequence, suggesting that the protein forming complex II may be SF-1. This was confirmed by three additional experiments. First, an oligonucleotide comprising bases Ϫ84 to Ϫ55 of the rat P450c17 promoter, which has previously been shown to bind SF-1 at bases Ϫ69 to Ϫ58 (55) specifically competed complex II, but its mutant counterpart did not (Fig. 4C). By contrast, the rat P450c17 probe Ϫ447 to Ϫ419, which binds COUP-TF, NGFI-B and two newly described proteins called StF-IT-1 and Ϫ2 (56), and oligonucleotides encompassing Ϫ155 to Ϫ131 and Ϫ85 to Ϫ66 of human P450scc, which bind a variety of factors (54, 57), did not compete for the formation of complex II. Second, gel mobility shift assays performed using SF-1 protein transcribed and translated in vitro produced a DNA-protein complex of the same mobility as that observed with NCI-H295A nuclear extract whereas the mutant probe did not (Fig. 4D). Third, as the CAAGGTCA sequences can also bind NGFI-B (a cAMP-induced early response gene also involved in transcriptional regulation of steroidogenic genes), mobility shift assays were performed using the Ϫ162 to Ϫ129 wild type probe with cAMP treated and untreated NCI-H295A nuclear extracts in order to identify a different DNA-protein complex on cAMP stimulation. Both extracts produced complexes of identical mobility (data not shown). Thus, competition with heterologous DNA, mutated homologous DNA, and SF-1 protein produced in vitro all strongly suggest that the GCAAGGTCAC sequence between bases Ϫ151 to Ϫ142 functions by binding SF-1.
Activities of the Two cis-Acting Elements to Drive a Heterologous Promoter-To assess the functional significance of the DNA elements comprising F1 and F2, we assessed their ability to stimulate transcription from a heterologous promoter. One, two, and three copies of wild type and mutant oligonucleotides encompassing the F1 (Ϫ205/Ϫ177) and F2 (Ϫ162/Ϫ129) elements were cloned in both forward and reverse orientations upstream from a heterologous promoter/reporter system (HSV-TK32/Luc) and transient transfection assays were performed in NCI-H295A, JEG-3, COS-1, HepG2, MA-10, HeLa, and Y1 cells. Mutant reporter constructs were also tested that contained the Ϫ193 to Ϫ188 and the Ϫ149 to Ϫ144 substitutions that abolished the formation of complexes I and II, respectively. In NCI-H295A cells, each of the wild type constructs drove the HSV-TK32 promoter in an orientation-independent manner, although activities were generally lower in the reverse orientation (Fig. 5A); similar results were seen in Y1 cells (data not shown). A single copy of the F1 element increased TK32 basal activity 2-fold, while three tandem copies increased activity 7-fold. Similarly, a single copy of the F2 element increased basal TK32 activity 3-fold, while two copies increased activity 8-fold. By contrast, the mutant F1 and F2 elements had considerably lower activity. There was no increase in reporter gene activity when the wild-type F1 or F2 elements were transfected into HepG2, COS-1, MA-10, and HeLa cells, although one copy of the F1 element elicited a 4-fold increase in JEG-3 cells, while its mutant counterpart showed a 2-fold increase (data not shown).
Mutants in either the GATGCA sequence of F1 (Ϫ193 to  In lanes 3, 4, 5, and 6, the labeled 200-bp fragment was incubated with 5, 25, 50, and 60 g, respectively, of NCI-H295A nuclear cell extract. Lane 9 shows a G sequencing reaction of the same DNA fragment. The protected regions representing nuclear protein binding sites are indicated at Ϫ179 to Ϫ200 (F1), Ϫ132 to Ϫ159 (F2), and Ϫ114 to Ϫ129. Right, DNase I footprint of the same 200-bp PCR probe prepared with 32 P-end-labeled vector specific sense primer. Lanes G, A, T, and C show the sequence of the DNA fragment initiated by the same primer. Lanes 1, 2, 7, and 8 show the labeled 200-bp fragment digested with DNase I. Lanes 3, 4, 5, and 6 show the labeled 200-bp fragment incubated with 5, 25, 50, and 60 g, respectively, of NCI-H295A nuclear extract. The protected regions are indicated at Ϫ114 to Ϫ156 and Ϫ179 to Ϫ201 and correspond to the Ϫ132 to Ϫ159 and Ϫ179 to Ϫ200 sites observed on the antisense strand. B, footprints 1 and 2 form complexes I and II. Double-stranded 32 P-end-labeled probes were incubated with NCI-H295A cell nuclear extracts. Left, the Ϫ162/Ϫ129 wild-type probe forms complex II and a slower nonspecific complex (NS); competition with 20-and 100-fold excess of unlabeled Ϫ162/Ϫ129 eliminated complex II, but competition with nonspecific DNA from the human P450scc gene did not. Right, the Ϫ205/Ϫ177 wt probe formed DNA-protein complex I; competition was observed with 20-and 100-fold excess of unlabeled Ϫ205/Ϫ177 but not with human P450scc Ϫ85/Ϫ66 DNA.
Ϫ188) or the AAGGTC sequence of F2 (Ϫ149 to Ϫ144), or both, were also incorporated into the C21/Ϫ5.0kb/Luc and C21/ Ϫ5.6kb/Luc constructs, the C21/0.3ϩ(4.6 -5.0kb)/Luc and C21/ 0.3ϩ(4.6 -5.6kb)/Luc deletion constructs and the two Z promoter constructs Z235 and Z542, and their activities were assessed in NCI-H295A cells (Fig. 5B). The activities of either the C21/Ϫ5.0kb/Luc or C21/Ϫ5.6kb/Luc wild type constructs were 2.5-3.0-fold higher than the C21/Ϫ0.3kb/Luc basal promoter, consistent with the data in Fig. 2 5. Activities of the F1 and F2 elements. A, activities of the Ϫ205/Ϫ177 F1 element and Ϫ162/129 F2 element linked to the TK32 promoter. One, two, or three copies of the F1 element; one or two copies of the F2 element; one or two copies of F1 mutated at Ϫ193 to Ϫ188; and F2 mutated at Ϫ149 to Ϫ144 were assayed in both the forward and reverse orientations. Luciferase activity is expressed relative to the TK32/Luc construct. The data are the means of two independent experiments performed in triplicate. B, activities of the F1 and F2 elements in various CYP21B promoter contexts. The mutant F1 and F2 elements are as in panel A. Data are the mean relative from two experiments performed in triplicate. complex II. B, mutant probes localize complex II to bases Ϫ149 to Ϫ144. Complex II is formed when the Ϫ162/Ϫ129 probe is mutated at Ϫ159 to Ϫ150, Ϫ136 to Ϫ129, and Ϫ162 to Ϫ158, but not when mutated at Ϫ149 to Ϫ144. C, a rat SF-1 site competes complex II. The Ϫ162/Ϫ129 wt probe forms complex II, which is competed by excess unlabeled Ϫ162/ Ϫ129 and the rat Ϫ84/Ϫ55 sequence which binds SF-1, but not by the Ϫ69/Ϫ58 mutant of rat Ϫ84/Ϫ55, by rat Ϫ447/Ϫ419 which binds NGFI-B, or human P450scc promoter sequences. D, SF-1 protein produced in vitro forms complex II. Left, incubation of the Ϫ162/Ϫ129 wt probe with NCI-H295A nuclear extract forms both the nonspecific complex (NS) and complex II, which is competed by excess wild-type but not mutant unlabeled DNA. Incubation of the Ϫ162/Ϫ129 wt probe with SF-1 protein produced in vitro forms only complex II, and protein prepared from the empty pCDNA3 vector alone formed neither complex. Right, the same experiment as in the left panel, performed with a probe mutated at Ϫ149 to Ϫ144. Complex II was not formed with NCI-H295A extract or SF-1 protein produced in vitro; however, a nonspecific complex was observed with the nuclear extract.
activity, but mutation of both elements consistently decreased activity. In the constructs lacking the sequences from Ϫ0.3 to Ϫ4.6 kb, the basal activities were much the same as for the Ϫ5.0 and Ϫ5.6 kb constructs, again consistent with Fig. 2, but the effects of mutating either the F1 or F2 sequences caused a more dramatic effect in these internally deleted constructs, reducing activity to the level seen with the 0.3-kb basal promoter. Similarly, the mutation of either element reduced the activity of the two Z promoter constructs. The mutations of both elements did not consistently reduce the level of transcription below the level seen with either single mutation. Thus, both elements had to be present together, indicating a cooperative interaction between them to promote CYP21B gene transcription.
Identification of a Second Upstream SF-1 Site (F3)-As several experiments indicate that the F2 element bound SF-1, we tested the activity of F2 in COS-1 cells, which lack SF-1. The C21/Ϫ5.0kb/Luc contiguous construct, the C21/0.3ϩ(4.6 -5.0)kb/Luc internally deleted construct, and the Z235 construct were all inactive in COS-1 cells, and all showed substantial activity when co-transfected with a vector expressing SF-1, consistent with the role of F2 as an SF-1 binding site. However, when the F2 sequence was mutated, SF-1-induced activity persisted, even in the small Z235 construct; thus there appeared to be another SF-1 binding site in Z235 that had not been identified by the footprinting experiments. Four sequences related to the GCAAGGTCA SF-1 consensus were identified between Ϫ4.6 and Ϫ5.0 kb, including three clustered between Ϫ43 and Ϫ81 with respect to the Z cap site. Doublestranded oligonucleotides corresponding to all four potential SF-1 sites were tested with nuclear extracts from NCI-H295A cells, but only the Ϫ73/Ϫ51 oligonucleotide containing the sequence GAAGGACA (Ϫ58 to Ϫ65 with respect to the Z cap site) formed a specific DNA/protein complex, termed complex III (Fig. 6A). Complex III was competed by a 50-fold molar excess of unlabeled Ϫ73/Ϫ51 and Ϫ162/Ϫ129 (F2 element), but not by an oligonucleotide mutated at bases Ϫ65/Ϫ58, by unrelated DNA from the gene for P450scc or the Ϫ205/Ϫ177 (F1) sequence. A similar result was observed when this probe was incubated with SF-1 protein produced in vitro. The mutant probe was unable to form complex III with either NCI-H295A nuclear extract or SF-1 protein produced in vitro. These results confirmed that this sequence, termed F3, binds SF-1.
To determine the functional significance of F3, this sequence was mutated in the C21/Ϫ5.0kb/Luc contiguous construct and the Z promoter construct, and transfected into NCI-H295A cells (Fig. 6B). Consistent with the data in Fig. 2A, the Ϫ5.0 kb construct had about 2.5-fold more activity than the 0.3-kb basal promoter. When all three elements in the Z promoter (F1, F2, and F3) were mutated, the activity of the 5.0-kb construct was reduced to that of the 0.3-kb construct. In the context of the 235-bp Z promoter construct, mutation of F1 had no effect, mutation of either of the SF-1 sites (F2 or F3) reduced activity and mutation of both F2 and F3 with or without mutation of F1 decreased activity further to about 35% of the Z235 wild type construct.
To show that F3 functions by binding SF-1, COS-1 cells were co-transfected with the various promoter-reporter constructs and a vector expressing SF-1 (Fig. 6C). In the absence of SF-1, none of the constructs had significant activity. When co-transfected with SF-1, the 0.3-kb basal promoter (which contains an SF-1 site) showed about a 3-fold induction. The Ϫ5.0 kb contiguous construct and the internally deleted 0.3ϩ(4.6 -5.0)-kb construct showed robust activity with SF-1 (13-and 20-fold above the 0.3-kb basal promoter, respectively), but mutation of the F1, F2, and F3 sites reduced this to the level of the 0.3-kb basal promoter. Similarly, SF-1 induced substantial activity in the Z235 construct (8-fold above the 0.3-kb basal promoter), which was largely eliminated by mutating the F1, F2, and F3 sites.
Characterization of F1/Complex I-Mutation of F1 in the Z235 construct had no effect (Fig. 6B), but mutation of the TK32/Luc constructs (Fig. 5A), and the long constructs (particularly the internally deleted 0.3ϩ(4.6 -5.0)-kb construct) (Fig.  5B), indicate that F1 plays a functional role. Therefore, we characterized the protein binding to the Ϫ205/Ϫ177 F1 region further. Mobility shift experiments showed that oligonucleotides mutated at bases Ϫ205 to Ϫ199 or at Ϫ182 to Ϫ177 compete with the wild type Ϫ205/Ϫ177 probe for formation of complex I, but no competition was observed with mutations at bases Ϫ195 to Ϫ186 or Ϫ193 to Ϫ188 (Fig. 7A). Similarly, complex I could still be formed by labeled probes carrying mutations at either end (Ϫ182/Ϫ177 and Ϫ205/Ϫ199), but probes carrying internal mutations at Ϫ195 to Ϫ186 or at Ϫ193 to Ϫ188 did not form complex I (Fig. 7B). Thus, even though complex I appears to be a multi-protein complex, the formation of all components of this complex was prevented by mutating the internal core GATGCA sequence. To estimate the approximate size of the protein binding to the F1 element, we performed a Southwestern blot (Fig. 8). An SDS gel of proteins from NCI-H295A cell nuclear extract was probed with a single copy of radiolabeled wild type F1 element (Ϫ205 to Ϫ177) identifying a protein of 97 kDa. By contrast, when the probe was mutated at bases Ϫ195 to Ϫ186, the protein was not detected. A BLAST search of the F1 sequence identified two proteins: the B-lymphocyte factor NF-W1 and the widely expressed factor NF-W2 (58). Both of these factors bind to the sequence GTTGCATC, which matches the F1 sequence at 7 contiguous bases on the antisense strand. Furthermore, NF-W2 has an estimated size of ϳ93 kDa (58), which is close to the 97 kDa inferred from Fig. 8. Thus, NF-W2 or a related factor may be a good candidate for the protein binding to F1. DISCUSSION The CYP21 gene(s) encoding the adrenal steroid 21-hydroxylase are intimately linked to the C4 genes in the major histocompatibility locus in the human, rodent, and bovine genomes (17, 59 -61). Although the distance between the duplicated C4A/21A and C4B/21B clusters differs by about 30 kb in the mouse and human genomes (15,18), the 3Ј end of a C4 gene is always close to the 5Ј end of a CYP21 gene, suggesting a functional significance to this genomic array. We found that a segment of only about 300 bp of CYP21B 5Ј-flanking DNA was sufficient to confer transcriptional activity above basal levels in human adrenal NCI-H295A cells and mouse Y1 cells, confirming previous reports with both the mouse (29 -33) and human (39,40) promoters. This region contains several sites that might bind transcription factors, including SF-1, SP1, Nur77 (51), and various cAMP response factors (39), which may include SF-1 itself (55). SP1 is expressed ubiquitously and increases the basal transcription of many promoters and thus may play a role in CYP21B transcription. Nur77 and SF-1 are specifically associated with transcriptional regulation in the adrenals and gonads, and may be involved in the tissue specific expression of genes for steroidogenic enzymes. SF-1 appears to be a constitutively acting steroidogenic factor, while Nur77 binds to a very similar cis element to confer cAMP responsiveness. Thus, the 300-bp CYP21B basal promoter contains several elements that appear to be crucial both for basal and tissue-specific and hormonally induced transcription. However, the 300-bp basal promoter is clearly insufficient for the expression of 21-hydroxylase in mice in vivo (38); therefore, we studied another 9 kb of upstream DNA.
The search for upstream regulatory elements was guided by FIG. 7. Characterization of complex I by electrophoretic mobility shift assays. A, labeled wild type F1 element incubated with nuclear extract from NCI-H295A cells and with various competitor DNAs. The Ϫ205/Ϫ177 probe forms a nonspecific complex and complex I, which can be competed by a 5-20-fold molar excess of Ϫ205/Ϫ177. Oligonucleotides containing mutations at bases Ϫ195/Ϫ186, bases Ϫ193/Ϫ188 or an unrelated sequence from P450scc did not compete for the formation of complex I, but sequences mutated at either end (Ϫ205/ Ϫ199 and Ϫ182/Ϫ177) did compete. B, mutant probes incubated with NCI-H295A nuclear extract and various competitors. Retention of the Ϫ193/Ϫ188 sequence in the probe results in the formation of complex I, which can be competed specifically as in panel A, but mutation of the Ϫ193/Ϫ188 sequence eliminates formation of complex I.
FIG. 8. Southwestern blot. Nuclear proteins from NCI-H295A cells were displayed by electrophoresis on several lanes of an SDS 10% acrylamide gel, blotted to polyvinylidene difluoride membranes, and probed with 32 P-labeled double-stranded oligonucleotides. Wild type probe comprised bases Ϫ205 to Ϫ177 of the Z promoter, and the mutant was altered at bases Ϫ193 to Ϫ188. The first (blank) lane contains the non-radioactive molecular size standards shown at the left. The wild type probe consistently detects a single protein of about 97 kDa. the studies of Milstone et al. (38), which identified the A and B elements located 5.3 and 6.0 kb upstream from the mouse CYP21A transcriptional start site, and by our previous identification of the adrenal specific Z promoter located 4676 bp upstream from the human CYP21B transcriptional start site (25). There are important differences in the mouse and human C4/C21 gene complexes, with a greater distance between C4 and CYP21 in the mouse; thus, Milstone's Ϫ5.3 and Ϫ6.0 kb elements correspond to ϳϪ3.5 and ϳϪ4.0 kb in the human sequence (15,25) so that the murine region corresponding to the human Z promoter was not examined by Milstone et al. (38). We found that sequences up to 4.6 kb upstream from the human CYP21B transcription initiation site had little effect on the transcriptional activity of this promoter in either NCI-H295A or Y1 cells, even though this DNA contained the regions corresponding to Milstone's A and B elements that appear to be crucial to the mouse gene. However, sequences further upstream between Ϫ4.6 and Ϫ5.6 kb increased transcription from the C21 promoter up to 3-fold in NCI-H295A cells, and to 3-5-fold in Y1 cells.
The region between Ϫ4.6 and Ϫ5.6 kb encompasses most of intron 35 of the C4 gene. Transcription of the human C4 gene is under the regulation of a strong, almost exclusively liverspecific promoter whose regulatory elements appear to be confined to the first 200 bp, although there is a low level C4 transcription in the human adrenal (23). However, within its intron 35, the C4 gene contains the Z promoter, which drives transcription of the Z transcript in an adrenal-specific manner (25). The Z transcript is initiated at base Ϫ4676 with respect to the CYP21B cap site, 55 bp upstream from the 5Ј end of exon 36 of human C4; it overlaps the last seven exons of the C4 gene, and maintains the same open reading frame (25). The Z transcript is found only in the adrenal cortex, and the Z promoter can function in human adrenal NCI-H295A cells but not in human placental JEG-3 cells, mouse Leydig MA-10 cells, or monkey kidney COS-1 cells, suggesting that these adrenalspecific elements may participate in the transcriptional regulation of the CYP21B gene (25). Our present results show that the Z promoter enhances transcription of the CYP21 gene, as the DNA between Ϫ4.6 and Ϫ5.6 kb conferred a substantial increase in transcriptional activity, either with or without retention of the DNA between Ϫ300 and Ϫ4.6 kb. Deletional mutagenesis then identified the DNA between Ϫ4.6 and Ϫ5.0 kb as the important region. The Ϫ4.6 to Ϫ5.0 kb region encompasses the Z promoter in intron 35 of human C4. The corresponding region of mouse C4 intron 35 has substantial sequence similarity, but the mouse and human sequences diverge drastically upstream from Ϫ4.9 kb (base Ϫ222 of the Z promoter) (25), further indicating the importance of this region.
DNase I footprinting studies identified three potential protein binding sites in the conserved DNA just upstream from the Z cap site. Electrophoretic mobility shift assays confirmed that two of these regions formed specific DNA-protein interactions. Footprint F2 contained an CAAGGTCA motif, which may be recognized by the transcription factors NGFI-B, SF-1, NP-III, and ApoCIIIp2; competitive gel mobility shift studies and transfection studies indicated that this footprinted DNA binds SF-1. Footprint F2 corresponds exactly to the sequence we previously suggested may be a SF-1 binding site participating in the regulation of the Z promoter (25). Bandshifts and functional studies demonstrated another site (F3) at Ϫ58 to Ϫ65 of the Z promoter as binding SF-1. Another putative SF-1 site lies within the Ϫ300 bp proximal promoter sequence of CYP21B, but actual binding of SF-1 to this proximal element has not been established. Thus, our deletional mutagenesis data, including excision of the DNA from Ϫ300 to Ϫ4.6 kb, suggest that all three of these SF-1 sites are needed for full transcriptional activity of the human CYP21B gene.
SF-1 regulates the adrenal and gonadal expression of steroidogenic genes and is required for the differentiation of these tissues (37,62). SF-1 is also essential for basal transcription of the gene for the human ACTH receptor (63) and acts synergistically with an early growth response protein (Egr-1) to increase expression of the rat gene for the ␤ subunit of luteinizing hormone (64). Thus, while SF-1 may be necessary for adrenal specific expression of CYP21B, other factors are clearly required to limit its expression to the adrenal cortex.
The SF-1 site in footprint F2 lies 35 bases downstream from the ATGATGCAAG sequence comprising footprint F1. Mobility shift experiments indicated that at least one, and possibly as many as three proteins bind to the F1 element, as several shifted bands were seen. Mutation of bases GATGCA in this sequence abolished the band shift pattern and reduced transcriptional activity of the complete CYP21B promoter and of this region fused to the basal TK32 promoter. Southwestern blotting identified a protein of about 97 kDa and BLAST searches suggest it may be related to NF-W2 (58).
Thus, the Z promoter in the C4 gene is indeed an intrinsic part of the CYP21 promoter that participates in basal adrenalspecific expression of CYP21 providing the evolutionary constraint for keeping these otherwise unrelated genes tightly linked. Both the CYP21A and CYP21B promoters are functional only in the adrenal, albeit at substantially different levels, and the nearby ZA, ZB, and XA promoters also appear to be adrenal-specific. Thus, it is not clear whether a single adrenal-specific element drives the CYP21A and 21B genes, or whether each has its own adrenal-specific element. However, the minimal expression of C4 in the adrenal suggests each gene cluster has its own locally acting element.