Characterization of the upstream sequence of the human CYP11A1 gene for cell type-specific expression.

The CYP11A1 gene encodes the cholesterol side-chain cleavage enzyme P450scc, which catalyzes the synthesis of steroids from cholesterol. This gene is expressed only in steroidogenic organs such as the adrenal, gonad, placenta, and brain. We have characterized an upstream regulatory element of the human CYP11A1 gene, termed AdE, which contributed to its cell type-specific expression. The AdE sequence contains two protein binding regions, AdE1 and AdE2, which bind many proteins including NF1- and Sp1-like proteins as shown by electrophoretic mobility shift assay, footprinting, competition, antibody supershift, and mutagenesis of the binding sites. When cloned in front of the CYP11A1 promoter or the heterologous thymidine kinase promoter, AdE sequences enhanced expression of the reporter gene in steroidogenic cell lines of the adrenal, gonad, and placental origin but not in nonsteroidogenic cell lines such as COS-1 and Rat-1. The function of AdE1 and AdE2 was lower when present individually than together. The combined action of multiple transcription factors binding to the AdE sequence brings about the final activation of the CYP11A1 gene in a tissue-specific manner.

The CYP11A1 gene encodes the cholesterol side-chain cleavage enzyme P450scc, which catalyzes the synthesis of steroids from cholesterol. This gene is expressed only in steroidogenic organs such as the adrenal, gonad, placenta, and brain. We have characterized an upstream regulatory element of the human CYP11A1 gene, termed AdE, which contributed to its cell type-specific expression. The AdE sequence contains two protein binding regions, AdE1 and AdE2, which bind many proteins including NF1-and Sp1-like proteins as shown by electrophoretic mobility shift assay, footprinting, competition, antibody supershift, and mutagenesis of the binding sites. When cloned in front of the CYP11A1 promoter or the heterologous thymidine kinase promoter, AdE sequences enhanced expression of the reporter gene in steroidogenic cell lines of the adrenal, gonad, and placental origin but not in nonsteroidogenic cell lines such as COS-1 and Rat-1. The function of AdE1 and AdE2 was lower when present individually than together. The combined action of multiple transcription factors binding to the AdE sequence brings about the final activation of the CYP11A1 gene in a tissue-specific manner.
The CYP11A1 (SCC) gene encodes the enzyme cytochrome P450scc (cholesterol side-chain cleavage enzyme) that catalyzes the conversion of cholesterol to pregnenolone, the first and rate-limiting step in the synthesis of all steroids (1). P450scc functions as a monooxygenase in the mitochondrion, using electrons transported from its cofactor ferredoxin reductase and ferredoxin for oxidation/reduction reactions. The human SCC gene is located on chromosome 15 (2). Its expression follows a developmentally programmed, cell type-specific, and hormonally regulated pattern. P450scc first appears in the adrenal primordia and fetal gonads at gestational days [11][12] in rodent embryos (3,4). The expression of P450scc is further stimulated by adrenocorticotropin and gonadotropin in the adrenal and gonads, respectively, using cAMP as an intracellular mediator (5). In addition, there are other sites of P450scc expression. Placenta expresses P450scc to provide progesterone necessary for pregnancy (2). Brain and the primitive gut of mouse embryo express a small amount of P450scc (6,7). With a few exceptions (8), regulation of P450scc gene expression in most cell types is at the transcriptional level (9).
The cis-acting elements that control the tissue-specific and hormonal regulation of the SCC gene have been under extensive investigation (10). Sites responsible for cAMP-dependent expression have been identified. One site close to the basal promoter consists of G-rich sequences which bind Sp1-like proteins (11,12). This G-rich sequence is also found in other steroidogenic genes such as ferredoxin (13), CYP21 (14), and CYP19 (15). Another cAMP-responsive sequence is located further upstream and contain sequences similar to cAMP-responsive element termed CRE (16). The sequence controlling phorbol ester response was found close to the basal promoter of the SCC gene (17,18).
In different cell types, the expression of the SCC gene follows different regulatory mechanisms. The adrenal and placenta use different control elements for gene expression and cAMP stimulation (19,20). Glioma cell line C6 also contains transcription factors that are different from those in the adrenal cell line Y1 for gene expression (21). The adrenal and gonads are derived from the same progenitor cells (4); therefore it is not surprising that they share the same transcriptional control elements.
One major transcription factor in the adrenal and gonad that controls SCC gene expression is steroidogenic factor 1 (SF1 or Ad4BP). 1 SF1 is a member of the nuclear hormone receptor family that binds to the AAGGTCA sequence (22). Almost all steroidogenic genes including SCC contains AAGGTCA sequence and is stimulated by SF1 transcriptionally (23). SF1, however, is not the only factor that controls tissue-specific expression of steroidogenic genes. P450c21 and P450c11 are expressed only in the adrenal, despite the abundant expression of SF1 in the gonad. Therefore, there must be other transcription factors that control the adrenal-specific expression of P450c21 and P450c11. The equal distribution of SF1 in all three zones of the adrenal cortex cannot explain zone-specific transcription of CYP11B1 (24). In addition, SF1 alone does not achieve the highest level of expression observed in these cell types.
Previously we have identified a DNA region at 1.9 kilobase pairs upstream from the transcription start site that augments transcription of the human SCC promoter above the basal level (11). In this report we have further characterized this region and found that multiple proteins including Sp1-and NF1-like proteins bind to this region. The combined action of these proteins upon binding to the upstream sites resulted in cell type-specific enhancement of transcription of the human SCC gene.

MATERIALS AND METHODS
Cell Culture-Y1 and H295 are mouse and human fetal adrenocortical cell lines, respectively (25,26). JEG-3 cells, derived from a human * This work was supported by Grant NSC84-2311-B001-078 from the National Science Council and by Academia Sinica, Republic of China. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 (27); MA10, a mouse testis Leydig tumor cell line (28); COS 1, an SV40-transformed simian cell line (29); and Rat-1, a rat embryo fibroblast cell line (30), were grown as described before.
Oligonucleotides-The oligonucleotides used in this study are listed in Table I.
Plasmid Construction-Plasmids ptkCAT, pSCC#12, pSCC145, and pSCC577 have been described before (11,13). Clone 1F was constructed by ligating an oligonucleotide spanning Ϫ1903 to Ϫ1822 into the XbaI site of ptkCAT. Clones 2F, 3F, and 4F are the same size as 1F, with mutations at Ϫ1852/Ϫ1853, Ϫ1860/Ϫ1861, and Ϫ1886/Ϫ1887, respectively. Clones AdE1 and AdE2 and the mutant clones of AdE1 (M2 and M4) were obtained by inserting the respective oligos into the XbaI site of ptkCAT. Plasmid pSCC1800 was obtained by inserting the ClaI/ HindIII segment (Ϫ1822 to ϩ55) from the CYP11A1 genomic clone (31) into pUC13CAT. The clones containing AdE1 in front of native promoter (pSCC145AdE1) were constructed by inserting the corresponding oligonucleotides into the SacI site of pSCC145.
Transfection and CAT Assay-Cells were transfected by calcium phosphate procedure (32) with 5 g of test plasmids and 3 g of RSV␤Gal or 0.3 g of CMV␤Gal plasmids. Cell extract was prepared 2 days after transfection. CAT activity was measured and normalized against ␤-galactosidase activity as internal control. At lease three independent transfections were performed, and their mean values and standard deviations were calculated.
Protein-DNA Interaction-The preparation of nuclear extract and the method of electrophoretic mobility shift assay, competition, and footprinting were as described before (11).

RESULTS
Function of AdE1 and AdE2-We have previously identified a positive regulatory region located at Ϫ1903 to Ϫ1845 upstream from the transcription start site of the human SCC gene. This region contains two protein binding sites termed AdE1 and AdE2 (11). The regulatory function of AdE1 and AdE2 is being further characterized by transfecting into mouse adrenal Y1 cells plasmids containing these elements in front of a thymidine kinase promoter and the reporter gene chloramphenicol acetyltransferase (Fig. 1). Clone 1F that contains both AdE1 and AdE2 activates transcription by about 7-fold over the control. AdE1 shows lower activation function, whereas AdE2 by itself is nonfunctional. These results suggest that the combined action of AdE1 and AdE2 is required for maximal stimulation.
Proteins Binding to the Upstream Elements AdE1 and AdE2-As a first step to investigate which kind of proteins may bind to these two sites, we examined the sequence of AdE1 and AdE2. In AdE1, two sequences, TGG(C/A)(N) 5 GCCAA (33) and CGGAAGT (34), match the consensus for transcription factor NF1-and Ets-biding sites, respectively ( Fig. 2A). It also contains a sequence that has only one mismatch out of eight nucleotides for the Sp1-binding site. The protein contact points of AdE1 as detected by footprinting previously fall on the putative Sp1-and NF1-binding sites, whereas the putative Ets site was found free of protein binding (11).
Radiolabeled oligos corresponding to AdE1 and AdE2 were incubated with protein extracts from Y1 cells in electrophoretic mobility shift assays. AdE1 formed several protein-DNA complexes, which could be competed by 100-fold unlabeled AdE1 itself (lanes 1 and 2, Fig. 2B). Other minor bands at the lower part of the gel represent nonspecific complexes that cannot be competed by any oligonucleotides. An oligo FIB2.6, which contains the consensus sequence for NF1 (35), was used as a competitor in this assay. It competed for the binding of the major complexes. Two minor complexes were revealed after the major complexes were competed by the FIB2.6 oligo (lane 4). FIG. 2. AdE elements bind many proteins including NF1 and Sp1 as shown by electrophoretic mobility shift assays. A, sequence that contains AdE1 and AdE2. The sequence is overlined at the protected regions for the sense strand and underlined for the antisense strand. The G residues that are protected from methylation are marked with filled circles at the top or bottom of the sequence for the sense or antisense strand, respectively. The sequence that matches the Ets protein-binding site is indicated by a hatched box. B, protein binding pattern with AdE1 oligo. Labeled AdE1 oligo was bound to 5 g of nuclear proteins from Y1 cells (lanes 1 and 3). The complexes were competed with unlabeled competitors as shown at the top of each lane (lanes 2 and 4 -6) or treated with anti-Sp1 antibody (lane 5). The gel at the right (lanes 3-6) is overexposed to reveal the faint Sp1-like complex and the super-shifted band (labeled by an arrowhead). The letter F indicates free probes. C, protein binding pattern with AdE2 oligo. AdE2 probe was incubated with Y1 extract (lanes 1, 3, and 6)  complex was further demonstrated by a supershifted band upon interacting with Sp1 antibody (lane 6). Therefore, Sp1and NF1-like proteins were found to bind to the AdE1 sequence. One protein complex remained unidentified. The putative Ets-binding site did not appear to bind any protein in electrophoretic mobility shift, competition, or footprinting assays (data not shown).
The AdE2 sequence contains one site that deviates from the Sp1 consensus by only one nucleotide ( Fig. 2A). This putative Sp1 site contacted Y1 proteins in previous footprinting analysis (11). The AdE2 oligo, when used in electrophoretic mobility shift assay, formed two major protein-DNA complexes with Y1 nuclear extract (Fig. 2C). An AdE2M2p28 mutant oligo, whose GG sequence at the protein contact site (nucleotides Ϫ1886 and Ϫ1887) was mutated to AC, could not compete for the binding of both complexes. Therefore, this GG sequence is important for the formation of both complexes. Sp1 is present in complex B1 as shown by its disappearance upon competition by the Sp1 oligo (lane 5) and the supershifted band with Sp1 antibody (lane 7). The affinity of the AdE2 sequence toward Sp1 was low, because excess unlabeled AdE2 oligo could not completely compete for binding to complex B1 as shown in the overexposure of the gel in lane 4. This low affinity could be due to one base pair mismatch of AdE2 with the Sp1 consensus sequence.
Effect of AdE Mutation on Function and Protein Binding-To investigate the importance of various bases in the AdE1 and AdE2 sequences, oligonucleotides with mutations at the protein contact sites were synthesized. The M2 oligo has mutations in the Sp1-binding site of AdE1, whereas M4 oligo has mutations in the NF1-binding sequence (Table I). When used in the electrophoretic mobility shift and competition assay, the M4 oligo could not compete for the binding of the major NF1 complexes (Fig. 3A, lane 2). M2 oligo, on the other hand, had the same competition pattern as the other oligo FIB2.6, which binds to the NF1 site (Fig. 3A, lanes 3 and 4). It indicated that M2 could bind to the NF1-but not the Sp1-binding site. These competition data correlate well with the binding specificity of NF1 and Sp1.
The binding specificity was further investigated by footprinting and competition (Fig. 3B). The footprint formed by Y1 nuclear extract could be completely competed by AdE1 as well as FIB2.6 oligos. Mutant M2, which is mutated at the Sp1 site, also competed for the formation of the footprint. Another mutant M2p20, which had two mutations at the NF1 site, could not compete. It therefore appears that the major protein-binding site is the NF1 site. Sp1 binding was too weak to be detected well in this competition assay. These data correlated with the electrophoretic mobility shift data showing NF1-like proteins as the major binding proteins in AdE1.
The function of these AdE mutants were further assessed in front of a heterologous tk promoter (Fig. 4). Mutations of the NF1 site, as in clone 2F or clone M4, did not significantly abolish the enhancer activity. The other mutant clone 4F, with a mutation at the Sp1 site of AdE2, also had slight reduction in activity. Only mutations at the Sp1 site of the AdE1 element,  either in the context of a longer clone 3F or in the shorter clone M2, had significantly reduced activity. These results indicated that the Sp1-binding site was the most important for function in this promoter setting. Destruction of other binding sites caused only slight decrease in activation function.
Cell Type Specificity of AdE1 Elements-The function of AdE1 was tested when it was placed in front of its native promoter (Fig. 5). Clones pSCC145, 577, and 1800, which are devoid of the upstream AdE sequence, all showed low activity. Clone pSCC145AdE1, which has the AdE1 sequence inserted in front of 145 base pairs of the SCC promoter, showed elevated expression. It indicated that AdE1 functioned well in front of its native promoter in Y1 cells.
In addition to Y1 cells, a nonsteroidogenic cell line COS-1 was also used to test for AdE function. In contrast to the results in Y1 cells, reporter gene activities of all the clones with decreasing lengths of the SCC 5Ј-flanking region were very low in COS-1 cells, including the ones with AdE sequence (Fig. 5). The control plasmid RSVCAT exhibited very high reporter gene expression, showing that low level of gene expression was not due to lower transfection efficiency. Therefore the SCC promoter and the AdE1 sequence are not functional in COS-1 cells.
To further test the function of AdE1 in other cells, we transfected plasmids driven by the tk promoter into various cell lines (Table II). In steroidogenic cell lines Y1, H295, MA10, and JEG-3, clones 1F and AdE1 invariably had higher activity than ptkCAT. Therefore AdE elements enhanced gene expression in all these cell lines. Plasmid pSCC#12, which is longer than 1F at both ends, had lower activity than 1F. It indicated that the extra sequences in pSCC#12 may be inhibitory for function in all these cell lines. The inhibitory function, however, varied in strength in these cells. It was most severe in JEG-3 cells, lowering the activity of pSCC#12 to below that of ptkCAT.
In nonsteroidogenic cell lines COS-1 and Rat-1, all plasmids directed similar levels of reporter gene expression as the vector ptkCAT. It indicated that the AdE sequences did not function in nonsteroidogenic cells. DISCUSSION In this report, we characterized the upstream AdE elements of the human CYP11A1 gene. The sequence was composed of two major protein binding regions AdE1 and AdE2, which functioned in steroidogenic cell-specific gene activation. All the tested steroidogenic cell lines, including two adrenal cell lines Y1 and H295, a mouse testis Leydig cell line MA10, and a human placental cell line JEG-3, support the activation function of AdE elements (Table II). The AdE sequences, on the other hand, are not functional in nonsteroidogenic cell lines COS-1 and Rat-1.
Besides AdE, a key regulator for steroidogenic gene expression, transcription factor SF1, functions in many steroidogenic cells including adrenal and gonad (36). SF1, however, is not expressed in such steroidogenic organ as placenta. 2 Therefore SF1 is not the sole factor that determines cell type specificity of steroidogenic gene expression. AdE sequences are functional in multiple steroidogenic cells including placenta, showing that they could enhance or modulate SCC transcription in many steroidogenic cells.
Many proteins bound to the AdE sequences. NF1-like proteins were the most prominent ones. The rest included Sp1-like and other as yet unidentified proteins. NF1 belongs to a protein family containing related proteins which recognize similar TGG(C/A)(N) 5 GCCAA sequences (35,38,39). NF1 family members function widely in replication and transcription of various viral and cellular genes (40 -47). Sp1, on the other hand, activates transcription of many genes through its binding to the GGGGCGG or GGGGAGG sequences (48). Sp1 also belongs to a protein family consisting of multiple related members with similar DNA-binding specificity (49 -51).
Both NF1 and Sp1 exert their activation function through interaction with other transcription factors. Sp1 has been shown to form multimeric complexes (52), to interact with coactivators (53) and other transcription factors (54 -57). Likewise, NF1 exerts its function through synergistic interaction with a number of different transcription factors in a wide range of situations (43,44,47). Combinatorial interaction of various transcription factors bound to their cognate binding sites of the gene has become a paradigm of eukaryotic gene activation. NF1 and Sp1 are both considered general transcription factors. In deed, NF1-and Sp1-like proteins were also found bound to the AdE sequences in COS-1 cells (data not shown). It  is therefore intriguing to find out how common transcription factors can determine cell type-specific expression. NF1 family members have been shown to be involved in liver-, adipocyte-, and epithelial cell-specific functions (58 -60). It is believed that synergism between factors that vary in concentration in different cells results in cell type-specific transcriptional activation (61)(62)(63)(64). In one situation, NF1 could interact with tissue-specific transcription factors for tissue-specific activation (58,65).
In the other situation, it is the balance of different NF1 family members that controls cell type specificity. In epithelial cells, the NF1 protein is derived from NF1-C, although in fibroblast cells, the major NF1 protein is NF1-X. NF1-X protein fails to activate enhancer function due to a variation in its activation domain. It is the property of NF1-X and the differential concentration of the NF1 family members that achieve the epithelial cell specificity of NF1 for human papillomavirus 16 expression (66).
Although being viewed as a ubiquitous transcription factor, substantial variations in Sp1 expression were found in different cell types, showing that its expression is developmentally programmed (67). Because Sp1 belongs to a protein family consisting of multiple related members with similar DNA-binding specificity (49 -51), it is possible that the variations in Sp1 levels in different cell types are attributed to detection of different members of the Sp1 family in these tissues. Some members of the Sp1 family could be expressed in a tissue-specific manner. BTEB2, an Sp1 family member that is homologous to Sp1 at the DNA-binding domain and recognizes the same GC box, is expressed specifically in testis and placenta (37). We showed that the Sp1-like protein being bound to the AdE sequences recognized the same sequence and shared the same antigenicity with Sp1, yet we do not know which member of the Sp1 family binds to the AdE sequences. It is possible that the tissue-specific expression of the SCC gene is determined by the steroidogenic tissue-specific expression of the BTEB-2-like protein.
In conclusion, the AdE sequences of the human SCC gene contains binding sites for Sp1-and NF1-like proteins. It is the combined action of these bound and interacting factors that brings about the final activation of the SCC gene in a steroidogenic cell-specific manner.