Identification and characterization of a novel androgen response element composed of a direct repeat.

Transcriptional regulation by the androgen receptor (AR) requires its binding to hormone response element nucleotide sequences in DNA. A consensus glucocorticoid response element (GRE) can mediate transactivation by AR and other members of the AR/glucocorticoid (GR)/progesterone (PR)/mineralocorticoid (MR) receptor subfamily. We identified putative androgen response element (ARE) sequences by binding of a human AR DNA-binding domain fusion protein to DNA in a random sequence selection assay. A 17-base pair consensus nucleotide sequence, termed IDR17, containing three potential GRE-like core binding sites organized as both inverted and direct repeats, was determined from a pool of degenerate oligonucleotides. IDR17 was active in mediating androgen-dependent induction of reporter gene expression in transient transfection assays. Dissection of the IDR17 sequence revealed an 11-base pair sequence (DR-1), consisting of two potential core binding sites oriented as an overlapping direct repeat, as the most potent ARE. DR-1 demonstrated a strong preference for AR binding and transactivation when compared with GR. To our knowledge, this is the first observation that a direct repeat of GRE-like core motifs functions as a preferred hormone response element within the AR/GR/PR/MR subfamily of nuclear receptors.

Transcriptional regulation by the androgen receptor (AR) requires its binding to hormone response element nucleotide sequences in DNA. A consensus glucocorticoid response element (GRE) can mediate transactivation by AR and other members of the AR/glucocorticoid (GR)/progesterone (PR)/mineralocorticoid (MR) receptor subfamily. We identified putative androgen response element (ARE) sequences by binding of a human AR DNA-binding domain fusion protein to DNA in a random sequence selection assay. A 17-base pair consensus nucleotide sequence, termed IDR17, containing three potential GRE-like core binding sites organized as both inverted and direct repeats, was determined from a pool of degenerate oligonucleotides. IDR17 was active in mediating androgen-dependent induction of reporter gene expression in transient transfection assays. Dissection of the IDR17 sequence revealed an 11-base pair sequence (DR-1), consisting of two potential core binding sites oriented as an overlapping direct repeat, as the most potent ARE. DR-1 demonstrated a strong preference for AR binding and transactivation when compared with GR. To our knowledge, this is the first observation that a direct repeat of GRE-like core motifs functions as a preferred hormone response element within the AR/GR/PR/MR subfamily of nuclear receptors.
Nuclear hormone receptors constitute a family of transcription factors that function by binding to specific DNA sequences in regulatory regions of target genes known as hormone response elements (HREs) 1 (1)(2)(3)(4)(5). The general structure of HREs has been well characterized (6 -8). They are generally composed of six base pair receptor binding sites, oriented as in-verted or direct sequence repeats and separated by a variable number of spacing nucleotides. Specificity of an HRE is a property of the primary sequence of the individual binding sites as well as the spacing and orientation of the binding motifs (7)(8)(9)(10)(11)(12). Specificity of hormone action may also be conferred by interaction between the receptor and other transcription factors (6,7). In this case, the HRE is a more complex element that includes multiple simple HREs (partial palindromes or a single binding site) together with adjacent binding sites for other factors (13)(14)(15)(16).
HREs comprise several subgroups corresponding to evolutionary conservation among subfamilies of the receptors. Androgen receptors (ARs) share a high degree of homology with glucocorticoid receptors (GRs), progesterone receptors (PRs), and mineralocorticoid receptors (MRs) in their DNA-binding domains (17)(18)(19)(20)(21)(22). A well characterized HRE for this subfamily of receptors is the glucocorticoid response element (GRE), which is composed of inverted nonidentical hexamer binding sites separated by a spacing of 3 nucleotides, GGTACAnnnT-GTTCT (6 -8, 23). This sequence mediates androgen-induced, as well as glucocorticoid-and progesterone-induced, gene expression (23)(24)(25)(26). Another subfamily of nuclear receptors consists of the thyroid hormone (TR), retinoic acid (RAR), and vitamin D (VDR) receptors (7)(8)(9). A striking feature of this latter subfamily of receptors is their recognition of a similar consensus nucleotide sequence binding site, with specificity of hormone response being determined by different spacing and orientation of the binding site motifs. Spacing of the consensus binding half-sites by 3, 4, or 5 nucleotides creates HREs for VDR, TR, and RAR, respectively (7, 10 -12). In addition to direct repeats, both TR and RAR can recognize HREs consisting of inverted receptor binding sites (11,12).
Naturally occurring androgen response elements (AREs) fall into two categories. Some AREs, such as those identified in the C3 subunit gene of rat prostatic binding protein and mouse sex-limited protein (13,27,28), consist of GRE-like sequences with inverted binding sites separated by a 3-bp nucleotide spacer. In other AREs, like those in the probasin (29) and 20-kDa cystatin-like (16) genes, only single binding site sequences are apparent. However, a recent report showed that the 5Ј-subsequence, 5Ј-GGTTCT-3Ј, within the partial palindrome of the probasin ARE-2 binding site, excludes GR binding but permits binding of AR (30). Another possibility for receptorspecific binding is that sequences adjacent to receptor binding sites create composite activation elements that also involve the binding of other factors (13)(14)(15)(16). An additional hypothetical means of conferring hormone specificity is the spacing and orientation of binding sites within the DNA sequences that bind members of the AR, GR, PR, and MR subfamily. GR and AR were recently shown to bind to direct repeat DNA elements (30,31) in addition to the more typical inverted repeat sequences. In this report, we provide evidence that a member of this subfamily, namely AR, can activate transcription of a reporter gene through an HRE consisting of a novel direct repeat nucleotide sequence that preferentially binds AR.

EXPERIMENTAL PROCEDURES
Plasmids-The following plasmid DNAs were generous gifts: OB7 (hGR) from Dr. R. Evans (Salk Institute for Biological Studies, La Jolla, CA); pCMV-hGR from Drs. F. French and J. Tan (University of North Carolina, Chapel Hill, NC); pBLCAT2 from Dr. C. Young (Mayo Clinic, Rochester, MN). MMTV-CAT and pCMV-hAR were used in our previous studies (32). Plasmid pGEX-2T was purchased from Pharmacia Biotech Inc., and pBluescript was obtained from Stratagene (La Jolla, CA).
Expression and Purification of hAR and hGR Fusion Proteins-Freshly diluted cultures of pGEX-hAR or pGEX-hGR were grown for 1 h prior to induction of protein expression by growth in 1 mM isopropyl-␤-D-thiogalactopyranoside for an additional 3.5 h. Cells were collected by centrifugation (5000 ϫ g, 10 min, 4°C) and lysed by sonication in 5 ml of ice-cold phosphate-buffered saline containing 1 mM phenylmethylsulfonyl fluoride. After the addition of 1% Triton X-100 and centrifugation (10,000 ϫ g, 5 min, 4°C), the supernatant was incubated (25°C, 4 min) with 1 ml of 50% (w/v) glutathione-agarose beads with gentle mixing. The beads were washed three times with 10 ml of ice-cold phosphatebuffered saline. The fusion proteins were eluted from the beads by incubating (25°C, 4 min) with 0.5 ml of 50 mM Tris-HCl (pH 8.0) containing 15 mM reduced glutathione. The beads were recovered by centrifugation, and the elution step was repeated four or five times.
Oligonucleotide Probes-For random sequence selection assays, a pool of degenerate 55-bp oligonucleotides was synthesized. The 5Ј and 3Ј termini contained restriction endonuclease sites for BamHI and EcoR I, respectively, for subcloning purposes and were utilized for primer annealing to convert oligonucleotides from single-to double-stranded DNA and for subsequent PCR amplification. The sequence of Oligo 1 was 5Ј-AGACGGATCCATTGCAATAN 18 ATCCTGTAGGAATTCGGA-3Ј. The sequence of Oligo 2 was 5Ј-AGACGGATCCATTGCAAN 13 -TGTTCTGATCCTGTAGGAATTCGGA-3Ј. The pair of primers for both Oligos 1 and 2 were as follows: primer A (sense), 5Ј-AGACGGATCCAT-TGCA; primer B (antisense), 5Ј-TCCGAATCCCTACAG. Doublestranded Oligo 1 and Oligo 2 were generated by annealing each oligonucleotide to a 10-fold molar excess of primer B and extending the complementary strand with Klenow fragment of E. coli DNA polymerase in the presence of dNTPs. The double-stranded Oligo 1 and Oligo 2 were purified on 10% polyacrylamide gels and eluted from the gel by agitation in 0.5 M ammonium acetate, 1 mM EDTA, 0.1% SDS at 37°C for 12 h.
A single-stranded 55-bp oligonucleotide containing a consensus GRE, 5Ј-AGACGGATCCATTGCAAAGTCAGAACACAGTGTTCTGATCCT-GTAGGAATTCGGA-3Ј was synthesized and made double-stranded by annealing primer B and extending it with Klenow fragment of DNA polymerase as described above.
Electrophoretic Mobility Shift Assays (EMSAs)-DNA was 5Ј-end labeled with [␥-32 P]ATP using T4 polynucleotide kinase under standard conditions, purified on 5-10% polyacrylamide gels, and recovered by elution from the gel as described above. Approximately 0.5 pmol (10 5 cpm) of 32 P-labeled double-stranded DNA were incubated with 200 ng of the AR-DBD fusion protein in 20 l of 20 mM HEPES (pH 7.6), 50 mM KCl, 3 mM MgCl 2 , 1 mM EDTA, 0.5% Nonidet P-40, and 8% glycerol at 25°C for 15 min. Poly(dI-dC) (500 ng) was added to each reaction to reduce nonspecific binding. The incubation mixtures were loaded on 5% nondenaturing polyacrylamide gels, and EMSAs were run at 10 V/cm for 4 h in 100 mM Tris-HCl (pH 8.0), 100 mM boric acid, 1 mM EDTA buffer at 4°C. For analytical purposes, the gels were dried before autoradiography. For preparative purposes, gels were exposed directly without drying for 1-3 h at 4°C to X-Omat film for localization of the DNA and DNA-protein complex(es). Regions of the gel containing the shifted bands of DNA were excised and recovered by elution. Half of the recovered DNAs were amplified by PCR, radiolabeled, and used for successive rounds of selection. For EMSA in other experiments, oligonucleotides (GRE, IDR17, and DR-1) with defined sequences were used as probes. The DR-1 (2x) probe represented a dimer of annealed and ligated DR-1 oligonucleotides derived by digesting the DR-1-pBLCAT plasmid (described below) with HindIII and BamHI. A 90-bp DNA fragment was isolated and radiolabeled by Klenow fragment in the presence of [␣-32 P]dCTP. For these experiments, 50 -500 ng of either AR-DBD or GR-DBD fusion proteins were incubated with 32 P-labeled DNAs in binding reactions. For competition assays, the proteins were first incubated with 10 -1000-fold excess of unlabeled double-stranded oligonucleotide competitor at 25°C for 10 min before the addition of the radiolabeled DNAs.
PCR Amplification of Selected DNA-Oligonucleotide DNAs recovered from EMSA were dissolved in 10 l of 0.1 ϫ TE buffer. For PCR amplification, 5 l of the DNAs were used as templates in a standard 50-l reaction containing 50 pmol each of primers A and B. The amplified DNAs were purified on 10% polyacrylamide gels and recovered by elution as described above.
The DNAs amplified by PCR after the last round of selection were digested with BamHI and EcoRI to create cohesive ends and ligated into the plasmid, pBluescript KS ϩ . Individual clones were sequenced by the dideoxynucleotide termination method (Sequenase 2.0 kit, U.S. Biochemical Corp.).
Construction of Chimeric Reporter Genes-Single-stranded oligonucleotides IDR17, IR0, IR5, and DR-1 were phosphorylated, and equal amounts of the complementary strand of each oligonucleotide were annealed to generate cohesive ends with overhangs corresponding to XbaI recognition sites. The double-stranded oligonucleotides were concatamerized with T4 DNA ligase, and the ladder of concatamerized DNAs was separated on 5% polyacrylamide gels. Dimers of IDR17, IR0, IR5, and DR-1 were excised from the gel, eluted, recovered by ethanol precipitation, and ligated into the XbaI site of the pBLCAT2 plasmid. Positive clones were screened by restriction mapping and confirmed by sequencing.
Cell Culture and DNA Transfection-CV-1 cells were transfected with plasmid DNAs by the calcium phosphate/DNA co-precipitation method, as described previously (32). The cells were subsequently cultured in Dulbecco's modified Eagle's medium containing 5% charcoalstripped fetal bovine serum in the absence or presence of 0.01-10 nM R1881 (methyltrienolone) or 1-1000 nM dexamethasone for 16 -24 h. The CAT reporter gene plasmids IDR17-, IR0-, IR5-, and DR-1-pBLCAT2 or MMTV-CAT were cotransfected with the receptor expression vector pCMV-hAR or pCMV-hGR. In all experiments, the pCH110 plasmid expressing ␤-galactosidase was also cotransfected as a control for the efficiency of transfection. The amount of each plasmid used in the transfection experiments was equivalent to 5 g of DNA/culture dish. The reporter gene plasmids were used without cotransfection of the receptor expression vector or with the "empty" expression vector as negative controls for the specificity of steroid induction.
CAT and ␤-galactosidase enzyme activities were assayed in 50-l aliquots of the cell extracts as described previously (32). CAT activity was quantitated by cutting and counting the radioactive areas from the thin layer chromatography plates by scintillation spectrophotometry. CAT activities were normalized for transfection efficiency based upon their corresponding ␤-galactosidase activities.

Random Sequence Selection for an AR Binding Site-
The method of random sequence selection and amplification originally described by Blackwell and Weintraub (34) was adopted to isolate AR binding sites on DNA from a pool of oligonucleotides with a region of degenerate nucleotide sequence. The AR protein was a recombinant protein composed of amino acids 559 -644 of the human AR-DBD fused to glutathione S-transferase (GST). This fusion protein was expressed in Escherichia coli and purified by affinity binding to glutathione agarose and elution in the presence of reduced glutathione. In preliminary studies, the AR-DBD fusion protein was incubated with an oligonucleotide containing a consensus GRE/ARE nucleotide sequence and was shown to form a protein-DNA complex by EMSA. For the random sequence selection assay, the AR-DBD fusion protein was incubated with a 55-bp oligonucleotide containing an 18-bp central region of totally degenerate nucleotide sequence (N 18 ). The presence of an 18-bp degenerate region was based upon the assumption that a 15-bp consensus GRE sequence was sufficient for binding of an AR dimer and that a specific binding site for AR would resemble this model. On the initial round of selection, binding of the AR-DBD fusion protein to a small percentage of the degenerate oligonucleotides was detected by EMSA (data not shown). The DNA was isolated from the gel and amplified by PCR, to enrich the population of DNAs bound by the AR-DBD fusion protein, prior to the next round of selection. As expected, the proportion of selected and amplified DNA that bound to the AR-DBD fusion protein increased during successive rounds of selection by EMSA. Following six rounds of selection, DNA from the shifted protein-DNA complex was isolated and subcloned, and 39 individual clones were sequenced (Table I). Alignment revealed a single hexamer nucleotide consensus binding site, TGTTC(T/C), identical to a GRE half-site, within the degenerate region as a preferred sequence for AR binding.
Based upon these preliminary studies and the similarity of the consensus hexamer binding site to that within known GREs that bind AR, we synthesized a second pool of 55-bp oligonucleotides containing a 13-bp region of degenerate nucleotide sequence (N 13 ) adjacent to a 6-bp (TGTTCT) consensus AR/GR/PR binding site, i.e. 5Ј-N 13 TGTTCT-3Ј. We predicted that the consensus binding site sequence would accommodate binding of one AR molecule and that the random sequence would allow identification of a second binding site with optimal sequence, spacing, and orientation to confer the cooperative binding of AR monomers residing on the DNA. By restricting the region of degenerate sequence to 13 bp, we theoretically could examine the complete pool of 6.7 ϫ 10 7 possible sequences.
Binding of the AR-DBD fusion protein to the 13-bp degenerate oligonucleotide pool is shown in Fig. 1. In the first round of selection, one shifted band was observed by EMSA. This band presumably represents a DNA-AR monomer complex. The binding site for the AR monomer could be the defined hexamer, TGTTCT, or another half-site located within the 13-bp degenerate region. Alternatively, some oligonucleotides containing two binding sites positioned appropriately for dimer binding may be bound by an AR dimer. Additional complexes with slower mobilities were not detected in the first round of selection, presumably due to their very low abundance within the oligonucleotide pool. DNA was recovered from the single shifted band, amplified by PCR, and incubated with the AR-DBD fusion protein for the second round of selection by EMSA. Two shifted bands, one of slower mobility than observed during the first round of selection, appeared on the gel. Based upon the reduced mobility of this DNA-protein complex, we presumed that this additional complex represented the binding of two molecules of AR-DBD fusion protein. The DNA was recovered from this less mobile complex and enriched during subsequent rounds of selection and amplification.
After six rounds, DNA from the less mobile complex was isolated and subcloned, and the nucleotide sequence of individual clones was determined. As shown in Table II, a conserved region of 11 bp with the consensus sequence -GGAACG-GAACA-was obtained. This selected sequence includes two potential AR binding sites immediately adjacent to the original site specified by the sequence, -TGTTCT (see Fig. 4A). One potential binding site is a 6-bp inverted repeat, -GGAACA-, which allows no intervening spacer between itself and the original binding site, -TGTTCT. In addition, there is another 5-bp sequence, GGAAC-, which forms a direct repeat, overlapping by 1 bp with the adjoining 6-bp sequence, -GGAACA. We named this putative 17-bp AR binding site sequence IDR17 due to its content of both inverted repeats (IRs) and direct repeats (DRs).
Functional Activity of IDR17-To examine the IDR17 consensus AR binding site sequence as a functional ARE, two copies of IDR17 were cloned into the pBLCAT2 plasmid upstream of the thymidine kinase promoter. The IDR17-pBLCAT reporter gene construct was tested for androgen-dependent expression of CAT activity following cotransfection with the human AR cDNA expression vector, pCMVhAR, into CV-1 cells. The synthetic androgen, R1881, induced a dose-dependent increase of CAT activity, which was not observed in the absence of androgen or in cells transfected with the empty  To examine the specificity of hormone response, IDR17-pBLCAT was compared with MMTV-LTR-CAT, which contains four GRE/AREs in its 5Ј-flanking sequence. Expression plasmids containing either the human AR or GR cDNAs were co-transfected with a reporter gene plasmid, and cells were incubated with R1881 or the synthetic glucocorticoid, dexamethasone. In contrast to MMTV-LTR-CAT, where CAT activity was induced to a higher level by glucocorticoid than by androgen in the presence of their respective receptors, IDR17-pBLCAT was much more responsive in the presence of androgen and its receptor than in the presence of glucocorticoid and its receptor (Fig. 2). Nearly maximal induction of CAT activity by AR with IDR17 occurred at 1 nM R1881, whereas much lower stimulation of CAT activity by GR was evident with IDR17 even at 100 nM dexamethasone. These results clearly demonstrate that IDR17 not only functions as an efficient ARE but also that it is preferentially induced by androgens when compared with the generic GRE/ARE-like hormone response elements of MMTV-LTR that respond more favorably in the presence of GR rather than AR.
The binding site specificity of IDR17 for AR was also tested by EMSA. Binding of the AR-DBD fusion protein to IDR17 was compared with the binding of a similar human GR recombinant protein containing an analogous region of its DNA-binding domain amino acids (421-506) fused to GST. An oligonucleotide containing a perfect 15-bp palindromic GRE nucleotide sequence (-AGAACAcagTGTTCT-) with the same flanking sequence as IDR17 was tested in parallel. As shown in Fig. 3, the AR-DBD and GR-DBD fusion proteins bound quantitatively similarly to the consensus GRE at each of several protein concentrations. By contrast, the AR-DBD fusion protein bound with high affinity to IDR17 and displayed a distinctive pattern of protein-DNA complexes, whereas binding of the GR-DBD fusion protein to IDR17 was relatively weak. This pattern of protein-DNA complexes was also characterized by reproducible differences in the relative mobility of complexes formed by AR compared with those formed by GR with the identical DNA fragments, suggesting possible differences in their binding conformations. This experiment further validates the specificity of IDR17 as an androgen receptor binding site.
Dissection of IDR17 Binding Sites-Three potential dimeric binding sites can be proposed among the three putative hexamer sequences within IDR17. To determine which of these sequences acts as an ARE, we created the oligonucleotides shown in Fig. 4A. For oligonucleotide IR0, nucleotide bases n 1-5 of IDR17 were substituted with nonspecific nucleotides, leaving only the hexamer IRs without a spacer (n 0 ) of nonspecific nucleotides. For IR5, nucleotides n 7-11 of IDR17 were substituted by five (n 5 ) nonspecific bases, leaving the two 6-bp binding sites at the 5Ј and 3Ј ends, respectively, as IRs. For DR-1, the 3Ј binding site specified by -TGTTCT was substituted by nonspecific nucleotides, leaving only an 11-bp sequence containing a 1-bp overlap (n Ϫ1 ) of two hexameric DRs. Two copies of each IDR17 derivative were cloned into pBLCAT2 upstream of the thymidine kinase promoter and cotransfected with AR to test the activities of IDR17 and its derivatives in androgen-dependent induction of CAT reporter gene expression (Fig. 4B). When transfected CV-1 cells were incubated in the absence and presence of different concentrations of R1881, IR0-pBLCAT showed constitutive CAT activity that was independent of androgen induction. IR5-pBLCAT was not stimulated by androgen and showed only basal expression. DR-1-pBLCAT, however, was highly responsive to androgen induction, reaching a level 180-fold above base line, and CAT activity exceeded that achieved by IDR17-pBLCAT with equivalent concentrations of R1881. The transactivation function of DR-1 was strictly androgen-dependent and required cotransfection of the AR.
Binding of AR to DR-1-The finding that AR was able to activate transcription from the direct repeat sequence, DR-1, was unexpected in light of the structure of other ARE/GRE/ PRE sequences identified to date. To determine whether DR-1 was a binding site for AR, the AR-DBD fusion protein was incubated with the DR-1 oligonucleotide and analyzed by EMSA. When a single DR-1 nucleotide sequence was present in the DNA, a low level of binding was observed for the shifted DNA-AR complex (Fig. 5A). However, when a tandem repeat of the DR-1 nucleotide sequence (DR-1, 2x) was present in the DNA fragment used as the probe, the binding of AR was greatly enhanced, and multiple shifted protein-DNA complexes were seen (Fig. 5B). We conclude that a single copy of the DR-1 sequence functions as a binding site for AR and that under conditions favoring cooperative binding of AR as in the headto-tail tandem repeat (DR-1, 2x), this same nucleotide sequence functions as an even higher affinity AR binding site.
DR-1 Is Specific for AR-The binding of AR-DBD and GR-DBD to the tandem repeat of the DR-1 oligonucleotide sequence was compared by EMSA and contrasted to the binding of each receptor to a consensus 15-bp palindromic GRE nucleotide sequence. As shown in Fig. 6, DR-1 had much higher affinity for AR than for GR, whereas GR and AR bound to the GRE with similar affinity. These findings suggest that the DR-1 sequence is an AR-selective binding site.
DR-1 also mediates androgen-selective activation of gene transcription in transfection assays. Fig. 7 shows a comparison of CAT reporter gene expression mediated through binding of AR or GR to the DR-1 nucleotide sequence in the presence of R1881 or dexamethasone, respectively. R1881 induced a 10 -12-fold increase in CAT activity by AR compared with a minimal (0.8-fold) effect of dexamethasone on CAT activity by GR, when a single copy of the DR-1 element was present. Reversing the orientation of the DR-1 element had no effect on the induction of CAT activity by AR or GR. As shown previously in Fig.  4B, tandem copies (2x) of DR-1 produced a synergistic response in CAT activity by AR in the presence of R1881, 87-fold above base line, when compared with a single copy of DR-1 (11-fold above base line). In addition, stimulation of CAT activity by AR was maximal at 1 nM R1881 and 5-fold greater than the maximal CAT activity induced by GR at the highest concentration (100 nM dexamethasone) of glucocorticoid. In the same experiment, GR and AR induced CAT activity maximally by 92-and 35-fold, respectively, from the MMTV-LTR promoter, in the presence of 1 nM R1881 and 1 M dexamethasone (data not shown). These results allow us to conclude that DR-1 binds AR selectively and with high affinity, leading to the preferential androgen stimulation of gene expression. DISCUSSION How specificity of steroid hormone action is achieved remains an important question. In principle, specificity can be generated at any step in a signal transduction pathway. One obvious level of cellular control is determined by the differen-tial expression of specific steroid receptors in target cells. The presence of specific HREs as genetic codes for selective regulation of gene transcription also plays a key role in hormone action (1)(2)(3)(4)(5)(6)(7)(8). Although a consensus HRE has been associated with transcriptional regulation by the AR/GR/PR/MR subfamily of nuclear receptors (23)(24)(25)(26)(27), it is not known how the specificity of these individual receptors is determined. For instance, prostate cells express AR, GR, and PR, yet the actions of these hormones and their specificity for activating or repressing gene expression differ considerably (35,36). Within the subfamily of receptors composed of TR, RAR, and VDR, both spacing and orientation of similar core binding motifs within the HREs determine receptor specificity (7)(8)(9)(10)(11)(12). In an effort to understand how cells might distinguish among AR, GR, and PR actions, we hypothesized that a specific binding site sequence might differentiate AR action from other receptors of its subfamily, and we searched for an ARE in the present studies.
To identify a specific ARE with a relatively unbiased, sensitive, and simple approach, we adopted the random sequence selection method (34) and utilized a two-stage approach. In the first stage, binding of the AR-DBD fusion protein to a pool of random oligonucleotides formed a single DNA-protein complex with the consensus receptor binding half-site, TGTTC(T/C), indicative of binding by a receptor monomer. We also observed a single shifted protein-DNA complex when the consensus GRE containing two half-sites as inverted repeats with a 3-bp nu-  TGTTCT  AGGGTACGGAACA TGTTCT  TTAGAACGGAACA TGTTCT  AGGGTACGGAACA TGTTCT  ATGGAACGGAACA TGTTCT  CGGGGACAGTACA TGTTCT  AGGGAACGGTACA TGTTCT  CTGGAACGGAACA TGTTCT  TGGGGACGGAACG TGTTCT  AGGGGACGGTACA TGTTCT  CGGGAACGGTACA TGTTCT  TGGGTACGGCACA TGTTCT  ACGGTACAGTACG TGTTCT  TGGGTACGGAACA TGTTCT  TGGGTACGGAACG TGTTCT  ATGGAACAGGTCA TGTTCT  ATGGGACGGAACA TGTTCT  ATGGAACGGAACA TGTTCT  CGGGAACAGTACA TGTTCT  TGGGAACGGTACA TGTTCT  ATGGAACGGGACA TGTTCT  AGGGAACGGAACA TGTTCT  ATGGAACGGCACA TGTTCT  ATGGAACGGTACA TGTTCT  TGGGTACGGAACG TGTTCT  CTAGAACAGTACA TGTTCT  TGGGTACGGAACG TGTTCT  CGGGAACAGTACA TGTTCT  TAGGAACGGAACA TGTTCT  TGGGGACGGGACG TGTTCT  TGGGAACGGTACA TGTTCT  TGGGAACGGAACA TGTTCT   Position   Ϫ11  Ϫ10  Ϫ9  Ϫ8  Ϫ7  Ϫ6  Ϫ5  Ϫ4  Ϫ3  Ϫ2  Ϫ1  1  2  3  4  5  6   Nucleotide  G  3 0  3 2  5  0  0  2 6  3 2  3  0  0  7  A  2  0  18  32  0  6  0  16  31  0  25  T  G  T  T  C  T  T  0 cleotide spacer was bound to human AR-DBD and GR-DBD fusion proteins until greater concentrations of either fusion protein led to the formation of additional higher order complexes. Using ␤-galactosidase-human AR or protein A-rat AR fusion proteins, Young et al. (37) and De Vos et al. (38), respectively, also observed a single DNA-protein complex with the consensus GRE sequence on gel shift assays. The absence or weak ability for dimerization or cooperative binding of the AR-DBD fusion proteins may be due to absence of the receptor N and C termini, which contain strong dimerization functions that are active in solution (39). Although a weak dimerization domain is present in the DNA-binding region of the steroid receptors (7, 8, 40 -42), this interaction may occur predominantly following binding to DNA, which places the receptor monomers in the preferred orientation (12). The presence of prokaryotic protein domains in the fusion proteins may also alter the cooperative interaction of the receptor-DBD dimerization domains, since glutathione S-transferase is known to form homodimers. In normal target cells, the native receptor proteins may also interact with other factors involved in transcriptional regulation, and such interaction may increase the specificity, as well as the affinity, of receptors for the cognate response elements (6,12,43).
To enhance the probability of finding the optimal half-site spacing for an ARE, we synthesized a second pool of oligonucleotides containing the previously defined half-site sequence, TGTTCT, adjacent to a 13-bp random sequence. The appearance of a second lower mobility DNA-protein complex in the second and subsequent rounds of binding and selection suggests the involvement of more than one AR molecule in the binding to individual oligonucleotides. By contrast, a previous study by Roche et al. (44), using a random sequence selection method and a protein A-rat AR DNA-binding domain fusion protein, selected a 15-bp imperfect palindrome, 5Ј-GG(A/T)A-CAnnnTGTTCT-3Ј, which is the same as the consensus GRE. Roche et al. (44) used oligonucleotides with a longer 26-bp random sequence, such that all potential sequences may not have been represented in the starting pool. Binding of AR protein to the selected oligonucleotides resulted in two shifted complexes in the second stage of our study, whereas Roche et al. observed a single protein-DNA complex, as in our first stage of selection. The presence of different fusion domains for the recombinant AR proteins, such as GST in the present study and protein A in Roche et al., may also create new specificities or restrict potential binding to DNA sequences. The inclusion of GST did not restrict binding, but perhaps protein A interfered with binding to the DR-1 sequence. Binding of the GST-AR-DBD was not created by the fusion partner, since full-length AR activates transcription in vivo through DR-1.
The IDR17 sequence identified in our study features both fundamental similarities and striking differences to the consensus GRE. The primary half-site sequence, TGTTC(T/C) (or its inverted repeat GGAACA), which is almost identical to the GRE half-site sequence, was identified from both the totally random (n 18 ) and partially random (n 13 ) oligonucleotide pools. This similarity might be explained by the proximal (P) box located at the C-terminal stem of the first zinc finger of AR and GR, which share identical amino acid sequences, and is considered to be critical in specifying the primary nucleotide sequence of the HRE half-site (7,8,(45)(46)(47). Despite agreement of the half-site sequences of IDR17 and GRE, the spacing and orientation of the binding sites are remarkably different. Differences in amino acid composition of the distal (D) box and the second zinc finger of the DNA-binding domain of AR and GR could contribute to the spacing and orientation differences of the half-sites (7,9,12), although AR has only one residue difference from GR among the five residues that compose the D box (7,8). Whereas sequences outside the DBD have been implicated in dimer formation, the ability of homo-or heterodimers to bind properly spaced half-sites can be observed for DBDs alone (48 -52). While these domains do not form dimers in solution in the absence of DNA, the target DNA serves as a scaffold that induces specific interactions between adjacent monomers, thus stabilizing the DNA-protein complex (53)(54)(55).
IDR17 contains three potential binding sites with both inverted and direct repeat configurations and functions as a preferred sequence for AR binding and androgen-dependent transactivation. When present in reporter gene constructs and cotransfected with the full-length AR into CV-1 cells, it is the truncated DR-1 sequence, however, instead of the full-length IDR17, that activates gene transcription most efficiently. Recent work has demonstrated that TR complexes can contact at least 20 nucleotides, including upstream and downstream flanking sequences, a spacer, and two half-sites (56,57). In fact, random sequence selection assays determined that the optimum binding site for TR actually consisted of an octamer nucleotide sequence rather than a hexamer (57). Furthermore, direct repeats of the optimal binding site, separated by 1-5 bp, all functioned as equally strong TREs (12,49). More recently, direct repeats of TGTTCT and RGGTCA motifs have been shown to function as response elements for GR and ER, respectively (31). In addition, binding of these receptors to direct repeats with different spacings between the half-sites suggested that binding to direct repeats was more flexible than binding to palindromic elements (31). However, binding of AR to a direct repeat, DR-1, with a one-base overlap remains unique. The asymmetrical head-to-tail arrangement of receptor dimers implied by the tandem of direct repeats suggests that the receptor subunits interact through a DNA-supported interface involving the carboxyl-terminal extension of the DNAbinding domain (12) or the ligand-binding domain (31) rather than the dimerization domain in the D box of the DNA-binding domain, which appears to be nonfunctional under such conditions. Studies specific to homodimerization of the ligand-activated AR have suggested that monomers were oriented antiparallel to each other, allowing interaction between N-and C-terminal domains of the opposing monomers (58).  1-6), prepared by isolating a 90-bp fragment, containing two copies of the DR-1 sequence excised from the DR-1 pBLCAT plasmid, and a 55-bp oligonucleotide, containing the GRE sequence (lanes 7-12), were radiolabeled with [␥-32 P]ATP. Approximately 10 5 cpm of radiolabeled DR-1 (2x) DNA fragment or GRE oligonucleotide was incubated with 50, 100, or 200 ng of AR-DBD or GR-DBD fusion protein in a binding reaction. Free and protein-bound DNA were separated by EMSA on 5% nondenaturing polyacrylamide gels.
A number of GRE-like sequences have been identified in the 5Ј-flanking or intron regions of some androgen-regulated genes, such as rat prostatic binding protein C3 subunit (27,28), rat probasin (29), human prostate specific antigen (38), human glandular kallikrein (59), and mouse sex-limited protein (Slp) (13). However, a more comprehensive understanding of what composes an ARE may require a more broad-based search, without bias for a prerequisite GRE-like sequence. A nonconsensus androgen-responsive region, 5Ј-CAGGGATC AGG-GAGTCTCAC-3Ј, that binds AR and cooperates in androgenregulated activity of the prostate-specific antigen promoter has recently been reported (60). In fact, most of the GRE-like sequences located in regulatory regions of androgen-responsive genes contain only single receptor binding sites, and the transcriptional activities of these AREs are relatively weak unless assayed in the context of their surrounding nucleotide sequences. Notably, a sequence located in the first intron (In-1) of the androgen-regulated 20-kDa protein gene (16), 5Ј-TGTCCT-GTTCC-3Ј, resembles the direct repeat sequence identified in our study. The 227-bp fragment (D1) containing this sequence and several other GRE half-site sequences functions as an androgen-specific enhancer. The element, 5Ј-GGTTCTtggAG-TACT-3Ј, in the promoter region of the probasin gene selectively interacts with the DNA-binding domain of the rat AR and not with that of the GR (60). Furthermore, it was concluded that the left subsequence, 5Ј-GGTTCT-3Ј, was responsible for excluding the binding of GR (60). Interestingly, direct repeats of GGTTCT separated by a 3-bp spacer were able to specifically bind AR and not GR (60). The observation that direct repeats of binding sites can function as AREs provides the basis for a possible mechanism by which AR preferentially interacts with specific DNA sequences.
In summary, a 17-bp androgen receptor-specific binding site was identified. This sequence shares the same receptor binding site core motif as the consensus GRE but differs in the context in which these core motifs are arranged. The identified sequence showed preferential binding to AR in gel mobility shift assays and specific response to androgen induction in transfection assays, indicating that an ARE sequence distinct from the consensus GRE may confer androgen specificity.