Mapping the protein/DNA contact sites of the Ah receptor and Ah receptor nuclear translocator.

The Ah receptor (AHR) and its DNA binding partner, the Ah receptor nuclear translocator (ARNT), are basic helix-loop-helix proteins distinguished by their PER, AHR, ARNT, and SIM (PAS) homology regions. To identify the amino acids of the AHRARNT heterodimer that contact the TNGCGTG recognition sequence, we have performed deletion mapping and amino acid substitutions within the N termini of both the AHR and ARNT. The ability of the variant AHR and ARNT proteins to bind DNA and activate gene transcription was determined by the gel shift analysis and transient transfection assays. We have found that the amino acids of ARNT that contact DNA are similar to those of other basic/helix-loop-helix proteins and include glutamic acid residue 83 and arginine residues 86 and 87. Although our initial experiments indicated that DNA binding of the AHR may involve two regions that are bordered by amino acids 9-17 and amino acids 34-42, further analysis demonstrated that only amino acids 34-39 are critical for the AHRTNGC interaction. These experiments indicate that while the structural features of the ARNTGTG complex may closely resemble that deduced for proteins such as Max, E47, and USF, the AHRTNGC complex may represent a unique DNA binding form of basic/helix-loop-helix proteins.

The AHR 1 and ARNT are bHLH/PAS proteins that represent a family of transcription factors containing basic, helix-loophelix, and PAS regions involved in both dimerization and DNA recognition (1)(2)(3)(4)(5)(6)(7). The distinguishing feature of the bHLH/PAS protein family is the PAS domain, which, like the leucine zipper, is thought to act as a secondary dimerization surface (4,5). In addition to the AHR and ARNT, the bHLH/PAS family members include PER, a Drosophila protein that regulates circadian rhythms; SIM, a Drosophila protein involved in central nervous system midline development; and HIF␣, a protein involved in mediating the cellular responses to hypoxia (2, 8 -11).
Crystallization and biochemical studies suggest that residues within the basic region specify and contact the DNA recognition site, whereas the helix-loop-helix and often a second domain such as the leucine zipper, or possibly the PAS region, participate in protein dimerization (4,5,(12)(13)(14). Basic regions are highly conserved domains that are present in several classes of transcription proteins, including the bHLH family (e.g. MyoD and E12) and the bHLH/LZ family (e.g. Myc and Max). Crystal structure analysis of bHLH and bHLH/LZ proteins has revealed several important aspects of how many proteins that contain bHLH motifs may contact and recognize DNA (12)(13)(14). First, DNA contact is contained within the basic region, the loop, and the second helix. Second, only the basic region undergoes a transition from a random coil structure to that of an ␣ helix as it initiates contact with its DNA recognition site. The first helix is merely an extension of the helix of the basic region. Formation of this helical structure is thought to play an important role in determining how specific recognition of DNA sequences may be achieved. However, the basic region of the AHR harbors four proline residues, suggesting that the AHR may not participate in this characteristic change in conformation as it initiates contact with its TNGC recognition half-site.
The AHR and ARNT are involved in mediating many of the cellular responses resulting from exposure to polyhalogenated aromatic hydrocarbons, including that of the prototypical ligand, 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) (1). The AHR exists in the cytosol complexed to a number of proteins including Hsp90. Upon ligand (TCDD) binding, the AHR translocates to the nucleus and dimerizes with its DNA binding partner ARNT to interact with specific sequences termed DREs (dioxinresponsive elements) (3,(15)(16)(17). DNA binding of the AHR⅐ARNT heterodimer results in the transcriptional activation of genes such as cytochrome P4501A1, cytochrome P4501A2, and NADPH-quinone reductase (see Ref. 18 for review). The DNA binding regions (basic regions) of the AHR and ARNT have been localized to their respective N termini, whereas the transcriptional activation domains reside within the C termini (3, 7, 19 -21).
The majority of bHLH proteins form homo-or heterodimers that recognize the consensus sequence CANNTG such that each protein partner specifies its DNA recognition half-site (i.e. 5Ј CAN or 3Ј NTG) (22,23). Crystal structure analyses of the bHLH proteins USF and Max suggest that an average of 10 -12 protein-DNA contacts are formed between the bHLH dimers and the 6-base pair recognition site (12,14). In contrast to that of other bHLH proteins, the DNA recognition sites of the bHLH/PAS protein family fail to conform to the CANNTG consensus sequence. For example, the AHR and ARNT heterodimerize to recognize the DRE that is composed of the sequence TNGCGTG (17), while ARNT and HIF␣ heterodimerize to recognize the sequence TACGTG (24). Previous work has demonstrated that the AHR occupies the 5Ј region of the DRE and specifies the DNA half-site TNGC, whereas ARNT occupies the 3Ј region of the DRE and specifies the half-site GTG (25,26). In addition, it has been determined that ARNT, but not the AHR, homodimerizes to recognize the CACGTG sequence that conforms to the CANNTG consensus site (25,27,28). The DNA binding form of ARNT may closely resemble the three-dimensional structural models developed for the DNA binding forms of the bHLH proteins Max and USF (12,14), since it binds to the identical GTG half-site recognized by these proteins and shares considerable homology within its basic region. In contrast, analysis of both the amino acid composition of the AHR basic region and its DNA recognition half-site indicates that the AHR may interact with DNA in a distinct manner.
To understand the tertiary structure of the DNA binding form of the AHR⅐ARNT heterodimer, how each partner may interact with its recognition half-site, and the amino acids that dictate DNA recognition and specificity of bHLH/PAS proteins, we have performed deletion analysis and amino acid substitutions within the basic regions of both the AHR and ARNT. Although previous work has indicated that DNA binding of the AHR may involve two discrete regions within the basic domain, we present evidence that only amino acids 34 -39 are involved in nucleotide and/or phosphate contacts. This study provides evidence that while ARNT may interact with the GTG half-site in a manner similar to that of other bHLH proteins, the conformation of the AHR⅐TNGC interaction does not fit this prototypical structure and may represent an as yet uncharacterized structural form of bHLH proteins.
General Procedures-Standard reaction mixtures for all PCR experiments were: 10 mM Tris-HCl, 50 mM KCl, 1.5 mM MgCl 2 , 0.001% gelatin, 200 M of each deoxyribonucleotide triphosphate, and 2.5 units of Pfu polymerase in a total volume of 100 l. Thermal cycler conditions were: 95°C for 1 min, 55°C for 1 min, 72°C for 1 min for 35 cycles, followed by 72°C for 10 min. The amplified products were purified following agarose gel electrophoresis (0.8%) and electroelution and subcloned using standard molecular biology procedures. The amplified "megaprimers" were purified using NuSieve agarose (NuSieve, FMC Bioproducts, Rockland, ME). Sequencing was performed using the dideoxy chain termination method (30).
N-terminal Deletions of the AHR and ARNT-To generate N-termi-nal deletions of the AHR, we performed PCR using plasmid pmuAHR as the template, OL 123 as the 3Ј oligonucleotide, and the following 5Јoligonucleotides: OL 67 (AHRN⌬9C⌬516), OL 55 (AHRN⌬17C⌬516), OL 254 (AHRN⌬34C⌬516), and OL 127 (AHRN⌬42C⌬516). The resulting PCR products were subcloned into the SpeI and HindIII sites of the pSport vector. All 5Ј oligonucleotides contained a synthetic Kozak consensus sequence (31). These AHR constructs bear a deletion of 516 C-terminal amino acids and interact with ARNT and the DRE in a ligand independent manner (3). The ARNT constructs were derived from pBM5NeoM1-1 and lack the N-terminal 15-amino acid alternatively spliced exon (15). ARNTC⌬418 was constructed using OL 177 as the 5Ј oligonucleotide and OL 175 as the 3Ј oligonucleotide. To generate N-terminal deletions of ARNT, we performed PCR using phuARNT as the template, OL 175 as the 3Ј oligonucleotide, and the following 5Ј oligonucleotides: OL 178 (ARNTN⌬74C⌬418), OL 267 (ARNTN⌬79C-⌬418) and OL 266 (ARNTN⌬87C⌬418), and OL 264 (ARNTN⌬150C⌬-418). The resulting PCR products were subcloned into the BamHI site of the pGem7Zf vector. All ARNT constructs bear a deletion of 418 amino acids from the C termini resulting in ARNT constructs that correspond to AHRC⌬516. Amino Acid Substitutions-To generate single amino acid substitutions, we used the "megaprimer" method as described previously (32). The megaprimer is a PCR product that is gel-purified and used as a primer in subsequent PCR reactions. To generate amino acid substitutions within the AHR basic region, we performed the standard PCR reactions using plasmid pmuAHR as the template, OL 209 (corresponding to the SP6 promoter site) as the 5Ј oligonucleotide, and the following 3Ј oligonucleotides containing substituted nucleotides to generate a megaprimer and the final PCR products: OL 501 (AHRB1Q5C⌬516) OL 502 (AHRB2Q5C⌬516), OL 268 (AHRQ 14 C⌬516). To generate the final PCR products, the standard reactions were performed except that the cycle number was increased to 50, the megaprimer was used as the 5Ј primer, and OL 123 was used as the 3Јoligonucleotide. The PCR products were then subcloned into the SpeI and HindIII sites of the pSport vector. These constructs bear a deletion of 516 amino acids from the C termini of the AHR. To generate AHR constructs capable of transcriptional activation, the NaeI and HindIII fragment of pmuAHR was subcloned into the plasmids containing AHRQ 14 Q 15 ⌬516, AHRN⌬17C⌬516, AHRQ 39 C⌬516, AHRB1Q5C⌬516, and AHRB2Q5C⌬516.
To generate amino acid substitutions within the basic region of ARNT, we performed PCR reactions identical to those described for the AHR except using huARNT as the template, OL 177 as the 5Ј oligonucleotide and the following 3Ј oligonucleotides containing substituted nucleotides to generate the megaprimers: OL 356 (ARNTQ 86 P 87 C⌬418 and ARNTQ 86 Q 87 C⌬418), OL 432 (ARNTQ 84 Q 85 C⌬418), and OL 500 (ARNTD 83 C⌬418). The resulting megaprimers were then used as the 5Ј primer with OL 175 as the 3Ј oligonucleotide. The final PCR products were subcloned into the BamHI site of the pGem7Zf vector (Promega, Madison, WI). These ARNT constructs bear a deletion of 418 amino acids from the C terminus of ARNT. To generate ARNT constructs capable of activating gene transcription, the XbaI/SpeI fragments of ARNTN⌬79C⌬418, ARNTN⌬87C⌬418, and ARNTD 83 C⌬418 were cloned into phuARNT.
Protein Expression-In vitro expression of all AHR and ARNT constructs was performed using rabbit reticulocyte lysates (Promega, Madison, WI) as described previously (3). For verification of protein expression, the translation reactions were performed in the presence of [ 35 S]methionine, and the products were analyzed by SDS-polyacrylamide gel electrophoresis. Quantitation of the expressed proteins was determined by excising the radiolabeled proteins from the gel and scintillation counting. Baculovirus expression and purification of histidine-tagged AHR and ARNT were carried out as described previously (33).
Gel Shift Analysis-The DNA probe (annealed OL 318/319) containing the DRE was radiolabeled with [␥-32 P]ATP by end labeling with T4 polynucleotide kinase (34). In vitro expressed AHR and ARNT proteins were incubated for 30 min at 30°C to aid dimerization. Nonspecific competitor, poly(dI-dC), was added and the mixture incubated at room temperature for 10 min. The radiolabeled probe (100,000 cpm, 0.5 ng) was added, incubated for 10 min, followed by nondenaturing gel electrophoresis using 0.5 ϫ TBE (45 mM Tris base, 45 mM boric acid, 1 mM EDTA, pH 8.0) as the running buffer (35).
Coprecipitation Analysis-Coprecipitation analysis was performed essentially as described previously (25). Briefly, Sf9 soluble extract containing approximately 2 fmol of baculovirus-expressed ARNT or AHR and 35 S-labeled reticulocyte lysate-expressed proteins were incubated with Ni-NTA-agarose (Qiagen, Chatsworth, CA) in wash buffer (50 mM Tris, pH 7.4, 100 mM KCl, 10% glycerol, 10 mM ␤-mercaptoethanol, 0.4% Tween 20, and 5 mM imidazole) for 2 h at 4°C with gentle mixing. As a negative control, baculovirus AHR or ARNT was incubated with either 35 S-labeled ARNTN⌬150C⌬418 or AHRGN⌬315, respectively. These constructs lack the HLH/PAS dimerization surfaces. The agarose was washed five times using wash buffer and pelleted following centrifugation at 16,000 ϫ g for 10 s. The pellets were resuspended and analyzed by SDS-polyacrylamide gel electrophoresis and autoradiography.
Cell Culture and Transient Transfections-LA-II and CV-1 cells were maintained in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum, 100 units/ml penicillin, 100 g/ml streptomycin, at 37°C in a humidified 5% CO 2 atmosphere. For transient transfections, CV-1 or LA-II cells (0.6 ϫ 10 6 cells) were subcultured to 90% confluence, trypsinized, and maintained in 60-mm dishes. The following day, fresh medium was added and transient transfections were performed using the calcium phosphate precipitation method essentially as described (36). At 16 h after the transfection, the cells were rinsed with medium, then phosphate-buffered saline, and further incubated in 3 ml of medium. The cells were treated with 10 M ␤-naphthoflavone 24 h following transfection and cultured for an additional 16 h. To prepare soluble extracts, the cells were washed twice with phosphate-buffered saline, scraped in 100 l of buffer (0.25 Tris, pH 7.8), and lysed following three cycles of freeze/thaw. The soluble extracts were obtained following centrifugation at 16,000 ϫ g at 4°C and stored at Ϫ80°C until needed for further analysis.
␤-Galactosidase and CAT Assays-10-l aliquots of the soluble extracts were incubated with 2 ϫ assay buffer (120 mM Na 2 HPO 4 , 2 mM MgCl 2 , 100 mM ␤-D-galactopyranoside) in a total reaction volume of 300 l at 37°C for 30 min. The reaction was terminated with the addition of 500 l of 1 M Na 2 CO 3 , and the absorbance at 420 nm was determined. For CAT assays, 5-20 l of extract was incubated with 1.0 mM acetyl-CoA, 0.25 Ci of [ 14 C]chloramphenicol, and 423 mM Tris (pH 7.8) for 3 h at 37°C (37). The reactions were extracted with ethyl acetate, and the organic phase dried under vacuum and resuspended in 20 l of ethyl acetate. The products were analyzed following elution by silica thin layer chromatography using a chloroform/methanol (97:3, v/v) solvent and quantitated by scintillation counting. The activity of each AHR or ARNT protein is expressed as percent of wild-type activity and was calculated by dividing the percent acetylation resulting from each variant plasmid by percent acetylation from the wild-type plasmid (i.e. either wild-type AHR or wild-type ARNT). To adjust for differences in transfection efficiencies, the CAT values were normalized using ␤-galactosidase activity. Each construct was tested in at least two independent experiments.

RESULTS AND DISCUSSION
Overall Strategy-Domain mapping of the AHR and ARNT has previously demonstrated that the regions of the AHR and ARNT involved in DNA binding lie within their N termini (3,6,7). To identify the boundaries of the DNA binding regions of the AHR and ARNT, we performed deletion analysis of the N termini of both proteins. Next, to determine which amino acids within our newly defined DNA binding regions are critical for protein/DNA contacts of the AHR and ARNT, we performed amino acid substitutions within the DNA binding regions. The nucleotide composition of each construct was confirmed by sequence analysis, while correct translation of the predicted molecular weight species was verified for all variant proteins by 35  phoresis (data not shown). Protein concentrations were determined by scintillation counting and normalized to that of the wild-type protein when analyzed by the gel shift and coprecipitation assays. To confirm that the AHR and ARNT N-terminal deletions and amino acid substitutions affected only DNA binding, and not the ability of the AHR to heterodimerize with ARNT, we performed affinity coprecipitation analysis. To demonstrate specificity of the AHR or ARNT interaction, we performed affinity coprecipitation analysis using the 35 S-labeled AHRGN⌬315 construct (3,25), in which most of the dimerization domain of the AHR has been replaced by the DNA binding and dimerization domain of Gal4 or the ARNT construct lacking the corresponding regions, ARNTN⌬150C⌬418. All 35 Slabeled AHR or ARNT proteins were capable of dimerizing with their respective DNA binding partner (see Figs. 1C, 2C, 3C, and 4, D and E).
Determination of the Minimal ARNT DNA Binding Region-We first focused on identifying the minimal DNA binding region of ARNT by generating a series of ARNT constructs that harbored deletions from their N termini and verified and quantitated their expression (Fig. 1A). Equal amounts of each ARNT protein were examined for their ability to bind DNA using the gel shift assay (Fig. 1B) and their ability to interact with the AHR using the coaffinity precipitation assay (Fig. 1C). As shown in Fig. 1B, the minimal region of ARNT that is involved in DNA contacts lies C-terminal to amino acid residue 74. Within the region bordered by amino acid residues 74 and 87 is the amino acid sequence ERRRR, which is highly conserved among other bHLH proteins, such as USF and Max (22). bHLH proteins have been classified into several groups according to the amino acid composition of their basic regions and their DNA recognition half-sites. Class A includes proteins that contain the amino acids ERRR and recognize the CAGCTG sequence. Class B typifies proteins that have an additional conserved R residue that is C-terminal to that of Group A (i.e. ERRRR) and recognizes the CACGTG site and the GTG recognition half-site. Since ARNT harbors the requisite C-terminal R residue and recognizes the GTG half-site, we have previously designated ARNT as a class B protein (25). This comparison of the amino acid composition and recognition half-site of ARNT predicts that the amino acids of ARNT that contact the GTG recognition half-site may be identical to those previously identified within the basic regions of USF and Max.
Determination of Essential Amino Acids Involved in the DNA Binding of ARNT-To confirm that the conserved ERRRR amino acid sequence of the ARNT basic region interacts with the GTG half-site in the manner described for other class B proteins, we performed amino acid substitutions within this region ( Fig. 2A). Again, the expression of the ARNT constructs was verified and quantitated by SDS-gel electrophoresis and autoradiography (data not shown). To establish the similarity between the DNA binding of ARNT and other class B bHLH proteins, we first substituted the characteristic glutamate residue (residue 83 of ARNT) with aspartic acid. Crystal structure analysis has revealed that the corresponding glutamate residue of both class A and B bHLH proteins contact the CA of the CANNTG recognition site (12,14). Substitution of glutamate with aspartic acid results in shortening of the glutamic acid side chain by one carbon atom and is sufficient to interrupt the side chain-base contact of USF with its DNA recognition site. Glutamate 83 of ARNT appears to play a similar role since substitution of this residue with aspartic acid abolishes the ability of ARNT to recognize the GTG half-site (Fig. 2B, lane 2) but does not affect the ability of ARNT to heterodimerize with the AHR (Fig. 2C, lane 2).
Next, we substituted arginine residues 84 and 85 of ARNT  Fig. 1. C, coprecipitation analysis of ARNT proteins containing amino acid substitutions. Equal concentrations of 35 S-labeled ARNT proteins were analyzed by the coprecipitation assay as described in Fig. 1C and under "Experimental Procedures." with glutamine residues. This substitution reduced DNA binding of the AHR⅐ARNT by approximately 50% (Fig. 2B, lane 3). However, the ability of this construct (ARNTQ 84 Q 85 C⌬418) to heterodimerize with the AHR was impaired, thus complicating interpretation of its ability to interact with DNA (Fig. 2C, lane  3). Previous studies, in which amino acid residue 84 was substituted with alanine residues, resulted in a diminished ability of the AHR⅐ARNT heterodimer to interact with DNA (38,39), indicating that these residues may play a role in the DNA interactions of ARNT, but are not involved in critical amino acid/nucleotide interactions. It is likely that these arginine residues may be involved in phosphate backbone contacts analogous to those contacts identified for corresponding amino acids within the basic regions of Max, MyoD, and USF (12,14,40). We also inserted a proline residue within the proposed DNA binding region of ARNT (ARNTQ 86 P 87 C⌬418, Fig. 2B,  lane 5). The corresponding residues of other bHLH proteins have been shown to confer the ability to distinguish between the inner nucleotides of the CANNTG sites (CACGTG versus CAGCTG) and make contact with the G of the CACGTG site (12,14,22). In addition, the arginine of E47 and Max that corresponds to arginine 86 of ARNT has been demonstrated to position the orientation of the E47 heterodimer (13). The fact that both the Q 86 P 87 and Q 86 Q 87 substitutions of ARNT abolished DNA binding (Fig. 2B, lanes 4 and 5), yet did not affect dimerization (Fig. 2C, lanes 4 and 5) is in agreement with previously reported data (38,39), and indicates that these arginine residues may act in a manner similar to that described for the corresponding amino acids of USF, Max, and E47. Thus, the critical DNA binding residues of ARNT are defined by glutamate 83, arginine 86, and arginine 87.
Determination of the Minimal DNA Binding Region of the AHR-To determine the boundaries that define the DNA binding region of the AHR, we generated a series of constructs  (Fig. 3A). To simplify the steps involved in DNA binding of the AHR, i.e. ligand binding, Hsp90 dissociation, and heterodimerization, we utilized the AHRC⌬516 construct that interacts with ARNT and the DRE in a ligandindependent manner (3). Gel shift analysis of the AHR constructs that bear N-terminal deletions revealed that the minimal DNA binding region of the AHR is defined by the constructs AHRN⌬34C⌬516 and AHRN⌬42C⌬516 (Fig. 3, B  and C). These results confirm the previously predicted DNA binding domain that was based on amino acid alignment of the AHR with other bHLH proteins (2, 3, 7). Interestingly, elimination of the first 17 amino acids from the N termini of the AHR resulted in a loss of 75% of wild-type DNA binding (Fig.  3B, lane 3) and indicated that although the region essential for DNA binding of the AHR lies within the area bordered by amino acids 34 and 42, a second region bordered by amino acids 9 and 17 may also be involved in DNA binding of the AHR. In addition, previous work has indicated that arginine residue 14 may be involved in DNA binding of the AHR (38,39). Based on these results, we predicted that DNA binding of the AHR may involve two regions, basic region 1 and basic region 2. We have defined the area bordered by the constructs AHRN⌬9C⌬516 and AHRN⌬17C⌬516 as basic region 1, whereas the region that lies between amino acids 34 and 42 is defined as basic region 2. This tentative assignment is supported by similar data obtained from studies that characterized the DNA binding regions of MyoD and USF (14,40,41). Here, amino acids within two regions of MyoD and USF were shown by both site-directed mutagenesis and crystal structure analysis to mediate protein/ DNA interactions. Although basic region 2 of both MyoD and USF are involved in DNA contacts, the nucleotides involved with respect to the CANNTG site differ. The primary contacts within basic region 2 of MyoD consists of two arginine residues that interact with the T and G nucleotides of the CANNTG site. In contrast, the arginine residues that lie within basic region 2 of USF contact nucleotides that flank the CANNTG site. The corresponding region of the AHR, basic region 1, also consists of two arginine residues. However, the spacing between the two basic regions of the AHR is significantly larger (18 amino acids) than that between basic region 2 and 3 of USF and MyoD (5-6 amino acids) (14,40,41). We theorized that perhaps the presence of prolines 24 and 26 of the AHR may allow adjustment of the distance between basic regions 1 and 2 of the AHR. In this manner, we predicted that the AHR, like USF and MyoD, may enlist two distinct regions to interact with DNA.
Determination of Amino Acids Essential for DNA Binding of the AHR-To test our hypothesis that DNA binding of the AHR involves two regions, basic region 1 and basic region 2, we generated a series of constructs that introduced substitutions with glutamine residues (Fig. 4A). We first replaced arginine and lysine residues 12-16 with glutamine residues (Fig. 4A). Gel shift analysis indicated that DNA binding of the construct containing these glutamine substitutions (AHRB1Q5C⌬516) was similar to that of the wild-type AHR protein (Fig. 4B, lanes  1 and 2). However, similar multiple substitutions within basic region 2 (AHRB2Q5C⌬516) obliterated the ability of the AHR to interact with DNA, indicating that basic region 2 is the region with which the AHR contacts DNA (Fig. 4B, compare  lanes 1 and 3). These results are in contrast with previously published studies in which substitutions of arginine residues 14 and 15 with either alanine or lysine residues severely affected DNA binding of the AHR (38,39).
To further delineate the amino acids of the AHR that are involved in contacting the 5Ј region of the TNGCGTG recognition site, we substituted individual residues within basic regions 1 and 2 with either glutamine or proline residues (Fig.  4A). Given the fact that arginine residues typically contact the DNA recognition site of bHLH proteins, we first concentrated on arginine residues (12)(13)(14)40). The expression of each AHR protein was quantitated and equimolar amounts analyzed for their ability to bind DNA and dimerize with ARNT (Fig. 4,  C-E). To further confirm that basic region 1 of the AHR is not involved in DNA contacts, we substituted arginine residues 14 and 15 with either glutamine or proline residues. As shown in Fig. 4C (lanes 1-3) substitution of arginine 14 and 15 with either glutamine or proline residues did not significantly alter DNA binding of the AHR⅐ARNT complex, indicating that arginine residues 14 and 15 are not involved in essential contacts with the TNGCGTG recognition site.
Since substitutions within basic region 1, including the presence of a helix-breaking proline, did not affect the ability of the AHR to bind DNA, we next focused our attention to substitutions within basic region 2. The goal of our first substitution within this region was to change the DNA binding specificity of the AHR to resemble that of Myc. Toward this end, we introduced leucine 34 and glutamic acid 35 and the amino acid sequence LEKRHR within the basic region of the AHR. (The corresponding region of Myc is LERQRR (22). The AHRL 34 E 35 C⌬516 construct did not recognize the CAC halfsite (data not shown), and resulted in formation of a DNA binding complex formation is approximately 40% that of wildtype AHR (Fig. 4C, compare lanes 1 and 7). The fact that the AHRL 34 E 35 C⌬516 protein was capable of interacting with ARNT (Fig. 4E, lane 7), yet this AHR⅐ARNT heterodimer did not recognize the CACGTG recognition binding site, suggests that the proper positioning required for the AHRL 34 E 35 C⌬516 protein to interact with the CAC recognition half site (42) may not be sufficiently dictated by heterodimerization with the ARNT protein.
The second piece of evidence that basic region 2 is a primary site of amino acid/DNA contacts is provided by the fact that insertion of a proline residue at residue 37 virtually abolished AHR⅐ARNT DNA binding (18% of wild-type, Fig. 4C, lane 6). More importantly, a single substitution, replacement of arginine 39 with a glutamine residue, also significantly reduced DNA binding (54% of wild-type, Fig. 4C, lane 9). These data indicate that arginine 39 that lies within basic region 2 is involved in critical contacts of the AHR basic region with its TNGC half-site, as has been shown in previous studies (38,39). In summary, the amino acids of the AHR that are critical for DNA contact are defined by proline 34, serine 35, lysine 36, arginine 37, histidine 38, and arginine 39.
We next questioned whether the presence of helix-breaking prolines at amino acids 24 and 26 located between basic region 1 and basic region 2 were necessary for DNA binding of the AHR. Substitution of prolines 24 and 26 with alanines did not substantially reduce DNA binding (approximately 70% of wildtype, Fig. 4C, lane 4). Since the absence of prolines 24 and 26 likely elicits a gross change in the conformational structure within basic region 1 of the AHR, these results indicate that ARNT and analyzed by gel shift analysis as described in Fig. 1. C, gel shift analysis of the AHR proteins containing amino acid substitutions within the N terminus of the AHR. D, coprecipitation analysis of AHR proteins containing amino acid substitutions within basic region 1 and basic region 2. Equal concentrations of 35 S-labeled AHR proteins were analyzed by the coprecipitation assay as described in Fig. 3C and under "Experimental Procedures." E, coprecipitation analysis of AHR proteins containing amino acid substitutions within the N terminus of the AHR. this region of the AHR is not involved in critical contacts with the TNGC half-site of the AHR. However, the apparent requirement for four prolines within the basic region of the AHR suggests that in contrast to other bHLH proteins, such as USF, the basic region of the AHR may not form a contiguous ␣ helix as it contacts the DRE, but may involve several structural turns.
Analysis of Gene Activation by the Variant AHR and ARNT Proteins-To confirm that the gel shift analysis accurately predicts the function of the AHR and ARNT constructs bearing amino acid substitution, we analyzed several AHR and ARNT proteins for their ability to activate gene transcription (Fig. 5). As predicted from our gel shift analysis, lack of the amino acids that lie C-terminal to amino acid 74 as well as substitution of amino acid 83 of ARNT obliterated transcriptional activation by the AHR⅐ARNT complex (Fig. 5B). In contrast, the gel shift analysis of the AHR proteins did not accurately reflect their ability to activate gene transcription (Fig. 5C). Although introduction of glutamine residues within basic region 1 (AHRB1Q5C⌬516 and AHRQ 14 Q 15 ⌬516) did not affect the ability of the AHR interact with the GCGTG sequence in vitro (Fig.  4, B and C, lanes 1 and 2), the ability to activate gene transcription was significantly altered (approximately 73% and 44% of wild type, respectively; Fig. 5C). However, substitutions within the region bordered by amino acids 34 -42 (AHRB2Q5 and AHRQ 39 ) did significantly decrease the ability of the AHR⅐ARNT complex to activate gene transcription (17% and 13% of wild-type, respectively; Fig. 5C), verifying that basic region 2 is the primary site involved in amino acid/DNA contacts.
Our results indicate that the primary DNA contacts of the AHR are specified by amino acids 34 -39. These results contradict previous studies in which amino acid residue 14 appeared to be involved in DNA binding of the AHR (38, 39). Although FIG. 5. Determination of the ability of the AHR and ARNT constructs to activate gene transcription. CV-1 or LA-II cells were transiently transfected with plasmids encoding the indicated AHR or ARNT constructs, respectively, and the reporter vector pHAV-CAT containing the CAT reporter gene regulated by the upstream region of cytochrome P4501A1 as described under "Experimental Procedures." When AHR constructs were analyzed, the phuARNT plasmid was cotransfected, whereas when ARNT constructs were analyzed, the pmuAHR plasmid was cotransfected. A, representative CAT assays of extracts from LA-II cells transfected with the indicated ARNT constructs, pmuAHR, and pHAV-CAT as described under "Experimental Procedures." B, comparison of CAT activities of extracts from LA-II cells transfected with the indicated ARNT constructs. C, comparison of CAT activities of extracts from CV-1 cells transfected with the indicated AHR constructs. The values are calculated as percent of wild-type following subtraction of either the sample generated in the absence of cotransfected AHR or ARNT. CAT activities were normalized using the values obtained from the ␤-galactosidase assay and are averages of two independent experiments. The standard error was never greater than 25%. the reason for this discrepancy is unclear, it may be due to the nature of the amino acid substitutions. For example, substitution of arginine residue 14 with glutamine would retain the polarity of residue 14, likely preserving the secondary structure within the substituted region whereas substitution with alanine may result in a more dramatic change in the structural conformation of the AHR. We propose that amino acid residues 14 and 15 are not involved in either nucleotide or phosphate contacts. This idea is supported by two observations. First, substitutions of arginine residues with glutamine residues at amino acid positions 14 and 15 do not affect the ability of the AHR⅐ARNT complex to interact with DNA (Fig. 4C) or its equilibrium binding constant (K D ), 2 and permitted gene activation by the AHR⅐ARNT complex (Fig. 5C). Second, although an arginine to alanine substitution at residue 14 abolished in vitro DNA binding of the AHR⅐ARNT complex, its ability to activate gene transcription was not substantially affected (38,39).
The dimeric pairs of several bHLH proteins form a parallel left-handed, four-helix bundle (12,14) in which the two basic regions enter the major groove of the DNA sequence. However, the subunits within a specified homodimer may not interact in a symmetric manner. For example, the subunits of the E47 homodimer, which recognizes the nonpalindromic CACCAG motif, make nonequivalent contacts to each recognition halfsite (13). Similarly, we suggest that each subunit within the AHR⅐ARNT heterodimer may not contribute equal contacts to the TNGCGTG binding site. As demonstrated in this study, the amino acid composition of DNA binding regions of the AHR and ARNT are distinct; DNA binding of the AHR appears to involve a composite of amino acid/DNA interactions, whereas DNA binding of ARNT relies strongly on DNA interactions with single amino acids (i.e. glutamic acid 83 and arginines 86 and 87). In addition, previous nucleotide substitution analysis has indicated that the interaction of the AHR⅐ARNT complex with the TNGCGTG sequence will tolerate substitutions within the TNGC half-site of the AHR, but not within the GTG half-site of ARNT (25,43,44). This suggests that the AHR may represent an as yet uncharacterized DNA-binding form of bHLH proteins.
Conclusions-This study contributes important information on the manner in which the AHR and ARNT interact with the DRE. We have demonstrated that the amino acids involved in DNA binding of the AHR and ARNT, glutamate, arginine, and histidine residues, are similar to those involved in amino acid/ nucleotide interactions of other bHLH proteins. However, the fact that DNA binding of the AHR, but not ARNT, relies on the presence of several proline residues within the basic region suggests that formation of an ␣-helical basic region may not be involved in the interaction of the AHR with its TNGC half-site. In addition, the primary DNA contacts of the AHR are defined by amino acid residues 34 -39. Finally, we suggest that, while the tertiary structure of the ARNT⅐GTG interaction may resemble that of USF and Max, that of the AHR⅐TNGC represents a distinct DNA binding structure.