Proteolytic Dissection of Zab, the Z-DNA-binding Domain of Human ADAR1*

Zα is a peptide motif that binds to Z-DNA with high affinity. This motif binds to alternating dC-dG sequences stabilized in the Z-conformation by means of bromination or supercoiling, but not to B-DNA. Zα is part of the N-terminal region of double-stranded RNA adenosine deaminase (ADAR1) , a candidate enzyme for nuclear pre-mRNA editing in mammals. Zα is conserved in ADAR1 from many species; in each case, there is a second similar motif,Zβ, separated from Zα by a more divergent linker. To investigate the structure-function relationship ofZα, its domain structure was studied by limited proteolysis. Proteolytic profiles indicated that Zα is part of a domain, Zab, of 229 amino acids (residues 133–361 in human ADAR1). This domain contains both Zα and Zβas well as a tandem repeat of a 49-amino acid linker module. Prolonged proteolysis revealed a minimal core domain of 77 amino acids (positions 133–209), containing only Zα, which is sufficient to bind left-handed Z-DNA; however, the substrate binding is strikingly different from that of Zab. The second motif, Zβ, retains its structural integrity only in the context of Zab and does not bind Z-DNA as a separate entity. These results suggest that Zαand Zβ act as a single bipartite domain. In the presence of substrate DNA, Zab becomes more resistant to proteases, suggesting that it adopts a more rigid structure when bound to its substrate, possibly with conformational changes in parts of the protein.

Many protein domains that recognize DNA in both sequenceand conformation-specific manners have been characterized (for a review, see Ref. 1). These studies have resulted in an understanding of the variety of ways in which protein-DNA interactions can result in function. Identification of a peptide motif, Z␣, which binds specifically to Z-DNA, opens up a new vista and invites the investigation of the similarities and differences between domains that bind right-and left-handed DNAs. The conformation specificity of Z␣ binding has been characterized in many ways. Peptides including this motif bind to alternating dC-dG that has been stabilized in the Z-conformation using bromination or supercoiling, as shown by band shift assays, competition experiments, and BIAcore measurements (2). When linked to the nuclease domain from FokI, the resulting chimeric nuclease cuts supercoiled plasmid DNA to bracket a d(C-G) 13 in the Z-conformation (3). The protein also binds to short oligonucleotides of suitable sequence and con-verts them from the B-to the Z-conformation, as detected by CD and Raman spectroscopy (4,5). The binding of Z-DNA by Z␣ occurs even in the presence of a 10 5 -fold excess of B-DNA (6). Z␣ binds poly(dC-dG), stabilized in the Z-conformation by bromination, with an equilibrium dissociation constant (K d ) in the lower nanomolar range, as shown by BIAcore measurements (2).
Although many properties of Z␣ have been studied, its biological function in the context of ADAR1 remains unknown. The Z-DNA binding activity of Z␣ was first identified in proteolytic fragments of double-stranded RNA adenosine deaminase (ADAR1) (6) and then in the full-length enzyme (7). Z␣ has been shown to be a conserved feature of human, rat, bovine, chicken, and Xenopus ADAR1 (2). A second related motif, Z␤, has been identified in all the ADAR1 enzymes whose sequences are known. These two motifs are separated by a linker region of conserved size; an exception is the human enzyme, in which the linker is twice as long and consists of two nearly identical copies of a module (8). The presence of a conserved N-terminal region containing these motifs distinguishes ADAR1 from other members of the ADAR family (9,10), and the N terminus has been shown to be differentially expressed (8). Therefore, we conclude that this region is of importance for the biological function of ADAR1.
The ADAR family of enzymes converts adenosine to inosine within double-stranded regions of RNA (11). In mRNA, inosine is read as guanosine by the translation apparatus, resulting in codon changes within the synthesized protein. A-to-I editing has been shown to occur in vivo in a number of mRNAs from higher animals (12)(13)(14)(15)(16)(17)(18). The best characterized of these, the editing of pre-mRNAs for subunits of the glutamate-gated cation channels in the brain, results in channels with dramatically altered functional properties (19). Double-stranded RNA structures required for ADAR activity are formed by base pairing of an exonic sequence around the editing site with a complementary sequence in the downstream intron; therefore, editing must take place in the nucleus before splicing removes the respective intron(s). It has been proposed that Z␣ serves to target ADAR1 to its preferred substrates by binding to Z-DNA formed close to actively transcribing genes (20).
To better understand the role of Z␣, we have characterized the N-terminal region of ADAR1 functionally and structurally. Using human ADAR1 as a model, the classical approach of limited proteolysis was employed to define the boundaries of this domain. Both motifs, Z␣ and Z␤, together are shown to form a single functional domain, Zab; Zab is stable and protected from proteolysis. Za, containing Z␣, but not Z␤, can be regarded as a stable subdomain; this subdomain contributes the main binding activity. There is no equivalent subdomain containing Z␤: this region is poorly structured and unstable when isolated. The intervening linker region is unexpectedly well structured, In humans, the second copy of the linker mod-ule appears to have a structure similar to the first. Removing one copy of the linker modules reduces the DNA binding affinity, indicating the importance of the distance between the Z␣ and Z␤ motifs. Za binds Z-DNA in a conformationally specific, but not sequence-specific manner. The binding is modified by the presence of the entire Zab domain to confer preference for d(C-G) n over d(C-A) n ⅐d(T-G) n .

EXPERIMENTAL PROCEDURES
ADAR1: DNA Constructs and Protein Purification-Different portions of the cloned cDNA coding for human ADAR1 (GenBank TM accession number U10439) were polymerase chain reaction-amplified and inserted into the expression vector pET28a (Novagen), resulting in N-terminal His 6 -tagged fusion proteins. In detail, Za131 (residues 96 -226), Za77 (133-209), and Zab236 (133-368) were amplified using complementary primers flanked with restriction sites at their termini. Polymerase chain reaction products were analyzed on an agarose gel; bands of the correct size were extracted and subcloned into the NdeI-HindIII sites (Za131 and Za77) or the NheI-HindIII sites (Zab236) of the multiple cloning site of pET28a, resulting in the vectors pZa131, pZa77, and pZa236, respectively. Another construct, Zab⌬l, missing one of the two 49-amino acid linker modules separating the Z␣ and Z␤ motifs, was created from pZab236 as follows. The 1.1-kilobase SphI-HindIII restriction fragment was digested with the restriction enzyme DrdI, resulting in two cleavage sites at identical locations at nucleotides 789 and 936 (numbers according to GenBank TM accession number U10439). The resulting DNA fragments were deproteinized and precipitated (21). After incubation with T4 DNA ligase (25°C, 4 h), the reaction mixture was analyzed on an agarose gel. The 930-base pair ligation product was isolated and subcloned in the 5-kilobase SphI-HindIII restriction fragment of pET28a, resulting in the vector pZab⌬l. To ensure that the plasmids were correct, they were analyzed by restriction digestion, and the coding regions were sequenced using Sequenase Version 2.0 (U. S. Biochemical Corp.). according to the manufacturer's instructions. The proteins were overproduced in Escherichia coli strain Novablue(DE3) (Novagen). Bacteria were grown at 37°C in Luria-Bertani medium and induced with 1 mM isopropyl-␤-D-thiogalactopyranoside at 0.7-0.9 A 600 nm units. Cells were harvested after a further 3 h of growth at 37°C. All subsequent steps were done at 4°C. The proteins were purified essentially to homogeneity under nondenaturing conditions as follows. A cell pellet obtained from a 1-liter culture was resuspended in 15 ml of buffer A (50 mM Tris-HCl (pH 8.0), 300 mM NaCl, 10 mM imidazole, 5 mM ␤-mercaptoethanol, 20 g/ml RNase A, and 100 M phenylmethylsulfonyl fluoride), and the cells were lysed using a French press. The lysate was then centrifuged for 30 min at 25,000 ϫ g, and the clear supernatant was separated and incubated with 2 ml of Ni 2ϩnitrilotriacetic acid metal affinity resin (QIAGEN Inc.) for 1 h. The resin was washed three times with 20 ml of buffer A in a batch and then washed with 40 ml of buffer B (50 mM Tris-HCl (pH 8.0), 1 M NaCl, 10 mM imidazole, and 5 mM ␤-mercaptoethanol) in a column. Overproduced His 6 -tagged fusion protein was eluted with an imidazole step gradient in buffer C (50 mM Tris-HCl (pH 8.0), 300 mM NaCl, and 5 mM ␤-mercaptoethanol). Steps were 30, 50, and 200 mM imidazole, respectively. Fractions were analyzed by denaturing SDS-polyacrylamide gel electrophoresis (PAGE) 1 on 15 or 18% gels. Fractions containing protein were pooled and dialyzed against buffer D (20 mM Tris-HCl (pH 8.0), 150 mM NaCl, and 2 mM dithiothreitol (DTT)). After 1 h of dialysis, 15 units of thrombin (Calbiochem) were added to cleave the N-terminal His 6 tag. 12 h later, the cleaved protein was dialyzed against buffer E (20 mM HEPES (pH 7.5), 20 mM NaCl, and 2 mM DTT) and finally purified by cation-exchange chromatography on a Mono S HR5/5 column (Amersham Pharmacia Biotech). Proteins were eluted with a 30-ml linear gradient of NaCl (0.05-0.3 M) in 20 mM HEPES (pH 7.5) and 1 mM DTT at a flow rate of 0.7 ml/min, resulting in sharp peak profiles. Za77 eluted at 220 mM NaCl, Zab⌬l at 200 mM, and Zab236 at 180 mM. The yield of electrophoretically homogeneous protein was determined using extinction coefficients of 14,000 M Ϫ1 cm Ϫ1 (Za77 and Za131), 22,400 M Ϫ1 cm Ϫ1 (Zab⌬l), and 28,020 M Ϫ1 cm Ϫ1 (Zab236) at the absorbance maximum at 278 nm (calculated as described in Ref. 22). 8 -12 mg of protein were obtained per liter of bacterial culture.
Limited Proteolysis-Protease digestion was performed by treating 50 g of protein (0.5 g/l) with trypsin, chymotrypsin, thermolysin, or Staphylococcus aureus endoproteinase Glu-C in 50 mM Tris-HCl (pH 8.0), 150 mM NaCl, and 2 mM DTT at a protein/protease mass ratio in the range of 50:1 to 1000:1 for various times at 24°C. Reactions were stopped by heat denaturation at 100°C for 5 min. To examine the effect of various DNA conformers on Zab digestion, the reaction was performed in 10 mM HEPES (pH 7.5), 20 mM NaCl, 5 mM DTT, and 10 mM MgCl 2 . DNA was used in a base pair/protein molar ratio of 5:1. Poly-[d( 5-Me C-G)] was used as substrate DNA, and poly[d(A-G)]⅐poly[d(C-T)] as unspecific DNA. The digests were separated by SDS-PAGE on 18% gels, followed by staining with Coomassie Brilliant Blue G-250. In the case of protein digested for the experiment shown in Fig. 6 (lanes 10 -13), the reactions were stopped by adding phenylmethylsulfonyl fluoride (1 mM) instead of heat inactivation to ensure nondenatured protein.
Protein Sequence Analysis-The proteolytic fragments were analyzed by mass spectrometry on a Voyager DE Workstation (PerSeptive) using matrix-assisted laser desorption ionization-time of flight technology. As a matrix, sinapinic acid (10 g/l) in acetonitrile/H 2 O/trifluoroacetic acid (70:29.9:0.1) was used. Alternatively, for fragments smaller than 10-kDa, the matrix was prepared with ␣-cyanocinnamic acid (10 g/l) instead of sinapinic acid. Various fragments were further analyzed by amino-terminal sequencing on an Applied Biosystems Model 475/477A protein sequencer.
DNA Binding Assay-DNA binding was assayed by native PAGE (23). The assay was carried out using d( 5-Br C-G) 20 as the substrate, which is stable in the left-handed Z-DNA conformation under the applied conditions (24). The substrate was end-labeled with 32 P and purified by native PAGE prior to the experiment. A reaction mixture of 10 l containing the ADAR1 fragment (4 -500 nM) with Ͻ1 pM substrate in 10 mM Tris-HCl (pH 7.8), 20 mM NaCl, 5 mM DTT, 5% glycerol, 100 g/ml bovine serum albumin, and 50 g/ml poly[d(A-G)]⅐poly[d(C-T)] (Amersham Pharmacia Biotech) as an unspecific competitor was incubated for 30 min at 24°C. The mixture was analyzed on a 6% native polyacrylamide gel using 0.5ϫ Tris borate (22.5 mM) as the running buffer. After electrophoresis (10 V/cm, 90 min), the gel was dried and autoradiographed at -70°C on Kodak X-Omat Blue film with intensifying screens.
CD Measurements-CD spectra were recorded at 24°C on an Aviv Model 62DS spectrometer. Conformational change in DNA oligomers was monitored between 235 and 305 nm. DNA samples used were annealed prior to the experiment. For this purpose, a concentrated solution of the self-complementary sequence d(C-G) 6 or an equimolar amount of d(C-A) 7 and d(T-G) 7 was heated to 85°C for 10 min and then slowly cooled to Ͻ20°C over 1 h. Measurements were carried out in 10 mM sodium phosphate (pH 7.0), 10 mM NaF, 1 mM EDTA, and 2 mM DTT using a DNA concentration of 30.0 M base pairs and an optical path length of 5 mm. Spectra were recorded in 10-nm steps and averaged over 4 s. Protein was added to the sample from a concentrated stock solution, in aliquots never exceeding 5% of the total volume, and the mixture was equilibrated for 5 min before each measurement. The spectra were corrected for buffer base line and smoothed using software provided by Aviv. Protein spectra were recorded between 190 and 250 nm. Za77 was measured at a concentration of 10.0 M, and Zab⌬l and Zab were measured at 5.0 M and an optical path length of 1 mm. Spectra were measured in 1-nm steps and averaged over 10 s.
For comparisons of the spectra of Zab between 190 and 250 nm in the presence and absence of substrate, poly[d( 5-Me C-G)] was used as substrate. A 2:1 base pair/protein molar ratio was used.

RESULTS
Defining the Boundaries of the Minimal Z-DNA-binding Domain of Human ADAR1-Protein domains are usually well structured regions of 50 -200 amino acids (25,26). Larger proteins are built from multiple, mostly independently folded domains. The regions connecting those domains are often flexible and solvent-exposed. Limited proteolysis is a classical approach to define domain organization (27-30). It takes advantage of the fact that site-specific proteases will cleave proteins preferentially in solvent-exposed unstructured regions, rather than within a folded domain.
Limited proteolysis was used to define a structured core containing Z␣, the Z-DNA-binding motif present in the Nterminal region of ADAR1. Z␣ has been defined to comprise residues 121-197 of human ADAR1 using functional assays (2). However, a variety of results from nondenaturing electrophoresis, chromatographic elution, and NMR studies have suggested that the recombinantly produced peptide is not stably folded. 2,3 An N-and C-terminally extended portion of the ADAR1 N terminus comprising Gly 96 -Ser 226 was overproduced as a His 6tagged fusion protein in E. coli, and its digestion with four different proteases (endoproteinase Glu-C, chymotrypsin, thermolysin, and trypsin) was analyzed. Each of these enzymes has a different sequence specificity; therefore, using them in concert results in complementary information. The use of this combination of proteases results in an even distribution of potential cleavage sites throughout the studied protein, with gaps no longer than 4 residues between adjacent sites. A time course of cleavage with endoproteinase Glu-C is shown in Fig.  1A. An 11-kDa band appeared rapidly and increased in intensity over the observed time; the full-length protein band gradually disappeared over the same period. The intensity of the 11-kDa band was comparable to that of the full-length band, indicating a stoichiometric conversion to a stable product. The cleavage site was mapped to a preferential endoproteinase Glu-C site, C-terminal to Asp 132 , using N-terminal sequencing. Similar results were obtained using trypsin and chymotrypsin to cleave this protein (data not shown).
To ensure that a minimum domain had been identified, the protein was cleaved sequentially with two different proteases. Fig. 1B shows the digestion with endoproteinase Glu-C followed by chymotrypsin. Before addition of the second protease, only the 11-kDa band was detectable. Chymotrypsin further truncated the fragment, producing the stable product V8/Ch-8. The N and C termini of these fragments were identified unambiguously using matrix-assisted laser desorption ionizationtime of flight mass spectrometry. The V8 -11 fragment was shown to contain residues 133-226. Chymotrypsin cut after Trp 204 ; V8/Ch-8 consists of residues 133-204. A similar digestion, carried out with endoproteinase Glu-C and thermolysin, produced a stable product extending from amino acids 133 to 209 (data not shown). Other combinations of enzymes produced consistent results in all cases. From this, we conclude that there is a core domain containing Z␣. Trp 204 is a potential target for cleavage by both chymotrypsin and thermolysin.
Chymotrypsin cut well, but thermolysin cut only marginally at Trp 204 . Therefore, we define the core domain as comprising Leu 133 -Gly 209 . This core was in no case significantly degraded, whereas the regions on either end were rapidly degraded to pieces too small to detect.
These results were used to design a stable construct, Za, comprising Leu 133 -Gly 209 . This protein was purified from E. coli undegraded under nondenaturing conditions. Za showed superior chromatographic behavior over previous Z␣ constructs, purifying from a Mono S cation-exchange column homogeneously as a sharp peak; this indicated structural uniformity. Samples yielded a single band when analyzed by native PAGE. 4 When challenged with exogenous proteases, only Za showed striking stability; other Z␣ constructs were rapidly degraded (data not shown).
The Two Motifs, Z␣ and Z␤, Form a Single Structural Entity-Both Z␣ and Z␤ are present in every species in which the sequence of ADAR1 is known. The motifs are separated by one or two copies of a module, weakly conserved in sequence, but consistently lacking positively charged residues and 43-49 amino acids in length. 12 residues from this module are an essential part of Za, the stable Z␣ core domain. It seemed possible that Z␣, Z␤, and the linker module(s) together form a single structural and functional unit. To investigate this possibility, we examined the structural organization of a peptide spanning both DNA-binding motifs. This peptide, termed Zab, comprising Leu 133 (the previously defined N terminus of Za) to Asn 368 (C-terminal to Z␤, from human ADAR1), was soluble when overproduced in E. coli, and full-length protein could be obtained with high yield. These results indicate proper folding with no significant instability. Improper folding often leads to the formation of inclusion bodies inside the overproducing bacterial cell (31). Highly flexible proteins are frequently degraded if expressed in a foreign host (32).
The results of the digestion of Zab with four different proteases are shown in Fig. 2. Each enzyme cleaved in a characteristic pattern and produced a small number of very stable bands. Time points were selected to allow the identification of all stable products, using mass spectrometry and N-terminal sequencing where appropriate; minor products were identified wherever possible. In each case, well resolved spectra were recorded. Table I lists the peptides produced by each enzyme, as determined from the molecular mass. For chymotrypsin, trypsin, and endoproteinase Glu-C, the assignments are unambiguous and in good agreement with SDS-PAGE analysis. Minor exceptions are fragments Tr8 and Ch5, which were detected only by mass spectrometry, as discussed below. In the case of thermolysin, it was not possible to unambiguously assign the multiple transitory fragments; however, the major fragments seen after 60 min of digestion could be identified. A schematic diagram of the major transitory products and the stable proteolytic fragments is shown in Fig. 3.
Endoproteinase Glu-C cleaved Zab rapidly at a single site, Glu 361 at the extreme C terminus (Fig. 2A). The resulting peptide was very stable to further proteolysis, despite an abundance of potential cleavage sites, including Asp 132 , which is exquisitely sensitive in the shorter construct used to define Za, as described above. After a long incubation with large amounts of enzyme, additional cleavage occurred at Glu 239 , Glu 301 , and Leu 307 . Glu 239 lies within the first 49-amino acid repeat; remarkably, the equivalent site in the second repeat, Glu 288 , was uncut.
Chymotrypsin  1. Limited proteolysis reveals a stable Z␣ core domain. A, a Z␣-containing peptide, comprising Gly 96 -Ser 226 of human ADAR1, was digested with endoproteinase Glu-C (V8) at a protease/protein mass ratio of 1:250 at room temperature. Reactions were stopped by heat denaturation after the indicated incubation times. Samples were resolved by SDS-PAGE (18% gels) and visualized by staining with Coomassie Brilliant Blue. The V8 -11 fragment resulted from a single cleavage and comprises residues 133-226. B, the same construct was digested consecutively with two site-specific proteases. After preincubation with endoproteinase Glu-C (1:100 protease/protein) for 1 h, chymotrypsin (1:250 protease/protein) was added, and the samples were analyzed after the indicated reaction times. V8/Ch-8 is a proteaseinsensitive core fragment containing Z␣, which spans residues 133-204 of human ADAR1. Lane M, molecular mass markers. tandem repeat. The three generated fragments were stable (Fig. 2B). The 5-kDa fragment was not visible on SDS-polyacrylamide gel, although it generated a signal in mass spectrometry of comparable intensity to Ch12 and Ch8. Coomassie Blue staining depends largely on positive charges present in the peptide (33). The 49-amino acid repeat contains only 1 positively charged residue. Therefore, we speculated that although Ch5 was resolved on the gel, it was not stained. Two other transitory fragments, Ch18 and Ch*, were separated on the gel. Ch18 could be assigned to be the product of a single cutting site. Ch* could not be unambiguously determined by mass spectrometry.
Thermolysin produced similar stable products (Fig. 2C). Again, symmetrical sites in the repeated linker, Gly 209 and Gly 258 , were cut, resulting in two stable products on an SDSpolyacrylamide gel. Because thermolysin has low sequence specificity, many transitory products were seen, especially at early time points. Because of these products, it was not possible to unambiguously identify the Tl5 fragment from among sev-eral candidates seen by mass spectrometry.
Trypsin attacked the protein at two preferred sites, Arg 232 in the first repeat and Lys 302 near the N terminus of Z␤ (Fig. 2D). (The site equivalent to Arg 232 in the second repeat is Ser 280 , not a substrate for trypsin.) Two sites near the C terminus, Lys 366 and Arg 367 , resulted in heterogeneity of the full-length protein and in the Tr15 fragment. Most of the expected products were stable; however, the C-terminal region peptide, starting at Ile 303 , was not detected by SDS-PAGE or mass spectrometry. A similar result was seen after extensive endoproteinase Glu-C digestion; again, the C-terminal fragment was not stable. It appears that Z␤, intrinsically more accessible than Z␣, is stable only in the context of the larger domain.
One of the Two 49-Amino Acid Repeats Can Be Removed without Destabilizing the Domain-In all the ADAR1 genes sequenced, there is a single copy of a 43-49-amino acid linker module between Z␣ and Z␤. In human ADAR1, this module is repeated. To determine the effect of this repeat on the structure of Zab, a protein lacking one module was constructed. This protein, Zab⌬l, was produced in high yields as a soluble protein in E. coli and could be purified to homogeneity. Protease mapping showed results similar to those for Zab (data not shown). Trypsin and endoproteinase Glu-C cleaved at identical residues. Chymotrypsin and thermolysin had only a single site each. Therefore, the overall structure of the domain was not altered by the presence of the repeated module.
Zab Consists of Two Ends with Regular Secondary Structure, Connected by an Unconventionally Folded Linker-Circular dichroic measurements in the region between 190 and 250 nm are a useful tool to assess the secondary structure of a protein.
This method was used to analyze Za, Zab, and Zab⌬l. The spectra are shown in Fig. 4, along with a difference spectrum between Zab and Zab⌬l, which reflects the contribution of a single copy of the linker module. Results of the analysis of the curves using the program K2d (34) are shown in Table II. The Z␣ and Z␤ motifs contain significant amounts of ␣-helix and ␤-sheet structures. In contrast, to a large extent, the linker adopts an alternate structure. This is consistent with the secondary structure analysis of the primary sequence with computer programs, such as PHD (35). Those analyses predict that no significant areas in the linker are structured as ␣-helices or ␤-sheets. It must be emphasized that the proteolysis studies clearly indicate that the linker module is structurally well defined, although in the majority, neither ␣-helical nor ␤-pleated.
Zab Is Protected from Proteolysis When Bound to Its Substrate-The presence of substrate can affect the protease sen-  sitivity of a protein either because of a direct interference by the substrate molecule or by altering the conformation of the protein. To test whether this is the case for Zab, protease digestions were carried out in the presence of either B-DNA or Z-DNA. Although there were no dramatic changes in the digestion profiles, B-DNA stabilized Zab slightly against proteolysis, and Z-DNA had a very marked stabilizing effect. Chymotrypsin, thermolysin, and trypsin all cut at their established sites, but to an ϳ50-fold lower extent in the presence of Z-DNA (data not shown).
Binding to Z-DNA offered striking protection against digestion with endoproteinase Glu-C (Fig. 5). Although there was no protection of the C-terminal site (Glu 361 ), the internal cleavage sites were strongly protected. Cleavage sites at residues 301 and 307 were completely protected in the presence of Z-DNA, resulting in the absence of the V20 and V19 bands. Cleavage at residue 239 was reduced, with an ϳ50-fold increase in the  ----). A difference spectrum, Zab minus Zab⌬l, is also shown (⅐⅐⅐⅐). The corresponding percentages of secondary structure motifs were calculated using the program K2d (34) and are listed in Table II. stability of the full-length Zab protein relative to the absence of DNA. In contrast, B-DNA protected Zab against cleavage only ϳ5-fold and did not alter the choice of sites. These results suggest that the entire domain becomes more rigid and less accessible in the presence of substrate. The protection of sites within Z␤ from endoproteinase Glu-C cleavage may occur because these sites are involved in DNA interaction. On the other hand, conformational changes occurring in the protein as a consequence of binding to DNA could prevent endoproteinase Glu-C from cutting. It is of note that the nearby trypsin site, Lys 302 , was protected in the presence of Z-DNA, but not to the same extent (data not shown).
When Zab in the presence and absence of Z-DNA was compared, there was no change in the CD spectra between 190 and 250 nm (data not shown). This indicates that there is no major change in the secondary structure of the protein when substrate is bound.
The Intact Zab Domain Forms a Stable Complex with Z-DNA and Binds with Sequence Preference-The binding of Z␣ to Z-DNA has been previously characterized using electrophoretic mobility shift assays (2,6,7). This assay was used to compare the binding of Za with that of Zab. d( 5-Br C-G) 20 was used as a substrate; this oligonucleotide is stabilized in the Z-form by the presence of bromine in the 5-position of cytosine (24). Binding was tested in the presence of a 10 4 -fold excess of B-DNA. The results are shown in Fig. 6. Four different proteins were compared at four concentrations (500, 100, 20, and 4 nM protein). Zab bound well at 500 and 100 nM, producing a stable, high molecular mass protein-DNA complex (Fig. 6A, lanes 2-5). At lower concentrations, the complex appeared to break down during electrophoresis, resulting in a smear. This behavior suggests that the most stable complex is formed when the sites on the probe are saturated. Zab⌬l showed a similar behavior, although the stable complex was formed only at the highest concentration (Fig. 6A, lanes 14 -17). In contrast, Za produced two complex bands, which appeared smeared at all concentrations (Fig. 6A, lanes 6 -9). Compared with Zab, Za had a slightly higher affinity for the substrate. The smearing is the result of the instability of the complex under electrophoretic conditions and the longer migration path of Za-DNA complexes as compared with Zab-DNA complexes.
These results may indicate that binding of the Z␤moiety of Zab is responsible for the difference in binding behavior between Za  9 -11). Digestion and analysis were performed as described for Fig. 1, except that the protein/protease ratio was 1:30. Lane M, molecular mass markers.
FIG. 6. Binding of Z-DNA by Zab and subdomains. A, electrophoretic mobility shift assays were performed with 32 P-labeled d( 5-Br C-G) 20 as the substrate, which is stable in the Z-conformation under the applied conditions (24). Zab (a, lanes 2-5), Za (b, lanes 6 -9), Zab digested with chymotrypsin to separate the Z␣ and Z␤ motifs (c, lanes 10 -13), and Zab⌬l (d, lanes 14 -17) were each assayed in a 5-fold dilution series (500, 100, 20, and 4 nM). and Zab. To test this hypothesis, Zab was digested with chymotrypsin and then assayed in the band shift (Fig. 6A, lanes 10 -13). Complete digestion, yielding the Z␣ and Z␤ motifs as separate peptides, was confirmed by SDS-PAGE (Fig. 6B). This mixture showed a band shift pattern very similar to that for Za. No additional bands were observed; therefore, Z␤ alone does not bind to the substrate. Since the molecular masses of the Z␣-and Z␤-containing fragments differ substantially, it is extremely unlikely that any complex formed by Z␤ and DNA would comigrate with the observed Z␣-DNA complexes (23). That the isolated Z␤ is not capable of binding Z-DNA under these conditions is remarkable considering the conservative substitution of functionally important residues (2).
As a second method of studying the binding of Z␣ to DNA, circular dichroism was used to monitor the transition of the DNA conformation from the B-to the Z-form (2,4,5). The spectrum of Z-DNA is inverted as compared with that of B-DNA in the near-UV region between 240 and 300 nm (36,37). Fig. 7 shows the spectra of two Z-DNA-forming oligomers of different sequence, d(C-G) 6 and d(C-A) 7 ⅐d(T-G) 7 , in the presence of either Za or Zab. The DNAs adopted the right-handed B-DNA conformation in the absence of protein. Protein was added in aliquots, resulting in protein/base pair molar ratios of 1:6, 1:4, 1:2, and 1:1.5. When Za was added, the spectra of both d(C-G) 6 and d(C-A) 7 ⅐d(T-G) 7 became inverted, indicating the shift from the B-to the Z-DNA conformation, with saturation at a 1:2 ratio (Fig. 7, C and D). Further addition of protein did not change the spectrum significantly above 250 nm, where the contribution of the protein to the spectrum was negligible. Below 250 nm, the spectrum was dominated by the contribution of the protein. This result is in agreement with Herbert et al. (5), which showed that a Z␣ motif (amino acids 121-201) binds without sequence specificity to a variety of Z-DNA-forming sequences. Zab converted d(C-G) 6 to the Z-form with a similar stoichiometry to Za (Fig. 7A). In contrast, there was only a limited effect on d(C-A) 7 ⅐d(T-G) 7 of the addition of Zab, even at a ratio of 1:1.5 (Fig. 7B).
Za, or any peptide containing Z␣ alone, was able to bind to Z-DNA in a sequence-independent manner. However, when Z␣ was in the context of the entire domain, Zab, a sequence preference for d(C-G) n was observed. Band shift data suggests that the mode of binding is different between Za and Zab, probably reflecting a difference in the degree of cooperativity. DISCUSSION Although the exact biological role of ADAR1 has not been established, analysis of the amino acid sequence has allowed the assignment of functions to parts of the protein (38). These functions contribute to the known in vitro activity of this enzyme, the conversion of A to I in regions of double-stranded RNA. The central series of double-stranded RNA-binding motifs and the C-terminal catalytic domain are well characterized protein domains. The N-terminal region, on the other hand, contains a novel motif, Z␣, with a binding specificity for Z-DNA. Analysis of the primary structure of the N-terminal region has led to a number of hypotheses about the structure and function of this region. In addition to Z␣, a second similar sequence, Z␤, was identified (2). This similarity and the conservative substitution of functionally important residues made it possible that this, too, is a Z-DNA-binding motif. These two motifs are separated by a linker, which is conserved in length but not in sequence, suggesting that correct spacing between Z␣ and Z␤ may be important. Prediction of the secondary structure and low resolution NMR analysis suggest that Z␣ may bind to Z-DNA using a unique application of the helixturn-helix motif common to many B-DNA-binding proteins (2,39). Z␣ and Z␤ might form separate domains or make up a single bipartite domain.
The latter hypothesis was supported when we delineated the structural domain Zab, containing both Z␣ and Z␤ as well as two 49-amino acid linker modules. Limited proteolysis demonstrated that the domain extends from Leu 133 to Glu 361 . A smaller core domain, Za, from Leu 133 to Gly 209 , contains Z␣ and a portion of the linker. C-terminally further shortened Z␣ peptides are functional in binding specifically to Z-DNA (2). However, they lack the structural uniformity seen only for the proteolytically defined domain Za. These earlier constructs also contain additional N-terminal residues (positions 121-132). These residues have been reported to modulate the results of band shift assays (5); however, CD experiments are unaffected by their presence. Although Za is stable to proteolysis, we conclude that Zab is the functional entity. Domain boundaries are frequently protease-hypersensitive, with cleavage sites for different proteases clustered in close proximity. This is observed both N-terminal to Leu 133 and C-terminal to Glu 361 . In contrast, the cleavage sites in the linker region are not clustered, but rather specifically selected from a number of alternatives for each enzyme. In contrast to Za, the stable domain containing Z␣, Z␤ is not organized independently into a stable domain. Instead, peptides containing Z␤ require almost all of one linker module to be stable. Finally, although secondary structure analysis of the primary sequence predicts mostly non-␣-helix or ␤-sheet in the linker modules, a prediction confirmed by CD spectra, we have shown that these modules are well structured. They are not as susceptible to proteolytic cleavage as it would be expected for unstructured regions. Most important in this respect is the result of cleavage by endoproteinase Glu-C, with 16 potential cleavage sites in the linker region. In fact, only one of these is attacked even at high protease concentrations. Other proteases provide similar results. Both linker modules are cut at a single repeated site by chymotrypsin and thermolysin. Trypsin cuts the first copy of the linker module once; the equivalent site in the second copy is missing due to a nonconserved residue. In each case, there are a large number of unattacked potential cleavage sites. A single linker module is sufficient for proper folding; removal of one linker module to form Zab⌬l does not appear to change the overall domain structure.
The overall structure of the Zab domain remains largely unchanged upon binding to substrate DNA. Within the accuracy of the measurements, CD spectra of Zab between 190 and 250 nm are identical in the presence or absence of Z-DNA. Although there are many potential cleavage sites for each protease throughout the protein, no new cuts are seen when the protein-DNA complex is compared with protein alone. This result makes it unlikely that major spatial reorientations take place within the protein. The most striking effect of binding to Z-DNA is a marked decrease in sensitivity to proteases. In particular, the endoproteinase Glu-C sites within Z␤ become completely protected in the presence of substrate DNA. We conclude that the protein becomes less flexible when bound to its substrate. It is also possible that DNA contacts shield the protein surface from proteases.
The hypothesis that Zab is a single domain with a single DNA-binding site involving both Z␣ and Z␤ is supported not only by proteolytic studies, but also by functional assays. In electrophoretic mobility shift experiments, Zab yields a distinct and stable product when bound to a left-handed DNA substrate. Zab⌬l forms an equivalently stable complex, although with an ϳ5-fold reduced affinity. There is no evidence that the linker modules are directly involved in DNA binding, and their high negative charge makes this unlikely. From the significant differences observed between Zab and Zab⌬l, it is clear that the relative orientation of Z␣ to Z␤ is important for DNA binding. In contrast, Za produces a less stable product, although it appears to bind with a somewhat increased affinity. Chymotrypsin-digested Zab, in which the halves of the domain are separated, binds identically to Za; therefore, it is very unlikely that Z␤ binds to DNA independently of Z␣. In addition, if Z␣ and Z␤ acted as physically linked independent motifs, Zab would be expected to have a higher affinity for Z-DNA than Za. Therefore, we conclude that Z␣ and Z␤ must interact within the Zab domain to form a single binding site, involving both motifs. The results obtained by CD measurements strongly support this conclusion. Zab binds with sequence preference for alternating d(C-G) n , whereas Za does not discriminate between Z-forming sequences, but rather is conformation-specific. This is in agreement with previous studies on Z␣-(121-201) (5). The preference of Zab for d(C-G) n needs further investigation. Identification of the optimal substrate in vitro may elucidate the role of Zab as part of ADAR1 in vivo and lead to the identification of the actual binding sites of ADAR1 on chromatin.