Dissecting APOBEC3G Substrate Specificity by Nucleoside Analog Interference*

The apolipoprotein B mRNA-editing enzyme, catalytic polypeptide-like (APOBEC) cytidine deaminase genes encode a set of enzymes including APOBEC1 (A1), APOBEC2 (A2), APOBEC4 (A4), and APOBEC3A-H (A3A-H). Although each possesses one or more zinc binding motifs conserved among enzymes catalyzing C → U conversion, the functions and substrate specificities of these gene products vary considerably. For example, although two closely related enzymes, A3F and A3G, both restrict HIV-1 infection in strains deficient in virus infectivity factor (vif), A3F selectively deaminates cytosine within 5′-TTCA-3′ motifs in single stranded DNA, whereas A3G targets 5′-CCCA-3′ sequences. In the present study we have used nucleoside analog interference mapping to probe A3G-DNA interactions throughout the enzyme-substrate complex as well as to determine which DNA structural features determine substrate specificity. Our results indicate that multiple components of nucleosides within the consensus sequence are important for substrate recognition by A3G (with base moieties being most critical), whereas deamination interference by analog substitution outside this region is minimal. Furthermore, exocyclic groups in pyrimidines 1–2 nucleotides 5′ of the target cytosine were shown to dictate substrate recognition by A3G, with chemical composition at ring positions 3 and 4 found to be more important than at ring position 5. Taken together, these results provide insights into how the enzyme selects A3G hotspot motifs for deamination as well as which approaches might be best suited for forming a stable, catalytically competent cross-linked A3G-DNA complex for future structural studies.

Sequencing of DNA derived from these viruses reveals numerous G 3 A mutations throughout the coding strand (4, 6, 11, 16 -19). These transitions are not likely to be generated directly but occur as cytosine deamination products are copied during the course of reverse transcription. A3G selectively deaminates the 3ЈC within 5Ј-CCCA-3Ј "hot spots" in singlestranded DNA, which in vivo only become available after RNase H-mediated hydrolysis of the viral genome during and after minus-strand DNA synthesis (4, 6, 11, 16 -19). Neither RNA nor double-stranded DNA is deaminated (20,21). Over the length of the genome, mutation frequency decreases in the 5Ј 3 3Ј direction from maxima immediately 5Ј to the two plusstrand primers (19). This pattern reflects (a) a direct relationship between the extent of deamination within a particular segment of minus-strand DNA and the time it remains single-stranded during the course of reverse transcription and (b) the inherent 3Ј 3 5Ј directional bias of A3G-mediated deamination (11, 18 -20, 22).
Understanding the basis for substrate recognition by A3G requires knowledge of enzyme structure as well as of the structural features of DNA important for enzyme recognition. Depending on salt conditions and substrate length, A3G is active in monomeric, dimeric, and larger oligomeric states (31)(32)(33). Expressed as an ϳ46-kDa polypeptide, each A3G monomer/subunit possesses two catalytic domains, CD1 and CD2, located at the N and C termini of the protein, respectively. The two domains are highly homologous, each containing a single (H/C)XEX [23][24][25][26][27][28] CXXC motif conserved among zinc-binding proteins (34,35). However, the functions of the two elements are quite different; CD1 contributes to substrate binding, A3G encapsidation, and perhaps dimerization, whereas enzyme activity and substrate specificity are principally attributable to CD2 (21,36,37). Although the structure of full-length A3G is not currently available, its overall folding and subunit organization may resemble that of the recently resolved A2 enzyme (38). In contrast to previously resolved deaminase structures (39 -41), A2 crystallizes as a rod-shaped tetramer that can be viewed as two homodimers joined head-to-head. Antiparallel B-strands within the CD1 domains of adjacent subunits interact to form the dimer interface in an arrangement that renders the helices bearing the zinc coordination residues solvent-accessible. Low resolution small-angle scattering experiments indicate that human A3G also forms a rod-like structure (33), suggesting that the enzyme may contact DNA at multiple sites within or even throughout an extended nucleoprotein complex in which enzyme and nucleic acid are in parallel configurations. This notion is supported by biochemical data indicating that A3G does not effectively bind substrates shorter than 10 nt (20). In addition, multiple binding sites for a single enzyme may contribute to the 3Ј 3 5Ј processive deamination of ssDNA catalyzed by A3G on longer substrates (20). However, despite these data, the sites and nature of contacts between enzyme and DNA along the length of the complex remain unknown.
Recently, the structure of the CD2 domain of A3G has been solved both by NMR and x-ray crystallography (42,43), although the latter structure reveals features of the domain unresolved by the former and convincingly maps the substrate binding groove (43). The overall folding of the A3G domain closely resembles that of A2 (38), with a core ␤-sheet surrounded by six ␣-helices and a zinc-coordinating active center nestled near the bottom of the groove. However, active center (AC) loops 1 and 3, conserved loop regions that flank the groove, are longer and more removed from the active site in the A3G domain. The bottom of the groove is lined with hydrophobic and positively charged residues that likely stack or form hydrogen bonds with the ssDNA substrate. Indeed, mutational analysis reveals most of these residues to be critical to A3G function (43). However, because individual nucleoprotein contact sites were not specifically delineated in this enzyme structure, those DNA structural features that determine deamination specificity remain unclear. Furthermore, although mutational analysis of APOBEC enzymes demonstrates that substrate specificities are determined by amino acid residues within CD2, the effects of point mutations within this domain are subtle and difficult to explain (30).
We demonstrate here the application of nucleoside analog interference mapping to probe A3G-DNA interactions throughout the enzyme-substrate complex as well as to determine which DNA structural features direct substrate specificity. Previous exploration in this area has been limited primarily to conventional biochemical and molecular biological techniques in which only the four standard deoxynucleosides (A, C, T, and G) have been used to produce substrate variants. In contrast, the approach presented here permits thorough dissection of the chemical requirements of substrate recognition at the submolecular level. Our results indicate that multiple features of nucleosides within the 5Ј-CCCA-3Ј consensus sequence are important for substrate recognition (with base moieties being most critical), whereas deamination interference by analog substitution outside this region is minimal. Furthermore, exocyclic groups in pyrimidines 1-2 nt 5Ј to the target cytosine dictate substrate recognition, with chemical composition at ring positions 3 and 4 more important than the presence or absence of a methyl group at ring position 5. Taken together, these results provide insights into how the enzyme specifically selects A3G hotspot motifs for deamination as well as which approaches might be best suited for forming a stable, catalytically competent, cross-linked A3G-DNA complex for future structural studies.

EXPERIMENTAL PROCEDURES
Expression and Purification of A3G-Recombinant wild type and D316R/D317R A3G were constructed, expressed, and purified as previously described (20,43,44). Sf9 cells were infected with recombinant A3G virus at a multiplicity of infection of 1 and harvested after 72 h. Cells were lysed in the presence of 50 g/ml RNase A. Cleared lysates were then incubated with glutathione-Sepharose resin (GE Healthcare) and subjected to a series of salt washes (0.25-1 M NaCl) before elution in buffer as previously described (31). A3G was treated with 0.02 units/l thrombin (GE Healthcare) for 16 h at 21°C to release the glutathione S-transferase tag, and subsequently the sample was diluted to 50 mM NaCl and loaded onto a DEAE FF column (GE Healthcare) equilibrated with 50 mM Tris-HCl, pH 8.9, 50 mM NaCl, 1 mM dithiothreitol, and 10% glycerol. A3G was eluted at ϳ200 mM NaCl by using a linear gradient from 75 to 1000 mM NaCl. Fractions were collected and stored at Ϫ70°C. Both A3G and the mutant derivative are ϳ95% pure.
Cytosine Deamination Assay-50 nM wild type or 10 nM D316R/D317R A3G was incubated with 50 nM 5Ј 2 P-end-labeled DNA oligonucleotide in reaction buffer (50 mM Tris-Cl, pH 8.0, 5 mM EDTA, 5 mM dithiothreitol, and 200 g/ml bovine serum albumin) for 30 min at 37°C. Reactions were terminated by extraction with 25:24:1 phenol/chloroform/isoamyl alcohol, and products were recovered by ethanol precipitation of the aqueous phase. Reaction products resuspended in reaction buffer were then incubated with 2 units of uracil DNA glycosylase (Fermentas, Glen Burnie, MD) for 1 h at 37°C to remove uracil generated by A3G-mediated cytosine deamination or, in the case of the control substrates, chemically incorporated uracil. Previously, DNAs containing nascent alkali-sensitive abasic sites were hydrolyzed by heating at 95°C for 20 min in the presence of 0.2 M NaOH (20,21). However, several of the nucleoside analogs used in the present study were incompatible with this treatment and required milder hydrolysis conditions. Therefore, reaction products were incubated in 30% aqueous ammonia for 16 h at 37°C, lyophilized to dryness, resuspended in a urea-based gel-loading buffer, and fractionated over a 15% denaturing polyacrylamide gel. Deamination/cleavage products were quantified by phosphorimaging using an FX model Molecular Imager and QuantityOne software (Bio-Rad). Deamination activity was measured as the fraction of total radioactivity present in a specific hydrolysis product, whereas relative deamination is defined as the ratio of deamination activity measured on a modified substrate divided by the deamination activity measured on the standard substrate in reactions catalyzed by a common enzyme.

RESULTS
Experimental System-Oligonucleotides containing single or multiple nucleoside analog substitutions were used to probe substrate binding features and deamination specificity of A3G. Specifically, a series of synthetic 39-mer oligonucleotides containing one or more modification to their base, deoxyribose, or phosphate moieties were synthesized and utilized in the biochemical assay of A3G activity described under "Experimental Procedures." Each oligonucleotide is a variant of the standard substrate, which contains a unique extended A3G recognition motif (5Ј-AAACCCAAA-3Ј; Fig. 1A) determined by Chelico et al. (20) to be the most receptive target of A3G-mediated deamination among the sequences tested. 5Ј 32 P-end-labeled standard and modified substrates are incubated with A3G, which site-specifically catalyzes C 3 U deamination. Reaction products are exposed to uracil DNA glycosylase (UDG), which exclusively hydrolyzes the glycosidic bond of uracil to create an alkali-labile abasic site that can be quantitatively hydrolyzed by overnight incubation in 30% ammonium hydroxide (producing strand scission at that position). Hence, by fractionation of the processed reaction products over a denaturing polyacrylamide gel, both the site and extent of A3G-mediated deamination in a given reaction can be quantitatively determined.
Two basic approaches were taken to explore substrate recognition by A3G. First, oligonucleotides containing single 2Ј-Omethyl (2ЈOm), PTE, or abasic substitutions from position Ϫ10 to ϩ10 (defined as 10 nt 5Ј or 3Ј to the deamination target, respectively; Fig. 1A) were used to explore the effects of each of these modifications on enzyme activity as well as to probe the A3G binding site (Fig. 1B). Once the bases critical to A3G activity were determined (and on the basis of previous findings), pyrimidine analogs ( Fig. 1C) were introduced into synthetic substrates at positions Ϫ2, Ϫ1, ϩ1, ϩ2, ϩ3, and combinations thereof to dissect the base composition requirements of A3Gmediated deamination. Because of assay limitations, the effects of structural variations in the deamination target itself (i.e. the nucleoside at position 0) were not examined directly.
Note that in the first set of experiments, substrate recognition by A3G containing D316R and D317R substitutions was also examined. Based on homology with other cytosine deami- nases, these mutations were introduced to increase positive charge at the A3G-DNA interface, thereby exploring the interplay among A3G surface charge, increased affinity for substrate (perhaps by virtue of additional interactions with the DNA phosphate backbone), and deamination specificity. Indeed, the recent A3G CD2 crystal structure places the two residues at the base of the substrate binding groove where they may play a role in positioning ssDNA for deamination (43).
Scanning 2Ј-O-Methyl Substitution-A3G deamination profiles derived from substrates containing 2ЈOm substitutions at positions Ϫ10 through ϩ10 and their quantitation are shown in Fig. 2, A and B. 0 marks the migration of the 5Ј-labeled hydrolysis product resulting from deamination of the cytosine at position 0. To ensure that the 2ЈOm analog does not affect the assay in a manner unrelated to A3G activity, control substrates containing deoxyuridine (dU) at position 0 as well as 2ЈOm at position Ϫ2, Ϫ1, ϩ1, or ϩ2 were also evaluated ( Fig. 2A, lanes marked dU). Substrate containing 2ЈOm at position 0 (lanes 0) was included in this experiment but only as a reference. Although reactions containing this substrate yield no product, this cannot be attributed to failure of A3G to deaminate the target cytosine, as 2ЈOm-U is not recognized by UDG.
The results indicate that the effects of 2ЈOm substitution are greatest when the analog is introduced in the immediate vicinity of the target cytosine, i.e. at positions Ϫ3 to ϩ1, although some effect is also observed upon substitution at positions Ϫ5, Ϫ4, and ϩ2 to ϩ4 (decreasing with distance from the target cytosine). This indicates that only binding of the catalytic domain (most likely CD2) is affected by 2ЈOm substitution, in a manner consistent with the 3-nt (or bp) nucleic acid footprint typically observed for zinc finger binding motifs (for review, see Ref. 45). It is possible that on short oligonucleotides, the interaction of other regions of A3G (e.g. CD1, or the opposite CD2 in a dimeric form) with DNA is not required for activity. Alternatively, contacts between other regions of A3G and substrate do occur, but (1) they are not affected by introduction of a 2ЈOm group, or (2) binding outside the catalytic domain is not restricted to any particular region within the substrate and is, therefore, not susceptible to interference by analog substitution at any single position. Interestingly, although reduced relative to standard substrate, A3G activity on substrate containing a Ϫ1 2ЈOm substitution is greater than that observed with oligonucleotides containing substitutions at positions Ϫ3, Ϫ2, or ϩ1 despite equal or greater proximity to the active site. This result is reproducible and suggests that direct contacts between A3G and the 2Ј H moiety at position Ϫ1 of the substrate may not exist or are not completely disrupted by the presence of an O-methyl moiety. It also indicates that the ribose 2Ј position of the adjacent cytosine may be a suitable site for insertion of a cross-linking agent to form a covalently linked, yet functional enzyme-substrate complex for biochemical and/or structural analysis. Fig. 2C depicts the deamination profiles of the A3G D316R/ D317R mutant on standard and 2ЈOm-substituted substrates. Comparing lanes S m and S illustrates the unusual activity of the mutant enzyme, i.e. deamination occurs at positions 0 and Ϫ1 with nearly equal efficiency. In addition, deamination of ϳ30% of the substrate required half as much D316R/D317R mutant relative to wild type. Although the precise biochemical mechanism remains unclear, it is likely that the broader specificity and increased activity of this mutant result from increased positive charge at the enzyme-substrate interface and that the function of the wild type residues may involve electrostatic repulsion of the substrate phosphate backbone. Such a notion is further supported by fluorescence anisotropy studies, which demonstrate that the affinity of the mutant enzyme is 1.5-2-fold higher than that of the wild type enzyme (supplemental Fig. 1).
With respect to deamination of the cytosine at position 0 (Fig. 2, A-D), the same substitutions affect wild type and mutant enzyme activity, but the degree of interference varies significantly between the two enzymes. Specifically, 2ЈOm substitution at positions Ϫ3, Ϫ2, or ϩ1 has a lesser effect on deamination catalyzed by the mutant enzyme relative to the wild type (compare equivalent bars in Fig. 2, B and D), whereas the mutant appears to be more sensitive to substitution at position Ϫ1. This entire pattern appears to be shifted one nucleotide when deamination/hydrolysis at position Ϫ1 is quantified (Fig.  1E), except for minor differences at the peripheries of the two windows. These data suggest that the entire catalytic domain of the mutant enzyme shifts to deaminate cytosine at position Ϫ1 and that the sensitivities to substitution at adjacent positions shift with it.
Methyl Phosphotriester Substitution-The previous experiments illustrated that A3G-mediated deamination is sensitive to 2ЈOm substitution at positions Ϫ3 to ϩ1, suggesting that critical nucleoprotein contacts in this region are disrupted. However, although these experiments explore the interaction between A3G and the sugar moieties of the DNA substrate, they do not address the dependence of A3G activity on phosphate contacts. Indeed, because the trajectories of 2ЈOm and non-phosphate oxygens emerging from the DNA strand differ by more than 90° (Fig. 1B), enzyme contacts with sugar and phosphate moieties may be significantly different. For a more thorough analysis of the dependence of A3G activity on phosphate contacts, substrates in which methyl PTE substitutions were introduced at positions Ϫ10 to ϩ11 were evaluated (where phosphates immediately 5Ј and 3Ј to the target cytidine are defined as Ϫ1 and ϩ1, respectively).
A3G-mediated deamination profiles derived from PTE-substituted substrates and quantitation thereof are shown in Fig. 3, A and B, respectively. The results are similar to those observed with 2ЈOm substitution, with a window of sensitivity spanning positions Ϫ3 to ϩ1, further indicating that interference only occurs when binding of the catalytic domain is disrupted. However, the degree to which A3G activity is reduced upon PTE substitution is not as pronounced as that observed with the 2ЈOm substrates (compare Figs. 2B and 3B). This is likely because 1) the PTE moiety is an enantiomeric mixture (i.e. the methyl group may be on either non-bridging phosphate oxygen at a particular position) and/or 2) basic residues may still effectively form hydrogen bonds with the non-bridging oxygen despite the presence of the methyl PTE.
In the D316R/D317R A3G profiles, the window of sensitivity contracts by a single nucleotide (Fig. 3, C and D), with PTE substitution at positions Ϫ2, Ϫ1, or ϩ1 resulting in ϳ40% relative deamination of the target cytosine. When deamination at position Ϫ1 is quantified (Fig. 3E), a shift in the entire window of sensitivity by a single nucleotide toward the 5Ј end of the substrate was observed, consistent with the corresponding 2ЈOm-substituted substrate profiles of Fig. 2.
Scanning Abasic Substitutions-Single abasic moieties were introduced stepwise to examine how base removal from nucleotides Ϫ10 to ϩ10 and potential stacking would impact A3G activity. Note that the abasic analog used here is a reduced derivative of the naturally occurring abasic lesion and is, therefore, insensitive to alkaline hydrolysis. Thus, as with the equivalent 2ЈOm substitution, the absence of hydrolysis products from substrates containing an abasic moiety at position 0 (Fig.  4A, lane 0) cannot be meaningfully interpreted.
From the profiles depicted in Fig. 4A it is apparent that removing an entire base has a greater effect on A3G activity than introducing an O-methyl or methyl group into the sugar phosphate backbone, as abasic substitutions from position Ϫ3 to ϩ1 almost completely prevent cytosine deamination. In this experiment the "edge" of the window of sensitivity 5Ј to the target cytosine is clearly defined, with abasic substitution at position Ϫ4 having a minimal effect on A3G activity, whereas substitution at position Ϫ3 almost completely prevents cytosine deamination (Fig. 4, A and B). In contrast, the window has no clear border 3Ј to the deamination target, with a more gradual reduction in sensitivity as the abasic substitution is progressively relocated from position ϩ1 to ϩ5. This indicates that base-dependent A3G contacts between ϩ1 and ϩ5 are important but not absolutely essential for deamination activity. The profile also indicates that removal of the base at position Ϫ1 almost completely prevents deamination of the target cytosine. This is in contrast to prior experiments in which 2ЈOm and PTE-substituted substrates were utilized, indicating that A3G activity is highly dependent on base composition at this position.
Although D316R/D317R A3G exhibits similar profiles on these abasic substrates with respect to deamination at position 0, this mutant is less sensitive to substitution at position Ϫ3, Ϫ2, Ϫ1, ϩ1, or ϩ2 (Figs. 4, C and D). This may be the result of an overall higher affinity of the mutant enzyme for its substrate (43). In addition, the window of sensitivity relative to deamina-APOBEC3G Substrate Specificity MARCH 13, 2009 • VOLUME 284 • NUMBER 11 JOURNAL OF BIOLOGICAL CHEMISTRY 7051 tion at position Ϫ1 is poorly defined (Fig. 4, C and E), as relative deamination increases gradually (in either direction) with the number of nucleotides from the target cytosine. This window can be loosely interpreted as spanning from position Ϫ4 to ϩ5, inclusive. It is not clear why the 5Ј border should be so illdefined in this case, although variations in sequence context between the two target cytosines may play a role.
Pyrimidine Analog Substitution 5Ј to the Target Cytosine-In each of the scanning substitution approaches described above, the nucleotide composition 2-3 nt immediately 5Ј and 1 nt 3Ј to the target site are shown to be important for efficient deamination of the target cytosine. In particular, removing nucleobases adjacent to the target site is especially detrimental. These data support the conventional definitions of the A3G consensus sequence as 5Ј-CCCA-3Ј (11), with 5Ј-AAACCCAAA-3Ј recognized most efficiently (20). To examine the chemical basis for A3G recognition of the consensus sequence, we first introduced a series of pyrimidine analog substitutions at position Ϫ1, Ϫ2, or both (Fig. 5, A-C, respectively; quantitation is below each panel). The chemical structure of each analog is shown in Fig. 1C.
Because some of the pyrimidine analogs are alkaline labile, a weak base (ammonium hydroxide) was used to hydrolyze reaction products after treatment with uracil DNA glycosylase. 16 h of incubation in this reagent was sufficient to hydrolyze all dUcontaining control substrates and did not otherwise interfere with reaction processing. Despite the mild conditions, a small amount of strand scission at the site(s) of analog substitution is evident (i.e. in products that migrate to positions other than that marked 0 in Fig. 5). These products do not affect how deamination is quantified, however, because (a) degradation occurs after the substrate has been incubated with A3G, and (b) the extent of degradation is 8% or less of starting material in all cases. Profiles illustrating products of pyrimidine analog hydrolysis in the absence of A3G are shown in supplemental Fig. 2. Fig. 5A illustrates clear preferences among pyrimidine analogs introduced at position Ϫ1. Deamination of substrates containing T at position Ϫ1 is highly disfavored relative to those containing C at the same position. Unlike C, T is protonated at the 3 position of the pyrimidine ring and contains carbonyl oxygen and methyl groups at ring positions 4 and 5, respectively. Interestingly, the 5-methyl moiety appears to contribute minimally to the distinction among pyrimidines. Substrate containing 5-methyl C (mC) at position Ϫ1 is deaminated with greater than 60% efficiency relative to the control substrate, whereas relative deamination of DNA containing pseudouridine () (which lacks a 5-methyl group but otherwise resembles T) is less than 2%.
Substrates containing C* or 4C at position Ϫ1 are deaminated by A3G with 25 and 19% efficiency, respectively. This intermediate phenotype is likely the result of competing effects by different structural features of each analog on enzyme recognition. For example, although the pyrimidine ring compositions of C and C* are identical at positions 2, 3, and 4, saturation of the 4 -5 bond in C* likely affects stacking between this analog

APOBEC3G Substrate Specificity
and adjacent residues which may disrupt substrate recognition. Similarly, despite being structurally identical to C in all other respects, the 4-ethyl moiety at the N4 ring position may sterically interfere with A3G substrate recognition.
Substrates containing 3C or iC at position Ϫ1 are not appreciably deaminated by A3G. The former analog contains an N3 methyl group, the steric and/or H-bonding effects of which are apparently sufficient to completely disrupt enzymatic function. In contrast, iC not only lacks an N3 alkyl moiety, but like C, it is most likely not protonated at this position under the conditions of this assay. Therefore, although an unprotonated N3 correlates well with A3G activity, this alone is insufficient to direct deamination of the 3Ј-adjacent C. Furthermore, because enzyme activity is completely disrupted by exchange of amino and carbonyl groups (relative to C) at ring positions 2 and 4, the chemical determinants of A3G recognition in this context are clearly position-specific. Finally, relative deamination of substrate containing 5-methyl Zebularine (Z) is 8.8%, indicating that the 4-amino group, although not absolutely essential, contributes significantly to substrate recognition by A3G.
Based on both prior results (20) and rotational anisotropy experiments conducted for the present study (supplemental Fig. 1), these results cannot be explained by differences in the affinity of A3G for the various substrates. Indeed, neither varying the base composition at position Ϫ1 nor removal of the recognition sequence entirely appears to affect A3G-DNA binding.
Profiles derived from substrates containing pyrimidine analogs at the Ϫ2 position are shown in Fig. 5B. In general, these profiles are quite similar to those derived from analog substitution at position Ϫ1, although A3G is somewhat less sensitive to introduction of T, , iC, and 3C at the Ϫ2 position. This is consistent with reports that A3G tolerates T at the Ϫ2 position, although C is preferred (22). Remarkably, substituting 4C for C at position Ϫ2 results in a nearly 50% increase in deamination relative to the wild type substrate. It may be that although substitution of this analog at position Ϫ1 produces a steric clash that inhibits A3G activity, interference does not occur when 4C is removed one base further from the target cytosine. Instead, the hydrophobic ethyl moiety may make favorable, likely hydrophobic contacts with A3G that stabilize the complex and augment enzyme activity. As was the case with 2ЈOm substitution at position Ϫ1, it may be that the 4-amino group of the cytosine at position Ϫ2 is an ideal site for introduction of a chemical cross-linking agent.
In some cases the effects of analog substitution at positions Ϫ1 and Ϫ2 appear to be additive, based on dual substitution profiles presented in Fig. 5C. As was the case with Ϫ1 substitution, deamination of substrates containing T, , iC, or 3C at both positions is minimal. The same is true for substrates containing C* or Z at both positions, where the effects of tandem substitution appear to be cumulative. Deamination of oligonucleotides containing mC at both positions is ϳ57%, a lower value than was measured on either individually substituted sub- A-E, deamination/cleavage product designations are identical to those described in Fig. 2, A-E, respectively. strate. However, introducing 4C at position Ϫ2 partially counterbalances the inhibitory effects of 4C at position Ϫ1, as the relative deamination value obtained from the dually substituted DNA is more than double that derived from substrate containing 4C at position Ϫ1 alone.
Nucleoside Analog Substitution 3Ј to Target Cytosine-Analysis of virus mutation patterns indicate that A3G-mediated deamination is most efficient on sequences where A is immediately 3Ј to the target C (11). In addition, Chelico et al. (20) have shown that oligonucleotides containing a 5Ј-CCCAAA-3Ј motif is deaminated more efficiently than 5Ј-CCCGGG-3Ј and much more efficiently than 5Ј-CCCTTT-3Ј, demonstrating that as many as three nt 3Ј to the target C can influence activity (20). To analyze this in more detail, A 3 X purine and pyrimidine analog substitutions were introduced at positions ϩ1 and ϩ1 through ϩ3, inclusive, with the resulting profiles shown in Fig. 6, A and B. Fig. 6A demonstrates substituting that ϩ1A of the standard substrate with another purine (G or I) has a negligible effect. However, introducing a pyrimidine at this position impairs enzyme activity by 43% or more, with the naturally occurring T being the most tolerated. This suggests that the double-ring purine structure and not its exocyclic composition is the prin-ciple determinant of deamination efficiency in this context. Interestingly, although A 3 C substitution at position Ϫ1 creates a series of four consecutive C residues (and, thus, two overlapping 5Ј-CCC-3Ј motifs), neither potential cytosine target is deaminated as efficiently as in the 5Ј-CCCA-3Ј motif in the standard substrate.
When nucleosides positions ϩ1 to ϩ3 are triply substituted, G and I are less well tolerated (Fig. 6B). Inosine nucleoside, like A, contains only hydrogen at the purine 2 position but also contains a 4-carbonyl substitution and possesses an electrostatic potential more closely resembling that of G. Hence, it is likely that latter properties are somewhat inhibitory to A3G activity in this context and that this effect is cumulative with consecutive purines. Similarly, substrates containing consecutive pyrimidines at positions ϩ1 through ϩ3 are deaminated much less efficiently than those containing a pyrimidine only at position ϩ1, an effect that otherwise appears to be largely independent of base composition. Again, substrate which contains more than three consecutive C residues (six, in this case) is deaminated inefficiently.
Because the substrates in question contain A 3 C substitutions rather than C insertions, the sequence context 3Ј to the string of consecutive C residues varies (i.e. C residues are no FIGURE 5. A3G-mediated deamination sensitivity of substrates containing pyrimidine analogs 5 to the deamination target. Substitutions were introduced at position Ϫ1, Ϫ2, or at both sites (marked by X), with profiles depicted in panels A-C, respectively. Lane designations reflect the analog incorporated into the substrate used for each reaction. These include C (present in the standard substrate), T, mC, , C*, iC, 4C, 3C, and Z. Relative deamination values were calculated from the wild type A3G profiles as previously described (Figs. 2-4, "Experimental Procedures") and are shown beneath the respective panels. Note that products of A3G-independent pyrimidine analog hydrolysis (lanes C*, iC, 3C, and Z, especially) were identified and quantified by comparison of these profiles with those obtained in the absence of A3G (supplemental Fig. 2).
longer flanked on both sides by 3 A residues). In addition, the separation between the 3Ј-terminal C in the motif and the 3Ј terminus of the oligonucleotide is reduced by 1 or 3 nucleotides (in substrates containing 4 or 6 consecutive C residues, respectively). This may be important, as Chelico et al. (20) demonstrated that both sequence context of an A3G recognition sequence and its proximity to the substrate 3Ј terminus significantly affect deamination activity. To address the possibility that poor deamination of overlapping 5Ј-CCC-3Ј motifs is due to altered sequence context, a series of substrates were constructed that each contain a unique 5Ј-AAAC X AAA-3Ј deamination target within the sequence context of the standard substrate (i.e. 5Ј…AAAC 3 AAA…3Ј), and where X ϭ 2-6. The resulting profiles are depicted in Fig. 6C. Note that unlike previous experiments where only deamination at position 0 (or Ϫ1, in the case of the mutant enzyme) is considered for quantitation, total relative deamination is defined in Fig. 6C as the ratio of the sum of all products in a reaction to the sum of products derived from the standard substrate (i.e. where X ϭ 3).
Consistent with previous definitions of the A3G consensus sequence, substrate containing only two consecutive C residues (flanked by A residues) is not efficiently deaminated. In addition, it is clear that even when sequence context is considered, motifs containing more than three consecutive cytosines are deaminated only half as efficiently as the optimized 5Ј-AAAC-CCAAA-3Ј motif present in the standard substrate. Also, in motifs containing 5 or 6 consecutive C residues, the 3Ј-and 5Ј-terminal C residues are deaminated more efficiently than internal C residues, which likely reflects the importance of the flanking A residues in substrate recognition.

DISCUSSION
In the present study the molecular footprint of A3G on a 39-base oligonucleotide substrate was determined by nucleoside analog interference mapping. Deamination profiles derived from substrates containing stepwise 2ЈOm, PTE, or abasic substitutions revealed overlapping windows of sensitivity spanning positions Ϫ3 to ϩ2, indicating that (i) introducing a methyl or O-methyl group into the sugar-phosphate backbone or (ii) removal of a base within the 5Ј-CCCA-3Ј consensus sequence (or 1 nt to either side) is sufficient to inhibit or prevent deamination of the target cytosine. In general, the extent to which deamination was disrupted was highly analog-dependent. Abasic substitution virtually abolished A3G activity over a wider range of positions than other analogs (i.e. from Ϫ3 to ϩ2, with decreasing sensitivity from ϩ3 to ϩ5), whereas relative deamination of substrate containing a methyl PTE linkage at any position was always at least 28%. Although specific sites of contact have not been described, aromatic residues within the substrate binding cleft of the A3G CD2 (Tyr-315, Trp-285) FIGURE 6. A3G-mediated deamination of substrates containing pyrimidine analogs 3 to the target C. Substitutions were introduced at position ϩ1 or at each of positions ϩ1, ϩ2, and ϩ3 (marked by X) with resulting profiles depicted in A and B. Quantification of deamination activity is also shown. Lane designations and analog substitutions were as described in Fig. 5, except that 2Ј-deoxyadenosine (A) occupies positions Ϫ1 to Ϫ3 in the standard substrate, and substrates containing 2Ј-deoxyguanosine (G) or 2Ј-deoxyinosine (I) were also examined. Panel C depicts the deamination profiles derived from substrates containing a unique 5Ј-AAAC X AAA-3Ј motif, where the value of X varies from 2 to 6, as indicated in lane designations. In this case Relative Deamination (total) is defined as the sum of deamination products at all positions within a particular substrate divided by the total measured for the standard substrate (i.e. where X ϭ 3). MARCH 13, 2009 • VOLUME 284 • NUMBER 11 may participate in stacking interactions with nucleoside bases, whereas positively charged residues Arg-213, -320, -374, and/or -376 may stabilize the enzyme-substrate complex by hydrogen-bonding to the phosphate backbone (43).

APOBEC3G Substrate Specificity
The breadth of the A3G-DNA interface indicates that although analog substitution interferes with substrate recognition by CD2, nucleoprotein contacts outside of this domain either do not exist or are unaffected. Because truncated, bacterially expressed A3G containing only the CD2 domain is enzymatically active (46), contacts outside of this domain may not be essential and/or may not occur. This notion was supported by a recent study predicting that under low salt conditions, short oligonucleotides are most likely deaminated by monomeric A3G (31).
It is also possible, however, that contacts beyond the catalytic CD2 domain are indeed formed but are not restricted to a single site on the DNA substrate and are, therefore, not detectable by our methodology. The latter scenario is consistent with a model proposed by Chelico et al. (20) explaining the 3Ј 3 5Ј processive nature of A3G-mediated deamination. In this model CD1 domains interact to form a "head-to-head" homodimer, whereas CD2 domains either concurrently or alternately bind substrate as the enzyme "jumps" along long ssDNAs. Although at least one of these CD2 domains must be located at an A3G consensus sequence for catalysis to occur, the other may bind ssDNA non-specifically. Although intriguing, this model is principally applicable to A3G-mediated deamination of long ssDNAs and is therefore not subject to validation by the present study.
A3G containing D 3 R substitution mutations at amino acid residues 316 and 317 is not only more active than the wild type enzyme but exhibits an altered deamination specificity, catalyzing C 3 U conversion at position Ϫ1 as well as at position 0 ( Fig. 2A). The differing sequence contexts of the two deamination targets (i.e. 5Ј-ACCC-3Ј versus 5Ј-CCCA-3Ј) indicate that the mutant enzyme is less sensitive to base composition 2 nt 5Ј and 1 nt 3Ј to the target cytosine. Similarly, D316R/D317R A3G is generally less sensitive to 2ЈOm, abasic, or PTE substitution than the wild type enzyme, as evidenced by narrower molecular footprints (2ЈOm, PTE) and/or higher relative deamination val- ues on individual substrates (Figs. 2-4, compare panels B and  D). Taken together and with the fluorescence anisotropy data presented in supplemental Fig. 1, these results are consistent with the notion that the D316R and D317R substitutions increase the affinity of A3G for its substrate by (a) substituting for or complementing existing hydrogen bonds or (b) by forming of new bonds with the DNA backbone. Conversely, the role of the wild type, negatively charged residues may be to increase the stringency and/or specificity of binding by electrostatic repulsion of the phosphate oxygens.
The data presented in Figs. 5 and 6 demonstrate that the composition of nucleosides at positions proximal to the target cytosine is crucial for A3G activity, with the chemical makeup of the pyrimidine at position Ϫ1 being especially critical (summarized in Table 1). T is a poor substitute for C at this position, as has been previously determined using conventional biochemical and molecular biological techniques (11,18,22). Remarkably, however, methyl group occupancy at position 5 of the pyrimidine ring, although constituting the greatest steric difference between C and T bases, does not correlate well with deaminase function. Furthermore, because mC substitution at position Ϫ1 minimally affects deaminase activity, it is unlikely that critical contacts occur between A3G and the 5-position of Ϫ1C in the consensus recognition sequence. Note that these results do not conflict with data of Iwatani et al. (21), as the oligonucleotide used that study contained a C 3 mC substitution at position 0 (as well as at Ϫ1) and was, therefore, not deaminated by A3G.
The features of pyrimidine composition at position Ϫ1 (and Ϫ2) that best correlate with A3G activity relate to the chemical occupancy at ring positions 3 and 4. Of the five Ϫ1-substituted DNAs that were deaminated with greatest efficiency, the three best substrates contained a four-amino moiety, and none was protonated at the N3 position (Table 1). Conversely, Ϫ1-substituted DNAs containing a carbonyl group at ring position 4 were universally poor A3G substrates, and three-of-four were protonated at N3. One consequence of the chemical differences between these two groups of substrates is that H-bonding roles at positions 3 and 4 would be inverted. Specifically, although N3 in the former group would be expected to serve as a hydrogenbond acceptor, the protonated nitrogen prevalent in the latter group would act as a donor. The inverse is true of Ϫ1-substituted DNAs alternately containing amino or carbonyl moieties at the 4-position of the pyrimidine ring. It is likely, therefore, that hydrogen bonds between A3G amino acid residues and DNA at the 3-and 4-ring positions in Ϫ1-substituted DNAs or disruption thereof play a critical role in substrate recognition. Potential contributors to this interaction include Tyr-315 and/or one of the charged residues that line the floor of the substrate binding groove, although specific contact sites have not been demonstrated (43). Hydrogen-bond and/or partial charge inversion may also explain how product is released from the active site after A3G-mediated conversion of C 3 U, although this possibility could not be tested here.
Alternatively, or perhaps in addition, changing the exocyclic groups within one or more pyrimidines in the 5Ј-CCCA-3Ј consensus sequence may alter stacking and/or electrostatic interactions among adjacent bases on flexible ssDNA substrates, which in turn could affect the efficiency with which substrate is recognized by A3G. Such interactions (or disruption thereof) may also explain why purine trimers are preferred to pyrimidine trimers 3Ј (and 5Ј) to the A3G consensus sequence (Fig. 6  (20)). For example, strong stacking interactions among purine trimers flanking the recognition sequence may, by excluding the three C residues, render the target sequence more accessible to the enzyme.
Although this study advances our understanding of structural requirements for A3G mediated deamination, a high resolution structure of the A3G-substrate complex will be required before important individual contacts can be identified. However, variability among A3G-DNA complexes may prove problematic, as the enzyme appears to bind and move along ssDNA randomly (20,31). Previously, catalytically competent nucleoprotein complexes have been stabilized by chemical cross-linking to facilitate crystallographic analysis (47)(48)(49). Our study illustrates that several sites within the DNA substrate might be suitable for attachment of a cross-linking agent without adversely affecting enzyme activity, including (i) the nonbridging phosphate oxygens (at several positions), (ii) the sugar 2Ј position (at substrate position Ϫ1), (iii) the pyrimidine 4 position (at substrate position Ϫ2), and (iv) the pyrimidine 5 position (at substrate positions ϩ1, Ϫ1, or Ϫ2). Although it remains to be seen whether nucleoprotein cross-linking will occur efficiently at any of these locations or even which crosslinking approach is best suited for the application, this information may prove useful for obtaining a high resolution, catalytically relevant A3G-DNA complex, which would facilitate future structural analysis of this important enzyme.