Crystal Structure of DNA Cytidine Deaminase ABOBEC3G Catalytic Deamination Domain Suggests a Binding Mode of Full-length Enzyme to Single-stranded DNA*

Background: The mechanism for DNA cytidine deaminase APOBEC3G (A3G) interacting with single-stranded DNA (ssDNA) is not well characterized. Results: The crystal structure of a head-to-tail dimer of the A3G catalytic deamination domain (A3G-CD2) was obtained. Conclusion: The dimer structure of A3G-CD2 suggests a binding mode of full-length A3G to ssDNA. Significance: The dimer structure of A3G-CD2 may represent a structural model of full-length A3G. APOBEC3G (A3G) is a DNA cytidine deaminase (CD) that demonstrates antiviral activity against human immunodeficiency virus 1 (HIV-1) and other pathogenic virus. It has an inactive N-terminal CD1 virus infectivity factor (Vif) protein binding domain (A3G-CD1) and an actively catalytic C-terminal CD2 deamination domain (A3G-CD2). Although many studies on the structure of A3G-CD2 and enzymatic properties of full-length A3G have been reported, the mechanism of how A3G interacts with HIV-1 single-stranded DNA (ssDNA) is still not well characterized. Here, we reported a crystal structure of a novel A3G-CD2 head-to-tail dimer (in which the N terminus of the monomer H (head) interacts with the C terminus of monomer T (tail)), where a continuous DNA binding groove was observed. By constructing the A3G-CD1 structural model, we found that its overall fold was almost identical to that of A3G-CD2. We mutated the residues located in or along the groove in monomer H and the residues in A3G-CD1 that correspond to those seated in or along the groove in monomer T. Then, by performing enzymatic assays, we confirmed the reported key elements and the residues in A3G necessary to the catalytic deamination. Moreover, we identified more than 10 residues in A3G essential to DNA binding and deamination reaction. Therefore, this dimer structure may represent a structural model of full-length A3G, which indicates a possible binding mode of A3G to HIV-1 ssDNA.

Human A3G exists as monomer, dimer, and tetramer, depending on the DNA substrate and salt concentration. It possesses two homologous deaminase domains, an inactive N-terminal CD1 domain (i.e. A3G-CD1) required for Vif, DNA, and RNA binding and an active C-terminal CD2 domain (i.e. A3G-CD2) required for catalysis and motif specificity (28 -30). The CD1 domain is also suggested to be required for the incorporation of A3G into virions (29). A3G deaminates cytidine processively 3Ј 3 5Ј on ssDNA (31,32). The processive deamination reactions have been decided in a non-random way (31,32). To date, the three-dimensional structure of the free A3G-CD2 domain has been determined by NMR (33)(34)(35)) and x-ray crystallography (36 -38). The three-dimensional structures of free APOBEC2 (39) and other APOBEC3 sub-family members, such as A3A (40,41), A3C (42), and A3F (43,44), have also been reported. The structural basis for Vif hijacking CBF-␤ and CUL5 E3 ligase was recently revealed too (45). However, the mechanism of how A3G-CD2 or full-length A3G and other members of its family interact with ssDNA is still not well understood.
To address how A3G-CD2 interacts with HIV-1 ssDNA, we crystallized A3G-CD2 in the presence of the ssDNA containing one target motif sequence 5Ј-CCC-3Ј. A novel head-to-tail dimer structure of A3G-CD2 was obtained. In its surface, a continuous groove for ssDNA binding was found. Our structural analysis and biochemical assays suggest that this dimer structure may represent a structural model of full-length A3G. From this model, we identified more than 10 new residues in both A3G-CD1 and A3G-CD2 critical to the deamination catalysis reaction.

Expression and Purification of A3G-CD2 and Its Variants-
The DNA corresponding to gene of wild-type (WT) A3G-CD2 (residues 193-384) or its variant was cloned into a pET28a vector containing an N-terminal His tag and thrombin cleavage sites. A3G-CD2 and its variants were expressed in Rosetta (DE3)plys Escherichia coli cells. Cell cultures were grown to A 600 value equal to about 0.8 and induced with a final concentration of 0.5 mM isopropyl 1-thio-␤-D-galactopyranoside for 20 h at 18°C. Cells were resuspended in nickel-binding buffer (20 mM Tris-HCl, pH 7.5, 500 mM NaCl, 10 M ZnCl 2 , and 0.5 mM dithiothreitol (DTT)) with protease inhibitor (DNase I) and lysed at 15,000 p.s.i. using a hydraulic cell disruption system (constant System JINBO Benchtop) (Guangzhou Juneng Biology and Technology Co., Ltd., Guangzhou, China). The lysate was centrifuged at 12,000 rpm and 4°C for 55 min to remove cellular debris prior to loading into a nickel-nitrilotriacetic acid resin (GE Healthcare). The protein was washed with nickelbinding buffer 20 mM imidazole and then eluted by a five-step gradient of nickel-binding buffer with 40 mM imidazole, 100 mM imidazole, 250 mM imidazole, and 500 mM imidazole, respectively. Fractions containing A3G-CD2 or its variants were concentrated and purified on a Superdex75 16/600 GL column (GE Healthcare) previously equilibrated with buffer B (50 mM HEPES, pH 7.5, 50 mM NaCl, 50 M ZnCl 2 , 5 mM DTT).
X-ray Crystallization Screening and Data Collection-Purified A3G-CD2 and its D370A variant were quantified by A280 and then mixed with ssDNA (it was commercially synthesized at an HPLC grade from Shanghai Sangon Biotech Co., Ltd. (Shanghai, China) with a sequence 5Ј-TTAACCCTTA-3Ј) at a molar ratio of 1:1.2 (protein/ssDNA) and concentrated to final concentrations of 10 and 20 mg/ml for crystallization. At 18°C, the crystals of the WT A3G-CD2 protein were grown by sittingdrop vapor diffusion against a reservoir containing 0.04 M citric acid, 0.06 M Bistris propane, pH 6.4, 20% (w/v) polyethylene glycol 3350, whereas the crystals of D370A were grown by sitting-drop vapor diffusion against a reservoir containing 0.1 M sodium citrate tribasic dehydrate, pH 5.6, 20% (v/v) 2-propanol, 20% (w/v) polyethylene glycol 4000. The crystals were flashfrozen in Paratone-N. X-ray diffraction data were collected at beamline BL17U of the Shanghai Synchrotron Radiation Facil-ity using a MAR CCD MX-225 detector. The wavelength of the radiation was 0.9792 Å, and the distance between the crystal and the detector was 300 nm. The exposure time for each frame was 1 s with a 1º oscillation, and 360 frames were collected. The data were indexed, integrated, and scaled using the HKL-2000 program suite (46).
Structural Determination-The A3G-CD2 and D370A structures were determined by molecular replacement, using the program PHENIX-Auto MR (47) and the structure of the A3G-CD2 2K3A variant (PDB code 3IR2) as the search model (37). Iterative rounds of model rebuilding and simulated annealing torsion angle refinement were performed using the programs Coot (48) and PHENIX Refine (47). Identification of proper sequence registry was confirmed with the location of the catalytic zinc site and the presence of bulky aromatic residues. Ramachandran plot analysis revealed that 93.7 and 6.7% of the residues of WT A3G-CD2 protein and 91.8 and 8.2% of the residues of A3G-CD2 D370A variant were in most favored and allowed regions, respectively. The final model of WT protein contains residues 194 -381 in monomer H and residues 196 -246, 253-316, and 321-381 in monomer T. Weak electron density was observed for residues 246 -253 and 316 -321 in monomer T, whereas the final model of the D370A variant contains residues 193-381.
Real-time Studies of A3G-CD2-catalyzed Deamination by NMR-A series of one-dimensional 1 H NMR spectra of the reported HIV-1 virus ssDNA with the sequence 5Ј-ATTC-CCAATT-3Ј (34) were acquired as a function of time at 20°C in NMR buffer (50 mM Na 2 HPO 4 , 50 mM NaCl, pH 7.5, 50 M ZnCl 2 , 2 mM DTT), after adding concentrated A3G-CD2 solution. To accurately assign uridine NMR signals, three ssDNAs with sequences of 5Ј-ATTCCUAATT-3Ј, 5Ј-ATTCUCAATT-3Ј, and 5Ј-ATTCUUAATT-3Ј were used as controls. Concentrations of A3G-CD2 and its variants were fixed at 9.4 M. The intensities of the 1 H-NMR signal belonging to U 6 was used for quantification. Real-time monitoring of A3G-CD2 catalyzed deamination reaction by NMR was performed to extract initial rates (Ͻ5% dC 3 dU conversion) for a series of substrate concentration (37.5, 56.3, 93.8, 117.2, and 140.6 M). K m values were obtained using Michaelis-Menten module with the software Prism 5 (GraphPad Inc.).
E. coli-based Deaminase Activity Assays-The intrinsic DNA cytidine deaminase activity of full-length A3G and its variants was measured by expressing these proteins in ung-deficient E. coli BW310 and by quantifying the frequency of Rif R -conferring rpoB mutations (33,49), as described in the previous report (35). A lot of single-base mutations in rpoB lead to active site amino acid replacements that confer Rif R . In each case, five single colonies were grown at 37°C in LB medium containing 100 g/ml ampicillin and induced overnight by 1 mM isopropyl 1-thio-␤-D-galactopyranoside at 18°C. Then appropriate volumes of cells were spread on plates containing 100 g/ml rifampicin to select for Rif R mutants and to plates containing 100 g/ml ampicillin to determine the number of viable cells. Colonies were allowed to form overnight at 37°C and then counted manually.
E. coli Immunoblots-The full-length A3G and its variant constructs were expressed in E. coli strain BW310. Proteins were generated by overnight expression at 18°C in LB medium containing 100 g/ml ampicillin. To induce expression, cells were diluted 1:10 in LB medium containing 100 g/ml ampicillin and 1 mM isopropyl 1-thio-␤-D-galactopyranoside and grown for 1 h at 37°C. Cells were pelleted and resuspended in SDS gel loading buffer (50 mM Tris-Cl, pH 6.8, 100 mM ␤-mercaptoethanol, 2% SDS (v/v) glycerol). Lysates were heated at 95°C for 5 min and fractionated by SDS-PAGE. Proteins were transferred to a polyvinylidene difluoride (PVDF) membrane and probed with a rabbit anti-A3G polyclonal serum. The primary antibody was detected by incubation with donkey peroxidase-conjugated anti-rabbit IgG (Shanghai Sangon Biotech Co. Ltd.), followed by chemiluminescent imaging.

RESULTS
Overall Fold of Novel Head-to-tail A3G-CD2 Dimer Crystal Structure-In the presence of ssDNA during crystallization, two monomers of human A3G-CD2 domain (residues 193-384) occupy one asymmetric unit. The final crystal structure was determined at 1.8 Å resolution and solved by molecular replacement, using the structure of the A3G-CD2 2K3A variant (residues 191-384; PDB code 3IR2) (37) as a model. The final refinement statistics are summarized in Table 1. Different from the previously reported tail-to-tail or head-to-head conformation (PDB code 3IR2) (37), the crystal structure demonstrates a new head-to-tail dimer conformation (to simplify discussion, the two monomers are referred to here as monomer H (head) and monomer T (tail), respectively), which means that the N terminus of the monomer H interacts with the C terminus of the monomer T, as shown in Fig. 1. The two monomers are identical to each other with a root mean square deviation value of 0.19 Å for backbone C ␣ atoms in the secondary structural regions. They contain a core sandwich-like ␣-␤-␣ fold, consistent with the reported cytidine deaminases (33-36, 39 -41, 43), in which the monomer structure has five ␤ strands encircled by six ␣ helices on both sides (Fig. 1). The catalytic zinc ion in each monomer was coordinated directly by the side chains of residues His-257, Cys-288, and Cys-291 and indirectly by catalytic center Glu-259 via a water molecule. The secondary structural elements are numbered after the x-ray crystal structures of WT A3G-CD2 (residues 197-380, PDB codes 3E1U and 3IQS) (36) and of its 2K3A variant (PDB code 3IR2) (37). Different from structures 3E1U and 3IQS (refined from 3E1U) but similar to the reported structures of A3G-CD2 (PDB codes 3IR2, 2KEM, 2JYW, and 2KBO) (33)(34)(35)37), the second ␤ strand in the current dimer structure is discontinuous. Loop 3 (residues 246 -253) and loop 7 (residues 316 -322) are missed in monomer T, which is also distinct from the head-to-head or tailto-tail dimer conformation (PDB code 3IR2). Thus, on the whole, in terms of the overall fold, this structure is similar to the previously reported structures, but many key differences were still observed.
Differences between the Crystal Structures-The reported crystal structures of A3G-CD2 (residues 197-380) with PDB codes 3E1U and 3IQS have a continuous ␤2 sheet (36), significantly differing from those (discontinuous ␤2/␤2Ј sheets) in three NMR structures (PDB codes 2YJW, 2KBO, and 2KEM) (33)(34)(35), the x-ray crystallographic tail-to-tail or head-to-head dimer structure (3IR2) (37), and the current dimer structure. The different regions include ␤1-␤2, ␤2Ј-␣2, and ␣2-␤3, which are due to the ambiguity in electron density. The bulge between the ␤2-␤2Ј strands in the NMR structures and 3IR2 is obviously an intrinsic feature of A3G-CD2 structure rather than an experimental artifact (37). Thus, for correct comparison, monomer H in the current A3G-CD2 dimer was just overlapped with one monomer of the tail-to-tail A3G-CD2 2K3A variant dimer ( Fig.  1), which produces root mean square deviation values of 1.64 Å for all of the backbone C ␣ atoms in the region of residues from 193 to 384 and of 0.30 Å for all of the backbone C ␣ atoms in the secondary structure regions. This observation indicates that the monomer conformation of the current A3G-CD2 dimer is almost identical to that of 3IR2. The main differences are located in loop 1 (residues 206 -215), loop 3 (residues 245-256), and loop 7 (residues 315-320), which are all involved in ssDNA binding and stabilize the active deamination center. The orientations of the side chains of the residues in sequence P 210 WVR 213 in loop 1; of the residues His-248, His-249, Phe-252, and Glu-254 in loop 3; and of the residues in sequence Y 315 DDQ 318 in loop 7 are apparently distinct from those in PDB entry 3IR2 (Fig. 2, A-C). Among these residues, Trp-211 (37), Arg-213 (33,34,36,37), Tyr-315 (36), and Asp-316 and -317 (36) have been confirmed to be crucial for the deaminase activities. Moreover, it has been suggested that the sequence Y 315 DDQ 318 in loop 7 specifically recognizes the second cytosine in the target motif sequence of ssDNA (5Ј-CCC-3Ј), due to its polarity, whereas the sequence Y 307 YFW 310 in loop 7 of A3F specifically identifies G or T in the ssDNA sequence 5Ј-(G/ T)C-3Ј (36,43). Analysis on the surface of 3IR2 and the current structure indicates that the conformational changes in loops 1, 3, and 7 result in a bigger ssDNA-binding groove in the monomer of 3IR2 than that in the current dimer structure (Fig. 2, D and E), thus probably enhancing DNA binding. Moreover, the positively charged side chain of Arg-213 points to the ssDNA binding groove in 3IR2 structure, strengthening DNA binding, but it deviates from the ssDNA binding groove in the current structure. These observations may account for the fact that the A3G-CD2 2K3A variant has deaminase activities ϳ2.7-fold more than its WT protein (33). Intermolecular Interfaces in Head-to-tail A3G-CD2 Dimer-The obviously big difference between A3G-CD2 head-to-tail dimer and tail-to-tail dimer (or head-to-head dimer) (PDB code 3IR2) is the intermolecular interface. In monomer H of the current head-to-tail dimer, helix ␣2 and loop 3 form an L-shaped hook, which stabilizes the dimer conformation by interacting with helix ␣6Ј and loop 1Ј in monomer T. Thus, the current head-to-tail dimer contains three main interfaces, which are between loop 3 and loop 1Ј, between helix ␣2 and helix ␣6Ј, and between loop 3 and helix ␣6Ј with surface areas of 223, 374, and 317 Å 2 , respectively (Fig. 3A). The total surface area is 1032 Å 2 , much larger than those in the tail-to-tail dimer conformation (901 Å 2 ) and in the head-to-head dimer conformation (604 Å 2 ), respectively. This indicates that the head-totail dimer conformation might be more stable than those of the tail-to-tail dimer and head-to-head dimer.
In the head-to-tail dimer conformation, the helix ␣2 in monomer H is almost perpendicular to the helix ␣6Ј in monomer T, making up the largest interface. Residues Gly-373Ј, Ala-377Ј, and Gln-380Ј in the C terminus of helix ␣6Ј in monomer T form hydrogen bond nets by interacting with residues Leu-263 and Asp-264 in helix ␣2 in monomer H through several water molecules (Fig. 3, B and C). The side chain of Arg-376Ј in monomer T has hydrophobic interactions with the side chain of Phe-268 in monomer H (Fig. 3D). To estimate the functional significance of this interface, four variants (D264A, F268A, R376A, and Q380A) were designed to disrupt the observed interactions. An NMR enzymatic assay was performed to measure the catalytic efficiency on the deamination at base C 6 in the reported sequence of 5Ј-ATTC 4 C 5 C 6 AATT-3Ј (34). These variants show decreased DNA deaminase activity in vitro ( ). In the structure 3E1U, an intramolecular salt bridge was observed between the side chains of residues Asp-264 and Arg-256 in loop 3 (33) (Fig. 3E), which obviously stabilizes the conformation of loop 3. Therefore, the mutation from Asp-264 to Ala-264 not only disrupts FIGURE 1. The overall fold of a novel head-to-tail A3G-CD2 dimer. A, ribbon representation of the two monomers (monomer H (gray) and monomer T (green)). B, the secondary structure elements of A3G-CD2 in this dimer, represented in ribbon mode. C and D, ribbon representation of the tail-to-tail and the head-to-head A3G-CD2 dimer conformations, respectively. E, the superimposition of one monomer (gray) in the head-to-tail dimer with one monomer (orange) in the tail-to-tail or head-to-head dimer. The spheres represent zinc ions in the three-dimensional structures.
the interactions between loop 3 in monomer H and the helix ␣6Ј in monomer T but also makes loop 3 more flexible, destabilizing the ssDNA binding center and thus further resulting in the reduced catalytic efficiency of the D264A variant. The mutation from Arg-376 to Ala-376 keeps its hydrophobic interactions with the side chain of Phe-268 ( Fig. 3D)  /K m R376A ϭ 8.10 ϫ 10 Ϫ4 min Ϫ1 M Ϫ1 , reduced 6.7fold), consistent with the previous observation that Arg-376 was involved in the ssDNA binding activity and catalytic reaction (36,37).
In the second interface, loop 3 in monomer H acts as a hook to catch the helix ␣6Ј in monomer T. The side chains of the residues Gln-245 and Arg-256 and the backbone oxygen atoms of the residues His-250 and Gly-251 in monomer H form a hydrogen bond network through water molecules with the charged side chains of Asp-370Ј and Arg-374Ј in monomer T (Fig. 3, F and G). To evaluate the contributions of these residues to the catalytic deamination, we replaced Gln-245, Arg-256, Asp-370Ј, and Arg-374Ј with alanine. Compared with the WT protein, the Q245A and R256A variants nearly abolish the catalytic efficiency ( , reduced 75-fold). Obviously, these mutations destroy the hydrogen bond interactions observed above and thus account for the changes in the catalytic efficiency. These results are different from those observed in structure 3IR2, where a side chain of Gln-245 was coordinated to the zinc ion in the dimer interface (37).
The D370A variant has smaller K cat and K m values (K m D370A ϭ 11.89 Ϯ 2.49 M, K cat D370A ϭ 0.010 Ϯ 0.00038 min Ϫ1 ) than the WT protein, with only about 16% of the catalytic efficiency of the WT protein (K cat D370A /K m D370A ϭ 8.58 ϫ 10 Ϫ4 min Ϫ1 M Ϫ1 ), suggesting that the D370A variant might have a stronger binding affinity to ssDNA than the WT protein (the DNA binding groove in the structure of the D370A variant becomes wider, as shown in Fig. 2F). We tried to crystallize the complex of D370A with ssDNA. Different from the WT protein, in the presence of ssDNA during crystallization, one asymmetric unit contains a D370A molecule. No electronic density was observed for ssDNA either. Its final crystal structure was determined at 1.7 Å resolution. One molecule of D370A forms a dimer with another molecule in an adjacent asymmetric unit in a head-to-tail way. The structure of the D370A variant reveals that the mutation from Asp-370 to Ala-370 not only directly results in the breakage of the hydrogen bond between Asp-370Ј (in loop 7) and Gln-245 (in loop 3) (Fig. 3F) but also indirectly impairs the hydrophobic interaction between His-248 (in loop 3) and Trp-211Ј (in loop 1) (Fig. 3H), and the intramolecular salt bridge interaction between Arg-374 (in

Structure of APOBEC3G Catalytic Deamination Domain
helix ␣6) and Asp-316 (in loop 7) (the distances between the side chains of Arg-374 and Asp-316 become bigger in D370A than those in the WT protein, as shown in Fig. 3, K and L). Thus, the decrease in the catalytic efficiency of the A3G-CD2 D370A variant further reveals that the stability of the active center is important to the catalytic deamination reaction.
To assess the importance of Arg-374 in the cytidine deamination, we replaced Arg-374 by Ala-374. Compared (36,37). On one hand, we think that the mutation from Arg-374 to Ala-374 may directly destroy the intramolecular salt bridge between Arg-374 (in helix ␣6) and Asp-316 (in loop 7) (Fig. 3K) and the intermolecular salt bridge between Arg-374 and Glu-209Ј (in loop 1) (which was observed in the tail-to-tail dimer conformation; Fig. 3M), both interactions making loops 1 and 7 more flexible. On the other hand, the replacement of Arg-374 by Ala-374 disrupts additional hydrogen bonds between the Arg-374 side chain and backbone oxygen atoms of residues Gly-251 and His-250 in loop 3 (Fig. 3G), which results in loop 3 being more flexi-  FEBRUARY 13, 2015 • VOLUME 290 • NUMBER 7

JOURNAL OF BIOLOGICAL CHEMISTRY 4015
ble. Thus, like the mutation from Asp-370 to Ala-370, the mutation from Arg-374 to Ala-374 destabilizes the active deamination center of A3G-CD2 by changing the conformations of loops 1, 3, and 7.
The interaction between loop 3 in monomer H and loop 1Ј in monomer T composes the third interface, in which the side chains of the residues His-250 and His-248 in loop 3 have weak hydrophobic interactions with the residues Pro-210Ј and Trp-211Ј in loop 1Ј as well as hydrogen bond interactions between the His-250 side chain nitrogen atom and the Pro-210 backbone oxygen atom through one water molecule (Fig. 3H) Before measuring the catalytic efficiency of each variant of A3G-CD2, we tested their aggregation states by running an analytic Superdex TM G75 column (10/300) (Fig. 4). The results suggest that all mutations on the residues mentioned above have no effects on the aggregation state of A3G-CD2. Therefore, we can exclude the possibility that the differences in the catalytic efficiency of A3G-CD2 variants resulted from A3G-CD2 aggregation state changes.
We further investigated the contributions of the residues in the interface in the head-to-tail A3G-CD2 dimer to the deaminase activities of full-length A3G through an E. coli-based deaminase activity assay (Fig. 5). The expression of the fulllength A3G and its variants was confirmed by running E. coli immunoblots (Fig. 5). The variants, including P210A, P210G, Q245A, R256A, D264A, D370A, R374A, R376A, and Q380A, demonstrate weaker deaminase activities. In addition, the residue Phe-252 (in loop 3) is not located in the interface, but it has hydrophobic interactions with the side chain of residue Arg-256 (in loop 3) within one monomer (Fig. 3E) of the current head-to-tail dimer conformation (this interaction was also observed in the x-ray structure 3E1U) (33), which may stabilize the active center by fixing the conformation of loop 3. Thus, the mutation from Phe-252 to Ala-252 impairs this hydrophobic interaction and decreases the activities of HIV-1 virus ssDNA deamination by A3G-CD2. The H248G, H250A, and H250G variants have higher deaminase activities than WT protein, consistent with in vitro enzymatic results from A3G-CD2 H248G, H250A, and H250G variants.
In summary, the interfaces in the current head-to-tail dimer conformation present new insights into the residues involved in HIV-1 virus ssDNA binding and the catalytic deamination. Mutagenesis studies on those residues further confirm that the stability of the active center is extremely important to catalytic C 3 U deamination. Nine new residues (Pro-210, Gln-245, His-248, His-250, Phe-252, Asp-264, Phe-268, Asp-370, and Gln-380) in the A3G-CD2 domain necessary for HIV-1 ssDNA binding and the catalytic deamination reaction were observed from this new dimer conformation.

DISCUSSION
The Head-to-tail Dimer Conformer of A3G-CD2 Reveals the Mode of Full-length A3G Binding to ssDNA-To well understand how the A3G CD1 and CD2 domains work together to facilitate cytidine deamination, a holoenzyme structure is a prerequisite. Although the three-dimensional structure of A3G-CD1 domain is not available, different models of full-length A3G were previously constructed. Two of them were predicted based on the APOBEC2 structure because APOBEC2 has amino acid sequence identical to A3G-CD1 (24%) and A3G-CD2 (31%) domains. One model, where A3G-CD1 and A3G-CD2 are tethered through the interactions between their ␤-strands, was successfully used to identify three residues (Arg-122, Trp-127, and Asp-128) important for packaging A3G into virions (50). The other model, however, produced from the extended NMR structure of A3G-CD2 domain (PDB code 2KEM) (35), suggested that the ␤-strands in A3G-CD1 and A3G-CD2 were distant. Instead, an N-terminal pseudocatalytic domain, including the interdomain linker and some of helix ␣6 of A3G-CD1, packs A3G-CD1 and A3G-CD2 together. Unfortunately, in this model, the catalytic deaminase domains (i.e. A3G-CD2 and N-terminal pseudocatalytic domain) point away from each other and from the nucleic acid binding site, creating a topological dilemma. The third model, the tail-to-tail dimerization model of the full-length A3G-DR (A3G was treated with RNase), and tetramer model of the full-length A3G-D (A3G was not treated with RNase) were generated by small angle x-ray scattering and shape reconstruction methods. This model implied that the full-length A3G in either low molecular mass or high molecular mass is symmetrically associated (51). This model was further refined into the fourth model after the A3G-CD2 three-dimensional structure was reported, through an in-cell quenched fluorescence resonance energy transfer (FRET) assay, small angle x-ray scattering, and other techniques (52). In this model, A3G was self-associated via its CD2 domain, forming a dimer structure. It seems that this model is more reasonable than any other reported model. Its low resolution, however, limits its usage in the analysis of the biological functions of the full-length A3G.
It is well known that A3G deaminates ssDNA processively with a strong 3Ј 3 5Ј bias (31,32,34). When there is more than one target motif in the sequence of ssDNA, A3G deaminates the 5Ј-CCC target motif 5-fold more rapidly than the 3Ј-CCC target motif (53). Unlike either WT full-length A3G or its monomer F126A/W127A mutant, the A3G-CD2 alone, which does not oligomerize, catalyzes ssDNA-dependent C 3 U deamination (33,36,49), displaying no deamination polarity and no dead zone. The non-catalytic A3G-CD1 plays an indispensable role in stabilizing ssDNA binding, enhancing the catalysis, and establishing 3Ј 3 5Ј deamination polarity and processivity. To explain the polarity and processivity of the deamination by A3G, Chelico et al. (53) suggested the fifth structural model of the full-length A3G, in which A3G effectively deaminates the ssDNA 5Ј-proximal CCC motif only upon binding to ssDNA in an active orientation. However, it is still difficult to obtain information about how A3G-CD1 is involved in ssDNA binding from this model.  FEBRUARY 13, 2015 • VOLUME 290 • NUMBER 7

JOURNAL OF BIOLOGICAL CHEMISTRY 4017
To address this, we predicted an A3G-CD1 model through the SWISS model server (54 -57) on the basis of its high sequence identity (43.68%) to A3C (42) (Fig. 6A), which contains six ␣-helices and five ␤-sheets. The ␤2 sheet is continuous, different from that in the A3G-CD2 structure. This structural model has a root mean square deviation value of 3.27 Å for all backbone atoms upon superimposing with the monomer T in the current head-to-tail A3G-CD2 dimer, indicating that its overall fold is almost identical with that of the A3G-CD2 domain (Fig. 6B). Thus, we raise a question of whether the current head-to-tail A3G-CD2 dimer conformation is a potential three-dimensional structural model of the full-length A3G.
Third, in what orientation does ssDNA upon its bind to fulllength A3G? In our current A3G-CD2 head-to-tail dimer conformation, close to the catalytic center of the residue Glu-259 (ligated to Zn 2ϩ ion), there is a deep pocket for accommodating the first base C m in the 5Ј-proximal -C mϩ2 C mϩ1 C m -target motif in ssDNA (the ssDNA sequence is assumed to be 5Ј-C mϩ2 C mϩ1 C m -C nϩ2 C nϩ1 C n -3Ј). By overlapping the structure of monomer H with that of the mouse cytidine deaminase in complex with cytidine in the active site (PDB code 2fr6), we found that the hot spot cytidine was really docked into this site near the residues -Y 315 DDQ 318 -in the conserved sequence -R 313 IYDDQ 318 -in loop 7 (Fig. 6, C and D). This sequence was previously predicted to specifically identify the second base, C mϩ1 , in the 5Ј-proximal C mϩ2 C mϩ1 C m target motif in ssDNA, which is different from the residues -Y 307 Y 308 FWD 311 -in loop 7 of A3F-CD2 (specifically binding to the second base T mϩ1 in 5Ј-T mϩ2 T mϩ1 C m A-3Ј) (43,44) and from the residues -Y 130 DYD 133 -in loop 7 of A3A-CD1 (specifically binding to the second base, T mϩ1 or C mϩ1 , in target motif 5Ј-T mϩ1 C m -3Ј or 5Ј-C mϩ1 C m -3Ј) (40,41). Thus, upon binding to full-length A3G, the 5Ј terminus of ssDNA is located in the A3G-CD2 side, whereas its 3Ј terminus interacts with the A3G-CD1 side. In other words, ssDNA binds to full-length A3G in an active orientation, as shown in Fig. 6, C and D, which accords with the previous prediction (53). The ssDNA interaction cavity displays large hydrophobic regions and negatively charged regions in the electrostatic potential surface of A3G-CD2 plus the A3G-CD1 structural model (Fig. 6E). The main hydrophobic regions distribute near the 3Ј-end of the ssDNA, and the negatively charged regions mainly locate close to the catalytic center.
In conclusion, based on the structural analysis, NMR-based enzymatic assay, and E. coli-based deaminase activity assay, we suggest that the head-to-tail dimer conformation may represent a structural model of full-length A3G. This model indicates that A3G may bind to HIV-1 virus ssDNA in an active orientation. According to this structural model, several new residues in A3G-CD1 (including Pro-25, Glu-61, His-72, Asp-130, and Leu-184) were found to be necessary to ssDNA binding and catalytic deamination.