Nanostructures of APOBEC3G Support a Hierarchical Assembly Model of High Molecular Mass Ribonucleoprotein Particles from Dimeric Subunits*

Human APOBEC3G (hA3G) is a cytidine deaminase that restricts human immunodeficiency virus (HIV)-1 infection in a vif (the virion infectivity factor from HIV)-dependent manner. hA3G from HIV-permissive activated CD4+ T-cells exists as an inactive, high molecular mass (HMM) complex that can be transformed in vitro into an active, low molecular mass (LMM) variant comparable with that of HIV-non-permissive CD4+ T-cells. Here we present low resolution structures of hA3G in HMM and LMM forms determined by small angle x-ray scattering and advanced shape reconstruction methods. The results show that LMM particles have an extended shape, dissimilar to known cytidine deaminases, featuring novel tail-to-tail dimerization. Shape analysis of LMM and HMM structures revealed how symmetric association of dimers could lead to minimal HMM variants. These observations imply that the disruption of cellular HMM particles may require regulation of protein-RNA, as well as protein-protein interactions, which has implications for therapeutic development.


Human APOBEC3G (hA3G) is a cytidine deaminase that restricts human immunodeficiency virus (HIV)-1 infection in a vif (the virion infectivity factor from HIV)-dependent manner. hA3G from HIV-permissive activated CD4؉ T-cells exists as an inactive, high molecular mass (HMM) complex that can be transformed in vitro into an active, low molecular mass (LMM) variant comparable with that of HIV-non-permissive CD4؉ T-cells.
Here we present low resolution structures of hA3G in HMM and LMM forms determined by small angle x-ray scattering and advanced shape reconstruction methods. The results show that LMM particles have an extended shape, dissimilar to known cytidine deaminases, featuring novel tail-to-tail dimerization. Shape analysis of LMM and HMM structures revealed how symmetric association of dimers could lead to minimal HMM variants. These observations imply that the disruption of cellular HMM particles may require regulation of protein-RNA, as well as protein-protein interactions, which has implications for therapeutic development.
hA3G 5 is an anti-retroviral host defense factor that restricts HIV infection by vif (the virion infectivity factor from HIV)-deficient viral strains (1). hA3G is packaged into HIV-1 virions (2,3) and causes extensive 2Ј-deoxycytidine to 2Ј-deoxyuridine mutations of minus polarity viral DNA during reverse transcription (4). Such "DNA editing" results in extensive 2Ј-deoxyguanosine to 2Ј-deoxyadenosine changes in the viral cDNA that contribute to reduced HIV infectivity (5)(6)(7)(8). However, a deaminase-independent anti-viral mechanism exists as well (9) that may entail RNA binding (10). Although hA3G does not edit RNA, it exhibits general RNA binding properties (11)(12)(13). The principal form of hA3G in HIV infection-permissive CD4ϩ cells of lymphoid tissues is an HMM ribonucleoprotein complex with little or no deaminase activity (14,15). In contrast, an enzymatically active, LMM form of hA3G predominates in peripheral blood CD4ϩ cells and serves as a potent post-entry HIV restriction factor (14,15). Activation of such cells recruits the LMM enzyme into HMM complexes rendering the cell permissive to infection (15). In vitro treatment of HMM hA3G with RNase or in vivo exposure to interferon produces the enzymatically active LMM form suggesting anti-viral activity involves a delicate interplay governed by RNA-protein interactions (14,16,17).
hA3G belongs to the family of APOBEC-1 related proteins characterized by a ZDD fold featuring the consensus sequence (Cys/His)-X-Glu-X 25-30 -Pro-Cys-XX-Cys, where "X" is any amino acid (18). Although homology models have been generated for some APOBEC-1 family members (19,20), and the hA3G secondary structure has been predicted (18,21), no empirical structural information exists for it or any other member of the APOBEC family. Modeling of the hA3G structure based upon known dimeric or tetrameric CDAs (20,(22)(23)(24) is complicated by the fact that the protein arose from a novel gene duplication of the fundamental ZDD motif such that tandem active sites are present in each subunit (18). To provide insight into the fundamental physical properties of hA3G in relation to known cytidine deaminase structures, as well as how hA3G oligomerization contributes to retroviral restriction, we undertook a solution SAXS analysis of the recombinant enzyme in its HMM and LMM forms.

EXPERIMENTAL PROCEDURES
Preparation of hA3G-Full-length hA3G cDNA was amplified from oligo(dT)-primed H9 cell RNA and a four His tag (His 4 ) was added to the C terminus by PCR. This construct was subcloned into pFastbac TM (Invitrogen, CA). Baculovirus production and infection of Sf9 cell cultures for expression were carried out by Immunodiagnostics, Inc.
Frozen cells (4 g) where lysed in 20 ml of 0.5ϫ hA3G buffer (1ϫ ϭ 50 mM HEPPS, pH 8.8, 75 mM NaCl, 10 mM MgCl 2 , 5% (v/v) glycerol, 0.2 mM ␤-mercaptoethanol, and EDTA-free complete protease inhibitor (Roche Applied Science)) by freezing in N 2 (l) and thawing followed by shearing via successive passes through 22-and 26-gauge needles. The lysis solution was brought to 1% (v/v) Triton X-100 and made 0.1 mM in CaCl 2 . Nuclease digestion ensued with either 0.125 mg⅐ml Ϫ1 RNasefree DNase I (Sigma) (hereafter this protein is referred to as hA3G-D) or 0.125 mg⅐ml Ϫ1 DNase I and 0.25 mg⅐ml Ϫ1 RNase A (Sigma) (hA3G-DR) at 37°C for 30 min. The sample was brought to 1 M urea final concentration, incubated at 24°C for 20 min, and centrifuged (10,000 ϫ g for 10 min at 24°C). Cleared supernatants were adsorbed onto 2 ml nickel-nitrilotriacetic acid-agarose (Qiagen) and mixed for 2 h at 24°C. Contaminants were removed by centrifugation of resin (500 ϫ g for 5 min) washed consecutively over a 2-h period with 10 volumes of: (i) 1X hA3G buffer with 1 M urea; (ii) 1ϫ hA3G buffer with 0.5 M urea; (iii) 5ϫ hA3G buffer; (iv) 1ϫ hA3G buffer containing 0.01 M imidazole; and (v) 1ϫ hA3G buffer with 0.07 M imidazole. Remaining Ni-NTA-bound hA3G was placed in a 15-ml Econo column (Bio-Rad) and eluted with 1ϫ hA3G buffer with 0.25 M imidazole. Elution was monitored at 280 nm. Pure fractions were identified and pooled based on SDS-PAGE gels stained with Coomassie Blue dye; estimated purity was Ͼ99%. Samples were centrifuged at 50,000 ϫ g for 60 min after purification. DNA deaminase assays (supplemental Methods and supplemental Fig. S1) demonstrated nominal activity for hA3G-D, whereas hA3G-DR produced a specific activity of 30 pmol g Ϫ1 min Ϫ1 . These activity trends are consistent with those reported (16).
SAXS Experiments-Scattering experiments were performed at beamline G1 of CHESS (Ithaca, NY). Scattered X-rays were recorded on a custom 1024 ϫ 1024 (69.78 m) pixel CCD detector fabricated by the Gruner group (Cornell University, Ithaca, NY). Scattering was performed at 20°C at a sample-todetector distance of 138.0 cm. The wavelength, , was 1.249 Å, which produced an accessible q-range from 0.012 to 0.215 Å Ϫ1 , where q ϭ 4sin/ ((2 is the scattering angle). Samples of hA3G were prepared at various concentrations in 1ϫ hA3G buffer containing 0.25 M imidazole. Protein concentrations were 0.9 mg⅐ml Ϫ1 and 1.8 mg⅐ml Ϫ1 for hA3G-D, and 0.55 mg⅐ml Ϫ1 and 1.1 mg⅐ml Ϫ1 for hA3G-DR; lower concentrations were examined as well to assure there was no aggregation. Samples were centrifuged at 14,000 ϫ g and immediately transferred to a homemade cuvette composed of a plastic micromachined disk (ALine Inc., Redondo Beach, CA) fitted with 25 m mica walls. This cell had a capacity of 12 l and was loaded through an inlet port with a 25 l blunt-end syringe (Hamilton Corp., Reno, NV). The x-ray beam size was 0.5 ϫ 0.5 mm 2 , which was significantly smaller than the sample cell window. Exposure times were 2-80 s to assess radiation damage; each exposure was recorded in triplicate. Two-dimensional scattering data were corrected for buffer scatter, CCD dark current, and detector non-uniformity. Ag-Behenate powder (The Gem Dugout, State College, PA) was used to calibrate the beam center and sample-to-detector distances. Two-dimensional scattering data were integrated by Data Squeeze 2.07 (25) yielding a one-dimensional intensity profile as a function of scattering vector q.
Analysis of Reduced Scattering Data-The radius of gyration (R G ) was calculated using the indirect Fourier transform package GNOM (26). The result is a pair-distance distribution func-tion, p(r), in real space that represents an alternative means to calculate R G compared with traditional Guinier approximations that are produced from low angle q values in which q⅐R G Ͻ 1.3 (27). In contrast, GNOM produces an R G calculated from the full experimental scattering curve and generates a maximum particle dimension (D Max ) as the distance where p(r) reaches zero, which is generally superior to the Guinier approximation (28). The GNOM method relies upon perceptual criteria (26) such that a solution for a compact, globular molecule obeys a smooth, monomodal Gaussian centered at R G . Goodness-of-fit scores were 0.92 for hA3G-D (an "excellent" score) and 0.894 (a "good" score) for hA3G-DR. The molecular mass for each sample was obtained from the respective pair-distance distribution functions by extrapolating to I(q ϭ 0) using GNOM (26).
Ab Initio Structural Modeling-The low resolution molecular envelopes of hA3G-D and hA3G-DR were restored from their respective SAXS profiles using DAMMIN (29). In this method, simulated annealing is employed for global minimization, whereby random movements in a multiphase dummy atom model minimize the discrepancy between observed and calculated scattering curves. No symmetry constraints were applied to the hA3G-D restorations. Scattering curves with a q range between 0.021 and 0.17 Å Ϫ1 and 0.016 to 0.18 Å Ϫ1 were used for hA3G-D and hA3G-DR, respectively, corresponding to a resolution range between 300 and 35 Å (2/q max ). A sphere was chosen as the initial starting model for each molecule, with D Max derived from the corresponding p(r). For hA3G-D, a dummy atom packing radius of 8.6 Å was assigned by the program; this radius was 3.75 Å for hA3G-DR. All calculations were run in "slow" annealing mode. DAMMIN calculations were performed on a 64 node dual processor cluster at Mac-CHESS (Ithaca, NY). Each restoration required ϳ20 h of CPU time on a 2.0-GHz 32-bit AMD processor. Ten independent DA models were calculated for hA3G-D and hA3G-DR. The 10 models of each class were subjected to automated envelope averaging using DAMAVER (30). Here, each model was compared in a pairwise manner to other models of its class, resulting in a series of NSD values. The model with the lowest NSD was chosen as a reference onto which all other models were fit using SUPCOMB (31). Neither ensemble included outliers based on the NSD criterion. As such, each group of 10 models was included in the calculation of the average envelope. Each of the 10 individual envelopes of a given class (hA3G-D or hA3G-DR) was mapped onto a densely packed grid of atoms with each position marked by its own occupancy value. Positions with significant, non-zero occupancies were chosen to produce a final model whose volume was equivalent to the average excluded volume derived from each independent model. It has been noted that final averaged structures from small angle scattering should not be considered a single unique macromolecular conformation in solution (32,33).
Shape Analysis-To determine whether multiple hA3G-DR envelopes could fit inside the hA3G-D particle, the hA3G-D envelope was moved to the origin and its principal axis of inertia oriented along the z-direction using ALPRAXIN. The hA3G-DR dimer was then subjected to a six-dimensional search of the oriented envelope using SUPMON (31). Other volumetric calculations were performed with CRYSOL (34) Relating Dimeric hA3G-DR to Cytidine Deaminase Crystal Structures-A single CDA domain of yeast CDD1 (Protein Data Bank entry 1R5T) was subjected to a six-dimensional search against the hA3G-DR envelope using COLORES in the SITUS suite (35,36). Several similar solutions were obtained that dif-fered only by the rotational placement of the CDA monomer into the hA3G-DR envelope. With the first CDA subunit fixed, a second search was conducted to fit the remaining hA3G-DR envelope.

Interpretation of the SAXS Data and Distance Distribution
Functions-The SAXS data reveal important physical properties of hA3G that define its global morphology in solution on a nanometer scale. The experimental scattering profiles of pure recombinant hA3G-D (no RNase treatment) and hA3G-DR (RNase treated) are depicted in Fig. 1, A and B. Respective distance distribution functions (Fig. 1, C and D) were calculated by GNOM (26). Both are skewed from an ideal bell-shaped curve characteristic of elongated particles (37,38). The p(r) for hA3G-D indicates an R G of 72.4 Ϯ 0.9 Å and a maximum molecular dimension (D Max ) of 210 Å. The forward scattering I(0) was also calculated by GNOM and corresponds to a molecular mass of 292 Ϯ 8 kDa. RNase-treated hA3G-DR exhibits a smaller R G of 45.8 Ϯ 0.2 Å with a D Max of 140 Å; its I(0) corresponds to a molecular mass of 100.6 Ϯ 4.5 kDa, consistent with a dimer of hA3G subunits. These values agree with those obtained by dynamic light scattering and/or gel filtration chromatography (supplemental data and supplemental Fig. S2).
Quality of ab Initio Models-Bead models for hA3G-D and hA3G-DR were reconstructed from the experimental SAXS curves in DAMMIN (29). The agreement between an individual ab initio model and the experimental data is indicated by the fit of the model scattering curve with actual data (Fig. 1, A and B). Ten ab initio models each were calculated for hA3G-D and hA3G-DR. The final models exhibited values of ϳ1.2 for hA3G-D and ϳ2.8 for hA3G-DR. The observation that the hA3G-DR MW was consistent with a dimer prompted the use of a P2 symmetry constraint in model calculations; no significant difference in was observed using P1 symmetry. The average shape of each molecule was calculated by superposition of all 10 independent models. The average NSD value for hA3G-D models was 0.74 and that for P2 symmetric hA3G-DR was 1.14 (a value of 1.05 was obtained when no symmetry restraint was applied). An NSD value close to unity indicates good agreement between models, whereas ideally superimposed objects tend toward zero (30,31).
Descriptions of Average hA3G Models-The hA3G-D shape is an elongated cylinder ( Fig. 2A). Three principal domains are FIGURE 1. Small angle x-ray scattering curves and distance distribution functions for hA3G. A, experimental hA3G-D SAXS curve (circles) and a scattering profile calculated from a representative ab initio model (line). The sample was 1.8 mg ml Ϫ1 in 1ϫ hA3G buffer plus 0.25 M imidazole. B, experimental hA3G-DR SAXS curve (circles) and a scattering profile calculated from a representative ab initio model (line); the sample was 1.1 mg ml Ϫ1 . C, distance distribution function for hA3G-D calculated from I(q) data in A. The data were fit to a smooth curve and correspond to a maximum particle dimension (D Max ) of 210 Å. The peak maximum of the curve corresponds to an R G of 72.4 Å. D, distance distribution function for hA3G-DR. D max ϭ 140 Å with an R G ϭ 45.8 Å. apparent along the major axis of inertia with each being separated by a narrow cleft. The central domain possesses a depression in its broad face producing a toroid. The RNase sensitivity of this particle and its prominent CD absorption at 267 nm (supplemental Fig. S3) demonstrate that this structure represents a ribonucleoprotein complex. In contrast, the hA3G-DR structure is significantly smaller (Fig. 2B) consisting of an elongated multi-lobed organization comparable with "beads-on-a-string." The molecule possesses dyad symmetry with only a small buried surface area in the subunit interface, which is different from known CDA structures in which the dimer interface is extensive (39).
hA3G-D Is a Minimal HMM Particle That Accommodates Two LMM hA3G-DR Dimers-It is likely that the highly purified hA3G-D of this study represents a minimal HMM particle since previous reports described HMM ribonucleoprotein complexes Ͼ669 kDa; similarly hA3G-DR of ϳ100 kDa is consistent with LMM variants isolated by gel filtration (14,16). To analyze the size and shape relationship between hA3G-D and hA3G-DR, the dimeric envelope of the latter was fitted inside that of the HMM particle. The results revealed that two independent hA3G-DR dimers (4 subunits) fit about a dyad-axis inside hA3G-D with no spatial overlap (Fig. 3A) giving an NSD of 1.1. A second mode of translational packing was also identified (supplemental Fig. S4). However, the rotational symmetry depicted in Fig. 3A is favored because it accounts for the torus in the central domain of hA3G-D ( Figs. 2A and 3A).
The Global Fold of the hA3G-DR Dimer Is a Novel Structure in Comparison with Known Cytidine Deaminases-hA3G is a ZDD enzyme based on its catalytic activity and amino acid sequence alignment with known CDAs (18). However, its secondary structure content and fold classification have not been analyzed experimentally. Using CD spectroscopy, we demonstrated that (i) hA3G-D and hA3G-DR belong to the ␣/␤ fold class, consistent with the CDA family (20), and (ii) the secondary structure content of hA3G-D does not change significantly upon RNase treatment (supplemental Table S1 and Fig. S3). These structural and functional similarities prompted a comparison of the LMM hA3G-DR dimer to the fold of a representative CDA crystal structure, i.e. yeast CDD1 (20). The CDD1 tetramer cannot superpose with either monomeric or dimeric hA3G-DR (Fig. 3B). CDD1, like other CDAs (such as the dimeric enzyme from Escherichia coli), is much more compact than the elongated hA3G-DR structure. These observations support a novel tertiary and quaternary organization for hA3G with implications for other APOBEC3 family members such as 3B and 3F (reviewed in Ref. 18).
Docking of a Minimal CDA Domain into the hA3G-DR Envelope Supports Tail-to-Tail Dimerization-The presence of deaminase activity, ␣/␤ secondary structure, and two ZDD signature motifs per polypeptide suggested that the hA3G-DR envelope should accommodate at least two minimal CDA structures per subunit. An automated rigid body search of the hA3G-DR envelope was conducted using a single CDD1 subunit (Fig. 3B, oval inset). A CDD1 monomer was chosen because it exhibits the minimal deaminase fold (ϳ132 amino acids) and is structurally homologous to numerous other deaminases with the ZDD signature sequence (20). The results revealed that two CDA monomers could be accommodated per hA3G-DR subunit with an average correlation coefficient of 0.76 per subunit. The top solutions differed only by rotational placement in the hA3G-DR envelope. For practical considerations, solutions were chosen (Fig. 3B) to orient the C terminus of one CDA domain in proximity to the N terminus of another. As such, the spatial relationship of the domains follows a "largesmall-large-small" pattern with envelope volumes of ϳ18,700 Å 3 , 7480 Å 3 , 15,180 Å 3 , and 6600 Å 3 . This pattern correlates with the domain organization of the hA3G amino acid sequence, i.e. an N-terminal ZDD motif, a smaller non-catalytic domain, a C-terminal ZDD motif, and a short non-catalytic C-terminal domain (18). The volume of a single CDD1 CDA domain is 17,690 Å 3 , which agrees well with the larger volumes of the LMM hA3G subunit. Although no high resolution structure exists for the smaller ϳ55-amino acid non-catalytic domains, these segments occupy volumes of ϳ7200 Å 3 based on amino acid van der Waals radii (40), which closely agrees with the envelope volumes observed here. This result supports a tail-to-tail dimerization model for hA3G (Fig. 3C) rather than a head-to-head (Fig. 3D) or head-to-tail configuration (16). We cannot dismiss the possibility that DNA or RNA binding  induces a conformational change that juxtaposes the N-and C-terminal CDA domains as in CDD1 (Fig. 3B) or other transacting CDAs (20). However, the extended tail-to-tail topology explains why each hA3G active site functioned as a monomer, devoid of dominant negative effects characteristic of trans subunit complementation (41). Tail-to-tail organization would also confer unique bidentate substrate affinity and deamination properties. Each solvent exposed N-terminal domain of a subunit would exhibit its established nucleic acid binding properties, contributing to the affinity of substrates deaminated by a catalytically active C-terminal ZDD (10). In this manner, transient binding and release of substrate by each half of a dimer would confer processivity, as well as the ability to "jump" large distances past double-stranded substrate sequences (16). Finally, the tail-to-tail model posits that hA3G possesses more than one mode of intersubunit interaction: (i) those promoted by protein, leading to self (or hetero) association (11) and (ii) those promoted by RNA. The solvent accessibility and RNA avidity of the N-terminal ZDD combined with the ability of the C terminus to form intermolecular subunit interactions imply that assembly of higher order ribonucleoprotein complexes is hierarchical. The close packing of hA3G-DR dimers within a minimal HMM particle (i.e. hA3G-D) could attenuate substrate affinity and deamination through sequestration of N-and C-terminal ZDDs. Such a situation might arise if multiple hA3G-D particles were to coalesce, possibly through an RNA bridge. By analogy, sequestration of APOBEC-1 within inactive 60 S editosomes was established as a mechanism to regulate mRNA editing, which requires reorganization into active, 27 S complexes (42). Ultimately, high resolution structural information will be required to discern explicit protein-and RNA-mediated factors leading to HMM assembly, which represents an important drug target.