Differences in the Structure and Dynamics of the Apo- and Palmitate-ligated Forms of Aedes aegypti Sterol Carrier Protein 2 (AeSCP-2)*

Sterol carrier protein-2 (SCP-2) is a nonspecific lipid-binding protein expressed ubiquitously in most organisms. Knockdown of SCP-2 expression in mosquitoes has been shown to result in high mortality in developing adults and significantly lowered fertility. Thus, it is of interest to determine the structure of mosquito SCP-2 and to identify its mechanism of lipid binding. We report here high quality three-dimensional solution structures of SCP-2 from Aedes aegypti determined by NMR spectroscopy in its ligand-free state (AeSCP-2) and in complex with palmitate. Both structures have a similar mixed α/β fold consisting of a five-stranded β-sheet and four α-helices arranged on one side of the β-sheet. Ligand-free AeSCP-2 exhibited regions of structural heterogeneity, as evidenced by multiple two-dimensional 15N heteronuclear single-quantum coherence peaks for certain amino acids; this heterogeneity disappeared upon complex formation with palmitate. The binding of palmitate to AeSCP-2 was found to decrease the backbone mobility of the protein but not to alter its secondary structure. Complex formation is accompanied by chemical shift differences and a loss of mobility for residues in the loop between helix αI and strand βA. The structural differences between the αI and βA of the mosquito and the vertebrate SCP-2s may explain the differential specificity (insect versus vertebrate) of chemical inhibitors of the mosquito SCP-2.

either as a single functional domain or in conjunction with other functional domains (1,2). In vertebrates, SCP-2 proteins have been shown to be involved in cellular lipid transport mechanisms that affect lipid uptake and metabolism (2). Vertebrate SCP-2 proteins are encoded by a single copy gene, SCPx/SCP-2, that is transcribed by alternative promoters to SCP-x or SCP-2 mRNAs (2,3). SCP-x mRNA is translated into a peroxisomal thiolase with an SCP-2 domain at the C terminus, whereas SCP-2 is translated into a protein with only a single functional domain. The role of the vertebrate SCP-x/SCP-2 gene in lipid metabolisms is supported by gene knock-out studies in mice that showed abnormal metabolism in branched fatty acid and bile salt formation (4 -7). However, the role of the vertebrate SCP-2 in lipid uptake is less clear, because knock-out of SCP-2 in the mouse was shown to have little impact on cholesterol uptake in the intestine (6). A human genetic mutation leading to deficiency in SCP-x production was found to be associated with the accumulation in plasma of a branched-chain fatty acid, pristanic acid, and abnormal levels in urine of bile alcohol glucuronides (8). A fragment from the C-terminal SCP-2 domain common to both SCP-x and SCP-2 has been found to bind DNA and to regulate the transcription of CD147, the regulatory subunit of the Alzheimer disease ␥-secretase (9).
In insects, SCP-2 was shown to be involved in cholesterol uptake in the midgut in both larval and adult mosquitoes (10,11). Whereas in mice, knock-out of the SCP-2 had little effect on development, growth, mortality, or fertility (6), knockdown of SCP-2 expression in mosquitoes resulted in high mortality in developing adults and significantly lowered fertilities (10). Moreover, chemical inhibitors of A. aegypti SCP-2 showed remarkable specificity toward insects versus mammals (12,13).
Protein crystal structures of the ligand-bound forms of vertebrate SCP-2 and mosquito SCP-2 reveal major structural differences. Helix ␣2 in vertebrate SCP-2 is replaced by a loop in mosquito SCP-2, and the ligands in the hydrophobic cavities are bound at a 90°angle to one another (14 -16). In mosquito SCP-2, the carboxyl group of palmitate is bound to positively charged amino acids in the loop between helix ␣1 and strand ␤1 (14), whereas in the human SCP-2, the carboxylate of the bound fatty acid is exposed at the surface of the protein near the additional short ␣-helix (17). It has been speculated that the loop between helix ␣1 and strand ␤1 might be a short ␣-helix in mosquito apo-SCP-2 (14). Although the dynamic plasticity of human SCP-2 with different ligands has been studied in solution (18) been carried out. Such studies are of importance, because it has been hypothesized that chemical inhibitors of the mosquito SCP-2 are specific to insects (12) as a result of the structural differences between the vertebrate and the mosquito SCP-2. The only structures of mosquito SCP-2 are from protein crystallography (11,14,19) of proteins produced recombinantly from Escherichia coli cells, and all of these structures show the presence of one or more bound fatty acid ligands. Because our repeated attempts to crystallize ligand-free AeSCP-2 were unsuccessful, we have turned to advanced NMR techniques to determine the three-dimensional structures in solution of ligand-free AeSCP-2 and its complex with a C16 fatty acid (palmitate), AeSCP-2-PA. 3 These results have greatly increased our understanding of the structural and dynamic properties of mosquito SCP-2 in solution.

Protein Expression Purification and Sample Preparation-
The entire coding region of the AeSCP-2 gene was cloned into a pGEX-4T-2 tag vector (GE Healthcare) as described previously (14). PCR primers were 5Ј-gtgaattcgaATGTCTCT-GAAGTCCG-3Ј (where capital letters represent the coding sequence and boldface letters represent the start codon; an EcoRI site was incorporated for cloning) and 5Ј-tactcgagT-TACTTCAGCGAGG-3Ј (capital letters represent the antisense of coding sequence; boldface letters represent the antisense of the stop codon; an XhoI site was incorporated for cloning). The expression vector was transmitted into E. coli strain BL21 (DE3) under 100 g/ml ampicillin selection (Amersham Biosciences). Sequence analysis was performed to confirm that the fusion protein was in-frame with glutathione S-transferase. Use of glutathione S-transferase fusion and a thrombin cleavage site introduced six additional residues at the N terminus. Consequently, the N-terminal amino acid sequence is Gly-Ser-Pro-Gly-Ile-Arg-Met…, where the methionine indicates the start of the native sequence. The transformed BL21 (DE3) strain was grown in 50 ml of Luria-Bertani broth with 100 g/ml ampicillin overnight at 37°C with shaking. The 50-ml cell culture was seeded into 2 liters of M9 minimal media supplemented with 15 NH 4 Cl (for U-15 N-labeled protein) or 15 NH 4 Cl and 13 C 6 H 12 O 6 (for U-13 C, 15 N-labeled protein) and allowed to grow at 37°C with shaking until the absorbance at 600 nm reached 0.7 ( 15 NH 4 Cl and 13 C 6 H 12 O 6 were purchased from Cambridge Isotope Laboratories, Inc. (Andover, MA)). The cells were induced to produce protein by addition of isopropyl ␤-D-1-thiogalactopyranoside to 0.4 mM for 24 h at room temperature with shaking. The cells were washed in phosphate-buffered saline (PBS: 140 mM NaCl, 10 mM Na 2 HPO 4 , 1.8 mM KH 2 PO 4 , 2.7 mM KCl, pH 7.4) and either used immediately or stored long term at Ϫ80°C. The bacterial cells were suspended in 30 ml of PBS with 5 mM dithiothreitol and 2 mM EDTA and lysed twice with a French press at 15,000 p.s.i. at 4°C. The cell lysate was centrifuged at 14,000 ϫ g at 4°C for 1 h to remove cellular debris. The lysate was loaded (1 ml/min) onto a GSTrap HP column (5 ml, GE Healthcare) equilibrated with PBS at room temperature. The column was washed with PBS until the A 280 was at the background level before 150 units of thrombin (GE Healthcare) in PBS was added, and the column was allowed to sit overnight at room temperature. The cleaved protein was eluted from the column and concentrated in a YM10 concentrator (Amicon) to 2 ml in PBS. The salt was removed by dialysis in 3.5-kDa cutoff tubing (Thermo Scientific) against 3ϫ 4 liters double-distilled H 2 O at 4°C. The sample was then lyophilized and used to make NMR samples with 2 mM concentration in 10 mM K 2 PO 4 , pH 7.8, 5% D 2 O. The purity of the protein was confirmed by mass spectrometry and by Western blot against an antibody raised in a rabbit against AeSCP-2. Mass spectrometry was also used to determine that the AeSCP-2 sample was ligand-free (supplemental material). To prepare the AeSCP-2-PA complex, palmitic acid was dissolved in hexane and added in 10 M excess to a 2 mM solution of [U-13 C, 15  NMR Spectroscopy-All NMR spectra were recorded at the National Magnetic Resonance Facility at Madison, WI, on Varian Inova (600 MHz and 900 MHz) spectrometers equipped with triple-resonance cryogenic probes. The temperature of each sample was regulated at 25°C. Sequence-specific backbone resonance assignments were conducted for AeSCP-2 and AeSCP-2-PA. A series of two-and three-dimensional heteronuclear NMR spectra were collected for both AeSCP-2 and AeSCP-2-PA containing 2.0 mM U-13 C, 15 N-protein dissolved in NMR buffer with 10 mM phosphate, 95% H 2 O, and 5% D 2 O (20). Raw NMR data were processed with NMRPipe (21) and analyzed using the program XEASY (22). Two-dimensional 1 H-15 N HSQC and three-dimensional HNCO data sets were used to identify the number of spin systems, and these identifications plus three-dimensional HNCACB and three-dimensional CBCA(CO)NH data sets were used as input to the PINE server (23) to determine sequence-specific backbone and CB resonance assignments. Backbone resonance assignments were confirmed on the basis of 15 N-resolved 1 H-1 H NOESY data. Two-dimensional 1 H-13 C HSQC, three-dimensional HBHA-(CO)NH, three-dimensional HC(CO)NH, and three-dimensional C(CO)NH experiments were used to assign the side chain and HA resonances. Three-dimensional 15 N-edited 1 H-1 H NOESY (100-ms mixing time) and three-dimensional 13 C-edited 1 H-1 H NOESY (120 ms) experiments were used to derive the distance constraints to determine the three-dimensional structure of protein (24). 13 15 N NOE values were calculated from the ratio of peak heights in a pair of NMR spectra acquired with and without proton saturation. The signal-tonoise ratio in each spectrum was used to estimate the experimental uncertainty. Modelfree analysis has been performed using fast Modelfree and Modelfree version 4.15 (27)(28)(29). The one bond scalar coupling ( 1 J N-H ) was derived from the shift in position between corresponding peaks in 15 NHSQC and transverse relaxation optimized spectroscopy spectra (30). RDC values for N-H bond vectors ( 1 D N-H ) were then calculated from the difference between the one bond scalar couplings of Pf1 aligned and isotropic samples. The Q factor was calculated from the equation Q ϭ r.m.s.
Structure Calculation and Analysis-For the structure calculation, 15 N-resolved 1 H-1 H three-dimensional NOESY and 13 C-resolved 1 H-1 H three-dimensional NOESY spectra were used to derive the intramolecular distance restraints and 13 C/ 15 N-filtered/ 13 C-edited three-dimensional NOESY, and 13 C/ 15 N-filtered/ 15 N-edited threedimensional NOESY were used to derived the intermolecular distance restraints. TALOSϩ software (31,32) was used in deriving backbone dihedral angle restraints ( and ) from 1 HA, 15 N, 13 CA, 13 CB, and 13 CЈ chemical shift values. CYANA software versions 2.1/3.0 (33) were used for automatic NOESY peaks assignments and structure calculations. NOESY peaks assigned automatically by CYANA were used as a guide to further refine the structure. RDC refinement was performed using CYANA 3.0, and PALES software was used for analyzing the refinement results (34). Programs MOLMOL (35) and PyMOL (36) were used to calculate the root mean square deviation (r.m.s.d.) and graphical analysis. The PSVS server (37) was used to check the quality of the structure.

RESULTS AND DISCUSSION
Conformational Heterogeneity-The 15 N HSQC spectrum of AsSCP-2 was well dispersed, but signals from a few amino acids  exhibited multiple peaks (Fig. 1, left). This indicated the presence of local structural heterogeneity, which could explain the failure to crystallize the ligand-free protein. The NMR data support the presence of three populations with percent occupancy 51:33:16 in part of the protein and two populations with percent occupancy 66:34 in another part of the protein. These results suggest that one domain of the molecule exists as one stable conformation, whereas another domain exists as two interconverting conformations with a subdomain of the latter domain affected by a second conformational equilibrium. As described below, we determined the solution structure of the most abundant species present in solution. We were also able to assign the signals from the minor species to individual amino acids in the protein sequence.
The multiple peaks observed with AeSCP-2 were converted into a single set of resonances in the AeSCP-2-PA complex (Fig. 1B). Changes in the positions of 15 N HSQC peaks upon formation of the complex identified regions of the protein perturbed by ligand binding. Sequence-specific assignment of these peaks identified the residues involved, and they were found to correspond to residues in the loop joining helix I and strand A.
Structure of AeSCP-2-The solution structure of ligand-free AeSCP-2 was derived from 1842 distance constraints, 146 angle constraints from chemical shift analysis by TALOSϩ software (31), and 87 orientational constraints from RDCs. The total of 2075 constraints was used in generating 100 refined structures, and the best 20 conformers, those with lowest energy that showed the fewest constraint violations, were chosen to represent the solution structure of AeSCP-2. Refinement against residual dipolar couplings improved the structure significantly; the Q factor was 0.67 before refinement and 0.14 after refinement (Fig. 2, A and B). The r.m.s.d. of 0.41 Å for backbone heavy atoms and 0.9 Å for all heavy atoms of regular secondary structures; 89% of the backbone angles were in most favored and 10% were in additionally favored regions of the Ramachandran plot; and the Z-scores for backbone/all dihedral angles were Ϫ0.31/ Ϫ2.9 (Table 1). These parameters are indicative of a high quality solution structure. The thickness of the backbone trace at a given residue represented in Fig. 3F, which is proportional to the magnitude of the r.m.s.d. for the backbone atoms in the 20 conformers, suggests that the loop between ␣I and ␤A is more dynamic than the rest of the backbone. The coordinates and restraints were deposited in the Protein Data Bank (38) with accession code 2ksh, and the chemical shifts were deposited in BioMagResBank (39) with accession code 16662.
The structure has an ␣/␤ mixed fold consisting of five ␤-strands (A-E), containing residues A (Val 29 -Gln 36 2), make up a twisted ␤-sheet (Fig. 3, A and B). ␣-Helix I is in close proximity to ␤-strands C and B and ␣-helix III and also contacts the terminal residues of ␣-helix IV. The ␣-helices pack against one side of the ␤-sheet, whereas the other side of the ␤-sheet is exposed to solvent. ␣-Helices II and III form a helix-turn-helix and contact ␣-helices I and IV. The C-terminal ␣-helix (IV), which is in close proximity to ␣-helices I, II, and III, contains a central bend resulting from the presence of a proline (Pro 104 ). The 10-residue loop joining helix ␣I and strand ␤A is flexible. The major structural difference between the superimposed backbone traces of ligand-free AeSCP-2 and ligand-free human SCP-2 (Fig. 3C) comes from the short ␣-helix in human SCP-2 inserted between ␣I and ␤A; this region in AeSCP-2 is a flexible loop. A second difference is that the ␣III and ␣IV regions are better defined in the AeSCP-2 structure than in the human SCP-2 structure.
Conformational Heterogeneity Giving Rise to the Multiple Peaks Observed in the 15 N HSQC NMR Spectrum-As noted above, the multiple cross-peaks assigned to single residues in the 15 N HSQC spectrum of AeSCP-2 (Fig. 1A) indicate the presence of conformational heterogeneity. Because of this complexity, we only determined the structure of the major conformer. The multiple peaks observed in the 15 (Fig. 1A). We mapped these positions onto the three-dimensional structure (Fig. 3D) in an attempt to find an explanation for the conformational heterogeneity. Most of the residues involved in the C-terminal helix (Glu 95 -Lys 110 ) show multiple resonances, as do residues that are in close spatial proximity to this helix. Thus, we speculate that multiple conformations of the C-terminal helix could explain the origin of the observed spectral heterogeneity. Spin system heterogeneity was observed in the past for fatty acid-binding proteins (40 -42). In fatty acid-binding proteins, the multiple conformations are associated with the size of the binding pocket that accommodates different fatty acid ligands. Extensive studies on fatty acid-binding proteins suggest that structural heterogeneity may function as a means for accommodating ligands of different size in the binding pocket; however, the mechanism by which the structural heterogeneity is generated remains unclear.
Structure of AeSCP-2-PA-The structure of the complex was determined from 1663 distance constraints, 192 dihedral angle constraints derived from chemical shifts, and 92 orientational constraints derived from residual dipolar couplings. The r.m.s.d. was 0.4 Å for the backbone heavy atoms and 0.9 Å for all heavy atoms of regular secondary structures; 85% of the backbone angles were in the most favored and 15% were in additionally favored regions of the Ramachandran plot; and the Z-scores for backbone/all dihedral angles were Ϫ0.55/Ϫ3.2 (Table 1). Refinement against residual dipolar couplings improved the structure significantly from Q ϭ 0.66 (before refinement, Fig. 2C) to Q ϭ 0.19 (after refinement, Fig. 2D). These results indicate that the structure is of high quality. Coordinates and restraints were deposited in Protein Data Bank with accession code 2ksi, and the assignments were deposited in BioMagResBank with accession code 16665.
The topology and secondary structure of the complex are very similar to those of the unligated protein (Fig. 4, A and B). Residues exhibiting 15 N HSQC chemical shift perturbations upon complex formation with palmitate are mapped onto the three-dimensional structure in Fig. 3E, with the thickness of the backbone proportional to the magnitude of the chemical shift change. Filtered 1 H, 1 H-1 H TOCSY, and 1 H-1 H NOESY data were used to assign three distinguishable 1 H signals from palmitate as follows: 1.90 ppm (q2), 0.99 -1.07 ppm (q3-q15), and 0.68 ppm (q16) for palmitic acid (where q indicates a pseudo atom for the methylene or methyl protons at the numbered carbon). 100 NOE interactions between AeSCP-2 and palmitic acid were assigned unambiguously and used to generate the structure of palmitate in the complex. The palmitate binding pocket is buried by hydrophobic residues and is inaccessible to solvent (Fig. 4C). A series of NOE correlations (Pro 21 HA/Arg 24 HD2(HD3), Ile 19 QG2/Arg 24 HD2(HD3)) suggest that the loop joining helix I and strand A is reasonably well structured. Additional NOE interactions (Arg 24 FIGURE 3. Solution NMR structure of AeSCP-2. A, ribbon drawing of the conformer with the lowest CYANA target function. Helices are labeled I-IV, and ␤-strands are labeled A-E. N-and C-termini are labeled as "N" and "C," respectively. B, 20 conformers that represent the solution structure of AeSCP-2. These are the low energy conformers with lowest residual CYANA target function values after superimposition to minimize the r.m.s.d. of the backbone heavy atoms in regular secondary structures. C, ligandfree AeSCP-2 (cyan) was overlaid with ligand-free human SCP2 (gray) (PDB code 1qnd). The major structural difference is the short helix inserted in human SCP2 in place of the loop in AeSCP-2, which is marked with a red box. D, residues of AeSCP-2 exhibiting multiple 15 N HSQC peaks mapped onto three-dimensional structure. Residues shown in orange correspond to those exhibiting multiple peaks (multiple conformational states), and residues shown in gray correspond to those with a single peak (single conformational state). E, chemical shift perturbations of 1 H, 15  HA/Arg 24 HD2(HD3) indicate that the side chain amide of Arg 24 lies close to the backbone. In the crystal structure of AeSCP-2-PA, the side chain amide NH and backbone NH of Arg 24  Backbone Dynamics of Ligand-Free AeSCP-2 and the AeSCP-2-PA Complex-Solution NMR is a powerful technique for studying protein dynamics. To characterize the dynamic behavior of AeSCP-2 and AeSCP-2-PA, we collected 15 N relaxation (longitudinal (T 1 ) and transverse (T 2 )) and { 1 H}-15 N heteronuclear NOE data. Fig. 5 compares T 1 , T 2 , { 1 H}-15 N heteronuclear NOE and the order parameter (S 2 ) results for AeSCP-2 and AeSCP-2-PA. Most of the T 1 , T 2 , { 1 H}-15 N heteronuclear NOE and S 2 values are similar throughout the protein sequence and indicate uniform backbone mobility in these regions. However, the loops between ␣-helix I/␤-strand A (Ile 19 -His 28 ) and ␤-strands C/D (Glu 55 -Ala 62 ) in AeSCP-2 are exposed to solvent and are more dynamic than the rest of the protein (Fig. 5). AeSCP-2-PA, in comparison with AeSCP-2, exhibits decreased T 2 and increased { 1 H}-15 N heteronuclear NOE values, which indicate a significant loss of backbone mobility upon complex formation.
Conclusions-We report here the first high quality solution structures of ligand-free and ligand-complexed SCP-2 from the yellow fever mosquito A. aegypti. The results show that ligand binding does not alter the protein secondary structure. Instead, NMR relaxation data show that the protein backbone of AeSCP-2 loses considerable mobility upon binding palmitate. The loop between ␣I and ␤A of AeSCP-2 coordinates the carboxyl group of the fatty acid in a fashion similar to that observed in the previous x-ray structure of AeSCP-2-PA (14).
Vertebrate SCP-2 proteins contain a short ␣-helix between ␣I and ␤A that is not present in invertebrate SCP-2 (Fig. 3C), and it has been speculated that this helix might be present in the unligated invertebrate SCP-2. The present results show that this is not the case. Instead, vertebrate and invertebrate SCP-2 proteins contain preformed binding pockets that accommodate fatty acids in different configurations. This slight, but critical, structural difference between mosquito and vertebrate SCP-2s may provide the explanation for the finding that AeSCP-2 inhibitors are lethal to mosquitoes but not to mammals (12,13).