X-ray Structure and Ligand Binding Study of a Moth Chemosensory Protein*

Chemosensory proteins (CSPs) are believed to be involved in chemical communication and perception. Such proteins, ofM r 13,000, have been isolated from several sensory organs of a wide range of insect species. Several CSPs have been identified in the antennae and proboscis of the mothMamestra brassicae. One of them, CSPMbraA6, a 112-amino acid antennal protein, has been expressed in large quantities and is soluble in the Escherichia coli periplasm. X-ray structure determination has been performed in parallel with ligand binding assays using tryptophan fluorescence quenching. The protein has overall dimensions of 25 × 30 × 32 Å and exhibits a novel type of α-helical fold with six helices connected by α−α loops. A narrow channel extends within the protein hydrophobic core. Fluorescence quenching with brominated alkyl alcohols or fatty acids and modeling studies indicates that CSPMbraA6 is able to bind such compounds with C12–18 alkyl chains. These ubiquitous proteins might have the role of extracting hydrophobic linear compounds (pheromones, odors, or fatty acids) dispersed in the phospholipid membrane and transporting them to their receptor.

Two classes of highly soluble and very abundant proteins of ϳ150 amino acids have been detected in sensilla of Lepidoptera, both containing 6 conserved cysteines forming three disulfide bridges. The first class, that of GOBPs, 1 is equally distributed in both sexes, whereas the second class, that of PBPs, is mainly present in males (1). A third class of small proteins (average M r 13,000) has been identified in antennae from Drosophila melanogaster and in antennae and several sensorial organs (tarsi, labrum) from a wide range of species of the insect order (2)(3)(4)(5)(6)(7)(8)(9)(10)(11). These proteins have been proposed to be involved in CO 2 detection (3), in chemical signal transmission in regenerating legs (5), or in chemo-perception (either olfaction or taste (9,12)), and they were therefore called chemosensory proteins (CSPs). CSPs are shorter (110 -115 amino acids) than PBP or GOBP, contain only 4 conserved cysteines forming two disulfide bridges (9), and share no sequence homology with them. They may also play a role in the transport of hydrophobic chemicals (volatile or not) from air or water to olfactory or taste receptors in a similar way as other transport proteins such as GOBPs or PBPs. However, the exact physiological role of CSPs has still to be identified. In the moth Mamestra brassicae, several CSPs have been identified in the proboscis (12) and in the antennae (13). M. brassicae CSPs have been shown to bind several components of the pheromonal blend and therefore might also have a function analogous to that of PBPs (12). In the proboscis, however, a putative role of odor or taste carriers has been assigned to CSPs (12).
To improve our knowledge of these important insect proteins, we have expressed in the Escherichia coli periplasm CSPMbraA6 originating from the antennae of M. brassicae (14). CSPMbraA6 NMR preliminary assignment and crystallization have already been published (14,15). The present paper is the first report of a CSP structure and of its preliminary functional characterization. We describe the three-dimensional x-ray structure of CSPMbraA6, which displays a novel fold, and fluorescence binding studies with 12-bromododecanol and brominated fatty acids. CSPMbraA6 is able to bind these compounds with good affinity. Based on these results, we hypothesize its putative function as a lipid carrier.

MATERIALS AND METHODS
Production of CSPMbraA6 -A detailed description of the molecular cloning, expression, purification, and characterization of CSPMbraA6 has previously been reported (14). Expression of selenomethioninesubstituted CSPMbraA6 was performed using the methionine pathway inhibition method (16). Briefly, bacteria were grown to midlog phase in minimal medium before addition of the following amino acids: lysine, phenylalanine, and threonine at 100 mg/ml, isoleucine, leucine, and valine at 50 mg/ml, and selenomethionine at 50 mg/ml. Induction thereafter carried on for 30 min in the same conditions as described above for the expression of mutants. Mass spectrometry was performed to confirm the substitution of the unique Met in CSPMbraA6.
X-ray Structure Determination-Crystals of CSP2 were obtained using two different conditions (15). Crystals of form 1 belong to the P4 3 2 1 2 space group with unit cell dimensions of a ϭ b ϭ 69.4 Å, c ϭ 79.6 Å and contain 1 molecule per asymmetric unit whereas crystals of form 2 belong to a monoclinic space group P2 1 with unit cell dimensions of a ϭ 47.6 Å, b ϭ 49.7 Å, c ϭ 50.3 Å, ␤ ϭ 110.1°and contain 2 molecules per asymmetric unit. Crystals of selenomethionyl protein were obtained in an identical fashion as the native crystals but needed seeding using small native crystals. For data collection, crystals were flash frozen at 100 K in their mother liquor without cryoprotectant. * This work was supported in part by the PACA region (Grant 9811/ 2177) and by an European Union BIOTECH structural biology project (Grant BIO4-98-0420). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The The x-ray fluorescence from each crystal was measured as a function of the incident x-ray energy in the vicinity of the Se-K edge. In the case of the tetragonal form, a three-wavelength MAD data experiment was carried out on the BM-14 beamline at ESRF (Grenoble). The wavelengths chosen for the data collection were 0.978878 Å (12.666 keV) and 0.979095 Å (12.663 keV) corresponding to maximum fЉ and minimum fЈ, respectively. A third remote energy was selected at 13.5 keV ( ϭ 0.918397 Å). In the case of the monoclinic crystal, a SAD experiment was performed on the ID-29 beamline at ESRF at a wavelength of 0.9789 Å (12.666 keV) corresponding to the maximum fЉ. Each time, data were collected from a single crystal in an arbitrary setting using a MAR-CCD detector (BM-14) or a ADSC Quantum 4 detector (ID-29) after optimizing the oscillation range using STRATEGY (17). All data were processed and reduced using DENZO or HKL2000 (18,19) and the CCP4 program suite (20). Data collection statistics are given in Table I.
In the case of crystal form 1, initial phases were obtained by the SOLVE program (21). The experimental MAD electron density maps were improved by solvent flattening using DM (22). The CSPMbraA6 was built manually using the MAKE FRAG and TPPR options in Turbo-Frodo (23). In the case of form 2 the native Patterson function calculated at 1.8-Å resolution contained a large peak in the y ϭ 0.5 section that was 8% of the height of the origin peak. This peak persisted in the anomalous difference Patterson calculated with the peak absorption diffraction data. Because of the special nature of the non-crystallographic translational symmetry, extra care was needed to solve the selenium structure. Heavy atom refinement and phasing were carried out in MLPHARE (20), and solvent-flattened maps were generated with DM (22); the main chain was traced with wARP (24). The map was of sufficient clarity not to require non-crystallographic symmetry averaging. Therefore, one of the two molecules was selected for fitting the sequence using O (25), and the other was generated by the application of the non-crystallographic symmetry operator. Refinement of both models was performed with CNS (26) using bulk solvent correction and simulated annealing alternated with manual refitting using Turbo-Frodo (23). Protein geometry was assessed using PROCHECK (27) showing 91 and 94.1% residues in the most favorable region and 9 and 5.9% in the additional allowed region for forms 1 and 2, respectively.
The models of the ligands were built with Turbo-Frodo using standard stereochemical parameters. The CSPMbraA6 Tyr-26 side chain 1 angle was rotated toward the surface by 100°to form the continuous internal channel, and the ligands were docked manually.
Fluorescence Spectroscopy-Brominated compounds and pheromones were obtained from Chemtech BV (The Netherlands) and from Sigma. 100% methanolic solutions were freshly prepared. Fluorescence quenching was measured using a Cary Eclipse (Varian). Fluorescence measurements were made in a right angle configuration at 20°C by using 2.5-nm excitation and 10-nm emission bandwidths. The excitation wavelength was 280 nm, and the emission spectra were measured between 290 and 540 nm. In all experiments the final methanol concentration in the cuvette was kept below 1%. Samples contained 1 M protein in 10 mM Tris buffer, 25 mM NaCl, pH 8.0; ligands were used at concentrations between 0.02 and 22.5 M.
To estimate the affinity of brominated compounds to CSPMbraA6, the fluorescence intensities at 343 nm at increasing concentrations of quencher were plotted versus the quencher concentration. The K d values were estimated by non-linear regression using Prism 3.02 (Graph-Pad Software, Inc.).

RESULTS AND DISCUSSION
Overall Structure Description-CSPMbraA6 has an overall globular shape with dimensions of 25 ϫ 30 ϫ 32 Å and consists of six helices connected by ␣-␣ loops (Fig. 1a). In crystal form 1 and in the monomer B of crystal form 2, the polypeptide chain is visible from residue 11 to residue 112, whereas in monomer A of crystal form 2, the chain starts at residue 4. The remaining N-terminal residues are probably disordered in the crystal. The helices correspond to residues 5 (or 12)-18 (helix A), 20 -30 (B), 38 -53 (C), 60 -76 (D), 78 -88 (E), and 93-105 (F) (Fig. 1a). Two disulfide bridges close small loops because they are formed by cysteines 29 and 36 and cysteines 55 and 58 (Fig. 1a), similar to what was proposed for the CSP of Schistocerca gregaria (9).
No comparable fold has been found in the Protein Data Bank using DALI (28), indicating that the six helices are arranged in an original way. Helices A and B (residues 6 -30) as well as helices D and E (residues 60 -88) form two Vshaped structures, with opening angles of ϳ60° (Fig. 1, b and  c). The planes defined by the two V-shaped structures are parallel and about 12 Å apart. Helix C is perpendicular to these two planes and positioned in between the four ends of the two V-shaped structures (Fig. 1, b and c). The final helix (F) is located packed against the external face of the D-E helices and does not take part in the core assembly. Aside from the N termini, the C␣ chains of the different protein forms or monomers are very similar. A RMSD of 0.44 Å is observed between the C␣ atoms common to monomers A and B of form 2, and a RMSD of 0.64 Å is observed between the C␣ atoms of crystal form 1 and monomer A of crystal form 2. A few significant differences are observed in some side chain orientations, however (see below).
The sequence of CSPMbraA6 contains 16 Glu, 7 Asp, 14 Lys, and 6 Arg residues, accounting for 39% of the total amino acid content. All of these charged residues are located at the protein surface. Aromatic and aliphatic residues, accounting for 30% of the total amino acid content, form the core of the protein and the walls of the internal channel (see below); only a few of these hydrophobic residues are located near the protein surface. The high content of charged residues and the absence of hydrophobic patch at the protein surface explain well the monomeric nature of CSPMbraA6 and its outstanding solubility.
Hydrophobic Channel-A narrow channel, starting from the surface region between residues 6 -10 (helix A) and residues 62-68 (helix D), extends 14 Å within the core of monomer A of form 2 between the two V-shaped structures (Fig. 2, a and b). Six ordered water molecules are visible in this channel at hydrogen bonding distances from each other (Fig. 2, a and b). They contact, from outside to inside, Arg-68, Asn-10, Asn-61, Leu-13, Asp-9, His-46, Glu-62, Gly-65, Ala-66, Leu-43, Val-69, Leu-47, and Tyr-26. One side of the channel is formed by helix A (residues 5-18) and is stabilized by a crystal contact between Tyr-8 and Glu-39 of a symmetry-related molecule. The side chain of Tyr-26 forms the bottom of the channel, preventing its continuity with a nearby internal cavity (Fig. 2, a and b).
In monomer B, the disorder of the peptidic segment 1-10 results in a large opening at the position of the beginning of the channel in monomer A (Fig. 2, c and d). NMR studies indicate that this segment is also disordered in solution. 2 The Tyr-26 side chain is rotated by 100°around the 1 angle toward the protein surface, and the position of its hydroxyl group in monomer A is occupied by the side chain of Leu-43. As a result of these side chain rotations, the channel is much shorter, and a  closed cavity starts in monomer B where the channel stops in monomer A.
We hypothesize that, upon ligand binding, the Tyr-26 side chain might be rotated toward the protein surface as in monomer B. Consequently, applying to monomer A the rotation of Tyr-26 observed in monomer B makes it possible to form a continuous internal channel starting from the surface and about 20 Å deep. This narrow and elongated channel seems suitable for alkyl compound binding, a hypothesis tested below.
Ligand Binding Solution Studies-As AMA has been widely used in binding assays with mammalian lipocalins (30) and insect PBPs (31), we initially attempted to use this fluorescent probe to determine the affinity of various ligands to CSPM-braA6. Unfortunately, no fluorescence increase was observed upon addition of AMA to a solution of CSPMbraA6 (data not shown), indicating that AMA does not bind in the internal hydrophobic channel. This was not surprising considering the bulkiness of the anthracene moiety and the small radius of CSPMbraA6 channel.
As bromine has been reported as an excellent long range quencher of tryptophan fluorescence (32), we decided to perform binding assays with linear compounds bearing a bromine atom: 12-bromo-dodecanol, 15-bromo-pentadecanoic acid, and 9-bromo-stearic acid (named C12Br, C15Br, and C18Br, respectively). In the presence of saturating concentrations of these compounds, the intrinsic tryptophan fluorescence of CSPM-braA6 was quenched at 26, 34, and 30%, respectively (Fig. 3, A-C), which suggests that at least one of the two tryptophans interacts with the ligand. This decrease in fluorescence intensity was associated with a blue shift of the emission maximum, from 343 to 339 nm with C12Br and to 332 nm with C15Br and C18Br, indicating that the tryptophan environment becomes probably more hydrophobic upon ligand binding. To determine the affinity of CSPMbraA6 for the three ligands, the fluorescence quenching at 342 nm was measured as a function of ligand concentration, and the dissociation constant was estimated by non-linear regression of the binding curve (Fig. 3, A-C). The K d values determined were 0.90, 1.60, and 0.35 M, respectively, in the range of values observed for PBP-pheromone complexes (31,33).
Ligand Binding Modeling-The x-ray structure and the positive binding assays prompted us to model the interaction between the ligands and CSPMbraA6. We performed a small rotation of the side chain of Tyr-26, which extends the length of the hydrophobic channel to ϳ20 Å inside the protein (see above). The three brominated ligands used in fluorescence experiments were modeled and docked in the cavity with the OH or COOH moieties pointing out of the channel. In this orientation, C12Br is fully imbedded in the channel while the carboxylic moieties of the C15Br and C18Br reach the protein surface and the bulk solvent (Fig. 3D). The three ligands fit well in the narrow channel in an elongated way for most of the alkyl chain, a kink being induced at the level of carbons 5-6 by a change in the channel direction. These ligand positions with the hydrophilic group turned to the outside are also in agreement with the hydrophobic character of the channel. The bromine atoms of the ligands are closer to the Trp-81 indole rings (8, 8, and 11 Å, respectively) than to Trp-94 (16, 16, and 15 Å, respectively). This orientation accounts well for the fluorescence of Trp-81 being mostly quenched.
Comparison with Other Lipid Transport Proteins-The three-dimensional structures of several classes of small lipid transport proteins have been solved to date and have diverse structural frameworks. Two families (34 -36) employ anti-parallel ␤-barrels to form internal, variable shaped internal cavities that accommodate the often hydrophobic ligands. By changing the residues inside of such barrels, a wide range of specificities can be achieved.
In contrast, CSPMbraA6 has an all ␣-helical structure. Several other proteins capable of accommodating lipids also have an all helical structure: LTPs (37,38), the B1 and B2 proteins (39), and PBPs (40,41). LTPs are small proteins (around 100 amino acids) that facilitate the transfer of lipids through membranes. In their structure, four helices delineate a cavity extending through the entire molecule (37,38). B1 and B2 proteins, secreted in the tubular accessory glands of the adult male mealworm beetle Tenebrio molitor, have been proposed to be lipid carriers able to keep hydrophobic compounds in solution in the aqueous seminal fluid (39). NMR studies of THP12, a protein from T. molitor homologous to B1/B2, revealed a structure of six ␣-helices and demonstrated that THP12 binds fatty acids (nonanoic acid and octanoic acid) as well as the specific pheromones of T. molitor, 4-methylnonanol, and ergosterol (42). The structure of Bombyx mori PBP has revealed the presence of six helices delineating a cavity that contains the pheromone, bombykol (40). Bombykol, an unsaturated alcohol, presents a bent conformation, reminiscent of fatty acids in fatty acid binding proteins (43). At acidic pH, the cavity of the apo-PBP is filled by the C-terminal segment in helical conformation (41). As far as the binding properties of PBPs are concerned, it is important to remember that some PBPs have also revealed strong affinities with non-pheromonal compounds such as alkyl alcohols and fatty acids (31).
The relative arrangement of the helices differs greatly among all these ␣-helical proteins, including CSPMbraA6. The ligand-accommodating region is also different as THP12 exhibits an elongated groove located at the protein surface whereas PBPs and LTPs possess an internal cavity, as in the case of CSPMbraA6. In contrast with PBPs, the narrow and rather elongated channels of CSPs seem to be conveniently designed to select only aliphatic ligands.
Conclusion-The structure and binding properties of an insect chemosensory protein are reported here for the first time. Our results clearly point to a lipid transport function for CSPMbraA6. Lipids are poorly soluble in the fluids from antenna or other organs containing CSPMbraA6 but are most probably dispersed in the phospholipidic membrane. We hypothesize therefore that CSPMbraA6 should interact with membranes to extract pheromones or other lipidic compounds dispersed in the hydrophobic membrane matrix and bring them to their specific target. Whether the N-terminal segment is the trigger of this interaction in a comparable way as the C-terminal helix of PBP remains to be explored.