Structural Evidence for a Dehydrated Intermediate in Green Fluorescent Protein Chromophore Biosynthesis*♦

The acGFPL is the first-identified member of a novel, colorless and non-fluorescent group of green fluorescent protein (GFP)-like proteins. Its mutant aceGFP, with Gly replacing the invariant catalytic Glu-222, demonstrates a relatively fast maturation rate and bright green fluorescence (λex = 480 nm, λem = 505 nm). The reverse G222E single mutation in aceGFP results in the immature, colorless variant aceGFP-G222E, which undergoes irreversible photoconversion to a green fluorescent state under UV light exposure. Here we present a high resolution crystallographic study of aceGFP and aceGFP-G222E in the immature and UV-photoconverted states. A unique and striking feature of the colorless aceGFP-G222E structure is the chromophore in the trapped intermediate state, where cyclization of the protein backbone has occurred, but Tyr-66 still stays in the native, non-oxidized form, with Cα and Cβ atoms in the sp3 hybridization. This experimentally observed immature aceGFP-G222E structure, characterized by the non-coplanar arrangement of the imidazolone and phenolic rings, has been attributed to one of the intermediate states in the GFP chromophore biosynthesis. The UV irradiation (λ = 250–300 nm) of aceGFP-G222E drives the chromophore maturation further to a green fluorescent state, characterized by the conventional coplanar bicyclic structure with the oxidized double Tyr-66 Cα=Cβ bond and the conjugated system of π-electrons. Structure-based site-directed mutagenesis has revealed a critical role of the proximal Tyr-220 in the observed effects. In particular, an alternative reaction pathway via Tyr-220 rather than conventional wild type Glu-222 has been proposed for aceGFP maturation.

Green fluorescent proteins (GFP) 2 and the GFP-like proteins (FPs) have become in recent years very useful tools in many areas of cell biology, biotechnology, and medicine. These proteins exhibit a wide spectral range of fluorescence, from blue to far-red. It became possible to effectively use FPs as single or coupled biomarkers for multicolor labeling of proteins, subcellular compartments, and specific tissue regions. Utilization of FPs enabled monitoring of a variety of characteristics, such as cellular pH and ion concentration, tracking of expression, intracellular localization, and trafficking of proteins of interest in the cell or whole organism and following their interactions with other cellular components (1)(2)(3)(4)(5).
Chromoproteins are another large group of non-fluorescent counterparts of FPs that share with them the principal fold but not spectral properties (6,7). However, a number of artificially created, genetically engineered variants of chromoproteins do exhibit fluorescence (6,8). Members of both fluorescent and non-fluorescent families possess visible coloration corresponding to an absorption range of 450 -610 nm. Extensive diversity of their photophysical characteristics arises mostly from variations in the chemical structure of the internal chromophore group and in the stereochemistry of its adjacent environment.
The first identified member of a novel colorless non-fluorescent group of GFP-like proteins was acGFPL from jellyfish Aequorea coerulescens (9). Its discovery suggested that many other species might express similar colorless FPs. It was found that a glycine replacement of Glu-222, a conserved residue adjacent to the chromophore, plays a key role in creating a fluorescent form of acGFPL. A variant of acGFPL, named aceGFP (commercial name AcGFP1 from Clontech), obtained by mutation of five residues that included mutating residue Glu-222 to a glycine, exhibited bright green fluorescence. The reverse single mutation G222E in aceGFP resulted in the colorless variant aceGFP-G222E that demonstrated no visible coloration and fluorescence, similarly to the wild type acGFPL. Detailed characterization of the purified aceGFP-G222E showed that only a minor fraction (about 3%) of this protein contains mature chromophore in the protonated state (absorption maximum at 390 nm), whereas the major fraction exhibits no absorption/fluorescence in the visible region, indicating the presence of an immature chromophore. Importantly, this immature protein can be converted into a green fluorescent state (similar to aceGFP, with excitation and emission maxima at 480 and 505 nm, respectively) by UV light illumination ( ϭ 250 -300 nm). We have focused on the stereochemical features responsible for these unique properties. We present here the results of a high resolution x-ray study of the mutant variants of the colorless non-fluorescent progenitor acGFPL. The proteins that were investigated include the green fluorescent aceGFP (resolution 1.5 Å), the colorless aceGFP-G222E (resolution 1.14 Å), and UV-photoconverted aceGFP-G222E-(UV) (resolution 1.75 Å). The structural aspects of this study have been supported by site-directed mutagenesis and by spectroscopic characterization of three new mutant variants, namely aceGFP-Y220L, aceGFP-Y220F, and aceGFP-G222E-Y220L.
Prior to crystallization, aceGFP and aceGFP-G222E were transferred to 10 mM Tris-HCl, pH 7.5, 100 mM NaCl, 2.5 mM EDTA buffer and concentrated to 10 -15 mg/ml in 10-kDa molecular mass cutoff concentration units (Vivascience). Crystals were obtained by the hanging drop vapor diffusion method at 20°C. Each crystallization drop consisted of 1 l of protein solution mixed with an equal volume of reservoir solution and was incubated against 500 l of reservoir solution. The hexagonal bipyramidal crystals of aceGFP, with visible green coloration, were obtained from 0.2 M potassium/sodium tartrate, 0.1 M sodium citrate, 2 M ammonium sulfate, pH 5.6, and the prismatic crystals of aceGFP-G222E were obtained from 0.2 M ammonium acetate, 0.1 M Bis-Tris, pH 6.5, 25% polyethylene glycol 3350. Crystal growth time varied from 1 week to 2 months. A crystal of the colorless aceGFP-G222E, surrounded by the mother liquor in its crystallization drop, was photoconverted to the green fluorescent state by 2 h of UV irradiation (254 nm) at 20°C.
Crystal Structure Determination and Refinement-X-ray diffraction data for aceGFP, aceGFP-G222E, and the UV-photoconverted aceGFP-G222E-(UV) were collected from single crystals flash-cooled in a 100 K nitrogen stream. Prior to cooling, the crystals were transferred to a cryo-protecting solution containing 20% glycerol and 80% reservoir solution. Data were collected with a MAR300 CCD detector at the SER-CAT beamline 22ID (Advanced Photon Source (APS), Argonne National Laboratory, Argonne, IL) and were processed with HKL2000 (10).  The crystal structure of aceGFP was solved by the molecular replacement method with MOLREP (11,12), using the coordinates of a GFP monomer (PDB ID: 1GFL (13)) without its chromophore. The refined coordinates of aceGFP were used to solve the aceGFP-G222E structure. Structure refinement was performed with PHENIX (14), alternating with manual correction of the model using COOT (15). Water molecules were located with ARP/wARP (16), and structure validation was performed with PROCHECK (17). Crystallographic data and refinement statistics are presented in the Table 1. The coordinates and structure factors of the aceGFP, aceGFP-G222E, and aceGFP-G222E-(UV) (after UV irradiation) were deposited in the Protein Data Bank under accession codes 3LVA, 3LVC, and 3LVD, respectively.
Mutagenesis and Spectroscopic Characterization-Preparation of the structure-based variants aceGFP-Y220F/L and aceGFP-222E-Y220L by site-directed mutagenesis was performed by PCR using the overlap extension method with primers containing the appropriate target substitutions (18). Absorption and excitation-emission spectra of the purified proteins were recorded with a Varian Cary 100 UV-vis spectrophotometer and Varian Cary Eclipse fluorescence spectrophotometer, respectively.

RESULTS
Overall Structure-A monomer (molecular mass ϳ27 kDa) of aceGFP and aceGFP-G222E adopts a fold typical for all GFPlike proteins that consists of an 11-stranded ␤-barrel, with loop caps from both sides and a chromophore embedded in the middle of an internal ␣-helix going along the ␤-barrel axis. In the crystalline state, both proteins form dimers composed of two such monomers arranged in an antiparallel fashion, with an angle of ϳ40°between the ␤-barrel axes.
Chromophore Structure of the Green aceGFP-The aceGFP variant exhibits bright emission in the green spectral range ( ex ϭ 480 nm, em ϭ 505 nm; Fig. 1A). In contrast to wild type green fluorescent proteins, the important position 222, proximal to the chromophore, is occupied in aceGFP by an unusual Gly residue instead of the highly conserved catalytic Glu. Nevertheless the chromophore structure, matured from the Ser-65-Tyr-66 -Gly-67 triad (Fig. 2), is identical to that present in wild type avGFP from Aequorea victoria (PDB ID: 1W7S) and cgGFP from Clytia gregaria (2HPW). The chromophore adopts a two-ring coplanar structure typical for GFP, consisting of a five-member imidazolone heterocycle having a p-hydroxybenzylidene substituent with the phenolic ring of Tyr-66 in cis-orientation to the C ␣ -N(66) bond. Tyr-66 exhibits trigonal planar geometry of bonds attached to the C ␣ atom and a double character of the C ␣ ϭC ␤ bond, indicating the sp 2 hybridization state of both atoms. Ser-65 is characterized by the sp 3 hybridization of C ␣ , the single character of the C ␣ -N bond, and the trans-configuration of the preceding peptide bond.
Chromophore Structure of the Colorless aceGFP-G222E-The replacement of the unusual Gly-222 with the generally conserved Glu residue in aceGFP traps the protein in an immature colorless state (9). Indeed, the crystal structure of aceGFP-G222E showed that the chromophore moiety is not fully matured. Here the posttranslational modification has terminated at the backbone cyclized state characterized by the noncoplanar, non-conjugated two-ring structure, with Tyr-66 in the native (non-oxidized) form. The single character of the Tyr-66 C ␣ -C ␤ and C ␤ -C ␥ bonds, the tetrahedral geometry of bonds attached to the C ␣ atom, and the characteristic value of the torsion angle 1 ϭ Ϫ55°around the C ␣ -C ␤ all unequivocally define the sp 3 hybridization state of both the C ␣ and the C ␤ atoms (Figs. 3 and 4a).
The environment of the chromophore in aceGFP-G222E can be defined as 19 side chains arranged in a manner practically identical to those in aceGFP, except for the extra Glu-222 side chain (Fig. 5). Glu-222 and another highly conserved residue, Arg-96, play an active catalytic role in the GFPlike proteins, facilitating oxidative chemistry in the posttranslational modification process (19 -22). Most of the proximal residues are involved in an extensive H-bond network interacting with the chromophore. The proximal waters (presumed reaction products of the maturation process) also participate in the network, mediating interactions between the residues. The network is apparently functionally important as a potential proton/ electron wire in the maturation process.
Interestingly, the carboxyl group of Glu-222 in the immature aceGFP-G222E exhibits ϳ2 Å displacement relative to the corresponding position in the wild type green avGFP and cgGFP. This shift is stabilized by an H-bond with the Tyr-220 side chain and can be reproduced with ϳ20, ϳ30, and ϳ35°torsion rotations around the respective C ␣ -C ␤ , C ␤ -C ␥ , and C ␥ -C ␦ bonds of the side chain of Glu-222. In  (Fig. 6).
Chromophore Structure of the UV-photoconverted aceGFP-G222E-Irradiation of the colorless aceGFP-G222E variant with UV light ( ϭ 250 -300 nm) causes its irreversible photo- The upper (bold) and lower numbers correspond to those in the crystal structures of acGFPL/aceGFP/avGFP and cgGFP, respectively. The residues in the immediate chromophore environment are shown in red. The residues forming direct bonds and the residues mediated by water H-bonds (d Յ 3.3Å) with chromophore are highlighted in yellow and light blue, respectively, and those making van der Waals contacts (d Յ 3.9Å) are not highlighted. Two critical sites responsible for coloration are marked by a dollar sign. conversion to a green fluorescent state that is similar to the one found in aceGFP (9). In contrast to a non-irradiated precursor, the chromophore structure in the UV-converted variant is fully matured and has spectral properties practically identical to that in aceGFP (Fig. 1, A and B), exhibiting coplanar arrangement of the imidazolone and phenolic rings with a conjugated system of -electrons. Except for the chromophore itself, photoconverted and non-photoconverted aceGFP-G222E, as well as aceGFP, have practically identical three-dimensional structural arrangement of the residues nearest to the chromophore. No decarboxylation of Glu-222, characteristic for UV-induced photoconversion of WT avGFP (24) and photoactivatable GFP (25,26), was observed in aceGFP-G222E.
Mutagenesis of Tyr-220-A comparison of the three-dimensional structures of aceGFP/aceGFP-G222E with WT avGFP/ cgGFP revealed an important amino acid difference at position 220. This position, located in the immediate environment of the chromophore, is occupied by Tyr and Leu/Ile, respectively (Fig. 2). In aceGFP-G222E, the Tyr-220 phenolic ring shifts the Glu-222 carboxylate by 2 Å from the standard position observed in avGFP/cgGFP. The replacement Y220L in the colorless aceGFP-G222E resulted in a green fluorescent product aceGFP-G222E-Y220L with excitation-emission spectra similar to those of the wild type avGFP ( ex ϭ 396 and 494 nm, em ϭ 508 nm; Fig. 1C). In the green fluorescent aceGFP, lacking Glu-222, substitutions Y220L and especially Y220F resulted in very slowly maturing mutants that required 1-2 weeks of incubation to develop fluorescence. At the same time, matured aceGFP-Y220L and aceGFP-Y220F proteins have excitation and emission spectra indistinguishable from those of parental aceGFP (Fig. 1A). Moreover, molar extinction coefficients and fluorescence quantum yields for these mutants were found to be very close to that of aceGFP (50,000 M Ϫ1 cm Ϫ1 and 0.55, respectively (9). Thus, Tyr-220 appears to be important to ensure fast chromophore maturation of aceGFP, but its pres-   ence has a negligible effect on the spectral properties of the matured protein.

DISCUSSION
The immature chromophore of aceGFP-G222E exhibits non-coplanar arrangement of the imidazolone and phenolic rings, indicating the lack of their conjugation (Figs. 3 and 4a). This state is characterized by the non-oxidized single Tyr-66 C ␣ -C ␤ bond, with both atoms in sp 3 hybridization. A distinct feature of the colorless aceGFP-G222E, as well as of aceGFP, is the presence of Tyr-220, as opposed to Leu and Ile in WT avGFP and cgGFP, respectively. In aceGFP-G222E, the side chain of Tyr-220 pushes the negatively charged carboxyl group of Glu-222 by ϳ2 Å from its conventional position. As a result, the H-bonded wire between Glu-222 and the chromophore Tyr-66 is extended by insertion of the interactions with Tyr-220. Such a shift apparently causes local distortion of the electrostatic field, which terminates the chromophore maturation process at an intermediate stage in which the cyclization of the protein backbone has occurred, but Tyr-66 still remains in the native, non-oxidized form. To check the proposed charge effect, we replaced the "pushing" Tyr-220 residue with a smaller Leu, aiming to restore the location of the Glu-222 side chain to that observed in avGFP and cgGFP. Indeed, this replacement resulted in a fully matured product aceGFP-G222E-Y220L, with spectral properties similar to those of the wild type avGFP (Fig. 1C).
UV irradiation of the immature aceGFP-G222E drives the post-translational modification process further and results in a fully matured chromophore having the GFP conventional planar bicyclic structure and the conjugated system of -electrons. The conserved Glu-222, Arg-96, and possibly, Tyr-220, are expected to participate in catalysis of this process. The UVconverted variant exhibits bright green fluorescence similar to that observed in green aceGFP. Surprisingly, despite the critical Glu-222/Gly-222 difference, both proteins have almost identical spectral characteristics ( ex ϭ 480 nm, em ϭ 505 nm; Fig. 1,  A and B). The absence of Glu-222 or its displacement from the position standard in GFPs correlates with the excitation peak at ϳ480 nm, which is presumably due to the deprotonated state of the chromophore (27). Indeed, replacement of Glu-222 with a Gly in WT avGFP shifts the excitation maximum in spectra from 398 to 481 nm, leaving the emission peak at ϳ507 nm (28). On the other hand, the proposed restoration of the conventional position of Glu-222 side chain in aceGFP-G222E-Y220L shifts the excitation peak from 480 to 390 nm, characteristic for WT avGFP (28) (Fig. 1C). The latter property has been proposed to relate with the protonated chromophore state, stabilized by electrostatic repulsion between Glu-222 carboxylate and the chromophore (27,29,30).
A distinctive feature of aceGFP is the presence of an unusual Gly residue substituting for the catalytic Glu-222, invariant in all wild type GFP-like proteins. Glu-222 was found to be one of the key residues in biosynthesis of the GFP chromophore. It has been suggested that this residue plays a role of the general base, facilitating proton abstraction from the chromophore moiety (19 -22). The absence of Glu-222 turned out not to be critical for complete maturation of the aceGFP chromophore to a conventional planar -conjugated bicyclic system (Fig. 4a), which might indicate the possibility of an alternative path for the chemical reaction. In this particular variant, the hydroxyl of Tyr-220 resides within ϳ1 Å distance from the position occupied by one of the oxygens of the carboxylate of Glu-222 in WT avGFP and apparently acts in a similar way as a proton acceptor.
To check a possible role of this residue in aceGFP, we substituted Tyr-220 by Leu, characteristic for avGFP, and by Phe. The resultant variants, aceGFP-Y220L and aceGFP-Y220F, have the spectra and the intrinsic brightness similar to aceGFP, but in contrast to the latter protein, show a very slow maturation rate. This experiment supports our suggestion of an alternative catalytic pathway of chromophore maturation via Tyr-220, rather than via conventional Glu-222.
The details of the mechanism of chromophore formation in GFPs have been a subject of considerable controversy. Several crystal structures trapping various intermediate states have been reported, leading to a number of proposed reaction schemes of chromophore biosynthesis (21,31,32). One of the plausible pathways of chromophore maturation (Fig. 7) has been proposed by Barondeau et al. (32). Currently, three intermediates, I, III, IV, and the mature state VI of the chromophore from the proposed pathway, have been experimentally detected by x-ray crystallography. The precyclized state I was trapped in the anaerobically prepared GFP variants with modified chromophore-forming sequences Gly-65-Gly-66 -Gly-67 and Ala-65-Ser-66 -Ala-67 (19,33). The dehydrated intermediate III has now been observed by us in the non-modified chromophore Ser-65-Tyr-66 -Gly-67 of the colorless aceGFP-G222E. The distinguishing feature of this state is the intact geometry of Tyr-66, characterized by sp 3 hybridization of both the C ␣ and the C ␤ atoms and by the tetrahedral arrangements of their covalent bonds (Fig. 4a). The structure corresponding to the following partially oxidized enolate intermediate IV with sp 2 and sp 3 hybridization of the respective C ␣ and C ␤ atoms (Fig. 4b) has been detected for the blue fluorescent BFPsol matured variant after reducing the previously mutated chromophore Thr-65-His-66 -Gly-67 with dithionite (32).
Pouwels et al. (21) proposed a different pathway of chromophore formation in which the dehydration reaction of the hydroxylated imidazolone is the last chemical step leading to a mature chromophore. The state preceding the matured form has been attributed to the non-dehydrated chromophore structure (Fig. 4c) trapped in the crystal of the colorless EGFP variant. This variant has the chromophore-forming sequence Thr-65-Leu-66 -Gly-67 with an unusual non-aromatic Y66L substitution (31). The observed form reveals the hydroxylated imidazolone and partially oxidized Leu-66 C ␣ -C ␤ bond with sp 2 and sp 3 hybridization of the respective atoms. The authors' assignment of the C ␣ (Leu-66) atom hybridization to sp 2 is consistent with the observed trigonal planar geometry of three attached bonds but somewhat disagrees with their single character. This non-dehydrated state with oxidized C ␣ (66)_sp 2 atom cannot succeed or precede our dehydrated form III with non-oxidized C ␣ (66)_sp 3 (Fig. 7). This might indicate the complexity of chromophore biosynthesis, suggesting more than one possible reaction pathway defined by the concrete chromophore-forming sequence and the immediate stereochemical environment. The unusual aliphatic nature of the chromophore Leu-66 in the EGFP variant could also make a dramatic difference in the maturation process when compared with that in GFP-like proteins that contain an aromatic residue at position 66.