Structure of the Homodimeric Glycine Decarboxylase P-protein from Synechocystis sp. PCC 6803 Suggests a Mechanism for Redox Regulation*

Background: Glycine decarboxylase (P-protein) is essential for many vital processes, including nucleotide biosynthesis and photosynthesis. Results: Disulfide formation drives conformational changes that inactivate the cyanobacterial P-protein, a model for plant and human glycine decarboxylase. Conclusion: Glycine decarboxylase activity is regulated by cellular redox homeostasis. Significance: This is the first molecular model for redox regulation of glycine decarboxylase. Glycine decarboxylase, or P-protein, is a pyridoxal 5′-phosphate (PLP)-dependent enzyme in one-carbon metabolism of all organisms, in the glycine and serine catabolism of vertebrates, and in the photorespiratory pathway of oxygenic phototrophs. P-protein from the cyanobacterium Synechocystis sp. PCC 6803 is an α2 homodimer with high homology to eukaryotic P-proteins. The crystal structure of the apoenzyme shows the C terminus locked in a closed conformation by a disulfide bond between Cys972 in the C terminus and Cys353 located in the active site. The presence of the disulfide bridge isolates the active site from solvent and hinders the binding of PLP and glycine in the active site. Variants produced by substitution of Cys972 and Cys353 by Ser using site-directed mutagenesis have distinctly lower specific activities, supporting the crucial role of these highly conserved redox-sensitive amino acid residues for P-protein activity. Reduction of the 353–972 disulfide releases the C terminus and allows access to the active site. PLP and the substrate glycine bind in the active site of this reduced enzyme and appear to cause further conformational changes involving a flexible surface loop. The observation of the disulfide bond that acts to stabilize the closed form suggests a molecular mechanism for the redox-dependent activation of glycine decarboxylase observed earlier.

In addition to the pyridoxal 5Ј-phosphate (PLP)-dependent P-protein, GCS requires two more enzymes, namely the tetrahydrofolate (THF)-dependent T-protein (EC 2.1.2.10; aminomethyltransferase) and the NAD ϩ -dependent L-protein (EC 1.8.1.4; dihydrolipoamide dehydrogenase). As the fourth component, a small lipoylated protein, the H-protein (initially named hydrogen carrier protein) interacts successively via its lipoyl arm with the P-, T-, and L-proteins (Fig. 1). The GCS reaction cycle comprises three reactions. First, in a transaldimination reaction (11), glycine reacts with the PLP on the P-protein to form an external aldimine that loses the carboxyl group as CO 2 and donates the remaining aminomethylene moiety to the oxidized lipoamide arm of H-protein. Next, T-protein uses the S-aminomethyl-dihydrolipoyl H-protein and THF to produce CH 2 -THF, ammonia, and dihydrolipoyl H-protein.
Finally, L-protein closes the GCS reaction cycle by the NAD ϩdependent reoxidation of the H-protein's dihydrolipoyl arm to the dithiolane form (reviewed in Refs. 12 and 13). As a result, one molecule each of glycine, THF, and NAD ϩ is converted into CO 2 , NH 3 , CH 2 -THF, and NADH. The produced CH 2 -THF is used as a one-carbon donor in a variety of biosynthetic pathways, including the synthesis of serine and of S-adenosylmethionine as another universal one-carbon donor (reviewed in Ref. 14). The NADH is reoxidized to NAD ϩ by the mitochondrial electron chain with or without subsequent ATP production (15).
The ubiquitous occurrence of GCS already indicates its crucial importance for cellular metabolism. Indeed, the absence of or malfunction of GCS results in fatal metabolic dysfunctions, at least in eukaryotes. In humans, for example, such defects cause the inborn disease glycine encephalopathy (nonketotic hyperglycinemia), which leads to accumulation of glycine in the central nervous system. This terminal disease very often results from mutations in the P-protein-encoding gene (16) but can also be due to defective T-protein or H-protein genes (17). In plants, the artificial deletion of the two P-protein genes was lethal for seedlings of the model plant Arabidopsis thaliana during or shortly after germination (18). Plants generate very large daily amounts of glycine in the photorespiratory cycle, an essential complement to the photosynthetic Calvin-Benson cycle (reviewed in Ref. 19). This particular metabolic pathway requires a very high glycine-decarboxylating activity. Therefore, the GCS proteins amount to more than 30% of total matrix protein in green leaf mitochondria (20). Correspondingly, the artificial reduction of leaf GCS activity distinctly impairs the growth of plants in air (21), whereas an increased GCS activity resulted in an improved growth (22). Deletion of GCS in cyanobacteria, the predecessors of plant chloroplasts, is less catastrophic (23). Such mutants also accumulate glycine, which is somewhat toxic due to chelation of magnesium ions, but a bypass to the GCS reaction, the glycerate pathway, allows nearly normal growth (24,25).
It has been suggested that the animal and plant GCS proteins form a labile enzyme complex loosely associated with the mitochondrial inner membrane (e.g. see Refs. 13 and 26). Although the formation of a moderately stable complex between two H-protein molecules per P-protein dimer was demonstrated (27) and the interaction between H-and T-protein was studied (28), the supramolecular structure of such a hypothetical complex comprising all GCS proteins is not known. On the other hand, three-dimensional structures of several individual GCS proteins were reported (29 -31), including that of the P-protein from Thermus thermophilus HB8 (32). In contrast to eukaryotic P-proteins, which are all ␣ 2 homodimers of about 200 kDa (e.g. see Refs. [33][34][35], the P-protein from T. thermophilus consists of two smaller polypeptides, ␣ and ␤, which assemble to an about 200-kDa heterotetrameric (␣ N ␤ C ) 2 P-protein. Related P-proteins are present in other prokaryotes, such as the anaerobic Eubacterium acidaminophilum (36), but the majority of known P-proteins from aerobic bacteria, including Escherichia coli (37) and all cyanobacteria (38,39), are homodimeric enzymes closely related to their eukaryotic homologs.
Following the biochemical characterization (38) of the homodimeric P-protein from the model cyanobacterium Synechocystis sp. PCC 6803 (hereafter referred to as Synechocystis) and preliminary structural characterization (40), we present here the first three-dimensional structure of a homodimeric P-protein. This knowledge is of particular interest because the Synechocystis P-protein is closely related to eukaryotic P-proteins. We also present structures of ternary complexes with the cofactor PLP and the substrate glycine. Based on the structural information, three cysteine residues that are potentially involved in redox-dependent regulation of P-protein and hence GCS activity were replaced by site-specific mutagenesis, and the P-protein variants were compared with the wild-type protein in biochemical assays.

Protein Expression, Directed Mutagenesis, and Purification-
The P-protein gene (slr0293) from Synechocystis was cloned in the bacterial overexpression vector pBAD-slr0293, expressed, and purified as described elsewhere (38,40). For the exchange of specific cysteines for serine by directed mutagenesis, we amplified the whole construct with Phusion polymerase (High-Fidelity DNA Polymerase, New England Biolabs) by PCR using phosphorylated primer pairs (C353S, forward (ATATCAG-CACTGCCCAGGTTCTCTT) and reverse (TACTAGTGGC-CTTGTCCCGACGGAT); C972S, forward (TAGTTAGCTC-CTGTGAAGGTATGGA) and reverse (AATGGCGATCACC-GTAGGTATTATT); C972S/C974S, forward (CTCCAGCGA-AGGTATGGAGG) and reverse (CTAACTAAATGGCGAT-CACC). After circularization and multiplication in E. coli strain LMG194, the nucleotide sequence was confirmed for the regions between the restriction sites below. From the circularized Next, the T-protein catalyzes the transfer of the methylene group to THF, thereby releasing NH 3 . Finally, the L-protein reoxidizes the reduced lipoamide group of the H-protein. B, outline of the involvement of the PLP cofactor in the decarboxylation reaction catalyzed by the P-protein. PLP first forms an internal aldimine with a lysine residue in the active site (Lys 726 in the Synechocystis enzyme). Next the C4Ј atom of PLP is attacked by glycine to form the external aldimine, thereby releasing the lysine. Finally CO 2 is released from the external aldimine, leaving a quinoid intermediate ready to bind lipoamide.
plasmids carrying the mutation in Cys 353 , an XhoI-Bsp1407I fragment was excised and used to replace the corresponding sequence in the above mentioned wild-type slr0293 overexpression vector. Correspondingly, Bsp1407I-PvuII fragments carrying the individual or combined mutations in Cys 972 and Cys 974 were used to introduce these mutations into the wildtype overexpression vector or the C353S variant.
Activity Assays-The recombinant P-and H-proteins were affinity-purified as described earlier (38) and equilibrated in 20 mM Tris/HCl (pH 7.8) by using prepacked PD-10 columns (Sephadex G-25 M, GE Healthcare). The P-protein preparations were concentrated in Vivaspin 20 centrifugation units (Sartorius, Göttingen, Germany). After the addition of glycerol to a final concentration of 50% (v/v), protein concentrations were determined (41), and aliquots were stored at Ϫ80°C. P-protein activity was measured by following the glycine-14 Cbicarbonate exchange reaction (2) after a preincubation with 1 mM ("reduced" state) or zero ("oxidized" state) dithiothreitol (DTT) for 5 min at room temperature. Standard assays at 30°C contained 100 mM sodium phosphate (pH 6.0), 0.1 mM PLP, 18 mM glycine, 1 mM or zero DTT, 18.5 M H-protein, 0.6 M P-protein dimer, and 3 mM NaH 14 CO 3 (1.62 Ci) in a total volume of 300 l for each time point. All enzyme assays were done in triplicates and repeated at least once with independent protein preparations. Immediately after starting the reactions by the addition of NaH 14 CO 3 and every 10 min thereafter for a total time of 40 min, aliquots of 250 l were mixed with 80 l of 50% trichloroacetic acid. These samples were dried overnight to remove remaining NaH 14 CO 3 , dissolved in 500 l of water, and analyzed in a liquid scintillation counter following a standard protocol. Rates were corrected by control reactions without glycine for all time points.
Soaking of Ligands-Crystals of the binary complex with PLP were used to prepare ligand complexes by soaking crystals at 20°C in solutions of the ligand. Glycine (1.5 l, 3 M) was added to the drop containing the crystal (initial drop size 5 l). After 60 min, the crystal was transferred for a few seconds to a cryosolution containing 0.3 M glycine, 17.5% ethylene glycol, 1.6 M (NH 4 ) 3 PO 4 , 5 mM TCEP, 0.2 mM PLP; transferred to a nylon CryoLoop (Hampton Research); and then flash-cooled in liquid nitrogen and maintained at 100 K for data collection. Alternatively, lipoic acid (50 mg ml Ϫ1 in ethanol) was added to the drop prior to the addition of 1.5 l of 3 M glycine. Soaking and flashcooling were performed as described previously.
Data Collection and Structure Determination-X-ray diffraction data for the apoenzyme (without PLP) were collected to 2.0 Å resolution on a CCD detector (ADSC Q315r) at beam line ID14-4, European Synchrotron Radiation Facility (ESRF, Grenoble, France) at a wavelength of 0.98 Å. The crystals were soaked for a few seconds in the reservoir solution containing 20% (v/v) ethylene glycol as a cryoprotectant, transferred to a nylon CryoLoop (Hampton Research), and then flash-cooled in liquid nitrogen and maintained at 100 K for data collection. A data set to 2.0 Å resolution was collected from one single (form 1) crystal, and the data were processed with DENZO and SCALEPACK (42). The diffraction data statistics are shown in Table 1. The structure was solved by molecular replacement with PHASER (43) using one heterodimer (␣ N ␤ C ) of the P-protein from T. thermophilus (PDB code 1WYT (32)) as a search model. The first crude model was improved by automatic chain tracing in ARPwARP (44) using the sequence of the Synechocystis enzyme. The structure was further refined by manual rebuilding using O (45) alternated with refinement in REFMAC5 (46) and PHENIX (47). Solvent molecules were added using ARPwARP (48) and PHENIX and were manually inspected in O. Assignment of water molecules was made based on peak heights of residual electron density, hydrogen bonding patterns, and B-factors.
Diffraction data for the holoenzyme (binary complex containing PLP and ternary complexes containing PLP and glycine) were collected on a CCD detector (ADSC Q315r) at beam line ID14-1, ESRF, to a maximum resolution of 1.6 Å from one single crystal. The data were processed with DENZO and SCALEPACK (42) and the structures were solved by molecular replacement using PHASER (43) with the ligand-free structure as a search model. The models were improved by iterative manual rebuilding of missing loop regions and the C terminus in O (45) and refinement with REFMAC5 (46) and PHENIX (47). Solvent molecules were added using ARPwARP (48). Density for PLP was observed close to the side chain of Lys 726 allowing the building of the cofactor into the model. Further refinement of the binary complex indicated that a mixture of PLP and phosphate ion from the crystallization buffer was bound in the active site. Several crystals were analyzed and were found to contain phosphate to various extents, ranging from 50 to 100%. For this reason, these data were not further pursued.
Diffraction data for the ternary PLP-glycine complexes were collected at beam line ID14-1, ESRF, to resolutions of 1.8 and 1.9 Å. Processing, phasing, and model refinement were performed as described above. Electron density for PLP indicated that the cofactor bound exclusively at this site in the ternary complex. Toward the end of refinement, glycine was built into positive difference density close to PLP. The data collection and refinement statistics are presented in Table 1.
The structures were evaluated using PROCHECK (49) and MOLPROBITY (50). Analysis of dihedral angles indicated that Ser 165 is in a disallowed region of the Ramachandran diagram. However, inspection of electron density showed this residue in well defined density in both subunits. Amino acid sequence alignments were made with ClustalW (51). Secondary structure assignments were made with DSSP (52), and structure-based sequence alignments were carried out using the least-square alignment function in the program O (45). Model figures were prepared with PyMOL (53).
Electrostatic Surface Analysis-Electrostatic potentials were calculated in PyMOL (53) using the vacuum electrostatics method. The analysis was performed on coordinates of one monomer each of (i) the apoenzyme with a closed C terminus and an extended mobile loop and (ii) the PLP complex with the C terminus in an open conformation and a closed mobile loop. Solvent molecules were removed from the coordinates before the analysis. The model of the apoprotein was extended with residues 964 -965 in molecule A, which were built into trace electron density to provide a more realistic surface for the analysis. These residues are not present in the final model submitted to the PDB.

RESULTS
Recombinant and enzymatically active homodimeric P-protein from Synechocystis (38,40) crystallized in an orthorhombic space group (P2 1 2 1 2 1 ) (40) under mildly reducing conditions (10 mM 2-mercaptoethanol). The structure solved from these (form 1) crystals shows that the C terminus (residues 963-978) is folded over the active site opening and secured by a disulfide bond between Cys-972 in the C terminus and Cys-353 located in the active site cavity (Fig. 2). This effectively isolates the active site from solvent and excludes the binding of ligands (cofactor and substrates) in the active site. Form 1 crystals thus contain the apoenzyme in an inactive, locked enzyme form. By using a more powerful reducing agent, TCEP, a trigonal crystal form (space group P321) was obtained that diffracts X-rays to 1.6 Å resolution. In these (form 2) crystals, the disulfide is reduced, the C terminus is free and folded away from the active site entrance, and the enzyme is able to bind ligands, such as the cofactor PLP and the substrate glycine. This is the active, unlocked enzyme form.
Aside from the C terminus, the loop comprising residues 337-349 (named the mobile loop according to Nakai et al. (32)) also adopts a number of different conformations: disordered, extended, and closed (see below). Apart from these two striking differences, continuous density was observed for most of the protein in both crystal forms. There was weak or no density for the first 19 N-terminal residues, for a surface loop (residues 756 -766), and for the six C-terminal residues. Contents of the refined models are summarized in Table 1.
Quaternary Structure-The Synechocystis P-protein has a homodimeric quaternary structure (␣ 2 NC ; Fig. 3), and each monomer harbors one active site. The N-terminal and C-terminal halves of the ␣ NC monomer correspond to the ␣ N and ␤ C subunits, respectively, of the heterodimer of the tetrameric enzyme (32) (Figs. 4 and 5). The bulk of the ␣ N and ␣ C parts of the homodimeric enzyme fold into separate units that pack tightly. The N-terminal arms of each unit wrap around the opposite unit to make a compact monomer. Two monomers pack tightly to form the active dimer. The formation of the dimer is driven by the hydrophobic interaction between mainly helical elements across the local two-fold axis (Fig. 3). The dimer interface, calculated following Lee and Richards (54) with a probe radius of 1.4 Å, is 3894 Å 2 and comprises 6.7% of the dimer total accessible surface area.
The Monomer-The homodimeric P-protein from Synechocystis shows about 35% sequence identity with the heterotetrameric enzyme from T. thermophilus (BLAST 2). Comparison of the homodimeric ␣ 2 NC and the heterotetrameric (␣ N ␤ C ) 2 structures reveals that the fold of the proteins is highly similar (Figs. 4 and 5), but there are some striking differences. Superposition of 373 equivalent C␣ atoms of the homodimeric structure with the ␣ N monomer from T. thermophilus gives a root mean square (r.m.s.) deviation of 1.55 Å, whereas superposition of 416 C␣ atoms of the homodimeric structure with the ␤ C monomer gives an r.m.s. deviation of 1.23 Å. The N-terminal ␣ NC -N domain is conserved between the heterotetrameric and homodimeric enzymes and consists of a central ␤-sheet of seven strands flanked by helices on both sides. The N terminus is extended by 34 residues, half of which are visible in the final electron density maps. Residues 17-26 form an ␣-helix (␣1) that packs against the first helix (10, residues 471-476) in the C-terminal part (␣ NC -C) of the adjacent subunit of the homodimer. Residues 33-37 form an extra helix not present in the T. thermophilus enzyme, but apart from that, the N-terminal residues 38 -100 adopt a conformation similar to that of the residues of the N-terminal arm of the ␣ N subunit of the heterotetramer. The domain ends with a twisted ␤-hairpin (strands ␤9 and ␤10). Residues 337-349 of the loop between strand ␤10 and helix ␣13 are able to adopt a number of different conformations and appear to contribute to the closing of the active site during catalysis (see below). The remainder of the N-terminal part of the subunit, consisting of three helices (14 -16) and an antiparallel ␤-sheet, is structurally homologous to the ␣ N of the T. thermophilus enzyme. Helix 16, the last helix of ␣ NC -N, connects to the C-terminal part of the ␣ NC subunit and, as a consequence, is significantly shifted compared with the corresponding helix in the T. thermophilus enzyme. Residues 461-478 form a linker connecting the N-and C-terminal parts, and The structures deviate considerably at the C terminus. In the Synechocystis enzyme, helix ␣30 is extended by 7 residues at its C terminus, and the loop preceding helix ␣31 is extended by 5 residues. Residues 925-956 of the C terminus of the Synechocystis enzyme contain three short helices not present in the T. thermophilus enzyme. The following 22 residues, 957-978, constitute the mobile C terminus, which in the ligand-bound open structures is very similar to the C terminus of the T. thermophilus enzyme. The last 6 residues of the C terminus are not visible in the electron density maps. In the apoenzyme, helix ␣34 unfolds partly, covers the active site, and is secured by a disulfide bond between the C-terminal Cys 972 and Cys 353 . Cys 353 is located in the active site within van der Waals distance to the glycine substrate in the ternary complex (see below).   Transition from the Apo to the Holo Structure-The initial crystallization solution (40) contained 10 mM 2-mercaptoethanol, and the cofactor PLP was added to the enzyme solution prior to crystallization. However, there was no trace of bound PLP in the active site in the final electron density maps after refinement. Instead, a network of solvent molecules was observed lining the active site. A hydrogen bond was observed between Lys 726 , to which PLP normally binds and where transaldimination occurs (see below), and the hydroxyl group of Tyr 131 . Opposite to Lys 726 in the active site, there was strong electron density close to the S␥ atom of Cys 353 , indicating that this cysteine was engaged in a disulfide bond with another cysteine residue (Fig. 2). This cysteine, identified as Cys 972 , is part of a short C-terminal peptide 966 GDRHLVCSCEGME 978 in one subunit of the dimer (subunit A) and 968 RHLVCSCE 975 in the second subunit (subunit B), for which there was well defined electron density. The C-terminal peptide covers the active site, effectively isolates it from solvent, and presumably hindered the access of the cofactor to the active site during the initial crystallization experiment.
In the apoenzyme, electron density for two other cysteines, Cys 642 and Cys 974 , indicated that these residues were modified.
The additional density was interpreted as mixed disulfides with 2-mercaptoethanol. Although this suggests that these cysteines are redox-active, we failed to find a suitable disulfide partner in these cases. The S␥ atoms of Cys 642 and Cys 974 are located more than 15 Å apart, making it unlikely that they would be part of the same disulfide. Cys 642 is located close to the surface of the molecule, and there are no other cysteines within reasonable distance. Cys 974 has its S␥ atom positioned only 6 Å from the S␥ atom of the active site residue Cys 530 , but there was no sign of oxidation or modification of Cys 530 in any of the electron density maps. We note also that the position of the S␥ atom of Cys 974 in the apoenzyme is located less than 2 Å from the position of the carboxyl group of the glycine substrate in the holoenzyme structure.
To reduce the 353-972 disulfide and break the bond, we switched to the powerful reducing agent TCEP. PLP was included in the purification buffers to ensure that the cofactor was bound. The resulting protein crystallized in a different space group with significantly better diffraction. There was well defined density for PLP in the active site (Fig. 6), indicating that it was covalently linked to Lys 726 via an internal aldimine bond (Schiff base linkage). Inspection of the C-terminal arm showed  1WYV). The sequences were obtained from the NCBI database. The sequence of Synechocystis P-protein was compared with homologous proteins from A. thaliana, Homo sapiens, and T. thermophilus. Residue numbering is according to the Synechocystis sequence. Because the N-terminal target peptides were omitted, the residue numbering of A. thaliana P-protein starts from position 40, and the H. sapiens numbering starts from position 57. To facilitate alignment, peptides ␣ N and ␤ C of T. thermophilus were fused to one polypeptide. Conserved residues are boxed, strictly conserved residues have a red background, residues well conserved within a group are indicated by red letters, and the remainder are in black letters. Symbols above blocks of sequences correspond to the secondary structure of the Synechocystis enzyme: ␣, ␣-helix; ␤, ␤-strand; , 3 10 -helix. Gray asterisks above the sequence indicate the presence of alternate conformations. The secondary structure assignment for the Synechocystis enzyme was made using DSSP (52). The figure was prepared using ESPript (70).
well defined density for residues 963-978, which were folded away and stabilized at the surface of the protein, thereby creating open access to the active site (Fig. 2). The Cys 353 -Cys 972 disulfide was reduced, and there was well defined density for both free S␥ atoms. Breaking of the 353-972 disulfide bond allows the side chain of Cys 353 to flip about 120º and the loop region carrying Cys 353 to move ϳ2-3 Å toward the active site. Apart from the 353-972 disulfide, the C terminus is stabilized in its closed conformation by a hydrogen bond between Asn 318 and Arg 975 . Upon reduction of the 353-972 disulfide, the contact between residues 318 and 975 is also broken, allowing the C-terminal residues 963 and onward to fold away from the active site. In the open conformation, residues 963-969 adopt a helical conformation (Fig. 2), and the C terminus is anchored to the surface by several hydrogen bonds. As shown earlier (40) and below, the Synechocystis P-protein requires the presence of a reducing agent (usually DTT) in assays for full activity, and the structure presented here offers a structural explanation. There is no equivalent to Cys 972 in the heterotetrameric enzyme, and the C terminus adopts an open conformation in the crystal structures of the apo and holo forms of the T. thermophilus enzyme.
A long loop, residues 337-349, is observed in two different conformations and also appears to participate in the transition from oxidized apoenzyme to reduced holoenzyme. In the apo structure, the loop adopts an extended conformation that stabilizes the C terminus in its closed conformation. This is observed in one of the two subunits (subunit A), whereas in the second subunit, the loop is disordered and not visible in the electron density maps. Upon reduction of the 353-972 disulfide and the folding of the C terminus away from the active site, the extended loop refolds and closes over the opening of the active site, shielding it partly. This conformation is observed in one of the subunits (subunit A) in the crystal structure of the ternary complex with PLP and glycine (Fig. 7). In the second subunit, residues 334 LALQTR 337 at one end and Thr 349 at the other end of the loop are visible and assume a similar conformation. An identical conformation of the mobile loop is observed in the ligandbound forms of the T. thermophilus enzyme (PDB codes 1WYV and 1WYU) and is thus likely to be a general structural feature of the holo forms of both heterotetrameric and homodimeric P-proteins. In its closed conformation in the Synechocystis enzyme, this loop interferes with the closed C terminus of the apo structure, and it is not likely that they could co-exist.
The Active Site and PLP Binding-PLP makes an aldimine bond with Lys 726 (Fig. 1B), whose N atom is stabilized by a hydrogen bond to the backbone carbonyl oxygen atom of resi-  due Ser 529 . PLP fits tightly in the active site and is stabilized by several hydrogen bonds (Fig. 6A). The oxygen atoms of the phosphate group are engaged in hydrogen bonding to the backbone nitrogen atoms of Thr 354 , Gly 590 , and Ser 591 ; the side chains of Thr 354 , Asn 723 , and His 725 ; and several solvent molecules. The pyridine ring of PLP is sandwiched on one side by His 623 and on the other side by Ala 701 and Asn 723 and makes hydrogen bond interactions with Ser 529 and Asp 699 . All of these residues are strictly conserved in the heterotetrameric enzyme despite their localization on separate polypeptides (␣ N and ␤ C ) and provide a good binding site for anion ligands, such as PLP.
The binding of PLP causes smaller rearrangements of the protein structure in the PLP binding site. The imidazole ring of His 623 moves about 2 Å toward the pyridine ring for a better stacking interaction, although the angle of the imidazole ring is unaltered with respect to the PLP pyridine ring. Trp 870 , located on a loop region, moves ϳ0.8 Å toward the bound PLP. Ala 701 (positioned in another loop region) moves around 1.7 Å closer to the center of the pyridine ring. The loop region harboring Thr 674 and His 675 also adjusts, moving ϳ1.2 Å toward PLP.
Binding of Glycine-Soaking of the substrate glycine into crystals of the PLP complex produced a ternary complex with glycine and PLP bound in the active site. Overall, the structure of the ternary complex is very similar to the structure of the co-crystallized PLP complex. Superposition of the two models gives an r.m.s. deviation of 0.139 for 1900 C␣ atoms. Glycine binds in a pocket close to PLP; both ligands are bound with high occupancy in well defined electron density (Fig. 6B). The glycine-binding site is located on the side of the pyridine ring that faces the outside and is otherwise occupied by solvent (water or ethylene glycol). The amino group of glycine makes a hydrogen bond to the imidazole ring of His 623 . It also makes solventmediated interactions with the phosphate group and the aldimine C4Ј carbon of PLP and with the hydroxyl group of Tyr 131 . There is no indication of the formation of an external aldimine with PLP; the amino nitrogen of glycine is located 3.7 Å away from the C4Ј carbon of the pyridine ring, and there is well defined density indicating that the internal aldimine bond to Lys 726 is maintained. The carboxyl group of glycine is stabilized by contact with the hydroxyl group of Tyr 134 and by a solventmediated contact with Thr 867 . Apart from the lack of a covalent bond to PLP, the orientation of the substrate is appropriate for decarboxylation with the scissile bond of glycine oriented approximately perpendicular to the -orbital system of the pyridine ring of PLP (corresponding to conformation III in Ref. 55).
Directed Mutation of Cysteine Residues Involved in Disulfide Formation-To test for the potential roles of the cysteines forming the disulfide bond, we made the C353S, C972S, C353S/ C972S, C353S/C972S/C974S, and C972S/C974S variants and assayed them in the presence and absence of DTT using the wild-type enzyme for comparison (Fig. 8). In the presence of DTT, wild-type P-protein catalyzed the glycine-bicarbonate exchange reaction with a specific activity of 1.04 Ϯ 0.12 mol s Ϫ1 g Ϫ1 . The C353S variant showed an ϳ20-fold decrease in activity, whereas the C972S variant displayed a distinctly smaller, about 2-fold, decrease in activity. The activity of the double and triple mutants C353S/C972S and C353S/C972S/ C974S was 20-fold reduced and did not significantly differ from the C353S P-protein variant. Similarly, the additional C974S exchange in the double-mutated P-protein C972S/C974S did not further decrease the specific activity relative to C972S. In the absence of DTT, the activity of the wild-type Synechocystis P-protein was more than 30-fold diminished, but the introduced single and combined mutations resulted in only 2-3-fold further decreases in specific activity. Under these conditions, the specific activity of the C972S/C974S variant was only slightly lower relative to the wild-type control, which suggests that the additional C974S exchange counteracts the effect of the C972S exchange. A similar although less strong effect was observed with C353S/C972S/C974S relative to C353S/C972S under non-reducing conditions. Taken together, the combined site-directed mutagenesis and structural information strongly support our hypothesis that formation of the Cys 353 -Cys 972 bond could be linked to the redox status of the cell and hence redox-dependent activity regulation of the P-protein.
Electrostatic Surface Analysis-The conformational transition from the apo-to the holoenzyme leads to a dramatic change in the charge of the molecular surface in the vicinity of the active site. The surface charge of the active site region of the apo-P-protein is mainly negative (Fig. 9A). Upon breaking of the Cys 353 -Cys 972 disulfide, the C terminus moves from its closed state to an extended open state, which permits the mobile loop to move to its closed conformation. This change in conformation also alters the surface charge in this area, which becomes remarkably more positive (Fig. 9B). The closed loop conformation permits the basic residues of the mobile loop, Arg 339 , Arg 344 , Arg 345 , and Lys 347 , to be fully exposed to the surrounding environment. Together with Arg 313 and Lys 314 , located on an adjacent short helix, they form a positively charged surface close to the active site entrance. In contrast, in its extended form (Fig. 9A), when the mobile loop is folded away from the active site opening, these residues form hydrogen bonds with residues at the surface far from the active site. Arg 344 hydrogen-bonds to the carbonyl oxygens of residues Ser 186 and Tyr 181 (and to a water molecule), Arg 345 interacts with the carbonyl oxygen of Glu 340 and the carboxylate of Glu 341 , and Arg 313 forms a salt link to Glu 340 . Lys 347 hydrogenbonds to the side chain of Asn 189 . When folded away from the active site, as in the PLP-and glycine-bound structures, the C terminus forms a convex protrusion, which effectively elongates the surface that a potential binding partner, such as the H-protein or lipoic acid, could use for interaction. Arg 968 and His 969 , located at the C terminus and fully exposed in the PLP structure, add to the already existing positively charged surface provided by residues in the mobile loop. On the other hand, when the C terminus closes over the active site, Arg 968 and His 969 are directed inward, toward the active site. In its closed state, the C terminus bears a clear negative charge, formed by Glu 975 and Asp 967 and exposed backbone carbonyl oxygen atoms, which are likely to enhance the negatively charged surface in the vicinity.
Several residues in the mobile loop are conserved in the heterotetrameric enzyme (32), but the difference in electrostatic potential between the apo and holo forms is rather small in this case (data not shown). There is no equivalent of the acidic surface in the T. thermophilus enzyme nor of the acidic residues at the C terminus (Glu 975 and Asp 967 ). This difference is in line with the observation that this particular conformation of the C terminus is not conserved between the homodimeric and heterotetrameric enzymes.

DISCUSSION
We present here the structure of the glycine decarboxylase or P-protein from the cyanobacterium Synechocystis, which can be obtained as enzymatically active recombinant protein (38). This is the first structural information provided for a homodimeric glycine decarboxylase. The protein displays over 70% similarity and about 60% identity with plant and mammalian P-proteins and therefore can serve as an excellent model for these eukaryotic enzymes. Our models show strict identity of key residues with the counterparts in the eukaryotic enzymes, including all known active site residues and most cysteine residues (Fig. 5). When purified in the presence of 2-mercaptoethanol (40), crystallization of the Synechocystis P-protein in the presence of 0.2 mM PLP resulted in orthorhombic crystals (space group P2 1 2 1 2 1 ). The resulting structure was of the unliganded form, devoid of the cofactor. Binding of the cofactor may have been prevented by the C-terminal lid closing over the active site as a part of the regulation of catalytic activity. As a less likely explanation, PLP may also have been hindered from binding by crystal packing restraints at the conditions used during the crystallization experiment. The disulfide between cysteine residues 353 and 972 firmly anchors the C terminus over the active site and was also observed in crystals obtained in the complete absence of reducing agents. However, the quality of the structure was inferior under these oxidizing conditions, and the structure was not analyzed further.
In the presence of the strong reducing agent TCEP, the C terminus is folded away from the active site, allowing the cofactor PLP and the substrate glycine to bind. This switch from a closed to an open conformation of the C terminus corresponds to the change from low to high enzymatic P-protein activity at oxidized versus reduced test conditions (see Fig. 8).
Glycine binds close to PLP in a pocket facing the opening toward solvent (Fig. 6B). To react, the amino group of glycine must form a Schiff base with PLP, upon which the carboxyl group of glycine is lost as CO 2 , and the remaining methylamine moiety is transferred to the lipoyl group of the H-protein (Fig.  1). There was no trace of a covalent bond between glycine and PLP in the crystal structures presented here, but several bicarbonate molecules were observed to bind in the crystals of the ternary complex with glycine. This suggests there may be a slow turnover of the substrate, resulting in bicarbonate formation at the pH of the crystallization. This is in line with the earlier demonstration of CO 2 evolution in the absence of H-protein or lipoic acid by recombinant Synechocystis P-protein (38). The external aldimine may be too transient to be observed. It is also possible that productive binding may be hindered by crystal lattice forces preventing the mobile loop from closing effectively and that bicarbonate binding is an artifact of crystallization. However, no bicarbonate was found in crystals of the apoenzyme, suggesting that PLP and glycine are required for bicarbonate binding in the crystal.
Earlier reports (56) and our studies (40) (this work) have pointed to a strong dependence of reducing agents, mainly DTT, for full activity of the P-protein during in vitro assays. Our observations of the closing of the active site by the movement of the C terminus and the stabilization of the closed conformation in a disulfide bond suggest that the Synechocystis glycine decarboxylase is a target for regulatory redox-active proteins. The structures of inactive and active forms of the P-protein presented here show how this regulation could happen on the molecular level. It should be noted that all structural requirements for the redox-dependent conformational change are conserved in homodimeric P-proteins from eukaryotes, suggesting that the Synechocystis P-protein represents a model also for the regulation of eukaryotic glycine decarboxylases. Dithiol/ disulfide exchange catalyzed by thioredoxins has been extensively studied in the light-dependent regulation of several processes in plants and is important also for other organisms. In plants, the processes of photosynthesis and respiration are separated in chloroplasts and mitochondria, respectively. Thioredoxin-mediated light-dependent regulation of each of the control points of the Calvin-Benson cycle enables the functional separation of diurnal and nocturnal processes in the chloroplast and was among the first discovered examples of this type of regulation (57). Although these enzymes are regulated by a common ferredoxin-thioredoxin system, structural studies have shown that the molecular mechanisms for activation of the target enzymes are different in each case, adapted to the reaction catalyzed by the enzyme (58), and there is no consensus regulatory sequence. In most cases, however, structural information of the different regulatory states of the target enzymes is lacking. More recently, numerous thioredoxinlinked proteins have been identified in plant mitochondria, among these all four GCS proteins, including the P-protein (59 -62), but the molecular details of activation are yet unknown. There are also indications that communication between chloroplasts and mitochondria may be mediated by thioredoxin-linked processes (59,63). Unlike plants, cyanobacteria cannot physically separate oxygenic photosynthesis and glycine decarboxylation in different organelles. Comparable with land plants, however, the simultaneous activation by light of photosynthetic electron transport, oxygen evolution, and CO 2 fixation in cyanobacteria also employs thioredoxins. Proteomic investigations have been published describing potential thioredoxin targets in cyanobacteria, including Synechocystis (64,65). These studies identified numerous thioredoxin-regulated photosynthetic proteins (66) and listed the GCS L-protein as a glutaredoxin target (65).
Cysteines 353 and 972 are well conserved in the P-protein from plants, animals, and many aerobic bacteria but absent in the enzymes from archaea and certain anaerobic bacteria (such as E. acidaminophilum (36)). As a consequence, a covalent bond between the corresponding residues 353 and 972 cannot form in the P-protein from these microorganisms. This is consistent with information from the crystal structures of T. thermophilus P-protein that shows the C terminus is in the open (unlocked) conformation in all complexes, including the ligand-free apoenzyme (32). It is striking that of 15 cysteines in Synechocystis P-protein, only one, Cys 530 , is conserved in the T. thermophilus enzyme (Fig. 5). The apparent absence of a redox switch in this particular case could be related to the extreme environment in which this thermophile lives (67), which is unfavorable to the transient formation of free cysteine residues (68). In the ternary complex of the Synechocystis P-protein, Cys 353 is within van der Waals distance to (the carboxylate of) the glycine substrate. This may explain the low enzymatic activity of the C353S variant and the other examined P-protein variants containing this particular mutation. The influence of the C972S substitution on activity is more difficult to rationalize, but it is likely that the mutation influences the mobility of the C terminus and that this may in turn influence substrate binding and catalysis. Although this needs to be studied in more detail, the structural data and experiments with artificial enzyme variants strongly support the conclusion that the P-protein is redox-regulated.
The overlap of the C-terminal arm and the mobile loop in their closed conformations indicates that they cannot be closed at the same time and that, consequently, closure of the mobile loop can only happen when the C terminus is in its extended conformation (Fig. 7). Loop closure, too, is thus indirectly dependent on the redox conditions in the cell. There may be additional signal(s) that trigger loop closure, but these are currently unknown.
The dislocation of the C terminus from the active site and the closing of the mobile loop over the active site entrance have a dramatic effect on the surface charge in the area around the active site. The apoprotein features a mainly negatively charged surface, which becomes positive upon formation of the holoenzyme (Fig. 9). Compared with the C terminus, which closes completely over the active site, the mobile loop does not occupy as large a volume in its closed state and would allow the H-protein to simultaneously deliver its lipoyl arm to the active site (see Fig. 7 in Ref. 32). Surface potential analysis of the lipoylated H-proteins from Pisum sativum and T. thermophilus (31) shows two main surface areas on each side of the lipoylated residue, Lys 63 , with clear negative surface charge (see Fig. 7 in Ref. 31). The structure of the Synechocystis H-protein is not known, but the high homology between H-proteins of diverse origins (sequence identity 38 -55% (31)) indicates that a similar negatively charged area may be present on the Synechocystis H-protein. In addition to the effect from several conserved residues, a large contribution to the charge seems to originate from exposed backbone carbonyl oxygens, in particular in the area closest to the lipoyl arm. The P-protein from Synechocystis has a lower affinity for the H-protein at pH 8 than at pH 6 (69); this may be caused by pH effects on charged residues on the protein surface. Presumably, the switch from the enzymatically inactive form to the enzymatically active form could also improve the H-protein docking via the observed change of surface charge.
Based on this, we envision the following scenario for regulation and catalysis in the homodimeric enzyme; formation of the 353-972 disulfide is subject to regulation by the redox status of the cell (enabled by a thioredoxin-linked system) and restricts access of the substrate to the active site (and possibly also of PLP). During conditions favoring catalysis, the disulfide bond is broken, and the C terminus is released and moves to its open unlocked position. The combined effect of the changes in conformation and surface charge enables passage of the two substrates, glycine and H-protein, to the active site. The concerted action of the entry of the H-protein's lipoyl arm and the con-formational change of the mobile loop provides an efficient closure of the active site necessary for catalysis.