Structure of Plant Photosystem I Revealed by Theoretical Modeling

Photosystem (PS) I is a large membrane protein complex vital for oxygenic photosynthesis, one of the most important biological processes on the planet. We present an “atomic” model of higher plant PSI, based on theoretical modeling using the recent 4.4 Å x-ray crystal structure of PSI from pea. Because of the lack of information on the amino acid side chains in the x-ray structural model and the high cofactor content in this system, novel modeling techniques were developed. Our model reveals some important structural features of plant PSI that were not visible in the crystal structure, and our model sheds light on the evolutionary relationship between plant and cyanobacterial PSI.

Photosystem (PS) I is a large membrane protein complex vital for oxygenic photosynthesis, one of the most important biological processes on the planet. We present an "atomic" model of higher plant PSI, based on theoretical modeling using the recent 4.4 Å x-ray crystal structure of PSI from pea. Because of the lack of information on the amino acid side chains in the x-ray structural model and the high cofactor content in this system, novel modeling techniques were developed. Our model reveals some important structural features of plant PSI that were not visible in the crystal structure, and our model sheds light on the evolutionary relationship between plant and cyanobacterial PSI.
Photosynthesis is one of the most important biological processes on Earth, one in which radiant energy from the sun is stored as chemical energy, which can be used as an energy source by all forms of life (1). In both eukaryotes and prokaryotes, the initial steps of photosynthesis take place across the photosynthetic membrane. This membrane divides an internal space, known as the lumen, from the cytosol outside the membrane, known as the stroma. During photosynthesis, two large membrane protein complexes, photosystems (PS) 4 I and II (PSI and PSII), harness the energy of incident photons and use it to drive a series of electron transfer reactions across the photosynthetic membrane that result in the establishment of a transmembrane electrochemical proton gradient that drives synthesis of ATP. PSII provides electrons for these redox reactions by splitting water into molecular oxygen and protons, whereas PSI provides the reducing power to reduce NADP ϩ to NADPH. The high energy products ATP and NADPH are then used in the dark reactions of photosynthesis to fix CO 2 and synthesize all cell compounds.
Oxygenic photosynthesis is performed both by the chloroplasts of eukaryotic organisms such as plants and green algae, and by prokaryotes such as cyanobacteria. Approximately 1.5 billion years ago, organisms similar to cyanobacteria entered into an endosymbiotic relationship with anaerobic eukaryotic cells, becoming the ancestors of modern chloroplasts. Plant and cyanobacterial photosystems therefore share a common origin and perform very similar functions, but they exist in different biological environments (2). We present a model of plant PSI in which the functional differences between plant and cyanobacterial PSI can be correlated with structural features unique to each system ( Fig. 1). This can allow insight into the parallel evolution of these two highly optimized systems, as well as providing a guide for future experimental work that will further explore the structure and function of plant PSI.
Photosystem I catalyzes the light-driven electron transfer from the soluble electron carriers plastocyanin or cytochrome c 6 , located at the lumenal side of thylakoid membrane, to ferredoxin, which is located at the stromal side. In cyanobacteria, PSI exists as a trimer; each monomer is composed of 12 individual protein subunits, 96 chlorophyll a molecules, 22 carotenoids, 3 [4Fe4S] clusters, and 2 phylloquinones (3). Plant PSI is a monomer containing 14 subunits. Ten of these are similar to corresponding subunits in the cyanobacterial structure: PsaA, PsaB, PsaC, PsaD, PsaE, PsaF, PsaI, PsaJ, PsaK, and PsaL. Plant PSI lacks two subunits that are unique to cyanobacteria (PsaX and PsaM) and contains four subunits that are absent in cyanobacteria (PsaG, PsaH, PsaN, and PsaO).
Both plant and cyanobacterial PSI can bind external antenna systems when extra light-harvesting capacity is needed. The diversity of the external antenna systems is one of the most interesting differences between different photosynthetic organisms. The major external antenna system in cyanobacteria consists of large, membrane-extrinsic complexes known as the phycobilisomes. The phycobilisomes use bilinbased pigments to absorb light in the green spectral region, precisely where the antenna system of the PSI does not absorb strongly (4). Cyanobacteria subjected to low iron concentrations also synthesize an iron stress-induced membrane protein called IsiA, which forms a symmetric 18-meric ring around the PSI trimer (4 -6). The external antenna complex of PSI in plants is formed by the light-harvesting complex I and II proteins (LHCI and LHCII), which dock to the periphery of the monomeric PSI complex. The external antenna system in plants is asymmetric and heterogeneous and is strongly modified under changing environmental conditions (e.g. light intensity, wavelength of light, oxygen supply, etc.).
Recently, a breakthrough in the understanding of plant photosynthesis has been achieved with the crystallization of a supercomplex of plant photosystem I from Pisum sativum (garden pea) with its peripheral LHCI antenna. A medium resolution structure of this PSI⅐LHCI supercomplex has been obtained by x-ray structure analysis at 4.4 Å resolution (7). This was the first structure of a plant membrane protein to be determined by x-ray crystallography. The PSI⅐LHCI structure contains the subunits conserved between plants and cyanobacteria, along with two of the nonconserved eukaryotic PSI subunits PsaG and PsaH, and four external light-harvesting complexes, tentatively assigned to Lhca1, Lhca2, Lhca3, and Lhca4.
One of the most remarkable features of this structure is the similarity of its conserved subunits to their counterparts in cyanobacterial PSI, whose structure was determined at 2.5 Å resolution (3,8). In most of the conserved subunits, the protein backbone conformation is virtually identical (see also TABLE ONE). Because a 4.4 Å structure is able to reveal the location of the protein backbone but not the identity or conformation of the amino acid side chains, the plant PSI crystal structure answers many questions but allows us to ask many more. If the backbone conformation is so similar, then what accounts for the functional differences between plant and cyanobacterial PSI? Where differences in the backbone conformation exist, what is their significance? What light can this structure shed on the results of biochemical studies of PSI function in plants and cyanobacteria?
To answer some of these questions and to provide detailed information for mutagenesis experiments and functional studies, we have derived an atomic level computational model of plant photosystem I (Protein Data Bank code 1YO9). We have used three different crystal structures in an effort to understand the structure of the plant PSI⅐LHCI complex at a more detailed level: the 2.5 Å structure of trimeric cyanobacterial PSI (3), the 4.4 Å structure of the plant PSI⅐LHCI supercomplex (2), and the recent 2.72 Å structure of the LHCII complex (9). LHCII belongs to the same protein family as LHCI and therefore is helpful in the modeling of the LHCI proteins.

MATERIALS AND METHODS
The computational model is based primarily on two structures: the 4.4 Å structural model from pea PSI⅐LHCI (7) and the 2.5 Å structure of cyanobacterial PSI (3). We have avoided "improving" upon the medium FIGURE 1. Plant PSI model. A, stromal view, perpendicular to the membrane plane. B, view from within the membrane plane. Protein is colorcoded by subunit; subunits are colored accordingly. Chlorophyll is shown in green; carotenoids are orange; lipids are pink; iron-sulfur clusters are shown as white and yellow spheres. Subunits conserved in cyanobacteria are shown in a schematic representation with ␣-helices depicted as cylinders; those unique to plants are shown as thick ribbons. All figures were made using VMD (54) and rendered using Raster3D version 2.6c (55). resolution structural model, i.e. modifying the backbone conformation of regions assigned in the x-ray structural model. This is also true for the chlorophyll molecules, where the position and macrocycle orientation from the medium resolution x-ray structure have been preserved in all cases.
Sequence data for the plant PSI subunits was obtained from the Swiss-Prot/TrEMBL sequence data base (10). Homology models of each subunit were constructed individually, using a combination of the MOE software package (Chemical Computing Group, Inc.), the Swiss-Model online modeling server (11), and the Swiss-PDB viewer (12).
In addition to the structural information included in the plant x-ray structural model, the modeling of the LHCI proteins was based on the crystal structure of plant LHCII (9). The LHCI and LHCII proteins exhibit 25-35% sequence identity and share a similar fold, but form different oligomeric structures: LHCII forms trimers and LHCI forms dimers.
In most cases, the sequence-structure alignment in our models of Lhca1 and Lhca4 follows the residue numbering in the plant x-ray structural model. This numbering was obtained by aligning LHCI and LHCII sequences from higher plants and placing the residues homologous to those coordinating chlorophyll in LHCII nearest the observed chlorophyll molecules. After adding amino acid side chains into the structure, it became clear that, in a few cases, a shift of one turn in the helix (i.e. three amino acids) was needed to improve chlorophyll coordination geometry. In some cases (e.g. chlorophyll 1012 in Lhca4), the coordination geometry at one chlorophyll was compromised in order to improve coordination geometry for a neighboring chlorophyll.
Several loop regions were not visible in the plant PSI crystal structure. The MOE software package was used to obtain plausible starting conformations for these loops. A lumenal portion of Lhca1 (Val-69 to Leu-77) obtained its starting conformation from a weak, previously unassigned electron density. Lhca4 contains a rather long stromal loop that was not well defined in the crystal structure and was minimized extensively to reach a suitable conformation, which is to be considered highly speculative (see supplemental Fig. S1).
Some side chain conformations in Lhca1 and Lhca4 were constrained by positioning them such that they could form hydrogen bonds with chlorophylls. Although the hydrogen bond donors identified in the plant LHCII crystal structure were poorly conserved in the LHCI proteins, other residues were found that were well positioned to form hydrogen bonds and widely conserved among higher plants (see supplemental Table S2). Not surprisingly, some of the hydrogen bonding and ligation geometries are not ideal. This could be a result of the fact that the stereochemical quality of the model is limited by the resolution of the template structure. It is also possible, of course, that these interactions would not be present in a higher resolution structure.
The modeling of cofactors presented a unique challenge. Nonprotein cofactors make up more than 30% of the total mass of plant PSI, yet traditional homology modeling approaches consider only polypeptides (13). Our approach to modeling cofactors is similar to one that has been described for buried water (14). An approach similar to ours has also been applied successfully in the modeling of protein and cofactors in the Rhodobacter capsulatus reaction center (15).
Initial coordinates for most cofactors in the core of the complex were obtained from the cyanobacterial PSI structure (Protein Data Bank accession code 1JB0). In general, chlorophyll positions assigned in the plant x-ray structural model were remarkably similar to those in the cyanobacterial structure, and it is reasonable to assume that when position is conserved orientation will also be conserved. A more exact fit to the data provided in the plant x-ray structural model was achieved by positioning each chlorophyll molecule such that the four nitrogen atoms in its macrocyclic head group aligned with the positions derived for nitrogen atoms from the plant x-ray structural model. The positions of these nitrogen atoms were held fixed during minimization, forcing the rest of the molecule to fall into line. Because the protein environment of cofactors differs between cyanobacteria and plants, extensive minimization was needed to optimize the cofactor conformations relative to their environment, especially in the flexible phytyl tail.
Because of the limited resolution of the plant x-ray structural model, no ␤-carotene positions were assigned. However, we found that most of the ␤-carotenes in cyanobacteria were located in pockets of hydrophobic residues that are highly conserved between cyanobacteria and plants. Other cofactors were also included in the plant PSI model. Three phospholipids and one galactolipid were included in the core complex, along with one phospholipid in each of the light-harvesting complexes. Two phylloquinones and three iron-sulfur clusters were placed at the same positions in which they were assigned in the plant x-ray structural model, as part of the electron transport chain.
Once an initial structure for the model was constructed and minimized, we subjected it to a molecular dynamics simulation, in order to optimize further the structure and explicitly account for the water/lipid environment of PSI. The model was placed in a solvated bilayer of digalactosyldiacylglyceride, which had been equilibrated at constant pressure in a molecular dynamics simulation. Digalactosyldiacylglyceride was chosen because it is a major lipid component of the thylakoid membranes in which PSI resides (16). The final system contained the modeled proteins and cofactors, 529 membrane lipid molecules, about 86,000 water molecules, and a sodium ion, needed for electrical neutrality. In order to maintain the secondary structure present in the starting structure, the protein ␣-carbon and chlorophyll nitrogen atoms, which were assigned in the plant PSI x-ray structural model, were bound to their starting positions by a harmonic potential (k ϭ 100 kcal mol Ϫ1 Å Ϫ2 ). All other atoms remained free to move, including the ␣-carbons of those regions unassigned in the x-ray model. In order to assess the progress of the MD simulation, the positions of these "flexible" ␣-carbon atoms were compared with their positions in the starting structure, and a global root mean square deviation was calculated.
Once the system had been assembled, it was simulated at constant pressure at 300 K, with the protein and associated cofactors held fixed, so that the lipid and solvent could equilibrate. Once the volume remained stable, the atoms of the PSI model were released and could move freely, except for the restraints described above. Once a new stable volume was reached, solvent molecules more than 15 Å away from the protein were held fixed, in order to decrease the simulation time. The simulation was allowed to continue until the global root mean square deviation of the loop regions had stopped increasing, indicating that the system had become conformationally stable. At this point, the system was repeatedly minimized (at T ϭ 0) and gradually warmed back up to 300 K, in order to find the best possible global minimum. The final simulation step was a simulated annealing process, in which the temperature of the system was gradually lowered from 300 to 0 K. One consequence of this simulation in explicit solvent was that water molecules from the bulk solvent migrated into the pigment-protein complex in order to ligate chlorophyll molecules for which a ligand could not previously be found.
All minimization and equilibration steps were performed using NAMD 2.5 (17). Because the force field used (CHARMM27) lacks parameters for many of the nonprotein cofactors present in plant PSI, parameter files needed to be developed, based on a combination of ab initio calculations (18)) 5 (19). A more detailed description of the modeling process is available at www.jbc.org.

RESULTS AND DISCUSSION
The modeling of plant photosystem I presented a unique challenge for several reasons. The most obvious is the presence of the numerous nonprotein cofactors in the supercomplex. These are not addressed by traditional homology modeling techniques (13), and their arrangement is constrained by, and provides constraints on, their protein milieu. Correct assignment of cofactor location and conformation was of great importance to the correctness of the model.
Another challenge came from the nature of the template structure used in modeling. In the production of an "atomic" model based on medium resolution experimental data, a balance must be found between stereochemical quality and fidelity to the experimental data. In several places, faithfulness to the medium resolution crystal structure led to non-ideal backbone geometries, especially in the transmembrane helices of PsaG and the light-harvesting complexes (see "Materials and Methods").
A further challenge for the modeling of the structure was the size of the PSI⅐LHCI supercomplex and its location in the membrane, where not only the aqueous environment but also the lipid bilayer has to be included in the modeling studies, which increases the size of the model system to 407,280 atoms. To limit the computation time, we have used a two-step approach; the structure was modeled in vacuo in the first step, and the membrane/aqueous environment was included in the second modeling step.
The model of plant PSI and its comparison with the structure of cyanobacterial PSI is summarized in TABLE ONE. The coordinates are available at the Protein Data Bank (accession code 1YO9). The model allows, for the first time, an atomic level examination of the evolution of photosystem I from two major kingdoms of life whose evolutionary branches have separated for more than one billion years. TABLE ONE provides an overview of the major functional and structural similarities and differences between plant and cyanobacterial PSI.
The most striking result of the comparison is that the core structure of photosystem I was conserved over more than one billion years of evolution. Not only the backbone conformation, but also the side chains of the core complex are highly similar in the plant and cyanobacterial structures. The subunit C, which carries the terminal FeS clusters F A and F B , shows the highest degree of homology, followed by the major subunits A and B. The homology is not restricted to the protein but extends to the coordination sites for the carriers of the ET chain and the antenna pigments. It is even more exciting that even the carotenoids and the lipids that have been identified in the high resolution x-ray structure of cyanobacterial PSI, but have not been assigned in the medium resolution plant PSI structure, may be conserved between plants and cyanobacteria.
As the core of PSI is well conserved, plant PSI has optimized the docking sites for the terminal electron acceptor ferredoxin and the plastocyanin docking site and has evolved a different oligomeric structure. In contrast to cyanobacteria, where it is essentially unknown where the phycobilisomes dock to PSI, the plant structural model allows investigation of the functional interaction of PSI with its external LHC complexes at a previously impossible level of detail (20). We will now discuss the major features of the plant structure and its implications for the function of plant PSI in more detail.
Plastocyanin-binding Site-The most important difference between plant and cyanobacterial PSI is that plant PSI has evolved a tighter docking site for plastocyanin (PC). Although cyanobacteria use cytochrome c 6 , which is regarded to be the most ancient electron donor for photosynthetic systems (21), or plastocyanin, which contains copper, plant PSI can only use plastocyanin as an electron donor. Electron transfer between plant PSI and plastocyanin shows second-order kinetics and is 2-3 orders of magnitude faster than in cyanobacteria, indicating that a more stable PSI⅐PC complex is formed in plants. This kinetic improve- a The first number in each pair is the percentage of cyanobacterial residues whose position is identical in the plant PSI model; the second (in parentheses) is the percentage of plant residues whose position is identical in cyanobacteria. Differences in these numbers stem from insertions or deletions in the sequence, e.g. in PsaE or PsaF. ment is associated with the N-terminal region of PsaF (22). Site-directed mutagenesis studies have identified several lysine residues in this N-terminal region (23) as well as an acidic patch on plastocyanin (24,25) as vital for the formation of the PSI⅐PC complex. The crystal structure of plant PsaF shows that the N-terminal region of PsaF forms a helix-turn-helix loop on the luminal side of PSI but could not identify the side chains. This loop is much longer in plants and green algae than in cyanobacteria.
Our model shows that, on this lumenal loop, eight positively charged amino acid residues point directly into the lumen, only two of which are conserved in cyanobacteria ( Figs. 2A and 3, C and D). We have thereby identified the amino acids that may be responsible for the electrostatic interaction with plastocyanin in plants.
Although the direct involvement of PsaF in the active docking of PC is well known, the luminal loop of PsaB also plays an important role in the stabilization of the PC-docking site. Site-directed mutagenesis studies (26) have shown that Glu-611 on PsaB is also crucial for the proper interactions between plastocyanin and PSI in C. reinhardtii. This finding can be explained by our model as it shows that this residue forms a strong salt bridge with Arg-17 of PsaF, suggesting that it is responsible for holding the luminal loop of PsaF in the proper orientation for plastocyanin binding.
Ferredoxin-binding Site-The structural model can also provide us with valuable insight about the optimization of the interactions of PSI with its soluble electron acceptor, ferredoxin. In photosystem I from cyanobacteria and plants, the small subunits PsaC, PsaD, and PsaE form a stromal ridge on top of PSI, which interacts with ferredoxin. After reduction by PSI, ferredoxin is oxidized by ferredoxin:NADP ϩ oxidoreductase, which has been shown to interact with PsaE in plants (27) but not in cyanobacteria. For both plant and cyanobacterial PSI, flash absorption spectroscopy has revealed three different kinetic phases in the reduction of soluble ferredoxin by PSI, with half-times of ϳ500 ns and 20 and 100 s (28). The ratio of the three phases shows significant differences between plant and cyanobacterial PSI, indicating that the docking site may have been further optimized during the last 1 billion years of independent evolution. The two faster kinetic components do not depend on the concentrations of ferredoxin and PSI, consistent with an electron transfer process occurring within a PSI-ferredoxin complex  white for neutral, red for negative, and blue for positive charges. Amino acids whose charge is conserved between plant and cyanobacterial PSI are shown in muted colors; those with nonconserved charges are highlighted. A, plant ferredoxin-docking site. The proposed proximal and distal binding sites are shown by yellow ellipses. B, cyanobacterial ferredoxin-docking site. Note that the charge distribution in the proximal (tight) binding site is largely conserved and that most changes appear on top of the stromal hump, at the distal (loose) binding site. C, plant plastocyanin-docking site. The proposed docking site is shown by a yellow ellipse; note that nearly all of the nonconserved residues are part of the luminal loop of PsaF (see Fig. 2A). The residues highlighted in yellow are PsaA:W658 and PsaB:W625, which are located immediately on top of P700 and interact directly with Pc (50). D, cyanobacterial plastocyanin-docking site. The highlighted residues are PsaA:W655 and PsaB:W631. Note the much smaller and decrease positive charge of the lumenal loop of PsaF.
that was formed before flash excitation. The slow component depends linearly on ferredoxin concentration, suggesting that this kinetic phase reflects the diffusion of ferredoxin to PSI after the flash excitation, followed by electron transfer (29).
The presence of two fast kinetic components in the electron transfer from PSI to ferredoxin suggests the existence of two distinct ferredoxinbinding conformations. The fastest component would correspond to a tightly bound ferredoxin, whose 2Fe-2S cluster is close to the distal 4Fe-4S cluster F B of PSI. The intermediate kinetic component is thought to arise from a conformation in which ferredoxin is more loosely bound to the top of the stromal ridge and must settle into the lower conformation before efficient electron transfer can take place.
It has been suggested that plants and cyanobacteria might bind ferredoxin in completely different locations (30), but our structural model does not support this conclusion. Instead, we suggest that in both systems a distal and a proximal docking site may exist with binding constants that differ in plants and cyanobacteria. First, the 4.4 Å structure of plant PSI shows that the stromal subunits are, in their general structural outline, virtually identical to the stromal subunits of cyanobacterial PSI. This high degree of structural similarity suggests a similar binding site for ferredoxin. It is worth noting, however, that a higher resolution structure may reveal some minor differences between the backbone conformation of the two proteins. Second, ferredoxin reduction in plants and cyanobacteria exhibits principally similar kinetic behavior, suggesting a common mechanism. Furthermore, plant ferredoxin is able to act as an electron acceptor for cyanobacterial PSI and vice versa (29). If the biochemical data on PSI-ferredoxin interactions are taken into account (31), what does our structural model say about the ferredoxin-binding site of plant PSI?
Site-directed mutagenesis studies have shown that Thr-15, Gln-16 (32), Lys-35 (33), and Lys-37 (31) of PsaC and Arg-68 (34) of PsaE (corresponding to Arg-39 in Synechocococcus elongatus) are all located within the ferredoxin binding domain with ferredoxin at the position closest to F B . Glu-121 (35) and Lys-122 (36) of PsaD sit on top of the stromal ridge and have been shown to affect the affinity of ferredoxin for PSI, so they probably participate in the looser, more distal binding site. All of these residues are well conserved between cyanobacteria and plants.
One region in which the plant and cyanobacterial ferredoxin-binding sites differ is the loop that lies between Gly-47 and Gly-54 of PsaE. This loop sits below the lower binding site of ferredoxin; it would be reasonable to suspect that it also participates in ferredoxin binding. This loop region is unique to S. elongatus and may be related to the thermostability of the PSI-ferredoxin interaction. Sequence alignments reveal that it is absent not only in plants and green algae but also in the cyanobacteria Anabaena sp. (37) and Gloeobacter violaceas (38).
As most of the amino acids that are essential to the proximal docking site are well conserved, differences are found on the proposed distal binding site of ferredoxin at the top of the stromal hump. Differences in this site may be linked to differences in ferredoxin behavior between plants and cyanobacteria; in plants, ferredoxin moves directly to an ferredoxin:NADP ϩ oxidoreductase bound to PsaE, whereas in cyanobacteria it diffuses away from PSI. The distal binding site may have a much more specific function in plants than in cyanobacteria. Detailed modeling studies of the interaction of ferredoxin with plant PSI to conform to this suggestion are in progress.
Structure and Function of the Unique Plant Subunits-The plant PSI crystal structure contains two of the four subunits that are unique to plants: PsaG and PsaH. We will now discuss their structure and possible function based on the "atomic" model.
Structure and Function of PsaH-PsaH is unique to plants, and two functions have been suggested for this subunit as follows: interaction with LHCII and the prevention of trimer formation. Because PsaH is unique to plants, the cyanobacterial crystal structure could not provide us with any clues about proper sequence-structure alignment; for details on the strategy used to find an alignment, see the Supplemental Material. The transmembrane helix prediction server TMHMM (39) predicted a transmembrane helix between residues Leu-1 and Ser-72. When the predicted transmembrane region is aligned with the transmembrane helix assigned in the plant x-ray structural model, chlorophyll 1801 is coordinated by Gln-35, which is conserved among all higher plant species. The plant-specific chlorophyll 1801 might be involved in the excitation energy transfer from LHCII to the core of PSI. It is coordinated in the anti-conformation and allows an ester group on the porphyrin ring to form a hydrogen bond with Asn-32, which is also very well conserved.
If plant photosystem I contains subunits that are not present in cyanobacterial PSI, then the conserved subunits should exhibit structural and functional differences from their cyanobacterial counterparts. In most cases, these differences are not visible in the 4.4 Å structure, and become clear only when a more detailed structural model is compared with the high resolution cyanobacterial structure.
The plant model reveals that a subtle modification of PsaI allows the binding of the nonconserved subunit PsaH (Fig. 4B). In cyanobacteria, FIGURE 4. A, Lhca1 and Lhca4 with their associated chlorophylls. Chlorophylls whose central ligand is identified in the model are shown in green, and others are shown in yellow (see Table S2). Phytyl tails have been omitted for clarity. Amino acids that coordinate chlorophyll have also been shown. B, interactions between PsaH (green) and PsaI (white). Residues shown in yellow are from S. elongatus. Note the bulky nonconserved tryptophan at the luminal side of PsaI, which is replaced by a much smaller serine in plants.
PsaI serves an important role in trimer formation (40), whereas in plants it interacts strongly with PsaH. Position 12 of cyanobacterial PsaI is at the lumenal end of its transmembrane helix, where it faces either the lipid bilayer (in PSI monomers) or the PSI-PSI interface (in trimers). It is poorly conserved among cyanobacteria and is occupied by the bulky side chain tryptophan in S. elongatus. In plant PSI, the corresponding residue, which faces PsaH rather than the lipid bilayer, is a very well conserved serine (Ser-6). Unlike tryptophan, serine is small enough not to interfere with the binding between PsaI and PsaH. The remaining contacts between the two subunits are formed by hydrophobic residues that are conserved between cyanobacteria and plants.
PsaH may also interact with the membrane extrinsic subunits of the PSI core. Plant PsaD has an elongated N terminus relative to cyanobacterial PsaD, and portions of this domain were assigned in the plant x-ray structural model. Because the electron density was not continuous with the remainder of PsaD, however, the protein backbone is not continuous, and we were unable to assign a sequence to this portion of PsaD. Consequently, it was modeled as polyalanine. It is known that this domain is responsible for the fact that plant stromal subunits are more strongly bound to the reaction center and more resistant to urea treatment than their cyanobacterial counterparts (41). This stabilizing effect is probably a result of the interactions between the N-terminal domain of PsaD and PsaH.
From the 4.4 Å structure of plant PSI, it is clear that the presence of PsaH precludes any trimerization of plant PSI analogous to that seen in cyanobacteria. Structural differences in PsaL may also play a role in the differing oligomerization behavior of the two complexes. Plant PsaL contains a large stromal loop that is much shorter in cyanobacteria but that was not assigned in the plant x-ray structural model. In assigning a structure for this loop, we chose a conformation that was consistent with the packing of the PSI crystals (i.e. it would not clash with PsaF of the neighboring monomer) and that could interact with PsaH and potentially with the LHCII trimer during state transitions.
A more subtle difference between the two systems is in the residues of PsaL that form the trimerization domain in cyanobacteria. The cyanobacterial crystal structure reveals the hydrophobicity of this region; helix g is lined with nonpolar residues that face into the trimerization domain, and the long C terminus forms extensive hydrophobic interactions with neighboring monomers. In plants, several of these hydrophobic residues are replaced by polar ones, which would fail to promote these hydrophobic interactions. Cross-linking and immunoblotting studies suggest that PsaO interacts with PsaH and PsaL (42); it is possible that some of the nonconserved polar residues on PsaL interact with PsaO or LHCII.
Strucure and Function of PsaG-In plants, PsaG and PsaK exhibit considerable sequence similarity and an identical fold, consisting of two transmembrane helices connected by a loop. It has been proposed that PsaG arose via gene duplication of PsaK and subsequent mutation, which allowed it to occupy a different, but symmetry-related, location in the complex (43)(44)(45). It has been suggested that the loop region of PsaG might be located in the lumen, unlike that of PsaK, which is located in the stroma (46). The plant crystal structure, however, shows the loop region of PsaG to be stromally oriented, and our model retains this orientation. PsaG coordinates a single chlorophyll 1701, via Gln-27, which is highly conserved among plant species.
Modeling the LHCI Proteins-The modeling of the LHC proteins was even more challenging than the modeling of the core subunits of PSI, because LHCI is less homologous to LHCII than the plant PSI core subunits are to their cyanobacterial counterparts. Furthermore, many structural features, such as the solvent-exposed loops, were missing from the x-ray structural model. Another complication was that the assignment of the four LHCs to Lhca1, Lhca2, Lhca3, and Lhca4 was only tentative. The modeling strongly supports the assignment of Lhca1 and Lhca4 in the x-ray structure, and the results on the assignment of Lhca2 and Lhca3 are less conclusive.
The interactions between Lhca1 and Lhca4 are mainly hydrophobic and confirm the importance of Trp residues for the interaction between Lhcas, which has been suggested by mutagenesis studies (47). Trp-4 of Lhca1 forms -stacking interactions with Trp-106 of Lhca4, and Trp-185 of Lhca1 interacts with a hydrophobic pocket on the lumenal side of Lhca4. Alignment between Lhca2 and Lhca1 reveals that the position containing Trp-185 in Lhca1 is occupied by a glycine in Lhca2. This might partially explain the apparent weaker interaction between Lhca2 and Lhca3 compared with the Lhca1-Lhca4 heterodimer.
Interactions of the Core Proteins of PSI with the LHCI Proteins-Just as the conserved subunits in plant PSI showed structural adaptations to their new role in binding PsaH, the conserved subunits that interact with the LHCI proteins differ from the cyanobacterial proteins in ways that highlight their new roles. PsaK is a particularly dramatic example. With only 31% of its residues conserved between plant and cyanobacterial systems, it is somewhat generous to refer to PsaK as a conserved subunit. Although the stromal loop in cyanobacteria is relatively short, the loop region in plant PSI is much longer. Although no location for this loop was assigned in the original plant x-ray structural model, we have assigned it a conformation that passes through a region of weak electron density above Lhca3, which was not assigned in the published crystal structure (48). It is probable that this loop is at least partially responsible for the well documented interaction between PsaK and Lhca3 (45,49).
The presence of Lhca3 points to another structural difference between plant and cyanobacterial PSI as the loop extending from Ser 261 to Thr 267 in PsaA may interact with the Lhca3 in plants. This loop of PsaA is not well conserved between plants and cyanobacteria and may be highly flexible as it was not completely resolved in neither the plant nor the cyanobacterial PSI structure. The conformation we suggest is partially supported by the electron density map from the plant PSI crystal structure and is the most probable conformation. The functional importance of the loop probably lies in its close proximity to the C terminus of Lhca3.
Another site for interactions between the PSI core and the lightharvesting complexes is the Lhca4-PsaF interface. PsaF has a fold that is unique among transmembrane proteins; in addition to its single transmembrane helix, there is a V-shaped structure formed by a helix that begins at the stromal side, extends halfway through the membrane, and then forms a kink and emerges again on the stromal side. Given its location on the edge of the complex, this V-shaped domain is probably involved in the interactions between PSI and its external light-harvesting complexes (48,50). It is very remarkable that this very uncommon structural feature is conserved between plant and cyanobacterial PSI, despite the differences in their external light-harvesting complexes. It has been suggested that PsaF is involved in the docking of phycobilisomes in cyanobacteria (50). PsaF and PsaJ may therefore provide an "entrance gate" for excitation energy in both plants and cyanobacteria. Are there more subtle structural differences between plant and cyanobacterial PsaF that arise from the different functional role played by PsaF in these two complexes?
The V-shaped domain of PsaF may interact with the long stromal loop between residues Lys-122 and Leu-147 in Lhca4, which was mentioned above. Because the conformation of the stromal Lhca4 loop in our model is speculative, details of the interaction are difficult to elucidate. The N-terminal helix of Lhca4 (helix B, following the nomenclature of Ref. 51) also comes close to the V-shaped domain of PsaF and is visible in the crystal structure, allowing us to reach more definite conclusions about its interactions with PsaF.
This region contains a salt bridge between Arg-142 of PsaF and Glu-35 of Lhca4. These residues are well conserved among higher plant species, but Arg-142 is replaced by a reasonably well conserved lysine in cyanobacterial PsaF. In our model, a hydrogen bond is also visible between Asn-146 of PsaF and Glu-35 of Lhca4; these residues are conserved in many but not all plant species. A more detailed understanding of the interactions between PsaF and Lhca4 awaits a more definitive conformation for the long stromal loop of Lhca4.
At the other end of the LHCI belt, the plant x-ray structural model deviates from the cyanobacterial structure between residues His-308 and Gly-318 of PsaB. The alteration of this loop may assist the binding of Lhca1, e.g. through the formation of a hydrogen bond between Arg-314, which forms part of this loop, and a stromal region on Lhca1.
Nonconserved Chlorophylls-The unique functional characteristics of plant PSI arise not only from its protein structure but also from the precise arrangement of the organic cofactors in its light-harvesting apparatus. The key players in coupling of core antenna system of PSI with the LHCI proteins are the "gap" chlorophylls. Ten chlorophylls were assigned as gap chlorophylls; they are observed in neither the cyanobacterial crystal structure nor the LHCII trimer. These molecules sit in the gap between PSI and the LHCI proteins, where they serve as a functional link by mediating energy transfer between the light-harvesting complexes and the core (48). Modeling of the excitation energy transfer based on our structural model reveals that there are four major "highways" for the excitation energy transfer that connect each of the LHCI proteins with the antenna system of the core. This tight coupling mediated by the gap chlorophylls leads to a preferential excitation energy transfer from the LHC chlorophylls to the core chlorophylls that is significantly faster than the excitation energy transfer from between the four LHCI complexes (20). If these chlorophylls are present in plants but not in cyanobacteria, what structural features of plant PSI enable it to bind them?
The nonconserved stromal loop of PsaB was mentioned earlier in the context of its interactions with Lhca1. This loop has another function, i.e. the gap chlorophyll 4001 is coordinated by the carbonyl oxygen of Gly-312. Another change in PsaB with consequences for pigment binding is found at the loop between Ala-491 and Asn-497 of PSI in S. elongatus, which is replaced by a much shorter loop in plant PSI. This is associated with the repositioning of chlorophyll 1233 in the plant structure, so that it is no longer parallel with chlorophylls 1231 and 1232. This pigment trimer is very strongly coupled in S. elongatus, and these chlorophylls are therefore thought to be among the most red-shifted in cyanobacterial PSI. A change in the orientation of chlorophyll 1233 would disrupt the excitonic coupling of this trimer and decreases its red shift. Rather than being a systematic difference between plant and cyanobacterial PSI, the presence of this trimer may be a peculiarity of S. elongatus; substantially fewer red-shifted chlorophyll pigments have been spectroscopically observed in Synechocystis (52,53), and it is probable that this trimer is not present in all cyanobacterial species. Sequence comparisons, however, show that this loop seems to be systematically shorter in higher plants than in cyanobacteria.
Another gap chlorophyll, number 4003, is situated close to the lumenal side of PsaF and is probably coordinated by the peptide carbonyl oxygen of Ser-74 in PsaF. The conformation assigned to this region in the plant x-ray structural model deviates slightly from its cyanobacterial counterpart, and sequence alignments reveal that this region is among those that are less well conserved between plant and cyanobacterial PsaF.
Another category of nonconserved chlorophylls in the plant PSI⅐LHCI supercomplex is the LHCI-linker chlorophylls. These are located at the interface between neighboring LHCI monomers and have no counterparts in LHCII. In one case, we were able to identify a protein ligand for an LHCI-linker chlorophyll; chlorophyll 1032 of Lhca4 is clearly coordinated by His-99. An orientation was chosen that causes chlorophyll 1032 to be anti-coordinated, but no conclusive hydrogenbonding partners seemed available. Coordination of nonconserved chlorophylls was not a criterion in the initial sequence-structure alignment; the proximity of His-99 to chlorophyll 1032 provides additional evidence that this alignment is correct.
The coordination situation is less clear with the other linker chlorophylls. Although widely conserved, His-99 of Lhca4 is replaced by a well conserved glycine in Lhca3, so chlorophyll 1032 in Lhca3 must be coordinated by a different side chain, a backbone atom, or water.
Despite the varied and often incomplete nature of the information used in the building of this model, a substantial amount of functional information can be gleaned from it. This paper can only provide an overview of some of the most salient features.
Conclusions-The model presented here shows for the first time details of the unique structural and functional features of plant PSI. It provides specific details about protein-protein and protein-pigment interactions, estimated transition dipole moment orientations for antenna system chlorophylls, and some insights into pigment-pigment interactions. It will form the basis for mutagenesis studies that will further explore the interactions between the LHC and the PSI core. Work on the docking of the soluble partners for PSI, plastocyanin and ferredoxin, is in progress. The estimated orientations of the chlorophyll transition dipole moments have been used in the theoretical modeling of energy transfer in the plant PSI⅐LHCI supercomplex (20).
At the same time, such a model should be used with caution. Because it is based on a 4.4 Å x-ray structural model, positions of atoms are far from precise, and geometries are often non-ideal, so an x-ray structure of the complex with higher resolution may still reveal new, unexpected features of PSI. The computational model presented in this work may provide information on inter-subunit interactions or potential crystal contacts that could prove valuable in obtaining an improved crystal structure of plant PSI and be useful in phase determination of x-ray diffraction data.