Isoforms of photosystem II antenna proteins in different plant species revealed by liquid chromatography-electrospray ionization mass spectrometry.

The high selectivity offered by reversed-phase high-performance liquid chromatography on-line coupled to electrospray ionization mass spectrometry has been utilized to characterize the major and minor light-harvesting proteins of photosystem II (Lhcb). Isomeric forms of the proteins, revealed either on the basis of different hydrophobicity enabling their chromatographic separation or on the basis of different molecular masses identified within one single chromatographic peak, were readily identified in a number of monocot and dicot species. The presence of several Lhcb1 isoforms (preferably in dicots) can explain the tendency of dicot Lhcb1 to form trimeric aggregates. The Lhcb1 molecular masses ranged from 24,680 to 25,014 among different species, whereas within the same species, the isoforms differed by 14-280 mass units. All Lhcb1 proteins appear to be highly conserved among different species such that they belong to a single gene group that has several different gene family members. In all species examined, the number of isoforms corresponded more or less to the genes cloned previously. Two isoforms of Lhcb3 were found in petunia and tomato. For Lhcb6, the most divergent of all light-harvesting proteins, the greatest number of isoforms was found in petunia, tobacco, tomato, and rice. Lhcb2, Lhcb4, and Lhcb5 were present in only one form. The isoforms are assumed to play an important role in the adaptation of plants to environmental changes.

apoproteins are encoded by families of nuclear genes (1-3) and over 40 nucleotide sequences encoding LHCII proteins from more than 15 different plant species have been cloned and sequenced (4). Moreover, a large number of protein bands can be resolved by gel electrophoresis from LHCII preparations (3). The precise relationship between the multiple functional roles played by LHCII, the diversity of polypeptides found in the membrane, and the complexity of the gene family that encodes the LHCII apoproteins has not yet been fully established. The observed protein heterogeneity has, in fact, a number of possible origins, thus the different protein isoforms could be: (i) products of distinct genes (5), (ii) post-translational modifications of one or a few primary translation products (6 -8), (iii) cleavage products of a common precursor at successive maturation stages (9), or (iv) artifacts introduced during sample workup and analysis.
The antenna system is organized into a mobile complex containing homotrimers of Lhcb1 and heterotrimers combining Lhcb1 and Lhcb2 (10) and an immobile complex containing trimers of Lhcb1 and Lhcb3 probably in a 2:1 ratio associated with the minor antenna proteins, which altogether surround the core complex. In most of the models proposed for the architecture of the light-harvesting complex, the minor antenna proteins form a link between the PSII core proteins and the major antenna proteins that are located in the periphery of PSII (11). Regarding the mobile complex, it has been recently reported that more than one trimeric form of LHCII may exist and that these are organized into different supramolecular complexes with differing aggregation tendency to PSII ("S," "M," and "L" types, referring to strongly, moderately, and loosely bound LHCII, respectively) (12). Specifically, the Lhcb1 proteins occur in several isoforms, but the functional significance of the diversity of the Lhcb1 apoproteins is still unknown. In particular, little attention has been devoted so far to the role of the subtypes of Lhcb1. It is also not clear whether different major antenna proteins assemble into homo-or heterotrimers and whether they serve different functions within the thylakoid membrane. The organizational differences observed in PSII can alter the efficiency of light-energy transfer from the major light-harvesting complexes to the photochemical reaction center and thus provide the system with a means to regulate photosynthetic efficiency under the various light or stress conditions to which green plants have to adapt (13,14). The emerging picture of the molecular organization and regulation of the photosynthetic apparatus presents one of the most interesting and challenging areas of modern plant biochemistry. However, it is becoming increasingly clear that the protein composition of the PSII antenna is complex and that there are several polypeptides with small differences in molecular mass and sequence that have to be unambiguously characterized in order to understand their function within the PSII.
Various analytical techniques, including SDS-polyacrylamide gel electrophoresis (PAGE), SDS-isoelectric focusing, high-performance liquid chromatography (HPLC), electrospray ionization mass spectrometry (ESI-MS) (15,16), or matrixassisted laser desorption/ionization mass spectrometry (MALDI-MS) (17,18), are applicable to the characterization of minute differences in heterogeneous proteins. Traditionally, the protein components of the PS II major and minor antenna system are resolved by SDS-PAGE into several closely migrating protein bands (19,20). Additionally, reversed-phase HPLC (RP-HPLC) has been successfully applied to the fractionation of the various types of antenna proteins (21). Although single stage chromatographic separation systems offer high resolving power, the separation selectivity may not suffice to separate all protein isoforms having very subtle structural differences (22). A practical means of increasing the selectivity of an analytical system is the use of two or more separation stages, resulting in a multidimensional separation (23). While MS is generally viewed as a spectroscopic technique yielding the molecular mass as well as structural information, it can also be viewed as a separation technique distinguishing between different massto-charge ratios. Consequently, conjugation of HPLC and MS represents a multidimensional separation system offering the high selectivity indispensable to distinguish closely related proteins.
In the past decade, the advent of ESI and MALDI as soft ionization methods for biological macromolecules has greatly enhanced the role of protein MS in structural biochemistry (24 -26) and in proteomic studies (27). By virtue of its high resolving power and the utilization of eluents that are compatible with ESI-MS (28), RP-HPLC has become the separation method of choice for direct interfacing with ESI-MS (29,30). In this article, the strategy of RP-HPLC-ESI-MS has been used to separate and identify isomeric forms of major and minor antenna proteins in different plant species. The high reproducibility and short analysis times of the hyphenated system enabled a comparison of the antenna proteome from 14 different dicot and monocot species. This information was utilized to generalize differences in the antenna proteome between monocots and dicots and their influence on the supramolecular organization of the PSII antenna system.
Isolation of the Major and Minor Antenna Systems by Sucrose-gradient Ultracentrifugation-Chloroplast thylakoid membranes (PSII membranes) were isolated from the following dicot and monocot leaves: spinach (Spinacia oleracea), petunia (Petunia hybrida), pea (Pisum sativum), tomato (Lycopersicon esculentum), tobacco (Nicotiana tabacum), cucumber (Cucumis sativus), soybean (Glicine max), Vicia faba, Populus albae, maize (Zea mays), rice (Oryza sativa), barley (Hordeum vulgare), rye (Secale cereale), and wheat (Triticum aestivum) according to the method of Berthold et al. (31) with the modifications reported elsewhere (21). The light-harvesting complex was isolated from the PSII membranes as described previously (11) with the following modifications. PSII membranes were pelleted by centrifugation at 10,000 ϫ g for 5.0 min at 4°C, suspended in Buffer 1 (50 mM MES, pH 6.3, 15 mM sodium chloride, 5 mM magnesium chloride) containing 1.0 mg/mg chlorophyll, and then solubilized by adding 1% (w/v) N-dodecyl ␤-Dmaltoside. Unsolubilized material was removed by centrifugation at 10,000 ϫ g for 10 min. The supernatant was rapidly loaded onto a 0.1-1.0 M sucrose gradient containing Buffer 1 and 5.0 mM N-dodecyl ␤-D-maltoside. The gradient was then spun on a Kontron Model Centricon T-1080 ultracentrifuge equipped with a Model TST 41.14 rotor at 39,000 rpm for 18 h at 4°C. Green bands were harvested with a syringe. The SDS-PAGE analysis of these green bands revealed that band 2 contained a mixture of the protein components of the major and minor PSII antenna systems, whereas band 3 essentially contained the protein components of the major PSII antenna system as reported previously (11). These bands were used for HPLC analysis without any further treatment.

Aggregation Tendency of Antenna Proteins Revealed by Sucrose Gradient
Ultracentrifugation-In each species examined, PSII was isolated from thylakoid membranes of leaves by Triton X-100 extraction. Then the PSII complex was extracted as described in Ref. 31 and subjected to sucrose-gradient ultracentrifugation. Fig. 1 illustrates an example of the band patterns obtained upon sucrose-gradient ultracentrifugation from the monocot species maize (left tube) and the dicot species spinach (right tube). The fractionated protein components were the monomeric antenna system (in band 2) and the trimeric major antenna system (in band 3). It can be seen clearly that after loading the same amount of protein, the intensity of band 2 in the dicot species is significantly lower than in the monocot species, indicating that in dicots most of the light-harvesting proteins are organized in trimers. Additional sucrose gradients from different species revealed that in all nine dicot species studied band 2 was less intense than band 3, whereas in all four monocot species examined the two bands were of similar intensity (Fig. 1). Spectrophotometric determination of chlorophyll in each sucrose band revealed that in monocots the intensity of band 2 is about 40% less than in band 2 collected from dicots. This result suggests that the monomeric and trimeric forms are present to a similar extent in monocots, whereas the trimeric form is more prevalent in dicots.

Determination of the Molecular Masses of the Major and Minor Antenna Proteins by RP-HPLC-ESI-MS-Although in a
former investigation the chromatographic and mass spectrometric conditions were optimized to analyze exclusively the antenna proteins of spinach (32), conditions in this study were chosen to be generally applicable to the characterization of the antenna proteins in a large variety of plant species. Thus, a gradient of 39 -69% acetonitrile in 0.05% aqueous trifluoroacetic acid in 45 min was utilized to elute the protein components from a 250 ϫ 4.6 mm inner diameter C4-silica column. This not only facilitated the separation of all different types of major and minor antenna proteins in the different species but also the efficient removal of detergents from the sample, which were needed to solubilize the hydrophobic membrane proteins during sample preparation but which are known to efficiently inhibit ion formation in ESI-MS. The components of band 3 from sucrose gradient ultracentrifugation, containing only the trimeric major antenna proteins, as well as those of band 2, containing both monomeric major and minor antenna proteins, were analyzed separately. However, under the chromatographic conditions, the trimers from band 3 dissociated into the monomeric apoproteins, yielding the same molecular masses as those observed for the equivalent apoproteins present in band 2. Fig. 2a illustrates the reconstructed ion chromatogram of antenna proteins upon injection of band 2 from petunia. A total of 15 protein masses were extracted from the chromatogram; these proteins are related to the six types of PSII light-harvest-ing proteins and were identified by comparison with the molecular masses deduced from the DNA sequences. From the number of found proteins, it is evident that at least some of them must be present in different isoforms. Fig. 2, b-e, reports the deconvoluted mass spectra of Lhcb1, Lhcb6, and Lhcb3 for which six, four, and two isoforms, respectively, were identified. Some of the isoforms were separable by RP-HPLC because of their differing hydrophobicity (e.g. Lhcb1.1 from Lhcb1.2 or Lhcb6.1 from Lhcb6.2/6.3/6.4), whereas some others coeluted in one HPLC peak but were distinguished by ESI-MS analysis (e.g. Lhcb1.3/1.4/1.5 or Lhcb6.216.316.4). For three of the antenna proteins, namely Lhcb2, Lhcb4, and Lhcb5, only one protein was found. A similar picture was observed for the major and minor antenna proteins of tomato in which Lhcb1 occurred in five isoforms, Lhcb3 occurred in two isoforms, and Lhcb6 occurred in two isoforms, respectively (Fig. 3, a-e), whereas Lhcb2, Lhcb4, and Lhcb5 revealed only one molecular mass each. Finally, Fig. 4 reports the reconstructed ion chromatogram of antenna proteins from rice. In this monocot species, the number of isomeric proteins was significantly lower, showing only two isoforms for Lhcb1 and two for Lhcb6 (Fig. 4, b and c). Table I summarizes the number of all isomeric proteins observed for Lhcb1, Lhcb3, and Lhcb6 in nine dicot and five monocot species. In the table, the sum represents the total number of isoforms observed for the antenna proteins, whereas the individual numbers indicate the number of proteins coeluting in one HPLC peak that were, however, distinguishable by ESI-MS. From Table I it becomes evident that the diversity of Lhcb1, Lhcb3, and Lhcb6 is significantly more pronounced in dicots as compared with monocots. Moreover, a higher number of isoforms is observed generally in species in which a higher number of genes have also been cloned. In all species examined, the other antenna proteins (Lhcb2, Lhcb4, and Lhcb5) were present in only one copy. Another aspect of the data presented in Table I is the presence of two separable HPLC peaks for Lhcb1 in nine of the 14 species examined. Only two of the species showed one peak, and three species showed more than two distinct HPLC peaks for Lhcb1.
Comparison of the Hydropathic Protein Profiles and the Theoretical Retention Coefficients-To collect more information on the isomeric proteins observed, Fig. 5 compares the hydropathic profiles for the Lhcb1 antenna proteins determined from the DNA sequence using the Kyte-Doolittle method (33). In Fig.  5a, the hydropathic profiles of Lhcb1 in pea, spinach, and tomato are compared. Fig. 5, b and c, illustrates the profiles of two different gene products within the same species, giving one example for the dicot tomato (Fig. 5b) as well as one for the monocot maize (Fig. 5c). It was generally observed that the Lhcb1 hydropathic profiles show significant differences only in the amino-terminal region, which explains the diverse retention times among different species and within the same species. The dissimilarities in the transmembrane-and carboxyl-terminal region are only very subtle as a consequence of the highly conserved protein sequence both within and among species.
Calculation of the retention coefficients for isomeric proteins in various species according to Browne et al. (34) revealed that there are two distinct groups of proteins both in dicots and monocots (Table II): one population with retention coefficients higher than 900 and another with retention coefficients lower than 900. These differences in the hydrophobicity are confirmed by the aliphatic indices (35) (Table II) and most probably explain the fractionation of isoforms into two major HPLC peaks (Figs. 2 and 3 and Table I). The two groups of Lhcb1 proteins display differences in the amino-terminal region, which give two main peaks with different elution times within the same species. Thus, it is likely that at least two subpopulations of Lhcb1 antenna proteins exist with significant differences in amino acid composition resulting in separability by RP-HPLC.

Revelation of Proteomic Heterogeneity by RP-HPLC-ESI-MS
Analysis-The presence of multigene families is common in the vegetable kingdom, in which a crucial task for the future will be to understand the biological significance of the numerous gene families with tandem gene arrangement. In particular, the antennae of photosystem I and II are encoded by several genes that show a high homology and consequently should encode proteins with similar molecular masses and molecular properties. Because of almost identical electrophoretic mobilities, such closely related proteins are difficult to separate and identify by conventional SDS-PAGE. Two-dimensional gel electrophoresis, currently the most powerful method for the separation of a large number of proteins, is not applicable for these hydrophobic membrane proteins because of their very similar isoelectric points and the necessity for them to be kept in solution by suitable detergents. Here we show for the first time that a multidimensional approach using RP-HPLC on-line coupled to ESI-MS is suitable to characterize even the isomeric forms of the antenna proteins of PSII in a single chromatographic run. In fact, several isomers of Lhcb1, Lhcb3, and Lhcb6 have been revealed in a variety of plants without the need for special experimental procedures such as the use of polyclonal and monospecific antibodies (36), dedicated electrolyte solutions, and/or extended length gels for SDS-PAGE (37,38).
The existence of several isoforms of Lhcb1 is in accordance with molecular genetic data. However, isomers for Lhcb3 have only been reported in tomato so far (3), whereas our method effortlessly facilitated the discovery of isoforms of Lhcb3 and/or Lhcb6 in tomato, petunia, pea, tobacco, and rice. Interestingly, the other antenna proteins (Lhcb2, Lhcb4, and Lhcb5) were present only in single copies in all investigated species. The isomers of Lhcb1 were recognized previously because of the result that in most species, the major antenna system of PSII showed more than one major band or peak that could be resolved both by SDS-PAGE in a 15-cm gel (38) or by RP-HPLC (21). However, the difference in apparent mass of ϳ1,000 -2,000 mass units revealed by SDS-PAGE in combination with a specific antibody was too large to be correlated with the DNA sequences for genes of the same type (36). On the other hand, the mass differences among the isoforms measured by RP-HPLC-ESI-MS are in the same range as the mass differences deduced from DNA sequence data for genes of the same type. Moreover, the total number of found protein isoforms and the number of genes cloned are very similar in most species (Table I).
The presence of several Lhcb1 isoforms reopens the old debate about whether or not the Lhcb1 polypeptide heterogeneity observed is the result of expression of multiple LHCII cab genes. The general fuzziness of the Western blot bands was interpreted by assuming that an initial precursor of Lhcb1 apoprotein could be processed in a different manner, giving rise to a multitude of Lhcb1 polypeptides (36). Nevertheless, the presence of small posttranslational modifications in a protein cannot significantly modify its electrophoretic mobility. Thus, the various bands observed in SDS-PAGE can be explained only by mutations that introduce changes in the number of positive or negative charges. Although the precision and accuracy of mass determinations by ESI-MS is typically between 0.01 and 0.02%, which translates into a maximum mass deviation of 5 mass units for a protein having a molecular mass of 25,000, the differences observed between molecular masses measured by ESI-MS and molecular masses deduced from the DNA sequences ranged from a few to more than 100 mass units. This discrepancy may be due to additional posttranslational modifications or to errors in published DNA sequence data. Therefore, we conclude that the isoforms of Lhcb proteins represent families of polypeptides having slightly different amino acid sequences, which are the gene products of distinct gene families. This assumption is supported by recent data on Z. mays in which the partial, highly homologous amino acid sequences of the six polypeptides observed in SDS-urea PAGE revealed distinct differences in their primary sequences, sug- gesting that they are distinct gene products with distinct pigment binding properties (39).
Two isoforms of Lhcb3 were observed in petunia and in tomato. In the latter, two separable Lhcb3 polypeptides with a mass difference of 30 were observed in agreement with two bands found by SDS-PAGE (40). Regarding the PSII minor antenna system, only Lhcb6, the most divergent of all LHC sequences (3), had two isoforms in tomato, petunia, tobacco, and rice. Two genes have been cloned in tomato, in which the mature cab10A and cab10B proteins have predicted masses of 22,610 and 22,800, respectively (41). RP-HPLC-ESI-MS analysis revealed two isoforms with molecular masses of 22,610 and 22,830, respectively. This correspondence is encouraging in that it indicates the high selectivity of RP-HPLC, which is capable of separating membrane proteins differing by only two amino acids. Two isoforms were also revealed in rice, but only one was revealed in maize, contrasting reports that four closely migrating Lhcb6 bands were obtained in non-denaturating isoelectric focusing-PAGE (42).    (34).
Types of Protein Isomers-From the chromatograms presented in Figs. 2, 3, and 4 as well as from Table I, we infer that two main categories of isomers can be identified: isoforms showing significantly different hydrophobicities, which render them separable by HPLC, and isoforms of similar hydrophobicity coeluting in one chromatographic peak that can be distinguished only by ESI-MS analysis. The isoforms showing different hydrophobicity most probably have greater differences in amino acid composition than the isoforms revealed within the same HPLC peak. In the particular case of Lhcb1, in which a significant number of genes is known, the splitting up into retention coefficients larger or lower than 900 (Table II) strongly corroborates the existence of two isomer categories. Thus, it is not surprising that RP-HPLC is capable of differentiating two categories of Lhcb1 protein subfamilies in most of the species. In tomato, the Lhcb1 polypeptides are encoded by two gene clusters (cab1 and cab3) that are located in two different chromosomes (43) and encode polypeptides that differ in only eight positions out of 232 in the mature protein. Microsequencing has revealed that the mature cab3 polypeptide is two amino acids longer and has one more positively charged amino acid than cab1. Hence, it is likely that the more hydrophilic protein in the chromatograms, Lhcb1.1, corresponds to the cab3 gene product, whereas the later eluting Lhcb1.2 corresponds to cab1, which is also consistent with the measured molecular masses of 24,696 (24,692 deduced from the cab1 DNA sequence) and 24,880 (24,879 deduced from the cab3 DNA sequence), respectively.
Similarly, in soybean, antenna protein genes have been shown to exist as two subfamilies (44,45), and we also found two proteins belonging to the two different populations (Table  II). Additionally, it may be anticipated that within the two different functional populations of Lhcb1, other polypeptides exist with only very slight sequence differences and almost identical hydrophobicity (Table II). Such subpopulations can only be recognized on the basis of their different molecular masses, but their correlation to distinct cab gene sequences is difficult because of their very similar molecular masses. From the hydropathic profiles, it becomes clear that significant differences in the amino acid sequence and thus, in the hydrophobicity, are localized within a small region of the amino terminus (Fig. 5) that can be found both within the isoforms in a single species as well as in isoforms in different species. Sigrist and Staehelin (36) used specific antibodies against synthetic peptides corresponding to the most unique sequence domains of the amino-terminal region to reveal that most species have more than one Lhcb1 and that differences between them are located in the amino-terminal region. The occurrence of such localized heterogeneity only in the hydrophilic part of the proteins points at a functional role in the supramolecular organization of the antenna system.
Physiological Relevance of Isomeric Forms of Antenna Proteins-An answer to the question as "why do several genes encode isoforms of the same polypeptides?" will be the key to understand the biological significance of the numerous multigene families. We assume that the differences in hydrophobicity present in the first part of the amino terminus of the two Lhcb1 subpopulations play an important role in the interaction of different Lhcb1 isoforms to form supramolecular aggregates. Thus, the differences in primary structure can provide a modulation of the physiological effect of LHCII proteins, which have distinct topological locations within the PSII supramolecular complexes. Recent studies by electron microscopy and image analysis have revealed the presence of supercomplexes consisting of trimeric LHCII in three different types of binding positions (S, M, and L, referring to strongly, moderately, and loosely bound LHCII, respectively) (12). The reason for the existence of three different types of trimer is difficult to explain on the assumption that they are formed from the same LHCII proteins. It may be that the different recently discovered trimers that form super-and mega-aggregates contain different isomeric Lhcb1 proteins. In agreement with this hypothesis is the evidence that fractionation of maize grana yields five major LHCII isoforms with a trimeric structure (39). Furthermore, analysis of the crystal structure has shown that most of the predicted subunit-subunit interactions of the LHCII monomer are mediated by residues in the amino-terminal domain (46).
Comparison of Monocots and Dicots-The heterogeneity of Lhcb1 detected in monocots was generally less than that in dicots (Table II). The band patterns seen after sucrose-gradient ultracentrifugal separation of PSII preparations from different species indicate that monocots show less oligomerization than dicots, suggesting that in dicots the trimers are more stable. This evidence agrees with a previous report that dicots appear to generate higher ratios of oligomeric-to-monomeric antenna system in vitro (47).
In conclusion Lhcb1, as well as to a minor extent Lhcb3 and Lhcb6, may exist in multimeric forms, each showing different amino acid sequences mainly in the amino-terminal region. All these different forms may be easily revealed and quantified by the strategy of using RP-HPLC coupled on-line with ESI-MS. These isoforms play a substantial role in the subunit-subunit interactions stabilizing homo-and/or heterotrimers. Interestingly, preliminary experiments on the influence of environmental stresses (such as high light intensity, high temperature, or drought, on the photosynthetic apparatus) reveal that such factors influence only some of the isomeric forms, indicating a possible general role of isomers in the adaptation of plants to environmental conditions.