Structural and functional analyses of Rubisco from arctic diatom species reveal unusual posttranslational modifications

The catalytic performance of the major CO2-assimilating enzyme, ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco), restricts photosynthetic productivity. Natural diversity in the catalytic properties of Rubisco indicates possibilities for improvement. Oceanic phytoplankton contain some of the most efficient Rubisco enzymes, and diatoms in particular are responsible for a significant proportion of total marine primary production as well as being a major source of CO2 sequestration in polar cold waters. Until now, the biochemical properties and three-dimensional structures of Rubisco from diatoms were unknown. Here, diatoms from arctic waters were collected, cultivated, and analyzed for their CO2-fixing capability. We characterized the kinetic properties of five and determined the crystal structures of four Rubiscos selected for their high CO2-fixing efficiency. The DNA sequences of the rbcL and rbcS genes of the selected diatoms were similar, reflecting their close phylogenetic relationship. The Vmax and Km for the oxygenase and carboxylase activities at 25 °C and the specificity factors (Sc/o) at 15, 25, and 35 °C were determined. The Sc/o values were high, approaching those of mono- and dicot plants, thus exhibiting good selectivity for CO2 relative to O2. Structurally, diatom Rubiscos belong to form I C/D, containing small subunits characterized by a short βA–βB loop and a C-terminal extension that forms a β-hairpin structure (βE–βF loop). Of note, the diatom Rubiscos featured a number of posttranslational modifications of the large subunit, including 4-hydroxyproline, β-hydroxyleucine, hydroxylated and nitrosylated cysteine, mono- and dihydroxylated lysine, and trimethylated lysine. Our studies suggest adaptation toward achieving efficient CO2 fixation in arctic diatom Rubiscos.

genase activity leads to a significant loss of carbon to the atmosphere and a decrease in the carboxylation efficiency (for reviews, see, for example, Refs. 4 -6). Because of the significance of Rubisco to crop production, plant nitrogen and water usage, and the global carbon cycle, there is considerable interest in investigations aimed at reducing the oxygenase activity. Rubisco has been the subject of intense research, including structural, mechanistic, and mutagenesis studies. However, despite the vast amount of data available, the molecular basis for its CO 2 /O 2 discrimination is not fully understood.
The ratio of carboxylation and oxygenation, measured as the CO 2 /O 2 specificity factor (for a definition, see below), is not fixed, and there is substantial variation among phototrophs (7). For instance, the specificity factor is very low (around 20) in anoxygenic nonsulfur purple bacteria but considerably higher (ϳ80 -100) in Rubisco from green plants. Form I Rubisco, to which plant Rubiscos belong, can be further divided into two subgroups: green-like, containing higher plants, green algae, and cyanobacteria, and red-like, containing among others eukaryotic nongreen algae (i.e. diatoms and rhodophytes) (8 -10). The latter group contain some of the most CO 2efficient forms of Rubisco.
The genetic, phylogenetic, kinetic, and structural characteristics of red-like Rubiscos from marine organisms are to a large extent unknown. For example, little is known about Rubisco from psychrophilic organisms that live in arctic waters. The objective of the present work was therefore to study arctic/cold water microalgae to provide new information on Rubisco function at the molecular level. We have studied the natural variation in Rubisco from northern diatoms, which thrive at the light-limited low temperature environments within and below the ice and make up the main part of primary production in fish-rich areas (11).

Growth experiments
The growth data were standardized to compare measurements obtained by different methods (see "Experimental procedures") and hence only serve the purpose of comparing the species in a relative manner. The mean growth rate obtained from the different methods was 0.47 doublings/day, with a minimum of ϳ0.05 and maximum of ϳ1.2 doublings/day. The expected maximum doubling rate at 3-4°C is ϳ1.0 doublings/ day (12); considering that we used a light/dark photoperiod of 14 h/10 h, we conclude that the growth achieved for our seven chosen species was in the maximum range. The results from the growth rate experiments showed that the overall fastest grower at 2-3°C was Thalassiosira antarctica (interpreted from the average growth data). In this temperature range, it was followed by Bacterosira bathyomphala and Thalassiosira nordenskioeldii ( Table 1). The slowest growers in this temperature range were Thalassiosira gravida and Thalassiosira hyalina. At 7°C, the fastest growers were T. antarctica, B. bathyomphala, and T. nordenskioeldii. When both temperatures were considered, the fastest growers were T. antarctica, B. bathyomphala, and Chaetoceros socialis; thus, T. antarctica performed best at both temperatures. The overall mean increase in growth rates from low to high temperature regimes were ϳ0.03 (in standardized relative units), and only C. socialis, T. nordenskioeldii, T. gravida, and Skeletonema marinoi responded with increased growth rates when the temperature increased ( Table 1).

Determination of kinetic constants of Rubisco enzymes from arctic diatoms
The CO 2 -fixation efficiency of Rubisco shows considerable species-specific variation (13). Our objective was to identify the most efficient Rubiscos among diatoms, a group of microalgae that are prime candidates for finding new highly efficient Rubisco enzymes. The partitioning between the carboxylation and oxygenation reactions ( c / o ) depends on the relative concentrations of the gaseous substrates and the relative catalytic efficiencies (V max /K m ) of the two activities in accordance with the relationship, , where c and o are the velocities of carboxylation and oxygenation, respectively, V c and V o are the maximal velocities of the two reactions, and K c and K o are the Michaelis constants for CO 2 and O 2 , respectively. The composite of constants in the equation is referred to as the specificity factor and is often referred to as S c/o , or ⍀ (14). The specificity factor is usually determined from the product of the measured 3PGA/2PG concentrations and the known [ Optimization of a Rubisco purification procedure for use with marine diatoms was undertaken, and a suitable protocol was developed that resulted in over 80% pure Rubisco. Diatom Rubisco content was generally much lower than in plants, confirming earlier observations (15).
HPLC analysis of [ 14 C]RuBP oxygenation and carboxylation was first evaluated to examine diatom Rubisco CO 2 /O 2 specificity. This method is labor-intensive, highly sensitive to relatively small changes in 3PGA and 2PG concentrations, and requires tightly controlled reaction conditions. Therefore, a method based on the oxygen electrode was employed, giving real-time data collection. Wheat Rubisco was used as an internal standard. In addition, K m and V max were determined for Rubisco from diatom species with wheat as a control.
Using assays and screening protocols especially developed for diatom Rubisco enzymes, specificity factors were determined for five diatom species at a range of temperatures from 15 to 35°C (Table 2) (16). The specificity factors of diatom Rubiscos were close to that for wheat Rubisco. In all of the arctic Table 1 Overall standardized maximum growth rates of seven diatom species chosen for further analysis The results are means of repeated experiments with species-specific n Ͼ 48. See "Experimental procedures" for explanation of the standardization method.

Rubisco from arctic diatoms
species examined, the specificity factor increased at decreasing temperatures. None of the arctic species examined had a higher specificity factor than wheat, even when values were extrapolated to 0°C. However, unlike wheat Rubisco, diatom Rubiscos were not deactivated when exposed for prolonged periods (ϳ24 h) to temperatures of 4°C (data not shown). These observations suggest structural adaptations to the low temperatures in the extreme environment these diatoms occupy.

Crystal structures of Rubisco enzymes from arctic diatoms
Crystals of diatom Rubisco species were obtained, and the corresponding structures were determined. Details of data collection and refinement are summarized in Table 3. Overall, the quality and resolution of the data were very good, with the best crystals diffracting to better than 2 Å resolution. However, because some crystals were thin in at least one dimension, the corresponding data were anisotropic. The quality of the structures was significantly improved by the use of TLS refinement implemented in the refinement programs REFMAC5 and PHE-NIX, but the quality of the T. nordenskioeldii and B. bathyomphala structures remained substandard, and they were not included in the final set of structures. The RbcL sequence from T. nordenskioeldii (O98947) was used for an initial fit to the electron density maps and subsequently modified to fit the density as refinement progressed. In parallel to this, genomic DNA was extracted from the cell cultures, and partial sequences of rbcL and rbcS were determined to aid model building (Fig. S1). The sequences of Rubisco from diatoms in this study were highly similar, as would be expected in view of their close relationship. GUG is the translation start codon of all rbcS genes sequenced. This codon normally codes for valine, but the protein structures show that, as expected, methionine was inserted in this position.
Diatom Rubisco (Fig. 1A) is a hexadecamer of eight large (L, 490 residues) and eight small (S, 139 residues) subunits and belongs to form I C/D (reviewed in Ref. 5). This form includes a small subunit that is distinct from the small subunits of form I A/B enzymes (e.g. in cyanobacteria and higher plants) and is characterized by a short ␤A-␤B loop and a C-terminal extension (␤E-␤F loop) that forms a ␤-hairpin structure. The ␤ hairpins from four small subunits together form a ␤ barrel that lines the entrance to the central solvent channel at each end of the holoenzyme (Fig. 1B). Form I C/D structures have previously been observed in Rubisco from the betaproteobacterium Cupriavidus necator (formerly Ralstonia eutropha), and the red algae Galdieria partita and Galdieria sulfuraria (17)(18)(19). The diatom structures are highly similar; structures can be superimposed with root mean square deviations of 0.15-0.32 Å for all C␣ atoms.

Posttranslational modifications
The structures of diatom Rubisco feature a number of posttranslational modifications in the large subunit (Table 4 and Figs. 2 and 3). Rubisco is activated by carbamoylation of the ⑀-amino group of an active-site lysine residue and subsequent coordination to Mg 2ϩ (20,21). Thus, as expected for the fully activated enzyme examined in the present study, electron density corresponding to a carbamoyl group is observed at the ⑀-amino group of Lys-205 (corresponding to Lys-201 of spinach Rubisco). 4-Hydroxy-Pro residues are present at positions 48 and 155. Hydroxy-Pro-155 ( Fig. 2A) corresponds to hydroxy-Pro-151 of Rubisco from the green algae Chlamydomonas reinhardtii (22), whereas hydroxy-Pro-48 appears to be unique to diatom Rubiscos. Both residues are relatively buried in the interior of the protein. Electron density corresponding to a modification of the sulfur atom of Cys-109 was detected in some (T. antarctica, T. hyalina, B. bacterosira) but not all diatoms.
Analysis of this extra density shows that it is most consistent with hydroxylation. A large extra density at S␥ of Cys-457 suggested a different modification; we first considered carbamylation or methylation (methylcysteine was detected in Rubisco from C. reinhardtii (22)), but analysis of side-chain atom temperature factors and difference density maps after refinement indicated such modifications to be unlikely. Instead, nitrosylation of the Cys-sulfur was found to best fit the extra density (Fig.  2B). The S-nitroso group of Cys-457 is accessible to solvent, 100

Rubisco from arctic diatoms
suggesting that it may be involved in interactions with binding partners. Nitrosylation at Cys-457 was most prominent in Rubisco from C. socialis, but the modification could also be detected at lower occupancy in the enzymes from T. antarctica and T. hyalina (Table 4). It was only faintly detected in the enzyme from S. marinoi; this is probably because of the low resolution of the data. Lys-150 features additional density at C␥ and C␦ most consistent with hydroxylation ( Fig. 3). This is a relatively unusual modification that has not been reported previously. Lys-150 is located on the dimer interface of the large subunits and forms several hydrogen bonds with its 3-and 4-hydroxyl groups to Ser-147 of the adjacent subunit (Fig. 3B). Such contacts have been shown to influence stability, catalysis, and specificity in Rubisco (23)(24)(25). It is therefore likely that these interactions, which would not be present in the unmodified enzyme, will confer extra stability to the holoenzyme. Additional density at C␦ of Lys-198 was interpreted as monohydroxylated lysine. Lys-346 shows extra density at its N⑀ cor-responding to trimethylation (Fig. 2C). Trimethyl-Lys-346 is located at the exterior of the hexadecamer and is accessible to the solvent. Trimethyl-lysine has been detected at position 14 of some plants (26), although it has not yet been observed in a crystal structure, presumably due to disorder of the N terminus. Trimethylation of residue 346 appears to be unique to the present structures. Leu-174 is hydroxylated at C␤ (Fig. 2D); the modification introduces an additional hydrogen bond contact to the main-chain nitrogen of Asp-202. As mentioned above, rbcS sequences deduced from the crystal structure differ from the DNA sequence at residue 1. All of the modifications are unambiguous for each of the 4 -8 copies in the asymmetric unit.

Discussion
Finding a Rubisco enzyme that has its carboxylation reaction enhanced relative to its oxygenase reaction and engineering this trait into the Rubisco enzymes of economically important crop plants has potential implications with regard to both agricultural and environmental considerations. Besides increasing yield, it would potentially allow the growth of crop plants in areas with a short season (i.e. short summers) and, in areas with longer seasons, permit more than one harvest per season. Current concerns regarding global warming and the greenhouse effect point to the need for a better understanding of global carbon fluxes, in particular in the oceans and between the ocean and the atmosphere.
Little is known about the biochemical properties of Rubisco from marine microorganisms, which are estimated to be responsible for about half of the total NPP. Initial findings indicate that Rubisco enzymes from marine microalgae carry a where F o and F c are the observed and calculated structure factor amplitudes, respectively. c R free calculated from a randomly chosen 5% of all unique reflections. d Root mean square deviation.

Rubisco from arctic diatoms
number of unusual features that make them prime candidates for further studies. Young et al. (27) reported the kinetic constants of Rubisco enzymes from a set of diatoms, which were all of southern origin and had a relatively large geographic spread. Much less has been reported about the properties of Rubisco from northern diatoms, and no structures of diatom Rubisco have been described so far. Specificity factors measured from diatom Rubisco are high relative to those of cyanobacteria. Although the specificity factor serves as an important first diagnostic parameter to indicate changes in efficiency of engineered Rubisco enzymes, it is but one parameter that determines the net efficiency of Rubisco enzymes. It is becoming increasingly obvious that environmental factors, such as the temperature and the aridity of the environment from which the organism evolved are important factors that influence Rubisco's carboxylation capacity (28). In the case of marine phytoplankton, CO 2 and light limitations are important factors to consider. Phytoplankton have adopted carbon-concentrating mechanisms (CCMs) to offset the problems of CO 2 limitation and use the high levels of bicarbonate in sea water (29,30). Evidence for a CCM in diatoms to date is mainly from model diatoms (31), whereas direct evidence for a CCM in northern diatoms is currently lacking. Common with previously analyzed red-type Rubisco enzymes, the northern diatoms show a reduced affinity for O 2 ( Table 2) but lack the very high affinity for CO 2 observed for nongreen algae, such as Griffithsia monilis (32,33). This, together with the low concen-tration of free dissolved CO 2 in sea water, would point to the need for a CCM. The operation of a CCM may increase photosynthetic light requirements. Thus, it seems possible that the evolution of high-specificity factors in diatoms (compared with, for example, ocean-living cyanobacteria) may contribute to their ability to grow well in the light-limited environment typical of the early bloom or under the arctic ice or during periods when the maximum solar elevation is low for longer periods (34). As the catalytic efficiency of Rubisco increases, one would expect that less nitrogen (as the constituent amino acids of Rubisco) would be required to maintain a given photosynthetic rate. From our results, the species that had the highest specificity factor relative to the others, T. hyalina/T. antarctica, also had the highest overall growth rate and the highest growth rate at the lowest (2°C) temperature. In addition, the diatom with the lowest specificities, S. marinoi, is considered a more southern species that does not normally enter the true arctic growth regime (11). This, in our opinion, indicates that high-specificity Rubisco may be a cold water/arctic evolutionary adaptation connected to competitive advantages. Hobson et al. (35) have reported high specific activities and low cellular concentrations of Rubisco in diatoms relative to green algae, illustrating the coupling between carbon assimilation and nitrogen metabolism. Although speculative, improvements in Rubisco specificity would be ecologically significant if they affect the competitive ability of a species.
The Rubisco rbcL/S sequences obtained from genomic DNA extracted from the cultured diatom material were generally confirmed by the amino acid sequences deduced from the electron density maps. Most of the differences may not be of significant consequence for the function of the enzyme; for instance, the commonly observed Ile/Val substitution (or Ile/Leu) belongs to the same class of apolar amino acids with similar physico-chemical properties.
T. nordenskioeldii, T. gravida, and T. antarctica are common in the northern cold water to temperate regions (11), whereas T. hyalina is described as an arctic species (36). It is well known that it is difficult to distinguish morphologically between Thalassiosira species (e.g. the morphologically determined identity of T. gravida may be confused by the fact that it may change morphology when the temperature is lowered, whereby it resembles T. rotula) (37). T. gravida may also easily be misidentified with T. antarctica (38). There are also different "types" of T. antarctica; the one cultivated here probably most resembles T. antarctica var. borealis (39). In addition, the genetic information for the group is largely incomplete, and, as a consequence, the available species concepts may be incomplete, and phenotypic (and physiological) adaptation may well

Rubisco from arctic diatoms
occur over short intervals of time. Considering the more southern origin of the diatoms that have been studied to date, the differences that we observe in this study may well be due to true variation occurring in the species collected in arctic/North Atlantic waters. There is also the possibility that certain species may carry several copies of rbcL and/or rbcS genes and that these may be expressed differentially. Plants and green algae are known to have multiple nucleus-encoded rbcS genes; for instance, wheat carries over 20 rbcS genes, whereas C. reinhardtii has two copies (reviewed in Ref. 40). Some prokaryotes even have multiple copies of both rbcS and rbcL genes (41). Multiple copies are assumed to be the result of lateral gene transfer (8), but this has not been addressed specifically for diatoms. In addition, some chloroplasts have been found to exhibit maternal, paternal, and biparental modes of inheritance even within the same species; the latter has been shown in, for example, Pseudo-nitzschia (42). Whereas it is presently not possible to distinguish between these alternatives, it is not unlikely that the conditions in a mass culture may give rise to sequence variations (multiple sequences) in a manner observed here.
Whereas the carbamylation of an active-site lysine residue has been established as essential for activity, the roles of other posttranslational modifications of Rubisco have not been elucidated with regard to functional significance (reviewed in Refs. 43 and 44). Hydroxyproline is a major component of collagen, where the absence of the hydroxyl group on proline (caused by a deficiency in vitamin C) results in the disease scurvy. Hydroxyproline is also found in diverse proteins localized to the plant cell wall (45,46), but this residue has never been observed in Rubisco from vascular plants. Whereas S-hydroxycysteine was detected here for the first time in Rubisco, 4-hydroxyproline and S-methylcysteine have been observed earlier in Rubisco from the unicellular green alga C. reinhardtii (22), but there is yet no known function for these modifications in algae. Redox regulation of Rubisco activity via cysteine residues has been extensively studied in C. reinhardtii (47). Nitric oxide (NO) signaling regulates various physiological processes in animals, plants, and algae. In the diatom Skeletonema costatum, a link has been found between nitric oxide concentrations and programmed cell death (48), suggesting a role for nitric oxide in the massive cell loss occurring at the end of a diatom bloom. Whether nitrosylation of Rubisco at Cys-457 is part of this mechanism remains to be elucidated, but we note that S-ni-trosocysteine has also been detected at the corresponding position (Cys-460) in Rubisco from the red algae G. sulfuraria (19). The presence of mono-or dihydroxylated or trimethylated lysines in the diatom Rubisco enzymes investigated in this study is also enigmatic. Hydroxylysines have been detected in the hydrolysates of peptides and proteins exposed to HO ⅐ /O 2 , and subsequently treated with NaBH 4 (49), but such oxidizing conditions are difficult to imagine in the live diatom cell. Hydroxylysine is a component of collagen and has also been reported to become incorporated instead of lysine in the bacterial cell wall (50). The diatom cell is characterized by its silica-based cell wall. Silica-precipitating peptides from the diatom Cylindrotheca fusiformis have been shown to contain posttranslationally modified lysines (51) that are necessary for their silica-precipitating activity. These lysine residues are ⑀-dimethylated, ⑀-trimethylated, or ␦-hydroxylated or contain a combination of these modifications. It is not clear why lysine residues of diatom Rubiscos should be modified in the same way. It may be that Rubisco has evolved to utilize the enzymes responsible for these modifications and that these confer some sort of advantage (e.g. insensitivity to tryptic proteolysis (44,52) or stability). Occurrence of these modifications in all of the diatoms used in this study confirms their close relationship. Trimethylation of Lys-14 of the large subunit of Rubisco occurs in some plants (e.g. those belonging to the Solanaceae or Cucurbitaceae families), but not all (26,43). It is possible that the interaction with other proteins (e.g. chaperones or Rubisco activase) may be regulated by trimethylation, but at present, there is no experimental evidence to substantiate this assumption. Similar to the S-nitroso-group of Cys-457, the trimethyl group of Lys-346 is located on the surface of the protein, thereby enabling potential contacts with other binding partners.

Conclusion
Oceanic primary production is dominated by phytoplankton, and diatoms account for a significant proportion of the NPP (2,3). Here, we present structural and functional data on a large set of diatoms from arctic cold waters. Our results indicate adaptation of diatom Rubiscos in response to the environment in which they live, including low contents of Rubisco protein, high specificity factors approaching those of the most efficient crop plants coupled with low oxygen sensitivities, and a number of posttranslational modifications.

Collection of algae in the arctic east ice area and selection of species for cultivation
To establish diatom monocultures, samples of algae were collected with 20-m phytoplankton nets during three spring cruises to the Atlantic and arctic Barents Sea from 2004 to 2006. As an initial guideline, when species were selected, it was assumed that the quantitatively most important species recorded during field investigations were the fastest and most successful growers (for the compilation of abundances, see Ref. 11). The chosen potential candidates were representatives from the genera Chaetoceros, Thalassiosira, Bacterosira, and Skeletonema.

Growth rate measurements
Small-scale cultivation experiments were performed aimed at assessing the fastest growers at nutrient-replete conditions (i.e. CO 2 and autoclaved natural sea water with added nitrogen, phosphorous, and silicate to f/10 concentrations). These experiments were performed in irradiance and temperature-controlled/logged rooms at two irradiances and temperatures (fluorescent daylight tubes, light/dark ϭ 14/10 h, scalar irradiance 25 and 125 mol quanta m Ϫ2 s Ϫ1 , 2-3 and 7°C) using 25and 1500-ml nontoxic Erlenmeyer plastic flasks. Because monitoring growth from a single measure of biomass (e.g. chlorophyll a (Chla), which may vary with species and light level) may not be sufficient to detect "true" increase in overall biomass, several methods were applied to detect the fastest growers. The methods were increase in (i) cell numbers (inverted microscope counting), (ii) organic bound carbon and nitrogen (Carlo Erba Elemental analyzer), and (iii) in vitro Chla and pheophytin content (53). We computed growth as doublings/day from the formula, ϭ ͑Log 2 C2 Ϫ log 2 C1͒/D (Eq. 1) where represents doublings/day, C2 and C1 are cell numbers, and D is the number of days.
In addition, 14 C radioactive tracer photosynthesis (carbon assimilation) measurements were performed applying 5-Ci aqueous sodium bicarbonate/100 ml of culture (for detailed method, see Ref. 54). The scalar irradiance exposure gradients were 330, 172, 102, 53, 13, and 0 mol quanta m Ϫ2 s Ϫ1 . Further, we calculated both the ␣-slope photosynthesis curve (mg C (mg Chla) Ϫ1 h Ϫ1 mol quanta m Ϫ2 s Ϫ1 ) and P max Ϫ maximum photosynthesis (mg C (mg Chla) Ϫ1 h Ϫ1 ). The above experiments were repeated several times to achieve robust data sets for statistical analysis (n ϭ 2,680). In the end, data for the highest growth rates, maximum photosynthesis (P max ), and slope photosynthesis (␣) for each species, condition, and experiment were standardized using the formula, where x is measured growth rate, is population mean doubling, and is population S.D. The standardized results were then pooled for each species, the results were ranked, and the following seven diatom species were considered fast growers and were chosen for further investigation: B. bathyomphala, T. antarctica, T. hyalina, T. nordenskioeldii, T. gravida, C. socialis, and S. marinoi.

Mass cultivation
The selected species were mass cultured in specially constructed 300-liter plexiglass cylinders in temperature and irradiance-controlled rooms. Cultivation took place at ϳ4°C under a 14/10-h light/dark regime and at optimal (I max ) scalar irradiances determined during the small-scale 14 C experiments. When the desired culture densities had been reached (150 -500 g of Chla liter Ϫ1 ), the cultures were harvested onto specially designed 20-m mesh plankton net devices. The samples were stored at Ϫ80°C before further analysis.

Purification of Rubisco for determination of specificity factors
Twenty-five ml of extraction buffer (100 mM Bicine, pH 8.0, 6% PEG 4000, 5 mM DTT, 1 mM each of benzamidine, phenylmethylsulfonyl fluoride, ⑀-amino-n-caproic acid (⑀-ACA), and EDTA, 1% (v/v) Tween 80, 0.2 mM EGTA, 0.5% (w/v) polyvinylpolypyrrolidone, 0.5% (v/v) protease inhibitor mixture (Sigma P5955), and 1% (w/v) washed sand) was ground to a frozen powder in liquid nitrogen (N 2 ). To this was added 20 -40 ml of a harvested mass culture from above, and then the mixture was ground to a frozen powder. A further 175 ml of extraction buffer was added, 25-50 ml at a time, with frequent grinding until thawing was complete. On thawing, polysaccharide hydrolases were added (200,000 units of lysozyme, 40 units of pectinase, 8 units of cellulase, all supplied by Sigma UK), and the ice-cold homogenate was sonicated (6 -8-m amplitude) for 30 s followed by a 60-s interval. This was repeated until the total sonication time was 2 min. The extract was clarified by centrifugation (22,000 ϫ g, 20 min, 4°C), brought to 20% (w/v) PEG 4000 and 20 mM MgCl 2 , and then stirred for 30 min at 4°C. The resulting protein precipitate was sedimented by centrifugation (22,000 ϫ g, 20 min, 4°C) and redissolved in 8 ml of ice-cold gradient buffer (10 mM Tris, pH 8.0 (HCl), 10 mM MgCl 2 , 10 mM NaHCO 3 , 5 mM DTT, 1 mM EDTA, 1 mM KH 2 PO 4 , 1 mM benzamidine, 1 mM ⑀-ACA), using a precooled homogenizer to achieve a lump-free suspension. The suspension was clarified by centrifugation (235,000 ϫ g, 20 min, 4°C), applied to a previously prepared sucrose gradient (0.3-1.2 M sucrose in gradient buffer) at a rate of 4 ml of suspension per 35 ml of sucrose gradient, centrifuged for 190 min at 370,000 ϫ g at 4°C, fractionated into 1-ml aliquots, and then snap-frozen in liquid N 2 . A small sample previously taken from each fraction was assayed for protein content and Rubisco activity (55). Fractions containing the Rubisco activity peak (between fraction 9 and 14 from the bottom) were pooled and passed through PD-10 columns (2 ml of sample/column) pre-equilibrated in column buffer (100 mM Bicine, pH 8.1, 10 mM MgCl 2 , 10 mM NaHCO 3 , 5 mM DTT, 0.5 mM EDTA, 1 mM ⑀-ACA, 1 mM benzamidine, 1 mM KH 2 PO 4 ). The resulting protein eluates were combined and passed through 0.45-m regenerated cellulose filters before sample concentration using Centriplus concentrators (Millipore Amicon, 150,000 NMWL). The final volume of the resulting Rubisco was ϳ0.50 ml, which was snap-frozen in liquid N 2 before short-term storage at Ϫ80°C.

Rubisco from arctic diatoms
Preparation of Rubisco used for determination of kinetic constants used a simplified procedure, omitting the sonication, sucrose gradient, and ultrafiltration treatments, simply consisting of homogenization, sedimentation, PEG precipitation, clarification of the redissolved protein, and passage through PD-10 columns pre-equilibrated with column buffer supplemented with 2% (w/v) PEG 4000, followed by freezing in liquid N 2 , before short-term storage at Ϫ80°C.

Determination of specificity factors
Specificity factors for diatom Rubisco were determined by real-time data collection based on rates of carboxylation and oxygenation measured by 14 C incorporation and an oxygen electrode, respectively. Wheat Rubisco was used as an internal standard, and before use, a freeze-dried stock of wheat Rubisco was dissolved in CO 2 -free 0.1 M Bicine, pH 8.2, containing 20 mM MgCl 2 . The purified Rubisco samples were then desalted by centrifugation through G25 Sephadex columns previously equilibrated with CO 2 -free 0.1 M Bicine, pH 8.2, containing 20 mM MgCl 2 . Potassium phosphate (400 mM, pH 8.2) was then added to give a final concentration of 4 mM. NaH 14 CO 3 (37 GBq mol Ϫ1 ) was then added to a final concentration of 10 mM, and the wheat Rubisco was activated by incubation at 37°C for 40 min. Diatom Rubisco showed no increase in activity in response to warming but maintained activity for 24 h when kept at 4°C (data not shown). Reaction mixtures were prepared in an oxygen electrode (model DW1, Hansatech, Kings Lynn, UK) by first adding 0.95 ml of 100 mM Bicine, pH 8.1, containing 10 mM MgCl 2 and 20 g (50 WA units) of carbonic anhydrase, preequilibrated with CO 2 -free air at 25°C, and 0.02 ml of 0.1 M NaH 14 CO 3 , 18.5 GBq/mol. A sufficient amount of activated Rubisco was then added in 25 l to complete the reaction in 5 min. The reaction was started by the addition of 10 l of 18.5 mM RuBP. RuBP oxygenation was calculated from the oxygen consumption and carboxylation from the amount of 14 C incorporated into 3PGA when all of the RuBP was consumed (56). A number of reaction mixtures containing pure wheat Rubisco were interspersed with those containing Rubisco from diatoms. In addition, measurements of specificity at 15 and 35°C were made. The procedure followed was similar to that at 25°C. Mean initial concentrations of O 2 in solution in equilibrium with air were 305, 254, and 227 M at 15, 25, and 35°C, respectively, as determined by the integrated Hansatech software. Initial concentrations of CO 2 in solution were calculated from the amounts of NaHCO 3 added, using pK a values for H 2 CO 3 of 6.19, 6.11, and 6.06 at 15, 25, and 35°C, respectively. The specificity values were normalized to the average value for wheat Rubisco, of 94 Ϯ 4 (S.D.) (n ϭ 4) at 25°C. The determinations were repeated 3-5 times at each temperature, using material pooled from two or three biological replicates.

Determination of catalytic parameters
Catalytic parameters were measured essentially as described previously (57). Carboxylation activity was measured at 8, 16,24,36,68, and 100 M CO 2 (aqueous) in equilibrium with a gas phase of N 2 containing 2, 21, 56, or 92% (v/v) O 2 at 25°C. K m and V max for carboxylation (K c and V c , respectively) were cal-culated at each O 2 concentration using a Michaelis-Menten kinetic model. K m and V max for oxygenation (K o and V o , respectively) were calculated as follows: where K c is the K m for CO 2 in the absence of O 2 , and K m,app is the apparent K m for CO 2 , as measured in the reactions equilibrated with 21, 56, or 92% O 2 . Specific mixtures of N 2 and O 2 were prepared using a gas divider (Signal Group, Camberley, UK), and concentrations of O 2 in solution were calculated at 100% relative humidity and standard atmospheric pressure (101.3 kilopascals). The solubility of O 2 was taken as 257.5 M. The concentration of CO 2 in solution (in equilibrium with HCO 3 Ϫ ) was calculated assuming a pK a of 6.11 for the first ionization of carbonic acid, taking into consideration the pH of each buffer solution (measured on the day of assay). Carbonic anhydrase (Ն50 WA units/1-ml reaction; Sigma, Poole, UK) was present in the reaction solution to maintain equilibrium between NaHCO 3 and CO 2 . The Rubisco samples used in these assays had all been equilibrated in NaHCO 3 -and MgCl 2 -containing buffers during the purification procedures (above) and were found not to require any further activation before assay. Control reactions were performed by measuring CO 2 fixation (acid-stable 14 C) in reaction solutions lacking RuBP or NaHCO 3 and following substitution of RuBP for 3PGA or following total inhibition of Rubisco by prior treatment with an excess of the tight-binding Rubisco inhibitor, 2Ј-carboxyarabinitol-1,5-bisphosphate (CABP). These controls confirmed that the activity measured (i.e. all acid-stable 14 C detected) was entirely due to Rubisco.
Radioactive content of 14 C-labeled compounds was measured in 0.40-ml aqueous solutions, following the addition of 3.6 ml of Ultima Gold scintillation mixture (PerkinElmer), using a Tri-Carb 2910 TR Liquid Scintillation Analyzer (PerkinElmer Life Sciences, Seer Green, UK).
Values of Michaelis-Menten constants and maximum velocities were estimated using EnzFitter (Biosoft, Cambridge, UK). Turnover number (k cat ; mol of product ϫ mol of active site Ϫ1 ϫ s Ϫ1 ) was calculated from the corresponding V max values (V c and V o ; mol of product ϫ mg of Rubisco Ϫ1 ϫ min Ϫ1 ) after determination of Rubisco concentration in the samples. This was accomplished using the [ 14 C]CABP-binding assay described previously (58).

Sequencing of Rubisco genes from marine diatom species
Total genomic DNA was isolated, and the DNAs were used as templates in PCRs to amplify the rbcL/S genes. Internal PCR primers were designed according to marine algal rbcL/S sequences that are already deposited in databases. Sequences of the 5Ј and 3Ј ends of the genes were amplified using the internal and a set of external primers designed according to genes flanking the rbcL/S gene cluster. These genes were found in a preliminary assembly of the genome of the diatom T. pseudonana on the website of the Joint Genome Institute of the United States Department of Energy (http://www.jgi.doe.gov/).
Genomic DNA from C. socialis, T. antarctica, T. hyalina, T. nordenskioeldii, S. marinoi, and B. bathyomphala was extracted by standard methods. Oligonucleotides were designed to amplify a region of the diatom genome including the Rubisco large and small subunit genes, rbcL and rbcS. In most cases, a faithful Rubisco from arctic diatoms DNA polymerase (PicoMaxx from Stratagene) was used to amplify this region, and the sequences of rbcL and rbcS from each species were determined. For each species, each base has been covered by at least two sequencing reactions from independently generated PCR products. If there was any difference between the first two sequences, a third independently generated PCR fragment was sequenced. Two species initially gave more than one DNA sequence. In these cases, sequencing was repeated with DNA isolated from a new culture.

Isolation and purification of Rubisco for structure determination
To yield crystallization-grade purity, frozen algae in glycerol (ϳ20 g) were thawed and suspended in 10 ml of extraction buffer (50 mM Bicine, pH 8.0, 10 mM MgCl 2 , 10 mM NaHCO 3 , 1 mM EDTA, 5 mM 2-mercaptoethanol, 1 Complete protease inhibitor tablet (Roche Molecular Biochemicals), 5 l of Benzonase nuclease (Novagen)). The algal suspension was disrupted in a One-shot cell disrupter (Constant System Ltd.). The extract was centrifuged (15 min, 20,000 rpm, Sorvall SS34). The supernatant was passed through a 0.45-m syringe filter and applied to a Superdex 200 column (120 ml) equilibrated with purification buffer. Fractions (2 ml) were collected and analyzed by SDS-PAGE. Fractions containing Rubisco were pooled, diluted with an equal volume of 0.1 M NaCl in purification buffer, and further purified on a MonoQ ion-exchange column (8 ml). The sample was loaded onto the column and equilibrated with low salt (0.1 M NaCl in purification buffer). The protein was eluted with a linear 0.1-0.5 M NaCl gradient in 120 ml of purification buffer. Fractions (2 ml) were collected and analyzed by SDS-PAGE. Pooled fractions containing Rubisco yielded 2-5 mg of pure protein from 20 g of algae.

Crystallization, data collection, structure determination, and refinement
Before crystallization, the activated enzyme was concentrated to 20 mg ml Ϫ1 using Vivaspin 6 (Vivascience) and incubated with 0.001 M CABP. Crystals were grown using the hanging-drop vapor diffusion method at 20°C. The drop contained equal amounts of the protein sample in crystallization buffer (0.05 M HEPES, pH 7.5, 0.05 M NaCl, 0.01 M NaHCO 3 , and 0.005 M MgCl 2 ) with 0.001 M CABP added, and a well solution consisting of the crystallization buffer with 7-13% PEG 4000 as a precipitating agent. The crystals were flash-cooled in liquid N 2 using a mother liquor with 30% ethylene glycol added as a cryoprotectant and maintained at 100 K for data collection. Diffraction data were collected at Max-lab (Lund, Sweden) and at the European Synchrotron Radiation Source (Grenoble, France) ( Table 3). The data were processed using DENZO/SCALE-PACK (59) and XDS (60). The crystal structures were solved by molecular replacement using the program MOLREP (61). The initial search model consisted of a set of one large and one small subunit of G. partita Rubisco (Protein Data Bank code 1BWV). Using the data for Rubisco from T. antarctica, eight solutions corresponding to eight different orientations of the search model in the hexadecamer of the asymmetric unit were found. The RbcL sequence from T. nordenskioeldii (O98947) was used for an initial fit to the electron density maps; this crude fit was subsequently improved using results obtained from sequencing of the gene and by inspection of electron density maps. Subsequently, the refined model of T. antarctica Rubisco was used as a search model to solve the remaining structures ( Table 3). Modifications of the sequence were made as above.
Refinement was performed using REFMAC5 (62) and PHENIX (63). For cross-validation, 5% of the data were excluded from the refinement for R free calculations. Refinement consisted of one round of rigid-body refinement using data to 3 Å, followed by refinement using a maximum likelihood target function with noncrystallographic symmetry restraints. Noncrystallographic symmetry restraints were released toward the end of refinement of the structures to the highest resolution. TLS refinement (64) was used in the final stages with each subunit as a TLS group. Solvent molecules were added using ARP/ wARP (65) and were manually inspected in O (66). Throughout the refinement, the 2mF o Ϫ DF c and mF o Ϫ DF c A weighted maps (67) were inspected, and the models were manually adjusted using O (66).