Structural insights into marine carbohydrate degradation by family GH16 κ-carrageenases

Carrageenans are sulfated α-1,3-β-1,4-galactans found in the cell wall of some red algae that are practically valuable for their gelation and biomimetic properties but also serve as a potential carbon source for marine bacteria. Carbohydrate degradation has been studied extensively for terrestrial plant/bacterial systems, but sulfation is not present in these cases, meaning the marine enzymes used to degrade carrageenans must possess unique features to recognize these modifications. To gain insights into these features, we have focused on κ-carrageenases from two distant bacterial phyla, which belong to glycoside hydrolase family 16 and cleave the β-1,4 linkage of κ-carrageenan. We have solved the crystal structure of the catalytic module of ZgCgkA from Zobellia galactanivorans at 1.66 Å resolution and compared it with the only other structure available, that of PcCgkA from Pseudoalteromonas carrageenovora 9T (ATCC 43555T). We also describe the first substrate complex in the inactivated mutant form of PcCgkA at 1.7 Å resolution. The structural and biochemical comparison of these enzymes suggests key determinants that underlie the functional properties of this subfamily. In particular, we identified several arginine residues that interact with the polyanionic substrate, and confirmed the functional relevance of these amino acids using a targeted mutagenesis strategy. These results give new insight into the diversity of the κ-carrageenase subfamily. The phylogenetic analyses show the presence of several distinct clades of enzymes that relate to differences in modes of action or subtle differences within the same substrate specificity, matching the hybrid character of the κ-carrageenan polymer.

key role. As primary producers, they initiate and maintain the biogeochemical carbon cycle in the ocean and constitute the first link in the marine food chain. Among them are found macroalgae, which grow in the photic zone. Macroalgae are polyphyletic and belong to three distinct phyla, namely red, green, and brown algae. In addition to their role in climate regulation (1) and as shelter for many animal and microbial marine species (2), they are a major source of carbon for many communities. Particularly, they establish close interactions with marine bacteria living freely in the water column or entrapped in biofilm on the surface of macroalgae (3). There has been increasing interest in the study of the interactions between microbial communities and their associated host, especially in the case of terrestrial association in human gut (4).
Heterotrophic bacteria have developed sophisticated enzymatic tools to extract the carbon from plant cell walls. Notably, the ecological position of a bacterial strain in the microbial communities will influence the degradation strategy deployed. Some possess a wide set of enzymes that make them generalist organisms, like Bacteroidetes, others being more specialized in the degradation of certain families of polysaccharides (5). As an example of an integrative strategy, Bacteroidetes species have been shown to usually possess all of the genes involved in the degradation, recognition, and transport of a specific polysaccharide organized in clusters called polysaccharide utilization loci (6,7). However, the majority of studies are done on terrestrial bacteria, producing enzymes against land plant polysaccharides. In the case of marine heterotrophic bacteria, their arsenal of carbohydrate active enzymes is still largely unexplored (8). Indeed, they are adapted to the diversity of macroalgal polysaccharides, and many of these marine polysaccharides have no equivalent in land plants. One of their peculiarities is the abundance of sulfation motifs, which are absent in the land plant polysaccharides but also present in polysaccharides of the mammalian extracellular matrix (9).
One of the best-known families of algal sulfated polysaccharides is carrageenan. This family is composed of linear chains of ␣-1,3-␤-1,4-galactans with an increasing number of sulfate groups per disaccharide, unit depending on the categories of carrageenans. Accordingly, the three main structures are called -(one sulfate group), -(two sulfate groups), and -(three sulfate groups) carrageenans ( Fig. 1) (10). The carrageenases, which cleave the ␤-glycosidic linkages of carrageenans, belong to different families of glycoside hydrolases (GH), 2 based on their amino acid sequences: GH16 for -carrageenases (11,12), GH82 for -carrageenases (13,14), and a new family of GH for -carrageenases (15,16). In this study, we will focus on the family of -carrageenases.
We report here the second crystal structure of a catalytic module of a GH16 -carrageenase, ZgCgkA GH16 from Z. galactanivorans Dsij T , a marine flavobacterium model for the bioconversion of algal polysaccharides that was isolated from the surface of the red macro-algae Delesseria sanguinea (15). In contrast to P. carrageenova, Z. galactanivorans grows perfectly well with -carrageenan as the sole carbon source (25). Both enzymes do not require any desulfation of natural -carrageenan substrate to be active. Despite the similarity of sequences between ZgCgkA GH16 and PcCgkA GH16 (40% identity, 50% similarity), several areas of divergence were observed, corresponding to the deletion or insertion of sequence stretches ( Fig. 2A). The biochemical and structural comparison of both enzymes, along with a phylogenic analysis of -carrageenases, give a first insight into the diversity existing within this subfamily. This allowed us to propose some structural determinants that underlie the biochemical properties of these enzymes, active on polyanionic substrate. In addition, the mutagenesis and analysis of kinetic properties of PcCgkA GH16 point mutants allowed the proposal of potential targets of interest for enzymatic engineering of more efficient -carrageenases.

Phylogenetic analysis of -carrageenases
A BLAST search with PcCgkA GH16 as query identified 76 putative -carrageenases in GenBank TM (identity ranging from 30 to 97%). All of these sequences have conserved several key residues for the recognition of -carrageenan (Arg 196 and Arg 260 in PcCgkA GH16 ). A phylogenetic tree was built (Fig. 3) based on the multiple alignment of the catalytic module sequences of these 76 -carrageenases and of three GH16 ␤-agarases chosen as an outgroup (supplemental Fig. S1).
This phylogenetic tree can be divided into at least 10 major clades, based on the highest values of bootstrap of deepest nodes. Four major clades can be delineated. Clade A, which contains the -carrageenase from P. carrageenovora, and clade D correlate well with bacterial phyla from Proteobacteria, with a bootstrap value of 100% for both. Clade B is composed of two subgroups of sequences belonging to Bacteroidetes and Planctomycetes, with a bootstrap value of 98%. Clade C contains sequences from Bacteroidetes only, including ZgCgkA GH16 , with a bootstrap value of 100%.
It is noteworthy that some bacterial strains contain several -carrageenase sequences that spread out throughout the whole phylogenetic tree, as exemplified by Algibacter sp. SK-16, which displays six sequences (highlighted by arrows in Fig. 3; specific alignment provided in supplemental Fig. S2) or Flammeovirga sp. OC4 with three sequences.
Interestingly, when looking at the specificities of the two clades A and C, we can notice that the presence or absence of sequence stretches in the alignment correlates with the phylogenetic clustering. Specific alignment of both PcCgkA GH16 and ZgCgkA GH16 allowed the definition of these sequence stretches more precisely as "fingers," numbered from F1 to F6. They are boxed on the sequence alignment of Fig. 2A. Their delimitation is based on analysis of secondary structure elements. These fingers are generally constituted by a succession of a ␤-strand-[tight turn,loop or helix]-␤-strand structural elements and were identified as regions containing a critical divergence in the amino acid sequences of both -carrageenases studied here (green boxes in Fig. 2A).
All of the enzymes belonging to clade A possess a "finger F2-type" motif, which is absent in clade C, and all of the enzymes from clade C possess a "finger F4-type" motif, absent in clade A enzymes. Conversely, clade B and D enzymes possess both types of motifs, clade D containing the most conserved sequences for both fingers and clade B showing modifications of the nature and of the length of "typical" sequences. 2 The abbreviations used are: GH, glycoside hydrolase(s); CBM, carbohydratebinding module; PDB, Protein Data Bank; DP, degree of polymerization; DA, 3,6-anhydro-D-galactose; G4S, D-galactose 4-sulfate; HPAEC, high-performance anion-exchange chromatography; RMSD, root mean square deviation. The alignment was produced with MultAlin, and the figure was produced using ESPript version 3.x software. Secondary structure elements are symbolized with arrows for ␤-sheets, curls for ␣-helix, and T for turns; the "finger" regions surrounding the catalytic channel are boxed and numbered from F1 to F6; the sequence divergences that have structural consequences are boxed in green; the numbers 1 in green refer to the cysteine residues involved in the single disulfide bond of PcCgkA GH16 ; the black stars above the sequence highlight the amino acids that were mutated into alanine for PcCgkA GH16 in this study. B, schematic representation of the structural fold of ZgCgkA GH16 . The color code is graduated from blue (N terminus) to red (C terminus) along the polypeptide chain. Catalytic residues Glu 159 -Asp 161 -Glu 164 are shown as sticks and colored in red. C, schematic representation of the structural fold of PcCgkA GH16-E168D in complex with a -neocarratetraose, shown in stick representation (carbon atoms are white, oxygen atoms are red, and sulfur atoms are orange). The color code of the polypeptide chain is the same as in B. Catalytic residues Glu 163 -Asp 165 -E168D are shown as sticks and colored in red.

Structure and function of GH16 -carrageenase enzymes
The remaining sequences of the phylogenetic tree possess bootstrap values that are too low to constitute a monophyletic clade and to allow the establishment of clear correlations between the sequences and structural features.
Another interesting feature observed in the phylogenetic tree of -carrageenases is that some enzymes possess a modular architecture (e.g. Z. galactanivorans), whereas others consist of catalytic modules only (e.g. Z. uliginosa, P. carrageenovora). Notably, the modular nature is not correlated with the clades, except for the clade B Planctomycetes sequences, which are all only composed of a catalytic module.

Biochemical characterization of ZgCgkA GH16 and ZgCgkA GH16-CBM16-PorSS and comparison with PcCgkA GH16
The -carrageenase from Z. galactanivorans, ZgCgkA, is a modular enzyme displaying an N-terminal GH16 catalytic module, a carbohydrate-binding module of family 16 (CBM16), and a C-terminal Type-IX secretion module (PorSS) (11,25). Two forms of the enzyme were cloned from genomic DNA, the catalytic module alone (ZgCgkA GH16 ) and the entire form of the enzyme (ZgCgkA GH16-CBM16-PorSS ). PcCgkA is composed of a GH16 catalytic domain and an Ig-like module of unknown function. Only the catalytic module (PcCgkA GH16 ) was previously cloned (17) and has been used in this study. The three proteins were successfully expressed in soluble form and purified to homogeneity.
The two forms of ZgCgkA, with and without CBM16, were studied in parallel to the recombinant enzyme PcCgkA GH16 . At first, we determined the optimal conditions for pH, buffer, and temperature for both forms of ZgCgkA. Efficiency was highest at pH 6 in MES buffer for both constructions, and analysis by dynamic light scattering revealed that both forms are stable up to 40°C (data not shown). The comparison of the kinetic parameters of the three enzymes on diluted solutions of -carrageenan shows that ZgCgkA GH16-CBM16-PorSS and ZgCgk-A GH16 are more efficient than PcCgkA GH16 by 3 and 5 times, respectively ( Table 1). The absence of CBM in the construct of ZgCgkA GH16 causes a 40% increase of efficiency in solution, compared with the entire enzyme.
Degradation kinetics on a longer time course (5 h) gave a similar result in solution (Fig. 4A). The concentration of product released after 30 min of incubation was increased by 26 and 143 times for ZgCgkA GH16-CBM16-PorSS and ZgCgkA GH16, respectively, as compared with PcCgkA GH16. Thus, ZgCgkA GH16 is 5.5 times more efficient in solution than ZgCgkA GH16-CBM16-PorSS . However on microgels (Fig. 4B), differences of released products between the three enzymes are far less important than reactions in solution. Under these conditions, subtle differences can be observed; ZgCgkA GH16-CBM16-PorSS releases 2 times more product than PcCgkA GH16 and 1.5 times more than ZgCgkA GH16 after 90 min of incubation.
HPAEC analyses of reaction end products produced by the recombinant enzymes confirm the previously reported difference of product patterns observed for the native enzymes (15). Namely, ZgCgkA GH16 mainly produces oligosaccharides with degree of polymerization of 4 and 6 (DP4 and DP6) (supplemental Fig. S3), whereas PcCgkA GH16 produces oligosaccharides with DP2 and DP4 (data not shown). Moreover, for both constructs of Z. galactanivorans -carrageenase, the profile of degradation products is different in solution and in microgels, with a DP4/DP6 ratio always higher in microgels compared with that in solution ( Table 2). But the presence of the CBM modifies the values of this ratio. In solution (Fig. 4, C-E), the ratio becomes Ͼ1 more quickly for ZgCgkA GH16-CBM16-PorSS than for ZgCgkA GH16 (after 45 and 90 min, respectively), and on microgels (Fig. 4, D-F), it reaches the highest levels with ZgCgkA GH16-CBM16-PorSS after 1 week of extensive digestion. This means that the construction with CBM (ZgCgkA GH16-CBM16-PorSS) favors the release of DP4, mainly in microgels but also in solution, to a higher extent than the construction without CBM (ZgCgkA GH16 ).

Crystal structure of the catalytic module of Z. galactanivorans -carrageenase (ZgCgkA GH16 )
ZgCgkA GH16 was crystallized, and the three-dimensional structure was determined at 1.66 Å resolution (Fig. 2B) by molecular replacement, using the native PcCgkA GH16 structure as a model (PDB code 1DYP) (12). The asymmetric unit contains four copies of the protein, covering the same sequence from Gln 30 to Ser 307 . The RMSD values, calculated for C-␣ atoms, between the four copies are low, ranging from 0.21 to 0.31 Å, and throughout the following, we will thus refer only to molecule A, except if otherwise specified. B factors of the overall chain (C-␣ and lateral chains) have been calculated for the different fingers defined upward as the extensions of ␤-strands and/or loops that flank the catalytic tunnel (Table 3). All further data statistics are summarized in Table 4.  Structure and function of GH16 -carrageenase enzymes

Structure and function of GH16 -carrageenase enzymes Crystal structure of the inactivated catalytic module of the -carrageenase from P. carrageenovora (PcCgkA GH16-E168D ) in complex with a -neocarratetraose
To identify the key residues involved in the interaction with the sulfated substrate, we designed an inactive mutant of the -carrageenase from P. carrageenovora (PcCgkA GH16-E168D ) to trap a stable enzyme-substrate complex. As a member of the GH16 family, PcCgkA possesses the characteristic catalytic triad EXDXXE responsible for the double displacement mechanism that results in the cleavage of the ␤-1,4 linkage with retention of the anomeric configuration (26 -28). Glu 163 is the nucleophile residue that binds the D-galactose 4-sulfate (G4S) unit at C1 to form the glycosyl-enzyme intermediate, and Glu 168 is the acid/base catalyst. Asp 165 is also involved in the mechanism by accelerating the deglycosylation step (12). Sitedirected mutation of the catalytic Glu 168 to aspartate resulted in a significant loss of enzymatic activity, although not abolished totally (data not shown). This allowed the co-crystallization of the recombinant enzyme in the presence of a mixture of -carrageenan oligosaccharides, purified from the digestion of -carrageenan polymer with PcCgkA GH16 . The crystals obtained allowed us to solve the three-dimensional structure at 1.7 Å resolution ( Fig. 2C) by molecular replacement again using the native PcCgkA GH16 as a model. The asymmetric unit contains two copies of the protein, covering residues from Met 28 to Val 297 . The F o Ϫ F c difference density for the complex crystal that we obtained unambiguously showed a tetrasaccharide spanning the active-site cleft in subsites from Ϫ4 to Ϫ1 (subsite numbering according to Davies et al. (29)), and the oligosaccharide could readily be built into the density (Fig. 5A). 19 residues have been identified as involved in substrate binding (Fig. 5B). Bonding distances are listed in Table 5. 3,6-Anhydro-D-galactose (DA) moieties in subsites Ϫ4 and Ϫ2 interact with 3 and 5 lateral chains of the catalytic tunnel residues, respectively. Comparatively, G4S moieties in subsites Ϫ3 and Ϫ1 show the highest number of interactions with the enzyme, 12 and 13, respectively. Remarkably, Arg 260 is involved in five interactions in Ϫ1 subsite, and of the 19 interacting residues, the three arginines are responsible for 12 of the 32 total interactions between the enzyme and its substrate. It can also be emphasized that the two sulfate functions of the substrate are particularly implicated in the network of hydrogen bonding, as underlined in Fig. 5B.
By comparison with the available structure of the enzyme without substrate (PDB code 1DYP), we can notice that the general fold of the enzyme displays no significant difference (RMSD ϭ 0.48 and 0.46 Å for molecules A and B, respectively). However, some side chains show translational movements, between 1.2 and 1.6 Å, and others show marked rotations, between 90 and 180° (Fig. 5D). The residues affected by these movements are mainly located in finger F2 above subsite Ϫ4 of the catalytic channel for the translational movements, namely Trp 95 , Gln 100 , and Gln 102 (straight arrows in Fig. 5D), and in fingers F5-F6, which constitute the closed part of the tunnel between the Ϫ2 and Ϫ1 subsites, for the rotational movements (curved arrows). In particular, Arg 196 and Asn 269 adopt a "closed" conformation in the presence of the substrate, with their functional groups pointing into the channel, with an overall displacement of 3.6 Å when compared with the structure without substrate. Interestingly, Arg 151 adopts two alternative conformations when the substrate is present as compared with the apo-structure of the enzyme, one pointing toward the Ϫ4 subsite and the second pointing toward the Ϫ3 subsite (Fig. 5C). The B factor of this arginine (20.0 Å 2 ) is also higher than the overall mean B factor.

Structural comparison of ZgCgkA GH16 and PcCgkA GH16-E168D
As a member of the GH16 family, ZgCgkA GH16 adopts a ␤-jelly roll fold. Similar to PcCgkA GH16-E168D , the catalytic channel of ZgCgkA GH16 is partially closed, forming a tunnel, constituted by the junction of fingers F5 and F6 (Fig. 6A) through Arg 199 and Asn 272 , the equivalent of Arg 196 and Asn 269 in PcCgkA GH16-E168D (supplemental Fig. S4). Despite the conservation of the overall fold, four main structural differences can be emphasized when superposing their structures (Fig. 6B).
At first, substantial differences concern the extremity of the catalytic channel containing the negative subsites (Ϫ4 and Ϫ3 subsites). In PcCgkA GH16-E168D , the top finger F2 establishes a direct link to the substrate molecule through three amino acids, Trp 95 , Gln 100 , and Gln 102 . Residues of this finger also take part, together with finger F1, in binding a dense network of 34 water molecules that wraps around the oligosaccharide bound in the catalytic channel from subsite Ϫ4 to ϩ2 (supplemental Fig. S5). Finger F2 is absent in ZgCgkA GH16 , as well as Cys 98 and Cys 268 , which form the disulfide bond between F2 and F6 in PcCgk-A GH16-E168D . The dense water molecule network, described above, is also far from being as extended as that in ZgCgkA GH16 , being confined to only seven water molecules located between subsites Ϫ1 and ϩ2. It can also be noticed that the key basic residue Arg 151 in PcCgkA GH16-E168D , located on finger F3, which adopts two alternative conformations, is replaced by an aromatic residue (Tyr 148 ) in ZgCgkA GH16 .
The second main difference between both enzymes is the presence of an additional finger, F4, at the extremity of the tunnel forming positive subsites for ZgCgkA GH16 , in the place of a key residue, Lys 172 , identified in PcCgkA GH16-E168D (12). This finger F4 is composed of the loop Phe 168 -Asp 177 and contains a combination of aromatic and charged residues ( Fig. 2A). Notably, Trp 170 forms a hydrophobic platform that extends the end of the binding cleft (supplemental Fig. S6). The distance of 2.8 Å between Asp 169 in finger F4 and Lys 201 in finger F5 is compatible with a hydrogen bond. Altogether, these features partially obstruct the base level of the tunnel exit, where the Structure and function of GH16 -carrageenase enzymes positive subsites are located in PcCgkA GH16-E168D , forcing the substrate into an orientation at 45°with respect to the substrate chain orientation on the side of negative subsites (Fig. 6C). A third difference is the presence of a helix (␣3 in Fig. 2A) on the very top of ZgCgkA GH16 , in the prolongation of finger F6, just above the positive binding subsites of the catalytic cleft (Fig.  6B). Notably, two lysine residues, Lys 284 and Lys 288 , contribute to a basic patch, together with Lys 264 , located just between this helix and finger F4. Altogether, finger F4, the top helix, and the charged side chains, create a unique basic environment located at the positive subsites of ZgCgkA GH16 (supplemental Fig. S6).
A fourth major difference between the two catalytic modules is the increased values of B factors of fingers F1 and F6 in ZgCgkA GH16 compared with those of PcCgkA GH16-E168D (Table  3 and supplemental Fig. S4). For finger F1, the relative value of the B factor with respect to the overall value changes by 49% in PcCgkA GH16-E168D (81% in PcCgkA GH16 ) and up to 119% in ZgCgkA GH16 . This is associated with an extension of finger F1 by two residues in ZgCgkA GH16 , resulting in a spatial displacement with respect to the position in PcCgkA GH16-E168D . Notably, Asn 70 is oriented toward the Ϫ3 subsite with a rotation of 45°in contrast to the equivalent Asn 63 in PcCgkA GH16-E168D , which points in the opposite direction (supplemental Fig. S4). Concerning finger F6, the relative B factor value changes by 89% in PcCgkA GH16-E168D (88% in PcCgkA GH16 ) and by 190% in ZgCgkA GH16 with respect to the overall B value. This increased mobility is especially marked for residue Asn 272 , which is equivalent to Asn 269 of PcCgkA GH16-E168D that was previously identified for its implication in forming the tunnel above the catalytic site. We note that the side chain of this residue in ZgCgkA GH16 is in the "open" position, as compared with what is observed in the apo-structure of PcCgkA GH16 (PDB code Table 3 Delimitation of structural "fingers" of the two catalytic modules of -carrageenases from Z. galactanivorans (5OCR) and P. carrageenovora, mutant E168D in complex with -neocarratetraose (5OCQ) and also the apoenzyme (1DYP), and associated values of B factor B factor values that vary significantly from values of the full sequence are underlined.

Finger
ZgCgkA  Structure and function of GH16 -carrageenase enzymes 1DYP), and that the associated Arg 199 (the equivalent of Arg 196 in PcCgkA GH16 ) adopts two alternative conformations in the crystal structure of ZgCgkA GH16 , one in the open and one in the closed position.

Site-directed mutagenesis of PcCgkA GH16
The structure of PcCgkA GH16-E168D in complex with a -neocarratetraose allowed the identification of 19 amino acids directly involved in the interaction with the substrate (listed in Table 5) in addition to some other residues more distant and involved in the network of water molecules. Six of these residues have been successfully mutated individ-ually into alanine (Fig. 5C). The recombinant mutated proteins were purified and biochemically characterized in solution on -carrageenan substrate. Details of the determination of kinetic parameters are provided in supplemental  Table S1.
Depending on the position of the mutated amino acid, the effect on the catalytic efficiency is very different (Table 1). Indeed, R196A, R260A, and W266A cause a dramatic loss of catalytic efficiency, between 95 and 99%, as compared with the wild-type enzyme.
For Arg 196 and Arg 260 , the loss of activity can be explained by their key role in binding to the charged sulfate groups of G4S in  Fig. 2). The catalytic sub-binding sites are numbered from Ϫ4 to Ϫ1. The polypeptide chain is represented as a purple schematic. B, schematic representation of the enzyme-substrate complex (produced with ChemDraw). All residues that interact with the substrate molecule are displayed. Possible hydrogen bonds are represented in dashes, and those established with sulfate groups are underlined in yellow, as well as the labels of the corresponding residues; water molecules are drawn as spheres; hydrophobic interactions are symbolized with arcs; the mutation E168D is preceded by an asterisk; the three arginine residues that were mutated into alanines are identified by a star; the three catalytic residues are labeled in red, and red waves symbolize the hydrogen bonds that are subsequently formed during the catalytic cleavage of the glycosidic linkage.

Structure and function of GH16 -carrageenase enzymes
the Ϫ1 subsite. The importance of both of these residues and especially Arg 260 in substrate binding was already detected by Michel et al. (12), based on a structural model of a docked substrate molecule in the catalytic active site of the native -carrageenase. It is noteworthy that these basic residues are strictly conserved in all of the sequences of -carrageenases, with a few exceptions, where Arg 196 is replaced by a lysine, and thus must play an important role in the specificity of recognition of the substrate in this GH16 subfamily. In the case of our enzyme-substrate complex PcCgkA GH16-E168D , we show that the importance of Arg 196 is connected to its involvement in shaping the tunnel above the catalytic active site and its concerted interaction with Asn 269 in the binding of the substrate in subsites Ϫ1 and Ϫ2 (Table 5 and Fig. 5B). We also confirm the crucial importance of Arg 260 in stabilizing the substrate in subsite Ϫ1, notably through the interactions with the sulfate group, which is the hallmark of -neocarrabiose ( Table 5). As a consequence, the remaining activity of the protein with Arg 260 mutated to alanine was Ͻ1% that of the wild-type enzyme, and kinetic parameters were not determined. By contrast, Trp 266 is positioned in the catalytic groove, where the positive substratebinding sites must be, and consequently does not interact with the substrate molecule in the crystal structure of this substrateenzyme complex. Nevertheless, the drastic loss of activity when mutated to alanine clearly indicates that it plays an important role in the interaction with the saccharide units that bind to the positive sub-binding sites.
Surprisingly, R151A and Q171A have an opposite effect, causing an increase of efficiency by 1.4 -1.8 times, respectively (Table 1). Arg 151 is located between subsites Ϫ4 and Ϫ3 and establishes strong interactions with the substrate at the entrance of the catalytic groove, through a salt bridge and direct hydrogen bonds. On the other hand, Gln 171 is located in the putative ϩ1 sub-binding site, where it could establish hydrogen bonds with the leaving group G4S. This residue is highly conserved in -carrageenases, so it may be linked with the thermodynamic equilibrium of enzymatic reaction. When mutated into alanine, the interaction with the leaving group is strongly decreased, allowing a more rapid product release into the medium and thus accelerating the enzymatic turnover.
R92A is the only mutated residue not directly in contact with the substrate but involved in binding water molecules at the entrance of the channel (supplemental Fig. S5). The absence of any effect on catalytic efficiency when mutating Arg 92 shows that this arginine does not seem to play a major role in substrate binding or mode of action, at least on soluble substrates.

Discussion
Although family GH16 -carrageenases are frequently identified in macroalgae-associated bacteria (3,30), only seven enzymes have actually been produced (11,(17)(18)(19)(20)(21)(22), and only one has been structurally characterized to date (12). In polyspecific GH families, such as the GH16 family, the subfamilies based on phylogenetic analyses usually correspond to distinct substrate specificities (11,29,31). Within a specific subfamily, the large diversity of sequences and the phylogenetic distribution in clades is rather related to differences in modes of action or subtle differences within the same substrate specificity (32). It is important to mention that carrageenans occur as hybrid polymers within the cell walls of red algae, containing numerous variant motifs beyond the frequent repeating unitsand -carrabioses (e.g. the precursor unitsand -carrabiose and the desulfated units ␤- (33) and ␣-carrabiose (34,35)). These complex structures are further modified by the addition of methyl or pyruvate groups. In this context, one can expect significant differences between -carrageenases, and the results presented here are the first example of exploring this biochemical diversity.
The first obvious difference between PcCgkA and ZgCgkA is the modular architecture of these -carrageenases. Even if both are secreted enzymes, only ZgCgkA possesses a CBM (11,36). The presence of CBMs appended to catalytic GH modules is in general related to the capacity of the adjacent enzyme to tackle recalcitrant, complex substrates in the solid form, as encountered in the context of the plant cell wall (37)(38)(39). Four main roles are attributed to CBMs: 1) to allow proximity between the catalytic domain and its substrate; 2) to target the catalytic module to specific parts of the cell wall; 3) to help disrupt the substrate organization; and 4) to anchor the enzyme to the bacterial cell wall (40,41). The isolated CBM16 module of ZgCgkA was cloned and expressed as a recombinant protein in an independent study and was indeed shown to bind -carrageenan in ELISA tests, 3 indicating that this secreted enzyme most proba-

Structure and function of GH16 -carrageenase enzymes
bly degrades semi-crystalline -carrageenans within the algal cell wall. Moreover, contrary to PcCgkA, DP6 is not hydrolyzed by ZgCgkA, which confirms what Potin et al. (15) suggested, that is to say DP8 is the minimal size of oligosaccharide that can be degraded by ZgCgkA. According to the displacement scheme proposed by Lemoine et al. (42) for PcCgkA, DP6 is exclusively produced by random processing, whereas DP4 is produced by random and processive modes of action. Similarly, we propose to use the DP4/DP6 ratio as a marker of processivity for ZgCgkA. In this study, comparison of behavior of recombinant ZgCgkA with and without CBM, and also on microgels compared with -carrageenan solutions, suggests that ZgCgkA is an endo-processive enzyme, mainly on microgel, and that its appended CBM16 contributes to the processivity, even on soluble substrate, probably by favoring and maintaining the proximity between the catalytic module and substrate. The difference in processivity also explains the difference in catalytic efficiency observed with and without CBM16, due to the fact that the selective advantage of processivity on insoluble substrate is in balance with the loss of catalytic efficiency on soluble substrate (43). This phenomenon has already been observed for processive endoglucanases, where deletion of the CBM associated with the catalytic module leads to an increased activity on soluble substrate (44).
Dissection of the molecular details of protein-carbohydrate interactions within the catalytic active site is crucial to understand the enzymatic efficiency, the mode of action, and the substrate specificity of a given enzyme class. From our structural and mutagenesis study, we can establish several key parameters in the -carrageenase subfamily.

Importance of arginine residues
First emphasized in the 1970s, with the study of carboxypeptidase A (45), the positive charge and basic character of arginines allow them to establish salt bridges and/or multiple hydrogen bonds with negatively charged substrates like phosphate groups (46) or sulfated polysaccharides, as observed in the case of chondroitin lyase (47).

Importance of enzymatic flexibility and substrate binding in catalytic efficiency
Loss of finger F2 and the stabilizing disulfide bond between F2 and F6, as well as the associated water network, result in a more flexible F6 finger that correlates well with the 5-fold increase of enzymatic efficiency of ZgCgkA GH16 in solution compared with PcCgkA GH16 . Interestingly, the mobility of fingers also seems to be inversed between both enzymes, because finger F6 at the top of the groove in ZgCgkA GH16 has 2 times higher B factors than overall, whereas fingers F3 and F5 of PcCgkA GH16 display 2 times higher than average values (Table  3). Moreover, the units at the reducing end of the oligosaccharide might be more tightly bound in ZgCgkA GH16 (at least four positive sub-binding sites constituted by fingers F4-F5-F6 than in PcCgkA GH16 (two positive sub-binding sites: fingers F5-F6), and less tightly bound at the non-reducing end, with only three negative sub-binding sites (fingers F1-F3) instead of four (fingers F1-F2-F3). These findings, along with the experiments on microgels, lead us to assume that ZgCgkA GH16 is an endoprocessive enzyme with at least seven sub-binding sites, moving from the reducing end toward the non-reducing end of acarrageenan chain, movement opposite to that proposed for PcCgkA GH16 (Fig. 6D) (12).

Requirements for processivity on charged substrates
Processivity necessitates a compromise between affinity and mobility, which can be fulfilled by a combination of basic and mobile residues, such as arginines and lysines, associated with aromatic residues (Arg 151 /Trp 95 for PcCgkA GH16 ; Lys 201 / Trp 170 for ZgCgkA GH16 ). The alternative conformations of the side chain of Arg 151 observed in presence of the ligand, together with the relative high B factor of this residue lead us to assume that it might assist the establishment of a first contact to the substrate, "grabbing" a sulfate group, but also assist in sliding the substrate further, after a catalytic cleavage. We can further assume that the mutation of Arg 151 into an alanine induces a weakening of the substrate binding at the non-reducing end of the polysaccharide chain, possibly hindering the processivity and thus leading to an increase of the catalytic efficiency on a soluble substrate. The effect of this point mutation can thus be compared with what is observed for the processive cellobiohydrolase of T. reesei, where the deletion of one of the active-site loops leads to an increased efficiency of the enzyme on amorphous cellulose (48).

Importance of subsites associated with the leaving group
A balance is needed between the necessity of maintaining the stability of the enzyme-substrate complex before cleavage and facilitating the release of the product. In PcCgkA GH16 , Gln 171 is possibly involved in a process defined as "product inhibition" by binding the product tightly to the enzyme, a feature that is frequently observed in processive enzymes active on crystalline substrates, such as cellulases or chitinases (43,49). We can assume a similar function for Asn 70 in ZgCgkA GH16 . Conversely, Trp 266 might facilitate the sliding of the released product, thus leading to a loss of enzymatic efficiency when mutated into alanine. Indeed, hydrophobic amino acids and tryptophans in particular are well known for their role in facilitating the

Structure and function of GH16 -carrageenase enzymes
sliding of oligosaccharide substrates in glycoside hydrolases (50). In ZgCgkA GH16, an equivalent residue could be Phe 146 .
To summarize, our data indicate that despite sharing a global common ancestor, PcCgkA GH16 and ZgCgkA GH16 have evolved unique structural features, which shape distinct tunnel topologies and probably result in opposite directions of processivity. Interestingly, from the point of view of industrial applications, the enzyme from Z. galactanivorans is 5 times more efficient in solution. The two GH16 enzymes described here are representatives of different clades of -carrageenases (clades A and C, respectively; Fig. 3). Whereas clade C enzymes clearly originate from marine Bacteroidetes and clades A and D from Proteobacteria, we cannot currently determine whether the common ancestor of clade B -carrageenases belonged to the Bacteroidetes or Planctomycetes phylum. The exact taxonomic nature of the common ancestor of the entire -carrageenase subfamily is even more difficult to establish, considering the low bootstrap values of the deep nodes of the remaining sequence clusters. Remarkably, some marine bacteria have multigenic families of -carrageenases, such as, for example, Algibacter sp. SK-16 that possesses three clade B sequences, one clade C sequence, and two other sequences (including the current deepest sequence of the -carrageenase subfamily). This diversity of enzymes active on -carrageenan within one organism most likely results from various evolutionary events, such as horizontal gene transfers or sequence duplications followed by diverging evolution. The presence of this multigenic family might also give an adaptive advantage to this species to face the chemical complexity of carrageenans, perhaps also indicating that this organism potentially adopts a strategy even different from that of Z. galactanivorans and P. carragenovora for the degradation of this abundant marine polysaccharide.

Bioinformatics analysis
Homologous sequences have been extracted with the BLASTp suite from the NCBI server (51) (https://blast.ncbi. nlm.nih.gov) from the non-redundant protein sequence database, manually curated before being aligned with the MAFFT program available at http://www.ebi.ac.uk/Tools/msa/mafft/ and edited with BioEdit free software (52)

Cloning of the two forms of -carrageenases of Z. galactanivorans in E. coli
Two different forms of the gene cgkA (zobellia_236) encoding the -carrageenase from Z. galactanivorans have been cloned. Briefly, primers (sequences provided in supplemental Table S2) were designed to amplify by PCR, from Z. galactanivorans genomic DNA, the coding regions corresponding to the complete sequence of the -carrageenase without the signal peptide sequence in the N-terminal position (ZgCgkA GH16-CBM16-PorSS ) and to the catalytic module alone (ZgCgkA GH16 ). After digestion with the restriction enzymes BamHI and PstI, the purified PCR products were ligated using T4 DNA ligase into the expression vector pFO4 predigested by a compatible couple of restriction enzymes. The recombinant proteins encoded in the plasmids pZG237 and pZG238 correspond to the peptide sequences Gln 30 -Lys 306 (ZgCgkA GH16 ) and Gln 30 -Glu 546 (ZgCgkA GH16-CBM16-PorSS ), respectively, with an added N-terminal hexahistidine tag. The plasmids were transformed into the E. coli BL21 (DE3) strain for protein expression.

Mutagenesis of -carrageenase from P. carrageenovora
The plasmid pCGK that was obtained by cloning the native -carrageenase gene into the C-terminal His tag encoding expression plasmid pET20b (Invitrogen) was subsequently used as a template for mutagenesis using the QuikChange mutagenesis kit (Stratagene). The primers used to amplify the different mutants are summarized in supplemental Table S2. The plasmids obtained were used to transform the E. coli C43 (DE3) strain for protein expression.

Expression and purification of the recombinant enzymes
A single colony containing the desired plasmid (pCGK or pZG) was used to inoculate LB medium (supplemented with 100 g/ml ampicillin). This preculture was incubated at 37°C and was then used to inoculate ZIP5052-autoinducing medium for cell growth at 17°C. The culture was stopped after 2 days when A 600 had reached 15 and pelleted (30 min). If needed, the pellet was frozen and stored at Ϫ20°C before use. The bacterial pellet was lysed by means of a French press after resuspension in buffer A (Tris-HCl (10 mM, pH 7.2, for P. carrageenovora enzymes or 50 mM, pH 7.5, for Z. galactanivorans enzymes), 500 mM NaCl, 40 mM imidazole), with an anti-protease mixture (Complete EDTA-free, Roche Applied Science), DNase (0.1 mg/ml), and the addition of lysozyme (0.2 mg/ml). The supernatant after lysis containing the soluble proteins was separated from the pellet by centrifugation (20,000 ϫ g for 30 min at 4°C) and then filtered before the two-step purification by nickel affinity chromatography (equilibrated with buffer A) and size exclusion chromatography on a Sephadex column (Amersham Biosciences). Elution on immobilized metal ion affinity chromatography was done with a gradient of buffer B, which differed from buffer A only in the concentration of imidazole, which was 250 mM for P. carrageenovora enzymes and 500 mM for Z. galactanivorans enzymes. Size exclusion chromatography was done with a flow rate at 1 ml/min in buffer C (50 mM Tris-HCl, pH 7.2, 150 mM NaCl for P. carrageenovora enzymes and 10 mM Tris-HCl, pH 7.2, 250 mM NaCl for Z. galactanivorans enzymes). The different fractions were analyzed by SDS-PAGE. Those containing the protein purified to homogeneity were pooled and concentrated by ultrafiltration on a styrene acrylonitrile membrane (10-or 30-kDa cutoff) (Millipore) to the desired concentration for biochemistry analysis and to 6.9 and 5 mg/ml for crystallogenesis of PcCgkA GH16-E168D and of ZgCgkA GH16 , respectively.

Crystallization of PcCgkA GH16-E168D and ZgCgkA GH16 , structure determination, and refinement
In the first step, crystallization conditions were screened for using the commercial kits PACT and JCSGϩ (Qiagen), dispensed by a robot (200 nl of protein solution mixed with 100 nl of reservoir solution). For both proteins, initial crystallization conditions identified from these screens were then manually optimized. Single crystals of suitable size were obtained for PcCgkA GH16-E168D in complex with substrate as follows. 200 l of enzyme at 6.9 mg/ml were supplemented with 2 mg of a purified mix of -carrabiose, tetraose, and hexaose (30:60:10, w/w/w), obtained from the digestion of -carrageenan by PcCgkA GH16 . Of this solution, 2 l were mixed with 2 l of reservoir solution containing 1.0 M sodium citrate and 100 mM cacodylate at pH 6.5, in hanging drops equilibrated against 500 l of reservoir solution at 19°C. Single crystals of ZgCgkA GH16 were obtained from 2 l of the enzyme solution (5.0 mg/ml) that were added to 1 l of reservoir solution containing 28 -29% of PEG 3350, 100 mM MES buffer, pH 6.5, and 0.3 M NaNO 3 , in hanging drops equilibrated against 250 l of reservoir solution at 19°C. Before flash-freezing in a nitrogen stream at 100 K, single crystals were quickly soaked in their respective crystallization solution supplemented with 20% glycerol. Diffraction data for PcCgkA GH16-E168D complex crystals were collected on beamline ID14-1, and data for ZgCgkA GH16 were collected on beamline ID29 (ESRF, Grenoble, France). X-ray diffraction data were integrated using Mosflm and scaled with SCALA (54). Both structures of ZgCgkA GH16 and PcCgkA GH16-E168D in complex with -carratetraose were determined by molecular replacement with MolRep (55) using chain A of the PcCgk-A GH16 native structure (PDB code 1DYP) as a starting model. The structure of ZgCgkA GH16 was then manually corrected and built using COOT (56). For both structures, the initial structural models were refined with the program REFMAC5 (57), alternating with cycles of further manual rebuilding using COOT. A subset of 5% randomly selected reflections was excluded from computational refinement to calculate the R free factors throughout the refinement. The addition of the ligand sugar units for the complex structure was performed manually using COOT. Water molecules were added automatically with REFMAC-ARP/wARP and visually verified. All data collection and refinement statistics are presented in Table 1. The structure obtained for the ligand tetrasaccharide has also been checked with the software Privateer in CCP4 (58).

Enzymatic activity assays on -carrageenan by reducing sugar analysis
The processive character of an enzyme can only be clearly observed on a non-soluble substrate (42). However, the determination of kinetic parameters implies a total availability of the substrate to fulfill Michaelis-Menten conditions (59). Thus, we decided to do the biochemical experiments in diluted solutions of -carrageenan to have an estimate of the apparent efficiency under similar conditions for all studied enzymes and mutants, although not entirely reflecting their behavior in "natural" conditions (i.e. on solid algal cell walls).
A stock solution of 0.5% (w/v) -carrageenan (Danisco) was prepared in 50 mM MOPS, pH 7, 150 mM NaCl for PcCgkA GH16 and its mutants, in Teorell buffer (60) to determine the best range of pH for both construction of ZgCgkA GH16 and then in 50 mM MES, pH 6, 300 mM NaCl for the determination of kinetic constants. Aliquots (14 l) of enzyme at the appropriate concentration were incubated in triplicate in the presence of -carrageenan solution (126 l) at a final concentration of 0.125% (w/v) at 40°C. The amount of reducing sugars released was assayed using a ferricyanide method adapted from that of Kidby and Davidson (61). The ferricyanide reagent consisted of 300 mg of potassium hexacyanoferrate III and of 28 g of hydrated Na 2 CO 3 dissolved in 1 liter of distilled water, to which 1 ml of 1 M aqueous NaOH was added. Aliquots (20 l) of the incubation medium were mixed with 180 l of ferricyanide reagent, and absorbance was recorded at 420 nm. Reducing sugar concentrations were calibrated using glucose as a standard.
The initial reaction rates of -carrageenan hydrolysis were measured at least in triplicate by the ferricyanide reagent method for 10 -carrageenan concentrations (1.25, 1.05, 0.85, 0.65, 0.45, 0.30, 0.25, 0.20, 0.15, and 0.13 g/liter). The Michaelis constant (K m ) and the reaction rate at infinite substrate concentration (V m ) were calculated with the software Hyper, with the hyperbolic regression model applying weighting. Note that for ZgCgkA GH16 , parameters were calculated without data at 0.13 g/liter and for PcCgkA GH16-R196A and PcCgkA GH16-W266A without data at 1.25 and 1.05 g/liter. For PcCgkA GH16-R260A , the concentration of enzyme needed to obtain comparable kinetic curves at 1.25 g/liter of substrate was 100 times higher than for PcCgkA GH16 , thus preventing us from calculating kinetic parameters, due to the fact that we were no longer in conditions consistent with the Michaelis-Menten assumption of For long time course kinetics, digestions were conducted at 30°C, in a final volume of 250 l of 0.2% (w/v) -carrageenan solutions, in 10 mM Tris buffer, pH 7.2, 150 mM NaCl. A 3.5 nM concentration of each enzyme was dissolved in the -carrageenan solution, or 14 nM in microgels, obtained from the previous solution at 0.2% -carrageenan supplemented with 60 mM KCl. Aliquots (20 l) of the incubation medium were pipetted after 15, 30, 45, 60, 90, 180, and 300 min and mixed with 180 l of 2.5ϫ ferricyanide reagent and treated the same way as above. Experiments were done in triplicate, and the S.D. was drawn for each spot on the graphics.