Plant O-Hydroxyproline Arabinogalactans Are Composed of Repeating Trigalactosyl Subunits with Short Bifurcated Side Chains*

Classical arabinogalactan proteins partially defined by type II O-Hyp-linked arabinogalactans (Hyp-AGs) are structural components of the plant extracellular matrix. Recently we described the structure of a small Hyp-AG putatively based on repetitive trigalactosyl subunits and suggested that AGs are less complex and varied than generally supposed. Here we describe three additional AGs with similar subunits. The Hyp-AGs were isolated from two different arabinogalactan protein fusion glycoproteins expressed in tobacco cells; that is, a 22-residue Hyp-AG and a 20-residue Hyp-AG, both isolated from interferon α2b-(Ser-Hyp)20, and a 14-residue Hyp-AG isolated from (Ala-Hyp)51-green fluorescent protein. We used NMR spectroscopy to establish the molecular structure of these Hyp-AGs, which share common features: (i) a galactan main chain composed of two 1→3 β-linked trigalactosyl blocks linked by a β-1→6 bond; (ii) bifurcated side chains with Ara, Rha, GlcUA, and a Gal 6-linked to Gal-1 and Gal-2 of the main-chain trigalactosyl repeats; (iii) a common side chain structure composed of up to six residues, the largest consisting of an α-l-Araf-(1→5)-α-l-Araf-(1→3)-α-l-Araf-(1→3- unit and an α-l-Rhap-(1→4)-β-d-GlcUAp-(1→6)-unit, both linked to Gal. The conformational ensemble obtained by using nuclear Overhauser effect data in structure calculations revealed a galactan main chain with a reverse turn involving the β-1→6 link between the trigalactosyl blocks, yielding a moderately compact structure stabilized by H-bonds.

Hydroxyproline-rich glycoproteins of the cell surface comprise groups of related structural proteins, including the extensins that form cell wall scaffolding networks essential for cytokinesis (1) and the classical arabinogalactan proteins (2) that are largely at the membrane wall interface (3) and have diverse functions (4). O-Hyp 6 glycosylation characterizes the hydroxyproline-rich glycoproteins and is of much interest as it defines molecular properties and, hence, biological function. Arabinogalactan proteins are highly glycosylated mainly with O-Hyp-arabinogalactan polysaccharides (5,6). Extensins are less highly glycosylated mainly with small O-Hyp arabinooligosaccharides (7,8), whereas the related proline-rich proteins are minimally glycosylated also with arabinooligosaccharides (9). Peptide sequence directs O-Hyp glycosylation by the addition of small oligosaccharides to contiguous Hyp residues and larger acidic arabinogalactan polysaccharides to clustered noncontiguous Hyp (10,11). For example, clustered Ala-Hyp and Ser-Hyp are typical AGP glycosylation motifs (12), whereas the Hyp residues in repetitive blocks of Ser-Hyp 2 orSer-Hyp 4 are arabinosylated. The "hyperglycosylation" of closely related AGPs complicates their purification, a problem that can be overcome by expressing single individual AGPs as GFP fusion glycoproteins, the hydrophobic GFP tag enabling chromatographic purification (13,14). This approach also allows purification of neo-AGPs containing single repeating AGP glycosylation motifs, for example (Ala-Hyp) n or (Ser-Hyp) n , for base-catalyzed peptide bond hydrolysis. Base hydrolysis releases alkali-stable Hyp-arabinogalactan glycoamino acids, designated Hyp-AGs (5,6), that can be further purified by size-exclusion chromatography.
Here we report the complete structural elucidation of three such Hyp-AGs ranging in size from 14 to 22 sugar residues. The structures were determined using multidimensional homonuclear and heteronuclear NMR spectroscopy in conjunction with molecular simulations in the presence of water.
As type II arabinogalactans are often considered intractably complex, and because Ala-Hyp-polysaccharide-1 was only a single example of a Hyp-AG subunit, we determined the structure of three additional Hyp-AGs derived from two different AGP motifs, repetitive Ala-Hyp and Ser-Hyp. Two of the Hyp-AGs designated interferon-polysaccharide-1 (interferon Hyppolysaccharide 1) and interferon-polysaccharide-2 were isolated from a fusion glycoprotein of human interferon ␣2b fused to a (Ser-Hyp) 20 AGP glycomodule (18). The third Hyp-AG, designated Ala-Hyp polysaccharide-2, was isolated from (Ala-Hyp) 51 -GFP similar to Ala-Hyp-polysaccharide-1 described earlier (15). Here we identified the fundamental similarities between these Hyp-arabinogalactans and determined if the non-glycosylated domains (interferon ␣2 versus GFP) or AGP motifs (Ser-Hyp versus Ala-Hyp repeats) influenced the glycan structure. Significantly, the six-residue galactan backbone of these new Hyp-AGs consisted of two ␤-1,3-linked galactosyl trisaccharides connected by a ␤-1,6 linkage. Such "decorated" ϳ15-residue trisaccharide subunits likely constitute the fundamental building blocks of type II arabinogalactan polysaccharides; hence, they are far less complex than commonly supposed (4,19). Finally, the NMR analyses and molecular modeling of the glycans revealed major conformers that include a moderately compact folded structure.

EXPERIMENTAL PROCEDURES
Gene Construction and Expression in Tobacco Cells-Genes encoding Interferon ␣2-(Ser-Hyp) 20 and (Ala-Hyp) 51 -GFP were constructed and expressed as described in detail earlier (12,18). Briefly, proteins were targeted for secretion using a tobacco extensin signal sequence, and gene expression was under control of the 35 S cauliflower mosaic virus promoter. The genes were subcloned into the plant transformation vector pBI121 and expressed in tobacco Bright Yellow-2 cells selected and maintained as described earlier (12,18).
Isolation of Hyp-arabinogalactans-Two hundred mg of Interferon ␣2-(Ser-Hyp) 20 were hydrolyzed in 20 ml of 0.44 N NaOH solution at 108°C for 18 h. The cooled solution was titrated to pH 7.8 with cold 1 N HCl and then freeze-dried.
Fractions (0.6 ml total volume each) were freeze-dried and analyzed for Hyp and monosaccharides colorimetrically or by gas chromatography using methods described earlier (18). The fraction containing the most Hyp (fraction 16 described in Ref. 18, interferon Hyp-polysaccharide-1) and a fraction containing later-eluting Hyp-glycans (fraction 18, Ref 18, interferon Hyppolysaccharide-2) were rerun on the Superdex column, freezedried, and then used for NMR analyses. The Hyp-glycan Ala-Hyp-polysaccharide-2 from (Ala-Hyp) 51 -GFP was isolated by a combination of cation exchange and gel filtration chromatography as described earlier (15,20).
NMR Spectroscopy-A 1-mg sample of each Hyp-AG was dissolved in 0.5 ml of 99.996% D 2 O (Cambridge Isotope Laboratories, Andover, MA). NMR experiments were carried out either at 55°C on a Bruker DMX-800 equipped with a cryoprobe or at 25°C on a Bruker DMX-600 spectrometer equipped with a triple-resonance probe and three-axis gradient coils. The parallel data sets include one-dimensional 1 H, two-dimensional 1 H-homonuclear correlation spectroscopy (COSY), total correlation spectroscopy (TOCSY) (mixing time 60 and 90 ms), rotating frame NOE spectroscopy (ROESY) (200 ms), and nuclear Overhauser effect spectroscopy (NOESY) (mixing time 150, 300 and 500 ms), and two-dimensional 13 C, 1 H heteronuclear single quantum coherence (HSQC) and heteronuclear multiple bond coherence (HMBC) NMR spectra. In addition, several more interferon Hyp-polysaccharide-1 experiments were recorded in an effort to resolve assignment ambiguities such as magnitude COSY with one-, two-, and three-step relay transfer and two-dimensional 13 C, 1 H heteronuclear HSQC-TOCSY and HSQC-NOESY together with diffusion-ordered spectroscopy for measuring the diffusion constant. Water suppression was achieved by either presaturation or WATERGATE techniques. Data were processed with NMRPipe (21) and visualized using NMRView (22) Chemical shifts were referenced to an external standard: 4,4-dimethyl-4-silapentane-1-sulfonic acid.
NMR Structure Calculations-Interferon Hyp-polysaccharide-1 was constructed in an arbitrary extended conformation using the LEaP module of Amber 10 (23). The starting model was subjected to a restrained simulated annealing conformational search protocol to obtain an ensemble of structures consistent with the NMR data. All assigned NOESY cross-peaks were classified as strong (1.8 -2.7 Å), medium (1.8 -3.7 Å), weak (1.8 -5.0 Å), and very weak (1.8 -6.0 Å) interproton distance restraints according to their intensities. Beyond these bounds, a quadratic penalty potential was applied with a force constant of 20 kcal mol Ϫ1 Å Ϫ2 . A total of 49 distance restraints were used for interferon Hyp-polysaccharide-1 of which 34 were assigned non-ambiguously to protons in sequential residues. Crosspeaks that correspond to non-sequential assignments gave rise to ambiguous restraints where either more than one proton pair contributes to the NOESY volume or unambiguous assignment was not possible. Ambiguous peaks were interpreted as an Ͻ r Ϫ6 Ͼ Ϫ1/6 averaged value of the contributing interproton distances.
For a restrained molecular dynamics conformational search, initial models were energy-minimized without restraints and subjected to 200 simulated annealing cycles. A cycle started from the structure obtained in the previous cycle and included 10-ps heating from 300 to 1000 K followed by an equilibration of 10 ps at 1000 K without any restraints applied. The force constants for all restraints were then scaled gradually from 0 to the final values during 20 ps at 1000 K followed by cooling the system to 300 K over 30 ps. Atomic interactions within the system were calculated using the Glycam06 parameter set for sugars (24) and Generalized Born solvation (igb ϭ 2) with monovalent salt concentration corresponding to 0.1 M. The last structures in each cycle were energy-minimized using the same parameters and restraints as described above. Ten best models for an average structure were selected based on NMR restraint violations and the potential energy of the molecule to undergo further refinement in explicit water and counterion environment. Each of these model structures was placed in a truncated octahedral box of about 5000 TIP3P water molecules and two K ϩ counterions to neutralize the total charge. In one case we used Ca 2ϩ to neutralize the charge of the uronic acids. Parameters related to water and counterions were taken from the standard Amber libraries. The system was energy-minimized and then heated to 300 K at constant volume during 50 ps, whereas the solute was kept under positional restraints with a force constant of 25 kcal mol Ϫ1 Å Ϫ2 . The positional restraints were gradually removed over 300 ps at constant pressure (1 atm) and temperature (300 K), and a production phase was initiated for 2 ns with the full set of restraints applied. The final structures were energy-minimized and used for subsequent analysis. The hydrodynamic radius was calculated for the model structures using HYDROPRO Version 7c2 (25).

RESULTS
The Structures of Interferon Hyp-polysaccharide-1, interferon Hyp-polysaccharide-2, and Ala-Hyp-polysaccharide-2 were determined based on earlier composition analyses and on the chemical shifts (15,26) observed in one-dimensional 1 H NMR spectra and two-dimensional COSY, TOCSY, HSQC, and HMBC NMR spectra as follows: Primary Structure of Interferon Hyp-polysaccharide-1-Sizefractionated base hydrolysates of Interferon␣2-(Ser-Hyp) 20 yielded a single peak containing sugar and Hyp residues, described earlier (26). Two subfractions were chosen for structural analyses; one contained the major Hyp-AG species, designated Interferon Hyp-polysaccharide-1, with 22 glycosyl residues estimated by the Hyp to monosaccharide molar ratios, and a second fraction contained a smaller, less abundant Hyp-arabinogalactan of 20 glycosyl residues, designated interferon Hyp-polysaccharide-2.
The Interferon Hyp-polysaccharide-1 Hyp-galactose Linkage-Signals arising from the Hyp residue were also identified in the TOCSY (supplemental Fig. 1) and HSQC spectra (Fig. 2). The chemical shifts are shown in supplemental Table I. The H-4 and C-4 resonances characteristic of non-glycosylated Hyp shifted downfield from 4.62 and 70.6 to 4.786 and 78.02 ppm, respectively, judging from the HSQC spectrum. This indicated that the hydroxyl group of Hyp was galactosylated (15,18). The HMBC spectrum (Fig. 3) confirmed this in crosspeak G, which arose from C-4 of Hyp (78.02 ppm) and H-1 of a ␤-D-Galp residue (4.57) ppm, designated G 1 in Fig. 4 and supplemental Table I.
Galactan Backbone and Side Chain Gal Residues-In addition to G 1 linked to Hyp, there were another five ␤-D-Galp residues in the interferon Hyp-polysaccharide-1 galactan backbone (designated G 1-6 in supplemental Table I and Figs. 4 and 5) judging by H-1 resonances at ϳ4.70 ppm in the one-dimensional 1 H spectrum. Four of the five backbone Gal residues were 3-linked to each other and to G 1 , as deduced from cross-peaks E in the HMBC spectrum (Fig. 3), which correlated backbone Gal H-1 signals with backbone Gal 3-C signals. A fifth backbone Gal participated in a 13 6 linkage to another backbone Gal residue deduced from cross-peak F in the HMBC spectrum. Thus, based The final 4 of the 10 Gal residues of interferon Hyp-polysaccharide-1 occurred in side chains (designated G a , G b , G c , G d in supplemental Table I and Figs. 4 and 5) linked 136 to the backbone Gal residues. This was deduced from one-dimensional 1 H and two-dimensional 13 C, 1 H HMBC spectra. Resonances at 4.40 -4.54 ppm in the one-dimensional 1 H NMR spectrum were consistent with four Gal side chains attached to the galactan backbone (15), and cross-peaks I and J in the HMBC spec-trum indicated that the four side-chain Gal residues were 136 linked to backbone Gal residues.
Interferon Hyp-polysaccharide-1 Side-chain Composition and Linkages-The two ␣-L-Rha residues, designated R 1 and R 2 , were terminal, deduced by a comparison of their assigned chemical shifts (supplemental Table I) obtained from TOCSY (supplemental Fig. 1) and HSQC spectra ( Fig. 2) with those of earlier characterized Ala-Hyp-polysaccharide-1 (15). Rhamnose residues R 1 and R 2 were linked to O-4 of ␤-D-GlcUAp (UA 1 and UA 2 of supplemental Table I), deduced from crosspeak D in the HMBC spectrum (Fig. 3). Cross-peak H in the HMBC spectrum indicated side-chain Gal residues were substituted at O-6 with glucuronic acid residues UA 1 and UA 2 . The same chemical shifts arising from ␣-L-Rhap, ␤-D-GlcUAp, and side-chain ␤-D-Galp residues were identified earlier on sidechain Gal residues in Ala-Hyp-polysaccharide-1 (15). This indicated the side chains were attached to backbone Gal residues nearest Hyp; therefore, we assigned the two Rha-(134)-GlcUA subunits to side-chain Gal residues closest to Hyp, G a , and G b (supplemental Table I  The Ara residues occurred in small side chains that were 3-linked to the side-chain Gal residues. The HMBC spectrum cross-peak A identified Ara 5-linked to another Ara (Fig. 3). It arose from H-1 of ␣-L-Araf residues (5.087 ppm, A 1 and A 4 in Fig. 4a) and 5-C of other Ara residues (67.1 ppm, A 2 and A 5 in Fig. 4a) This was consistent with the one-dimensional 1 H NMR spectrum that indicated there were two Ara-(135)-Ara linkages in interferon Hyp-polysaccharide-1. HMBC signals arising from the ring carbon atoms of A 1 and A 4 (C-2, C-3, and C-4) showed that they were terminal residues (15); hence, two diarabinosyl structures occurred having the structure ␣-L-Araf-(135)-␣-L-Araf- (13).
A comparison of the interferon Hyp-polysaccharide-1 HSQC and HMBC spectra with those of Ala-Hyp-polysaccharide-1 (15) showed that the interferon Hyp-polysaccharide-1 anomeric proton/carbon chemical shifts arising from G c and G d (ϳ103.6/4.447 and 103.2/4.41 ppm), the side-chain Gal residues furthest from Hyp in Fig. 4, differed from the other side-chain Gal residues, G a and G b (both ϳ103.4/4.50). Furthermore, the signals from G a and G b in Interferon Hyp-  . HMBC spectrum of Hyp-AG interferon Hyp-polysaccharide-1 at 55°C. This helped identify the interferon Hyp-polysaccharide-1 monosaccharide sequence. Cross-peak A correlated Ara H-1 with Ara C-5, cross-peak B correlated Ara H-1 with Ara C-3, cross-peak C correlated Ara H-1 with sidechain Gal C-3, cross-peak D correlated Rha H-1 with GlcUA C-4, cross-peak E correlated backbone Gal H-1 with backbone Gal C-3, cross-peak F correlated backbone Gal H-1 with backbone Gal C-6, cross-peak G correlated G 1 H-1 with Hyp C-4, cross-peak H correlated GlcUA H-1 with side-chain Gal C-6, and crosspeaks I and J correlated side-chain Gal H-1 to backbone Gal C-6.
polysaccharide-1 were identical to those of Ala-Hyp-polysaccharide-1 characterized earlier (15). Therefore, we designated the specific side-chain Gal residues in the two ␣-L-Araf-(135)-␣-L-Araf-(133)-␣-L-Araf-(133)-Gal units as G a and G b. They were part of the two bifurcated six-residue side chains of Interferon Hyp-polysaccharide-1. The Gal residues in the two ␣-L-Araf-(133)-side-chain Gal units were designated G c and G d (supplemental Table I

, Figs. 4 and 5).
Interferon Hyp-polysaccharide-1 Long-range Interactions-Lowering the temperature of NMR analyses from 55 to 25°C (see the chemical shift assignments in supplemental Table II) provided the following lines of evidence for a folded Interferon Hyp-polysaccharide-1 conformer (Fig. 6).
A diffusion-ordered spectroscopy spectrum (supplemental Fig. 2) gave a diffusion coefficient of (1.58 Ϯ 0.1) ϫ10 Ϫ10 m 2 /s. Using the Stokes-Einstein equation, this value corresponds to a hydrodynamic radius of 15.6 Ϯ 1.0 Å, which is consistent with a globular folded structure and close to the calculated value for model structures of 13.9 Ϯ 0.3 Å (see below).
The intensity of NOEs arising from protons between sequential O-linked sugar residues, for example H-1 of R 1 and H-4 of glucuronic acid residue UA 1 (supplemental Table III and Fig. 3) suggested a somewhat restricted conformation around glycosidic links rather than free rotation. Furthermore, the single set of chemical shifts rules out the existence of several stable conformations.
Although the major spectral region was not amenable to unequivocal analysis due to considerable resonance overlap, a region possessing unique chemical shifts showed two NOE clusters at ϳ5. 25 6 and H-1 of UA 1 or UA 2 . NOEs (i) and (ii) indicate a side chain Ara of the second trigalactosyl unit is close to the side-chain Gal/UA residues of the first trigalactosyl unit; this suggests that the ␤-1-6 linkage between two trigalactosyl units allows the main chain to fold. The second cluster also included diagnostic NOEs; (i) 5.237/4.694 ppm attributed to H-1 of Ara A 7 or A 8 and H-1of Gal G 2 ; (ii) 5.265/4.700 ppm attributed to H-1 of A 3 or A 6 and H-1 of G 4 . The intensity of these NOEs indicates a distance of 6 Å between the respective protons that suggests the molecule is folded. The possible effect of spin diffusion can be excluded because these NOEs could be observed with a short mixing time of 150 ms.
Primary Structure of Interferon Hyp-polysaccharide-2-Neutral sugar, uronic acid, and Hyp analyses of interferon Hyppolysaccharide-2 gave a molar ratio of Hyp Gal 10 Ara 5 GlcUA 4 Rha. The interferon Hyp-polysaccharide-2 1 H NMR spectrum (supplemental Fig. 4) gave a very similar signal pattern in the anomeric proton region as interferon Hyp-polysaccharide-1, except the peak area ratios differed, as interferon Hyp-polysaccharide-2 contained two more GlcUA and one less Rha and only five Ara residues. Of the five ␣-Araf residues, only one (anomeric proton signal at ϳ5.08 ppm) was 5-linked to another Ara; the other four (anomeric proton signal at ϳ5.24 ppm) were terminal, 1,3-linked, or 1,5-linked. The ␣-Rhap residue was ter-minal, and of the 10 ␤-Galp residues, 6 were backbone Gal residues. Five of the backbone Gal residues gave H-1 signals at 4.68 -4.71 ppm, and the sixth, G 1 in supplemental Table IV, was linked to Hyp. The other four Gal residues were part of the side chains (4.39 -4.49 ppm). The H-1 signals of ␤-GlcUAp were not resolved from those of the side-chain Gal; however, the signal peak integral combined with chemical analyses of interferon Hyp-polysaccharide-2 indicated it had four GlcUA residues. Together, the 1 H NMR spectrum and the sugar analyses indicated interferon Hyp-polysaccharide-2 was a 20-sugar residue Hyp-arabinogalactan (Fig. 5c). We assigned the chemical shifts of interferon Hyp-polysaccharide-2 (supplemental Table IV) using two-dimensional TOCSY (supplemental Fig. 5), HSQC (supplemental Fig. 6), and HMBC (supplemental Fig. 7) spectra discussed below.
Hyp-Gal Linkage and Galactan Backbone-Interferon Hyppolysaccharide-2 had the same galactan backbone structure as interferon Hyp-polysaccharide-1. The Hyp-Gal linkage was established by cross-peak G in the HMBC spectrum (supplemental Fig. 7). Like interferon Hyp-polysaccharide-1, the galactan backbone of interferon Hyp-polysaccharide-2 was composed of five ␤-D-Galp residues (G 2 -G 6 in supplemental Table IV) and six Gal linked to Hyp (G 1 ). Cross-peak E indicated the backbone Gal residues were mainly 133-linked, although a 136 link occurred between G 3 and G 4 (cross-peak F in supplemental Fig. 7 and Fig. 5c).
Interferon Hyp-polysaccharide-2 Side Chains-The sidechain structures were similar to those of interferon Hyp-polysaccharide-1, but generally smaller. The four side-chain Gal residues evident in the one-dimensional 1 H NMR spectrum were attached to backbone Gal residues through the O-6 position, deduced from cross-peaks I and J in the HMBC spectrum (supplemental Fig. 7) (side-chain Gal H-1 at 4.39 -4.49 ppm to backbone Gal 6-C at ϳ70.0 ppm). Interferon Hyp-polysaccharide-2 had only one terminal ␣-L-Rhap but four ␤-D-GlcUAp residues. Cross-peak D in the HMBC spectrum indicated an ␣-L-Rhap-(134)-␤-D-GlcUAp unit, and the chemical shifts of the other GlcUA residues (supplemental Table IV) indicated they were unsubstituted and, therefore, terminal. Cross-peak H showed that all four GlcUA residues were ␤-13 6-linked to side-chain Gal residues. In Fig. 5c, we assigned the Rha-(134)-GlcUA unit to one of the two side-chain Gal residues closest to Hyp, G b , based on earlier work with Ala-Hyp-polysaccharide-1 (15); however, we had no direct evidence for this assignment, and any one of the four side-chain Gal residues were candidates.
HMBC cross-peaks A, B, and C (supplemental Fig. 7) corresponded to the following side chain arabinosyl units: ␣-L-Ara-  Table IV, Fig. 5c) in the TOCSY spectrum (supplemental Fig. 5) indicated residues G c and G d were unsubstituted at O-3. However, we could not discern the precise distribution of the diarabinosyl and triarabinosyl units between G a and G b .
Judging by the C/H chemical shifts at ␦ 82.3/4.21 ppm in the HSQC spectrum (supplemental Fig. 8 and Table V), the two ␣-L-Araf residues were terminal residues. Cross-peak C in the HMBC spectrum indicated two side-chain units of ␣-L-Araf-(133)-␤-D-Gal.
NMR Structure Calculations of the Hyp-AGs-Model structures of interferon Hyp-polysaccharide-1 were generated by simulated annealing, and the 10 best models were further refined in explicit water and ion environment. These models, consistent with NOE distance information obtained at 25°C (supplemental Table III), depict an overall folded structure with backbone residues Gal-6 and Gal-1 in proximity forming a sharp bend or "reverse turn" formed by the ␤-1,6-link between repetitive ␤-1,3-linked trigalactosyl subunits. We note that some conformational flexibility was seen in the NMR ensemble without violating the experimental distance information (Fig.  6). Significantly, ϳ10 intramolecular H-bonds stabilized interactions in which uronic acid carboxyls appeared close enough to chelate Ca 2ϩ , a possibility supported by NOE-restrained molecular dynamics simulation of interferon Hyp-polysaccharide-1 in explicit water and Ca 2ϩ (supplemental Fig. 10). The chelated ion remained strongly bound to the uronic acids without violating the available NOE distance data throughout the simulation. As an additional test of the global properties of the structural models, we calculated the hydrodynamic radius of the structures as 13.9 Ϯ 0.3 Å.

DISCUSSION
Complete elucidation of type II arabinogalactan structure has been a major goal since the isolation of Hyp-AGs more than 30 years ago (5, 6). Churms et al. (16,27,28) notably suggested a repetitive structure with "AG substituents showing blocks of 1,3-1inked galactan backbone interrupted by periodate susceptible residues (kinked region)." The structure of Ala-Hyppolysaccharide-1 supported the repetitive subunit hypothesis and also suggests that the arabinogalactan structure is highly conserved in both classical AGPs and the less abundant chimeric glycoproteins of the cell surface (29). As Ala-Hyp-polysaccharide-1 was only the first Hyp-AG (15), we characterized additional Hyp-AGs to test the possibility that AGs are highly varied structures dictated by regional peptide sequence and too complex for structural elucidation. We expressed the two most frequently occurring AGP motifs Ser-Hyp and Ala-Hyp in tobacco Bright Yellow-2 cells as chimeric fusion glycoproteins with non-glycosylated partners interferon ␣2b and green fluorescence protein, respectively. This allowed comparison of Hyp-AGs isolated from Ser-Hyp and Ala-Hyp repeats in the fusion proteins interferon ␣2-(Ser-Hyp) 20 (15,30) and (Ala-Hyp) 51 -GFP.
The side-chain substituents linked to the galactan main chain were similar in position and composition. The small Hyp-AGs, Ala-Hyp-polysaccharide-1, and Ala-Hyp-polysaccharide-2 contained only two side chains attached to main-chain Gal residues G 1 and G 2 (15) and differed from each other only in their Rha and Ara content (Fig. 5). In contrast, the larger Hyp-AGs, interferon Hyp-polysaccharide-1 and interferon Hyp-polysaccharide-2 each had four side chains ranging from two to six glycosyl residues linked to G 1 , G 2 , G 4 , and G 5 (Figs. 4 and 5). These three new Hyp-AGs and Ala-Hyp-polysaccharide-1 shared an ϳ15-residue subunit consisting of a repetitive trisaccharide with two bifurcated acidic side chains (Figs. 4 -6), each with a maximum of six residues; these side chains are attached to C-6 of G 1 and G 2 , i.e. the first and second Gal residues, numbered from the reducing end of the galactan backbone.
This relatively invariant Hyp-AGs structure with bifurcated side chains is apparently widespread; it is consistent with compositional data from diverse species (16,27,28) and with the fact that AGPs from diverse species selectively co-precipitate with the ␤-Yariv reagent. The six-residue side chain of interferon Hyp-polysaccharide-1 is identical with the gum arabic side chain in the legume Acacia senegal (31) except for the addition of terminal 5-linked Araf in interferon Hyp-polysaccharide-1. Furthermore, neither the type of Hyp-AGs peptide motif (Ser-Hyp versus Ala-Hyp) nor the attached non-glycosylated domain (interferon versus GFP) affected the composition or structure of this 15-residue glycan subunit, again consistent with a highly conserved structure. The "type II" (i.e. ␤-1,3linked) Hyp-AGs represented here by interferon Hyp-polysac-charide-1, interferon Hyp-polysaccharide-2, Ala-Hyp-polysaccharide-2 (Figs. 4 and 5), and Ala-Hyp-polysaccharide-1 (15) confirm that tobacco Bright Yellow-2 cell Hyp-AGs consist of small ϳ15-residue subunit repeats with a common composition and linkage pattern differing mainly in the number of trigalactosyl subunits decorated with two side chains each composed of up to 6 residues. Thus, larger Hyp-AGs of up to 150 sugar residues (5, 6) may consist of ϳ10 repetitive subunits. There are variations on the major 15-residue theme. For example, some Arabidopsis Hyp-AGs lack rhamnose (32) and likely contain fucose; furthermore, some monosaccharide residues may be modified by acetylation (33) or 4-O-methylation (34).
Although interferon Hyp-polysaccharide-1, interferon Hyppolysaccharide-2, and Ala-Hyp-polysaccharide-2 each had a complete 15-residue subunit, i.e. a galactose trisaccharide with 6-residue bifurcated side chains, these Hyp-AGs also contained incomplete subunits with truncated side chains, for example (␣-L-Ara-(133)-Gal) in interferon Hyp-polysaccharide-1 and (␤-D-GlcUA-(136)-Gal) in interferon Hyp-polysaccharide-2. These variations, particularly the incomplete Ala-Hyp-polysaccharide-1 Gal backbone, argue for biosynthesis of Hyp-AGs via stepwise saccharide addition to an AGP polypeptide or alternatively en bloc transfer of incomplete lipid-linked arabinogalactan intermediates (15) or even sugar trimming after transfer, although compelling evidence for degradative turnover of AGPs has not yet been observed. The existence of AGP microheterogeneity like that shown here may account for notions of Hyp-AGs as intractably complex. However, a small Hyp-arabinogalactan containing only 4 different sugars and 7 different glycosidic linkages is a relatively simple structure compared, for example, with the complex rhamnogalacturonan-II pectic polysaccharide with 12 different sugar residues linked by more than 20 different glycosidic linkages (35).
Signaling functions and roles as determinants of cell fate dominate current AGP discussion. However, the location of classical AGPs at the cell surface and their sheer physical abundance (3) predict primarily structural functions. Structural conservation implies that Hyp-AGs play similar conserved roles in both classical AGPs and AGP chimeras (3,29). Nevertheless, despite considerable speculation (4,36), specific biological roles of classical AGPs and their Hyp-AGs remain to be elucidated. The ubiquity of AGPs at the cell surface and involvement of classical AGPs in numerous fundamental developmental processes is clear (3,(37)(38)(39)(40)(41), yet AGP function at molecular levels is relatively unexplored. The current structural elucidation including computer simulations support structural roles for classical AGPs as follows.
Exclusively ␤-1-3-linked galactans belong to the compact hollow helix polysaccharide family (42). However, interspersed ␤-1-6-linkages form "kinks" (17) that may be analogous to the classical reverse ␤-turns of polypeptides. A molecular model (Fig. 6) depicts interferon Hyp-polysaccharide-1 as a folded polysaccharide that forms a moderately compact spheroid consistent with both the NOE data and the hydrodynamic radius determined experimentally.
Modeling also revealed other conserved features of interferon Hyp-polysaccharide-1 that may contribute to its stability and molecular function. They include the general stabilizing role of ϳ10 intramolecular H-bonds and shrinkage of conformational space (43) by the bulky bifurcated AG side chains; these limit the conformational AG landscape particularly when restrained even further in vivo by their attachment to the polypeptide backbone.
A conserved core structure for type II arabinogalactans suggests a conserved function and some speculations about the precise roles played by the Hyp-arabinogalactans that decorate naturally occurring AGPs and the numerous AGP chimeras that populate the plasma membrane-cell wall interface. Indeed, in tobacco cells complete coverage of the plasma membrane by classical AGPs is likely (3).
Although the sheer abundance of classical AGPs at the cell surface suggests they (and their glycans) are structural molecules, roles for AGPs in signal transduction might arise from the composition and presentation of residues at the periphery of the glycans, the regions that also exhibit microheterogeneity. The compact folded structure deduced here for interferon Hyp-polysaccharide-1 indicates that the side chains are readily available for homophilic and heterophilic interactions and the galactan backbone somewhat less so, especially near the reducing end of the polysaccharide where the larger side chains shield the galactan backbone.
For example, although all Hyp-arabinogalactans possess a galactan backbone with arabinosyl side chains, some lack the abundant uronic acid residues prevalent in the Hyp-arabinogalactans of tobacco (44,45) where charge repulsions of the uronic acid residues may form a compression buffer at the cell surface analogous to animal proteoglycan compression buffers (15). On the other hand, the close proximity of glucuronic acid residues may favor Ca 2ϩ chelation (supplemental Fig. 10), which may be relevant to processes involving calcium signaling (45). Finally, wall AGPs may play a role in regulating cell extension, perhaps by acting as pectic plasticizers (3). The structures described here may help test these hypotheses.