![]()
|
|
||||||||
J. Biol. Chem., Vol. 279, Issue 44, 45503-45511, October 29, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


||
From the
School of Agriculture and Wine, Waite Agricultural Research Institute, The University of Adelaide, PMB1, Glen Osmond, South Australia 5064, Australia and the ¶Australian Centre for Plant Functional Genomics and Plant Cell Biology Research Centre, School of Botany, The University of Melbourne, Parkville, Victoria, 3010 Australia
Received for publication, July 7, 2004 , and in revised form, August 16, 2004.
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
AGPs contain large O-linked arabinogalactan (AG) polysaccharides attached to the numerous hydroxyproline (Hyp) residues found throughout the protein backbone (13). In some instances, short arabino-oligosaccharides chains (1012) are attached to Hyp residues and single Gal residues are linked to Ser residues (6, 13). Modification of the Pro residues in the protein backbone is thought to occur co-translationally as the proteins are inserted into the endoplasmic reticulum (ER) (1). Recently a putative plant prolyl hydroxylase was shown to have hydroxylation activity in insects cells on both pro-collagen and plant-specific protein backbones (14). Because our understanding of the substrate specificity of individual prolyl hydroxylases is limited, sequencing the protein/peptide backbones of AGPs would make a substantial contribution. Other common post-translational modifications in plants and animals, which are not commonly associated with AGPs, include N-linked glycosylation, myristoylation, phosphorylation, and methylation (15).
It is now clear that the broad class of glycan attached to Hyp residues in the protein backbone depends on whether Hyp residues are contiguous or non-contiguous (12). The Hyp contiguity hypothesis (16) states that contiguous Hyp residues (e.g. Ser-Hyp-Hyp-Hyp), as typically found in extensions, are attachment sites for short arabino-oligosaccharides and that non-contiguous Hyp residues (e.g. Ser-Hyp-Ala-Hyp), as typically found in the AG-peptides, are the attachment sites for AG polysaccharides. Our understanding of the genes encoding the enzymes for the glycosylation of Hyp is limited although it is likely that many different glycosyltransferases (GTs) will be involved (6, 17). It is not clear whether the addition of AG polysaccharides to AGPs occurs one residue at a time, or whether a glycan core is added en bloc as is the case for N-linked glycans. The AG polysaccharide backbone is repetitive with
-(1
3)Galp residues (degree of polymerization of approximately seven) separated by a periodate-sensitive linkage, reviewed in Ref. 1. The repetitive nature of the AG polysaccharide backbone provides some support for the en bloc addition of the sugars (18).
The finding that AGPs are GPI-anchored (1922) provided new insights into how AGPs might function as signaling molecules (8). Most Arabidopsis AGPs are predicted to be, at least transiently, attached to the outer surface of the plasma membrane via a GPI anchor. A plasma membrane localization is consistent with the result of immunolocalization experiments, reviewed in Ref. 23. Experimental evidence for the GPI anchoring of only a few plant AGPs has been obtained, PcAGP1 from Pyrus communis (19), NaAGP1 from Nicotiana alata (19), At-AGP10 from Arabidopsis (9), and LeAGP1 from Lycopersicon esculentum (12, 24, 25). To date, only two plant proteins have had their GPI-anchor cleavage site (
) determined experimentally; NaAGP1 from N. alata and PcAGP1 from P. communis (19). For many other AGPs the predicted in vivo location is based on the presence of a putative signal for GPI-anchor addition based on genomic sequences (4, 26).
With the completion of two plant genome sequencing projects, Arabidopsis (27) and rice, Oryza sativa (28, 29) there is a need to improve the certainty around GPI-anchor prediction. Although the basic structure of the GPI-signal sequence is conserved among a diverse group of organisms, prediction programs developed for animal and protozoan proteins (30) work poorly for the AGPs. Progress for the prediction of plant proteins has recently been made by two groups (26, 31). With a combination of simple in silico sequence analytic procedures, gene sequence refinement, and human expert judgment, Dupree and colleagues (26) derived a list of 248 potentially GPI lipid-anchored proteins. With the improved genome annotation this approach predicts that there are
248 GPI-anchored proteins in Arabidopsis belonging to at least 15 different gene families (32). The classical AGPs and AG-peptides comprise 12% of the total GPI-anchored proteins, and proteins containing AGP-like domains (AG glycomodules) as part of longer proteins, e.g. certain FLAs and lipid transfer proteins, make up a further 28% of predicted GPI-anchored proteins (32).
The big-
Plant Predictor was developed by Eisenhaber and colleagues (31) and is available as an online predictor (mendel. imp.univie.ac.at/sat/gpi/plant_server.html). This predictor includes PcAGP1, NaAGP1, and selected AG-peptides among 219 plant proteins with experimentally verified or hypothetically anticipated GPI anchor modification. This has resulted in the prediction that 187 Arabidopsis proteins are GPI-anchored, and 165 of these are the same as predicted by Ref. 32. The discrepancies between the two approaches highlights that the knowledge of the sequence features of the GPI-anchor signal is incomplete and that the big-
Plant Predictor conservatively extrapolates the current knowledge on uncharacterized proteins.
Another new web-based GPI-anchor predictor, "detection of GPI" was developed by D. Buloz and J. Kronegg of the Swiss Institute of Bioinformatics (129.194.185.165/dgpi/DGPI_demo_en.html), however, it does not work well for plant proteins. Although the "detection of GPI" program suggests that both NaAGP1 and PcAGP1 are GPI-anchored, it does not accurately predict the cleavage site for these two AGPs. For example, the experimentally determined GPI cleavage site (
) for PcAGP1 (GenBankTM accession number U14009
[GenBank]
) is between Ser120 and Gly121 (19), whereas the detection of GPI program (Oct 24, 2003) predicts cleavage between Gly106 and Ser107.
The major limitation to accurate GPI-anchor prediction in plants is the lack of experimental verification of GPI anchoring, especially with regard to the determination of the cleavage site (
) residue. Here, we describe the separation of a family of deglycosylated AG-peptides from the bulk of the AGPs. The small size of the deglycosylated AG-peptides, 10 to 17 residues, allowed them to be analyzed without protease digestion. Using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF/MS) and tandem mass spectrometry (MS/MS) protein sequencing we determined the precise cleavage site for both N- and C-terminal signals. We also established that all Pro residues can be hydroxylated to Hyp for the eight AG-peptides analyzed. No other post-translational modifications were detected. These data contribute toward our goal of determining the complete primary structure of this functionally important group of plant proteoglycans.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
120 rpm) in the laboratory under continuous fluorescent lighting at
22 °C. The whole seedlings were harvested after 14 days and freeze dried. Five flasks gave
1 g dry weight of tissue after freeze drying.
Purification of AG-peptidesAGPs were extracted from freeze dried, liquid culture-grown seedlings as previously described (9). In brief, following homogenization in liquid nitrogen, detergent-soluble protein was extracted in a buffer (50 mM Tris-Cl, pH 8.0, 10 mM EDTA, 1% Triton X-100, 0.1%
-mercaptoethanol), freeze dried, and resuspended in a small volume of 1% (w/v) NaCl. AGPs were selectively precipitated using
-glucosyl Yariv reagent (
-GlcY, Biosupplies, Parkville, Australia) and collected by centrifugation. After washing, the
-GlcY was dissociated from the AGPs using dimethyl sulfoxide and sodium dithionite and removed by desalting on a PD-10 column (Sephadex G-25, Amersham Biosciences). The
-GlcY-precipitated AGP sample was chemically deglycosylated without further chromatography as previous studies have shown that
-GlcY-precipitated AGPs contains no Coomassie Blue-stained proteins on SDS-PAGE (33).
Deglycosylation of AGPs by Anhydrous Hydrogen FluorideChemical deglycosylation of AGPs was performed using anhydrous hydrogen fluoride (HF) for 3 h at room temperature, according to the method of Mort and Lamport (34) as previously described (9). The HF was removed under vacuum, then the sample was desalted on a PD-10 column (Sephadex G-25, Amersham Biosciences). The desalted deglycosylated AGP fractions were freeze dried prior to resuspending in 500 µl of ultrapure water.
HPLCReversed-phase (RP)-high performance liquid chromatography (HPLC) was performed with a C18 column (Vydac, 218TP52, 5 µm 2.1 x 250 mm) attached to a Hewlett-Packard 1090 Liquid Chromatograph with a diode array detector and analyzed using Chemstation software. AGPs were eluted and collected from the column equilibrated in solvent A (0.05% trifluoroacetic acid) with a linear gradient of solvent B (0.04% trifluoroacetic acid in 70% acetonitrile): 0 to 30% solvent B in 30 min, then 30 to 100% in 30 min at a flow rate of 0.3 ml/min. Chromatography was monitored by absorption at 214 and 280 nm. A 250-µl aqueous aliquot of the deglycosylated AGP was mixed with 0.05% trifluoroacetic acid and loaded onto the C18 column maintained at ambient temperature (21 °C) to give sample A. The remaining deglycosylated AGP sample (approximately 250 µl) was mixed with an equal volume of 0.05% trifluoroacetic acid and loaded onto the same RP-HPLC column but the column temperature was increased to 28 °C in an attempt to improve the elution profile to give sample B. Sample B contained
80% as much material as compared with sample A based on the absorbance of the largest peak (160 mA for B7 compared with 200 mA for A7).
Edman Degradation Protein SequencingN-terminal protein sequencing was performed by automated Edman degradation on a sequencer (Model LF 3400; Beckman Instruments) with on-line analysis on a Beckman System Gold HPLC.
Matrix-assisted Laser Desorption Ionization Time-of-Flight Mass SpectrometryMALDI-TOF analysis was performed on a Voyager-DE STR MALDI mass spectrometer (Applied Biosystems), equipped with a 337-nm N2 laser. Parent ion masses were measured in reflector/delayed extraction mode, with accelerating voltage of 20 kV, grid voltage of 65%, and a 150-ns delay. Positive polarity was used. 50 scans were averaged per sample, and spectra were subject to 2 point external calibration using the most appropriate standards. Initially, MALDI-TOF was performed with external standards to avoid interference and suppression that can occur with internal standards (35). For sample A, the external standards used were bradykinin, angiotensin I, and adrenocorticotropic hormone peptide-(1839) with Mr = 1060.5692, 1296.6853, and 2465.1989, respectively. For sample B, the internal standards used were bradykinin, angiotensin I, and adrenocorticotropic hormone-(117) with Mr = 1060.5692, 1296.6853, and 2093.0867, respectively. For both internal and external standards, two-point calibrations were used. The matrix used was a 2,5-dihydroxybenzoic acid-saturated solution in 50% acetonitrile with 0.1% trifluoroacetic acid.
Protein Sequencing by Tandem Mass SpectrometryMost samples were analyzed on an Applied Biosystems QSTAR® XL system using a protana/new objective tip. MS/MS for most ions was performed with an ion source voltage of 1600 V, collision energy of 28 V, with Q1 in low resolution mode. For ions that were difficult to fragment (e.g. sodium and potassium adducts) the collision energy was increased to 40 V. Other samples were analyzed on an LCQ mass spectrometer (Thermo-Finnigan, San Jose, CA). Sample (5 µl) was injected onto a C18 column (Vydac, 300 µm internal diameter x 10 cm). The solvents used were 0.5% acetic acid, 5% acetonitrile (solvent A) and 0.5% acetic acid, 60% acetonitrile (solvent B). Peptides were eluted with a 60-min gradient into the LCQ mass spectrometer with a 4.6 kV source voltage. A cycle of one full-scan mass spectrum (3002,000 m/z) followed by one data-dependent MS/MS spectrum with a collision energy of 35 V was employed. Sample A10 was run on an Applied Biosystems QSTAR® Pulsar I, using a protana nanospray tip in positive ion mode. MS/MS of ion m/z 616.8 was performed with an ion source voltage of 950 V, collision energy of 40 V, and with Q1 in low resolution mode.
Calculating the Predicted Masses of AG-peptidesTo calculate the expected mass of each AG-peptide the expected post-translational modifications were included. The "standard" modifications were based on the (i) removal of the N-terminal signal sequence as predicted by SignalP (36), (ii) removal of the C-terminal signal for GPI-anchor addition (assuming the most likely cleavage site based on the animal consensus data (37)), (iii) addition of an ethanolamine (43.0422 Da) at the C terminus of the peptide (ethanolamine (et) is the only remnant of the GPI-anchor after deglycosylation), and (iv) hydroxylation of Pro residues to hydroxyproline (Hyp). The monoisotopic mass of unmodified AG-peptides (with N- and C-terminal signals removed) was calculated using MSProduct (prospector.ucsf.edu/ucsfhtml4.0/msprod.htm). Where the mature protein was predicted to have a Gln (Q) residue at the N terminus, the mass for pyroglutamate was used because N-terminal Gln residues on AGPs are frequently cyclized (9). The masses of modifications such as hydroxylation and methylation were obtained from the FindMod page at the Expasy site (Supplemental Materials Table IV; Ref. 15; us.expasy.org/tools/findmod/findmod_masses.html) or from Ref. 38.
| RESULTS |
|---|
|
|
|---|
-GlcY precipitation. The AGP sample was deglycosylated using HF and the deglycosylated AGPs were separated on a C18 column using RP-HPLC (Fig. 1A). Half of the deglycosylated
-GlcY precipitated material (sample A) was used to produce the profiles shown in Fig. 1, A and B. The remaining material (sample B) was chromatographed at a later stage using slightly different conditions, in particular, elevated column temperatures, resulting in minor changes in the elution profiles (Fig. 1C).
|
Edman Sequencing Confirms the Presence of AG-peptidesTo confirm that the AG-peptides were in fractions A1 to A12, each fraction was sequenced by Edman sequencing. Some fractions contained multiple amino acid residues in each cycle of Edman sequencing suggesting that multiple peptides eluted in each fraction. Because the expected sequence of AG-peptides is known it was generally possible to assign most residues to a single AG-peptide (Table I). AtAGP13, AtAGP14, AtAGP15.2, AtAGP21, AtAGP22, and AtAGP24 were positively identified from the N-terminal Edman sequencing (Table I). Four of the identified AG-peptides, AtAGP13, AtAGP21, AtAGP22, and AtAGP24, have the expected N terminus based on SignalP prediction (36). The presence of AG-peptides in the
-GlcY-precipitated material provides indirect evidence that AG-peptides are glycosylated with AG polysaccharides because
-GlcY precipitation requires the presence of both carbohydrate and protein (39).
|
|
Edman sequencing suggested that some of the AG-peptides contained modified amino acid residues. For example, two derivatized amino acid residues were observed in cycle 4 of fractions A9 to A11, whereas all the other cycles showed only one derivatized amino acid. The most abundant residue was the expected E (retention time (Rt) = 6.5 min) of AGP21, but a smaller peak at Rt = 10 min was also observed (data not shown). Because the other cycles contain only a single amino acid residue it is likely that there are two forms of AGP21, one of which contains a modified E in cycle 4. An unknown amino acid residue was also observed in cycle 2 of A3 to A7 as well as cycle 3 of A8.
MALDI-TOF Analysis of AG-peptidesTo determine the cleavage sites for the N-terminal ER signal and the C-terminal GPI-anchor signal we used MALDI-TOF MS. External standards were used initially (sample A) to eliminate the possibility of signal suppression that can be observed with internal standards (35). All fractions showed at least one molecular ion, and many fractions showed two or more dominant ions (Fig. 2 and Supplemental Materials Table V). The MALDI-TOF data supported the Edman sequencing data that individual AGPs eluted in multiple fractions, e.g. AGP21 (expected monoisotopic mass of 1218.5853 Da) eluted in fractions A8 (Fig. 2A and Supplemental Materials Table V) and A9 (Fig. 2B and Supplemental Materials Table V). Ions with mass increases representing sodium (21.9819) and potassium (37.9559) adducts, respectively, were occasionally observed (e.g. A12, Fig. 2C), suggesting that salts were not completely removed in all samples (38).
|
mass of 120 ppm). One reason for the low accuracy in some fractions is the use of external standards for the calibrations (35). To improve the accuracy, the MALDI-TOF analysis was repeated using internal calibrations. The remaining aliquot of deglycosylated total AGPs was separated by RP-HPLC and a similar elution profile was obtained (Fig. 1C). The fractions from the second sample were numbered B1 to B12. There were minor differences in the elution profile because the column temperature was
7 °C higher in the second run in an attempt to improve separation (see "Experimental Procedures"). Each fraction from sample B was analyzed by MALDI-TOF analysis (Supplemental Materials Table VI). Internal standards improved the accuracy with the largest
mass being 32 ppm for AGP16 in B12 (Fig. 2D). Tandem Mass Spectrometry Confirms That AG-peptides Elute in Several FractionsNot all of the ions observed by MALDI could be assigned to a known AG-peptide, therefore we used MS/MS sequencing in an attempt to identify the unknown peptides and confirm the assignment of AG-peptides based on MALDI data. Using MS/MS we identified AGP12 (Fig. 3A), a methylated form of AGP13 (Fig. 3B), AGP14 (Fig. 3C), AGP15.1 (Fig. 3D), AGP16 (Fig. 3E), AGP21 in fractions B8b (Fig. 3F), and B10 (Fig. 3G) and a methylated form of AGP21 (Fig. 3H). Methylation of deglycosylated AGPs has been reported and is believed to be an artifact of HF deglycosylation in the presence of methanol (42). Sodium adducts of AGP15.1, AGP16, and AGP21 were also confirmed by MS/MS (data not shown). The presence of y ions consistent with the presence of ethanolamine at the C terminus of the AG-peptides confirmed that the AG-peptides are processed in vivo for the addition of a GPI-anchor.
|
| DISCUSSION |
|---|
|
|
|---|
|
AtAGP15 was identified in two forms, one with the predicted N terminus of Gln23 (AtAGP15.1) and the other with an N terminus of Ser24 (AtAGP15.2) (Table II and Supplemental Materials Fig. 4). It is not known whether there are two different cleavage sites for the ER signal peptide, or whether At-AGP15.1 and AtAGP15.2 are both cleaved between Ala22 and Gln23 and in some cases Gln23 is removed by a different protease to give AtAGP15.2. If proteolytic trimming of the AG-peptides is occurring, then the peptide backbone that we have assumed is encoded by AtAGP13, VEAOAOSOTS-et, could in fact be because of proteolytic processing of AtAGP21.
Additional proteolysis of chimeric (non-classical) AGPs has been experimentally observed in PcAGP2 from pear (33) and for the 120-kDa protein from ornamental tobacco (44). For the chimeric AGPs it is not known whether proteolysis occurs in vivo or during extraction of the proteins.
Experimental Verification of GPI Anchoring of AG-peptidesBecause it is technically more challenging to identify the C terminus of proteins than the N terminus, only two GPI-anchor cleavage sites have been experimentally determined for plants (19). The data presented here show that the eight AG-peptides identified are GPI-anchored based on MS ions that match the mass of C-terminal processed AG-peptides that include an ethanolamine at the C terminus. The ethanolamine is the only remnant of GPI-anchor that remains after chemical (HF) deglycosylation.
Experimental proof for the GPI anchoring of AGP16 is noteworthy. When AtAGP16 was first discovered, it was unclear if it was GPI-anchored because it contained an unusually long C-terminal hydrophobic domain (9). A long GPI signal sequence has been shown to function in animal cell culture systems (45). Because this was only one example, it was difficult to know how general a phenomenon this was. We show here that AtAGP16 is processed in vivo for the addition of a GPI-anchor based on the observation of ions in MALDI spectra of m/z 1113 in fractions A12 (Fig. 2C) and B12 (Fig. 2D). The MALDI-TOF result was subsequently confirmed by MS/MS in both fractions (Fig. 3E and Supplemental Materials Fig. 4). The MALDI-TOF data shows the presence of sodium and potassium adducts for AtAGP16 at 1135.4069 (Fig. 2, C and D) and 1151.3889 (Fig. 2C), respectively, and the sodium adduct was confirmed by MS/MS (data not shown).
The data confirm that the permissible amino acid residues around the cleavage site residue (
) are similar between plants and animals (37, 46). The cleavage site region, containing residues
1,
,
+1, and
+2, are particularly important for recognition by the transamidase complex. The transamidase complex cleaves the GPI-anchor signal and adds the preformed GPI-anchor at the
residue. The cleavage site regions of the AG-peptide propeptides are TS
DA (AtAGP12, AtAGP13, AtAGP14, and AtAGP21), TS
DG (AtAGP16 and AtAGP22), and AS
SS (AtAGP24), where the arrow symbol (
) represents the cleavage site. AtAGP15 is present with two different C-terminal cleavage sites TS
GS (AGP15.1), GS
SA (AGP15.2). The presence of ethanolamine at the C terminus of both forms of AtAGP15 confirms that they are the result of alternate cleavage sites rather than subsequent processing (Fig. 3D and Supplemental Materials Fig. 4). The cleavage site observed for AGP15.1 is listed as the most likely site for the big-
Plant Predictor (31), and the site used for AGP15.2 is listed as an alternate site. The two isoforms of AGP15 also differ at their N termini. One possibility is that the sequence near the N terminus of AG-peptides may influence the choice of cleavage site by the transamidase complex. This possibility is consistent with the information suggesting that residues in the mature protein up to
11 are important for GPI-anchor addition (31, 47).
The only other plant proteins that have had their cleavage sites determined experimentally have different cleavage site regions in their propeptides to the six confirmed regions in the Arabidopsis AG-peptides. NaAGP1 from N. alata has (P/O)N
AA (where O = Hyp) and PcAGP1 from P. communis has (P/O)S
GT (19). These sites are also consistent with the animal data and big-
Plant Predictor.
Prediction of GPI-anchored ProteinsThe experimental determination of the cleavage site for GPI-anchor addition for the eight AG-peptides from Arabidopsis provides information for further refining GPI-anchor prediction tools. Among the 12 AGPs analyzed in this work, the big-
Predictor recognizes both the capacity for GPI lipid anchoring and the correct
-site (as best or second best position) for 10 sequences. Even in the case of the overly long C-terminal signals, AtAGP16 and At-AGP20, the correct
-site is found by big-
. It is only the built-in length constraint for permissive C termini that leads to rejection of AtAGP16 and AtAGP20 as possible transamidase targets. For AtAGP12, the sites Ala33
Pro34-Ser35 and Ser38
Asp39-Ala40 are predicted by big-
Plant Predictor as the best and second best sites, respectively. Their relative ranking is determined by a volume constraint for the hydrophobic tail (term 20 big-
), a term that must be reconsidered during a revision of the big-
score function. The first site, Ala33
Pro34-Ser35, is unlikely to be used in vivo because the remaining peptide may be too short for further processing by the transamidase, because the cleavage of the ER signal gives an N terminus starting at Gln28. The current version of big-
Plant Predictor does not look for and remove the N-terminal ER targeting signal prior to GPI-anchor prediction. This was done intentionally because gene annotation from genomic sequences is not robust and therefore GPI-anchored proteins might be missed if they were incorrectly annotated (31).
As mentioned earlier AtAGP16 has an unusually long hydrophobic C terminus and is not predicted to be GPI-anchored by big-
Plant Predictor. Another AG-peptide, AtAGP20, contains a very similar C terminus to AtAGP16 and it is unclear whether AtAGP20 is GPI-anchored. We did not find conclusive proof that AtAGP20 is GPI-anchored, however, it is likely that it will be because it has the same cleavage site region as AGP16. It only differs by one amino acid in the mature protein and two in the putative signal sequence downstream of the cleavage site. The predicted GPI-signal sequence of AtAGP20 contains one additional Ser, eight residues from the C terminus, and has an Ser to Thr substitution three residues from the end of the C terminus. These differences will not significantly effect the hydrophobicity of the putative signal sequence and therefore are not likely to have a major effect. Recent data using an in vitro system showed that changing a charged residue (Glu) to a hydrophobic residue (Leu) was sufficient to change a GPI-anchor signal sequence to a classical transmembrane protein domain (48). This same report also provided strong evidence that the entire GPI-anchor signal is translocated across the ER membrane, prior to processing by the transamidase complex (48).
There are at least four distinct proteins in the transamidase complex for processing GPI-anchors and each of these is apparently encoded by a single copy gene in Arabidopsis (1, 49). Our results showing that AtAGP16 is GPI-anchored suggests that parameters such as the distribution and/or position of the hydrophobic region with respect to the cleavage site is more important than overall length of the signal for accurate prediction of GPI anchoring. The three-dimensional structure of the
1 to
11 region is also relevant with extended structures such as a polyproline II conformation, as found in AGPs (18, 50, 51), necessary for transamidase binding (31). Alternatively, the unusually long C-terminal stretch of AGP16 could be preprocessed prior to presentation to the transamidase complex. If big-
Plant Predictor is given a truncated form of AtAGP16 propeptide (missing six residues at the C terminus), then GPI anchoring is predicted. Site-directed mutagenesis of AtAGP16 and the other AG-peptides could provide a useful system for studying the specificity of ER translocation and transamidase specificity in plants.
Hydroxylation of Pro ResiduesIn this paper we confirm that all of the Pro residues in the protein backbones can be, and usually are, modified to Hyp. The observation that AtAGP24 exists as a fully hydroxylated form (m/z 1813, Supplemental Materials Table V and Fig. 4) establishes for the first time that the Pro residues in the GP motif can be hydroxylated in vivo. GP motifs are not generally thought to be hydroxylated, as shown by the recent hydroxylation/glycosylation predictions for LeAGP1 from tomato (12). Underhydroxylated forms of At-AGP24 were observed in the MALDI-TOF data, however, the position of the unmodified Pro (rather than Hyp) is not known (Table V and VI). It is possible that the efficiency of hydroxylation of GP is reduced compared with AP. This may be one of the reasons why AtAGP24 was the only AG-peptide that was underhydroxylated.
Variable hydroxylation of Pro residues has been observed for some of the AGPs. In NaAGP1, the Pro residue closest to the GPI-anchor cleavage site is not always hydroxylated (19). The most likely reason for variable hydroxylation is the type of adjacent residue to the Pro residues. Experiments using fusion proteins in tobacco bright yellow (BY2) suspension cultured cells shows that TP and VP are hydroxylated less frequently than AP (52).
Complexity of AG-peptides in VivoThe size and complexity of the AG polysaccharides attached to each Hyp in the Arabidopsis AG-peptides is currently not known. In the current study our focus was on the peptide backbone and all the glycosylation was removed by anhydrous HF to facilitate purification of the peptides. Predictions about the size of AG polysaccharides in Arabidopsis can be made based on analyses of AGP and AG-peptides in other plants. The length of AG polysaccharides reported on other AGPs is diverse ranging from 30 sugars/Hyp in gum arabic glycoprotein, up to 150 sugars/Hyp in radish leaf, reviewed in Ref. 39.
Native AG-peptides were first identified in wheat endosperm (53). Protein backbone sequence of the wheat endosperm AG-peptide identified the corresponding gene as a grain softness protein, GSP-1 (42). The purified backbone of GSP-1 is apparently processed to give, YAEVOSOAAQAOTAD, an AG-peptide with a backbone of 1534 Da, containing three Hyp residues. The wheat AG-peptide is expected to have an AG polysaccharide of up to 32 residues in length attached to each Hyp based on the relative contribution of carbohydrate (91%) and protein (9%), and the observed ratio of Gal to Ara residues of 1.5.
More recently Tan et al. (18) showed that a recombinant AGP, (Ala-Hyp)51, expressed in tobacco contains AG polysaccharides ranging from 13 to 36 sugar residues/Hyp, with a median size of 16. The estimated molar ratios of the sugars of (Ala-Hyp)51 was Galp:Araf:GlcUA:Rhap at 10:2.9:2.0:0.5.
There is likely to be considerable diversity in the glycan structures in planta because of different GTs. The specific modifications of glycans can be dependent on either protein backbone sequence or the glycosylation status of neighboring amino acid residues providing an increasing complexity of glycosylation. For example, in animal mucins the
-N-acetylgalactosaminyl transferases have different specificities based on protein context and prior glycosylation status (54, 55).
Peptide differences have been shown to affect glycosylation in plants. Different AG polysaccharides are attached to AP containing fusion proteins compared with TP or VP fusion proteins in tobacco suspension-cultured cells (52). AP and TP containing synthetic AGPs have
40 mol % Gal and 30 mol % Ara, whereas VP containing AGPs have more Ara with
40 mol % of both Ara and Gal. The amounts of the other sugars, e.g. glucuronic acid (GlcUA), also differs in the three fusion proteins at 15, 23, and 13 mol %, respectively (52). This suggests that the Arabidopsis AG-peptides may have different polysaccharides even if they are expressed in the same cell types.
Additional heterogeneity in the carbohydrate moieties of AGPs is likely to exist because of tissue-specific differences in the expression of GTs. In radish, some AGPs are fucosylated. A tissue-specific fucosyltransferase has been found that adds fucosyl residues to an AGP-like synthetic acceptor (56). GT specificity has also been inferred from studies of carbohydrate-deficient mutants in Arabidopsis mur1 mutants lacking an enzyme for the synthesis of GDP-L-fucose. The AGPs isolated from roots of mur1 mutants have a different carbohydrate composition and appear to have larger polysaccharide chains (57).
| CONCLUSION |
|---|
|
|
|---|
Carbohydrate specificity is important for function of animal proteoglycans. For example, only heparan sulfates from fibroblasts bind to basic fibroblast growth factor thereby facilitating their uptake into cells by receptors (62), whereas others, such as the GPI-anchored Dally, bind to and regulate the extracellular movement of signaling peptides (63). The small size of the GPI-anchored AG-peptides makes them excellent targets for studying the importance of post-translational modifications for the function of AGPs in planta.
| FOOTNOTES |
|---|
The on-line version of this article (available at http://www.jbc.org) contains supplemental Tables IV, V, and VI and Fig. 4. ![]()
|| Supported by Australian Research Council and Grains and Development Research and Development Corporation grant to the Australian Centre for Plant Functional Genomics. ![]()
To whom correspondence should be addressed. Tel.: 61-8-8303-7342; Fax: 61-8-8303-7109; E-mail: carolyn.schultz{at}adelaide.edu.au.
1 The abbreviations used are: AGP, arabinogalactan-protein; AG, arabinogalactan; ER, endoplasmic reticulum; FLAs, fasciclin-like AGPs;
-GlcY,
-glucosyl Yariv reagent; GPI, glycosylphosphatidylinositol; GT, glycosyltransferase; HF, hydrogen fluoride; MALDI-TOF, matrix-assisted laser desorption ionization time-of-flight; MS, mass spectrometry; RP-HPLC, reversed phase-high performance liquid chromatography; Rt, retention time; Hyp, hydroxyproline. ![]()
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|