|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 283, Issue 25, 17298-17313, June 20, 2008
Regulation of Glycan Structures in Animal TissuesTRANSCRIPT PROFILING OF GLYCAN-RELATED GENES*
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
A major goal in the field of "glycobiology" is an understanding of how glycan structures are regulated in abundance and the impact that these changes have on the physiology and pathology of an organism. Several difficulties arise when attempting to examine the regulation of glycan structures in complex biological systems. Because glycan biosynthesis is a post-translational modification, it is not directly template-driven like the synthesis of polypeptide structures from genome-derived transcripts. Thus, numerous factors can impact the efficiency and penetrance of individual glycosylation steps on protein and lipid acceptors, including enzyme accessibility to glycan modification sites, the abundance of the respective protein or lipid acceptors, availability of sugar-nucleotide precursors, and relative enzyme levels or relative localization of biosynthetic enzymes that can compete for the same glycan substrates. Despite these complexities in glycan biosynthesis, several lines of evidence indicate that one of the major modes of regulating cellular glycosylation is transcriptional regulation of the enzymes involved in glycan synthesis and catabolism (14). One method for testing whether the elaboration of glycan structures is controlled at the transcriptional level is by the comparison of glycan structural data with transcript abundance measurements in multiple biological samples, where differences in glycan structures are known to occur. The last decade has seen significant advancements in methods for glycan structural analysis providing increased breadth, depth, and sensitivity to the glycan structures detected and quantitated within a single experiment (15–17). Although these analyses have revealed critical changes in glycan structures during development or between biological samples, rarely have they been paired with broad-based transcript analysis to determine whether transcriptional regulation is the major mechanism driving the structural alterations (14, 18–24).
Transcript profiling of glycan-related genes has its own set of complexities. The enzymes involved in glycan synthesis and modification have been collated into multigene families based on sequence and structural similarities. In mammalian cells, glycosyltransferases number
200 members and are subdivided into 40 families (CAZy database (25, 26)), but in many cases the acceptor specificity of individual family members is not known or potential enzymatic redundancy may exist between multiple members of the same enzyme family. Thus, one-to-one mapping of individual gene products to steps in glycan biosynthetic pathways is difficult to achieve or may have ambiguity among multiple family members. Existing web-based resources (CAZy (25, 26), KEGG (27–29), Consortium for Functional Glycomics (14, 18, 21), and SOURCE (30)) have collated and annotated many of the genes related to glycan biosynthesis, but comprehensive resources for mapping enzymes to complex glycan biosynthetic pathways for glycoprotein, glycolipid, and proteoglycan biosynthesis and catabolism are still in their early stages.
An additional complexity for the study of glycan-related gene expression is the relatively low abundance of transcripts encoding many of the critical enzymes involved in glycan modifications. These low transcript levels make it difficult to employ broad-based survey methods, such as microarray approaches (14, 18–22, 24), for global transcriptome analyses. More focused approaches employing quantitative real time PCR (qRT-PCR)2 have been employed effectively (23, 31, 32), but this strategy has generally been restricted to the analysis of a relatively small number of target genes.
We have chosen to develop a broad-based analytical platform for transcript analysis of glycan-related genes that has three key components. First, we have drawn on numerous publicly available resources and the primary literature to generate a comprehensive gene list encoding enzymes and proteins involved in glycobiology, including sugar-nucleotide biosynthesis, transporters, glycan extension, modification, recognition, catabolism, and numerous glycosylated core proteins (>700 genes in the mouse). Second, we have developed a robust, sensitive, and flexible qRT-PCR platform for transcript analysis using experimentally validated primer sets for all of the members of the mouse gene list. This strategy has allowed us to examine global changes in glycan-related transcripts for both highly expressed genes as well as low abundance transcripts that may play key roles in generating important glycan epitopes. Third, we have developed a set of detailed pathway diagrams for glycan biosynthesis and modification and initiated the mapping of all members of the gene list to their respective biochemical pathway steps. Initial use of these pathway diagrams has allowed the visual depiction of transcript abundance within a framework of glycan biosynthetic pathways as a means of correlating glycan structural data with transcript abundance.
As an experimental framework for examining the regulation of glycan structures in mammalian systems, we have analyzed RNA samples derived from several adult mouse tissues and compared them with microarray data from parallel tissue samples and glycan structural data previously obtained by MALDI-MS approaches (14, 16, 17). Greater sensitivity and dynamic range was found for our qRT-PCR approach compared with focused microarrays, particularly for the numerous low abundance glycan-related enzymes. Comparison with glycan structural data demonstrated numerous correlations between glycan structures and transcript abundance for their respective biosynthetic enzymes. Several cases were also noted where differences in glycan structures did not correlate with transcript abundance suggesting that regulation may occur at a post-transcriptional level. The analysis of glycan-related transcripts within the context of biosynthetic pathways also predicted differences in low abundance glycans consistent with previous observations of these structures in the literature.
| EXPERIMENTAL PROCEDURES |
|---|
|
|
|---|
|
Primer Validation—Amplification reactions consisted of 5 µl of diluted mouse genomic DNA (a kind gift from Dr. Nancy Manley, Department of Genetics, University of Georgia) as template, 5 µl of primer pair mix (500 nM each primer, 125 nM final concentration) (MWG Biotec), and 10 µl of iQTM SYBR® Green Supermix (Bio-Rad). Amplifications were performed in a 96-well iCycler or myIQ real time detection system (Bio-Rad) with the following cycling conditions: 95 °C for 3 min, followed by 40 cycles of: 95 °C for 10 s (denaturing), 65 °C for 45 s (annealing), 78 °C for 20 s (data collection), followed by a melt curve program (95 °C for 1 min and 55 °C for 1 min and then increasing temperature of 0.5 °C per cycle for 80 cycles of 10 s each). We found that collecting fluorescence data at a temperature above the annealing temperature yielded cleaner amplification profiles, because any primer dimers that formed would be dissociated at this temperature as has been described previously (39). Primer pairs were tested at a single DNA concentration in triplicate, and the average of the cycle threshold (Ct) values was compared with that of a housekeeping gene (Rpl4). Primer pairs that yielded an average Ct within 2 units of the average Ct for the control gene were tested for efficiency, and those outside the 2 Ct window were re-designed (supplemental Fig. 1A). A typical amplification curve from a genomic DNA dilution series is shown in supplemental Fig. 1B. The efficiency of amplification for each primer pair was determined in duplicate using serial dilutions of mouse genomic DNA as the template by the method of Liu and Saint (40). The Standard Curve Method (41) was applied to the analysis of data from each primer set to generate plots of Ct versus log concentration of template and the slope was used to determine amplification efficiency, where efficiency (E) = 10-1/slope - 1 (supplemental Fig. 1C). For validation purposes, we selected an acceptable range of 100 ± 10% efficiency with genomic DNA as template (shown as dotted lines in supplemental Fig. 1C). Following the amplification and melt curve analysis, data were set to a common threshold, and the efficiency of the primer pair was determined from the slope of the standard curve using software supplied with the qRT-PCR instrumentation (Bio-Rad). Melt curves were analyzed for the presence of a single peak of -d(RFU)/dT at 80–86 °C (where RFU is relative fluorescence units) indicating a single amplification product (supplemental Fig. 1D). An example of a melt curve analysis where a primer pair amplified more than one product is shown in supplemental Fig. 1E. Primers that failed any of the validation steps were redesigned and reanalyzed until a suitable primer pair was obtained. Our success rate for primer design in our first attempt was
90%.
RNA and mRNA Isolation—Kidney, liver, testis, and brain tissue were isolated from C57BL6 mice (a kind gift from Dr. Mary Bedell, University of Georgia), flash-frozen in liquid N2, and stored at -80 °C. Total RNA was isolated using TRIzol reagent (Invitrogen) according to the manufacturer's instructions. Samples were digested with RNase-free DNase I (Ambion) to remove genomic DNA contamination and then re-extracted with TRIzol, precipitated with isopropyl alcohol, resuspended in diethyl pyrocarbonate-treated water, quantitated, and stored at -80 °C. Poly(A+) mRNA was isolated from total RNA using Dynabeads mRNA direct kit (Dynal, Invitrogen), quantitated, and stored at -80 °C.
cDNA Synthesis—cDNA was synthesized from 500 ng of poly(A+) mRNA using the SuperScript III First Strand synthesis kit (Invitrogen) according to the manufacturer's instructions except that both oligo(dT) and random primers (1:1) were included in the cDNA synthesis reactions. A control reaction lacking reverse transcriptase ("No-RT") was prepared and analyzed to detect the presence of contaminating genomic DNA. For qRT-PCR reactions, cDNA reaction products (20 µl) were diluted 1:20 in water and used as template in triplicate reactions for each primer pair assayed.
Normalization Gene Selection— qRT-PCR reactions with cDNA templates from mouse tissues were assayed using several housekeeping genes to determine the variability of expression across all tissues. The gene with the lowest variation across all tissues was selected as the normalization gene for all samples.
RT-PCR Data Analysis—Reactions for qRT-PCR were set up in a 96-well plate format using the same amplification conditions described above for primer validation using genomic DNA as template. The No-RT control cDNA template was tested with several primer pairs to confirm that the sample was free of contaminating genomic DNA prior to analysis of the reverse-transcribed template. Each primer pair was analyzed in triplicate for each cDNA sample. Following each run, the threshold was set to a common value to maintain consistency between runs, and data for each primer pair was averaged and the standard deviation determined. We chose an arbitrary cutoff of 0.5 Ct for the standard deviation (42). Triplicate values with a standard deviation >0.5 Ct were reassayed. The raw fluorescence data from the PCR machines were also analyzed using LinRegPCR (43) to determine the amplification efficiency of the individual reactions, and a cutoff of <5% was set as acceptable variability. Averaged Ct data were transformed to linear amplimer abundance values (2-Ct) and normalized to the housekeeping gene (Rpl4).
Data Analysis Method—We utilized the 
Ct method (41) to determine the relative transcript levels for the glycan-related genes in a given cDNA sample. This analysis method requires the assumption that the amplification efficiencies of all reactions are approximately equal. A test for equal efficiencies is to plot
Ct (Ctgene - Ctcontrol) versus template concentration for a dilutions series and ensure that the slope of the generated line is <0.1. Modeling conditions within these restrictions translate into an acceptable difference in amplification efficiency of <5%. Primer efficiency values from all samples tested were below the 5% cutoff (generally <3%). The normalization gene, Rpl4, was included on all 96-well plates to control for inter-plate and machine variations.
Microarray Analysis—Matched samples for mouse kidney and liver were analyzed on the GLYCOv2 custom microarray chip and by qRT-PCR analysis. The GLYCOv2 gene chip was produced by Affymetrix (Affymetrix, Santa Clara, CA) for the Consortium for Functional Glycomics. Samples were labeled and hybridized to the GLYCOv2 chips in triplicate as described previously (14). The GeneChip® operating software (GCOS, Affymetrix, Santa Clara, CA) was used to determine whether genes were called "present" or "absent." Genes were classified as present if at least two of the three calls were "present." Robust multichip average (RMA) data, a normalized intensity value based on the amount of probe that hybridizes to the array, was used to produce signals for each gene (44), and these values were averaged and used for comparison with data generated by qRT-PCR.
Comparison of Microarray and qRT-PCR Data—Genes from the microarray analysis were grouped into present and absent categories. The average RMA values for each gene were plotted against the normalized linear amplimer abundance value for each gene from the qRT-PCR analysis with both data sets plotted on a log10 scale. The correlation coefficient was determined for genes that were detected as present on both the microarray and qRT-PCR analysis. The lower limit of detection for qRT-PCR was set at Ct = 35.
Processing of Mass Spectrometry Data to Calculate Semiquantitative Relative Glycan Abundance—MALDI-MS data for N-glycan structures identified in samples from four adult mouse tissues were obtained from the Consortium for Functional Glycomics web site based on previously published methods (14) employing release by peptide:N-glycanase F digestion followed by permethylation. Raw spectral data posted for each mouse tissue were initially analyzed using Voyager software (Applied Biosystems, Foster City, CA) to generate lists of mean-centered ion clusters for each glycan mass followed by integration of the corresponding ion currents. In-house software was used to identify glycosyl compositions corresponding to each mean-centered ion cluster in the MS data, and the most likely glycan structure corresponding to each composition was assigned manually. Spectral features having an ion current integral of less than 1.5% of the total integrated ion current for all assigned glycan structures in the given tissue data set were not considered further. Ion current integrals for a given glycan structure were normalized and expressed as a percentage of the total ion current for the given data set.
| RESULTS |
|---|
|
|
|---|
Development of a qRT-PCR Platform for Glycobiology-related Transcript Analysis—Although numerous strategies are available for transcript quantitation in biological systems (i.e. SAGE (45–47), microarray techniques (48–50), or variations of these approaches (51, 52)), we chose to employ a qRT-PCR-based platform because of the extreme sensitivity and wide dynamic range of this latter approach (42) and its common use in validating transcript changes initially identified by microarray and SAGE techniques (53). We chose the SYBR green intercalating dye methodology to take advantage of reduced cost and ease of use in high throughput qRT-PCR compared with other approaches that employ additional primers and/or probes (54). Several considerations have been taken into account during the development of the methodology to ensure uniformity of analysis across the hundreds of target genes on the list, including standardized primer design, amplification, and validation protocols. The details of the analytical platform are described under "Experimental Procedures," but the rationale and criteria for protocols employed are discussed below.
Validation of Amplification Efficiency and Specificity—A key component in the development of our qRT-PCR platform was the generation of primer sets that were each experimentally validated to provide uniform amplification efficiency across the entire gene list. Because cloned versions of each member of the gene list are not available for validating the amplification efficiency for a given primer set, we used genomic DNA (gDNA) as an amplification template during the extensive primer validation process described under "Experimental Procedures." We confirmed linearity of the qRT-PCR responses across the template dilution series, calculated amplification efficiencies relative to a standard housekeeping gene (Rpl4), checked for the production of single amplimer products by melt curve analysis, and the lack of amplimer production in no template control reactions (supplemental Fig. 1). Primers that failed any of these validation steps were redesigned and reanalyzed until a suitable primer pair was obtained. Our success rate for primer validation in the first round of design was
90%.
Primer Design Criteria—The simplicity of the SYBR green qRT-PCR strategy means that effective primer design is critical for selective and efficient template amplification. Because we decided to use gDNA as a validation template during primer design, primers were designed using the largest coding exon as template so that they would effectively amplify both the cDNA target as well as the gDNA validation template. The criteria for primer design were quite restrictive (primer length 20 ± 1 bp, Tm 65 ± 1 °C, and amplimer length 65–75 bp; see "Experimental Procedures") to allow the use of a single set of amplification conditions for the entire list of genes in a 96-well microtiter plate format qRT-PCR machine.
|
|

Ct method (41) by comparison to an invariant housekeeping gene that is similarly expressed across all tissues being examined (42). A panel of housekeeping genes was tested for data normalization using cDNAs derived from four adult mouse tissues (supplemental Fig. 3), and ribosomal protein L4 (Rpl4) was selected as the normalization gene because transcript levels were least variant across the four tissues. In addition, triplicate qRT-PCR analyses of Rpl4 were included in all 96-well plate analyses to control for inter-plate and machine variations.
Comparison of qRT-PCR with Microarray Analysis—In an effort to compare data obtained by our qRT-PCR platform with microarray approaches, we analyzed paired RNA samples from wild type C57/BL6 mouse liver and kidney tissues by qRT-PCR and an Affymetrix microarray platform using the GLYCOv2 gene chip from the Consortium for Functional Glycomics (18). As described under "Experimental Procedures," RMA values from the microarray data yielded normalized intensity values based on the amount of probe hybridized to the array, whereas Affymetrix GCOS data provided present and absent calls for transcripts in each sample. RMA values were compared with the relative transcript abundance data from qRT-PCR analysis (Fig. 1, both plotted on a log10 scale) for 149 glycan-related glycosyltransferase and glycosylhydrolase transcripts called as present (Fig. 1, solid circles) or absent (Fig. 1, open circles) in the microarray analyses. Correlation coefficients between the microarray and qRT-PCR data sets were fairly low (R2 = 0.43 for liver and 0.26 for kidney). Half (50%) of the liver transcripts and about a third (34%) of the kidney transcripts were classified as absent in the microarray analysis (Fig. 1, open circles) consistent with the limited sensitivity of the previously published GLYCOv1 gene chip (14). In contrast, qRT-PCR analysis detected most of the transcripts from both tissue sources (92.6% for liver and 98.7% for kidney) confirming the higher sensitivity of the latter approach. Differences in the dynamic range of the two techniques were also evident, where the microarray data spanned
2.5 orders of magnitude as compared with
7 orders of magnitude for the qRT-PCR approach.
|
|
-mannosidases (Man1a1, Man1a2, Man1c1), B3galT1-6, B4galt1-5), and transcript levels exhibited significant tissue-specific differences for many of the isoforms as previously indicated for these gene families (57–62). Exceptions to this pattern include the single enzyme isoform, Mgat3, adding a bisecting β1,4-GlcNAc, and Fut8, adding a core
1,6-Fuc residue. Transcript levels for both of these enzymes were >10-fold lower in liver compared with kidney, testis, and brain. Considerably greater differences in tissue-specific expression could be identified in the synthesis of the complex capping reactions that are present on the nonreducing termini of N-glycans, O-glycans, and glycolipids (Fig. 4). These capping reactions are generally catalyzed by multiple enzyme isoforms, and the majority of these enzymes exhibit wide differences in tissue-specific expression. Correlations between qRT-PCR Analysis and N-Linked Glycan Structures—Previous studies correlated glycan structural data obtained by MALDI-MS approaches with microarray transcript data from a partial list of glycan-related transcripts (14). In an effort to make a similar comparison with glycan structural data, we compared the transcript abundance from our more extensive glycan-related gene list with the same mouse tissue N-glycan structural data (14). To make a more quantitative comparison between the two data sets, the respective N-glycan MALDI profiles were converted into relative glycan abundance (expressed as a percentage of the total peak intensity) in each tissue as described under "Experimental Procedures" (Fig. 5).
|
1,6-Fuc levels in liver (15% of the N-glycan structures in liver compared with 45–52% of the glycan structures in the other tissues) and the inability to detect transcripts encoding Fut8 (14), the enzyme that synthesizes this linkage. Our qRT-PCR data detected Fut8 transcripts in all four tissues, but the greater sensitivity of the latter approach demonstrated an
10-fold lower abundance of Fut8 transcripts in liver compared with the other tissues (Fig. 6, A and B). Similarly, prior studies correlated nonreducing terminal
1,3-Fuc residues on N-glycans with the expression of Fut9 (14). Glycan structures containing terminal fucose residues were highly abundant in kidney (37%) and brain (19%) but extremely low abundance in liver and testis (<4%). By comparison our qRT-PCR data indicated abundant Fut9 transcripts in kidney and brain, but no detectable transcripts in liver and testis (Fig. 6, C and D), reflecting a >104-fold difference in transcript abundance between the two pairs of tissues.
|
Although several correlations could be made between glycan structures and transcript data, there were also cases where the data did not correlate well. For example, oligomannose structures included 85% of the glycans in liver in contrast to 47–55% in the other three mouse tissues (Fig. 6G). Because the predominant oligomannose glycan in liver was a Man5GlcNAc2 structure (59% of oligomannose glycans), one explanation could be greatly reduced expression of N-acetylglucosaminyltransferase I (Mgat1) in liver. However, transcript abundance for Mgat1 revealed similar levels in liver, kidney, and brain (Fig. 6H). Transcript data for the Golgi
-mannosidases (GMIA (Man1a), GMIB (Man1a2), and GMIC (Man1c1)), which produce the Man5GlcNAc2 structure, were also not significantly reduced in liver compared with the other tissues (Fig. 6I). A likely explanation for elevated oligomannose structures in liver is the extensive elaboration of the smooth endoplasmic reticulum in liver hepatocytes to provide the enzymatic machinery for drug detoxification and lipid biosynthesis (63, 64). This highly abundant, specialized endomembrane system is enriched in glycosylated enzymes that will not likely encounter the Golgi N-glycan processing machinery required for conversion of oligomannose structures into highly branched complex N-glycans. Thus, glycan abundance in this tissue likely reflects the unique organelle structure in hepatocytes rather than the glycan processing machinery in the Golgi complex of these cells.
The glycan structural data also indicated a considerably reduced level of sialylation in the kidney compared with the other tissues (14). In contrast, transcripts encoding all of the sialyltransferase genes (GT29 CAZy family) are detected at some level in kidney, with the exception of St6Gal2, which was highly expressed in brain and barely detectable in kidney (Fig. 7). However, four members of the GT29 family are proposed to be the major contributors to N-glycan
2,3- or
2,6-sialylation (St3gal3, St3gal4, St3gal6, and St6gal1) (14). Of these four enzymes, only St3gal4, involved in the
2,3-sialylation of type II (Galβ1–4GlcNAc) sequences, is slightly reduced (
5-fold) in kidney (Fig. 7). These data suggest that decreased transcript levels for the sialyltransferases in kidney are not likely to be the major contributor to reduced sialylation in this tissue. Additional potential contributors to reduced sialylation in kidney include post-transcriptional control of sialyltransferase activities, increased removal of sialic acid residues by neuraminidases, or decreased CMP-sialic acid precursor pools. Transcript abundance for the four neuraminidase/sialidase genes (GH33 family, Fig. 7) indicates increased levels for the lysosomal neuraminidase, Neu1 (
6-fold higher in kidney), and the cytosolic neuraminidase, Neu2 (
8-fold higher in kidney). Transcript abundance for the membrane sialidases, Neu3 and Neu4, revealed lower (11-fold) or undetectable levels in kidney, respectively. Transcript abundance for enzymes involved in sugar nucleotide biosynthesis leading to CMP-sialic acid precursor production (Renbp, Nans, and Cmas, supplemental Fig. 4) and CMP-sialic acid transport in the Golgi complex (Slc35a1, supplemental Fig. 19) were either higher in kidney or similar in all four mouse tissues, suggesting that depleted precursor pools in kidney were unlikely to account for reduced sialylation in this tissue.
|
10-fold lower expression was detected. N-Glycan structures containing the β1,4GlcNAc-branched products of Mgat6 have not been detected in mammalian species (66), consistent with exceptionally low levels of Mgat6 and Mgat6-like transcripts in mouse kidney, liver, and brain by qRT-PCR (Fig. 3). However, we detected moderate Mgat6-like transcript levels in mouse testis consistent with the proposal that GlcNAc-terminal branched N-glycan structures play critical roles in sperm-Sertoli cell interactions in this tissue (67).
The polysialyltransferases St8sia2 (STX) and St8sia4 (PST) were both highly expressed in brain (Fig. 7) as anticipated for their roles in the extension of polysialylated N-glycan structures on NCAM (68), where they influence NCAM homotypic adhesion during axonal pathfinding and migration of neural cells (69). Surprisingly, NCAM and St8sia2 (STX) transcripts were also highly abundant in testis (Fig. 7 and supplemental Fig. 20), along with transcripts encoding the ganglioside
2,8-sialyltransferases, St8sia3 and St8sia5. A role for polysialic acid-negative NCAM in sperm-Sertoli cell interactions has been proposed (70, 71), but a role for polysialylated NCAM has not yet been demonstrated in this tissue. The presence of St8sia2 (STX) transcripts in testis suggests that polysialylation may regulate homotypic NCAM adhesion in this tissue similar to its role in modulating cell adhesion and migration of neural cells. Roles for the two ganglioside
2,8-sialyltransferases in neural tissues are also well established (72); however, blockage of complex ganglioside biosynthesis in mice did not lead to neurological abnormalities. Surprisingly, defective ganglioside biosynthesis led to male sterility and defective spermatogenesis (73) suggesting a role for the
2,8-sialyltransferases and their complex ganglioside products in testis.
The murine gene responsible for A/B histo-blood group antigen biosynthesis is a cis-AB glycosyltransferase (Abo) that can synthesize both
1,3GalNAc (blood group A) and
1,3Gal (blood group B) linkages (74). Humans synthesize A/B antigens broadly in the epithelial cells of gastrointestinal, esophageal, bronchopulmonary, oral, and urogenital tissues and bone marrow progenitors in contrast to low expression in liver, spleen, and kidney and no detectable expression in brain and muscle (57). In contrast, mice appear to express Abo exclusively in the urogenital tract and intestine based on our qRT-PCR data (Fig. 4, high in testis, base-line levels in kidney, liver, and brain) and EST abundance data (only detected in bladder, epididymus, prostate, testis, and intestine). These transcript data are consistent with the absence of detectable A/B antigens in murine blood cells and salivary secretions based on agglutination tests and immunologic detection methods (74).
Extended linear and branched polylactosamine structures containing Galβ1,4GlcNAcβ1,3-repeat units can be found appended to N-glycan, O-glycan, and glycolipid structures (Fig. 4). Both branched and linear polylactosamine structures can then act as scaffolds for creation of various blood group antigens, such as sialyl Lex and ABH structures. Synthesis of the linear backbone of these structures results from the concerted and sequential action of β1,3GlcNAc transferase family members (B3gnt1–8) and β1,4Gal transferases family members (B4galt1–7) to generate i-blood group antigens. Substrate specificities for recombinant B3gnt1–8 have been characterized (75), but their roles in polylactosamine extension in vivo are still unclear (76). In adult mouse tissues all of the B3gnt family members are expressed widely (Fig. 4). The B3gnt2 isoform is particularly highly expressed in all four mouse tissues examined. This latter enzyme has strong polylactosamine synthesizing activity (75), and recent studies on B3gnt2-deficient mice indicate a marked reduction in polylactosamine on N-glycans in immunological tissues and induction of a hyper-reactive immune response (76).
|
The synthesis of human Sda/CT/Cad antigens is accomplished by the transfer of a β1,4GalNAc residue to a disaccharide acceptor to form the terminal GalNAcβ1,4 [Neu5Ac
2,3]Galβ trisaccharide on red blood cells, body fluids, kidney, and large intestine (80). The enzyme that generates this linkage, B4galnt2, is onco-developmentally regulated, increasing activity with age, and down-regulated in colon carcinoma (81). In mice, B4galnt2 is poorly expressed in kidney, liver, and brain, whereas moderate expression is found in testis based on our qRT-PCR data (Fig. 4), and EST abundance data suggest additional elevated expression in mouse intestine.
Sulfation of N-glycan and mucin-type O-glycan structures is restricted to GalNAc-4-SO4 addition to terminal structures in glycoprotein hormone N-glycans, HNK-1 epitope antigen biosynthesis, sulfation of polylactosamine structures to form keratan sulfate, and sulfation of terminal GlcNAc residues on sialyl 6-sulfo Lewis x structures (Sia
2,3Galβ1,4(Fuc
1,3(sulfo-6))GlcNAcβ1-R) (Fig. 4). Each of the linkages is generated by sulfotransferases with unique substrate specificities and expression patterns.
Glycoprotein hormone GalNAc-4-sulfation is catalyzed by the sulfotransferases, Chst8 and Chst9. Although human Chst8 is restricted in expression to neuron-derived tissues (82), expression of Chst8 and Chst9 is more widely distributed in murine tissues based on Northern blots (83), EST abundance data, and our detection of transcripts in kidney and lower expression in liver and testis (Fig. 4). Transcripts encoding Chst9 were low in all four mouse tissues examined suggesting that this enzyme is unlikely to contribute to significant synthesis of GalNAc-4-SO4 structures in these tissues.
The elaboration of HNK-1 epitopes is largely restricted to glycans associated with cell adhesion molecules such as NCAM, MAG, L1, P0, telencephalin, and others, and some glycolipids in the nervous system (57), where its expression is spatially and temporarily regulated (84). Synthesis of the HNK-1 epitope is initiated by one of two glucuronosyltransferases, GlcAT-P (B3gat1) (85) or GlcAT-S (B3gat2) (86, 87), and extended by the addition of a 3-linked sulfate transferred by the HNK-1 sulfotransferase, HNK-1ST (Chst10) (88, 89). Northern blots detected neuron-specific expression of B3gat1 and B3gat2 in rat and mouse tissues (85–87), but our qRT-PCR data and EST abundance data indicate a broader expression profile in murine tissues, especially for B3gat2, which was found at appreciable levels in kidney (Fig. 4). Expression of Chst10 was predominantly in brain with
10-fold lower transcript levels in testis and
1000-fold lower levels in kidney and liver consistent with a low abundance of Chst10 ESTs in a collection of non-neural mouse tissues. Murine knockouts in Chst10 resulted in normal growth phenotypes and fertility, but altered spatial learning, and synaptic efficiency (90), suggesting that the role of sulfated HNK-1 carbohydrates may be more subtle in regulating animal behavior or that other sulfated carbohydrate structures may compensate for the absence of the sulfated capping structure of this glycan.
Sulfation reactions leading to the synthesis of sialyl 6-sulfo Lewis x structures are catalyzed by sulfotransferases associated with high endothelial venules, where they create ligands for interaction with L-selectin during lymphocyte homing to lymph nodes (91, 92). The sulfated structures are generated on the termini of core 1 and core 2 sialomucins by the sulfotransferases GlcNAc6ST-1 (Chst2) and GlcNAc6ST-2 (Chst4) (92). Expression of Chst4 was found to be restricted in expression pattern to high endothelial venules and a limited number of other sites by Northern blotting (93) consistent with low expression levels for transcripts in the four mouse tissues by qRT-PCR (Fig. 4). In contrast, Chst2 is more broadly expressed in mouse tissues based on Northern blots (94) consistent with high transcript levels in brain and moderate levels in kidney, liver, and testis by qRT-PCR (Fig. 4).
Additional transcript abundance data were collected for other glycan classes (see supplemental figures), and in instances where biochemical pathways could be mapped, the data are presented with the corresponding pathway diagrams. In cases where pathway data are not relevant (lectins, sugar/sugar-nucleotide transporters, and proteoglycan core proteins), the data are presented in histogram form based on the classification of the proteins.
| DISCUSSION |
|---|
|
|
|---|
7 orders of magnitude and the ability to detect very low abundance transcripts. Direct comparison of our qRT-PCR platform with microarray analyses using the Affymetrix GLYCOv2 microarray approach (18) revealed an improved detection of low abundance transcripts (2 times more transcripts detected in liver and 1.5 times more detected in kidney). Among those transcripts that were detected in the microarray analyses, the general trends for the microarray and the qRT-PCR analyses were in the same direction, but the correlation coefficients were quite poor (R2 = 0.39 for liver and 0.24 for kidney), confirming the common use of qRT-PCR to provide quantitative validation of transcript changes initially detected using microarray methods (53). The Consortium for Functional Glycomics has now generated a new version of their GLYCOv3 gene chip, and comparisons of paired analyses by qRT-PCR and the new chip may yield better correlations in the future.
Several lines of evidence strongly argue that our qRT-PCR data reflect the true quantitative measure of transcript abundance in the respective tissues. First, our qRT-PCR amplifications have a linear response in a dilution series of both genomic DNA templates and cDNA templates across the entire dynamic range of analysis. Second, our PCR efficiencies have been confirmed both through a template dilution series as well as by determining the rate of amplimer appearance using the LinRegPCR program (43). In any instance where an amplification efficiency value deviates from our restrictive criteria (100 ± 10%) during either primer validation or sample analysis, the experiments are either repeated or subjected to primer redesign until our efficiency criteria are obtained. Our restrictive primer design and validation criteria have allowed the use of uniform amplification conditions for analysis of the entire gene list and yielded validated primer sets for
90% of the target genes in the first round of design. Most of the primer sets that failed validation in the first round of design could be effectively validated upon redesign. Finally, in each instance where we have compared our qRT-PCR data with literature data on transcript abundance based on Northern blots, the published data agree with our qRT-PCR results, except where the sensitivity of the Northern blots is too low for transcript detection.
The key components in our transcript analytical platform were the collation of a comprehensive target list for glycobiology-related genes, the development of protocols and validated primer sets for the qRT-PCR approach, and the assignment of members of the gene list to discrete steps in their respective biosynthetic pathways. Collation of the comprehensive gene list was more difficult than anticipated because of the complexity of glycan biosynthetic and modification pathways. Numerous steps in glycan biosynthetic pathways are catalyzed by multiple enzyme isoforms, many of which are not well characterized in regard to acceptor or donor specificity. As a result it is difficult to achieve one-to-one mapping of enzymes to discrete pathway steps without detailed knowledge of the primary literature and the specificity of a given enzyme system. Although web resources such as the CAZy and KEGG databases provided effective starting points for gene list assembly or integrated pathway mapping, there is not a single, publicly available resource for all of the relevant target genes in a form that contains functional annotations or effective pathway mapping. Thus, extensive manual annotation of the gene list and pathway diagrams was required to create a bridge between our glycan-related transcript abundance data and corresponding glycan structural data. We consider the annotated pathway diagrams and the underlying gene list to be a "work in progress," because new enzymes are constantly being added to the list and ongoing characterization of the existing members of the list will likely lead to revisions of the enzyme assignments in the pathway diagrams. However, the present gene list is still more than double the size of the largest glycan-related gene list used to create focused microarrays (i.e. Consortium for Functional Glycomics GLYCOv1 gene chip) employed in prior transcript analysis studies (14). We have created a web site for archiving our glycotranscriptome data (NIH-NCRR Integrated Technology Resource for Biomedical Glycomics) containing a catalog of the updated gene lists, primer design information, pathway diagrams, and eventual archiving of transcript data sets generated by our qRT-PCR analysis.
Application of the qRT-PCR methodology to mouse tissue transcript analysis revealed a broad diversity of tissue-specific transcript expression for glycan-related enzymes and proteins. Several pathways (i.e. N-glycan lipid-linked precursor synthesis (Fig. 2) and GPI anchor biosynthesis (supplemental Fig. 7)) are catalyzed predominantly by single enzyme isoforms and generally exhibited very little tissue-specific variation across the linear pathway steps. Other pathways, especially those that elaborate terminal glycan epitopes on N-glycans, O-glycans, glycolipids (Fig. 4), and sulfated proteoglycans on cell surface glycoconjugates (supplemental Figs. 11–13), were catalyzed by multiple enzyme isoforms that exhibited complex tissue-specific expression patterns. This proliferation of isoforms with varying specificities and expression patterns likely reflects the varying diversity in functions of cell surface glycoconjugates in cellular adhesion, recognition, and host-pathogen interactions.
In some cases, the influence of tissue-specific glycosyltransferase expression was reflected in altered glycan profiles across the mouse tissues (i.e. reduced levels of bisecting GlcNAc and core
1,6-Fuc residues in liver N-glycans corresponding with reduced transcripts for Mgat3 and Fut8, respectively (Fig. 6), see Table 2 for summary). In other instances, the relationship between alterations in glycan profiles and changes in transcript levels were more complex. For example, sialylation of N-glycans was reduced in kidney, but this alteration was accompanied by only minor reductions in some sialyltransferase transcripts and elevation in lysosomal and cytosolic neuraminidase transcripts in this tissue (Fig. 7). Regulation could also occur at a post-transcriptional level or by a mechanism unrelated to glycan-associated transcript expression. An example of the latter case was the elevated oligomannose structures found in liver relative to other tissue sources (Fig. 6). A likely explanation for increased liver oligomannose structures is the proliferation of the smooth and rough endoplasmic reticulum in hepatocytes (63, 64), where glycoproteins would not be expected to encounter the complex glycan processing machinery in the Golgi complex. Thus, comparison of glycan structures with corresponding transcript levels can provide insights into transcriptional control of glycosylation as well as provide a framework for hypotheses about where glycan regulation is accomplished at the post-transcriptional level.
An additional benefit of broad-based transcript profiling of glycan-related genes is the use of these data to predict which enzyme isoform among a family of related enzymes is catalyzing a given glycan-processing step. An example of this is shown in the addition of nonreducing terminal
1,3-Fuc residues on mouse tissue N-glycans. Mice contain five GT10 fucosyltransferase isoforms potentially capable of adding
1,3-Fuc linkages to N-glycan structures. Only transcripts encoding Fut9 correlate with glycan structures containing the corresponding sugar linkage (high in kidney and brain, low in liver and testis, Fig. 6) strongly suggesting that Fut9 is responsible for creation of this linkage in mouse tissues consistent with immunohistochemical studies in Fut9-deficient mice (14). Thus, comparison of transcript profiles with glycan structures may help to provide insights into enzyme specificity and identify contributors to glycan biosynthetic pathways.
Finally, transcript profiling can also predict where glycan structures are not anticipated to be regulated (constitutive expression) or locations where glycan structures might be expressed in unanticipated locations. Examples of apparent constitutive pathways based on the profiles of the four mouse tissues include N-glycan lipid-linked precursor biosynthesis (Fig. 2), GPI anchor biosynthesis (supplemental Fig. 7), nucleotide-sugar transporter expression (supplemental Fig. 19), as well as proteoglycan core and co-polymerase extension (supplemental Fig. 10). In contrast, proteoglycan sulfation enzymes appear to have tissue-specific expression patterns across the various enzyme isoforms. Some enzymes involved in the sulfation of heparan sulfate exhibit little variation across the four tissues (Ndst1, Ndst2, Hs2st1, Hs6st1, Hs3st3a1, and Hs3st6, supplemental Fig. 11), whereas other isoforms exhibit profound tissue-specific expression. Similar combinations of invariant and tissue-specific isoforms can be seen in the biosynthesis of chondroitin, dermatan, and keratan sulfates (supplemental Figs. 12 and 13). Widely variant tissue-specific expression patterns can also be found for the core proteins that act as acceptors for heparan, chondroitin, dermatan, and keratan sulfates (supplemental Fig. 14). In a more focused study, we previously demonstrated that proteoglycan sulfation and core protein levels vary during mouse embryonic stem cell differentiation, whereas the proteoglycan core and disaccharide co-polymer extension enzymes are not altered in expression during differentiation (95). Thus, glycan-related transcript profiling can provide insights into which enzyme isoforms are regulated in a given biological system.
Transcript profiling data can also highlight points of transcriptional regulation that may be unexpected, including the expression of St8sia2 and NCAM in testis in addition to its expected location in neural tissues (Fig. 7 and supplemental Fig. 20). The appearance of nonpolysialylated NCAM in testis has been reported (70, 71), but the presence of both St8sia2 and NCAM transcripts in testis suggests an additional role for polysialylation in this latter tissue. Similarly, expression of the ganglioside
2,8-sialyltransferases, St8sia2 and St8sia3, in testis is consistent with recent studies demonstrating a critical role for complex gangliosides in mouse spermatogenesis (73).
Although our initial profiles of glycan-related transcripts in four mouse tissues provide a focused view of glycan regulation, one of our eventual goals is to generate a more extensive map of glycan-related transcript expression in a broad collection of mouse tissues to correlate these data with parallel sets of quantitative glycan structural data. This atlas of glycan and transcript expression will act as a framework for understanding regulation of glycan structures in animal systems. Further developments in glycan analytical methodologies will be required to attain these goals, particularly for O-glycans, glycolipids, and proteoglycans, where chemical and enzymatic release methods are less developed for high throughput quantitative glycan profiling. Additional work must also be done to complete the enzyme assignments for glycan-related pathways, particularly in the catabolism of all glycan classes. Steady state levels of some glycoproteins will likely be regulated by a balance of glycan synthesis and catabolism. Thus, modeling of glycan abundance in biological systems will require detailed knowledge of the mechanisms for regulating both biosynthetic and catabolic pathways.
In conclusion, we have initiated studies to test the hypothesis that glycan expression in animal tissues is regulated predominantly at the transcriptional level. To accomplish this we have developed a robust, flexible platform for transcript analysis of mouse glycan-related enzymes and proteins with a wide quantitative dynamic range. Insights that come from the ability to quantitate the full range of glycan-related transcripts revealed numerous correlations with glycan structural data indicating that many, but not all, glycan structural changes can be accounted for by transcriptional regulation of the glycan synthetic machinery. Additional use of this analytical platform in other biological systems should complement glycan structural studies and lead to a greater understanding of the roles that glycans play in animal physiology and pathology.
| FOOTNOTES |
|---|
The on-line version of this article (available at http://www.jbc.org) contains supplemental Figs. 1–20 and Table 1. ![]()
1 To whom correspondence should be addressed: Complex Carbohydrate Research Center, 315 Riverbend Rd., University of Georgia, Athens, GA 30602. Tel.: 706-542-1705; Fax: 706-542-1759; E-mail: moremen{at}uga.edu.
2 The abbreviations used are: qRT-PCR, quantitative real time polymerase chain reaction; MALDI-MS, matrix-assisted laser desorption ionizationmass spectrometry; RMA, robust multichip average; gDNA, genomic DNA; Rpl4, ribosomal protein L4; EST, expressed sequence tag; IGnT, I-GlcNAc transferase; GPI, glycosylphosphatidylinositol; NCAM, neural cell adhesion molecule. ![]()
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. H. Merrill Jr., T. H. Stokes, A. Momin, H. Park, B. J. Portz, S. Kelly, E. Wang, M. C. Sullards, and M. D. Wang Sphingolipidomics: a valuable tool for understanding the roles of sphingolipids in biology and disease J. Lipid Res., April 1, 2009; 50(Supplement): S97 - S102. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| All ASBMB Journals | Molecular and Cellular Proteomics |
| Journal of Lipid Research | ASBMB Today |