Human small intestinal maltase-glucoamylase cDNA cloning. Homology to sucrase-isomaltase.

It has been hypothesized that human mucosal glucoamylase (EC 3.2.1. 20 and 3.2.1.3) activity serves as an alternate pathway for starch digestion when luminal alpha-amylase activity is reduced because of immaturity or malnutrition and that maltase-glucoamylase plays a unique role in the digestion of malted dietary oligosaccharides used in food manufacturing. As a first step toward the testing of this hypothesis, we have cloned human small intestinal maltase-glucoamylase cDNA to permit study of the individual catalytic and binding sites for maltose and starch enzyme hydrolase activities in subsequent expression experiments. Human maltase-glucoamylase was purified by immunoisolation and partially sequenced. Maltase-glucoamylase cDNA was amplified from human intestinal RNA using degenerate and gene-specific primers with the reverse transcription-polymerase chain reaction. The 6,513-base pair cDNA contains an open reading frame that encodes a 1,857-amino acid protein (molecular mass 209,702 Da). Maltase-glucoamylase has two catalytic sites identical to those of sucrase-isomaltase, but the proteins are only 59% homologous. Both are members of glycosyl hydrolase family 31, which has a variety of substrate specificities. Our findings suggest that divergences in the carbohydrate binding sequences must determine the substrate specificities for the four different enzyme activities that share a conserved catalytic site.

It has been hypothesized that human mucosal glucoamylase (EC 3.2.1.20 and 3.2.1.3) activity serves as an alternate pathway for starch digestion when luminal ␣-amylase activity is reduced because of immaturity or malnutrition and that maltase-glucoamylase plays a unique role in the digestion of malted dietary oligosaccharides used in food manufacturing. As a first step toward the testing of this hypothesis, we have cloned human small intestinal maltase-glucoamylase cDNA to permit study of the individual catalytic and binding sites for maltose and starch enzyme hydrolase activities in subsequent expression experiments. Human maltaseglucoamylase was purified by immunoisolation and partially sequenced. Maltase-glucoamylase cDNA was amplified from human intestinal RNA using degenerate and gene-specific primers with the reverse transcription-polymerase chain reaction. The 6,513-base pair cDNA contains an open reading frame that encodes a 1,857-amino acid protein (molecular mass 209,702 Da). Maltase-glucoamylase has two catalytic sites identical to those of sucrase-isomaltase, but the proteins are only 59% homologous. Both are members of glycosyl hydrolase family 31, which has a variety of substrate specificities. Our findings suggest that divergences in the carbohydrate binding sequences must determine the substrate specificities for the four different enzyme activities that share a conserved catalytic site.
Starches are a mixture of two structurally different polysaccharides: amylose, a linear [4-O-␣-D-glucopyranosyl-D-glucose] n polymer, and amylopectin, with additional 6-O-␣-D-glucopyranosyl-D-glucose links (about 4% of total), which result in a branched configuration. Dietary starches are a mixture of approximately 25% amylose in amylopectin, a fact of nutritional significance because of the multienzyme complexity of the mammalian starch digestion pathway (1). ␣-amylase (EC 3.2.1.1) is the endoenzyme found in mature human salivary and pancreatic secretions that produces linear maltose oligo-saccharides by hydrolysis of ␣134 linkages of amylose (2,3). ␣-amylase bypasses the ␣1 3 6 linkages of amylopectin and produces branched isomaltose oligosaccharides. The starchderived oligosaccharides are not fermentable by yeast without further processing by ␤-amylase (EC 3.1.1.2), which hydrolyzes the nonreducing ends at 134 and 136 linkages (2). In mammals, hydrolysis of the nonreducing ends is carried out by small intestinal mucosal brush border-anchored sucrase-isomaltase (SIM) 1 (1). Enzyme substrate specificities of SIM overlap with those of MGA. In vivo, SIM accounts for 80% of maltase (1,4-O-␣-D-glucanohydrolase) activity, all sucrase (D-glucopyranosyl-␤-D-fructohydrolase) activity, and almost all isomaltase (1, 6-O-␣-D-glucanohydrolase) activity (1). MGA accounts for all glucoamylase exoenzyme (1,4-O-␣-D-glucanohydrolase) activity for amylose and amylopectin substrates, 1% of isomaltase activity, and 20% of maltase activity (1,4). Some have hypothesized that human mucosal glucoamylase exoenzyme activity is an alternate pathway for starch digestion when luminal ␣-amylase endoenzyme activity is reduced because of immaturity and malnutrition and that MGA plays a unique role in the digestion of malted dietary oligosaccharides used in food and beverage manufacturing (5)(6)(7)(8). The objective of this study was the cloning and sequencing of the human small intestinal MGA cDNA to allow subsequent testing of this hypothesis by analysis of the individual catalytic and binding sites for maltose and starch by expression and mutation experiments. In this paper, we describe the cloning of cDNA for MGA from human intestinal RNA. The identity of the isolated cDNA clone was confirmed by expression of a recombinant protein in Escherichia coli transformed by the cDNA coding for the N-terminal domain.
MGA activities associated with proteins of the same size as the intestinal enzyme have been reported from human kidney and granulocytes (9,10). A clinical deficiency of intestinal glucoamylase has been reported that consisted of chronic diarrhea responding to a starch elimination diet in children with normal mucosal morphology and low starch hydrolyzing activity (11). A genetic form of glucoamylase deficiency has been reported in mice (12). In rats and rabbits, SIM activity is undetectable until weaning but MGA is present from birth (1). In pigs and humans, both SIM and MGA activities are present from birth. In malnourished human infants, SIM and MGA activities are reduced (13,14), but SIM and MGA activities are increased in malnourished rats (15). There have been charac-terizations of MGA activity, synthesis, and processing in chickens, mice, rats, rabbits, and pigs (12, 16 -24). There were two mature MGA protein subunits in all of these species. Studies of pig MGA peptide sequence demonstrated that it is anchored to the membrane by the N terminus (16 -18). Rat studies found that maltase activity was associated with the membrane, and glucoamylase was associated with the luminal ends (1,19).

EXPERIMENTAL PROCEDURES
Materials-Human small intestine from organ donors was used for the preparation of antibodies and isolation, and characterization of MGA (25)(26)(27). The usage of this tissue was approved by the ethical committees of all involved hospitals and institutions (25)(26)(27). All chemicals and biologicals used for protein isolation and characterization were purchased from Sigma unless otherwise noted (27). All molecular reagent and kit suppliers are indicated and were used according to manufacturer's instructions.
Electrophoresis and Immunoblotting Procedures-The procedures used for SDS-polyacrylamide gel electrophoresis and immunoblots were previously reported (27,28). Molecular mass standards were run with all gels and ranged from 43 to 202 kDa (27).
Immunoisolation-Human small intestinal epithelial cells were obtained from frozen and thawed organ donor tissue, and brush border membranes were isolated from a 2% mucosal homogenate containing protease and bacterial inhibitors (27,28). After removal of cellular debris (P1) at 1,000 ϫ g, the supernatant was pelleted by centrifugation at 100,000 ϫ g. The pellet was solubilized in Triton X-100 and deoxycholate (P2). HBB 2/143/17 was used to immunoisolate MGA (27). The HBB-MGA complexes were isolated with protein A (28). Replicate MGA protein isolates were pooled, separated on SDS gels, and blotted onto a membrane. The proteins were stained with Coomassie blue, and their identity was confirmed by immunoblots stained with pooled HMA mAb. The individual 335-and 285-kDa protein bands were subjected to direct gas-phase N-terminal microsequencing. Additional 335-or 285-kDa isolate pools were hydrolyzed with CNBr, extracted, dried, and washed. The pellet was dissolved in electrophoresis buffer with dithiothreitol. The two hydrolyzed proteins were run on a Tricine gel and then were blotted and stained with Coomassie Blue. The hydrolysate bands were sequenced.
PCR Primers-Degenerate 19 -20 oligonucleotide primers were designed from areas of the sequenced MGA peptides with lowest homology with SIM and from SIM catalytic site sequence WIDMNE (29,30). These primers were synthesized by Microsynth (Windisch, Switzerland). Gene-specific primers were designed from sequenced clones. These primers were synthesized with an Applied Biosystems 394 DNA/ RNA Synthesizer (Foster City, CA) or by Genosys (The Woodlands, TX).
RT/PCR Procedures-Reverse transcription (RT) was carried out using 100 ng of total human small intestine RNA (CLONTECH, Palo Alto, CA). The RT reaction used random nonamer primers (RT/PCR kit; Stratagene, La Jolla, CA). The RT product was amplified by polymerase chain reaction (PCR) that used degenerate primers (1 M), Taq polymerase, and RT reaction product in a thermal cycler (Perkin-Elmer 480). Amplicons were separated on agarose gel, purified using QIAquick gel extraction kit (Qiagen, Chatsworth, CA), ligated into pT7-blue T-vector, and transformed into NovaBlue E. coli (Novagen, Madison, WI). Clones were screened by PCR with vector primers, and QIApreps plasmid isolations (Qiagen) were carried out. DNA was subsequently sequenced (Applied Biosystems 373A automated sequencer, Foster City, CA).
PCR Extension Procedures-Rapid amplification of cDNA ends (Life Technologies, Inc.) was used to extend the 3Ј and 5Ј ends of the sequences. The 3Ј reactants were amplified using rTth polymerase kit (GeneAmp XL PCR kit, Perkin-Elmer). The forward primer was genespecific, and the reverse primer was the abridged universal amplification primer (AUAP; Life Technologies). The amplicons were ligated with a PCR-script (GeneAmp XL PCR kit, Perkin-Elmer). The 5Ј reverse primer was gene-specific, and the first forward primer was the anchor primer (Life Technologies). This was followed by a second round of amplification with primer AUAP. Clones were screened using vector and nested gene-specific primers.
Tissue Specificity-Equal amounts of total RNA from human small intestine, kidney, salivary gland, pancreas (CLONTECH), and isolated granulocytes (a gift of Dr. C. W. Smith, Baylor College of Medicine) were subjected to RT/PCR with primers producing MGA amplimers from small intestinal RNA (intestinal sequence locations 3660 -4451, 1669 -2031, and 5911-6231). ␤-actin was used as an amplification control for RNA quality (see Fig. 4A). The gel was transferred to a membrane (MSI; Westboro, MA) and cross-linked. The probe was prepared from the full-length MGA-P1 cDNA using random labeling. The 32 P-labeled probe was purified by column isolation. The hybridization (ExpressHyb; CLONTECH) was at 68°C, the membrane was washed at 50°C, and the image was developed.
Southern Blot Analysis-Species specificity and genomic complexity were evaluated with the Zoo blot and Geno blot (CLONTECH). The Zoo blot contained 4 g of EcoRI-digested genomic DNA from nine eukaryotic species (human, monkey, rat, mouse, dog, cow, rabbit, chicken, yeast). The Geno blot contained 4 g of human genomic DNA cut with EcoRI, HindIII, BamHI, PstI, and BglII. The probe was prepared by PCR from the MGA-P1 template with primers encompassing 1669 -2049 in the intestinal MGA sequence. The amplimer was randomlabeled with [ 32 P]dCTP. The hybridization (CLONTECH) was at 60°C, the membrane was then washed under high stringency conditions at 50°C, and the image was developed (Fig. 4, B and C).
Expression in E. coli-A full-length construct was assembled by cutting clones K3, 13-13, and T46 with HaeII, SacI, and SpeI. The DNA fragments were purified and ligated. The assembled construct was then column-and gel-purified. pBlueScript SK (Stratagene) vector was digested with SpeI, and the construct was ligated into the vector. The construct was used to transform XL1 Blue MRF cells. The full-length clone, MGA-P1, was completely resequenced.
For the purpose of expressing independent active sites, two sets of gene-specific primers were designed that contained a SalI (on the 5Ј end) or NotI (on the 3Ј end) restriction site. By using these primer pairs and MGA-P1 as a template, an N-terminal 2,584-bp fragment encoding residues 295-2838 and C-terminal 2,687-bp fragment encoding 2893-5550 were amplified by PCR with rTth polymerase. The amplicons were ligated into pET22b (Novagen) and transformed into NovaBlue cells for plasmid production. These clones, MGA-P1a (N-terminal domain) and MGA-P1b (C-terminal domain), were sequenced to confirm the orientations before transforming into BL21(DE3) E. coli (Novagen). Expression was induced by 1 mM IPTG, and proteins were harvested by osmotic shock and sonication. The induced and uninduced proteins were separated on a 6% gel and stained with Coomassie Blue or immunoblotted with HMA mAbs.
Computer Analysis of Sequences-The software from the Genetics Computer Group, Inc. (31) was accessed via the Baylor College of Medicine molecular biology computation resource. Sequence analysis was performed with the GCG programs (32). GenBank TM data were searched using the BLAST Service (33) or FastA and TFastA (34) programs. Primer design used the PRIMER (35) program. The PROSITE analysis accessed the patterns section (36) and Blocks data base (37). Chromosomal assignment was made by searching the Expressed Sequence Tag (EST) data base (38).
The 335-and 285-kDa MGA bands were separately excised from blots, and N-terminal amino acids were sequenced. The human MGA N-terminal sequence had 82% homology with hog MGA (18) and 52% with human SIM (29,30). The initiator methionine was missing from the peptide sequences of the human MGA and human, rabbit, and porcine SIM N termini (16 -18, 29, 30). Five identical bands were visualized on electrophoretic separation after independent CNBr hydrolysis of the 335-and 285-kDa forms. The polypeptide fragment bands were individually excised and sequenced. Band 2 could not be successfully sequenced. Band 4 was only sequenced from the 335-kDa digest. The N terminus and internal peptide sequences from the 335-kDa band were identical to those from the 285-kDa band. The locations of these peptide sequences are underlined in the amino acid sequence in Fig. 2.
Clones and Sequences-Four overlapping clones were obtained from the degenerate primer amplifications, and a 2,461-bp consensus was determined (Fig. 1). The N terminus, two internal peptides, and a WIDMNE catalytic site sequences were recognized from a single open reading frame (Fig. 2). Gene-specific primers were used to extend the sequence by 5Ј and 3Ј rapid amplification of cDNA ends RT/PCR. The open reading frame of the consensus was continued, and two additional internal peptides and a WIDMNE catalytic site sequences were recognized (Fig. 2). The nucleotide sequence has been listed as GenBank TM accession number AF016833.
Chromosomal Location-A search of the EST chromosomal data bank with the MGA nucleotide sequence revealed that the 3Ј end of the untranscribed cDNA had 100% identity with a 147-base cDNA EST GS1365 (38). There were no other EST identified.
Expression of Cloned cDNA in E. coli-Because of the length of the cDNA, 2,584-and 2,687-bp fragments of the cDNA (MGA-P1a, N-terminal domain; MGA-P1b, C-terminal domain), each of which included one catalytic site, were ligated into expression vectors and induced in E. coli. Proteins with the expected molecular mass of about 100 kDa were induced by IPTG in these E. coli cells from both constructs (Fig. 3A). When the bacterial proteins were immunoblotted to confirm expressed MGA protein identity, the induced 100-kDa band from MGA-P1a was stained by three HMA mAb (Fig. 3B), but neither the uninduced bacterial nor the induced MGA-P1b proteins were stained.
Tissue Distribution-The same amplimer patterns were visualized after RT/PCR from small intestine, granulocyte, or kidney total RNA. No MGA amplimers were detected from salivary or pancreatic RNA, although control actin amplimers were present. The amplimer pattern suggested that MGA mRNA was expressed in small intestine, granulocyte, and kidney but not in salivary gland or pancreas (Fig. 4A). A Southern blot from this gel revealed that all the MGA amplimers from small intestine, granulocyte, and kidney were stained by the MGA-P1 probe (data not shown).
Southern Blots-Zoo blot Southern lanes were positive for human, monkey, rat, mouse, dog, cow, and rabbit genomic DNA cut with EcoRI (Fig. 4B). Chicken and yeast lanes were negative. Except for EcoRI, the Southern Geno blot revealed only one band per restriction enzyme from human genomic DNA (Fig. 4C). A map of the MGA probe revealed no internal restriction sites for EcoRI; thus the restriction sites are contained in two introns within the coding 380-bp length of the probe. DISCUSSION Two major studies have addressed isolation and synthesis of human MGA proteins (27,39). The first study isolated human MGA by chromatography after treatment with papain and found a single band of about 312 kDa with 32-38% glycosylation. This isolated protein had substrate specificities for maltose and whole starch (39). The second study utilized in vitro organ culture followed by immunoisolation (27). Electrophoresis of the precipitate revealed a major band of about 335 kDa (27). In metabolically labeled explants, a polypeptide of about 285 kDa was visualized at 30 min, and the 335-kDa form was visualized at 60 min. The 335-kDa isoform was found to be a complex glycosylated form of the high mannose 285-kDa form. Treatment of both 335-and 285-kDa forms with trifluoromethanesulfonic acid resulted in a single unglycosylated protein (described as ϳ210 kDa in this paper) and revealed that the 335-kDa form was heavily glycosylated. Because of reports of two ϳ100 kDa MGA proteins in rats and pigs (18,22), the human explants were treated with trypsin, but MGA was not cleaved (27).
Cloned cDNA-We have isolated cDNA for human MGA that codes for a single open reading frame of 6,513 nucleotides and includes 58 nucleotide 5Ј-and 885 nucleotide 3Ј-untranslated regions. There is a single polyadenylation signal AATAAA at nucleotide 6,404. There is a 174 nucleotide sequence in the 3Ј-untranslated region that is homologous with a cDNA 3Ј EST that has been mapped to chromosome 7 (38). Consistent with the reported distribution of enzyme activity, MGA mRNA was expressed in human small intestine, kidney, and granulocytes (Fig. 4A). Based upon the pattern obtained on Southern analysis with five different restriction digests, we conclude that a single gene produces this cDNA (Fig. 4C). As evidenced on another Southern analysis with nine different eukaryotic genomic digests, an MGA gene was detectable in all other mammalian species tested (Fig. 4B) but not in chicken, known to have maltase and glucoamylase activities (20).
Peptide Sequence-There was 92% identity between the five MGA peptides and deduced sequences. The coding region of the cDNA of human MGA reveals a protein of 1,857 amino acids with a calculated molecular mass of 209,702 Da (Fig. 2). This protein is similar in molecular mass to the smallest band visualized on overloaded lanes and previously identified as a unglycosylated MGA (27). There was 59% sequence identity between human MGA and SIM proteins.
Recombinant Expression-N-terminal and C-terminal halves (each including one catalytic site) of MGA cDNA were expressed in E. coli. The induced proteins were of the expected size, ϳ100 kDa (Fig. 3A), and the N-terminal-expressed bacterial protein selectively reacted with mAbs used to characterize the human MGA starting material (Fig. 3B). The expression and peptide sequence data support the conclusion that we have cloned the authentic human intestinal MGA cDNA.
N-terminal Domains-In the cytoplasmic tail domain, MGA has 26 amino acids with 5 lysines. The N-terminal domain has a hydrophobic segment, a putative type II membrane anchor, with 16 branched chains in this 21-amino acid sequence. The anchoring domain is followed by threonine-and serine-rich regions that have been termed the O-glycosylated stalk in SIM (29) and human intestinal aminopeptidase (41). The region is 52-amino acids long, and there are 20 possible O-glycosylation sites in this MGA sequence.
Disulfide Bonds-MGA has 24 cysteines; it is reported that all the cysteines of MGA are joined in disulfide linkages (29,39). Of these cysteines, 11 are in homologous regions in the N-terminal and C-terminal domains (Fig. 2). These have been described as trefoil-type domains (37). There are three additional conserved cysteines that bracket the N-terminal and two that follow the C-terminal signature 1 WIDMNE sites. There are also paired, conserved cysteines within each of the signature 2 sequences (discussed below).
Glycosylation-MGA has 19 potential N-glycosylation sites. There is a total of 253 sites of potential O-glycosylation in MGA. It is reported that mature MGA has a total of 28 carbohydrate residues with a total oligosaccharide molecular mass of 50,400 kDa (41). Carbohydrates were found to make up about a third of the molecular mass of mature MGA, and no sialic acid could be detected (27,39). One hundred ng each of small intestinal, granulocyte, kidney, pancreas, and salivary gland total RNA (left to right) were used for RT, and 8% of each product was used for PCR. The sets of primers used for PCR (left to right for each tissue) were MGA C terminus (C), MGA N terminus (N), MGA 3Јuntranslated region (3Ј), and ␤-actin (␤). The MGA amplimers are further described in the text. Three MGA amplimers were visualized from small intestine, kidney, and granulocyte RNA (the kidney C amplimer was faint in the original). No MGA amplimers were detected from pancreas or salivary gland, although ␤-actin was amplified from all tissue RNA. MGA amplifications from human kidney and granulocyte RNA are consistent with reported enzyme activities of these tissues (see the Introduction). Molecular weight markers are on both sides. B, Zoo blot Southern of EcoRI-cut genomic DNA from 9 eukaryotic species. The lanes, left to right, contained human, rhesus monkey, Sprague-Dawley rat, BALB/c mouse, dog, cow, rabbit, chicken, and Saccharomyces cerevisiae. Human, monkey, rat, mouse, dog, cow, and rabbit were positive, but chicken and yeast lanes were negative. M r markers are indicated on the left. C, the Southern Geno blot, made with five different restriction enzyme digestions. Except for EcoRI, this Southern revealed only one band per restriction enzyme. A restriction map of the MGA probe revealed no internal restriction sites for EcoRI; thus these cuts are in introns. M r standards are on the left. Dimer Formation-All isolated bands were recognized by eight different mAbs specific for hMGA (25)(26)(27). The CNBr fragment patterns and the peptide sequences were identical from the 335-and 285-kDa proteins. The complete removal of all glycosylation from either the 335-or 285-kDa form resulted in a single protein band of ϳ210 kDa (27), which was consistent with the molecular mass as determined by sedimentation equilibrium (39,40,42) and cloning. These observations suggest the presence of a single protein structure in the 335-and 285-kDa forms of human MGA. Only differences in type of glycosylation and molecular mass distinguished the human 335-from 285-kDa bands (27). Parallel glycosylation differences were found in uncleaved MGA bands immunoisolated from pig or rat intestine (17,19). It has been reported that the largest band in the rat (similar to human 335 kDa) is a dimer formed by noncovalent adhesion between complex glycosylation sites whose selective removal resulted in a high mannose monomer (similar to human 285 kDa) (19). Identical results were reported after selective removal of the complex glycosylated chains of human 335-kDa MGA (27). Is the human 335 band a dimer of the 285-kDa band? High resolution electron microscopic examination of reconstituted vesicles suggested a dimeric structure for pig MGA (43). Attempts to chemically crosslink the human 335-kDa band failed to document a dimeric structure (27) on 4% SDS gels; however, a noncovalently linked dimeric structure of pig MGA was found on 1% SDS gels (44). These observations suggest that the 335-kDa band is a complex glycosylation-linked dimeric form of the 285-kDa band.
Enzyme Family-MGA has two glycosyl hydrolase family 31 signature sequences (37)