If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
* This minireview will be reprinted in the 1999 Minireview Compendium, which will be available in December, 1999. This work was supported in part by NIGMS Grant 27566 from the National Institutes of Health (to R. L. H.). 210 The on-line version of this article (available athttp://www.jbc.org) contains supplemental material including comparisons of amino acid sequences, accession numbers for proteins, tissue distribution of secreted mucins, and additional references.
Mucins are major glycoprotein components of the mucous that coats the surfaces of cells lining the respiratory, digestive, and urogenital tracts, and in some amphibia, the skin. They function to protect epithelial cells from infection, dehydration, and physical or chemical injury, as well as to aid the passage of materials through a tract. Individual organisms make several structurally different mucins, and a given mucin may be found in more than one organ (see Supplemental Material). Members of the mucin family can differ considerably in size. Some are small, containing a few hundred amino acid residues, whereas others contain several thousands of residues and are among the largest known proteins. Irrespective of size, all mucin polypeptide chains have domains rich in threonine and/or serine whose hydroxyl groups are in O-glycosidic linkage with oligosaccharides. Moreover, these domains are composed of tandemly repeated sequences that vary in number, length, and amino acid sequence from one mucin to another (
). The carbohydrate content of a mucin may account for up to 90% of its weight. There are two types of mucins, membrane-bound and secreted. Of the human mucins, two are membrane-bound (MUC1 and MUC4) (
), one of the most thoroughly characterized mucins, has a tissue distribution and structure similar to MUC5B. An increasing number of proteins that are not mucins also contain highly O-glycosylated domains called “mucin-like domains.”
The functions of mucins are dependent on their ability to form viscous solutions or gels. Although the highly glycosylated domains of mucins are devoid of secondary structures, they are long extended structures that are much less flexible than unglycosylated random coils. The oligosaccharides contribute to this stiffness in two ways, by limiting the rotation around peptide bonds and by charge repulsion among the neighboring, negatively charged oligosaccharide groups (
). Such long, extended molecules have a much greater solution volume than native or denatured proteins with little or no carbohydrate and endow aqueous mucin solutions with a high viscosity. Mucins protect against infection by microorganisms that bind cell surface carbohydrates, and mucin genes appear to be up-regulated by substances derived from bacteria, e.g. lipopolysaccharides (
This review will summarize what is known about the polypeptide structures of the secreted mucins and how some, in particular PSM, are assembled via interchain disulfide bonds into molecules with molecular weights in the millions. We will not consider membrane-bound mucins, which were the subject of earlier reviews (
). The different domains of mucins are shown in Fig.1. Many of the domains show sequence identities and possibly similar functions in different mucins. These mucins vary greatly in size, from as few as 322 residues to 13,288 residues. The sequences of mucin polypeptides were deduced almost completely by recombinant DNA methods, and the physical-chemical properties of some mucins have not been determined. Nevertheless, it is well established that the oligosaccharides in many secreted mucins, e.g. PSM (
), show structural microheterogeneity, with GalNAcα-O-Ser/Thr as the sugar-protein linkage upon which other sugars are added. Most mucins have negatively charged sugars, either sialic acid or O-sulfosaccharides.
Tandem Repeat Domains
The number, length, and amino acid sequence of the repeats vary among different mucins, as shown in the Supplemental Material. The tandem repeat domains are flanked on either side by other types of domains (Fig. 1). All of the serine and threonine residues in the repeat domain of PSM have O-linked oligosaccharides (
), but this is not known for other mucins. The repeats in some mucins have identical sequences, whereas in others the repeat sequence is degenerate. The lack of secondary structures in the repeat domains and their flanking domains suggests that these domains serve as a scaffold forO-linked oligosaccharides (
), are encoded by a single large exon, although the remainder of the mucin is encoded by short exons separated by long introns. Many mucins show length polymorphism as the result of multiple alleles that encode different numbers of tandem repeats (
). Many secreted mucins contain three NH2-terminal D-domains, designated D1, D2, and D3, and some a fourth domain, D4, at the COOH terminus (Fig. 1). A partial D-domain, D′, is between D2 and D3 in all secreted mucins and VWF. Each domain, which contains up to 30 ½Cys, shows significant sequence identity with the other D-domains, especially the half-cystines. Comparisons of the sequences of the D-domains and other ½Cys-rich domains are given as supplemental information (see Supplemental Material). The D1-, D2-, and D3-domains of PSM areN-glycosylated when expressed in COS-7 cells (
), but this is not known for other mucins. In PSM and VWF all of the ½Cys in the D1-, D2-, D3-, and CK-domains are thought to form disulfide bonds, some of which are intrachain bonds whereas others are interchain bonds that are involved in assembly of PSM and VWF into multimers (see below).
The COOH-terminal Disulfide-rich/CK-domains
A 240–325-residue domain with 29–33 ½Cys is at the COOH terminus of many mucins (
). The first 100–130 residues in this domain have sequence identity with the C-domains of VWF, but the last 90–120 residues from the COOH terminus have sequence identities with the CK-domain at the COOH terminus of VWF (
). The CK-domains are homologous to the “cystine knot” superfamily of proteins that includes transforming growth factor β2, nerve growth factor, platelet-derived growth factor, and chorionic gonadotropin (
). The CK-domains of VWF and mucins show significant sequence identity to norrin, a 133-residue protein that in mutant form gives rise to Norrie disease in humans, a rare, sex-linked disorder characterized by congenital blindness, mental retardation, and deafness (
) have confirmed the role of disulfide bonds in the assembly of mucins into multimers. The recognition that mucins had disulfide-rich domains structurally similar to those in VWF and the fact that VWF formed disulfide-bonded multimers through its disulfide-rich domains (
) indicated a possible role of these domains in mucin multimer formation. However, the large size of mucin polypeptides and their high carbohydrate content prevented use of the conventional methods of protein chemistry for examining the molecular details of mucin multimer formation. Fortunately it has been possible to obtain insights into multimer formation by expression of plasmids encoding mucin domains in mammalian cells followed by characterization of the recombinant proteins by SDS-gel electrophoresis and chromatography under reducing and non-reducing conditions. This approach has been particularly successful for examining multimer formation in PSM (
), with the assumption that the assembly of domains accurately reflects the assembly of native mucinsin vivo. Thus, as illustrated in Fig.2, PSM is thought to form disulfide-linked dimers through its COOH-terminal CK-domains, and the dimers then form disulfide-bonded multimers through their NH2-terminal D-domains. It is likely that all mucins structurally related to VWF (Fig. 1), in addition to rat Muc2, MUC5AC, BSM, CTM, PGM, and MUC6, form multimers similar to those formed by PSM.
Dimerization through the CK-domains
Two polypeptide chains of PSM form disulfide-linked dimers through their CK-domains soon after their biosynthesis in the endoplasmic reticulum (
). Pulse-chase studies show that dimerization is very rapid and occurs concomitant with or soon after N-glycosylation. N-Glycosylation is not required for dimer formation or later during multimer formation because both processes are unaffected by tunicamycin (
). The fact that brefeldin A, a compound that disrupts the Golgi complex, has no effect on dimer formation and that dimers are formed before N-linked oligosaccharides become endoglycosidase H-resistant indicates that dimerization is confined to the endoplasmic reticulum. Subsequent to the studies on the dimerization of PSM, rat Muc2 was also reported to form disulfide-linked dimers through its COOH-terminal disulfide-rich domain, which includes the CK-domain (
) (see Supplemental Material). Dimer formation by other types of mucins has not been examined by expression of plasmids encoding the CK-domains. However, mucins secreted by mucin-producing cells in culture (
), including MUC2, MUC5AC, and likely MUC5B and MUC6, appear to form disulfide-linked dimers shortly after their synthesis in the endoplasmic reticulum. In contrast to PSM, N-glycosylation is reported to be required for dimerization of rat Muc2, MUC2, and MUC5AC.
The interchain disulfide bonds in PSM dimers have been examined by site-directed mutagenesis (
). Of the 11 ½Cys in the CK-domain, mutation of 8 is without effect on dimer formation. Dimerization is partly impaired by mutation of 3 ½Cys at residues 13223, 13244, and 13246. C13244 and C13246 are in the sequence C13244LC13246C, which is conserved in all mucins and other proteins containing the CK-domain (Fig.3) (see Supplemental Material) and is also critical for interchain disulfide bond formation in VWF (
), suggesting that this sequence motif may be important in folding of the CK-domain in the endoplasmic reticulum. This sequence motif is also conserved in all mucins, VWF, and norrin (Fig. 3) (see Supplemental Material), which attests to its importance in maintaining the structure of the CK-domain.
O-Glycosylation of the Repeat Domain
The incorporation ofO-linked oligosaccharides into mucins begins afterN-glycosylation and disulfide-linked dimer formation as suggested by biosynthetic studies on MUC2 (
). O-Glycosylation of PSM begins when the dimers reach the cis-Golgi compartments, because the GalNAc transferase that forms the GalNAc-Ser/Thr linkages and the mucin precursors bearing only GalNAc have been located by electron microscopy in the cis-Golgi in mucous cells of submaxillary glands (
). The completion of the biosynthesis of theO-linked oligosaccharides in secreted mucins continues in the medial- and trans-Golgi compartments where the requisite glycosyltransferases for elongation and termination of the oligosaccharides are located (
Expression in COS-7 cells of plasmids encoding the three D-domains of PSM has shown that these domains participate in formation of interchain disulfide bonds between disulfide-linked dimers to give very high molecular weight multimers of mucin (
). Multimer formation differs from dimer formation in several respects. Brefeldin A, which disrupts the Golgi complex, inhibits multimer formation, indicating that multimers form in the Golgi complex. Compounds that increase the pH of thetrans-Golgi compartments, such as chloroquine and monensin, also inhibit multimer formation but not dimerization (
). Bafilomycin, a specific inhibitor of the vacuolar H+-ATPase that maintains the trans-Golgi compartments at a slightly acidic pH, also inhibits multimer formation. These observations suggest that the interchain disulfide bonds that give rise to multimers are formed at a slightly acidic pH in the trans-Golgi complex through ½Cys residues in the D-domains. The molecular weights of the multimers cannot be assessed accurately by SDS-gel electrophoresis because they are so large they do not enter the running gel under non-reducing conditions. However, species with a size of trimers were observed when the three D-domains were expressed together (
), suggesting that a step in the process of multimerization is trimer formation of disulfide-linked dimers. Such multimers are likely branched structures as indicated in Fig. 2. Recombinant PSM containing no glycosylated domains is secreted from COS-7 cells as dimers and multimers and indicates that like VWF not all dimers are converted to multimers (
). The released propeptide contains the D1- and D2-domains and is essential for multimer formation although cleavage is not. Cleavage may not be essential for mucin multimerization because the D′-domains of mucins do not contain the sequence motif required for proteolytic cleavage of prepro-VWF. The observation that the D-domains of PSM are not cleaved when expressed in COS-7 or MOP-8 cells (
) is consistent with the lack of the cleavage motif in the D′-domain of PSM (see Supplemental Material). However, some proteolytic processing of mucins is possible as suggested by recent studies showing that cleavage occurs in the COOH-terminal region of MUC2 (
), and further electron microscopic studies should be made on well characterized preparations. Of interest is a recent report describing branched structures for MUC5B in respiratory secretions of asthmatic individuals (
). Nevertheless, additional mechanisms of mucin assembly are supported by studies on MUC2. LS174T cells synthesize soluble MUC2 disulfide-linked dimers, but higher molecular weight species are water-insoluble (
). Apparently, the water-insoluble species are assembled in the Golgi complex following initialO-glycosylation by a pH-independent process. These insoluble complexes are partly maintained by non-reducible chemical bonds of unknown nature (
) involves dimerization in the endoplasmic reticulum and multimerization in the trans-Golgi compartments. The molecular mechanisms that permit this compartmentalization are not known, but the NH2-terminal D-domains and the CGLCG motifs in the D1- and D3-domains seem to play critical roles (
). Plasmids encoding only the D1- and D2-domains, the D1- and the D3-domains, or the D3-domain of PSM expressed mucin oligomers in the presence of monensin suggesting that the three domains must be contiguous to avoid multimerization at the non-acidic pH of the endoplasmic reticulum and the cis- andmedial-Golgi compartments. Replacement of the two ½Cys by alanine in the CGLCG motif in the D3-domain permits formation of multimers in the presence of monensin (
). Thus, the motif in the D3-domain prevents multimerization of mucin in the non-acidic compartments of the endoplasmic reticulum and thecis/medial-Golgi compartments. Replacement of the two ½Cys by alanine in the CGLCG motif in the D1-domain dramatically reduces the rate of formation of disulfide-linked multimers (
). This observation suggests that multimerization at low pH in the acidic trans-Golgi compartments requires the motif in the D1-domain. Multimerization of VWF also requires the CGLCG motif in the D1-domain (
). However, among the mucins structurally related to VWF, only MUC5AC and MUC5B have CGLCG motifs in their D2-domains (see Supplemental Material). The exact roles of the CGLCG motifs remain unknown but because of the fact that similar motifs are in the active sites of proteins involved in catalyzing formation of disulfide bonds during protein folding, such as protein disulfide isomerase (Fig. 3), the question arises whether these motifs have a direct role in formation of disulfide bonds in mucins.
Much progress has been made recently in our understanding of the structure and assembly of secretory mucins, but much work remains for the future. Other members of the mucin family should be identified and their structures and mechanism of assembly into disulfide-bonded multimers elucidated. The pairing of half-cystines to form the many disulfide bonds in the globular domains must be established, and the role of chaperones in folding of these domains must also be determined. The molecular basis for the regulated/polarized transport of mucins should be explored. These kinds of studies will be needed to obtain further insights into the exact biological roles of mucins.