Genomic Organization of the Human Mucin GeneMUC5B
cDNA AND GENOMIC SEQUENCES UPSTREAM OF THE LARGE CENTRAL EXON*
- Jean-Luc Desseyn‡§,
- Marie-Pierre Buisine‡,¶,
- Nicole Porchet‡,¶,
- Jean-Pierre Aubert‡,¶ and
- Anne Laine‡‖
- From the ‡ Unité 377 INSERM, Laboratoire de Recherche Gérard Biserte, Place de Verdun, 59045 Lille Cedex and ¶Laboratoire de Biochimie et de Biologie Moléculaire de l’Hôpital C. Huriez, CHRU, 59037 Lille Cedex, France
Abstract
The complete structure of the DNA encoding the polypeptide chain of human mucin MUC5B has been determined. In this paper, we report the full-length cDNA (3886 bp) and genomic (15,143 bp) sequences upstream of the unusually large central exon of the human mucin gene MUC5B. This region, composed of 29 exons, encodes 1283 amino acid residues. Exon sizes vary from 44 to 262 bp, and intron sizes range from 87 to 1703 bp. We determined the 5′-end ofMUC5B by performing rapid amplification of cDNA ends-polymerase chain reaction experiments leading to the same length of the amplified product and by using primer extension experiments. A putative translation start site was found at nucleotide +37. We compared the amino-terminal region of MUC5B with those of pro-von Willebrand Factor, MUC2 and MUC5AC, and animal mucins, RMuc2, PSM, and FIM-B.1. The primary amino acid sequence with a high content of cysteine residues demonstrates a high degree of similarity with other members of the 11p15 mucin gene family, particularly MUC5AC. The complete genomic organization and both full-length genomic and cDNA sequences of MUC5B have been elucidated. This gene contains 48 exons and encodes 5662 amino acid residues to give a polypeptide with a M r approximately 600,000.
Footnotes
-
↵* This work was supported by “le Comité du Nord de la Ligue Nationale contre le Cancer” and “l’Association de Recherche contre le Cancer.”The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank™/EMBL Data Bank with accession number(s) AJ004862 (5′-genomic sequence).
-
↵§ Supported by a fellowship from the “Ministère de l’Education Supérieure et de la Recherche.” Present address: Dept. of Pharmacology, P. O. Box 357750, University of Washington, Seattle, WA 98195.
-
↵‖ To whom correspondence should be addressed: INSERM U-377, 59045 Lille Cedex, France. Tel.: 33-3-20-29-88-59; Fax: 33-3-20-53-85-62; E-mail: laine{at}lille.inserm.fr.
- vWF
- von Willebrand factor
- aa
- amino acid(s)
- bp
- base pair(s)
- FIM
- frog integumentary mucin
- RT-PCR
- reverse transcription-polymerase chain reaction
- PSM
- porcine submaxillary mucin
- RACE
- rapid amplification of cDNA ends
- SMC
- sialo-mucin complex
- UTR
- untranslated region.
- Received April 30, 1998.
- Revision received September 1, 1998.
- The American Society for Biochemistry and Molecular Biology, Inc.











