Genomic Organization of the Human Mucin GeneMUC5B

cDNA AND GENOMIC SEQUENCES UPSTREAM OF THE LARGE CENTRAL EXON*

Abstract

The complete structure of the DNA encoding the polypeptide chain of human mucin MUC5B has been determined. In this paper, we report the full-length cDNA (3886 bp) and genomic (15,143 bp) sequences upstream of the unusually large central exon of the human mucin gene MUC5B. This region, composed of 29 exons, encodes 1283 amino acid residues. Exon sizes vary from 44 to 262 bp, and intron sizes range from 87 to 1703 bp. We determined the 5′-end ofMUC5B by performing rapid amplification of cDNA ends-polymerase chain reaction experiments leading to the same length of the amplified product and by using primer extension experiments. A putative translation start site was found at nucleotide +37. We compared the amino-terminal region of MUC5B with those of pro-von Willebrand Factor, MUC2 and MUC5AC, and animal mucins, RMuc2, PSM, and FIM-B.1. The primary amino acid sequence with a high content of cysteine residues demonstrates a high degree of similarity with other members of the 11p15 mucin gene family, particularly MUC5AC. The complete genomic organization and both full-length genomic and cDNA sequences of MUC5B have been elucidated. This gene contains 48 exons and encodes 5662 amino acid residues to give a polypeptide with a M r approximately 600,000.

Footnotes

  • * This work was supported by “le Comité du Nord de la Ligue Nationale contre le Cancer” and “l’Association de Recherche contre le Cancer.”The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

    The nucleotide sequence(s) reported in this paper has been submitted to the GenBank™/EMBL Data Bank with accession number(s) AJ004862 (5′-genomic sequence).

  • § Supported by a fellowship from the “Ministère de l’Education Supérieure et de la Recherche.” Present address: Dept. of Pharmacology, P. O. Box 357750, University of Washington, Seattle, WA 98195.

  • To whom correspondence should be addressed: INSERM U-377, 59045 Lille Cedex, France. Tel.: 33-3-20-29-88-59; Fax: 33-3-20-53-85-62; E-mail: laine{at}lille.inserm.fr.

  • Abbreviations:
    vWF
    von Willebrand factor
    aa
    amino acid(s)
    bp
    base pair(s)
    FIM
    frog integumentary mucin
    RT-PCR
    reverse transcription-polymerase chain reaction
    PSM
    porcine submaxillary mucin
    RACE
    rapid amplification of cDNA ends
    SMC
    sialo-mucin complex
    UTR
    untranslated region.
    • Received April 30, 1998.
    • Revision received September 1, 1998.
    Table of Contents

    Submit your work to JBC.

    You'll be in good company.