The Exon Structure of the Human MAGP-2 Gene

A cDNA for human microfibril-associated glycoprotein-2 (MAGP-2) was used to screen a human leukocyte genomic DNA library in EMBL-3 vector. One clone, clone H (10 kilobase pairs (kbp)), was isolated that contained most of the MAGP-2 gene. The remainder of the 3′ end of the gene was obtained by direct polymerase chain reaction amplification of genomic DNA. The human MAGP-2 gene was found to be about 11 kbp in size and to contain 10 evenly distributed exons. The internal exons range in size from 30 base pairs (bp) to 88 bp with exons 4 and 6 the only exons of equal size (45 bp). All internal intron:exon junctions are defined by canonical splice donor and acceptor sites. Each junction has a 1/2 codon split with the exception of the exon 8/9 junction, which has a 2/1 split. The translation initiation codon is in exon 2, and the final exon contains 110 bp of coding sequence, including 2 cysteine codons. Primer extension experiments identified only one major transcription initiation site, 213 bases upstream of the ATG site. Rapid analysis of cDNA ends-polymerase chain reaction analysis of the 5′ end of MAGP-2 mRNA from placenta confirmed this result and did not detect any alternative splicing of transcripts. The putative promoter region of the MAGP-2 gene was found to be AT-rich and it lacked a TATA box and other common regulatory elements. However the sequence surrounding the transcription start site CTCA(+1)TTCC was similar to the consensus CTCA(+1)NTCT (N is any nucleoside) for an initiator element found in terminal deoxynucleotidyltransferase and a number of other highly regulated genes. Comparison with the previously characterized human MAGP-1 gene showed that structural similarity was largely confined to the exact size, sequence, and junction alignment of the two penultimate exons which encode the first six of the seven cysteine residues that are precisely spaced in both proteins. The findings are consistent with the growing evidence that, although MAGP-1 and MAGP-2 are both intimately involved in the biology of fibrillin-containing microfibrils, the MAGPs are structurally, functionally, and developmentally diverse proteins which share one characteristic cysteine-rich motif.

MAGP-2 1 is a 25-kDa glycoprotein that recently has been identified in specific association with fibrillin-containing microfibrils in the extracellular matrix of a number of tissues (1,2). Fibrillin-containing microfibrils, 10 -12 nm in diameter, are important structural components of most connective tissues. In elastic tissues such as artery, lung, and skin, these microfibrils are components of elastic fibers in association with the elastic protein, elastin. In other tissues such as ocular zonule, kidney glomerulus, and muscle, the microfibrils occur as elastin-free bundles (3). Despite being the subjects of intensive investigation, the precise molecular composition, architecture, and structural and functional heterogeneity of the microfibrils are still being elucidated (4,5). The major structural components of the microfibrils are rod-like, 350-kDa glycoproteins named fibrillin 1 and 2 that appear to be arranged as parallel bundles of 4 -8 molecules joined in series in a head to tail manner (6 -10). Fibrillins 1 and 2 have distinct but overlapping spaciotemporal tissue distributions, and it is unclear if they form separate populations of microfibrils or if they can coexist in the same microfibril (9 -11). It has been suggested that fibrillin 1 has a persistent structural role, whereas fibrillin 2 is predominantly involved in the early stages of elastic fiber assembly (10). Mutations in the genes for fibrillin 1 and fibrillin 2 have been, respectively, linked to the heritable connective tissue disorders, Marfan syndrome and congenital contractural arachnodactyly (12,13).
An increasing number of other proteins have been identified in association with fibrillin-containing microfibrils. These include MAGP 1 and 2 (1,2,14), microfibril-associated proteins 1, 3, and 4 (15)(16)(17), latent transforming growth factor ␤ 1 binding proteins (18,19), fibulins (20,21), and emilin (22). In most instances, it is unclear if the protein forms part of the microfibril or is adhered to its surface. However, in the case of MAGP-1 there is strong biochemical and immunoelectron microscopic evidence that, in tissues, this 31-kDa glycoprotein is covalently linked by disulfide bonding in a specific periodic manner to most if not all fibrillin-1-containing microfibrils (23)(24)(25). This suggests that MAGP-1 may be an integral component of microfibrils of this type. Its structural relative, MAGP-2, is also disulfide-bonded to fibrillin-containing microfibrils, but it exhibits more restricted tissue and developmental patterns of distribution, suggesting that MAGP-2 has a more specialized role in microfibril biology (1,2). Cloning of MAGPs 1 and 2 revealed that they each contain a central region of close sequence similarity between the molecules, including precise alignment of seven cysteine residues (1,26,27). It is considered likely that this cysteine-rich region is involved in the interactions of MAGPs with other components of the microfibril. In contrast, the other regions of MAGP-1 and MAGP-2 were found to be very divergent in structure, and the N-terminal regions of the molecules have also been shown to have distinct molecular binding properties (28 -30). In the current study, we have characterized the exon structure of the human MAGP-2 gene and compared it to that of the human MAGP-1 gene described previously (27). Overall the genes for MAGPs 1 and 2 were found not to be closely related in structure.

Isolation and Characterization of a Genomic DNA Clone for
Human MAGP-2 A 318-bp digoxigenin-labeled fragment of human MAGP-2 cDNA was prepared from cDNA clone H61 (1) by PCR amplification and simultaneous incorporation of digoxigenin-11-dUTP (Boehringer Mannheim). The primers used were F1, 5Ј-CCTGGGGGTCAATAGTCAAC-3Ј, and R2, 5Ј-CACAGGGAGGAAGTCGGAAGTAAT-3Ј, corresponding to bases 282-301 and 677-700, respectively, of the final cDNA sequence. The reaction mixture (40 l) contained MgCl 2 (2 mM); dATP, dCTP, and dGTP (200 M each); dTTP (130 M); digoxigenin-dUTP (70 M), primers (0.1 M each), and Taq DNA polymerase (0.5 units). The PCR was conducted on a capillary thermal cycler, model FTS-1 (Corbett Research, Sydney, Australia) for 40 cycles of 95, 55, and 72°C, as per the manufacturer's instructions. The cDNA probe was gel-purified and used to screen a human leukocyte genomic library in EMBL-3 vector (CLONTECH) plated at a density of 2 ϫ 10 4 plaque-forming units/plate. Hybridization to nylon membrane plaque lifts was conducted at 37°C as described previously (18), except that the final wash was conducted at 42°C. Positive plaques were identified with anti-digoxigenin antibodies conjugated to alkaline phosphatase as described previously (26). The genomic DNA inserts were amplified using EMBL-3-specific primers and the Expand Long Template PCR system (Boehringer Mannheim) as per the manufacturer's instructions. Clones were authenticated by successful PCR reamplification of insert fragments with a range of primer pairs consisting of one MAGP-2 cDNA-specific primer and one EMBL-3 primer. Direct sequencing (fmol kit, Promega) of these products was used to confirm the presence of exons from the MAGP-2 gene. Further sequencing of both DNA strands with appropriate primers was used to elucidate the sequences surrounding intron:exon junctions.

Identification of the Transcription Initiation Site
Primer Extension-Total RNA was prepared from human term placentas using Trizol Reagent (Life Technologies, Inc.) following the manufacturer's instructions. A reverse primer, PERA, 5Ј-TCCGCTGAAA-CAATGAGATGAGG-3Ј, corresponding to bases 55-77 of the MAGP-2 cDNA, was synthesized and end-labeled with [␥-32 P]ATP (1 Ci/pmol of primer). Labeled primer (5 pmol) was then annealed in water with 3-g samples of total placental RNA, or tRNA as a control, by heating to 70°C for 10 min followed by cooling in ice. The samples (20 l) were then incubated at 50°C for 50 min with 200 units of Superscript II reverse transcriptase (Life Technologies, Inc.) in 20 mM Tris-HCl buffer, pH 8.4, containing 50 mM KCl, 2.5 mM MgCl 2 , 10 mM dithiothreitol, and 400 M dNTPs. The reaction was terminated by incubation at 70°C for 15 min, and the RNA was digested by treatment with RNase A (10 g) for 20 min at 37°C. The primer extension products were ethanolprecipitated and analyzed on a 6% denaturing polyacrylamide sequencing gel. To identify the transcription start site, labeled primer PERA was also used to prime (from MAGP-2 genomic DNA) a standard sequencing ladder that was electrophoresed concurrently on the gel.
RACE-PCR-To confirm the position of the transcription start site and to search for possible alternative splicing of the pre-mRNA, RACE-PCR was used to obtain a range of cDNA clones corresponding to the 5Ј region of MAGP-2 mRNA from human placental RNA. The 5Ј RACE system for rapid amplification of cDNA ends kit (Life Technologies, Inc.) was used following the manufacturers instructions. Briefly, first-strand cDNA was synthesized from total RNA (1 g) using Superscript II reverse transcriptase (200 units) and 100 nmol of reverse primer, R7, 5Ј-AGTCGCTTGAGTCACATCGTCT-3Ј corresponding to bases 306 -327 of MAGP-2 cDNA. The first-strand cDNA was purified using a GlassMax spin cartridge and oligo(dC)-tailed using terminal transferase. Nested PCR was used to amplify the cDNA through two rounds of 40 cycles of 95, 50, and 72°C. For the first round of amplification, a MAGP-2 cDNA-specific reverse primer, R17, 5Ј-CCCAGGGGTATC-CAGTCAGAGGT-3Ј (bases 265-287, spanning the exon 2/3 junction), and an inosine-containing oligo(dC) binding anchor primer 5Ј-GGC-CACGCGTCGACTAGTACGGGIIGGGIIGGGIIG-3Ј were used. For direct sequencing, the second round of PCR utilized primer R17 and abridged anchor primer 5Ј-GGCCACGCGTCGACTAGTAC-3Ј. Exon 2-specific reverse primer R11, 5Ј-TGCAGCAAGAAACAGCAGCACC-3Ј (bases 234 -255) was used to prime the sequencing reaction. For cloning, the abridged anchor primer and reverse primer R16 SalI, 5Ј-GACATATTGTCGACGGTGTTCTGCTCTT-3Ј (bases 191-218, modified to include the SalI restriction site, underlined) were used for the second round of PCR. The PCR product was digested with SalI and cloned into pBluescript (KSϩ) vector. A range of individual clones were purified and sequenced as described previously (26).

RESULTS
Exon Organization-Screening of the EMBL-3 human genomic DNA library with a human MAGP-2 cDNA probe led to the identification and purification of a 10-kbp clone, clone H, which yielded PCR products when amplified using several MAGP-2 cDNA-specific oligonucleotide primers. The authenticity of clone H was confirmed when sequencing of the PCR products revealed the presence of MAGP-2 cDNA sequences. Further sequencing revealed that clone H contained all but the extreme 3Ј end of the MAGP-2 gene. This region was directly amplified by PCR from genomic DNA using a reverse primer from the 3Ј-untranslated region of MAGP-2 cDNA. The human MAGP-2 gene was found to be about 11 kbp in size and to contain 10 exons, which are evenly distributed (Fig. 1). Sequencing showed that all internal intron:exon junctions are defined by canonical splice donor and acceptor sites (Fig. 2). Fig. 3 shows the human MAGP-2 cDNA sequence with the exon boundaries marked. Each junction has a 1/2 codon split with the exception of the exon 8/9 junction, which is split 2/1. The internal exons range in size from 30 to 88 bp, with exons 4 and 6 the only exons of equal size (45 bp each). The translation initiation codon is located in exon 2, and the final exon contains 110 bp of coding sequence including two cysteine codons. The 3Ј end of exon 10 remains to be defined.
The number and sizes of the exons of the human MAGP-2 gene are compared with those of the human MAGP-1 gene in Table I. The two penultimate exons, 8 and 9, of the MAGP-2 gene match in size with the corresponding exons, 7 and 8, respectively, of the MAGP-1 gene, which contains one less coding exon (27). However the splicing patterns for the rest of the two genes have no discernible similarities, indicating that the genes are quite divergent in structure. Further analysis showed precise sequence and junction alignment of exons 8 and 9 of the MAGP-2 gene with exons 7 and 8 of the MAGP-1 gene (Fig. 4). These exons encode most of the region of amino acid sequence homology previously identified between MAGP-1 and MAGP-2, where the spacing of seven cysteine residues is precisely conserved between the two proteins (96). The first group of four aligned cysteines are encoded by MAGP-2 exon 8 and MAGP-1 exon 7, which also encodes an additional cysteine. The next two cysteines are aligned in MAGP-2 exon 9 and MAGP-1 exon 8. The remaining region of homology containing the seventh aligned cysteine is encoded in both genes at the extreme 5Ј end of the final exon. In the exon 8/exon 7 alignment, sequence similarity is present throughout the exon, whereas in the exon 9/exon 8 comparison there is strong similarity in the 5Ј region but little homology in the 3Ј region.
Identification of the Translation Start Site and Sequencing of the Putative Gene Promoter-To identify the translation start site and thus to define the 5Ј end of exon 1, a primer extension experiment was conducted using Superscript II reverse transcriptase and total RNA from human placenta (Fig. 5). A major band was identified in extensions from two different placental RNA samples (Fig. 5, lanes 2 and 3), which was absent from the tRNA control lane (Fig. 5, lane 1). The size of this extension product was determined to be 77 bp by comparison to the sequence ladder obtained from genomic clone H using the same primer. It should be noted that the band corresponds to the base that is one space higher on the sequencing ladder due to the mechanism of sequencing that terminates each chain by the addition of a dideoxynucleotide. This corresponds to a point 213 bases upstream of the ATG translation initiation site in the MAGP-2 cDNA. Three additional bands migrating slightly more slowly than the major band were also identified. However these bands were also present in the tRNA control lane (lane 1), indicating that they represent nonspecific artifacts in the analysis. Thus only one major transcription start site was evident for the MAGP-2 gene in placental tissue.
The above findings were consistent with the results of the RACE-PCR of MAGP-2 mRNA from placenta (not shown). Direct sequencing of the RACE-PCR product showed clear sequence from exons 1 and 2, which could be read to within 8 bases of the transcription start site identified above. There was no evidence of shadow sequences, which would indicate alternative exon usage. Analysis of more than 20 individual clones prepared from the RACE-PCR product showed that the 5Ј end of each was downstream of the identified start site and that most clones terminated at various points within 25 bases of the site. Sequences that diverged from exon 1 sequences were not detected. The results confirm that exon 1 is the first exon of the MAGP-2 gene.
The finalized sequence of exon 1 and 976 bases of upstream sequence are shown in Fig. 6. This 5Ј-flanking region containing the putative promoter of the MAGP-2 gene was found to be A ϩ T-rich (59%), and it lacked a typical TATA box. However the sequence around A ϩ1 of the transcription initiation site showed strong similarity to pyrimidine-rich initiator elements, which encompass the transcription start site of a number of developmentally regulated genes with TATA-less promoters. In particular, the MAGP-2 gene sequence Ϫ3 CTCATTCC ϩ5 is similar to the extensively characterized initiator consensus sequence Ϫ3 CTCANTCT ϩ5 found in the highly regulated gene, terminal deoxynucleotidyltransferase, and several other TATA-less genes (31)(32)(33). The putative promoter and exon 1 were scanned for the presence of potential regulatory elements. No full consensus sequences for common transcription regulatory elements such as Sp1, OCT, or CAAT sites were identified in the region proximal to the initiator sequence using the program TESS (Transcription Element Search Software) 2 and the Transfac v3.2 data base (34). However, the complementary sequence immediately upstream of the initiator site, Ϫ12 GGGAATTTGC -21 , has similarity to the consensus, GGGANTTYCC, for binding of regulatory factor NF-B (35). DISCUSSION This study has characterized the human MAGP-2 gene, which we have previously located on chromosome 12 at 12p12.3-13.1 (1) and compared its structure to that of the MAGP-1 gene (27). The two genes were found to be diverse in structure. The MAGP-2 gene was shown to consist of 10 exons, 9 of which contain coding sequence for a total of 173 amino acids (Figs. 1-3). In contrast, the human MAGP-1 gene utilizes 8 coding exons for 183 amino acids (27), indicating that its average exon size is larger (Table I). Structural conservation between the genes was confined to the precise size and splice junction alignment of two penultimate exons, 8 and 9 in the MAGP-2 gene and 7 and 8 in the MAGP-1 gene (Fig. 4A). These two exons encode the first 6 of the 7 precisely aligned cysteine residues in the center of both proteins (Fig. 4B). This region of each protein is probably responsible for a function(s) common to both proteins, including linkage to other components of the fibrillin-containing microfibrils. Neither MAGP-1 nor MAGP-2 has yet been shown to bind to fibrillins, but differential molecular interactions with other molecules have been found involving the structurally divergent N-terminal domains. This region of MAGP-1 binds tropoelastin and collagen VI in vitro, and it is possible that the glycoprotein functions on the surface of the microfibrils as an elastin-binding protein during elastinogen-2 http://agave.humgen.upenn.edu/tess/index.html.  esis and as an anchoring protein mediating the interaction of fibrillin-containing microfibrils and collagen VI microfibrils (28 -30). In contrast, MAGP-2 lacks these binding characteristics, but its N-terminal region contains an RGD motif that interacts in vitro with cell surfaces via the ␣V␤3 integrin, 3 suggesting that MAGP-2 may play a role in microfibril-cell interactions. Thus the divergent nature of their exon structures is consistent with the distinct binding functions of the two proteins.
Although both MAGP-1 and MAGP-2 are specifically associated with fibrillin-containing microfibrils, MAGP-2 has more restricted patterns of tissue expression, being absent from the microfibrils of the ocular zonule and the elastic fibers of the aortic media and having restricted distribution in tissues such as kidney and skin (2). During development of the highly elastic tissue, bovine nuchal ligament, MAGP-2 expression was found to peak during the early stages of elastinogenesis, whereas MAGP-1 was expressed at consistently high levels throughout middle and late fetal development (2). Thus it is evident that the MAGP-1 and MAGP-2 genes are differentially regulated, and this is supported by comparison of the structures of their putative promoter regions. The RACE-PCR experiments detected no alternative splicing of the human MAGP-2 gene in placental tissue, suggesting that the gene contains only one promoter region (not shown). In addition, primer extension experiments detected the presence of only one major transcription initiation site (Fig. 5). Upstream from this site, the putative promoter region is AT-rich and lacks TATA, CAAT, and other common regulatory elements (Fig. 6). Interestingly, the sequence spanning the transcription start site has close similarity to the consensus for initiator elements found in terminal deoxynucleotidyltransferase and a number of other highly regulated TATA-less genes (32,33). Such sequences are able to initiate accurate basal transcription without the presence of other upstream regulatory elements (31). Like the MAGP-2 gene, the terminal deoxynucleotidyltransferase gene lacks common regulatory elements, is not GC-rich, and has one transcription start site. Non-GC-rich, TATA-less promoters are usually highly regulated during development and differentiation and initiate transcription at only one or a few tightly clustered start sites (31). Thus the promoter region of the MAGP-2 gene appears to have characteristics that are consistent with its restricted patterns of expression. It is interesting that several additional examples of the pyrimidine-rich initiator consensus were identified in exon 1, and it is unclear why these are not utilized for transcription. It is possible that the NF-B-like element immediately upstream of the transcription start site is important for its selection and regulation of MAGP-2 gene transcription (Fig. 6). NF-B/Rel transcription factors (36) have been shown to interact with the initiator protein, TFII-I, and to bind to sequences close to initiator elements in several genes (37).
In contrast to the MAGP-2 gene, the human MAGP-1 gene appears to exhibit alternative exon usage at the extreme 5Ј end of the mRNA (27), suggesting that its transcription could be initiated from at least two distinct sites and raising the possibility that the gene may be under the control of two distinct promoter regions. The sequences flanking the alternatively used primary exons also lack TATA and CAAT sequences but are relatively GC-rich rather than AT-rich (27). Thus the MAGP-1 gene appears to be regulated by a promoter that can be categorized with a different group of TATA-less promoters. Genes of this GC-rich group often have several transcription initiation sites and are usually widely expressed (31,32). Thus it is apparent that the two human MAGP genes contain different classes of promoters that reflect their patterns of tissue and developmental expression. The distinct structures of the two 3 M. A. Gibson, manuscript in preparation.
FIG. 5. Determination of transcription start site by primer extension. PERA, a 32 P-labeled reverse primer from exon 1, was used to reverse transcribe the 5Ј end of MAGP-2 mRNA samples prepared from two human term placentas. The products were analyzed on a 6% sequencing gel. Lane 1, tRNA control; lanes 2 and 3, placental RNA. A prominent band corresponding to a 77-bp product was identified in both RNA samples. Note: additional nonspecific bands were present in all three lanes. Far right, corresponding ladder of MAGP-2 gene sequence (complementary strand) arranged left to right in the order ACGT.