Structural Analysis of the Human BIN1 Gene

BIN1 is a putative tumor suppressor that was identified through its interaction with the MYC oncoprotein. To begin to identify elements of BIN1 whose alteration may contribute to malignancy, we cloned and characterized the human BIN1 gene and promoter. Nineteen exons were identified in a region of >54 kilobases, six of which were alternately spliced in a cell type-specific manner. One alternately spliced exon encodes part of the MYC-binding domain, suggesting that splicing controls the MYC-binding capacity of BIN1 polypeptides. Four other alternately spliced exons encode amphiphysin-related sequences that were included in brain-specific BIN1 species, also termed amphiphysin isoforms or amphiphysin II. The 5′-flanking region of BIN1 is GC-rich and lacks a TATA box but directs transcriptional initiation from a single site. A ∼0.9-kilobase fragment from this region was sufficient for basal transcription and transactivation by MyoD, which may account for the high levels of BIN1 observed in skeletal muscle. This study lays the foundation for genetic and epigenetic investigations into the role of BIN1 in normal and neoplastic cell regulation.

* This work was supported by Grants DAMD17-96-1-6324 from the U. S. Army Breast Cancer Research Program and CN160 from the American Cancer Society (to G.C.P.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  Primer Extension-3 pmol of the oligonucleotide primer ACAGCG-GAGCCAACTGAC (PEprimer#2) end-labeled with [␥-32 P]ATP was annealed to 10 g of WI-38 total cytoplasmic RNA for 12 h at 58°C in hybridization buffer (40 mM PIPES, pH 6.4, 1 mM EDTA, 0.4 M NaCl, 80% formamide). The annealed RNA was ethanol-precipitated and resuspended in 20 l of RT reaction buffer (see above) plus 50 g/ml actinomycin D (BMB). The reaction mixture was incubated 1 h at 42°C and stopped by the addition of sequencing gel loading buffer. A standard [␣-35 S]dATP DNA sequencing reaction was performed with the same primer using clone p31.2 as the template (see Fig. 1). The primer extension and sequencing products were cofractionated on a 12% DNA sequencing gel and autoradiographed for 15 min (extension reaction) or overnight (sequencing reaction).
Transcription Assays-pGL-Bgl was generated by subcloning a 886 BglII fragment including proximal 5Ј-flanking sequences from p31.2 into the luciferase reporter plasmid pGL2-basic (Promega). The hosts for transfection, C2C12 cells or 10T 1/2 cells (a gift of D. Goldhamer), were cultured in Dulbecco's modified Eagle's medium containing 15% fetal calf serum and penicillin/streptomycin. Cells (4 ϫ 10 4 /60-mm dish) were transfected (9) with 5 g of pGL-Bgl or pGL2-basic, 4.5 g of pSKϩ (Invitrogen), and 0.5 g of a ␤-galactosidase expression vector. For MyoD activation, a murine MyoD expression construct under control of the ␤-actin promoter (a gift of D. Goldhamer, University of Pennsylvania) was substituted for pSKϩ. Cells were harvested and processed for luciferase activity two days posttransfection using a commercial kit (Promega).

BIN1 Gene Structure and Exon-Intron Organization-A
physical map of the human BIN1 gene was constructed from a set of phage clones isolated from a WI-38 diploid fibroblast genomic library (see Fig. 1). To identify exons and exon-intron boundaries, the DNA sequence determined from six genomic segments (GenBank TM accession numbers U83999 -U84004) was compared with BIN1 cDNA sequences from several sources, including the original BIN1 clone, RT-PCR products from human RNAs, and the DNA data base (1, 8, 10 -12). With the exception of exon 1, all exons were located within a ϳ38-kb contig. An additional noncontiguous clone contained exon 1 and 5Ј-flanking sequences, with the latter extending ϳ3 kb upstream of the RNA cap site (see below). Given the structure of this clone, the size of intron 1 would be inferred to be at least 17 kb. Thus, we concluded that the human BIN1 gene spanned a minimum of 54 kb.
The DNA sequence of each exon and proximal introns are shown in Fig. 2. Based on the characteristics they encode, the BIN1 exons can be grouped into four sets, termed the BAR (BIN1/amphiphysin/RVS167-related), unique, brain-specific, and protein-protein interaction sets, respectively. Exons 1-8 encode the BAR domain of BIN1 (1). In this group, exon 1 included a different 5Ј-UTR and N-terminal coding sequence (MAEMGSKG) compared with the original BIN1 cDNA (ML-WNV) (1). The genomic sequence was judged to accurately represent the 5Ј-end of the BIN1 mRNA, because (i) expressed sequence tag and cDNA sequences identical to the genomic but not the 5Ј BIN1 cDNA sequence were present in the DNA data base; (ii) cDNAs whose structure matched the original cDNA clone could not be identified by RT-PCR in any tissue; and (iii) the 5Ј-end of the cDNA was found to contain an inversion of 64 bp derived from the middle of the cDNA (previously missed because the inversion fortuitously contained a translation initiation site). Exons 9 -11 encode a unique region of BIN1 that is functionally undefined and unrelated to amphiphysin and RVS167. The unique-1 (U1) and unique-2 (U2) regions are encoded by exons 9 and 11, respectively, and separated by a nuclear localization-like motif encoded by exon 10. Exons 12A-12D encode amphiphysin-related sequences that were not found in the original BIN1 cDNA (see Fig. 2B). These exons are spliced into larger isoforms of BIN1 message detected in brain and muscle, alternately termed amphiphysin isoforms or amphiphysin II (1, 10 -12) (see below). Exons 13-16 encode the C-terminal region of BIN1 implicated in protein-protein interactions. Exons 13-14 and 15-16 encode the Myc-binding domain and the Src homology 3 (SH3) domain, respectively, the latter of which is also a feature of amphiphysin and RVS167 (1).
Alternate Splicing of BIN1 RNA-To examine patterns of BIN1 splicing, RT-PCR was performed using RNAs isolated from three human cell lines, WI-38 diploid fibroblasts, HeLa cervical carcinoma, and Rh30 rhabdomyosarcoma (a muscle tumor line). The observations in human cells were extended using RNAs isolated from a set of normal murine tissues. All cells and tissues examined were previously shown to express BIN1 RNA by Northern analysis (1). Oligonucleotide primers derived from BIN1 sequences were used for RT-PCR of fragments spanning exons 3-7 (5Ј-end), 7-11 (midsection), and 11-16 (3Ј-end). For analysis of murine RNAs, oligonucleotide primer sequences were derived from the sequence of SH3P9, a murine BIN1 cDNA (GenBank TM accession number U60884; and Ref. 8). Products from these reactions were fractionated on agarose gels, blotted, and hybridized to a BIN1 cDNA probe or subcloned and sequenced.
Amplification of the 5Ј-end of BIN1 yielded a single product in all cell lines examined, indicating that this region was not subjected to alternate splicing in these cells. In contrast, amplification of the central and 3Ј regions revealed several alternate splicing events. In the central region, two products of similar abundance were observed in WI-38 fibroblasts that differed in the presence or absence of exon 10 sequences. Messages including exon 10 sequences were not detected in BIN1 messages from any of the other tissues or cell lines examined, suggesting that this splice form was relatively uncommon and thus regulated. Amplification of the 3Ј-end revealed additional splice forms. Two products of similar abundance were detected in all cell types that differed in the presence of exon 13 se-quences, which encodes part of the Myc-binding domain (1). 2 The coordinate appearance of each species suggested that exon 13 was alternately spliced but in an unregulated fashion. Additional species that included sequences derived from exons 12A-12D were detected in murine brain (E9.5 RNA also included one of these species). Interestingly, whereas exon 12A-12D sequences were not detected in any other normal murine tissues, exon 12A was included in RNA species in each of the established human cell lines. It was unclear whether the difference in exon 12A splicing reflected tissue-specific regulation, cell line establishment, or neoplastic transformation. Nevertheless, taken together with the brain-specific events, this observation suggested that splicing of exons 12A-12D was uncommon in most tissues and thus may be regulated.
To determine which combinations of exons appeared in various BIN1 RNAs, we performed RT-PCR using exon 9 and 16 primers (which span all the alternately spliced exons) and subcloned and sequenced the products. We confined this analysis to RNAs isolated from WI-38 and HeLa cells, where exons 10, 12A, and 13 are alternately spliced, to focus on the events in proliferating rather than postmitotic cells. The results, which are summarized in Fig. 3B, showed that seven of the eight RNA species theoretically possible in these cell lines were in fact generated (the one that was not detected was the ϩ12AϪ13 species). We concluded that exons 10, 12A-12D, and 13 of BIN1 were alternately spliced and that splicing of exons 10 and 12A-12D splicing was likely to be regulated.
Definition of 5Ј-flanking Sequences Sufficient for Basal Transcription and MyoD Activation-Definition of the BIN1 promoter was of interest for two reasons. First, previous work suggested that epigenetic mechanisms might underlie the loss of BIN1 expression in breast tumor cells (1). Therefore, characterization of the BIN1 promoter would permit an examination of tumor DNA for alterations in DNA methylation or transcription factor interactions that might account for loss of expression. Second, we have observed previously that BIN1 is expressed at high levels in skeletal muscle and murine C2C12 myoblasts (1,14). For this reason, we predicted that the BIN1 promoter might be activated by MyoD, a master regulator of muscle cell differentiation (15).
To identify the BIN1 promoter, it was first necessary to pinpoint the site(s) of transcription initiation. To this end, primer extension analysis was performed on RNA from WI-38 diploid fibroblasts. By comparing the genomic sequence to that of a murine BIN1 cDNA (8), which has a long 5Ј-UTR, a primer that was likely to hybridize within 100 nucleotides of the RNA cap site was chosen. RT-mediated primer extension yielded a 33-nucleotide product (see Fig. 4). Together with the DNA sequence of the 5Ј-flanking region generated by this primer, we were able to map the 5Ј-end of BIN1 RNA in WI-38 cells to the guanine residue designated ϩ1 in Fig. 5.
Determination of the genomic sequence upstream of the RNA cap site indicated that the 5Ј-flanking region was GC-rich and lacked a TATA box but contained a consensus binding site for TATA-binding protein at Ϫ79 (see Fig. 5). Supporting the possibility that MyoD may regulate BIN1, a consensus recognition site for MyoD was located at Ϫ238. Consistent with a possible promoter function, computer search algorithms identified consensus sites for several other transcription factors in this region (data not shown). Finally, among the 694 nucleotides upstream of exon 1, ϳ18% were composed of CpG dinucleotides, indicating the presence of a CpG island typical of many TATA-less promoters. These observations suggested that the 5Ј-end of the BIN1 gene that we cloned might contain a functional promoter.
The transcriptional potential of the 5Ј-flanking region was tested in a transient transfection assay. An 886-bp BglII restriction fragment was cloned into the luciferase reporter plasmid pGL2-basic, allowing transcription to be initiated at the BIN1 cap site. The resulting plasmid, pGL2-Bgl, was transfected into C2C12 myoblasts, which express high levels of endogenous BIN1 RNA (14). As shown in Fig. 5C, within two days after transfection, pGL2-Bgl exhibited ϳ100-fold greater activity that the control plasmid pGL2-basic (see Fig. 6A).
To determine whether pGL2-Bgl included sequences that were sufficient for regulated expression, the plasmid was introduced with or without a MyoD expression vector into 10T 1/2 fibroblasts (which do not express MyoD but differentiate into myoblasts in response to it). As a positive control for MyoD responsiveness, a second set of transfections used a luciferase reporter driven by a mutated ornithine carboxylase promoter (ODC⌬Smut-luc) containing a MyoD E box response element. 3 We observed that the activity of both reporters was increased up to ϳ7-fold of the basal level by MyoD cotransfection (see Fig.  6B). The effect was dose-dependent, because higher ratios of MyoD:reporter plasmids increased reporter activity. We concluded that the 5Ј-flanking sequences of the BIN1 gene constituted a promoter sufficient for directing transcription in myoblasts. DISCUSSION We have characterized the structure and some of the regulatory features of the human BIN1 gene. Nineteen exons were 3 J. Cleveland, unpublished results.  Fig. 1 is presented. The RNA cap site is indicated by ϩ1; exon 1 is shown in uppercase letters. Single underlining indicates the antisense sequence of the oligonucleotide used for the primer extension experiment. Double underlining indicates a potential binding site for TATA-binding protein (TBP) at Ϫ79 identified by Mat-Inspector (13), and a single E box consensus binding site for MyoD at Ϫ237 identified by visual inspection.
FIG. 6. Promoter activity of 5flanking sequences. A, basal transcription. C2C12 cells were transfected with 5 g of the luciferase reporters indicated and processed 48 h later as described under "Materials and Methods." pGL2-basic is the no insert control vector. pGL-Bgl contains the ϳ0.9-kb BglII-BglII fragment whose sequence is shown in Fig. 5. B, MyoD activation. 10T 1/2 cells were transfected and processed as above except that the transfected DNAs included the indicated amounts of a MyoD expression vector (1, 0 g;^, 0.25 g; Ⅵ, 2.5 g). ODC⌬mut-luc is a positive control reporter for MyoD responsiveness (see text). The data represent the average of two trials.
identified within a Ն54-kb region of DNA previously mapped to chromosome 2q14 (5). The primary BIN1 transcript was found to be extensively spliced, resulting in at least seven different species in proliferating cells and an even larger number in postmitotic cells of the brain. Characterization of the BIN1 promoter defined a region sufficient to direct inducible transcription in muscle cells where BIN1 is highly expressed. Thus, BIN1 is subjected to tissue-specific regulation at the levels of transcription and splicing.
Exons 10 and 13 were two of the three exons found to be alternately spliced in proliferating cells. Exon 10 splicing was relatively uncommon, because it appeared only in messages from WI-38 in addition to skeletal muscle (the source of the original BIN1 cDNA). Exon 10 encodes a basic amino acid-rich region that closely resembles a nuclear localization signal. However, this region may not act as an nuclear localization signal, because we have observed recently that its presence is neither necessary nor sufficient for nuclear localization (14,16). Therefore, exon 10 splicing probably has other implications. Exon 13 splicing was ubiquitous but apparently unregulated, because an approximately similar quantity of ϩ13 and Ϫ13 RNA species were detected in all cells examined. Since this exon encodes a significant part of the Myc-binding domain, it is likely that its alternate splicing affects the MYC-interacting potential of BIN1. An interesting possibility implied by our results is that, taken together, there may be two classes of BIN1 polypeptides that exist in cells, one that can interact with MYC and one that cannot.
Four exons identified in this study, 12A-12D, were not included in the original BIN1 cDNA but were detected by RT-PCR in a subset of messages in the brain. The existence of brain-specific exons was suggested previously by Northern analysis, which revealed a larger message(s) in the brain in addition to the ubiquitously expressed smaller species (1). Our findings confirmed those of others who have recently identified exon 12A-12D sequences in brain and muscle cDNA species, alternately termed amphiphysin isoforms or amphiphysin-II (10 -12). Another cDNA species identified by these workers imply the presence of an additional 93-bp brain-specific exon in intron 6 (which is unrelated to amphiphysin or RVS167); however for unknown reasons we were unable to confirm its presence in the expected location either by RT-PCR or direct DNA sequencing. Exons 12A-12D encode sequences related to amphiphysin, so their introduction would be expected to increase the amphiphysin-like character of BIN1. Alternate splicing of exon 12A-12D in a subset of brain messages may therefore provide a mechanism to augment or vary certain amphiphysin functions in neurons while retaining BIN1 functions in the same cell.
Interestingly, we found that exon 12A was spliced into a subset of messages in the human cell lines WI-38, HeLa, and Rh-30, but not into messages in normal nonneuronal tissues. The significance of exon 12A splicing in these cells is unclear. However, since we did not detect the ϩ12AϪ13 isoform in cells, an interesting possibility is that ϩ12A and Ϫ13 isoforms are functionally redundant (that is, they each lack the ability to interact with and inhibit the oncogenic properties of MYC). If so, the appearance of ϩ12 isoforms in WI-38 and HeLa cells may reflect an aberrant splicing event that relieves MYC inhi-bition by BIN1 (1) thereby promoting immortalization or establishment. In general, the extensive splicing we have documented in BIN1 opens the possibility that splice site mutations or altered splicing via epigenetic mechanisms may be germane to tumorigenesis. Two important goals of future work will be to (i) assess the activities of different splice forms of BIN1 for MYC interaction, cell localization, and inhibition of neoplastic cell growth, and (ii) determine whether there are altered splice patterns in tumor cells that could compromise the growth inhibitory activity of BIN1.
The BIN1 promoter is characterized by a high CpG content but otherwise exhibits the features of a housekeeping promoter. We showed that the muscle determination factor MyoD can up-regulate the BIN1 promoter and identified an E box site that might mediate MyoD-induced activation. Transcriptional activation by MyoD and/or other helix-loop-helix proteins may contribute to the strong up-regulation of BIN1 levels in differentiated neurons and muscle cells (1,10). Because the BIN1 promoter is rich in CpG residues, it is highly susceptible to the alterations in CpG methylation status, which are common in cancer cells and form the basis for loss of some tumor suppressors such as p16INK4 in lung cancers (17). Other mechanisms by which promoter activity might be altered in cancer cells include genetic mutations or epigenetic changes in the activity of transcriptional regulatory factors. The stage is now set to determine the basis for the frequent loss of BIN1 expression in certain solid tumors such as breast carcinoma (1).