SAFB2, a new scaffold attachment factor homolog and estrogen receptor corepressor.

We have characterized previously the nuclear matrix protein/scaffold attachment factor (SAFB) as an estrogen receptor corepressor and as a potential tumor suppressor gene in breast cancer. A search of the human genome for other potential SAFB family members revealed that KIAA00138 (now designated as SAFB2) has high homology to SAFB (now designated as SAFB1). SAFB1 and SAFB2 are mapped adjacent to each other on chromosome 19p13.3 and are arranged in a bidirectional divergent configuration (head to head), being separated by a short (<500 bp) GC-rich intergenic region that can function as a bidirectional promoter. SAFB1 and SAFB2 share common functions but also have unique properties. As shown previously for SAFB1, SAFB2 functions as an estrogen receptor corepressor, and its overexpression results in inhibition of proliferation. SAFB1 and SAFB2 interact directly through a C-terminal domain, resulting in additive repression activity. They are coexpressed in a number of tissues, but unlike SAFB1, which is exclusively nuclear, SAFB2 is found in the cytoplasm as well as the nucleus. Consistent with its cytoplasmic localization, we detected an interaction between SAFB2 and vinexin, a protein involved in linking signaling to the cytoskeleton. Our findings suggest that evolutionary duplication of the SAFB gene has allowed it to retain crucial functions, but also to gain novel functions in the cytoplasm and/or nucleus.

We have characterized previously the nuclear matrix protein/scaffold attachment factor (SAFB) as an estrogen receptor corepressor and as a potential tumor suppressor gene in breast cancer. A search of the human genome for other potential SAFB family members revealed that KIAA00138 (now designated as SAFB2) has high homology to SAFB (now designated as SAFB1). SAFB1 and SAFB2 are mapped adjacent to each other on chromosome 19p13.3 and are arranged in a bidirectional divergent configuration (head to head), being separated by a short (<500 bp) GC-rich intergenic region that can function as a bidirectional promoter. SAFB1 and SAFB2 share common functions but also have unique properties. As shown previously for SAFB1, SAFB2 functions as an estrogen receptor corepressor, and its overexpression results in inhibition of proliferation. SAFB1 and SAFB2 interact directly through a C-terminal domain, resulting in additive repression activity. They are coexpressed in a number of tissues, but unlike SAFB1, which is exclusively nuclear, SAFB2 is found in the cytoplasm as well as the nucleus. Consistent with its cytoplasmic localization, we detected an interaction between SAFB2 and vinexin, a protein involved in linking signaling to the cytoskeleton. Our findings suggest that evolutionary duplication of the SAFB gene has allowed it to retain crucial functions, but also to gain novel functions in the cytoplasm and/or nucleus.
Scaffold attachment factor B (SAFB, 1 also named HET and HAP in the literature) was originally identified based on its ability to bind to scaffold/matrix attachment regions (1) and as a protein binding to the small heat shock protein hsp27 gene promoter (2). SAFB is a large protein (130 kDa) with a number of putative functional domains. The function of many of these domains is still unclear but can be inferred from the roles of similar domains in other proteins. The N terminus contains an SAF box (3) (also called an SAP domain (4)), which is a homeodomain-like DNA binding motif. This motif is believed to play a role in chromatin organization and specifically in organizing the interaction between nuclear matrix proteins and scaffold/ matrix attachment regions. SAF boxes are found in proteins involved in very diverse processes such as transcription, RNA processing, apoptotic chromatin degradation, and DNA repair (3). Amino acids 409 -484 harbor an RNA recognition motif (RRM), which is often found in mRNA-processing proteins. SAFB can interact with a number of proteins from the RNAprocessing machinery, such as AUF1/hnRNP D, hnRNP A1, htra2-␤1, ASF/SF2, SRp30c, and CLK2 (5)(6)(7). Consistent with this, SAFB is able to alter the splice site selection of an E1A minigene (7). Because SAFB was also shown to interact with the C-terminal domain of RNA polymerase II, it has been suggested to be part of a "transcriptosome" complex, coupling transcription and RNA processing (7). SAFB has a nuclear localization signal (NLS), and using biochemical fractionation experiments, we and others have shown that SAFB is a nuclear protein that copurifies with chromatin (1) and nuclear matrix protein fractions (2).
Previously we have published a number of studies showing that SAFB plays an important role in human breast cancer. First, it functions as an estrogen receptor (ER) corepressor (8). ER and SAFB interact both in vitro and in vivo, and overexpression of SAFB results in inhibition of ER activity. Second, overexpression of SAFB results in growth inhibition and generation of multinucleated cells. Finally, we discovered an exceptionally high rate of loss of heterozygosity (9) at the SAFB chromosomal locus on 19p13 in human breast cancer (8). Importantly, we have also identified SAFB mutations in those tumors but not in the adjacent normal tissue (9).
Given that SAFB is a multifunctional gene that seems to play an important role in breast tumorigenesis, we set out to determine whether SAFB would also be part of a larger family. Searching GenBank we identified one gene that showed high homology to SAFB, termed KIAA0138. Herein we propose the nomenclature SAFB2 for KIAA0138 and SAFB1 for SAFB/ HET/HAP. In this study we define similarities and differences in sequences and domain structures, chromosomal localizations, expression and subcellular localizations, and functions of SAFB1 and SAFB2. We believe that SAFB1 and SAFB2, like many other genes in the human genome, probably arose through duplication of a single ancestral gene.

MATERIALS AND METHODS
Plasmid Constructs and Chemicals-The cloning of the HET/SAF-B (SAFB1) expression construct (2), of pEGFP-SAFB1 (11), and of the mammalian expression vector for full-length ER (12) has been described previously. The human KIAA0138 cDNA (clone HA03743) (SAFB2) was kindly provided by the Kazusa DNA Research Institute (Kisarazu, Chiba, Japan). The 3,233-bp fragment with 57-bp poly(A) stretches was inserted into the EcoRV-NotI site of the Bluescript SKϩ vector (Stratagene, La Jolla, CA). To generate an expression vector, the cDNA was cut with the restriction enzymes KpnI and NotI and cloned into the KpnI and NotI sites of the pcDNA3 vector (Invitrogen). pEGFP-SAFB2 was constructed by excising full-length SAFB2 from pcDNA3-SAFB2 with NotI, which was made blunt, and digested with KpnI. SAFB2 was then ligated into the KpnI and a blunt BamHI site of pEGFP-C1 (BD Biosciences).
The putative SAFB1/2 promoter fragment was amplified by PCR using platinum Taq PFX (Invitrogen) from genomic DNA (MCF-7) with the following primers containing XhoI sites: forward 5Ј-GGA ACT GCA GGT CTT CGC CAC CGA CTC AGT CG-3Ј and 5Ј-GGA ACT GCA GCC GCC CAC TTT CCA CAG AAG-3Ј. The fragment was cloned in both orientations into the XhoI site of the pGL3 Basic reporter plasmid (Promega) upstream of the luciferase gene. The clones were sequenced to verify both the orientation of the insert and sequence identity with the corresponding NCBI human genome sequences.
All other constructs were verified using restriction digests and/or extension capillary sequencing (SeqWright, Houston, TX). All primers were purchased from MWG Biotech Inc. (High Point, NC), and PCRs were carried out using the MJ Research PT-200 Peltier Thermal Cycler (Waltham, MA). Chemicals were purchased from Sigma unless stated otherwise. Restriction and general molecular biology enzymes were purchased from Invitrogen.
Cell Culture, Stable and Transient Transfection-Human breast cancer cell lines (MCF-7, MDA-MB-231, and T47D), a human osteosarcoma cell line (Saos-2 Endo), a human myeloblastoma cell line (KG-1), a transformed embryonic kidney cell line (293), and a monkey kidney cell line (CV-1) were maintained in improved modified Dulbecco's medium supplemented with 10% fetal bovine serum (Hyclone, Logan, UT), 200 units/ml penicillin, 200 g/ml streptomycin, and 6 ng/ml insulin. For reporter assays, cells were transiently transfected using FuGENE 6 (Roche Applied Science) following the manufacturer's protocol. One day before transfection, cells were plated at 1.5 ϫ 10 5 cells/well in six-well plates. For estradiol induction experiments, the cells were plated in serum-free medium, which consisted of phenol red-free improved modified Dulbecco's medium, 10 mM HEPES pH 7.4, 1 g/ml fibronectin (Invitrogen), trace elements (Biofluids), and 1 g/ml transferrin (Invitrogen). Cotransfections were performed using 1 g of reporter plasmid (ERE-TK-LUC), 25 ng of ␤-galactosidase expression vector, and expression plasmids as indicated in the figure legends for each experiment. 24 h after transfection, the medium was replaced with serum-free medium containing the appropriate ligand. 48 h later cells were washed twice with PBS, and luciferase activity was measured using the luciferase kit from Promega (Madison, WI). ␤-Galactosidase activity was measured as described previously (2), and the luciferase activities were normalized by dividing by the ␤-galactosidase activity to give relative luciferase units.
For FACS analysis, cells were harvested 48 h after transfection, washed, and fixed in 70% ethanol. Immediately before analysis on a FACS STAR PLUS (BD Biosciences), propidium iodide and RNase were added to final concentrations of 0.1 and 0.5 mg/ml, respectively. Data were analyzed using CellQuest software.
To assay the activity of the putative SAFB1/2 promoter elements, 24 h before transfection, 1.5 ϫ 10 5 MCF-7 cells were plated/well in a six-well plate in improved modified Dulbecco's medium. Cells were transfected using FuGENE 6 reagent with 1 g of promoter or control vectors (empty pGL3 vector, TK-LUC) along with 25 ng of ␤-galactosidase control. Cells were harvested 36 h later, and luciferase and ␤-galactosidase activity were measured as described above.
To generate doxycycline-inducible SAFB1-overexpressing cells, MCF-7 cells were transfected with pUHD172-1neo plasmid (13) encoding a fusion product of the VP16 activation domain of herpes simplex virus and a mutated Escherichia coli tetracycline repressor protein.
Neomycin-resistant clones were tested for responsiveness to doxycycline, a tetracycline derivative, by transient transfection of a luciferase reporter gene. A parental clone (RTA-16) was subsequently transfected with HA-tagged SAFB1 subcloned into pTRE2-hygro plasmid (Clontech) downstream from the tetracycline resistance operator and a minimal cytomegalovirus promoter. Stable clones were selected in the pres-ence of 200 g of hygromycin and tested for inducible expression of HA-SAFB1 by immunoblotting using anti-HA and anti-SAFB antibodies.
Reverse Transcription-PCR (RT-PCR) and RNase Protection Assay-Total cellular RNA from cell lines was extracted using the RNeasy kit (Qiagen, Valencia, CA). Human tissue RNA for RNase protection assay analysis was purchased from Clontech. Reverse transcription of the RNA was performed in a final volume of 20 l using 2 g of RNA and 25pmol of primer (5Ј-GTG GCC ATG GCG CTC ATC TCC-3Ј). The reaction mixture was incubated at 70°C for 5min and then put immediately on ice. 4 l of 5 ϫ incubation buffer, 40 units of RNase inhibitor, 625 M each dNTP, and 200 units of Moloney murine leukemia virus reverse transcriptase (Promega) were added, and the reaction was incubated at 42°C for 1 h. Subsequent PCR was performed using SAFB2-specific primers (5Ј-CAA AGG GAG AGA GAG CGC CAG-3Ј, and 5Ј-GTG GCC ATG GCG GTC ATC TCC-3Ј) and 200 M dNTPs, 1.5 mM MgSO 4 , 0.5 ϫ enhancer solution, 10 ϫ Pfx DNA polymerase buffer (Invitrogen), 1.25 units platinum Pfx DNA polymerase. The RNase protection assay was performed as described previously (11,14). In addition to the SAFB1 (11) and 36B4 (15) probes, we also used an SAFB2-specific probe that was generated by cutting the SAFB2-pBSII SKϩ with HindIII restriction enzyme. The released 700-bp fragment was cloned into pGEM-11zf (Promega), and the probe was made using T7 polymerase.
To produce SAFB2-specific antibodies, the SAFB2 peptide GLLDS-FCDSKEYVAAQLRQ, corresponding to amino acids 154 -172, was used to produce rabbit polyclonal anti-SAFB2 antiserum (Research Genetics, Huntsville, AL), which was subsequently purified using immobilized peptide. This peptide is within a region of low homology between SAFB1 and SAFB2.
For immunofluorescence, 5 ϫ 10 4 cells (293, HeLa, MCF-7) were seeded onto poly-L-lysine-coated coverslips (Biocoat) (BD Biosciences) in 24-well plates 1 day before transfection. The cells were then transfected with GFP-vinexin ␤ (17) using LipofectAMINE Plus reagent (Invitrogen) overnight. The next day the medium was changed to Iscove's modified Dulbecco's medium ϩ 10% fetal bovine serum, and cells were allowed to recover for 24 h. Cells were washed twice with PBS, fixed for 30 min in PEM (80 mM PIPES, pH 6.8, 5 mM EGTA, 2 mM MgCl 2 ) and 4% paraformaldehyde, and then washed three times in PEM. The cells were permeabilized with PEM and 0.5% Triton X-100 buffer for 30 min, washed three times with PEM, and then blocked with PBST and 5% normal goat serum (Jackson Immunoresearch Laboratories Inc.). The cells were stained using our polyclonal anti-SAFB2 or the monoclonal anti-SAFB (Upstate Biotechnology) antibodies at 1:250 dilution in PBST and 5% normal goat serum overnight at 4°C, and then washed five times for 5 min in PBST buffer. Cells were then stained with goat anti-rabbit or anti-mouse Texas Red antibodies (Jackson) at 1:100 dilution in PBST and 5% normal goat serum followed by five washes for 5 min in PBST buffer. The coverslips were mounted on slides using mounting medium (Vectorshield, Vector Laboratories, Burlingame, CA). Confocal microscopy was performed with a digital scanning confocal microscope (Nikon Eclipse E1000) equipped with a 60 ϫ/numerical aperture ϭ 1.40/oil immersion objective. Images were exported to Adobe Photoshop (Adobe Systems, San Jose, CA), and final figures were composed in PowerPoint (Microsoft).
In Vitro (GST Pull-down) and in Vivo (Coimmunoprecipitation) Protein-Protein Interaction-GST pull-down experiments were performed as described previously (8) using bacterially expressed GST fusion proteins and in vitro translated proteins. Coimmunoprecipitation studies were performed using transfected and nontransfected cell lines (i.e. coimmunoprecipitation of endogenous proteins). For the transient transfections, 293 and CV-1 cells were plated at 5 ϫ 10 6 cells in 10-cm dishes. The next day the cells were transiently transfected with the appropriate plasmids using LipofectAMINE Plus (Invitrogen) for 7 h. The cells were maintained for an additional 20 h and then lysed in low stringency (PBS, 0.1% Nonidet P-40, protease inhibitors) or high stringency buffer (20 mM Tris, pH 7.4, 50 mM NaCl, 1 mM EDTA, 0.5% Nonidet P-40, 0.5% SDS, 0.5% deoxycholate, protease inhibitors). 500 g of lysate was precleared with 50 l of protein G-agarose for 30 min at 4°C and incubated overnight with 4 g of the appropriate antibodies at 4°C. Protein G-agarose was added for another 4 h, and the beads were pelleted and washed three times with low or high stringency buffer, respectively. Bound proteins were eluted in SDS sample buffer, subjected to SDS-PAGE, and analyzed by immunoblotting. To detect interaction of endogenous proteins, MCF-7 cells were allowed to grow for 48 h in 20-cm dishes, washed with PBS, and proteins were crosslinked for 30 min with 1 mM DSP (Pierce) which is membrane-permeable. After subsequent washing of the cells, they were lysed in high stringency buffer, and immunoprecipitation was performed as described above.
Yeast Two-hybrid Assays-We used the Matchmaker Two-Hybrid System 3 (BD Biosciences) with the yeast strain AH109, which includes three reporter genes, Ade2, His3, and ␣-galactosidase to reduce the incidence of false positive clones, for our yeast two-hybrid assays. Yeast manipulations were undertaken according the Clontech Matchmaker System 3 and Yeast Protocols Handbooks (BD Biosciences). The Cterminal SAFB1 domain (amino acids 599 -915) was subcloned in-frame into the EcoRI-BamHI sites of pGBKT7. This SAFB1 clone results in expression of an SAFB1-C-Gal4DBD fusion protein of the expected size of 60 kDa (in vitro transcription and translation, TNT Kit, Promega) using the T7 promoter site of pGBKT7, as seen by Western blotting using polyclonal rabbit anti-Gal4 DNA binding region antibodies (Upstate Biotechnology) of the protein expressed in AH109 (data not shown). Similarly, the C terminus of SAFB2 (amino acids 600 -953) was subcloned into pGADT7. SAFB1 bait and prey constructs were generated using appropriate PCR primers containing EcoRI or BamHI restriction sites and cloned into pGBKT7 and pGADT7 using the EcoRI and BamHI restriction sites of these vectors. The vinexin ␤ yeast two-hybrid clone 3-1 was obtained from a yeast two-hybrid screen using the SAFB1 C-terminal domain as bait against a normal mammary gland library (Clontech). 2 For yeast two-hybrid interactions, the bait and prey constructs were cotransformed into the yeast strain AH109 and plated onto both SD-Leu/Trp and high stringency SD-Leu/Trp/His/Ade dropout plates. Colonies from the SD-Leu/Trp/His/Ade plates were allowed to grow for 5-10 days and then streaked onto new SD-Leu/Trp/His/Ade dropout plates. Interacting proteins were assessed after the colonies had been streaked a second time onto SD-Leu/Trp/His/Ade plates with or without X-␣-galactose as the substrate for ␣-galactosidase. Appropriate controls as suggested by the manufacturer were used to ensure the validity of the interactions. Colonies that contained interacting proteins were maintained on SD-Leu/Trp/His/Ade and SD-Leu/Trp plates, whereas colonies having noninteracting proteins were maintained only on SD-Leu/Trp plates.

SAFB1 and SAFB2 Are Highly Homologous Proteins That
Map to the Same Chromosomal Locus-We set out to analyze whether SAFB is part of a gene family. Therefore, we performed a homology search using GenBank and identified a second human gene with very high homology to SAFB, termed KIAA0138 (accession number NM_014649). For ease of understanding and to prevent further confusion in the literature where a number of SAFB synonyms exist, we suggest the use of the terminology SAFB1 and SAFB2 for the two highly homologous genes; SAFB1 is the gene previously termed SAFB, HET, HET/SAFB, and HAP; and SAFB2 is the gene originally designated KIAAA0138.
SAFB genes are highly conserved in other mammalian organisms. Orthologous SAFB1 and SAFB2 genes are present in mice; human SAFB1 corresponds to mouse gene ENS-MUSG000000428128, and human SAFB2 corresponds to mouse gene ENSMUSG00000042625. A data base search of expressed sequence tag sequences also identified SAFB genes in cattle (Bos taurus, e.g. GenBank BE666390), and in pig (Sus scrofa, e.g. GenBank BI181164). Proteins with an SAF box and RRM domain structure similar to SAFBs have been identified in the sequence data bases of other organisms such as Drosophila melanogaster (gene CG6995, Berkeley Drosophila genome data base, (18)) and Caenorhabditis elegans. A search of the worm data base (19) at the Sanger Center revealed a putative gene CE43E11.1 (gene CE43E11.1 from clone C43E11, Gen-Bank accession no. U80437) coding for the protein CE23592. Proteins with a domain structure similar to SAFB1 were not found in any of the public domain data bases for yeast, plants, bacteria, or protozoa.
Further computer analysis of the translated protein sequences derived from the NCBI human genome sequence data base, expressed sequence tag data bases, and overlapping cDNAs indicated the presence of a third gene with high homology in its SAF box (also called SAP domain for SAF-A/B, Acinus, and PIAS) and RRM domains to both SAFB1 and SAFB2. The full-length open reading frame for this hypothetical protein can be deduced readily from the overlap of GenBank sequences AK000867 and NM_024755 (for hypothetical protein FLJ13213) to give a 1,034-amino acid protein. However, because the overall nucleotide homology of this hypothetical protein is only 36% to SAFB1 and 37% to SAFB2, suggesting that it is a more distantly related member of this gene family, we have limited our current studies to the characterization of SAFB1 and SAFB2.
SAFB2 (KIAA0138 cDNA) was originally cloned from the myeloblast cell line KG-1 by Nagase et al. (20), but up until now no further analysis of this gene or its gene product has been reported. At the amino acid level, human SAFB2 shows a 74% homology to human SAFB1 (Fig. 1A) and 75% homology at the nucleotide level. The SAF box and RRM are highly conserved between SAFB1 and SAFB2 (Fig. 1B). Furthermore, there are three other regions that share high homology but whose function is unknown. One such region maps to the N terminus (amino acids 210 -385), and the other two map to the C terminus (Glu/Arg-rich, 619 -788; Gly-rich, 792-899). We have shown previously that SAFB1 is a nuclear protein and has a functional NLS (2). 3 The NLS in SAFB1 at position 599 -616 (KRSVVSFDKVKEPRKSRD) has very little homology to the corresponding sequence in SAFB2. However, SAFB2 also has a predicted NLS which is at position 713-730 (RRQQEQLRYEQERRPGRR).
We have previously mapped SAFB1 to chromosome 19p13.3 (21). Searching for the genomic locus of SAFB2, we found that it is immediately adjacent to SAFB1 on chromosome 19p13.3. As outlined in Fig. 1C, the two genes are oriented in a bidirectional divergent (head-to-head) fashion: SAFB1, 5Ј 3 3Ј toward the centromere, and SAFB2, 5Ј 3 3Ј toward the telomere. The translational start sites of the first exon in SAFB1 and SAFB2 are only 490 bp apart.
In summary, SAFB1 and SAFB2 are highly homologous genes that reside in very close proximity on chromosome 19p13.3.
SAFB1 and SAFB2 Are Coexpressed, and Their Expression Is Regulated through a Bidirectional Promoter-SAFB2 was originally identified as a cDNA (KIAA0138) from the KG-1 myeloblast cell line. We next tested whether SAFB2 expression is restricted or whether it is, like SAFB1, expressed in a wide number of breast cancer cell lines as well as other non-breast tissues.
Therefore, we first generated SAFB2-specific primers for RT-PCRs and (a) confirmed its expression in KG-1 cells, and (b) determined that SAFB2, just like SAFB1, is expressed in a number of breast cancer cell lines including MCF-7BK, MDA-MB-231, T47D, and ZR75. A representative RT-PCR is shown in Fig. 2A, where we detected a band of the expected size (562 bp) using RNA from ZR75 and KG-1 cell lines.
Next we designed SAFB1-and SAFB2-specific RNase protection assay probes that hybridize to regions with low homology (see "Materials and Methods"). As shown before, RNase protection assay can be used to differentiate between the expressions of highly homologous genes because small differences such as 3-bp insertions can be identified (22). Using an SAFB1 probe (protected fragment, 240 bp), an SAFB2 probe (protected fragment, 340 bp), and 36B4, a ribosomal gene frequently used as a control probe in RNase protection assay (protected fragment, 140 bp), we analyzed expression in a series of normal tissues. RNA from MCF-7 served as a positive control, and tRNA as a negative control. An autoradiogram and numerical values for the corrected expression levels are shown in Fig. 2B. Tissues with high expression of SAFB1 (e.g. central nervous system) show high expression of SAFB2, whereas tissues with low SAFB1 expression (e.g. liver) also show low SAFB2 expression. Thus, the two highly homologous proteins SAFB1 and SAFB2 are coexpressed in tissues, implying a possible coregulation of the two genes. Thus, we next asked whether the relatively small intergenic region between SAFB1 and SAFB2 could harbor a promoter driving transcription of both genes, i.e. function as a bidirectional promoter.
The putative transcriptional start sites (as judged by the most distal nucleotide in a large number of 5Ј-expressed sequence tags) (indicated in bold in Fig. 2C, top panel) are 437 bp apart. This intergenic region is TATA-less; however, a large number of other TATA-less promoters have been found to be able to initiate transcription through so-called initiator elements overlapping the transcriptional start sites. Because the nucleotide sequence of initiators is not highly conserved, it is difficult to predict the exact initiator region. We do however  2 and 4) and presence (lanes 3 and 5) of reverse transcriptase using SAFB2-specific primers. SAFB2-pCDNA3 served as a positive control (ϩve), and no DNA or RNA was added to the negative control (Ϫve). The arrow indicates the SAFB2 band. B, RNase protection assay using human tissue RNA and probes (SAFB1, SAFB2, 36B4) as indicated. tRNA served as a negative control. The bottom panels represent arbitrary values for expression after measuring intensities using software on the Molecular Imager FX (Bio-Rad) and correcting for the 36B4 loading control. C, sequence of the genomic DNA region containing the minimal promoter region of the SAFB1/2 genes. The nucleotides in bold represent first exons of both genes. The primer sequences used to clone the minimal promoter are underlined. The capitalized nucleotides in bold represent putative transcriptional start sites for the two genes, and the bold underlined nucleotides represent putative transcription factor binding sites (labeled on the left). The bottom panel shows the activity of the SAFB1/2 minimal promoter. Therefore, MCF7 cells were transfected with plasmids as indicated (pGL3 Basic and Tk-Luc as negative controls, and 5Ј-3Ј and 3Ј-5Ј SAFB1/2 promoter cloned into pGL3 Basic), and relative luciferase units were determined as described under "Materials and Methods." D, SAFB1 and SAFB2 proteins have very similar molecular weights and run as a close doublet on immunoblots. Left panel, Western blot using anti-HET/SAFB mAb (Upstate Biotechnology) and SDS extracts from MCF-7 cells and Saos-2 Endo cells transfected with vector only, SAFB1-pcDNA1, or SAFB2-pcDNA3. Right panel, IVT products using extracts primed with SAFB1 or SAFB2 cDNA. know that the intergenic region can function as a minimal promoter, as determined in transient reporter assays in MCF-7 cells (Fig. 2C, bottom panel). The transcriptional activity of this region was ϳ5,000 times stronger compared with pGL3 Basic and 200 times stronger compared with a minimal TK promoter. Furthermore, the promoter was able to function cloned in either orientation, 5Ј 3 3Ј or 3Ј 3 5Ј. The promoter region contains a number of potential sites such as PEA3 (ets family member), CREB, c-myc, and GATA-1 binding sites (underlined in Fig. 2C, top panel). Finally, ProSite detected a total of 59 potential Sp1 binding sites in forward and reverse orientation, which is no surprise because this region is extremely C-ϩ G-rich (65%, of which 12% occurs as a CpG motif). Thus, SAFB1 and SAFB2 transcription is directed through a bidirectional promoter that contains a number of potential transcription factor binding sites, SP1 being the most frequent.
Finally, we set out to determine whether we would be able to detect protein expression of both SAFB1 and SAFB2. We had observed previously that Western blot analysis using an SAFB monoclonal antibody (Upstate) frequently showed a double band. This antibody was raised against a peptide (amino acids 345-357) mapping to a region of 100% homology between SAFB1 and SAFB2. Therefore it was likely that this doublet would represent SAFB1 and SAFB2 (the predicted SAFB2 protein sequence is 38 amino acids longer than the SAFB1 protein). To confirm a theoretical predicted size difference, we transcribed and translated the two genes in vitro (IVT) (Fig,  2D, right panel) and expressed them in Saos-2 Endo cells (Fig.  2D, left panel). Saos-2 Endo cells represent a good model system because they do not express any endogenous SAFBs because of a homozygous deletion at the D19S216 locus. 4 As shown in Fig. 2D, the antibody detected both proteins, with SAFB2 running at a slightly higher molecular mass than SAFB1. It is therefore possible that the doublet that we frequently detect represents SAFB1 and SAFB2 proteins. We cannot exclude, however, that the doublet additionally represents post-translational modification of SAFB1/2.
Thus, SAFB1 and SAFB2 are coexpressed in a number of tissues, and their transcription is regulated through a complex bidirectional promoter.
SAFB2 Can Function as an ER Corepressor and Inhibit Proliferation-Previously we have characterized SAFB1 as an ER corepressor (8). SAFB1 binds to ER and inhibits its transcriptional activity. We asked whether SAFB2 could function in a similar manner, by first performing GST pull-down assays using a GST-ER fusion protein (amino acids 180 -302, containing the ER domains C (DNA binding domain) and D (hinge domain), which we have previously shown to interact with SAFB1 (8)). As shown in Fig. 3A, SAFB2 can bind to ER in GST pull-down assays, suggesting a direct interaction. To determine whether ER and SAFB2 endogenous proteins interact, we needed to generate SAFB2-specific antibodies using an antigenic peptide highly specific for SAFB2 (see "Materials and Methods"). Specificity was confirmed through immunoblotting of protein extracts from Saos-2 Endo cells (SAFB1-and SAFB2negative) transfected with vector only, SAFB1, or SAFB2 cDNA. As expected, SAFB2 but not SAFB1 was detected using this antibody (Fig. 3B). Subsequently we used this antibody to immunoprecipitate SAFB2 from MCF-7 lysates, and immunoblotting with anti-ER antibodies confirmed the in vivo interaction between SAFB2 and ER (Fig. 3C). The experiment was also performed in a reciprocal manner (i.e. immunoprecipitation with ER antibodies and blotting with SAFB2 antibodies), and again we detected an interaction between SAFB2 and ER (data not shown).
To determine whether the SAFB2-ER interaction results in repression of ER activity we performed transient transfection reporter assays (Fig. 3D). MCF-7 cells were transfected with an ERE-TK-LUC reporter construct, and luciferase activity was determined upon cotransfection with SAFB1 or SAFB2 in the presence or absence of estradiol. As expected, the addition of estradiol led to a 2-3-fold increase in reporter activity, and SAFB1 overexpression resulted in strong repression of ER transcriptional activity. Similarly, SAFB2 overexpression also repressed ER transcriptional activity, demonstrating that SAFB2, like SAFB1, can function as an ER corepressor.
Another characteristic of SAFB1 is its ability to block cell growth, at least in part by decreasing the fraction of cells in S phase (11). To see whether SAFB2 would have a similar effect on S phase, we expressed GFP-SAFB1 or GFP-SAFB2 transiently in 293 cells and measured cell cycle distribution via FACS. As a control, we used cells transfected with the appropriate GFP construct only. As expected, cells transfected with SAFB1 displayed fewer cells in S phase (from 19.4 to 16.8%) ( Fig. 3E; see pointed arrow). In 293 cells, this decrease in S phase is associated with a block in G 1 (64.8% to 74.6%). Similar results were observed in cells transfected with SAFB2, although the effect was repeatedly more pronounced. The cells in S phase decreased from 19.7 to 14%, and the cells in G 1 increased from 58.6 to 73.6%. Thus, overexpression of either SAFB1 or SAFB2 decreased the fraction of cells in S phase.
In summary, we have shown that previously described functions of SAFB1 are also carried out by its homolog SAFB2. These functions include binding to ER and conferring repression of ER transcriptional activity and inhibiting cell proliferation as measured by S phase distribution. SAFB1 and SAFB2 Interact, but Only SAFB1 Self-associates-It has been reported previously that scaffold attachment factors can form large aggregates through oligomerization (23). We therefore asked whether the scaffold attachment factors SAFB1 and SAFB2 would be able to interact with each other and to self-associate. We hypothesized that an interaction domain would be in the C terminus because this region has been shown to contain multiple protein-protein interaction sites (24). To test our hypothesis we used a directed yeast two-hybrid screen with the SAFB1 C terminus (599 -915) as bait and the SAFB2 C terminus (600 -953) as prey, and as shown in the yeast two-hybrid assay in Fig. 4A (top panel), these domains strongly interacted. For negative controls, we used the respective empty vectors only (pGBKT7 and pGADT7), and as positive control we used p53 as bait and the SV40 large T-antigen as prey (Fig. 4A, middle panel). Subsequent deletion mapping of the C-terminal domain in SAFB1 (Fig. 4A, bottom panel) resulted in the identification of amino acids 719 -808 as the major interaction region and amino acids 519 -720 as a region of minor interaction. Interestingly, the same region allows selfassociation of SAFB1 but not of SAFB2 (data not shown).
To confirm the interaction between SAFB1 and SAFB2 in vivo, we coimmunoprecipitated differentially tagged proteins from CV-1 cells. Cells were transiently transfected with constructs expressing HA-SAFB1 and GFP-SAFB2, followed by lysis and immunoprecipitation with anti-HA antibodies. As shown in Fig. 4B, GFP-SAFB2 coimmunoprecipitated with anti-HA, reflecting the interaction between HA-SAFB1 and GFP-SAFB2 in vivo. Coimmunoprecipitations in transient transfection assays can potentially result in artifacts because of overexpression. Because an SAFB1-specific antibody is not available at this point in time, we generated MCF-7 clones that express low levels of an HA-tagged SAFB1 construct upon treatment with the inducer doxycycline (see "Materials and Methods"). Immunoblotting of cell lysates from MCF-7RTA-HASAFB1 (clone 23) resulted in detection of HA in the presence but not in the absence of doxycycline (Fig. 4C, top left  panel). Immunoblotting of the same lysates using anti-SAFB antibodies showed an ϳ2-fold overexpression (Fig. 4C, top right  panel). The same cells were subsequently used to analyze in vivo interaction of endogenous SAFB2 and HA-SAFB1. Therefore, lysates were immunoprecipitated with HA-antibodies followed by immunoblotting with anti-SAFB2 antibodies (Fig. 4C,  bottom panel). The interaction observed in these studies was confirmed in immunoprecipitation experiments performed in a reciprocal manner (i.e. immunoprecipitation with SAFB2, and immunoblotting with HA antibodies) (data not shown).
Finally, we tested whether coexpression of SAFB1 and SAFB2 would result in additive, synergistic, or antagonistic repression of ER activity. Various amounts of the two expression plasmids were coexpressed in MCF-7 and Saos-2 Endo cells (no endogenous SAFB1/2), and we repeatedly observed additive repression. A representative experiment is shown in Fig. 4D.
Thus, SAFB1 and SAFB2 are able to interact both in vitro and in vivo. The C-terminal domain that harbors the interaction domain is also involved in SAFB1 self-association. The formation of large aggregates has been shown for other scaf-folding factors, and future studies will determine whether a yet to be determined function of SAFB1/SAFB2 depends on the formation of SAFB1/SAFB2 dimers and/or large aggregates.
SAFB1 and SAFB2 Differ in Intracellular Localization, and SAFB2 Interacts with Vinexin-In the experiments described thus far, we did not detect any significant differences in expression patterns or functions between SAFB1 and SAFB2. However, genes duplicated during evolution usually gain at least some evolutionary novelties or become silenced. Therefore, our next goal was to determine whether SAFB1 and SAFB2 would differ in subcellular localization. Three different cell lines (MCF-7, HeLa, and 293) were transfected with GFP-SAFB1 or GFP-SAFB2 and analyzed by confocal laser microscopy. As a control, cells were transfected with GFP only, which resulted in a uniform staining of both nucleus and cytoplasm (data not shown). GFP-SAFB1 transfection resulted in exclusive punctate staining of the nucleoplasm with exclusion of nucleoli (Fig. 5A, left panels). Similar observations were recently described by Weighardt (5) and Chiodi (25). In contrast, SAFB2 clearly showed cytoplasmic as well as nuclear staining (Fig. 5A, right panel). The nuclear staining of SAFB2 mirrored that of SAFB1 (i.e. exclusion of nucleoli and partition into nuclear speckles), but there was clearly also a uniform staining of the cytoplasm in all cell lines tested. Further sequence analysis identified a potential nuclear export signal (LXXLXXI/L/ VXL) (26,27) in SAFB2 (amino acids 30 -38: LSELRVIDL). However, a similar sequence can be found in SAFB1 (amino acids 31-39: LSDLRVIDL). It is thus unclear whether the surrounding sequence determines the nuclear export, or whether other factors play a role in retaining SAFB2 but not SAFB1 in the cytoplasm.
The additional cytoplasmic localization of SAFB2 implies that it could have functions that are separate from those shared with SAFB1 in the nucleus. Interestingly, in a recent yeast two-hybrid assay aimed at identifying SAFB-interacting proteins that might play a role in the function of SAFB as a repressor, 2 we identified several vinexin ␤ clones (17). Vinexin ␤ is a member of a family of Src homology 3-containing adaptor molecules that regulate growth factor signaling, cell adhesion, and cytoskeletal organization in the cytoplasm (28).
After confirming the interaction between SAFB2 and vinexin ␤ in yeast (data not shown), we next validated the interaction in mammalian cells. Again, coimmunoprecipitation experiments were performed using transiently transfected cells as well as nontransfected cells analyzing the interaction of endogenous proteins. First we transfected MCF-7 cells with GFPvinexin ␤, precipitated SAFB2 from the lysates, and subsequently immunoblotted the membrane with anti-vinexin antibodies. As shown in Fig. 5B, we were indeed able to detect an interaction between SAFB2 and GFP-vinexin ␤. Next, we performed colocalization studies to confirm further an SAFB2vinexin ␤ interaction. Saos-2 Endo cells were transiently transfected with GFP-vinexin ␤ and SAFB2 cDNA, stained for SAFB2 using our polyclonal antibody, and analyzed by digital confocal microscopy (Fig. 5C). As described previously (17), the majority of vinexin ␤ was found in focal adhesions. Overlay of GFP-vinexin ␤ and SAFB2 staining revealed colocalization in discrete focal points in a subset of cells. Additionally, in some cells, we found that SAFB2 and vinexin ␤ costained the "leading edges" at the periphery of cells (data not shown). Finally, we immunoprecipitated SAFB2 from nontransfected MCF-7 cells and immunoblotted the membrane with anti-vinexin antibodies. As shown in Fig. 5D, we detected not only an interaction with vinexin ␤ but also with vinexin ␣. Interestingly, the molecular masses of the immunoprecipitated vinexin ␣ and ␤ proteins are slightly smaller as expected (75 versus 82 kDa, and 30 versus 38 kDa, respectively), which could be the result of loss of post-translational modification or degradation of the SAFB2interacting vinexin proteins. It is, however, also possible that SAFB2 preferentially interacts with minor vinexin ␣ and ␤ isoforms, which we have described previously (17).
In summary, in contrast to SAFB1 which we could only detect in the nucleus, SAFB2 is also present in the cytoplasm. This finding suggests that SAFB2 could play additional roles in the cytoplasmic compartment which are distinct from the nuclear functions shared with SAFB1, and the interaction with vinexin lets us predict that these functions might include coordination of signaling and cytoskeletal responses. DISCUSSION In this study we showed that the ER corepressor SAFB is part of a gene family consisting of two highly related genes termed SAFB1 and SAFB2. The genes map adjacent to each other on chromosome 19p13.3 and are arranged in a bidirectional divergent configuration (head to head). They are separated by a short GC-rich intergenic region that can function as a bidirectional promoter. SAFB1 and SAFB2 proteins share common functions but also have unique properties.
There are a large number of genes with similar features sharing domains with extremely high homology and performing analogous functions. In most cases, these genes arose as a result of gene duplication during evolution, and it is believed that gene duplications have played important roles in the evolution of the extremely complex vertebrate genome (29). A recent study by Dehal et al. (30) extensively characterized a number of tandem familial clusters of genes on chromosome 19 (where SAFB is localized) which originally arose from gene duplication events during evolution. Chromosome 19 is one of the smallest and most gene-dense chromosomes, and up to 31% of its genes might have arisen from gene duplications, among them being SAFB1 and SAFB2. Subsequent continuous genesis of tandem repeats by further chromosomal arrangements such as inversions could have resulted in the particular head-tohead configuration of SAFB1 and SAFB2, as suggested for other loci by Achaz et al. (31).
The distance between the translational and potential transcriptional start sites of SAFB1 and SAFB2 is only 490 and 437 bp, respectively. This closeness and the coexpression of the two transcripts implied that the two genes utilize a common promoter element, and indeed promoters with bidirectional activity coregulating two divergent genes have been described previously (32)(33)(34)(35). Our promoter studies described herein prove that the intergenic region can indeed function as a bidirectional promoter.
Previous studies showing that the SAFB locus undergoes extremely frequent loss of heterozygosity in tumors and finding mutations in the SAFB1 gene have led to our hypothesis that SAFB1 could be a breast cancer tumor suppressor gene (9). Interestingly, a high percentage of repair genes (42% of 120 analyzed genes) including tumor suppressor genes such as BRCA1, p53, and CHK2, are arranged in bidirectional divergent manner (36). A very common feature of these bidirectional loci is the presence of CpG islands between the genes which could function as CpG island promoters; in contrast to tissuespecific promoters very little is known about specific mechanisms of CpG island promoters. They could also be targets for hypermethylation, an increasingly recognized mechanism for inactivation of tumor suppressor genes (37). Indeed, the SAFB1 and SAFB2 intergenic region as well as the first introns are extremely GC-rich and could potentially be hypermethylated in tumors.
Other breast cancer tumor suppressor genes exist in families, such as p53/p63/p73, Rb/p107/p130, and BRCA1/BRCA2. Members of these families have overlapping as well as distinct biological functions (38 -42). For BRCA1 and BRCA2, it is clear that inactivation of either gene may lead to familial breast cancer, but the roles of the more recently identified family members of p53 and Rb in tumorigenesis are still under intense investigation. In contrast to all of these genes, the two SAFB family members are located directly adjacent to each other, and deleterious chromosomal events such as deletions result in concurrent inactivation of both genes. Further studies including a comprehensive analysis of SAFB1 and SAFB2 in human breast tumor specimens as well as the generation of appropriate mouse models are necessary to be able to draw solid conclusions on whether SAFB1 and/or SAFB2 are indeed tumor suppressor genes.
The high homology between SAFB1 and SAFB2 suggested that they might function in a similar manner. Here we show that SAFB2 was able to inhibit S phase, just as we have described previously for SAFB1 (11). We also found that SAFB2 can bind to and repress ER, thus functioning as an ER corepressor. Other ER-interacting proteins exist in large families, such as the SRC coactivator and N-CoR/SMRT corepressor families (43). N-CoR and SMRT share many sequence and structural features and subsequently many functional properties (for review, see Ref. 44), although differences have been observed. For example, the thyroid receptor (TR␤1) and the orphan receptor RevErb (45) prefer to recruit N-CoR to DNA response elements, whereas RAR␣ prefers SMRT (46). Similarly, the SRC family of nuclear receptor coactivators (SRC-1, GRIP/TIF2/SRC-2, and ACTR/RAC3/AIB-1/TRAM-1/SRC-3), also possesses many similar properties in interaction with nuclear hormone receptors and transcriptional activation (for review, see Ref. 47), yet there is evidence that not all the functions of the SRC proteins are redundant (47)(48)(49).
Unique functions of family members support the idea that most gene duplications result in inactivation of one of the genes unless it possesses an evolutionary novelty, such as an altered function or pattern of expression (50 -53). It has also been described that retention of duplicated genes is attained through segregation into different intracellular sites within the same cell (54). For example, this intracellular sorting has been observed for protein kinase C isoforms (55). We have detected a cytoplasmic localization of SAFB2, which is clearly different from the strictly nuclear localization of SAFB1. Interestingly, we detected an interaction of SAFB2 with vinexin ␤, a protein that has been described to function in anchorage-dependent activation of the extracellular signal-regulated kinase pathway (56). This occurs via its interaction with Sos, the guanine nucleotide exchange factor for Ras and Rac (57,58), through its Src homology 3 domains and its binding to vinculin, a cytoskeletal protein involved in cell spreading, cytoskeletal organization, and focal adhesion (17,56,58). Thus, vinexin ␤ is believed to play a role in the convergence of growth factor signaling and cytoskeletal organization. Future experiments will address a potential role of SAFB2 in the convergence of various signaling pathways and the pathways involved in cytoskeletal organization.
Finally, we observed that SAFB1 and SAFB2 interact, both in yeast two-hybrid assays and in vivo by coimmunoprecipitation experiments. Although it is feasible that SAFB1 and SAFB2 dimerize when bound to ER, it is also possible that this association is necessary for the formation of the SAFB-containing nuclear bodies (5,24). The precise function of these nuclear bodies has yet to be determined, but they have been proposed to represent transcriptosome complexes. In addition, SAFB is recruited to stress-induced small nuclear bodies that are formed after cells are exposed to stress conditions such as heat shock (24,25). Interestingly, recent data have shown that the signature proteins of several different nuclear bodies can selfoligomerize (e.g. SMN (59), Sam68 (60), PML (10), suggesting that formation of large aggregates is a prerequisite for nuclear body formation.
In summary, we have identified a second member of the SAFB family, whose members are now termed SAFB1 and SAFB2. Their close proximity at a chromosomal locus showing extremely high loss of heterozygosity, their ER repressor activities, and the additional function of SAFB2 in the cytoplasm where it interacts with vinexin, suggest that both proteins play important roles in breast cancer.