Human regulatory factor X 4 (RFX4) is a testis-specific dimeric DNA-binding protein that cooperates with other human RFX members.

Regulatory factor X (RFX) members are evolutionarily conserved transcription factors that share a highly conserved winged helix DNA-binding domain. Human RFX4 has been isolated from breast cancer as a partial cDNA encoding a short RFX-type DNA-binding domain fused to the estrogen receptor, but the entire structure of RFX4 has been unknown. Here, we report the molecular cloning and characterization of human RFX4. RFX4 contains evolutionarily conserved regions, including a RFX-type DNA-binding domain, a dimerization domain, and other conserved regions, and is closely related to RFX1, RFX2, and RFX3 in structure. The expression of RFX4 is restricted to testis. In vitro synthesized RFX4 protein bound to typical RFX binding sites in a sequence-dependent manner. Immunoprecipitation analyses showed that RFX4 interacts physically with RFX2, RFX3, and RFX4 itself but not with RFX1. In contrast to other mammalian RFX members that form dimers, RFX4 is revealed to have no distinct transcriptional activation domains. By using a chimeric protein of RFX1 and RFX4, the C-terminal domain of RFX4 was shown to be a possible transcriptional repression domain. Taken together, these results indicate that RFX4 is the first mammalian member of RFX family without transcriptional activation capacity and might function through selective interactions with other RFX members in transcriptional regulation.

Regulatory factor X (RFX) 1 was first identified as a protein that binds to the X-boxes of major histocompatibility complex class II promoters (1). RFX is defined by a highly conserved DNA-binding domain that shows a unique winged-helix structure consisting of 76 amino acids (2)(3)(4)(5). Members of the RFX family are now known to be present in a broad range of eu-karyotic organisms, including yeast (4,6), fungi (7), nematode (8), fruit fly (9), and vertebrates (3,10). In Saccharomyces cerevisiae, Crt1, which also contains a RFX-type DNA-binding domain, functions as a transcriptional repressor in DNA damage and replication block checkpoint pathway. In response to DNA damage or replication block, Crt1 is highly phosphorylated, resulting in transcriptional derepression (11). In Caenorhabiditis elegans, RFX-type transcription factor DAF-19 plays a crucial role in the development of ciliated sensory neurons (8).
In humans, the presence of five RFX genes has been reported, four of which have been characterized (4). RFX1, RFX2, and RFX3 are structurally related proteins that share the DNA-binding domain and four additional conserved regions, A, B, C, and dimerization domains (4). The dimerization domain is required for homo-and heterodimer formations among RFX members (3,12,13). Another common characteristic of these three genes is their highest expression in testis (3). Other than testis, the RFX1 mRNA is also detectable in most other tissues. The RFX2 mRNA is abundant in lung and stomach, and the expression of RFX3 is ubiquitous but particularly high in brain and intestine (3).
RFX1 is a well characterized member in the human RFX family. It forms a homodimer or heterodimers with the RFX2 or RFX3 proteins through the dimerization domain (3,12,13). The RFX1 modulates expression of physiologically important genes, such as interleukin-5 receptor ␣ chain (IL-5R␣). RFX1 exerts its function on transcription through distinct transcriptional activation and repression domains and is supposed to cooperate with a cell lineage-specific coactivator to regulate lineage-specific transcription (12). In other cell lineage, the full-length RFX1 protein exhibits little transcriptional activity; however, when a deletion is made in the C-terminal region, the truncated form of RFX1 becomes a strong transcriptional activator (12,14). This observation means that the C-terminal domain represses the N-terminal transcriptional activation domain in an intramolecular manner.
RFX5 protein is structurally different from the other human RFX members. It contains the conserved DNA-binding domain but lacks the A, B, C, or dimerization domains (4,15). Therefore, RFX5 does not interact with other RFX members but forms a complex with other transcription factors that are indispensable for RFX5-dependent transcriptional activation (16 -18). RFX5 plays a critical role in the regulation of the major histocompatibility complex class II gene expression, and germ line mutations in the RFX5 gene cause bare lymphocyte syndrome, a hereditary disease with severe immunodeficiency (15,19,20).
In 1992, Murphy and co-workers (21) found an aberrant cDNA from human breast cancer. This cDNA encoded a fusion of the N-terminal half of the estrogen receptor and a novel RFX-type DNA-binding domain. This RFX-type DNA-binding domain was thought to be a part of a novel RFX member, which was later referred to as RFX4. Since then, the entire structure of RFX4 has been undefined, whereas the other RFX members have been shown to play important roles in several cellular processes. In this study, we elucidated the entire structure of RFX4. We also studied the molecular properties of the RFX4 protein. Our results demonstrate that the RFX4 protein is the first mammalian RFX member lacking transcriptional activation domains and may be involved in transcriptional regulation through dimer formation with other RFX members.

EXPERIMENTAL PROCEDURES
Isolation of RFX4 cDNA-The RFX4 cDNA (nucleotides 328 -1661) was isolated in the course of a yeast two-hybrid screening of a Matchmaker human testis cDNA library (CLONTECH). To isolate the 3Ј-end of the RFX4 cDNA, rapid amplification of cDNA ends (RACE) was performed with Marathon-Ready human testis cDNA (CLONTECH) and the primers F1 and F3 according to the protocol supplied by CLON-TECH. The 3Ј-RACE yielded a cDNA fragment containing a stop codon in-frame followed by a poly(A) tract. We then carried out 5Ј-RACE with primers R2 and R3, which amplified a 1-kb fragment. Because no possible ATG initiation codon was found in this fragment, we performed a second 5Ј-RACE by using primers R4 and R7. The second RACE yielded a 0.5-kb fragment that contained a potential first ATG preceded by an in-frame stop codon. Sequences of the primers are as follows: F1, 5Ј-GAGTCCTACATTGAGTGGCTGG; F3, 5Ј-AATCTCTGCACTGTCA-GGAGCGG; R2, 5Ј-TGCTTGGTGATGCTGTTCAGGTCC; R3, 5Ј-GCC-GTCTCAGAATTTGGGAGAACC; R4, 5Ј-AAACCTTCTCCTCAGGCAG-GCTGG; R7, 5Ј-CCGAGTCTTCTGGTGGTTAACTGAGG. Amplified cDNA fragments were cloned into pT7Blue plasmid vector (Novagen, WI) and sequenced by the CEQ 2000XL DNA analysis system (Beckman Coulter). The full-length RFX4 cDNA was constructed by ligation of the cDNA fragments.
Plasmid Constructions-The full-length RFX4 cDNA was cloned into pFLAG-CMV5 (Sigma) and pcDNAHA, respectively. The pcDNAHA was constructed by replacing the Xpress epitope of pcDNA3.1His (Invitrogen) with hemagglutinin (HA) epitope. The plasmids carrying RFX1, HA-tagged RFX2, and RFX3 were described previously (12). The entire open reading frames of RFX1 and RFX3 were subcloned into pcDNAHA to generate pcDNAHA-RFX1 and pcDNAHA-RFX3, respectively. For luciferase reporter analyses, portions of RFX1 and RFX4 were introduced into pBIND vector (Promega, WI) to fuse the DNAbinding domain of yeast GAL4 with various regions of the RFX proteins. The pBIND plasmids contain the Renilla luciferase gene for normalization of transfection efficiency. Reporter plasmid pG5luc (Promega) carries five GAL4-binding sites upstream of the firefly luciferase gene. All constructs used in this study were verified by sequencing.
Northern Hybridization-The 3Ј-portion of the RFX4 cDNA (nucleotides 1038 -1543) was labeled with [␣-32 P]dCTP by the random priming method and hybridized with the Human Multiple Tissue Northern blot (CLONTECH) according to the manufacturer's protocols.
Electrophoretic Mobility Shift Assays (EMSA)-Oligonucleotides were synthesized to carry the RFX-binding sites of IL-5␣ enhancer (12,22), hepatitis B virus enhancer (23), polyomavirus EF-C site (3), ribosomal protein L30 promoter (rpL30), HLA-DRA X-box (X-box) (24), c-Myc intron element 1 (MIE1) (25), and peripheral cell nuclear antigen E1A-responsible element (26). Sequences of these oligonucleotides are shown in Fig. 6A. The oligonucleotides for IL-5␣ enhancer were made double-stranded and end-labeled with 32 P by using T4 polynucleotide kinase (27). RFX proteins were synthesized by a coupled in vitro transcription/translation reaction with a TNT T7 Quick system (Promega) according to the manufacturer's protocol. The protein-DNA complex was formed for 15 min at room temperature in a 20-l mixture containing the 32 P-labeled probe (0.5 ng, 10 8 cpm/g), 0.5-2 l of in vitro translation products, 2 g of poly(dI-dC)-poly(dI-dC) (Amersham Pharmacia Biotech, NJ), 20 mM HEPES-KOH (pH 7.9), 50 mM KCl, 0.5 mM EDTA, 1 mM dithiothreitol, 3 mM MgCl 2 , and 5% glycerol. For competition analyses, an excess amount of unlabeled oligonucleotides was included in the binding reaction prior to the addition of the probe. For supershift experiments, 2.5 g of anti-HA monoclonal antibody 12CA5 (Roche Molecular Biochemicals) was added to the binding reaction. The samples were run on a 4% polyacrylamide gel in 0.25ϫ Tris borate EDTA buffer at room temperature. After the gel was dried, the DNAprotein complexes were visualized by autoradiography.
Immunoprecipitation-COS7 cells were maintained in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum, 100 units/ml penicillin, 100 g/ml streptomycin, and 250 ng/ml amphotericin B. Plasmids carrying FLAG-and HA-tagged RFX members were cotransfected by using FuGENE (Roche Molecular Biochemicals) according to the manufacturer's protocol. Twenty-four hours after transfection, the cells were collected and then lysed in buffer A (20 mM HEPES-KOH, pH 7.9, 80 mM NaCl, Complete protease inhibitor mixture (Roche Molecular Biochemicals), and 1 mM dithiothreitol) supplemented with 0.4% Nonidet P-40. After centrifugation at 20,000 ϫ g for 5 min, the cleared lysates were reacted with anti-FLAG M2 monoclonal antibody (Sigma) conjugated with protein G-Sepharose beads (Amersham Pharmacia Biotech) at 4°C for 1.5 h. The immunocomplex was collected by centrifugation and washed three times with buffer A containing 0.1% Tween 20. The proteins were eluted by boiling in SDSsample buffer, separated by SDS-PAGE (27), and transferred to Immobilon-P polyvinylidene difluoride membrane (Millipore Corp.). The blotted proteins were reacted with anti-HA monoclonal antibody (clone 3F10; Roche Molecular Biochemicals) or anti-FLAG M2 monoclonal antibody and detected by the chemiluminescent method with SuperSignal West Pico reagent (Pierce).
Reporter Assays-HeLa cells were maintained as described above. For reporter assays, cells were grown in 35-mm dishes at 70% confluence and then cotransfected with the pBIND derivatives (1 g) and the pG5luc reporter plasmid (1 g) by using TransIT LT-1 lipofection reagent (Panvera) as described previously (28). Twenty-four hours after transfection, the cells were washed twice with phosphate-buffered saline, collected with a scraper, and lysed in the passive lysis buffer (Promega). Luciferase activities in the lysates were measured by the TD20 luminometer (Turner Designs, CA) with dual luciferase reporter assay systems (Promega) according to the manufacturer's protocol.

RESULTS
Cloning of Human RFX4 -We performed two-hybrid screening with human testis cDNA library and carried out sequence analysis of many cDNA clones. In the course of large scale sequence analysis, we found a cDNA that encodes a novel RFX-type dimerization domain. To elucidate the entire structure of this cDNA, we performed multiple RACE analyses of human testis cDNA. We isolated cDNA fragments for the 5Јand 3Ј-ends and then reconstituted the full-length cDNA from these isolated cDNA fragments. The combined cDNA sequence contained one large open reading frame. Searching the DNA data base, we found that this cDNA encodes the RFX4 protein.
Originally, RFX4 was identified from human breast cancers as a partial cDNA encoding a short RFX-type DNA binding domain fused to the estrogen receptor (21), but the entire structure of RFX4 has been unknown until now. We also noticed that the RFX4 gene is located on chromosome 12q24. Since the estrogen receptor gene is on chromosome 6q25 (29), the reported fusion of the estrogen receptor and the DNA binding domain of RFX4 appears to be generated by abnormal chromosomal translocation in breast cancers.
Structure and Expression of RFX4 -The combined RFX4 cDNA consists of 2186 bp with a large open reading frame that is preceded by a nonsense codon in frame. The RFX4 protein is deduced to consist of 563 amino acids (Fig. 1). Among RFX family members, the RFX4 protein is closely related to RFX1, RFX2, and RFX3 in structure (Fig. 2). The RFX4 protein carries the DNA-binding domain in the N-terminal region and the dimerization domain in the C-terminal region but lacks some other domains, such as Q-rich and A regions, which play roles in transcriptional activation (12,14). As demonstrated by Fig.  3, RFX4 exhibits extensive sequence homology with other RFX members at the DNA-binding domain, the dimerization domain, and the B and C regions. Outside of these regions, RFX4 shows little similarity to the other RFX members. RFX4 pro-tein does not share significant homology with any known protein other than the RFX members.
We then carried out Northern hybridization with human multiple tissue blot to analyze tissue specificity of the RFX4 expression. As shown in Fig. 4, the expression of RFX4 was highly specific to testis. The RFX4 mRNA was ϳ3.7 kb in length. A faint band of larger size was also detectable in the brain.
DNA Binding Specificity of RFX4 -Since the RFX4 protein contains the typical RFX-type DNA-binding domain, we examined whether RFX4 has DNA binding ability. Oligonucleotides for the binding site of the RFX members in the IL-5R␣ enhancer and its mutated sequence (12,22) were synthesized, made double-stranded, and end-labeled with 32 P. The RFX4 protein tagged with HA epitope was synthesized in vitro and reacted with 32 P-labeled oligonucleotide probes. EMSA showed that the RFX4 protein binds to the IL-5R␣ enhancer (Fig. 5, lane 3) but not to the mutated sequence (Fig. 5, lane 4). We confirmed the sequence-dependent binding of the RFX4 protein to DNA by competition analysis with an excess amount of unlabeled oligonucleotides (Fig. 5, lanes 5-10). We also verified the presence of the RFX4 protein in the DNA-protein complex, since the band was supershifted by adding the anti-HA monoclonal antibody to the reaction (Fig. 5, lane 11). Taken together, these observations demonstrate that RFX4 binds to DNA in a sequence-dependent manner.
We then tested the DNA-binding preferences of human RFX members. Double-stranded oligonucleotides for reported binding sites of the RFX members were used as unlabeled competitors in EMSA with the RFX proteins and the radiolabeled IL-5R␣ probe (Fig. 6A). Loss of the signal for the RFX-probe complex meant effective competition and thus high affinity binding of the RFX proteins to the unlabeled competitors. As shown in Fig. 6B, oligonucleotides for hepatitis B virus en- hancer, polyomavirus EF-C site, rpL30, and X-box served as efficient competitors, indicating that the RFX members bound to these sequences. This observation means that RFX4 has DNA binding specificity similar to those of other human RFX members. However, MIE1 oligonucleotide did not disrupt the complex between the IL-5R␣ probe and the RFX4 protein.
Together, these results demonstrate that the DNA recognition profile of RFX4 is similar to those of other RFX members, with a minor difference in affinity for the MIE1 binding site. We also observed that the binding of the RFX members to the periph-  Complex Formation of RFX4 with Other RFX Members-As RFX4 contains a typical RFX-type dimerization domain, we examined whether RFX4 interacts with other RFX members. The RFX4 protein tagged with FLAG epitope was coexpressed with the HA-tagged RFX1, RFX2, RFX3, and RFX4 proteins, respectively in COS7 cells. We carried out immunoprecipitation with anti-FLAG monoclonal antibody and examined whether the HA-tagged RFX members coprecipitated with FLAG-RFX4. As shown in Fig. 7, RFX4 formed stable complexes with RFX2 and RFX3, respectively. Under the same experimental conditions, the interaction between RFX1 and RFX4 could not be detected (Fig. 7), while the stable complexes of RFX1⅐RFX1 and RFX1⅐RFX2 were observed (data not shown). We also found that the RFX4 protein formed a homodimer.
Domain Analysis of the RFX4 Protein-We then examined whether RFX4 could function as a transcriptional regulator. RFX1 is known to carry a transcriptional activation domain in its N-terminal region, but the full-length RFX1 protein is functional as a transactivator only in a certain cell lineage (12). In other cell lineage, the transcriptional activity of RFX1 is masked intramolecularly by two inhibitory domains in its Cterminal region (12,14). We first studied whether RFX4 carries a transcriptional activation domain. We constructed a series of plasmids expressing various portions of the RFX4 protein fused to the DNA-binding domain of yeast Gal4. These plasmids were cotransfected with a reporter plasmid-carrying luciferase gene under the control of Gal4-binding sites. Transcriptional activities of the Gal4-RFX fusion proteins were tested as measured by luciferase activities. As shown in Fig. 8A, the full-length RFX1 had little transcriptional activation in HeLa cells (Fig.  8A, GAL4-RFX1 Full), but the truncated form of RFX1 showed strong transcriptional activation (Fig. 8A, GAL4-RFX1N). On the other hand, no domains of RFX4 activated luciferase, demonstrating that RFX4 lacks a distinct transcriptional activation domain.
Next, we examined whether the C-terminal domain of RFX4 acts as an inhibitory domain. As mentioned above, the truncated RFX1 showed a strong transcriptional activation, while full-length RFX1 exhibited little activation. Two overlapping regions in RFX1 are known to function as separate inhibitory domains (13,14). We generated a series of plasmids to express chimeric RFX1⅐RFX4 proteins (Fig. 8B). When the C-terminal domain of the RFX1 protein was replaced with the corresponding region of the RFX4 protein, this chimeric RFX protein exhibited little transcriptional activity (Fig. 8B, GAL4-RFX1N-RFX4), suggesting the C-terminal portion of the RFX4 inhib-ited the transcriptional activity of the N-terminal region of RFX1 intramolecularly. We also showed that two overlapping regions of the RFX4 protein (amino acids 224 -320 and 283-499) could function as distinct inhibitory domains (Fig. 8B, GAL4-RFX1N-RFX4⌬D and GAL4-RFX1N-RFX4⌬B). These observations indicate that the two regions of the RFX4 protein may possibly have inhibitory activities, as reported for the RFX1 protein (13,14). DISCUSSION In this study, we found a cDNA clone which encodes a novel RFX-type dimerization domain in the course of sequencing of human testis cDNA clones. Because the RFX members play important roles in many organisms (8,11,20), the finding of the novel RFX-type dimerization domain prompted us to elucidate the entire structure of this protein. A series of RACE analyses revealed that this novel RFX-type dimerization domain was a part of RFX4. RFX4 was originally identified in breast cancers as a short DNA-binding domain fused to the estrogen receptor (21), but the entire structure of RFX4 has been unknown until now. Here, we have described the entire primary structure and molecular properties of RFX4.
We mapped the RFX4 gene to chromosome 12q24 by searching the human genome data base. Since the estrogen receptor gene locates on the chromosome 6q25 (29), the reported fusion of the estrogen receptor to the DNA binding domain of RFX4 appears to be generated by abnormal chromosomal translocation in breast cancers. Genetic instability is one of the major characteristics of malignant tumors, and chromosomal translocations are aberrations caused by genetic instability (30,31). In some cancers, chromosomal translocations generate gene fusions that express aberrant proteins with oncogenic activity (30). Since the reported fusion protein carries the DNA binding domain of RFX4 and the transcriptional activation domain of estrogen receptor (21), it is of interest whether the chimeric RFX4-estrogen receptor is involved in abnormal gene expression that contributes to the development and progression of breast cancers.
Although the structure of RFX4 is highly homologous to other human RFX members, several differences in molecular properties were found in RFX4. First, the expression of RFX4 is stringently restricted to testis. This testis-specific mRNA distribution may reflect roles of RFX4 yet to be identified. Second, RFX4 had slightly different DNA binding specificity, in contrast to RFX1, RFX2, and RFX3, which display almost indistinguishable DNA binding properties. We observed that RFX4 is able to bind to several RFX-binding sites, such as the enhancer of IL-5R␣, but not to the MIE1 sequence in EMSA. These observations indicate that RFX4 has similar but slightly FIG. 6. DNA binding preferences of human RFX members. A, sequences of competitor oligonucleotides used in this study. Oligonucleotides for the indicated regulatory sequences were synthesized and made double-stranded. B, EMSA of human RFX proteins and IL-5R␣ probe in the presence of various competitor DNA. EMSA of the RFX proteins with 32 P-labeled IL-5R␣ probe were performed in the presence of a 50-, 100-, and 250-molar excess of unlabeled competitors. different DNA binding specificity compared with other RFX members. Finally, we showed that RFX4 exhibited no transcriptional activation, even when a deletion was made in its C-terminal domain. This means that among RFX members that function as dimers, human RFX4 is the first mammalian member not showing transcriptional activation. It is of interest whether homologs of human RFX4 exist in other mammalian species.
Crt1 is a RFX-type transcription repressor of S. cerevisiae. In normal conditions, the Crt1 protein is highly phosphorylated and functions as a transcriptional repressor. Upon DNA damage or replication block, Crt1 is rapidly dephosphorylated, and a set of genes for DNA repair is derepressed (11). At present, we could not rule out the possibility that RFX4 is posttranslationally modified in response to signals and then contributes to some transcriptional activation processes.
The C-terminal domain of RFX4 shows strikingly conserved structure with human RFX1, RFX2, and RFX3 and exhibits inhibitory activity when fused to the N-terminal activation domain of RFX1. The C-terminal domain of RFX4 also mediates selective dimer formation with other RFX members. Based on these observations, we speculate that the dimer formation with other RFX members constitutes a major part of the mechanism by which RFX4 functions. FIG. 7. Physical interaction of RFX4 with other RFX members. COS7 cells were cotransfected with plasmids for the FLAG-tagged RFX4 and HA-tagged RFX members as indicated at the tops of panels. V, vacant vectors (pFLAG-CMV5 or pcDNAHA) that were used as negative controls. Immunoprecipitation (IP) was performed with anti-FLAG monoclonal antibody M2 (Sigma) conjugated with protein G-Sepharose. Precipitated proteins were examined by Western blotting (WB) using anti-FLAG monoclonal antibody M2 (Sigma) or anti-HA monoclonal antibody (Roche Molecular Biochemicals). Upper panels, Western blotting of the whole-cell extracts with the anti-HA antibody to show that each HA-tagged RFX protein was expressed at a comparable level. Lower panels, immunoprecipitated FLAG-RFX4 protein. Middle panels, coprecipitated HA-tagged RFX proteins.

FIG. 8. Domain analysis of the RFX4 protein.
A, transcriptional activity of Gal4-RFX fusion proteins. Gal4-RFX plasmids were co-transfected with the pG5luc reporter plasmid into HeLa cells. Efficiency of each transfection was normalized by the activity of Renilla luciferase, which is constitutively expressed from the Gal4-RFX plasmids. Firefly luciferase activity of full-length RFX1 fused to Gal4 was set as base line (relative activity 1.0). Transcriptional activation was shown as the ratio of the firefly luciferase activity to that of the full-length RFX1. Average activities with S.D. values were calculated from four independent assays. B, the C-terminal region of the RFX4 protein has possible inhibitory activity. We constructed plasmids expressing the Nterminal domain of RFX1 fused to the C-terminal domain of RFX4. These RFX1⅐RFX4 plasmids were cotransfected with the pG5luc reporter plasmid to HeLa cells, and relative luciferase activities were examined as described above. RFX1, RFX2, and RFX3 have been thought to be redundant genes because of their highly conserved structures and indistinguishable DNA binding properties. In this study, however, we showed that RFX4 binds stably with RFX2, RFX3, and RFX4 itself and does not bind to RFX1. This preference of interactions among human RFX members could lead to the formation of various RFX dimers. We speculate that various RFX dimers could have different properties, and each human RFX member could play a distinct role. Our results will contribute to further investigation in an effort to elucidate the functional roles of human RFX members in transcriptional regulation.