Cloning and characterization of a transcription factor that binds to the proximal promoters of the two mouse type I collagen genes.

We have used the yeast one-hybrid system to clone transcription factors that bind to specific sequences in the proximal promoters of the type I collagen genes. We utilized as bait the sequence between −180 and −136 in the pro-α2(I) collagen promoter because it acts as a functional promoter element and binds several DNA-binding proteins. Three cDNA clones were isolated that encoded portions of the mouse SPR2 transcription factor, whereas a fourth cDNA contained a potential open reading frame for a polypeptide of 775 amino acids and was designated BFCOL1. Recombinant BFCOL1 was shown to bind to the −180 to −152 segment of the mouse pro-α2(I) collagen proximal promoter and to two discrete sites in the proximal promoter of the mouse pro-α1(I) gene. The N-terminal portion of BFCOL1 contains its DNA-binding domain. DNA transfection experiments using fusion polypeptides with the yeast GAL4 DNA-binding segment indicated that the C-terminal part of BFCOL1 contained a potential transcriptional activation domain. We speculate that BFCOL1 participates in the transcriptional control of the two type I collagen genes.

Type I collagen is a protein that is abundantly synthesized by a discrete number of cell types including osteoblasts, odontoblasts, fibroblasts, smooth muscle cells, and mesenchymal cells. It is composed of two ␣1 chains and one ␣2 chain forming a characteristic triple helix. Expression of the genes for these polypeptides is coordinately regulated in a variety of physiological and pathological situations (1). Changes in the synthesis of type I collagen occur not only during embryonic development in specific tissues but changes also take place in disease states, for example during wound healing as well as in fibrotic diseases such as lung fibrosis, cirrhosis, and scleroderma. In many of these instances it is likely that the control of expression of the two type I collagen genes is mainly exerted at the level of transcription, but the precise mechanisms that control transcription of these genes are still poorly understood. Our long term goal is to identify the critical cis-acting elements in these two genes and both the cell-specific and ubiquitous tran-scription factors that presumably control their expression.
Recently, transgenic mouse studies have identified strong tissue-specific enhancer elements in the 5Ј-flanking regions of both type I collagen genes (2)(3)(4)(5). These elements are located further upstream than the proximal promoter elements. For instance, in the mouse pro-␣1(I) gene, a potent enhancer element for osteoblast and odontoblast expression was localized about 1.6 kilobases (kb) 1 upstream of the start of transcription, whereas another strong element for expression in tendon and fascia fibroblasts was found between Ϫ2.3 and Ϫ3.2 kb (2). Similar experiments from other laboratories have produced analogous results (3,4). These experiments strongly suggested that separate elements control the expression of this gene in different type I collagen-producing cells. In the pro-␣2(I) gene, an element that strongly enhanced expression in fibroblasts and mesenchymal cells was located between 13.5 and 17.5 kb upstream of the transcription start (5). One can speculate that proteins binding to the upstream enhancers in both type I collagen genes cooperate with transcription factors binding to the proximal promoters to activate transcription in specific cell types.
Previous studies have identified several functional cis-acting elements in the 350-bp (base pair) proximal promoter of the mouse pro-␣2(I) collagen gene (6). These included a binding site for the ubiquitous heterotrimeric CCAAT-binding factor (CBF), between Ϫ75 and Ϫ98 (7-9), redundant GC-rich binding sites for several proteins between Ϫ65 and Ϫ105, between Ϫ114 and Ϫ131, and between Ϫ152 and Ϫ176 (10). Several classes of proteins that are mainly ubiquitous proteins bind to these redundant sites. Transient expression and in vitro transcription experiments with wild-type and mutant templates indicated that the segment between Ϫ40 and Ϫ170 containing the three redundant elements was essential for promoter activation. Other studies identified three short cis-acting GC-rich elements in the human pro-␣2(I) collagen gene between Ϫ323 and Ϫ264 (11) that were capable of binding SP1. Additional studies presented evidence that a protein complex which includes SP1 binds to this segment of the human promoter and participates in the transforming growth factor-␤ activation of this promoter (12). In the mouse promoter there is also a binding site for CTF/NF1 between Ϫ305 and Ϫ290 (13).
In the mouse pro-␣1(I) collagen gene, the sequence between Ϫ220 and the TATA box presents strong homologies with the sequence of the pro-␣2(I) gene in the same region. This DNA segment contains binding sites for DNA-binding factors that also bind to the proximal pro-␣2(I) promoter. The DNA elements in the pro-␣1(I) promoter include a binding site for CBF between Ϫ90 and Ϫ115, two apparently redundant sites between Ϫ190 and Ϫ170 and between Ϫ160 and Ϫ130 for a DNA-binding protein previously designated inhibitory factor 1 (IF-1), and two sites between Ϫ130 and Ϫ80 that flank the CBF-binding site and are binding sites for SP1 and probably other GC-rich binding proteins (14). DNA transfection experiments with the pro-␣1(I) promoter showed that point mutations in the CBF-binding site decreased promoter activity, whereas small substitution mutations in some of the other sites resulted in an increase in transcription (14,15). It was also shown that the sequence of the pro-␣2(I) promoter between Ϫ173 and Ϫ143 was able to compete for the binding of a protein that was forming a major DNA-protein complex with two redundant elements in the pro-␣1(I) promoter between Ϫ190 and Ϫ170 and between Ϫ160 and Ϫ130, suggesting that both type I promoters contained binding sites for the same proteins (15). Fig. 1 summarizes binding sites for DNA binding proteins in the two mouse proximal type I collagen promoters.
The purpose of the present study was to identify one or more trans-acting factors that bind to these proximal cis-acting elements in the type I collagen promoters. We have used the sequence between Ϫ180 to Ϫ136 of the pro-␣2(I) collagen promoter to clone cDNAs for proteins binding to this segment using the yeast one-hybrid system (16,17). This segment was chosen mainly because previous experiments with the pro-␣2(I) proximal promoter showed that the DNA segment between Ϫ180 and Ϫ136 was capable of binding an array of DNAbinding proteins, many of which also bound to the same promoter between Ϫ133 and Ϫ105 and between Ϫ105 and Ϫ65; the sequence between Ϫ180 and Ϫ136 was also binding these proteins with greater efficiency than the more proximal sequences (10). One of the cDNAs that was cloned encodes a polypeptide of 775 amino acids, which also bound to two discrete sites in the pro-␣1(I) collagen promoter.

MATERIALS AND METHODS
Cloning of DNA-binding Proteins Using the Yeast One-hybrid System-The yeast strain BY 164 (MAT a his3 ␦ 200 leu2-3, 112 ura3-52 lys2-801a trp1a) was provided by Dr. Stevan Marcus. The yeast reporter plasmid was constructed as follows. Six tandem copies of a double-stranded oligonucleotide corresponding to the sequence from Ϫ180 to Ϫ136 bp of the mouse pro-␣2(I) collagen promoter were inserted into the BamHI site of the vector pRS315HIS containing the LEU2 gene as selectable marker (16,18,19) to generate pRS315HIS-6x160 (160 denotes the sequence between Ϫ180 and Ϫ136). The XbaI-SalI fragment of pRS315HIS-6x160 was then subcloned into the XbaI-SalI site of the vector pRS305 (16); this plasmid was designated pRS305HIS-6x160. After digestion with ClaI, this vector was used for transformation. Yeast transformation was performed by the polyethylene glycol/lithium acetate method (20). Plasmid integration in the genome of yeast strains was confirmed by Southern blot analysis using a 32 P-labeled oligonucleotide from Ϫ180 to Ϫ136 bp. Cells were then plated on a minimal synthetic dextrose plate without histidine to verify background HIS3 gene activity. One of the yeast strains that had minimal HIS3 gene activity was also selected as the strain for the transformation after the initial selection. Plasmid pJL638 -6x160 contained six tandem copies of the sequence from Ϫ180 to Ϫ136 of the mouse pro-␣2(I) promoter in the pBgl-lacZ plasmid harboring the URA3 gene as a selectable marker (17). The yeast strain in which both pRS305HIS-6x160 and pJL638 -6x160 plasmids were integrated was used for cDNA library transformation. However, because lacZ was constitutively expressed at low levels in this strain, 5-bromo-4-chloro-3-indoyl ␤-D-galactoside staining could not be used for screening HIS3positive colonies. cDNAs were generated from the mRNA of primary fibroblasts of 14-day mouse embryos by priming with either oligo(dT) or with a random hexamer using the TimeSaver cDNA synthesis kit (Pharmacia Biotech Inc.) and cloned into the EcoRI-NotI site or EcoRI site of plasmid pPC86 containing the TRP1 gene as a selectable marker (19). The ligation products were electroporated into the Escherichia coli strain MC1061, and the resultant transformants (6 ϫ 10 6 for the directionally cloned library, 3 ϫ 10 6 for the non-directionally cloned plasmid library) were plated onto ampicillin plates. After scraping the cells from the plates, the plasmid library was purified with Promega's Wizard Maxiprep DNA purification system with additional phenol and chloroform extractions. Ten to 20 g of cDNA plasmid from the libraries were transformed into the yeast strain harboring the two reporter plasmids integrated into the genome and plated onto plates lacking leucine, uracil, tryptophan, histidine, and 3-amino-1,2,4-triazole. Transformation efficiency was about 2-3 ϫ 10 5 /g cDNA plasmid. Colonies were picked after 5-7 days. Plasmid cDNAs were extracted and used for retransformation either into the same yeast strain or the yeast strain into which plasmid pRS305-HIS plasmid instead of pRS305-HIS-6x160 plasmid had been integrated.
DNA Sequencing-DNA sequencing was carried out using a primer present in the DNA for the GAL4 transactivation domain (5Ј-GGAT-GTTTAATACCACT-3Ј) or T3, T7, and SP6 primers.
Expression of Cloned cDNAs by in Vitro Transcription and Translation-Three different recombinant polypeptides corresponding to the full-length, the N-terminal part, and the C-terminal part of BFCOL1 were generated using the T N T-coupled reticulocyte lysate system (Promega Corp). For the full-length polypeptide, the SalI-NotI fragment of pPC86-BFCOL1 was inserted into the SalI-NotI site of the pBluescript KS vector (pBS-BFCOL1-full). For the N-terminal and C-terminal polypeptides, the SalI-NsiI fragment and the ScaI-NotI fragment of pPC86-BFCOL1 DNA were inserted into the SalI-PstI site of pGEM 5Zf(ϩ) and the EcoRV-NotI site of pBluescript KS to generate p5Zf-BFCOL1-N and pBS-BFCOL1-C, respectively. 35 S-Labeled polypeptide products were analyzed using 10% sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis. These were exposed to Fuji RX film.
Generation of Fusion Polypeptides with Glutathione S-Transferase (GST)-Three different fusion polypeptides were generated. For the full-length and N-terminal fusion polypeptides, the SalI-NotI fragment of pPC86-BFCOL1 and the SalI-NotI fragment of p5Zf-BFCOL1-N were inserted, respectively, into the XhoI-NotI site of the pGEX-4T-3 vector (Pharmacia Biotech Inc.). For the C-terminal fusion polypeptide, the SalI-NotI fragment of pBS-BFCOL1-C was inserted into the SalI-NotI site of the pGEX-4T-3 vector. Procedures for production and purification of fusion polypeptides were carried out as suggested by the manufacturer.
DNase I Footprinting-DNA-binding reactions were performed according to the method of the Core Footprinting System (Promega Corp). Ten femtomoles (1 l) of the end-labeled BamHI-NarI fragment containing the Ϫ350 to ϩ7-bp sequence of the pro-␣2(I) collagen promoter inserted in the HindIII site of pA3LUC (20) was used as a DNA substrate. Binding reactions were started by addition of the glutathione S-transferase (GST) protein or GST-BFCOL1 full-length recombinant polypeptide. At the end of the reaction the samples were heat-denatured and loaded onto a 6% polyacrylamide, 8 M urea gel. Gels were then autoradiographed at Ϫ80°C with an intensifying screen.
Gel Retardation Assay-One microliter of recombinant protein (either products of in vitro transcription and translation or GST-fusion polypeptides) was incubated with 5 fmol of end-labeled double-stranded oligonucleotide in a volume of 10 l. Incubation was carried out at room temperature for 20 min. All binding reactions contained 10 mM Tris-HCl (pH 7.5), 4% glycerol, 50 mM NaC1, 0.5 mM EDTA, 0.5 mM dithiothreitol, 1 mM MgCl 2 , and 0.5 g of poly(dA-dT). Following electrophoresis in a 5% polyacrylamide Tris borate/EDTA gel, the gel was dried and subjected to autoradiography at room temperature. The SP1 consensus oligonucleotide was purchased from Promega. The Krox consensus oligonucleotide (23) and other oligonucleotides containing specific sequences of the pro-␣1(I) and pro-␣2(I) collagen promoters were produced by an oligonucleotide synthesizer. Sequences of oligonucleotides used for competition experiments are listed in Fig. 2.
Transfection Experiments-The SmaI-SacI fragment of pPC86-BF-COL1, which contains the full-length DNA of BFCOL1, and the ScaI fragment of pPC86-BFCOL1, which contains the N-terminal portion of BFCOL1, were subcloned into the SmaI-SacI-and SmaI-digested pSG424 vector, respectively (24). In the case of the plasmid containing the C-terminal part of BFCOL1, we first made two constructs. First, the XhoI-BamHI fragment of pAS2 (25) containing the DNA-binding domain of the yeast transcription factor GAL4 as well as several cloning sites was subcloned into pSG424 to add supplementary cloning sites (pSG424 m). Second, the ScaI-NotI fragment of pPC86-BFCOL1 that contained the C-terminal part of BFCOL1 was subcloned into the EcoRV-NotI site of plasmid pCITE-2a (Novagen, Inc.) (pCITE-2a-BF-COL1-C). Finally, the NdeI-XhoI fragment of pCITE-2a-BFCOL1-C was subcloned into the NdeI-SalI site of pSG424-m to make the pSG424-BFCOL1-C. Transfections were carried out using 10 g of the reporter plasmid containing the GAL-binding sites upstream of an SV40 promoter linked to the chloramphenicol acetyltransferase (CAT) gene, 5 g of pSG424-derivative plasmid, and 5 g of SV␤gal plasmid into 714 BALB 3T3 fibroblasts cells (26). Cells were harvested after 48 h and assayed for CAT activity (27). ␤-Galactosidase activity was measured with a resorufin-␤-D-galactopyranoside substrate (Boehringer Mannheim).
RNA Isolation and Northern Blot Analysis-Total RNA was extracted from 714 BALB 3T3 fibroblasts, NIH 3T3 fibroblasts, S194 B cells, and EL4 T cells using TRIzol solution (Life Technologies, Inc.). About 20 g of each total RNA preparation was electrophoresed on 1% agarose gels containing 1.1 M formaldehyde. The RNA was transferred to Hybond-N nylon membranes (Amersham Corp.). mRNA was detected with a 32 P-labeled EcoRI-BamHI fragment of BFCOL1 (Multiprime labeling system, Amersham Corp.) after hybridization for 18 h at 42°C in 5 ϫ SSPE (1 ϫ SSPE is composed of 0.18 M NaCl, 10 mM sodium phosphate (pH 7.7), and 1 mM EDTA), 5 ϫ Denhardt's solution (1 ϫ Denhardt's solution is composed of 0.02% bovine serum albumin, 0.02% Ficoll, and 0.02% polyvinylpyrrolidone), 50% formamide, 0.1% SDS, 50 g/ml heat-denatured salmon testis DNA, and radioactive probe. Membranes were washed twice for 15 min each at 65°C in a solution containing 2 ϫ SSC (1 ϫ SSC is composed of 0.15 M NaCl and 15 mM sodium citrate) and 0.1% SDS, then once in 1 ϫ SSC with 0.1% SDS for 30 min at 65°C, and finally twice for 15 min each in 0.1% SSC with 0.1% SDS at room temperature. The membranes were then autoradiographed at Ϫ80°C using Fuji RX film. Human glyceraldehyde-3-phosphate dehydrogenase cDNA (Ambion) was used as an internal control.
Reverse Transcriptase-Polymerase Chain Reaction (RT-PCR)-Total RNA of S194 B cells, HT29 cells, and HSC34 cells were annealed with the oligonucleotide 5Ј-CTCTTAATCTCCACATTCAGTGCCTG-3Ј (12c) and reverse-transcribed using avian myeloblastosis virus reverse transcriptase. The cDNA products were then subjected to PCR using oligonucleotides 12n 5Ј-AATAGTAAGAGAAGTCTGAA-3Ј and 12c. The sense strands of 12n and 12c are indicated in Fig. 3, A and B.

Cloning of cDNAs for Polypeptides That Bind to the Ϫ180 to
Ϫ136-bp Segment of the Mouse Pro-␣2(I) Collagen Promoter-Previous studies indicated that the GC-rich DNA segment between Ϫ180 and Ϫ136 of the mouse pro-␣2(I) collagen gene was able to bind several different DNA-binding proteins in vitro and that this segment was also able to compete for the binding of proteins to two other redundant but discrete sites closer to the transcription start site of this promoter (10). A deletion of this same segment resulted in substantial decrease in promoter activity. To begin to identify some of the proteins that bound to this segment, we used this DNA as bait in the yeast one-hybrid system and screened two mouse embryo fibroblast cDNA libraries, one primed with oligo(dT) (library 1) and the other primed with a random hexamer oligonucleotide (library 2). In the yeast strain that was used for selection, plasmid pRS305-HIS-6x160 was integrated into the genome. In this plasmid, six tandem copies of the sequence of the pro-␣2(I) gene between Ϫ180 and Ϫ136 were cloned upstream of a minimal yeast GAL1 promoter itself linked to the HIS3 gene. After screening six million independent colonies from library 1 and three million from library 2, an initial 81 histidine-positive colonies from library 1 and 44 histidine-positive colonies from library 2 were picked; 17 cDNA plasmids from library 1 and 12 cDNA plasmids from library 2 gave positive colonies upon retransformation of the parental strain. However, most of these also gave histidine-positive colonies with the yeast strain in which the control plasmid pRS305-HIS was integrated, a plasmid that contains the minimal GAL1 promoter but not six tandem copies of the Ϫ180 to Ϫ136 sequence. Only four cDNAs, all from library 1, could specifically activate the HIS3 gene of the yeast strain containing the pRS305-HIS-6x160 plasmid without activating the HIS3 gene of the strain with the pRS305-HIS control plasmid. This suggested the possibility that the recombinant fusion polypeptides encoded by each of these four cDNAs might bind specifically to the Ϫ180 to Ϫ136-bp segment of the pro-␣2(I) collagen gene and not to the DNA sequence of the GAL1 minimal promoter. One cDNA clone was designated as BFCOL1 (see below). Two other cDNAs contained an almost full-length coding sequence for mouse SPR-2 (28), and the fourth cDNA was a shorter partial cDNA for SPR-2. human T cell receptor (29). Subsequent to the codon for amino acid 400 in BFCOL1, the reported nucleotide sequence of ht␤ DNA displays two translational frame changes compared with that of BFCOL1 immediately followed by a termination codon (29). The sequence of the 340 amino acids at the C terminus of BFCOL1 has no significant amino acid sequence homology with other polypeptides present in Genebank data bases. The nucleotide sequence preceding the initial methionine codon of BF-COL1 is also similar to that of ht␤ except that the first 25 nucleotide residues at the 5Ј end of the cDNA of BFCOL1 are different from comparable residues in ht␤. As reported for ht␤, the deduced amino acid sequence of BFCOL1 contains four potential zinc finger motifs.
Since the reported nucleotide sequence of the cDNA of ht␤ following the termination codon is also about 90% identical to that of BFCOL1 DNA, we performed RT-PCR experiments in order to verify the nucleotide sequence of the human homologue of BFCOL1 RNA. We used two primers that bracketed the sequence containing the reported frameshifts and stop codon in ht␤ and RNAs from two different human cell lines, the colon carcinoma cell line HT29 and the stomach cell line HSC34. The location of the primers that were used (12n, 12c) are indicated in Fig. 3A. The sequence of the PCR product from the RNAs of these human cells is presented in Fig. 3B and shows a continuous open reading frame without the frameshifts that were reported earlier.
We then asked whether the entire open reading frame shown in Fig. 3A was translated into a polypeptide of the expected size, and we constructed three plasmids for in vitro transcription-translation. One plasmid encoded the fulllength BFCOL1 (pBS-BFCOL1-full, from 1 to 2426), whereas the others encoded the N-terminal part (p5Zf-BFCOL1-N, from 1 to 1156) and the C-terminal segment (pBS-BF-COL1-C, from 1395 to 2426) of BFCOL1. The major product of pBS-BFCOL1-full (Fig. 4, lane 1) was a single polypeptide, whereas the DNA of p5Zf-BFCOL1-N (lane 2) gave rise to two major polypeptides and several fainter species. The major polypeptide species appeared to run more slowly by SDSpolyacrylamide gel electrophoresis than expected from their estimated molecular sizes (expected sizes are 89 kDa for pBS-BFCOL1-full and 43 kDa for p5Zf-BFCOL1-N) maybe due to increased SDS binding. The major product of pBS-BFCOL1-C (Fig. 4, lane 3) was a single polypeptide that had about the expected size (estimated size is 37 kDa).
DNA-binding Experiments-Gel shift experiments were performed with the Ϫ180 to Ϫ136 DNA segment of the mouse pro-␣2(I) collagen promoter that was used in the one-hybrid screen to verify whether BFCOL1 was able to form a DNAprotein complex under the conditions of this assay and to determine which segment of the BFCOL1 polypeptide contained a DNA-binding domain. In these experiments, the products of in vitro transcription-translation shown in Fig. 4 were tested. With the full-length BFCOL1, one major protein-DNA complex was detected (Fig. 5, lane 4), whereas two protein-DNA complexes were seen with the N-terminal BFCOL1 (Fig.  5, lane 3), the upper complex being less intense than the lower complex, although both complexes were migrating faster than the complex with full-length BFCOL1. The presence of these two complexes could possibly be a result of the heterogeneity of protein products observed with the cDNA for N-terminal BF-COL1 (see Fig. 4, lane 2). With the polypeptide corresponding to the C-terminal part of BFCOL1, no protein-DNA complexes were detected other than nonspecific bands (Fig. 5, lane 2). These results suggested that the N-terminal part of BFCOL1 contained a DNA-binding domain and that DNA binding might be mediated by the four tandem zinc fingers, consistent with previous results obtained with ht␤ (29).
The proximal promoter of the mouse pro-␣2(I) collagen gene contains several redundant GC-rich elements. To examine whether other segments of the 350-bp proximal promoter of this gene contained binding sites for BFCOL1, in vitro DNase I footprints were performed using a recombinant GST-fulllength BFCOL1 fusion polypeptide and the promoter segment from Ϫ350 to ϩ7. As shown in Fig. 6, the recombinant GSTfull-length BFCOL1 fusion polypeptide protected the region between Ϫ180 and Ϫ152 (lane 2) and only this segment. When a similar DNase I footprint was performed using the promoter labeled on the other strand (lane 4), again no other protected regions were observed. Hence, recombinant BFCOL1 binds only to one specific sequence in this proximal promoter and not to other GC-rich sequences.
To further confirm the specific binding of BFCOL1 to a discrete site in the proximal promoter of the pro-␣2(I) collagen gene, gel shift experiments were performed using a 32 P-labeled Ϫ180 to Ϫ136 oligonucleotide as probe and several competitor DNA oligonucleotides corresponding to specific sequences present in the mouse pro-␣2(I) collagen proximal promoter (Fig. 7). In this experiment, the product of full-length BFCOL1 DNA was used, generated by in vitro transcription-translation. As expected, the Ϫ180 to Ϫ136 oligonucleotide (lane 2) competed for binding, as did, with somewhat less efficiency, the shorter Ϫ176 to Ϫ152 oligonucleotide (lane 3), which is included in the   FIG. 4. Synthesis of full-length, the N-terminal, and the C-terminal part of BFCOL1 by in vitro transcription and translation. pBS-BFCOL1-full, p5Zf-BFCOL1-N, and pBS-BFCOL1-C DNA plasmids were incubated with rabbit reticulocyte lysate and [ 35 S]methionine. In addition, T3 RNA polymerase was used for pBS-BFCOL1-full and pBS-BFCOL1-C DNAs and SP6 RNA polymerase for p5Zf-BF-COL1-N DNA. Polypeptide products were electrophoresed on a 10% SDS-polyacrylamide gel. Molecular mass standards are indicated on the right. former. Other oligonucleotides from Ϫ140 to Ϫ86 (lane 4), from Ϫ135 to Ϫ104 (lane 5), from Ϫ105 to Ϫ65, which includes the CBF-binding site (lane 6), and from Ϫ315 to Ϫ284, which includes a CTF/NF1-binding site (lane 7), were unable to compete. An SP1 consensus oligonucleotide had practically no effect (lane 8), and an oligonucleotide containing a Krox consensus binding site was also unable to compete (lane 9). When labeled Ϫ140 to Ϫ86 and Ϫ105 to Ϫ65 oligonucleotide probes were used in gel shift assays, BFCOL1 was unable to bind to these DNAs (data not shown). These results confirmed that BFCOL1 was specifically binding to the Ϫ180 to Ϫ152 region of the pro-␣2(I) collagen promoter and indicated that this binding site must be different from a binding site for either SP1 or Krox and their family members.
The results of previous gel shift experiments using crude nuclear extracts of NIH/3T3 fibroblasts were consistent with the hypothesis that a DNA-binding protein was binding to discrete sites in both the proximal pro-␣1(I) and pro-␣2(I) collagen promoters (14,15). This protein was tentatively designated inhibitory factor 1 (IF-1) based on the result that substitution mutations in its binding sites in each promoter, which inhibited its binding, resulted in an increase in promoter activity. The binding site of IF-1 in the pro-␣2(I) promoter corresponded to the binding site for BFCOL1, whereas the binding sites in the pro-␣1(I) promoter were located between Ϫ190 and Ϫ170 and between Ϫ160 and Ϫ130. To test whether these sites in the pro-␣1(I) promoter could also bind BFCOL1, gel shift assays were performed using two labeled oligonucleotides from Ϫ194 to Ϫ168 and from Ϫ168 to Ϫ129 in the mouse pro-␣1(I) collagen promoter in conjunction with the GST-full-length BF-COL1 fusion polypeptide. When this GST-fusion polypeptide was used with an oligonucleotide corresponding to the sequence between Ϫ180 to Ϫ136 in the pro-␣2(I) promoter, two complexes were observed, a slower migrating complex and a more intense faster migrating complex (Fig. 8, lane 1). The difference in the pattern of DNA-protein complexes between those observed with GST-BFCOL1 fusion polypeptides synthesized in E. coli and those seen with BFCOL1 synthesized in the reticulocyte lysate (see Fig. 5) could possibly be due to the heterogeneity of the GST-BFCOL1 fusion polypeptides as examined by SDS-polyacrylamide gel electrophoresis (data not shown). Fig. 8 shows that the recombinant GST-BFCOL1 fulllength fusion polypeptide was also able to bind to the Ϫ168 to Ϫ129 pro-␣1(I) oligonucleotide and with much less efficiency to the Ϫ194 to Ϫ168 pro-␣1(I) oligonucleotide DNAs (Fig. 8, lanes  9, and 17).
Mutations were then introduced into these DNA segments, i.e. 5Ј CGCGCCCCCCC 3Ј 3 5Ј CGCGCTTTCCC 3Ј in the Ϫ194 to Ϫ168 sequence of the pro-␣1(I) (sequence represents lower strand) and 5Ј CCTCCCCCCTC 3Ј 3 5Ј GGTCCGCCCTC 3Ј in both the Ϫ168 to Ϫ129 sequence of pro-␣1(I) and the Ϫ180 to Ϫ136 sequence of the pro-␣2(I) promoter, and the mutant oligonucleotides tested in DNA-binding assays with recombinant BFCOL1. Lanes 8, 16, and 24 of Fig. 8 show that each of these mutations abolished the binding of the recombinant GST-BFCOL1 full-length polypeptide. We also performed competition experiments using the wild-type oligonucleotides as probes. With each of the three wild-type-labeled oligonucleotide probes, the wild-type Ϫ180 to Ϫ136 sequence of pro-␣2(I) (Fig.  8, lanes 2, 10, and 18) and the wild-type Ϫ168 to Ϫ129 sequence of pro-␣1(I) (Fig. 8, lanes 4, 12, and 20) acted as strong competitors. In contrast, the mutant Ϫ180 to Ϫ136 sequence of pro-␣2(I) (Fig. 8, lanes 3, 11, and 19) and the mutant Ϫ168 to Ϫ129 sequence of pro-␣1(I) (Fig. 8, lanes 5, 13, and 21) were unable to compete. The wild-type Ϫ194 to Ϫ168 oligonucleotide of the pro-␣1(I) collagen promoter had little effect as competitor when the other two oligonucleotides were used as probes (Fig.  8, lanes 6 and 14), confirming the notion that this sequence binds BFCOL1 much less efficiently than the other two sites. A mutant oligonucleotide corresponding to the Ϫ194 to Ϫ168 sequence of the pro-␣1(I) promoter had no effect as competitor with all oligonucleotide probes (Fig. 8, lanes 7, 15, and 23). Hence, BFCOL1 binds to two sites in the pro-␣1(I) proximal promoter with different efficiencies and to one site in the pro-␣2(I) collagen proximal promoter. The locations of these binding sites are the same as those previously identified as binding to IF-1. The same substitution mutations that inhibited IF-1 binding in crude nuclear extracts (14,15) also inhibited BF-COL1 binding.
Functional Analysis-To test whether BFCOL1 could either activate or inhibit transcription, DNAs for the "full-length" and N-terminal segment of BFCOL1 were cloned in a mammalian expression vector, and these DNAs were cotransfected with a pro-␣2(I) collagen promoter (Ϫ350 to ϩ54) linked to the luciferase reporter gene in BALB 3T3 fibroblasts. A plasmid containing DNA for the C-terminal part of BFCOL1, lacking the DNA-binding domain and driven by the same mammalian expression promoter, served as control. No activation occurred with any of the three BFCOL1 constructions (data not shown). At higher concentrations the full-length BFCOL1 and the Cterminal BFCOL1 segment caused inhibition of the 350-bp pro-␣2(I) collagen promoter, presumably as a result of squelching, i.e. titration of another transcription factor that is important for expression of this promoter (30). Very similar results were observed when the promoter contained six tandem repeats of the Ϫ180 to Ϫ136 pro-␣2(I) sequence cloned upstream of a minimal pro-␣2(I) promoter (Ϫ40 to ϩ54). Again at higher concentrations of BFCOL1, inhibition occurred, but this took place even when the reporter plasmid contained mutations that abolished binding of BFCOL1, strongly suggesting that the inhibition was not dependent on binding of BFCOL1 to DNA and hence presumably due to squelching. Similar results were also obtained after cotransfection of the BFCOL1 plasmids and the reporter plasmids in S194 B cells (data not shown).
To determine whether segments of BFCOL1 contained a potential transactivation domain, three mammalian expression plasmids were constructed coding for fusion polypeptides with the yeast GAL4 DNA-binding domain. The DNAs for full-length, N-terminal, and C-terminal fusion polypeptides were cotransfected along with a plasmid containing a GAL4binding site upstream of an SV40 promoter itself linked to the CAT gene. Activation of the reporter gene occurred only with the plasmid coding for the GAL4-BFCOL1 C-terminal fusion polypeptide. No transcriptional activation was detected with either the GAL4-BFCOL1 full-length or the GAL4-BFCOL1 N-terminal fusion polypeptides (Fig. 9). This experiment suggested that the C-terminal segment of BFCOL1 contained a potential transcription activation domain. This does not exclude the possibility that the other segments of BFCOL1 might contain additional activation domains. Indeed, the presence of the BFCOL1 DNA-binding domain in the two other fusion polypeptides might eventually interfere with binding to the GAL4-binding site in the promoter of the reporter plasmid.
Northern Blot Analysis-To determine the size of the BF-COL1 RNA transcripts, a Northern hybridization experiment was performed (Fig. 10). Three RNA transcripts were identified as follows: one transcript had a size of about 9 kb, another of about 5.5 kb, and a third RNA, which migrated more slowly than the 9-kb species, was seen in S194 B cells. These RNAs are all larger than the size of our cDNA. This pattern of RNAs is analogous to that previously shown to hybridize to the ht␤ DNA probe although the shorter RNA was shown to have a mobility of 4 to 4.2 kb in humans. It is possible that in the human RNA either the 3Ј-untranslated segment or the 5Јuntranslated segment or both are shorter than in the mouse RNA (29). In our experiments, the two major species of 9 and 5.5 kb were seen in two fibroblast cell lines, a B cell line and a T cell line. enhancer, implying that at least some of the several proteins that bind to this element should act as transcription activators. We speculate that as with nuclear extracts in vitro (10), several proteins in vivo also compete for binding to this element. In experiments using fusion polypeptides with the yeast GAL4 DNA-binding domain, we showed that the C-terminal part of BFCOL1 had the potential of being a transcriptional activator domain; the degree of activation was, however, weaker than with the activation domain of two other DNA-binding proteins that were similarly tested as GAL4 fusion polypeptides. 2 One possible hypothesis for the increase in promoter activity which took place with mutations that abolished the binding of BF-COL1 (14,15) is that the binding of other transcription factors that bind to the same area of the promoter to binding sites which would overlap partly with that of BFCOL1 would now occur more efficiently. If these transcription factors were more potent activators than BFCOL1, then the net result would be an increase in promoter activity.
In brief, BFCOL1 appears to be one of several ubiquitous proteins that bind to discrete sites in the proximal promoters of the two type I collagen genes and probably control the activity of these promoters in conjunction with other ubiquitous DNAbinding proteins, as well as tissue-specific transcription factors.
It was interesting that in addition to BFCOL1 the yeast one-hybrid system identified SPR2, a DNA-binding protein related to SP1. Our earlier experiments had suggested that the Ϫ180to Ϫ136-bp segment of the pro-␣2(I) gene was capable of binding SP1 and other proteins different from SP1 that bind to a consensus SP1-binding site (10). SPR2 could be one of these proteins.