The yeast STM1 gene encodes a purine motif triple helical DNA-binding protein.

The formation of triple helical DNA has been evoked in several cellular processes including transcription, replication, and recombination. Using conventional and affinity chromatography, we purified from Saccharomyces cerevisiae whole-cell extract a 35-kDa protein that avidly and specifically bound a purine motif triplex (with a K(d) of 61 pM) but not a pyrimidine motif triplex or duplex DNA. Peptide microsequencing identified this protein as the product of the STM1 gene. Confirmation that Stm1p is a purine motif triplex-binding protein was obtained by electrophoretic mobility shift assays using either bacterially expressed, recombinant Stm1p or whole-cell extracts from stm1Delta yeast. Stm1p has previously been identified as G4p2, a G-quartet nucleic acid-binding protein. This suggests that some proteins actually recognize features shared by G4 DNA and purine motif triplexes, e.g. Hoogsteen hydrogen-bonded guanines. Genetically, the STM1 gene has been identified as a multicopy suppressor of mutations in several genes involved in mitosis (e.g. TOM1, MPT5, and POP2). A possible role for multiplex DNA and its binding proteins in mitosis is discussed.

The formation of triple helical DNA has been evoked in several cellular processes including transcription, replication, and recombination. Using conventional and affinity chromatography, we purified from Saccharomyces cerevisiae whole-cell extract a 35-kDa protein that avidly and specifically bound a purine motif triplex (with a K d of 61 pM) but not a pyrimidine motif triplex or duplex DNA. Peptide microsequencing identified this protein as the product of the STM1 gene. Confirmation that Stm1p is a purine motif triplex-binding protein was obtained by electrophoretic mobility shift assays using either bacterially expressed, recombinant Stm1p or whole-cell extracts from stm1⌬ yeast. Stm1p has previously been identified as G4p2, a G-quartet nucleic acidbinding protein. This suggests that some proteins actually recognize features shared by G4 DNA and purine motif triplexes, e.g. Hoogsteen hydrogen-bonded guanines. Genetically, the STM1 gene has been identified as a multicopy suppressor of mutations in several genes involved in mitosis (e.g. TOM1, MPT5, and POP2). A possible role for multiplex DNA and its binding proteins in mitosis is discussed.
It has long been recognized that, under the proper conditions, certain DNA sequences preferentially adopt a structure composed of three nucleic acid strands (1). Triple helical or triplex DNA is a thermodynamically favored structure characterized by a third pyrimidine-rich (Py triplex) 1 or purine-rich (Pu triplex) DNA strand located within the major groove of a homopurine/homopyrimidine stretch of duplex DNA (reviewed in Ref. 2). Both intermolecular triplexes, where the third stand originates from a separate DNA molecule, and intramolecular triplexes (H-DNA), where the third stand originates from a proximal site on the same DNA molecule as its duplex acceptor, have been described. In intermolecular and intramolecular triplexes, stable interaction of the third strand is achieved through either specific Hoogsteen (Py triplex) or reverse Hoogsteen (Pu triplex) hydrogen bonding to the homopurine strand of the duplex, with the third strand adopting either a parallel (Py triplex) or antiparallel (Pu triplex) orientation relative to the homopurine acceptor. Base triplets in the pyrimidine motif include T*AT and C ϩ *GC, whereas those in the purine motif include G*GC, A*AT, and T*AT. Because cytosine protonation requires acidic pH (3) and the G*GC base triplet is the most stable in the purine motif (4), T-rich Py motif or G-rich Pu motif triplexes would be expected to predominate under physiological conditions. Do triplexes occur in vivo? Although direct proof is lacking, long oligopurine tracts with triplex-forming potential are quite common in eukaryotic genomes, ranging from yeast to human (5). These tracts are distributed nonrandomly and are typically located near gene promoters, recombination hot spots, and matrix attachment regions (6,7). Additionally, multiple lines of evidence have implicated intramolecular triplexes in several cellular processes, including transcription, replication, and recombination (8 -13). Finally, monoclonal antibodies generated against triplex DNA were found to interact nonuniformly with metaphase chromosomes and interphase nuclei, preferentially staining centromeric regions (14,15). Taken together, these data support the existence of triplex DNA at some point during the life cycle of a eukaryotic cell and suggest an important role for these structures in DNA-dependent biological processes.
If triplexes form in vivo, whether as required intermediates or as undesired side products of a necessary process, then cellular proteins might also exist that specifically recognize this particular DNA form. To date, four examples of triplexbinding proteins (3BPs) have been described in the literature. These include two reports of similar 55-kDa human proteins that exhibit binding specificity for Py triplex DNAs (16,17), our findings of several human proteins that specifically recognize a Pu triplex (18), and evidence that the Drosophila GAGA factor can bind to Py triplexes (but not Pu triplexes) containing a (GA⅐TC) 22 sequence (19). Using electrophoretic mobility shift assays (EMSA), we have also found evidence for Pu motif 3BPs in extracts from organisms ranging from bacteria to human. 2 These data suggest that 3BPs are present in all eukaryotes and that they play important cellular roles.
To better understand the biological roles of 3BPs, we sought to identify their corresponding genes. We chose the yeast Saccharomyces cerevisiae as our model system, given its completely sequenced genome and the wealth of biochemical and genetic information presently available for this organism (20,21). Here we describe the purification and characterization of the major S. cerevisiae 3BP, y3BP1, and its identification as the product of the STM1 gene.

EXPERIMENTAL PROCEDURES
DNAs-Sequences and structures of duplex, triplex, and quadruplex DNA probes and competitor DNAs used in this study are shown in Fig.  1. Psoralenated oligonucleotides, indicated by a "P" prefix in their name or by a "Pϳ" appended to their sequence, contained a 4Ј-(hydroxy-methyl)-4,5Ј,8-trimethylpsoralen-hexyl (Glen Research) moiety attached to their 5Ј-terminus. All oligonucleotides were purified by nbutanol precipitation (22). Those used in constructing duplex and triplex probes were further purified by denaturing PAGE.
Duplex probes and competitor DNAs were made by annealing equimolar (0.1 mM) concentrations of complementary oligonucleotides at room temperature. In the case of labeled probes, annealed duplexes were 3Ј-end-filled using the Klenow fragment of DNA polymerase and deoxyribonucleotides, including [␣-32 P]dATP. Duplex DNAs used in this study included the G/C-rich Pu duplex and the A/T-rich Py duplex (Fig.  1). Pu triplexes used in this study contained either the noncovalently attached triplex-forming oligonucleotide (TFO) ODN 1 or the covalently attached psoralenated TFO PODN 1. To form either Pu motif triplex, Pu duplex and a 10-fold molar excess of TFO were incubated for 60 min at 30°C in a reaction mixture containing 40 mM Tris-HCl (pH 8.0), 100 mM MgCl 2 , and 0.01% Nonidet P-40 (23). For pyrimidine motif triplex formation, the Py duplex, TFO PODN 3, and a buffer composed of 25 mM Tris-HCl (pH 6.0), 20 mM MgCl 2 , and 70 mM NaCl were used instead. Psoralenated TFOs were covalently attached to both strands of the duplex DNA following triplex formation by irradiation at 365 nm for 10 min at 0°C with a 6-watt hand-held UV lamp. Under these conditions, greater than 90% of the probe is converted to photo-cross-linked triplex (24). G-quartet (G4)-containing competitors used in this study included the tetrameric parallel structure GL-tetramer, the intermolecular quadruplex ODN 1 dimer, and the intramolecular quadruplex ODN 1 monomer. Each G4 DNA was formed following published procedures (25)(26)(27). All DNA probes and competitor DNAs were analyzed by PAGE prior to use.
EMSA-Protein mixtures containing 3BPs were incubated for 30 min at 24°C in a 10-l volume containing buffer A (25 mM HEPES-Na ϩ , pH 7.9, 50 mM KCl, 10% glycerol, 0.5 mM dithiothreitol), 1 mM MgCl 2 , and 0.1 nM probe DNA, together with 2 g of poly(dI-dC) carrier DNA or additional competitor nucleic acids as indicated. Resulting proteinprobe complexes were resolved by nondenaturing PAGE at 7 V/cm for 90 min through a 5% acrylamide, 0.13% bisacrylamide gel containing 22 mM Tris borate, 0.5 mM EDTA. Probe-containing species were visualized by autoradiography and quantitated by a Storm 840 PhosphorImager (Molecular Dynamics).
Yeast Extract Preparation-Haploid yeast (S. cerevisiae, strain FY86 ␣) was cultured in eight 2-liter flasks containing 500 ml of YPD (1% yeast extract, 2% peptone, 1% dextrose), harvested at midlog phase (A 600 ϭ 0.8), and lysed by vortexing with glass beads according to published protocols (28). Proteins were extracted from 18.6 g of lysed cells in 46 ml of buffer B (50 mM NaHPO 4 , pH 7.4, 1 mM EDTA, 1 mM phenylmethylsulfonyl fluoride, and 5% glycerol). The extract was cleared by sequential centrifugation at 1500 ϫ g for 5 min at 4°C followed by 12,000 ϫ g for 10 min at 4°C and stored frozen at Ϫ80°C. The soluble protein concentration was determined to be 16.3 mg/ml in 46 ml of buffer B, using a Bradford dye binding assay (Bio-Rad).
Purification of y3BP1-All protein manipulation was performed at 4°C. Triton X-100 (0.05% final) was added to thawed yeast extract (46 ml) and loaded at 12 ml/h onto a 6-ml (1.4 ϫ 3.7 cm) P-11 phosphocellulose column equilibrated in buffer C (20 mM Tris-Cl, pH 7.3, 0.2 mM EDTA, 0.5 mM phenylmethylsulfonyl fluoride, 10 mM ␤-mercaptoethanol, and 0.05% Triton X-100) and 100 mM KCl. Fifty milliliters of column break-through was collected, and the column bed was washed with 50 ml of buffer C plus 100 mM KCl before the proteins were eluted on a gradient of increasing KCl concentration (100 mM--1 M) in buffer C. Thirty-five 1.9-ml fractions were collected, and their conductivities and protein concentrations were determined. Each was analyzed by EMSA (using a covalent Pu triplex probe) to determine the 3BP activities present. Fractions in the range 450 -550 mM KCl contained partially purified y3BP1 and were stored frozen at Ϫ80°C.
A 0.5-ml photo-cross-linked Pu triplex-DNA affinity column was prepared with CNBr-activated Sepharose 4B (Amersham Pharmacia Biotech) according to the manufacturer's instructions. 500 nmol of Pu duplex DNA (Fig. 1A) possessing a 3Ј-terminal hexylamine moiety on the G-rich strand was incubated with 1890 nmol of PODN 1 in 10 ml of buffer D (10 mM NaHCO 3 , pH 8.3, 12 mM MgCl 2 , and 0.05% Triton X-100) for 4.5 h at 30°C in the dark to effect triplex formation. The psoralenated third strand was covalently cross-linked to the Pu duplex following irradiation at 365 nm for 40 min at 0°C using a 6-watt hand-held UV lamp. At least 65% of the triplexes were determined to be photo-cross-linked (data not shown). A 500-nmol sample of covalent Pu triplex in 50 mM NaHCO 3 was mixed with 0.33 g (0.5 ml) of activated Sepharose and incubated for 3 h at 24°C in the dark to allow coupling to occur. The slurry was collected and washed with 10 ml of buffer D, and excess reactive sites blocked with 1 ml of 1 M ethanolamine and 12 mM MgCl 2 . The ligated column material was gravity packed into a 0.5 ϫ 3.0-cm column and washed with 5 ml of buffer C plus 250 mM KCl and 2 mM MgCl 2 .
A quarter of the partially purified y3BP1 from the phosphocellulose column (1.25 ml) was thawed, pooled, diluted with buffer C to 225 mM KCl, and loaded at 5 ml/h onto the X-Pu triplex-Sepharose column. Breakthrough from this initial load was reapplied to the X-Pu triplex-Sepharose column. Afterward the column was washed with 2 ml of buffer C plus 250 mM KCl and 2 mM MgCl 2 . Bound proteins were eluted in three sequential steps of buffer C plus 2 mM MgCl 2 and either 500, 1000, or 1940 mM KCl (1 ml each). Twelve 0.36-ml fractions were collected and assayed by EMSA. Most y3BP1 activity was present in the 1000 mM KCl step. This chromatography was repeated an additional three times with the remaining y3BP1-containing phosphocellulose fractions. All fractions were stored frozen at Ϫ80°C.
Y3BP1-containing fractions from all four runs of the Pu triplex-Sepharose column (4.3 ml total) were thawed, pooled, dialyzed (Spectra/ Por 4; molecular weight cutoff, 12 kDa) against 100 ml of buffer C plus 90 mM KCl, and filtered through a 0.22-m Millex-GS filter (Millipore) before loading at a flow rate of 1 ml/min onto tandem 1-ml Mono Q and Mono S FPLC columns (Amersham Pharmacia Biotech) equilibrated in buffer E (25 mM HEPES-Na ϩ , pH 7.9, 0.5 mM EDTA, 0.5 mM dithiothreitol) plus 100 mM KCl. After being loaded and washed with 5 ml of buffer E plus 100 mM KCl, the tandem columns were separated, and the proteins on the downstream Mono S column were eluted with a 10-ml linear gradient of increasing KCl concentration (100 -1000 mM) in buffer E. Eleven 0.7-ml fractions were collected and assayed by EMSA and SDS-PAGE (10% polyacrylamide). Nearly pure y3BP1 was obtained in fraction 8 (350 mM KCl). This was stored frozen at Ϫ80°C.
Peptide Sequencing and Data Base Searches-Purified y3BP1 (8 g) obtained from the Mono S column (fraction 8) was fractionated on a 10% SDS-PAGE gel and visualized by brief Coomassie Blue staining, and a 100-mm 3 piece of acrylamide containing the major protein band was excised from the gel. This was sent to the Harvard Microchemistry Facility, which determined the sequences of two tryptic peptides, VN-QGWGDDK and DVSNLPSLA, by tandem mass spectrometry on a Finnigan LCQ Quadrupole Ion Trap Mass Spectrometer. These peptide sequences were used in a search of the Yeast Proteome Data base (21). Both peptides mapped with complete identity to sequences within the protein encoded by the S. cerevisiae STM1 gene.
Preparation of Recombinant Stm1p-A bacterially expressed Stm1p fusion with an N-terminal (His) 6 sequence was produced using the pV2a expression vector (29). Plasmid pTU151, which contains the full-length STM1 gene, was obtained from Y. Kikuchi (30). This gene was amplified by PCR using Vent polymerase (New England Biolabs) and the amplimers 5Ј-CGG GAT CCA TTT GAT TTG TTA GGT AAC GAC G-3Ј and 5Ј-CGG AAT TCA GGC TTA AGC CAA AGA TGG CAA G-3Ј. A 400-ng aliquot of agarose gel-purified PCR product was digested with restriction endonucleases EcoRI and BamHI, ligated into the like-digested plasmid pV2a, and subcloned into Escherichia coli strain XL1-blue (Stratagene). Correct clones were identified by color selection using the ␤-galactosidase substrate X-gal (Fisher Biotech). Large scale preparation and immobilized metal affinity chromatography of recombinant (His) 6 -Stm1p was performed as described previously (29). Recombinant Stm1p was further purified by Mono S FPLC chromatography as was described above for the native y3BP1 protein. Nearly pure (His) 6 -Stm1p was obtained in fraction 12 (350 mM KCl).

Identification of Yeast Pu
Triplex-binding Proteins-To determine whether yeast have proteins that specifically recognize Pu triplex DNA, an EMSA of yeast whole-cell extract using a covalently bound Pu triplex probe (Fig. 1B) was performed. This triplex is based on the well characterized 19-mer triplexforming oligonucleotide ODN 1 (5Ј-TGGGTGGGGTGGGGT-GGGT-3Ј), which demonstrates strong, sequence-specific binding with an antiparallel orientation to a G-rich, homopurine duplex DNA target (4). Use of 5Ј-psoralenated TFO (PODN 1) and a duplex target containing a proximal 5Ј-TA-3Ј sequence allows the formation of a photo-cross-linked triplex (X-Pu triplex) that remains intact even under conditions that normally promote triplex dissociation (23,31). Incubation of 1 nM radiolabeled X-Pu triplex probe, 2 g of poly(dI-dC) carrier DNA, and additional competitor nucleic acids as indicated with 4.5-g proteins from a yeast whole-cell extract for 20 min at Stm1, a Purine Motif Triplex DNA-binding Protein room temperature allowed formation of protein-triplex complexes that could be resolved by nondenaturing PAGE and visualized by autoradiography (Fig. 2). Under these conditions, nearly complete shifting of the triplex probe into two slower mobility species (C1 and C2) could be observed (compare lanes 1 and 2). The major species, C1, had a relative mobility (R F ) compared with the free probe of 0.43 and comprised 87% of the total triplex probe, whereas the minor species, C2, had a R F of 0.53 and comprised 11% of the total triplex probe. The specificity of these protein-triplex interactions was demonstrated by competition binding with other nucleic acids. As shown here, the C1 species was reduced to 64% and less than 3% of its normal amount when 100-and 200-fold molar excess unlabeled, noncovalent Pu triplex DNA was present in the binding reaction, respectively (lanes 4 and 5). Formation of complex C2 was inhibited to a similar but lesser extent under these conditions. Note that the competitor triplex did not contain a psoralen photo-cross-link, indicating that this competition was most likely not the result of psoralen photoadduct recognition. Competition with equivalent molar excesses of Pu duplex probe had no effect on the quantities of C2 and C1 species observed (lanes 7 and 8). Likewise, competition with 1000-fold molar excesses of the individual oligonucleotides that comprise the Pu triplex, i.e. the G/A-or C/T-rich strands of the Pu duplex or the TFOs ODN 1 or PODN1, had no appreciable effect on C2 or C1 complex formation. Taken together, these data demonstrated that yeast have proteins that specifically recognize a Pu motif triplex.
Purification of the Major Yeast 3BP-To further characterize proteins that bind Pu triplexes, we purified the protein(s) responsible for the major protein-triplex EMSA complex, C1. A combination of conventional, affinity, and high performance liquid chromatography was employed. Table I outlines this purification, whereas SDS-PAGE and EMSA analyses of the relevant protein fractions are shown in Fig. 3, A and B, respectively. Initially, whole-cell yeast extract was loaded onto a phosphocellulose cation exchanger, and proteins eluted through a gradient of increasing KCl concentration. Note that no glycerol was present in the elution buffers, to minimize viscosity of the eluent and ensure maximal resolution on this column. The proteins responsible for C1 complex formation eluted in the range of 450 -550 mM KCl, as determined by an EMSA of the fractions, and they were effectively separated from those responsible for C2 (see Fig. 3B; compare lanes 3 and 5). Partially purified y3BPs were subjected to affinity chromatography using an X-Pu triplex covalently attached to a Sepharose 4B matrix. The y3BPs present in C1 eluted in the 1000 mM KCl step fraction. Note that the overall complexity of this fraction was not greatly changed through this chromatographic step, though the major 50-kDa contaminant, believed to be yeast cytoplasmic elongation factor 1 ␣ (32), was effectively removed (Fig. 3A, compare lanes 4 and 5). This limited purification achieved with the X-Pu triplex-Sepharose column may be due to there being a mixture of Pu duplex and X-Pu triplex  PODN 1 (lanes 11 and 12). Lane 1 shows a control reactions with the probe alone (P). Locations of the gel well (W), the two yeast protein-triplex complexes (C1 and C2), the unbound triplex (T), and the unbound duplex (D), are indicated on the left. Stm1, a Purine Motif Triplex DNA-binding Protein sites on this column (estimated 35:65, respectively), the local high molar concentration of these sites present, or the absence of competing sites normally provided by poly(dI-dC) carrier DNA. Final purification was achieved using high performance liquid chromatography. X-Pu triplex affinity-purified C1 3BPs were loaded onto tandem Mono Q and Mono S FPLC columns. These were separated, and the proteins bound to the downstream Mono S column were eluted with a 100 -1000 mM linear gradient of KCl. Triplex binding activity was concentrated in a single 350 mM KCl fraction (Fig. 3B), which contained a prominent (Ͼ80% total protein) 35-kDa polypeptide (Fig. 3A). Using proteins eluted from SDS-PAGE gel slices, we determined that a 35-40-kDa protein was responsible for the C1 shifted species (data not shown). Based on these data, we concluded that the 35-kDa polypeptide present in Mono S fraction 8, referred to as y3BP1, was the sole protein responsible for the major protein-DNA species observed with a cross-linked Pu triplex probe and yeast whole-cell extracts.
Characterization of Purified y3BP1-Whereas our previous experiments suggested that y3BP1 recognized an intact Pu triplex DNA species, this need not be the case with our covalent X-Pu triplex probe. It is possible that the psoralen photoadduct itself, or a change it induces in the duplex DNA structure, is actually being recognized, as might be expected for a protein involved in DNA repair (33). Alternatively, because noncovalent Pu triplexes are inherently unstable under our standard gel electrophoresis conditions (absence of Mg 2ϩ in the gel buffer) (18), our probe might be expected to adopt a structure composed of a single-stranded DNA attached at its 5Ј-end to a duplex DNA, which is somewhat reminiscent of structures found in DNA replication (34). To verify that an intact Pu triplex was actually being recognized by y3BP1, EMSAs were performed with different labeled probes, including the covalent X-Pu triplex, the noncovalent Pu triplex, and the Pu duplex. Note that in this experiment, electrophoresis was performed at 4°C and at a lower voltage (4 V/cm) to maintain maximal stability of the y3BP1-DNA complex. As shown in Fig. 4, the X-Pu triplex remained stable throughout electrophoresis, as did the y3BP1-triplex complex C1 (lanes 1 and 2). Also, as expected, the noncovalent Pu triplex probe dissociated under these conditions, quantitatively yielding Pu duplex (lane 3). However, a significant fraction of the labeled probe (15%, as compared with 60% with the X-Pu triplex probe) remained intact as part of complex C1, when purified y3BP1 was present in the binding reaction (lane 4). This was likely not the result of binding to the Pu duplex part of the dissociated Pu triplex probe, because no interactions between y3BP1 and the Pu duplex were observed under these conditions (lane 6). Taken together, these and the competition data from Fig. 2 indicate that y3BP1 recognizes an intact purine motif triplex.
The y3BP1 protein may bind a Pu triplex specifically, but does it do so with high affinity? To measure its dissociation binding constant, a constant concentration of y3BP1 was incubated in a near physiological buffer (25 mM HEPES-Na ϩ , pH 7.9, 50 mM KCl, 10% glycerol, 0.5 mM dithiothreitol, and 1 mM MgCl 2 ) with a fixed concentration of labeled, covalent X-Pu triplex probe and increasing concentrations of unlabeled, X-Pu triplex DNA. Protein-DNA complexes resulting from these reactions were analyzed by EMSA. Plotting the ratio of bound to free triplex DNA as a function of C1 concentration, we found that y3BP1 exhibited an apparent K d ϭ 61 pM for Pu triplexes (Fig. 5). We also determined that our Mono S fraction 8 contained 1.1 M active y3BP1, which is comparable to the concentration of a 35-kDa protein estimated from the Coomassie Blue-stained SDS-PAGE gel (3.2 M). This binding constant is similar to those of many duplex DNA-binding proteins (e.g.   8 -12). C1 corresponds to the y3BP1-triplex complex.

Stm1, a Purine Motif Triplex DNA-binding Protein
transcription factors) for their corresponding specific sites. Thus it is quite possible that y3BP1 would be capable of binding Pu motif triplexes, should they exist in vivo.
Though yeast 3BP1 may have a high affinity for Pu motif triplexes, it is possible this structure is not the true target of this protein in vivo. To better understand the binding specificity of y3BP1, competition experiments were undertaken with a variety of different DNA structures. These included several quadruplex DNAs (both parallel tetraplexes and antiparallel hairpin dimers and intramolecular quadruplexes), triplex DNAs (both Pu motif and Py motif), duplex DNAs (both A/Tand G/C-rich), and the single-stranded TFO ODN 1. Their sequences and structures are shown in Fig. 1. Binding reactions were modified in some cases to maintain the integrity of these DNA structures. For example, competition experiments with single-stranded ODN 1 were performed in buffer A containing HEPES-Li ϩ and LiCl instead of HEPES-Na ϩ and KCl, respectively, to minimize formation of G4-containing species with this G-rich oligonucleotide (26,27). Quantitation from a series of competition experiments is shown graphically in Fig.  6, with apparent competitor concentrations necessary to observe a 50% decrease in complex C1 formation (EC 50 ) provided in Table II. As shown here, the covalent Pu triplex had the highest affinity for y3BP1 followed by the parallel G4 tetraplex GL-tetramer, which had a 5.6-fold lower affinity. Surprisingly, y3BP1 exhibited significant, albeit lower (33-fold less than the Pu triplex) binding affinity to the Pu duplex. However, this binding might be explained by the fortuitous formation of a small amount of a purine motif triplex DNA, containing two G/A-rich strands and one C/T-rich strand from the Pu duplex, under our incubation conditions. Even lower binding affinities were observed with the Py triplex and the G4 DNAs ODN 1 dimer and ODN 1 monomer, indicating that not all DNA structures possessing high negative charge density are avidly bound by y3BP1. Single-stranded ODN 1 bound y3BP1 with very low affinity (10 4 -fold larger K d than the Pu triplex), whereas the double-stranded Py duplex showed no apparent affinity for y3BP1 for the concentration range examined. Taken together, these data indicate that y3BP1 is a true purine motif triplexbinding protein with an appreciable affinity for some, but not all, G4-containing DNAs.
Identification of the Gene That Encodes y3BP1-To directly determine the gene responsible for y3BP1, we obtained amino acid sequence information from the purified protein. Given that the complete nucleotide sequence of S. cerevisiae is available (20), a sequence of as few as five amino acid residues can be sufficient to define a single gene. Tryptic fragments from SDS-PAGE-purified y3BP1 were sequenced by tandem mass spectrometry, and two amino acid sequences, VNQGWGDDK and DVSNLPSLA, were identified. A search of the Yeast Proteome  1 and 2), the noncovalently bound Pu triplex (lanes 3 and 4), or the corresponding Pu duplex alone (lanes 5 and 6).
This protein was originally found to have a specific affinity for some quadruplex nucleic acids and was referred to as G4p2 (36).
Two approaches were used to verify that the STM1 gene product was responsible for the primary yeast protein-Pu triplex complex, C1. In the first approach, the entire STM1 open reading frame was cloned into the bacterial expression vector pV2a (29). This allowed the production of a recombinant Stm1 protein as an N-terminal oligohistidine fusion. Bacterially expressed (His) 6 -Stm1p was purified to homogeneity by immobilized metal affinity chromatography and Mono S FPLC. When assayed by EMSA using a covalent X-Pu triplex probe, (His) 6 -Stm1p demonstrated a single protein-DNA complex with an identical relative mobility as native y3BP1 (Fig. 7). Recombinant Stm1p also demonstrated a Pu triplex binding specificity similar to that of the native protein, as shown by the concentrations of Pu triplex and Pu duplex competitor required to demonstrate a 50% decrease in complex C1 formation (220 pM and 19 nM, respectively). That a greater concentration of (His) 6 -Stm1p than native Stm1p was required most likely reflects the limited specific activity of our recombinant protein preparations. These data directly demonstrate that Stm1p is a bona fide triplex-binding protein and that bacterially expressed Stm1p retains its DNA binding properties. This also suggests that post-translational modifications absent in bacteria (e.g. proper phosphorylation, acetylation, and methylation) are not essential for the specific recognition of triplex DNA by Stm1p.
As a second approach to verifying the involvement of Stm1p in the major y3BP-Pu triplex complex, we obtained S. cerevisiae strain A1454 (35), which is deficient in Stm1p. Disruption of the STM1 gene is nonlethal in haploid yeast grown in rich medium, with the only observed phenotypic change for stm1⌬ yeast being a slightly increased doubling time at elevated temperatures in media containing nonfermentable carbon sources (30). 3 As shown in Fig. 8, proteins were present in wild-type but not stm1⌬ yeast extracts that generate the C1 protein-triplex complex. Neither extract contained appreciable amounts of proteins that specifically bound the corresponding Pu duplex probe. This demonstrates that the product of the STM1 gene is responsible for activity of the major Pu triplex-binding protein in yeast extracts and is consistent with Stm1p being a triplexbinding protein. Note that other, minor protein-triplex complexes (C0 and C2) appeared in both extracts and that the amounts of these minor complexes approximately doubled in reactions containing the stm1⌬ yeast extract. The former observation would indicate that Stm1p is not present in these other complexes, while the latter is suggestive of a possible functional compensation by other yeast proteins that bind triplex DNA.

DISCUSSION
Using a well defined purine motif triplex and EMSA, we identified at least two different proteins in S. cerevisiae whole cell extracts that specifically recognized triple helical DNA. Using a combination of conventional and affinity chromatography, we purified a 35-kDa protein, y3BP1, that was responsible for the major protein-triplex complex C1. Microsequencing of this protein and a comparison with the Yeast Proteome Data base indicated that y3BP1 was encoded by the STM1 gene. That Stm1 protein is a bona fide triplex-binding protein and responsible for the C1 complex was verified by bacterially expressed recombinant Stm1p and whole-cell extracts from a stm1⌬ mutant yeast strain, respectively. 3 A. Sakai, personal communication. a EC 50 refers to the concentration of competitor DNA that reduces C1 complex formation 50% when a 0.1 nM-labeled X-Pu triplex probe and 0.27 nM purified y3BP1 were present in an EMSA binding reaction.

Stm1, a Purine Motif Triplex DNA-binding Protein
Previous studies have described 3BPs in human cell extracts, but their conclusive identification with particular gene products had not been achieved (16 -18). Similarly, a portion of the Drosophila GAGA factor has been shown to bind triplex DNA with less affinity than its duplex target (19), though it is debatable whether crude preparations of the native protein would preferentially recognize triplex DNA under physiological conditions. Thus, the work presented in this paper is the first reported purification of a Pu motif triplex-binding activity from crude cell extracts and the conclusive identification of its gene.
STM1, also referred to as MPT4 and STO1, was identified through genetic screens as a multicopy suppressor of temperature-sensitive tom1, htr1, and pop2 mutants (30,35). TOM1 encodes a ubiquitin ligase that regulates activation of the ADA histone acetyltransferase A complex and is required for G2/M progression (37). POP2 encodes a subunit of the CCR4 general transcriptional complex, which regulates the expression of a number of genes during the late mitotic part of the cell cycle (38). HTR1 (also known as MPT5) encodes a protein involved in mating pheromone-induced G1 arrest and for progression through G2/M (39). MPT5 has been found to affect the lifespan of yeast cells, presumably by affecting the strength of transcriptional silencing at telomeres compared with the ribosomal DNA locus (40). Interestingly, MPT5 is also a multicopy suppressor of POP2 (35). Because high copy numbers of STM1 were required to produce even partial phenotype suppression for tom1, htr1, and pop2 mutations, it is likely Stm1p either acts far downstream of these proteins (e.g. for Tom1p) or is not a fully functional substitute for them (e.g. for Mpt5p and Pop2p). This and the viability of stm1⌬ yeast (30) make it difficult to genetically determine an exact function for Stm1p. However, common themes among these proteins, e.g. transcriptional regulation and mitosis, strongly suggest that Stm1p has a role in these processes.
Evidence for a protein's biological role can sometimes be found through an analysis of its pattern of expression under different conditions. For a baseline, STM1 is moderately well expressed in yeast, averaging 51 copies/cell during log phase growth as determined by serial analysis of gene expression (41). However, as determined by microarray analysis, the level of STM1 expression does not change significantly during the cell cycle (42), as a function of different cell mating type (43), or in response to stimuli such as heat shock (43) or treatment with DNA alkylators (44). The largest changes reported were observed upon a shift from fermentation to respiration (4.5-fold lower expression) and during sporulation (4.2-fold lower), though values of this magnitude were relatively common among all yeast genes investigated (45,46). Likewise, an analysis of the STM1 promoter region (47) provided few clues as to it possible regulation, because sites for relatively common transcription factors (e.g. GCN4 and GCR1) were primarily identified. More striking was the observation of three consensus PHR1 upstream activator sequence (UASPHR) sites within 250 base pairs of the STM1 translation start site, given that UAS-PHR sites are a hallmark of genes involved in nucleotide excision repair and recombination (e.g. RAD1, RAD4, RAD23, and RAD50) (48). However, it should be noted that large changes in expression are not necessarily a hallmark of an important DNA-binding protein. For example, the Cbf1 protein, which binds to an element in the centromere and is involved in mitosis (49), does not significantly change its expression in any of the aforementioned circumstances (42)(43)(44)(45)(46) nor has noteworthy transcription factor binding sites within its proposed promoter region been found (47).
Stm1 protein was first biochemically identified as G4p2, a yeast protein that exhibits specific affinity for quadruplex nu-cleic acids (36). Quadruplex nucleic acids are four-stranded, right-handed helical structures composed of stacked pairs (or greater) of G-quartets, square planar arrays of guanines Hoogsteen hydrogen bonded to one another (reviewed in Ref. 50). Quadruplexes can be composed of four parallel-oriented nucleic acid strands (see Fig. 1D for an example), two folded nucleic acid strands (Fig. 1E), or with certain G-rich sequences, a unimolecular species composed of only intramolecular Gquartets (Fig. 1F). Quadruplexes form with low to moderate kinetics under physiological conditions, and sequences exist within the yeast genome that can form quadruplexes in vitro. Examples include the 3Ј-G-rich single-stranded extensions present in the telomeres on chromosome ends (51) and transcripts of a G-rich 26 S rRNA gene (52). It is not conclusively known whether quadruplexes exist in vivo, nor whether the recognition of such structures is the biological function of proteins like Stm1p. In our studies, we found that Stm1p bound a Pu motif DNA triplex better than a DNA tetraplex and considerably better than a dimeric or a monomeric quadruplex DNA. Thus our quadruplex data are consistent with those described by Frantz and Gilbert (36). However, the importance of higher triplex binding affinity should not be exaggerated, given that the identity of the putative nucleic acid recognized by Stm1p in vivo, let alone its exact structure, is not known. Efforts to determine its identity, through cross-linking and immunoprecipitation experiments, are presently underway. It is interesting to speculate that structural elements common to both quadruplexes and Pu motif triplexes, e.g. a large negative charge density or reverse Hoogsteen hydrogen bonded guanines, might be recognized by 3BPs like Stm1p. However, none of the human Pu motif 3BPs so far reported bind G4 DNAs with high affinity (18). Likewise, not all quadruplex-binding proteins recognize Pu motif triplexes. For example, extracts made from an arc1⌬ yeast strain (53), which does not produce the quadruplex-binding protein and cofactor for methionyl-and glutamyl-tRNA synthetases G4p1 (54), did not demonstrate the loss of any protein-Pu triplex complex observable by EMSA. 4 Thus the relationship between quadruplex-and triplex-binding proteins, like the existence of their supposed binding sites in vivo, remains an open question.
Following its predicted N-terminal acetylation (55), mature Stm1p should be composed of 272 amino acids, with a calculated molecular mass of 29,903 Da (21). From an analysis of its amino acid composition, Stm1p would be predicted to be a soluble protein with a pI of 9.8. These characteristics stem from the unusual abundance of basic residues (19% of the total amino acids) and the significant paucity of the major hydrophobic residues leucine, valine, isoleucine, phenylalanine, and methionine (15%) in Stm1p compared with other yeast proteins (56). This high percentage of basic residues would not be unexpected for a triplex-and/or quadruplex-binding protein, given the necessity to complement the high negative charge density present on these multistranded nucleic acids. Interestingly, basic residues in Stm1p are distributed throughout the protein in patches of only slightly greater density than would be expected randomly (Fig. 9). This contrasts with studies done with basic oligopeptides, which indicated that high positive charge densities were best for stabilizing intermolecular and intramolecular triplexes (57,58). However, without knowing the three-dimensional structure of Stm1p, it is difficult to predict the exact charge density on any part of its surface. Stm1p should be a nuclear protein, given the presence of two overlapping nuclear localization domains, one a pat7 motif and the other a bipartite nuclear localization domain, located between amino acids 33 and 50 (59). Otherwise, the only other noteworthy feature of Stm1p is a region from amino acids 119 -141 that is very rich in alanines (48%) and acidic residues (39%) and is postulated to adopt a coiled-coil ␣-helical conformation characteristic of many protein-protein interaction domains (60). Taken together, these data would suggest that multiple regions of Stm1p might be involved in making contacts with triple helical DNA and that Stm1p might form a complex with other proteins at some point in its life cycle.
Stm1p has been estimated to be present at about 35,000 copies/yeast cell (36). We found that purified Stm1p has a measured affinity for Pu triplex DNA of 61 pM under our standard reaction conditions. This value is quite reasonable for a moderately abundant DNA-binding protein. However, the conditions we employed for our binding reactions were those that demonstrated maximal binding affinity in vitro and are not those expected in vivo. We have investigated purified Stm1p binding under more physiological conditions (140 mM KCl, 12 mM MgCl 2 , 1 mM spermine, pH 7.5) and found that it retained substantial binding activity (ϳ50%) under these conditions. 4 This would suggest that Stm1p could bind Pu triplexes in vivo, should they occur.
Though we have observed 3BPs in species ranging from S. cerevisiae to human, are there any direct analogs of Stm1p in other organisms? A BLASTP (61) search of the nonredundant protein data base identified two homologous proteins in Schizosaccharomyces pombe (Protein IDs 1749424 and 2842507), though none in any higher organisms. However, a TBLASTN (62) search of translations of the available EST data bases found additional homologues, including two in Neurospora crassa (nucleotide IDs 3045341 and 4065003), one in Drosophila melanogaster (4246316), and an additional one in S. pombe (3346736). Pending a structure/function analysis of Stm1p, we cannot be certain whether these homologies are functionally significant with regards to specific recognition of triplex DNAs. However, their existence is consistent with an important, conserved function for these proteins in eukaryotic organisms.
At present, we can only speculate about the possible biological roles of triplexes and 3BPs. Two 3BPs have been identified so far, the Drosophila GAGA factor (19) and S. cerevisiae Stm1p. As mentioned above, genetic studies have shown STM1 to be a multicopy suppressor of several genes that have roles in transcriptional controls and progression through mitosis. Interestingly, GAGA, a well known DNA-binding transcription factor involved in the regulation of many Drosophila genes (63), has also been found to be a constituent of centromeric heterochromatin in mitotic chromosomes (64), whose absence has been shown to result in mitotic defects including failures in chromosome condensation and segregation (65). We have recently identified a second S. cerevisiae gene, CDP1, which also encodes a 3BP. 5 Genetically, CDP1 is a complement of CBF1, a centromere-binding protein involved in mitosis (49,66). A yeast strain harboring a deletion in CDP1 exhibited defects in chro-mosome segregation and a temperature-sensitive arrest in G2/M (66). 6 Biochemically, Cdp1p is part of a general transcription factor complex containing Cdc73p and Paf1p that interacts directly with RNA polymerase II (67,68). Together these findings are consistent with 3BPs having a role in cell cycle progression, specifically mitosis. Interestingly, triple helical DNA has also been implicated in mitosis. Agazie et al. (69) found that when anti-triplex monoclonal antibodies were introduced into synchronized myeloma cells, they had their greatest effect on cell growth when introduced at the end of S phase and during G2 (69). From these studies they proposed a model in which transmolecular triplex formation and dissociation were involved in the processes of chromosome condensation and decondensation, respectively. Transmolecular triplexes are structures that involve triplex-forming single-stranded DNA and duplex acceptor from either different DNA molecules or, unlike intramolecular triplexes, from distal sites on the same DNA. From their model, these antibodies would bind triplexes in condensed DNA but would lack the ability to be readily reversible, thereby retarding chromosome decondensation and cell cycle progression. In extension, a possible role for 3BPs in this process might be to provide a reversible means of alternatively protecting and deprotecting triplex DNA from the actions of other proteins (e.g. naturases and helicases) responsible for triplex formation and dissociation. Note that it need not be necessary to limit this model to only triplexes, given that association between different DNAs can also be achieved through G4 (e.g. hairpin dimer) formation. Thus transmolecular G4 formation may be the actual initiator of chromosome condensation, and G4 DNA may be the true target of 3BPs in vivo.