Structure of a Conserved Domain Common to the Transcription Factors TFIIS, Elongin A, and CRSP70*

TFIIS is a transcription elongation factor that consists of three domains. We have previously solved the structures of domains II and III, which stimulate arrested polymerase II elongation complexes in order to resume transcription. Domain I is conserved in evolution from yeast to human species and is homologous to the transcription factors elongin A and CRSP70. Domain I also interacts with the transcriptionally active RNA polymerase II holoenzyme and therefore, may have a function unrelated to the previously described transcription elongation activity of TFIIS. We have solved the structure of domain I of yeast TFIIS using NMR spectroscopy. Domain I is a compact four-helix bundle that is structurally independent of domains II and III of the TFIIS. Using the yeast structure as a template, we have modeled the homologous domains from elongin A and CRSP70 and identified a conserved positively charged patch on the surface of all three proteins, which may be involved in conserved functional interactions with the transcriptional machinery.

TFIIS 1 is a transcription elongation factor that increases the overall transcription rate of RNA polymerase II by reactivating transcription elongation complexes that have arrested transcription (1). TFIIS is conserved from yeast to man, and homologs are found in Archaea and in some viral genomes. TFIIS comprises three structural domains termed I, II, and III. A fragment consisting of domains II and III is sufficient for elongation activity in vitro and is able to rescue the phenotype of a TFIIS gene disruption in yeast cells. Domain II (residues 131-240) mediates the interaction with the largest subunit of RNA polymerase. Domain III (residues 264 -309), a small zincribbon motif, is implicated in the stimulation of transcript cleavage and resumption of transcription by RNA polymerase II (2,3).
Several observations implicate the N-terminal domain of TFIIS (domain I) in the transcription process. First, domain I interacts directly with the RNA polymerase II holoenzyme and can be used to purify the holoenzyme using protein affinity chromatography (4). Second, domain I is conserved from yeast to human and is also homologous to regions of elongin A (5) and CRSP70 (6), both of which are involved in transcription. Elongin is a transcription elongation factor that increases the rate of transcription by suppressing transient pausing of the elongation complex (7). CRSP70 is an essential subunit of the CRSP complex, which is required for the activity of the enhancerbinding protein Sp1 (4). Third, in Saccharomyces cerevisiae, TFIIS lacking domain I is synthetically lethal with a disruption of the TFG3 gene. Domains II and III, which are sufficient to reactivate arrested transcription complexes, are not able to complement the synthetic lethality with TFG3. 2 TFG3 is a component of the general transcription factors TFIIF and TFIID and swi/SNF, a complex required for full activity of several transcription activators (8).
Thus, biochemical and genetic studies of the N terminus of TFIIS suggest a role in transcription. As yet, the functions of the N-terminal domain of TFIIS, elongin, or CRSP70 are not known. To help elucidate the function of this domain and to provide a structural framework for the design and interpretation of additional studies, we determined the solution structure of domain I of S. cerevisiae TFIIS and used this structure to model the N-terminal domains of elongin A, and CRSP70.

MATERIALS AND METHODS
Cloning, Expression, and Purification of TFIIS-Sequences coding for residues 1 to 111 of yeast (S. cerevisiae) TFIIS were cloned into the T7 polymerase expression vector pET15b (Novagen), as described in Morin et al. (9) and expressed as C-terminal fusions to an N-terminal His 6 tag and a thrombin protease site. Escherichia coli BL21(DE3) cells expressing the TFIIS fragment were grown in M9 minimal medium (3) containing 15 N-labeled NH 4 Cl and for 13 C-labeled samples, 13 C-labeled glucose. Cells were grown at 30°C to an absorbance of 0.4 at 600 nm, and protein expression was induced with 1.0 mM isopropyl-6-D-thiogalactopyranoside. Three hours after induction, the cells were harvested by centrifugation and resuspended in Buffer A (30 mM Hepes, 600 mM NaCl, 10 M ZnCl, 1.0 mM benzamidine, pH 7.5) with 5 mM imidazole and frozen at Ϫ70°C. The cells were lysed using a French pressure cell, and the supernatant was clarified by centrifugation at 100,000 ϫ g for 40 min at 4°C. All subsequent steps were performed at 4°C. The supernatant solution was loaded onto a (5 ϫ 5)-cm DE52 (Whatman, Maidstone, UK) column, and equilibrated in Buffer A. The flow-through * This research was funded in part by the National Cancer Institute of Canada with funds from the Canadian Cancer Society. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The NMR Spectroscopy and Spectral Assignments-All NMR spectra were collected at 25°C on either a Varian 500-MHz or 600-MHz Inova spectrometer equipped with pulse field gradient units and actively shielded z-gradient triple-resonance probes. NMR experiments involving correlations of amide protons were acquired with gradient enhanced versions (10) of the originally published pulse sequences. NMR data were processed using nmrPipe software (11). Spectral analysis was assisted using the programs Pipp and Capp (12) and NMRView (13). Sequence-specific assignment of 1 HN, 15 N, 13 C ␣ and 13 C ␤ resonances for all non-proline residues of TFIIS was achieved through correlations from CBCA(CO)NH (14), CBCANH (15), and 15 N NOESY-HSQC (16) spectra. Side chain 1 H and 13 C resonances of aliphatic residues were assigned from an HCCH-TOCSY (17,18), a CCC-TOCSY (19) and an 15 N TOCSY-HSQC (20) (mixing time, 81 ms). Homonuclear two-dimensional NOESY (21) and TOCSY (22) spectra in D 2 0 were used to assign side chain 1 H resonances of aromatic residues. H-H nuclear Overhauser effects (NOE) were identified from the following spectra: 13 C, 15 N-edited NOESY in H 2 0 (16) (both mixing times are 150 ms) and homonuclear NOESY in D 2 O. An HNHA was used to measure 3 J NH-H␣ coupling (23).
Structure Determination-Structure calculations were performed using version 3.851 of XPLOR with ambiguous restraints for iterative assignment (ARIA) (24). The initial input for ARIA consisted of the eight lowest energy structures calculated by XPLOR using 276 unambiguous manually assigned NOEs, dihedral angle restraints, and hydrogen bond restraints. The dihedral restraints were based on the predictions of TALOS (25). All TALOS-derived restraints were consistent with the secondary structure as determined from 3 J NH-H␣ measurements, NOE patterns, and H ␣ and C ␣ chemical shifts. These indicated an ␣-helical conformation for residues 3-16, 20 -31, 41-53, and 60 -75. Hydrogen bond restraints (H⅐⅐⅐O, 2.5 Å; N⅐⅐⅐O, 3.5 Å) were added for residues which were clearly ␣-helical as judged by NOE patterns, chemical shift values, and lower rates of amide hydrogen exchange.
The CN-NOESY, 15 N-edited NOESY-HSQC, and homonuclear twodimensional NOESY were automatically peak picked using NMRView (13) followed by manual removal of obvious artifacts. The resulting NOE peak lists were used as input for ARIA calculations. The frequency window tolerance for assigning NOEs with ARIA was Ϯ0.03 ppm for all proton dimensions and Ϯ0.5 ppm in the nitrogen and carbon dimensions. The ARIA parameters p, T v and N v were as in Nilges et al. (24). In each iteration, 20 structures were refined, and the 7 lowest energy structures were used for the purposes of NOE assignments. In the final (eighth) ARIA iteration, 20 structures were refined, and the 10 lowest energy structures were retained for analysis. A total of 1199 unambiguous and 542 ambiguous NOE-based distance restraints were used for the final set of structures.
Homology modeling of human TFIIS, elongin A, and CRSP70 using the yeast structure as a template was performed using the "optimize" option of SWISS-MODEL Version 3.5 (26). The sequence alignment was generated using the multiple sequence alignment program Clustal W1.8 (27) and edited slightly by hand to optimize the alignment over the helices in favor of aligning over the loops. Surface charge plots were produced using GRASP (28).

RESULTS AND DISCUSSION
Partial proteolysis studies revealed that the N-terminal region of yeast TFIIS consists of a stable structural domain. This domain extended from the N terminus of TFIIS to the region between residues 105 and 124 (9). To identify the fragment of domain I most suitable for structural studies, three TFIIS constructs containing residues 1-93, 1-111, and 1-124 were prepared, and the resulting 15 N-labeled proteins tested for feasibility by NMR. The TFIIS-(1-111) construct containing the conserved residues was expressed well, was stable for several weeks, and had all the dispersed HSQC peaks (characteristic of residues in structured parts of the protein) that were observed in the longer TFIIS-(1-124) construct. The TFIIS-(1-111) construct was selected for structure determination.
Domain I of TFIIS formed a four-helix bundle (Fig. 1). The helices are each 12 to 13 residues long and are connected by structured loops of length 4, 9, and 6 residues. Residues 78 -111 are unstructured. The structure of domain I is likely to be conserved in elongin and CRSP70. The hydrophobic core residues of helices 2, 3, and 4 are well conserved among TFIIS, elongin, and CRSP70, although helix 1 is less conserved (Fig.  2). The DALI data base (29) revealed that the closest structural homolog is the A chain of cytochrome c oxidase (Z-score ϭ 4.1, root mean squared deviation (RMSD) ϭ 2.7), which is also a four-helix bundle although its helices are much longer than those of TFIIS.
The surfaces of the proteins were examined to identify regions that might participate in functional interactions. The structures of proteins were modeled with SWISS-MODEL Version 3.5 (26) using the yeast TFIIS structure as a template. The reliability of the models depends on the degree of sequence identity between the template and model sequences. The models are expected to be accurate for human TFIIS and helices 2, 3, and 4 of elongin A and CRSP70 because these helices share high sequence homology with the TFIIS structure. The modeled structures are likely to be poorer in helix 1 because the sequence homology in this region is lower.
TFIIS and human elongin A both bind the transcription holoenzyme. To indicate what part of the proteins might be responsible for binding, we looked for surface features conserved between these two proteins (Fig. 3). Several surface residues are conserved in TFIIS and elongin: Leu-7, Asn-12, Glu-14, Asn-19, Leu-24, Thr-37, Leu-41, Val-46, Lys-54, Lys-55, Lys-66, Met-68, Ile-76, and Lys-73. All of these residues with the exception of Lys-73 are also conserved in CRSP70. Several of these residues localize to the top of the helix bundle and the face of the protein formed by helices 1 and 3. These form a basic patch, which is most extensive in human TFIIS. The charged nature of this conserved patch makes it a likely candidate for the location of an interaction common to TFIIS, elongin, and CRSP70. Other than this basic patch, the surface charge properties vary considerably between the proteins. This may reflect binding of different targets or may indicate that the remainder of the surface is not functionally important. Helix 2 and 4 of CRSP70 and elongin A have basic patches that are not found in TFIIS.
In vertebrates, TFIIS is expressed in several distinct isoforms where one seems to be a general form and one is testesspecific (30). One main difference between the isoforms is the length of the region linking domains I and II. Perhaps changing the length of this linker may modify the function of TFIIS by changing the configuration and orientation of proteins within TFIIS-containing complexes.
In conclusion, we have solved the structure of yeast TFIIS and shown that the overall fold of this domain is conserved among TFIIS, elongin A, and CRSP70. We modeled the homologous proteins using the yeast structure as a template and identified a conserved patch of charge, which may be involved in functional interactions common to all three proteins.