Solution Structure of the N-terminal Domain of the Human TFIIH MAT1 Subunit

The human MAT1 protein belongs to the cyclin-dependent kinase-activating kinase complex, which is functionally associated to the transcription/DNA repair factor TFIIH. The N-terminal region of MAT1 consists of a C3HC4 RING finger, which contributes to optimal TFIIH transcriptional activities. We report here the solution structure of the human MAT1 RING finger domain (Met1–Asp65) as determined by1H NMR spectroscopy. The MAT1 RING finger domain presents the expected βαββ topology with two interleaved zinc-binding sites conserved among the RING family. However, the presence of an additional helical segment in the N-terminal part of the domain and a conserved hydrophobic central β strand are the defining features of this new structure and more generally of the MAT1 RING finger subfamily. Comparison of electrostatic surfaces of RING finger structures shows that the RING finger domain of MAT1 presents a remarkable positively charged surface. The functional implications of these MAT1 RING finger features are discussed.

MAT1 is one of the nine subunits of the human transcription/ DNA repair factor TFIIH, which is known to play a crucial role in the transcription of class II genes as well as in DNA repair through the nucleotide excision repair pathway (1). This factor may be resolved in vivo and in vitro into two structural subcomplexes: the TFIIH core and the cyclin-dependent kinase (cdk) 1 -activating kinase (CAK) complex (2,3). The CAK complex is composed of the catalytic subunit cdk7, the regulatory subunit cyclin H, and a third partner, MAT1, originally defined as a stabilizing and activating factor (4,5). This complex is also found in its free form within the cell and preferentially phosphorylates cdks known as key components of the cell cycle progression (6). As part of the TFIIH factor, the CAK complex phosphorylates different substrates of the transcription apparatus including TATA box-binding protein, TFIIE, TFIIF, the C-terminal domain of the largest subunit of RNA polymerase II, and regulatory factors such as p53 and some nuclear receptors (7). The cdk7 kinase activity of the CAK complex is stimulated by a combined action of cyclin H and MAT1 binding and cdk7 phosphorylation (8). Moreover, the subunit MAT1 is involved in a substrate selection process choosing either cdk or another transcription apparatus to be phosphorylated (9,10). To further investigate the role of the CAK complex in transcription when part of TFIIH and to elucidate the specific role of MAT1, a structural study of all CAK components was undertaken. The crystal structure of cyclin H was solved (11), and a structural model between cdk7 and cyclin H was built (12). Recently, a combination of sequence analysis and biochemical data showed that MAT1 can be divided into three functional domains: an N-terminal RING finger domain, a central coiled coil domain, and a C-terminal domain rich in hydrophobic residues. Functional analysis revealed that the C terminus strongly interacts in vitro, as well as in vivo, with the cdk7cyclin H complex and stimulates cdk7 kinase activity (13). The authors showed that the median domain of MAT1 is involved in CAK anchoring to the core TFIIH through interactions with both XPD and XPB helicases. It has also been shown that the deletion of the N terminus, which presents the consensus sequence of a C3HC4 RING finger domain, inhibits the basal transcription as well as the phosphorylation of the C-terminal domain of RNA polymerase II when engaged in a transcription complex (13). This enlightens the potential role of the RING finger domain of MAT1 in the architecture of the preinitiation complex of the transcription.
To complete the functional data available for the MAT1 N-terminal domain and to provide a structural basis for further structure-function relationships, we determined its solution structure using proton NMR spectroscopy. The comparison with previously reported RING finger structures shows that the MAT1 RING finger domain presents a classical ␤␤␣␤ topology with a "cross-brace" arrangement of the eight zincbinding ligands. The MAT1 RING finger domain is characterized by the presence of an additional short ␣-helix within the N-terminal loop and by an extended basic surface. The functional implications of these features, which are specific to all of the MAT1 RING finger orthologous sequences, are discussed, as are the new insights brought by this fourth high resolution structure of a RING finger.

EXPERIMENTAL PROCEDURES
Expression and Purification of the Recombinant RING Finger Domain-The nucleotide sequence encoding the fragment corresponding to the RING finger domain of MAT1 (Met 1 -Asp 65 ) was amplified by polymerase chain reaction and inserted into the appropriate Escherichia coli expression vector. The cDNA of the human MAT1 gene was amplified by polymerase chain reaction using a forward primer, which introduces a BamHI site at the 5Ј end, and a reverse primer containing a stop codon and an EcoRI site at the 3Ј end. After digestion by BamHI and EcoRI (New England Biolabs), the polymerase chain reaction fragment was inserted into the pGEX-4T2 expression vector (Amersham Pharmacia Biotech). A starter culture of 500 ml of LB containing 200 g/ml ampicilin was inoculated with the E. coli strain BL21(DE3) transformed by the pGEX-4T2 recombinant vector and grown overnight at 37°C. Cells were pelleted, resuspended in a fresh medium, and used to inoculate 6 liters of LB medium containing 200 g/ml ampicilin at an A 600 nm of 0.1. Cultures were grown at 37°C to an A 600 nm of 0.6 -0.8, and the expression of recombinant proteins was induced by addition of 0.6 mM isopropyl-1-thio-␤-D-galactopyranoside. After 4 h, cells were harvested, washed in buffer A (20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 20% glycerol), frozen in liquid nitrogen, and stored at Ϫ80°C.
Cells were resuspended in 100 ml of buffer B (50 mM Tris-HCl, pH 7.5, 500 mM NaCl) containing 2.5 mM ␤-mercaptoethanol and disrupted by sonication for 10 min (pulse 2/8, T ϭ 10°C) using a 13-mm probe with a Vibracell 72412 sonicator at 30% intensity. The cell extract was then centrifuged for 2 h at 45,000 rpm at 4°C in a Beckman R60Ti rotor. The soluble extract containing the GST-RING finger recombinant protein was incubated during 1 h at 4°C with 4 ml of GSH-Sepharose resin (Amersham Pharmacia Biotech) preequilibrated in buffer B containing 2.5 mM ␤-mercaptoethanol. The resin was washed in a batch with 40 volumes of buffer B. The adsorbed proteins were eluted with 2 ϫ 4 ml of buffer B containing 30 mM glutathione. The fractions (4 ml) containing the GST-RING finger protein (as judged by SDS-polyacrylamide gel electrophoresis) were pooled, and the GST fusion protein was cleaved with bovine thrombin (Sigma) (3 units per mg of recombinant fusion protein) at 4°C during 12 h. The sample was concentrated using a Centriprep device with a 3,000-Da cut-off (Amicon). Digestion was stopped by addition of 5 mM Pefabloc (Roche Molecular Biochemicals), FIG. 1. NMR data of the human MAT1 RING finger domain. A, the number of NOE distances as a function of residue number is shown in violet for intra-residue and in blue, yellow, and red for short, medium, and long range interresidue, respectively. The presence of intra-residue NOE distances indicates the residues for which stereospecified H␤ assignments were possible. B, stereo view of the C␣ trace from the 20 lowest energy superimposed NMR structures (rmsd 0.67 Å). The two zinc atoms and the zinc-binding residues are shown in red. C, schematic view of the ␣␤␤␣␤ fold of the MAT1 RING finger domain with the two zinc ligation sites (ZNI and ZNII). ␣-helices and ␤-strands are displayed with pink boxes and cyan arrows, respectively. Secondary structure NOEs evidencing the three-stranded ␤-sheet are shown as blue arrows. Hydrogen bond constraints deduced from solvent exchange experiments are indicated by red dashed lines. and the fraction was then subjected to gel filtration chromatography (Amersham Pharmacia Biotech; 2.6 ϫ 60 cm at a flow rate of 2 ml/min) in buffer C (20 mM Tris-HCl, pH 7.5, 50 mM NaCl). Recombinant RING finger protein-containing fractions were pooled and concentrated on a Centriprep device with a 3,000-Da cut-off to a final concentration of approximately 20 mg/ml (2 mM). For NMR studies, the sample was dialyzed against buffer C containing deuterated Tris.
NMR Spectroscopy-40 l of D 2 O was added to the 400 l of the protein solution for the lock, and 2,2-dimethyl-2-silapentane-5-sulfonate was used as the internal chemical shift reference. Homonuclear TOCSY (14), NOESY (15), and DQF-COSY (16) spectra were recorded at four temperatures (283, 290, 298, and 303 K) on either Bruker DRX600 or DMX750 spectrometers with spectral widths of 7000 Hz (600 MHz) or 8333 Hz (750 MHz) in both dimensions and a relaxation delay of 2 s. Water signal suppression was achieved by presaturation or by using a WATERGATE sequence (17). Slowly exchanging amide protons were identified by recording 70-ms NOESY spectra at 283 K and at different delays after addition of D 2 O to the lyophilized sample. Processing was performed on an SGI Octane SE computer using the program FELIX 97 (Biosym Technologies) and on an SGI INDY R5000 computer using XWIN-NMR software (Bruker). Spectra were assigned with the FELIX 97 package (Biosym Technologies) and the XEASY program (18). A single set of resonances was assigned for 63 of the 65 residues of the MAT1 fragment; the two missing residues were the N-terminal Met and Arg 54 . Stereospecific assignments of C␤ methylene protons were obtained for 30 residues on the basis of the patterns of HN-H␤ and H␣-H␤ NOEs and J H␣-H␤ scalar couplings (19).
Structure Calculations-A first set of distance constraints was obtained by classifying peak volumes measured on a 70-ms NOESY spectrum recorded at 298 K as strong, medium, and weak, corresponding to distances of 2.7, 3.7 and 5.0 Å, respectively. 60 structures were generated using the restrained simulated annealing protocol implemented in the program X-PLOR 3.851 (20,21). Eight additional distances of 2.4 Å were added between the two zinc atoms and the S␥ of cysteine residues and between the side chain of His 28 and the second zinc atom according to the binding pattern deduced from the primary sequence analysis. Several rounds of structure calculations were then analyzed in an iterative manner with successive incorporation of initially ambiguous distance restraints. The structure was then refined by converting the NOE cross-peak volumes (V) into target distances (d) according to the following relationship (22), ) Ϫ1/6 average distance calculated for all distances ranging from 2.7 to 5 Å between amide, H␣, and H␤ protons obtained from the first round of calculations. V ref was calculated as the arithmetic average over all corresponding volumes. Upper (dϩ) and lower (dϪ) limit distance restraints were derived using the empirical relationship proposed by M. Nilges.
The final set of experimental constraints included 897 distances derived from NOEs (75 intra-residue, 286 sequential, 210 medium range, and 326 long range) and 22 distances derived from hydrogen bonding patterns. 22 and 21 dihedral angles deduced from J HN-H␣ couplings and from the secondary structure analysis were used as weak constraints (⌬ ϭ Ϯ50°and ⌬ ϭ Ϯ40°). The topology of the peptide was modified to include the coordination bounds between histidine and cysteine residues and the two zinc atoms according to tetrahedral coordination values (23). Structure calculations run using either N⑀2 or N␦1 of the His 28 ring as the fourth ligand for the second zinc-binding site allowed us to unambiguously assign N␦1 as the zinc-bound atom. A set of 51 structures was retained based on the following criteria: low total energy, no NOE violation greater than 0.5 Å, and no angle violation greater than 5°. Structures were visualized using MOLMOL (24), and their structural quality was analyzed with PROCHECK (25) and WHATIF (26). The latter program has been used to define the structurally equivalent positions and to superimpose MAT1 structure onto the three available structures of RING (24).

RESULTS AND DISCUSSION
The MAT1 RING Domain Solution Structure-To determine the solution structure of the N-terminal RING finger domain of the human MAT1 subunit, a polypeptide corresponding to residues Met 1 -Asp 65 was produced. The definition of the domain boundaries was based on mild proteolysis experiments and on the comparison of orthologous sequences. This domain, when expressed as a GST fusion protein in E. coli, is soluble and can be easily purified. After removal of the GST tag and subsequent gel filtration, the MAT1 1-65 fragment led to a monodisperse solution and could be concentrated up to 2 mM. As expected for a canonical C3HC4 RING finger domain (27), atomic absorption and mass spectrometry experiments showed that the human MAT1 RING finger domain binds two zinc atoms (data not shown).
A first analysis of NOESY spectra recorded on the MAT1 RING finger domain indicated the presence of both ␣ and ␤ secondary elements with a good dispersion of resonances. The sample behaves as a monomer in solution because the average proton line widths are in a good agreement with the expected values for a folded 7-kDa protein. NOE connectivities observed between H␤ protons of Cys 6 , Cys 9 and Cys 31 , Cys 34 clearly indicate that these residues are forming one of the two zincbinding sites and therefore that the two zinc atoms are bound in a cross-brace fashion, which is one of the defining features of the RING family.
The distribution of the inter-residue NOE restraints used to calculate the structure together with the C␣ rmsd calculated along the peptide chain are shown in Fig. 1A. Except for a few regions where no long range NOE could be observed, the experimental set of NOE-derived distances allows an accurate definition of the three-dimensional structure of the human MAT1 RING finger domain, with a backbone rmsd of 0.67 Å.
Experimental restraints and structural statistics over the 20 lowest energy structures are summarized in Table I. The C␣ backbone trace of the 20 lowest energy NMR structures is shown in Fig. 1B. The N-terminal fragment of the human MAT1 subunit adopts the ␤␤␣␤ fold typical of RING finger domains and presents an unusual one-turn ␣ helix in its N terminus. The core of the domain consists of a three-stranded antiparallel ␤-sheet, comprising residues Leu 21 -Val 23 (␤1), Thr 29 -Cys 31 (␤2), and Arg 59 -Gln 61 (␤3) packed along a twoturn ␣-helix (helix ␣2, residues Glu 32 -Val 40 ). The triple-stranded ␤Ϫsheet is clearly defined by an unambiguous pattern of NOEs H␣-HN, H␣-H␣, and HN-HN (Fig.   1C). Slowly exchanging amide protons are observed for residues Met 22 , Val 23 , Leu 30 , and Gln 61 in the ␤-strands, which indicate that they are hydrogen-bonded. A regular pattern of H␣-HN(i,iϩ3), H␣-H␤(i,iϩ3), and H␣-HN(i,iϩ4) NOEs (28) together with upfield-shifted H␣ resonances (29) and solventprotected amide protons define two helical regions (␣1 and ␣2). Some regions of the peptide chain are less well defined (local rmsd, ϳ1 Å) and correspond to loops that link the secondary structure elements, namely loop L1, which encompasses helix  ␣1 (Thr 12 -Arg 15 ), and loop L2 between residues Val 40 and Ser 56 . The Ramachandran plot (data not shown) shows that 97% of the nonglycine and nonproline residues are located in allowed regions; the few residues presenting unusual and angles are systematically located in the loop regions.
The RING domain is stabilized by two mononuclear zinc sites separated by 14 Å. Cys 6 , Cys 9 , Cys 31 , and Cys 34 form one zinc-binding site (C4), whereas Cys 26 , Cys 46 , and Cys 49 with His 28 form the second zinc-binding site (C3H). The first cysteine pair (Cys 6 , Cys 9 ) stabilizes the N-terminal part of the peptide. The loop L1 containing the ␣1 helix is connected to the central ␤1-strand, which is linked to ␤2 by a short loop harboring the two zinc ligands Cys 26 and His 28 . A two-turn ␣-helix (Glu 32 -Val 40 ) is positioned between the ␤2-strand and the ␤3strand and contains Cys 34 , which is paired with Cys 31 to form the third zinc-ligand pair. A long loop (L2) comprising the fourth pair of zinc ligands (residues Cys 46 , Cys 49 ) connects the helix ␣2 to the ␤3-strand. The overall shape of the MAT1 RING finger domain is found to be slightly elongated, with principal axis lengths of 13.5 ϫ 10.0 ϫ 19.5 Å.
Finally, the MAT1 RING finger core is stabilized by a network of highly conserved hydrophobic residues among MAT1 orthologs (Fig. 2), namely Leu 19 in the loop L1; Met 22 , Leu 30 , and Val 60 in the internal face of the ␤ sheet; Val 35 , Leu 38 , and Phe 39 in the helix ␣2; and Leu 53 and Phe 58 in the loop L2.
Structure Comparison with Known RING Finger Domains-The topology of the three ␤ strands, together with the crossbrace arrangement of the eight zinc-binding residues of MAT1 RING domain, is similar to that observed in two RING atomic resolution structures that have been reported: the structure from the IEEHV protein solved by NMR (30) and the crystal structure of the human recombination-activating protein RAG1 dimerization domain (31). A similar cross-brace arrangement of the zinc-binding residues was also found in the structure of human acute PML proto-oncoprotein (32). The ribbon diagrams of the MAT1 RING structure, together with the three previously reported structures of RING finger domains, are shown in Fig. 3. A data base search for superimposable folds in the Protein Data Bank using the Dali program (33) finds structural similarities between MAT1 and the RING motifs in the RAG1 dimerization domain (31) and in the IEEHV (30). The best structural homology score is found for the superimposition of the MAT1 structure onto the crystal structure of RAG1. Indeed, both structures can be superimposed for 43 C␣ equivalent atoms with an rmsd value of 1.7 Å, whereas the comparison with the solution structure of the IEEHV RING yields 28 equivalent C␣ atom positions that superimpose with an rmsd value of 1.99 Å (the structurally equivalent positions are indicated by plain circles in the alignment of Fig. 2B). However, no significant superimposition could be obtained when comparing the MAT1 RING structure with the solution structure of the RING finger domain from the acute promyelocytic leukemia proto-oncoprotein PML. It is worth mentioning that the weak structural homology observed between the RING finger domains of the human MAT1 protein and PML was also observed when comparing those of RAG1 and PML (31).
The comparison of the MAT1 RING structure with other available RING structures confirms that the consensus C3HC4 zinc-binding sequence defines a conserved structural motif, which constitutes a widely used molecular scaffold. Sequence comparisons of various RING sequences show, however, that this consensus sequence incorporates regions of high sequence diversity with variable spacing between the conserved zincbinding residues. One of these regions is located between the first two pairs of zinc-binding ligands and encompasses the loop L1. In most RING sequences, this loop contains 10 -12 residues, whereas the MAT1 sequence incorporates 16 -17 residues (Fig. 2B). The observation of a well defined secondary structure element (helix ␣1) in this region is noteworthy and constitutes a specific feature of the MAT1 structure ( Fig. 2A). In contrast, the loop L2 containing the fourth zinc ligand pair presents the same conformation as in other RING structures despite the sequence divergence outside the fourth pair of zinc ligands.
The use of stereospecific constraints on most of the H␤ methylene protons allows a precise determination of the side chain orientations (angle 1), in particular for the zinc-binding residues. A detailed analysis of the zinc ligation sites in the various RING structures reveals that the second coordination site, ZNII, is well conserved between the different structures. When comparing the ZNII binding sites of MAT1 and RAG1 (Fig. 4), we found a sharp superimposition, with an rmsd of 0.29 Å, of the four MAT1 ZNII ligand side chain heavy atoms (Cys 26 , His 28 , Cys 46 , and Cys 48 ) onto the corresponding atoms of RAG1. In the same manner, the ZNII coordination sites of RAG1 and IEEHV can be superimposed with an rmsd of 0.32 Å. The first coordination site is less conserved between the three RING finger structures. Indeed, the superimposition of the C␣ of the four ligands (Cys 6 , Cys 9 , Cys 31 , and Cys 34 ) onto the equivalent C␣ of RAG1 is poor, yielding an rmsd of 0.57 Å, a value similar to the one obtained when comparing RAG1 and IEEHV (rmsd of 0.54 Å). In the RAG1 structure, the first zinc-binding site is part of a binuclear cluster, with the Cys 29 (equivalent to Cys 9 in MAT1) bridging two zinc atoms. This feature of RAG1 may explain the observed local structure differences around the first zinc-binding site.
Functional Implications of MAT1 Structure-In a recent paper, Busso et al. (13) have established the role of the N-terminal RING finger domain of MAT1 in the activation of transcription in a TFIIH-dependent manner. They have also observed that the presence of the RING finger domain allows an optimal phosphorylation of the RNA polymerase II C-terminal domain. In agreement with the general role of RING domains in mediating protein-protein interactions (34), it has been suggested that the MAT1 RING domain interacts with other factors within the preinitiation complex of transcription, although no partner has yet been found. When compared with other RING structures, the MAT1 RING domain presents specific features that could be involved in the MAT1 activities.
First, the occurrence of a stretch of conserved hydrophobic residues located in the vicinity of the ␤1 strand (residues Leu 19 , Leu 21 , Met 22 , and Val 23 ) constitutes an interesting feature. Among these conserved hydrophobic residues, two are involved in the hydrophobic packing of the RING structure, namely Leu 19 and Met 22 , whereas the two others (Leu 21 and Val 23 ) are exposed to the solvent. Such a stretch of hydrophobic residues is unusual in RING sequences (Pfam zf-C3HC4 (35)) and could be related to a regulation of the MAT1 activity.
A second specific structural feature of MAT1 concerns the presence of a structured region including one turn of an ␣-helix (␣1), which corresponds to the sequence insertion between the first and second pairs of zinc-binding ligands, only observed in MAT1 sequences. Interestingly, this helix contains a solventexposed tyrosine residue (Tyr 14 ), which is strictly conserved among all MAT1 sequences, suggesting that the helix ␣1 might be involved in the packing of the RING domain with another partner. This hypothesis is supported by the comparison of the MAT1 RING structure with the crystal structure of RAG1, which shows that the equivalent part in RAG1 interacts with its N-terminal C2H2 zinc finger (32). Moreover, the solventexposed tyrosine could be a potential site for phosphorylation and therefore be involved in activity regulation. Recently, the crystal structure of a complex consisting of a portion of the c-Cbl proto-oncogene protein bound to the ubiquitin-conjugating enzyme UbcH7 and a kinase peptide was reported (36). The structure reveals that the loop L1 of the C-terminal C-cbl RING domain interacts closely with both the N-terminal C-cbl tyrosine kinase binding domain and the UbcH7 partner, thus emphasizing the functional role of this region in RING domains.
Analysis of the surface electrostatic potential shows that the MAT1 RING domain is highly positively charged because of the presence of several basic side chains of arginines and lysines (Fig. 5). Whereas the extended distribution of the nine positive charges observed at the domain surface is a specific feature of higher eucaryote sequences of MAT1, it is worth noting that the four positively charged residues that are conserved from human to yeast (Lys 20 , Arg 54 , Lys 55 , and Arg 59 ) are located in the same area, forming a positive patch. Two highly conserved acidic residues located on the helix (Glu 32 and Asp 36 ) form a small negative patch on the exposed side of the helix ␣2. It is worth noting that the two C-terminal acidic residues (Glu 64 and Asp 65 ) are also conserved but are disordered in the structure. The presence of positively charged patches on protein surfaces seems to be a general feature of the RING finger domains, but the basicity of the MAT1 RING surface is remark-able, as shown in Fig. 5. It must be stressed that most of the mutations that affect the function of PML and IEEHV RING finger domains involve charged residues.
The role of the MAT1 RING finger domain within the transcription complex needs to be studied further by site-directed mutagenesis. Two targets need to be identified from the threedimensional structure and probed: (i) the positively charged residues that may be involved in the modulation of the CAK phosphorylation activity through electrostatic interactions with either the phosphorylated C-terminal domain, the DNA, or another component of the preinitiation complex, and (ii) the solvent-exposed hydrophobic residues in the strand ␤1 that may directly affect the stability of the preinitiation complex. It would be of particular interest to know whether these residues are independently related to the two distinct functions of the MAT1 RING finger domain: the phosphorylation of the RNA polymerase II C-terminal domain and transcription activation. To address these points, the building of specific mutants is currently under way.
Since the first member of the Really Interesting New Gene protein family was identified in 1991 (37), only a few structural data are yet available, partly because of the natural propensity of these domains to aggregate and precipitate when being expressed and concentrated. So far no general structure-activity relationship for the RING finger family has been established. The solution structure of the N-terminal part of MAT1 is the fourth structure of a RING domain that is now available. These new data provide interesting insights into the structure of the loop region of variable length between the first two pairs of zinc-binding residues that could be useful in other biological contexts. Finally, the structural variability of the loop L1 and the charge distribution at the surface of the RING domains could be essential factors that modulate RING finger activities.