Different Strategies for Carboxyl-terminal Domain (CTD) Recognition by Serine 5-specific CTD Phosphatases*

The phosphorylated carboxyl-terminal domain (CTD) of RNA polymerase II, consisting of (1YSPTSPS7)n heptad repeats, encodes information about the state of the transcriptional apparatus that can be conveyed to factors that regulate mRNA synthesis and processing. Here we describe how the CTD code is read by two classes of protein phosphatases, plant CPLs and yeast Ssu72, that specifically dephosphorylate Ser5 in vitro. The CPLs and Ssu72 recognize entirely different positional cues in the CTD primary structure. Whereas the CPLs rely on Tyr1 and Pro3 located on the upstream side of the Ser5-PO4 target site, Ssu72 recognizes Thr4 and Pro6 flanking the target Ser5-PO4 plus the downstream Tyr1 residue of the adjacent heptad. We surmise that the reading of the CTD code does not obey uniform rules with respect to the location and phasing of specificity determinants. Thus, CTD code, like the CTD structure, is plastic.

The carboxyl-terminal domain (CTD) 2 of the largest subunit of RNA polymerase II is a landing pad for proteins and multiprotein complexes that regulate transcription and catalyze mRNA processing. The CTD is composed of a tandemly repeated heptapeptide of consensus sequence 1 YSPTSPS 7 . CTD positions Ser 5 and Ser 2 undergo waves of phosphorylation and dephosphorylation during the transcription cycle. The potential complexity of the CTD serine phosphorylation array comprises 4 n different structures, where n is the number of heptad repeats (n varies from 15 in microsporidia to 52 in mammals). Thus, the CTD is a vast repository of information about the state of the transcriptional apparatus (a so-called "CTD code" (1)) that can be read by trans-acting factors. Studies of the interaction of mRNA capping enzymes with the phosphorylated CTD have illuminated its atomic structure and how CTD length, amino acid sequence, and the phosphorylation array influence CTD-PO 4 effector functions (2)(3)(4). Three important points about the CTD code have emerged: (i) the phosphorylated CTD is structurally plastic and can assume markedly different conformations depending on its binding partner (4 -7); (ii) CTD primary structure is recognized independent of CTD phosphorylation state (2,3,8,9); and (iii) encoded CTD-PO 4 information can be assembled from multiple noncontiguous repeats (4).
The information content of the CTD reflects the instantaneous balance between the activities of CTD kinases and CTD phosphatases at either some or all of the serine phosphorylation sites. A subject of intense interest (and debate) is whether and how CTD kinases and phosphatases are themselves responsive to specificity determinants within the CTD. Among the CTD phosphatases, some enzymes fail to discriminate between Ser 2 -PO 4 and Ser 5 -PO 4 substrates (9, 10), whereas others display a preference (2-to 10-fold) for either Ser 2 -PO 4 or Ser 5 -PO 4 (11,12). Recently, we described two types of CTD phosphatases that act exclusively on Ser 5 -PO 4 : (i) the paralogous plant enzymes CPL1 and CPL2, which were identified genetically as regulators of osmotic stress and abscisic acid-responsive transcription in Arabidopsis thaliana (13,14), and (ii) yeast Ssu72, which interacts with TFIIB and with proteins involved in RNA 3Ј-end formation (15)(16)(17)(18)(19)(20).
CPL1/2 and Ssu72 belong to different enzyme families. The plant CPL1/2 proteins, like the prototypal CTD phosphatase Fcp1, are members of the DXDXT superfamily of metal-dependent phosphotransferases that act via an aspartyl-phosphoenzyme intermediate (8,11,13,(21)(22)(23). Ssu72 belongs to the CXXXXXR superfamily of metalindependent phosphohydrolases that act via a cysteinyl phosphoenzyme intermediate (15,24,25). This situation, whereby nature has selected two entirely different structural solutions to perform the same site-specific chemical transformation of the CTD, provides an opportunity to address key questions about how the CTD code is read. Here we demonstrate that CPL1/2 and Ssu72 rely on different positional cues in the CTD primary structure to recognize and hydrolyze Ser 5 -PO 4 .

EXPERIMENTAL PROCEDURES
Recombinant Proteins-The CPL1-(1-646) and CPL2-(1-649) coding sequences were amplified by PCR using primers designed to introduce BglII sites at the start codon and immediately 3Ј of the stop codon. The BglII-digested PCR products were inserted into pET28-His 10 Smt3 that had been linearized with BamHI. The resulting expression plasmids pET-His 10 Smt3-CPL1-(1-646) and pET-His 10 Smt3-CPL2-(1-649) encode the respective CPL proteins fused in-frame to an amino-terminal His 10 Smt3 domain consisting of a His 10 leader (MGHHHHHHHH-HHSSGHIEGRH) followed by the 98-amino acid Saccharomyces cerevisiae Smt3 protein and a single serine. (Smt3 is the yeast ortholog of the small ubiquitin-like modifier SUMO.) pET-His 10 Smt3 plasmids encoding mutated versions CPL1-(1-646)-D161A and CPL2-(1-649)-D144A were generated by two-stage overlap extension PCR and cloning of the BglII-digested mutated PCR products into pET28-His 10 Smt3. The pET-His 10 Smt3-CPL plasmids were transformed into Escherichiacoli BL21-CodonPlus(DE3). Cultures (500 ml) derived from single transformants were grown at 37°C in Luria Bertani medium containing 50 g/ml kanamycin and 50 g/ml chloramphenicol until the A 600 reached 0.6. The cultures were adjusted to 0.2 mM isopropyl-1-thio-␤-D-galactopyranoside and 2% ethanol, and incubation was continued for 20 h at 17°C. Cells were harvested by centrifugation and stored at Ϫ80°C. All subsequent procedures were performed at 4°C. Thawed bacteria were resuspended in 25 ml of buffer A (50 mM Tris-HCl, pH 8.0, 200 mM NaCl, 10% glycerol). Phenylmethylsulfonyl fluoride and lysozyme were added to final concentrations of 500 M and 100 g/ml, respectively. After incubation on ice for 30 min, Triton X-100 was added to a final concentration of 0.1%, and the lysate was sonicated to reduce viscosity. Insoluble material was removed by centrifugation. The soluble extracts were mixed for 30 min with 1 ml of Ni 2ϩ -nitrilotriacetic acidagarose (Qiagen) that had been equilibrated with buffer A containing 0.1% Triton X-100. The resin was recovered by centrifugation, resuspended in buffer A, and poured into columns. The columns were washed with 10 ml of 20 mM imidazole in buffer A and then eluted stepwise with 1.5 ml of buffer A containing 50, 100, 250, and 500 mM imidazole. The polypeptide compositions of the column fractions were monitored by SDS-PAGE. The recombinant His 10 Smt3-CPL polypeptides were recovered predominantly in the 250 mM imidazole fractions. The 250 mM imidazole eluates were dialyzed against buffer A containing 2 mM dithiothreitol and 0.01% Triton X-100 and then stored at Ϫ80°C. The CPL1 and CPL2 concentrations were determined by SDS-PAGE analysis of serial dilutions of the CPL preparations in parallel with serial dilutions of a BSA standard. The gels were stained with Coomassie Blue, and the staining intensities of the His 10 Smt3-CPL and BSA polypeptides were quantified using a Fujifilm FLA-5000 digital imaging and analysis system. CPL1 and CPL2 concentrations were calculated by interpolation to the BSA standard curve.
Wild-type Ssu72 and the C15S mutant were produced in E. coli as glutathione S-transferase fusions and purified from soluble bacterial extracts by glutathione-Sepharose affinity chromatography as described (15). Protein concentration was measured with the Bio-Rad dye reagent using BSA as the standard.
CTD Phosphopeptides-CTD Ser-PO 4 peptides were synthesized and purified by the Sloan-Kettering Microchemistry Core Laboratory as described previously (2,8). The peptides were dissolved in 10 mM Tris-HCl (pH 7.4), 1 mM EDTA and stored at 4°C. The molar concentrations of the phosphopeptides were initially estimated from the absorbance at 274 nM using an extinction coefficient of 1.4 ϫ 10 3 M Ϫ1 for tyrosine. The content of Ser-PO 4 was then determined for each peptide, measuring the release of inorganic phosphate after digestion with calf intestinal phosphatase as described (8).
CTD Phosphatase Assay-Reaction mixtures (25 l) containing 50 mM Tris acetate (pH 5.5), 10 mM MgCl 2 , CTD phosphopeptide, and CPL1 or CPL2 were incubated for 60 min at 37°C. Reaction mixtures (25 l) containing 50 mM Tris acetate (pH 7.0), 5 mM dithiothreitol, CTD phosphopeptide, and Ssu72 as specified were incubated for 120 min at 30°C. The reactions were quenched by adding 0.5 ml of malachite green reagent (BIOMOL Research Laboratories, Plymouth Meeting, PA). Release of phosphate was determined by measuring A 620 and interpolating the value to a phosphate standard curve. The amounts of CTD substrate in the CTD phosphatase reactions are expressed as input phosphoserine, as determined by CIP digestion performed in parallel with each assay.

RESULTS AND DISCUSSION
Physical and Biochemical Characterization of CPL1 and CPL2-CPL1 and CPL2 were shown previously to dephosphorylate in vitro Ser 5 of the Arabidopsis CTD (consisting of 34 heptad repeats), but not Ser 2 (13). Exclusive hydrolysis of Ser 5 -PO 4 , but not Ser 2 -PO 4 , was also demonstrated using defined synthetic CTD phosphopeptides. Physical and biochemical characterization of the plant phosphatases was hampered initially by the poor solubility of the recombinant proteins in E. coli and their susceptibility to proteolysis in vivo. Deletion analysis showed that the carboxyl-terminal segments of CPL1 and CPL2 could be removedwithoutdiminishingphosphataseactivityoraffectingtheSer 5 specficity (13). 3 Here, we produced the catalytically active amino-terminal domains as His 10 Smt3-CPL fusions; this maneuver improved their yield and solubility compared with previous expression strategies. The His 10 Smt3-tagged proteins CPL1-(1-646) and CPL2-(1-649) were isolated from soluble bacterial extracts by adsorption to nickel-agarose and elution with imidazole. SDS-PAGE revealed the presence of polypeptides corresponding to the intact fusion proteins (Fig. 1A, arrows). The apparent sizes of the Smt3-CPL fusion proteins by SDS-PAGE (97-98 kDa) were larger than their calculated molecular masses of 87 kDa; this is because the Smt3 domain migrates aberrantly during SDS-PAGE, appearing ϳ10 kDa larger than its predicted size. The CPL1 and CPL2 preparation contained several polypeptides in the 60-90-kDa range and a cluster of smaller polypeptides migrating at 20-27 kDa (the latter correspond to His 10 Smt3-and His 10 Smt3-peptide fusions arising via proteolysis) ( Fig. 1A and data not shown).
The quaternary structure of recombinant CPL1 and CPL2 was investigated by zonal velocity sedimentation through a 15-30% glycerol gradient ( Fig. 1, B and C). Marker protein catalases (native size 248 kDa), BSA (66 kDa), and cytochrome c (12 kDa) were included as internal standards in the gradient. The tagged CPL1 and CPL2 proteins sedimented as discrete peaks coincident with BSA. CPL1 and CPL2 were clearly separated from the cluster of low molecular weight polypeptides, which cosedimented with cytochrome c. The CTD Ser 5 phosphatase activity profile (measured by the release of P i from a synthetic phosphopeptide) paralleled the abundance of the intact CPL1 and CPL2 polypeptides. These results are consistent with a monomeric quaternary structure for the catalytic domains of CPL1 and CPL2.
CPL1-and CPL2-catalyzed release of P i from a 28-amino acid tetraheptad CTD Ser 5 -PO 4 substrate was proportional to enzyme concentration (expressed as the amount of full-length CPL fusion protein); 95% of the input Ser 5 -PO 4 residues were hydrolyzed at saturating CPL levels ( Fig. 2A). From the slope of the titration curves we estimated that CPL1 and CPL2 hydrolyzed a 6400-and 9300-fold molar excess of Ser 5 -PO 4 / enzyme, respectively, during the 60-min reaction. Neither CPL1 nor CPL2 catalyzed P i release from a tetraheptad CTD Ser 2 -PO 4 substrate ( Fig. 2A). A kinetic analysis is shown in Fig. 2B of the reaction of CPL2 with the tetraheptad CTD-Ser 5 -PO 4 substrate at two different levels of input enzyme. Product accumulated steadily with a pseudo-first order profile. The initial rate was proportional to enzyme concentration; the turnover number was 13 s Ϫ1 .
Reaction of CPL1 and CPL2 with a 14-amino acid diheptad CTD Ser 5 -PO 4 substrate YSPTSPSYSPTSPS resulted in P i release proportional to the amount of input enzyme (Fig. 2C). 85% of the input phosphoserine was hydrolyzed at saturating enzyme levels ( Fig. 2C and data not shown). CPL1 and CPL2 released an ϳ11,000-fold molar excess of P i /enzyme during the 60-min reaction. In contrast, CPL1 and CPL2 were inert in hydrolyzing a diheptad CTD Ser 2 -PO 4 substrate YSPTSP-SYSPTSPS (Fig. 2C). This experiment showed that two heptad repeats suffice for Ser 5 -specific CTD phosphatase activity. CPL1 and CPL2 are putative members of the DXDXT phosphotransferase family. The signature feature of these enzymes is the formation of an intermediate in which phosphate is attached covalently to the first Asp residue of the DXDXT motif (23). We mutated the presumptive Asp nucleophiles of CPL1 (Asp 161 ) and CPL2 (Asp 144 ) to alanine, produced the His 10 Smt3-CPL1-(1-646)-D161A and His 10 Smt3-CPL2-(1-649)-D144A proteins in bacteria, and isolated them from soluble extracts by affinity chromatography. The polypeptide compositions of the mutant proteins were virtually identical to those of the wild-type CPL1 and CPL2 (not shown). The CPL1-D161A and CPL2-D144A preparations displayed no detectable CTD-Ser 5 phosphatase activity at levels of input protein that were saturating for the respective wild-type enzymes (not shown). These results suggest that the plant CPLs are bona fide acyl-phosphatases and, in conjunction with the sedimentation data, prove that the observed CTD-Ser 5 -specific CTD phosphatase activity is intrinsic to the recombinant CPLs.
CTD Ser 5 Phosphatase Activity of Ssu72-The specificity of Ssu72 was characterized using a recombinant glutathione S-transferase-Ssu72

Serine 5-specific CTD Phosphatases
NOVEMBER 11, 2005 • VOLUME 280 • NUMBER 45 JOURNAL OF BIOLOGICAL CHEMISTRY 37683 fusion protein purified from a soluble bacterial lysate by glutathioneaffinity chromatography (Fig. 3D). Ssu72 hydrolyzed P i from the tetraheptad Ser 5 -PO 4 substrate; activity reached a plateau at 2.5-5 g of input protein (Fig. 3A), at which point 55% of the input Ser 5 -PO 4 residues had been hydrolyzed. Product accumulated steadily during a 3-h reaction (Fig. 3B), which signified that the low end point was not dictated by inactivation of the enzyme during the incubation. Additional experiments showed that, although adding more Ssu72 to the reaction after the end point had been achieved did not result in further P i release, adding more peptide without more enzyme did result in hydrolysis of ϳ50% of the additional substrate (not shown). Ssu72 failed to release P i from the tetraheptad Ser 2 -PO 4 substrate (Fig. 3, A and B).
Ssu72 also hydrolyzed the diheptad CTD Ser 5 -PO 4 substrate YSPTSPSYSPTSPS. 50% of the input phosphoserine was hydrolyzed at saturating enzyme levels (Fig. 3C). We estimated Ssu72 hydrolyzed a 150-fold molar excess of Ser 5 -PO 4 /enzyme during the 120-min reaction, which corresponds to a turnover number of 1.2 min Ϫ1 . Ssu72 was unreactive with the diheptad CTD Ser 2 -PO 4 substrate YSPTSP-SYSPTSPS (Fig. 3C). Ssu72 resembles the low molecular weight cysteinyl phosphatase enzymes. The presumptive catalytic mechanism entails attack of a cysteine thiolate nucleophile on the substrate to form a cysteinyl-phosphoenzyme intermediate. Here we found that the purified recombinant active site mutant Ssu72-C15S (Fig. 3D) was incapable of hydrolyzing the CTD-Ser 5 -PO 4 peptide substrate (not shown).
CPLs and Ssu72 Recognize Different Specificity Determinants in the CTD-To probe the role of CTD primary structure in Ser 5 phosphatase activity, we tested mutated versions of the tetraheptad CTD Ser 5 -PO 4 peptide (YSPTSPS) 4 , wherein every Tyr 1 , Ser 2 , Pro 3 , Thr 4 , or Pro 6 was replaced by alanine. The titration profiles of the reaction of CPL1 with the S2A, T4A and P6A substrates were similar to that of the wild-type CTD (Fig. 4A). However, the CPL1 activity profiles with the P3A and Y1A substrates displayed a shift to the right (Fig. 4A). The specific activities of CPL1 with the P3A and Y1A substrates were 16 and 4%, respectively, of the activity with the wild-type (YSPTSPS) 4 substrate. CPL2 hydrolyzed the S2A and T4A CTDs as well as the wild-type CTD. Activity was reduced modestly by the P6A change (to 27% of wild-type) and drastically by the P3A (3%) and Y1A (1%) CTD mutations (Fig. 4B). These experiments demonstrate that Tyr 1 and Pro 3 are the critical determinants of CPL phosphatase activity at position Ser 5 .
Ssu72 responded quite differently to changes in CTD primary structure (Fig. 4C). The P3A change that was deleterious to the CPLs had virtually no ill effect on Ssu72, whereas the T4A mutation, which was well tolerated by the CPLs, was inimical to Ssu72, reducing its specific activity to 3% of the wild-type CTD level (Fig. 4C). Moreover, the P6A lesion abolished Ser 5 phosphatase activity of Ssu72, though it had no effect on CPL1 and only a modest effect on CPL2. Ssu72 activity was reduced severely (to 4% of wild-type) by the Y1A change and was unaffected by S2A. We conclude that: (i) Tyr 1 , Thr 4 , and Pro 6 are the critical determinants of Ssu72 phosphatase, and (ii) different CTD phosphatases with the same exquisite specificity for Ser 5 -PO 4 achieve their specificities by recognizing different structural cues in the CTD.
By titrating the wild-type (YSPTSPS) 4 substrate against fixed amounts of CPL1, CPL2, and Ssu72, we determined apparent K m values of 60, 140, and 280 M, respectively (not shown). These values are higher than the substrate concentrations (ϳ30 M) used in the specific activity determinations in Fig. 4. Thus, we retested the activity of CPL1 and CPL2 with the wild-type, Y1A, P3A, and P6A tetraheptad Ser 5 -PO 4 peptides at substrate concentrations (240 -260 M) in excess of the K m for wild-type peptide (Fig. 5). The hierarchy of mutational effects at higher substrate was similar to what was observed at limiting substrate. The extents of phosphate release by CPL1 with the P6A, P3A, and Y1A substrates were 96, 27, and 13%, respectively, of the activity with the wild-type (YSPTSPS) 4 substrate (Fig. 5, left panel). Phosphate release by CPL2 with the P6A, P3A, and Y1A substrates was 64, 7, and 4% of activity with the wild-type peptide (middle panel). Thus, raising the CTD concentration above K m only slightly mitigated the effects of the Y1A or P3A changes on CPL1 activity, or the effects of Y1A, P3A, and P6A on CPL2 function, vis-à-vis their activity with the wild-type tetraheptad substrate. A parallel experiment with Ssu72 at the higher substrate concentrations showed that phosphate release with the Y1A, T4A, and P6A substrates was 8, 12, and 3% of its activity with the wildtype peptide (Fig. 5, right panel).
Effects of a Lysine Substitution at Position 7 of the CTD Heptad-The identity of the amino acid at position 7 of the CTD is either highly conserved (e.g. all serines in the Schizosaccharomyces pombe CTD (GenBank TM accession code CAB57941) and Encephalitozoon cuniculi CTD (GenBank TM CAD26175)) or subject to variation from one heptad to another within the same CTD. In the case of human RNA polymerase II (GenBank TM P24928), one-half (26/52) of the heptads have a consensus Ser 7 , and the others are substituted, most commonly with Lys 7 (7 heptads) or Asn 7 (5 heptads). The CTD of the malaria parasite Plasmodium falciparum (GenBank TM Z98551) is dominated by Lys 7 -containing heptads (11/15 repeats).
To gauge the role, if any, of the position 7 side chain as a CTD phosphatase specificity determinant, we tested a tetraheptad CTD Ser 5 -PO 4 peptide (YSPTSPK) 4 in which every Ser 7 was replaced by lysine. The titration profiles of the reaction of CPL1 and CPL2 with the S7K substrates were similar to that of the wild-type CTD (Fig. 6, A and B). Ssu72 was ϳ30% as active with the S7K CTD substrate as it was with the wild-type CTD (Fig. 6C). Thus, neither the CPLs nor Ssu72 depend on Ser 7 for their Ser 5 -specific phosphatase activity.
Different Phasing of Specificity Determinants for CPLs and Ssu72-Further insight into the location of the specificity determinants emerged from an analysis of the reaction of CPLs and Ssu72 with diheptad monophosphorylated CTDs, which contained Ser 5 -PO 4 in either the amino-terminal heptad (YSPTSPSYSPTSPS) or the carboxyl-terminal heptad (YSPTSPSYSPTSPS). Ssu72 displayed a 10-fold preference for hydrolysis of Ser 5 -PO 4 within the amino-terminal heptad (Fig. 7C). This result implies that the recognition of Tyr 1 by Ssu72 occurs on the carboxyl side of the Ser 5 -PO 4 moiety that is being hydrolyzed, i.e. Ssu72  has higher activity on the proximal repeat because Ssu72 recognizes the Thr 4 and Pro 6 flanking the Ser 5 -PO4 and the Tyr 1 in the distal repeat (see Fig. 9). Ssu72 is only weakly active on the distal repeat because there is no Tyr 1 available on the carboxyl side of the phosphoserine. From a substrate titration experiment at a fixed level of Ssu72, we determined an apparent K m of 850 M for the YSPTSPSYSPTSPS peptide (data not shown); thus, the affinity of the enzyme for a tetraheptad CTD with four phosphoserines is ϳ3-fold higher than for a diheptad with a single phosphoserine.
CPL1 and CPL2 display the opposite preference; they are 2-to 3-fold more active in hydrolyzing Ser 5 -PO 4 when it is located in the carboxylterminal heptad (Fig. 7, A and B). These results suggests that: (i) the CPLs recognize the Tyr 1 residue upstream of the phosphorylated Ser 5 that is being hydrolyzed, and (ii) 2 amino acids downstream of Ser 5 -PO 4 suffice. From substrate titration experiments, we determined apparent K m values of CPL1 and CPL2 for the preferred YSPTSPSYSPTSPS peptide of 100 and 570 M, respectively (data not shown). The disfavored CPL substrate does contain a Tyr 1 upstream of Ser 5 -PO 4 in the first repeat; perhaps the free amine of the terminal Tyr 1 (which adds a positive charge not present normally) might account for the lower activity of this substrate. Alternatively, 1 or more amino acids on the upstream side of Tyr 1 might contribute to substrate recognition, either directly or by limiting the flexibility of the Tyr 1 residue. To address these issues, we tested a series of incrementally truncated 12-mer (PTSPSYSPTSPS), 10-mer (SPSYSPTSPS), and 8-mer (SYSPTSPS) peptides containing a complete Ser 5 -PO 4 heptad at their carboxyl termini and either 5, 3, or 1 amino acids from an upstream heptad (Fig. 8). CPL1 and CPL2 readily hydrolyzed the 12-and 10-mer substrates, whereas their activities with the 8-mer were reduced by factors of 2 and 3, respectively (Fig. 8). We surmise that CPL1/2 activity is optimal with as few as 3 amino acids on the amino-terminal side of Tyr 1 . Apparent K m values of CPL1 and CPL2 for the 12-mer PTSPSYSPTSPS were 180 and 950 M, respectively (not shown).
Properties of the CTD Code-Here we have begun to illuminate how the CTD code is read by two classes of protein phosphatases that specifically dephosphorylate Ser 5 . The instructive findings are that the  plant CPLs and yeast Ssu72 recognize entirely different constellations of specificity determinants in the CTD. Whereas the CPLs rely on Tyr 1 and Pro 3 located on the upstream side of the Ser 5 -PO 4 target site, Ssu72 recognizes Thr 4 and Pro 6 flanking the target Ser 5 -PO 4 plus the downstream Tyr 1 residues of the adjacent heptad (Fig. 9). The two classes of phosphatases not only see different amino acids, they see them in entirely different registers with respect to Ser 5 -PO 4 .
The simple interpretation of our results is that: (i) the functionally relevant CTD structural elements comprise the respective binding sites for the CPLs and Ssu72, and (ii) those CTD elements are likely to be disposed on the surface of the CTD Ser 5 -PO 4 substrate to which the phosphatase docks. The most plausible scenario is that the proper CTD conformation for catalysis is templated by interaction with the phosphatase via an induced fit mechanism, as shown for other CTD-binding proteins. Although we do not yet have an atomic structure of the CTD bound to either of the phosphatases studied here, we can make an educated guess as to what conformations the phosphatases might select for, based on principles emerging from the structure of the mRNA capping enzyme bound to the Ser 5 -PO 4 CTD (4). Interactions with the phosphorylated CTD occur at two distinct CTD-docking sites (CDS1 and CDS2) on the guanylyltransferase surface. CDS1 engages a CTD segment TSP-SYSP, whereas CDS2 interacts with a downstream segment SYSPTSP located two heptads away. Thus, the capping enzyme samples two distinct CTD structures that are phased in different registers with respect to Ser 5 -PO 4 . The CTD bound at sites CDS1 and CDS2 adopts a ␤-like extended conformation, whereby every other residue is oriented toward or away from the protein surface (Fig. 9).
It is remarkable that the CTD segments and the CTD conformations at CDS1 and CDS2 of the capping enzyme satisfy reasonably well the specificity parameters elucidated here for Ssu72 and the CPLs, respectively. To wit, the Tyr 1 , Pro 3 , and Ser-PO 4 moieties that comprise the recognition code for CPL1/2 project onto a common surface within CDS2, whereas Ser 2 and Thr 4 , which are not needed for CPL1/2 activity, project away from the docking site (Fig. 9). Pro 6 also projects onto the protein side of the interface in CDS2; this residue is not critical for CPL1 phosphatase activity but does play a modest role in CPL2 function. We envision that the CTD adopts a CDS2-like conformation in its interaction with the CPL1 and CPL2 phosphatases.
The Ser 5 -PO 4 , Pro 6 , and Tyr 1 moieties that Ssu72 recognizes are located on the protein side of the interface in CDS1, whereas Ser 2 and Pro 3 , to which Ssu72 is indifferent, project away from the protein surface (Fig. 9). We therefore speculate that the CTD adopts a CDS1-like conformation in its interaction with Ssu72. Although Thr 4 does not contribute to the CTD-protein interface at CDS1 of the capping enzyme, the Thr O␥ is in position to donate a hydrogen bond to the vicinal phosphate group at Ser 5 (4).
The finding of at least two CTD coding elements that signal specific hydrolysis of Ser 5 -PO 4 has implications for understanding CTD information content. First, the functional "footprints" of CPL1/2 and Ssu72 on the CTD primary structure collectively span more than one heptad repeat, a state of affairs that is consistent with recent genetic evidence that the essential functional unit of the CTD is contained with a contiguous diheptad pair (26). Second, there is no simple rule for what is recognized. The CTD phosphorylation sites are located within Ser-Pro dipeptides. With respect to CTD Ser 5 -specific dephosphorylation,  . CPLs and Ssu72 read distinct overlapping CTD codes to hydrolyze Ser 5 -PO 4 . The CTD structural determinants of CPL and Ssu72 phosphatase activity are indicated by arrows above and below the CTD primary structure. The phosphorylated Ser 5 position is colored red. It is presumed that the specificity determinants are likely to residue on a common face of the CTD with which the phosphatase interacts. The CTD requirements for CPL activity are consistent with the ␤-like conformation of the corresponding CTD segment bound to CTD-docking site 2 (CDS2) of yeast mRNA capping enzyme, which is illustrated at top right. The requirements for Ssu72 activity suggest a ␤-like conformation of the same CTD segment bound to CDS1 of the capping enzyme, which is shown at bottom left.
Ssu72 is identified here as a stringently Pro 6 -directed phosphatase, whereas CPL1 is Pro 6 -independent and CPL2 is only modestly affected by elimination of Pro 6 . Third, even though the CTD is structurally plastic, there are likely to exist a set of CTD conformational states that are utilized reiteratively by different CTD-binding proteins, even where the proteins are themselves not structurally related (e.g. the CTD phosphatases and the mRNA capping enzyme). Fourth, the information content of the CTD can be amplified by blending multiple distinct CTD coding elements (such as those seen here for the CPLs and Ssu72) to form a bipartite, or even higher order, recognition site for CTD-associated factors (such as the capping enzyme).