Crystal structure of foot-and-mouth disease virus 3C protease. New insights into catalytic mechanism and cleavage specificity.

Foot-and-mouth disease virus (FMDV) causes a widespread and economically devastating disease of domestic livestock. Although FMDV vaccines are available, political and technical problems associated with their use are driving a renewed search for alternative methods of disease control. The viral RNA genome is translated as a single polypeptide precursor that must be cleaved into functional proteins by virally encoded proteases. 10 of the 13 cleavages are performed by the highly conserved 3C protease (3C(pro)), making the enzyme an attractive target for antiviral drugs. We have developed a soluble, recombinant form of FMDV 3C(pro), determined the crystal structure to 1.9-angstroms resolution, and analyzed the cleavage specificity of the enzyme. The structure indicates that FMDV 3C(pro) adopts a chymotrypsin-like fold and possesses a Cys-His-Asp catalytic triad in a similar conformation to the Ser-His-Asp triad conserved in almost all serine proteases. This observation suggests that the dyad-based mechanisms proposed for this class of cysteine proteases need to be reassessed. Peptide cleavage assays revealed that the recognition sequence spans at least four residues either side of the scissile bond (P4-P4') and that FMDV 3C(pro) discriminates only weakly in favor of P1-Gln over P1-Glu, in contrast to other 3C(pro) enzymes that strongly favor P1-Gln. The relaxed specificity may be due to the unexpected absence in FMDV 3C(pro) of an extended beta-ribbon that folds over the substrate binding cleft in other picornavirus 3C(pro) structures. Collectively, these results establish a valuable framework for the development of FMDV 3C(pro) inhibitors.

cently caused devastating epidemics around the globe, affecting Uruguay, Taiwan, and the United Kingdom (1,2). The UK epidemic of 2001 inflicted an economic toll of £20 billion and was only controlled by destroying over 5 million animals. Although FMDV 1 vaccines are available, their use in disease control is compromised by the difficulty in distinguishing vaccinated from infected animals, which currently acts as an impediment to the resumption of international livestock trade (1). The use of vaccines is also hampered by the existence of seven distinct serotypes and multiple subtypes of the virus; vaccine stocks must therefore be constantly updated to protect against the circulating strains most likely to give rise to new outbreaks (3). Moreover, at least 3-4 days are required following vaccination for a protective response to develop, during which time outbreaks may gain a wider foothold (1,3). In view of these problems, which are exacerbated by the emergent threat that the virus may be used as a bioterrorism agent (1,4), additional measures for disease control are required. One promising strategy is to develop drugs to inhibit viral enzymes, such as proteases, that are necessary for replication and propagation of the disease, an approach that has been adopted with some success for human immunodeficiency virus (5) and is now being applied to a number of other viral targets (6,7). In the case of FMDV, such antiviral drugs could be deployed as soon as an outbreak is detected and thereby provide a potent adjunct to vaccines for faster control of incipient epidemics.
FMDV, a picornavirus related to poliovirus (PV), human rhinovirus (HRV), and hepatitis A virus (HAV), possesses a single-stranded, positive sense RNA genome containing one open reading frame that is translated as a polypeptide precursor. The cleavage of this polypeptide into the functional proteins required for viral replication is performed by virally encoded proteases. The most important of these is the conserved 3C protease (3C pro ), which in FMDV performs 10 of the 13 cleavages (4); this enzyme is therefore an appealing target for antiviral drugs.
Structural studies on HAV, HRV, and PV 3C pro enzymes have established that these belong to an unusual family of cysteine proteases that have an overall fold similar to chymotrypsin (8 -10). They invariably possess a conserved Cys-His-Asp/Glu catalytic triad at the active site, which is similar to the Ser-His-Asp triad found in the vast majority of serine proteases. In serine proteases, the His residue serves to prime the Ser side chain for nucleophilic attack on the scissile bond by abstracting a proton; the role of the Asp is to provide electrostatic stabilization of the resulting positive charge on * This work is supported by the Fleming Fund (Imperial College) and the BBSRC. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We dedicate this publication to David Blow , our long time colleague at Imperial College London, who solved the first protease structure in 1967.
The atomic coordinates and structure factors ( the His (11). To date, however, the structures of picornaviral 3C enzymes indicate that their active sites deviate in significant details from the canonical Ser-His-Asp configuration, leading to suggestions that they also differ mechanistically. In HAV 3C pro , the catalytic Asp is found to be directed away from the active site and is proposed not to play a significant role in catalysis; its function is possibly replaced by a tyrosine residue that is suggested to be negatively charged (8,12). In HRV and PV 3C proteases, which are very similar to one another, the position normally occupied by the catalytic Asp is taken by Glu (9, 10), a substitution that is extremely rare in serine proteases (11). These unexpected features have generated new hypotheses regarding the catalytic mechanism of picornaviral 3C proteases that have yet to be fully resolved (8 -10, 12, 13). At the same time, structural studies have also provided a detailed structural basis for the search for effective inhibitors (14 -16). In order to shed further light on the possible catalytic mechanism of chymotrypsin-like cysteine proteases and to establish a molecular framework for anti-FMDV drug design, we have developed a soluble, recombinant form of FMDV 3C pro , determined the crystal structure, and analyzed its proteolytic specificity.

MATERIALS AND METHODS
Protein Purification and Crystallization-FMDV 3C pro strain A10 61 was expressed in BL21(DE3) pLysS Escherichia coli from a pET-28 vector modified to add a thrombin-cleavable N-terminal polyhistidine tag. Selenomethionine-labeled protein was produced in B834(DE3) E. coli grown in minimal medium supplemented with essential nutrients and selenomethionine (Molecular Dimensions). The expressed protein was purified on TALON resin (BD Biosciences), digested overnight with thrombin, reapplied to the TALON resin to remove the cleaved tag, and further purified by gel filtration on a Hi Load 16/60 Superdex 75 column (Amersham Biosciences). Peak fractions were concentrated to 10 mg/ml in 50 mM HEPES, pH 7.0, containing 200 mM NaCl, 1 mM EDTA, 1 mM ␤-mercaptoethanol. The purified protein contains residues 1-207 of the 213 amino acids of wild-type 3C pro ; it has an additional Gly at the N terminus remaining from the thrombin cleavage and one additional His at the C terminus. In addition it contains mutations C95K and C142S, which were designed via homology modeling to enhance solubility. Incorporation of these mutations into a 3B-3C fusion construct (17) had no effect on 3C pro -mediated removal of the 3B peptides upon expression in E. coli (data not shown). The mutation C163A was also made to eliminate the activity of the active site nucleophile. Crystals were obtained by sitting drop vapor diffusion over a 100-l reservoir upon mixing 2 l of protein with 0.5 l of 10% (v/v) Anapoe-X-405 and 2.5 l of a reservoir solution prepared by mixing 75 l of 1.5 M sodium citrate, pH 6.5, with 25 l 0.5 M ammonium phosphate.
Data Collection, Phasing, and Refinement-Crystals (typically 250 ϫ 250 ϫ 80 m) were transferred to a 60/40 (v/v) mix of reservoir solution and saturated sucrose and frozen in a stream of N 2 gas at 100 K immediately prior to data collection. Multiwavelength x-ray diffraction data to 1.9 Å collected from crystals of SeMet-labeled protein were processed and scaled using the CCP4 program suite (18). The crystals belong to the R3 space group and have two molecules in the asymmetric unit related by a noncrystallographic 2-fold axis (Table I). Since the crystals exhibited hemihedral twinning (19) with a twinning fraction of around 0.2, the diffraction intensities were detwinned using DETWIN. Initial selenium sites were located from data collected at peak and remote wavelengths using SHELXD and SHELXE (20) and refined with SHARP (21). Solvent flattening, histogram matching, and 2-fold noncrystallographic symmetric were applied in DM (22) to improve the phases. An initial model encompassing about 75% of the polypeptide was built automatically using ARP/wARP (23), completed manually in O, and refined using CNS (Table I). The two independent molecules of 3C pro in the asymmetric unit are very similar (root mean square deviation in C ␣ positions ϭ 0.4 Å); in each chain, electron density is visible starting from residue 7 and ending at residue 205, although loops A142-146, B142-146, and B76 -81 are also missing due to disorder.
Peptide Cleavage Assays-Soluble single (C95K) and double (C95K/ C142S) cysteine mutant proteases for peptide cleavage assays were generated in a construct that generates a ⌬3B1-3B2-3B3-3C fusion protein, which contains 3C pro fused to the three upstream 3B peptides found in the FMDV polyprotein (though 3B1 is truncated at its N terminus) (17). Upon expression in E. coli, the 3C pro cleaved itself from the 3B peptides and was purified in a single step to Ͼ95% homogeneity using TALON resin via its noncleavable C-terminal polyhistidine tag. Fresh preparations of the single mutant 3C pro exhibited significantly higher specific activity, but the loss of enzyme activity under assay conditions and a propensity for aggregation made it difficult to obtain reproducible results. The additional C142S substitution in the B 2 -C 2 loop of the double mutant is absolutely required to obtain a soluble and stable enzyme for cleavage assays. Although this mutation may lie close to the substrate binding site (Fig. 1A), there were no significant differences in peptide cleavage specificity compared with the single mutant (data not shown).
Samples of peptide (50 M) were incubated with the 3C pro enzyme at 37°C in 0.1 M phosphate buffer, pH 7.4, containing 0.01 mM dithiothreitol, 1 mM EDTA, and 5% glycerol. The extent of reaction was determined following HPLC analysis of the mixture by integration of the peaks representing the starting and product peptides. The identity of the product peptides was verified by MS analysis and/or comparison with HPLC traces of peptide standards. In all cases where cleavage occurred, this was found to be at the expected P1/P1Ј junction. For the alanine-scanning experiments, reactions were quenched after a 4-h incubation by the addition of trifluoroacetic acid to 2.5% (w/v). Relative hydrolysis rates were calculated from the data assuming that the hydrolysis followed apparent first order kinetics.

Crystallization and Structure Determination-Recombinant
FMDV 3C pro (strain A10 61 ) may be highly expressed in E. coli (17,25), but we found that the native protein precipitates relatively rapidly following purification. Storage at 5-10 mg/ml over a period of days led to the formation of a proteinaceous skin over the top of the solution, which makes this form incompatible with crystallization studies. Extensive mutagenesis trials of amino acids predicted to be surface-exposed (on the basis of modeling using HAV 3C pro , the nearest homologue of the known structure (8)) were performed in an effort to enhance protein solubility. Dynamic light scattering analysis of the aggregation properties of the resulting mutant enzymes, coupled with activity assays, eventually yielded a soluble, active variant. 2 In this mutated FMDV 3C pro , two solvent-exposed cysteine residues (C95K and C142S) have been replaced, and the C terminus has been trimmed by six residues (for details, see "Experimental Procedures"). For crystallization, the active site nucleophile was also replaced to eliminate proteolytic activity (C163A). The refined structure, solved to 1.9-Å resolution by multiwavelength anomalous dispersion, has an R free ϭ 24.6% and good stereochemistry ( Table I).
Structure of FMDV 3C pro and Comparison with Other Proteases-FMDV 3C pro adopts a chymotrypsin-like fold, which typically consists of two ␤-barrel domains, each composed of a pair of four-stranded anti-parallel ␤-sheets that pack together to form a shallow peptide binding cleft (Fig. 1A). According to the naming convention adopted for these proteases, strands B and E within each domain usually contribute to both sheets. Although the other picornaviral 3C proteases (8 -10) and the 3C-like NIa protease from tobacco etch virus (TEV pro ; 16% identical in amino acid sequence to FMDV 3C pro ) (26) conform to this template and superpose on the FMDV structure with root mean square differences in C ␣ positions of 1.8 -2.3 Å, FMDV 3C pro exhibits some significant and unexpected differ-ences. In particular, the B 2 strand does not extend to participate in the E 2 -F 2 -C 2 -B 2 ␤-sheet found in the C-terminal ␤-barrel of other 3C pro structures (Fig. 1A); instead, this segment of polypeptide (residues 137-139) is separated from the C 2 strand and pairs up with the residues preceding the A2 strand (residues 107-109) to form a small ␤-sheet (A 2 Ј -B 2 Ј ) that is not observed in the HAV, HRV, or PV 3C proteases or in other common chymotrypsin-like proteases (Fig. 1, A-C). A major corollary of this rearrangement of the ␤-sheet is that the extended ␤-ribbon formed by the B 2 -C 2 loop in other picornaviral 3C proteases (and in some serine proteases, such as ␣-lytic protease (27)), which has an important role in recognizing the P2-P4 region of peptide substrates (16), is absent in FMDV 3C pro . Although the central region of this loop (residues 142-146) is missing in the electron density map, presumably due to disorder, the paths adopted by the visible termini and the short length of the disordered portion clearly preclude formation of a ␤-ribbon feature in the free enzyme structure (Fig. 1D) and suggest that it projects away from the active site. The 3C-like protease TEV pro also lacks the extended B 2 -C 2 ␤-ribbon, but this feature is effectively substituted by a ␤-ribbon derived from a C-terminal extension that is not present in bona fide picornavirus 3C proteases (26). The E 1 -F 1 loop in FMDV 3C pro assumes a highly extended conformation that is directed away from the active site. Thus, as a result of the conformations adopted by the B 2 Ј -C 2 and E 1 -F 1 loops, the active site and substrate binding cleft in FMDV 3C pro are more solvent-exposed than the corresponding features in most other chymotrypsin-like proteases. The A 1 -B 1 , C 1 -D 1 , and A 2 -B 2 loops in FMDV 3C pro also differ significantly from the conformations observed in other picornaviral 3C proteases (Fig. 1E), and although they project away from the active site, they may contribute to substrate specificity.
The determination of the crystal structure of HAV 3C pro unexpectedly revealed that residue Asp 84 , which was predicted to form the catalytic triad along with His 44 and Cys 172 , is actually redirected away from the active site by a salt bridge interaction with the side chain of Lys 202 (8) (Fig. 2). Tyr 143 , which points toward the catalytic His in this enzyme was proposed to compensate for this unusual arrangement by providing sufficient electrostatic stabilization to activate the charge relay system (8,14), and, based on homology modeling, it was recently proposed that FMDV 3C pro would operate a similar system (28). However, the sequence alignment based on the structure reported here shows that neither Tyr 143 nor Lys 202 from HAV 3C pro is conserved in the FMDV enzyme (Fig.  3). Instead, Asp 84 in the crystal structure of FMDV 3C pro makes a hydrogen bond interaction with His 46 of the catalytic triad, and the enzyme therefore has a Cys-His-Asp triad that is very similar in configuration to the equivalent triad in the 3C-like TEV pro (26) and to the Ser-His-Asp triad found in the vast majority of serine proteases (11) (Fig. 2). This similarity extends to encompass the side-chain hydroxyl group of Ser 182 in FMDV 3C pro (Ser 214 in chymotrypsin), which is positioned to stabilize the conformation of the side chain of the triad residue Asp 84 , an interaction that is conserved in many chymotrypsinlike serine proteases (11). Intriguingly, this Ser residue is replaced by Val 192 in HAV 3C pro ; loss of this interaction may therefore contribute to the reorientation of the catalytic Asp 84 in that enzyme.
The face of the protease opposite to the active site contains a basic patch that is implicated in RNA binding during picornaviral RNA replication (29 -31). As in other picornavirus 3C proteases, this is centered on the conserved motif (K/R)(F/ V)RDI ( 95 KVRDI 99 in the present structure, due to the Cys 95 3 Lys mutation introduced to aid solubility; exceptionally for aphthoviruses, the motif has the sequence CVRDI in wild-type FMDV A10 61 , which could conceivably be due to a single base substitution during passage of virus stocks in tissue culture) (32,33). The basic residues of the motif, Lys 95 and Arg 97 , are exposed on the surface of FMDV 3C pro and form the core of a basic patch that also contains Lys 10 , Arg 60 , Arg 92 , His 91 , Lys 101 , His 102 , Arg 126 , Arg 196 , and Lys 203 .
Peptide Cleavage Specificity-Comparison of the cleavage rates of peptides corresponding to the 10 polyprotein sites cleaved by 3C pro in FMDV A10 61 (see "Experimental Procedures") reveals a surprising degree of variation (Table II). The most susceptible peptides correspond to the VP1/2A and 2B/2C junctions, which were completely cleaved within 6 h, whereas peptides corresponding to the VP2/VP3, 2C/3A, 3A/3B1, 3B1/ 3B2, and 3B2/3B3 junctions were uncleaved after a 24-h incubation under our experimental conditions (see "Experimental Procedures"). The VP1/2A junction in HRV and PV is the primary cleavage site of a different viral protease, 2A pro (32). The 2B/2C cleavage site is also found to be one of the most efficiently processed protein junctions in peptide cleavage assays performed with PV and HAV 3C pro (34,35), although it is not efficiently cleaved as a peptide by HRV 3C pro (35). The primary 3C pro cleavage site in vivo is variously reported to occur at the 2A/2B, 2B/2C, 2C/3A, or 3A/3B junctions (32, 33, 36, 37); clearly, factors such as folding of the viral polyprotein precur- sor may affect site accessibility in infected cells. The two most susceptible peptides in the FMDV polyprotein contain the sequence KQL at positions corresponding to P2-P1Ј. However, the 2C/3A peptide, which contains KQI at these positions, is resistant to cleavage, suggesting that amino acids outside the core sequence are also important for efficient cleavage.
To characterize the cleavage specificity of the enzyme in greater detail, alanine-scanning mutagenesis was performed on a 12-mer version of the VP1/2A peptide (IAPAKQ2LLNFDL) that spans the P6 -P6Ј positions. These experiments identified P4-Pro, P2-Lys, P1-Gln, and P4Ј-Phe of this sequence as the most important residues for specific recognition by 3C pro (Fig.  4), since substitution of these residues by Ala abrogated peptide cleavage under our conditions. The substrate recognition site therefore spans at least eight residues (P4 -P4Ј) and is longer than has previously been appreciated for 3C proteases. Substitution of P1Ј-Leu and P2Ј-Leu by Ala had smaller but still significant detrimental effects on cleavage. The P1 and P4 positions have previously been shown to be important in peptide substrates for HRV and HAV 3C proteases (38,39) and are the most highly conserved positions in the FMDV polyprotein cleavage sites (40). Whereas the deleterious effect of Ala substitutions for P2-Lys and P4Ј-Phe indicates that these residues contribute substantially to substrate recognition, it should be noted that they are not conserved in the polyprotein (Table  II). The impact of alanine-scanning mutagenesis may well depend on the peptide sequence being probed, and further work is necessary to extract more detailed information on cleavage specificity. Nevertheless, the present results strongly suggest that the P4 -P1 sequence (Pro-Ala-Lys-Gln) would serve as a promising template for the design of peptide-mimetic inhibitors.
Most picornavirus 3C proteases are generally specific for P1-Gln and discriminate strongly (48-fold in the case of HAV) against P1-Glu (38,39). In marked contrast to these enzymes, FMDV 3C pro is only weakly selective for Gln over Glu at this position (about 2-fold in rate) (Fig. 4). This is consistent with the occurrence of P1-Glu at many of the 3C pro cleavage sites within the FMDV polyprotein (40) and the cleavage of eukaryotic initiation factor 4AI at a Glu-Val junction (17). DISCUSSION We report here the first structural insights into FMDV 3C pro . Our results confirm that the enzyme adopts the expected chymotrypsin-like fold but reveals several unexpected aspects of the structure that undoubtedly impact on its proteolytic activity. Strikingly, FMDV 3C pro lacks the extended ␤-ribbon that folds  (12)) (C), colored according to the same scheme as in A are shown. Note that the B 2 -C 2 ␤-ribbon present in these two structures, but absent in FMDV 3C pro , is colored orange. D, electron density maps for residues 147-152 from FMDV 3C pro showing the polypeptide chain immediately following the disordered section of the B 2 Ј -C 2 loop (residues 142-146). The blue chicken wire represents a 3F o Ϫ 2F c map phased with refined model phases depicted at a contour level of 1. An anomalous difference map calculated with experimental phases and contoured at ϩ3 shown in magenta indicates the position of the selenium atom in (Se)Met 148 . E, superposition of 3C pro structures for FMDV (thick coil, colored as in A), HRV (yellow), and HAV (purple). over the P-side of the peptide binding cleft and the active site in HAV, HRV, and PV 3C proteases (8 -10), thereby rendering these sites more exposed to solvent (Fig. 1). More importantly, perhaps, FMDV 3C pro is the first member of this unusual family of proteases to be found with a truly chymotrypsin-like configuration of the catalytic triad (Fig. 2). Previous studies showed that in HAV 3C pro , the side chain of the catalytic Asp residue is rotated away from the active site, apparently sequestered in a interaction with the basic side chain of Lys 202 (8,12,14), which is conserved in the hepatoviruses but not found in any other picornaviral genus (32). HRV and PV 3C pro structures showed that the acidic member of the catalytic triad, conserved as Glu in these virus families, was indeed hydrogen-bonded to the His base, but this was achieved using the anti lone pair of the carboxylate group of the Glu rather than the more favorable syn lone pair (9,10). With very few exceptions, the catalytic acidic residue of serine proteases is conserved as Asp and bonds to the His in the syn configuration (11,41), an orientation that is associated with enhanced catalytic efficiency (42).
On the basis of the observation of an apparently nonoptimal configuration of the triad in HAV 3C pro (8,12), perhaps permissible because of the greater nucleophilicity of Cys over Ser (43), a dyad model has been proposed to account for the catalytic activity (8,10,12). In this model, Tyr 143 is proposed to provide electrostatic stabilization, taking on at least part of the role played by Asp in the serine proteases or Glu in HRV and PV 3C pro . However, the finding that FMDV 3C pro and the 3C-like TEV pro (26) both possess a Cys-His-Asp triad that is configured similarly to the Ser-His-Asp triad of serine proteases, in stark contrast to the active site configuration seen in HAV 3C pro , prompts a reassessment of whether a dyad model is indeed the mechanism used by these types of protease. The observation that the orientation of the Glu residue in the triads of HRV and PV 3C pro , which interacts in the less favorable anti configuration, thereby precluding proton transfer from the catalytic His, has been cited in support of the view that 3C proteases do not rely on the third member of the triad (10,12). However, the prevalent model of serine protease mechanisms is that proton transfer does not occur between His and Asp and that the carboxylate group of the Asp serves to provide electrostatic stabilization of the positive charge that accrues on the His (11), a role that could still be played by Glu found in HRV and PV 3C pro . The functional significance of this electrostatic stabilization is also suggested by the fact that the acidic member of the catalytic triad is strictly conserved, either as Glu in entero-and rhinoviruses or as Asp in all other picornavirus genera (32); moreover, even conservative mutations of this residue (e.g. from Glu to Asp in PV (44) or from Asp to Glu in FMDV (45)) invariably reduce the 3C pro activity and are lethal to the virus. Thus, we feel that there is no evidence for a significant mechanistic distinction between serine proteases and these viral chymotrypsin-like cysteine proteases. However, the deviations from the canonical structure found when a Glu is the third member of the triad, as observed in HRV and PV 3C pro , seem likely to reduce the electrostatic stabilization made available to the His during catalysis. Consistent with this notion, Ser-His-Glu catalytic triads are more commonly found in lipases (43); these enzymes catalyze the cleavage of ester bonds that are typically at least 10 3 times more reactive that amide bonds (11). Conversely, the close structural correspondence between the Cys-His-Asp triad of FMDV 3C pro and the Ser-His-Asp triad of most serine proteases suggests the FMDV enzyme may be one of the more efficient picornaviral 3C proteases. But what of the dyad that is observed in HAV 3C pro , where the catalytic Asp is displaced? It may be that high efficiency polyprotein processing by 3C pro is not required by HAV and that a defective triad will suffice. It has been suggested that the catalytic Asp is functionally substituted by the negatively charged side chain of Tyr 143 (12), but this would require a very large change in the pK a of the phenolic side chain that seems unlikely; moreover, superposition of the structure of HAV 3C pro with the similar structures of other   FIG. 2. Comparison of the active sites of 3C proteases shows that the catalytic triad found in FMDV 3C pro most closely resembles that found in the serine protease chymotrypsin. To aid the comparison, the Cys 163 side chain has been restored in the FMDV 3C pro structure by modeling (since this is Ala in the crystal structure). Although the hydrogen-bonding geometry between Cys 163 and His 46 is nonideal, this may be due to small perturbations in the triad configuration resulting from the mutation to the smaller Ala side chain in the FMDV 3C pro structure. The other crystal structures shown are chymotrypsin (Protein Data Bank code 4cha (55)), HAV 3C pro (Protein Data Bank code 1hav (12)), and HRV 14 3C pro (coordinates provided by D. Matthews).

TABLE II
Comparative cleavage of peptides corresponding to polyprotein junctions in FMDV A10 61 Synthetic peptides spanning eight residues either side of the ten 3C pro cleavage positions were incubated with enzyme (10 M) as described under "Materials and Methods" and then analyzed for extent of cleavage by HPLC. Marked differences in cleavage rates were found, with rapid, complete cleavage occurring for peptides 3 and 4 within 6 h; partial cleavage was found for peptides 2, 9, and 10 after 24-h incubation; little or no measurable cleavage was found for the remaining peptides even after 24 h. At reduced enzyme concentration (1 M), peptide 3 was found to hydrolyze slightly faster than peptide 4.

Peptide
Polyprotein junction Sequence (P-PЈ) Cleavage chymotrypsin-like proteases (including HRV 3C pro (16) and TEV pro (26)) complexed with peptides or peptide-based inhibitors indicates that the interaction between Tyr 143 and the catalytic His could not be maintained upon substrate binding (data not shown). It remains possible that the crystal structure of HAV 3C pro represents a nonproductive form of the enzyme. Structural analysis of 3C pro -peptide complexes should help to further our understanding of the enzyme mechanisms. The S1 pocket that accommodates the P1 amino acid side chain of the substrate is composed mainly of residues from the C 2 -D 2 loop and the C-terminal end of strand E 2 (Figs. 1A, 1E, and 5E). This pocket contains His 181 , which is coordinated to Tyr 154 by a buried hydrogen bond, a feature that is observed in other 3C pro enzymes (Figs. 3 and 5) (although in HAV 3C pro , the equivalent Tyr residue is missing, and the His is instead hydrogenbonded to a water molecule that is itself coordinated by a buried Glu side chain (8,12)). It has been argued that this coordination allows the uncharged His to donate a hydrogen bond to the side-chain carbonyl oxygen of the P1-Gln of the peptide substrate (8,10,12). In HRV-2 3C pro , structural studies with peptidomimetic inhibitors suggested that the selectivity for Gln over Glu at P1 may also require Thr 142 (equivalent to Thr 158 in FMDV 3C pro ), which donates a hydrogen bond from its side-chain hydroxyl to the P1-Gln side-chain carbonyl, whereas the mainchain carbonyl of this Thr accepts one from the P1-Gln side-chain FIG. 3. Structure-based sequence alignment of picornavirus 3C pro enzymes. The sequence alignment was performed by the EBI SSM server (www.ebi.ac.uk/msd-srv/ssm/) using the known structures of 3C proteases from FMDV (strain A10 61 ), HAV (strain HM-175), HRV (type 2), and PV (type 1 Mahoney). Residues that are identical in two or more sequences are shaded black; homologous residues are shaded gray. The secondary structure for FMDV 3C pro is shown along the top. The asterisks indicate the positions of the C95K and C142S substitutions introduced to improve solubility (see "Materials and Methods"). The hash marks denote residues in the catalytic triad. Sequences flanking the S1 specificity pocket, which exhibit significant variation between different picornaviruses, are boxed. . The chart compares rates of hydrolysis relative to that of the starting peptide for modified sequences that incorporate an alanine substitution. Where no bar appears, the observed rate was effectively zero. (Note that P5 and P3 are both alanine in the original sequence, so these positions are not represented.) The graph also shows the relative hydrolysis rate when the P1-Gln is replaced by Glu (denoted P1E). amide group (16,46) (Fig. 5B). An almost identical configuration, observed in the complex of TEV pro with a peptide substrate, was proposed to explain the P1-Gln selectivity of that enzyme (26) (although it should be noted that whereas P1-Gln is conserved at all TEV pro cleavage sites in the viral polyprotein (47) and the enzyme discriminates in favor or P1-Gln over most other amino acids, the particular effect of incorporating P1-Glu in substrates has not yet been tested (48)). However, the results presented here, coupled with more recent findings, indicate that the presence of this His-Tyr-Thr cluster in the S1 pocket does not fully account for P1-Gln selectivity. FMDV 3C pro , which is not strongly selective for P1-Gln over P1-Glu, contains this cluster (Fig. 5A), whereas HAV 3C pro , which is selective, lacks the Thr residue that is proposed to differentiate between Gln and Glu at P1. More tellingly, perhaps, the His-Tyr-Thr arrangement found in the S1 specificity pocket is conserved in a number of Glu-specific serine proteases (49 -51), which tend to discriminate against P1-Gln and thus display the reverse selectivity to that seen in picornaviral 3C proteases (Fig. 5C). These serine proteases select for P1-Glu by placing a counter charge to the negative carboxyl of the P1-Glu side chain in the S1 pocket, either from a basic side chain or from the amino terminus of the polypeptide (49 -52) or by using specific hydrogen bond contacts (Fig. 5C) (53). Thus, in picornavirus 3C proteases, the selection of P1-Gln over P1-Glu may simply be due to the absence of an appropriate counter charge. A definitive explanation for the more relaxed selectivity of FMDV 3C pro at P1 is not yet possible, largely because the impact of remote interactions between the protease and its substrate are difficult to assess (11), but the more open active site in this enzyme that results from the absence of a B 2 -C 2 ␤-ribbon may reduce discrimination against P1-Glu side chains because they are more solvent-exposed when bound. Given the very significant sequence and structural variation in the vicinity of the S1 pocket (Figs. 1E and 3), it is perhaps surprising that picornavirus 3C proteases maintain a common preference for P1-Gln.
Modeling studies based on the structures of a range of 3C proteases and serine proteases complexed with peptides or peptide-based inhibitors reveal consistent backbone and sidechain configurations for the P1-P4 region (data not shown) and allow us to suggest plausible locations on the surface of FMDV 3C pro for peptide interactions involving P2-P4 residues (Fig.  5A). The P2-Lys of the VP1/2A peptide extends to one side of the catalytic His 46 and is likely to interact with the side-chain hydroxyl of Tyr 190 , a residue conserved in FMDV but not in other picornavirus 3C proteases; conceivably, this Tyr may also interact with Gln and His residues that occur at P2 in other FMDV 3C pro cleavage sites (Table II). The P3 side chain is usually solvent-exposed and contributes little to specificity (11). The preference for a hydrophobic Pro side chain at P4 (40) is likely to be due to this side chain being pinned between the apolar moieties of Val 188 and Tyr 190 . On the other side of the cleavage site, Ile 30 , Leu 47 , Ala 122 , and the aliphatic portion of Glu 20 may provide hydrophobic contacts for the P1Ј, P2Ј, and P4Ј positions in the optimal substrate, consistent with the general preference for residues with a significant apolar component in these positions (Table II).
FMDV 3C pro is an attractive target for drug design, since there are no known cellular homologues in susceptible hosts. In addition, since FMDV 3C pro sequences are highly conserved, it may be possible to develop protease inhibitors that have a broad activity against a range of different serotypes. For example, although FMDV capsid protein sequences from the seven serotypes are only 50 -70% conserved, complicating vaccine formulation, their 3C pro sequences are 82-85% identical. These figures rise to 91-97% if one excludes the three SAT serotypes that are usually confined to sub-Saharan Africa (2) and in any case form a distinct conserved group in terms of 3C pro sequences (88 -98% identical) (28). Mapping the variation found within 41 distinct FMDV 3C pro sequences spanning all seven serotypes onto the 3C pro structure (Fig. 6) reveals that these are almost entirely peripheral to the substrate-binding site. For the two residue positions closest to the substrate-binding site that exhibit some variation (Ile 30 and Ala 160 ), the vast majority of this variation is due to differences between the SAT and non-SAT groups. Ile is substituted conservatively by Val in just one of 33 known non-SAT sequences and is strictly conserved as Leu in the SAT group; Ala 160 is strictly conserved in the non-SAT group of viruses and substituted by Val in the SAT group. This emphasizes the highly conserved nature of the binding pocket on FMDV 3C pro and highlights the potential for developing a broad spectrum inhibitor. Progress toward this goal should be greatly facilitated by the structural and specificity characterization reported here.