Serpins Flex Their Muscle

Inhibitory serpins are metastable proteins that undergo a substantial conformational rearrangement to covalently trap target peptidases. The serpin reactive center loop contributes a majority of the interactions that serpins make during the initial binding to target peptidases. However, structural studies on serpin-peptidase complexes reveal a broader set of contacts on the scaffold of inhibitory serpins that have substantial influence on guiding peptidase recognition. Structural and biophysical studies also reveal how aberrant serpin folding can lead to the formation of domain-swapped serpin multimers rather than the monomeric metastable state. Serpin domain swapping may therefore underlie the polymerization events characteristic of the serpinopathies. Finally, recent structural studies reveal how the serpin fold has been adapted for non-inhibitory functions such as hormone binding.

Inhibitory serpins are metastable proteins that undergo a substantial conformational rearrangement to covalently trap target peptidases. The serpin reactive center loop contributes a majority of the interactions that serpins make during the initial binding to target peptidases. However, structural studies on serpinpeptidase complexes reveal a broader set of contacts on the scaffold of inhibitory serpins that have substantial influence on guiding peptidase recognition. Structural and biophysical studies also reveal how aberrant serpin folding can lead to the formation of domain-swapped serpin multimers rather than the monomeric metastable state. Serpin domain swapping may therefore underlie the polymerization events characteristic of the serpinopathies. Finally, recent structural studies reveal how the serpin fold has been adapted for non-inhibitory functions such as hormone binding.
Although amino acid sequence similarity varies from 17 to 95%, key conserved residues facilitate the folding of inhibitory serpins into a metastable conformation typically comprising three ␤-sheets, eight to nine ␣-helices, and a solvent-exposed reactive center loop (RCL) 3 (Fig. 1A) (1). The conformational "stressed-to-relaxed" transition that classical inhibitory serpins undergo upon interaction with target peptidases (Figs. 1A and 2A) has been commented on in previous reviews and will not be discussed extensively here (2)(3)(4)(5). Suffice it to say that structural snapshots are available for native forms with the RCL fully expelled or partially (e.g. P14) 4 inserted into ␤-sheet A and for inactive forms with the RCL partially (e.g. P12) or fully (e.g. RCL-cleaved serpins, the final peptidase complex, and the intact but latent conformer) inserted into ␤-sheet A (6). Together, these data provide a comprehensive picture of the range of conformational states that the serpin scaffold adopts, as well as the structural details of the conformational rearrangement that occurs upon RCL cleavage by a target peptidase (Figs. 1A and 2A). Indeed, during peptidase inhibition, it is this latter event that triggers RCL insertion, which stabilizes the acyl-enzyme complex by transposing the enzyme and deforming its active site before deacylation occurs (4).
Serpins are broadly distributed throughout all major branches of life; hence, their irreversible inhibitory mechanism must provide a distinct advantage over standard mechanism inhibitors when regulating proteolytic circuits. These concepts are discussed in the accompanying minireview (7). The main purpose of this minireview is to highlight new structural information regarding serpin-peptidase complex formation, hormone binding, and pathologic polymerization.

Serpin Exosite Interactions Enhance Target Peptidase Recognition
In 2001, the crystal structure of the Michaelis complex formed between the Manduca sexta P1 Lys serpin 1B with rat S195A trypsin was published (8). As expected, the RCL is bound in a substrate-like fashion by the peptidase, poised for attack of the P1-P1Ј bond. The core interactions involve residues from P4 to P3Ј, with no contacts between the body of the serpin and the peptidase (i.e. no exosite contacts). This structure is consistent with the notion that the RCL is flexible and positioned away from the body of the serpin as an isolated peptide loop. Additional evidence suggesting that exosite contacts are not extensively involved in serpin-peptidase recognition came from changes in serpin specificity by mutations within the RCL (principally P1) and an NMR study of the Michaelis complex between ␣ 1 -antitrypsin (␣ 1 AT) Pittsburgh and trypsin showing that the each molecule rotated as if in isolation. After 2001, however, several new crystal structures of serpin-peptidase Michaelis complexes were solved: two non-physiological pairings with trypsin, four pairings with thrombin, and one pairing each with factors Xa (fXa) and IXa. These structures show that extensive exosite interfaces are a common feature involved in the recognition of serpins by target peptidases and involve residues outside P4 -P3Ј in addition to the RCL (Fig. 1, B-D) (3).
The RCL of most serpins extends from the P15 residue at the top of strand (s) 5A to the P3Ј residue at the beginning of s1C. There is a high degree of flexibility on the non-prime side, consistent with the need for rapid incorporation in ␤-sheet A, but the region extending from P1 to P3Ј is held close to the body of the serpin. Thus, the peptidase must approach close to the body of the serpin to engage the P1 residue in the S1 pocket. This requirement effectively limits the range of possible exosite contacts for typical serpins. To overcome this limitation and to allow interactions with multiple peptidases, certain serpins, in particular antithrombin (SERPINC1) and protein C inhibitor (PCI/SERPINA5), have 3-4-residue extensions on the PЈ side (9,10). Antithrombin recognizes thrombin by engaging extensive exosites accessible only by stretching the PЈ side toward the front of the serpin (10), whereas the fXa recognition site requires stretching of the non-prime side (Fig. 1B) (11). Similarly, PCI forms different complexes with thrombin and activated protein C by the apparent twisting of the peptidase relative to the serpin (9). By contrast, serpins with only one physiological target such as heparin cofactor II (HCII/ SERPIND1) and ␣ 1 AT have short PЈ sides (P2Ј and P3Ј of ␣ 1 AT are conformationally restrictive prolines) (12). The hallmark of multispecific (not promiscuous) serpins thus appears to be the extension of the PЈ region.
By calculating total interaction surface areas, with and without the RCL, we can assess the relative importance of RCL and exosite contacts in determining serpin specificity ( Table 1). The surface area buried in non-physiological complexes (e.g. the two involving trypsin) is typically Ͻ1000 Å 2 , 90% of which involves the RCL. By contrast, the physiologically relevant pairings all bury Ͼ1000 Å 2 and rely to varying degrees on exosite contacts. Indeed, there appears to be a tradeoff between the quality of the RCL sequence and the dependence on exosite contacts. This analysis is most interesting with respect to thrombin recognition by various serpins. For example, the disfavored P1 Leu of HCII and the P2 Gly of antithrombin necessitate large exosite contacts of over 1000 and 500 Å 2 , respectively, whereas the favorable P2 Pro and P1 Arg sequence of PCI requires an exosite contact of only 150 Å 2 for efficient recognition by thrombin.
A second important finding from these structural data is the demonstration that exosites on the serpin scaffold play a crucial role in facilitating initial serpin-peptidase interactions (Table  1). Interestingly, different peptidases appear to rest in different ways on the top of the serpin scaffold (even where the serpin component is the same). In several complexes, the peptidase lies far over the "front" of the serpin scaffold and predominantly forms contacts with a conserved single turn helix that precedes s4C as well as surrounding residues (Fig. 1B) (10). Conversely, in other complexes, the peptidases dock "on the back foot" (e.g. the PCI-thrombin-heparin complex) by forming interactions with residues on ␤-sheet B/s1C, the N-terminal end of s2C, and the C-terminal portion of s3C (9).
Finally, it is interesting to note that three human serpins use protein sequences outside the serpin scaffold as key exosites. HCII utilizes an N-terminal extension to bind to exosite I of thrombin (12), and similarly, ␣ 2 -antiplasmin contains an extensive C-terminal extension that functions to bind the Kringle domains of plasmin (13). The x-ray crystal structure of ␣ 2 -antiplasmin reveals that the C terminus is positioned appropriately near the RCL to bind to the peptidase (Fig. 1C). Finally, the crystal structure of the protein Z-dependent inhibitor (SER-PINA10) of fXa reveals that the serpin recruits protein Z via a binding site centered on sheet C (14). Whereas the full ternary complex (SERPINA10-protein Z-fXa) remains to be determined, modeling studies suggest that the epidermal growth factor domain of protein Z is placed to assist in recruiting fXa (Fig. 1D).
Taken together, these results show that the RCL sequence itself provides insufficient information for determining actual peptidase target(s), as some targets poorly recognize certain RCL sequences in the absence of regulatory cofactors. Thus, the identification of specificity-determining exosites is of therapeutic interest because these sites represent potentially important new targets for development of molecules that block or enhance serpin-peptidase interactions.

Domain-swapped Model of Serpin Polymerization
The use of serpins to modulate diverse molecular pathways predicts that serpin mutations will yield a wide range of disease phenotypes. Indeed loss-of-function mutations associated with ␣ 1 AT, antithrombin, and C1 esterase inhibitor (SERPING1) FIGURE 1. Native serpin structures form docking (Michaelis) complexes with peptidases using exosites and other factors. A, structure of native/ stressed (left) and cleaved/relaxed (right) ␣ 1 AT. ␤-Sheet A is in red, and the RCL is in magenta. In the cleaved or covalently complexed form of the serpin (see Fig. 2A), the RCL forms an extra strand in sheet A. Native, but not cleaved, serpins are available to form docking complexes. B, superposition of the structure of the antithrombin-thrombin-heparin ternary complex (with serpin in cyan, the RCL in orange, and thrombin in yellow) with antithrombin-fXapentasaccharide (with serpin in green, the RCL in magenta, and fXa in blue). The structures are superposed on the serpin. C, structure of ␣ 2 -antiplasmin (green with magenta RCL). The C-terminal region functions as an exosite for plasmin; the portion visible in electron density is in cyan, and the approximate position of a docking peptidase is shown in ghost white. D, structure of protein Z-dependent inhibitor (green with magenta RCL) in complex with protein Z (cyan). The approximate position of a docking peptidase is in ghost white. result in emphysema, thrombosis, and angioedema, respectively. Historically, these loss-of-function mutations underpin the clinical deficiencies that are classified as either type 1 (absent or decreased circulating levels below a critical threshold of a functionally normal protein) or type 2 (normal circulating levels of a dysfunctional protein). A spectrum of genetic mutations (missense, nonsense, indels) leads to these classes of deficiencies (15). However, several of the missense mutations also induce toxic gain-of-function phenotypes by encoding fulllength molecules that are prone to misfolding and/or polymer formation. For example, the Z mutation (E342K) of ␣ 1 AT leads to the accumulation of misfolded and polymerized protein within the endoplasmic reticulum of hepatocytes (16). The marked decrease in circulating levels of ␣ 1 AT (the major antipeptidase in extracellular fluids) predisposes to emphysema, a loss-of-function phenotype (17). In contrast, the accumulation of misfolded or aggregated ␣ 1 AT in hepatocytes probably leads to overloading of protein quality control systems, resulting in cellular injury and the development of cirrhosis, a toxic gainof-function phenotype. Collectively, toxic gain-of-function phenotypes observed with destabilizing mutations have been termed the serpinopathies (18). Because full-length serpins associated with the serpinopathies retain some functional inhibitory activity, a better understanding of the mechanism of serpin polymer formation could lead to therapeutic strategies designed to block serpin accumulation and to enhance secretion and thereby treat both disease phenotypes.
The early observation that inhibitory serpins accept their RCL as an additional strand in ␤-sheet A led to the suggestion that RCL insertion in trans may represent the physiological basis for serpin polymerization. Although alternative models have been suggested, the "loop-sheet A" model of serpin polymerization has been generally accepted until recently (Fig. 2B). Specifically, it was suggested that mutations might destabilize ␤-sheet A of the native serpin and enhance the ability of this region to accept another serpin RCL in trans. It was therefore suggested that serpin metastability and the requirement to undergo the stressed-to-relaxed conformational change as part of function thus represented a key weakness of the serpin scaffold. However, despite the appealing rationale of this proposal, it has remained challenging to reconcile the loop-sheet A polymerization model with the biophysical data, suggesting that polymerogenic serpin mutations result in the stabilization of a serpin folding intermediate that is polymerogenic (termed M*) rather than subsequent polymerization of the native folded state. Indeed, comparisons between a fully folded native poly- merogenic serpin variant and the wild-type counterpart reveal only modest differences in thermal stability or inhibitory activity (19). Finally, it has proven difficult to build physiochemically reasonable loop-sheet A polymer models that contain both an RCL linkage and completed ␤-sheet A hydrogen bonding. (In the illustrative model shown in Fig. 2B, the top of ␤-sheet A is open.) Recently, the x-ray crystal structure of a domain-swapped antithrombin dimer suggested a new model for serpin polymerization (20). Strikingly, this structure reveals that both s5A and the RCL are incorporated into ␤-sheet A of another serpin molecule (Fig. 2C). Although the crystal structure was that of a self-terminating antithrombin dimer, and thus not able to propagate further, it was easy to open this model to form long chain polymers. Limited proteolysis data, together with disulfide trapping experiments, support the formation of such polymers in vitro in response to chemical denaturants and heat.
The domain-swapped serpin dimer has a number of implications for the mechanism of serpin polymerization. In particular, this model suggests that polymerogenic serpin variants, which generally cluster on and around s5A and s6A, interfere with the final stages of ␤-sheet A assembly and permit the domainswapping event (Fig. 2, C and D). Thus, the polymerogenic M* intermediate may resemble a serpin with a substantially incomplete or disordered ␤-sheet A (Fig. 2D). A key question is whether similar domain-swapped serpin polymers of polymeric variants form in the endoplasmic reticulum in vivo.

Serpins from Thermophilic Organisms Provide New Insights into Function and Dysfunction
Serpins are sporadically distributed in Bacteria and Archaea (21,22), suggesting an ancient origin for the serpin fold.
Although the role of most prokaryote serpins is not understood, some of these molecules (serpins from Clostridium thermocellum) localize to the cellulosome, a multiprotein extracellular complex that digests material such as cellulose (23). These serpins may protect the cellulosome against unwanted peptidase activity.
Considering the conformational lability of most serpins in eukaryotes, it was surprising to find a large number of inhibitory serpins encoded in extremophilic organisms. In addition to C. thermocellum serpin, exemplars include thermopin (from Thermobifida fusca; 55°C), tengpin (from Thermoanaerobacter tengcongensis; 75°C), and aeropin (from Pyrobaculum aerophilum; 100°C). Biophysical studies reveal that both the native and cleaved states of these latter molecules have substantially elevated stability and that these serpins still function as metastable inhibitors in essentially the same way as their mesophilic counterparts. Interestingly, however, one of these molecules, thermopin, employs a novel strategy to fold at elevated temperatures. Structural studies reveal that thermopin possesses an extreme C-terminal sequence that folds across the front of the molecule and interacts with the top of sheet A (Fig.  3A) (21,24). Biophysical studies show that the C-terminal sequence plays no detectable role in influencing the stability of the native fold but instead appears to be important for stabilizing a folding intermediate. This concept is intriguing in light of the recent discovery of how native serpins may fold and domain swap (20). As suggested above, the final stage of serpin folding is most likely the assembly of s5A into ␤-sheet A (Fig. 2D). Thus, the C-terminal sequence of thermopin could provide an extra set of interactions that assist in efficient recruitment of s5A to the folding serpin. An alternative explanation is that the C-ter-

2GD4 (3.3 Å) and 1SR5 (3.1 Å)
Antithrombin-S195A factor Xa-pentasaccharide complex Interactions between thrombin "140 loop" and helical turn 231-234 (preceding s4C), s4C, and s3C; also minor interaction between thrombin "30 loop" and C-terminal end of s1B. minal sequence contributes to folding by making the intermediate less likely to undergo a domain-swapping event. Bioinformatic studies on aeropin also reveal the presence of a C-terminal extension; however, the role of this region remains to be investigated (25).
Another interesting insight from the study of bacterial serpins is the discovery that tengpin contains an N-terminal sequence that makes extensive contacts with the first ␤-strand of sheet A (s1A) and helix E (Fig. 3B) (26). Deletion of this region results in a protein that initially folds to the native state but rapidly undergoes conformational change to adopt the latent conformation. The N-terminal region thus functions to stabilize the native state. Interesting parallels can be drawn between this situation and that of mammalian plasminogen activator inhibitor 1 (PAI-1/SERPINE1), which, in the absence of the cofactor vitronectin, rapidly switches from the native to the latent conformation. The x-ray crystal structure of PAI-1 in complex with vitronectin reveals that the cofactor similarly interacts with s1A and helix E (Fig. 3C) (27). Although the folds of the N terminus of tengpin and vitronectin are different, these data nonetheless point toward a common means of stabilizing the metastable serpin fold.

Structural Studies on Non-inhibitory Serpins: Hormone-binding Serpins, Pigment Epithelium-derived Factor, and Maspin
Of the 36 known human serpins, six have evolved to perform homeostatic non-inhibitory functions: maspin (SERPINB5), pigment epithelium-derived factor (PEDF/SERPINF1), HSP47 (SERPINH1), cortisol-binding globulin (CBG/SERPINA6), thyroxine-binding globulin (TBG/SERPINA7), and angiotensinogen. Structural studies on the hormone-binding serpins CBG and TBG have provided seminal insights into how these molecules interact with and transport the hormones cortisol and thyroxine, respectively. For both serpins, the native conformation has a higher affinity (CBG, ϳ10-fold; and TBG, ϳ3-fold) for ligand than the cleaved conformation. Until recently, the precise molecular mechanism of ligand binding and release was obscure, although it was postulated that hormone release may be mediated via RCL cleavage by promiscuous proteolytic activity at the site of delivery. Recent structures of both serpins reveal the unexpected finding that cortisol and thyroxine bind into a pocket formed by helix H, s3B, s4B, s5B, and the loop between s4B and s5B (Fig. 3D) (28,29). The structures also reveal how RCL cleavage results in substantial mobility in a loop at the top of helix D, which forms extensive contacts with residues in ␤-sheet B that form the ligand-binding site. This finding suggests that the enhanced flexibility of this region results in a hormone-binding region less able to adopt the conformation required for high affinity binding.
Unfortunately, determination of the structures of PEDF and maspin has not provided insights into their mechanisms of action, as their partners and molecular functions remain somewhat enigmatic. Compelling evidence from cell and animal models implicate PEDF (an extracellular glycoprotein) in the maintenance of tissue homeostasis via cell-surface receptors and interaction with extracellular matrix components (30). For example, it inhibits angiogenesis by suppressing the migration and proliferation of endothelial cells and by promoting apoptosis. Although maspin is also implicated in tissue development and homeostasis, as evidenced by early developmental failure of null mice and its loss in carcinomas (31,32), it is a nucleocytoplasmic protein (33), which may bind transcription factors (34). The only structure of the human non-inhibitory serpins that remains to be determined is HSP47, and this is perhaps the most tantalizing, as it may shed light on how this serpin performs its critical intracellular chaperone function in collagen biosynthesis (35).

Conclusion
Taken together, these studies underscore the versatility of the large complex metastable serpin scaffold in regulating peptidase and non-proteolytic functions. In addition to the specificity afforded by the primary amino acid sequence of the RCL, inhibitory serpins achieve fine-tuning of specificity via the use of specific exosites present both inside and outside the serpin domain. Accordingly, a single serpin is able to effectively modulate the function of more than one target peptidase and, furthermore, ensure tight physiological specificity even when confronted by a group of peptidases whose cleavage specificity is similar. The cost of complexity is, however, that serpins are unusually vulnerable to mutations that impact their folding pathway. As a consequence, and in common with many other proteins that cause conformational disease (36), serpins are able to domain swap. These events most likely underlie the physiological basis for serpin polymerization.