The X-ray Crystal Structure of Mannose-binding Lectin-associated Serine Proteinase-3 Reveals the Structural Basis for Enzyme Inactivity Associated with the Carnevale, Mingarelli, Malpuech, and Michels (3MC) Syndrome*

Background: Mutants of the serine protease, mannose-binding lectin associated-protease-3 (MASP-3), are associated with Carnevale, Mingarelli, Malpuech and Michels (3MC) syndrome. Results: The lack of activity and the structure of the G666E mutant of MASP-3 reveals a molecular basis for 3MC syndrome. Conclusion: Mutants of MASP-3 associated with 3MC syndrome are inactive. Significance: The catalytic activity of MASP-3 is required for normal developmental processes. The mannose-binding lectin associated-protease-3 (MASP-3) is a member of the lectin pathway of the complement system, a key component of human innate and active immunity. Mutations in MASP-3 have recently been found to be associated with Carnevale, Mingarelli, Malpuech, and Michels (3MC) syndrome, a severe developmental disorder manifested by cleft palate, intellectual disability, and skeletal abnormalities. However, the molecular basis for MASP-3 function remains to be understood. Here we characterize the substrate specificity of MASP-3 by screening against a combinatorial peptide substrate library. Through this approach, we successfully identified a peptide substrate that was 20-fold more efficiently cleaved than any other identified to date. Furthermore, we demonstrated that mutant forms of the enzyme associated with 3MC syndrome were completely inactive against this substrate. To address the structural basis for this defect, we determined the 2.6-Å structure of the zymogen form of the G666E mutant of MASP-3. These data reveal that the mutation disrupts the active site and perturbs the position of the catalytic serine residue. Together, these insights into the function of MASP-3 reveal how a mutation in this enzyme causes it to be inactive and thus contribute to the 3MC syndrome.

developmental disorder, the 3MC syndrome, which suggests a possible role for MASP-3 outside of the immune system (12,13). The 3MC syndrome is the collective term used to identify four allelic variants of the same disease category, the Carnevale, Mingarelli, Malpuech, and Michels syndromes (14,15). Patients with 3MC display a series of congenital characteristic anomalies, including abnormally increased distance between facial parts, bilateral ptosis with abnormally small and droopy eyelids, cleft lip and/or palate, abnormal growth pattern of skull, fusion of radius and ulna, as well as impairment in hearing and/or learning (16 -19). Four mutations (C630{168}R, G666{197}E, G687{216}R, and H497{57}Y (12,13)) (numbers in {brackets} indicate chymotrypsin numbering) in the MASP1 gene were identified to be located in exon 12, which encodes the SP domain of MASP-3. All these mutations map to the active site region of the enzyme, including the catalytic His 497{57} residue. Both studies make elegant predictions of the effects of these mutations on the MASP-3 enzyme, however, the structure of MASP-3 and the structural basis for MASP-3 deficiency remain to be characterized.
In a previous study (20), we produced a form of human MASP-3 with a single Gln residue replacing the Lys residue N-terminal to the R-I activation bond (M3Q). This enzyme could be efficiently cleaved and activated by the C1r protease of the classical pathway complement (20). Because the amino acid sequence C-terminal to the activation bond represents that of the wild type MASP-3, the C1r-activated recombinant M3Q protein contains a fully functional SP domain of the wild type MASP-3. We were similarly able to produce cleaved, activated (M3Q cleaved) forms of 3MC syndrome-related mutants G687{216}R and G666{197}E mutant. Both mutants lacked detectable protease activity. Finally, we determined the 2.6-Å structure of the G666{197}E mutant protease in the zymogen form. These data reveal substantial perturbation of the active site, consistent with a correlation between the lack of MASP-3 function and the 3MC syndrome.

Expression and Purification of MASP-3-Recombinant
MASP-3 CCP12SP (residues Lys 298 -Arg 728 and mutants, M3Q, M3QG666{197}E, and M3QG687{216}R) were expressed and refolded with some modifications to previously described methods (21). Briefly, genes for all recombinant proteins were synthesized (GenScript, Piscataway, NJ) and the DNA was cloned into pET17b (EMD Biosciences, Rockland, MA). After transformation of the vector into Escherichia coli strain BL21(DE3)pLysS, cells were cultured at 37°C in 2ϫ TY (tryptone/yeast extract) broth with 50 g/ml of ampicillin and 34 g/ml of chloramphenicol to an A 595 of 0.6, followed by induction with 1 mM isopropyl ␤-D-thiogalactoside for 4 h. Following induction, the culture was centrifuged (27,000 ϫ g, 20 min, 4°C), the cells were collected in 30 ml of 50 mM Tris-HCl, 20 mM EDTA, pH 7.4, and frozen at Ϫ80°C. The cells were thawed and sonicated on ice 6 times for 30 s. After centrifugation at 27,000 ϫ g for 20 min, inclusion body pellets were sequentially washed and centrifuged with 10 ml of 50 mM Tris-HCl, 20 mM EDTA, pH 7.4. The washed pellet was resuspended in 10 ml of 8 M urea, 0.1 M Tris-HCl, 100 mM DTT, pH 8.3, at room tem-perature (RT) for 3 h. Refolding was initiated by rapid dilution dropwise into 50 mM Tris-HCl, 3 mM reduced glutathione, 1 mM oxidized glutathione, 5 mM EDTA, and 0.5 M arginine, pH 9.0. The renatured protein solutions were concentrated and dialyzed against 50 mM Tris-HCl, pH 9.0, and renatured proteins were purified on a 5-ml Q-Sepharose-Fast Flow column (GE Healthcare). The bound protein was eluted with a linear NaCl gradient from 0 to 400 mM over 35 ml at 1 ml/min. The recombinant proteins were further purified using a Superdex 75 16/60 column (GE Healthcare) in a buffer of 50 mM Tris, 145 mM NaCl, pH 7.4, aliquoted, snap frozen, and maintained at Ϫ80°C. The purity of the protein was confirmed by SDS-PAGE followed by Western blotting and N-terminal sequencing. Typically protein yields were between 1 and 2 mg/liters.
Western Blotting and Antibodies-Proteins were resolved by SDS-PAGE, transferred, and immunoblotted using an anti-MASP-3 antibody directed against the unique peptide sequence, NPNVTDQIISSGTRT, which was raised in chickens as previously described (22).
Activation of Zymogen MASP-3-To activate the recombinant M3Q and 3MC syndrome-related MASP-3 mutants, 0.5 mg of active human C1r enzyme (Complement Technology, TX) was coupled to a 1-ml HiTrap TM NHS column (GE Healthcare), according to the manufacturer's instructions. 0.5 mg of pure recombinant protein was loaded into the column for activation at 26°C for 16 h. The C1r-activated MASP-3 was then eluted, in a buffer containing 50 mM Tris, 145 mM NaCl, pH 7.4. Wild type MASP-3 and all recombinant MASP-3 mutant proteins were subjected to SDS-PAGE under reduced conditions and then transferred onto a PVDF membrane, followed by N-terminal sequencing to identify the cleavage site by C1r.
N-terminal Sequencing-Protein samples were reduced and denatured. The protein fragments in the samples were separated by SDS-PAGE and transferred onto a polyvinylidene difluoride membrane in a transfer buffer containing 10 mM 3-(cyclohexylamino)-1-propanesulfonic acid, pH 11, with 15% (v/v) methanol. The protein band on the membrane was visualized by Coomassie R-250 Brilliant Blue staining and subsequently excised. After three alternating washes of water and 50% (v/v) methanol, the band was cut into small pieces and loaded onto a Procise Protein Sequencer 492/492C with a Microgradient Delivery System 140C and a UV Detector 785A (Applied Biosystems) for sequencing. The sequencing process applied automated Edman degradation and phenylthiohydantoin amino acid analysis. The data were analyzed by applying the Sequence Pro software (Applied Biosystems).
Enzyme Kinetic Analysis-Kinetic analysis of MASP-3 was performed in fluorescence assay buffer (FAB) (50 mM Tris-HCl, 150 mM NaCl, 0.2% (w/v) PEG 8000, pH 7.4), using the synthetic peptide substrates Boc-VPR-AMC, Boc-PFR-AMC, Boc-GGR-AMC, Z-FR-AMC, Boc-AFK-AMC, and Boc-LGR-AMC (denoted as VPR, PFR, GGR, ZFR, AFK, and LGR, respectively) at 37°C on a POLARstar fluorescent plate reader (BMG Labtech, Offenburg, Germany). For a typical activity assay, 100 nM M3Q (or up to 1 M of either M3QG666{197}E or M3QG687{216}R) was incubated with 100 M substrate. The rate of increase of fluorescence was monitored for 30 min using excitation and emission wavelengths of 360 and 460 nm, respectively. The initial rate of proteolysis of substrate was measured as the slope of the linear portion of the progressive curve, and was then converted to the rate of increase of AMC molecules per unit of enzyme (M [AMC] Ϫ1 ) using a standard curve for arbitrary fluorescence units versus AMC concentration and taking into account the enzyme concentration used. To determine the values of the steady-state reaction constants, K m and V max , 100 nM M3Q was used to react with substrate at a range of concentrations from 0 to 500 M and the initial velocity of the reactions was plotted against the substrate concentration. The K m and V max values were estimated by using GraphPad Prism 5.0 (GraphPad Software, San Diego, CA) to fit the data to the Michaelis-Menten equation: where Y is the initial rate of the proteolytic reaction and X is the concentration of substrate. The Determination of the Substrate Specificity of M3Q Using the Rapid Endopeptidase Profiling Library (REPLi)-The REPLi library is a combinatorial peptide library that contains 512 pools of peptides with each pool containing up to eight different variable tripeptides with the template layout of MeOC-GGXXXGG-dipicolinic acid-KK, where each X represents a variable alternative amino acid based on similar physiochemical properties, i.e. A/V, F/Y, I/L, D/E, R/K, D/E, S/T, Q/N, and P. There are no Gly, His, Trp, Cys, or Met residues in the variable tripeptide region. The resulting combinations of variable tripeptides give rise to 3375 different peptides in the library in total. Methoxycarbonyl (MeOC) is the fluorophore, and dipicolinic acid is the fluorophore quencher. The soluble peptide library pools (i.e. 512 wells) in six 96-well plates were diluted using FAB to a final concentration of 50 M. A final concentration of 300 nM M3Q was incubated with the substrate pools in FAB at 37°C. Cleavage of the substrates was monitored by measuring the increase in fluorescence intensity from the MeOC fluorophores using 55-s cycles for 30 cycles, with an excitation wavelength of 320 nm and an emission wavelength of 420 nm. The initial velocity of the cleavage was indicated by the slope per unit time of the linear region of the curves. These values were converted into molar MeOC molecules per unit time per molar enzyme (M [MeOC] ) by using an MeOC-fluorescence standard curve, allowing comparison of the cleavage rate between pools and between enzymes.
Based on the REPLi results, 17 individual peptides from the substrate pools containing tripeptidyl sequences of Arg/Lys-Ile/Leu-Phe/Tyr, Arg/Lys-Arg/Lys-Ile/Leu, and Arg/Lys-Ser/ Thr-Ile/Leu, which displayed high or modest rates of cleavage by M3Q, were synthesized (GL Biochem, Shanghai, China). Each assay used 100 M of the fluorescent quenched substrate with 500 nM M3Q (or up to 1 M of either M3QG666{197}E or M3QG687{216}R) and cleavage was monitored for 30 min as described above. The initial rate of the proteolysis of substrate was measured as the slope of the linear portion of the progressive curve, and was then converted to M [Abz] Ϫ1 by using the standard curve. To determine the values of the steadystate reaction constants, 500 nM M3Q was mixed with substrate at a range of concentrations from 0 to 500 M and the initial velocity of reaction was plotted against the substrate concentration allowing the determination of the K m , V max , and k cat values as described above. Selected wells that showed the highest increase in fluores-  Protein Crystallization and X-ray Structure Determination-Purified zMASP-3 CCP12SP_QG666{197}E was buffer exchanged into 100 mM NaCl, 10 mM Tris-HCl, pH 7.4, and concentrated to 5 mg/ml for crystallization trials.
Crystals of zMASP-3 were obtained utilizing the sitting drop vapor diffusion method at 4°C with a reservoir solution comprised of 0.12 M NaCl, 18% (w/v) polyethylene glycol 4000, and 0.1 M imidazole, pH 7.5. With an equal reservoir to protein ratio, crystals were observed after 1 day and grew to maximal size after 3 weeks. Crystals were cryogenically cooled in sucrose supplemented mother liquor prior to data collection.   Data were collected at cryogenic temperatures on the MX2 beamline at the Australian Synchrotron to 2.6-Å resolution. Data were indexed, integrated, scaled, and merged using the Xia2 data reduction pipeline (23), which utilizes XDS (24 -26), Pointless, Aimless (27), and the CCP4 suite (28).
A molecular replacement solution was found following utilization of the EMBL-HH Automated Crystal Structure Determination Platform "Auto-Rickshaw" (29,30), which contained 2 molecules per asymmetric unit with clear unbiased F o Ϫ F c electron density observed in omitted loop regions. Manual iterative model building and refinement using Coot (31) and phenix.refine (32) was performed yielding a high quality final model (R factor /R free 19.5/26.0, Molprobilty Score 3.29).
Final coordinates and structure factors have been deposited with the Protein Data Bank with the accession code 4KKD. A table of full data collection and refinement statistics can be found in Table 1.

RESULTS
The Activity and Specificity of Activated MASP-3-We used a mutant of MASP-3 in which a Gln residue had been introduced at position 448 in place of the Lys residue at that position to facilitate its efficient cleavage and activation by C1r (20) (Fig. 1). In general, active MASP-3 had very low activity (Table 2), with VPR being the best substrate, (k cat /K m value of 5.8 M Ϫ1 s Ϫ1 (Table 3)). The similarly activated (Fig. 1) mutant forms of MASP-3 associated with the 3MC syndrome were completely inactive against any substrate tested (Table 2).
We comprehensively mapped the specificity of activated wild type MASP-3 using the REPLi combinatorial peptide substrate library. The majority of the substrate pools in the library were not cleaved by MASP-3 ( Fig. 2A). However, we were able to identify optimal substrates for the enzyme, with the top ranked pool containing substrates of the sequence Lys/Arg-Ile/Leu-Phe/Tyr (Fig.  2A).
The individual substrates contained in the top two pools from the REPLi analysis were individually synthesized and tested for their rate of cleavage by wild type MASP-3 (Fig. 2B). These data collectively reveal that the enzyme had a strong preference for an Arg residue at P1, an Ile residue at P1Ј, and an amino acid with an aromatic side chain (Phe or Tyr) at P2Ј. Our studies also revealed that the enzyme was able to cleave substrates containing the dibasic RR or RK motif. Analysis using LC-MS revealed that the enzyme cleaved after the first basic amino acid (Fig. 2C), thus it is likely that the enzyme additionally can accommodate a positively charged residue at the P1Ј position of substrates.
Kinetic analysis of the cleavage of the top 5 substrates selected from the REPLi library revealed that the RIF and RIY substrates were optimal for MASP-3, with k cat /K m values 3-fold better than any of the other substrates tested ( Table 4). The top substrate, RIF, had a k cat /K m value of 98.5 M Ϫ1 s Ϫ1 , which is ϳ20-fold better than the coumaryl peptide substrates used thus far to characterize MASP-3 in vitro.
We tested the 3MC syndrome-associated mutants G666E and G687R against this substrate. These data revealed no detectable activity, suggesting that the mutant enzymes are indeed inactive.
Structure Determination of Zymogen MASP-3-To address the structural basis of 3MC syndrome, we determined the 2.6-Å resolution structure of a MASP-3 CCP12SP zymogen FIGURE 3. Schematic representations of the MASP-3_G666E structure. A, the zymogen MASP-3 G666E monomer (PDB code 4KKD) with residues 301-362 in magenta representing the CCP1 domain, residues 367-432 in dark blue representing the CCP2 domain, and residues 433-728 in gold representing the SP domain. The linker regions are colored black and the N and C denote the N-and C-terminal ends, respectively. The activation loop substitution K448{14}Q is shown in purple stick format. The active site residues (His 497{57} , Asp 553{102} , and Ser 664{195} ) are shown in green stick format. The residues involved in point mutations associated with the 3MC syndrome, Cys 630{168} and Gly 687{216} , in addition to the incorporated G666{197}E substitution are shown in blue stick format (His 497{57} in green stick is also the site of a point mutation associated with 3MC syndrome). The amino acid sequence of the activation loop is shown in gold font below, with the black arrow between Arg 449{15} and Ile 450{16} highlighting the activation point. (zMASP-3) containing the mutation G666{197}E (Fig. 3). Two molecules are present in the asymmetric unit. Molecule B is more complete than molecule A and includes the intact activation loop (amino acids 440 -453{5-19}). The activation loop mutation (K448{14}Q) participates in crystallographic packing interactions, suggesting that this may have aided crystallization. We were unable to crystallize wild type or activated enzyme. The SP domain of MASP-3 is in the zymogen conformer of the chymotrypsin-like serine proteases. Superimposition of the SP domain of chain A onto chain B reveals a significant differ- ence in the positions of the CCP1 and CCP2 domains, with up to 14 Å of variance observed in C␣ positions in CCP1. The interactions between the various domains remain the same, but differences in the relative positions of CCP residues account for inter-domain flexion (Fig. 4). The zMASP-3 SP chains A and B superimpose well (root mean square deviation ϭ 0.588 Å). Minor differences in side chain rotamers in addition to minor loop movements are observed, presumably due to differences in crystal packing interactions. Superimposition of the CCP2 domains reveals a different backbone position of residues 398 -409 causing the different domain juxtaposition. The greatest shift is observed in the Pro 400 C␣ with a movement of 0.98 Å (Fig. 4, B and E).
Differences in the positions of CCP1 relative to CCP2 are observed upon superimposition of CCP1 domains (Fig. 4D). In CCP1, a backbone hydrogen bond between Gly 332 -O and Val 365 -N is observed in chain A (2.78 Å), but is lost in chain B (3.34 Å). Concurrently, the peptide backbone of Thr 331 is flipped ( angle A: 55.25 versus B: 175.89). Movement of Gly 332 and Thr 331 away from CCP2 serves to reduce steric hindrance to domain movement increasing the distance to Tyr 389 in CCP2. The different orientations of the CCP1, CCP2, and SP domains between the two chains in the crystallographic asymmetric unit highlight the flexibility of the CCP domain interactions with minimal hydrogen bonding observed between the domains.
Comparison of zMASP-3 with zMASP2 further highlights the CCP2-SP inter-domain flexibility. Superimposition of the SP domains of zMASP-3 and zMASP-2 (PDB code 1ZJK) reveal a shift of up to 34 Å in the relative position of the CCP1 domain between the enzymes (Fig. 4C). The difference at the CCP2-SP junction of Val 433 and Glu 435 in MASP-2 and MASP-3, respectively, sees the loss of a polar interaction in MASP-2, which contributes significantly to the different domain positioning in the two enzymes.
In the zMASP-3 unit cell, the 2 molecules are found in a head to head configuration, with several interactions between the long extended A and B loops observed. The unusually long D loop (amino acids 595-612{143-150}) in both chains was not visible in electron density. The buried surface area of the inter-action between the SP domains of the asymmetric unit has been calculated to be 276 Å 2 . However, there is no evidence that MASP-3 forms a dimer in vitro.
The catalytic motif (GDSer {195} GG) is highly conserved across the chymotrypsin family. Our data reveal that mutation of the Gly 666{197} in MASP-3 has marked structural consequences. Comparison of zMASP-3_G666{197}E with zymogen forms of MASP-1, MASP-2, C1r, and C1s reveals that the introduction of a relatively large charged side chain in place of the glycine residue has dramatic effects. The 3MC syndrome G666{197}E mutation significantly impacts on the position of the active site Ser 664{195} , drawing the functional hydroxyl group away from the catalytic His 497{57} to an inactive position and thereby eliminating any potential charge relay and subsequent proteolytic activity (Fig. 5A). Specifically, Glu 666{197} makes H-bonds with Asp 663{194} -N, Ser 664{195} -N, and Ser 664{195} -O␥, forcing the active site loop into a ␣-turn conformation. This structure is distinctly different to the ␤-turn conformation observed in other zymogens (MASP-1, MASP-2, C1s, C1r (where the ␤-turn is stabilized by a main chain H-bond between Asp {194} -O and Gly {197} -N)) ( Fig. 5B). Furthermore, rather than interacting with the continuation of the D loop, Asp 663{194} is in a novel solvent-exposed conformation.

DISCUSSION
The lectin pathway of complement is an important component of the innate immune system of the body (1). The MASP-1 and MASP-2 enzymes of the pathway play clear roles in activation of the complement system via this pathway, with MASP-1 activating MASP-2 and playing a role in subsequent C2 cleavage, whereas MASP-2 is crucial for C4 cleavage and plays a major role in C2 cleavage as well (33). The role of MASP-3 in this system is presently contentious.
Iwaki et al. (34) presented findings to suggest that murine MASP-3 was able to cleave Factors B and D of the alternative pathway of complement and thus play a role in activation of this system. Recently, Degn et al. (35) showed that there was no alteration in the activation of the alternative pathway in serum from a patient deficient in both MASP-1 and MASP-3. It is therefore uncertain what relevance the cleavage of Factors B and D will have physiologically in humans, but it is worth noting that activating cleavage of these enzymes would be predicted to occur after the Arg residues in the P2-P2Ј sequence KR-KI for Factor B and GR-IL for Factor D. This matches the cleavage between basic residues noted for the enzyme here and also the preference for a hydrophobic amino acid at P2Ј. In addition to these cleavages, the reported cleavage of insulin growth factorbinding protein-5 by MASP-3 (11) has been shown to occur at the P2-P2Ј sites PR-II (major cleavage site), PK-IF, and PK-HT. The first cleavage matches the R-I P1-P1Ј specificity noted here and the general preference for a hydrophobic amino acid at P2Ј, the second cleavage matches the Ile specificity at P1Ј and the preference for an aromatic amino acid at P2Ј. In general, the cleavage specificity noted here is consistent with cleavages in proteins by MASP-3 reported in the literature.
Recently, some light has possibly been shed on the function of MASP-3 due to findings that mutations to the enzyme or one of its lectin ligand recognition partners, CL-11, are associated with the 3MC syndrome (12,13). Work in this regard suggests that the enzyme and lectin molecule are involved in neural crest migration during early developmental processes (13). The molecular targets of the lectin and enzyme are currently unknown. Many of the mutations to the enzyme clearly will yield an inactive enzyme because they result in predicted truncation of the protein (12,13). Two of the other point mutations that are likely to result in full-length protein are also most likely to be inactive because one (H497{57}Y) targets the His residue involved in the catalytic triad of the enzyme and the other (C630{168}R) would most likely disrupt a disulfide bond vital to the structure of the enzyme. The effects of the other two point mutations in MASP-3 (G666{197}E and G687{216}R) associated with the syndrome were less certain, although molecular modeling has suggested that these might be inactivating due to their active site-associated location and the major change in amino acid side chain involved. Here we show that activated forms of these two mutants of MASP-3 associated with the 3MC syndrome were indeed unable to cleave a substrate that was optimal for wild type MASP-3, supporting the idea that the 3MC syndrome arises from a lack of MASP-3 activity. Furthermore, the x-ray crystal structure of the G666{197}E mutant enzyme provides an understanding of the structural basis for the inactivating effect of the 3MC-associated point mutation.
The physiological roles of MASP-3 remain to be found. It is clear that the enzyme plays a role in an important developmental process as evidenced by the severe effect that an inactivating mutation has in determining the phenotype observed. Elucidating the role of the enzyme in this system and in the complement system at maturity remain crucial objectives in understanding the biology of this enzyme.