Solution Structure and Dynamics of a Calcium Binding Epidermal Growth Factor-like Domain Pair from the Neonatal Region of Human Fibrillin-1* 210

Fibrillin-1 is a mosaic protein mainly composed of 43 calcium binding epidermal growth factor-like (cbEGF) domains arranged as multiple, tandem repeats. Mutations within the fibrillin-1 gene cause Marfan syndrome (MFS), a heritable disease of connective tissue. More than 60% of MFS-causing mutations identified are localized to cbEGFs, emphasizing that the native properties of these domains are critical for fibrillin-1 function. The cbEGF12–13 domain pair is within the longest run of cbEGFs, and many mutations that cluster in this region are associated with severe, neonatal MFS. The NMR solution structure of Ca2+-loaded cbEGF12–13 exhibits a near-linear, rod-like arrangement of domains. This observation supports the hypothesis that all fibrillin-1 (cb)EGF-cbEGF pairs, characterized by a single interdomain linker residue, possess this rod-like structure. The domain arrangement of cbEGF12–13 is stabilized by additional interdomain packing interactions to those observed for cbEGF32–33, which may help to explain the previously reported higher calcium binding affinity of cbEGF13. Based on this structure, a model of cbEGF11–15 that encompasses all known neonatal MFS missense mutations has highlighted a potential binding region. Backbone dynamics data confirm the extended structure of cbEGF12–13 and lend support to the hypothesis that a correlation exists between backbone flexibility and cbEGF domain calcium affinity. These results provide important insight into the potential consequences of MFS-associated mutations for the assembly and biomechanical properties of connective tissue microfibrils.

Epidermal growth factor-like (EGF) 1 domains represent one of the most commonly identified protein modules in mosaic proteins (1,2). A subset of these domains contains a calcium binding (cb) consensus sequence, i.e. (D/N)X(D/N)(E/Q)X m (D/ N)*X n (Y/F) (where m and n are variable, and * indicates a potential ␤-hydroxylation site) (3)(4)(5) (Fig. 1a). This type of EGF domain has been identified in many proteins including the human fibrillin and Notch family proteins, protein S, factor IX, and the low density lipoprotein receptor. Furthermore, genetic mutations that cause amino acid changes within cbEGFs in these proteins have been linked to a number of human diseases including the Marfan syndrome (MFS) (6), CADASIL (cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy) (7), Alagille syndrome (8), protein S deficiency (9), hemophilia B (10), and familial hypercholesterolaemia (11).
Here we describe solution nuclear magnetic resonance (NMR) structural and dynamics studies of the cbEGF12-13 domain pair from human fibrillin-1. Fibrillin-1, a major component of 10 -12-nm connective tissue microfibrils (12), is mainly comprised of multiple, tandem repeats of cbEGF domains (Fig. 1b). Over 300 mutations within the fibrillin-1 (FBN1) have been reported that are associated with MFS and related disorders (6,13). MFS is an inherited disorder estimated to affect ϳ1/5,000 in the population (reviewed in Ref. 14); symptoms vary from mild to life-threatening, and although the genotype-phenotype relationship remains elusive, a cluster of mutations in the region corresponding to exons 24 -32 (encoding transforming growth factor ␤-binding protein-like domain-3 and cbEGF domains [11][12][13][14][15][16][17][18] have been found to be associated with the most severe forms of the disease, including neonatal MFS (nMFS). Mutations that produce a more moderate phenotype are, however, also found in this region (6,15).
The structure of cbEGF12-13 has been determined to assess the prediction that all tandem fibrillin-1 cbEGF domain pairs, when saturated with Ca 2ϩ , exhibit a rod-like conformation that may be required for microfibril organization (17)(18)(19). In addition, the spatial localization of MFS causing missense mutations within this region has been identified, and structural consequences have been considered. An extended region of fibrillin-1 (cbEGF11-15) has been modeled to gain further insight into the molecular basis of the severe, neonatal MFS phenotype.
NMR backbone relaxation measurements were performed to highlight regions of the cbEGF12-13 pair with increased flexibility, which may indicate their involvement in protein-protein interactions. The data for cbEGF12-13 were compared with previous backbone dynamics measurements for the * This work was supported in part by the Medical Research Council (to R. S. S., P. A. H., and P. W.) and by the Wellcome Trust (to A. K. D., I. D. C., and J. M. W.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.  (20). This analysis has provided information regarding variations in intrinsic dynamics of cbEGF domains along the length of human fibrillin-1, which are relevant to the biomechanical properties of connective tissue microfibrils and the phenotypic variability of MFS-associated mutations.

EXPERIMENTAL PROCEDURES
The cbEGF12-13 domain pair from human fibrillin-1 includes residues Asp 1070 -Ile 1154 of the intact molecule (numbering according to Ref. 21).
Sample Preparation-The cbEGF12-13 domain pair from human fibrillin-1 was expressed, refolded, and purified as described previously (22). 15 N isotopically enriched samples were produced analogously using Escherichia coli strain BL21[pREP4], which was grown in minimal media with 15 NH 4 Cl as the sole nitrogen source. Low resolution electrospray mass spectrometry was used to confirm the molecular weight of the produced protein (data not shown).
NMR samples contained 20 mM CaCl 2 and 4.55 mM Tris at pH 6.5. At 20 mM CaCl 2 , in the absence of additional salt, saturation of both calcium binding sites of the pair was established based on chemical shift comparison with previous calcium binding studies (23). Furthermore, 2D HSQC spectra (24) recorded on a 15  The sample used to obtain NMR data for 15 N backbone dynamics analysis contained ϳ1 mM 15 N-labeled cbEGF12-13 in 4.55 mM Tris, 20 mM CaCl 2 at pH 6.5. To ensure data were not affected by sample aggregation, measurements were also performed on a 1.6 mM cbEGF12-13 sample. The concentrations of samples were estimated from A 280 measurements using ⑀ 280 ϭ 3280 M Ϫ1 cm Ϫ1 .
NMR Experiments-Initial assignments were made using 3D gradient-enhanced 15 N-separated NOESY-HSQC spectra (24,25), recorded at 15 and 33°C on a home-built/GE Omega spectrometer operating at 600 MHz. The spectrometer was fitted with a triple resonance probe with self-shielded pulsed field gradients. 3D NOESY spectra were recorded over ϳ3 days, with acquisition times of 102 ms in the direct 1 H (F 3 ) dimension, 10 ms in the 15 N (F 2 ) dimension, and within a range of 21.2 to 25.6 ms in the indirect 1 H (F 1 ) dimension. All NOESY spectra were recorded with a mixing time of 150 ms, and linear prediction was used to double the F 2 acquisition time to 20 ms (26). Gradient-enhanced 15 N-separated total correlation spectroscopy-HSQC spectra (24) were recorded on the same spectrometer at 15 and 33°C to enable identification of intraresidue cross-peaks. These spectra were recorded over approximately 3 days with the same acquisition times as for the NOESY, except for 16 ms in the indirect 1 H (F 1 ) dimension. Magnetization transfer was effected using an 11-kHz DIPSI-2 mixing sequence for 46 ms.
A 1 H-15 N heteronuclear multiple-quantum correlation-J (scalar coupling constant) spectrum (33) was recorded at 33°C and 500 MHz to allow derivation of 3 J HN-H␣ coupling constants. This spectrum was recorded with an acquisition time of 82 ms in the direct 1 H (F 2 ) dimension and 261 ms in the 15 N (F 1 ) dimension.
NH-exchange data in 2 H 2 O were obtained by recording a series of 1 H-15 N HSQC spectra (24) at 33°C after a fully protonated cbEGF12-13 sample had been dissolved in a 2 H 2 O solution (34). NH-exchange data in H 2 O at 33°C were obtained using the method of Böckmann & Guittet (35), using a 100-ms timescale for NH-exchange.
Relaxation data were acquired at 35°C and pH 6.5 allowing direct comparisons with results obtained for the cbEGF32-33 pair from human fibrillin-1 (19,20). Collection of the 15 N-T 1 , 15 N-T 2 , and 1 H-15 N heteronuclear NOE data at 11.7 and 17.6 T was carried out as described previously (20). In T 1 and T 2 experiments, acquisition times were 102.4 and 110.5 ms in the 1 H (F 2 ) and the 15 ( 15 N), at 11.7 and 17.6 T, respectively. In experiments with NOE, 1 H saturation was effected by means of a train of 120°flip-angle pulses at 10-ms intervals for 3 and 4.5 s at 11.7 and 17.6 T, respectively.
All spectra were processed using Felix 2.3 (MSI, Inc.) with mild resolution enhancement in both dimensions to optimize resolution while maintaining a good signal-to-noise ratio. Where applicable, all spectra recorded in one series were processed identically.
Spectral Assignment-Sequence-specific 1 H and 15 N chemical shift assignments were made using conventional methods (36,37) with the program NMRView, version 3.1.2 (38). 1 H-15 N spectral assignments for the cbEGF12-13 domain pair are shown in Fig. 2.

FIG. 1. Schematic illustration of the secondary structure and consensus sequence of the cbEGF12-13 domain pair from human fibrillin-1 (a) and the position of cbEGF12-13 mapped onto the domain organization of human fibrillin-1 (b).
In a conserved cysteine residues are shown in light gray, and the calcium binding consensus sequence is shaded dark gray. ␤ indicates a potential ␤-hydroxylation site. Point mutations in cbEGF12-13 associated with MFS are highlighted. Underlined and plain text mutations are known to cause neonatal and classic MFS, respectively. A double mutation is highlighted by asterisks. The G1127S and V1128I missense mutations, shown in italics, are associated with related disorders. Mutation data were obtained from the Marfan syndrome data base on the World Wide Web (6,13,65,66). In b, the position of the neonatal region (as defined by mutation studies) is highlighted.
3 J HN-H␣ coupling constants were measured via line shape fitting to one-dimensional traces extracted from the 1 H-15 N heteronuclear multiple-quantum correlation-J (scalar coupling constant) spectrum. Backbone torsion angles were restrained with a minimum range of Ϯ30°f or 26 residues having small 3 J HN-H␣ coupling constants (Ͻ5.0 Hz) or large 3 J HN-H␣ values (Ͼ8.0 Hz). angle restraints were only included in the structure calculation process where consistent with initial structures calculated in their absence.
NH-exchange data in 2 H 2 O and H 2 O were used to identify slowly exchanging backbone amide protons, and those that could be assigned to regions of regular secondary structure were restrained to form HN-CO hydrogen bonds, using two distance restraints, d O-N ϭ 3.3 Å and d O-HN ϭ 2.3 Å. In final structures, 24 experimentally derived distance restraints were used for 12 backbone hydrogen bonds in ␤-sheet structures. These restraints were incorporated into structure calculations only where consistent with both NOE data and initial structures calculated in their absence. Calcium atoms were constrained as described previously (19).
Structure Calculation and Refinement-Structures were calculated using ab intio simulated annealing from an extended template in XPLOR, version 3.81 (39) with version 4.01 topology and version 4.02 parameter files according to methods described previously (19). A final group of 25 was selected from 100 structures based on agreement with experimental data, with no distance violations greater than 0.3 Å, no angle violations (i.e. final values for restrained angles remained within the Ϯ 30°limits), and F NOE less than 102 kJ mol Ϫ1 . An average coordinate structure was calculated and energyminimized based on all residues (39), and the consensus secondary structure was derived using the program PROCHECK_NMR (40). A summary of the NMR structural statistics for the cbEGF12-13 domain pair, in terms of agreement with experimental data, is given in the PDB entry for the pair 1LMJ.
Analysis of Relaxation Data-T 1 and T 2 relaxation time constants were derived from two-parameter exponential fits to resonance intensities for all non-overlapped peaks. Errors were determined from the standard deviations of differences in the peak intensities in the two spectra that were recorded with the same relaxation delay (41). The heteronuclear NOE effect was calculated as the ratio of resonance intensities in spectra recorded with and without NOE. Errors were estimated from the signal-to-noise ratio of each spectrum.
A robust estimate of the diffusion tensor was obtained by selecting residues with NOE Ͼ 0.65 and with T 1 and T 2 values within a Lipari-Szabo diffusion model of S 2 Ͻ 1.0 using the average minimized structure. These residues did not exhibit significant line broadening at either field strength (41)(42)(43). Removal of a number of residues on statistical grounds had only minor effects on the diffusion tensor and led to a self-consistent interpretation of the data. Errors were estimated using 500 Monte Carlo simulations. The spectral densities were derived for a sphere (D xx ϭ D yy ϭ D zz ; D ϭ (D xx ϩ D yy ϩ D zz )/3), a symmetric top (D F ϭ D zz and DЌ ϭ 1 ⁄2(D xx ϩD yy ), and a fully asymmetric tensor with principal values D xx , D yy , and D zz (44) using dipolar and chemical shift anisotropy relaxation with the usual fundamental constants, a chemical shift anisotropy of Ϫ170 ppm (45), and an NH bond length of 1.02 Å. The angles , , and denote the orientation of the tensors with respect to the inertia frame. Inclusion of an angle of 0 to ϩ40°between the dipolar and chemical shift anisotropy tensor (43) had no statistically significant effect on the results.
Internal dynamics were analyzed using the extended model-free approach (46 -48) as implemented in Model-free4 (41,42). The procedures used have been described previously (49). In addition line broadening was confirmed based on the field dependence of the line width at 11.7 versus 17.6 T (50).

RESULTS
cbEGF12-13 Structure-1 H-15 N HSQC cross-peak assignments for fibrillin-1 cbEGF12-13 are annotated in Fig. 2. The 25 final NMR structures for the cbEGF12-13 pair are shown overlaid on the average structure in Fig. 3a, and the consensus secondary structure is illustrated schematically in Fig. 3b. The structure of the cbEGF12-13 domain pair is a near-linear, rod-like arrangement of two domains, with each domain comprising a major and minor region of double-stranded antiparallel ␤-sheet. However, the conformation of the minor ␤-sheet of cbEGF13 is non-ideal. This is most likely because of the absence of the C-terminal cbEGF14 domain, which would be likely to stabilize the fold of this region through interdomain packing interactions (19) and the presence of a proline in this region (Pro 1148 ). In addition, cbEGF13 contains a short ␣-helical region, which is also identified in 8 ⁄25 of the models for cbEGF12 using PROCHECK_NMR (40).
The near-linear orientation of the two domains is maintained by calcium binding to the C-terminal domain and by interdomain hydrophobic packing interactions. These packing interactions are mainly analogous to those observed previously in the cbEGF32-33 domain pair (19). A conserved aromatic residue, Tyr 1101 , at the open end of the minor ␤-sheet of cbEGF12, packs against the top of the major ␤-sheet of cbEGF13. The main packing interaction involves the side chains of Tyr 1101 and Gly 1134 and the methylene groups of Glu 1133 (Fig. 3b). In cbEGF12-13, however, the methylene groups of Arg 1083 are also involved. These do not form interdomain contacts but pack against the side chain of Tyr 1101 . The participation of Arg 1083 in interdomain packing may relate to increased calcium binding affinity for cbEGF13 (23) relative to cbEGF33 (51) because of a more stable binding site in cbEGF13.
Each domain of the cbEGF12-13 domain pair is well defined. The backbone (C ␣ , C, N) root mean square deviation values to the average coordinates, based on regions of secondary structure, are 0.37 Ϯ 0.09 and 0.31 Ϯ 0.06 Å for cbEGF12 and cbEGF13, respectively. The loop region between cysteines 5 and 6 is one residue shorter in cbEGF13 and contains three proline residues (Pro 1141 , Pro 1142 , and Pro 1148 ). This region contains no prolines in cbEGF12, and the tip of this extended loop, consisting of residues Gly 1104 -Asn 1110 (GFMMMKN), is not well defined in the NMR-derived models.
To assess the similarity of the cbEGF12-13 and cbEGF32-33 structures, tilt and twist angles for cbEGF12-13 were calculated using the program mod2 (52) according to methods described previously (19). The tilt and twist angles for cbEGF12-13 are 30 Ϯ 15 and 152 Ϯ 13°, respectively. Corresponding tilt and twist angles of 18 Ϯ 6 and 159 Ϯ 6°were reported previously (19) for the cbEGF32-33 domain pair. Therefore the two domains adopt a very similar extended arrangement in the two constructs. Although in Fig. 3 the tip of the cbEGF12 major ␤-sheet appears to adopt a more tilted conformation than that seen for cbEGF32, comparison of the orientation of the cbEGF "core" regions (as defined in Ref. 19) suggests that this may not be a significant structural difference and might result from the dissection of the domain pair from the intact protein.
cbEGF12-13 Dynamics-The shape and internal dynamics of calcium-saturated cbEGF12-13 were determined using 15 Nrelaxation data recorded at 11.7 T. The experimental T 1 and T 2 values for each residue, overlaid on parametric curves of T 1 and T 2 as a function of correlation time and order parameter, S 2 , are shown in Fig. 4a. The majority of T 1 and T 2 values cluster in a small region of this plot indicating that their relaxation properties can be described by overall diffusion of the molecule. There were no systematic differences between the correlation times determined from residues of either cbEGF12 or cbEGF13 suggesting that these domains tumble as a single unit. Residues with T 2 values outside the diffusion model appear to be affected by slow internal motion (Fig. 4a).
The results of the determination of the diffusion tensor of cbEGF12-13 are summarized in Table I. It can be seen that the best fit to the data was obtained using a prolate, symmetric top model with an axial ratio 2D zz /(D xx ϩ D yy ) of 1.9. The unique axis of the diffusion tensor, D zz , is aligned with symmetry axis of the molecule. A fully asymmetric tensor was statistically not justified reflecting the near-degeneracy of the short axes of the diffusion tensor for cbEGF12-13. The axial ratio and orientation of the diffusion tensor confirm the elongated shape of the module pair.
The average 1 H-15 N NOE values for residues in regions of secondary structure of 0.69 and 0.77 at 11.7 and 17.6 T, respectively, are consistent with a molecule in the slow tumbling limit (Fig. 4b). Significantly reduced NOEs are observed for residues at the N terminus of cbEGF12, before the start of the major ␤-sheet of this domain; for the disordered loop in cbEGF12, residues Gly 1104 -Lys 1109 , located in the turn joining the strands of the minor ␤-sheet; and for residues in the minor ␤-sheet (Gln 1145 -Ile 1154 ) of cbEGF13. A reduced NOE value was not seen for the single linker residue (Met 1112 ) of the cbEGF12-13 domain pair indicating, in agreement with cbEGF32-33 data (20), that fibrillin-1 cbEGF domain pairs possess a rigid interdomain linker when saturated with Ca 2ϩ .
The model-free approach was used to quantitatively describe the internal dynamics of cbEGF12-13 using the T 1 , T 2 , and   FIG. 4. 15 N relaxation data of cbEGF12-13. a, experimentally determined T 1 and T 2 at 11.7 T, together (filled circles) within T 1 and T 2 values (lines) calculated as a function of correlation time, and S 2 using the Lipari-Szabo model (46)  NOE data at 11.7 T (46 -48). The average order parameter for all residues in secondary structure is ͗S 2 ͘ ϭ 0.83 suggesting that fast motions only have small amplitudes.
The two residues involved in interdomain packing, Glu 1113 and Gly 1134 , have high order parameters of 0.88. These data, combined with the fact that the single interdomain linker residue, Met 1112 , and the aromatic residue involved in interdomain packing, Tyr 1101 , have above average 15 N-NOE values, indicate a well defined domain-domain interface. This is consistent with the similarity of the isotropic correlation times of the individual domains of cbEGF12-13 and the large axial ratio of the diffusion tensor.
Exchange terms of Ͼ1 Hz were required for ten and five residues of cbEGF12 and cbEGF13, respectively, and line broadening was confirmed by comparison of the relaxation data at 11.7 and 17.6 T. In the model-free analysis, both domains are affected by exchange, with the largest terms derived for the disulfide bonded cysteines and adjacent residues. This effect was observed previously to a greater extent in cbEGF32-33. These observations suggest that disulfide bond isomerization may play a role in the slow dynamics of this domain pair as reported for bovine pancreatic trypsin inhibitor (53,54).
Residues involved in and around the 1-3 disulfide bond of cbEGF12 (Cys 1074 , Cys 1086 , Val 1087 , Asn 1088 ) have significant exchange contributions. In addition, Cys 1081 and Cys 1095 of cbEGF12, which form the 2-4 disulfide bond, have R ex terms of 2.9 and 1.0 Hz, respectively, as well as significant e terms. In agreement with data for cbEGF32 (20), the 2-4 disulfide bond may also be affected by motions on the s to ms timescale. The combined motions of the 1-3 and 2-4 disulfide bonds appear to influence the dynamic behavior of Arg 1083 , in between the second and third cysteines. This residue has the largest exchange term in cbEGF12, and it may act as a structural pivot point. In cbEGF13 the largest exchange contributions are observed for the 2-4 disulfide bond between Cys 1124 and Cys 1138 . Although Cys 1140 does not require an exchange term, Cys 1153 , the C-terminal cysteine, has a R ex value (1.7 Hz), suggesting greater flexibility toward the tail of the construct, as was also observed for cbEGF33.
In cbEGF12, Phe 1105 and Met 1107 , within the disordered loop comprising Gly 1104 -Asn 1110 , require small and large exchange terms, respectively. Many of the residues in this loop also require e terms, confirming the flexibility of this region. In cbEGF13, R ex information was obtained for residues of the minor ␤-sheet, i.e. Ser 1147 and Ala 1152 -Ile 1154 , with these residues also requiring e terms. The minor ␤-sheet of cbEGF13 is a highly dynamic structure, comprising residues manifesting reduced NOE values, as well as significant R ex and e terms. These motions may partially explain the non-ideal structure of this ␤-sheet. The sum of the analysis of the relaxation data indicates that the central region of the pair construct is the most ordered on the ps-ms timescale. DISCUSSION Comparison of the structures of cbEGF12-13 and cbEGF32-33 from fibrillin-1 validates the proposal that fibrillin-1 (cb)EGF-cbEGF domain pairs, with one residue in their interdomain linker, adopt a rigid, rod-like structure when saturated with Ca 2ϩ (19). It is also interesting to compare the structure of the low density lipoprotein receptor EGF-AB pair, another Class I cbEGF domain pair (55,56). Tilt and twist angles were measured for this construct using identical methods to be 27 Ϯ 6 and 168 Ϯ 5°(56), compared with 30 Ϯ 15 and 152 Ϯ 13°, and 18 Ϯ 6 and 159 Ϯ 6°for fibrillin-1 cbEGF12-13 and cbEGF32-33, respectively. All three of these domain pairs adopt a very similar cbEGF orientation, which lends further support to the hypothesis that the Class I consensus sequence defines a conserved domain architecture (19).
Analysis of backbone relaxation data for holo-cbEGF32-33 suggested a correlation between backbone flexibility and calcium binding affinity. This is supported by the relaxation data for cbEGF12-13. For both tandem cbEGF domain pairs, the N-terminal domain binds calcium more weakly than the Cterminal domain, and residues in the N-terminal half of the first domain are affected by s to ms timescale motions. Comparison of dynamics for the N-terminal halves of cbEGF12 and cbEGF32 show that residues from this region of cbEGF12 have larger S 2 values and smaller R ex terms than the corresponding residues of cbEGF32. The affinities of the N-and C-terminal domains of cbEGF12-13 for Ca 2ϩ are significantly higher than those observed for cbEGF32-33, which correlates with the increased anisotropy of the diffusion tensor for cbEGF12-13 (1.9 versus 1.6). The less than expected anisotropy of the diffusion tensor of cbEGF32-33 was primarily attributed to significant fluctuations in the N-terminal cbEGF32 domain (19,20). The reduced internal dynamics in cbEGF12, together with the increased anisotropy of cbEGF12-13, indicates that the N-terminal half of cbEGF12 in cbEGF12-13 has a better defined structure than cbEGF32 in cbEGF32-33.
The fact that several residues of the major ␤-sheet of cbEGF12 have significant R ex terms, an effect not seen for cbEGF13, suggests that the lower calcium binding affinity observed for this domain relative to cbEGF13 (ϳ1.6 mM versus Ͻ 30 M) may be a result of slow dynamic processes that compromise the formation of a well defined binding site (23). It is noteworthy that cbEGF32 in cbEGF32-33, which has a calcium binding affinity more than five times weaker than cbEGF12 (ϳ9 mM) (51), has R ex terms with an upper limit of 30 Hz (20) compared with an upper limit of 10 Hz for cbEGF12. The lack of significant exchange contributions, where data were available, for residues in the minor ␤-sheet of cbEGF12 and the major ␤-sheet of cbEGF13 suggest that cbEGF13 is able to form a well defined calcium binding site, explaining the relatively high calcium binding affinity observed for this domain. Modelfree analysis was performed for cbEGF12-13 Ca 2ϩ ligands b Values in degrees. c 2 is given per residue. d The probability, Q, for obtaining a better fit by chance when using a more complex model was calculated for the isotropic and axially symmetric (prolate) models (Q 1 ) and for the axially symmetric (prolate) and asymmetric models (Q 2 ). Glu 1073 , Asn 1088 , Thr 1089 , Asp 1113 , Ile 1114 , Glu 1116 , and Asn 1131 , and none of these residues require an exchange term Ͼ1 Hz apart from Asn 1088 (2.5 Hz). These observations are consistent with calcium saturation. Saturation was similarly established for cbEGF32-33 (20).
An extended loop structure in cbEGF12 between cysteines 5 and 6 was observed to be highly solvent accessible and relatively unstructured, with reduced 1 H-15 N heteronuclear NOE values indicative of increased flexibility. Because protein binding interactions usually involve at least one flexible component, this region may be important for intra-or intermolecular contacts. It could be directly involved in microfibril assembly, i.e. directly interacting with itself or other fibrillin-1 monomers, or it may be involved in the interaction of non-homologous proteins/proteoglycans with the 10 -12-nm microfibrils. The cbEGF6 domain from thrombomodulin is the only other cbEGF of known structure with a large 13-residue loop connecting cysteines 5 and 6, and a residue in this region (Asp 461 ) has been shown to be involved in complex formation with thrombin, suggesting a potential role for the fibrillin loop in proteinprotein interactions (57). Similarly, the cbEGF domain from C1r has an unusually large loop region connecting cysteines 1 and 2. This region does not possess a unique structure and has therefore been proposed to play a role in domain-domain interactions in C1r or in protein-protein interactions within the C1 complex (58). It is also interesting to note that CD55 ligand binding activity of two highly homologous cell surface EGFcontaining proteins, CD97 and EMR2, is altered by at least an order of magnitude by just three amino acid changes within the ligand binding EGF-cbEGF-cbEGF region. One of the changes, Thr 3 Met, is located within a 17-residue loop between cysteines 5 and 6, suggesting that this region may play a role in protein-protein interactions (59).
Backbone dynamics studies of cbEGF12-13, combined with earlier results for cbEGF32-33 (20), provide insight into the role of calcium in maintaining the observed rod-shaped architecture of connective tissue microfibrils (17,18). In the absence of a N-terminally linked domain, the N-terminal region of cbEGF12 exhibits conformational exchange on the s to ms time-scale, an effect not observed for the N-terminal region of cbEGF13. These results are in agreement with those for cbEGF32-33. In addition, both the cbEGF13 and cbEGF33 domains present a systematic decrease in heteronuclear NOE values toward their C terminus, indicative of fast motions, which suggests that this region of cbEGF domain pairs may be sensitive to calcium binding and/or pairwise domain interactions. Hence, in the presence of calcium, the most ordered region of cbEGF domain pairs investigated to date is localized to the interdomain interface (this study) (20), which may explain why cbEGF domains are usually present in proteins as multiple tandem repeats (2).
Missense mutations within cbEGF12-13 that have been associated with MFS are highlighted in Fig. 1a, and these mutations can be classified into three groups depending on the residue affected (19). Mutations affecting cysteine residues are likely to alter disulfide bond formation, thereby disrupting the correct fold, and mutations affecting residues in the calcium binding consensus sequence are likely to reduce calcium binding affinity, leading to structural destabilization. Of the remaining missense mutations in this domain pair, G1127S has been shown by NMR studies to impair folding of cbEGF13 (60,61), possibly because of the exchange of Gly for a less flexible residue at the start of the major ␤-hairpin. S1077P may also affect domain folding, because this amino acid change results in a Pro-Pro sequence between the first and second cysteines of cbEGF12, which would limit the conformational flexibility of this region. R1137P may also affect folding by distortion of the major ␤-sheet of cbEGF13. Because Val 1128 is localized to the surface of the domain pair, it is not clear why the relatively conservative substitution, V1128I, also produces a disease phenotype.
Within the neonatal region of fibrillin-1, missense mutations that affect structurally analogous calcium ligands produce varying phenotypes (23). In addition there is wide variation in clinical phenotypes even when different ligands within cbEGF13 coordinating the same Ca 2ϩ are substituted (for example, D1113G-classic, N1131Y-nMFS). To clarify the molecular basis of these differences, the positions of mutations in cbEGF11-15 were assessed using a model that was created using methods described previously (19). The model was created using the coordinates of the cbEGF12-13 pair, rather than the cbEGF32-33 pair, to maximize the accuracy of atomic coordinates in the region of domains 12-13. The global structures of this five domain model and the one reported previously are highly similar and both manifest extended rod-like conformations. Cysteine mutations were not considered in this analysis, because they are likely to affect protein folding. As shown in Fig. 5, a rigid, rod-like structure is predicted for the cbEGF11-15 region of fibrillin-1. The relative spatial disposition of mutations associated with different phenotypes shows that changes to cbEGF12 calcium ligands produce severe effects. These mutations could, as a result of defective calcium binding, decrease the anisotropy and increase the intrinsic flexibility of the neonatal region of fibrillin-1. A more compressed, flexible structure could distort a potential binding interface, which may affect the microfibril assembly process and/or interactions with non-homologous components of the microfibrils by making binding energetically less favorable.
Interestingly, the substitution of Asn 1131 by a bulkier tyrosine, which would be predicted to result in a conformational change of the major ␤-hairpin of cbEGF13, is associated with a more severe phenotype than the less disruptive D1113G change. Based on the observation that three missense mutations with no clear structural consequences, K1043R, I1048T, and V1128I cluster on one face of the model, opposite the potential N-glycosylation sites, it is interesting to speculate that these residues may form part of a molecular interface. The An asterisk indicates a double mutation that was identified. It is currently unknown whether these were on the same or different alleles. Only noncysteine mutations were considered here, because cysteine mutations are likely to affect protein folding rather than protein-protein interactions. extended, flexible loop region between cysteines 5 and 6 of cbEGF12 may also localize to this face of the model and participate in protein-protein interactions. Further studies will be necessary to prove or disprove the theory that this region is involved in microfibril assembly.
The dynamic behavior of cbEGF12-13 and cbEGF32-33 (20) highlights the importance of calcium in determining the overall shape of these domains in fibrillin-1 and subsequently the 10 -12-nm connective tissue microfibrils. An MFS-causing mutation in a cbEGF domain that affects a calcium ligand, or reduces calcium binding affinity indirectly, could increase flexibility of a localized region of fibrillin-1. Mutation of a residue that participates in interdomain packing could also have this effect, which may alter microfibril assembly or the integral properties of microfibrils. Dynamic changes could also result in the production of a MFS phenotype because of increased susceptibility of fibrillin-1 and/or microfibrils to proteolysis (62). It has been demonstrated in vitro that missense mutations that change calcium binding ligands cause increased proteolytic susceptibility of recombinant fibrillin fragments (62)(63)(64).
In summary the structure of the cbEGF12-13 pair has validated the hypothesis that fibrillin-1 (cb)EGF-cbEGF domain pairs adopt a rod-like structure and has shed light on the plasticity of pairwise cbEGF domain interactions. These results have been used to examine the spatial distribution of MFS-associated mutations to the region comprising cbEGF11-15, within the neonatal region of fibrillin-1. Insights gained from the structure of the cbEGF12-13 domain pair and the cbEGF11-15 model will provide a basis for future functional studies.
Backbone dynamics studies of the calcium-saturated cbEGF12-13 and cbEGF32-33 pairs have highlighted a dynamic signature for fibrillin-1 cbEGF domain pairs. This signature includes s to ms fluctuations for residues in the Nterminal half of the N-terminal domain, including at least the first disulfide bond, and fast (ps to ns) motions for residues of the minor ␤-sheet of the C-terminal domain. It includes a rigid interdomain interface and linker residue, with the central region of the domain pair forming a common dynamic unit.
Taken together with previous calcium binding studies, these results support a correlation between backbone dynamics and calcium binding affinity. MFS-causing mutations along the length of fibrillin-1 that result in defective calcium binding to tandem cbEGF domains and/or modify interdomain packing interactions are therefore likely to produce a less extended, more flexible structure for a region of fibrillin-1, which could increase proteolytic susceptibility and/or distort potential protein binding sites. The severity of the disease phenotype produced will depend on both the nature of the fibrillin-1 defect and its location within the fibrillin-1 monomer.