Structure of the globular tail of nuclear lamin.

The nuclear lamins form a two-dimensional matrix that provides integrity to the cell nucleus and participates in nuclear activities. Mutations in the region of human LMNA encoding the carboxyl-terminal tail Lamin A/C are associated with forms of muscular dystrophy and familial partial lipodystrophy (FPLD). To help discriminate tissue-specific phenotypes, we have solved at 1.4-A resolution the three-dimensional crystal structure of the lamin A/C globular tail. The domain adopts a novel, all beta immunoglobulin-like fold. FPLD-associated mutations cluster within a small surface, whereas muscular dystrophy-associated mutations are distributed throughout the protein core and on its surface. These findings distinguish myopathy- and lipodystrophy-associated mutations and provide a structural framework for further testing hypotheses concerning lamin function.

LGMD, and DCM-CD were originally described as distinct syndromes, although considerable overlap had been recognized (9,10). Clinical variability consistent with all three diagnoses can even be seen within family members carrying the same mutant Lmna allele (10), suggesting that the myopathies associated with Lmna mutations represent a clinical spectrum rather than distinct disorders.
Dunnigan-type familial partial lipodystrophy (FPLD), on the other hand, represents a distinct clinical disorder that also maps to the Lmna gene (11,12). Like the myopathies, FPLD is tissue-specific, but adipose tissue is affected as opposed to muscle. A loss of subcutaneous fat in the torso and extremities begins around puberty. Fat may accumulate instead in the face and neck. Systemic manifestations of FPLD include hypertriglyceridemia and hyperinsulinemia associated with a syndrome of severe insulin resistance. Patients with FPLD do not have skeletal muscle or cardiac findings associated with the hereditary myopathies. Conversely, patients with EDMD, LGMD, or DCM-CD have normal fat distribution and insulin sensitivity, raising the intriguing question of how specific mutations in Lmna lead to clinically distinct phenotypes that affect either muscle or fat, but not both.
Lamins A and C are encoded by Lmna (13), whereas lamins B1 and B2 are encoded by distinct genes, Lmnb1 and Lmnb2. The lamins form the nuclear lamina, a protein meshwork that lines the inner leaflet of the nuclear membrane thus maintaining the structural integrity of the nucleus. Lamins also apparently bind chromatin-associated histones and additional nuclear proteins such as emerin (14,15). A central ␣ helical rod domain in the lamins has high homology with the intermediate filament (IF) proteins, including the keratins, vimentin, and desmin. All other IFs are cytoplasmic, whereas lamins are restricted to the nucleus. The lamin and other IF protomers dimerize into coiled coils (via their central rod domains) that further assemble into larger filaments (16). The intertwining of lamin filaments and their carboxyl-terminal segments distinguish them from other IFs. The carboxyl-terminal regions appear by transmission electron microscopy to form globular domains (16).
Frameshift and nonsense mutations in Lmna cause muscular dystrophy. These are severe mutations in terms of encoded protein function. Similarly, LmnaϪ/Ϫ mice exhibit a muscular dystrophy-like syndrome associated with retarded growth and early death (17), whereas adipose tissue, insulin sensitivity, and lipid profiles in LmnaϪ/Ϫ mice are nearly normal (18). Thus null alleles and severe loss-of-function appear to affect muscle rather than fat. Missense mutations occurring throughout the coiled coil domain are associated as well with the inherited myopathies and not FPLD (Fig 1A). These are likely to prevent dimerization and higher order assembly of lamin filaments, so that in terms of function these too are severe. In contrast, specific missense mutations in the carboxyl-terminal domain of lamin A/C may be associated either with inherited myopathy or FPLD, but not both. Since the biological functions of the carboxyl-terminal domain are unknown, understanding the effects that substitutions have on function has been particularly difficult to divine. We have solved the three-dimensional structure of the globular tail domain of lamin A/C to provide a framework for determining how certain missense mutations cause muscular dystrophy while others cause lipodystrophy.

MATERIALS AND METHODS
Protein Production and Crystallization-DNA encoding human lamin A/C residues 436 -552 was subcloned into a pET28a vector (Novagen) using NdeI and EcoRI sites. Protein was expressed in Escherichia coli BL21(DE3) (Novagen) using LB or synthetic medium containing L-selenomethionine (SeMet) and isolated from bacterial lysates using cobalt affinity media. The His 6 affinity tag was removed with thrombin, and the protein was further purified by ion exchange chromatography (Mono Q FPLC). Crystals were obtained at room temperature by the vapor diffusion method in drops suspended over 0.5 ml of crystallization buffer (25% polyethylene glycol 4000, 0.2 M ammonium acetate, 10 mM DTT, 0.1 M Tris-HCl, (pH 8.5)). The hanging drops contained 1 l of protein (20 mg/ml in 5 mM Tris-HCl (pH 8.0), 0.1 M NaCl, 10 mM DTT) mixed with 1 l of crystallization buffer.
Data Collection and Structure Refinement-Crystals were transferred to a crystallization buffer containing 25% glycerol prior to cryogenic data (100 K) collection at National Synchrotron Light Source beamlines X25 and X12C. HKL Suite (19) was used to integrate and scale the data. The program SOLVE (20) was used to identify the two selenium atoms, and MLPHARE and DM (21) were used to generate a preliminary electron density map. The overall figure of merit of the MAD phases was 0.63; the initial electron density map calculated using the MAD phases was interpretable. A final model, including lamin residues 436 through 544, 4 amino-terminal vector-derived residues, and 107 water molecules, was built using the program O (22) and the refinement package CNS (23).

RESULTS AND DISCUSSION
Domain Architecture and Structural Homology-Alignments of the available amino acid sequences for vertebrate and invertebrate lamins revealed two regions of high homology connected by a variable length spacer. Algorithms for predicting protein secondary structure indicated that the homology domains are ordered, while the spacer, which contains the nuclear localization signal, is predicted to be disordered. The first homology domain corresponds to the coiled coil rod domain common to all IF proteins. The second domain located toward the carboxyl termini of the lamin sequences is not found in cytoplasmic IFs. The second homology domain contains the known FPLD mutations as well as several mutations associated with EDMD and LGMD ( Fig. 1), so we targeted this domain for structural analysis.
Recombinant protein corresponding to human lamin A/C residues 436 -552 was crystallized (P2 1 2 1 2 1 crystal form; unit cell: a ϭ 26.4, b ϭ 52.6, c ϭ 67.4 Å), and the structure was solved at 1.4-Å resolution by MAD methods (Table I). The final model is based on 18,892 reflections using 30 -1.4-Å resolution data. The model contains 878 non-hydrogen protein atoms and 107 solvent molecules per asymmetric unit and has an R-factor of 21.6% and an R free of 23.8%. The temperature B-factor is 16.0 Å 2 , and the root mean square deviations for bond length and bond angle are 0.0056 Å and 1.41°, respectively.
The structure of the globular tail of lamin A/C reveals a compact, well defined domain composed entirely of ␤ strands (Fig. 2a). Two large ␤ sheets form a ␤ sandwich. One sheet has five ␤ strands and the other has four. A second, smaller ␤ sheet lies perpendicular and adjacent to the plane of the ␤ sandwich. Listed sequentially, ␤ strands a, b, and e form one sheet and strands c, d, f, and g form the other. Short loops connect most of the ␤ strands, lending to the compact appearance of the domain. This type of ␤ sandwich is referred to as an immunoglobulin (Ig) domain, a common protein structural unit (24,25). We have followed the convention of labeling ␤ strands in Ig domains according to linear position (Figs. 1 and 2); intervening loops are named according to flanking strands (e.g. ab, eЈf, etc.). Ig-like domains have a common core of six ␤ strands: a, b, and e in one ␤ sheet and c, f, and g in the other (Fig. 2b). Ig-like domains have been further categorized into four groups according to the presence of ancillary ␤ strands and the connections between them (Fig. 2b) (25,26). Because it is different from the previously defined subtypes, the lamin A/C tail represents the prototype for a new class of Ig-related domain, which we refer to as the lamin or L subtype.
The two sheets of the ␤ sandwich in lamin A/C are closely associated. Water is excluded from the densely packed core, which is formed by residues from ␤ strands a (  Structure of the Globular Tail of Nuclear Lamin 17382 larger groove lies between ␤ strands d and eЈ and has aromatic residues Tyr 481 , Phe 483 , Phe 487 , Trp 514 , and Trp 520 at its base. In our crystal structure, the eeЈ loop of an adjacent molecule is inserted into this groove, supporting potential roles for the groove or the eeЈ loop in intermolecular interactions. A smaller cleft bordered by the loop preceding strand a and the fg and bbЈ loops has residues Gln 462 , Trp 467 , and Val 538 at its base. Although different classes within the Ig domain family share very low (Ͻ10%) sequence identity, certain residues that contribute to the domain core have conserved function. Residue c3 (third position in ␤ strand c) is invariably hydrophobic, while residues a3, b1, e5, and f5 are hydrophobic in most cases (26). All of these residues in lamin A/C are hydrophobic: c3 ϭ Ile 469 ; a3 ϭ Val 442 ; b1 ϭ Phe 451 ; e5 ϭ Ile 497 ; f5 ϭ Leu 530 .
Potential Function of the Ig Domain-Although biological roles of the nuclear lamins are understood, including the maintenance of nuclear integrity, specific molecular functions of the carboxyl-terminal domain are unknown. Ig domains in general may serve either as structural scaffolds or they may mediate specific intermolecular interactions with other proteins, DNA, or phospholipids. As scaffolds, a series of Ig domains often serve as building blocks, with one of the domains having intermolecular binding properties and the others interacting with adjacent domains. For example immunoglobulins are composed of paired heavy and light chains having four or two Ig domains each (27). The terminal (variable) Ig domains in each chain mediate antigen binding, while the remaining four (constant) domains in each half-antibody are structural. Analogous clusters of Ig domains comprise the extracellular domains of B and T cell receptors and major histocompatibility complex proteins, growth factor and cytokine receptors, and numerous cell adhesion molecules, where they are often called fibronectin repeats.
To predict potential sites of interaction with the lamin Ig-like domain, we have analyzed the surfaces of many Ig domains that mediate protein-protein interactions and found that virtually any surface might be involved.
We have similarly analyzed Ig domains that bind DNA in the selected group of transcription factors that includes NF-B, NFAT, p53, and the SH2 domain-containing STAT proteins. In each of these cases, residues on elongated ab and ef loops at the base of the Ig domain contact DNA. These loops are longer than their counterparts in lamin and, as is typical of DNA-binding proteins, Arg, Lys, and Gln residues form critical hydrogen bonds with DNA. The short ab loop in lamin forms a type I ␤ turn, with glutamic acid residues at its second and third positions. While the eЈf loop is longer, containing 11 residues, it caps the domain base, its length is just sufficient to connect the eЈ and f strands, and it lacks residues necessary for binding DNA. The lamin Ig domain cannot bind DNA by the mechanism common to these transcription factors.
Phospholipid-binding C2 domains, which regulate catalytic activity and subcellular localization, for example in the PKC enzymes (28), are structurally related to Ig domains. Acidic residues from the bc and fg loops at the top of the PKC-␣ C2 domain Residues that are substituted in patients with FPLD, colored yellow, map to a distinct corner on the domain surface. B, expanded view of the bЈc and ddЈ loops and flanking strands. Oxygen, nitrogen, and sulfur atoms are colored red, blue, and violet, respectively, and carbon atoms of unsubstituted residues are green. Carbon atoms of residues substituted in FPLD are colored yellow, whereas those that are substituted in EDMD or LGMD are colored red. Structure of the Globular Tail of Nuclear Lamin 17383 chelate a pair of calcium ions. Phosphatidylserine binds residues in this vicinity and buries one of the calcium atoms. The acidic residues in the corresponding bc and fg loops of the lamin domain are not properly oriented either for calcium chelation or phosphatidylserine binding. Therefore, while our structure provides a powerful starting point, the identification of the relevant physiological partners will be required before more detailed analyses can be used to map the involved surfaces. Disease Mutations in the Lamin Tail Domain-Over 50 distinct mutations and polymorphisms have been identified in the lmna gene (6, 8, 9, 11, 12, 29 -31). Six leading to frameshifts and early termination are associated with the hereditary myopathies, as are the 22 missense mutations encoding substitutions in the coiled coil domain (Fig. 1). We thus surmise that null mutations and severe loss-of-function leads to muscular dystrophy. By contrast, of 15 missense mutations encoding substitutions within the globular Ig domain (Figs. 1 and 3A), 9 occur in families with muscular dystrophy (EDMD or LGMD) and 8 are in families with FPLD (two distinct mutations encode the K486N substitution). Most of the muscular dystrophy mutations in the Ig domain affect core residues that are likely to be necessary for maintenance of proper folding and protein stability. The N456I, N456K, I469T, Y481H, W520S, T528K, and L530P substitutions are predicted to perturb stable folding of the domain and potentially diminish protein stability. The remaining two EDMD mutations substitute arginine residues present at opposite surfaces of the domain. R453 is in the b strand, with its side chain ion paired with the carboxylate side chains from Glu 443 and Glu 444 in the a strand. Along with Asp 446 , Glu 447 , and Glu 448 , these five acidic residues form an electrostatic patch on the surface of the domain. Arg 527 is on the opposite surface of the domain in the f strand. The guanidinium group of its side chain forms a bipartite salt bridge with the two carboxylate oxygen atoms from the side chain of Glu 537 . The R453W and R527P substitutions would disrupt these salt bridges on the domain surface and perturb local structure in addition to having more general effects on global structure.
The three residues that are mutated in patients with FPLD cluster within a discrete corner of the domain (Fig. 3, A and B). Gly 465 is in the bЈc loop and Lys 486 is in the ddЈ loop. These two short loops run antiparallel to one another and are fixed at either end by associated antiparallel ␤ strands (the antiparallel sheets are formed by strands bЈ and dЈ at one end and strands c and d at the other end). The MGNW sequence encompassing Gly 465 forms a type I ␤ turn. Substitution of Gly 465 would destabilize the turn and an L amino acid side chain, as occurs in the G465D substitution, would clash with the ddЈ backbone in the vicinity of Lys 486 , which is 4 Å away. Solvent-exposed Lys 486 , along with the two preceding prolines, forms a sharp turn. Substitutions with asparagine or threonine, which cause FPLD, are not expected to disrupt either the turn or the domain fold. It is possible that Lys 486 participates in an intermolecular interaction. The final residue that is mutated in FPLD, Arg 482 , is at the carboxyl terminus of the d strand. The ␣ carbon of Arg 482 is about 10 Å away from the ␣ carbons of Gly 465 and Lys 486 , such that these three residues circumscribe a small triangle on the domain surface. The Arg 482 side chain similarly extends into surrounding solvent without interacting with neighboring residues. FPLD mutations substitute Arg 482 with Trp, Gln, Leu, or Gly, suggesting that here too a basic side chain is required. Thus substitution of either Arg 482 or Lys 486 would likely prevent an intermolecular interaction due to loss of a basic side chain, rather than perturbing structure. Substitution of Gly 465 would likely perturb the local environment of these basic residues. In contradistinction, most muscular dystrophy mutations are predicted to perturb core structure.
Conclusion-Specific mutations in the tail domain of human lamin A/C are associated either with muscular dystrophy or lipodystrophy. The crystal structure of this domain reveals that the mutations segregate; substitutions encoded by muscular dystrophy mutations are usually in the protein core and most likely perturb protein stability, whereas substitutions encoded by lipodystrophy mutations are localized to a discrete corner of the domain which may mediate an intermolecular interaction.