The X-ray Crystal Structure of Human (cid:1) S-crystallin C-terminal Domain*

(cid:1) S-crystallin is a major human lens protein found in the outer region of the eye lens, where the refractive index is low. Because crystallins are not renewed they acquire post-translational modifications that may perturb stability and solubility. In common with other members of the (cid:2)(cid:1) -crystallin superfamily, (cid:1) S-crystallin comprises two similar (cid:2) -sheet domains. The crystal structure of the C-terminal domain of human (cid:1) S-crystal-lin has been solved at 2.4 Å resolution. The structure shows that in the in vitro expressed protein, the buried cysteines remain reduced. The backbone conformation of the “tyrosine corner” differs from that of other (cid:2)(cid:1) crystallins because of deviation from the consensus sequence. The two C-terminal domains in the asymmetric unit are organized about a slightly distorted 2-fold axis to form a dimer with similar geometry to full-length two-domain family members. Two glutamines found in lattice contacts may be important for short range interactions in the lens. An asparagine known to be deamidated in human cataract is located in a highly ordered structural region. The lens crystallins are protein molecules that need to last a lifetime, because they are found in cells that have no protein synthetic or degradation machinery (1). The leading senile cataract hypothesis is

The lens crystallins are protein molecules that need to last a lifetime, because they are found in cells that have no protein synthetic or degradation machinery (1). The leading senile cataract hypothesis is that aged non-native crystallin molecules overwhelm the binding capacity of the small heat shock protein ␣-crystallin (also found in the lens), resulting in aggregation and formation of light scattering centers (2). Detailed molecular information from selected crystallin domains and proteins is needed to model their unfolding and characterize their likely ensemble biophysical properties, particularly the early unfolding intermediates that have been hypothesized to bind to ␣-crystallin (3). As a first step toward providing detailed molecular information on a major human lens crystallin involved in cataract, we have solved the x-ray crystal structure of the C-terminal domain from human ␥S-crystallin.
The crystallins are a well studied family of proteins for which there are several three-dimensional structures (4) as well as thermodynamic and kinetic data on folding/unfolding (5)(6)(7). The polypeptides of the ϳ13-member ␤␥-crystallin superfamily each comprise similar ϳ10-kDa N-and C-terminal domains that are themselves formed from two symmetrically organized Greek key motifs. In all cases, the N-and C-terminal domains pair about a similar pseudo-2-fold axis, with the domains in monomeric ␥-crystallins being covalently connected, whereas domain swapping can lead to dimerization in ␤-crystallins (8).
There are seven genes coding for ␥-crystallins in vertebrate lenses (9), and they consist of the closely related ␥A-␥F family and the more distantly related but more conserved ␥S-crystallin. The expression patterns of the family of ␥-crystallins appear to be correlated with the formation of the decreasing refractive index gradient from the center to the cortex of the adult lens (4,10). The propensity of certain ␥-crystallins to easily form a concentrated phase (11), such as the high T c ␥-crystallins that are enriched in the core region of the lens, probably reflects their "attractive" interactions (12). ␥S-crystallin, located in the low refractive index outer regions of the lens, is characterized by more repulsive intermolecular interactions (13). The molecular basis for the stability of these long-lived structural proteins, along with their solution intermolecular interactions that govern solubility and phase separation behavior, are areas of cataract research. Several x-ray structures of ␥A-F crystallins are now known, and they all show very similar two-domain pairing about a hydrophobic interface that contributes toward stability (14 -17). ␥S-crystallin is a major structural protein in the human eye lens (18). Human and bovine ␥S-crystallins and their isolated domains are very stable and show two-state unfolding, allowing detailed quantitative thermodynamic properties of the proteins to be evaluated (19). Computer simulations of heat-induced unfolding of bovine ␥B-crystallin also indicate high stability and furthermore suggest that the first stage of unfolding involves the dissociation of the paired domains (20). Conformational changes to aging crystallins can derive from a variety of covalent changes. Oxidation of cysteine and methionine residues have been detected in human crystallins (21). Deamidation of human ␥-crystallins is correlated with aging (22,23) and with increased insolubilization of crystallins, particularly ␥S (21). Deamidation alters the charge balance, adding a negative charge to a previously neutral area, but it is also thought to mark the nonenzymic formation of isomers such as ␤-aspartate that would alter the backbone covalent structure (24).
So far for ␥S-crystallin, only the C-terminal domain of the bovine protein has been solved by x-ray crystallography (25), showing how two domains self-associate to form a dimer in an analogous way to that of the native two-domain ␥-crystallins, although the pairing is less symmetrical. Surprisingly, one of the domains has an altered conformation in its tyrosine corner, a usually highly conserved feature of most ␤-sandwich proteins (26). In fact, the tyrosine corner has been proposed as a possible folding nucleus in a prokaryote protein with a related ␤␥crystallin fold (27), although this has not been universally supported (28). Because it is unclear to what extent the lattice interactions in the crystal structure influenced pairing and conformation, further three-dimensional structures are required. Here we show that the C-terminal domains of human ␥S-crystallin pair about a slightly distorted 2-fold axis to form a dimer with both tyrosine corners in a nonstandard conformation.

EXPERIMENTAL PROCEDURES
Protein Expression-The human ␥S-crystallin C-terminal domain (HGSC) 1 was cloned in the pET3a vector essentially as described for the C-terminal domain of calf ␥S-crystallin (25). The novel initiation codon was introduced in the human ␥S sequence (19) at a position that replaced the first glycine in the linker sequence by PCR-mediated mutagenesis using the following primers: GTTCATCTGCCTCATAT-GGGCCAGTATAAG (forward) and GGATCCATGTCATTACTCCA-CAATG (reverse).
The HGSC plasmid DNA, coding for residues 91-177 (topologically equivalent to residues 86 -172 of ␥B-crystallin) was transformed into Escherichia coli strain BL21(DE3) pLysS competent cells. Colonies were picked to inoculate and grown overnight at 37°C with shaking in 10 ml of 2YT medium (5 g/liter NaCl, 10 g/liter yeast extract, 16 g/liter peptone 140) with 10 l of ampicillin (100 mg/ml) and 15 l of chloramphenicol (34 mg/ml). Large scale growth was performed with an overnight culture of 500 ml of 2YT medium containing 250 l of ampicillin (50 mg/ml) after inoculation at 100:1 from the 10-ml overnight growths. The flasks were shaken at 37°C and induced by the addition of 250 l isopropyl-␤-D-thiogalactopyranoside after the culture was grown to an A 550 of 0.4 -0.6 (3-4 h). Growth was continued overnight, whereupon cells were harvested by centrifugation at 5000 rpm for 15 min at 4°C. The pellets were resuspended in 10 ml of 25 mM Tris-HCl, pH8.0, 10 mM EDTA, 50 mM glucose with a protease inhibitor (5 l of Pefabloc (Merck)) and frozen at Ϫ20°C.
Protein Isolation-The highly expressed protein was isolated from the soluble fraction. Whole cell lysate was prepared from the thawed pellet by addition of DNase I and MgCl 2 to the suspension giving final concentrations of 10 g/ml and 10 mM, respectively, followed by sonication on ice using 10-s pulses with cooling in between. The pellet was spun down at 20,000 rpm for 30 min at 4°C before dialyzing the supernatant overnight at 4°C with stirring against buffer A (25 mM Tris-HCl, pH 8.0, 1 mM EDTA, 1 mM dithiothreitol). The solution was then filtered through a 0.4-m nitrocellulose filter followed by a 0.2-m nitrocellulose filter before being loaded onto a Hiload 16/10 Q Sepharose High Performance column (Amersham Biosciences, Inc.). The column was run at 4 ml/min on a Gradifrac with the following program: 1) 20 ml of 100% buffer A; 2) gradient from 0 -70% buffer B (buffer B was buffer A with 1 M NaCl) over 160 ml; 3) 60 ml of 100% buffer B; and 4) 100 ml of 100% buffer A. The HGSC peak eluted at about 15% buffer B, in line with the predicted pI of 6.0. The identity of the protein was confirmed by electrospray mass spectrometry with the measured mass of 10,412 being in close agreement with the calculated mass of 10,414 and indicating that the initiating methionine had been cleaved. The HGSC protein fractions were concentrated to ϳ10 mg/ml and equilibrated against 25 mM Bis-Tris-propane HCl, pH 7.5, using an Amicon (Millipore, Watford, Hertfordshire, UK) cell equipped with a YM3 membrane. The concentrated protein was stored at Ϫ20°C.
The size of the protein was estimated using chromatography on a Superose 12HR 10/30, using 25 mM Bis-Tris-propane HCl, 0.2 M NaCl, pH 6.5 or 8.0, as running buffer. The HGSC eluted at 15.3 ml, over a wide range of protein concentration application (0.2-6.5 mg/ml) at both pH 6.5 and 8.0. For comparison, full-length bovine ␥S-crystallin at pH 8.0 elutes at 14.4 ml, in agreement with the monomeric nature of the C-terminal domain of human ␥S in solution, as determined using ultracentrifugation (19).
Crystallization-The crystals were grown using the hanging drop vapor diffusion method with conditions for crystal growth optimized from Hampton (Laguna Niguel, CA) Crystal Screen II condition 13, with polyethylene glycol monomethylether 2000 as precipitant. 1 l of protein at ϳ10 mg/ml 25 mM Bis-Tris-propane HCl, pH 7.5, was added to 1 l of well solution containing 0.2 M ammonium sulfate, 0.1 M sodium acetate, pH 5.0, and 20 -28% polyethylene glycol monomethylether 2000. The optimum crystals, formed at 24% polyethylene glycol monomethylether 2000 after 4 days growth at room temperature, were hexagonal bipyramidal with dimensions of ϳ0.3 ϫ 0.1 ϫ 0.1 mm 3 .
Data Collection and Processing-Intensity data to 2.4 Å were collected from a cryo-cooled (100 K) crystal using the Daresbury SRS source at Station 9.6 using an ADSC imaging plate. No cryoprotectant was added. The data were processed using the program MOSFLM (29). Scaling was carried out with the program SCALA (30), and the data were truncated with TRUNCATE (31). The crystals were either space group P6 1 22 or P6 5 22 with two molecules in the asymmetric unit assuming a solvent content of 61% (V m ϭ 3.15 Å 3 /dalton). The crystal data and statistics from data processing are listed in Table I.
Structure Determination-Molecular replacement was undertaken with the program AMoRe (32) with the B chain coordinates from the bovine ␥S-C domain (25) as a search model. Data from 15-2.4 Å were used in both the rotation and translation function searches with a Patterson cut-off radius of 15 Å and a radius of integration of 0.75% (the maximal distance from the center of mass being 21.5 Å). A successful solution was found indicating two molecules in the asymmetric unit, using the P6 5 22 space group, with a correlation coefficient of 63.7 and an R factor of 42.3%.
Structure Refinement-Refinement of the structure was undertaken using CNSsolve version 0.9 (33). The reflections were divided, at random, into working and test (7.5% of the data) sets, to allow both the crystallographic and free R factors to be followed. The test set of reflections was excluded from the map calculations. Early in refinement, the noncrystallographic symmetry at the dimer interface was maintained by use of restraints (initial noncrystallographic symmetry restraints were 20 kcal/mol). Both simulated annealing and minimization methods were tried for refinement, with a maximum likelihood target using amplitudes. The refinement method giving the best reduction in the R factors for a cycle was chosen, and individual isotropic B-factor refinement was then undertaken. In each cycle, both 2F o Ϫ F c and F o Ϫ F c electron density maps were calculated, and manual rebuilding was undertaken using the program O (34). Water molecules were added using the CCP4 (35) programs PEAKMAX and WATPEAK to select potential sites. Some later rounds of refinement were undertaken using the CCP4 programs REFMAC (36) and ARP_WARP (37) interspersed with CNSsolve refinement. Noncrystallographic symmetry restraints The final solution contains two molecules of human ␥S-crystallin C-domain (A and B) in the asymmetric unit, together with 90 water molecules. The statistics after refinement are given in Table II. Electron density for all the residues is visible, with just some side chain density remaining unclear. The Ramachandran plot from PROCHECK (38) shows 86.2% of the residues in the most favored regions and 13.8% in the additional allowed regions. The coordinates have been deposited in the RCSB protein data bank, under code number 1ha4.
Solvent Accessibility-The program NACCESS (39) was used to calculate solvent accessible surface areas using the default probe radius of 1.4Å.
Figures-The figures were produced using the programs MOLMOL (40) and POVRAY version 3.1.

RESULTS
Human ␥S-crystallin C-terminal Domain Forms a "Dimer" in the Crystal Lattice-The refined electron density shows that the two molecules of the HGSC in the asymmetric unit have a very similar structure (backbone root mean square deviation of 0.29 Å), and they are very similar to bovine ␥B-crystallin Cterminal domain (BGBC) (backbone root mean square deviation of 0.8 Å). The two HGSC domains form a dimer with a rotation of 176.5°between the two chains (red in Fig. 1A). This 2-fold pairing of two C-terminal domains is similar to the Nand C-terminal domain pairing in other polypeptides of the ␤␥-crystallin family and is considered to reflect the origin of the family from an ancestral homodimer of single domains (41). Further evidence for the ancestral homodimer model is provided by the crystal structure of the C-terminal domain of bovine ␥B-crystallin in which the C-terminal tyrosine was deleted. The two domains paired about a slightly distorted noncrystallographic 2-fold axis (BGBC, blue in Fig. 1A) and are readily superposed on the native 2-domain ␥B-crystallin (42). However, when the C-terminal domain of bovine ␥S-crystallin was solved, the two molecules in the asymmetric unit were found to pair in a broadly similar way (BGSC, green in Fig. 1A), but the 2-fold is distorted by around 20°C (25).
The residues in the interface between the two domains in HGSC are shown in Fig. 2. It should be noted that the residue numbers used here are based on the alignment of the domain to ␥B-crystallin not from the ␥S-crystallin sequence (Fig. 1B). In common with other ␤␥-crystallin paired domain interfaces, there is a central hydrophobic patch (shown in green) surrounded by polar residues that make specific interactions. For example, in the HGSC dimer, there are two ion pairs between Asp 147 and Arg 168 on each side of the noncrystallographic axis, whereas in the BGSC distorted dimer, only one ion pair can form (see Fig. 4 in Ref. 25). A cluster of interactions is close to the 2-fold axis: each Gln 143 and Glu 172 side chain interacts with backbone polar atoms of its symmetrically related partner, and each Arg 142 interacts with its symmetrically related partner side chain. The only sequence difference between the two species at the interface is Val 130 in human that replaces alanine in the bovine and is likely to contribute to the differing symmetries.
Tyrosine Corner Structure-It is apparent from a superposition of three kinds of ␥-crystallin C-terminal domain dimers that there are two regions, calculated using difference distance plots (43), where the domain conformation differs: the tyrosine corners and a dimer-dimer interface region (Fig. 1A). When tyrosine corners from four C-terminal domains of members of the ␤␥-crystallin superfamily (HGSC, BGSC, BGBC, and the C-terminal domain of bovine ␤B2-crystallin) are compared, HGSC has a backbone conformation that follows that of the BGSC A chain rather than the more common conformation seen in BGBC and the C-terminal domain of bovine ␤B2-crystallin. Fig. 3A shows a comparison of the HGSC tyrosine corner with that from BGBC. The unusual conformation is centered near residue 148 where ␥B has proline and ␥S has lysine. In the "standard" conformation, the tyrosine (151) hydroxyl oxygen hydrogen bonds with the main chain N-H of Asp 147 (bond length, 2.4 Å). In HGSC, the tyrosine hydrogen bonds with the main chain N-H of Lys 148 (bond length, 1.96 Å). The new corner conformation results in different positions for the exposed positive charges of lysines 148 and 149 (Fig. 3A).
Dimer-Dimer Interactions in Human ␥S-C-There are three different interdimer interfaces present in the crystal lattice (Fig. 4). The solvent accessible surface areas have been calculated for the single HGSC chains, for the AB dimer on its own (Chains A and B together), and for the AB dimer with its symmetrically related partners. These data are shown in Table III. When the amounts of buried surface area in the three lattice interfaces are compared with the area buried within the dimer, it can be seen that the interface between the chains in the dimer buries 10.9% of the monomer surface, whereas lattice interfaces 1, 2, and 3 bury 9.3, 6.6, and 2.8%, respectively. Interface 1 is thus nearly as extensive as the dimer interface and is similar to one of the four lattice interfaces found in the bovine lattice (data not shown). His 117 , a residue in the long loop region between strands c and d of the first Greek key motif, participates to some extent in all three interfaces (Fig. 4).
It is residues in the smaller HGSC interface 3 that show the major conformational differences when compared with other ␥-crystallin C-terminal domains (Fig. 4C). Only one of the residues that differs between the human and the bovine sequences is involved in lattice interactions for the human form. This is tyrosine 103 that interacts with conserved glutamine 101 in interface 3, whereas this residue is a histidine in the bovine protein.  involved in hydrogen bonding to either backbone or side chain atoms of neighboring residues; therefore deamidation will not destabilize the molecule by disruption of hydrogen bonds. At neutral pH, the C-terminal domain of ␥S-crystallin has a charge of Ϫ1, and the addition of another negative charge by deamidation of Asn 138(143) may contribute to a decrease in stability. Charged residues within 10 Å (chosen as a value for the limit of electrostatic effects), which may be perturbed by the addition of a negative charge are: Glu 114(119) , Glu 135(140) , Arg 140(145) , and Arg 168(173) . The most important of these interactions is likely to be that of Asn 138(143) with Glu 114(119) because the closest atoms of these two residues are the termini of the side chains (7.2 Å apart). The two-domain human ␥S-crystallin was modeled using the complete bovine ␥B-crystallin as a template, with the C-terminal domain replaced by the human ␥S coordinates and the residues in the N-domain mutated to match the human ␥S sequence. In this model, the only residue on the N-terminal domain that is within 10 Å of Asn 138(143) is Met 69(73) , with the main chain carbonyl of Asn 138(143) being 8.8 Å from the side chain of Met 69(73) . This interaction with methionine is also seen in the x-ray structures of bovine ␥D-, ␥E-, and ␥F-crystallins, but at a distance of 6.3-6.5 Å, and it is the only residue from the N-terminal domain within 10 Å of Asn 138(143) . Deamidation of Asn 138(143) is thus unlikely to perturb electrostatic interactions in the N-terminal domain.
Two other amide containing residues are involved in lattice contacts: Gln 115(120) and Gln 101(106) . Gln 115(120) hydrogen bonds to Thr 105(110) and Thr 106(111) and is close to Glu 107(112) in interface 1 (Fig. 4A). This residue is also involved in a similar interaction in the lattice of the bovine ␥S-crystallin C-terminal domain dimer. In HGSC interface 2, Gln 101(106) from chain A interacts with the main chain of two residues from chain B: Arg 119(124) and Glu 120(125) (Fig. 4B) and in interface 3 (Fig. 4C), it interacts with its symmetrically related partner as well as with Met 102(107) and Tyr 103(108) .

DISCUSSION
The human ␥S-crystallin C-terminal domain forms dimers in the crystal lattice (although not in solution) using a similar interface to those observed between N-and C-terminal domains in other ␤␥-crystallins and is likely to form a similar intramolecular interface with its own N-terminal domain. The dyad is more exact than in the corresponding bovine ␥S construct. The recreation of these 2-fold interactions between single domains underscores the idea that domain pairing is an ancestral dimer trait. However, without the covalent linker, the local concentration of a single domain is insufficient to form a dimer in solution (19). The weakness of the interface interaction renders it susceptible to deformation, and it is the first likely hydrophobic surface to become water exposed during denaturation, in line with computer simulated unfolding studies of ␥B-crystallin (20).
The tyrosine corner is an extremely conserved structural feature of the ␤␥-crystallin fold. In common with other ␤-sandwich domains, it occurs only once in the domain, even though the ␤␥-crystallin domain is made from two similar Greek key motifs (Fig. 1B). Here it is shown that in the human ␥S- crystallin C-terminal domain, the tyrosine corner conformation in both partners of the dimer is nonstandard. In the corresponding bovine ␥S-crystallin structure, where the two domains in the asymmetric unit (chains A and B) pair about a distorted 2-fold axis, the major conformational difference between the chains is in the tyrosine corner, with chain A having an unusual conformation. However, it was not possible to ascertain whether this was due to the distorted 2-fold pairing and/or was a consequence of crystal lattice interactions (25). We hypothesize that the tyrosine corner structure seen in both chains of the human ␥S-crystallin C-terminal domain, as well as the A chain of the bovine ␥S-crystallin C-terminal domain, is the favored conformation for ␥S-crystallins. The consensus sequence for the tyrosine corner is LXPGXY, whereas in ␥Scrystallin C-terminal domain it is LDKKEY with the lysine pair that replaces proline-glycine increasing the energy of the standard tyrosine corner polyproline II conformation. The more usual ␥-crystallin conformation found in the crystal form of the B chain of the C-terminal domain from bovine ␥S-crystallin is probably being stabilized by the side chain of Lys 148(153) . This forms a salt bridge with the C-terminal carboxylate of chain A, giving a compensation for the higher energy conformation of the backbone. Now that the new conformation has been seen in a ␥S-crystallin in a different lattice, it is likely to be independent of a secondary lattice effect. It will be interesting to ascertain whether this new conformation contributes to the lower stability of ␥S-crystallin toward denaturants compared with ␥B-crystallin (19) and/or affects the folding.
The human ␥S C-terminal domain sequence is very similar to the bovine (93% identical). Although the two species of crystals are grown under very similar conditions, they have different space groups (human, P6 5 22; bovine, P6 1 22) and form two kinds of dimer, one almost perfect and one distorted. Only one of the residues that differs between the human and the bovine sequences is involved in lattice interactions, this being Tyr 103(108) in the human form. This bulky residue occupies a  position at an extremity of the molecule, a position that is involved in lattice interactions in other ␥-crystallins (14). It is likely to be responsible for the differing space groups and hence different lattice contacts in ␥S-crystallin and may play a role in the short range interactions in the eye lens. Because deamidation is a commonly observed post-translational modification in the long-lived crystallins, it is possible that the addition of a negative charge could disturb the short range repulsive interactions of human ␥S-crystallin (13). Two highly exposed glutamines (at positions 101(106) and 115(120)) are involved in lattice interactions, the latter in both human and bovine crystal structures. Deamidation of these glutamines may thus have implications for the interactions of ␥S-crystallins in the lens.
Human ␥S-crystallin has to last a lifetime and thus requires both thermodynamic and kinetic stabilization. A recently described mouse ␥S-crystallin gene that carries a point mutation provides a model for how a properly folded but destabilized protein can cause cataracts (45). Mechanisms for loss of stability leading to aggregation and light scattering in human senile cataracts have invoked post-translational modifications involving cysteine oxidation and deamidation. The two cysteines, Cys 109(114) and Cys 124(129) , that are buried in the C-terminal domain of human ␥S-crystallin remain reduced during crystal growth without the addition of reducing agents in keeping with the stability of the native domain fold. An interesting question is whether deamidation contributes to domain destabilization and hence increases the chances of the buried cysteines becoming exposed and available for cross-linking.
It has been shown that the tryptic peptide containing Asp 138(143) is deamidated when isolated from human cataractous lens proteins (44), whereas when the corresponding peptide is isolated from the fetal-embryonic region of aged transparent human lenses, it is not deamidated (46). Asp 138(143) is in the region of the domain dimer interface but has a moderate solvent exposure (Fig. 5). The addition of a negative charge in place of the neutral asparagine is likely to perturb Glu 114(119) , but weakly because it is some distance away. Recently, more than 40% of residue Asn 138(143) in human cataractous lenses has been identified as being in the ␤-aspartate form (47). Identification of this modification, along with the occurrence of racemization at this site, further substantiates the hypothesis that deamidation can occur via a preferred succinimidyl intermediate (48). Deamidation is thus a useful marker of a more radical structural change to the protein that involves addition of extra carbons to the polypeptide backbone and tends to be correlated with flexibility of the protein backbone chain (24). It is significant that Asn 138(143) is in the highly ordered folded ␤-hairpin structure that is involved in maintaining the tertiary ␤␥-crystallin fold (4). If deamidation were to occur to the native protein at this site leading to an altered covalent backbone structure, it would likely destabilize the ␥S-crystallin domain. Because this residue is resistant to deamidation in the normal aged human ␥S-crystallin (46), it is unknown whether the molecule has first to be unfolded prior to deamidation or whether other cataractogenic factors are involved that favor deamidation, which then leads to unfolding.