Structural and Mutational Analysis of the SBDS Protein Family

Shwachman-Diamond Syndrome (SDS) is an autosomal recessive disorder characterized by bone marrow failure with significant predisposition to the development of poor prognosis myelodysplasia and leukemia, exocrine pancreatic failure and metaphyseal chondrodysplasia. Although the SBDS gene mutated in this disorder is highly conserved in Archaea and all eukaryotes, the function is unknown. To interpret the molecular consequences of SDS-associated mutations, we have solved the crystal structure of the Archaeoglobus fulgidus SBDS protein orthologue at a resolution of 1.9 Å, revealing a three domain architecture. The N-terminal (FYSH) domain is the most frequent target for disease mutations and contains a novel mixed α/β-fold identical to the single domain yeast protein Yhr087wp that is implicated in RNA metabolism. The central domain consists of a three-helical bundle, whereas the C-terminal domain has a ferredoxin-like fold. By genetic complementation analysis of the essential Saccharomyces cerevisiae SBDS orthologue YLR022C, we demonstrate an essential role in vivo for the FYSH domain and the central three-helical bundle. We further show that the common SDS-related K62X truncation is non-functional. Most SDS-related missense mutations that alter surface epitopes do not impair YLR022C function, but mutations affecting residues buried in the hydrophobic core of the FYSH domain severely impair or abrogate complementation. These data are consistent with absence of homozygosity for the common K62X truncation mutation in individuals with SDS, indicating that the SDS disease phenotype is a consequence of expression of hypomorphic SBDS alleles and that complete loss of SBDS function is likely to be lethal.

Shwachman-Diamond syndrome (SDS, 1 OMIM 260400) is a rare (frequency of 1:76000) autosomal recessive disorder with clinical features that include hematological dysfunction, pancreatic exocrine insufficiency and skeletal abnormalities (1)(2)(3). SDS patients are significantly predisposed to the development of hematological abnormalities, including cytopenias of one or more lineages, myelodysplasia (MDS) and acute leukemia (4). The cumulative risk of MDS/acute leukemia among SDS patients was recently reported as 18.8 and 36.1% at 20 and 30 years, respectively (5). SDS therefore represents an important model for understanding the genetic determinants underlying the multistep progression to leukemia.
SDS-associated mutations were recently described in a gene designated SBDS (Shwachman-Bodian-Diamond syndrome) (6) that encodes a member of a highly conserved protein family of unknown function with orthologues in diverse species including Archaea, plants, and eukaryotes but not Eubacteria (Pfam UPF00023, Ref. 7). Indirect evidence supports the hypothesis that the SBDS protein family may function in RNA metabolism. Thus, the archaeal SBDS orthologues are located in highly conserved operons containing RNA-processing genes that include orthologues of the eukaryotic exosome and RNase P complex subunits (8) and at least three plant SBDS orthologues contain extended C termini with putative RNA binding U1-type zinc fingers. Haploid spores deleted for the Saccharomyces cerevisiae SBDS orthologue YLR022C were reported to be inviable (9). Although the Ylr022cp protein has been clustered functionally with RNA-processing enzymes and ribosomal RNA-processing factors in microarray expression analyses (10,11), it has also been reported to bind the phospholipids PI(4,5)P 2 and PI4P in vitro (12). GFP-tagged Ylr022cp was distributed in both the nucleus and cytoplasm (13).
In SDS, recurring mutations in the SBDS gene arise from gene conversion caused by recombination between SBDS and a pseudogene copy that lies in a 305-kb paralogous duplicon located 5.8 Mb distally (6). Converted gene segments consistently include at least one of two pseudogene-like sequence changes that are predicted to result in SBDS protein truncation. Specifically, the SBDS dinucleotide mutation 183-184TA 3 CT introduces an in-frame stop codon (K62X) whereas the 258 ϩ 2T 3 C mutation is predicted to result in premature trun-cation of the SBDS protein by frameshift (C84fs3) (6). At least 90% of affected individuals carry one common conversion mutation, whereas 50% are compound heterozygotes with respect to the K62X and C84Cfs mutations. Strikingly, no homozygotes for the K62X mutation were identified, suggesting that this may be a lethal mutation. Alleles from affected individuals who do not have the common conversion mutations carry additional frameshift and missense mutations in the SBDS coding region (6). Two further reported series have confirmed these initial findings and identified novel missense mutations in the SBDS gene (14,15).
To interpret the molecular consequences of SDS-related mutations and to begin to address the conserved function of the SBDS protein, we have determined the structure of the A. fulgidus SBDS orthologue AF0491 (herein called AfSBDS). The AfSBDS structure represents a paradigm for the SBDS protein family in view of the striking evolutionary amino acid sequence conservation. The AfSBDS structure has a three domain architecture: an N-terminal domain with a mixed ␣/␤-fold, a central three-helical bundle and a C-terminal ferredoxin-like domain. We show that the N-terminal domain is the most frequent target for disease mutations and is structurally related to the single domain yeast protein Yhr087wp. By genetic complementation analysis in yeast, we demonstrate that the most common SDS-related predicted protein truncations are null mutations. These data are consistent with the absence of homozygosity for the common early (K62X) truncation mutation in a large cohort of families affected with SDS (6), indicating that the SDS disease phenotype is a consequence of expression of hypomorphic SBDS alleles and that complete abrogation of SBDS function is probably lethal.
Crystallization-Crystals of SeMet-substituted His 6 -AfSBDS protein were grown as hanging drops against 1 ml of reservoir solution (19 -20% w/v polyethylene glycol 8000, 100 mM CHES buffer pH 9.5, 2 mM dithiothreitol) at 21°C, using 1 l of 10 mg/ml protein solution and 1 l of crystallization solution. Crystals grew to typical dimensions of 300 ϫ 100 ϫ 50 m over 3 days, and were cryocooled in liquid nitrogen following transfer to a cryobuffer comprising reservoir solution supplemented with 20% (v/v) glycerol.
Data Collection, Phasing, and Model Refinement-Data were collected at 100 K at the ESRF beamline ID-29 in Grenoble using an ADSC Q210 CCD detector. The SeMet-substituted AfSBDS protein crystallized in space group P1 with cell parameters a ϭ 33.668 Å, b ϭ 44.448 Å, c ϭ 54.720Å, ␣ ϭ 75.87°, ␤ ϭ 85.61°, ␥ ϭ 69.49°and a solvent content of 55.6%. Data sets were collected at three different wavelengths, 1 ϭ 0.979200 Å, 2 ϭ 0.979289 Å, and 3 ϭ 0.891971 Å, corresponding to peak, inflection, and remote points in the fluorescence spectrum of the SeMet-substituted crystal respectively. 360°of data were collected at each wavelength in 1°oscillations from a single crystal. Data were processed using MOSFLM (17) and scaled using SCALA (18). SHARP (19) was used to locate five of the six possible selenium atoms in the asymmetric unit to generate phases. The phases were calculated by SHARP and improved by density modification, histogram matching, and solvent flattening using DM (20). The phases from DM were used as the starting point for automated chain tracing and model building using ARP/wARP (21). Manual model building was performed using the program O (22), and the structure refined using Refmac5 (23). All methionine residues were modeled and refined as SeMet. Maximum likelihood residuals restrained using Hendrickson-Lattmann coefficients from the SHARP output were used throughout refinement against the peak wavelength data. Table I lists the statistics for data collection and phase refinement. The structure has been refined to a free R factor (R free ) of 24.9% and shows good stereochemistry with no residues in disallowed regions of the Ramachandran plot. No density was seen for the N-terminal four amino acid residues of AfSBDS or the His 6 tag.
Yeast Plasmids, Strains, and Media-The sequence encoding yeast Ylr022cp-His 6 was cloned into pYC2/CT (CEN6/ARSH4/URA3) (Invitrogen) to generate pYC2[Gal10::YLR022CHis 6 ]. The YLR022C open reading frame (ORF) was PCR-amplified with 500 bp of 5Ј and 400 bp of 3Ј genomic sequence and cloned into plasmids pRS314 (ARS/CEN/  ) is colored red (␣-helices) and orange (␤-strands); domain II is colored yellow (␣-helices); domain III is colored blue (␣-helices), and cyan (␤-strands). The elements of secondary structure and the N and C termini are also indicated. This and all other molecular illustrations were prepared with PYMOL (www.pymol.org). B, stereoscopic diagram of the C␣ trace of AfSBDS, residues 5-234. Every 10th residue is labeled. C, electrostatic surface potential of AfSBDS. The two views are related by an 180°rotation about the vertical axis. Positively charged areas are shaded blue; negatively charged areas are shaded red. The charge distribution was calculated using DelPhi (43). Residues mutated in disease that map to the basic surface of the AfSBDS protein are indicated (Arg 19 , Lys 118 , Arg 126 , Lys 148 , Arg 169 ). letion of YLR022C in the 20519D strain was confirmed by sequencing. 20519D was transformed with plasmid pYC2[Gal10::YLR022CHis 6 ], sporulated and tetrads dissected to give haploid strain H1 (YLR022C::kanMX4/pYC2[Gal10::YLR022CHis 6 ]). Yeast cells were grown in YPD rich medium (1% w/v yeast extract, 2% w/v Bacto Peptone, and 2% w/v glucose). In some experiments, glucose was replaced with 2% galactose (YPG). For tetrad dissections, a Singer MSM micromanipulator was used. Yeast cells were transformed by the lithium acetate method (24).
Flow Cytometry-Cells were stained with propidium iodide as described (25). Flow cytometry was performed with the FACSCalibur (BD Biosciences), and data analyzed using Cell Quest software.
Guanidinium Unfolding Assays-Two protein stock solutions of 1 M concentration were made in buffer containing 150 mM NaCl, 50 mM Tris, pH 7.4, and 0.5 mM DTE with either 0 or 7.5 M guanidinium hydrochloride (MP Biochemicals Inc.). Circular dichroism measurements were recorded at 222 nm using an AVIV CD spectrophotometer in a quartz cuvette (1-cm path length), final volume 1.8 ml, at 25°C. A minimum of ten traces were averaged per data point and fit to a two-state model to determine the denaturant concentration at which the protein is 50% unfolded (midpoint transition, [D] 50% ) using Kaleida-Graph (Synergy Software, Reading, PA) (26).

RESULTS AND DISCUSSION
Overall Structure of the A. fulgidus SBDS Orthologue-We have determined the structure of the A. fulgidus SBDS orthologue AF0491 (NP_069327). The protein was expressed and purified from E. coli and crystallized in space group P1. The 1.9-Å structure was determined by x-ray diffraction using multiple anomalous dispersion data collected from a single crystal generated from SeMet-substituted AfSBDS protein. The crystallographic data are summarized in Table I. The current model includes residues 5-234 of AfSBDS and has been refined to a crystallographic R factor of 21.1% (R free ϭ 24.9%) using data between 52.7 and 1.9 Å.
FYSH Domain-Domain I (Asp 5 -Ile 87 ) (Fig. 1, A and B) has a mixed ␣␤ topology, comprising four ␤-strands and four ␣-helices arranged as a three stranded anti-parallel ␤-sheet (␤1-␤2-␤3-␤4) broken by the insertion of a hydrogen-bonded water molecule between strands ␤3 and ␤4. Helices ␣3 and ␣4 are packed against the concave surface of the ␤-sheet perpendicular to the long axis of the ␤-strands, whereas helices ␣1 and ␣2 together with the intervening loop line the face of the cleft between domains I and III.
We noted close structural homology between the N-terminal domain of AfSBDS and a single domain protein from S. cerevisiae denoted Yhr087wp (AAB68927.1) whose structure was solved using NMR spectroscopy by the Northeast Structural Genomics Consortium (PDB code, 1NYN) (27). The YHR087W ORF lies within the NAM8-GAR1 intergenic region of S. cerevisiae and encodes a 111 amino acid protein belonging to a small family whose distribution is restricted to fungi. Least squares alignment of Yhr087wp and domain I of AfSBDS using Superpose (18) gives an r.m.s. deviation of 1.47 Å for 59 common C␣ atoms, despite a sequence identity of only 15.3%. The most significant difference between the two structures is a six-residue loop insertion between strands ␤3 and ␤4 of Yhr087wp relative to the AfSBDS structure (Fig. 2). In view of the structural homology between domain I of AfSBDS and Yhr087wp, and the restricted distribution of Yhr087wp homologues to fungi, we have denoted domain I of AfSBDS the FYSH (Fungal, Yhr087wp, Shwachman) domain.
Yhr087wp is non-essential (28) and localizes to both the nucleus and cytoplasm of yeast cells (29). It interacts with the serine/arginine-rich protein kinase Sky1p/Ymr216cp (30) and with Rho3p/Yil118wp (31) in high-throughput two-hybrid screens, and co-immunoprecipitates with the FLAG-tagged yeast ribosomal protein Rpp0p (32). Sky1p phosphorylates the shuttling mRNA export carrier protein Npl3 to promote efficient release of mRNA (33,34). Physical interaction with Sky1p therefore supports a potential role for Yhr087wp in RNA metabolism.
Central Domain-The central domain II (Thr 88 -Phe 161 ) consists of a compact three-helical bundle (␣5-␣7). Helices ␣5 and ␣6 are connected by a conserved proline-rich loop (␣5-␣6 loop) (Figs. 1A and 3). The protein that is most closely related struc- turally to domain II is the C-terminal domain of E. coli RuvA that functions in Holliday junction recognition (35). Superposition of AfSBDS domain II with the RuvA C-terminal domain gives an r.m.s. deviation of 2.2 Å for 45 common C␣ atoms.
C-terminal Domain-The C-terminal domain III (Glu 162 -Gly 234 ) is composed of a four-stranded antiparallel ␤-sheet (␤6-␤7-␤5-␤8) with two ␣-helices (␣8, ␣9) packing against the concave surface of the sheet (Fig. 1A). The ␤␣␤␤␣␤ folding topology is typical of a ferredoxin-like fold as defined in the SCOP data base (36). Of the large number of protein domains with a ferredoxin-like fold, the closest structural homologue to domain III of AfSBDS is domain V of S. cerevisiae elongation factor 2 (Eft2p/Ydr385wp) (PDB code, 1N0U, 1N0V) (37). The r.m.s. deviation is 2.36 Å over 63 C␣ atoms despite no obvious homology between the protein sequences. Eft2p and its bacterial counterpart EF-G are members of the GTPase superfamily of proteins (38). Domain III of AfSBDS is also closely related structurally to a conserved E. coli protein of unknown function, YigZ (PDB code 1VI7) (39). One or more copies of the ␤␣␤␤␣␤ ferredoxin-like fold are seen in a number of RNA-binding proteins, including heterogeneous nuclear ribonucleoproteins, proteins implicated in the regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (40). It is the most common RNA binding fold occurring in the translation system where it is referred to as the RNA binding domain or RNA recognition motif. The motif also appears in a few single-stranded DNA-binding proteins. Fig. 3 shows a structure-based sequence alignment containing the secondary structure elements from both AfSBDS and Yhr087wp. The amino acid sequences of representative members of the conserved SBDS protein family and orthologues of Yhr087wp are indicated. As there is significant evolutionary sequence conservation (24% identity, 48% homology) between the human and A. fulgidus SBDS proteins, the AfSBDS structure represents a paradigm for the entire SBDS family. Electrostatic Surface Potential-The electrostatic charge distribution on the AfSBDS protein surface was calculated using  3. Structure-based sequence alignment of representative SBDS and Yhr087wp protein family orthologues. The alignment was generated manually then read into Alscript (44). The first ten sequences shown are SBDS orthologues and the following three are Yhr087wp homologues. The numbering corresponds to the amino acid sequence for AfSBDS. Secondary structure elements for AfSBDS are labeled ␣ for helices, ␤ for strands, and are colored as in Fig. 1. The secondary structure elements of AfSBDS were calculated from the crystal structure using DSSP (45) the program DelPhi (41). One face of the protein consists of an extended positively charged surface that includes contributions from all three domains (Fig. 1C). On the opposite face of the protein, the FYSH domain is predominantly negatively charged, whereas domains II and III are mainly basic. Surface residues mutated in disease (Arg 19 , Lys 118 , Arg 126 , Lys 148 , Arg 169 ) are predicted to map to an analogous extended basic surface of the human SBDS protein (Fig. 1C). Since nucleic acid-binding proteins often use basic residues to contact the sugar-phosphate backbone, the extended basic electrostatic surface potential would be consistent with such a role for AfSBDS.
Impact of SDS-associated Mutations on Protein Stability-By direct sequencing of SBDS alleles from individuals judged clinically to have SDS, fourteen novel disease-related SBDS mutations have been identified, including missense, frameshift, splice site mutations, and deletions (Table II). The availability of the three-dimensional x-ray structure of the AfSBDS protein together with the evolutionary conservation of amino acid sequence allows us to rationalize the consequences of mutations associated with SDS (6,14,15) by mapping the SDS-associated mutations onto the molecular surface of the AfSBDS protein (Fig. 4A). The positions of the mutations within the structure-based sequence alignment are indicated in Fig. 3. Strikingly, this analysis reveals that the FYSH domain is the most frequent target for SDS-associated mutations. On the basis of our structure, SDS-associated mutations can be categorized into three main groups: first, the common protein truncations, second rare missense mutations that are predicted to destabilize the fold of the protein and third, rare missense mutations that are predicted to modify surface epitopes.
In the first group, the K62X mutation truncates the SBDS protein within the ␤4-␣3 loop of the FYSH domain, whereas the C84fs3 frameshift mutation is predicted to result in truncation within helix ␣4. These mutations might be expected to result in loss of protein function. The second group includes buried sites or residues with a distinct structural role. Three such residues are Cys 31 , Leu 71 , and Ile 87 that map to the hydrophobic core of the FYSH domain. The corresponding disease-associated mutations (C31W, L71P, and I87S) are predicted to result in loss or reduction of protein stability. The third group of mutations is predicted to affect the surface electrostatic potential or to locally alter surface epitopes and is not expected to disrupt the overall fold of the protein (R19Q, K33E, N34I, E44G, K67E, K118N, R126T, S143L, K148R, Q153R, R169C, and R169L). To verify our hypothesis about the two classes of missense mutations experimentally, we examined the stability of representative human and yeast SBDS mutant proteins in a series of guanidinium hydrochloride-induced denaturation experiments (Fig. 4, B and C). Missense mutations of disease-relevant or invariant residues in yeast Ylr022cp (N34I, E44G, R100A) or in human SBDS (E44G, K118N, E28A, R100A) that are predicted to affect surface epitopes do not indeed result in global unfolding of the protein (Fig. 4C). By contrast, we were unable to express sufficient quantities of the SBDS mutants C31W, L71P, or I87S to test in our assay, strongly suggesting that these mutations perturb the overall fold of the protein and are predicted to significantly perturb protein function in vivo. Although the number of samples tested is limited, these data support the impact of the mutations on protein stability in vitro predicted by our structural analysis.
Ylr022cp Is Required for G 1 Cell Cycle Progression-To test the functional consequences of SDS-related mutations in vivo, we exploited the essential role of the S. cerevisiae SBDS orthologue Ylr022cp (9) to examine the ability of variant Ylr022c proteins to restore yeast cell growth in a complementation assay. We created a yeast strain (H1) in which YLR022C was deleted from the genome and cell viability maintained by a centromeric plasmid expressing YLR022C under the control of the GAL10 promoter (42) that can be induced in galactosecontaining medium (YPG) and repressed in glucose-containing medium (YPD) (see "Experimental Procedures"). Growth of H1 and orthologues Yeast strain H1 was transformed with pRS314 carrying the indicated Ylr022cp variants or SBDS orthologues, and the growth of the transformants was measured at 30°C under restrictive conditions as shown in Fig. 6: full (ϩϩϩ), partial (ϩϩ), (ϩ), no (Ϫ) complementation of Ylr022cp function.

Complementation of
Gal10::YLR022C allele in YPD ϩϩϩ a S96X corresponds to the yeast FYSH domain.
FIG. 5. Ylr022cp is required for G 1 cell cycle progression. A, Ylr022cp is required for growth. Growth curves are shown for H1 cells cultured in YPG (open circles) or YPD (closed squares) medium. Isogenic wild-type W303-1 haploid control cells were grown in YPD (filled triangles). B, Ylr022cp depletion following transfer of H1 cells to YPD medium. Ylr022cp (with tubulin as a loading control) was detected by immunoblotting of H1 whole cell extracts at the indicated time points using anti-Ylr022cp polyclonal antiserum. C, Ylr022cp is required for G 1 cell cycle progression. DNA content as measured by flow cytometry of propidium iodide-stained H1 cells is shown in black; DNA content for W303-1 haploid and 20519D diploid control cells is shown in brown.
or isogenic wild-type parental haploid (W303-1) cells was monitored by serial cell density (A 600 ) measurements. Following transfer of H1 cells to YPD medium, Ylr022cp became undetectable by immunoblotting after 2 h, with delayed onset of a progressive growth arrest from around 7 h (Fig. 5, A and B). Correlating with the growth arrest, flow cytometry demonstrated a progressive increase in the proportion of H1 cells with a 1N DNA content, indicative of delayed progression through G 1 phase of the cell cycle (Fig. 5C).
We first tested the ability of a human SBDS allele to complement the growth defect of H1 cells in YPD medium. Centromeric pRS314 plasmids (TRP) carrying either a wild-type yeast YLR022C or human SBDS allele under the control of the endogenous YLR022C promoter were constructed (see "Experimental Procedures") and transformed into H1 cells. Because Ylr022cp is undetectable in H1 cells by immunoblotting 2 h after the shift to YPD medium, the growth phenotype of H1 cells in YPD medium is attributable solely to protein expression from the transformed pRS314 plasmids. Growth of transformed H1 cells was monitored at 30°C in -URA-TRP selective medium under restrictive conditions, and as shown in Table  III, unlike the wild-type YLR022C allele, the human SBDS allele failed to rescue the growth defect of H1 cells in YPD medium. Therefore, all further genetic complementation assays were performed using variant YLR022C alleles transformed into H1 cells.
In Vivo Functional Consequences of SDS-associated Mutations-A comprehensive series of YLR022C variants was generated and transformed into H1 cells. Fig. 6 shows representative quantitative growth curves of transformed H1 cells under restrictive conditions and the results are also summarized in Table III. Variant YLR022C alleles were classified into four groups based on their ability to complement growth of H1 cells in YPD medium: full (ϩϩϩ), partial (ϩϩ, ϩ), or no (Ϫ) growth complementation. Several disease-related mutations had severe effects on yeast cell growth: complete loss of function was associated with the K62X and S96X truncations and with the C31W, N34I, and L71P missense mutations, whereas severe growth impairment was associated with I87S (ϩ). Partial loss of function was associated with K118N (ϩϩ), but the majority of SDS-related missense mutations tested showed no apparent growth defects in our assay (R19Q, E44G, K118N, K126A, S143LϩK148R, Q153R, R169C). Surprisingly, with the excep-tion of Arg 100 (ϩϩ), alanine substitutions of evolutionarily invariant residues (Glu 28 , Lys 62 , Gln 94 ) or conserved charged residues exposed on the surface of the central three-helical bundle (Lys 98 , Lys 148 , Glu 156 , Lys 159 ) showed no apparent growth defects.
Consistent with a critical role for the FYSH domain, the N-terminal deletion variant ⌬1-95 was non-functional. Deletion of domain III (M174X) or the missense mutation R169P (but interestingly, not the disease-related mutation R169C), resulted in partial (ϩϩ) loss of function. Taken together, the ⌬1-95 deletion and M174X truncation mutations indicate that the FYSH domain alone is insufficient to complement Ylr022cp function, and reveal that domains I and II together are essential for function.
Importantly, these data demonstrate that the disease-related Ylr022cp variants K62X and S96X are null mutations. S96X introduces a premature stop codon at the junction between the FYSH domain and domain II, analogous to the effect of the premature protein truncation by frameshift that is predicted for the common 258 ϩ 2T 3 C (C84fs3) disease mutation. This indicates, therefore, that the most common predicted disease-associated SBDS truncations are null mutations. The FYSH domain hydrophobic core mutations (C31W, L71P, I87S) are particularly deleterious to Ylr022cp function, presumably as a consequence of impaired protein folding and consistent with our inability to express these variant proteins in vitro. As the N34I mutation does not appear to perturb global protein folding (Fig. 4C), the consequence of the introduction of a bulky hydrophobic Ile residue on the protein surface may be the reduction of protein solubility. The surface-exposed residues Lys 118 and Arg 100 may be critical for interaction with a potential partner protein or ligand, as the respective K118N and R100A missense mutations significantly affect Ylr022cp function in vivo in the absence of an effect on global folding (Figs. 4C and 6, Table III).
Insight into the Pathogenesis of SDS-Affected individuals from 50% of families are compound heterozygotes with respect to the 258 ϩ 2T 3 C (C84fs3) splice site and the 183-184TA 3 CT (K62X) premature stop codon mutations (6). Of 158 families reported, no homozygotes for the 183-184TA 3 CT (K62X) premature stop mutation were identified (6). By contrast, seven families were found to be homozygous for the 258 ϩ 2T 3 C (C84fs3) splice-site mutation. This strongly suggests that the 183-184TA 3 CT (K62X) premature stop is a null mutation, and that a homozygous K62X-null mutation is embryonic lethal. By contrast, the 258 ϩ 2T 3 C splice-site mutation is an obligate hypomorph with respect to SBDS function. Consistent with these observations, our yeast genetic complementation analysis indeed demonstrates that the K62X truncation is a null mutation. However, the S96X truncation reveals that the highly conserved FYSH domain is insufficient to complement wild-type Ylr022cp function. This indicates that the hypomorphic nature of the C84fs3 allele is not the result of residual function associated with the predicted SBDS truncation fragment. Rather, we speculate that the GC dinucleotide donor splice site that arises as a consequence of the 258 ϩ 2T 3 C mutation may result in inefficient low level splicing activity and SBDS protein expression. Alternative splicing may yield a transcript that results in a functional SBDS protein product. Interestingly, the majority of SDS-related missense mutations fail to abrogate Ylr022c function. Taken together, these results indicate that the clinical phenotype of SDS is associated with expression of functionally hypomorphic SBDS alleles.
The structural features of the AfSBDS protein together with the homology to Yhr087wp described herein, are consistent with the proposed role for the SBDS protein family in RNA metabolism (6), although we have insufficient data to further define the specific function at present. Our structural and genetic analysis reveals that the novel FYSH domain is a frequent target for disease mutations and is required for protein function in vivo, suggesting that elucidation of the specific role of the FYSH domain may provide critical insights into the function of the SBDS protein family. It will be of interest to determine potential interactions (genetic/physical) common to both Ylr022cp and Yhr087wp to further elucidate the function of these novel proteins.
Coordinates-The coordinates have been deposited in the Protein Data Bank (accession code 1T95).