|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 280, Issue 19, 19213-19220, May 13, 2005
The Shwachman-Bodian-Diamond Syndrome Protein Family Is Involved in RNA Metabolism*From the aOntario Center for Structural Proteomics, the bBanting and Best Department of Medical Research, and the eStructural Genomics Consortium, University of Toronto, Toronto, Ontario M5G 1L6, Canada, the cDepartment of Molecular and Medical Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada, the dNortheast Structural Genomics Consortium, and Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, the fDivision of Molecular and Structural Biology, Ontario Cancer Institute, Toronto, Ontario M5G 2M9, Canada, the gDepartment of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 2M9, Canada, the hProtein Design Group, Centro Nacional de Biotecnologia (CNB-CSIC), Cantoblanco, E-28049 Madrid, Spain, the iBioinformatics Group, Ontario Genomics Innovation Centre, Ottawa Health Research Institute, Ottawa, Ontario K1H 8L6, Canada, the jDepartment of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, Oklahoma 73190, the kEuropean Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD United Kingdom, and the lProgram in Genetics and Genomic Biology, The Hospital for Sick Children, Toronto, Ontario M5G 1X8, Canada
Received for publication, December 22, 2004 , and in revised form, January 24, 2005.
A combination of structural, biochemical, and genetic studies in model organisms was used to infer a cellular role for the human protein (SBDS) responsible for Shwachman-Bodian-Diamond syndrome. The crystal structure of the SBDS homologue in Archaeoglobus fulgidus, AF0491, revealed a three domain protein. The N-terminal domain, which harbors the majority of disease-linked mutations, has a novel three-dimensional fold. The central domain has the common winged helix-turn-helix motif, and the C-terminal domain shares structural homology with known RNA-binding domains. Proteomic analysis of the SBDS sequence homologue in Saccharomyces cerevisiae, YLR022C, revealed an association with over 20 proteins involved in ribosome biosynthesis. NMR structural genomics revealed another yeast protein, YHR087W, to be a structural homologue of the AF0491 N-terminal domain. Sequence analysis confirmed them as distant sequence homologues, therefore related by divergent evolution. Synthetic genetic array analysis of YHR087W revealed genetic interactions with proteins involved in RNA and rRNA processing including Mdm20/Nat3, Nsr1, and Npl3. Our observations, taken together with previous reports, support the conclusion that SBDS and its homologues play a role in RNA metabolism.
Shwachman-Bodian-Diamond (SBD)1 syndrome (OMIM 260400 [OMIM] ) is a rare autosomal recessive disorder that is caused by mutations in the SBDS gene on chromosome 7 (1). The mRNA for SBDS is ubiquitously expressed, and its mutation has many consequences, including abnormalities of pancreatic exocrine function, skeletal defects, and hematological dysfunction (1-4).
The predominant genetic event that underlies SBD syndrome appears to be a gene conversion event between the SBDS locus and a pseudogene that is 97% similar in sequence to SBDS, but is predicted to encode a non-functional protein (1, 5). Sequence analysis of disease-associated alleles has identified more than 20 different mutations in affected individuals (1, 5, 6). Two mutations, both predicted to lead to truncated gene products, account for over 95% of these mutant alleles. The first, and most common mutation, 258 + 2T Although the molecular function of SBDS is unknown, the predicted SBDS protein is evolutionarily conserved and has apparent homologues in Archaea, plants, yeast, and other lower eukaryotes, suggesting that it may have a fundamental, conserved biochemical role. Several lines of circumstantial evidence suggest SBDS has a role in RNA metabolism. First, in Archaea, the gene is located in an operon that encodes, among other proteins, RNA-processing enzymes (7). Second, in yeast, the gene for the SBDS homologue (YLR022C) clustered with RNA-processing enzymes in transcription profiling experiments (8). Third, although the coding sequence is divided into three blocks of sequence conservation in most organisms, including humans, several plant SBDS homologues contain a fourth C-terminal region that is predicted to be an RNA binding domain (1). In an effort to determine the molecular and cellular functions of SBDS, we initiated a series of studies in model organisms. As a part of our structural genomics project, we used x-ray crystallography to determine a high resolution structure of the Archaeal SBDS homologue, AF0491 in Archaeoglobus fulgidus, and NMR to discover an unanticipated structural homologue of the N-terminal domain of SBDS in Saccharomyces cerevisiae (YHR087W). The biochemical and genetic links to RNA metabolism for both the S. cerevisiae SBDS sequence homologue, YLR022C, and the more distantly related homologue, YHR087W, suggest a common functional role for this new structural family.
Cloning, Purification, and Crystallization of AF0491The AF0491 gene was subcloned, expressed, and its product purified and screened for crystallization as described previously (9). Crystals for x-ray diffraction data collection were obtained from hanging drop vapor diffusion conditions containing 2 µl of Se-Met derivative of the protein plus 2 µl of 0.2 M sodium acetate, 0.1 M sodium cacodylate at pH 6.4, and 20% polyethylene glycol 3350 over 2-5 days at 21 °C. The crystals were flash-frozen with crystallization buffer completed with 17% glycerol.
X-ray Diffraction and Structure DeterminationThe protein was crystallized in the P1 space group with the unit cell parameters of a = 33.848, b = 44.189, c = 61.295,
Cloning and Purification of YHR087WThe YHR087W gene was PCR-amplified from genomic DNA and inserted into a pET15b (Novagen) vector. This construct yields the protein with an N-terminal His6 tag and thrombin cut site. The protein expression and purification method are described by Yee et al. (17). The His6 tag was not cleaved, leaving an extra 20 residues at the N terminus (MGSSHHHHHHSSGLVPRGSH). NMR samples of 1.5 mM uniformly 15N/13C-labeled protein were prepared in 10 mM NaOAc, 300 mM NaCl, 10 mM dithiothreitol, 10 µM Zn2+, 1 mM benzamidine, 1x inhibitor mixture (Roche Applied Science), and 0.01% (w/v) NaN3 in 10% (v/v) 2H2O/H2O at pH 5.0. This buffer was not optimized to determine the importance of any components. Samples were placed in 5-mm Shigemi susceptibility matched NMR tubes. NMR SpectroscopyNMR data were collected at 25 °C using standard triple resonance pulse sequences (18). Backbone and side chain correlation experiments (HNCACB, CBCACONNH, HNCO, CBCACOCAHA, CCC-TOCSY-NNH, HCC-TOCSY-NNH, and HCCH-TOCSY) were acquired on a Varian Unity 600 spectrometer. 1H-13C HSQC, three-dimensional 15N NOESY (150-ms mixing time), four-dimensional 13C-13C HMQC-NOESY-HMQC (125-ms mixing time), HNHA, and aromatic ring side chain correlation experiments (HBCBCG-CDHD/CEHE-aro) were acquired on a Varian Inova 600 spectrometer. 1H-13C HSQC and three-dimensional 13C NOESY (120 ms mixing time) experiments were acquired on a Varian Inova 750 spectrometer. Amide proton exchange was monitored by acquiring 1H-15N HSQC spectra following dissolution of a lyophilized protein sample in D2O. Stereospecific Leu and Val side chain assignments were obtained from a 1H-13C HSQC experiment recorded on a sample that had been produced in minimal growth medium containing 10% [U-13C]glucose and 90% unenriched glucose (19). The raw data and pulse sequences used have been deposited in the BioMagResBank (Madison, WI). Data were processed with Felix (MSI) and analyzed with Felix and Sparky (www.cgl.ucsf.edu/home/sparky).
Calculation and Analysis of YHR087W Structural EnsembleNOE distance restraints had uniform lower bounds of 1.8 Å and upper bounds of 2.8, 3.2, 4.0, or 5.0 Å. Hydrogen bond restraints were derived from amide proton D2O exchange data. Amide 1H-15N HSQC crosspeaks still present 30 min after dissolution of a lyophilized sample in D2O were given bounds of 1.8-2.5 Å for the HN-O distance and 2.8-3.5 Å for the N-O distance, provided preliminary structural ensembles clearly indicated the correct acceptor atom. Dihedral restraints for phi were derived from the HNHA experiment (20) and had bounds of -55 ± 30 degrees for helical residues with J < 5Hzand -120 ± 50 degrees for extended residues with J > 7.5 Hz. Dihedral restraints for psi had values of -47 ± 30 for helical residues and 140 ± 50 for extended residues. Psi restraints were added only for residues in helices and Structures were calculated with NIH-Xplor (21) using distance geometry and simulated annealing. The routines dg_sub_embed, dg_ full_embed, and dgsa were used as provided except that in dgsa, an initial temperature of 2,000 K was used with 30,000 high temperature steps and 200,000 cooling steps. Sum averaging was used for methyl groups and methylene proton pairs. The structural ensemble was analyzed with PROCHECK-NMR (22). Surface electrostatic features of the protein were examined using GRASP (23). Structure similarity searches using Dali (24) and VAST (25) were conducted using the first structure from the ensemble as a representative structure for similarity searching. Structural statistics are presented in Table II.
Sequence Analysis of the Conserved N-terminal Region of the SBDS FamilyThe alignment of the orthologues of human SBDS was used to search for distant protein families via intermediate searches (26) using global hidden Markov model profiles (using hmmsearch of HMMer; hmmer.wustl.edu/) (27). To improve the profile quality we followed two approaches: first, BLAST searches against unfinished genomes (28), and secondly, additional searches against EST databases (29), using NAIL to view and analyze the HMMer results (30). This sequence enrichment improved the quality of the profile that was used to perform the searches against the non-redundant protein databases. The alignment in Fig. 3 was produced with HMMer (27) and Belvu (www.sanger.ac.uk/Software/Pfam/help/belvu_setup.shtml), and the phylogenetic tree was produced using ClustalW (31).
Purification of TAP-tagged Proteins and Mass SpectrometryTAP-tagged (tandem affinity purification) (32) proteins were purified from extracts of yeast cells as previously described (33) on IgG and calmodulin columns. The purified proteins were separated by SDS-PAGE on gels containing 10% polyacrylamide, and the proteins were visualized by silver staining. Protein bands were digested with trypsin and peptide samples were spotted onto a target plate with a matrix of Synthetic Genetic Array Analysis (SGA)SGA analysis was carried out as previously described (34), except with a miniarray of 383 deletion strains.
Crystal Structure of AF0491, the Archaeal Homologue of SBDSThe SBDS protein is conserved in Archaea and Eukarya. A 2.0-Å structure of AF0491, the SBDS homologue from A. fulgidus, was solved from a single selenomethionine-containing crystal. The structure was refined to an Rfactor/Rfree of 21.9/27.0%. The data collection and refinement details are provided under "Experimental Procedures" and in Table I.
AF0491 comprises three domains (Fig. 1). The AF0491 N-terminal domain (residues 1-86) is an
The middle domain of AF0491 (residues 87-160) is a winged helix-turn-helix, a common fold associated with DNA binding (35). In the context of the human SBDS, this domain corresponds to residues 98-169. The C-terminal domain of AF0491 (residues 161-234), which corresponds to residues 170-241 in the human protein, comprises a four-stranded More than 20 different mutations have been identified in SBD syndrome patients (1, 5, 6) (Fig. 1A). Most of the mutations that alter surface residues are located in the N-terminal half of the protein, where many of the conserved residues are also located. An analysis of disease-linked mutations can be found in the report by Shammas et al. (6). YHR087W and the AF0491 N-terminal DomainAs part of an ongoing structural genomics project in yeast, we used NMR spectroscopy to determine the structure of YHR087W from S. cerevisiae (Fig. 2A). This protein did not have a previously known structure or function; however, the YHR087W orthologue in S. pombe is expressed during sporulation and environmental stress (36, 37). The structure of YHR087W displayed striking similarity to that of the N-terminal domain of AF0491 (Fig. 2B), a similarity that was first recognized by Alexey Murzin.2 This was confirmed using pairwise comparisons: Dali returned an alignment of 79 residues with an r.m.s.d. of 2.1 Å for the CA atoms, whereas VAST reported 68 aligned residues with an r.m.s.d. for CA atoms of 2.3, score 6.5, p value of 0.002 (Fig. 1A). Consequently, we searched for sequence similarity between the two proteins using HMMer. Profiles of the N-terminal conserved region of the SBDS family scored against the YHR087W sequence with an E-value of 0.042. Reciprocally, the profile of the YHR087W sequence with its orthologues detected the SBDS family with an E-value of 0.018. Statistically significant E-values connected all the sequences contained in both families. None of these HMMer profile searches retrieved any new unrelated sequences and, as stated above, reciprocal searches produced convergent results. Thus, YHR087W is a distant sequence homologue of the AF0491 N-terminal domain and, by extension, of both the human SBDS sequence and of YLR022C, an SBDS sequence homologue that we identified in S. cerevisiae (Figs. 1A and 3). The runs of hydrophobic residues are in good agreement between both alignments, and there are also several conserved residues. The grouping by taxonomy is in agreement with the structure of the phylogenetic tree derived from the multiple sequence alignment, with all sequences in the branch of YHR087W being found exclusively in fungi (with the exception of the Zea mais sequence, which could be either a true case of horizontal transfer or the result of a fungal contamination of the corresponding EST library).
The structural similarity between YHR087W and the AF0491 N-terminal domain (AF0491-N) extends over the entire structure, including conservation of the irregular
The electrostatic properties of YHR087W and AF0491-N are also similar (Fig. 4). One side of both proteins, formed by the side of helix
Functional Analysis of YLR022C and YHR087W, SBDS Homologues in S. cerevisiaeThe biochemical and cellular functions of both YLR022C and YHR087W are unknown. Strains lacking the YLR022C gene are not viable, indicating it is an essential gene, whereas strains lacking YHR087W are viable (38). Thus, in an effort to infer the functions of YLR022C and YHR087W, and by inference those of SBDS, we turned to the genomic and proteomic analyses that have been used with such success in yeast. YLR022C was tagged at the C terminus with a TAP tag by recombining a tagging cassette directly into the chromosomal YLR022C locus. The protein was then purified by tandem affinity chromatography (Fig. 5) and associated proteins were identified using LC/MS/MS mass spectrometry. YLR022C copurified with small amounts of a large number of polypeptides, most of which are linked to the processing of rRNA and many of which are known components of the 60 S particle. Because ribosomal proteins are abundant, there is a concern that they might be adventitiously associated with YLR022C. However, we believe that this is unlikely because we have now purified more than 3,000 yeast proteins, and only a limited number have co-purified with the 60 S particle. Therefore, these data support a role for YLR022C in ribosomal processing.
To explore whether the sequence and structural similarity between YHR087W and YLR022C corresponded to a functional similarity, we performed genetic and protein interaction studies with YHR087W. When YHR087W was tagged with a TAP tag and purified using tandem affinity chromatography, we did not detect any co-purifying proteins (data not shown). However, since YHR087W is a non-essential gene, we were able to explore genetic interactions between YHR087W and other yeast genes using SGA analysis; results were confirmed by tetrad analysis. A strain deleted for YHR087W was crossed with a miniarray of 383 other deletion strains, each lacking a protein implicated in RNA metabolism. Deletion of YHR087W caused marked synthetic lethality when combined with deletions of several other genes including those that encode the NatB complex (Mdm20/Nat3), which is required for the acetylation of ribosomal proteins (39), and Nsr1, a nucleolar protein involved in the synthesis of 18 S rRNA and its 20 S precursor (40, 41). Nsr1 has two RNA recognition domains and is a member of the GAR (glycine/arginine-rich repeats) family of proteins (42). YHR087W also interacted genetically with each of Npl3, Air2, and Yra2, which others have shown to interact with each other both by GST pull-down and two hybrid-based methods (43, 44). Npl3 is a protein involved in 18 S and 25 S rRNA processing, export of RNA from the nucleus, and import of proteins into the nucleus (45-48). Npl3 is also associated with the U1 snRNP and is predicted to have two RNA recognition domains (46, 49). Air2 is a RING-type zinc finger domain protein, and Yra2 is a protein that associates with RNP complexes (43, 44). Therefore, the genetic interactions of YHR087W support a role for the protein in RNA processing.
Shwachman-Bodian-Diamond syndrome is the second most common cause of pancreatic insufficiency in children, after cystic fibrosis (50). The syndrome is caused by mutations in the SBDS gene on chromosome 7 (1). Our studies of SBDS homologues in Archaea and yeast provide experimental evidence for a role for SBDS in RNA metabolism.
We have solved the crystal structure of AF0491, the SBDS homologue in A. fulgidus, revealing a three domain protein. The C-terminal domain comprises a four-stranded The central domain of the protein adopts another common fold, the winged helix-turn-helix (wHTH). Proteins with an HTH domain are abundant in all Archaea, with the wHTH fold being the most common (55), and so it is difficult to infer a molecular function. Although HTH domains are widely used in DNA binding and have also been identified in RNA-binding proteins (56-58), a role in nucleic acid binding is not supported since the surface of AF0491 does not have the general basic character expected for such a function. Part of this domain may be involved in protein-protein interactions, as with Kluyveromyces lactis HSF, another wHTH protein. Unlike other wHTH proteins where the wings contact DNA, the wing in K. lactis HSF is involved in HSF dimerization (59).
The N-terminal domain of AF0491 is a novel To elucidate the function of SBDS, we made use of the various experimental advantages of the yeast S. cerevisiae, and studied the SBDS structural and sequence homologues, YHR087W and YLR022C respectively. Our results link both YLR022C and YHR087W to RNA metabolism. TAP-tagged YLR022C co-purified with numerous ribosomal proteins and proteins associated with rRNA processing. Protein complexes that are purified using the TAP protocol can also be probed for the presence of specific RNAs through hybridization to DNA microarrays. This process is particularly applicable to RNA processing enzymes, which are often ribonucleoprotein assemblies. Previously, the affinity-purified YLR022C complex was probed for potential co-purifying RNAs by using a dedicated DNA microarray (60). YLR022C was found to co-purify with snoRNAs and exhibited a profile that is similar to that of YHR040W (BCD1), an RNA-associating protein. snoRNPs, which contain these non-coding snoRNAs, have been implicated in the cleavage, modification, and folding of precursor rRNA substrates (61). Our studies, which provide experimental evidence for a role for SBDS in RNA metabolism, support previous hypotheses which were based on bioinformatics studies. In Archaea, the orthologous gene is located in an operon that includes RNA-processing enzymes (7). Through a computational study, YLR022C was predicted to function in rRNA processing (62). This was accomplished by analyzing protein-protein interaction networks and identifying biologically relevant functional groups. Also, SBDS RNA is expressed ubiquitously (1), which is consistent with a basic cellular function such as RNA processing. Finally, in some plants, the SBDS orthologues have a fourth domain that contains a putative RNA-binding motif (1). Although our data link the SBDS protein to ribosomal biogenesis, the specific role of SBDS in this pathway remains to be determined. Whereas the function of the ribosome is conserved in all living species, ribosomal biogenesis is fundamentally different in Bacteria compared with Archaea and Eukarya; many families of ribosomal genes are specific to Archaea and Eukarya and are absent from Bacteria (63). The observation that the SBDS gene is restricted to Archaea and Eukarya (1) suggests that SBDS plays a role that is specific for the process in these two kingdoms. Bacterial ribosomes, though sharing many proteins with those in eukaryotes and Archaea, are assembled differently (64-67). Active prokaryotic ribosomes can be reconstituted in vitro using only the individual ribosomal components. Although this does not exclude the involvement of additional factors in vivo, it is evident that all the information needed to assemble an active prokaryotic ribosome is contained within the sequences of the ribosomal proteins and rRNAs. By contrast, eukaryotic, and likely archaeal, ribosome biosynthesis requires the coordinated action of hundreds of accessory proteins, snoRNPs, and ribosomal proteins to produce the final assembly of over 70 ribosomal proteins with four rRNAs. In this pathway, three of these rRNAs (18 S, 25 S, and 5.8 S) are produced from a single 35 S transcript whereas the fourth 5 S rRNA is independently transcribed. Processing of the pre-rRNA occurs following association of ribosomal and non-ribosomal proteins with the pre-rRNAs, forming a pre-ribosomal particle. Pre-rRNA modifications needed to produce the mature rRNA include cleavage of the pre-rRNAs, conversion of uridine residues to pseudouridine, and nucleotide methylation. This pre-ribosomal particle then separates into 40 S and 60 S presubunits. As they exit the nucleolus and then the nucleus, additional factors associate with, and dissociate from, the presubunits until the final maturation of the ribosome occurs in the cytoplasm. Numerous trans-acting protein factors with a role in ribosome biogenesis have been identified in yeast, including rRNA-modifying enzymes, nucleases and putative RNA helicases, and it is likely that others have not yet been discovered (65). SBDS may be one such factor, necessary for eukaryotic, but not prokaryotic, ribosomal biosynthesis.
The atomic coordinates and structure factors (code 1P9Q and 1NYN) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/). NMR data (code 5695) on chemical shifts have been deposited in the BioMagResBank, Madison, WI.
* This work was supported by Genome Canada, the Ontario Research and Development Challenge Fund, and the National Institutes of Health Protein Structure Initiative (Grants P50-GM62413-02 to the NE Structural Genomics Consortium and P50-GM62414 to the Midwest Center for Structural Genomics). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. m To whom correspondence should be addressed: Banting and Best Dept. of Medical Research, C. H. Best Institute, University of Toronto, 112 College St., Toronto, ON M5G 1L6, Canada. Tel.: 416-946-3436; Fax: 416-946-0588; E-mail: aled.edwards{at}utoronto.ca.
1 The abbreviations used are: SBD, Shwachman-Bodian-Diamond; TAP, tandem affinity purification; SGA, synthetic genetic array; RRM, RNA-recognition motif; AF0491-N, AF0491 N-terminal domain; GAR, glycine/arginine-rich repeats; wHTH, winged helix-turn-helix; r.m.s.d., root mean square deviation; EST, expressed sequence tag; PDB, Protein Data Bank; MALDI-TOF, matrix-assisted laser desorption/ionization-time of flight.
2 A. Murzin, personal communication.
All NMR spectra were performed at the Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by the United States Department of Energy Office of Biological and Environmental Research, located at Pacific Northwest National Laboratory and operated by Battelle for the DOE. Diffraction data were collected at the Argonne National Laboratory (Beamline 19ID).
This article has been cited by other articles:
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||