|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 279, Issue 45, 46787-46793, November 5, 2004
Solution Structure of a Ubiquitin-like Domain from Tubulin-binding Cofactor B*![]() ![]() ![]() ![]() || ||![]() **
From the
Received for publication, August 17, 2004 , and in revised form, September 2, 2004.
Proper folding and assembly of tubulin ![]() -heterodimers involves a stepwise progression mediated by a group of protein cofactors A through E. Upon release of the tubulin monomers from the chaperonin CCT, they are acted upon by each cofactor in the folding pathway through a unique combination of protein interaction domains. Three-dimensional structures have previously been reported for cofactor A and the C-terminal CAP-Gly domain of cofactor B (CoB). Here we report the NMR structure of the N-terminal domain of Caenorhabditis elegans CoB and show that it closely resembles ubiquitin as was recently postulated on the basis of bioinformatic analysis (Grynberg, M., Jaroszewski, L., and Godzik, A. (2003) BMC Bioinformatics 4, 46). CoB binds partially folded -tubulin monomers, and a putative tubulin-binding motif within the N-terminal domain is identified from sequence and structure comparisons. Based on modeling of the homologous cofactor E ubiquitin-like domain, we hypothesize that cofactors B and E may associate via their -grasp domains in a manner analogous to the PB1 and caspase-activated deoxyribonuclease superfamily of protein interaction domains.
The cytoskeleton of eukaryotic cells is a network of microfilaments, intermediate filaments, and microtubules. The microtubules, hollow tubes made up of ![]() -tubulin heterodimers, provide structural support for the cell and play a role in cell division, cell movement, and intracellular transport. The actin and tubulin subunits from which microfilaments and microtubules are comprised require the cytosolic chaperonin CCT (also known as c-cpn or TriC) for proper folding (1). However, in contrast to the folding of actin and -tubulin, which require only chaperonin and ATP, the folding and dimerization of - and -tubulin is a complex process involving the additional presence of GTP and five tubulin folding cofactors (24). After release from CCT, quasi-native folding intermediates are captured and stabilized by cofactors A or D (in the case of -tubulin) or by cofactors B and E (in the case of -tubulin) (5, 6). A supercomplex is then formed consisting of cofactors CE and - and -tubulin. Finally, the ![]() -tubulin heterodimer is released upon hydrolysis of GTP (7, 8). In addition, the interaction of cofactor D with native tubulin is regulated by the small G protein Arl2 (9).
Of the five cofactors, structures have been determined only for cofactor A (10, 11) and the C-terminal glycine-rich cytoskeleton-associated protein (CAP-Gly) domain of cofactor B (12). However, a recent study combining modeling and fold prediction tools identified previously unknown domains of cofactors BE (13). These include a spectrin-like domain in cofactor C, HEAT and Armadillo domains in cofactor D, and leucine-rich repeats in cofactor E. In addition, the N and C termini of cofactors B and E, respectively, were predicted to adopt a ubiquitin-like
CloningWe developed an Escherichia coli-based expression system to produce the N-terminal half of tubulin folding cofactor B from C. elegans, residues 1120. PCR was used to amplify a gene fragment coding for the N-terminal 120 amino acids with BamHI and HindIII sites at the 5' and 3' ends, respectively, to facilitate insertion into a modified pQE30 vector (Qiagen, Valencia, CA). The coding sequence of the insert was verified by DNA sequencing.
Protein Expression and PurificationThe pQE30T-CoB (1120) expression construct was transformed into E. coli strain SG13009[pRPEP4] (Qiagen). Cells were grown at 37 °C in LB media containing 150 µg/ml ampicillin and 50 µg/ml kanamycin until the cell density reached A600 = 0.7. Protein expression was then induced by the addition of isopropyl-
The recombinant N-terminal domain of CoB was found exclusively in the soluble fraction of the harvested cells and purified by the following procedure. Cell paste from a 1-liter culture was resuspended in 10 ml of lysis buffer (50 mM sodium phosphate, pH 7.4, 300 mM NaCl, 10 mM imidizole, 0.1% (v/v) 2-mercaptoethanol, 1 mM phenylmethylsufonyl fluoride) and lysed by two passes through a French pressure cell. The cell lysate was clarified by centrifugation at 10,000 x g for 15 min. The resulting protein solution was incubated with Ni+2-NTA resin (Qiagen) in batch mode for 30 min at room temperature, packed into a 5-ml disposable column, and washed with 3 x 10 ml portions of lysis buffer. The bound protein was eluted with 3 x 10 ml portions of elution buffer (50 mM sodium phosphate, pH 7.4, 300 mM NaCl, 250 mM imidizole, 0.1% (v/v) 2-mercaptoethanol). Eluted fractions were pooled and dialyzed against 4 liters of 50 mM sodium phosphate, pH 7.4, 100 mM NaCl, 0.1% (v/v) 2-mercaptoethanol overnight at 4 °C. Separation of the fusion tag was performed concurrently with dialysis by addition of a catalytic amount of TEV protease ( During the course of NMR data collection for structural analysis, signals corresponding to residues at the C terminus gradually became weaker and split into multiple species, suggesting that this portion of the protein may be susceptible to proteolytic degradation. Mass spectrometry revealed a series of species derived from the initial product by removal of varying numbers of amino acids from the C terminus. As described below, residues beyond Val84 are unstructured and highly mobile, making the C terminus more susceptible to degradation. However, chemical shifts of residues 1115 were unperturbed by heterogeneity at the C terminus, indicating that the protein tertiary structure was unaffected.
NMR SpectroscopyAll NMR data were acquired at 25 °C on a Bruker 600 MHz spectrometer equipped with a triple-resonance Cryo-ProbeTM and processed with NMRPipe software (14). Heteronuclear 15N-1H NOE values were determined from an interleaved pair of two-dimensional gradient sensitivity-enhanced correlation spectra of [U-15N]CoB Ubl domain acquired with and without a 3-s proton saturation period (15). Backbone resonance assignments were obtained in a semi-automated manner using the program Garant (16) with peak lists from three-dimensional HNCO, HNCACO, HNCA, HNCOCA, HNCACB, and CCONH spectra generated manually with XEASY software (17). Sidechain assignments were determined manually using three-dimensional 15N-edited TOCSY-HSQC and HCCH-TOCSY data and a 13C-edited NOESY-HSQC spectrum optimized for aromatic groups. Distance constraints were obtained from three-dimensional 15N-edited NOESY-HSQC and 13C-edited NOESY-HSQC spectra (
Initial structures were generated using the CANDID module of the torsion angle dynamics program CYANA (18). NOE peak intensities were converted into upper distance bounds with the CALIBA function of CYANA. Backbone
Structure DeterminationThe solution conditions chosen for NMR spectroscopy were 20 mM NaPO4, pH 6.5, 50 mM NaCl, 10% 2H2O. Under these conditions, the number, dispersion, and line widths of the observed signals from a 1H-15N HSQC spectrum (Fig. 1) were all consistent with a homogeneous folded protein. The translational self-diffusion coefficient of 1.46 x 106 cm2s1 measured by NMR pulsed field gradient methods (22), which is consistent with molecular mass 14 kDa, indicated that the protein is monomeric. Using manually obtained peak lists from a complete set of standard triple-resonance 3D NMR spectra, greater than 90% of the backbone 1H, 15N and 13C resonance assignments were accurately assigned by the program Garant (as described under "Experimental Procedures"). Sidechain assignments were completed by manual analysis of the CCONH and HCCH-TOCSY spectra.
An automated procedure for iterative NOE assignment (18) was used in generating the structure of the protein. The final structure was generated from 114 dihedral angle constraints and 1732 unique, NOE-derived distance restraints. Molecular dynamics calculations in explicit solvent (21) (Table I) were used in the final refinement. Because no long range NOEs were observed for residues 91120, and 15N-1H heteronuclear NOE values (Fig. 2A) confirmed that those C-terminal residues are dynamically disordered on the picosecond to nanosecond time-scale (23), the final structure calculations included only CoB residues 190. Low 15N1H NOE values for residues at the N terminus (14) and an internal loop (5256) indicate that these regions of CoB are also highly flexible. These dynamic properties are reflected in variations of the local backbone r.m.s.d. values (Fig. 2B) for the ensemble of structures, which is otherwise highly ordered (Fig. 2C). Statistics on experimental constraints, coordinate precision, and stereochemical quality as determined by PROCHECK (24) and WHATCHECK (25) are summarized in Table I.
The CoB Ubl domain (Fig. 2C) consists of a mixed five-stranded -sheet in the order 21534, a single major -helix (residues 2839) that packs in the concave groove of the -sheet, and a short 310 helix (residues 6871) in the loop between strands 4 and 5. These secondary structure elements are consistent with those of ubiquitin (26) and form a tertiary structure corresponding to the common -grasp fold, which is widespread among proteins having little sequence or functional similarity. The tertiary structure is stabilized by interactions between the buried sidechains of the hydrophobic residues Tyr5, Leu7, Ile9, Leu28, Leu31, Leu35, Met46, Ile48, Leu50, Leu68, Val73, Ile79, and Ala81. In addition, the sidechain of Lys32 is buried within the core. All of these residues except Val73 are located in secondary structure elements. The aromatic ring of Tyr77 is partially exposed, causing extreme upfield shifting of the nearby Arg74 1HN resonance to 4.9 ppm because of ring current effects. The surface of CoB Ubl is devoid of any large hydrophobic patches like typical chaperones and is instead characterized by charged and polar residues (Fig. 2D).
Disordered C-terminal residues of the construct used in this analysis may serve as a flexible linker between the N-terminal Ubl domain and the C-terminal CAP-Gly domain (12) of CoB. However, by dividing the protein in the middle of this intervening sequence (residues 85135), additional structural elements may have been disrupted. Although chemical shifts and NOE patterns did not indicate any regular secondary structures for residues 85120, the secondary structure prediction program PSIPRED (27) predicts helices from 104107 and 112122. Similar predictions are made for the CoB ortholog sequences. In addition, helical
Cofactor B Orthologs Sequence ComparisonCofactor B is conserved among eukaryotes as one element of a sophisticated system for the folding and assembly of tubulin heterodimers. Overall sequence conservation among vertebrates is high with 93% sequence identity between human and murine CoB Ubl orthologs. However, the CoB Ubl domain is more divergent than the C-terminal CAP-Gly domain (Fig. 3). Whereas the C. elegans CoB CAP-Gly domain shares 65% amino acid identity with that of Homo sapiens, the corresponding Ubl domains are only 33% identical with only four residues strictly conserved across other species: Lys32, Gly40, Leu68, and Asp83. Of these, all but Asp83 are also conserved in ubiquitin and serve a structural role. In addition to the low sequence identity, the
Structural Comparison to Other Ubl DomainsA search for homologous structures using both VAST (30) and FATCAT (31) revealed a high level of structural similarity to ubiquitin and other ubiquitin-like modifier proteins such as Nedd8 and Rub-1. Significant similarity was also found between other members of the ubiquitin superfamily, including elongin B (32), the UBX domains of FAF1 (33) and p47 (34), the Ras-binding domain of RalGDS (Protein Data Bank accession code 1RAX [PDB] ), and the Ubl domain of Hhr23B (35) (Fig. 4). Sequence similarity between the CoB Ubl domain and these proteins is very low (only 416% identity) with a majority of the type-conserved residues belonging to the hydrophobic core. Low sequence similarity has hindered the identification of Ubl domains in CoB sequences from other species, leading to the suggestion that the human (tubulin-specific chaperone B) and S. cerevisiae (Alf1) orthologs lack this N-terminal domain (36). However, we verified the presence of the Ubl domain in all of the sequences shown by using the Fold and Function Assignment System server (37), which predicted significant similarity to ubiquitin for all cofactor B sequences in a search of the Pfam data base.
A key feature of the backbone structure of the CoB Ubl domain is the flexible loop (residues 5256) between strands 3 and 4. This loop contains two solvent-exposed Asp residues, which are conserved among the mammalian CoB Ubl domains. Absent in ubiquitin and ubiquitin-related modifier proteins, in the UBX, PB1, and caspase-activated deoxyribonuclease (CAD) domains this loop is important for interactions with binding partners (34, 38, 39). Because of its recurring functional role, a scheme proposed for classification of ubiquitin-like domains would take the length of the 3 4 loop into account (34). The UBX domain is defined in part by a conserved Phe-Pro sequence motif in the 4-residue loop; thus, along with other differences, CoB Ubl cannot be categorized as a UBX domain. However, as discussed below, it displays striking similarities to the CAD domain of ICAD and the PB1 domain type I, both of which contain conserved acidic residues within a 5-residue loop (39, 40).
The DCX domain was recently reported to be the first known microtubule-binding module with a canonical ubiquitin fold (41). Both the solution structure of the N-terminal DCX domain (one of two homologous tandem domains) from human doublecortin (Protein Data Bank accession code 1MJD
[PDB]
) and the crystal structure of the equivalent domain from human doublecortin-like kinase (Protein Data Bank accession codes 1MFW
[PDB]
and 1MG4
[PDB]
) were determined. Although the DCX and CoB Ubl sequences display only 20% similarity, they are structurally similar with an r.m.s.d. of 3.1 Å between the
Putative Tubulin-binding ResiduesIt has been shown that CoB sequesters
A comparison of 15 CoB Ubl and 14 CoE Ubl sequences (data not shown) reveals only three solvent-exposed residues that are highly type-conserved in both cofactors: a pair of basic residues in strand 2 (Lys20, Lys21) and a hydrophobic residue (Val38) in the
The doublecortin protein is essential for normal development of the cerebral cortex, and a number of disease-causing missense mutations map to the surface of the tandem ubiquitin-like DCX domains that bind microtubules (41). Although there is little sequence conservation between CoB Ubl and DCX, many of the doublecortin mutations are likely to affect the electrostatic potential and cluster in the same region of the C-terminal DCX domain surface as the putative -tubulin-binding residues of the CoB and CoE Ubl domains described above. Because C-DCX binds to tubulin heterodimers and microtubules rather than free tubulin monomers, the details of those interactions may differ from those of CoB with -tubulin. However, one of the disease-causing doublecortin substitutions is an Arg residue (Arg196) that is found on strand 2, in a location analogous to the CoB Ubl "KK" motif. Thus, the cluster of basic residues on the CoB surface may serve a common functional role with similar microtubule binding elements of the doublecortin DCX domains.
The only known structure of a protein that binds monomeric tubulin is that of cofactor A (10, 11). Although its structure, a three-helix bundle, is completely different from that of CoB, like CoB Ubl its surface is highly hydrophilic. Cofactor A binding to
Binding of Cofactor B and Cofactor EAlthough the binding of CoB and CoE has been reported in fission yeast (43), the interaction has not been mapped to specific domains within either protein. Analysis of the CoB and CoE Ubl sequences revealed numerous basic residues in CoE Ubl that are not conserved in CoB Ubl (Fig. 5). This observation led us to speculate about the potential role for the Ubl domains to mediate the binding of CoB and CoE via electrostatic interactions, in a manner analogous to that of the recently defined PB1 (Phox and Bem1) domains (40, 44) and CAD domains (39), both of which display the ubiquitin-like
A homology model of CoE Ubl, based on the CoB Ubl structure and a sequence alignment optimized manually on the basis of multiple sequence alignments, was created using the automated model routine of the program Modeller (46, 47). The CoE Ubl model confirmed the presence of a basic region made up of Lys413 and Arg415 in strand 1, Arg425 and Arg426 in strand 2, Lys435 and Arg442 in helix 1, and Lys445 in the 1 2 loop (Fig. 6B). Except for Arg425 and Arg426, these residues are conserved only in CoE Ubl and appear to be analogous to the residues of the PB1 and CAD domain basic clusters. Likewise, CoB Ubl contains conserved acidic residues within the 3 4 loop (Asp54 and Asp55) as previously discussed. Additional acidic residues between strands 4 and 5 (Asp63, Asp70, Asp75) are found on the same face. Although acidic residues are not strictly conserved at these positions across species, a higher proportion of acidic residues than basic residues is found in the 3 4, 4 3, and 3 5 loops of all sequences. Therefore, we hypothesize that, as observed in the PB1 and CAD heterodimers, acidic CoB residues may participate in electrostatic interactions with the basic residues of CoE Ubl. Additional residues, such as the conserved aromatic residue at position 71 of CoB Ubl, may be involved in the binding interface. Because of sequence variations and minor differences in tertiary structure, it is expected that a CoB/CoE complex would necessarily diverge in some respects from the examples set by the PB1 and CAD complexes. The interacting regions of a possible CoB/CoE complex are depicted in Fig. 6B; however, confirmation of this interaction must await further study.
The
* This work was supported by the National Institutes of Health Protein Structure Initiative through Grant 1 P50 GM64598. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The atomic coordinates and structure factors (code 1T0Y [PDB] ) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).
All time-domain NMR data and chemical shift assignments have been deposited in BioMagResBank (www.bmrb.wisc.edu/) under BMRB accession number 6176
[BMRB]
. ** To whom correspondence should be addressed. Tel.: 414-456-8400; Fax: 414-456-6510; E-mail: bvolkman{at}mcw.edu.
1 The abbreviations used are: CoB, cofactor B; Ubl, ubiquitin-like; NTA, nitrilotriacetic acid; NOE, nuclear Overhauser effect; HSQC, heteronuclear single quantum coherence; r.m.s.d., root mean square deviation; CAP-Gly, glycine-rich cytoskeleton-associated protein; CAD, caspase-activated deoxyribonuclease; CoE, cofactor E; PB1, Phox and Bem1; DCX, doublecortin-like domain.
We thank Adam Godzik for assistance with the cofactor E model and Kelly Kjer for assistance with protein purification.
This article has been cited by other articles:
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||