Biophysical and Structural Characterization of the Centriolar Protein Cep104 Interaction Network*

Dysfunction of cilia is associated with common genetic disorders termed ciliopathies. Knowledge on the interaction networks of ciliary proteins is therefore key for understanding the processes that are underlying these severe diseases and the mechanisms of ciliogenesis in general. Cep104 has recently been identified as a key player in the regulation of cilia formation. Using a combination of sequence analysis, biophysics, and x-ray crystallography, we obtained new insights into the domain architecture and interaction network of the Cep104 protein. We solved the crystal structure of the tumor overexpressed gene (TOG) domain, identified Cep104 as a novel tubulin-binding protein, and biophysically characterized the interaction of Cep104 with CP110, Cep97, end-binding (EB) protein, and tubulin. Our results represent a solid platform for the further investigation of the microtubule-EB-Cep104-tubulin-CP110-Cep97 network of proteins. Ultimately, such studies should be of importance for understanding the process of cilia formation and the mechanisms underlying different ciliopathies.

Cilia and flagella are evolutionary conserved organelles that contain a microtubule (MT) 3 -based cytoskeleton called the axoneme. They are crucial for motility, fluid flow, and coordination of many signaling pathways during cell growth, development, cell mobility, and tissue homeostasis (1,2). The majority of mammalian cells have the capacity to form cilia. Conversely, most cancer cells lack cilia (3). Moreover, defects in ciliogenesis have been implicated in a wide range of other human disorders, including polycystic kidney disease, obesity, mental retardation, blindness, as well as various other developmental malformations (4). As a result, there has been great interest in understanding the regulation of both ciliary assembly/disassembly and the control of cilia length.
Ciliary assembly and disassembly is precisely coordinated during cell cycle progression. Cilia are formed from the distal end of the mother centriole upon exit from the cell cycle and entry into the G 0 phase, become shorter as cells progress from the G 1 to the S phase and disassemble in mitosis (5). The process of cilia formation is tightly controlled through a delicate balance of positive and negative regulators. Cep104 has recently been identified as a key player in the regulation of cilia formation (6).
Cep104 localizes to the distal ends of both centrioles throughout the whole cell cycle. At the G 0 phase of the cell cycle, the mother centriole converts to a basal body that gives rise to a cilium. During ciliogenesis Cep104 shifts from the mother centriole to the tip of the elongating cilium. It has been shown that the Cep104 protein and its Chlamydomonas orthologue, FAP256, are essential for cilia formation in a high percentage of cells and in cases where, after depletion of Cep104/ FAP256, cilia could still form, the ciliary tips displayed major structural deformation (6). Mutations in the Cep104 gene have been directly linked to Joubert syndrome, which is a genetically heterogeneous ciliopathy characterized by a distinctive midhindbrain and cerebellar malformation (7).
Cep104 has recently been identified as an EB protein-dependent MT plus-end tracking protein (ϩTIP). Based on immunoprecipitation studies with cell lysates, Cep104 was found to interact with CP110 and Cep97, two proteins that affect the length of the centriole and that interact with each other (8). Depletion of either CP110 or Cep97 causes abnormal elongation of centrioles and enhances primary cilia formation in growing cells (9,10). Based on these observations, it was proposed that CEP104 regulates the function of the CP110-Cep97 complex when cilia outgrowth is initiated by mediating the connection between the CP110-Cep97 complex and dynamic MT ends (6).
Cep104 is crucial for the conversion of the mother centriole into the ciliary basal body (6). The exact role of Cep104 in this process and how it collaborates with its binding partners in the regulation of ciliogenesis is, however, not fully understood. Moreover, a biophysical or structural characterization of the Cep104 protein and its interaction network is currently missing. To better understand the role of the Cep104 protein in ciliogenesis, we analyzed the protein sequence in detail and revealed the presence of four conserved functional domains in the Cep104 molecule (Fig. 1A). Based on these results, we carried out a detailed biophysical characterization of the individual domains and their interactions with the binding partners Cep97, CP110, and EB. Moreover, we solved the crystal struc-

Results
Biophysical Characterization of the Cep104 Domains-To better understand the role of Cep104 in ciliogenesis, we first analyzed its amino acid sequence by secondary structure prediction and protein threading that revealed the presence of four conserved functional domains (Fig. 1A). The first 163 residues fold into an N-terminal domain that is predicted to adopt a jelly-roll-fold similar to IFT25 (11). The jelly-roll domain consists of nine ␤-strands that are organized into two antiparallel sheets stacked on top of each other. Previously, this domain has been identified as a binding site for Cep97 (8). The N-terminal domain is followed by a predicted coiled coil region (residues 207-301). The ␣-helical coiled coil is one of the principal subunit oligomerization motifs in proteins (12) and based on the MultiCoil algorithm (13) the Cep104 coiled coil is predicted to be responsible for dimerization of the protein. Following the coiled-coil region, the Phyre2 fold recognition server (14) identified a putative TOG domain (residues 418 -673). TOG domains are present in proteins that regulate MT dynamics (15). These proteins either increase MT polymerization rates by recruiting soluble tubulin via their conserved TOG domains to polymerizing MT plus ends or decrease MT catastrophe and activate MT rescue by recruiting soluble tubulin to depolymerizing MT plus ends (15). The C-terminal part of Cep104 (residues 730 -887) is predicted to fold into a zinc-finger domain. Previously, this domain has been identified as a binding site for the CP110 protein (8). Furthermore, an SXIP motif is present at the very C terminus of the Cep104 protein. The SXIP motif is a short linear motif that is recognized by EB proteins that then target the proteins to growing MT ends (16) (Fig. 1A).
To characterize the identified domains, we first cloned and expressed them individually in Escherichia coli. The TOG domain and the C-terminal zinc-finger domain expressed well as N-terminally His 6 -tagged variants. In contrast, expression of the N-terminal jelly-roll domain and the coiled-coil region required the insertion of a SUMO tag and a thioredoxin tag, respectively, between the His 6 tag and the target proteins. All the His 6 -tagged fusion proteins were cleaved off by proteases before performing biophysical analysis (for details, see "Experimental Procedures"). All proteins migrated at the expected positions on SDS-PAGE gels and no degradation products were observed.
We characterized the recombinant proteins by CD spectroscopy and sedimentation velocity (SV) analytical ultracentrifugation (AUC) to confirm their correct folding based on their secondary structure content (Fig. 1B) and assess their oligomerization state, respectively (Fig. 1C). The N-terminal jelly-roll domain revealed a CD spectrum rich in ␤-sheet structure, which correlates well with the secondary structure prediction and the Phyre2 model based on the similarity with IFT25 structure (11). c(S) distribution shows a single peak with a s w value of 2.0. This corresponds to a S max /S value of 1.29 and therefore suggests a monomeric globular shape of the domain (theoretical molecular mass of monomer: 19 kDa) that is consistent with the Phyre2 prediction. The CD spectrum of the coiled-coil domain revealed a significant amount of ␣-helical structure with distinct minima near 208 and 222 nm. The stability of the protein was assessed by thermal unfolding monitored by CD at 222 nm, which revealed a sigmoidal-shaped unfolding profile with a T m of 55°C (Fig. 4B). Characteristic of a coiled-coil structure, the unfolding was reversible. c(S) distribution analysis revealed that the coiled coil sediments with a s w value of 1.9 S. This yields a S max /S of 1 for a monomer and a S max /S of 1.57 for a dimer. As a S max /S value of 1 is not possible for proteins, we conclude that the coiled-coil domain forms a dimer in solution (theoretical molecular mass of monomer 12 kDa). This is in good agreement with the theoretical prediction by MultiCoil (13). Furthermore, a S max /S value of 1.57 is consistent with an elongated shape of the domain, which is characteristic of coiled coils. CD spectroscopy of the putative TOG domain showed a significant amount of ␣-helical structure that is typical for this type of fold. c(S) distribution revealed a single peak with a s w value of 2.5 that amounts to a S max /S of 1.36 (theoretical molecular mass of monomer: 29 kDa). This result suggests a moderately elongated monomeric structure in solution, which is characteristic of TOG domains.
The CD spectrum of the C-terminal zinc-finger domain confirmed the folded state of the purified protein and revealed a mixture of ␣-helixes, ␤-sheets, and random coils. c(S) distribution showed a single peak with a s w value of 1.9 S. This corresponds to a S max /S of 1.49, suggesting a slightly elongated shape and monomeric state of the protein in solution (theoretical molecular mass of monomer: 22 kDa). Taken together, our bioinformatics analysis and biophysical characterization revealed the presence of four conserved functional domains in the Cep104 molecule (Fig. 1A).
Cep104 Is a Novel Tubulin-binding Protein-We identified a putative TOG domain in the central part of the Cep104 molecule. To assess the functional role of this domain, we performed a detailed biophysical and structural characterization. We used SV AUC to test tubulin binding. Tubulin forms dimers in solution that are prone to polymerization. To avoid formation of higher oligomers, we kept the concentration of tubulin constant at 1 M. At that concentration, the c(S) distribution showed a single peak corresponding to tubulin dimers (s w 5.8, S max /S for dimer: 1.34, theoretical molecular mass of monomer: 50 kDa) without signs of aggregation (Fig. 2B). To study its binding to tubulin, different concentrations of the putative TOG domain were added to 1 M tubulin. s w isotherm analysis revealed that the TOG domain of Cep104 directly binds to tubulin dimers with a 1:1 stoichiometry and an apparent dissociation constant of 1 M (Fig. 2, A and B).
To further study the Cep104 TOG domain and its interaction with tubulin, we decided to characterize it by x-ray crystallography. Unfortunately, the human TOG domain did not yield any crystals; therefore we performed a screen of a limited number of Cep104 TOG domains from different species, including Mus musculus, Gallus gallus, Xenopus tropicalis, Drosophila melanogaster, Danio rerio, Tetrahymena thermophile, and Chlamydomonas reinhardtii. The only crystals were obtained from the chicken TOG domain, which possesses 73% amino acid sequence identity to its human homolog. The structure was solved at 1.4-Å resolution by using selenomethionine-labeled protein for obtaining experimental phases. The overall structure resembles a canonical TOG-domain fold comprising six HEAT repeats (HR1-6) forming flat paddles with wide thin edges (Fig. 2E).
Although the overall architecture of TOG domains structures is the same, small structural variations that have been linked to diversified interactions with tubulin have been described. The PDBefold identified the crescerin TOG2 domain (PDB code 5DN7) as the most similar structure (Q-score (quality score): 0.31). Although both domains share a sequence identity of only 14%, their structures can be superimposed with a root mean square deviation of 2.94 Å. Both TOG domains have the tubulin binding intra-HEAT loop surface in a straight conformation that resembles the Stu2 TOG1 structure and contrasts with the bent architectures observed in ch-TOG TOG4, MAST TOG1, and hCLASP1 TOG2 (17)(18)(19)(20)(21). The straight conformation and the similarity to Stu2-TOG1 complement our finding that the Cep104 TOG domain is involved in binding of free tubulin and suggest a possible role of Cep104 in regulation of MT dynamics.
Although the sequence identity among different TOG domains is generally very low (typically below 20%), they share two highly conserved features: hydrophobic residues buried between neighboring HEAT repeat ␣-helices that stabilize the overall structure of the domain and five short intra-HEAT repeat loops (T1-T5) on one narrow edge creating a binding site for tubulin. Single amino acid mutations in this highly conserved loop region often completely abolishes binding to tubulin dimer (15). The TOG domain of Cep104 possesses two highly conserved residues, Trp-448 in the HEAT repeat loop T1 and Arg-626 in the HEAT repeat loop T5, amino acids that have previously been shown to play an important role in tubulin binding. Additionally, a basic overall electrostatic net charge distribution is seen across the first five intra-HEAT loops that in other TOG domains was shown to complement the acidic nature of the TOG domain binding site in tubulin (Fig. 2G). A hypothetical model of a complex between the TOG domain of Cep104 and an unpolymerized tubulin dimer (Fig. 2F) that was generated based on the x-ray structure of the TOG1 domain of Stu2 and yeast tubulin dimer (PDB code 4FFB) indeed suggests that both Cep104 residues Trp-448 and Arg-626 make very similar contacts with tubulin as their equivalents in Stu2-TOG1 (Trp-23 and Arg-200, respectively). We used site-directed mutagenesis and SV AUC to probe the importance of Trp-448 and Arg-626 for the TOG/tubulin interaction. As expected, both mutants display substantially reduced binding to tubulin dimers (Fig. 2, C and D).
Taken together, we identified Cep104 as a novel tubulinbinding protein. Tubulin therefore adds to the complexity of Cep104 interaction network, which comprises CP110, Cep97, and EB proteins.
Cep104 Directly Interacts with CP110 and Cep97-Jiang et al. (8) identified CP110 and Cep97 as novel Cep104 binding partners. They mapped the binding sites and found that the C-terminal part of CP110 (residues 791-991) interacts with the zincfinger domain of Cep104 (residues 730 -887) and the central part of Cep97 (residues 310 -480) binds to the jelly-roll domain of Cep104 (residues 1-163) (8). Because the authors used an immunoprecipitation assay with cell extracts to map the binding sites, indirect binding could not be excluded. To test whether the interaction with the binding partners involves direct binding, we performed SV AUC experiments. The CP110 791-991 region is predicted to be unstructured and when expressed in E. coli it forms mostly insoluble inclusion bodies; therefore the boundaries were first narrowed down to residues 902-991 using pull-down experiments with cell lysates. This region of CP110 can be expressed when fused to thioredoxin, and the uncleaved fusion protein sediments with a s w value of 1.8. This corresponds to a S max /S ratio of 1.67, which suggests an asymmetric shape of the molecule (theoretical molecular mass of monomer: 24 kDa). Based on the fusion of the compact thioredoxin fold with the unstructured CP110 moiety, an asymmetric shape is expected. c(S) distribution clearly shows that the fusion protein directly interacts with zinc-finger domain of Cep104 with a 1:1 stoichiometry (Fig. 3B). Using N-and C-terminal truncations, we were able to map the Cep104 binding region even more precisely (Fig. 3, A-G). The smallest region identified contains only 30 amino acids (residues 907-936) (Fig. 3G). For screening, we used recombinantly expressed peptides fused to thioredoxin. However, for the final affinity measurements by isothermal titration calorimetry (ITC) we used a synthetic peptide. The ITC results revealed that the CP110 peptide (residues 907-936) interacts with Cep104 with an apparent dissociation constant of 4 M and in a 1:1 stoichiometry (Fig. 4A).
The central part of Cep97 (residues 310 -480) is also predicted to be unstructured in solution but can be expressed in the cytosol of E. coli as a His 6 -tagged version. c(S) distribution showed a single peak with a s w of 1.5 and a S max /S of 1.7, which is consistent with a monomeric state of the protein (theoretical molecular mass of monomer: 19 kDa). The S max /S of 1.7 indicates an asymmetric shape of the polypeptide chain fragment and corresponds well with the unstructured nature of the protein. By measuring different ratios in SV AUC experiments, we found that the jelly-roll domain of Cep104 directly interacts with the central part of Cep97 with an apparent dissociation K d of 3 M and in a 1:1 stoichiometry (Fig. 4C).
Cep104 Directly Interacts with EB Proteins-Cep104 has been recently identified as a MT plus-end tracking protein (ϩTIP) (8). Like many other ϩTIPs it binds to EB proteins through intrinsically disordered basic and serine-rich polypeptide chain regions containing a C-terminal core SXIP motif (S, serine; X, any amino acid; I, isoleucine; P, proline). The conserved SXIP peptide motif binds to the hydrophobic groove formed by the C-terminal helix bundle of the EB homology domain dimer (16). To confirm a direct interaction between the C-terminal part of the Cep104 and the EB homology domain of EB1 proteins and to assess the binding affinities, we performed SV AUC measurements. For these experiments, we used the C-terminal part of Cep104 that consists of the zinc-finger domain followed by a 40-amino acid long unstructured region. The SXIP motif is located within this unstructured part. In agreement with the literature, the EB homology domain forms tight dimers. c(S) distribution shows a single peak with a s w value of 1.6 S. A calculated S max /S value of 1.56 is consistent with a dimeric-fold of the EB homology domain (theoretical molecular mass of monomer: 9 kDa). By measuring different protein concentration ratios using SV AUC, we determined that the C-terminal part of Cep104 directly interacts with the EB homology domain dimers with an apparent dissociation K d of 2 M (Fig. 4D). The strength of the interaction is comparable with those of other known EB-binding proteins.

Discussion
Dysfunction of cilia is associated with common genetic disorders called ciliopathies. Understanding the structure, function, and interaction networks of ciliary proteins is therefore key for a better understanding of these severe diseases and ciliogenesis in general. In this study, we characterized in detail the domain organization and interaction network of Cep104, a protein that plays a crucial role in ciliogenesis.
We found that Cep104 is a multidomain protein that interacts with a number of other proteins: CP110, Cep97, tubulin, and EB. This suggests that the function of Cep104 is complex and tightly regulated through various interactions. In this study, we showed that the Cep104 protein directly interacts with tubulin through a conserved TOG domain. The TOG domain array-containing proteins ch-TOG and CLASP are key regulators of cytoplasmic MTs (15). The cytoplasmic TOG array proteins have been extensively studied, however, the first TOG array protein that regulates ciliary MT, crescerin, has been reported only very recently (22). The number of TOG domains in each array differs. Higher eukaryotic ch-TOG family members use a pentameric TOG domain array to promote MT polymerization, whereas CLASP members use a trimeric TOG domain array to stabilize MTs (15). Members of the newly described crescerin family contain four TOG domains and have been suggested to promote MT polymerization in vitro (22). Cep104 contains only one TOG domain, but the whole protein is dimeric in solution due to the presence of a coiled-coil region. It has been shown previously, i.e. for Stu2 TOG domains, that two TOG domains of the same type are sufficient for altering the MT dynamics (23). MT binding and plus end localization of Cep104 is also provided by the interaction with EB proteins (8). The Cep104 protein localizes to the tips of growing cilia and possesses 2 TOG domains per Cep104 dimer and EB-dependent plus end tracking activity suggests that Cep104 could be a novel MT dynamics regulator and therefore complement the function of crescerin in cilia.
It was suggested that CP110 and Cep97 proteins act as a "cap" at the distal end of the centriole and collaborate to limit centriolar length and block the ciliogenic branch of centriole function (6). An attractive hypothesis is that in the regulation of ciliogenesis Cep104 mediates the connection between the CP110-Cep97 complex and dynamic MT ends because neither CP110 nor Cep97 have been reported to directly interact with MTs. Via the EB interaction, Cep104 could potentially recruit the Cep97-CP110 complex to the growing MT tips. The CP110-Cep97 complex at the distal end of the centriole blocks cilium elongation, whereas Cep104, based on its domain organization, could potentially be involved in the initiation of cilia formation. However, the initiation of ciliogenesis by Cep104 can probably only start, once the CP110-Cep97 inhibitory complex is released from the distal end of the centriole. The observation that depletion of either CP110 or Cep97 enhances primary cilia formation in growing cells is consistent with this hypothesis (9,10). The permanent presence of Cep104 at the tip of dynamic MTs and its necessity for ciliogenesis (6) support our suggestion of Cep104 being novel MT regulator.
Our study provides new insights into Cep104 domain organization and interaction network. However, further work will be necessary to fully understand Cep104 function in cilia and at the distal end of the centriole. Further experiments will examine the effect of ablating the Cep104 protein in human cells on cilia formation. A dissection of the contribution of the different domains to the function of the Cep104 protein (i.e. deleting the TOG domain and/or introducing a point mutation (W448A or R626A) could help to answer whether Cep104 plays a role in MT dynamics. It will also be important to understand how the

Experimental Procedures
DNA Constructs and Protein Production-Human CEP104 cDNA fragments encoding the TOG domain (residues 418 -673) and the C-terminal zinc-finger domain (residues 730 -925) were ligated into the bacterial expression vector pET-15b. The N-terminal jelly-roll domain (residues 1-163) was cloned into a modified version of pET-15b vector that includes a SUMO fusion protein after the His 6 tag and the coiled-coil domain (residues 207-301) into a modified version of the pET-15b vector, pHisTrx2, which includes a thioredoxin fusion protein after the His 6 tag. A synthetic, codon-optimized gene fragment encoding the chicken TOG domain (residues 428 -686) used for crystallization and the human Cep97 (residues 310 -480) were cloned into pET-15b. cDNAs encoding and all CP110 polypeptide chain fragments (residues 902-991, 926 -991, 902-946, 912-936, 902-925, 907-936) were cloned into pHisTrx2. All mutants were generated using the QuikChange approach.
Recombinant proteins were produced in E. coli BL21(DE3) cells. Bacterial cultures were induced at OD 0.7 with 0.5 mM isopropyl D-thiogalactopyranoside and grown for an additional 16 h at 20°C. Recombinant proteins were purified using immobilized-metal affinity chromatography according to a standard protocol. Depending on the experiment, the His 6 -tagged fusion proteins were cleaved by incubation with thrombin or sumoprotease. Size exclusion chromatography on Superdex 75 or 200 columns (GE Healthcare) was used as a final purification step or for tag removal (20 mM HEPES (pH 7.5), 150 mM NaCl and 2 mM ␤-mercaptoethanol). The EB1 protein fragment (residue 191-267) was purified as reported previously (16). Bovine brain tubulin was prepared according to well established protocols (24). The CP110 peptide (residue 907-936) was assembled on an automated continuous flow synthesizer employing standard methods. Sedimentation data were collected with absorbance (230, 250, or 280 nm) or interference optical systems. Protein partial specific volumes, solution density, and viscosity were calculated in SEDNTERP. Data were analyzed in terms of a continuous c(s) distribution of Lamm equation solutions with the software SEDFIT (27). Scan file time stamps were corrected (28) and good fits were obtained with root mean square deviation values corresponding to typical instrument noise values. Sedimentation coefficients were corrected to standard conditions, s 20,w . The shapes and oligomerization states of the proteins were determined on the basis of the S max /S ratio. S max was calculated by the following equation: S max ϭ 0.00361 ϫ M 2/3 , where M is molecular weight of the protein (29). The Gilbert s w fast isotherm was determined by integration of the c(S) peaks of the fast boundary component in c(S); integration was done from 4.5 to 9 S. The isotherm was created by plotting the weighted average sedimentation coefficients s w as a function of TOG domain concentration (27,30). The fitting was done with the software SEDPHAT (31). Isotherm figures were prepared using the GUSSI software package.
Circular Dichroism (CD)-CD spectra measurements were carried out on a Chirascan-Plus instrument (Applied Photo-physics Ltd.) equipped with a computer-controlled Peltier element. All experiments were performed in PBS. The CD spectra were obtained at 5°C by scanning wavelengths from 200 to 260 nm in 1-nm steps using a protein concentration of 5 M. A ramping rate of 1°C per min was used to record the thermal unfolding profiles. Midpoints of the transitions, T m , were taken at the maximum of the derivative d[] 222 /dT.
Isothermal Titration Calorimetry-ITC measurements were performed using an ITC200 instrument (MicroCal). The CP110 peptide and the C-terminal Cep104 polypeptide chain fragment (residues 730 -925) were dialyzed against 20 mM Tris-HCl (pH 7.5), 150 mM NaCl prior to analysis. Titrations were performed at 0.2-min intervals at a stirring speed of 1000 s Ϫ1 . The resulting heats were integrated using NITPIC software (32) and isotherms were fitted using a 1:1 bimolecular interaction model in program SEDPHAT (31). All final figures were prepared using the GUSSI software package.
Crystallization and X-ray Structure Determination-The recombinant TOG domain (from G. gallus) was concentrated to 20 mg/ml. Crystals suitable for structure determination were grown at 20°C in sitting drops composed of a 1:1 mixture of the protein solution and a well solution consisting of 25% PEG 1500 and 0.1 M MMT buffer (pH 5.0). For cryo-protection, the reservoir solution was supplemented with 20% ethylene glycol. Native and selenomethionine SAD data were acquired at 1-Å wavelength (0.96 Å for SAD) and 100 K at the X06SA beamline of the Swiss Light Source (Paul Scherrer Institute) to a resolution of 1.4 and 2.10 Å, respectively. The acquired datasets were reduced, scaled, and merged using XDS, XSCALE, and XDSCONV (33,34). The structure was solved in the space group P22 1 2 1 using AUTOSOL (35). The initial model building was done via AUTOBUILD and was used as a molecular replacement search model for phasing the native dataset by PHASER (36). Model building and refinement were performed via COOT (37) and phenix.refine from the PHENIX suite (38). Crystallographic data collection and refinement statistics are summarized in Table 1.
Author Contributions-L. R., A. A., M. O. S., and R. A. K. designed the research, L. R. carried out the research, L. R. and S. H. W. K analyzed the data, L. R. and R. A. K. wrote the manuscript with input from the other authors.