Family-specific Kinesin Structures Reveal Neck-linker Length Based on Initiation of the Coiled-coil*

Kinesin-1, -2, -5, and -7 generate processive hand-over-hand 8-nm steps to transport intracellular cargoes toward the microtubule plus end. This processive motility requires gating mechanisms to coordinate the mechanochemical cycles of the two motor heads to sustain the processive run. A key structural element believed to regulate the degree of processivity is the neck-linker, a short peptide of 12–18 residues, which connects the motor domain to its coiled-coil stalk. Although a shorter neck-linker has been correlated with longer run lengths, the structural data to support this hypothesis have been lacking. To test this hypothesis, seven kinesin structures were determined by x-ray crystallography. Each included the neck-linker motif, followed by helix α7 that constitutes the start of the coiled-coil stalk. In the majority of the structures, the neck-linker length differed from predictions because helix α7, which initiates the coiled-coil, started earlier in the sequence than predicted. A further examination of structures in the Protein Data Bank reveals that there is a great disparity between the predicted and observed starting residues. This suggests that an accurate prediction of the start of a coiled-coil is currently difficult to achieve. These results are significant because they now exclude simple comparisons between members of the kinesin superfamily and add a further layer of complexity when interpreting the results of mutagenesis or protein fusion. They also re-emphasize the need to consider factors beyond the kinesin neck-linker motif when attempting to understand how inter-head communication is tuned to achieve the degree of processivity required for cellular function.

The coiled-coil was the first quaternary structural arrangement described and has been predicted to occur in ϳ3% of all proteins (1,2). This apparently simple motif shows considerable structural variation, but because of the characteristic distribution of hydrophobic and polar residues, it can be detected readily through sequence analysis (3)(4)(5). Coiled-coils play diverse functional roles, but in many cases, they serve as oligomerization domains. Under these circumstances, the exact length or starting point of the coiled-coil is structurally and functionally important. Unfortunately, this parameter is not well defined or uniformly predicted by any computational algorithm.
The precise starting point of the coiled-coil is particularly important in the kinesin and myosin family of molecular motors, where the dimerization module coordinates the activities of individual motor domains on separate polypeptide chains. Typically, in these motor families, there is a flexible section of the polypeptide that connects the motor to the dimerization module. The length of this connecting unit plays a critical role in processive molecular motors and is especially important for many members of the kinesin superfamily in which this domain is known as the neck-linker.
Kinesin motor proteins are classified into 15 different kinesin families, which share a structurally conserved kinesin motor domain (6 -10). These families perform a diverse set of cellular functions, all of which involve moving along a microtubule track for cargo transport or modulating microtubule dynamics (11)(12)(13)(14)(15)(16)(17). Part of the classification is dictated by the location of their motor domains. N-terminally located motors comprise the majority of kinesin families. The exceptions are the kinesin-14A and -14B families that contain C-terminal motors and the kinesin-13 family in which the motor domain is located in the middle of the polypeptide (6).
This study is focused on N-terminal kinesins, which are composed of an N-terminal motor domain connected to a long ␣-helical region that dimerizes into a coiled-coil stalk that ends with a C-terminal cargo domain that may interact with other partner proteins or substrates. The N-terminal motor domain is responsible for ATP turnover coupled to force production. This group of motors is typically dimeric and shows processive movement along microtubules tracks. The ability of the N-terminal kinesin family to remain on the microtubule lattice is critical to their function.
Dimeric N-terminal kinesins employ an asymmetric handover-hand stepping motion to move processively along a microtubule as they hydrolyze ATP ( Fig. 1) (18 -23). A general outline of the hydrolytic cycle begins arbitrarily in an ATP waiting state, where the leading head without nucleotide is strongly bound to a microtubule, whereas the lagging head is bound to ADP, but only weakly associated with a microtubule (Fig. 1, E1) (18,19). To proceed with stepping, the leading head binds ATP, and the dimer undergoes a structural transition transmitted through the neck-linker motif, a 12-18-amino acid flexible peptide that connects the motor domain to the coiled-coil stalk (Fig. 1, E2) (24,25). This structural transition, designated necklinker docking, shifts the lagging unbound head forward 16 nm to the next microtubule-binding site toward the microtubule plus end (Fig. 1, E3) (24,26). Subsequently, ADP is released from this new leading head resulting in both heads bound to the microtubule (Fig. 1, E4 and E5) (27).
When both heads are bound to the microtubule, the necklinker domains are oriented in opposite directions (28,29). The neck-linker of the leading nucleotide-free head is oriented backward, inhibiting ATP binding and hydrolysis by the front head until the rear head detaches from the microtubule (Fig. 1,  E6) (30). The neck-linker of the lagging ATP-bound head remains docked onto the catalytic core and directed forward. ATP hydrolysis on the microtubule-bound lagging head results in another structural rearrangement that leads to phosphate release. Thereby, the lagging head transitions into a weakly bound ADP state and detaches from the microtubule, thus starting the cycle anew ( Fig. 1, E7).
During the stepping cycle, each motor domain must remain out-of-phase with the other. If both motor domains enter an unbound state at the same time, the processive run ends. The run length indicates the distance a motor steps along the microtubule before dissociation and gives a measure of processivity. Each kinesin family member has different average run lengths ranging from Ͻ0.2 m, as in Eg5, to 2.1 m, as in conventional kinesin-1, although this is construct-dependent for kinesin-1 and can range from 1.3 to 2.1 m (25,31,32). Despite the structural conservation between different kinesin motors, there are clear kinetic differences between the families.
One domain hypothesized to contribute to processivity is the kinesin neck-linker, a small flexible peptide consisting of 12-18 amino acids (33)(34)(35). The neck-linker connects each motor domain to the coiled-coil stalk where the junction between these two entities has been assumed to occur at the same position in the sequence as that observed in kinesin-1 (36). The neck-linker itself undergoes a series of structural transitions as outlined in the kinesin mechanochemical cycle. It is hypothesized that longer neck-linkers increase the diffusional search area and therefore could slow down stepping, allowing time for the forward head to release from the microtubule (37). Recently, alterations in neck-linker length were shown to affect the kinetic cycle (38). Increasing neck-linker length resulted in increased rear head binding, and decreasing neck-linker length resulted in slower release of ADP from the unbound head. Both effects result in slowing the productive kinetic cycle (38). Because the neck-linker is involved in connecting the motor domain and coiled-coil stalk, changes in the length of the necklinker are expected to alter communication between the two motor heads.
Given that the overall size of the kinesin motor domain is similar across all families and that they bind to microtubules in a similar manner, it is surprising that the predicted length of the neck-linker between families shows considerable variation even though within each family the neck-linker/␣7 sequences are almost completely conserved (36). This is especially puzzling because a multitude of studies show increasing the necklinker for a given kinesin by even one residue results in decreases in run length and processivity (25,31,34,39,40). This conundrum is particularly evident in members of the kinesin-2 family, KIF3AB and KIF3AC, which are unique in forming a FIGURE 1. Schematic representation of the kinesin chemomechanical cycle that illustrates key transitions that influence processivity. A processive run is started by binding of either head followed by rapid ADP release to form the E1 intermediate where the leading head that forms the initial contact is microtubule-bound but nucleotide-free (Ø), whereas the trailing head is detached with ADP tightly bound. ATP binding to the leading head induces a series of structural transitions, including neck-linker docking that allows the trailing ADP-head to move 16 nm ahead to its new microtubulebinding site (E2-E4). ADP release from the new leading head (E4 and E5) results in the E5 two-head bound state, thereby generating intermolecular strain, which inhibits ATP binding at the now leading head. ATP hydrolysis at the trailing head followed by phosphate (P i ) release generates a microtubule weakly bound ADP state (E6 and E7). Detachment of the trailing head relieves the intermolecular strain (E7), and initiates the next motor cycle. heterodimeric motor. KIF3AB and KIF3AC further interact with an adaptor protein to bind a variety of cargoes for intraflagellar and neuronal transport (11,15,(41)(42)(43)(44)(45). KIF3AC in particular has been implicated in neuronal repair (11). Both KIF3AB and KIF3AC are highly processive motors (46). Yet their neck-linker is predicted to be three residues longer than conventional kinesin, which would suggest that these motors should not be so processive (36). KIF3AB and KIF3AC have identical neck-linker lengths and nearly identical neck-linker sequences, differing only at Thr-380 in KIF3C, which is an alanine in KIF3A and KIF3B. Both motors are processive; KIF3AB and KIF3AC have run lengths of 1.6 and 1.2 m, respectively (37,46). However, the kinetic parameters vary significantly between the two motors (46 -48). Furthermore, homodimeric species KIF3AA, KIF3BB, and KIF3CC also exhibit vastly different processivity parameters. KIF3AA and KIF3BB are highly processive, but KIF3CC travels at only 7.5 nm/s with an average run length of just 0.6 m, which suggests that the neck-linker is not the only determinant of processivity (46). These observations prompted an investigation of the true length of the necklinker in the best studied classes of N-terminal kinesin families.
The original estimates of the neck-linker length made the assumption that the coiled-coil would begin on a hydrophobic residue that lie in either the a or d position of the coiled-coil heptad repeat (36). However, the predictions of the first residue to adopt a helical conformation in any coiled-coil are ambiguous, even though the body of a coiled-coil is well indicated by current software (3,4). Prediction software recognizes the heptad repeat in a coiled-coil domain. However, because the beginning and end of sequences may not follow that pattern, there is insufficient information to make accurate predictions of the start or of the end of a native coiled-coil. Given the considerable variation in the amino acid sequence of neck-linkers and associated coiled-coils, this raises the question whether the true length of the neck-linker has been accurately identified across the kinesin superfamily with the original assumption described above.
There is only one structure for a dimeric N-terminal kinesin, rat kinesin-1 (3KIN) because dimeric kinesins are difficult to crystallize (14). The 3KIN structure provided the first picture of the true ␣7 start and neck-linker length in the context of a dimeric motor. Most kinesin motor structures are monomeric, allowing the neck-linker to adopt varying conformations that may not reflect that experienced by the dimeric motor in vivo. For example, in the structure of the kinesin-2 KIF3B (3B6U) the neck-linker includes a cis-proline; therefore, this structure is unlikely to reflect the native neck-linker conformation. This study is directed toward providing an experimental foundation for determining the start site for the coiled-coil and by inference the length of the neck-linker. We determined the structures of the neck-linker and ␣7 helix from the following four different kinesin families: kinesin-1, -2, -5, and -7. Kinesin-1 is the canonical N-terminal processive kinesin motor. Kinesin-2 family members, KIF3A and KIF3C, are involved in long range transport and neuronal repair (11,15). Kinesin-5 family member, Eg5, is unique within this subset as it is a bipolar tetrameric kinesin whose role is to cross-link microtubules during cell division (12,13,49). Kinesin-7 family member CENP-E is responsible for transporting misaligned chromosomes during congression in mitosis (16). In addition, we compared structures in the PDB 2 that include a native start to their coiled-coil with the predictions generated from either MARCOIL or COILS-28 (3,4). Overall, we find that structures of proteins are necessary to determine the true start site of the coiled-coil rather than relying on prediction software alone.

Results
Crystal Structures of Kinesin-1, -2, -5, and -7 Neck-linker ␣7 Helix Proteins-All neck-linker structures were homodimeric and solved to a resolution of 2.3 Å or higher allowing for accurate determination of secondary structure transitions. The extent of the ordered structure in each construct is given in Table 1. Several structures had multiple monomers in the asymmetric unit. Individual monomers were similar as shown in Table 1. For each neck-linker structure, the start of the coiled-coil helix was determined using the Dictionary of Secondary Structure of Proteins algorithm (50). In each structure, varying lengths of the neck-linker were ordered; thus we focused on the initiation point of the ␣7 helix to determine neck-linker length.
All of the structures were determined as fusions with the C-terminal dimerization domain of EB1. Previous studies have shown that inclusion of globular folding domains considerably increases the ability to express and crystallize sections of coiledcoil proteins and that they do not perturb the structure more than one heptad from the point of fusion (51,52).

TABLE 1 Average pairwise root means square differences between independent chains within each structural determination
First ordered residue is listed for the longest monomer and may not be ordered in other monomers in the asymmetric unit. r.m.s.d. is root mean square deviation. NA is not applicable.

Monomers in asymmetric unit
Average r.m.s.d.

First ordered residue
a In these structures two residues that remain from the recombinant tobacco etch virus cleavage site were also observed.

Coiled-coil Initiation in Kinesin Motors
The crystal structure of the kinesin-1 neck-linker ( Fig. 2) shows that the start of ␣7 and the end of the kinesin-1 necklinker observed here is identical to that seen in the structure of the kinesin-1 dimer (PDB code 3KIN) (14). An overlay of the kinesin-1 neck-linker structure and dimeric kinesin-1 is shown in Fig. 3. Neck-linker residues Asn-340 -Thr-344 were ordered in the crystal structure and adopt a random coil conformation, and the ␣7 helix begins at Ala-345 as expected. The consistency of the kinesin-1 neck-linker structure with predictions and previous work supports the assertion that the structures from other families likely represent the solution state of the junction between their neck-linker and ␣7.
For the remaining neck-linker structures, the length of the neck-linker and the ␣7 helix deduced from prediction algorithms do not agree with the experimental structural data. Although the neck-linker itself is flexible as evidenced by docking, it is unlikely that the start residue for the ␣7 helix changes. Studies using high resolution atomic force microscopy and computational modeling have shown that there is no local conformational unwinding or "breathing" at helix ␣7 (53-55). Thus, the ␣7 start residues determined here are most likely the same as in the full dimeric motor. The kinesin-2 family members, KIF3A and KIF3C (Fig. 2), have significantly shorter neck-linkers and longer ␣7 helices than earlier models based on the kinesin-1 family. Although predictions suggested that the ␣7 helix would begin at Leu-360 in KIF3A and at Leu-382 in KIF3C, in the crystal structures ␣7 starts five residues earlier at Pro-355 and Pro-377 in KIF3A and KIF3C, respectively (Fig. 4). The discrepancy on the start of ␣7 leads to a shortening of the neck-linker from 17 residues to 12. This appears to be in contrast with the neck-linker observed in monomeric KIF3B structure (PDB code 3B6U), which has a length of 16 residues. However, the KIF3B structure (PDB code 3B6U) is monomeric where it is unlikely that there were enough residues to form a stable ␣-helix, leading to the discrepancy in the neck-linker lengths. Although the native kinesin-2s are heterodimeric, the crystallized constructs are homodimeric. It is not anticipated that the length of the neck-linker or structures of the coiled-coils will differ between the homodimer as compared with the heterodimer, as both KIF3A and KIF3C resulted in a neck-linker of 12 residues. Additionally, the buried residues in the first heptad of the coiled-coil are similar in both KIF3A and KIF3C, and thus it should not affect neck-linker length. Furthermore, the native coiled-coil is stabilized by a hydrogen bond between a lysine and aspartate in both KIF3A and KIF3C; thus this interaction should also be present in the heterodimer (Fig. 4). There are sequence differences in ␣7 between KIF3A and KIF3C, but these occur in positions that are solvent-exposed and do not interact with the adjacent ␣-helix (Fig. 4) and hence are not expected to greatly influ-FIGURE 2. Crystal structures of kinesin-1, -2, -5, and -7 reveal unique classspecific neck-linker ␣7 neck-coil domains. The neck-linker motif predicted to occur based on the earlier studies of kinesin-1 is colored blue. Helix ␣7, as observed in kinesin-1, is colored according to the kinesin family to which it belongs. EB1, a coiled-coil fusion protein, is colored gray. There are varied amounts of ordered neck-linker motifs. These structures show that the true start of helix ␣7 is variable across the kinesin superfamily. Table 4 provides the protein sequence of each fusion protein and their coiled-coil registry. Figs. 2-6 were prepared in part with PyMOL. ence the structure of the heterodimer. The structure of the kinesin-2 neck-linkers in Fig. 2 show that the neck-linker is shorter than that of kinesin-1, which has a 14-residue neck-linker and the longest run length of all families tested (39).
The Eg5 neck-linker crystal structure (Fig. 2) shows differences in length as well. The ␣7 helix begins at Lys-371 and not Ile-375 as predicted, resulting in a neck-linker that was only 14 residues long. As in Eg5, the CENP-E neck-linker was predicted to be 18 residues long. The coiled-coil of CENP-E begins at Asp-341 rather than Leu-345, thus shortening the neck-linker to 14 residues (Fig. 2). These crystal structures show that both the CENP-E and Eg5 neck-linkers are the same length as that of kinesin-1. A summary of the predicted neck-linker lengths and  Fig. 2, with side chains shown as sticks and colored by element. In both KIF3A (A) and KIF3C (B), there is a hydrogen bonding interaction between a lysine and an aspartate that stabilizes part of the coiled-coil. The sequences of the neck-linker and ␣7 are also shown (C) where the residues depicted in gray were not included in the constructs but represent the full-length linker. Interestingly, none of the differences in sequence between KIF3A and KIF3C are predicted to influence formation of a heterodimer.
actual neck-linker lengths along with the average run length is listed in Table 2.
Crystal Structures of the Kinesin-1 Extended Neck-linker and KIF3A-Kinesin-1 Hybrid-The coiled-coil stalk of the kinesin-1 motor has often been fused to the motor domain of other kinesin family members for single molecule studies where this served as a template for understanding the effect of length differences in the neck-linker seen across the entire kinesin superfamily (25, 39, 56 -59). This hybrid was used in part to ensure that kinetic differences were derived from the differences in the neck-linker domain and not due to the coiled-coil stalk or other charged regions (39). Additionally, the hybrid constructs were easily expressed in Escherichia coli, rather than baculovirus (39). To determine the molecular consequences of these engineered hybrids and how they might affect the interpretation of changes in kinetic or motile behavior, structural studies were performed on a KIF3A-kinesin-1 hybrid and a kinesin-1 in which three-residues (kinesin-1 ϩ DAL) were inserted (Fig. 5). This extension has been previously used to examine processivity changes in the kinesin-1 motor (25,39).
The kinesin-1 ϩ DAL structure shows the effect of adding three additional residues to the kinesin-1 neck-linker. Previous studies have added these three residues, DAL, as an extension to mimic kinesin-2, as the three final residues of its neck-linker are DAL (DTL in KIF3C) (25,39). The addition of these residues to the end of the kinesin-1 neck-linker should result in a 17-residue neck-linker, as in kinesin-2, rather than the native 14-residue neck-linker. Interestingly, in our structure, even though three residues (DAL) were added to the putative end of the neck-linker, two of the three residues become a part of the ␣7 helix and thus only lengthens the neck-linker by one residue.
The KIF3A-kinesin-1 neck-linker construct fuses the KIF3A neck-linker domain to the ␣7 helix of kinesin-1 (25,39). In the native KIF3A structure determined here, the neck-linker was 12 residues long; however, in the hybrid, it lengthens to 14 -15 amino acids (Figs. 5 and 6). There is variation in the start of the coiled-coil due to slight differences in crystal packing. However, the length is clearly different from that of the native kinesin-2 neck-linker. These results indicate that studies of the kinetic and motile properties of kinesins should be performed in the context of the native neck-linker and coiled-coil.
Temperature Factor Trends at the Neck-linker/␣7 Junction-As noted earlier, atomic force microscopy measurements suggest that the start of ␣7 in kinesin-1 is particularly stable. It is not known whether this phenomenon holds true for all N-terminal kinesins, but is an important consideration when assess-ing the length of the neck-linker. The structures give a clear indication of the start of ␣7 but do not necessarily give an assessment of stability. In principle, examination of the temperature factors across the neck-linker/␣7 junction could provide   SEPTEMBER 23, 2016 • VOLUME 291 • NUMBER 39 some insight as to whether the longer helices are less stable. As shown in Fig. 7, there is a general trend that the residues at the N terminus of the neck-linker have a higher temperature factor than the coiled-coil, but there is a continuum leading into the ␣-helix. This is often observed for the N termini of proteins or linkers between domains. This analysis, though fraught with reservations because temperature factors are susceptible to modification by crystal packing, radiation decay, and lattice disorder, suggests that the residues that increase the length of ␣7 are no less stable than those that make up the canonical helix in kinesin-1. Interestingly, the same trends in B-values are also seen at the N termini of other native coiled-coils as discussed later (Fig. 8).

Coiled-coil Initiation in Kinesin Motors
Coiled-coil Predictions Do Not Accurately Reflect the Start Residues of Coiled-coils-As noted earlier, the current prediction algorithms provide a robust estimate of the existence of a coiled-coil, although the exact start of the structural motif is ambiguous. A robust prediction for a residue in a coiled-coil will often be close to 1.0, and a value greater than 0.5 is commonly considered to indicate a coiled-coil (60). The question is as follows. What value for the probability should be accepted as a reliable indication of the first residue? To gain insight into this area of uncertainty, the structures of the kinesin neck-link-ers and ␣7 were compared with the calculated probabilities for two algorithms. The register of the coiled-coil and prediction for the start site were determined with COILS, a Position-specific Scoring Matrix model, and MARCOIL, a Hidden Markov Model (3,4). Both COILS and MARCOIL gave similar registers for the body of the coiled-coils, However, neither program accurately predicted the observed start sites for the ␣7 helix (Fig. 7, Table 3). In the calculations of the probabilities, the 28-residue window in COILS was used as it gives the lowest false-positive rate of the three options (14, 21, or 28 residue windows).
Several coiled-coil prediction algorithms were recently reviewed to check for both the accuracy in prediction of coiledcoil and also the accuracy of the oligomeric state of the coiledcoil (60). This study found the CCHMM_PROF algorithm to give the best indicator of coiled-coil; however, it does not yield a registry prediction for the coiled-coil. Thus, it was not used in this study (61). Multicoil2 also performed well, but its results were consistent with that of MARCOIL and COILS-28 (62). In general, for the kinesin coiled-coil domains, MARCOIL predicted coiled-coil start sites more conservatively than COILS-28. MARCOIL probabilities were nearly always lower than the COILS-28 prediction, except for the coiled-coil of KIF3C (60). Both programs poorly predicted the KIF3C coiled-coil. MARCOIL yielded a probability of 0.09 for the propensity of Pro-377 to form a coiled-coil, and COILS predicted a probability of 0.06 for the 28-residue window.
For kinesin-1, where the structure was previously known, the algorithms differ in the coiled-coil probabilities. Ala-345 is the ␣7 helix start. MARCOIL gives a conservative probability of 0.82, although COILS reaches a probability of 1, five residues earlier in the sequence where there is no coiled-coil. COILS tended to over-predict the neck-linker coiled-coil, reaching high probabilities earlier in the sequence. MARCOIL is a better estimator, but its predictions for the coiled-coil start sites ranged from 0.09 to 0.83. Neither program yielded reliable predictions for the start site of coiled-coils. The variation between predictive approaches creates a dilemma for deciding the coiled-coil start site based on bioinformatics approaches. Indeed, structural results reveal a fundamental weakness in the prediction algorithms because they are unable to categorically indicate the first residue that will adopt the ␣-helical conformation.
To see whether this problem of predicting the coiled-coil start sites in kinesins is a general phenomenon, structures of  a Observed start residues are determined using the first residue in a helical conformation in the crystal structures of the neck-linker protein defined as helix as determined by Dictionary of Secondary Structure of Proteins. b Predicted start residue as determined in Ref. 36. This start was assumed in earlier kinetic studies of kinesins to determine the influence of the neck-linker length. c The KIF3A neck-linker residue where the coiled coil starts is shown first, followed by the corresponding position in kinesin-1 relative to the kinesin-1 ␣7 start. coiled-coil proteins in the PDB that contain a native transition from random-coil to parallel coiled-coil were also examined ( Fig. 8 and Table 4). Although there are a large number of structures of dimeric coiled-coils in the Protein Data Bank, there are only about a dozen that contain the native start sequence. Most structures in the PDB represent fragments of a larger protein or are fused to the canonical coiled-coil found in GCN4. Interestingly, the performance of the algorithms on this restricted set was similar to that observed for the kinesin neck-linker proteins.

TABLE 3 Predicted coiled-coil start sites, actual start sites, and the corresponding probabilities of the residue being in a coiled-coil according to MARCOIL and COILS using the 28-residue window
In almost every case, the algorithms miss the start site of the coiled-coil. As with the neck-linker proteins, COILS-28 overpredicts the propensity for coiled-coil, and MARCOIL is much more conservative. Neither algorithm accurately predicted the coiled-coil start, often only reaching a reasonable probability of coiled-coil formation until 10 or more residues after the structurally observed start site.
A notable exception is 2FXM, the N-terminal region of the S2 fragment of cardiac ␤-myosin II (63). COILS-28 reaches a probability of 99% for the start of the coiled-coil to begin at the site corresponding with the coiled-coil start in the structure. MARCOIL is close, but still under-predicts the probability of a coiled-coil in that region.
In 1GD2, a structure of the bZIP transcription factor, both COILS and MARCOIL under-predict the possibility of a coiled-coil and do not reach an ϳ90% probability until 17 residues later in the sequence, corresponding to ϳ2.5 heptads of coiled-coil missed (64). Results are similar for 3HNW, a coiled-coil protein from Eubacterium eligens with unknown function. Neither COILS nor MARCOIL reaches a reasonable coiled-coil prediction until more than three heptads into the coiled-coil domain. Overall, it is clear that algorithms can predict the presence and the register of a coiledcoil but do not provide an accurate guide to the start of the helical structure.

Discussion
In this study, the crystal structures of seven kinesin necklinker ␣7 helices were determined from four different kinesin families representing N-terminal kinesin motors. In these structures, there are differences in the length of the neck-linker and the start of the coiled-coil stalk as compared with previous predictions. These differences are most likely the result of inaccurate assumptions in the prediction of coiled-coil start sites. The coiled-coil algorithms, COILS-28 and MARCOIL, were further investigated with respect to start site accuracy for other non-kinesin proteins and were shown to be unable to predict the start sites for the majority of coiled-coil-containing structures.
In the kinesin neck-linker structures, the coiled-coil start site was 4 -5 residues earlier in the sequence than was predicted. The original assignment of neck-linker length by Hariharan and Hancock in 2009 (36) was based upon the assumption that the ␣7 helix would begin on an a or d residue in the coiled-coil registry, and this was true for the kinesin-1 motors. However, the choice of this residue registry was arbitrary and as shown by the results in this paper does not universally apply. Numerous studies have shown differences in the neck-linker length can drastically alter the single molecular parameters (25,

COILS-28 and MARCOIL prediction for structures of coiled-coil containing proteins that contain a native start sequence
All sequences are in the same register as listed in the top chart. The start residues for each coiled-coil as determined by Dictionary of Secondary Structure of Proteins and the PDB structure are shaded and underlined. Probabilities (ϫ100) as predicted by COILS-28 and MARCOIL are shown successively below each amino acid sequence. References for each PDB if available are included below the PDB code. 31,37,39,65). However, our study suggests that rather than a direct association of shorter neck-linkers leading to greater processivity, it may be that neck-linker length is tuned to its motor and relative changes in length can increase or decrease the processivity.
Many studies have inserted residues in or deleted residues from the presumptive end point of the neck-linker where it was assumed that the additional residues would add to the necklinker and have no effect on the coiled-coil (25,39). As shown in this study, inserted residues can be incorporated directly into the ␣7 helix, rather than increasing the length of the neck-linker. Even when there are added residues at the appropriate end of the neck-linker, as in kinesin-1, the residues may be incorporated into the coiled-coil domain, rather than extending the neck-linker. It is possible that the added residues may disrupt coiled-coil formation because the coiled-coil motif depends on specific residues at each position to fulfill the canonical knobs-into-holes packing. Thus, the disruption of coiled-coil formation, rather than direct altering of neck-linker length, could be leading to artificial lengthening of the neck-linker. The changes in coiled-coil formation may also underlie the changes in processivity seen in other studies (25,57,58).
Furthermore, we have shown that fusing the ␣7 helix from kinesin-1 to kinesin-2 results in changes in the true neck-linker length. This accounts for the observed difference in kinetic and motile activities between synthetic and native fusions. In a study where the KIF3A motor domain and neck-linker were fused to the kinesin-1 ␣7 helix, the average run length was 0.7 m, although a separate study showed the run length was nearly 50% longer at 1 m when using the native coiledcoil (39,46,65). In contrast, the speed of the KIF3A-kinesin-1 construct, 480 nm/s, was nearly twice as fast as the native, 240 nm/s, showing that there are clear alterations in kinetic properties when proteins are fused to different coiled-coil domains (39,46).
Our study also shows that although the coiled-coil prediction algorithms, COILS and MARCOIL, are able to predict both the occurrence of a coiled-coil accurately and the registry, both algorithms struggle to identify the start site of a coiled-coil. The probabilities given by both MARCOIL and COILS cannot be relied upon to yield the correct initiation point because there is no a priori way to decide which probability should be chosen to indicate the absolute start of the helical conformation. Thus, when evaluating sequences through bioinformatics, it should be understood that there is considerable ambiguity in where the coiled-coil starts. This might not be important for many proteins that include coiled-coils, but it is critical when the junction between globular domains and coiled-coils plays a role in function, as seen in motor proteins.
Kinesin neck-linker domains are clearly important for processivity, but it is not as simple as a short neck-linker leading to increased processivity. Further studies must consider the length of the neck-linkers carefully and ensure that when mutations or substitutions are made for kinetic studies the constructs are altered in the way that was intended.

Experimental Procedures
Construct Preparation-Mus musculus KIF3A cDNA was a generous gift from William O. Hancock (Pennsylvania State University, University Park, PA). M. musculus KIF3C cDNA was synthesized by Open Biosystems (GE Healthcare, Lafayette, CO). Homo sapiens Eg5 was obtained from an expression plasmid containing Eg5(1-513) (66). H. sapiens CENP-E was obtained from an expression plasmid containing CENP-E(1-407) (67). Drosophila melanogaster kinesin-1 neck-linker was synthesized by Integrated DNA Technologies (Coralville, IA). Kinesin-1 neck-linker mutagenesis was accomplished through a QuikChange-like protocol to introduce gatgcgctg for the ϩDAL mutation and to mutate cysteine 338 to alanine.
Neck-linker proteins were cloned using a modified pET-31 vector (Novagen) containing an N-terminal 8-histidine tag linked to the protein via a tobacco etch virus protease site and a C-terminal EB1 protein used as a coiled-coil fusion. EB1 is a coiled-coil protein with a C-terminal globular domain used to improve crystallization and expression of coiled-coil-containing proteins (46,51,68,69). Great care was taken to maintain the coiled-coil registration across the fusion boundary and to avoid conflicts between structurally adjacent residues (68). A complete description of all constructs is included in Table 5. Cloning was accomplished using a protocol similar to the QuikChange method (Agilent) as described previously (68). Briefly, the QuikChange method allows genes to be inserted into vectors via linear amplification using PfuUltra II Fusion HF polymerase (Agilent), avoiding the introduction of cloning artifacts and resulting in faster preparation of constructs (70,71). The sequences of the constructs and coiled-coil registry are detailed in Table 5.
Neck-linker proteins were expressed in an E. coli BL21-CodonPlus (DE3)-RIL strain (Agilent). For natively expressed neck-linker proteins, 6 liters of Lysogeny Broth culture were inoculated with an overnight culture formed from a single colony and allowed to grow to an A 600 between 0.6 and 1.0 at 37°C. Upon reaching the appropriate A 600 , the cells were chilled to 16°C, induced with 0.5 mM isopropyl ␤-D-1-thiogalactopyranoside, and grown at 16°C for 18 h before harvesting via centrifugation. For production of selenomethionine-derived proteins, 6 liters of M9 media were inoculated with 50 ml per 500 ml of overnight culture. Cells were grown to an A 600 between 0.6 and 1.0 at 37°C. Upon reaching the appropriate A 600 , the cells were chilled to 16°C, and 5 ml of an amino acid mixture (100 mg of lysine, 100 mg of threonine, 100 mg of phenylalanine, 50 mg of leucine, 50 mg of isoleucine, 50 mg of valine, 50 mg of selenomethionine per 30 ml of mixture) was added. Cells were grown with shaking at 16°C for 30 min before induction with 1 mM isopropyl ␤-D-1-thiogalactopyranoside. Cells were grown for 24 h before harvesting via centrifugation.
Protein Purification-All purification steps occurred at 4°C. . The octahistidine tag was cleaved using a 1:100 molar ratio of a recombinant tobacco etch virus protease and was dialyzed against lysis buffer without imidazole (72). After overnight digestion at 4°C, the octa-histidine tag was removed via a second 7-ml nickelnitrilotriacetic acid column. Protein was loaded at 1 ml/min and washed with 3 column volumes of buffer (20 mM Tris-HCl, pH 8.0, 300 mM NaCl, 0.1 mM EGTA, 0.2 mM TCEP, 30 mM imidazole), followed by 3 column volumes of lysis buffer. The protein was concentrated using an Amicon Ultra-15 Centrifugal Filter Unit with Ultracel-10 membrane (Millipore). The concentrated protein was dialyzed against 10 mM Tris-HCl, pH 8.0, 100 mM NaCl, 0.1 mM EGTA, 0.2 mM TCEP and flashfrozen in 30-l droplets in liquid nitrogen and stored at Ϫ80°C prior to crystallization.
Structure Determination-The x-ray diffraction data for all neck-linker structures were collected at the SBC 19-ID beam line at the Advanced Photon Source (Argonne, IL). The datasets were integrated and scaled with the program HKL2000 (73). The kinesin-1, KIF3A-kinesin1 hybrid, KIF3A, KIF3C, and Eg5 neck-linker structures were solved via molecular replacement using Phaser with PDB structure 1YIB (74,75). The CENP-E neck-linker structure was solved independently using selenomethionine-containing crystals, where single anomalous diffraction data were processed using Phaser (75). After the initial solutions were obtained, structures were refined by iterative cycles of manual model building in Coot and refinement with phenix.refine (76,77). Data collection and refinement statistics for all structural determinations are given in Table 6. Secondary structure assignment was calculated with the Dictionary of Secondary Structure of Proteins Algorithm (50). Structural overlays were done using Superpose (78).
Author Contributions-R. K. P., L. G. P., S. P. G., and I. R. designed the research. R. K. P. and L. G. P. performed the research, and R. K. P., S. P. G., and I. R. analyzed the data and wrote the manuscript.