Divergent Evolution of Nuclear Localization Signal Sequences in Herpesvirus Terminase Subunits

The tripartite terminase complex of herpesviruses assembles in the cytoplasm of infected cells and exploits the host nuclear import machinery to gain access to the nucleus, where capsid assembly and genome-packaging occur. Here we analyzed the structure and conservation of nuclear localization signal (NLS) sequences previously identified in herpes simplex virus 1 (HSV-1) large terminase and human cytomegalovirus (HCMV) small terminase. We found a monopartite NLS at the N terminus of large terminase, flanking the ATPase domain, that is conserved only in α-herpesviruses. In contrast, small terminase exposes a classical NLS at the far C terminus of its helical structure that is conserved only in two genera of the β-subfamily and absent in α- and γ-herpesviruses. In addition, we predicted a classical NLS in the third terminase subunit that is partially conserved among herpesviruses. Bioinformatic analysis revealed that both location and potency of NLSs in terminase subunits evolved more rapidly than the rest of the amino acid sequence despite the selective pressure to keep terminase gene products active and localized in the nucleus. We propose that swapping NLSs among terminase subunits is a regulatory mechanism that allows different herpesviruses to regulate the kinetics of terminase nuclear import, reflecting a mechanism of virus:host adaptation.


________________________________________
Herpesviruses are large double stranded DNA (dsDNA) ubiquitously found in humans that replicate in the cell nucleus. They are broadly subdivided into three taxonomic subfamilies (,  and ), thought to have diverged from a common ancestor around 400 million years ago (1). In total, there are 8 herpesvirus types that infect humans and present different cell and tissue tropism as well as genome size (between 120-240 kbp). The assembly of all herpesvirus proceeds via formation of an empty procapsid, which is filled with genetic material by the action of a powerful virus-encoded genome-packaging motor (reviewed in (2)(3)(4)). The icosahedral procapsid builds around a scaffolding protein not present in the mature virus and contains a dodecameric portal protein complex at a unique vertex (5,6). Herpesvirus replication strategy is very similar to that of tailed bacteriophages (reviewed in (7)(8)(9)(10)) and, to some extent, adenoviruses (11). In both bacterial viruses and herpesviruses, the terminase complex is formed by a small and a large terminase subunit (abbreviated as S-terminase and L-terminase) assembled in various stoichiometry (12), that dock at the portal vertex (13) and convert ATP hydrolysis into rotation of dsDNA, which is gradually encapsidated (14). L-terminase, known as pUL15 in HSV-1 (15) and pUL89 (16) in HCMV is a bifunctional ATPase/nuclease that binds directly to portal protein (7,17,18). In contrast, the S-terminase subunit (e.g. pUL28 in HSV-1 (19,20) and pUL56 in HCMV (21,22)) binds packaging initiation sites on viral genome and, at least in bacterial viruses, regulates the ATPase activity of L-terminase (23)(24)(25). Herpesviruses also contain a third terminase subunit (herein referred to as 'T-terminase'), such as pUL33 in HSV1 or pUL51 in HMCV, of unknown function and structure, that can be isolated from infected cells in a complex with L-and S-terminase (26).
Herpesvirus L-terminase shares significant aminoacid homology with L-terminase from bacteriophages (7), including an N-terminal ATPase domain with Walker A/B motifs and a C-terminal nuclease domain (27,28) superimposable to that of most phages (13,(29)(30)(31). While L-terminase is a bona fide ortholog of phage L-terminases, there is no sequence homology and predicted structural similarity between bacteriophage and herpesvirus S-terminases, suggesting the latter is an evolutionarily new protein in herperviruses that plays a functionally similar role of DNA-recognition subunit during genomepackaging. In addition, herpesvirus S-terminase is larger in molecular mass than L-terminase (Fig. 1A), which prompted some authors to refer to this subunit as 'large-terminase' (22). In this paper, we will refer to herpesvirus S-terminase as the functionally equivalent protein of phage S-terminase.
Trafficking of macromolecules between nucleus and cytoplasm is typically an active, signalmediated and highly regulated process that occurs through the Nuclear Pore Complex (NPC) (reviewed in (32)(33)(34)). The majority of cellular and viral cargos moving through the NPC expose a transport signal on their surface, exemplified by the NLS (35) for import cargos and the Nuclear Export Signal (NES) (36) for export cargos. NLS-cargos are shuttled through the NPC by soluble transport factors of the importin  superfamily (also known as -karyopherins) (37) in a process that requires the small GTPase Ran. Cargos bound to transport factors move bidirectionally through the NPC interior by making interactions with phenylalanine-glycine-rich repeats exposed by several nucleoporins mainly lining the NPC (38). Import complexes assemble in the cytoplasm upon interaction of NLS-cargos with the receptor importin  this interaction can be direct (39), or mediated by transport adaptors such as importin (40) and snurportin (41), that use an N-terminal Importin  Binding (IBB)domain (42) to recruit importin . Importin  is the universal adaptor that binds NLS-bearing import cargos (40). It exists in seven isoforms in humans, all structurally very similar (43) that generate a binding surface for NLSs, which can be monopartite like the SV40 T-large antigen NLS (44), or bipartite like nucleoplasmin NLS (45,46).
In order to obtain a quantitative description of the structure, potency and conservation of NLSs found in herpesvirus terminase subunits, in this paper, we carried out a structural and bioinformatic analysis of HSV-1 pUL15 and HCMV pUL56 NLS sequences.
Crystallographic methods -Crystals of ∆IBB-Importin 1 bound to terminase NLSs were obtained using the hanging drop vapor diffusion methods. A droplet containing 2.3 L of gel filtration-purified protein at 18 mg ml -1 and 0.7 L of a 2-fold molar excess of peptide was mixed with an equal volume of 0.1 M sodium citrate tribasic dihydrate pH 5.6, 0.7 M sodium citrate tribasic dihydrate, 10 mM mercaptoethanol and equilibrated against 600 l of the precipitant, at 18 °C. Crystals were harvested in nylon cryo-loops, cryo-protected with 27% ethylene glycol and flash-frozen in liquid nitrogen. Crystals were diffracted at LS-CAT Beamline 21-ID-F at Argonne Photon Source on a MARMOSAIC 225 CCD detector. Data were indexed, integrated and scaled using HKL2000 (48). Initial phases were obtained by molecular replacement using Phaser (49) and PDB 3Q5U as a search model. Herpesvirus NLSs were built in Fo-Fc electron density difference maps using Coot (50) and complete atomic models were refined using phenix.refine (51) using cycles of positional and isotropic B-factor refinement with six distinct Translation/Libration/Screw groups. Stereochemistry was checked using PROCHECK (52): the final models have excellent geometry and the Ramachandran plot shows over 96% of residues in the most favored regions of the Ramachandran plot and no outliers in disallowed regions. Data collection and refinement statistics are summarized in Table 1. All ribbon diagrams and surface representations were prepared using the program Pymol (53). The structures were analyzed using the PISA server (54) and all structural superimpositions were carried out in Coot (50). 3D-structural models of pUL15 and pUL56 were generated using I-TASSER (55).
Isothermal Titration Calorimetry -ITC experiments were carried out at 25°C using a nano-ITC calorimeter (TA Instruments). pUL15 NLS, pUL56 NLS and SV40 NLS peptides were dissolved in G.F. buffer between 500-600 M and injected in 1.5 L increments into a calorimetric cell containing 195 L of IBB-importin 1 at 50 M. The spacing between injections was 300 seconds. Titration data were analyzed using NanoAnalyze Data Analysis software version 3.5.0 (TA Instruments). Heats of dilution were determined from control experiments (NLS against buffer) and subtracted from heats of binding (NLS against importin ). Curve fitting using a two independent binding sites model gave a slightly better standard deviation around fit than a single binding site model (0.51 vs 0.54 for pUL15 NLS; 0.32 vs 0.45 for pUL56 NLS and 0.24 vs 0.34 for SV40 NLS).
Bioinformatic analysis -Herpesvirus terminase subunits sequences were obtained from the NIAID Virus Pathogen Database and Analysis Resource (56) through the web site at http://www.viprbrc.org/", and with links to relevant website pages in the body of the paper. In addition, whole genome sequences accessed through NCBI (http://www.ncbi.nlm.nih.gov/genomes ) were scanned for proteins related to HSV-1 large terminase and HCMV small terminase, using either BLASTp (57) or manually. Global alignment of Land S-terminase sequences was done using ClustalW at EMBL-EBI (58) using HSV-1 pUL15 and HCMV pUL56 as references to locate NLSs in orthologous terminases. Local alignment of NLSs was generated for every genus using the program WebLogo (59). Ab initio prediction of classical NLSs was done using NLS mapper (60)

Topology of herpesvirus L-and S-terminase subunits.
A functional NLS was previously identified and validated at the N-terminus of HSV-1 L-terminase subunit (pUL15) (61) and at the C-terminus of HCMV S-terminase (pUL56) (62) (Fig. 1A). To map the position of these import sequences in the 3D-structure of terminase subunits, we generated structural models of pUL15 and pUL56 using the prediction software iTASSER (55) (Fig. 1B). For L-terminase, iTASSER predicted a three-domain structure, consisting of a highly conserved C-terminal nuclease domain (13,(27)(28)(29)(30)(31), and an amino-proximal ATPase domain containing Walker A and B motifs, fused to an Nterminal insertion domain similar to that found in phage T4 L-terminase (13,63). In this model, the NLS is located at the surface of the insertion domain, adjacent to the ATPase module and readily accessible to importin . In contrast, iTASSER predicted HCMV S-terminase to fold into a -solenoid structure ( Fig.  1B) built by tandemly-repeated HEAT motifs, similar to importin  (37). This U-shaped structure is vastly different from phage S-terminases, which are typically smaller (~20-25 vs ~90 kDa) and oligomeric proteins assembled into ring-like oligomers (25,(64)(65)(66)(67). Nonetheless, the iTASSER model is consistent with the 4-prong structure of dimeric pUL56 obtained using negative stain electron microscopy (68). In this model, pUL56 NLS (62) lays on the solvent-exposed loop connecting helix A with helix B of the last HEAT repeat and, in part, on the surface of helix B. It is not unusual for NLSs to be partially helical in the absence of importin , as seen for the NLS of NF-kB p65 in the structure of NF-kB p65/p50 heterodimer bound to IkBa (69).
To shed light on the recognition of herpesvirus terminase NLSs by the host nuclear import machinery, we crystallized the adaptor importin 1 lacking the N-terminal IBB-domain (42) (IBB-importin 1) in complex with peptides encompassing HSV-1 pUL15 NLS (residues 180-191) and HCMV pUL56 NLS (residues 814-829) (Fig. 1A). For both complexes, complete diffraction data to 1.95 Å resolution were measured and the relative structures solved by molecular replacement and refined to an R work/free of 17.3/19.4% and 17.3/19.6%, respectively ( Table 1).

Atomic structure of HSV-1 L-terminase NLS bound
to importin 1. Importin  is made up of 10 stacked Armadillo (Arm) repeats, each formed by three helices (known as A, B and C) (40, 70). Without the IBB-domain, the importin  Arm-core generates a continuous -helical surface that harbors a major and a minor NLS-binding pocket. Two binding sites for NLSs have been identified and are usually referred to as major (Arm repeats 2-4) and minor (Arm repeats [7][8]. At each site, five main points of contact between NLS side chains and importin are referred to as P 1 -P 5 and P 1' -P 5' at major and minor binding site, respectively (35,44-47, [71][72][73][74][75][76]. HSV-1 L-terminase NLS binds at both sites of importin ( Fig. 2A), but is better ordered at the major pocket (Fig. 2B), where 9 NLS residues (181-GPPKKRAKV-190) have excellent electron density, as opposed to only 4 residues at the minor binding site (184-KKRA-187) (Fig. 2C). Accordingly, the pUL15 NLS buries 733.9 and 542.3 Å 2 of surface areas at major and minor binding site respectively, making a total of 17 hydrogen bonds, 1 salt bridge and 153 non-bonded contacts at major and 8 hydrogen bonds, 2 salt bridges and 65 non-bonded contacts at minor NLS site ( Fig.  2B,C). The different avidity for importin core at the two sites is reflected by the peptide B-factor (Table 1), which is comparable to importin  average B-factor at the major NLS-binding pocket (44.4 Å 2 versus 43.5 Å 2 ), but nearly twice as high (94.5 Å 2 ) at the minor NLS site. Structural alignments with known classical and non-classical NLS peptides identified HSV-1 pUL15 NLS residues occupying conserved positions P 1 -P 5 and P 1' -P 4' ( Table 2 and Fig. 2B,C). A critical Lys (K185) is invariantly conserved at position P2, as found in the majority of classical and non-classical NLSs ( Table 2). Surprisingly, position P 4 is occupied by an Ala, a residue not commonly found in the context of an NLS (35). Unlike NLSs that bind exclusively (or preferentially) to importin  minor NLS-site (47,72,73,77), HSV-1 pUL15 NLS inserts a Lys at P 2 ' (K185), as opposed to an Arg, possibly explaining the weak affinity for this site. The general features of pUL15 NLS recognition by importin resemble those observed for the classical of SV40 NLS (44,45). The NLS binds in an extended conformation, with the mainchain running antiparallel to the direction of the importin Arm-core. The concave surface of importin provides a continuous binding interface for the NLS, where conserved tryptophan and asparagine residues protruding at the surface of importin  make hydrophobic, cation-(78) and polar contacts with NLS sidechains and backbone, positioning the peptide in the helical groove of importin  (Fig. 2B,C). Notably, importin  makes eight close contacts with pUL15 NLS backbone atoms at the major NLS pocket, versus only three contacts at the minor NLS-box (dashed lines in Fig. 2B,C), also explaining the reduced NLS avidity for this site.

Atomic structure of HCMV S-terminase NLS bound
to importin 1. HCMV S-terminase (pUL56) NLS used for crystallization with importin was longer than pUL15 NLS (16-mer vs 12-mer) and contains several basic residues clustered in three boxes (VSRRVRATRKRPRRAS) separated by a few residues (Fig. 1A). At 1.95 Å resolution, the crystal structure of importin :pUL56 NLS complex revealed the peptide binds importin  like a monopartite NLS with different occupancy at major and minor binding pockets (Fig. 3A). 8 residues (821-TRKRPRRA-828) have clear electron density at the major site ( Fig. 3B) while only 6 residues are seen clearly (821-TRKRPR-826) at the minor site (Fig. 3C). No density was observed for the N-terminal box 814-VSRRVR-819, that, therefore, is not part of this NLS. Structural alignment suggests pUL56 NLS occupies positions P 1 -P 6 at the major binding pocket and P 0' -P 4' at the minor binding pocket ( Table 2). pUL56 NLS inserts a canonical Lys at position P 2 (K823) and an Arg (R824) at position P 2' . Position P 4 is occupied by a Pro rather than a basic amino acid as seen in most NLS (i.e. this residue was an alanine in HSV-1 pUL15), followed by two Arg at P 5 and P 6 . The latter is not commonly found in classical NLSs. Interestingly, position P 0 is occupied by a Thr ( Table 2), a potential candidate for phosphorylation (34), which has been shown to affect the binding affinity for importin  depending on sequence context and importin  isoform specificity (40). At the minor NLS-pocket ( Fig. 3C), four basic side chains (RKRPR) contact sites P 0 '-P 4 ' as well as importin  makes three close contacts with pUL56 NLS backbone atoms (dashed lines in Fig. 3C), resulting in a stronger interaction than seen for HSV-1 pUL15 (Fig. 2C). Accordingly, the refined B-factor for pUL56 NLS at the major site is ~56.2 Å 2 , comparable to importin  (54.6 Å 2 ), and ~72.9 Å 2 at the minor site (Fig. 3B), lower than pUL15 NLS (Table 1). Overall, pUL56 NLS association with importin  is stabilized by 34 hydrogen bonds, 9 salt bridges and total 224 nonbonded contacts (Fig. 3B,C). The NLS peptide buries 773.4 Å 2 and 511.5 Å 2 of the surface areas, at major and minor binding sites, respectively. All long chained basic residues of HCMV pUL56 NLS make very similar contacts with importin  backbone as reported in SV40 T-large antigen (45), except R122 at position P 0 ', which points in the opposite direction.

Binding affinities of terminase NLSs for importin .
The intimate association of pUL15 and pUL56 NLSs with importin 1 observed crystallographically prompted us to measure their binding affinities in solution. Using nano Isothermal Titration Calorimetry (nano-ITC), we measured the heat released upon titration of increasing concentrations of NLS peptide inside a cell containing purified IBB-importin . For pUL15 NLS (Fig. 4A), we observed a saturable endothermic reaction, which saturated within 11-12 injections after the NLS concentration in cuvette was ~50 M. Binding data were fit using a two independent binding site model yielding an equilibrium dissociation constant for the first site (K d1 ) of 588.2 10 nM (Fig. 4A) and a K d2 ~100 M for the second site. We interpret the two binding events reflect the association of pUL15 NLS with major and minor NLS-binding pocket, respectively. As suggested by the high B-factor at the minor pocket (Table 1), the contribution of this second NLS peptide to the overall heat released during titration is likely very small at 25° C.
For pUL56 NLS, ITC data could be fit unambiguously to a two independent binding events model that yielded an equilibrium dissociation constant (K d1 ) of 9.4 1.4 M for the first site (the major NLS-binding pocket) and K d2 = 66.9 8.2 M for the second binding site (the minor NLS-binding pocket) (Fig. 4B). This model is consistent with the better occupied minor NLS binding site seen in the crystal structure, which suggests the enthalpy released during the experiment is a summation of two binding events. A similar enthalpy release was measured in a control experiment where the classical SV40-NLS was injected against IBB-importin  under identical experimental conditions (Fig. 4C), which gave equilibrium dissociation constants K d1 = 1.5 0.3 M and K d2 = 9.7 1.2 M. It should be pointed out that the equilibrium binding constants reported here are significantly lower than those measured using fluorescence-depolarization (79) or surfaceimmobilized NLSs (80), but nonetheless consistent with published ITC studies that employed short NLSpeptides (47, 75,76,81). This is explained by the high solubility and charge of NLS peptides and their tendency to remain in solution as solvated ions. In light of this, pUL15 NLS appears to be a stronger binder than pUL56 NLS, which is similar to the classical SV40 NLS.
Using the two thermodynamic equations ∆G = ∆H -T∆S and ∆G = -RT ln(1/K d ), we compared the thermodynamic parameters associated to NLS-binding to importin , to get further insight into the mechanisms of binding. Interestingly, HSV-1 NLS association to importin 1 (Fig. 4A) involves negative values of ΔH and ΔS at the experimental temperature, indicating a balanced binding affinity based on both favorable hydrogen and van der Waals interactions and hydrophobic interactions. The entropic contribution possibly reflects the involvement of six Trp in importin 1 and two Pro in pUL15 NLS (Fig.  2B,C). In contrast, the binding affinity of HCMV pUL56 NLS, as well as the classical SV40 NLS, for importin  (Fig. 4B,C) is based exclusively on hydrogen and van der Waals interactions (∆H<0), but is accompanied by unfavorable entropy changes (∆S >0), possibly due to conformational effects.

Comparative analysis of herpesvirus NLSs in terminase subunits.
Next, we explored the evolutionary conservation of terminase NLSs in members of the herpesvirus superfamily. Sixty eight herpesvirus genomes have been sequenced and annotated, and are available for analysis through the NIAID Virus Pathogen Database and Analysis Resource (56). We identified the genes encoding L-, S-and T-terminase in each available genome and generated a global alignment of all sequences grouped in -, -, -subfamilies, and further divided in genera according to established taxonomic criteria (82). Strikingly, all three terminase subunits are remarkably conserved among subfamilies with sequence identity exceeding 60% for L-terminase and close to 50% for T-terminase, which is the most divergent ( Table 3).
To investigate the conservation of NLS sequences in terminase subunits, we focused on the regions of L-and S-terminase that contain a functional NLS and generated a local alignment around pUL15 and pUL56 NLSs, which was displayed with WebLogo (59) for each genus of the three subfamilies. For L-terminase (Fig. 5A), we found HSV-1 pUL15 NLS is conserved not only in viruses of the Simplexvirus genus, which have very high sequence identity (76-95%), but also in genera with relatively lower percentage of sequence identity/similarity such as Varicellovirus (Table 3), supporting the idea of a strong evolutionary conservation among herpesviruses. Interestingly, critical Lys and Arg at positions P 2 /P 3 are conserved in all four major genera of -herpesviruses. The only exception is perhaps the Scutavirus genus that includes only two species, the testudinid herpesvirus 3 and chelonid herpesvirus 5. Testudinid herpesvirus 3 has an inverted RK motif, whereas a putative L-terminase was not found in chelonid herpesvirus 5. Likewise, a systematic search for pUL15-like NLS in the L-terminase subunit of and -herpesviruses failed to identify a sequence consistent with a classical import signal (Fig. 5A).
For S-terminase (Fig. 5B), global alignment of -herpesvirus sequences revealed pUL56-like NLS is well-conserved in Cytomegalovirus and Muromegalovirus genera, but surprisingly absent in Roseoloviruses and Proboscviruses, despite high sequence identity (>48%) and similarity greater than 60% ( Table 3). The latter viruses have truncated Sterminases that lack the C-terminal moiety harboring the NLS. Finally, a pUL56-like NLS was not found in members of the and -subfamilies (Fig. 5B), consistent with the cytoplasmic localization of Sterminase from HSV-1 (61,83) and Kaposi's sarcomaassociated herpesvirus (KSHV) (84), a prototypical herpesvirus. Thus, the NLS sequence of herpesvirus L-and S-terminase subunits has diverged more rapidly than the rest of the aminoacid sequence of these conserved gene products.

Prediction of classical NLSs in terminase subunits.
To determine if loss of pUL15-like and pUL56-like NLSs could be compensated by new NLSs somewhere else in the tripartite terminase complex, we probed the aminoacid sequence of L-and S-terminases from herpesviruses using NLS mapper (60). We predicted two putative NLSs spanning residues 278-284 of Lterminase and 429-446 of S-terminase (using KSHV terminases as reference). Weblogo representation revealed conservation of 2 basic residues in Lterminase, possibly indicative of a partial (or weak) monopartite NLS (Fig. 6A). In contrast, the predicted NLS of -herpesvirus S-terminase contains two small basic patches separated by 10-12 residues similar to a bipartite NLS (45,46) (Fig. 6B). Structural modeling suggests these putative NLSs fall in the ATPase domain of L-terminase (named ORF29), close to pUL15-like NLS, and on the outer surface of pUL56 paralog (named ORF 7).
We also investigated the existence and conservation of a classical NLS in the third terminase (T-terminase) subunit, that forms a cytoplasmic complex with S-and L-terminase (26). Using HSV1 pUL33 as prototypical T-terminase sequence, NLS mapper (60) identified a classical monopartite NLS at the C-terminus of the protein spanning residues 107-117. Global alignment revealed this NLS is well conserved in -herpesviruses (Fig. 5C) with 3-4 conserved Arg/Lys clustered together like in classical NLSs. Although NLS mapper (60) failed to identify this same NLS in T-terminase from βand herpesviruses, a global sequence alignment against pUL33 revealed partial conservation of pUL33 NLS in other herpesviruses (Fig. 6C). This putative NLS is only 'partial' in and -herpesviruses where it contains 2 basic residues scattered over 3-5 aminoacids (Fig. 6C) similar to the partial NLSs found in -herpesvirus L-and S-terminase subunits ( Fig. 6A-B).

DISCUSSION
In this paper, we investigated the structure and conservation of import signals in herpesvirus terminase subunits. Structural analysis of pUL15 and pUL56 NLSs in complex with importin  revealed both terminases use a classical monopartite NLS, bound predominantly to the major NLS-binding site of importin . The higher B-factor of the NLS peptide bound at the minor NLS-binding pocket, also observed for SV40 T-large antigen NLS (45,46), argues against the physiological significance of this binding site that is not likely occupied when the NLS is in the context of a full length cargo. In analogy to classical NLS-cargos, terminase NLSs present a conserved Lys at position P 2 and their nuclear import is disrupted by a point mutation at P 2 (85,86). This was validated by introducing Ala-substitutions at P 1 and P 2 of pUL56 that abolished nuclear import of a reporter protein consisting of GFP--galactosidase fused to pUL56 NLS (62).

Conservation of NLSs in terminase subunits.
All herpesviruses contain a set of 41 core ortholog proteins (87), which include L-, S-terminase and Tterminase. Together these three factors form a complex that was purified from HSV-1 infected cells (26) and identified using immunoblot analysis in cells infected with HCMV (88). Growing evidence supports a model where the terminase complex assembles in the cytoplasm and its translocation to the nucleus is dependent on L-terminase NLS (61,83). Our bioinformatic analysis of pUL15 NLS suggests this scenario is certainly possible for -herpesviruses (Fig.  7A), where pUL15 harbors a strong NLS, but is unlikely for and -herpesviruses that lack an NLS in L-terminase. We also predicted a putative classical NLS in T-terminase, which has not been functionally validated. We speculate a second NLS in the tripartite terminase complex could synergize with pUL15 NLS to provide a kinetic advantage to the nuclear import reaction. Unlike -herpesviruses, two genera of the subfamily (Cytomegalovirus and Muromegalovirus) have a strong NLS in the S-terminase subunit that is expected to substitute for the lack of a pUL15-like NLS in L-terminase (Fig. 7B). The conundrum is how the tripartite terminase complex of Proboscivirus and Roseolovirus (both -herpesviruses) and herpesviruses that lack both pUL15-like and pUL56like NLSs can be imported in the nucleus without a 'proper' import signal. None of the putative NLSs predicted in L-, S-and T-terminase (Fig. 6) appear strong enough to promote nuclear import of the terminase complex, that exceeds 200 kDa assuming a 1:1:1 stoichiometry (26). We envision three ways by which these herpesviruses could import the terminase complex (Fig. 7C). First, -herpesviruses have a 'partial' pUL56-like NLS in S-terminase (Fig. 6B) and a weak NLS-like sequence in T-terminase (Fig.  6C). Two partial NLSs could generate a functional NLS in trans, that becomes exposed and signals only in the quaternary structure of the terminase assembly. trans-NLSs generated by complementation of partially basic sequences have been explored and validated for the dimeric transcription factor STAT1, which is imported only upon Try-phosphorylation (81). Second, the terminase complex of -herpesviruses may interact in the cytoplasm of infected cells with a yet unknown factor harboring a strong NLS, sufficient to translocate the entire complex in the nucleus. Finally, we cannot rule out these viruses contain other non-classical NLSs in one or more subunits of the terminase complex, impossible to predict with current algorithms, that are necessary and sufficient for nuclear import of the terminase complex.

Evolution of import signals in herpesviridae.
Genome-packaging is essential to the herpesvirus replications cycle and all herpesviruses replicate in the cell nucleus. Together, these two functional constraints on herpesvirus biology have likely driven the evolution of these viruses to maintain the tripartite terminase complex both active and localized in the nucleus. This idea is supported by the strong conservation of terminase subunits in diverse members of the herpesvirus superfamily that exceeded 50% sequence identity (Table 3). Likewise, critical motifs associated with essential enzymatic function such as the Walker A (258-VPRRHGKT-265) and Walker B (352-LLFVDE-357) motifs in L-terminase (essential for ATP binding/hydrolysis) and a putative zinc-finger motif at the N-terminus of S-terminase likely implicated in DNA-binding (89) (Fig. 1A) are 100% conserved in all herpesviruses, underscoring the strong evolutionary pressure to maintain these gene products active in DNA-packaging. Unexpectedly, though the terminase complex functions strictly in the cell nucleus, in this paper we demonstrate that terminase subunit NLSs are not conserved among herpesviruses, but vary significantly even in relatively similar genera of the same subfamily (Fig. 5B). There appear to have been significant loss of NLS-function during evolution, possibly compensated by gain of NLSs in other subunits, a process that we will refer to as 'NLS-swapping'. How does this observation correlate with what is known about conservation of genes in the herpesvirus superfamily? In general terms, there are three ways by which herpesviruscommon proteins evolve and diversify (90). Certain factors preserve common function as well as sequence homology. This is perhaps the case of the Walker A/B motifs in L-terminase and the putative zinc-finger in S-terminase (Fig. 1A), which are absolutely invariant in all sixty eight herpesvirus genomes analyzed here. In other cases, the function has been retained yet with only limited sequence homology. This is especially true for structural components that have fewer constraints on their structure than enzymes and tend to diverge in sequence conserving a similar 3Dorganization (91). Finally, there are examples of genes that retain high sequence homology while acquiring distinct function. For instance, the large subunit of ribonucleotide reductase (e.g. HSV-1 pUL39) that forms an active enzyme with a small subunit in and -herpesviruses, but despite being conserved, lacks enzymatic activity in -herpesviruses. Our analysis of NLS conservation in terminase subunits suggests a fourth way to maintain a critical function in herpesviruses, by swapping NLSs among different subunits of the terminase complex. The divergent evolution of NLSs in terminase subunits likely followed the evolution of the host import machinery. With two structurally distinct and functionally independent NLS-binding sites, importin  is perfectly suited to accommodate partial NLSs exposed at the surface of a terminase oligomer. In this scenario, the cumulative avidity of a terminase tripartite complex NLS for importin  would be the product of the Kds of each NLS for importin  (92), suggesting that even weak NLSs, not functional on their own, could become active if simultaneously bound to one equivalent of importin . The same would not hold true if importin  could associate with only one NLS at a time. Thus, the divergent evolution of herpesvirus terminase NLSs may reflect a highly specialized mode of virus:host adaptation meant at regulating the affinity of the terminase complex for importin  and the kinetics of terminase translocation into the cell nucleus. We propose that by adjusting the number, strength and synergy of NLSs in different subunits of the terminase complex, herpesviruses regulate the kinetic at which the terminase complex is translocated into the cell nucleus and becomes available for genome-packaging.

33.
Cook FOOTNOTES This work was supported by NIH grants R01GM100888 to G.C. Research in this publication includes work carried out at the Sidney Kimmel Cancer Center X-ray Crystallography and Molecular Interaction Facility, which is supported in part by NCI Grant P30 CA56036.
The atomic coordinates and structure factors for IBB-Importin 1 bound to HSV-1 pUL15 NLS and HCMV pUL56 NLS were deposited in the protein Data Bank with accession codes 5HUW and 5HUY.      (59). For each genus, conserved amino acids in the NLS are represented by a stack of letters. The height of the stack (measured in bits) reflects the degree of sequence conservation at the corresponding position and the height of each letter represents the relative frequency of the amino acids at the corresponding location.   Values in parentheses are for highest-resolution shells (2.02-1.95). * R free was calculated using ~2,000 randomly selected reflections.