Molecular Architecture of Full-length TRF1 Favors Its Interaction with DNA*

Telomeres are specific DNA-protein structures found at both ends of eukaryotic chromosomes that protect the genome from degradation and from being recognized as double-stranded breaks. In vertebrates, telomeres are composed of tandem repeats of the TTAGGG sequence that are bound by a six-subunit complex called shelterin. Molecular mechanisms of telomere functions remain unknown in large part due to lack of structural data on shelterins, shelterin complex, and its interaction with the telomeric DNA repeats. TRF1 is one of the best studied shelterin components; however, the molecular architecture of the full-length protein remains unknown. We have used single-particle electron microscopy to elucidate the structure of TRF1 and its interaction with telomeric DNA sequence. Our results demonstrate that full-length TRF1 presents a molecular architecture that assists its interaction with telometic DNA and at the same time makes TRFH domains accessible to other TRF1 binding partners. Furthermore, our studies suggest hypothetical models on how other proteins as TIN2 and tankyrase contribute to regulate TRF1 function.

Telomeres are the nucleoprotein structures that protect the ends of linear eukaryotic chromosomes from DNA damagesensing mechanisms and the DNA repair machinery, thus preventing chromosomal fusions and other rearrangements that could lead to genome instability. Telomeres are formed by tandem repeats of the (TTAGGG) n sequence synthesized by the enzyme telomerase. These repeats are bound by a six-subunit complex called shelterin, which is essential to form the functional telomere. To date, however, the three-dimensional structure of the telomere is unknown, in part due to a lack of structural information on the shelterins and the shelterin complex and its interaction with the telomeric DNA repeats. All shelterin components, except for Rap1, show exclusive binding to TTAGGG repeats and are essential for telomere function and for cell viability, as their deletion in mouse models causes early embryonic lethality (1)(2)(3)(4)(5). Rap1, instead, can bind throughout the chromosome arms where it regulates transcription (4,6).
Only two components of the shelterin complex, TRF1 3 and TRF2 (telomeric repeat factors 1 and 2), interact directly with double-stranded telomeric DNA. TRF1 was first to be discovered and is one of the best studied. It was identified as a major protein component of human telomeres (7) that behaves as a negative regulator of telomere length (8,9). In addition to its known roles in telomere protection and telomere length regulation, TRF1 has been recently proposed to have a role in pluripotency (10) and to be a potential anti cancer target (11). In particular, both genetic and chemical inhibition of TRF1 levels at telomeres has an anti-tumoral effect (11). Two other telomeric proteins, TIN2 and tankyrase, bind to TRF1 and contribute to telomeric length regulation. TIN2 serves as a link between TRF1 and TRF2 and promotes TRF1 interaction with DNA (12). On the other hand, tankyrase promotes the release of TRF1 from the telomeres upon TRF1 PARylation and allows telomere elongation by telomerase (8,(13)(14)(15).
The TRF1 protein contains a conserved N-terminal homodimerization domain (TRFH) and a C-terminal DNA binding domain (Dbd) connected by a long loop region (Fig.  1A). The N terminus is very acidic and comprises the binding site for tankyrase 1, whereas the TRFH domain is involved in homodimer assembly and in the recruitment of several proteins, including TIN2. The Dbd domain has homology to the single Dbd of Myb oncoproteins, and TRF1 binds to doublestranded telomeric repeats TTAGGG sequence as a preformed dimer. The crystal structures of the TRFH and Dbds domains have allowed a detailed description of the TRF1 domains (16,17). However, we lack structural information on how the two domains are oriented with respect to each other and how they are organized to form an active TRF1 dimer.
We used the single-particle electron microscopy (EM) technique to obtain first low resolution structures of full-length TRF1 dimer and its structure in complex with telomeric DNA. The protein presents lock-washer-like configuration where the dimerization domains form a scaffold of the protein and the Dbds seem to be in close proximity, facing each other ready to engage telomeric DNA. DNA binding does not introduce big conformational changes on the TRF1 dimer as it interacts with Dbds maintaining the internal part of the TRFH domain accessible to other TRF1 binding partners. This work opens new ave-nues for a more detailed mechanistic structure-function study on the protection of chromosomal ends and thus of therapeutic strategies targeting telomeres both for cancer and aging.

TRF1 Was Produced as a Dimer Suitable for Structural
Analysis-TRF1 protein was expressed in the baculovirus insect cell system and purified to homogeneity. Expression of the recombinant TRF1 resulted in mainly insoluble protein. To increase the solubility of the final protein chaperones Hsp70 and its cofactors, Hsp40 and Hsdj were co-expressed with Histagged TRF1 (16 -19), which undoubtedly improves the solubility of the recombinant protein several-fold. The final product was Ͼ95% pure as shown by SDS-PAGE (Fig. 1B, inset). The molecular weight of the purified protein was assessed by sizeexclusion chromatography (SEC), indicating that the TRF1 protein was purified as a dimer (Fig. 1B).
TRF1 Domains Are Oriented within a Dimer Ready to Engage dsDNA-To examine the molecular architecture of the TRF1 dimer, we carried out single-particle EM studies. The size (Ͻ100 kDa) and shape of the particles are below the limit that can be regularly visualized and resolved at high resolution by cryo-EM. Therefore, we have used the staining agent to increase the signal to noise ratio of the images and to elucidate the three-dimensional structure of TRF1. We first analyzed the structure of the apoTRF1 protein. For that, the freshly purified TRF1 sample was applied directly onto glow-discharged carbon-coated copper grids and negatively stained for EM analysis. A detailed examination of the EM field revealed a relatively homogeneous distribution of particles (Fig. 1C). The two-dimensional averages showed particles of similar size but different shape that clearly suggested the presence of a 2-fold symmetry (Fig. 1D). These EM images, which represent different views of the protein, indicated a lack of preferential orientation of the sample on the EM grid allowing us to pursue three-dimensional reconstruction of TRF1 protein. An initial threedimensional model was generated using makeinitialmodel.py algorithm implemented in EMAN2 and refined applying 2-fold symmetry over several rounds of alignment and projection matching until it stabilized. To support the validity of the EM reconstruction we compared two-dimensional class averages, generated without any reference bias, with re-projections of the three-dimensional structure of TRF1, which happened to closely match each other ( Fig. 2A).
The EM structure of TRF1 at 23 Å resolution revealed a lockwasher-like configuration with a central globular part and two bent arms (Fig. 2B). To assign the domains within the threedimensional structure, available crystal structures were fitted into the density map (Fig. 2C). The structural details of the TRF1 were sufficient to unambiguously fit the dimerization domain (1H6O.pdb) that forms a scaffold of the protein (Fig. 2, C and E). Dbds and the loops that link dimerization and Dbds initially could not be precisely located within the free protein mass. Indeed, the Dbd could be placed either in closer contact with the dimerization domain or at the tip of lock-washer shape molecule (Fig. 2C). Nevertheless the crystal structure of two Dbds bound to telomeric dsDNA (1W0T.pdb) fit accurately into the TRF1 apo structure (Fig. 2D) undoubtedly allowing localization of Dbds (Fig. 2E). Thus, in our TRF1 apo structure the Dbds, although connected with the dimerization domain with long and flexible loops, were not randomly oriented but located facing each other with a deflection between them that would allow two Dbds to engage a DNA molecule in opposite sites (Fig. 2E).
TRF1 Dimers Bind to Seven TTAGGG Repeats as One Predominant Complex-To examine the interaction of TRF1 with telomeric DNA, we initially performed electrophoretic mobility shift assays (EMSA) with a dsDNA probe containing an array of seven TTAGGG repeats. Gel-shift experiments show that TRF1 forms two complexes, one of them clearly predominant (Fig. 3A, lane 2, lower TRF1 band). To confirm the specificity of these complexes, TRF1 protein was preincubated with specific antibody anti-TRF1 that showed super-shifted TRF1 (Fig. 3A, lane 3). This effect was not observed with an unspecific antibody (Fig. 3A, lane 4). Moreover, the addition of excess amounts of unlabeled telomeric probe abolished the formation of TRF1-DNA complex (Fig. 3A, lane 5).
Binding of Telomeric DNA Does Not Introduce Big Conformational Changes within TRF1 Dimer-To explore putative conformational changes of TRF1 upon its binding to doublestranded telomeric DNA, we prepared and examined the low resolution three-dimensional structure of TRF1-DNA complex and compared it with our apo TRF1 reconstruction. For that we generated a duplex DNA with seven TTAGGG repeats coupled to biotin molecule that included EcoRI restriction site and incubated with purified TRF1 dimer (see "Experimental Procedures"). TRF1-DNA complexes were released from the beads by EcoRI DNA cleavage. The freshly eluted fractions containing the protein-DNA complex were directly applied on carboncoated glow discharged EM grids and negatively stained for further structural analysis.
Reference-free two-dimensional class averages of DNAbound complex showed similar general features to apoTRF1 (Fig. 3B). Consequently, and to discard any bias during image processing, we used two initial band-passed models: our final apoTRF1 structure and the same starting model used for apo-TRF1. They both converged into the same three-dimensional structure (Fig. 3C). The TRF1-DNA three-dimensional reconstruction clearly indicated the presence of extra density between the two Dbds. The presence of the stain excluding region between the two Dbds could be interpreted as a result of the presence of DNA occupying the space between these domains. As was the case with apoTRF1 volume, the crystal structure of TRF1-Dbds bound to telomeric DNA (1W0T.pdb) fitted nicely on the bottom part of the TRF1-DNA volume and further supported the location of the bound telomeric DNA between the tips of the dimeric TRF1 molecule (Fig. 3D). Therefore, comparison with the previous EM volume demonstrates that the presence of the DNA did not induce large conformational changes (Fig. 3E). Importantly, the orientation between two Dbds domains remained almost the same as in apoTRF1, indicating that the location of the domains in the apo structure was well suited for binding simultaneously to two adjacent and opposite TAGGG binding sites.

Discussion
Telomeres are located at the chromosome ends and are essential for chromosomal stability. This protective function is exerted by binding of the so-called shelterin proteins to telomeric DNA repeats. All of these shelterins except for Rap1 are essential for telomere protection (5,20), and mutations in some of the shelterin proteins have been found both in premature aging syndromes associated to extremely short telomeres, the so-called telomere syndromes (21,22), as well as in various familiar and sporadic tumors (23)(24)(25). Interestingly TRF1 has also been proposed to have an important role in pluripotent and adult stem cells as well as an anti-cancer target (10,11). Further advancement in the understanding and therapeutic potential of targeting shelterins in cancer and aging, however, needs a structural knowledge of the shelterins and the shelterin complex and its binding to DNA.
In mammals, shelterin components TRF1 and TRF2 directly bind to double-stranded telomeric repeats and build a platform for recruitment of the rest of the members of shelterin complex, Rap1, TIN2, TPP1, and POT1 proteins. Despite their modest sequence identity, TRF1 and TRF2 have a similar architecture. However, the central part of the dimer interface presents a crucial difference between TRF1 and TRF2 that would prevent heterodimerization (17). For both TRF1 and TRF2, dimerization domain is linked to the DNA binding domain with a long protein loop, suggesting flexible arrangements between protein domains. Indeed, the low resolution small angle x-ray scattering (SAXS) envelope of TRF2 dimer has been recently reported as a highly expended molecule with an elevated degree of flexibility in solution (26). In this conformation two DNA binding  OCTOBER 7, 2016 • VOLUME 291 • NUMBER 41 domains are far apart and not facing each other. Yet how TRF1 protein is organized and how it recognizes DNA in the context of the full-length protein is not understood.

Structure of Full-length TRF1 and Binding to Telomeric dsDNA
To elucidate and analyze the molecular architecture of the full-length TRF1, we have used negative stain single particle electron microscopy. Our three-dimensional EM model of TRF1 revealed a molecule with a lock-washer-like shape where the dimerization domain forms a scaffold of the protein and two Dbds are located relatively close to each other. This difference in protein conformation between the SAXS envelope of TRF2 and the three-dimensional EM structure of TRF1 could be due to the use of two different techniques to address the protein structures; SAXS analysis provides information about the average conformation of the protein in solution, whereas in the case of negative staining EM the protein is attached to the carbon surface of the EM grid. Importantly, the size of the loops that connect the dimerization domain and Dbds is almost 100 amino acids longer in the case of TRF2 dimer compared to the TRF1 loop. This big difference in the length of the unstructured part of the protein could indicate that TRF2 is intrinsically more flexible than the TRF1 dimer, which as a result preferentially could adopt more bent conformation.
In addition, the shape of the dimerization domain of TRF1 provides surfaces for interaction with other proteins (27), like TIN2. TIN2 is a central component of the shelterin complex, which simultaneously binds TRF1 and TRF2, increasing their specificity for telomeres and stabilizing the formation of the protein complex with the telomeric repeat array (28,29). TRF1 interacts with TIN2 in a way that each TRF1 dimerization domain contacts with one TIN2 molecule (30) (Fig. 4A). TIN2 interacts with TRF1 through a stretch of 12 residues forming numerous intermolecular hydrogen-bonding and hydrophobic interactions. Based on the distribution of thermal factors, some of these interactions, particularly in the region of the TIN2 residues 257-262, are quite tight but appear more dynamic near residues Arg-265-Arg-267 (31). The side chain of Arg-266 is nested within a small depression on the TRF1dimerizationsurface, whereas the side chains of residues Arg-265 and Arg-267 are flexible and oriented toward the solvent. We envision that the TIN2 peptide might fill the gap observed between the dimerization domains and the supposedly bound DNA and that the interaction of TIN2 with the DNA through the side chains of Arg-265 and Arg-267 perhaps, consequently, enhance the DNA binding properties of TRF1.
On the other hand, an additional telomeric protein, tankyrase, is essential for TRF1 release from telomeres and facilitates telomere elongation by telomerase. Acidic N-terminal domain of TRF1 is necessary and sufficient for interaction with tankyrase protein through its ANK repeats (32). Hence, tankyrase does not interact with TRF2, as the acidic N-terminal domain is absent from TRF2. The available crystal structure of the ankyrin repeat clusters (ARC2 and ACR3) of tankyrase 1 exhibits a contoured shape that matches to a certain extent the horseshoe shape of the TRF1 dimerization domain (33). The complementarity between the protein surfaces suggests a possible interaction mode between tankyrase and TRF1. Indeed, tankyrase is reported to recognize the N terminus of TRF1 (res- idues [11][12][13][14][15][16][17][18][19][20][21][22] located in the outer part of the dimerization domains of TRF1 dimer (Fig. 4B). Thus, tankyrase may possibly act as molecular tongs, binding to the TRF1-DNA complex and, in addition to PARylation of TRF1, could transmit conformational changes that separate the TRF1 dimer arms, favoring DNA release and access of telomerase to the telomeric ends (Fig. 4C).
In this work we provide a structural basis for understanding how full-length TRF1 interacts with DNA. Our work also suggests potential new insights for TIN2 role in stabilization of TRF1 on DNA and hypothesizes possible TRF1 regulation by tankyrase. Further structural studies will be required to determine the full mechanism of protection of chromosome ends.

Expression and Purification of the Recombinant TRF1
Protein-The mouse full-length TRF1 gene (GenBank TM accession number NM_009352) was amplified by PCR with restriction site-tailed primers incorporating an N-terminal His 6 tag. The resulting DNA was cloned into the BamHI/EcoRI sites of the baculovirus transfer vector pBacPak8 (BD Biosciences; Clontech). The derivative plasmid, pBacPak8-mTRF1, was proof sequenced. Recombinant baculoviruses were obtained from cotransfection of insect Sf9 cells involving Bsu36I-linearized BacPak6 viral DNA (BD Biosciences; Clontech) and the pBacPak8-mTRF1 vector. Positive clones were isolated by plaque assay, and a single recombinant virus, AcTRF1, was purified by a consecutive plaque picking and used to produce a virus stock with a titer of 108 pfu/ml. For protein expression, Sf9 cells in 500-ml spinner flasks were doubly infected with AcTRF1 virus at a multiplicity of infection of 1.0 and with a dual recombinant baculovirus expressing human chaperone Hsp70 and Hsp40 cofactor (18,19), kindly provided by Dr. Tsurumi (Aichi Cancer Center Research Institute, Nagoya, Japan), at an multiplicity of infection of 5. After 72 h at 27°C, infected cells were harvested and resuspended in lysis buffer (20 mM phosphate buffer, pH 7.4, 500 mM NaCl, and 1% Triton X-100) supplemented with a protease inhibitor mixture (Roche Applied Science). After a brief sonication, the lysate was centrifuged (12,000 ϫ g, 20 min, 4°C), and His-mTRF1 in soluble the fraction was purified by affinity chromatography using a HisTrap column (GE Healthcare) on an ÄKTA prime (GE Healthcare). The HisTrap column was washed with buffer A (20 mM phosphate buffer, pH 7.4, 500 mM NaCl, 20 mM imidazole, and 2 mM Tris(2-carboxyethyl) phosphine (TCEP)) and eluted with a stepwise gradient (0 -100%) of buffer B (20 mM phosphate buffer, pH 7.4, 500 mM NaCl, 500 mM imidazole, and 2 mM TCEP). TRF1-containing fractions were pooled, concentrated to Ͻ5 ml with a 10-kDa Vivaspin concentrator (Sartorius), and further purified by SEC with a Superdex S200 16/600 column (GE Healthcare) in SEC buffer (20 mM Tris, pH 7.5, 30 mM NaCl, and 1 mM TCEP) using an ÄKTA FPLC system (GE Healthcare). Protein standards (GE Healthcare) were loaded onto the column for molecular weight calibration using the same method. Finally, the eluted peak corresponding to TRF1 dimers was collected and concentrated to 0.8 mg/ml with a 10-kDa Vivaspin concentrator. Samples were analyzed by SDS-PAGE, and the identity of TRF1 was confirmed by in-gel tryptic digestion followed by LC-MS/MS analysis.  OCTOBER 7, 2016 • VOLUME 291 • NUMBER 41
Assembling the TRF1-DNA Complex-The TRF1 bound to specific biotin-labeled DNA template was purified. To this end, a pair of complementary oligonucleotides was designed containing a biotin (Btn) molecule conjugated at the 5Ј end of one oligo, an EcoRI restriction site located 25 base pairs downstream from the biotin, and 7 tandem TTAGGG repeats: 5Ј-Btn-ACGGTGTATCTACTGTTTGAATTCCCATTAT-CGAAGGCACGTGTATGTATAGAGGGTTAGGGTTAG-GGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGCC-CCTC-3Ј and 5Ј-GAGGGGCCCTAACCCTAACCCTA-ACCCTAACCCTAACCCTAACCCTAACCCTCTATACAT-ACACGTGCCTTCGATAATGGGAATTCAAACAGTAGA-TACACCGTA. The oligonucleotides were hybridized by mixing them at a final concentration of 50 mM and adding 15 mM sodium chloride to the reaction. The mixture was incubated at 94°C for 4 min and then cooled down slowly to room temperature. The oligo hybridization was checked in a 2% agarose gel. Initially, the DNA template was bound to streptavidin-covered magnetic beads (Dynabeads M-280 Streptavidin, Thermo Fisher Scientific). For that, 20 l of streptavidin-covered magnetic beads were washed with water and equilibrated with SEC buffer. Beads were later resuspended in 100 l of SEC buffer and 3 l of the hybridized DNA at final concentration of 15 mM. The binding reaction was performed at room temperature for 1 h with constant agitation. Beads were washed with SEC buffer to eliminate the excess of DNA, and 0.1 mg of purified TRF1 was incubated with the beads-DNA for 4 h at 4°C with gentle agitation. The unbound protein was removed by washing beads three times with SEC buffer. Finally, beads were resuspended in 30 l of SuRE/Cut Buffer H 1ϫ (Roche Applied Science) together with 10 units of EcoRI (Roche Applied Science) and incubated overnight at 4°C. Beads were magnetically separated, and the supernatant containing the DNA-TRF1 complex was collected for EM analysis.
EM and Image Processing-For negative staining 4 l (ϳ0.08 mg/ml) of the each sample was applied onto a freshly glow-discharged carbon-coated 400-mesh copper EM grids (Electron Microscopy Sciences) and incubated for 10 s at room temperature. The grids were sequentially laid on top of two distinct 50-l drops of MilliQ water, and stripped gently for 2 s, and then laid on the top of two distinct 50-l drops of a freshly prepared 1% uranyl acetate solution for 1 min, striped gently for 10 s and air dried.
Data were collected on a Tecnai 12 transmission EM (FEI, Netherlands) with lanthanum hexaboride cathode operated at 120 keV. Images were recorded under low-dose conditions with an electron dose of 11-18 e Ϫ /Å 2 at 61,320Ϫ nominal magnification on a 4kϫ4k TVIPS TemCam-F416 CMOS camera resulting in a final pixel size at the specimen level of 2.5 Å.
The contrast transfer function (CTF) was determined on micrographs with ctffind3 (35), and its effects were corrected with bctf from bsoft program lsbr.niams.nih.gov). Particle selection was performed semi-automatically with e2boxer.py in EMAN2 (36). A total of 32,880 particles were boxed for apo-TRF1 and 33106 for the DNA-bound TRF1 complex.
Reference-free two-dimensional classification and averaging of the raw datasets were carried out in refine2d.py implemented in EMAN and CL2D from the Xmipp package (37). A large number of class averages were produced with around 20 particles in each class. Two-dimensional averages were visually inspected and used for generating a sub-data set with best particles.
For apoTRF1 structure, we used two-dimensional averages as input into the program e2initialmodel.py implemented in EMAN2 with C2 symmetry to produce eight different initial models. These models were carefully inspected for consistency between original reference-free two-dimensional averages and the models re-projections. The best model was low pass-filtered with a cutoff of 50 Å to minimize model bias and was used as starting model. For TRF1-DNA as a starting model we used the same band-pass-filtered initial model as for the TRF1. In parallel, the band-pass-filtered apoTRF1 volume was used.
Three-dimensional volumes were calculated using an iterative projection-matching approach using libraries from EMAN1.9, EMAN2, and XMIPP. The resolution of structures was estimated at 23 Å for apoTRF1 structure and 25 Å for TRF1-DNA complex using Fourier Shell Correlation, with 0.5 cut-off criterion. The resulting volumes were filtered in Spider to calculate resolution using Butter low pass filtering. The docking of atomic coordinates was performed with UCSF Chimera (38).