Point mutations in the N-terminal domain of transactive response DNA-binding protein 43 kDa (TDP-43) compromise its stability, dimerization, and functions

Transactive response DNA-binding protein 43 (TDP-43) performs multiple tasks in mRNA processing, transport, and translational regulation, but it also forms aggregates implicated in amyotrophic lateral sclerosis. TDP-43's N-terminal domain (NTD) is important for these activities and dysfunctions; however, there is an open debate about whether or not it adopts a specifically folded, stable structure. Here, we studied NTD mutations designed to destabilize its structure utilizing NMR and fluorescence spectroscopies, analytical ultracentrifugation, splicing assays, and cell microscopy. The substitutions V31R and T32R abolished TDP-43 activity in splicing and aggregation processes, and even the rather mild L28A mutation severely destabilized the NTD, drastically reducing TDP-43's in vitro splicing activity and inducing aberrant localization and aggregation in cells. These findings strongly support the idea that a stably folded NTD is essential for correct TDP-43 function. The stably folded NTD also promotes dimerization, which is pertinent to the protein's activities and pathological aggregation, and we present an atomic-level structural model for the TDP-43 dimer based on NMR data. Leu-27 is evolutionarily well conserved even though it is exposed in the monomeric NTD. We found here that Leu-27 is buried in the dimer and that the L27A mutation promotes monomerization. In conclusion, our study sheds light on the structural and biological properties of the TDP-43 NTD, indicating that the NTD must be stably folded for TDP-43's physiological functions, and has implications for understanding the mechanisms promoting the pathological aggregation of this protein.

Transactive DNA-binding protein 43 kDa (TDP-43) is an essential human protein that is vital to pre-mRNA (1) and microRNA processing (2). Key to these activities are TDP-43's two well-folded and stable RRM 4 domains (spanning residues 106 -177 and 192-259, respectively). TDP-43 contains a nuclear localization sequence (NLS, residues 80 -102) and is mostly nuclear; however, its nuclear export sequence (residues 238 -250) permits it to transport mRNAs to the cytoplasm and even to synapses as part of neuronal granules (3). TDP-43 also regulates translation by participating in stress granules (4).
Aberrant aggregate forms of TDP-43 are tightly linked to amyotrophic lateral sclerosis (ALS) as inclusions composed of polyubiquitinated, hyperphosphorylated, and truncated TDP-43 (5,6) and have been reported in 95% of ALS patient motor neurons (7). TDP-43 aggregates are also observed in 60% of frontotemporal lobar degeneration (FTLD: a form of dementia whose symptoms overlap with ALS) patient neurons, and the observation of aggregates composed of TDP-43 plus A␤ or polyglutamine suggests that TDP-43 may contribute to other neurodegenerative diseases such as Alzheimer's or Huntington's (8,9). Almost all the pathologically linked mutations are localized in the C-terminal region of TDP-43 (CTR, residues 270 -414). In contrast to the RRM domains, the CTR is intrinsically disordered and consists of the following four segments: two of which (residues 267-320 and 367-414) are rich in G/S, aromatic, G/S motifs reminiscent of (G/S)Y(G/S) motifs in the RNA-binding protein Fused in Sarcoma/Translocated in Sarcoma, which drive the formation of a hydrogel or liquid phase. A third segment (residues 320 -340) is hydrophobic and tends to adopt helical conformations (10 -13) and also drives the formation of a distinct, non-aqueous liquid phase such as those present in stress granules and neuronal granules (13,14).
The fourth CTR segment of TDP-43, composed of residues 341-367, is "Q/N-rich" as it contains a high proportion of Gln and Asn residues. Although variants lacking the Q/N-rich segment do not aggregate, variants containing 12 copies of it recapitulate in cells most of the pathological characteristics seen in ALS (15). In 2015, we advanced an amyloid-like conformer for this segment based on a variety of biochemical, spectroscopic, and computational data (16). In early 2016, the Q/N-rich segment, or part of it, was found to be intact in TDP-43 aggregates from ex vivo brain tissue, whereas all other segments of the CTR are heavily phosphorylated, deamidated, and oxidized (17). Because the exceptionally strong hydrogen-bonding networks of Q/N-rich amyloids (18) could impede such chemical modifications, we have recently interpreted these data as supporting our amyloid-like model of the Q/N-rich segment (19).
In 2012, the N-terminal domain of TDP-43 (NTD) was predicted to adopt a stable fold and was found to drive the formation of large oligomers (20). The NTD is required for TDP-43's physiological functions and pathological aggregation (21)(22)(23), but it has been less studied due to its unique sequence, which thwarts structural prediction based on homology modeling, and its strong tendency to aggregate. In 2014, an important advance in its structural characterization was reported by Song and co-workers (24), who discovered that NTD constructs with a C-terminal His tag are soluble in low pH, very low ionic strength solution conditions. This construct adopted a minor population of folded conformers; nevertheless, they were able to advance a medium resolution model for the tertiary fold of the NTD based on a small number of NMR NOE constraints and Rosetta ab initio structure prediction.
In 2016, we reported that the NTD is stably folded (T m ϭ 45-50°C) in the context of short (residues 1-77, TDP-43(1-77)) or long (residues 1-102, TDP-43(1-102)) constructs with N-terminal His tags, which permitted the elucidation of the shorter construct's 3D structure to high resolution using NMR methods (25), PDB code 2N4P. In this study, we observed that the longer construct TDP-43 , consisting of the NTD plus the NLS, which is rich in cationic residues, is more soluble. The NTD contains two highly conserved consecutive Leu residues at positions 27 and 28. We also found that the Leu-27's hydrophobic side chain is mostly solvent-exposed, which is unusual for a nonpolar residue, and that Leu-28 is mostly buried and makes important contacts linking different elements of secondary structure. Here, the roles of Leu-27 in promoting dimerization and of Leu-28 in stabilizing the tertiary structure are tested by substituting these residues by Ala. We also study another variant, V31R/T32R, designed to introduce two charges into the nonpolar core of the NTD. Because placing even one charge in the hydrophobic core strongly destabilizes proteins (26), this variant is expected to unfold the NTD.
Despite these advances, there is a continuing debate regarding whether the NTD is stably folded and whether this domain needs to be folded for TDP-43 to be active in cells (27)(28)(29). Previously, TDP-43 has been shown to exist as a monomer/ dimer in vivo (30) and in vitro (31). The analysis of truncation mutants (30), size-exclusion chromatography, and a low resolution SAXS envelope (31) indicate the NTD is the domain chiefly responsible for this dimerization. In cells, TDP-43's concentration is exquisitely controlled (32,33), which further suggests that the monomer/dimer equilibrium could have important functional consequences. However, the conformation of the dimer is currently unknown. Further studies in this area are therefore required, if we consider that TDP-43 affects the maturation and transport of thousands of mRNAs and that changes in TDP-43 concentration, due to aggregation or gene knockdown, strongly alter protein expression (34).
One objective of this study was to test whether stably folded NTD is required for TDP-43's activity by characterizing the biological functions of mutations, namely V31R/T32R and L28A, designed to disrupt the protein's tertiary structure. The latter variant was chosen for high resolution studies of its structure, stability, and dynamics. The second objective was to characterize the conformation of the NTD dimer on the basis of multidimensional heteronuclear NMR spectroscopy and corroborated by studies of a variant, L27A, that is designed to disrupt the dimer interface without perturbing the structure of the monomer.

Characterization of the solution conformation of TDP-43(1-102)
Taking advantage of the superior solubility of the TDP-43(1-102) construct, we recorded a series of NOESY NMR spectra (2D NOESY and 3D HSQC-NOESY). These spectra yielded hundreds of new NOEs (over 1800 compared with the 1058 peaks used to calculate the structure of TDP-43(1-77) (25)) and allowed us to determine the NTD's structure to higher resolution ( Fig. 1A and supplemental Table 1). The NMR assignments and final refined structures have been deposited in the Biological Magnetic Resonance Data Bank (BMRB access code 34081) and the RCSB (Protein Data Bank code 5MRG), respectively. Overall, the conformation is very similar to that of the TDP-43(1-77) construct studied previously, except some differences are observed for Asn-76 and Tyr-77 at the end of the last ␤-strand and for Cys-39 and Gly-40, which contact those residues. These minor differences could be due to the end effects, particularly the influence of Pro-78 on the conformation of Asn-76 and Tyr-77. 1 H-15 N HSQC spectra recorded on 15 N-labeled TDP-43 (1-102), which had been transferred into 100% D 2 O buffer, revealed signals belonging to amide groups protected from H/D exchange. This experiment corroborated published results (25) and led to the identification of three new protected residues (Ser-20, Gln-34, and Tyr-73) (Fig. 1B). Ser-20 appears to donate an H-bond to Glu-3 in a minority of our family of 20 NMR structures, and Gln-34 and Tyr-73 are H-bonded in the ␣-helix and last ␤-strand, respectively. In our previous paper (25), H/D exchange was monitored by 1D 1 H NMR spectra, and the protected peaks were identified on the basis of their 1 HN chemical shift and based on a 2D 1 H-1 H NOESY spectrum that was recorded part way through the exchange. In this study, H/D exchange has been followed by 2D 1 H-15 N HSQC spectra. In the previous experiment, the closeness of the Ser-30 1 HN signal to those of Thr-25, Ala-38, Val-72, Val-75, and Asn-76, the proximity of Gln-34's 1 HN signal to aromatic side chain 1 H resonances, and the closeness of Tyr-73's 1 HN signal to those of Ile-5 and Val-7 made it difficult to identify these slow exchanging peaks. They could be unambiguously identified in the 2D 1 H-15 N HSQC spectra thanks to the superior separation of resonances in the 15 N dimension.

Mutation L28A strongly destabilizes TDP-43
As a further test of the structural integrity of the NTD, the effect of a mutation, L28A, designed to disrupt NTD tertiary contacts and hydrophobic core packing, was studied. The 1 H-15 N HSQC NMR spectrum of this variant, called L28A, shown in Fig supplemental Table S1) shown in atomic (upper figure) and ribbon ϩ atomic (lower figure) representations. Residues Ϫ11 to 0 (the His tag, sequence MRGSHHHHHHGS) is colored in dark brown, and residues 78 -102 (the NLS) are in gray. The zoomed view on the right highlights the minor differences between the new structure (rose gold, PDB code 5MRG) and the previous one (green, PDB code 2N4P); i.e. the C-terminal residues Asn-76 and Tyr-77 now establish contacts with Cys-39 and Gly-40 (red labels), which allows elongation of the ␤-strands (red circles), and a half-turn of 3/10 helix spanning residues Pro-46 -Ser-48 (purple labels and circle). This last element of structure is partially populated in the previous structure of the TDP-43(1-77) construct (PDB code 2N4P) and now appears in the lowest energy structure in the new calculation obtained with the longer construct, TDP-43 . Gray lines indicate the close proximity in the tertiary fold of residues Pro-19, Leu-27, and Glu-58 (black labels), involved in the dimerization interface (see text). B, 2D 1 H-15 N HSQC spectrum of TDP-43(1-102) recorded at pH* 3.9, 25°C in 1.0 mM deuterated acetic acid after transfer into 100% D 2 O. pH* is the pH reading of the pH meter without correction for the deuterium isotope effect. The blue, green, and red spectra were recorded after 1 h and 1.5 and 10 days, respectively, of exchange. For clarity, the green and red spectra are displaced 0.07 and 0.14 ppm, respectively, to the right along the x axis. Ser-20, Gln-34, and Tyr-73, whose slow exchanging HN groups are identified here, are circled. assignment of those that had. The largest chemical shift perturbations are localized at the mutated site, the second loop, the ␣-helix, the turn connecting it to the third ␤-strand, and the ␤-hairpin formed by ␤-strands 4 and 5 (Fig. 2B). This pattern of perturbations is consistent with the contacts formed by Leu-28's nonpolar side chain in the folded NTD (supplemental Fig. S1).
Two 2D 1 H-15 N HSQC peaks, for the folded and denatured states, were observed for the side chain HN⑀ of Trp-68 ( Fig. 2A). Based on NMR peak integration, 19% of L28A TDP-43(1-102) is folded at pH 3.9, 25°C. Upon lowering the temperature, the folded population, as gauged by this group, increases to 52% at 15°C and 67% at 5°C (supplemental Fig. S1). These values are in line with the results of fluorescence-monitored thermal denaturation described below. The lack of signal broadening and the observation of discrete native and denatured peaks for the side chain HN⑀ group of Trp-68 are evidence that the folding/unfolding equilibrium of the L28A variant can be approximated as a two-state process, which is slow on the NMR time scale and that the population of folding intermediates is low.
To get more insight into the L28A variant's conformational stability, its thermal denaturation was followed by fluorescence spectroscopy. Even at low temperature, L28A is not completely folded as its wavelength of maximum emission is 333 nm compared with 329 nm for WT TDP-43(1-102). The apparent unfolding midpoint temperature is about 15°C (Fig. 2D). Although this value is imprecise considering the inability to fit the pre-transition baseline, this apparent T m is ϳ30°C lower than that of the WT TDP-43(1-102) construct, indicating that this variant is strongly destabilized. As will be shown below, this severe destabilization strongly affects TDP-43's subcellular localization, in cell aggregation and its ability to regulate mRNA splicing. By analytical ultracentrifugation, this variant sediments as a monomer; no dimer was observed, which is consistent with the lower concentration of folded protein (Fig. 2E).

NMR relaxation analysis shows increased dynamics in the L28A variant
The heteronuclear { 1 H} 15 N NOE of WT TDP-43(1-77) and TDP-43(1-102) constructs and of the TDP-43(1-102) L27A and TDP-43(1-102) L28A variants provides insight into dynamics on the picosecond-nanosecond time scale (Fig. 3A). The low NOE ratios show that there are significant dynamics in both the N-terminal His tag and the C-terminal NLS; this is consistent with the lack of stable conformations in these segments. In contrast, the folded portion (residues 3-77) shows high NOE ratios that approach the values (0.85) expected for static behavior in the elements of secondary structure. Somewhat lower ratio values are observed in the loops, especially those connecting ␤-strands 3 and 4. Similar picosecondnanosecond dynamic behavior was seen for the short (1-77) or long (1-102) WT constructs as well as the variant L27A, which was designed to disrupt dimer formation (see below). By contrast, the L28A variant's { 1 H} 15 N NOE ratio values are considerably lower than WT NTD or the L27A variant, indicating higher dynamics on the fast picosecond-nanosecond time scale.

WT TDP-43(1-102) dimerizes in vitro
As a first step, we studied the oligomerization behavior of WT TDP-43(1-102) by analytical centrifugation, and we found evidence for dimer formation at higher concentrations ( Fig.  2E), namely a peak with a sedimentation constant of 1.9 S corresponding to dimer was observed at higher concentrations along with a larger peak at 0.8 -1.1 S, which corresponds to monomer. Upon dilution, the peak corresponding to dimer disappears. Similar results were obtained on an independent sample prepared in 100% D 2 O (see below). Based on these data, an estimated K D of 1 mM for the dimer dissociation could be calculated under these experimental conditions: pH 4.0, 25°C. Additional thermal denaturation experiments on WT TDP-43(1-102) monitored by fluorescence spectroscopy showed a mean increase in T m of 2.2°C at 500 M relative to 77 M (data not shown). This increased T m is consistent with a dimer dissociation constant of 0.94 mM.
Previous studies of TDP-43 in vitro and in cells at near neutral pH and physiological salt concentrations reported that Cys residues can form disulfide bonds under oxidizing conditions that promote TDP-43 oligomerization (35,36). Cys residues in the RRM domains were found to be mainly responsible, but the participation of the two Cys of the NTD was not ruled out. Here, MALDI-TOF mass spectra revealed a major peak, which is consistent with the sequence of 15 N-labeled TDP-43(1-102) and a generally high incorporation (Ͼ90%) of 15 N (supplemental Fig. S2). Only trace peaks whose mass corresponds to a dimer were detected (supplemental Fig. S2). The MALDI-TOF procedure dissociates non-covalently linked dimers but does not separate dimers linked by disulfide bonds. This point is relevant considering that the NTD contains two rather exposed Cys residues (25) (supplemental Fig. S3C). Because a dimer peak was not detected, these results, as well as 13 C␤ chemical shift values characteristic of reduced Cys, rule out dimerization through the formation of intermolecular disulfide bonds under the conditions studied, namely pH 4.0, 25°C. MALDI-TOF mass spectra recorded on the L27A and L28A variants also revealed mass peaks expected for these variants, a high level of 15 N incorporation and no evidence for dimers linked by covalent bonds (data not shown).
Next, NOESY spectra on a freshly prepared sample dissolved in buffer containing 1 mM deuterated acetic acid and 100% D 2 O were recorded on concentrated (340 M) and diluted (100 M) samples. By these spectra, we observed that some signals, for example those of Leu-27, were considerably perturbed (supplemental Fig. S3A). This strongly suggests that this residue is directly involved in the dimerization interface. The relatively small number of changes is likely due to the fact that these spectra were recorded in D 2 O, and most surface HN have exchanged and become invisible. Analytical centrifugation was performed on this sample and revealed a significant population of dimers (data not shown). The small number of changes also suggests the interface involves a relatively low number of residues, which is in agreement with the weak association constant and the relatively low population of dimers.
As an additional test for dimer formation, the translational and transversal relaxation rates for the following 15    sample are also slightly lower. These results, which can also be represented as the R 2 /R 1 ratio, are clear evidence for dimer formation in the 523 M TDP-43(1-102) sample. From the mean R 1 and R 2 rates, values of the correlation time, Tc, which measures the tumbling time in solution and is related to the size, were calculated (supplemental Table 2) These Tc values are not directly comparable because of the influence of disordered segments on the experimental Tc values but not the calculated Tc values. Nevertheless, all the samples' Tc values are in reasonable agreement with the values calculated based on the monomer structure (supplemental Table S2) and with Tc values for proteins of this size, except the 523 M WT TDP-43(1-102) sample. The latter's Tc value is consistent with a mixed sample content of monomer and dimer.

Atomistic model for the solution structure of the NTD dimer
A sample containing 50% unlabeled TDP-43(1-102) and 50% 13 C, 15 N-TDP-43(1-102) labeled (total concentration ϭ 330 M) was studied to obtain atomic-level information on the TDP-43 dimer's structure. In this sample, half the dimers will contain one labeled subunit and one unlabeled subunit, which affords the exclusive detection of pure intermolecular protonproton contacts using a 13 C-edited/ 12 C-filtered 2D NOESY experiment, in which only cross-peaks between the unlabeled and labeled monomers within the same dimer are visible. Following this strategy, we could detect 17 intermolecular crosspeaks (supplemental Table S3 and supplemental Fig. S3D).
Whereas this limited number of unambiguous NOEs is insufficient to enable the determination of a high-resolution dimer structure, it does reveal many structural features of the interface. Several of these NOEs arise from inter-monomer contacts between Leu-27 side chains, as well as between Pro-19 (at the end of ␤-strand 2) with Glu-58 (in the loop that connects ␤-strands 4 and 5). By utilizing a limited number of conformational restraints based just on these NOEs (supplemental Table  S3 and supplemental Fig. S3), the structure of a minimal dimer interface could be calculated (Fig. 4A). In this structure, the burial of the hydrophobic side chain of Leu-27 (Fig. 4A), which is exposed in the monomer (see above and Fig. 1A and supplemental Tables S1 and S3) would provide a favorable free energy change to drive dimerization.
The remaining NOE signals arise from residues 30 to 32 in the ␣-helix and Pro-36, which lies right at the end of this element of secondary structure. Employing all possible NOE-derived conformational restraints, a larger dimer interface, which features helix-helix "knobs into grooves" packing could be determined (Fig. 4B). Whereas this structure represents our proposal for the TDP-43 NTD dimer in solution, it is important to point out that on the basis of this structure, one may expect to see additional NOE cross-peaks between 1 H in the ␣-helices; nevertheless, these signals were unobserved. Dynamic behavior on the micro-to millisecond time scale, which might arise from interconversion between the conformers comprising the minimal and large dimer interfaces, or low signal intensity due to the low population of the dimer could account for why these signals were not detected.

Leu-27 is a key residue for TDP-43 NTD dimerization
To further validate the structural model for the NTD dimer, a variant containing the Leu-27 to Ala mutation was prepared and studied. The 2D 1 H-15 N HSQC of the TDP-43(1-102) L27A variant shows features of the natively folded domain ( Fig.  2A). A 3D 15 N-NOESY-HSQC spectrum was also recorded, and its analysis permitted the verification of assignments in the 2D 1 H-15 N HSQC spectrum, the confirmation of essential side chain assignments, and the corroboration that crucial NOEs defining the tertiary structure were still present in this variant with respect to the WT (supplemental Fig. S4). A plot of the differences in the 1 H-15 N backbone chemical shifts in the L27A mutant versus the WT construct reveals minor variations except for the residues neighboring the mutation site and residues 21, 22, and 58 (Fig. 2C). The latter differences may reflect the lack of a populated dimer. In addition, we monitored the thermal denaturation of this variant using fluorescence spectroscopy, and we found that the WT TDP-43(1-102) L27A variant (T m ϭ 45°C) is almost as stable as the WT TDP-43(1-102) WT (T m ϭ 49°C) at pH 3.9, as illustrated in Fig. 2D. Analytical centrifugation revealed that the TDP-43(1-102) L27A variant is completely monomeric at concentrations where WT TDP-43(1-102) exists partly as a dimer (Fig. 2E).
We also analyzed the effect of substituting both Leu-27 and Leu-28 by alanine residues, in the context of the L27A/L28A double mutant. This variant appears to be even less stable than L28A as its 2D 1 H-15 N HSQC spectrum is typical of a completely unfolded protein (supplemental Fig. S5). The destabilizing effect caused by the mutations increases from L27A to L28A to L27A/L28A can be appreciated in supplemental Fig. S6 where the upfield region of the 1 H NMR spectrum near 0 ppm shows intense peaks for methyls retaining the native fold in the hydrophobic core of L27A, weaker and shifted signals in L28A, and the absence of these peaks in the double mutant.

Conformational model of the TDP-43(1-414) dimer
A model of the complete TDP-43 protein, built on the basis of the NTD monomer structure and dimer interface reported here, and the SAXS envelope of Ref. 31, shows that the NTD and RRM domains are arranged like beads on a string or rosary and that the disordered C-terminal region extends out away from them (Fig. 4C). However, considering that the segment connecting the NTD to RRM1 is flexible and that the RRM1 to RRM2 linker is also flexible in the absence of RNA (41), it is quite possible that the folded domains and C-terminal regions can flex and bend to interact with each other, when they are buffeted by other macromolecules or due to attractive interactions with other molecules such as RNA. This structure, and the contacts defining the dimer interface in the N-terminal domain, may well be affected by changes in pH and ionic strength. This is in agreement with this consideration, and evidence for interdomain contacts in TDP-43 has been recently reported (28).

Functional experiments on NTD structural mutants in the Leu-28 -Phe-35 ␣-helix
To study the biological implications of NTD variants in a cellular context, we engineered a mutation, V31R/T32R, that was predicted to completely destroy the ␣-helix structure and unfold the NTD, because of the well known severely destabilizing effect of introducing charges into the hydrophobic core (26). This mutant was then assayed in a variety of functional assays to test its ability to affect the splicing and aggregation functionality of TDP-43.
First of all, we tested this mutation in an add-back assay (Fig.  5). This assay, which is described in detail by Ref. 42, is based on a minigene system carrying CFTR exon 9 that includes a muta- tion in a splicing enhancer element within its sequence (C155T) to obtain a 50:50 ratio of exon inclusion/skipping in normal conditions (Fig. 5, 1st lane). This is the optimal condition to see whether a change in TDP-43 structure can result in either a loss-of-function (less exon 9 skipping) or a gain-of-function effect (more exon 9 skipping). In this system, when endogenous TDP-43 is removed from the cells by siRNA treatment, the levels of CFTR exon recognition substantially increase to more than 80% (Fig. 5, 2nd lane). As expected, splicing inhibition was fully rescued following expression of an si-resistant TDP-43 wild-type protein (Fig. 5, 3rd lane) but not when an siRNA-resistant F4L TDP-43 mutant that cannot bind RNA due to mutations in the RRM1 and RRM2 domains was expressed at similar levels (Fig. 5, 4th lane). The same result was also observed for the V31R/T32R mutant (Fig. 5, 5th lane).
To determine whether this change could also be associated with loss-of-function, we took advantage of an aggregation system that we previously set up by adding 12 repetitions of the Q/N prion-like domain of TDP-43 (15). The induction of this protein, when stably transfected in cells, caused the accumulation of aggregates capable of sequestering the endogenous TDP-43 protein and inducing a very well defined loss-of-function phenotype (22,34,43). Therefore, we also inserted the V31R/T32R mutation in these stable cell lines (FLAG-TDP-12X-V31R/T32R) to see their effect on aggregation and endogenous TDP-43 sequestration (supplemental Figs. S7A and S8A). As shown in supplemental Figs. S7B and S8B, compared with the expression of a wild-type TDP-12X-Q/N, the expression of this protein was not capable of inducing loss-of-function effects in the pre-mRNA splicing of POLDIP3. The reason for this loss is due to the fact that, as shown in co-immunoprecipitation experiments, the FLAG-TDP-12X-V31R/T32R protein has a very reduced ability to interact with endogenous TDP-43 (supplemental Figs. S7C and S8B). Additional experiments also confirmed that endogenous TDP-43 remains soluble in the nucleus even in the presence of aggregated FLAG-TDP-12X-V31R/ T32R (data not shown).
Having established the importance of folded NTD and its ␣-helix in TDP-43 splicing and aggregation properties, it was therefore of interest to see the effects of the L27A and L28A mutants for which we had obtained the structural data described above.

Functional experiments on NTD structural mutants designed to disrupt quaternary (L27A) and tertiary (L28A) contacts
First of all, we performed add-back experiments on all the three mutants (L27A/L28A, L27A, and L28A) in HeLa cells ( Fig. 6A and B). These results show that the double mutant L27A/L28A cannot recover CFTR exon 9 skipping following its add-back in TDP43-depleted cells (Fig. 6A, 5th lane) just like the V31R/T32R. Furthermore, it almost all localizes in the cytoplasm as observed with immunofluorescence experiments (Fig.  6C). This is expected, as we have shown that this mutant lacks any native structure in solution, as gauged from the NMR studies described above. More interesting is the L28A variant, which is also unable to recover the exon skipping activity (Fig.  6A, 7th lane) and is also predominantly localized in the cytoplasm. This similar behavior of L28A and the L27A/L28A double mutant is evidence that TDP-43 needs to be completely folded, and not partially folded, to perform its biological functions and maintain its predominantly nuclear localization.
In fact, the L27A variant that is shown above to retain the native fold can restore exon skipping just like WT does (Fig.  6A, 6th lane). Interestingly, although this L27A is well localized in the nucleus, it is also quite present in the cytoplasm (Fig. 6C). We attribute this behavior to this variant's ability to retain the native fold, while being less able than the WT to dimerize.
These results are also reflected when these mutants are transfected in cell lines stably expressing a GFP-TDP-43-12X-Q/N-F4L protein (Fig. 7). First of all, it can be observed that the

Stably folded N-terminal domain required for TDP-43 function
add-back of L27A can substantially recover the CFTR exon 9 skipping following the formation of the aggregates (Fig. 7A, 6th lane), although this was not observed for the L28A and L27A/ L28A mutants (Fig. 7A, 7th and 9th lanes). Moreover, the immunohistochemical localization of L27A in the presence of the aggregates still remains diffuse in the nucleus (Fig. 7B) as opposed to the L28A and L27A/L28A, which colocalize extensively with the aggregates.

Discussion
Thanks to the superior solubility of the TDP-43(1-102) construct, we have been able to extend the structure of the NTD to higher resolution, corroborate the identity of well protected amide groups in secondary structure, and identify three new protected residues. These results are additional evidence that the NTD is well folded and stable (25) and not poorly folded and marginally stable (24,28).
Our findings are pertinent to the open debate on whether the NTD is stably folded and functionally relevant. In particular, the strong destabilization and structural disruption of the NTD in the context of the Leu-28, L27A/L28A, and V31R/T32R mutations, these variants' dramatically reduced activity in modulating RNA splicing, subcellular mislocalization, and strong tendency to aggregate in cells all emphasize the importance of folded NTD for TDP-43's native activity. Based on these results, we conclude that the NTD must be stably folded for TDP-43 to be able to carry out its physiological functions. This conclusion is likely to hold at the level of whole organisms, because the deletion of a short segment of the NTD corresponding to the first ␤-strand, which we expect would provoke the domain's denaturation, has severe physiological effects in mice (23).
TDP-43 is known to exist as a dimer/monomer equilibrium in vivo (30). Under the conditions studied here (25°C, pH 3.9), the TDP-43 NTD dimerizes through interactions mediated by conserved exposed hydrophobic residues such as Leu-27, Pro-19, and Pro-36 as well as Glu-58. Residues in the ␣-helix could contribute additional interactions. The dimerization interface is relatively small, and the association is weak under the conditions examined here. However, it could be relevant for TDP-43's physiological activities, particularly in "crowded" conditions, for example when concentrated in subcellular compartments such as stress granules or as part of microdroplets (as discussed below). Considering that the K D value of the TDP-43 NTD dimer is estimated to be in the range of 1 mM, which is well above the in vivo concentration of TDP-43, there is a doubt regarding how much dimer is present in cells. TDP-43 levels in cells are subjected to very fine regulation as the protein binds and regulates the concentration of its own mRNA (32) and also acts to retain its own mRNA in the nucleus to prevent translation (33). Nevertheless, crowding effects from the high in vivo concentrations of macromolecules and metabolites act to promote protein oligomerization (44), including TDP-43 dimerization. In addition, in vivo post-translational modifications, such as the acetylation of Lys-145 (45) or phosphorylation (46), are known to increase or decrease, respectively, TDP-43 self-association. It is also possible that the monomer association is strengthened at neutral pH and higher ionic strength conditions (20). In addition, gel filtration and SDS-PAGE experiments have also shown that TDP-43 truncation constructs consisting of both RRM domains either linked together or as separate entities also dimerize or tetramerize, respectively, in vitro (47). Therefore, TDP-43 dimerization is likely to be aided by interactions among the RRM domains and could be further strengthened when distinct TDP-43 molecules are bound to the same RNA.
It is fascinating that the L27A variant, which is unable to dimerize but can stably fold, is active in modulating splicing. This is logical considering that no TDP-43 dimers were previously detected in the nucleus where splicing occurs (30). In addition to regulating splicing, TDP-43 performs several other functions, and dimerization may well be required for some of them. TDP-43 dimers were previously observed in the cytoplasm (30), despite the fact that TDP-43 is more concentrated in the nucleus. This suggests that some mechanism concentrates TDP-43 in the cytoplasm so as to promote its dimerization. Over the last several years, the idea that cells contain several types of discrete liquid phases known as "microdroplets" such as nucleoli (48), stress granules (49), or neuronal granules that serve to concentrate and organize certain proteins and RNAs has gained acceptance (50 -52). TDP-43 has been recently shown to bind to stress granules through the hydrophobic subsegment in the CTR (13, 14), and as a working hypothesis, we speculatively propose that the concentration of

Stably folded N-terminal domain required for TDP-43 function
TDP-43 molecules within these microdroplets could well promote its dimerization and eventually lead to pathological aggregation or altered functionality. Future experiments will be necessary to test this proposal as well as to ascertain the physiological and eventually pathological function(s) of the TDP-43 dimer. Nonetheless, our results provide a better understanding of the relationship between the TDP-43 NTD and its functionality. Most importantly, the level of structural detail achieved by our analyses to characterize the dimerization interface at a single amino acid level may eventually open the way for the development of small molecules/peptides capable of interfering with this process. These effectors could then eventually be used to specifically inhibit TDP-43 aggregation tendencies and enhance its solubility within cells.

Protein expression and purification
Protein expression was essentially performed as described previously in Mompean et al. (25). Briefly, the sequences coding for TDP-43 N terminus residues 1-102 and the various mutants were cloned in the BamHI-HindIII of pQE306xHis plasmid. The plasmids were then used to transform M15 bacteria grown on kanamycin/ampicillin-resistant plates. The bacteria were then grown in M9 media supplemented with D-glucose (U-13 C, 99%, Cambridge Isotope Laboratories, Inc.) at 0.3% v/v and 15 NH 4 Cl (Sigma) at 0.1% v/v, and protein expression was induced overnight with 1 mM isopropyl 1-thio-␤-Dgalactopyranoside at 30°C. All constructs contained an N-terminal His tag (MRGSHHHHHHGS), which facilitated the proteins' purification following the manufacturer's instructions using nickel-nitrilotriacetic acid-agarose Qiagen resin in the presence of Complete protease inhibitor cocktail (Roche Applied Science). Eluted fractions were then analyzed by 15% SDS-PAGE before staining with Colloidal Coomassie Blue to check for purity. For spectroscopic studies, samples were transferred to buffer containing 85% milliQ water, 15% D 2 O (Cambridge Isotope Laboratories, Inc.), and 1.0 mM acetic acid-d 3 (Aldrich) using PD-10 desalting columns (GE Healthcare) and concentrated utilizing Vivaspin centrifugal concentrators (Sartorius). We used deuterated acetic acid to avoid the 1 H signal from the methyl group of acetic acid.

Fluorescence spectroscopy
The thermal denaturation of WT TDP-43  and that of the L27A, L28A, and L27A/L28A variants was monitored using a Fluoromax4 spectrofluorimeter (Jobin-Yvon Inc., Edison, NJ). The xenon emission (467 nm) line and the water Raman signal (which is 397 nm when exciting at 350 nm) were used to calibrate the excitation and emission wavelengths, respectively. The experiments were made using fractions from gel filtration or NMR samples (concentrations 50 -100 M, save one experiment at 500 M) and contained the same buffer (1 mM acetic acid, pH 3.9) as the NMR experiments. Emission spectra were recorded over the emission range of 270 -380 nm, utilizing an excitation wavelength of 280 nm, 2-2.5 nm excitation, 2 nm emission slit widths, and a Peltier module to control the temperature. Data were collected using a scan speed of 2 nm⅐s Ϫ1 over intervals of 1-2°C with 1-or 2-min equilibration times, respectively, at the new temperature before recording the spectra.

Analytical ultracentrifugation
The samples used for the NMR experiments were subjected to sedimentation velocity experiments at 25°C, using an Optima XL-A analytical ultracentrifuge with a UV-visible variable wavelength detection system. The sedimentation assays were performed on NMR samples at their initial concentrations or diluted 3-5-fold by centrifugation at 48,000 rpm and detected using a wavelength of 280 nm. The concentrations of TDP-43(1-102) studied were as follows: WT 0.23 and 0.046 mM; L27A 0.20 and 0.050 mM; L28A 0.13 and 0.032 mM. A second independent WT sample was studied and gave similar results. For the experiments shown in Fig. 2E, thin cells (pathlength ϭ 3 mm) were used for the concentrated L27A and L28A samples, and normal cells (pathlength ϭ 12 mm) were utilized for the dilute samples. This accounts for why the peak absorbance does not decrease despite the substantial dilution. The program SEDFIT (version 15.01b) (53) was used to analyze the sedimentation profiles and obtain the sedimentation coefficients.

NMR experiments
All NMR experiments were recorded at 25°C and pH 3.9, using a Bruker Avance 800 MHz spectrometer equipped with triple axis resonance cryo-probe with Z-gradients. Based on our reported chemical shift assignments for WT TDP-43(1-77) (25) (BMRB access code 25675), we performed additional standard 3D experiments, including 1 H-1 H-15 N HSQC-NOESY, CBCA(CO)NH, HNCO, and HNCA spectra, to ensure that no structural changes occurred, to assign the segment 78 -102, and to confirm that it is unfolded and does not interact with the folded domain comprising residues 1-77. This assignment process was carried out manually using SPARKY (54), and the 1 H chemical shifts were referenced to sodium 4,4-dimethyl-4-silapentane-1-sulfonate. The 13 C and 15 N chemical shifts were referenced indirectly to 1 H based on the nuclei's gyromagnetic ratios as recommended (55). Interproton NOE distance restraints were obtained by 2D 1 H-1 H and 13 C-edited/ 12 C-filtered NOESY spectra (120-ms mixing time). The NMR assignments have been deposited in the Biological Magnetic Resonance Data Bank (BMRB access code 34081).  (56) were carried out with an overall recycling delay of 10 s to ensure the maximal development of NOEs before acquisition and to allow solvent relaxation (57). Heteronuclear NOEs were calculated from the ratio of cross-peak intensities in spectra collected with and without amide proton saturation during the recycle delay. Uncertainties in peak heights were determined from the standard deviation () of the distribution of intensities in the region of the spectra where no signal and only noise were observed.

NMR 15 N relaxation measurements
The characterization of the dynamics and dimer formation by measuring the translational (R 1 ), off-resonance rotating frame (R 1 ), and transversal (R 2 ) relaxation rates using the parameters and procedures was previously described for the TDP-43(1-77) construct (25). For the 523 M WT TDP-43(1-102) sample, four additional spectra were recorded with longer relaxation delays (2.0, 2.2, 2.5, and 3.0 s) to accurately fit the R 1 rates, which are somewhat shorter than those of the other samples. The overall correlation time, Tc, was calculated from the ratio of the mean values of R 1 and R 2 excluding disordered residues. Values of NOE ratios Ͻ 0.65, R 2 Ͻ the mean R 2 Ϫ 1 and R 1 Ͼ the mean R 1 ϩ 1 were used to define disordered residues (58). In addition to the experimental measurements, the correlation times were also calculated for the well-folded regions of the monomer and the dimer structures using the HYDRO NMR program (59).

NMR structure calculations
The dimeric structure of TDP-43(1-102) was determined in a two-step procedure using CYANA-3.97 (60). In the first step, we resolved the monomeric structure using dihedral angle restraints as determined by TALOSϩ based on the chemical shift information, in addition to a large number NOE crosspeaks obtained from a 1 H-1 H 2D NOESY spectrum. These data were used as the input for interactive structure calculation, to elaborate, seven cycles of simultaneous automatic assignment of the cross-peaks and torsion angle dynamics. Next, a final simulated annealing calculation was carried out using the distance restraints derived from the consensus assignment achieved in the seventh cycle. The resulting structure is slightly more refined than that of monomeric TDP-43(1-77) that we recently reported (25), essentially because the samples used in this work yielded superior spectra, as explained under "Results." The final refined structures have been deposited in the RCSB Protein Data Bank (code 5MRG), respectively.
In the second step, the 13 C, 15 N-edited/ 12 C, 14 N-filtered 2D NOESY experiment, which is specifically designed to detect intermolecular contacts, was recorded on a hemi-13 C, 15 Nlabeled 330 M TDP-43(1-102) sample. The cross-peaks obtained from this experiment were converted to upper distance limits of 5.5 Å. These data were used as distance restraints to complement the previously obtained dihedral and distance restraints of the monomer. As we noted in our previous study, there is a unique H-bonding pattern that can account for residues showing protection against H/D exchange and used as constraints along with the experimental cross-peaks resulting in a bundle of conformers without violations. Thus, we used all this information together and submitted two copies of TDP-43(1-102) to a second simulated annealing calculation to obtain the final dimeric structure. Finally, the structural model of the NTD dimer and the tandem RRM domain structure solved in 2013 (41) were manually fitted into the previously published SAXS envelope to produce the model for the complete TDP-43 dimer shown in Fig. 4C.

Add-back splicing assay and in cell aggregation and subcellular localization with WT and the mutants
Using specific sets of primers the V31R/T32R, L27A, L28A, and L27A/L28A were inserted in the previously described si-resistant plasmid expressing wild-type TDP-43 (42). Briefly, to maximize TDP-43 silencing efficiency, HeLa cells were plated at 30% of confluence (day 0), and two rounds of TDP-43 siRNA transfections were carried out on days 1 and 2, according to the procedure already described (61). On the afternoon of day 2, cotransfection was performed with 1 g of pFLAG-fusion protein expression vector and 0.5 g of the CFTR C155T reporter minigene. On day 3, cells were harvested, and total RNA was extracted with EuroGold TRifast (Euroclone, Milan, Italy). Reverse transcription was performed using murine leukemia virus reverse transcriptase (Invitrogen), according to the manufacturer's protocol. PCR with DNA polymerase (New England Biolabs, Ipswich, MA) using minigene-specific primers was carried out for 35 amplification cycles (95°C for 45 s, 54°C for 45 s, and 72°C for 45s). The expression levels of the added-back TDP-43 proteins were monitored through Western blotting, using a commercially available antibody against TDP-43 (Protein Tech, 10782-2-AP). Endogenous tubulin (in-house made mouse monoclonal antibody) and p84 (Abcam, Ab487) were used as loading controls. Inclusion levels of exons were quantified using the Qiaexcel platform used to run the PCR products or through ImageJ quantification.

Stable cell line generation
HEK293 flip-in cell line (Invitrogen) was grown in DMEM/ Glutamax-I (Gibco) supplemented with 10% fetal bovine serum (Gibco) and antibiotic/antimycotic-stabilized suspension (Sigma). The plasmid transfections were carried out using Effectene transfection reagent (Qiagen) according to the manufacturer's instructions. To generate the stable clone, 0.5 g of the FLAG-TDP-12X-V31R/T32R expressing plasmid were cotransfected together with 0.5 g of pOG44 vector that expresses the Flp-recombinase (Invitrogen).

Coimmunoprecipitation assays
For coimmunoprecipitation assays, HEK293 flip-in stable cell lines expressing FLAG-TDP-43-12X-Q/N and FLAG-TDP-12X-V31R/T32R were induced for 24 h with 1 g/ml tetracycline. Cells were collected in RIPA lysis buffer (50 mM Tris/ HCl, pH 7.4, 150 mM NaCl, 1% Nonidet P-40, 0,1% SDS, 1 mM EDTA, pH 8, 1 mM PMSF, 0.5% SDC, H 2 O up to the final volume) supplemented with protease inhibitors (Roche Applied Science, catalog no. 11836145001) and incubated for 30 min at 4°C. After spin down at 500 ϫ g at 4°C, cells were lysed by sonication. The lysates were then incubated with 40 l of A/G plus agarose beads (Santa Cruz Biotechnology) to perform a pre-clearing for 1 h at 4°C. In the meantime, an incubation of 40 l of A/G plus agarose beads with 3 g of anti-FLAG antibody (Sigma, F1804) in RIPA buffer for 2 h at 4°C was performed. After both incubations, the pre-cleared lysate was incubated with A/G plus agarose beads/anti-FLAG overnight at 4°C. The day after, the beads were precipitated and washed with PBS once for 10 min at 4°C. The beads were finally resuspended in 50 l of Resuspension Buffer (50 mM Tris/HCl, pH 7.4, 5 mM EDTA, 10 mM DTT, 1% SDS, H 2 O up to the final volume), and 20 l of SDS 5ϫ loading buffer were added.

Cell lysate fractionation
To perform cell lysate fractionation in soluble and pellet fractions, 2 ϫ 10 6 cells (HEK-FLAG-TDP-12X-Q/N, HEK-FLAG-TDP-12X-V31R/T32R) were seeded and induced with 1 g/ ml tetracycline for 24 h. Then, cells were collected and lysed with 1 ml of RIPA lysis buffer ϩ protease inhibitor for 30 min at 4°C. After centrifugation at 4000 rpm for 20 min, the whole supernatant was further sonicated for 5 min to allow a better lysis. Two hundred g of cell lysate were ultracentrifuged in a clean Beckman polycarbonate thick wall centrifuge tube (rotor type 70.1Ti) for 1 h at 25°C at 33,000 rpm.
The supernatant was collected, and the pellet was washed twice with 100 l of RIPA buffer. Pellet was finally dissolved in urea buffer (7 M urea, 4% CHAPS, 30 mM Tris, pH 8.5). To analyze each fraction by Western blotting, 10% of input, 10% of soluble fraction, and 30% of pellet volume were loaded in a 10% SDS-polyacrylamide gel.