A structural organization for the Disrupted in Schizophrenia 1 protein, identified by high-throughput screening, reveals distinctly folded regions, which are bisected by mental illness-related mutations

Disrupted in Schizophrenia 1 (DISC1) is a scaffolding protein of significant importance for neurodevelopment and a prominent candidate protein in the pathology of major mental illness. DISC1 modulates a number of critical neuronal signaling pathways through protein-protein interactions; however, the mechanism by which this occurs and how DISC1 causes mental illness is unclear, partly because knowledge of the structure of DISC1 is lacking. A lack of homology with known proteins has hindered attempts to define its domain composition. Here, we employed the high-throughput Expression of Soluble Proteins by Random Incremental Truncation (ESPRIT) technique to identify discretely folded regions of human DISC1 via solubility assessment of tens of thousands of fragments of recombinant DISC1. We identified four novel structured regions, named D, I, S, and C, at amino acids 257–383, 539–655, 635–738, and 691–836, respectively. One region (D) is located in a DISC1 section previously predicted to be unstructured. All regions encompass coiled-coil or α-helical structures, and three are involved in DISC1 oligomerization. Crucially, three of these domains would be lost or disrupted by a chromosomal translocation event after amino acid 597, which has been strongly linked to major mental illness. Furthermore, we observed that a known illness-related frameshift mutation after amino acid 807 causes the C region to form aberrantly multimeric and aggregated complexes with an unstable secondary structure. This newly revealed domain architecture of DISC1, therefore, provides a powerful framework for understanding the critical role of this protein in a variety of devastating mental illnesses.

Disrupted in Schizophrenia 1 (DISC1) is a scaffolding protein of significant importance for neurodevelopment and a prominent candidate protein in the pathology of major mental illness. DISC1 modulates a number of critical neuronal signaling pathways through protein-protein interactions; however, the mechanism by which this occurs and how DISC1 causes mental illness is unclear, partly because knowledge of the structure of DISC1 is lacking. A lack of homology with known proteins has hindered attempts to define its domain composition. Here, we employed the high-throughput Expression of Soluble Proteins by Random Incremental Truncation (ESPRIT) technique to identify discretely folded regions of human DISC1 via solubility assessment of tens of thousands of fragments of recombinant DISC1. We identified four novel structured regions, named D, I, S, and C, at amino acids 257-383, 539 -655, 635-738, and 691-836, respectively. One region (D) is located in a DISC1 section previously predicted to be unstructured. All regions encompass coiled-coil or ␣-helical structures, and three are involved in DISC1 oligomerization. Crucially, three of these domains would be lost or disrupted by a chromosomal translocation event after amino acid 597, which has been strongly linked to major mental illness. Furthermore, we observed that a known illness-related frameshift mutation after amino acid 807 causes the C region to form aberrantly multimeric and aggregated complexes with an unstable secondary structure. This newly revealed domain architecture of DISC1, therefore, provides a powerful framework for understanding the critical role of this protein in a variety of devastating mental illnesses.
Disrupted in Schizophrenia 1 (DISC1) 3 was initially identified as being of importance for mental illness due to its disruption by a chromosomal translocation in a Scottish family with schizophrenia, depression, and related psychiatric illnesses (1,2). At the protein level, its translocation would cause loss of all of the amino acids C-terminal of residue 597 in the 854-amino acid long human DISC1 protein. A second family was subsequently reported where a C-terminal frameshift mutation following amino acid 807 was linked to mental illness (3). Since then it has been established that the DISC1 protein acts as a scaffold, interacting with a considerable number of proteinbinding partners to modulate a wide variety of signaling pathways, many of which are of significant importance for neurodevelopment (4 -6). Progress in understanding the mechanisms by which this modulation occurs has been limited, however, in part because of the dearth of information concerning the basic structure of DISC1.
To date, domain delineation has been limited to bioinformatics predictions that suggest the first 325 amino acids of DISC1 are disordered, whereas the remainder of the protein (from 326 or 350 onwards) is rich in ␣-helical and/or coiled-coil structure (1,7,8). In addition to a deduced self-association domain around residues 400 -500, identified via a mutation analysis conducted in cell culture (6), several biophysical studies of the DISC1 protein have revealed distinct dimerization and oligomerization domains toward the C terminus of DISC1 (9) and provided evidence that the full-length protein forms octamers and dimers (10). The major barriers impeding the characterization of the DISC1 protein are the low solubility and high aggregation propensity of recombinant DISC1, except when fused to maltose-binding protein (10) or when insoluble fragments are solubilized in urea and refolded in vitro (9). Moreover, the DISC1 gene is also evolving very rapidly, with the encoded protein having a low degree of sequence conservation across species and lacking significant homology with any other known proteins (7,8,11,12). This hinders attempts to identify functional domains, which would facilitate the expression of discretely folded DISC1 regions.
To address this latter issue, we utilized the Expression of Soluble Protein by Random Incremental Truncation (ESPRIT) technology (13,14). This technique, developed to identify folded structural domains of uncharacterized proteins, involves the random truncation of a gene of interest at one or both ends. The ensuing library of tens of thousands of truncated variants is then expressed in bacteria and screened for solubility. Clones encoding the most soluble protein fragments are sequenced to identify construct boundaries before being further characterized, based on the hypothesis that such soluble, well expressed protein fragments often correspond to distinctly folded domains within the full-length protein. This approach has previously led to the identification of unpredicted domains within several proteins, and in some instances to solved three-dimensional structures of such domains (13,(15)(16)(17).
By employing this technique, we identified and characterized four novel folded regions of the human DISC1 protein that we propose as the basic domain architecture of the protein. Significantly, two mutations associated with psychiatric illness in families lie within these regions. These results, therefore, provide a powerful basis for further structural and functional studies of the DISC1 protein and its relevance to mental health.

Identification of four novel structural regions within DISC1 by ESPRIT screening
A version of the wild-type human DISC1 gene, codon-optimized for efficient expression in Escherichia coli, was cloned into a plasmid vector encoding hexa-histidine (His 6 ) and biotin acceptor peptide (BAP) tags at the N and C termini, respectively. The gene was then subjected to random incremental truncation from both the 5Ј and 3Ј ends, generating a library of truncated DISC1 gene fragments. This library was then constrained to those DNA fragments encoding ϳ150 -400 amino acid (AA) protein fragments to focus on constructs of typical domain size. Twenty-one randomly picked clones were sequenced to ensure an even distribution of construct lengths within this range, as well as coverage of the entire length of the DISC1 protein.
A total of 27,652 clones (72 ϫ 384-well plates) were arrayed robotically onto nitrocellulose membranes to generate colony filters in which encoded DISC1 fragments were expressed. Identification of putatively soluble expression constructs was performed by colony blot hybridization to detect His 6 and BAP signals for each clone. The 94 highest double-signaling constructs from the array were isolated, expressed in liquid culture format, and assayed directly for solubility by nickel-NTA purification, SDS-PAGE, and Western blotting. Thirty-one constructs yielded detectable DISC1 protein fragments that could be purified, and these were sequenced to identify construct boundaries. Most of these constructs are located in the C-terminal half of the protein (AA 427-854) (Fig. 1A), with only two constructs being found in the N-terminal half (AA 1-426).
To determine which of these constructs were most likely to express folded, stable, soluble proteins representative of the domains of the DISC1 protein in vivo, the BAP tags were deleted from the constructs, and then each protein was purified and analyzed by size exclusion chromatography (SEC). Most of these recombinant proteins displayed a high aggregation propensity, eluting largely in the void volume and with a highly variable oligomeric state. However, six constructs, representing four different regions of the DISC1 protein, were soluble with consistently stable oligomeric states by SEC, indicative of folded, stable regions, as will be described below. We, therefore, propose these four newly defined regions to represent at least part of the basic domain architecture of the full-length DISC1 protein, and we will refer to them here (from N terminus to C terminus) as the D, I, S, and C regions (Fig. 1, A and B). Notably, however, two of the regions (S and C) lie C-terminal to the Scottish family translocation breakpoint, whereas the I region lies across it, implying that any truncated or fusion protein produced from the translocation chromosome would lack most of its major folded elements.
All four regions, when expressed as recombinant proteins at 0.5 mg/ml, were stable, showing no signs of visible precipitation after 48 h at room temperature (25°C). Furthermore, the S region was stable for at least 1 week and the C region over even longer time periods, even at concentrations of 10 mg/ml. The biophysical characterization of these four regions is described below and is summarized in Table 1.

The D region is an unpredicted dimeric structure in the N-terminal half of DISC1
We define the D region based on the two highly similar truncation constructs, corresponding to AA 249 -383 and 257-400 of full-length DISC1. These constructs were the only soluble constructs detected in the N-terminal half of DISC1 during the solubility screen. SEC analysis of the purified D region revealed multiple species ( Fig. 2A). As these fractions all contained the D region fragment, as assayed by SDS-PAGE, and the elution points were roughly consistent with the different species having molecular weights that were multiples of each other, these were interpreted as representing different oligomeric states of the D region. When eluted peaks were re-analyzed by SEC after 24 h at 4°C, the formerly most abundant species remained the most prominent, although with some shift in equilibrium toward a lower oligomeric state ( Fig. 2A). Analytical ultracentrifugation (AUC) sedimentation equilibrium (SE) experiments on the most prominent oligomeric state revealed it to have a molecular mass of 38.8 Ϯ 0.2 kDa, strongly indicative of a dimer (Fig. 2B, the dimer is predicted to be 35.4 kDa). Despite predictions that this region of DISC1 would incorporate a large swathe of protein dis-DISC1 domain architecture order (8), circular dichroism (CD) of the D region yields a spectrum consistent with a protein of high ␣-helical content (Fig. 2C).

The I region forms a distinct helical oligomeric species with a high aggregation propensity
We define the I region based on a single ESPRIT construct encoding DISC1 AA 539 -655. Like the other three proposed regions, the I region can be expressed in E. coli as a soluble protein with a stable, low-order oligomeric state (Fig. 2D); however, unlike the other three regions there is an apparent upper limit to this, with the purified proteins exhibiting visible precipitation when concentrated above ϳ30 M. Multiple other ESPRIT-derived constructs overlap the AA 539 -655 segment and similarly precipitate, although these do so immediately upon purification and are found abundantly in the void volume after SEC (Fig. 2E). Of the truncated DISC1 species detected in this region by the solubility screen, the AA 539 -655 fragment is, therefore, unique in being expressed as a soluble, stable protein under these conditions, and is therefore defined here as a structured region. SEC shows the I region is dimeric, although AUC SE analysis was not possible due to the upper concentration limit, whereas CD spectrometry showed it to be predominantly helical (Fig. 2F).

The S region is a highly elongated tetramer, which overlaps with neighboring regions
We define the S region based on a construct encoding DISC1 AA 635-738. It overlaps with both the I and C regions, and is a stable tetramer. Specifically, SEC demonstrates the region to express as multiple different oligomeric states, of which one was consistently seen to be the most abundant. After a 24-h period, SEC of this principal oligomeric state showed it retained its prominence (Fig. 3A). AUC SE analysis shows the preparations of this species have a mean molecular mass of 68.8 Ϯ 2.3 kDa, which is between a tetramer and a pentamer (predicted mass: 55.6 kDa, Fig. 3B). Of these, the tetramer is the more likely state, with the higher apparent molecular mass being the result of larger oligomeric species forming during the concentration of tetrameric species at the bottom of the cell by AUC.
Notably, whereas recombinant I or S regions are soluble and stable when expressed alone, albeit with differing oligomeric states, constructs that span both of these regions are unstable with regard to their oligomerization and show low solubility A, the positions and lengths of soluble truncated DISC1 proteins following highthroughput screening, shown relative to the predicted secondary structure of the full-length protein for comparison. Locations of a translocation breakpoint (at amino acid 597) and a frameshift mutation (at 807) both linked to major mental illness are also displayed. Of the soluble proteins, six were subsequently determined to yield consistent, stable proteins. These were interpreted to be representative of four structural regions of the DISC1 protein, which were named D (represented by two clones), I, S (one clone each), and C (two clones). In the full-length DISC1 schematic, dark gray represents sections with a predicted coiled-coil or helix-forming propensity, light gray represents regions predicted to be disordered. White regions were not predicted to be either helical or disordered. B, SDS-PAGE Coomassie-stained gel (left panel) and Western blot (right panel) of DISC1 protein fragments encoded by four of these clones, representing the soluble D, I, S, and C regions. SDS-resistant oligomerization of some of the species is visible on the blot. Table 1 The four proposed structural regions of DISC1 Amino acid positions of their prototypical ESPRIT constructs are shown, along with the calculated molecular mass (determined using the ExPASy ProtParam tool (33) for the region alone, and for the His-tagged constructs used in this study (in parentheses).  2E). We, therefore, consider them to be distinct regions, potentially representing alternative conformations of fulllength DISC1. AUC sedimentation velocity (SV) analysis resulted in a c(s) distribution with a weighted average frictional ratio of 1.8, suggesting that the tetramer is highly elongated, whereas CD confirms it to be mostly helical (Fig. 3C).

The C region is an elongated ␣-helical monomer, containing some disordered content
We defined the C region of DISC1 based on two constructs encoding AA 691-836 and 684 -836 of full-length DISC1. Upon purification, the recombinant C region exists as both a prominent and a minor species according to SEC, however, only the more prominent, lower molecular weight species was stable (Fig. 3D). AUC SE analysis showed this species to have a molecular mass of 20.8 Ϯ 0.2 kDa, strongly indicative of a monomer (Fig. 3E, predicted molecular mass: 20.1 kDa), whereas AUC SV analysis of the species indicated it to possess an elongated shape (weighted average frictional ratio: 1.45). CD showed this region to be helical, with some disordered content (ϳ25% random coil, based on deconvolution using the CDSSTR method in DichroWeb with reference dataset 4 (18 -21), Fig. 3F). An additional construct lies within the N-terminal half of the C region, representing AA 718 -771 of DISC1. This fragment is also predicted as monomeric by SEC (Fig. 3G), whereas CD revealed a lack of regular secondary structure (Fig. 3H). This section of the C region thus appears to be responsible for the ran-dom coil content of the region, and is consistent with the previous prediction of a short unstructured segment in this area (8).

A frameshift mutation implicated in schizophrenia and schizoaffective disorder leads to an aggregation-prone C region
As a proof-of-principle to show how knowledge of the domain architecture of DISC1 can aid in the understanding of mental illness, we considered a frameshift mutation at AA 807 in DISC1, which was previously observed in the case of an American family with schizophrenia and schizoaffective disorders (3). This mutation leads to the addition of 9 amino acids of nonspecific read-through followed by a premature stop codon (3). Notably, this mutation lies within our newly defined C region, and would, therefore, be predicted to disrupt the structure of this folded, soluble region.
Two recombinant C region proteins were therefore expressed in parallel in E. coli, one representing the wild-type sequence and one mimicking the truncation resulting from the frameshift mutation, with the addition of the resulting 9 amino acids and stop codon deriving from it. Although this mutant protein still expressed at 50% the level of the wild-type protein (Fig. 4A), its solubility decreased by 90% (Fig. 4B), due to insoluble, and potentially unfolded, protein being deposited in inclusion bodies. To investigate this further, the inclusion bodies of bacteria expressing wild-type or the frameshift-carrying DISC1 C region were denatured with urea buffer and the recombinant

DISC1 domain architecture
C regions were then purified and refolded in vitro. The ensuing mutant C region elutes earlier in SEC than the wild-type version, despite its smaller number of amino acids (Fig. 4, C and D). When analyzed further in AUC SE experiments, the mutant form was confirmed to exist as a larger protein complex, consistent with aberrant multimerization (average molecular masses were determined through AUC: wild-type 20.5 kDa, mutant 25 kDa), but also to have a higher aggregation propensity (Fig. 4, E and F). Finally, the mutant C region showed a lower level of ␣-helix forming propensity than the wild-type (Fig. 4G). The secondary structure signal detected by CD is lost over time when stored at room temperature, potentially due to precipitation, unlike the wild-type protein, which remained a structurally stable entity under the same conditions.
Together, these experiments show that the effect of the frameshift mutation on the C region structure is sufficient to induce aberrant oligomer formation, ultimately leading to the formation of aggregating protein complexes, despite the absence of segments that cause the region to form oligomers in its wild-type state.

Discussion
Understanding the structure of a protein is an important step toward understanding its function, however, progress on DISC1 has been hampered by insolubility of the recombinant protein and a lack of understanding of its domain architecture. Here, we have addressed these technical difficulties by using the ESPRIT library screening technique to scan for readily soluble and stable regions within the DISC1 protein. In previous experiments employing the ESPRIT technique, generally employed to investigate enzymes, distinct globular domains were determined: regions that were identified by multiple different ESPRIT constructs that varied only by a small number of amino acids encoded at the 5Ј and/or 3Ј ends of the cDNA (13,(15)(16)(17). These constructs represented compactly folded domains with hydrophobic cores, with similar variants having only small unstructured terminal extensions. In contrast, and reflecting the nature of DISC1 as a primarily helical scaffold protein (5,8), this screen, when coupled to purification and biophysical analysis, instead identified four helical regions within DISC1 as being both soluble and structured. We have defined these as the D, I, S, and C regions and propose them to represent distinct, structured regions of the human DISC1 protein, which coincide with established binding sites for several of the key protein interaction partners of DISC1 (Fig. 5A).
Additionally, in light of this proposed domain architecture, some previous work into DISC1 must be re-evaluated, where fragments of the protein defined arbitrarily or based on bioinformatic predictions were expressed, to determine how they Figure 4. A pathological frameshift mutation disrupts the stability, oligomeric state, and secondary structure of the C region. A, quantified, relative level of DISC1 expression in bacterial cell lysates expressing either the wild-type DISC1 C region (AA 691-836) or a mutant version representing a frameshift mutation associated with mental illness (leading to truncation of the protein after AA 807, followed by 9 AA of nonspecific read-through), n ϭ 6; *, p Ͻ 0.05. B, following centrifugation of these lysates, the wild-type protein remains almost entirely in the supernatant, indicating solubility, whereas much of the mutant protein is instead lost in the pellet, indicating insolubility. n ϭ 6; *****, p Ͻ 10 Ϫ6 . C, comparison of the two proteins at two different concentrations (indicated in the figure), following refolding from a urea-solubilized pellet, by SEC, showing the mutant form to preferentially form higher molecular weight complexes. D, this effect is enhanced with increasing protein concentration. E, AUC SV results for the wild-type C regions obtained at 50,000 rpm (181,675 ϫ g), 20°C. Sedimentation profiles (dots) recorded over time from purple (initial scan) to red (final, scans taken at 3 min intervals) are shown together with the c(s) fit results (lines). Below the graphs, the corresponding residuals are shown. Both AUC experiments were performed using 33 M protein. F, equivalent results for the mutant C regions, showing an increased propensity to form insoluble aggregates visible in the upper part of the sedimentation boundary as quickly sedimenting material. G, comparison of the CD spectra of the wild-type and mutant C regions, at a concentration of 5 M each, showing the mutant to have reduced ␣-helical content, which decreases further over time stored at room temperature, whereas there is no obvious similar loss of structure for the wild-type.

DISC1 domain architecture
relate to this architecture. Protein fragments containing only part of these structured regions could, for example, appear to show loss or gain of function as a result of aberrant, or a lack of, folding.
A caveat to this work, however, is the extent to which protein expressed in large quantities in E. coli would be representative of the low levels of endogenous full-length human DISC1 expressed in vivo. That the four described regions are soluble even at very high concentrations, combined with their stable oligomeric state and secondary structure argues in favor of them representing distinct stable structural elements in DISC1, however, it is possible that others also exist not detectable by this approach. For example, the lack of other structured regions in the N-terminal half supports the idea of this section of the protein being unstructured, although the presence of the D region requires the redefining of the extent of this unstructured region as only stretching from the N terminus to approximately AA 250.
The possibility must also be considered, however, that other structural elements exist that are not easily transcribed, translated, or folded in E. coli, but which may be seen in mammals. Furthermore, it is likely given the role of DISC1 as a scaffold protein with many interaction partners (5) that DISC1 would not naturally exist in vivo outside of complexes with protein binding partners. Those regions of DISC1 that are unstable in vitro most likely gain stability in vivo immediately after translation in the cell through interaction with binding partners. It is possible that such early interactions may direct individual DISC1 molecules toward specific cellular functions, through establishing its oligomeric state and setting a tertiary structure, limiting the number of potential other protein interaction sites that are exposed on the surface of the protein.
It is notable that, when expressed in isolation, the individual structured regions of DISC1 display differing oligomeric states. This is particularly striking in the case of the overlapping dimeric I region and tetrameric S region, whereas constructs overlapping the two show mixed characteristics of both. We therefore put forward the hypothesis based on our data that at least two different structural conformations of full-length DISC1 could exist, based on which of the dimeric I or tetrameric S region is dominant as the intermolecular interaction domain (Fig. 5A, Table 1). The first of these would have at its core a dimeric I region, from which the monomeric C region and N-terminal sections extend, corresponding to a simple dimeric model (Fig. 5B), consistent with the dimer of full-length DISC1 observed by Narayanan et al. (10). The second would instead have a tetrameric core, based on that seen with the S region construct, and therefore represent a higher oligomeric state of DISC1 (Fig. 5C). This would likely represent the basis of the reported octameric species of DISC1 (10), with the additional oligomericity potentially arising through association of the D regions, through interactions between different regions, or possibly with additional structural elements of DISC1 not detectable in this screen.
Of the four regions described here, the I region is the least stable, existing in solution at concentrations up to ϳ30 M, but precipitating to insoluble aggregates above that. Longer ESPRIT-derived constructs that include the I region all show a strong tendency to form higher molecular weight aggregate complexes as well. Such insolubility, if also present in vivo, may contribute to the formation of the insoluble DISC1 aggregate species found in brains of a subset of patients with mental illness (22,23) or a transgenic rat overexpressing full-length DISC1 (24).
A consequence of our proposed domain structure is that the sequence encoding the I region would be bisected by the Scottish translocation (1), meaning that any protein translated from this locus would likely have an incorrectly folded I region and lack completely the highly stable S and C regions, the former of which would likely be critical for its higher-order oligomeriza- Figure 5. Summaries of the domain structure and oligomerization of DISC1. A, schematic of the domain structure for DISC1 proposed in this paper, with the I and S regions representing alternative configurations. Approximate locations at which known protein interaction partners bind to DISC1 are indicated in red, where these have been mapped at high precision to within one of the four regions (34 -40), although this does not necessarily imply that they would interact with only one structural configuration of DISC1. B, schematic of a simple domain configuration for a DISC1 dimer, with the I and D regions both forming dimers. C, equivalent schematic for a DISC1 tetramer based on the S region, in which the D regions could form dimers or a tetramer. Note: these figures are one-dimensional and do not take into account the folding of the structural regions or the potential for interaction between these regions. tion. A translocation-derived protein would thus be highly unstable, in addition to effects due to loss of protein-protein interaction sites.
Finally, we investigated the structural consequences of a frameshift mutation in the DISC1 gene, which has been linked to major mental illness in a family (3) and causes synaptic vesicle deficits via depletion or dysregulation of the expression of several synapse-related genes in human forebrain neurons (25). In neurons generated using induced pluripotent stem cells from this family, the total level of DISC1 protein was reduced, suggesting ubiquitin-mediated degradation of the mutant protein, despite the frameshift protein containing almost 95% of the DISC1 reading frame. The domain structure described here provides a partial explanation for this effect, as the frameshift directly disrupts the C region of DISC1. Furthermore, when the frameshift mutation was simulated in a recombinant C region protein, it led to the aberrant and higher oligomerization of the region (Fig. 4E) relative to a wild-type version, and to a tendency to unfold and aggregate which, during early protein genesis, could lead to ubiquitin-mediated protein degradation as described previously (25).
In this work, we have defined for the first time structural domains of human DISC1 based on experimental evidence, thus laying the foundations for understanding the functional architecture of the protein. Specifically, we revealed that the disordered N-terminal region of DISC1 is followed by a dimeric D region. Following a central stretch that is unstable at least when expressed in E. coli, presumably requiring either proteinbinding partners or mammalian-specific factors to facilitate its folding, lie the overlapping I and S regions, which drive the oligomeric state of DISC1. Finally, the extreme C terminus harbors the monomeric, helical C domain, which is seemingly involved in protein-protein interactions. Although other structural domains, not amenable to expression in E. coli, may be added with time, this framework nevertheless has already provided insight into the mechanisms by which DISC1 is disrupted in major mental illness and will be a powerful resource for future studies into its structure and function.

Construct generation
A cDNA encoding for full-length human DISC1 (RefSeq accession number NP_061132.2) was codon optimized for expression in E. coli (GeneArt, Thermo Fisher Scientific, Regensburg, Germany) and then cloned between the AscI and NotI sites of the pET-derived pESPRIT002 vector (26) for use in the ESPRIT screen. Ensuing constructs had their BAP tags removed by BspEI enzyme digestion prior to biophysical characterization. Additional subregions of DISC1 were subcloned from this vector and inserted into the pESPRIT vector at its AatII and BspEI sites. The identity of all constructs was confirmed by sequencing.

ESPRIT
High-throughput screening of DISC1 for soluble domains by incremental truncation was performed as described in detail previously (26). Briefly, a construct encoding DISC1 with N-terminal His 6 tags and C-terminal BAP tags was linearized at the 3Ј end of the gene by restriction digest and truncated in a 3Ј to 5Ј direction with Exonuclease III and mung bean nuclease (both from New England Biolabs, Évry, France) and the resulting blunt ends polished with Pfu polymerase (Agilent Technologies, Les Ulis, France). The linear plasmid library of truncated constructs was religated back to a circular plasmid, then recovered by transformation of E. coli MACH1 (Thermo Fisher Scientific, Villebon-sur-Yvette, France), plating, and subsequent plasmid preparation from pooled colonies. This plasmid mixture was then similarly truncated from the 3Ј end. The linear plasmid library was electrophoresed on an agarose gel and plasmids containing constructs encoding 150 -400 amino acid DISC1 fragments were excised, ligated, and recovered into E. coli MACH1. DNA sequencing of randomly selected plasmid inserts revealed inserts distributed along the entire length of DISC1. E. coli BL21 AI (Thermo Fisher Scientific) were transfected with the library and 27,642 clones picked robotically into 72 ϫ 384-well plates were grown, then arrayed in duplicate onto nitrocellulose membranes (GE Healthcare, Vélizy-Villacoublay, France) over LB agar with antibiotics. Clones were grown, induced by transfer to fresh agar trays containing 50 M biotin and 0.2% (w/v) arabinose for 4 h. Colonies on membranes were then lysed on NaOH-soaked filter paper, neutralized in buffer (27), blocked with Superblock (Thermo Fisher Scientific), and probed using anti-His (GE Healthcare) and rabbit anti-mouse Alexa Fluor 532 conjugate (Thermo Fisher Scientific) to detect the N terminus, and streptavidin Alexa Fluor 488 (Thermo Fisher Scientific) to detect the C terminus. The 3545 clones with highest anti-His signal were then ranked for in vivo biotinylation efficiency based upon the streptavidin signal and the highest signaling 94 clones were selected for expression and nickel-NTA purification analyses in larger scale cultures.

Recombinant soluble protein expression and purification
Plasmid vectors were transfected into BL21 AI cells and grown in Terrific Broth. Protein expression was induced by the addition of 0.2% L-arabinose and 1 mM isopropyl 1-thio-␤-Dgalactopyranoside for 16 h at 25°C. Bacterial pellets were stored at Ϫ80°C and lysed by incubation in 25 mM Tris, pH 7.4, 150 mM NaCl, 5 mM imidazole, 1 mM DTT, 0.5% Triton X-100, 1 mM MgCl 2 containing lysozyme, DNase I, and protease inhibitors at room temperature. The insoluble pellet was spun down by centrifugation at 12,000 ϫ g for 45 min. The soluble fraction was then incubated with nickel-NTA (Qiagen, Hilden, Germany) for 45 min at room temperature and washed with 25 mM Tris, pH 7.4, 150 mM NaCl, 20 mM imidazole, 1 mM DTT. Protein was eluted with the same buffer containing 500 mM imidazole and then further purified by SEC. Where necessary, proteins were concentrated using Amicon Ultra centrifugation devices (Merck Millipore, Darmstadt, Germany). All proteins used for experiments were at least 95% pure.

Denaturing protein purification from the inclusion bodies and protein refolding
Plasmid vectors were transfected into E. coli BL21 AI cells and grown in Terrific Broth. Protein expression was induced by the addition of 0.2% L-arabinose and 1 mM isopropyl 1-thio-␤-D-galactopyranoside for 6 h at 37 ºC. Bacterial pellets were DISC1 domain architecture stored at Ϫ80°C and lysed by incubation in 25 mM Tris, pH 7.4, 150 mM NaCl, 5 mM imidazole, 1 mM DTT, 0.5% Triton X-100, 1 mM MgCl 2 containing lysozyme, DNase I, and protease inhibitors at room temperature. The insoluble pellet was spun down by centrifugation at 12,000 ϫ g for 45 min. The insoluble pellet obtained after the centrifugation of the cell lysate was dissolved in 25 mM Tris, pH 7.4, 150 mM NaCl, 5 mM imidazole, 1 mM DTT, 8 M urea and then incubated with nickel-NTA (Qiagen) for 45 min at room temperature and washed with 25 mM Tris, pH 7.4, 150 mM NaCl, 20 mM imidazole, 1 mM DTT, 8 M urea. Protein was eluted with the same buffer containing 500 mM imidazole. Refolding of the eluted protein was done by dialyzing in three steps for a period of 16 -18 h. The refolded protein was then further purified by SEC.

Gels and Western blots
Protein samples were denatured in Laemmli buffer and run on SDS-polyacrylamide gels and then either directly stained with SYPRO Ruby Protein Gel Stain (Thermo Fisher Scientific) or InstantBlue Protein Stain (Expedeon, Swavesey, UK), or else transferred to nitrocellulose membranes. These were then stained accordingly with primary antibodies against the His 6 tag and DISC1 (14F2, raised against a peptide found in the C region and was described previously (28)) as well as fluorescently-labeled streptavidin to detect the BAP tag. Protein signals were then detected using IRDye secondary antibodies (LI-COR Biosciences, Bad Homburg, Germany) on an Odyssey Clx infrared imaging system (LI-COR Biosciences).

Size exclusion chromatography
SEC was performed using an ÄKTA Pure system (GE Healthcare, Freiburg, Germany), cooled to 4 -8°C with 25 mM Tris, pH 7.4, 150 mM NaCl, 1 mM tris(2-carboxyethyl)phosphine, and using a HiLoad 16/600 Superdex 200pg column (GE Healthcare). In some instances, to check the stability of the oligomeric state, specific eluted protein fractions were stored at 4°C for 24 h and then subjected to SEC again under the same conditions.

Circular dichroism
For circular dichroism experiments, protein was desalted into 25 mM sodium phosphate, pH 7.0, 150 mM NaF, 1 mM tris(2-carboxyethyl)phosphine. Circular dichroism spectroscopic measurements were carried out on a JASCO J-815 spectrometer (JASCO, Gross-Umstadt, Germany). A 1-mm optical path length cuvette was used. The temperature was controlled at 20°C. Spectra were recorded from ϭ 260 to 185 nm at 1 nm resolution, 50 nm/min scan speed, and an integration time of 0.5 s. For signal improvement, 10 accumulations were averaged. The obtained spectra were transformed to mean residue ellipticity after subtraction of the buffer spectra. Deconvolution of the data were performed on DichroWeb (18,19) using the CDSSTR method and reference dataset 4 (20,21).

Analytical ultracentrifugation
Sedimentation velocity centrifugation experiments at 50,000 rpm and 20°C were carried out in a Beckman Optima XL-A (Beckman-Coulter, Brea, CA), equipped with absorption optics and a four-hole rotor. Samples (volume 400 l) were pipetted into standard aluminum double sector cells with quartz glass windows. Measurements were performed in absorbance mode at detection wavelength 230 nm. Radial scans were recorded with 30-m radial resolution at ϳ1.5 min intervals. The software package SEDFIT version 14.1 was used for data evaluation. After editing time-invariant noise was calculated and subtracted. Continuous sedimentation coefficient distributions c(s) were determined with 0.05 S resolution and F ratio ϭ 0.95. Suitable s-value ranges between 0 and 20 S and f/f 0 between 1 and 4 were chosen. Buffer density and viscosity had been calculated with SEDNTERP version 20111201 beta (29). The partial specific volumes of the DISC1 protein fragments were calculated according to the method of Cohn and Edsall (30,31) as implemented in SEDNTERP.
Sedimentation equilibrium experiments were performed in standard aluminum double sector cells with quartz glass windows. Equilibria were established at multiple speeds. After equilibrium was reached, concentration profiles were recorded with 10-m radial resolution and averaging of seven single registrations per radial value. Data evaluation was performed using SEDPHAT version 10.55b (32).