Repairing the Sickle Cell Mutation

A DNA third strand with a 3′-psoralen substituent was designed to form a triplex with the sequence downstream of the T·A mutant base pair of the human sickle cell β-globin gene. Triplex-mediated psoralen modification of the mutant T residue was sought as an approach to gene repair. The 24-nucleotide purine-rich target sequence switches from one strand to the other and has four pyrimidine interruptions. Therefore, a third strand sequence favorable to two triplex motifs was used, one parallel and the other antiparallel to it. To cope with the pyrimidine interruptions, which weaken third strand binding, 5-methylcytosine and 5-propynyluracil were used in the third strand. Further, a six residue “hook” complementary to an overhang of a linear duplex target was added to the 5′-end of the third strand via a T4 linker. In binding to the overhang by Watson-Crick pairing, the hook facilitates triplex formation. This third strand also binds specifically to the target within a supercoiled plasmid. The psoralen moiety at the 3′-end of the third strand forms photoadducts to the targeted T with high efficiency. Such monoadducts are known to preferentially trigger reversion of the mutation by DNA repair enzymes.

Nucleic acid third strand binding provides an effective and specific means of directing reagents to unique sites in a complex duplex genome (1)(2)(3)(4). When the reagent linked to the third strand allows covalent attachment to a mutated base pair, the resultant adduct may be recognized as a defect to be corrected by some DNA repair mechanism. Error-prone repair in such a situation can lead to correction of the mutation. This approach has been utilized to effect specific base pair changes following psoralen photoadduct formation site-directed by specific sequence third strands (5)(6)(7)(8). In particular, the use of irradiation times and wavelength favorable for monoadduct formation has been shown to induce the transversion T⅐A 3 A⅐T (5).
To make such a third strand-mediated strategy for sitespecific base pair change viable, it is necessary to overcome two primary barriers. One stems from the fact that ideal sites for spontaneous third strand binding, i.e. perfect homopurine⅐ homopyrimidine target sequences 15-25 base-pairs long occur with limited frequency. The second barrier is that in the face of imperfect targets, third strand binding energy is often insufficient to permit stable triplex formation.
We have exploited the triplex-based approach to develop a strategy for correcting the mutation that underlies human sickle cell anemia, which is due to an A⅐T 3 T⅐A transversion in the gene for the ␤-globin chain located on chromosome 11 (9). This mutation occurs immediately upstream of a 24-nt 1 purinerich sequence that consists of two adjacent regions located on opposite strands, the longer downstream one containing four base pair inversions. These encumbrances make it a difficult target for third strand binding, requiring a strand-switching "cross-over" strategy (10), the use of modified third strand residues and, when this target is contained in a linear duplex fragment, a duplex-forming hook (Fig. 1b) to provide sufficient third strand binding energy to enable triplex-mediated psoralen photoaddition in high yield precisely at the mutant base pair.

EXPERIMENTAL PROCEDURES
Deoxyoligonucleotides-These were synthesized by automated phosphoramidite chemistry, purified to homogeneity by denaturing PAGE, and recovered from gel slices via the "modified crush-and soak" method (11). Final purification was by acetonitrile/water (50/50) elution from C18 Sep-Pak reverse phase columns (Millipore), followed by spin evaporation to dryness. Oligomer concentrations were adjusted spectrophotometrically in Milli-Q-purified water. Psoralen was attached to the 3Ј-end of designated oligomers by incorporating the phosphoramidite of psoralen CPG (containing a C-16 linker between psoralen C-3 and DMT) (Chemgenes Corp., Waltham, Massachusets) during oligomer synthesis. Oligomer homogeneity was ascertained by 32 P-5Ј-end-labeling and denaturing PAGE. Oligomers were 32 P-5Ј-end-labeled preparatively by incubating approximately 4 pmol of oligomer with 2 mCi [␥-32 P]ATP (Amersham Pharmacia Biotech), and 1 unit of T-4 polynucleotide kinase (U. S. Biochemical Corp.) at 37°C for 1 h, purified by denaturing PAGE, and eluted from C18 Sep-Pak columns.
Triplexes-Duplexes were formed by mixing equimolar amounts of each strand, heating to 80°C and slowly annealing to room temperature in the standard buffer. Various ratios of third strand were added to pre-formed duplex, and resulting mixtures were incubated for 1 h at room temperature, and then at 4°C overnight.
Electrophoresis-Denaturing PAGE was performed on slabs 25 ϫ 45-cm using 8 M urea, TBE, 16% polyacrylamide gels (1:37 bisacrylamide/acrylamide). Samples were dissolved in denaturing loading buffer (12) and heated to 80°C before loading. Denaturing gels were run for 2-6 h at 1.5 kV at room temperature. Nondenaturing PAGE was conducted on a 20 ϫ 20-cm slab of 15% polyacrylamide in the standard buffer (see above) and run at 4°C for 12 h at 150 V. Denaturing gels were soaked in standard fixing solution (12), then in 100% methanol to * This work was supported in part by fellowships from Codon Pharmaceutical, Inc. and Oncor, Inc. (to O. A. and N. G. D., respectively), and by Grant DE-FG02-96-ER62202.A001 from the Department of Energy. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 (13), and dried onto Whatman filter paper on a Bio-Rad gel dryer. Gels were visualized using either or both NEN Life Science Products x-ray film and an ImageQuest PhosphorImager, which was also used to quantitate gel bands.
DNase I Footprinting-Triplexes with 32 P on the 5Ј-end of the D-2 strand (0.5 M duplex) were mixed with 0.6 units of DNase I (U. S. Biochemical Corp.) in a final volume of 8 l and incubated at 8°C in the standard buffer. At designated times, 2 l aliquots were removed to 2 l of denaturing gel-loading buffer, frozen in a dry ice/ethanol bath, and analyzed by denaturing PAGE. After fixing and drying on filter paper, gels were imaged and quantitated. Band assignments of DNase I digests were determined from sequencing ladders (see below) run in lanes adjacent to lanes with DNase I-treated samples.
UV Irradiation-Triplexes in 10-l droplets in appropriate buffer were placed on a parafilm-covered glass plate lying on ice and positioned 10 cm under a UV light source ("Blac Ray", 15 mW/cm 2 at 365 nm). Thermocouple readings showed that sample temperature was maintained at ϳ8°C. During irradiation, the ice container was placed on a rotating turntable to ensure even exposure of all samples to UV light. At indicated times, 2-l aliquots were added to 2 l of denaturing gel-loading buffer, and after mixing and heating to 80°C were analyzed by denaturing PAGE.
Sequenase Primer Extension-UV-irradiated complexes were fractionated by denaturing PAGE. After extraction from gel slices, photoproducts were purified on C-18 columns, and utilized as templates for primer extension by Sequenase 2.0 (U. S. Biochemical Corp.). Gel-purified 32 P-5Ј-end-labeled 8-nt primers complementary to the 3Ј-end of D-1 were annealed to the template by heating to 80°C and slowly cooling to 8°C. Primer extension reactions were carried out according to Sequenase 2.0 kit instructions (14), and reaction products were analyzed by denaturing PAGE. Gels were fixed, dried, and analyzed by autoradiography and phosphorimaging.
Plasmid Binding-A plasmid pSCe was constructed by inserting a 610-base pair sickle cell ␤-globin gene fragment surrounding the mutation site into the pBluescript vector (Stratagene). 32 P-end-labeled PsT-2 and PsT-3 were incubated at 20°C for 1 h with 10 Ϫ8 M pSCe and/or pBluescript (vector without the insert) at a 10:1 molar ratio in the standard buffer and then at 4°C overnight prior to irradiation. Irradiated plasmids were electrophoresed on agarose gel in TBE under conditions where nonphotoattached third strand dissociates from the target, and the gels were evaluated by ethidium bromide staining, autoradiography, and phosphorimaging.

RESULTS
Experimental Plan- Fig. 1, a and b, show the 38-and 35base pair linear target fragments, D L -1⅐D L -2 and D-1⅐D-2, respectively, of the ␤-globin gene sequence of human chromosome 11 used in this investigation. D-1⅐D-2 is a version of D L -1⅐D L -2 shortened at the downstream end. Residue T 11 on strand D L -1/D-1 (shown in bold) is the consequence of the A 3 T transversion responsible for sickle cell anemia. Residues 1-17 on strand D-2 (4 -21 on D L -2) and 12-19 on D-1/D L -1 comprise the 24-nt purine-rich third strand binding region that was targeted for triplex formation. The one T and three C residues within the purine-rich sequence of the D-2 binding domain significantly decrease the effectiveness of third strand binding to the target. Our original approach was a strand-switching third strand with an acridine intercalator added at the 5Ј-end (Fig. 1a). The triplex forming domain of PsT-1 was designed to bind to the D L -2 target segment in the pyrimidine/parallel motif, and then cross over to bind to the D L -1 segment in the GT/antiparallel motif. This design allows the psoralen moiety at the 3Ј-end of PsT-1 to be positioned directly opposite the mutant T 11 residue of D L -1. To enhance the association to this target, pr 5 U 2 was positioned opposite the three C⅐G inversions, G opposite the T⅐A inversion (15,16), and me 5 C was substituted for C in the third strand opposite G⅐C pairs (17). Despite these various modifications, binding was not sufficiently enhanced (see below).
The strategy ultimately employed to achieve more effective third strand binding to the target in a linear duplex fragment 2 O. Amosova and J. R. Fresco, unpublished. PsT-1 contains modified residues ( m Cϭ me 5 C, p U ϭ pr 5 U) in the triplex-forming region that binds to D L -2 in the pyrimidine/parallel motif and then crosses over to bind to D L -1 in the GT/antiparallel motif, terminating with a 3Ј-psoralen moiety positioned opposite the mutant residue T 11 on D L -1. An acridine intercalator moiety is added to the 5Ј-end of PsT-1 to strengthen third strand binding. b, triplex structure formed by association of strand PsT-2 and duplex D-1⅐D-2, which are shortened versions of D L -1 and D L -2, respectively. The 3Ј-end of D L -1 is truncated by 6 nt to create a target on D-2 for a duplex-forming hook linked to PsT-1 to create third strand PsT-2. The hook, complementary to the 5Ј-end of D-2, is followed by a (T) 4 linker and then the same triplex-forming sequence containing modified residues ( m Cϭ me 5 C, p U ϭ pr 5 U), that binds to duplex segment C. PsT-2 then crosses over to bind along strand D-1 of duplex segment B, terminating with a 3Ј-psoralen moiety positioned opposite the mutant residue T 11 on D-1. Duplex segment A should not be protected from nuclease digestion by third strand binding, whereas segments B and C are potential third strand binding targets that should be protected. c, negative control third strand PsT-3, containing the same 6-nt duplex-forming hook and (T) 4 linker as PsT-2, that is however followed by a scrambled sequence that should not form triplex.
is depicted in Fig. 1b. Additional binding energy was designed into the system by truncating the 3Ј-end of the D L -1 target strand (resulting in the D-1 strand) to produce a 6-nt singlestranded "sticky" 5Ј-end of D-2. The 5Ј-end of PsT-1 (third strand) was then elongated in complementary fashion (via a linker of four T residues) to produce a duplex-forming hook to bind to the D-2 sticky end (Fig. 1b), resulting in third strand PsT-2. The combined strategy of using stronger binding modified residues, crossing-over of the third strand, and a duplexforming hook proved successful in forming the structure in Fig. 1b. PsT-3, a scrambled sequence containing the 5Ј-duplex binding hook and 3Ј-psoralen, was used as a negative control (Fig. 1c).
Third Strand Binding-Band-shift assays at 4°C evaluated by native PAGE were used to assess third strand binding. Fig.  2 shows that both PsT-2 and PsT-3 induce a band-shift of 32 P-labeled target duplex D-1⅐D-2, whereas PsT-1, the hookless third strand, does not with D L -1⅐D L -2. This indicates that the binding energy between the sticky end of D-2 and the 6-nt hook of either PsT-2 or PsT-3 is sufficient to form a stable complex with apparent triplex stoichiometry under the electrophoretic conditions, whereas that between the triplex-forming domain alone and the duplex target is not. However, the results with PsT-2 and PsT-3 do not discriminate between true triplexes and complexes in which the third strand is merely bound by the hook. Note also that the mobility of the PsT-3-containing complex is somewhat lower, which may be a consequence of the "dangling" third strand making the complex much less compact than a (presumably) true triplex formed by PsT-2. Furthermore, electrophoretic analysis at 25°C shows no band-shift for PsT-3, whereas PsT-2 does form a complex (data not shown). This is consistent with melting by that temperature of the hook from the complex that is not a triplex.
DNase I protection experiments (18) were performed to discriminate between triplexes, which should be relatively protected, and complexes formed merely by the third strand hook, which should be sensitive to the enzyme. Fig. 3a shows the results of denaturing PAGE analysis of such digests. Complexes formed with 10:1 (0.5 M duplex, 5 M PsT-2) and 100:1 (0.5 M duplex, 50 M PsT-2, data not shown) ratios of PsT-2: duplex display reduced sensitivity to DNase I along D-2 target segments B and C, but not along the unprotected segment A (cf. Fig. 1a). In contrast, complexes formed with similar ratios of the negative control strand PsT-3 do not display reduced DNase I sensitivity along any segment of D-2.
The gel from Fig. 3a was quantitated, and the photodensity of each D-2 segment was determined. These results, expressed in Table I as the fraction of the total strand cuts within each D-2 segment, confirm that complexes formed with PsT-2 display much reduced DNase I sensitivity along the D-2 target segments B and C, but not at unprotected segment A. This protection is very strong for segment C, but less evident for segment B (Table I), which contains the strand-switching triplex domain. Further, in Fig. 3b, where DNase I sensitivity is shown relative to that of naked duplex, the data clearly indicate protection along the D-2 target segment in the presence of PsT-2, but a lack of protection in the presence of PsT-3. Thus, triplex formation occurs only with PsT-2; so PsT-3 must be bound to duplex only by the hook. UV Irradiation of Complexes-The complexes formed with PsT-1, PsT-2, and PsT-3 were UV-irradiated and the products analyzed by denaturing PAGE. Based upon the expected triplex structure, a psoralen on the 3Ј-end of PsT-1 or PsT-2 should principally form monoadducts (19) to residue T 11 on the D-1/D L -1 strand, resulting in covalent attachment of the third strand. The long length of the linker (C 16 ) between the psoralen moiety and the third strand makes possible additional monoadducts to T 9 of D-1/D L -1 and to various pyrimidine residues along D-2/D L -2, the complementary duplex strand. Interstrand psoralen crosslinks (19) are also possible between pyrimidine residues of D-1/D L -1 and D-2/D L -2. Fig. 4 shows denaturing PAGE analysis of UV-irradiated complexes containing 32 P-end-label on the D-1 or D L -1 strand that contained either 100 nM or 40 M of the appropriate third Duplex strand D-2 of D-1⅐D-2 is 32 P-labeled on the 5Ј-end; thus, observed bands represent specific cuts along D-2. D-2 residue assignments were made by running ddNTP sequencing reactions using D-2 templates on denaturing PAGE alongside DNase I-treated duplex (data not shown). Note that D-2 strand segments B and C (but not A) display reduced sensitivity to DNase I cleavage only in the presence of PsT-2 (cf. lanes 3 and 4 versus 1 and 2), but not in the presence of PsT-3 (cf. lanes  5 and 6 versus 1 and 2). Reduced sensitivity to DNase I at the 5Ј-end of the duplex control is due to the single-stranded protruding end, which is not a substrate of DNase I; this 5Ј-end becomes a substrate when bound to PsT-3, forming a duplex. b, relative DNase I sensitivity of different D-2 segments. Quantitated PAGE results (Table I) are represented as changes in nuclease sensitivity relative to unprotected naked duplex along different D-2 segments. Note that in the presence of a binding third strand (PsT-2), relative sensitivity significantly decreases along the protected segments, and the observed cuts are "shifted" to the unprotected segment. In contrast, no significant change in the relative sensitivity profile is observed in the presence of a nontriplex-forming third strand (PsT-3). Values along the ordinate represent the difference in the fraction of total cuts (along the designated segments) between the three-stranded complex and the same region of unprotected duplex.
strand. Photoproducts are observed in the presence of PsT-2, and also in the presence of the hookless PsT-1 at 40 M; neither duplex alone nor any complex of PsT-3 and duplex produce higher molecular weight bands, notwithstanding the psoralen moiety tethered to the 3Ј-end of PsT-3 (Fig. 4).
Third strand binding affinity of PsT-1 and PsT-2 was evaluated by the dependence of photoproduct formation on third strand concentration. Varying concentrations of third strand were annealed to target duplex, irradiated, and analyzed by denaturing PAGE analysis (Fig. 5a). Bands were quantitated (Fig. 5b), giving apparent K d values of Ͻ3 nM for PsT-2 and ϳ3,000 nM for PsT-1. The 1,000-fold difference in K d values between PsT-1 and PsT-2 clearly demonstrates the advantage of the third strand with the duplex-forming hook, and all further work was performed using PsT-2.
Photoproduct Identification-To investigate the interaction of psoralen-linked third strand with the coding and noncoding target strands, irradiation experiments were performed using duplex with either 32 P-D-1 or 32 P-D-2. Triplexes with D-1 labeled show a major photoproduct and two minor ones, one slightly faster moving than the major one, and one much slower-moving. The D-2-labeled triplexes (Fig. 6a) also show one major photoproduct, but three faster moving minor products, and one much slower moving one. The slower moving band has the same mobility in both cases and it is also visible when PsT-2 is labeled (not shown). Hence, this slower moving photoproduct contains D1, D2, and PsT-2; it is therefore identified as a crosslink between PsT-2 and D-1⅐D-2. Based upon the sequence of the triplex, previous observations (17), the yield, and primer extension results described below, the major D-1labeled photoproduct is identified as a monoadduct between PsT-2 and D-1 at residue T 11 , whereas the faster running minor product is probably the monoadduct formed at residue T 9 of D-1. Similar considerations, including primer extension results (not shown), identify the major D-2-labeled photoproduct as a monoadduct between PsT-2 and residue T 22 of D-2 and the three faster moving minor products as monoadducts formed with residues C 21 , C 23 , and C 24 , respectively.
Kinetics of Photoproduct Formation and Yields-The major D-1 monoadduct, which amounts to more than 50% of the total photoproduct, represents psoralen interaction with the mutated T nucleotide of the human sickle cell hemoglobin gene. As indicated in Fig. 6b, ϳ 40% of the D-1 strand is converted to the major photoproduct by 5 min of irradiation. In contrast, the photocrosslinked product forms at much lower yield (Ͻ3%) and does not level off over the timecourse studied, which is consistent with previous observations (5,19). The kinetics of major photoproduct formation is similar for the D-1-and D-2-labeled complexes (Fig. 6b). The maximum yield is attained by 5 min of irradiation and levels off. The yield is significantly lower for the major D-2 monoadduct (15 versus 40%). This is consistent with the more distant location of the target residue in that strand and the less favorable orientation of the intercalated psoralen moiety (18). Fig. 7 shows the dependence of photoproduct yield on pH. The yield decreases significantly as pH is increased from 5 to 7. Additional modifications of the third strand sequence that use C analogs that need not be protonated should improve binding at physiological pH (22,23).
Sequenase Primer Extension-To confirm the photoaddition site of the major D-1 photoproduct, it was eluted from PAGE gel slices and used as a DNA template in primer extension

FIG. 4. Photoproduct formation along D-1-labeled complexes.
Denaturing PAGE analysis of photoproduct formation. Complexes formed at 10 nM duplex and indicated third strand concentrations were UV-irradiated for 10 min at 4°C. Note that no photoproduct is observed at either third strand concentration in the presence of the nontriplexforming strand PsT-3, despite the psoralen tethered to its 3Ј-end. In contrast, strand PsT-2 forms photoproducts in similar yield at both concentrations. However, PsT-1 forms photoproducts only at very high (40 M) concentration.  PAGE results were used to obtain the indicated percentages, which represent the fraction of total strand cuts localized along each designated D-2 segment after 3 min of DNase I treatment. Segment A, unprotected nontarget segment; segments B and C, protected triplex target segments (see Fig. 1). Note that the nuclease sensitivity profile in the presence of PsT-3 is similar to that for unprotected naked duplex, whereas in the presence of PsT-2, nuclease sensitivity is shifted from the 5Ј-end of D-2 to its unprotected 3Ј-end. reactions (20,21). Fig. 8 shows these results along with a mixed ddNTP sequencing ladder obtained from nonirradiated D-1 control strands. It can be seen that synthesis continues only up to residue T 11 . Apparently, the polymerase is unable to synthesize past the PsT-2 photoattachment site. This observation identifies residue T 11 , the site of the A⅐T 3 T⅐A human sickle cell transversion, as the major site of psoralen photoaddition. Similar observations on photoproducts formed with the D-2 target strand confirm the sites of photoattachment indicated above (data not shown). Specific Third Strand Binding and Photoaddition to Plasmids Containing the Target-Band-shift assays (Fig. 9) indicate that whereas PsT-3, the scrambled third strand, does not bind to the plasmid with the ␤-globin sickle cell target, PsT-2, the third strand with the duplex-forming hook, does bind to it, though not to the vector plasmid without the target. Moreover, the binding is very much greater to the supercoiled than to open circular or linearized plasmid. These results suggest high promise for our overall approach. FIG. 9. Third strand binding to plasmid. Ethidium bromidestained agarose gel of pSCE (containing the target sequence) and pBluescript II (vector plasmid without the target sequence) incubated with PsT-2 or PsT-3 and then UV-irradiated (top). Open circular (OC) and supercoiled (SC) plasmid bands are indicated by arrows. * indicates radioactively labeled. Bottom, autoradiograph of the same gel. Only the supercoiled form of the plasmid with the sickle cell target and third strand PsT-2 is strongly labeled. A very low level of nonspecific binding of PsT-2* to the control plasmid without the target is evident in the left lane.

DISCUSSION
With respect to our long term goal, this study indicates that the efficiency of photochemical modification at the desired target site is relatively high; 40% of the duplex target is converted to monoadduct at the pyrimidine residue of the mutated base pair. Another 2% at this site are crosslinked, whereas a minor photoproduct on this strand accounts for ϳ8% of the duplex. The C 16 psoralen linker additionally permits substantial photoproduct formation of the opposite target strand. Model building suggests that shortening the linker should substantially reduce or even eliminate those reactions, in which case the yield and specificity of the desired psoralen photoaddition can be raised. In fact, preliminary experiments demonstrate that use of a C 6 psoralen linker markedly reduces photoproduct formation with the noncoding strand.
It would appear that early steps of a strategy for triplexmediated repair of the sickle cell mutation are now in place. A strand has been designed with suitable affinity for third strand binding, which forms a psoralen monoadduct to the mutated base pair in high yield. DNA under superhelical stress unwinds to relieve this stress, thereby creating dynamic unwound single-stranded patches no less accessible than the sticky end in the linear target. In fact, the preliminary results suggest that this third strand with the hook is capable of binding to a supercoiled plasmid containing the target sequence, indicative of a strand-invading mechanism (28). This opens the possibility of exploiting various DNA repair mechanisms that have been shown to effect site-specific base pair changes. For example, T⅐A 3 A⅐T substitutions in the supF gene have been triggered in vivo by third strand-directed psoralen photomodification of plasmids transfected into monkey COS-7 cells (6 -8). Using the methodology developed, it is anticipated that the sickle cell mutation may be similarly repaired. In that event, acceptable levels of the correct phenotype might be achieved if psoralen monoadduct-triggered in vivo mutation efficiencies are sufficiently high.
Chimeric structures with linked duplex and triplex elements have been exploited for other purposes (24 -28). In this study, the duplex-forming hook represents a novel approach for achieving effective third strand binding to a relatively poor DNA duplex target, coupled to a potential for strand invasion. Direct comparison of binding affinity to the same target (by photoproduct formation) of a third strand with a hook versus a hookless third strand demonstrates at least 1,000-fold difference in K d values. Moreover, the hookless third strand does not induce a band-shift on native PAGE even at high third strand concentrations (Fig. 2). This indicates that the complex is transient and is trapped only by formation of covalent photoproducts under UV-irradiation. In contrast, the addition of the duplex-forming hook to the third strand makes the complex stable even without irradiation. The approach developed in the present work should find wider applicability to other unfavorable target sequences.