Protein orientation in the Tat-TAR complex determined by psoralen photocross-linking.

Replication of human immunodeficiency virus type 1 (HIV-1) requires specific interactions of Tat protein with the trans-activation responsive region (TAR) RNA, a 59-base stem-loop structure located at the 5′-end of all HIV mRNAs. We have used a new method based on psoralen photochemistry to identify a specific contact between a fragment of Tat protein (residues 38-72) and TAR RNA. We synthesized a 35-amino acid fragment containing arginine-rich RNA-binding domain of Tat (38-72), and replaced Arg57 with Cys to introduce a unique thiol group (-SH) in our model peptide. A psoralen derivative, which can react with thiol groups, was synthesized and used for specific chemical modification of Cys57-Tat-(38-72). We used this psoralen-Tat conjugate (psoralen-Cys57-Tat-(38-72)) to form a specific complex with TAR RNA. Upon near-ultraviolet irradiation (360 nm), this synthetic psoralen-peptide cross-linked to a single site in the TAR RNA sequence. The RNA-protein complex was purified and the cross-link site on TAR RNA was determined by RNA sequencing, which revealed that Cys57 of Tat is close to U31 of TAR RNA. Our results provide high-resolution proximity and orientation information about Tat-TAR complex. Such psoralen-peptide conjugates provide a new class of probes for sequence-specific protein-nucleic acid interactions and could be used to selectively control gene expression or to induce site-directed mutations.

Human immunodeficiency virus type 1 (HIV-1) 1 encodes a trans-activating regulatory protein, Tat, that is essential for trans-activation of viral gene expression (1)(2)(3). HIV-1 Tat protein acts by binding to the trans-activation-responsive region (TAR) RNA, a 59-base stem-loop structure located at the 5Јends of all nascent HIV-1 transcripts (4 -7). Upon binding to the TAR RNA sequence, Tat causes a substantial increase in transcript levels (8 -10). The increased efficiency in transcription is possibly by preventing premature termination of the transcriptional elongation complex (3) or directly at the level of initiation of transcription (11). TAR was originally localized to nucleotides ϩ1 to ϩ80 within the viral long terminal repeat (4). Subsequent studies have further mapped its 3Ј boundary to ϩ44 (6). Nucleotides spanning positions ϩ19 to ϩ42 are sufficient for Tat responsiveness in vivo (6). The TAR RNA contains a six-nucleotide loop and a three-nucleotide pyrimidine bulge, which separates two helical stem regions. The trinucleotide bulge is essential for high affinity and specific binding of the Tat protein (12).
It has been shown by a number of groups that Tat-derived peptides, which contain the basic arginine-rich region of Tat, are able to form in vitro complexes with TAR RNA (13)(14)(15)(16)(17). To achieve specific RNA binding by a Tat fragment, we synthesized a Tat peptide (amino acids 38 -72), which contained an RNA-binding domain and 11 amino acids from the core domain of the Tat protein ( Fig. 1). Since Tat protein mutants where Arg 57 was substituted with Ser were functional for trans-activation (18), we replaced Arg 57 with Cys to introduce a unique thiol group (-SH) in our model peptide which was labeled with a psoralen derivative. Psoralens are bifunctional photoreagents that have been used as photoactive probes of nucleic acid structure and function (19). In this report, we have used a psoralen-Tat conjugate to determine the protein orientation in Tat-TAR  complex. EXPERIMENTAL PROCEDURES Synthesis of 8-((3-Iodopropyl-1)oxy)psoralen, Compound 1-A mixture of 8-hydroxypsoralen (0.176 g, 1 mmol), 1,3-diiodopropane (1.15 ml, 2.96 g, 10 mmol), and potassium carbonate (1.38 g, 10 mmol) was stirred in acetone (15 ml) for 10 h at room temperature. Reaction was monitored by TLC (petroleum ether/ethyl acetate, 4:1), which showed the absence of starting material. The reaction mixture was concentrated to dryness under reduced pressure. The residue was dissolved in water (30 ml) and extracted with ethyl acetate (3 ϫ 20 ml). The extract was washed with water (10 ml) and brine (10 ml) and dried over sodium sulfate. Ethyl acetate was evaporated under reduced pressure, and the residue was chromatographed on silica gel (elution with petroleum ether/ethyl acetate, 4:1). Pure compound 1 (0.31 g, 85%) was obtained as a pale yellow solid. 1  Synthesis of Psoralen-Cys 57 -Tat-(38 -72) Conjugate-A Tat-derived peptide (from amino acids 38 -72 with Cys at position 57) was synthesized on an Applied Biosystems 431A peptide synthesizer using standard FastMoc protocols (20). The mass of fully deprotected and purified peptides were confirmed by fast atom bombardment mass spectrometry; calculated mass for Cys 57 -Tat-(38 -72) ϭ 4029.6, found 4030.6 (M ϩ H). Tat peptide (10 nmol) and 8-((3-iodopropyl-1)oxy)psoralen 1 (100 nmol) were dissolved in 100 l of N,N-dimethylformamide. The final pH of the reaction mixture was adjusted to 7.0 by adding 50 l of 0.1 M sodium phosphate buffer (pH 7.4). After 16 h incubation at room temperature, 10% trifluoroacetic acid solution was used to bring the pH of the reaction mixture between 4 and 5, and unreacted psoralen was extracted with chloroform (3 ϫ 0.5 ml). The organic phase was washed with 1% trifluoroacetic acid (3 ϫ 0.2 ml) to recover any dissolved peptide. The aqueous layers were combined and concentrated to Ϸ 200 l by Speed-vac. Psoralen-Tat conjugate was purified by HPLC on a Zorbax 300 SB-C 8 column. The final yield of psoralen-Tat conjugate was Ϸ80%. The mass of purified peptides was confirmed by fast atom bombardment mass spectrometry; calculated mass for psoralen-cys 57 -Tat-(38 -72) ϭ 4271.9, found 4272.9 (M ϩ H).
Photocross-linking and RNA sequencing reactions were carried out as described earlier (20).

Site-specific Incorporation of a Psoralen into Tat-(38 -72) Sequence-
The experimental strategy for site-specific psoralen conjugation of Tat-(38 -72) is outlined in Fig. 2. We introduced a unique cysteine residue in the RNA-binding region of Tat at position 57 during peptide synthesis. A derivative of psoralen (8-((3-iodopropyl-1)oxy)psoralen) was synthesized and used to label the cysteine residue in Tat fragment. Psoralen-Tat conjugate was purified by HPLC and characterized by mass spectrometry. To further characterize and evaluate the binding capabilities of psoralen-peptide conjugate, we determined the dissociation constants for psoralen-peptide and compared them with those of the wild-type peptide (Tat-(38 -72)). Equilibrium dissociation constants of the Tat-(38 -72)⅐TAR complexes were measured using direct and competition electrophoretic mobility assays (17). We determined the relative dissociation constants (K rel ) by measuring the ratios of wild-type Tat-(38 -72) to psoralen-Tat-(38 -72) dissociation constants (K d ) for TAR RNA binding. The calculated value for K rel was 1.12, indicating that a psoralen derivatized cysteine at position 57 of Tat-(38 -72) did not significantly alter the structure of the Tat-(38 -72), thus preserving the TAR binding affinities of the peptide (data not shown).
Site-specific Photocross-linking of Psoralen-Cys 57 -Tat-(38 -72) to TAR RNA-Psoralen-modified Tat-(38 -72) peptide was used to form a complex with 5Ј-32 P-end-labeled TAR RNA at room temperature in TK buffer and ultraviolet irradiated (360 nm) for 20 min. Cross-linked products were separated by denaturing polyacrylamide gel electrophoresis. Results of this experiment are shown in Fig. 3. Irradiation of RNA and psoralen-peptide complex yields a new band with electrophoretic mobility less than that of TAR RNA (lane 6). These results indicate that upon irradiation this psoralen-peptide yields a single RNA-protein cross-link with high efficiency, Ϸ10%. Both the psoralen-peptide and UV (360 nm) irradiation are required for the formation of a cross-linked RNA-protein complex (see lanes 2 and 5). Further control experiments showed that no cross-linking was observed when RNA and unmodified peptide were irradiated (lane 4). Digestion of the RNA-peptide crosslink with Proteinase K (5 units for 30 min at 37°C) resulted in an RNA species with mobility similar to TAR RNA (lane 7). Since the cross-linked RNA-peptide complex is stable to alkaline pH (9.5), high temperature (85°C), and denaturing conditions (8 M urea), we conclude that a covalent bond is formed between TAR RNA and the peptide during cross-linking reaction.
Specificity of the Cross-link Formation-Specificity of the cross-linking reaction was established by competition experiments. Cross-linking reactions were performed in a 15-l volume containing 0.25 M of 5Ј-32 P-labeled TAR RNA, 1.0 M psoralen-Tat peptide, 25 mM Tris-HCl (pH 7.4), 100 mM NaCl, and up to 1.25 M unlabeled competitor RNA. Cross-linked products were separated by 8 M urea-20% polyacrylamide gels and visualized by phosphorimage analysis. Fig. 4 shows that cross-linking was inhibited by the addition of unlabeled wildtype TAR RNA and not by a mutant TAR RNA lacking the trinucleotide bulge. Therefore, we conclude that formation of a specific RNA-protein complex between TAR RNA and psoralen-Tat is necessary for photocross-linking.
Uridine 31 in TAR RNA Cross-links to Psoralen-Cys 57 -Tat-(38 -72)-Mapping of the cross-link site on TAR RNA to single nucleotide resolution was carried out by partial RNase digestion and alkaline hydrolysis of the gel-purified RNA-protein cross-link. Fragment sizes are determined by comparison with RNA oligonucleotides of defined sequence and length generated

Orientation of Tat Protein in the Tat-TAR Complex
by digesting RNA with RNases T 1 and B. cereus. Alkaline hydrolysis of RNA and cross-linked RNA-peptide complex generates a ladder of RNA degradation products. Bands of crosslinked RNA-peptide complexes migrate slower than corresponding free RNA (lane 3 in Fig. 5, A and B). Base hydrolysis of the 5Ј-end-labeled cross-linked complex (lane 3 in Fig. 5A) results in an RNA ladder in which all fragments up to C30 are resolved. There is an obvious gap in the hydrolysis ladder after C30, indicating that the fragments above C30 from the 5Ј-end are linked to the psoralen-Tat peptide (Fig. 5A, lane 3) that is not seen with the uncross-linked RNA (lane 2). Thus, U31 is the 5Ј-end cross-link site. To define 3Ј-end boundary of the crosslink site, we purified 3Ј-end-labeled RNA-protein cross-link and subjected to partial alkaline hydrolysis. Base hydrolysis ladder of 3Ј-end-labeled cross-link (Fig. 5B, lane 3) produces a ladder in which the fragments from the cross-linked RNApeptide complex match those from free RNA until G32 from 3Ј-end. After G32, a clear gap was observed during hydrolysis of 3Ј-end-labeled cross-link (Fig. 5B, lane 3), while alkaline digestion of 3Ј-end-labeled TAR RNA resulted in a standard ladder (lane 2). This result indicates that the fragments above G32 from the 3Ј-end contain Tat peptide. Based on these results, we conclude that U31 of TAR RNA is the only site at which cross-linking occurs.
To further confirm the cross-link site, we transcribed a mutant TAR RNA containing G31 instead of U31 in its sequence (Fig. 1). Since psoralen reacts primarily with uridine in RNA (19), replacement of U31 with G31 in TAR RNA loop should abolish or decrease significantly the cross-link formation between psoralen-Tat and mutant TAR RNA. Results were consistent with this notion and a minor cross-link product (Ϸ1.5%) was observed when psoralen-Tat and mutant TAR RNA complex was UV irradiated (Fig. 6, lane 4). On the other hand, an RNA-protein cross-link product with high yields was obtained when RNA-protein complex containing psoralen-Tat and wildtype TAR RNA was UV irradiated (Fig. 6, lane 2). These results establish that psoralen-Cys 57 -Tat-(38 -72) forms a single crosslink product with TAR RNA and cross-linking reaction occurs at U31 in the loop region of the RNA. DISCUSSION We have used a site-specific cross-linking strategy to determine protein orientation in Tat-TAR complex. Our results establish that Arg 57 of Tat-(38 -72) is close to uridine 31 in the loop region of TAR RNA.
How does Tat recognize TAR RNA? Several lines of evidence suggest that Tat protein contacts TAR RNA in a widened major groove. In a recent study from our laboratory, we used a rhodium complex, Rh(phen) 2 phi 3ϩ , to probe the effect of bulge bases on the major groove width in TAR RNA (21). Our studies establish two important factors involved in Tat-TAR recognition: (i) there is a correlation between major groove opening and Tat binding. At least a 2-base bulge is required for major groove widening and other conformational changes to facilitate Tat binding. This cannot be accomplished by a single base bulge. (ii) A Tat fragment (42-72) occupies the major groove of TAR RNA and abolishes access of the rhodium complex.
To determine the relative orientation of the nucleic acid and protein in the Tat-TAR complex, we have devised a new method based on psoralen photochemistry (20). We synthesized a 30amino acid fragment containing arginine-rich RNA-binding domain of Tat-(42-72) and chemically attached a psoralen at the amino terminus. Upon near-ultraviolet irradiation (360 nm), this synthetic psoralen-peptide cross-linked to a single site in TAR RNA sequence. The RNA-protein complex was purified, and the cross-link site on TAR RNA was determined by chem- ical and primer extension analyses. Our results show that the amino terminus of Tat-(42-72) contacts, or is close to, uridine 42 in the lower stem of TAR RNA (20).
On the basis of our psoralen cross-linking results, we suggest a model for Tat-TAR recognition in which Tat binds to TAR RNA by inserting the basic recognition sequence into the enlarged major groove with an orientation where lysine 41 in the core domain of Tat contacts the lower stem and Arg 57 is close to U31 in the loop region of TAR RNA (Fig. 7). As shown in Fig.  2B, there is a linker arm of 9.7 Å between the ␣-carbon of amino acid 57 in the Tat sequence and the psoralen moiety. Taking into account the rotational flexibility of the psoralen, the maximum distance between the protein backbone and U31 in TAR RNA would be Ϸ15 Å. Circular dichroism studies by Tan and Frankel (22) showed that Arg 52 in basic RNA-binding peptides is an essential residue to induce a conformational change in TAR RNA. Recent NMR data suggest that after binding of an Arg (free or in Tat peptides), the trinucleotide bulge region TAR RNA undergoes a conformational rearrangement and forms a more stable structure (23,24). At present, there is no structural data available to determine which Arg is interacting with the bulge region of TAR RNA in the context of native HIV-1 Tat. According to our model, the proximity of Arg 52 to U23 suggests that Arg 52 is a likely candidate for such an interaction. However, we cannot rule out the possibility that other arginines are involved in specific interactions with bulge nucleotides. Another interesting feature of our model involves a straight helix of TAR RNA in Tat-TAR complex, which is required to fit our cross-linking data. This is consistent with a recent study by Zacharias and Hagerman (25) in which they performed transient electric birefringence measurements and showed that TAR RNA bulge introduces a bend of 50°in the absence of Mg 2ϩ , which is straightened by the addition of Arg and Tatderived peptides.
Mutational analyses have shown that sequences in the loop of TAR RNA are required for trans-activation (5,26) and not for Tat binding (13,17). The loop may provide the binding site for cellular factor(s) involved in trans-activation (27)(28)(29)(30). Tat could also be involved in rearranging the loop structure that can be recognized by cellular factors. Our results show that the COOH-terminal region of RNA-binding domain of Tat is in the close proximity of U31 in TAR RNA sequence, and whether Tat directly interacts with the loop or not remains to be determined.