Face-time with TAR: Portraits of an HIV-1 RNA with diverse modes of effector recognition relevant for drug discovery

Small molecules and short peptides that potently and selectively bind RNA are rare, making the molecular structures of these complexes highly exceptional. Accordingly, several recent investigations have provided unprecedented structural insights into how peptides and proteins recognize the HIV-1 transactivation response (TAR) element, a 59-nucleotide-long, noncoding RNA segment in the 5′ long terminal repeat region of viral transcripts. Here, we offer an integrated perspective on these advances by describing earlier progress on TAR binding to small molecules, and by drawing parallels to recent successes in the identification of compounds that target the hepatitis C virus internal ribosome entry site (IRES) and the flavin-mononucleotide riboswitch. We relate this work to recent progress that pinpoints specific determinants of TAR recognition by: (i) viral Tat proteins, (ii) an innovative lab-evolved TAR-binding protein, and (iii) an ultrahigh-affinity cyclic peptide. New structural details are used to model the TAR–Tat–super-elongation complex (SEC) that is essential for efficient viral transcription and represents a focal point for antiviral drug design. A key prediction is that the Tat transactivation domain makes modest contacts with the TAR apical loop, whereas its arginine-rich motif spans the entire length of the TAR major groove. This expansive interface has significant implications for drug discovery and design, and it further suggests that future lab-evolved proteins could be deployed to discover steric restriction points that block Tat-mediated recruitment of the host SEC to HIV-1 TAR.

Small molecules and short peptides that potently and selectively bind RNA are rare, making the molecular structures of these complexes highly exceptional. Accordingly, several recent investigations have provided unprecedented structural insights into how peptides and proteins recognize the HIV-1 transactivation response (TAR) element, a 59-nucleotide-long, noncoding RNA segment in the 5 long terminal repeat region of viral transcripts. Here, we offer an integrated perspective on these advances by describing earlier progress on TAR binding to small molecules, and by drawing parallels to recent successes in the identification of compounds that target the hepatitis C virus internal ribosome entry site (IRES) and the flavin-mononucleotide riboswitch. We relate this work to recent progress that pinpoints specific determinants of TAR recognition by: (i) viral Tat proteins, (ii) an innovative lab-evolved TAR-binding protein, and (iii) an ultrahigh-affinity cyclic peptide. New structural details are used to model the TAR-Tat-super-elongation complex (SEC) that is essential for efficient viral transcription and represents a focal point for antiviral drug design. A key prediction is that the Tat transactivation domain makes modest contacts with the TAR apical loop, whereas its arginine-rich motif spans the entire length of the TAR major groove. This expansive interface has significant implications for drug discovery and design, and it further suggests that future lab-evolved proteins could be deployed to discover steric restriction points that block Tat-mediated recruitment of the host SEC to HIV-1 TAR.
Noncoding (nc)RNAs exhibit remarkable architectural diversity that contributes to function in multiple gene-regulatory settings (1). Although the human proteome is derived from only a small fraction of the genome (0.05%), the preponderance of the DNA blueprint is transcribed into ncRNA (2,3). As a result, ncRNA transcripts provide important opportunities to intervene in a range of biological processes and diseases (4 -6). Such pursuits are especially meaningful in light of the fact that only 3.5-10% of the proteome is likely to be druggable (7,8). Substantial evidence demonstrates that a handful of ncRNAs adopt elegant three-dimensional folds with distinct topologies and recurrent architectural motifs (9 -14), including cavities and deep grooves predisposed to ligand binding (7,15,16). These properties are suited for shape-specific recognition of small molecules or peptides and provide a basis to manipulate conformation or dynamics to alter downstream function. Several notable achievements accentuate such efforts, including the identification of inhibitors that target the following: cancerassociated miR-21; CUG repeats of myotonic dystrophy; riboswitches in pathogenic bacteria; and exon splicing in spinal muscular atrophy (17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28). These successes underscore the feasibility of sequence-specific targeting of RNAs to create research tools or as a means to treat human disease. Accordingly, delineating principles of molecular recognition represents a cornerstone for therapeutic design, especially as part of a combination-drug strategy to circumvent drug resistance by pathogens that undergo multiple genomic mutations per generation (26,29).
New mandates in HIV eradication and cure research (https:// grants.nih.gov/grants/guide/notice-files/NOT-OD-15-137. html) (30, 31) have led to a resurgence in efforts to target the transactivation response (TAR) 2 element. This 59-nucleotide RNA is located in the 5Ј-LTR of all viral transcripts and features a conserved hairpin that harbors an apical loop and pyrimidinerich bulge that are each indispensable for transactivation (33)(34)(35)(36)(37)(38)(39)(40). TAR interacts with the viral Tat protein, which recruits the host pTEFb complex away from inactivating HEXIM-7SK RNA complexes (Fig. 1A) (41)(42)(43)(44)(45)(46)(47). When localized to TAR, host kinase CDK9 within a super-elongation complex (SEC) phosphorylates RNA polymerase II, releasing it from a paused state to produce full-length viral mRNA (48 -50). Structural and biochemical analysis of the Tat-pTEFb complex revealed Tat-specific conformational changes (51, 52). Because of the essential role of Tat in securing pTEFb for processive viral transcription, efforts are focused on the development of inhibitors that block this key host-virus protein interaction (53)(54)(55). Sustained inhibition of Tat could lock HIV into a state of deep latency and represents one strategy to produce a functional cure (56,57).
This work was supported in part by National Institutes of Health Grants GM123864 and GM063162 (to J. E. W.) and a CFAR pilot award (to J. E. W.) from P30 AI078498 (to S. D.). The authors declare that they have no conflicts of interest with the contents of this article. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Targeting TAR RNA represents a fundamentally different antiviral approach (58). TAR is one of the most conserved RNA sequences in the viral genome (Fig. 1B). In addition to SEC binding, TAR functions as a pre-miRNA whose Dicer cleavage products block host-cell apoptosis, prolonging the viral life span in infected cells (59 -61). For these reasons, TAR is a highvalue drug target whose inhibition could potentially disrupt viral transcription in chronic as well as latent infections. However, no such inhibitors are clinically available, and TAR has resisted the development of therapeutics, despite success in the identification of compounds that target the RNA with specificity and affinity (62)(63)(64).
As with other RNA molecules, TAR is dynamic and adopts multiple conformations (65)(66)(67), undermining efforts to obtain high-resolution crystal structures (68). NMR has bridged many gaps in our understanding of TAR with nearly 20 distinct structures determined of the isolated (apo) RNA and in complex with peptides or small molecules (61,62,(67)(68)(69)(70)(71)(72)(73)(74)(75)(76)(77)(78)(79)(80)(81). Even so, TAR has been historically challenging for NMR due to significant line broadening of resonances, which has hindered the acquisition of restraints needed to generate high-quality models (69 -71). For this reason, many efforts have focused on closely related HIV-2 or BIV TAR (Fig. 1C), which are structurally better defined (69,72,73). Additional structural improvements have been attained through engineered RNA constructs to promote crystal contacts or by exploiting structurally well-characterized proteins, such as U1A, as a starting platform for labbased evolution and structural studies. These developments have led to a series of new structures including: exciting TAR-Tat and TAR-Tat-SEC complexes (74 -76), a co-crystal structure of TAR bound to a lab-evolved protein from the Wedekind lab (77), and an ultrahigh-affinity cyclic peptide bound to TAR (63). Here, we put these novel discoveries into perspective by considering prior characterization of TAR apo-and boundstate conformations. We then consider molecular recognition by representative small molecules, which are then contrasted with recent high-quality ncRNA-inhibitor complexes. A major take-home message is that peptide-mediated TAR recognition utilizes some common molecular-recognition principles, such as the arginine-sandwich motif (ASM)-a primary determinant of affinity and specificity observed in both natural TAR-Tat complexes, as well as TAR binding by a lab-evolved protein. In contrast, no consistent rules of recognition could be discerned for existing TAR-small molecule complexes, despite the use of common guanidinium groups. As the reader will see, new TAR-peptide and TAR-protein complexes offer the most cogent details to address challenges and opportunities associated with effective TAR targeting. In this respect, the best days of RNA drug discovery appear to lie ahead.

TAR adopts two major conformations that depend on ligand binding
The discovery of TAR-Tat-mediated gene regulation in HIV-1 (78) started a race to elucidate the underlying molecular determinants that give rise to this unique viral RNA-protein interaction. Major steps were made by NMR analyses of TAR in complex with the arginine analogue argininamide and in a ligand-free (apo) state. This work revealed TAR's overall hairpin architecture as well as substantial backbone rearrangements at the central bulge resulting from ligand binding (79,80). Indeed, when specific effectors interact with the major groove, the RNA adopts a slightly bent (ϳ165°) helical axis HIV-1 TAR role in transcription, sequence conservation, and secondary structure. A, cartoon diagram of an inactive pTEFb complex comprising CDK9 and CycT1 in the context of HEXIM protein bound to 7SK ncRNA in the host. The arrow indicates that addition of the HIV-1 regulatory protein Tat competes with HEXIM, removing pTEFb from 7SK, which is then escorted by Tat to the TAR RNA element of HIV-1 (141). TAR is essential for transcription and is depicted as a stem loop interrupted by a central bulge that comprises nucleotides 18 -44 of the viral transcript (78,142). Tat interacts directly with TAR and promotes formation of a host SEC comprising pTEFb, scaffold proteins such as AFF4, and other factors (43,47,52,68,143,144). CDK9 phosphorylates host RNA polymerase II in its CTD, which releases pausing and stimulates synthesis of full-length viral transcripts (33,(145)(146)(147). B, Web-logo showing the sequence conservation of HIV-1 TAR based on circulating forms of the virus compiled as described (77); blue represents the greatest conservation, and red indicates the poor conservation. Elements of the secondary structure including helical stems s1a and s1b are labeled. C, secondary structures of various TAR RNAs. The canonical Cyt30 -Gua34 pair of HIV-1 TAR is supported by chemical modification, NMR, sequence conservation, and CycT1-binding requirements (65,66,77,148,149). A key difference between HIV-1 and HIV-2 TAR is deletion of Cyt24 in the central bulge (150). Details of the BIV TAR secondary structure were derived from Refs. 73,75,151. CTD, C-terminal domain. JBC REVIEWS: Molecular recognition of HIV TAR RNA formed by coaxial stacking of stem 1a and stem 1b (s1a and s1b) ( Fig. 2A), wherein the bases of the central bulge jut outward. This conformation exhibits a high degree of concave surface suited to ligand binding (Fig. 2, A and B). The TAR major groove is characterized by a narrow width (3.9 Ϯ 0.5 Å) and substantial depth (10.3 Ϯ 0.3 Å) reminiscent of an ideal A-form duplex (i.e. 2.7 Å wide by 13.5 Å deep (81)). In contrast, the minor-groove width (9.9 Ϯ 0.6 Å) and depth (1.0 Ϯ 0.6 Å) are substantially wider and shallower than a typical A-form helix (5.7 Å wide by 7.5 Å deep (81)). A hallmark of the ligand-bound conformation is that Uri23 interacts with the Hoogsteen edge of a nearby adenine to form a Uri23⅐Ade27-Uri38 base triple (Fig. 2C)-a feature observed in most peptide-and proteinbound TAR structures (70,75,77,79,82). Cyt24 and Uri25 extrude from the helical core with bases pointing into solvent. Molecular dynamics simulations of TAR in complex with a labevolved protein revealed that this long-range triple is preserved over 16 s but disintegrates rapidly when the protein is omitted from the simulation (77).
In the absence of interacting ligands, the TAR helical axis is bent more acutely to 121° (Fig. 2D). The major groove is extraordinarily wide (13.1 Ϯ 4.2 Å) and shallow (4.4 Ϯ 3.2 Å) compared with an A-form helix. These features are accompanied by a relative reduction of concave surface in the major groove (Fig. 2, D and E versus A and B), a property that is less conducive to binding by small molecules or peptides (83). Because Uri23 and Cyt24 reside inside the duplex without base pairing to the adjacent strand, the s1a-s1b coaxial stack is wedged apart yielding an underpacked core (Fig. 2, D and F) (80). Whereas Uri23 and Cyt24 adopt stacked and inclined base orientations relative to underlying base pairs, Uri25 loops out of the duplex. These features prohibit formation of the hallmark base triple, leaving only the canonical Ade27-Uri38 pair (Fig.  2F). As a result, the central bulge exhibits significantly more conformational flexibility in the apo-state compared with the ligand-bound state, as observed by NMR analysis and molecular dynamics simulations (77,80).
Overall, the propensity of TAR to adopt two major bulge conformations is well-suited to ligand binding. Solution studies showed that the major groove is narrow and deep in the presence of ligand (73,79,84). As such, the RNA is capable of folding around a ligand-such as the unstructured peptide of Tatgiving rise to a complementary interface with a substantial buried surface area (76). Understanding the details of such RNA-peptide interactions provides insight into the basis for affinity and specificity, while revealing stereochemical features that are unique to the respective apo-and ligand-bound states. Such information is of high value for the design of novel antivirals that target the HIV TAR element.

Targeting TAR with small molecules
During the past 2 decades, multiple labs have worked to identify small molecules that bind HIV-1 TAR (64,(85)(86)(87)(88)(89)(90)(91). To gain perspective about the successes and ongoing challenges, it is instructive to examine the handful of structurally characterized TAR-small molecule complexes to assess compound localization and commonalities in their modes of molecular recognition. A survey of such complexes (Table 1) reveals common  (77). D, surface map of concave and convex features for apo-state HIV-1 TAR (PDB entry 1anr) (80). The helical axis bends substantially with an overall angle of 121°. The structure is characterized by more convex surfaces compared with the bound state. E, view of D rotated ϩ45°to emphasize the helical bend. F, ribbon model revealing the Ade27-Uri38 duplex but not the major groove triple. Bases of the flanking UCU bulge penetrate the core contributing to the bend. The helical axis, angle, major-groove width, and depth were calculated by Curvesϩ (152); when applicable, parameters were computed as the average of the NMR ensemble. Concave and convex properties for each nucleotide of the lowest-energy NMR structures were calculated by Cx (83) and displayed on a Curvesϩ output file as a heat-map surface using PyMOL (Schrödinger, LLC). Here and elsewhere, perceived hydrogen bonds and related interactions are depicted as broken lines. JBC REVIEWS: Molecular recognition of HIV TAR RNA chemical features, including positively charged alkylamine or guanidinium groups and planar heteroaromatic groups, such as naphthyl, indole, phenyl, or phenothiazine moieties. Although neomycin and derivatives thereof are known to bind TAR (92), we will not consider aminoglycosides here, due to their promiscuous RNA binding resulting from many positively charged amines (Fig. 3A), their toxicity (86,(93)(94)(95), and the recent focus on compounds with "drug-like" properties in terms of potency, solubility, selectivity, and distribution, as well as RNA targeting by use of specific modes of molecular recognition (96).
Historically, arginine is one of the first small molecules shown to bind TAR, leading to a conformational change (97,98). This amino acid-and derivatives thereof-has served as a proxy for selective binding of the Tat protein to TAR. This key interaction arises from specific contacts at Uri23, Gua26, and Ade27 (97, 99 -101). Because more structural restraints were discernible for the HIV-2 TAR-argininamide complex compared with HIV-1 TAR-argininamide, the former analysis is considered to provide a definitive basis to evaluate this RNAligand interaction (69). Indeed, HIV-2 TAR differs from HIV-1 by deletion of Cyt24 in the UCU bulge (Fig. 1C). Both HIV TAR variants have similar K D values of ϳ2 mM for argininamide (Fig.  3A). The HIV-2 TAR-argininamide NMR ensemble reveals that the ligand localizes to the major groove near the central bulge, where the guanidinium moiety engages in cationstacking between Ade22 and Uri23 (Fig. 3, B and C). Although NMR spectra did not provide direct evidence for hydrogen bonding between argininamide and Gua26 (77), the guanidinium position and orientation are consistent with co-planar hydrogen bonding predicted by theoretical calculations (69). One metric of surface complementarity at the receptor-ligand interface is the shape correlation statistic (S c ) (102). The calculated S c of 0.67 for the HIV-2 TAR-argininamide interaction (69) suggests substantial surface complementarity. The complex buries 229 Å 2 of the argininamide solvent-accessible surface, which is 65% of the total ligand surface area. The observation that this RNA-ligand complex shares NOEs with the HIV-1 TARargininamide complex suggested similar modes of effector binding (79,103). Although the hallmark Uri23⅐Ade27-Uri38 triple of the bound state was absent in NOE assignments in one study (103), another study revealed that an isomorphic C ϩ ⅐G-C mutant interacts with argininamide in a pH-dependent manner, suggesting base-triple formation is needed for amino acid binding (84).
Argininamide binding to TAR provided several insights in terms of ligand localization and the determinants of binding (Fig. 3, B and C). As we will see, this mode of binding-known as an arginine sandwich motif (ASM)-was observed next in the context of TAR-Tat interactions (described below). Of course, high-affinity Tat-mediated recognition requires multiple arginines (101,104,105). This knowledge and the application of electrostatic analysis to the TAR-argininamide complex prompted high-throughput screening of bis-guanidine compounds designed to mimic argininamide binding. Based on Tat-peptide-displacement assays, a top hit, RBT-203 (Fig. 3A), showed a K i of 1.5 M by FRET displacement, evidence of binding by surface plasmon resonance (SPR), as well as inhibition of Tat-mediated transcription in cell-free extracts at levels of 5-15 M (106, 107). NMR analysis revealed the compound induces a conformation similar to that of the TARargininamide complex, although neither guanidinium group of RBT-203 interacts with a guanine base (107). Addition of an indole ring into the RBT-203 benzyl scaffold and replacement of the guanidinium groups by piperazine and a primary amine improved the K i to 39 nM, although antiviral activity was not assessed (106). This new compound, RBT-550 (Fig. 3A), was shown by NMR to bind TAR in a fundamentally different manner compared with argininamide. The indole ring appears to intercalate adjacent to the UCU bulge between the Gua26 -Cyt39 and Ade22-Uri40 base pairs (Fig. 3D). Uri23 does not form the hallmark base triple, and the primary amine of RBT-550 interacts with the Gua26 backbone; the piperazine moiety protrudes into solvent but appears to restrict the propylamine conformation in some members of the structural ensemble. Intercalation produces a high degree of shape complementarity (average S c of 0.67), and 290 Å 2 of the ligand is buried in the interface, representing 44% of its solvent-accessible surface. The observation that RBT-550 promotes a TAR conformation that differs from the argininamide-bound complex lends support to the idea that some small molecules can shift the RNA conformational equilibrium to an "inactive" state, which is a generally accepted drug strategy (66,92,106,108).
Computational screening of a small molecule library identified acetylpromazine (Fig. 3A) as a TAR binder, providing an early example of how this approach could be used to target RNA. Electrophoretic mobility shift assays suggested that the compound blocks formation of a TAR-Tat-CycT1 complex at ϳ100 nM (64). NMR analysis indicated that acetylpromazine localizes within the bulged loop of TAR in a manner analogous to RBT-550 (109). Binding appears to be conferred primarily by stacking between the Gua26 -Cyt39 and Ade22-Uri40 base pairs like RBT-550 and is accompanied by dissolution of the Uri23⅐Ade27-Uri38 base-triple. Like RBT-203 and RBT-550, there are no base-specific interactions comparable with Hoogsteen-edge readout by argininamide (Fig. 3C). The RNA-ligand

Model ncRNA-inhibitor interactions: base pairing and shape complementary
Small molecules that strongly target a specific RNA are uncommon, and these are likely to engage multiple unintended partners (91). At the outset of screening, effective approaches strive to limit off-target recognition by conducting binding assays in the presence of a molar excess of tRNA (85), or by gauging nonspecific binding by use of decoy RNAs (85, 86), RNase footprinting (110), or whole transcriptome analysis (111). Even after the identification of a tight-binding RNA inhibitor, the structure determination of such a complex is even more extraordinary. As we noted, many technical obstacles were overcome to obtain reliable experimental structures of TAR (69,112). To improve such outcomes, the analysis of TAR binding to various small-molecule ligands would have benefitted from complementary biophysical approaches to rigorously and reproducibly assess the binding determinants of hit compounds (113)(114)(115). Methods that provide thermodynamic parameters (⌬G, ⌬H, and ϪT⌬S) and K D values have proven especially useful to relate structural observations to specific A, chemical diagrams for various small molecules that bind TAR and have been characterized structurally by experimental approaches. Positively charged groups are light blue, and aromatic rings are pale yellow. Equilibrium K D values for TAR binding to neomycin and argininamide were derived from NMR (69,92). K i values for RBT-203 and RBT-550 were measured for the ability to displace a Tat-derived peptide from TAR, as monitored by FRET (106,107). The EC 50 value of acetylproamizine was estimated based on an EMSA analysis of concentration-dependent disruption of a TAR-Tat-CycT1 complex (64). Here and elsewhere, shape correlation coefficients for RNA-ligand interfaces were calculated by the program S c on a scale of 0 to 1.0 (102). Calculations in A were applied to the following: TAR-neomycin (PDB entry 1qd3) (92); TAR-argininamide (PDB entry 1akx) (69); TAR-RBT-203 (PDB entry 1uub) (107); TAR-RBT550 (PDB entry 1uts) (106); and TAR-acetylpromazine (PDB entry 1lvj) (64). S c values are the average derived from the reported NMR ensembles. Solvent-accessible surface areas of RNA-ligand interfaces were calculated by PISA (153). B, ribbon diagram of HIV-1 TAR (PDB entry 6cmn) (77) depicting the locations of nucleotides (green surface) that interact with various small molecules in Table 1. Most ligands bind in the major groove at the interface between s1a and s1b; neomycin binds in the minor groove (92). C, ribbon and ball-and-stick diagram of HIV-2 TAR in complex with argininamide. D, ribbon and ball-and-stick diagram of HIV-1 TAR in complex with RBT-550. JBC REVIEWS: Molecular recognition of HIV TAR RNA modes of binding (116). SPR is also considered a rigorous secondary screen to validate high-throughput approaches, while providing kinetic constants k on and k off for lead optimization (117). A future challenge for inhibitor studies of TAR will be to relate quantitatively vetted molecular-recognition attributes of ligand binding to the drug-discovery process. Accordingly, we now consider examples of well-defined ncRNA-effector complexes with distinct RNA recognition features, supporting equilibrium binding constants, and analyses of downstream inhibitor effects on antiviral or antibacterial function.
Benzimidazole derivatives have been identified by MS-based screening that target the internal ribosome entry site (IRES) of the hepatitis C virus (HCV) genomic RNA (118). The IRES features a series of folded domains, including conserved domain II. This region comprises a bent, bulged loop that is key for positioning the viral mRNA initiation codon and activation of the hostribosome(119).Biophysicalanalysisdemonstratedbenzimidazole compounds straighten domain IIa and reorder the bulge (110,120), which has parallels to the ligand-bound and apo-states of TAR (Fig. 2, A and D). Significantly, the restructured S-shape of domain IIa produces a cavity suited to smallmolecule binding.
Lead optimization led to compound 12 that binds HCV IRES domain IIa with a K D of 860 nM (Fig. 4A) (118). A 2.2 Å resolution co-crystal structure reveals the mode of RNA recognition by 12 (Fig. 4B). Specifically, the 2-aminoimidazole moiety donates hydrogen bonds to the Hoogsteen edge of Gua110, like argininamide (Fig. 3C). The dimethylamino-propyl group makes an electrostatic interaction with a nonbridging oxygen of Ade109, whereas the dimethylamino-methyl group forms a water-mediated contact to a nonbridging oxygen of Ade53. The benzimidazole moiety engages in -stacking between purines Gua52 and Ade53. These features are corroborated strongly by structure-activity relationships (118). Compound 12 sequesters 384 Å 2 of its surface in the RNA pocket or 71% of the ligand's solvent-accessible area. As expected from the structure, the shape correlation between the RNA and ligand surfaces is high with an S c value of 0.82. Importantly, compound 12 was also active in HCV-replicon assays. The inhibitor reduced HCV RNA levels in cells with an EC 50 of 3.9 M (118), similar to levels needed to inhibit translation from subgenomic replicons (2.8 Ϯ 0.4 M) and full-length virus (3.4 Ϯ 0.5 M) (28). Although a related compound 11 showed slightly poorer binding affinity (K D 1.7 M), it performed better in the replicon assay (EC 50 of 1.5 M). Compound 11 replaces the tetrahydropyran ring with a smaller tetrahydrofuran. This subtle difference has been attributed to differences in cellular penetration (118), which is a major consideration beyond a tight-binding K D .
Riboswitches represent another class of structured ncRNAs that change their conformations in response to the binding of cognate ions or small molecules (121)(122)(123). Such sensing results in mRNA regulatory feedback that controls downstream genes (124 -126). The flavin mononucleotide (FMN) riboswitch is notable because it senses the cofactor FMN as well as the natural product roseoflavin, which acts as an antibacterial (23,127). This vulnerability has fostered efforts to target the  (118). The chiral center is labeled with an asterisk. B, ball-and-stick diagram of HCV domain IIa (purple) bound to compound 12 (yellow) (PDB entry 3tzr) (28). The stereochemistry was not resolved in electron density maps. C, chemical structure of the FMN analogue BRX1555. The K D value of drug binding, the EC 50 value from single-round transcription assays, and the IC 50 value for bacterial growth inhibition are provided (25). D, ball-and-stick diagram of the Fusobacterium nucleatum FMN riboswitch in complex with BRX1555 (PDB entry 6dn3) (25). The respective ncRNA-inhibitor structures were chosen based on visual inspection of ligand fit to electron-density maps and associated quality-control indicators.

JBC REVIEWS: Molecular recognition of HIV TAR RNA
FMN riboswitch with novel antibiotics (19,21,22,25,26). Recent structure-guided design led to the discovery of compound BRX1555-an FMN analogue that binds with a K D of 39.0 Ϯ 0.7 nM based on in-line probing (Fig. 4C) (25). The ligand has an in vitro EC 50 of 1.70 Ϯ 0.18 M in single-turnover transcription termination assays and an IC 50 of 0.49 Ϯ 0.09 M in bacterial growth inhibition assays (25).
The 2.80 Å resolution co-crystal structure of the FMN riboswitch in complex with BRX1555 reveals key details about its mode of molecular recognition. As expected, the inhibitor overlaps with the binding site of the natural ligand, which resides at the center of a six-way helical junction comprising two pairs of stabilizing loops (128). The isoalloxazine ring of the inhibitor stacks centrally between Ade48 (junction J3-4) and Ade85 (pairing region P6) (Fig. 4D). The face of Ade99 (J6-1) hydrogen bonds to the uracil-like edge of BRX1555 in a manner similar to FMN. Gua62 (J4-5) stacks against the phenyl group of the inhibitor, reminiscent of the 2-methylaminopyrimidine moiety of ribocil-the synthetic FMN analogue discovered by Merck (21). In terms of binding and localization, the similarities of BRX1555 and ribocil are remarkable, especially because the former molecule was developed by structure-based design and the latter was identified by phenotypic screens that yielded a novel chemical scaffold distinct from FMN (21,25). Like FMN and ribocil, the riboswitch-BRX1555 complex buries a large amount of the inhibitor's solvent-accessible surface in the interface (468 Å 2 or 88%). The riboswitch-BRX1555 complex also shows significant shape complementarity, as indicated by an S c value of 0.72. Interestingly, significant commonalities exist in the interactions used by HCV IRES domain IIa and the FMN riboswitch in terms of ligand recognition; these likenesses include hydrogen bonding that imparts base-specific readout, co-axial base stacking, solvent exclusion, and high shape complementarity (Fig. 4, B and D). These features also represent key molecular recognition determinants in peptide binding to TAR, which we will now explore.

Molecular recognition of TAR by Tat peptides
The HIV-1 Tat protein comprises multiple functional domains that are needed to complete the viral life cycle (Fig.  5A). TAR binding requires a basic ARM (37, 99) harboring nine arginines. Of these, Arg-52 is most essential because its mutation to lysine yields a drastic loss of transactivation (99). Mutagenesis of flanking residues supports the presence of a supplemental electrostatic interaction network that modulates RNA binding as well as transactivation (100,129). Thus far, elucidation of the intact TAR-Tat-SEC complex (Fig. 1A) has remained elusive, although divide-and-conquer efforts have led to core SEC complexes in the presence of Tat's transactivation domain. Nevertheless, these co-crystal structures currently lack the Tat ARM domain (51, 74), providing an incomplete picture of RNA recognition. Accordingly, we will now focus on recent structures of Tat-derived ARM peptides in complex with TAR that have led to a new understanding of this key RNA-protein interaction and how it provides a foundation for HIV inhibitor design. A structural survey of known peptides and proteins bound to TAR is presented in Table 2.
To provide perspective on the recent structure of the HIV-2 TAR-Tat complex, it is important to recognize that initial high-resolution insights came from NMR studies of the BIV TAR-Tat complex (73,75). Like HIV-1 Tat, the ARM domain of BIV Tat is also arginine-rich (Fig. 5B). The peptide binds BIV TAR in the major groove near the central UU bulge, where it forms a short antiparallel strand capped by a distorted type VЈ ␤-turn (Fig. 5C) (130). Like many ␤-turns, the ith to i ϩ 3rd hydrogen bond is absent, but the carbonyl oxygen of the ith residue (Arg-73) receives a hydrogen bond from the i ϩ4th side chain (Arg-77) (Fig. 5D). The net result is a ␤-hairpin spanning the width of the major groove. Base-specific readout is mediated by guanidinium groups from Arg-70, Arg-73, and Arg-77, which hydrogen bonds to the Hoogsteen edges of Gua14, Gua11, and Gua9. Cationstacking is observed between Arg-70 and Ade13 of the central base triple and between Arg-73 and Gua9. A handful of salt-bridge and hydrogen-bond interactions occur, including Lys-75 N⑀ to the pro-(R p )-oxygen of Uri24 and the backbone amide of Gly-71 to N7 of Gua22. The complex buries 62% of the total Tat peptide (Ser-65 to Arg-81) solvent-accessible surface or 1187 Å 2 . The interface exhibits a substantial amount of shape complementarity, as indicated by an S c value of 0.70. These molecular recognition properties are consistent with the K D of 1.3 Ϯ 0.1 nM measured for this strong peptide-RNA binding interaction (131).
More recently, the solution-NMR structure of the HIV-1 Tat peptide (amino acids 44 -60) was determined in complex with HIV-2 TAR (76). This exciting new complex reveals unprecedented chemical details about the mode of TAR-Tat molecular recognition (Fig. 5B). Remarkably, the Tat ARM spans the length of the TAR major groove, starting with the N terminus abutting the well-ordered apical loop (Fig. 5E). This tight RNA turn is fortified by a canonical Cyt30 -Gua34 base pair first observed in the HIV-1 TAR complex with the labevolved protein TBP6.7 (77) (discussed below). The C terminus of the peptide extends through stem s1a (Fig. 5E) and protrudes into solvent past Arg-57. Consistent with CD spectra of the isolated peptide in solution (129), the ensemble of Tat conformers in the bound state lacks regular secondary-structure features in contrast to the ␤-turn in BIV Tat (Fig. 5, E versus C).
As anticipated, the determinants of TAR-Tat binding specificity include key arginines that read the Hoogsteen edges of conserved guanine bases in the TAR sequence (Fig. 5, B and F). The indispensable nature of Arg-52 (97,100) is consistent with its recognition of Gua26 (Fig. 5F)-the site of argininamide binding (Fig. 3C). Arg-52 is sequestered by cation-stacking of its guanidinium group between bases from Ade22 and Uri23. The latter base engages in the hallmark bound-state base triple. The constellation of bases and mode of amino acid recognition compose the specialized ASM protein-RNA interaction module (Fig. 5F) (132)-first observed for argininamide (above).
Although the ASM appears only once in the TAR-Tat complex, it is utilized four times in the 7SK-Tat complex (data not shown) (76). Arg-73 of BIV Tat uses comparable ASM-like readout, although the Arg-73 guanidinium group does not stack beneath the Uri10 base (Fig. 5D).
A different mode of TAR recognition is used by the Arg-49 group of Tat, which also hydrogen bonds to the Hoogsteen edge JBC REVIEWS: Molecular recognition of HIV TAR RNA of a conserved guanine (i.e. Gua28), while making contacts to the 2Ј-OH of Uri23 (Fig. 5, B and F). Although the latter nucleobase stacks upon the Arg-49 side chain, this binding mode does not constitute an ASM because the guanidinium is not flanked by bases on both sides (i.e. it is an "open-faced" arginine sandwich). Beyond arginine, Tat uses additional stabilizing hydrogen bonds to recognize TAR in the upper and lower stems. These include the following: the ⑀-amino groups of Lys-50 and Lys-51, which interact with backbone oxygens from Gua36 and Cyt37; the Gly-48 carbonyl oxygen, which interacts with the exocyclic amine of Cyt29 (Fig. 5, B and F); and Arg-53 and Arg-55 from the flexible C-terminal tail of Tat, which interact with the backbone at Cyt39 and Uri40, whereas Gln-54 recognizes atom N7 of Gua43 (data not shown). The cumulative interactions are summarized in Fig. 5B.
The HIV-1 Tat peptide recognizes TAR with a K D of 22.5 Ϯ 15.2 nM based on ITC (76). This slightly reduced affinity compared to BIV TAR-Tat represents a change in free energy  (154). The transactivation domain comprises an N-terminal acidic and proline-rich domain, a cysteine-rich domain that binds Zn(II) (i.e. zinc finger or ZnF), a core region, and a basic ARM. Additional domains include a glutamine-rich region and the E2 CTD. B, summary of peptide sequence interactions with TAR RNAs from C-J (blue box). Amino acids of naturally occurring BIV and HIV-1 Tat ARMs are shown; the sequence alignment is based on common recognition modes of RNA targets as follows: Gua11 (26) and Gua14(28) of BIV(HIV) TAR are recognized by Arg-73(R52) and Arg-70(R49) of BIV(HIV) Tat (yellow box). Amino acids of lab-evolved proteins and cyclic peptides from structure-based design. The sequence alignment is based on common spatial recognition at Gua26 and Gua28 of HIV-1 TAR by Arg-47 and Arg-49 of TBP6.7 and Arg-3 and Arg-5 of JB181. Specific RNA-peptide interaction types are listed above each amino acid; the symbols are as follows: indicates cation-or aromatic stacking; H equals hydrogen-bond recognition of a guanine Hoogsteen edge; P i indicates salt-bridge formation to the phosphate backbone; b indicates hydrogen-bond recognition to a nucleobase; a pentagon indicates a hydrogen-bond contact to ribose. Symbols of nonstandard amino acids are as follows: B equals L-2,4-diaminobutyric acid; O equals L-ornithine; backward P is D-proline. C, global view depicting BIV TAR (purple ribbon) recognition by the BIV-Tat ARM (yellow worm) (PDB entry 1biv) (75). The S c value was calculated from the lowest energy NMR core structure (amino acids [67][68][69][70][71][72][73][74][75][76][77][78][79]. D, close-up view of BIV TAR recognition by BIV Tat at the UU bulge. Despite differences in the central bulge compared with HIV TAR (Fig. 1C), BIV TAR exhibits a major-groove base triple at Uri10⅐Ade13-Uri24. The Tat peptide undergoes a sharp bend with dihedral angles of i ϩ 1 75°, i ϩ 1 Ϫ10°, and i ϩ 2 Ϫ133°, i ϩ 2 63°characteristic of a type VЈ turn (130); the peptide has no other ␤-hairpin characteristics. For clarity, only amino acids engaged in peptide-RNA were included in diagrams. E, global view depicting HIV-2 TAR recognition by the HIV-1 Tat peptide (PDB entry 6mce) (76). The S c value was calculated from the lowest energy NMR core structure (amino acids 48 -54). F, close-up view of HIV-2 TAR recognition at its UU bulge by HIV-1 Tat. Three nucleobases compose an ASM (cyan highlight) that engages Arg-52 of Tat. G, global view depicting HIV-1 TAR recognition by TBP6.7 (PDB entry 6cmn) (77). The lab-evolved ␤2-␤3 loop (yellow) recognizes the TAR major groove. The S c value was calculated from the co-crystal structure. H, close-up view of HIV-1 TAR recognition in the central UCU bulge by TBP6.7. Arg-47 engages TAR at the ASM, similar to F. I, global view of HIV-1 TAR recognition by the cyclic peptide JB181 (PDB entry 6d2u) (63). The S c value was calculated from the lowest energy NMR structure. J, close-up view of HIV-1 TAR recognition at the UCU bulge by JB181. The DP13-LP14 turn is shown to emphasize the restrained peptide conformation. CTD, C-terminal domain.

JBC REVIEWS: Molecular recognition of HIV TAR RNA
(⌬⌬G) of only ϩ1.7 kcal mol Ϫ1 , e.g. the difference of 2-3 hydrogen bonds. Like the BIV TAR-Tat complex, the lowest energy peptide of the HIV TAR-Tat ensemble is significantly sequestered in the major groove with 41% of the peptide (1185 Å 2 ) buried from solvent. This degree of similarity is striking, considering that the HIV-1 Tat peptide adopts an extended conformation compared with the BIV U-shaped polypeptide path (Fig. 5, E versus C). As expected, the HIV TAR-Tat interface exhibits substantial shape complementarity in its core, as indicated by an S c value of 0.66 -comparable with antibodyantigen interfaces and peptides designed to inhibit ␤-amyloid aggregation (102,133).

Lab-evolved proteins for HIV-1 TAR recognition
Advances in protein engineering have facilitated the design of novel RNA-binding proteins with distinct functions (134). In this regard, the TAR-binding protein (TBP) is a model system that was "evolved" from RRM1 of the U1A spliceosomal protein by combining saturation mutagenesis, yeast display, and cell sorting (77,85). The unique mode of HIV-1 TAR recognition by variant TBP6.7 was visualized recently by a co-crystal structure determined to 1.80 Å resolution (77). Unexpectedly, TAR recognition by TBP6.7 entails doubled-stranded RNA recognition of s1b and the UCU bulge (Fig. 5G). This mode of binding differs entirely from the parental U1A protein, which binds to a single-stranded loop within the U1 small nuclear RNA (156). The major determinants of TAR RNA recognition by TBP6.7 are attributable to residues in the evolved ␤2-␤3 loop. For rigor, every amino acid in the loop was mutated and analyzed for TAR binding by ITC, thereby relating structure and recognition in terms of free-energy changes. Arg-47, Arg-49, and Arg-52 are the most energetically significant residues as reflected by their ⌬⌬G values of ϩ3.8, ϩ3.2, and ϩ2.8 kcal mol Ϫ1 for Arg-to-Ala mutations. These observations agree well with the structure wherein each residue penetrates deeply into the major groove to recognize a conserved guanine. Like the HIV-1 TAR-Tat interaction, Arg-47 utilizes the ASM in which its guanidinium group stacks between Ade22 and Uri23, while forming hydrogen bonds to the Hoogsteen edge of Gua26 (Fig. 5, B and H). Unlike the modes of TAR RNA recognition by BIV and HIV Tat peptides (Fig. 5, D and F), Arg-47 simultaneously makes two electrostatic contacts to Uri23 phosphate. The collective interactions appear to be a variation of a hypothetical "arginine fork" interaction, wherein both edges of the Tat-derived guanidinium group were hypothesized to bind TAR's phosphate backbone (99). Otherwise, the TAR-TBP6.7 complex typifies the TAR-bound conformation featuring the hallmark Uri23⅐Ade27-Uri38 base triple and canonical Cyt30 -Gua34 pair in the apical loop (77).
Other similarities exist between the modes of HIV-1 Tat and TBP6.7 recognition of TAR. Specifically, Arg-49 of TBP6.7 stacks upon Ade27 while hydrogen bonding and engaging in electrostatic interactions with the Gua28 Hoogsteen edge and phosphate group (Fig. 5H). Arg-49 of HIV-1 Tat forms similar stacking and base-pairing interactions but hydrogen bonds to the 2Ј-OH of Uri23 (Fig. 5F). Unlike the HIV-1 TAR-Tat complex, TBP6.7 uses a third arginine for guanine recognition. Arg-52 of TBP6.7 reads the Hoogsteen edge of Gua36 while stacking beneath Gua34. Beyond TBP6.7, BIV Tat is the only other example of major-groove guanine recognition by three peptide arginines (Fig. 5, B and D). Despite similarities in TAR recognition among TBP6.7, HIV-1 Tat, and BIV Tat, the commonalities are entirely local and do not reflect common polypeptide folds (Fig. 5, C, E, and G). In terms of the buried surface area and shape complementarity of the TAR-TBP6.7 interface, a total of 718 Å 2 of TAR is sequestered, wherein 384 Å 2 is attributable to the ␤2-␤3 loop. Recognition of TAR by TBP6.7 gives an S c value of 0.79 (77), which is comparable with S c values of peptides selected by phage display to bind the insulin receptor ectodomain (136). Overall, these properties closely resemble comparable metrics for the BIV and HIV TAR-Tat complexes (Fig. 5, C, E, and G).
The observation that the major determinants of TAR recognition by TBP6.7 are localized mostly to the lab-evolved ␤2-␤3 loop has ramifications for inhibitor design using a short peptide that comprises the isolated ␤2-␤3 loop. Indeed, a series of complementary experiments demonstrated that the ␤2-␤3-loop sequence could be removed from the context of TBP6.7 and was still capable of TAR binding. When synthesized as a stapled peptide, the restrained ␤2-␤3 loop still exhibited affinity for TAR (K D of 1.8 Ϯ 0.5 M) and was capable of inhibiting TAR-Tat-dependent transcription in HeLa nuclear lysate (77). At present, it is unknown whether stapled ␤2-␤3-loop peptides enter cells or whether they possess antiviral activity. Nevertheless, this work provides proof-of-principle that small peptides can be derived from proteins evolved in the lab to recognize TAR.

TAR recognition by structure-based design of cyclic peptides
A more traditional approach to disrupt the SEC-TAR interaction (Fig. 1A) is to exploit existing knowledge of TAR-Tat molecular recognition to guide design of restrained, inhibitory peptides (62,70). Past studies leveraged structural information from the BIV TAR-Tat interaction (Fig. 5, C and D) to produce a number of cyclic peptides (82,137,138) that culminated recently in an "ultrahigh affinity" cyclic peptidomimetic. This inhibitor, JB181, binds HIV-1 TAR with an unprecedented K D  (63). NMR solution analysis revealed that JB181 recognizes TAR in the s1b major groove and bulge (Fig. 5I). However, rather than adopting an elongated peptide as observed for the HIV-2 TAR-Tat complex, the designed peptide forms a ␤-hairpin comprising 14 residues (Fig. 5B). To reduce conformational flexibility, the peptide termini are linked by an innovative L-and D-proline turn that covalently cyclizes the inhibitor (Fig. 5J). The RNA recognition-end of the peptide adopts a distorted type II ␤-turn wherein the carbonyl oxygen of Arg-5 (ith amino acid) accepts a hydrogen bond from the backbone amide of Arg-8 (i ϩ 3rd) (data not shown). Overall, cyclization stabilizes the antiparallel ␤-strand structure and positions the ith and i ϩ1st amino acids to interact with the major groove and UCU bulge. Combining natural and unnatural amino acids in the cyclic peptide offers advantages to elicit desired RNA-peptide interactions. Placement of L-2,4-diaminobutyric acid (B) at position 1-as opposed to Arg-1 used in precursor peptide L-22 (70)induces favorable salt bridges between the B1 amino group and phosphates at Gua21 and Ade22 (Fig. 5J). This pairing serves to anchor the peptide in the major groove and promotes electrostatic binding by other basic groups introduced to recognize both bulge and major-groove features. For example, the guanidinium groups of Arg-3 and Arg-5 interact with Gua26 and Gua28, and the Lys-6 N group hydrogen bonds to the carbonyl oxygen of Uri25.
In some respects, the determinants of TAR molecular recognition by JB181 are comparable with naturally occurring modes of TAR recognition by the Tat ARM domains from BIV and HIV. JB181 buries 920 Å 2 or 55% of its solvent-accessible surface area in the RNA-inhibitor interface. This level of sequestration is comparable with Tat binding to BIV or HIV-2 TAR (ϳ1200 Å 2 ). Recognition of BIV TAR Gua9 and Gua11 by Arg-77 and Arg-73 of BIV Tat are analogous to JB181's use of Arg-5 to recognize Gua28 because both sets of interactions involve favorable co-planar positioning of a guanidinium group to donate two hydrogen bonds to O6 and N7 of the base Hoogsteen edge (Fig. 5, D and J). HIV-1 Tat similarly employs a single imino group of Arg-52 and Arg-49 to recognize the Hoogsteen edges of Gua26 and Gua28 within HIV-2 TAR, akin to JB181's use of Arg-3 and Lys-6 to recognize O6 and O4 of Gua26 and Uri25-albeit JB181 does not utilize the ASM. Although JB181 binds TAR with 100 -1000-fold greater affinity than HIV and BIV Tat, it uses fewer specific interactions to recognize TAR (Fig. 5B). Whereas BIV and HIV-1 Tat peptides use every arginine of the ARM sequence for RNA binding, JB181 utilizes half of its complement. This attribute may be indicative of a greater role for JB181's charged residues in general electrostatic recognition of the RNA. Notably, the S c value of 0.59 for the TAR-JB181 complex agrees well with that of a similar antiviral cyclic peptide, L-22, whose shape complementarity score is 0.60 in the context of the TAR complex (70,77,88). L-22 likewise relies on electrostatic features to bind TAR RNA (62).

Model of the SEC-core complex bound to HIV-1 Tat(1-60)
In addition to the new structures showing TAR recognition, a recent co-crystal structure of the TAR-Tat-SEC-core com-plex was determined recently to 3.5 Å resolution. This exhilarating complex comprises Tat(1-48)-CycT1-AFF4 -CDK9 and the apical loop attached to stem s1b of HIV-1 TAR (74). Although the Tat ARM domain is absent in electron density maps, the complex shows how the transactivation domain of Tat is interwoven into CycT1 and that both viral and host proteins contact the TAR apical loop (Fig. 6A), burying 350 Å 2 of the RNA's solvent-accessible surface. To visualize a more complete model of HIV-1 Tat binding to TAR-including the Tat ARM domain and TAR bulged loop-we superimposed the recent TAR-SEC-core complex (74) upon the HIV-2 TAR-Tat(44 -60) structure (i.e. Fig. 5E) (76) based on the position of the common RNA elements. This model provides an integrated view of HIV-1 TAR-Tat-SEC-core binding (Fig. 6A). With this new perspective, the mode of TAR recognition by Tat encompasses the following: (i) the Tat ARM domain and (ii) the Tat transactivation domain-including contributions from CycT1. This bipartite mode of Tat recognition buries an estimated 1550 Å 2 of TAR's solvent-accessible surface. In this manner, Tat binding facilitates SEC-core recruitment to TAR by shifting the conformational equilibrium of the unbound RNA to the bound-state conformation (Fig. 2, A and D) (74).

Implications for drug discovery and design
Based on the model of the TAR-Tat(1-60)-SEC-core complex, we can evaluate prior studies of antiviral molecules to assess progress and future challenges. As shown above, a comparison of the newly determined HIV-1 TAR-JB181 complex to the recent HIV-2 TAR-Tat complex reveals that the former cyclic peptide inhibitor induces conformational changes in the RNA apical loop and UCU bulge (i.e. a canonical Cyt30 -Gua34 pair and the hallmark base triple) that are comparable with folding features elicited by HIV-1 Tat (Fig. 5, I and J versus E and F) (63). Despite this similarity and the remarkably high affinity of JB181 for TAR, the antiviral properties of this inhibitor are limited. Indeed, both JB181 (K D 28.4 Ϯ 4 pM) and its precursor L-22 (K D ϳ30 nM) (63,70,88) can reduce HIV-1 replication and viral spreading in cell culture, but only to a similar extent (K i of ϳ40 M) (63,88). Hence, the 1000-fold tighter binding to TAR by JB181 compared with L-22 does not appear sufficient to overcome SEC-Tat binding to TAR. Poor intracellular delivery or low stability could account for the unexpectedly low K i of JB181. However, a biochemical explanation from the authors of the JB181 study offers another possibility. Specifically, they noted that JB181 efficiently displaces peptide mimics of the Tat ARM domain from TAR, but the inhibitor fails to block recruitment of the SEC-core complex onto TAR (63).
A key implication from this work is that cyclic peptides-or lab-evolved proteins-designed to target the HIV-1 UCUbulge and s1b may not be sufficient to function as potent antivirals (63). Instead, effective Tat inhibition likely requires overcoming its extensive ARM contacts to the entire TAR major groove, as well as apical loop recognition by both the Tat transactivation domain and host CycT1 (Fig. 6B). This observation also has implications for small-molecule inhibitors (63). As we have seen, the preponderance of structurally characterized drug-like molecules bind TAR at the s1a-s1b interface (Fig.  3B). Future drug-design efforts should consider approaches JBC REVIEWS: Molecular recognition of HIV TAR RNA that target TAR's apical loop interaction with Tat and CycT1 (Fig. 6, A and B). Alternatively, multivalent molecules can be envisioned that target TAR; such molecules would simultaneously displace the Tat ARM domain from TAR's major groove, while blocking the Tat transactivation domain and CycT1 from binding the TAR apical loop. Another possibility is to create steric blocks at distal sites of the SEC that do not directly interact with TAR. For example, we superposed TAR from the TAR-TBP6.7 complex (Fig. 5, G and H) upon the TAR-Tat(1-60)-SEC-core model (Fig. 6C). The results not only reveal competition between the lab-evolved ␤2-␤3 loop and Tat ARM but also a steric block arising from the lab-evolved protein where its ␤1-␣1 and ␣2-␤4 loops clash with loop 112-124 of CycT1. This observation provides a possible explanation for why TBP6.7 hinders TAR-Tat-dependent transcription at a concentration of 0.2 M TBP6.7 (85), whereas the minimal stapled ␤2-␤3-loop peptide from TBP6.7-which is missing the ␤1-␣1 and ␣2-␤4 loops-requires 20 M concentrations (77).
Hence, targeting TAR at the UCU bulge could cause steric inhibition of a distal CycT1 region to effectively block recruitment of the Tat-SEC complex.
In closing, the field is only beginning to understand the basis of molecular recognition of HIV-1 TAR by cognate host and virus proteins. New investigations and innovative approaches are needed to make progress on this complex and multifaceted problem. Our experience is that lab-evolved proteins offer a flexible strategy to cultivate the development of peptide inhibitors directed at specific regions of TAR (85). Proof-ofconcept has been demonstrated by the recent TAR-TBP6.7 co-crystal structure (Fig. 5, G and H), which has been reduced to a small restrained peptide (77). By producing a series of peptide inhibitors that target multiple discrete TAR sites and reducing these to small molecules (e.g. employing HIV protease methods (29)), it may be feasible to create a multivalent drug by covalently tethering the disparate compounds together. A related approach was used to target nucleotide repeat tran-  (1-60). A, hypothetical model of HIV-1 TAR in complex with CycT1-CDK9 -AFF4 -Tat(1-60). The putative Tat ARM trajectory is based on superposition of HIV-1 TAR in the context of the recent co-crystal structure of the SEC core complex (PDB entry 6cyt) (74) upon HIV-2 TAR in the context of the recent TAR-Tat(44 -60) structure (PDB entry 6mce) (76), as depicted in Fig. 5C. Rather than localizing the Tat ARM domain to the s1b stem (74), the model predicts that the ARM runs through TAR's major groove. B, surface model of the hypothetical SEC-TAR-Tat model from A. The surface emphasizes three virus-host protein contact points to TAR. First, the RNA apical loop makes a modest number of interactions with Tat (yellow) and CycT1 (blue). Second, the proximal segment of the Tat ARM (amino acids 47-52) contacts TAR within s1b and the bulged loop. Third, the distal Tat ARM (amino acids 53-57) contacts TAR at s1a. For emphasis, TAR is depicted as a semi-transparent surface to allow Tat visualization in the major groove. Notably, there are no observed contacts between the TAR bulge and CycT1, which is behind the RNA in this orientation. C, two sites of steric blocking are predicted when TBP6.7 is docked onto the SEC-TAR-Tat model of A. Specifically, interference occurs by the ␤2-␤3 loop of TBP6.7 (labeled "binding") where the Tat 〈R⌴ interacts with s1b and the UCU bulge of TAR (Fig. 5, G and H). TBP6.7 loops also interfere with the positioning of CycT1 in the context of the SEC due to steric clashes (labeled "steric surface"). The net result is displacement of the SEC, resulting in the TAR-TBP6.7 complex (right panel). The hypothetical model (left panel) was prepared by superposition of the HIV-1 TAR-TBP6.7 co-crystal structure (PDB entry 6cmn) (77) upon the SEC-TAR-Tat model of A. JBC REVIEWS: Molecular recognition of HIV TAR RNA scripts that give rise to myotonic dystrophies (140). The use of lab-evolved proteins to target RNA is broadly applicable to the development of new reagents and drugs that target a variety of functional ncRNAs.