Identification of sequences which regulate the expression of Drosophila melanogaster Doc elements.

Long interspersed nuclear elements (LINEs) are mobile DNA elements which propagate by reverse transcription of RNA intermediates. LINEs lack long terminal repeats, and their expression is controlled by promoters located inside to the transcribed region of unit-length DNA copies. Doc elements constitute one of the seven families of LINEs found in Drosophila melanogaster. Plasmids in which the chloramphenicol acetyltransferase (CAT) gene is preceded by DNA segments from different Doc family members were used as templates for transient expression assays in Drosophila S2 cells. Transcription is initiated at the 5′ end of Doc elements within hexamers fitting the consensus (C/G)AYTCG and is regulated by a DNA region which is located Ȭ20 base pairs (bp) downstream from the RNA start site(s). The region includes a sequence (RGACGTGY motif, or DE2) which stimulates transcription in other Drosophila LINEs, and two adjacent elements, DE1 and DE3. Moving the downstream region either 4 bp away from, or 5 bp closer to the RNA start site region inhibited transcription. Sequences located Ȭ200 bp downstream from the Doc 5′ end repressed CAT expression in an orientation- and position-dependent manner. The inhibition reflects impaired translation of the CAT gene possibly consequent to the interaction of specific Doc RNA sequences with a cellular component.

Long interspersed nuclear elements (LINEs) are mobile DNA elements which propagate by reverse transcription of RNA intermediates. LINEs lack long terminal repeats, and their expression is controlled by promoters located inside to the transcribed region of unit-length DNA copies. Doc elements constitute one of the seven families of LINEs found in Drosophila melanogaster. Plasmids in which the chloramphenicol acetyltransferase (CAT) gene is preceded by DNA segments from different Doc family members were used as templates for transient expression assays in Drosophila S2 cells. Transcription is initiated at the 5 end of Doc elements within hexamers fitting the consensus (C/ G)AYTCG and is regulated by a DNA region which is located ϳ20 base pairs (bp) downstream from the RNA start site(s). The region includes a sequence (RGACGTGY motif, or DE2) which stimulates transcription in other Drosophila LINEs, and two adjacent elements, DE1 and DE3. Moving the downstream region either 4 bp away from, or 5 bp closer to the RNA start site region inhibited transcription. Sequences located ϳ200 bp downstream from the Doc 5 end repressed CAT expression in an orientation-and position-dependent manner. The inhibition reflects impaired translation of the CAT gene possibly consequent to the interaction of specific Doc RNA sequences with a cellular component.
Doc is one of 50 or more mobile DNA elements that have been identified in the fruit fly Drosophila melanogaster . Doc elements lack terminal repeats, instead terminating at the 3Ј end in runs of adenine residues flanked by polyadenylation signals. They differ in size, being variously truncated at the 5Ј end, and are flanked by target site duplications which vary in length from 10 to 14 bp 1 (Schneuwly et al., 1987;Driver et al., 1989). Complete family members are ϳ4.7 kb in length and potentially encode a putative nucleic acid binding protein and a reverse transcriptase (O'Hare et al., 1991). The structure and coding capacity of Doc is typical of LINEs, nomadic DNA sequences conserved in evolution from protozoa to man (Doolittle et al., 1989;Xiong and Eickbush, 1990). LINEs, also known as type II retrotransposons, use self-encoded proteins to reverse transcribe their own mRNA and integrate cDNA copies at new locations in the genome. This hypothesis has been experimentally supported by the analysis of transgenic flies carrying intron-marked Drosophila I factors (Pelisson et al., 1991;Jensen and Heidmann, 1991) and baby hamster kidney cells transfected with mouse LINE-1 elements (Evans and Palmiter, 1991).
Mammalian genomes harbor Ͼ10 5 LINEs that belong to a single superfamily (Singer and Skowronski, 1985;Hutchison et al., 1989). By contrast, distinct LINE families, each including 50 -80 members, coexist in D. melanogaster. In addition to Doc, six other families of LINE elements have been described so far in this organism, including the I factor , F (Di Nocera and Casari, 1987), G (Di , and jockey (Priimagi et al., 1988) elements, and type I and type II ribosomal DNA insertions (Jacubczak et al., 1990).
LINEs differ markedly from other mobile DNA sequences that also propagate by the retrotranscription of RNA intermediates such as copia-like elements in D. melanogaster  and the Ty element (Boeke et al., 1985) in Saccharomyces cerevisiae. These elements, also known as viral retrotransposons, resemble the integrated genomes of retroviruses as they carry LTRs. LINEs lack LTRs, and their expression is controlled by promoters which are located within the transcribed region Swergold, 1990;Minchiotti and Di Nocera, 1991;Minakami et al., 1992;Contursi et al., 1993;McLean et al., 1993).
By means of transient transfection assays we monitored the expression of constructs in which the reporter CAT gene was under the control of various Doc DNA segments in Drosophila Schneider II (S2) cells. We show that distinct cis-acting DNA elements, clustered in a ϳ50-bp long DNA region located at the 5Ј end of unit-length Doc copies, cooperate to control RNA initiation. In addition, we found that sequences located ϳ200 bp downstream from the 5Ј end inhibit the expression of the reporter CAT gene in a position-and orientation-dependent manner. The inhibition appears to be due to reduced translation rather than to impaired synthesis of CAT mRNA.

MATERIALS AND METHODS
Plasmids-The clones suN, 6N, and 11N, in which the 5Ј end regions of the elements su(f) S2 , Doc6, and Doc11, respectively, precede the CAT gene, were obtained by cloning SalI-NaeI (suN), XhoI-NaeI (6N), and EcoRI-NaeI (11N) fragments, isolated, respectively, from pDoc, Doc6, and Doc11, into the AvaI site of pEMBL8CAT. pDoc, Doc6, and Doc11, described in Driver et al. (1989), were kindly provided by Dr. Kevin O' Hare. The clones suA, 11A, and 6A were obtained by cloning an EcoRI-AluI fragment from suN (suA), or AluI fragments from either * This work was supported by grants from Ministero Università e Ricerca Scientifica, P. F. Ingegneria Genetica of the C.N.R. and Commission of the European Communities (contract ERBSC1* CT920811). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
DNA Transfections and CAT Assays-Three ml of D. melanogaster S2 cells, seeded at a density of 1 ϫ 10 6 to 2 ϫ 10 6 per ml, were transfected as described previously (Di Nocera and Dawid, 1983). Five g of the plasmid of interest and 5 g of ␥F-gal were cotransfected per 65-mm diameter culture dish. ␥F-gal is a reference construct in which the Escherichia coli ␤-galactosidase gene is under the control of the D. melanogaster hsp70 core promoter, flanked 5Ј by a DNA segment from the D. melanogaster F element. Forty-eight hours after transfection, cells were harvested, resuspended in 0.2 ml of 0.25 M Tris (pH 7.8), and lysed by three cycles of freeze-thawing. The amount of ␤-galactosidase activity in each lysate was used to normalize the amount of extract with which CAT assays were performed. CAT and ␤-galactosidase activities were measured according to standard procedures (Sambrook et al., 1989).
RNA Analyses-Total RNA was analyzed by primer extension according to Grimaldi and Di Nocera (1988). Reaction products were resolved on 6% polyacrylamide, 8 M urea gels. Sequencing ladders were generated by the dideoxy chain termination method utilizing doublestranded DNA templates. The CAT primer was described previously (Minchiotti and Di Nocera, 1991). Northern analyses were carried out according to standard procedures (Sambrook et al., 1989). Thirty g of total RNA from Schneider II cells co-transfected with 5 g of ␥F-gal and 5 g of suA␦1 or suA␦2 were electrophoresed on a 1% formaldehydeagarose gel. The gel, blotted onto a HyBond ϩ membrane, was hybridized to an HindIII-NcoI fragment from pEMBL8CAT spanning the CAT gene. The filter was subsequently hybridized to a EcoRI-XhoI fragment from pC4␤gal (Thummel et al., 1988) spanning the ␤-galactosidase gene. Hybridization was carried out in 4 ϫ SSC, 50% formamide for 16 h at 45°C. The filter was washed two times in 2 ϫ SSC for 30 min at 65°C, and three times in 0.1 ϫ SSC for 30 min at 65°C before autoradiography.

The 5Ј End Regions of Distinct Doc Elements Promote CAT
Expression in S2 Cells-The promoters of different LINE elements are located in the 5Ј end region of unit-size family copies Minchiotti and Di Nocera, 1991;Minakami et al., 1992;McLean et al., 1993). Since complete Doc elements are heterogeneous at the 5Ј end both in length and sequence content (Driver et al., 1989), we checked the template competence of three different unit-length Doc copies: su(f) S2 , Doc6, and Doc11 (Driver et al., 1989). The element su(f) S2 , associated with a spontaneous lethal mutation of the suppressor of forked (su(f)), transposed in the recent past (Schalet, 1986), and should have a functional promoter region. This is the case, since a restriction fragment spanning the interval 1-267 of su(f) S2 stimulated the expression of the CAT gene in Drosophila S2 cells (Fig. 1, suN). The elements Doc6 and Doc11 were both isolated from a genomic DNA library. Doc11 is homologous to su(f) S2 , but is 7 bp shorter at the 5Ј end. Doc6 is 5 bp shorter than su(f) S2 and also differs at the 5Ј end by an additional 29 residues (Driver et al., 1989; see Fig. 6). Aside from these differences, the three elements are identical in sequence down to the NaeI sites used for cloning (data not shown). Despite the extensive changes, plasmid 6N, which carries residues 1-262 of Doc6, directed CAT expression nearly at the same levels of suN. In contrast, the construct 11N, which carries residues 1-260 of Doc11, was ϳ10-fold less efficient than suN in directing the expression of the CAT gene (Fig. 1).
An AluI site conserved in the Doc 5Ј UTR was used to construct three clones (Fig. 1, suA, 11A, and 6A) in which ϳ50 bp from the 5Ј end of each element preceded the CAT gene. These constructs directed CAT expression at higher (5-8-fold) levels than the parental ones, suN, 11N, and 6N (Fig. 1).
Information sufficient to direct transcription is thus restricted in Doc to a relatively small DNA interval. This contrasts what has been reported for the F element, in which basal transcription is stimulated by sequences located far downstream from the 5Ј end region (Contursi et al., 1993). Fig. 1 suggest that the AluI-NaeI interval, which was removed in plasmids of the A series, contains sequences which inhibit CAT expression. Via polymerase chain reaction cloning (see "Materials and Methods"), we obtained two 3Ј deletion derivatives of suN in which Doc DNA extended 3Ј to residue 218 (construct su218) or 132 (construct su132). The removal of the region 219 -267 was sufficient to delete inhibitory sequences from suN ( Fig. 2A). Construct su132 directed CAT expression ϳ2-fold less efficiently than su218. This may denote either that positive DNA sequences are located between residues 132 and 218, or that transcripts directed by the two constructs have different stability.

Doc Sequences Inhibit CAT Expression in an Orientationand Position-dependent Manner-Data shown in
One interpretation of the data is that the region 219 -267, which we will call herein ␦, hosts a silencer-like element. Silencers generally inhibit transcription in a position and orientation independent manner (Laurenson and Rine, 1992). ␦, cloned in either orientation upstream of a copy of the D. melanogaster hsp70 promoter transcribing the CAT gene, did not reduce CAT expression (data not shown). Subsequently, ␦ was cloned into the plasmid suA, either upstream (␦1suA and ␦2suA) or downstream (suA␦1 and suA␦2) of the su(f) S2 promoter. We found that CAT expression was inhibited only when ␦ flanked 3Ј, in direct orientation, the su(f) S2 promoter (Fig.  2B). CAT levels were reduced, again in an orientation-dependent fashion (control data not shown) in cells transfected with RSV␦1, a construct in which ␦ had been cloned between the CAT gene and the Rous sarcoma virus (RSV) promoter ( Fig. 2B).
The position-and orientation-dependent mode of action suggests that inhibitory sequences within ␦ act at the RNA, rather than at the DNA level. This hypothesis is supported by Northern blot analyses showing that suA␦1 transcripts accumulated at levels 2-3-fold higher than suA␦2 transcripts (Fig. 2C). Northern data also ruled out that the inhibition is associated to enhanced degradation of CAT transcripts.
Upstream AUG triplets may inhibit translation initiation at natural AUGs (Liu et al., 1984), and an ATG is present within ␦ (see Fig. 2A). However, by assaying derivatives of both suA and RSVCAT in which the CAT gene is flanked 5Ј by either the left-hand (⑀) or the right-hand () portion of ␦ ( Fig. 2A), we observed CAT inhibition with the ⑀, but not with the subregion, which includes the ␦ ATG (Fig. 2B). The ⑀ subregion extends 5Ј to residue 208 and carries an ATG, not present in ␦ ( Fig. 2A). We cannot therefore rule out that data obtained with suA⑀1 and RSV⑀1 reflect an inhibitory action of the ⑀ ATG, which is the codon for the initiating methionine of the hypothetical Doc open reading frame 1, on the translation of the CAT mRNA.
Functional Organization of the Doc Promoter Region-Sites of transcription initiation in su(f) S2 , Doc6, and Doc11 were determined by RNA primer extension analyses by using total RNA from cells transfected with suA, 6A, 11A, and an oligomer complementary to the CAT coding region as a primer. To set unambiguously the RNA 5Ј ends, reaction products were electrophoresed along with sequencing ladders of the three plasmids (Fig. 3). Transcription initiates predominantly at 6 and 7 in su(f) S2 and at 1 and 2 in Doc6 (in both elements 1 refers to the first residue flanking the target site). Faint bands of elongation may be artifactual or mark minor sites of RNA initiation at Ϫ20, 9, and 19 in su(f) S2 and at Ϫ19, Ϫ3, 5, and 16 in Doc6. A faint band, corresponding to transcripts initiating at 17 in Doc11, was the only discrete product of extension detected with RNA from cells transfected with 11A.
The organization of the cis-acting elements involved in the control of Doc transcription was investigated by assaying plasmids in which the CAT gene is under the control of different versions of the su(f) S2 promoter region. A construct carrying the interval 1-26 directed CAT expression, but could not drive faithful RNA initiation (Fig. 4, construct s26). A 10-fold increase in CAT levels, and a correct transcription pattern, was observed by using as template a construct carrying the interval Boxed residues correspond to ␦ sequences. The DdeI I site at the boundary between the ⑀ and subregions is underlined; ATGs triplets are in uppercase letters. B, effect of ␦, ⑀, and sequences on CAT expression directed by su(f) S2 and RSV promoters. The orientation of ␦, ⑀, and in each construct is denoted by an arrow. In this panel, as in Panel A, S2 cells, transfected with 5 g of test plasmid and 5 g of the internal control ␥F-gal, were assayed for CAT activity. Relative enzymatic activities are expressed as described in the legend to Fig. 1. C, Northern analysis of suA␦1 and suA␦2 transcripts. Total RNA (30 g) from S2 cells co-transfected with 5 g of ␥F-gal and 5 g of either suA␦1 (lane 1) or suA␦2 (lane 2) was analyzed by Northern blot using as probes 32 P-labeled DNA fragments spanning CAT and ␤-galactosidase sequences (see "Materials and Methods"). Bands corresponding to CAT transcripts are indicated by an arrow. Upper hybridization bands correspond to ␤-galactosidase transcripts directed by ␥F-gal.
1-47 (Fig. 4, s47). In s26 and s47, the dinucleotide TT, found in su(f) S2 at residues 25-26, is replaced by GA. The change, which created in s47 a SalI site between residues 22 and 27 (Fig. 4A), did not perturb promoter function, as judged by comparing the expression of suA and s47 (data not shown). By repairing the termini of s47 DNA after SalI cleavage with either the Klenow enzyme or the mung bean nuclease, we obtained constructs in which the interval 27-47 of su(f) S2 was moved either 4 bp away from (s47 ϩ 4) or 5 bp closer to (s47 Ϫ 5) the RNA start site(s). Both space changes severely reduced CAT expression (Fig. 4A). Primer extension data showed that faithful RNA initiation was abolished in the construct s47 ϩ 4 (Fig. 4B).
The interval 25-47 spans a DNA sequence (AGACGTGT, residues 33-42; see Fig. 4A) which is conserved in all Drosophila LINEs (consensus RGACGTGY; see Fig. 6). The transcriptional pattern of the construct 47b indicates that this sequence has a key role in Doc transcription (Fig. 4, A and B). The reduced template activity of both s47 ϩ 4 and s47 Ϫ 5 may be due to the novel position of the RGACGTGY motif, which is located in all Drosophila LINEs at the same distance from the RNA start site(s) (see Fig. 6). However, the situation is more complex, since we found that base changes introduced either to the left (residues 28 -32, construct s47a) or to the right of the AGACGTGT sequence (residues 43-47, construct s47c) also impaired promoter function (Fig. 4, A and B). Results thus indicate that three adjacent downstream sequences, herein called DE1, DE2, and DE3 (Fig. 4A) stimulate transcription of the su(f) S2 element.
Sequences spanning the RNA start sites also have a critical role in transcriptional promotion. The RNA start site regions of su(f) S2 (GATTCG, residues 6 -11) and Doc6 (CACTCG, residues 1-6) fit the consensus (C/G)AYTCG (RNA start sites are underlined; see Fig. 3). The notion that the (C/G)AYTCG motif spans an initiator (Inr) module is supported by the analysis of the construct 47in1. This clone, in which residues 6 -10 of su(f) S2 have been changed from GATTC to cgTga, directed CAT expression ϳ70-fold less efficiently than the parental construct s47 (Fig. 4A). We have also constructed a derivative of s47 (s47in2) in which residues 1-7 of su(f) S2 have been replaced by the heptamer TGCCTCT, which is the sequence found at the same relative position in Doc11 (see Fig. 6). Base changes introduced in s47in2 did not impair CAT expression (Fig. 4A). This result suggests that the template activity of Doc11 does not correlate with base changes in the Inr region, but possibly reflects the negative interference of flanking genomic DNA.
The organization of promoter sequences is similar, on the whole, in su(f) S2 and Doc6 (Fig. 5). The construct Doc6 -21, which carries the interval 1-21 of Doc6, directed CAT expression ϳ50-fold less efficiently than Doc6 -42, a construct in which Doc6 DNA extended 3Ј to residue 42 (Fig. 5). Faithful RNA initiation was observed with Doc6 -42, but not with Doc6 -21 (data not shown). By mutating residues 17 and 22 of Doc6, a SalI site was also introduced into Doc6 -42. SalI sites in s47 and Doc6 -42 are at the same distance from the RNA start site region (Fig. 5). Similarly to what was observed with the s47 ϩ 4 construct (Fig. 4), the insertion of 4 bp in the Doc6 promoter region significantly reduced CAT expression (Fig. 5,  construct Doc6 -42 ϩ 4) and abolished faithful RNA initiation (data not shown). The DNA region dislodged in Doc6 -42 ϩ 4 corresponds to the DE1-DE3 array. Interestingly, while DE2 FIG. 3. Primer extension analysis of Doc-CAT RNAs. Total RNA (40 g) from S2 cells transfected with 11A, suA, or 6A was hybridized to a 32 P-5Ј-end-labeled 30-mer (CAT primer) complementary to the CAT gene sense strand. Annealed primer moieties were extended, in the presence of deoxynucleoside triphosphates, by avian reverse transcriptase. Reaction products were run on a 6% acrylamide, 8 M urea gel, along with sequencing ladders of the 11A, suA, and 6A templates obtained by the dideoxy chain termination method using the CAT primer. Predominant RNA start sites, marked by arrows, are shown along with the DNA sequence of the 5Ј end of each element at the bottom. Lowercase letters denote flanking DNA; target sites setting the 5Ј boundaries of su(f) S2 and Doc6 are underlined. Residue 2 in the su(f) S2 element is A and not G as previously reported (Driver et al., 1989).

FIG. 4.
A, effect of base and space changes within the su(f) S2 promoter. The sequence of the interval 1-47 from su(f) S2 flanking the CAT gene in s47 is shown at the top. Lowercase letters denote residues mutated to create a SalI site. Sequence identities are denoted by dashes in the clones listed below. Residues deleted in s47 Ϫ 5 are denoted by asterisks. The site of insertion of extra nucleotides in s47 ϩ 4 is marked by an arrow. The RNA start site region and the RGAGCTGY motif are boxed. Sites of RNA initiation at residues 6 and 7 and the SalI site between residues 22 and 27 are underlined. The regions denoted in the text as DE1, DE2, and DE3 are indicated. Relative CAT activities and S.D. were calculated as described in the legend to Fig. 1. B, transcription initiation in different su(f) S2 derivatives. Total RNA (30 g) from S2 cells transfected with the constructs listed below was analyzed by primer extension as described in the legend to Fig. 3. Lanes: 1, s26; 2, s47; 3, s47 ϩ 4; 4, s47; 5, s47a; 6, s47b; 7, s47c. Sequencing ladders of either s26 (lane R, G ϩ A reactions; lane Y, C ϩ T reactions) or s47 (lanes G, A, T, and C) were obtained by the dideoxy chain termination method with the same 30 mer (CAT primer) used for RNA extension. Taking into account the length of the vector DNA segment separating Doc from CAT sequences in the construct s26, bands in lane 1 corresponding to faithfully initiated transcripts should be 7-8 nucleotides shorter than the doublet detected in lane 2.
(with a single base pair change) and DE3 are conserved in Doc6 and su(f) S2 , the DE1 region varies (Figs. 5 and 6). Comparison of the template activities of Doc6 -42, s47, and two chimeric clones (Doc6/s and s/Doc6) in which regulatory sequences of Doc6 and su(f) S2 have been exchanged, suggests that the two DE1-DE3 arrays are functionally equivalent (Fig. 5).
Doc Elements Lack an Antisense Promoter-In addition to a sense promoter (F in ) transcribing toward the 3Ј end, the 5Ј end region of the D. melanogaster F element also hosts an antisense promoter (F out ) located ϳ100 bp downstream from the Fin RNA start site (Minchiotti and Di Nocera, 1991;Contursi et al., 1993). The relatedness of Doc and F elements (see O'Hare et al. (1991)) prompted the search for an antisense promoter in the 5Ј end region of Doc. suN-inv, a construct in which the DNA region cloned in suN (Fig. 1) is reversed with respect to the CAT gene, directed CAT expression ϳ60 times less efficiently than F-cat2, a construct in which the CAT gene is flanked 5Ј by the 267/1 region from the F element (see Minchiotti and Di Nocera, 1991). We were unable to map, by primer extension and RNase protection experiments, specific sites of initiation of CAT transcripts within Doc DNA (data not shown). Deleting DNA from either end of the suN-inv insert did not stimulate CAT expression (data not shown).
From these results we conclude that the 5Ј end region of Doc elements lacks an antisense promoter. CAT activity directed by the construct suN-inv likely results from the translation of heterogeneous transcripts initiated within vector sequences.

DISCUSSION
In many polII genes, transcriptional signals are located downstream from the CAP site (Ayer and Dynan, 1988;Soeller et al., 1988;Perkins et al., 1988;Thummel, 1989;Nakatani et al., 1990;Hariharan et al., 1991;Fridell and Searles, 1992). LINEs represent an interesting system for studying the nature and organization of intragenic promoters, because their expression is regulated by DNA elements that are largely internal to the transcriptional unit. Data presented in this work add knowledge to the picture emerging from analyses carried out with other Drosophila LINEs in many respects.
In spite of significant sequence divergence, the 5Ј end regions of su(f) S2 and Doc6, two "full-length" members of the Drosophila Doc LINE family, directed CAT expression at comparable levels in S2 cells (Fig. 1). Transcription is initiated in the two elements within similar DNA tracts fitting the consensus (C/ G)AYTCG (Fig. 3). A related motif, CATTCG, is found at the 5Ј end of Drosophila jockey elements (see Fig. 6), and its deletion abolished jockey-dependent expression in S2 cells . All of these hexamers share an A at the RNA start site flanked 3Ј by pyrimidines, a sequence which is the core motif in many Inrs (Weis and Reinberg, 1992;Jahavery et al., 1994). The notion that the (C/G)AYTCG motif overlaps a Inr module is supported by the analysis of the site-directed mutant 47in1 (Fig. 4A).
Doc11 is similar to su(f) S2 at the 5Ј end but lacks residues 1-7 (Fig. 6). The inability of Doc11 sequences to drive faithful and efficient transcription (Figs. 1 and 3) is not due to changes in the RNA start site region (see construct 47in2 in Fig. 4A), and possibly reflects the negative action of flanking genomic DNA.
Three copies of the sequence (C/G)ATTCG, and one copy of the sequence CATTCC, flank 3Ј the RNA start site regions of Doc6 and su(f) S2 , respectively (Fig. 6). None of these repeats primes transcription (Fig. 3). Selectivity is not dictated by base content, since the sequence GATTCG is found both in su(f) S2 and Doc6, but transcription is initiated within the su(f) S2 hexamer only. We favor the hypothesis that a functional hierarchy is imposed by the nature of the trans-acting factors interacting with Doc promoters. According to this view, sequence redundancy may facilitate the binding of a protein to the Doc 5Ј end. In transcriptionally competent complexes, however, such a protein would be able to recognize only one target as Inr because of stereospecific interactions with one or more factors bound to downstream promoter elements. The fidelity and the efficiency of transcription, both in su(f) S2 and Doc6, is controlled by a DNA region located ϳ20 bp downstream from the RNA start sites (Figs. 4 and 5). Sequences found at nearly the same place within the 5Ј UTR of the HIV-1 enhance the activity of the HIV-1 Inr in a distance independent fashion (Zenzie-Gregory et al., 1993). In contrast, the distance between the Inr and 3Ј flanking sequences is critical in the Drosophila mdg1 (Arkhipova and Ilyin, 1991) and the adenovirus IVa 2 (Chen et al., 1994) promoters. Space changes are similarly not tolerated in the Doc promoter (Figs. 4 and 5). Although it cannot be formally excluded that negatively acting sequences had been inserted (or created) both in s47 ϩ 4 and Doc6 -42 ϩ 4, and crucial ones had been deleted in s47 Ϫ 5, we believe that transcription in all of these constructs is inhibited by loss of protein-protein interactions due to the misalignment of promoter modules. Site-directed mutations introduced in the su(f) S2 promoter revealed that multiple DNA elements located downstream from the Inr stimulate transcription. The key element is a DNA sequence (RGACGTGY motif or DE2) which is conserved in all Drosophila LINEs at a fixed distance from the RNA start site(s) (Fig. 6). Similarly to what has been  S2 and Doc6 in which SalI sites have been created are highlighted. References (in parentheses) are as follows: Doc6, su(f) S2 and Doc 11 (this work; Driver et al., 1989), F (Minchiotti and Di Nocera, 1991), I factor , and jockey . Residues flanking the element Doc11 are in lowercase letters.
reported for jockey (Mizrokhi and Mazo, 1990) and F (Contursi et al., 1993) elements, DE2 is absolutely required for Doc transcription (Fig. 4). To a less but significant extent, base changes introduced within the adjacent DE1 and DE3 regions also reduced CAT expression (Fig. 4). This finding is novel and suggests that the functional organization of Drosophila LINE promoters may be more complex than predicted from previous transient transfection assays. Sequences similar to DE1 or DE3 are found neither in other LINEs (Fig. 6) nor in Antennapedia and engrailed, two Drosophila genes in which transcription is also regulated by RGACGTGY motifs located in the 5Ј UTR (Soeller et al., 1988;Perkins et al., 1988). Since the analysis of Doc6 and su(f) S2 chimeric constructs established that DE1-DE3 arrays are functionally equivalent (Fig. 5), the peculiar substitution of DE1 sequences in Doc6 with a copy of the RNA start site region (Fig. 6) leads to speculate that the same protein may bind both to the Inr and the DE1 region. This hypothesis is supported by knowledge that some Inrs are recognized by transcription factors with multiple binding specificities (Seto et al., 1991;Du et al., 1993). On the other hand, it cannot be ruled out that two distinct proteins interact with the Inr and DE1 in the su(f) S2 promoter.
Sequences analogous to DE1 and DE3 are plausibly present in other LINEs. In the I factor, an Inr-like element spanning residues 1-4 is flanked 3Ј by a RGACGTGY motif spanning residues 29 -36 (Fig. 6). In contrast with data reported in this work, deletion of the regions corresponding to DE2 and DE3 in Doc reduced only 2-fold the transcription of the I factor (McLean et al., 1993). This finding, and the inability of the I factor region 1-20 to direct faithful transcription in S2 cells, 2 favor the notion that in this LINE sequences crucial for transcription are located in the region corresponding to DE1 in Doc. The relative contribution of downstream cis-acting elements to transcriptional promotion may thus vary among Drosophila LINEs, and plausibly rely, because of the heterogeneity of DE1 and DE3 regions, on the recruitment of distinct trans-acting factors.
Significant similarities between the 5Ј UTRs of Doc and other Drosophila LINEs are restricted to matches in the promoter region. Therefore it is not surprising that Doc, although closely related to the F element at the gene products level (O'Hare et al., 1991), lacks an antisense promoter.
Selective accumulation of LINE transcripts in specific developmental stages or cell types has been described Chaboissier et al., 1990;Lachaume et al., 1992;Martin and Branciforte, 1993;Minchiotti et al., 1994). Less is known, however, about the synthesis of LINE proteins (McMillan and Singer, 1993;Branciforte and Martin, 1994;Trelogan and Martin, 1995). Data reported in this study suggest that the expression of Doc elements may be regulated at the gene products level. Sequences found at the boundary between the 5Ј UTR and the open reading frame 1 region inhibited CAT expression in an orientation-and position-dependent manner (Fig. 2, A and B). Northern RNA blotting data (Fig. 2C) substantiated the hypothesis that inhibitory sequences act at a post-transcriptional level, and ruled out that reduced CAT production is due to enhanced degradation of CAT mRNA. We cannot exclude that ATG triplets present within ␦ and ⑀ regions inhibit CAT expression by impairing translation of the CAT mRNA. This hypothesis is, however, weakened by results obtained with the Doc subregion (Fig. 2B). We favor the hypothesis that, by interacting with a cellular component, transcripts carrying ␦ (or ⑀) sequences are somehow compartmentalized, thereby reducing their translation. The effect of inhibitory sequences was more pronounced in transcripts driven by the RSV promoter (Fig. 2B). This may reflect differences in the folding of inhibitory sequences and consequently in the ability to interact with a cellular component, within distinct RNAs.
A DNA region exhibiting ϳ70% homology to the ⑀-␦ interval is found in the F element between residues 184 and 243. A 3Ј deletion derivative lacking this region (Fin3Ј ϩ 175) is transcribed ϳ10-fold less efficiently than F-cat1, a construct in which F DNA extended 3Ј to residue 267 (Minchiotti and Di Nocera, 1991). This result is correlated to the removal of sequences located between 193 and 207 which stimulate F basal transcription (Contursi et al., 1993). However, Fin3Ј ϩ 175 directs CAT expression only 2-fold less efficiently than F-cat1. 3 This observation indirectly suggests that inhibitory sequences may be conserved in F. Germ line transformation experiments will eventually clarify whether this type of post-transcriptional regulation operates in vivo and if the inhibition is enhanced (or relieved) in specific cell types.
At the moment, genetic conditions which trigger transposition are known only for the Drosophila LINE I factor (Bucheton, 1990). In this respect, the isolation of a fly stock in which members of the Doc family transpose at a high rate (Pasyukova and Nazhdin, 1993) provides the base for analyses aimed at the characterization of the mechanisms that control the expression and mobilization of the Doc retrotransposon in the organism.