Characterization of an unusual Rho factor from the high G + C gram-positive bacterium Micrococcus luteus.

A transcription termination factor (Rho) was purified from the Gram-positive bacterium Micrococcus luteus, and the complete gene sequence was determined. The M. luteus Rho polypeptide has 690 residues, which is 271 residues more than its homolog from Escherichia coli. Most of the additional residues compose a highly charged, hydrophilic segment that is inserted in a nonconserved region between two conserved regions of the RNA-binding domain of the known Rho homolog proteins. This segment extends from residues 49 to 311 and includes a stretch of 238 residues that contain no hydrophobic side chains. Biochemical studies indicate that the M. luteus protein is very similar to E. coli Rho in terms of its RNA-dependent NTPase activity and its sensitivity to the Rho-specific inhibitor bicyclomycin. However, the M. luteus protein has a less stringent RNA cofactor specificity. It also acts to terminate RNA transcription with E. coli RNA polymerase on the λ cro DNA template, but at much earlier termination stop points than those recognized by E. coli Rho. Thus, the M. luteus protein functions as a true Rho factor, but with a different specificity than that of E. coli Rho. We propose that this altered specificity is consistent with its need to function on transcripts that have a high content of G + C residues.

The orderly expression of the genetic information in DNA segments into RNA molecules depends on the function of transcription terminators. In Escherichia coli, one mechanism of transcription termination is mediated in part by an essential protein factor called Rho (1). Rho factor from E. coli has been studied since its discovery nearly 25 years ago (2). The Rho monomer is a 47-kDa protein. However, Rho factor functions as a homohexamer (3) that can bind to a nascent transcript and mediate its release by actions on the transcription complex that are coupled to the hydrolysis of NTPs (1).
A recent phylogenetic study by Opperman and Richardson (4) comparing rho genes isolated from organisms from several of the major branches of bacteria suggests that Rho is ubiquitous throughout the bacterial domain. An unexpected result was discovered during the analysis of the rho gene from Micrococcus luteus, a Gram-positive soil bacterium that has an unusually high G ϩ C DNA content (74%) (5). The M. luteus rho homolog was found to have an open reading frame encoding a protein that was homologous to E. coli Rho through a very long portion. However, the homology did not extend all the way through the RNA-binding domain toward the amino terminus of the protein. Because the region of homology starts in a segment that has an in-frame GTG codon preceded by a sequence that is a good match to a Shine-Dalgarno sequence, Opperman and Richardson proposed that translation began at that GTG codon to yield a 41,733-Da protein of 382 amino acids that is 52% identical (71% similar) to E. coli Rho. If this proposal were correct, the M. luteus Rho protein would be unusual in comparison with the other Rho homologs as it would lack a conserved part of its RNA-binding domain. Additionally, the protein would be ϳ30 amino acids smaller than any of the other predicted Rho factors that have been sequenced.
The data of Opperman and Richardson (4) were also consistent with an alternative hypothesis, namely that the M. luteus Rho polypeptide is much larger than the homologs from other organisms and includes a large region with a very unusual amino acid sequence. The DNA sequence determined in that work indicated that the open reading frame extended upstream for at least 160 amino acid residues. However, because the G ϩ C content of that upstream region was ϳ78%, which is a value that is typical of intergenic spacer regions in M. luteus DNA (5), and because these upstream codons had a very unusual bias favoring Arg, Asp, Gln, and Gly residues and lacking hydrophobic residues, Opperman and Richardson argued that it was unlikely to be part of the coding region for the M. luteus Rho protein.
To resolve this issue, we purified Rho protein from M. luteus. Our studies show conclusively that the latter hypothesis is correct and demonstrate directly that an organism that is phylogenetically distinct from E. coli also has a factor that can cause the termination of RNA transcription. ATPase Assay-ATPase activity was assayed colorimetrically as described by Lanzetta et al. (6). Typically, 50 ng of protein were mixed with 100 l of assay solution (40 mM Tris-HCl, pH 7.7, 50 mM KCl, 10 mM MgCl 2 , 1 mM ATP, and 10 g/ml poly(C)). After 10 min at 37°C, the * This work was supported by Grant AI 10142 from NIAID, Department of Health and Human Services and by Grant AI10142 from the National Institutes of Health (to J. P. R.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Materials-Restriction
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EMBL Data Bank with accession number(s) L27277.
‡ To whom correspondence and reprint requests should be addressed. Tel.: 812-855-1520; Fax: 812-855-8300; E-mail: jrichard@bio. indiana.edu. release of P i from ATP was detected by the addition of 800 l of a mixture of 4.2% ammonium molybdate, 0.045% malachite green, concentrated flame photometer diluent (Bacharach, Inc.) (12.5:37.5:1) that had been premixed and filtered through Whatman No. 1 filter paper. Color development was quenched after 1 min by the addition of 100 l of 34% citric acid. After 30 min at 20°C, the absorbance at 660 nm was measured. One unit is defined as the amount that hydrolyzes 1 mol of ATP/min. This assay was found to be quantitative from 1 to 15 nmol of P i .
Protein Sequencing-10 g of the intact protein were bound to a strip of polyvinylidene difluoride membrane by direct adsorption (7). The N terminus was sequenced by William Lane at the Harvard University MicroChem Laboratory. To obtain N-terminal sequence information of a stable tryptic fragment, 16.8 g of Rho protein was digested in the presence of 105 nM trypsin for 3 min at 37°C in a total volume of 30 l. Proteolysis was stopped by the addition of 10 l of 4 ϫ sample loading buffer (252 mM Tris-HCl, pH 6.8, 8% SDS, 40% glycerol, 20% 2-mercaptoethanol, and 0.004% bromphenol blue) followed by immersion of the mixture in a boiling water bath for 2 min. The products were separated by electrophoresis on an 8% SDS-polyacrylamide gel (8) Isolation of the Upstream Portion of the M. luteus rho Gene-Genomic DNA was prepared from M. luteus as described by Wilson (10). The method employs the detergent hexadecyltrimethylammonium bromide to remove proteins and polysaccharides. A typical yield from a saturated 100-ml culture was 0.7 g of DNA.
To identify a 203-kbp 1 fragment containing the sequence encoding the N-terminal region of the M. luteus rho gene, 10 g of genomic DNA were digested with various combinations of restriction endonucleases, and the products were separated by agarose gel electrophoresis. Fragments containing the desired rho gene segment were identified by Southern hybridization (11) with a 0.28-kbp HindIII/SmaI fragment from the plasmid pMLRHOSK ϩ (12), which had been radiolabeled to a high specific activity with 32 P (13).
It was determined that a BamHI/PstI double digest of M. luteus genomic DNA produced a ϳ2-kbp fragment that contained the desired portion of the M. luteus rho gene. To clone this DNA fragment, 50 g of M. luteus DNA were digested with BamHI and PstI, and the fragments were size-selected on a 1% agarose gel and ligated into pBluescript II SK ϩ (Stratagene). Colonies of E. coli DH5␣FЈ transformed with these ligated plasmids were screened by nucleic acid hybridization (14) with the same HindIII/SmaI fragment used for Southern analysis.
DNA Sequencing-Both double-stranded and single-stranded DNA templates were utilized for sequencing reactions (15). Double-stranded templates were prepared as described in the Sequenase protocol manual (U. S. Biochemical Corp.). Single-stranded DNA was generated from pBluescript derivatives by infection with M13K07 helper phage (16). Reactions were carried out according to the manufacturer's instructions with the following modifications. 7-Deaza-dGTP was used in place of dGTP to reduce band compressions (17), and a dideoxy-dGTP/7-deaza-dGTP ratio of 1:15 and a dideoxy-CTP/dCTP ratio of 1:100 were used in the termination mixtures to enhance dG and dC signals.
RNA Transcription-A DNA fragment encoding the cro gene was prepared by amplification of the segment of the gene from residues Ϫ188 to ϩ372 from pCBC1 DNA 2 with Vent DNA polymerase. This plasmid contains a mutation of C to G at position 6 of the cro gene. Transcription complex formation was carried out according to Burns and Richardson (18)  . After the addition of 5 l of rifampicin (1 mg/ml), the unincorporated NTPs were removed by ultrafiltration (20 min, 500 ϫ g) with a Microcon-100 (Amicon, Inc.). The retentate (10 l) was diluted to 200 l with transcription buffer containing 36 units of rRNasin (Promega). 10 l of this diluted A 24 complex solution were used per reaction. After the addition of Rho factor, all four NTPs (including CTP) were added to 200 M, and the 20-l reaction mixture was incubated at 37°C for 3 min. After the addition of 20 l of 2 ϫ stop mixture (20 mM EDTA, 0.1% SDS, 0.5 mg/ml tRNA, 0.3 mg/ml proteinase K) and incubation at 37°C for 20 min, the RNA was collected by ethanol precipitation and resuspended in 6 l of loading dye (10 mM EDTA, 0.001% bromphenol blue, 0.001% xylene cyanol in 98% formamide). The entire sample was loaded on a 6% polyacrylamide gel (20 ϫ 40 cm) containing 50% urea (14), and the RNA transcripts were separated by electrophoresis for 2 h at 30 watts.

Purification of an RNA-dependent ATPase from M. luteus-
From preliminary studies, we found that the ATPase activity in partially fractionated extracts of M. luteus cells was stimulated by the addition of RNA homopolymers and that poly(C) was especially effective. Since E. coli Rho is an RNA-dependent ATPase that is strongly activated by poly(C) (19), we made use of an assay for ATP hydrolysis with poly(C) present for purification of a putative Rho factor from M. luteus. The purification procedure, which is described in detail elsewhere (20), involved chromatography of the crude extract on a Bio-Rex 70 cationexchange column (Bio-Rad), concentration of the pooled fractions containing poly(C)-dependent ATPase with a Centriprep-100 ultrafiltrator (Amicon, Inc.), and chromatography on heparin-Sepharose CL-6B resin (Pharmacia Biotech Inc.).
Analysis of the final fraction by electrophoresis on a 10% SDS-polyacrylamide gel (8) revealed that it consisted of a single polypeptide in Ͼ95% purity with an apparent M r of 95,000 ( Fig.  1), which is approximately twice as large as the E. coli Rho polypeptide and also significantly larger than the M. luteus rho gene product (M r ϭ 41,733) proposed by Opperman and Richardson (4).
A sample of this highly purified M. luteus ATPase, analyzed by the Microsequencing Facility at Harvard University, yielded an N-terminal amino acid sequence of TESTE, which is different from the sequence of MAGIL at the N terminus of the proposed M. luteus rho gene product (4). This sequence also did not match any other pentapeptide sequence in the segment of the M. luteus rho gene that had been sequenced prior to this work. However, a 42-kDa fragment generated from the 95-kDa protein by partial digestion with trypsin had the N-terminal sequence GRPGPEVDE, which did match a sequence located upstream of the previously proposed rho translational start site (12). Together, these results concerning the apparent size and 1 The abbreviation used is: kbp, kilobase pair. Location of the TESTE Sequence-The M. luteus rho gene identified in the previous study (12) was cloned as a 2.1-kbp SphI/SacI DNA fragment in the plasmid pMLRHOSK ϩ . The remainder of the M. luteus rho gene was cloned as a 2-kbp BamHI/PstI DNA fragment into pBluescript II SK ϩ to create the plasmid pBN10. We intentionally chose a fragment that would overlap partially with the rho insert in pMLRHOSK ϩ . The DNA sequence of this fragment confirmed that the target DNA had been successfully isolated and contained ϳ1200 base pairs of additional upstream M. luteus sequence. 111 base pairs (37 residues) upstream from the SphI site, we found a segment encoding the sought-after N-terminal pentapeptide sequence TESTE, which had been identified from microsequencing of the purified RNA-dependent ATPase. This result demonstrated that the RNA-dependent ATPase is indeed encoded by the gene identified by Opperman and Richardson (4). Although the open reading frame continues upstream for 108 base pairs from the codon of the first Thr residue, the next upstream codon is GTG, which is used as the start codon in ϳ50% of the M. luteus genes (5), and it is preceded by an excellent Shine-Dalgarno sequence (Fig. 2). We thus conclude that translation of the M. luteus rho gene starts at the GTG codon at position 1 and that the initiating Met residue is removed post-transcriptionally. The amino acid composition of the predicted protein from the completed gene sequence is in excellent agreement with the amino acid composition of the purified protein (Table I). The molecular mass of the M. luteus Rho protein is 74,957 Da; therefore, the protein runs anomalously (ϳ95 kDa) in the Laemmli gel system (8) (Fig. 1).
Substrate Specificity-It has previously been demonstrated that the E. coli Rho protein is capable of catalyzing the hydrolytic conversion of any one of the four ribonucleoside triphosphates to the corresponding nucleoside diphosphate and P i in the presence of an RNA cofactor such as poly(C) (21). This aspect of M. luteus Rho was investigated by assaying for P i release with each of the four different NTPs (Table II) (Table III). Both proteins were dependent on the presence of an RNA for ATP hydrolysis, and neither hydrolyzed ATP in the presence of poly(dC). Of the RNA polymers tested, poly(C) was the most effective activator for both proteins. M. luteus Rho, however, had appreciable activity with poly(A) as well as relatively high activity with poly(U) and measurable activity with poly(I). This was significantly different from E. coli Rho. Thus, the spectrum of RNA molecules that can activate ATP hydrolysis is greater for M. luteus Rho than for E. coli Rho.
To test whether the lower activity M. luteus Rho had with CTP was related to the RNA cofactor used, the rate of CTP hydrolysis was measured with poly(U) and was found to be ϳ30% of that with ATP (data not shown). Thus, the lower activity of M. luteus Rho with CTP was not a consequence of using poly(C) as an activator.
Bicyclomycin Inhibits the ATPase Activity of M. luteus Rho-Bicyclomycin is an antibiotic that has recently been shown to specifically inhibit E. coli Rho function (22). Mutants that exhibit bicyclomycin resistance contain mutations in the ATPase domain of the Rho protein (22). Because M. luteus Rho contains strong sequence homology in that conserved region, it was hypothesized that bicyclomycin would be an inhibitor of the M. luteus Rho protein. To investigate this, the standard ATPase assay was performed in the presence of increasing concentrations of bicyclomycin (Fig. 3). The results reveal that bicyclomycin is a potent inhibitor of M. luteus Rho. In the presence of 25 M bicyclomycin, the lowest concentration tested, both the E. coli and M. luteus Rho proteins retain Ͻ30% of their poly(C)-dependent ATPase activity. Activity continues to decrease with increasing bicyclomycin concentrations and is nearly abolished at 200 M.
M. luteus Rho Terminates Transcription-To determine whether the RNA-dependent ATPase from M. luteus is a transcription termination factor, the purified protein was assayed for its effect on transcription of a cro template with E. coli RNA polymerase. This template has the well characterized Rho-dependent terminator tR 1 (23). The control experiments show that starting with isolated complexes, incubation for 3 or 6 min in the absence of factors yielded a 372-nucleotide RNA (Fig. 4, lanes 4 and 5), the readthrough transcript from the promoter (P R ) to the end of the template. When E. coli Rho factor was present at 28 nM, a 3-min incubation yielded RNA molecules with ϳ290, ϳ312, and ϳ345 nt arising from termination at subsites I, II, and III (23), respectively, as well as the 372-nucleotide readthrough transcript (Fig. 4, lane 6). The overall termination efficiency was ϳ50% for this condition. The addition of the E. coli termination cofactor NusG, which has been shown to cause Rho-dependent termination at sites upstream of tR 1 in vitro (24), yielded the expected pattern (Fig. 4,  lane 7). When M. luteus Rho was used at the same concentration (28 nM), a new set of discrete RNA molecules was formed with sizes in the range of 90 -280 nucleotides (Fig. 4, lane 9), indicating that it caused termination at points well upstream from those used by E. coli Rho. The addition of E. coli NusG had only a small effect in enhancing the yield of smaller transcripts with M. luteus Rho (Fig. 4, lane 10). This effect of M. luteus Rho was not specific to tR 1 . Similar results were obtained with another E. coli Rho-dependent terminator, tiZ1, an intragenic terminator in the lacZ gene. M. luteus Rho terminated transcription at points earlier than E. coli Rho or E. coli Rho assayed in combination with NusG. 2 This ability of M. luteus Rho to give rise to smaller RNA molecules during transcription of the cro template was completely blocked when 200 M bicyclomycin was present in the reaction mixture (Fig. 4, lane 12), showing that this inhibitor of the ATPase activity of M. luteus Rho also inhibits its termination function.
To show that the smaller transcripts were the result of M. luteus Rho action as a transcription termination factor rather than as a ribonuclease, transcripts synthesized in the absence of Rho factor were subsequently incubated with M. luteus Rho (Fig. 4, lane 11). Although a small amount of a 145-nucleotide RNA appeared, the fact that no other RNA molecules appeared that had the same sizes as the products made when M. luteus Rho was present cotranscriptionally rules out the possibility that they were generated by a ribonuclease activity. Because very few of the transcripts were extended to the size of the readthrough RNA when M. luteus Rho was present during transcription, the overall efficiency of termination within the   . 3. Bicyclomycin inhibits M. luteus Rho ATPase activity. ATP hydrolysis at 37°C was measured in standard Rho ATPase reaction mixtures containing 58 nM Rho (M. luteus (q) or E. coli (Ⅺ)), 10 g/ml poly(C), and bicyclomycin as indicated. Reactions were initiated by the addition of ATP (final concentration of 1 mM) to prewarmed solutions, and P i release was detected colorimetrically. 100% activity is 11.5 units (M. luteus) and 11.5 units (E. coli).
transcribed fragment was nearly 100% Two lines of evidence suggest that the 145-nucleotide RNA arose as a result of a contamination of M. luteus Rho with a ribonuclease. First, the extent of appearance of the 145-nucleotide RNA was higher with other, less pure preparations of the M. luteus factor (data not shown). Second, it also appeared when the function of M. luteus Rho was inhibited by bicyclomycin (Fig. 4, lanes 12 and 13).
A comparison of the distribution of transcripts in reaction mixtures lacking Rho that had been quenched at 2, 5, and 8 s after initiation (Fig. 4, lanes 1-3) with the distribution of the terminated transcripts (lanes 9 and 10) indicated that, as with E. coli Rho, the preferred positions for termination stop points were at the positions where RNA polymerase naturally pauses. However, with M. luteus Rho, the termination occurred at pause sites that were farther upstream than the pause sites that were used as the termination points by E. coli Rho. DISCUSSION We have isolated a transcription termination factor from M. luteus that is phylogenetically related to transcription termination factor Rho from E. coli. Although rho homologs have been identified from several different phylogenetic branches of bacteria (4,(25)(26)(27)(28), this is the first demonstration that an organism that is distantly related to E. coli actually expresses its rho homolog gene. Although M. luteus Rho is similar to E. coli Rho in having a broad NTP substrate specificity, in its turnover number with poly(C) as a cofactor, and in its sensitivity to inhibition with bicyclomycin, it differs in having a less stringent RNA cofactor specificity and in its specificity of termination during transcription of a coliphage gene with E. coli RNA polymerase. We have also found that M. luteus Rho differs from E. coli Rho in containing an extended insertion of very unusual sequence and likely structure within its RNA-binding domain.
M. luteus belongs to the phylogenetic branch called the high G ϩ C Gram-positive group. The G ϩ C content of its DNA is ϳ74% (5). In contrast, the G ϩ C content of E. coli DNA is only 50%. In its function, the Rho factor of E. coli acts by binding to the nascent transcript at regions of the RNA called rut (rho utilization site). Although rut sequences lack a consensus (1), they do have certain specific, defining characteristics; they have little base-paired secondary structure (1) and usually have a compositional bias that is high in C residues and low in G residues (29). Because of their high G ϩ C content, the RNA molecules in M. luteus are likely to have more extensive base pairing than the RNA molecules in E. coli. Thus, M. luteus Rho has likely been adapted to use a rut site that has more extensive base pairing than is typical for a rut site in E. coli. Evidence in support of this hypothesis is our finding that M. luteus Rho caused termination of transcription at a site located well before the rut site used by E. coli Rho on the cro gene template. The RNA encoded by the upstream region of cro forms extended base-paired secondary structures (30), thus making it unavailable as a rut site for E. coli Rho. This inter- pretation is supported by the finding that E. coli Rho will cause termination at upstream sites when transcription is performed with ITP in place of GTP because the resulting inosine-substituted RNA has less stable base-paired secondary structure than the normal cro transcript (31). M. luteus Rho, in contrast, was able to use these segments in the first 100 nucleotides of a normal, guanosine-containing cro transcript as its rut site to cause termination.
An exceptionally unusual feature of M. luteus Rho is the amino acid composition of the insert in the RNA-binding domain between the two phylogenetically conserved sequence segments that are found in the RNA-binding domain of all the known Rho sequences. In M. luteus Rho, this insert is between Ile 48 and Gly 312 (Fig. 5). In Rho factors from most organisms, these two phylogenetically conserved landmark residues are usually separated by 14 amino acids with very little phylogenetic conservation. With its insert, M. luteus Rho has 263 residues instead of 14 in this putative loop region. The first part of the insertion sequence is rich in Ala residues, while the C-terminal part is rich in Arg, Asp, Gly, and Asn residues. Also, in a stretch of 238 residues, there are no amino acids with a hydrophobic side chain (excluding Pro and Ala residues). Since patterns of polar and nonpolar residues are important in the formation of ordered ␤-stranded and ␣-helical secondary structures (32) and since hydrophobic residues have a major role in the formation of ordered tertiary structures for globular domains (33), we predict that this very hydrophilic segment of the protein will be randomly coiled, lacking a defined secondary structure. Indeed, when the insert sequence was analyzed for secondary structure (PHDsec Secondary Structure Prediction Program, EMBL, Heidelberg, Germany) (34 -36), ϳ80% was predicted to exist as a loop. However, this segment has approximately an equal number of positively and negatively charged residues and might form an unprecedented, ordered structure consisting of many salt bridges.
The sequences of two other rho genes from this same group of organisms have recently become available: the genes from Streptomyces lividans 3 and Mycobacterium leprae (GenBank TM accession number U15186). The open reading frames of these genes predict Rho proteins with 706 and 610 residues, respectively. With both, the major part of the additional residues over the ϳ420 that are typical of Rho homologs from other phylogenetic groups start after Ile 72 in S. lividans Rho and after Ile 73 in M. leprae Rho and end before Gly 336 and Gly 236 , respectively. Thus, S. lividans Rho has 263 and M. leprae Rho has 162 residues between these landmark residues. Like M. luteus Rho, S. lividans Rho has a major part that is very rich in Arg, Asp, Gly, and Glu residues, but is different in having many Gln residues instead of many Asn residues. The M. leprae sequence is also rich in polar residues. Like the M. luteus Rho insert, these sequences are very deficient in hydrophobic residues. These observations suggest that the presence of a polar, random-coiled structure insert is a conserved feature of the Rho proteins in these organisms that have a very high G ϩ C content. However, in spite of the similar features, the three known Rho RNA-binding domain insertion sequences did not reveal any obvious phylogenetic relatedness. It will be of great interest to learn how the presence of a structurally unordered subdomain can help these Rho factors contend with their nascent transcripts to cause termination.
M. luteus Rho also contains another smaller insertion sequence that runs from Lys 364 to Gln 375 ( Fig. 5; see Ref. 5). It is between two phylogenetically conserved residues in the RNAbinding domain corresponding to Glu 108 and Arg 109 in E. coli Rho. The S. lividans and M. leprae Rho homologs have insertions of three and six amino acids in that position, respectively. Like the large upstream insertion sequence, these lack amino acids with hydrophobic side chains.