Identification of a short highly conserved amino acid sequence as the functional region required for posttranscriptional autoregulation of the cystathionine gamma-synthase gene in Arabidopsis.

Cystathionine gamma-synthase (CGS) catalyzes the first committed step of Met biosynthesis in plants. We have previously shown that expression of the gene for CGS is feedback-regulated at the level of mRNA stability, and that the amino acid sequence encoded by the first exon of the CGS gene itself is responsible for the regulation (Chiba, Y., Ishikawa, M., Kijima, F., Tyson, R. H., Kim, J., Yamamoto, A., Nambara, E., Leustek, T., Wallsgrove, R. M., and Naito, S. (1999) Science 286, 1371-1374). To identify the functional region within CGS exon 1, deletion analysis was performed. The results showed that the 41-amino acid region of exon 1 highly conserved among plants is necessary and sufficient for the regulation. Analyses of in vivo and in vitro generated mutations that abolish the regulation identified the functionally important amino acids as 11-13 residues within this conserved region. The importance of these residues was confirmed by deletion analysis within the conserved region. These studies identified the functional region of CGS exon 1 required for the posttranscriptional autoregulation of the CGS gene as (A)RRNCSNIGVAQ(I), with uncertainty of the first and last residues. This sequence is almost perfectly conserved among CGS sequences of higher plants but cannot be found elsewhere in the public databases.

In higher plants, the first committed step of Met biosynthesis is represented by the condensation of O-phosphohomoserine and cysteine to form cystathionine (1). This reaction is catalyzed by cystathionine ␥-synthase (CGS 1 ; EC 4.2.99. 9) and has been proposed to be the key regulatory step in the biosynthesis pathway (2)(3)(4).
Studies with Arabidopsis mto1 mutants, which overaccumulate soluble Met (5), have shown that expression of the CGS gene is feedback-regulated at the level of mRNA stability in response to Met application and that the mto1 mutations impair this regulation (6). Five independently isolated mto1 mutations were found to carry single-base changes within the coding region of the first exon of CGS, each altering the amino acid sequence. The mutated amino acids were clustered in a small region that is highly conserved among four plant species (6). Analysis of chimeric constructs carrying the coding region of CGS exon 1 fused in frame to a reporter gene under control of the cauliflower mosaic virus (CaMV) 35 S RNA promoter in both transient expression experiments and transgenic plants suggested that the exon 1 region is responsible for the posttranscriptional regulation of CGS gene expression (6,7). Synonymous changes introduced into the amino acid codons altered by the mto1 mutations did not impair the regulation, indicating that it is the amino acid sequence and not the nucleotide sequence that has a role in the regulation (6). Cotransfection experiments and genetic crosses of transgenic plants suggested that CGS exon 1 acts in cis. Based on these observations, we have proposed a model in which the regulation occurs during translation when the nascent polypeptide of CGS exon 1 is in close proximity to its own mRNA (6,7).
Several cis-acting nascent polypeptides involved in control of gene expression have been reported (for review, see Refs. 8 and 9). In most cases, a cis-acting polypeptide is encoded by an upstream open reading frame and controls translation of the downstream gene. The regulation observed in the CGS gene system, in which the cis-acting polypeptide is encoded within the main coding sequence and affects its own mRNA stability, is highly unique, with few similar cases (10,11).
We report here the identification of the functional region within CGS exon 1 required for posttrancriptional regulation of CGS gene expression, providing further insight into this mechanism.

EXPERIMENTAL PROCEDURES
Plant Materials and mto1 Mutant Isolation-Arabidopsis thaliana L. Heynh. wild-type Columbia (Col-0) was used as a wild-type strain. The mto1-1 to mto1-5 mutants have been described previously (5,6,12) (Table I). New alleles of mto1 mutants were isolated as described (5) * This work was supported by Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan 12138201 and 13440233 and by the "Research for the Future" program from the Japanese Society for the Promotion of Science (JSPS) Grant JSPS-RFTF97L00601. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  based on a phenotype of resistance to ethionine, a toxic analog of Met (13). Genetic mapping and sequence analysis were carried out as described (6). The mto1-6 and mto1-7 mutations thus identified were used for further study after being backcrossed three times to wild-type plants. Plant growth conditions have been described (14,15).
Characterization of mto1 Mutants-Analysis of ethionine resistance and measurement of soluble Met content were carried out as described (5,12). Extraction of total RNA from rosette leaves and Northern blot analysis were performed as described (7).
The dL1 and dR1 constructs carry small deletions within the conserved region of CGS exon 1. For the construction of dL1, the region covering the amino acids 1-60 was amplified using the primers Ex1P1 (forward) and L1n (reverse; 5Ј-AATCTGCCGGCCCTGGAGGATAA-3Ј), and the region covering amino acids 69 to 183 was amplified using primers L1c (forward; 5Ј-CTCCTGCCGGCGTCCGTCAGCTG-3Ј) and Ex1P2 (reverse). Both the L1n and L1c primers carry a NaeI restriction site (underlined), and the dL1 deletion was generated by ligating the two PCR products at the NaeI site prior to insertion into pTF33 as described above. Due to the NaeI site insertion, an Ala-Gly sequence (GCCGGC) was inserted at the deletion point. The construction of dR1 was similar to that for dL1, except that the regions covering amino acids 1-76 and 88 -183 were amplified using the primers Ex1P1 (forward) and R1n (reverse; 5Ј-TTTCTGCCGGCTTTAATGCTCAG-3Ј), and R1c (forward; 5Ј-GTGTTGCCGGCATCGTGGCGGCT-3Ј) and Ex1P2 (reverse), respectively. In the case of dR1, the first 3 nucleotides of the NaeI site replaced the Ala-76 codon, and thus a glycine codon (GGC) only was inserted at the deletion point.

FIG. 1. Ethionine resistance, soluble Met concentration, and CGS mRNA accumulation in mto1 mutant alleles.
A, wild-type (WT) and alleles of mto1 mutant seeds were sown on agar-solidified plates containing various concentrations of L-ethionine as indicated, and photographs were taken after 14 days of incubation. B, amino acids were extracted from rosette leaves 21 days after imbibition, and soluble Met concentration was determined. Soluble Met concentrations relative to that in wild type are also indicated. C, total RNA was extracted from rosette leaves at 21 days after imbibition, and CGS mRNA accumulation was determined by Northern hybridization using CGS cDNA (29) as a probe (upper panel). The membrane was rehybridized with a ubiquitin cDNA (30) probe (UBQ) as a loading control (lower panel).
a Mutated bases are underlined.
Alanine substitution mutagenesis (17) and amino acid changes at the Ser-81 position (Table I) were carried out by the overlap extension PCR method (18,19). The flanking primers were Ex1P1 and Ex1P2. Sequences of the 62 mutagenic internal primers will be provided upon request. The mutant CGS exon 1 sequences were cloned into pHA0 between the XbaI and BamHI sites. pHA0 is a pUC19 vector that carries the CaMV 35 S RNA promoter:CGS exon 1:GUS:nopaline syn-thase terminator fragment from PMI4(WT) between the HindIII and EcoRI sites. In all cases, the integrity of PCR-amplified regions was confirmed by sequence analysis.
The control construct ⌬5-183 that carries only the first four amino acids of CGS exon 1 fused in-frame to the GUS reporter gene and placed under the control of CaMV 35 S RNA promoter has been described (6). The 221-LUCϩ plasmid carries a modified firefly luciferase gene directly under the CaMV 35 S RNA promoter (20) and was used as an internal standard for transient expression experiments.
Transient Expression Studies-Liquid callus cultures of Arabidopsis were prepared as described (21,22), and transfection of tester (GUS reporter) and control (luciferase reporter) plasmids by electroporation was carried out as described (6).

Identification and Characterization of mto1 Alleles-We
have previously shown that the coding region of the first exon of CGS (183 amino acids) is sufficient for the regulation of its own mRNA stability (6,7). Although the amino acid sequences and lengths of CGS exon 1 region are not well conserved, there was a region within exon 1 that is highly conserved among four plant species (6). In the present study, the region between amino acids 58 and 98 of CGS exon 1 is referred to as the conserved region. The observation that the three amino acids altered by previously identified mto1 mutations (mto1-1 to mto1-5) ( Table I) were clustered in the conserved region (6) implies an important role for this conserved region in the posttranscriptional regulation of the CGS gene. To identify the functionally important region in CGS exon 1, we isolated additional mto1 alleles. Two new mutations were identified and designated as mto1-6 and mto1-7, respectively (Table I). These mutations are also located within the conserved region in close proximity to the previously identified mto1 mutations (6).
The seven mto1 alleles we have so far isolated were characterized on the basis of ethionine resistance, soluble Met accumulation, and CGS mRNA accumulation. As shown in Fig. 1A, growth of wild-type plants was severely inhibited by 30 M of L-ethionine (5), whereas growth of all seven mto1 mutants was  Fig. 2B. Relative activities are indicated above the amino acid sequence of the conserved region, where those amino acids that are conserved among angiosperms are reversed and those with similar amino acids are shaded (see Fig. 5). The amino acid changes in mto1 mutant alleles, the region deleted in dL1 and dR1 constructs, and the MTO1 region as determined by the present study (see "Discussion") are also indicated. dL1 and dR1 carry Ala-Gly and Gly, respectively, at the deletion point. Averages Ϯ S.D. of at least triplicate experiments are shown. Asterisks indicate significant difference (p Ͻ 0.05 by t test) from the full-length wild-type exon 1 construct (WT). not affected at this concentration of L-ethionine. mto1-4 and mto1-6 mutants showed a weaker resistance to ethionine than the other alleles, and their growth was inhibited in the presence of 100 -200 M L-ethionine.
CGS mRNA accumulated to higher levels than wild-type plants in all mto1 mutant alleles (Fig. 1C). The weak mto1-4 and mto1-6 alleles showed lower levels of CGS mRNA overaccumulation compared with that in the other alleles. These results demonstrate a positive correlation between CGS mRNA accumulation level, soluble Met accumulation level, and ethionine resistance. CGS protein accumulation levels were also similar to that for CGS mRNA accumulation.
Although the mto1-3 and mto1-5 mutants were isolated independently, they carry the same mutation (6). Consistent with this, these two mutants exhibited similar phenotypes ( Fig. 1; data not shown).
Deletion Analysis of the CGS Exon 1-To test whether the conserved region (amino acids 58 -98) is necessary for the ability to down-regulate its own expression in response to Met application, N-and C-terminal deletions of CGS exon 1 were constructed and transient expression studies were carried out (Fig. 2). Expression of the deletion constructs that lacked the conserved region (CD1, CD2, ND7, and ND8; Fig. 2A) was not down-regulated in response to applied Met (Fig. 2B), indicating that the conserved region is necessary for the regulation. In contrast, deletion constructs that retained the conserved region (CD3, CD4, ND5, and ND6; Fig. 2A) did respond to applied Met. Although the level of response was weaker than that for the full-length CGS exon 1 construct ( Fig. 2A, Exon 1), the positive response to applied Met was substantiated by the observation that introduction of mto1-1 mutation into these constructs totally abolished the response (Fig. 2C).
To test whether the conserved region is sufficient for the regulation, constructs bearing both N-and C-terminal deletions (IR1, IR2, IR3, and IR4; Fig. 2A) were also generated and tested (Fig. 2B). The results showed that all of these constructs responded to applied Met even when the conserved region alone was used (IR3). Although again the response was weaker than that for the full-length exon 1 construct, introduction of mto1-1 mutation totally abolished the response (Fig. 2C). The results indicate that the conserved region is essentially sufficient for the regulation.
Mutational Analyses of the Conserved Region in the CGS Exon 1-The full-length exon 1 constructs carrying mto1-6 and mto1-7 mutations were generated, and the response to applied Met was tested (Fig. 3). Constructs carrying mto1-1 to mto1-4 mutations that we have previously reported (6) were reanalyzed for comparison. As expected, both mto1-6 and mto1-7 mutations affected the response, as did the other mto1 alleles. In order to determine which amino acids of the conserved region in CGS exon 1 are important for the regulation, effects of alanine substitution mutations (17) were studied in the context of full-length exon 1. Of the 41 amino acids in the conserved region, 14 amino acids were individually changed to alanine, and their response to applied Met was tested (Fig. 3). As a result, four additional mutations were identified, namely R77A-1, N79A-1, I83A-1, and Q87A-1, whose changes impaired the response. The change of Ser-81 to alanine in S81A-1, however, did not impair the regulation. This was despite the fact that the regulation was abolished by alteration of the same amino acid residue to asparagine in the mto1-2 mutant. Due to this unexpected result, Ser-81 was changed to all other amino acids and tested further (Fig. 4). The results showed that amino acids with a small side chain, namely serine (wild type), glycine, alanine, and cysteine, were tolerated, whereas changes to those with a larger side chain affected the regulation.
Critical amino acids for the regulation identified by alanine substitution mutagenesis were clustered in a small region where all of the mto1 mutations are also located (amino acids 77-87) (Fig. 3). This implied that this 11-amino acid sequence within the conserved region is essential for the regulation of the CGS gene. To confirm this, additional deletion constructs within the conserved region were generated and tested. As expected, deletion of amino acids 77-87 in the dR1 construct ( Fig. 2A) abolished the response to applied Met. In contrast, deletion of amino acids 61-68 in the dL1 construct ( Fig. 2A), a C-terminal part of the conserved region, still showed a response (Fig. 2B). The positive response of the dL1 construct was evidenced by introduction of the mto1-1 mutation, which totally abolished the response (Fig. 2C). DISCUSSION The results presented in this study identified the region of CGS exon 1 that is critical for the posttranscriptional regulation of its own expression in response to applied Met. We have previously denoted the amino acid sequence defined by mto1 mutations as the MTO1 region (6). The results obtained in this study demonstrated that the MTO1 region spans from amino acid 77 to 87 (Arg-Arg-Asn-Cys-Ser-Asn-Ile-Gly-Val-Ala-Gln). Untested Ala-76 and Ile-88 might also be included in the MTO1 region. Changes of amino acids near the border of the MTO1 FIG. 4. Effect of amino acid changes at the Ser-81 position in transient expression system. Full-length exon 1 constructs carrying wild type (Ser), mto1-2 (Asn), and all other 18 protein amino acids as indicated were transfected to wildtype Arabidopsis protoplasts, and response of the GUS activity to applied Met was analyzed as in Fig. 2B. Averages Ϯ S.D. of at least triplicate experiments are shown. Asterisks indicate significant difference (p Ͻ 0.05 by t test) from the fulllength wild-type exon 1 construct (WT).

FIG. 5. Alignment of nucleotide and amino acid sequences of a CGS region from various plant species.
A, alignment of amino acid sequence corresponding to the Arabidopsis CGS exon 1 region. BLASTp searches (26) of the protein data base was carried out using the exon 1 region of Arabidopsis Col-0 ecotype (accession number AF039206) as the query. tBLASTn searches of the expressed sequence tag data base were also carried out using the exon 1 region of the Arabidopsis Col-0 ecotype and the corresponding region of Zea mays (accession number AF007785) as the query. Protein data base entries that were identified only by genome sequencing were not chosen. For those sequences with multiple expressed sequence tag entries, a consensus sequence was deduced by comparing the nucleotide alignment. The deduced amino acid sequences were aligned with the ClustalW program (31) (available on the World Wide Web at clustalw.genome.ad.jp/). Dashes indicate gaps in the alignment, and filled circles indicate sequences that are missing in the data base. Identical (asterisks) and similar (dots and colons) amino acids were marked both for the angiosperm sequences (ang) and for all of the sequences (all). Shaded and nonshaded sequences indicate species of the same family. region (Arg-77, Ala-86, and Gln-87) had a weaker effect on the regulation compared with that in the central part of the MTO1 region (Fig. 3). Interestingly, the weak alleles of mto1-4 and mto1-6 are among those near the border.
Detailed analysis of the Ser-81 position showed that those amino acids with a small side chain are tolerated. Other characteristics such as polar/nonpolar did not seem to affect the regulation (Fig. 4). The most likely interpretation of this result is that amino acids with a larger side chain inhibit the function of the MTO1 region by a structural hindrance at this position. The same rule, however, does not seem to apply for Gly-84, because the mto1-1 mutation that alters this amino acid to serine strongly abolished the regulation.
Due to the fact that not all amino acids in the conserved region were tested by alanine substitution mutagenesis and that deletion analysis within the conserved region did not cover all residues, we cannot rule out the possibility that amino acids outside the MTO1 region are also involved in the regulation. No mto1 mutation, however, has been identified outside the MTO1 region from our screens. This is despite the fact that mutations were independently found three times at the same amino acid residue in the MTO1 region (Gly-84 for mto1-1, -3, and -5) (6). Ethyl methanesulfonate, the mutagen used for mto1 mutant isolation, is known to preferentially produce a guanine-to-adenine change (23,24). Of the 12 possible amino acid substitutions in the MTO1 region, we have identified six mto1 mutations. On the other hand, of the 20 possible amino acid substitutions outside the MTO1 region (but within the conserved region) none was identified, suggesting that the possibility of identifying additional mto1 mutations outside the MTO1 region is low.
Although the conserved region (amino acids 58 -98) of CGS exon 1 was capable of responding to Met application on its own, deletions of other regions of exon 1 more or less weakened the response (Fig. 2). Changes in the amino acid sequence flanking the MTO1 region may affect activity of the MTO1 region, such as by affecting proper folding of the MTO1 polypeptide or by preventing access of the MTO1 region to the site of action.
The amino acid sequence in the MTO1 region is almost perfectly conserved among 19 plant species whose CGS sequences, either as cDNAs or expressed sequence tags, are currently available in the GenBank TM data base (25) (Fig. 5A). These species cover multicellular plants and include not only dicots and monocots but also a gymnosperm (loblolly pine) and a moss species (Physcomitrella patens). The only exception to the conservation of the MTO1 region is a valine-to-leucine substitution in one of the CGS isoforms in potato (Fig. 5A). On the other hand, tBLASTn searches (26,27) of the GenBank TM databases did not identify any sequence that has the RRNC-SNIG(V/L)AQ sequence other than those for plant CGS sequences.
Expressed sequence tags of Chlamydomonas reinhardtii (accession numbers BF862446, BI874449, and BF866840) that were identified in the tBLASTn searches showed significant (expect value Ͻ10 Ϫ30 ) homology to the Arabidopsis CGS exon 2 plus 3 region. They also encode a 75-amino acid N-terminal extension sequence but did not contain the MTO1 sequence (data not shown), suggesting that the MTO1 sequence is unique to CGS of multicellular plants.
In contrast to the highly conserved nature of the amino acid sequences in the MTO1 region, nucleotide sequences encoding this region carry many synonymous changes (Fig. 5B). The present observation supports our previous conclusion that it is the amino acid sequence and not the nucleotide sequence that has a role in the regulation (6).
A Prosite data base search (28) (available on the World Wide Web at www.expasy.ch/tools/scanprosite/) of the MTO1 region identified only an asparagine N-glycosylation site (Asn-Cys-Ser-Asn). However, N-glycosylation does not seem to be relevant to the function of a nascent polypeptide. Although there is still much to be understood regarding regulation of the CGS gene in response to applied Met, the functional amino acid sequence identified in this study provides a critical key to the elucidation of this unique regulatory mechanism.