A genomic cluster containing four differentially regulated subtilisin-like processing protease genes is in tomato plants.

Screening of a genomic library from tomato plants (Lycopersicon esculentum) with a cDNA probe encoding a subtilisin-like protease (PR-P69) that is induced at the transcriptional level following pathogen attack (Tornero, P., Conejero, V., and Vera, P. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 6332-6337) resulted in the isolation of a cluster of genomic clones that comprise a tandem of four different subtilisin-like protease genes (P69A, P69B, P69C, and P69D). Sequence analyses and comparison of the encoded proteins revealed that all are closely related (79 to 88% identity), suggesting that all are derived from a common ancestral gene. mRNA expression analysis as well as studies of transgenic plants transformed with promoter-beta-glucuronidase fusions for each of these genes revealed that the four genes exhibit differential transcriptional regulation and expression patterns. P69A and P69D are expressed constitutively, but with different expression profiles during development, whereas the P69B and P69C genes show expression following infection with Pseudomonas syringae and are also up-regulated by salicylic acid. We propose that these four P69-like proteases, as members of a complex gene family of plant subtilisin-like proteases, may be involved in a number of specific proteolytic events that occur in the plant during development and/or pathogenesis.

Screening of a genomic library from tomato plants (Lycopersicon esculentum) with a cDNA probe encoding a subtilisin-like protease (PR-P69) that is induced at the transcriptional level following pathogen attack (Tornero, P., Conejero, V., and Vera, P. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 6332-6337) resulted in the isolation of a cluster of genomic clones that comprise a tandem of four different subtilisin-like protease genes (P69A, P69B, P69C, and P69D). Sequence analyses and comparison of the encoded proteins revealed that all are closely related (79 to 88% identity), suggesting that all are derived from a common ancestral gene. mRNA expression analysis as well as studies of transgenic plants transformed with promoter-␤-glucuronidase fusions for each of these genes revealed that the four genes exhibit differential transcriptional regulation and expression patterns. P69A and P69D are expressed constitutively, but with different expression profiles during development, whereas the P69B and P69C genes show expression following infection with Pseudomonas syringae and are also up-regulated by salicylic acid. We propose that these four P69-like proteases, as members of a complex gene family of plant subtilisin-like proteases, may be involved in a number of specific proteolytic events that occur in the plant during development and/or pathogenesis.
Proteolysis is fundamental for the normal functioning of multicellular organisms and plays key roles in a variety of processes such as development, physiology, defense and stress responses, and adaptation to the changing environment. In plants, despite the importance of all these processes and involvement of different classes of proteinases (Refs. 1-5, and references therein), it still remains to be defined more precisely what components and molecular mechanisms are responsible for regulating specific aspects of protein degradation/processing. A major task for research will be to determine which pathway of proteolysis is responsible for the degradation of particular proteins.
The serine proteases are one of the best characterized groups of proteolytic enzymes in higher organisms. They can be grouped in six clans, of which one of the largest is the subtilisin-like clan (EC 3.4.21.14) that includes over 200 different members. Despite this wealth of knowledge, very little is know about subtilisin-like proteases in plants. Recently, we and others have shown the existence of members of this clan in plants, including Arabidopsis (6), tomato (7,8), melon (9), and Lilium plants (10). According to a recent classification (11), the subtilisin-like proteases from plants can be grouped within the Pyrolysin subfamily, which is highly related to the Kexin subfamily of proteases involved in the posttranslational processing of peptide hormones (12,13). Comparative molecular, biochemical, and cellular studies indicate that the subgroup of plant subtilisin-like enzymes are characterized by the presence of a large polypeptide sequence insertion preceding the reactive Ser residue and/or long C-terminal extensions relative to all other subtilisin-like proteases. Furthermore, they were found to be glycosylated and to be secreted to the plant extracellular matrix (ECM) where they accumulated and presumably exert their biochemical function(s) by recognizing and processing pericellular substrates (7)(8)(9)(10)(11).
We here describe the isolation and characterization of a genomic cluster comprised of four genes encoding different, but highly related, members of subtilisin-like proteases (named as P69A, P69B, P69C, and P69D). While the four clustered proteases exhibit a high degree of amino acid sequence identity, we show that they are differentially regulated at the transcriptional level, each showing a different expression pattern, either during normal plant development or following pathogenic attack.

EXPERIMENTAL PROCEDURES
Plant Material, Growth Conditions, Treatments-Lycopersicon esculentum cv. Rutgers and Arabidopsis thaliana (Col-0) plants were grown at 22°C in growth chambers programmed for a 14-h light and 10-h dark cycle. Fully expanded leaves or rosette leaves were sprayed with SA (0.5 mM) or buffer alone (50 mM phosphate buffer, pH 7.2), and samples were taken for analysis after 48 h. Suspensions of Pseudomonas syringae strains (0.09 O.D.) were infiltrated locally in one part of the leaf. Control (mock) plants were injected similarly with the solution containing no bacteria. Samples were analyzed 24 -48 h post-inoculation.
Library Screening and DNA Sequence Analysis-A tomato genomic DNA library constructed in -EMBL3 was screened at 65°C as described (14) with the radiolabeled p26 cDNA encoding the prepro sequence of the PR-P69 protein described previously (7). The positive clones were isolated and characterized with the routine described previously (14).
RT-PCR-cDNA synthesis, quantification of the products, and reverse transcriptase-mediated PCR 1 were conducted as described (8). The oligonucleotide primer pairs (50 pmol each), a1 ϩ a2 (ATGGGAT-TCTT GAAAATCCTT ϩ TCAACAAAAGTGCAATTGGACTTC), b1 ϩ b2 (ATGGGATT CTTGAAAATCCTT ϩ CCTAGGCAGACACAACTG-* This work was supported in part by the Spanish Ministry of Science and Education. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  CAAT), c1 ϩ c2 (ATGGGAT TCTTGAAAATCCTT ϩ TCATATCAAT-GTCCTCTCAAAGAG) and d1 ϩ d2 (ATGGGATTCTTGAAAATT ϩ TTATTCAGCAGACACTCTAACTGC), specific for the amplification of P69A, P69B, P69C, and P69D sequences, respectively, were used to amplify by PCR the in vitro synthesized single-stranded cDNA from the different mRNA sources in a Perkin-Elmer/Cetus DNA Cycler. PCR amplification was programmed as described before (8). The amplified DNA fragments were visualized in agarose gels, or alternatively, they were hybridized with a radiolabeled p26 cDNA probe. The inability of each combination of primers to amplify the closely related P69 sequences was confirmed in control PCR reactions that included 10 ng of plasmid DNA containing each of the four P69 ORFs as template.
Promoter-GUS Fusion, Plant Transformation, and Analysis of Transgenic Plants-Oligonucleotides GEN69a (5Ј-GCCCGGGGGCTTGCAA-ATGGTATAG-3Ј), GEN69b (5Ј-GCCCGGGGGCTAGCTAATACAACA-AGTG-3Ј), GEN69c (5Ј-GCCCGGGGGCTGCAAATACAAGAAG-3Ј), and GEN69d (5Ј-GCC CCGGGTTGCTGGTATAGAGTAATTGG-3Ј) in combination with the T7 oligonucleotide served as primers for the incorporation of a synthetic SmaI restriction site in each promoter by site-directed mutagenesis (15). These primers introduced the SmaI site at positions Ϫ1 relative to the translation initiation sites in each gene. SmaI-BamHI fragments encompassing each of the P69 promoter regions were cloned upstream of the ␤-glucuronidase gene in pBI101.1 (16) to generate plasmids pP69A::GUS, pP69B::GUS, pP69C::GUS, and pP69D::GUS. The resulting transcriptional fusions were verified by nucleotide sequence analysis using specific primers. The constructs were introduced into Arabidopsis plants by Agrobacterium tumefaciens mediated transformation. Transformants were selected on MS agar medium containing kanamycin, transferred to soil, and allowed to self. The transgenic lines were assayed for GUS activity by a fluorimetric assay or by an in situ assay using the colorigenic substrate X-gluc (16).

Characterization of a Genomic Cluster Containing Four
Genes Encoding Highly Related P69 Proteases-A DNA fragment encoding the signal sequence and propeptide for the previously identified P69 protease was obtained from plasmid p26 (7) and used as a radiolabeled probe to screen a tomato genomic library constructed in -EMBL3, and different clones were isolated. After a third round of screening and purification, three clones (-5, -2, and -3Ј) were finally selected for restriction analysis and sequencing. These analyses revealed that the genomic DNA inserts of the three clones were overlapping clones encompassing ϳ41 kb of genomic DNA. Alignment of the genomic sequences revealed the presence of a tandem of four transcription units that were highly similar (Fig. 1). The first one, from here on designated as P69A, was identical to the previously identified P69 cDNA contained in plasmid p26. The last one in the row, and designated as P69B, was identical to the previously reported cDNA clone p9 (8). Between the P69A and P69B transcription units, two additional ones, designated P69C and P69D, were identified. The four genes were intronless. While the nucleotide sequence homology for the four open reading frames was quite high (in the range of 75 to 85% identical), the comparison of the 5Ј promoter regions (preceding the ATG initiation codon) or the 3Ј region after the polyadenylation signal of each gene revealed no homology between them. However, in all cases, putative TATA boxes and CAAT boxes shortly upstream of the ATG initiation codon were observed (data not shown Computer-aided comparison of the amino acid sequence at the NH 2 termini (Fig. 2) along with the hydropathy profiles (data not shown) identified within the cluster indicated the existence of a preprosequence in all cases for the four P69 proteases. The prosequence comprises a hydrophobic signal peptide at the extreme N terminus which, accordingly to von Heijne (17), is cleaved C-terminal of the conserved Ser-22 residue. In all cases, the signal peptide is followed by a 92-amino acid prosequence, which is a typical feature of proteases of the subtilisin family, and its cleavage is mandatory for the generation of the active protease from the inactive zymogen (18). The putative N-terminal amino acid of the mature proteins is the conserved Thr-115, identified also by comparison with other plant subtilisin-like proteases (6 -11). The predicted mature enzymes thus contain 631, 631, 552, and 632 amino acids for the P69A, P69B, P69C, and P69D isoforms, respectively. Within the four mature proteins, the amino acid residues Asp-146, His-203, and Ser-532 (or Ser-531 for P69B and p69C), corresponding to the catalytic site (catalytic triad) essential for the enzymatic activity of subtilisin-like members to function as proteases, were identified (Fig. 2). Also the proteases of the four P69s in the cluster have an Asn residue (Asn-306, or Asn-305 for P69B and P69C) that has been found to be highly conserved in this position and that is catalytically important in the subtilisins (12,13). However, sequences close to this Asn-306 are highly variable within the four P69s (Fig. 2). In all cases there is also an insertion of a long sequence (226 amino acids) between the stabilizing Asn-306 and the reactive Ser-532, relative to all other subtilisin-like proteases (11), in which these two residues are separated by much shorter distances. This displacement has also been observed in the three other subtilisin-like proteinases recently identified from plants (6,9,10) and could represent a characteristic signature of the subtilisin enzymes from plants (Fig. 3).
Expression Analysis of P69 Genes-The differential expression pattern of the four P69 genes was initially determined by gene-specific RT-PCR reactions combined with Southern blot analyses. In vitro synthesized single-stranded cDNAs from mRNA samples of fully grown leaves from healthy and Pseudomonas syringae pv. syringae-infected tomato plants were assayed by PCR using sets of primers which were specific for each P69 gene member. Primer specificity was demonstrated in pilot experiments using each of the individually cloned gene as template in the PCR reaction (Fig. 4). Whereas P69A mRNAs accumulate to detectable levels in fully expanded leaves of healthy tomato plants, the P69D mRNAs accumulate in marginal amounts in the same leaf samples. Neither P69A nor P69D were found to be induced over basal levels in leaves that were infected with P. syringae. Conversely, similar RT-PCR analysis with primers specific for the P69B gene and the P69C gene indicated that while the corresponding mRNAs were nearly undetectable in leaves from healthy control plants, a dramatic accumulation of the transcripts occurred in P. syringae-infected leaves (Fig. 4).
Developmental and Tissue-specific Regulation of the Different P69 Promoters in Transgenic Plants-To investigate in detail the spatial pattern of expression of the four P69 genes, each of the 5Ј flanking promoter regions was fused to the ␤-glucuronidase (GUS) reporter gene in plasmid pBI101.1 to generate constructs pP69A::GUS, pP69B::GUS, pP69C::GUS, and pP69D::GUS (Fig. 5). These constructs were introduced separately into Arabidopsis plants by transformation with A. tumefaciens, and a minimum of ten independent kanamycinresistant transformants were generated for each construct. GUS activity was initially analyzed in plants grown under normal growth conditions and also after inoculation with the avirulent bacteria P. syringae DC3000 (AvrRpm1) by a fluorimetric assay (16) (data not shown). These assays revealed constitutive expression of GUS activity in the transgenic plants generated with construct P69A::GUS and P69D::GUS, whereas those harboring the P69B::GUS or the P69C::GUS cassette expressed GUS activity only upon bacterial infection (see below). These results were, to some extent, coincident with the RT-PCR studies shown above (Fig. 4).
To study comparatively the distribution of GUS activity in planta, the initial transgenic plants generated for each construct were selfed, and the kanamycin-resistant progeny was analyzed in situ using the chromogenic substrate X-Gluc (Fig.  6). Expression of GUS activity driven by the P69A promoter was detected in the seedlings as well as in fully grown plants. As deduced from the tissue staining pattern, it seems as if the P69A gene is expressed in a general fashion in all organs of the plant, except in roots and in flower organs, where no GUS activity could be detected in any plant.
Likewise, transgenic plants in which GUS expression was driven by the P69D promoter revealed that this is active predominantly in expanding cotyledons and leaves (Fig. 6). This expression was transient because it disappeared once the leaves or cotyledons had enlarged and matured. Interestingly, the P69D gene is also observed to be expressed in inflorescences, and more particularly, in stigmas. We still do not know whether or not this expression pattern in flowers is also transient.
Conversely, no constitutive expression of the GUS gene driven by the P69B or P69C promoters could be detected in any of the transgenic plants generated (Fig. 6). However, in some of the P69C::GUS transgenic lines, we could detect GUS activity in discrete groups of cells (islands) that appeared sporadically, and with an unpredictable location, either in stems, roots or leaves (not shown). Induced expression was also analyzed in rosette leaves from transgenic plants before and after inoculation with P. syringae pv. tomato (Pst) strain DC3000 carrying the avrRpm1 avirulence gene, which is recognized by the corresponding resistance gene in A. thaliana ecotype Col-0. Pst DC3000(avrRpm1) and causes a macroscopic hypersensitive response (HR) at the inoculation site. These studies revealed that GUS expression driven by either the P69B and P69C promoters was induced in the infected leaves. Induction occurred throughout the inoculated leaves and was not restricted to the site of inoculation where the HR became apparent (Fig. 7). Conversely, in transgenic plants carrying P69A::GUS or P69D::GUS constructs, GUS expression was not induced during the course of infection.
Because salicylic acid (SA) has been demonstrated to be a master regulatory molecule mediating most of the plant defense responses to challenging pathogens (19), we tested whether or not SA could act as inducer of any of these genes. Spraying of healthy transgenic Arabidopsis plants with a 0.5 mM solution of SA resulted in the induction of GUS activity only in plants harboring the P69B::GUS and P69C::GUS transgenes, and in a manner similar to that observed following bacteria inoculation (Fig. 7). These results further reinforce our consideration that the P69B and P69C gene pair is involved in pathogenic response of the plant to challenging pathogens, whereas the P69A and P69D pair is more related to development or to a basic biochemical function. DISCUSSION In this work, we provide structural and functional information on a genomic cluster comprising four different members of a family of plant genes, which on the basis of amino acid sequence conservation and structural organization are related to the subtilisin-like protease clan (EC 3.4.21.14) (11). The genomic cluster was identified by screening of a genomic DNA library from tomato plants with a partial cDNA probe for the previously identified pathogenesis-related PR-P69 protease (7) (here renamed as P69A). The predicted primary structure of the four P69 proteases designated P69A, P69B, P69C, and P69D, indicate that all of them are synthesized as precursor proteins (preproenzyme) composed of three distinct domains: a 22-amino acid signal peptide, a 92-amino acid propolypeptide, and a mature polypeptide of variable length for each protease in the cluster. Within the mature polypeptides, the amino acid sequences surrounding Asp-146, His-203, and Ser-531 are the most salient features of all these proteases, which are identical to those of the catalytic sites (catalytic triad) of subtilisins (11). Interesting also is the conservation of an insertion of a long sequence (226 amino acids) between the conserved Asn residue and the reactive Ser residues of the catalytic triad, relative to all other subtilisin-like proteases. The meaning of such a conserved displacement remains unknown, but its conservation suggests it may subserve important functions in regulating the properties of this subgroup of subtilases in plants.
Comparative studies of the mode of expression by RT-PCR and by analysis of transgenic plants harboring independent promoter-GUS fusions for the four P69 genes indicate that they are regulated differently. P69A is transcribed at all stages of plant growth and in all organs except roots and flowers where the expression is absent. Likewise, P69D is also transcribed under resting conditions in emerging leaves, but its expression is transient. This transient expression pattern, which has been observed also in transgenic tobacco plants (data not shown), is presumably associated with the elongation processes in these organs because transcription is repressed once the leaves cease elongation. Although P69D is not expressed in roots, transcription is specifically recovered in flowers, and in the stigma in particular. Neither P69A nor P69D gene expression is induced over basal levels during pathogenesis.
Conversely, although neither P69B or P69C show constitutive expression in any organ of plants grown under resting conditions, both are transcriptionally activated upon infection with avirulent bacteria, either in transgenic Arabidopsis plants or in tomato plants. This latter result is consistent with our previous observations that one member of this family, now identified as P69B on the basis of sequence identity, is coordinately induced with a set of pathogenesis-related (PR) proteins associated with the defense response in tomato plants (7,20,21).
Interestingly, the observation that the induced transcriptional activation of P69B and P69C is not restricted to the point of inoculation with the avirulent bacteria (Fig. 7), where the HR cell death and the induction of protective genes occur (22), but rather is extended throughout the afflicted leaf blade suggests that their induction is mediated by a long distance signaling process. Because the benzoic acid-derivative SA is one of the candidate signal molecules mediating long distance activation of plant defense reactions (19), we tested the ability of SA to induce expression of these two pathogen-inducible genes. The result presented in Fig. 7 demonstrates that local application of SA to leaves promotes the transcription of both genes de novo. Thus, these results reconcile with the idea that proteases, and in particular members of the subtilisin-like family, are components of the general plant response to attacking pathogens.
The differential expression profile found for these four P69 genes indicate that different regulatory mechanisms have evolved to control the expression of these genes either during normal growth or under pathological situations. It remains to be demonstrated whether or not these differential expression patterns also imply different roles for the protein products. In this regard, it is tempting to speculate that the permanent expression of the P69A gene at the different stages of growth of the plant may suggest a housekeeping function for the P69A protein and that this function may be advantageously enhanced during processes of plant cell elongation and/or pathogenesis by supplementing with the other isoforms that may serve backup functions to the principal protease. It has been demonstrated in animal systems, where natural substrates for subtilisin-like proteases have been identified (12,13), that when the different protease members of a family are removed from their biological context and assayed in vitro, many of these proteases are able to process the same substrates. This observation again raises the question of whether such a functional redundancy exists among the family members in vivo and how this might be regulated. One insight into defining functionality has been provided by examining the expression of the individual protease members (23). From such analysis, it has been shown that protease substrate specificity in vivo is influenced by restricting expression to particular tissues and also by compartmentalization of the individual enzymes to specific intracellular locations (24). Thus, it may be the case that a similar regulation also applies for the differentially regulated P69 protease isoforms under consideration as a likely explanation for delimiting redundant functions in vivo. Alternatively, we cannot disregard the possibility that each member functions separately by recognizing different substrate(s) and thus implies that both function and gene expression patterns evolved coordinately, but separately, for each P69 protease.
Whatever the meaning of such complexity is, the diversification of either the regulation of gene expression or function for each member of the P69 clan would be in agreement with the polymorphism found for other unrelated gene families in higher plants, which arise from gene duplication events of a common ancestral gene (25,26).
So far, only two protein substrates have been identified for these plant subtilisin-like proteases. One is systemin, the traveling peptide hormone mediating signaling processes during wound response in plants (27), the other is LRP (28), an extracellular matrix associated leucine-rich repeat (LRR) protein that is part of a family of proteins that mediate molecular recognition and/or protein interaction processes (29). Subtilisin-like enzymes have been shown to be secreted to the plant extracellular matrix (ECM) (30). Thus, the consideration that the P69s may be mediating in pericellular processing/degradation events may indicate a role, similar to those in some animal systems (31,32), in modulating the interaction of the plant cell surface with the extracellular environment.