The C-terminal sequence of LMADS1 is essential for the formation of homodimers for B function proteins.

LMADS1, a lily (Lilium longiflorum) AP3 orthologue, contains the complete consensus sequence of the paleoAP3 (YGSHDLRLA) and PI-derived (YEFRVQPSQPNLH) motifs in the C-terminal region of the protein. Interestingly, through yeast two-hybrid analysis, LMADS1 was found to be capable of forming homodimers. These results indicated that LMADS1 represents an ancestral form of the B function protein, which retains the ability to form homodimers in regulating petal and stamen development in lily. To explore the involvement of the conserved motifs in the C-terminal region of LMADS1 in forming homodimers, truncated forms of LMADS1 were generated, and their ability to form homodimers was analyzed using yeast two-hybrid and electrophoretic mobility shift assay. The ability of LMADS1 to form homodimers decreased once the C-terminal paleoAP3 motif was deleted. When both paleoAP3 and PI-derived motifs were deleted, the ability of LMADS1 to form homodimers was completely abolished. This result indicated that although the paleoAP3 motif promotes the formation of LMADS1 homodimers, the PI-derived motif is essential. Deletion analysis indicated that two amino acids, RV, of the 5 final amino acids, YEFRV, in the PI-derived motif are essential for the formation of homodimers. Further, point mutation analysis indicated that amino acid Val was absolutely necessary, whereas residue Arg played a less important role in the formation of homodimers. Furthermore, Arabidopsis AP3 was able to form homodimers once its C-terminal region was replaced by that of LMADS1. This result indicated that the C-terminal region of LMADS1 is responsible and essential for homodimer formation of the ancestral form of the B function protein.

MADS box genes have been thought to play central roles in flower development (1)(2)(3)(4). The most representative MADS box genes in the B function group are AP3 and PI, which play major roles in specifying petal and stamen development in Arabidopsis (5)(6)(7)(8). In Arabidopsis, ap3 and pi mutants have identical phenotypes, producing the sepal structure in the second whorl and carpel in the third flower whorl (5,6). AP3 and PI orthologues from other plant species have been isolated and shown to have similar functions (8 -24).
In addition to AP3 and PI, many MADS box genes showing sequences similar to B function genes have also been identified (8). Based on sequence diversity, B group genes were divided into one PI and three major AP3-like gene lineages (8). B group genes have been proposed to have arisen from by two major duplication events from an ancestral gene possessing both pa-leoAP3 and PI motifs in the C-terminal end of the protein (4,8,(25)(26)(27). The first duplication generated paleoAP3 and PI lineages. The paleoAP3 lineage composed of AP3 orthologues has been identified in gymnosperm, lower eudicots, magnolid dicot, and monocots (8,17). Genes in this lineage contain both conserved paleoAP3 and PI-derived motifs. Members of the PI lineage lost the paleoAP3 motif and were composed of PI orthologues in most plant species. The PI motif was, however, highly conserved throughout the PI lineage after duplication (8).
The second duplication from the paleoAP3 lineage generated two genes. These two genes encounter significant sequence changes in dicots and generated the euAP3 and TM6 gene lineages during evolution (8,28,29). The paleoAP3 motif was retained in TM6 lineages and replaced by a conserved euAP3 motif in the AP3 lineage of most eudicot species (8,29). In contrast to the B function genes in the euAP3, paleoAP3, and PI lineages, which are known to be responsible for petal and stamen development, genes in the TM6 lineage have distinct expression patterns, although their real function remains unclear (8,15,28).
Different from other MADS box proteins, which form homodimers in regulating flower development, in various plant species, proteins in the euAP3 and paleoAP3 lineages were stable and functional in the cell, regulating petal and stamen development only in heterodimer form with a protein in the PI lineage (9,11,20,30). For example, heterodimers were formed between AP3 and PI in Arabidopsis (29), DEFICIENS (DEF) and GLOBOSA (GLO) in Antirrhinum (9,11), and OsMADS16 and OsMADS4 in rice (20), respectively. This partnership was supported by the expression pattern for AP3 and PI orthologues. They are expressed at the same time and in the same areas, in petals and stamens, during flower development (11,31,32). Furthermore, expression of AP3 and PI orthologues is maintained by an autoregulatory circuit (6,7,33,34). AP3/PI or DEF/GLO heterodimers autoregulated their own expression by binding to the specific CArG sequence elements in their promoter regions. Therefore, it is interesting to explore the origin of this obligate nature of heterodimerization for AP3/PI during evolution. Because there was only one B gene before duplication, it is postulated that the ancestral B protein should function as a homodimer in regulating gene expression (17,24). The ability of B proteins to form homodimers was gradually lost and replaced by a preference to form heterodimers with other B proteins. This assumption was supported by a recent finding that the LMADS1 of the monocot lily (Lilium longiflorum) in paleoAP3 lineage was able to form homodimers in regulating flower development (24). An orchid (Oncidium Gower Ramsey) AP3-like protein OMADS3 is also able to strongly form homodimers (28). A similar result was observed for the lily (Lilium regale) PI orthologue and an ancestral B protein from the gymnosperm Gnetum gnemon, which have the ability to bind DNA as a homodimer (17).
In addition to forming homodimers, LMADS1 was also able to form heterodimers with Arabidopsis PI and produced dominant negative ap3-like mutations in transgenic Arabidopsis plants (24). Complete consensus sequence by LMADS1 to the paleoAP3 and PI-derived motifs in the C-terminal region of the protein is an indication of its ancestry to the B function gene (24). Thus, LMADS1 has become an excellent candidate for studying the transition from homodimerization to heterodimerization of B function proteins during evolution. To explore the possible involvement of the conserved paleoAP3 and PI-derived motifs in the C-terminal region of LMADS1 in forming homodimers, various deletions and point mutations in the two motifs within the truncated LMADS1 were executed, and the subsequent ability of truncated LMADS1 to form homodimers was analyzed. The result indicated that these two motifs were absolutely essential for LMADS1 to form homodimers. This conclusion was further supported by the result that Arabidopsis AP3 acquired the ability to form homodimers once its Cterminal region was replaced by that of LMADS1.
Generation of Chimeric AP3 and PI cDNAs Containing C-terminal Portion of LMADS1-PCR was used to generate the chimeric cDNAs, AP3-L1C and PI-L1C in which the C-terminal region of AP3 or PI were replaced by that of LMADS1 (see Fig. 1B). For generation of AP3-L1C, two cDNAs (AP3-C and P3-L1C) were first produced by PCR. Primers P3-EI (5Ј-GAATTCATGCCTAACACCACAACGA-3Ј) and the P3-C (5Ј-CGAGTTTTTGTTCTTTTTCTTGGTG-3Ј) were used in PCR using pG-BKT7-AP3-M (24) as template to produce AP3-C that encoded 97 amino acids of intervening (I) 1 and keratin-like (K) domains of AP3 protein (see Fig. 1B). Primers P3-L1C (5Ј-GAAAAAGAACAAAAACTCGGAA-GAAG-3Ј) and the L1BI (5Ј-CAGGATCCGGGTTTCAAGCC-3Ј) were used in PCR using pGBKT7-LMADS1-M (24) as templates to produce P3-L1C that encoded 73 amino acids of the C-terminal (C) domain of the LMADS1 protein (see Fig. 1B). A second round of PCR using P3-EI and the L1BI as primers and cDNAs AP3-C and P3-L1C as templates was performed to generate chimeric cDNA AP3-L1C that produced a protein fused with the N-terminal I and K domains of AP3 and the C-terminal domain of LMADS1 (see Fig. 1B).
A similar strategy was used to generate PI-L1C. For generation of PI-L1C, two cDNAs (PI-C and I-L1C) were first produced by PCR. Primers I-EI (5Ј-GAATTCCATATGGGTAGAGGAAAGATCGA-3Ј) and the I-C (5Ј-CTTGTGTGCTTCTTCCGCCATCATCTT-3Ј) were used in PCR using pGBKT7-PI (24) as templates to produce cDNA PI-C that encoded 154 amino acids of N-terminal portion (M, I, and K domains) of PI protein. Primers I-L1C (5Ј-ATGATGGCGGAAGAAGCACACAA-GAACT-3Ј) and the L1BI (5Ј-CAGGATCCGGGTTTCAAGCC-3Ј) were used in PCR using pGBKT7-LMADS1-M (24) as templates to produce cDNA I-L1C that encoded 71 amino acids of the C-terminal domain of LMADS1 protein. A second round of PCR using I-EI and the L1BI as primers and cDNAs PI-C and I-L1C as templates was performed to generated chimeric cDNA PI-L1C that produced a protein fused with the N terminus of PI and C-terminal domain of LMADS1.
For generation of cDNAs truncated with paleoAP3 or PI-derived motifs in the 3Ј end of AP3-L1C, combinations of specific primers were used in PCR. The 5Ј specific primer is P3-EI. The 3Ј specific primer for deletion of paleoAP3 motif (AP3-L1C-219) is L1-C1. The 3Ј primer for deletion of paleoAP3 and PI-derived motifs (AP3-L1C-203) is L1-C2. Specific 5Ј and 3Ј primers contained the generated EcoRI or BamHI recognition site to facilitate the cloning of cDNAs. PCR fragments were digested with appropriate enzymes and ligated into either plasmid pGBKT7 (binding domain vector) or PGADT7 (activation domain vector) for yeast two-hybrid analysis.
Yeast Two-hybrid Analysis-Yeast two-hybrid analysis was performed using the MATCHMAKER yeast two-hybrid system 3 (Clontech). In this system, yeast strain Y187 was used for transformation, and lacZ was used as the reporter gene. Yeast transformation was performed by using the lithium acetate method (24,35). The transformants co-transformed with binding domain plasmid and activation domain plasmid were selected on selection medium lacking tryptophan and leucine ( Ϫ Trp Ϫ Leu) according to the manufacturer's instructions. For the analysis of ␤-galactosidase activity, positive transformants grown on selection medium were further grown and suspended into Z buffer (100 mM Na-PO 4 , 10 mM KCl, 1 mM MgSO 4 , 50 mM ␤-mercaptoethanol, pH 7.0) containing o-nitriphenyl-␤-D-galactopyranoside (4 mg/ml in Z buffer) as a substrate. ␤-Galactosidase activity was calculated according to Miller (36).
The proteins were purified with glutathione-Sephadex beads (Amersham Biosciences). For EMSA, different amounts of the fusion proteins were preincubated with 100 ng of pBSKS and 8 mg of bovine serum albumin at room temperature for 5 min in 20 mM HEPES, pH 7.9, 3 mM MgCl 2 , 1 mM EDTA, 10 mM ␤-mercaptoethanol, 0.1% Triton X-100, 10% glycerol in a total volume of 20 ml. The labeled oligonucleotides were then added, and the samples were further incubated for 20 min at room temperature. To separate the protein-DNA complexes, the reaction mixtures were loaded onto a running nondenaturing 4% polyacrylamide gel that had been prerun for 30 min at 4°C and 200 V in 0.5ϫ Tris borate-EDTA buffer. Electrophoresis was further carried out at 4°C and 200 V for 2.5 h. The gels were dried and exposed on x-ray film at Ϫ80°C. For competition experiments, a 120-fold molar excess of the cold competitor DNAs was included in the binding reactions. For supershift analysis, 1 g of antisera was individually incubated with the GST fusion proteins at room temperature for 15 min prior to their use in the binding reactions.

RESULTS
paleoAP3 and PI-derived Motifs Are Required for The Formation of LMADS1 Homodimers-To explore the possible involvement of the paleoAP3 and PI-derived motifs in the C terminus of LMADS1 in the formation of homodimers, constructs containing cDNAs with deletions of the paleoAP3 motif (LMADS1-219) or both the paleoAP3 and PI-derived motifs (LMADS1-203) (Fig. 1A) were transformed into yeast followed by two-hybrid analysis. As shown in Figs. 1A and 2, the ability of LMADS1-219 to form homodimers was decreased to approximately 75% of that for full-length LMADS1. This suggested a positive role for the paleoAP3 motif in the formation of homodimers in LMADS1. Interestingly, the ability of LMADS1 to form homodimers was completely abolished once both the pa-leoAP3 and PI-derived motifs were deleted as seen in LMADS1-203 (Figs. 1A and 2). This result indicated that the PI-derived motif was essential for the formation of homodimers in LMADS1 once the paleoAP3 motif was absent. As controls, neither PI nor AP3 formed homodimers (Fig. 2).
To further confirm the result obtained from yeast two-hybrid analysis, EMSA, a technique established for the investigation of dimerization and DNA binding of MIKC-type MADS domain proteins (9,30,17), was employed. A stretch of DNA sequence (probe CArG1) from the Antirrhinum DEF promoter, including CArG box (CC(A/T) 6 GG), which has been thought to be the region bound by the MADS proteins in regulating gene expression (34,(37)(38)(39), was used to investigate the sequence-specific DNA binding of various forms of LMADS1 protein homodimers to this sequence.
As shown in Fig. 3, LMADS1 proteins bound efficiently to probe CArG1. The binding of LMADS1 proteins to CArG1 was confirmed by the supershift assay in which the signal of protein-DNA complexes was shifted (from position of arrow 1 to that of arrow 2 in Fig. 3) once the GST antisera were added (Fig. 3). When the binding of CArG1 to LMADS1-219E (with paleoAP3 motif deletion) was analyzed, the signal of protein-DNA complexes was also observed and seen to shift in the supershift assay (Fig. 3). The ability of LMADS1-219E homodimers to bind CArG1 was clearly weaker than that for full-length LMADS1 (Fig. 3). This indicated a positive role for the paleoAP3 motif in the formation of homodimers in LMADS1 as was seen in yeast two-hybrid analysis (Figs. 1A and 2). Interestingly, the ability of LMADS1-203E (with both paleoAP3 and PI-derived motifs deletions) proteins to bind to CArG1 was completely abolished as seen for LMADS1-203 in yeast two-hybrid analysis (Figs. 1A and 2). This result con-firmed that the PI-derived motif was essential for the formation of LMADS1 homodimers once the paleoAP3 motif was absent.
Amino Acids in the PI-derived Motif Play Different Roles in the Formation of LMADS1 Homodimers-To further explore the role of the PI-derived motif in the formation of LMADS1 homodimers, constructs containing cDNAs with a series of deletions of amino acids in the PI-derived motif (LMADS1-214 to LMADS1-206) (Fig. 1A) were transformed into yeast followed by two-hybrid analysis. As shown in Fig. 4, the ability of LMADS1-214 (deletion of 2 amino acids, LH, in the PI-derived motif) to form homodimers decreased to approximately 40% of that for full-length LMADS1 (Fig. 1A). This suggested a positive role for amino acids LH in the PI-derived motif in the formation of LMADS1 homodimers (Fig. 1A). Interestingly, when 2 more amino acids (PN) were deleted in the C terminus of the PI-derived motif (LMADS1-212), the ability of LMADS1-212 to form homodimers increased approximately 2-fold of that for LMADS-214 and was similar to that for LMADS1-219, which contained the entire PI-derived motif (Figs. 1A and 4). The ability to form homodimers for LMADS1-210 (6 amino acids, SQPNLH, in PI-derived motif deleted) was even slightly higher than that observed in LMADS1-212 (Figs. 1A and 4). This suggested a possibly negative role for amino acids SQPN in the PI-derived motif during formation of LMADS1 homodimers (Fig. 1A). When 8 amino acids (QPSQPNLH) were deleted in the C terminus of the PI-derived motif (LMADS1-208), the ability to form homodimers decreased approximately 40% of that for full-length LMADS1 and was similar to that for LMADS-214 (Figs. 1A and 4). This suggested a positive role for amino acids QP in the PI-derived motif during formation of LMADS1 homodimers (Fig. 1A). The ability of LMADS1 to form homodimers was completely abolished once 10 amino acids (RVQPSQPNLH) in the PI-derived motif were deleted in LMADS1-206 (leaving only three amino acids YEF remaining) and was similar to that for LMADS1-203 (Figs. 1A and 4). This result indicated that amino acids RV in the PI-derived motif are particularly essential for the formation of homodimers for LMADS1 (Fig. 1A).
To further examine the role of amino acids RV in the formation of LMADS1-208 homodimers, amino acid substitution for these two amino acid residues through site-specific mutagenesis was employed. When the arginine residue at position 207 was substituted by alanine in LMADS1-R207A, the ability to form homodimers decreased to approximately 50% of that for LMADS-208 (Figs. 1A and 4). The ability of LMADS1-V208A or LMADS1-V208R to form homodimers was completely abolished once amino acid valine at position 208 was substituted by either alanine or arginine (Figs. 1A  and 4). Similarly, the ability to form homodimers was completely abolished once both amino acids, valine at position 208 and arginine at position 207, were substituted by alanine in LMADS1-RV208AA (Figs. 1A and 4). This result clearly indicated that the presence of amino acid arginine promotes homodimer formation, whereas amino acid valine is very important for the PI-derived motif to be involved in the formation of LMADS1 homodimers.
AP3 Is Able to Form Homodimers Once the C-terminal Region Is Replaced by That of LMADS1-To provide further evidence to support the involvement of the paleoAP3 and PIderived motifs in the C terminus of LMADS1 in the formation of homodimers, constructs containing chimeric cDNAs (AP3-L1C or PI-L1C) generated by fusion of the N-terminal portion (I and K) of AP3 or PI and the C-terminal portion of LMADS1 (Fig. 1B) were transformed into yeast followed by two-hybrid analysis. As shown in Figs. 1A and 5, the ability of AP3-L1C to Construct PI-LIC encoding a chimeric PI protein with the C terminus of LMADS1. The first large boxed region represents the conserved PI-derived motif sequence (YEFRVQPSQPNLH). The second large boxed region (YGSHDLRLA) contains sequences for the paleoAP3 motif. The horizontal lines above the PI-derived and paleoAP3 motifs indicate the amino acids that possibly played either positive (ϩ) or negative (Ϫ) roles in the homodimerization of LMADS1. The ␤-galactosidase activity shown at the right indicates the relative ability of homodimerization for each truncated or chimeric protein. B, strategy used to generate AP3-L1C cDNA encoding a chimeric AP3 protein with the C terminus of LMADS1 (see "Experimental Procedures" for detail). The two filled boxes in the C terminus of LMADS1 represent the PI-derived and paleoAP3 motifs. I, K, and C indicate the I, K, and C domains, respectively.
form homodimers was as strong as that observed for LMADS1. By contrast, PI-L1C was not able to form homodimers as seen for PI (Figs. 1A and 5).
Furthermore, constructs containing AP3-L1C with paleoAP3 motif (AP3-LIC-219) deletion or deletion of both the paleoAP3 and PI-derived motifs (AP3-LIC-203) (Fig. 1A) were transformed into yeast and two-hybrid analysis performed. As shown in Figs. 1A and 5, the ability of AP3-LIC-219 to form homodimers decreased to approximately 80% of that for fulllength AP3-LIC. The ability of AP3-LIC to form homodimers was completely abolished once both paleoAP3 and PI-derived motifs were deleted as seen in AP3-LIC-203 (Figs. 1A and 5).
This result provides evidence to support the idea that the paleoAP3 and PI-derived motifs in LMADS1 are essential for the formation of homodimers. DISCUSSION Obligate heterodimerization is a unique characteristic for B function MADS proteins in regulating petal and stamen development (9,11,20,30). Discovery of the ability to form homodimers for B function proteins in monocot lily (17,24), orchid (28), and gymnosperm (17) is extremely interesting. These results support that an ancestral B protein functioned as a homodimer in regulating gene expression (17,24). The ability to form homodimers for B function proteins was, however, replaced by the formation of heterodimers in eudicots after gene duplications.
LMADS1 of lily (L. longiflorum), characterized previously in our laboratory, contains both paleoAP3 and PI-derived motifs in the C-terminal region of the protein (Fig. 6A) (24). LMADS1 can not only form homodimers but can also form heterodimers with Arabidopsis PI efficiently (24). This suggested B function gene ancestry for LMADS1 as a possibly transitional role from homodimerization to heterodimerization. One interesting question raised in this study was, what sequence or structure specificity allowed LMADS1 to retain the ability to form homodimers? As shown in Fig. 6A, LMADS1 contains a PIderived motif (YEFRVQPSQPNLH) which showed 85% (11/13) identity to the consensus PI-derived motif (FXFR-LQPSQPNLH) found in AP3 family genes (8,20). The 11-amino acid core (EFRVQPSQPNL) in this sequence is completely identical to the core consensus sequence of the PI motif of the PI lineage and is only 1 amino acid different from the core consensus sequence of the PI-derived motif of the paleoAP3 lineage (Fig. 6A). In addition, 100% (9/9) identity was found between the paleoAP3 motif of LMADS1 (YGSHDLRLA) (Fig.  6A) and consensus of the paleoAP3 motif (YGXHDLRLA) found in AP3 family genes of low eudicot, magnolid dicot, and monocot species (8,20). This high sequence conservation indicated a great possibility that these two motifs play an important role for homodimerization.
To seek evidence for the involvement of the paleoAP3 and PI-derived motifs of LMADS1 in forming homodimers, homodimerization for deletion mutations was analyzed. Our result clearly indicated that the paleoAP3 motif in the C terminus of LMADS1 is responsible for homodimerization because the ability of LMADS1 to form homodimers decreased once this motif was deleted as seen in LMADS1-219. Because approximately 75% of the ␤-galactosidase activity still remained in LMADS1-219 in yeast two-hybrid analysis, this paleoAP3 motif was thus useful but not obligate for homodimerization of LMADS1. Interestingly, when the PI-derived motif was further deleted as seen in LMADS1-203, the ability of LMADS1 to form homodimers was completely eliminated. This result indicated that the PI-derived motif was not only necessary but also required for the formation of homodimers for LMADS1. This assumption was supported by the fact that LRGLOA was able to form homodimers although the paleoAP3 motif was absent and only the PI motif was observed in its C terminus (17). The result obtained from yeast two-hybrid analysis was further confirmed by an independent method EMSA that test for DNAbinding depending on protein dimerization. In EMSA, LMADS1 proteins bound efficiently to probe CArG1 whereas binding ability slightly decreased for LMADS1-219E (with pa-leoAP3 motif deletion) and was abolished for LMADS1-203E (with both paleoAP3 and PI-derived motif deletions). The result also indirectly supported that the different proteins tested in yeast two-hybrid analysis were likely expressed in yeast at the same level or were equally stable. The decrease in ␤-galacto- with GST were incubated with the 32 P-labeled DNA probes CArG1 and the protein-DNA complexes subjected to polyacrylamide gel electrophoresis. For competition assay, a 120-fold molar excess of the cold competitor DNAs was included in the binding reactions. For supershift analysis, 1 g of antisera was individually incubated with the GST fusion proteins at room temperature prior to their use in the binding reactions. The results indicate that LMADS1-GST and L1-219E-GST proteins were able to bind to CArG1 and formed protein-DNA complexes (arrow 1), whereas L1-203-GST proteins were not able to form protein-DNA complexes with CArG1. As control, GST proteins alone were not able to form protein-DNA complexes with CArG1. The LMADS1-GST-CArG1 and L1-219E-GST-CArG1 protein-DNA complexes shifted in the supershift assay (arrow 2) once the preincubation of GST antisera was performed. Ab, antibody; oligos, oligonucleotides. sidase activity was therefore due to the effect on homodimerization for truncated proteins.
There are 13 amino acids (YEFRVQPSQPNLH) in the PIderived motif of LMADS1. It is interesting to explore whether any specific amino acids in this motif are particularly important for homodimerization. After a series of amino acid deletions in this PI-derived motif, different ␤-galactosidase activity was observed in yeast two-hybrid analysis. Our results indicated that amino acids L 215 H 216 and Q 209 P 210 are required for the PI-derived motif to maintain its ability for homodimerization, since deletion in these two regions caused decrease of ␤-galactosidase activity. By contrast, amino acid residues 211 to 214 (SQPN) seem not to be required for homodimerization since ␤-galactosidase activity was not influenced once these four amino acids were deleted. The minimum number of amino acids required for the PI-derived motif to retain its ability for homodimerization is five (YEFRV). The ability for homodimerization was completely lost by deletion of amino acids R 207 V 208 from the PI-derived motif in LMADS-206. This suggested a mandatory role for these two amino acids, RV, in maintaining the function of the PI-derived motif to form homodimers for LMADS1.
When the role for these two amino acid residues arginine (R) and valine (V) was further examined separately by site-specific mutagenesis, different effects caused by amino acid substitutions for these two residues were observed. The ability to form homodimers decreased to approximately half of that for LMADS-208, once arginine (R) was substituted by alanine (A). By contrast, the ability to form homodimers was completely abolished once valine (V) was substituted by either alanine (A) or arginine (R). The ability to form homodimers was also completely abolished once both arginine (R) and valine (V) were substituted by alanine (A). This result indicated two things. First, it indicated that the ability of LMADS1-208 to form homodimers is not due to the number of amino acids remaining in the PI-derived motif, but rather, due to the presence of the residues arginine (R) and valine (V). Second, it clearly indicated that valine (V) at position 208 played a more important role than arginine (R) at position 207 in the PI-derived motif in regulating the formation of homodimers for LMADS1. Interestingly, arginine (R) at position 207 is the consensus residue whereas valine (V) at position 208 is not the consensus residue (L) in the PI-derived motif found in AP3 family proteins (8,20). However, when the sequence was further analyzed, valine (V) at position 208 remains highly conserved in the PI-derived motif found in AP3 family proteins of monocots such as LMADS1, OsMADS16, SILKY, LRDEF as well as in the PI motif found in PI family proteins (Fig. 6A) (8,24). This conservation may also reveal the possibly important function for the residue valine (V).
Another interesting question raised in this study is whether the homodimerization of LMADS1 solely depends on the pa-leoAP3 and PI-derived motifs. LMADS1 showed high identity (70%) to other monocot AP3 orthologues such as OsMADS16 of rice (24). Similar to LMADS1, conserved sequences, (FAFRV- VPSQPNLH) and (GGNHDLRLG), showed 85% (11/13) identity to the consensus sequence of the PI-derived motif and 78% (7/9) identity to the paleoAP3 motif respectively, were also identified in the C-terminal region of OsMADS16 (Fig. 6A). Thus, OsMADS16 should be able to form homodimers if these two motifs were the only requirement for homodimerization. However, OsMADS16 has not been reported to be able to form homodimers (20). This indicates that sequences other than paleoAP3 and PI-derived motifs should also be considered for homodimerization of LMADS1. LMADS1 and OsMADS16 showed high identity in M (81%), I (69%), and K (78%) domains (Table I). By contrast, only 33% identity (16/48) in the Cterminal region (without paleoAP3 and PI-derived motifs) was observed for LMADS1 and OsMADS16 (Table I). This result strongly suggests that these 48 amino acids in the C-terminal region of LMADS1 were also possibly involved in the homodimerization of LMADS1 and may be the cause of the difference between LMADS1 and OsMADS16.  6. A, Alignment of the consensus sequences (underlined) for PI-derived, paleoAP3,and euAP3 motifs in the C terminus of LMADS1 (L. longiflorum), Os-MADS16 (rice), and AP3 (Arabidopsis). Alignment of the consensus sequences (underlined) for the PI motif in the C terminus of LRGLOA (L. regale), OsMADS4 (rice), and PI (Arabidopsis). The residues not conserved for the consensus sequences in the corresponding motif are boxed. The number under each motif indicates the number of conserved residues in this motif to the consensus sequences of the corresponding motif. B, the importance for C and K domains in forming homodimers for B function proteins. Homodimerization of a B function protein required the interaction between a conserved K domain and a corresponding ancestral C-terminal domain. LMADS1 is able to form homodimers because it contained both conserved K and C-terminal domains. The chimeric protein containing the C-terminal domain of LMADS1 was able to form homodimers once its K domain of AP3 showed a high identity (53%) to that for LMADS1. By contrast, a chimeric protein was unable to form homodimers if its K domain of PI showed a low identity (25%) to that for LMADS1. Although high K domain identity is present, the ability to form homodimers for a B protein is lost once the identity of a C-terminal domain to that of LMADS1 is below a threshold as seen for OsMADS16 and AP3. The number under each domain indicates the percentage of sequence identity for this domain to that of LMADS1. The plus and minus signs shown at the right indicates the ability for homodimerization by each wild-type or chimeric protein.
To further examine this assumption, the chimeric protein AP3-L1C containing the entire C terminus of LMADS1, plus I and K domains of Arabidopsis AP3 protein was analyzed for homodimerization. Interestingly, despite only approximately 50% identity in I and K domains for AP3 and LMADS1 (Table  I), this chimeric protein AP3-L1C formed homodimers as strong as that for LMADS1 (Figs. 5 and 6B). Thus, this result provided direct evidence to support that the C-terminal region, in addition to the paleoAP3 and PI-derived motifs is also responsible for the homodimerization of LMADS1. The presence of the conserved paleoAP3 and PI-derived motifs alone is clearly not sufficient to account for ability to homodimerize as seen for OsMADS16. This assumption is further supported by the fact that OMADS3, an orchid (O. Gower Ramsey) AP3-like protein, loses the ability to form homodimers once its C-terminal region is deleted (Hsu and Yang, unpublished result).
The K-domain of MADS box proteins contains an amphipathic helix that has been thought to be involved in protein dimerization (40). Since the chimeric protein AP3-L1C has almost 100% of the ability to form homodimers as seen for LMADS1, it seems that when the K domain of LMADS1 is substituted no effect on homodimerization of LMADS1 is observed. However, in contrast to AP3-L1C, chimeric protein PI-L1C encodes a chimeric PI protein with the C terminus of LMADS1 which is unable to form homodimers (Figs. 5 and 6B). The difference between AP3-L1C and PI-L1C occurs at the I and K domains in the N terminus of the chimeric proteins. This revealed that I and K domains of AP3 and LMADS1 are also important for homodimerization. The I and K domains of AP3 showed 48 and 53% identity to that of LMADS1 (Table I). These two domains of PI showed 45 and 25% identity to that of LMADS1 (Table I). It is clear that the K domain is the major difference (53% versus 25% identity) between AP3-L1C and PI-L1C and likely participates in homodimerization. Therefore, we believe that a conserved K domain interacting with a corresponding ancestral C-terminal domain is required for the homodimerization of a B function protein (Fig. 6B). In our study, 53% identity for AP3 is above, whereas 25% identity of PI is below the threshold of the identity for a K domain to interact with the C-terminal domain of LMADS1 in forming homodimers (Fig. 6B). Interestingly, a similar pattern of sequence identity was also observed for LRGLOA, PI and AP3. The I and K domains of PI showed 52 and 54% identity to that of LRGLOA (Table I). These two domains of AP3 only showed 25 and 28% identity to that of LRGLOA (Table I). Therefore, the chimeric protein containing the C terminus of LRGLOA, plus I and K domains of Arabidopsis PI is expected to have great potential to form homodimers as seen for AP3-L1C. This predication remains under investigation.
Although percentage sequence identity gave some clues to protein interaction specificity, further investigation to determine which residues in the K domain might be responsible for a functional difference between AP3 and PI would be helpful. However, the amino acids are highly variable in the K domain of AP3 and PI (more than 50 among 67 amino acids are different). Thus, the identification of residues specifically responsible for this protein interaction becomes extremely difficult and will remain under investigation.
Based on our results, a model has been proposed to illustrate the possible evolution of dimerization for B function proteins. Genes such as LMADS1 were generated by duplication from an ancestral B gene containing an ancestral form of PI and pa-leoAP3 motifs at approximately 300 million years ago. This duplication also produced an ancestral PI gene that consequently evolved into genes such as LRGLOA of lily (17). The PI motif was conserved in LRGLOA and was slightly changed to the PI-derived motif in LMADS1 (Fig. 6A). To form homodimers, an ancestral form of PI or the PI-derived motif is absolutely required. A conserved paleoAP3 motif will enhance the ability to homodimerize. Thus, LMADS1 and LRGLOA retain the ability to form homodimers (17,24). In addition to PI-derived and paleoAP3 motifs, a specific stretch of sequences with ancestral characteristics in the C domain and a certain level of conservation in the K domain are also required for homodimerization (Fig. 6B). Therefore, despite the high sequence conservation in other parts of the proteins, the severe sequence change in this C domain from LMADS1 to Os-MADS16 (Table I) caused a loss of the ability to form homodimers for OsMADS16 (Fig. 6B). Further severe alteration of the PI-derived motif and replacement of the paleoAP3 motif by euAP3 motif in proteins such as AP3 and DEF of higher eudicots (Fig. 6A) completely transformed them into obligate heterodimers. This assumption may also be true for genes in the PI lineage. Because the PI motif was highly conserved in the PI lineage from LRGLOA and OsMADS4 to PI (Fig. 6A), variable sequences in the C and K domains should be observed for these three proteins and accounts for the difference in dimerization. As seen in Table I, in contrast to 72% identity in MADS box domain, only approximately 50% identity in I and K domains and 25% identity in C domain was observed between LRGLOA and PI. There is 83 and 73% identity in M and K domains between LRGLOA and OsMADS4. However, only 34% identity in C domain was observed between these two proteins (Table I). Interestingly, the percentage of sequence identity in the M, I, K, and C domains is almost identical for LMADS1/ OsMADS16/AP3 or LRGLOA/OsMADS4/PI. This indicated that a similar evolutionary rate for AP3 and PI from their corresponding ancestral B genes has occurred.