Cloning and functional analysis of the beta-carotene hydroxylase of Arabidopsis thaliana.

An Arabidopsis thaliana cDNA encoding the enzyme β-carotene hydroxylase was identified by functional complementation in Escherichia coli. The product of this cDNA adds hydroxyl groups to both β rings of the symmetrical β-carotene (β,β-carotene) to form zeaxanthin (β,β-carotene-3,3′-diol) and converts the monocyclic β-zeacarotene (7′,8′-dihydro-β,ψ-carotene) to hydroxy-β-zeacarotene (7′,8′-dihydro-β,ψ-carotene-3-ol). The ε rings of δ-carotene (ε,ψ-carotene) and α-zeacarotene (7′,8′-dihydro-ε,ψ-carotene) are poor substrates for the enzyme. The predicted amino acid sequence of the A. thaliana enzyme resembles the four known bacterial β-carotene hydroxylase enzymes (31-37% identity) but is much longer, with an N-terminal extension of more than 130 amino acids. Truncation of the cDNA to produce a polypeptide lacking the first 69 amino acids does not impair enzyme activity in E. coli. Truncation to yield a polypeptide of a length comparable with the bacterial enzymes (lacking 129 N-terminal amino acids) resulted in the accumulation of the monohydroxy intermediate β-cryptoxanthin (β,β-carotene-3-ol), predominantly, when β-carotene was provided as the substrate. It is suggested that amino acid residues 70-129 of the A. thaliana enzyme may play a role in formation of a functional homodimer.

Carotenoids are essential components of the photosynthetic apparatus in plants, algae, and cyanobacteria. These yellow, orange, and red pigments protect against photooxidation, harvest light for photosynthesis, and serve a number of other important functions. Most of the carotenoids important in photosynthetic organisms are xanthophylls or oxygenated carotenoids (1). The dihydroxy carotenoid zeaxanthin (␤,␤-carotene-3,3Ј-diol) 1 is thought to play a central role in the nonradiative dissipation of light energy under conditions of excessive photon capture by the photosynthetic light-harvesting apparatus (2). Zeaxanthin is formed from ␤-carotene (␤,␤-carotene) by hydroxylation at position 3 in both rings of this symmetrical precursor (Fig. 1). Zeaxanthin, in turn, serves as the substrate for biosynthesis of many other important xanthophylls.
Bacterial genes encoding enzymes that convert ␤-carotene to zeaxanthin have been described (3)(4)(5)(6), but a gene or cDNA encoding an enzyme with this activity has not yet been identified in any photosynthetic organism. We employed a functional complementation approach to identify a cDNA that encodes ␤-carotene hydroxylase in the higher plant Arabidopsis thaliana. The polypeptide predicted by this cDNA resembles the bacterial enzymes (31-37% amino acid identities), lending strong support to the concept of a common origin for the pathways of carotenoid biosynthesis in photosynthetic and nonphotosynthetic organisms. The A. thaliana enzyme, however, is much longer than the bacterial enzymes and has an N-terminal region that is essential for proper enzyme function and yet has no counterpart in the bacterial enzymes.

EXPERIMENTAL PROCEDURES
Plasmid Construction-A cDNA encoding an isopentenyl pyrophosphate (IPP) 2 isomerase in the green alga Haematococcus pluvialis 3 was excised from the library cloning vector (pBluescript SKϩ) with BamHI and KpnI, and cloned, in frame, in the BamHI and KpnI sites of pTrcHisA (Invitrogen). Digestion of the resulting plasmid with EcoRV and KpnI produced a fragment of ϳ2.0 kb containing the H. pluvialis IPP isomerase downstream of the strong bacterial trc promoter. This fragment was cloned in the blunted HindIII site of pAC-BETA (7) to produce pAC-BETA-O4. A 1.9-kb SmaI-EcoRV fragment containing an A. thaliana lycopene ⑀ cyclase cDNA was excised from plasmid y2 (7) and cloned in the blunted BamHI site of pTrcHisA to give pTrc-At⑀. A 2.5-kb EcoRV-KpnI fragment containing the trc promoter and the A. thaliana lycopene ⑀ cyclase was excised from pTrc-At⑀ and cloned in the blunted HindIII site of pAC-LYC (8) to produce pAC-DELTA. The plasmid pAC-ZEAX was constructed from pAC-EHER (8) by deletion of a 1.1-kb SalI-SalI fragment containing the Erwinia herbicola gene encoding zeaxanthin glucosyltransferase.
Mass Excision and Screening of an A. thaliana cDNA Library-A size-fractionated 0.5-1 kb cDNA library of A. thaliana in ZAPII (9) was treated to cause a mass excision and thereby produce a phagemid library (7). The phagemid library was introduced into E. coli strain XL1-Blue, amplified on agar plates, and recovered as a plasmid library.
E. coli strain DH10BZIP containing the plasmid pAC-BETA-O4 was the host for library screening. Cultures were grown in Luria-Bertani (LB) medium supplemented with chloramphenicol (CAP; 50 g/ml). Cells were made competent for transformation using a chemical method (10). The A. thaliana plasmid library was transfected into DH10BZIP/ pAC-BETA-O4 cells, and cells were spread on LB agar plates containing CAP and ampicillin (AMP; 150 g/ml) to produce a density of about 3,000 colonies/large Petri plate (150 ϫ 15 mm). Colonies were examined after 5-day incubation at room temperature.
Subcloning, Truncations, DNA Sequencing, and Sequence Analysis-Plasmid DNA was purified as described by Del Sal et al. (11). The A. thaliana ␤-hydroxylase cDNA was sequenced completely on both strands using an automated sequencer. The sequence of this cDNA has the GenBank accession number U58919.
For N-terminal truncation, the library plasmid containing the A. thaliana ␤-carotene hydroxylase cDNA was digested with XhoI plus * This work was supported primarily by a grant from the United States Department of Agriculture (Project MD-9302172) (to E. Gantt and F. X. Cunningham, Jr.). Support was also provided by the Binational Agricultural Development Fund (Project number US-2006-91R) and by the Maryland Agricultural Experiment Station. This manuscript is scientific article number 7905 and contribution number 9240 of the Maryland Agricultural Experiment Station. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
¶ To whom correspondence should be addressed. Tel.: 301-405-1649; Fax: 301-314-9082; E-mail: fc18@umail.umd.edu. 1 The common names of carotenoids are used in the text of this manuscript. The more descriptive systematic or semisystematic names (15) are given in parentheses when first used.
either BglII or SacI (partial digest). The resulting truncated cDNA fragments were directionally cloned, in frame, in BamHI-XhoI-or SacI-XhoI-digested pTrcHis plasmid vectors (Invitrogen). Methods for DNA and amino acid sequence analysis and prediction of transmembrane helices are described elsewhere (7).
Carotenoid Pigment Analysis and Mass Spectrometry-Cultures of E. coli DH10BZIP containing the various plasmids were grown in 50 ml of LB medium containing AMP and CAP in 250-ml flasks and were incubated at 28°C for 36 h with gentle shaking. Cell harvest, pigment extraction, and HPLC analysis of the carotenoid pigments are described in Cunningham et al. (8).
For mass spectrometry, the carotenoids were separated on silica gel TLC plates developed with hexane:diethyl ether (1:1, by volume). The pigments were eluted from the adsorbent with acetone, the acetone was evaporated under a stream of nitrogen gas, and the pigments were stored under nitrogen at Ϫ20°C until analysis. Electron impact mass spectral data were obtained with a mass spectrometer model VG7070E.

RESULTS AND DISCUSSION
A Visual Screen for Identification of Genes or cDNAs Encoding ␤-Carotene Hydroxylase-Using carotenoid biosynthetic genes from the bacterium E. herbicola, we constructed the plasmids pAC-BETA and pAC-ZEAX such that cells of E. coli that contain them accumulate ␤-carotene and zeaxanthin, respectively, and thereby form colonies that are yellow in color. E. coli colonies containing pAC-ZEAX could be distinguished, with some difficulty, from those containing pAC-BETA by a subtle difference in the color. The difference in colony color (and the amount of carotenoid pigment) could be enhanced by the appropriate choice of the host E. coli strain, by incubation of Petri plates at room temperature for 5 days rather than overnight at 37°C and by introduction of a cDNA encoding the enzyme IPP isomerase. Under optimal conditions the color difference was readily apparent: E. coli colonies containing ␤-carotene were a deep orange-yellow in color and those containing zeaxanthin were a bright yellow color. The ability to visually distinguish colonies containing zeaxanthin from those that contain ␤-carotene provided a method for screening for genes or cDNAs that produce a functional ␤-carotene hydroxylase enzyme when introduced into E. coli.
Identification of an A. thaliana cDNA Encoding ␤-Carotene Hydroxylase by Functional Complementation in E. coli-An A. thaliana cDNA library was transfected into E. coli DH10BZIP containing the plasmid pAC-BETA-O4 (pAC-BETA with the addition of a cDNA encoding IPP isomerase), and the cells were spread on agar plates containing the requisite antibiotics to maintain pAC-BETA-O4 and select for library plasmids. A single bright yellow colony was observed in a background of the ϳ250,000 deep orange-yellow colonies that were examined. HPLC analysis of the carotenoid pigments in cultures inoculated with this bright yellow transformant (Fig. 2, right panel) indicated that zeaxanthin was the predominant carotenoid (95-99% of the total for each of several experiments). A small amount (1-5% of the total carotenoid) of the monohydroxy carotenoid ␤-cryptoxanthin (␤,␤-carotene-3-ol) was also detected. A control culture, with the empty library cloning vector, contained only ␤-carotene (Fig. 2, left panel). The identities of zeaxanthin and ␤-cryptoxanthin were confirmed by mass spectrometry (data not shown). We concluded that the A. thaliana cDNA contained within the bright yellow transformant encodes the enzyme ␤-carotene hydroxylase.
In addition to the ubiquitous ␤ ring, some of the important carotenoids in photosynthetic organisms contain a second type of ring, the ⑀ ring. The ⑀ ring differs from the ␤ ring only in the position of the double bond (compare ␦-carotene and ␤-carotene in Fig. 1). When the A. thaliana ␤-carotene hydroxylase cDNA was introduced into cells that accumulate ␦-carotene (⑀,-carotene), a monocyclic carotenoid with a single ⑀ ring, only a small amount of the substrate was hydroxylated (Fig. 3, right  panel). A control culture containing the empty cloning vector accumulated mostly ␦-carotene (Fig. 3, left panel) with a smaller amount of the linear precursor lycopene (,-carotene). The ⑀ ring of the monocyclic ␣-zeacarotene (7Ј,8Ј-dihydro-⑀,carotene) was also a poor substrate for the hydroxylase (data not shown).
The inefficient hydroxylation of ⑀ rings in the heterologous E. coli assay system is consistent with genetic evidence (12) that implies the existence of a separate ⑀ hydroxylase. A report that the chirality of the hydroxyl group in the ⑀ ring of lutein (␤,⑀-carotene-3,3Ј-diol) extracted from Calendula officinalis is opposite to that of the hydroxyl group in the ␤ ring of this same compound (13) also implies a separate ⑀ hydroxylase. It would be of interest to determine the chirality of the hydroxylated ␦-carotene that is produced in the presence of the ␤-carotene hydroxylase in E. coli (Fig. 3, right panel).  DNA and Amino Acid Sequence Analysis of the A. thaliana ␤-Carotene Hydroxylase cDNA-The A. thaliana cDNA encoding the ␤-carotene hydroxylase enzyme is 956 base pairs (bp) in length with an open reading frame that extends from the beginning of the cDNA (starting with base 3) to a termination codon (TGA) at bases 885-887. A search of the data base of expressed sequence tags (dbEST) identified an A. thaliana cDNA (clone VBVPH03; GenBank accession number F13822 for the N-terminal sequence and F13851 for the C-terminal sequence) that appears to be identical to the ␤-hydroxylase cDNA described in this work. The dbEST sequence provides an additional 100 bp of N-terminal sequence (it is shorter at the C terminus) and suggests that the full-length polypeptide com-mences with a methionine residue that begins 46 bp upstream of the first nucleotide in the sequence of the cDNA described in this work. The polypeptide predicted by the combination of the dbEST sequence and the cDNA sequence determined in this work contains 310 amino acids and has a molecular weight of 34,400 and a pI of 9.4.
An alignment of the predicted amino acid sequence of the A. thaliana ␤-hydroxylase with those of the four known bacterial ␤-carotene hydroxylases is displayed in Fig. 4. There are few gaps in the alignment, and a number of highly conserved regions are evident, the most notable being that labeled "Motif 1." A search of the nucleotide and protein sequence data bases for polypeptides containing the sequence HDGLVH(Q,K)R(W,F)P identified only the known bacterial ␤-carotene hydroxylase enzymes even when mismatches were allowed. The predicted A. thaliana polypeptide does not contain the sequence FLYAXX-PXXXXXXXLXE (beginning at position 274 in the alignment of Fig. 4), which was suggested as a signature sequence for enzymes that act upon carotenoid ␤ rings (14).
The identity between the predicted sequences of the A. thaliana and the bacterial hydroxylases ranges from 31 to 37% with more than one-fourth of the identically conserved residues being histidines. Plots of amino acid hydropathy (not shown) and other analyses (see "Experimental Procedures") suggest three probable transmembrane helical regions in corresponding locations in the bacterial and plant enzymes. Interestingly, a fourth transmembrane region was predicted for the higher plant sequence in a region (residues 82-102, Fig. 4) that aligns upstream of where the bacterial sequences begin.
A Portion of the A. thaliana N Terminus with No Bacterial Counterpart Is Important to Enzyme Function-The ␤-carotene hydroxylase of A. thaliana is presumed to be located in the thylakoid membranes of the chloroplasts in this plant. Therefore, it is not unexpected that the sequence of the predicted A consensus sequence is given below the alignment. A capital letter in the consensus sequence indicates that the amino acid residue is identical for all five polypeptides. A lowercase letter is used where the amino acid residue for three of the four bacterial enzymes is identical to that predicted for the A. thaliana enzyme. Arrows above the alignment mark the sites of truncation resulting after digestion with the indicated restriction enzymes. A dot in a sequence denotes a gap in the alignment. The first 16 amino acid residues (in lowercase type) were predicted by the N-terminal sequence of an A. thaliana expressed sequence tag (GenBank accession number F13822). GenBank accession numbers: A. thaliana, U58919; Alicalgenes sp., D58422; Agrobacterium aurianticum, D58420; E. herbicola strain Eho10, M87280; Erwinia uredovora, D90087. TM, transmembrane. enzyme includes an N-terminal region that is not found in the bacterial enzymes (Fig. 4). The N-terminal extension of more than 130 amino acids, however, is much longer than typically found for chloroplast targeting sequences. We truncated the A. thaliana cDNA to examine whether and how much of this N-terminal extension was essential to enzyme function.
The enzyme SacI was used to remove that portion of the cDNA encoding the first 69 amino acids of the A. thaliana ␤-hydroxylase (Fig. 4). The product of this truncated cDNA (fused at the N terminus to a peptide encoded by the plasmid vector) efficiently converted ␤-carotene to zeaxanthin (92-93% of the total carotenoid) in cells of E. coli (Fig. 5, left panel). As in the "full-length" cDNA (which lacks the first 16 amino acids predicted by the dbEST sequence), a small amount of ␤-cryptoxanthin (7-8% of the total) was also detected.
A second truncation, using an available BglII site, served to remove that portion of the cDNA encoding the first 129 amino acids of the A. thaliana ␤-carotene hydroxylase (Fig. 4). Since the initial methionine residues of the bacterial hydroxylases align at positions 133-136 (Fig. 4), the polypeptide produced by this truncated cDNA might be expected to contain all that is required for enzyme function. The product of this construct does, indeed, hydroxylate most of the ␤-carotene produced in cells of E. coli. However, the major product (75-77% of the total) was ␤-cryptoxanthin rather than zeaxanthin (Fig. 5,  right panel). Small amounts of zeaxanthin (16 -18%) and ␤-carotene (7%) were also present.
The high ratio of ␤-cryptoxanthin to ␤-carotene in cells containing the BglII-truncated hydroxylase (ϳ11 to 1) indicates that the ␤ rings of ␤-carotene are hydroxylated with greater efficiency than the not yet hydroxylated ␤ ring of ␤-cryptoxanthin. We speculate that the A. thaliana enzyme normally associates with a second ␤ hydroxylase (or with an ⑀ hydroxylase) to form a dimer and that a portion of the N terminus (e.g. between residues 69 and 130) mediates these subunit interactions. Experiments to clarify the role of the N-terminal domain and determine whether the hydroxylase does, indeed, function as a dimer are now in progress.