African Trypanosomes Have Differentially Expressed Genes Encoding Homologues of the Leishmania GP63 Surface Protease*

The genomes of various Leishmaniaparasites contain tandemly arrayed genes encoding an abundant 63-kDa surface glycoprotein called GP63. Leishmania GP63s are metalloproteases that play an important role in the invasion and survival of the parasites within the macrophage, and their presence on the Leishmania surface has been correlated with resistance to complement-mediated lysis. Here we report the identification of GP63-like genes in African trypanosomes. The predicted trypanosome andLeishmania GP63s share a metalloprotease catalytic site motif of HEXXH as well as 19 cysteines and 10 prolines, implying a conservation of enzymatic activity and secondary/tertiary structure. The trypanosome GP63 genes are transcribed equally in procyclic and bloodstream trypanosomes, but their mRNAs accumulate to a 50-fold higher steady state level in bloodstream trypanosomes, where the ratio of mRNAs for GP63 and variant surface glycoprotein is about 1:150. Transcription of the GP63 genes is sensitive to α-amanitin, indicating that they are transcribed by a different polymerase than the variant surface glycoprotein genes. These results lead to a reconsideration of the potential functions of GP63, inasmuch as African trypanosomes are not known to interact with macrophages and do not have an intracellular stage during their life cycle.

During its life cycle, the trypanosomatid protozoan parasite Leishmania exists as an extracellular form, the promastigote, in its sandfly vector and as an intracellular form, the amastigote, in macrophages of its mammalian host (1). Promastigotes of all Leishmania species examined to date possess on their surface an abundant 63-kDa glycoprotein, called GP63 1 or leishmanolysin, that is a zinc-metalloproteinase with a broad substrate specificity and pH optimum (2)(3)(4). Leishmania GP63s and their genes have been the subject of extensive investigation and many reviews (5)(6)(7)(8). These studies have included determinations of the structure and post-translational modifications of the protein, characterizations of its protease activity, and elucidations of the chromosomal organization and differential expression of the multicopy GP63 genes. Nascent GP63s have an N-terminal prepropeptide that must be removed for activation of the protease activity and most nascent GP63s have a C-terminal hydrophobic tail that is replaced by a glycosylphosphatidylinositol (GPI) anchor (9). Depending on the Leishmania species, infective (metacyclic) promastigotes can have as many as 500,000 GP63 molecules on their surface (7,8).
A number of different functions have been ascribed to Leishmania GP63s, several of which implicate GP63 in the interaction between the infective promastigote and the macrophage. GP63 has been reported to participate in the entry of promastigotes into macrophages (10 -14) and to contribute to the survival of amastigotes in the phagolysosome of the macrophage (15,16). The presence of proteolytically active GP63 on the surface of promastigotes has also been correlated with resistance of the parasite to complement-mediated lysis (17,18). In addition, recombinant versions of GP63 have been found to be partially protective against challenge by infective Leishmania (19 -23).
A homologue of Leishmania GP63 has been reported in another trypanosomatid, Crithidia fasciculata (24). C. fasciculata is not known to infect a mammalian host. Its life cycle appears to be confined to mosquitoes, and it is thought to have two predominant developmental forms, one adhering to the lining of the insect's midgut and the other existing within the lumen of the gut (7,25). A biological role for the C. fasciculata GP63 remains obscure. It might protect the organism from the immune response of the insect, it might be an evolutionary vestige of an era when C. fasciculata had another host, or it may have a function yet to be discovered.
African trypanosomes are a third group of protozoan parasites that, similar to Leishmania, exist as one major developmental form in an insect vector and another in the mammalian host (26). In contrast to Leishmania, however, both of these forms are extracellular. In the tsetse fly midgut, African trypanosomes are present as an extracellular procyclic form, whereas in the bloodstream of their mammalian hosts, they exist as a free-swimming bloodstream form. Bloodstream trypanosomes evade their hosts' immune response by periodically switching their variant surface glycoprotein (VSG), a phenomenon called antigenic variation that has also been the subject of extensive investigation and many reviews (27)(28)(29)(30).
While comparing the efficiency of sequencing random genomic DNA fragments versus random cDNAs for discovering new genes in African trypanosomes, we identified a random genomic fragment, i.e. a genomic survey sequence (GSS), encoding a protein with similarity to GP63s of Leishmania and Crithidia (31). Two random trypanosome cDNAs, or expressed sequence tags (ESTs), have also now been found to encode GP63-like proteins. 2 Thus, we undertook a characterization of the genes encoding these African trypanosome homologues of GP63. We found that the African trypanosome GP63 genes are predominantly expressed in the bloodstream form, leading us to hypothesize that the presence of GP63 protects bloodstream trypanosomes against complement-mediated lysis.

EXPERIMENTAL PROCEDURES
Trypanosomes-All experiments were conducted with material isolated from the MVAT4 bloodstream clone of the Trypanosoma brucei rhodesiense WRATat serodeme, or from procyclic organisms derived from an MVAT5 bloodstream clone of the same serodeme (32,33). These bloodstream and procyclic organisms were maintained in rats or grown in culture, respectively, as described previously (32)(33)(34).
Analysis of Nascent (Run-on) RNA in Isolated Nuclei-The nuclei of bloodstream or procyclic trypanosomes were isolated using a protocol kindly provided by Dr. Etienne Pays (Universite Libre de Bruxelles) and described previously (35)(36)(37). They were incubated with [␣-32 P] UTP, and their RNAs were isolated for use as probes in Southern blots as described (35). In some experiments, ␣-amanitin (500 g/ml) was added to the nuclei before incubation.
Other Procedures-Total RNA and DNA were isolated from bloodstream and/or procyclic trypanosomes as described (38,39). Southern and Northern blots were conducted as described (35). Radioactive signals were quantitated by electronic autoradiography using a Packard instant imager (Packard, Meriden, CT). A bacteriophage ZAP library of cDNAs, and a bacteriophage FIX library of genomic DNA fragments, both derived from T. brucei rhodesiense bloodstream clone MVAT4 (32), were screened with an African trypanosome GSS (31) using conventional procedures (40). The cDNA inserts were rescued in the plasmid pBluescript. BamHI fragments from a genomic DNA insert of a FIX clone were subcloned into the same plasmid. DNA sequencing was conducted on an Applied Biosystems model 373A stretch fluorescent automated DNA sequencer (Perkin-Elmer). Known sequences were used to synthesize new oligonucleotides for further sequencing. The sequences were analyzed using the HIBIO MacIntosh DNASIS program (Hitachi) and the CLUSTAL algorithm (41).

Identification of T. brucei cDNAs Encoding Homologues of
Leishmania and Crithidia GP63s-The African trypanosome GSS possessing similarity to GP63 genes (31) was used as a probe to screen cDNA and genomic DNA libraries derived from T. brucei rhodesiense bloodstream clone MVAT4 (32). Positive clones were identified in both libraries, and several clones from each were selected for further characterization. Preliminary analyses of the cDNA clones revealed two that contain a complete coding region. Panel A of Fig. 1 depicts one of these cDNAs and the relative locations of the 5Ј ends of the GSS and ESTs containing the sequence similarities. The complete sequences of the two cDNAs were determined and found to be identical except for a small difference in the locations of their 5Ј ends within the 5Ј-untranslated region (UTR). Neither cDNA contains a 5Ј spliced leader, but both have an upstream termination codon in frame with the first methionine codon.
The protein encoded by this cDNA sequence is compared with the C. fasciculata GP63 sequence (24) and a GP63 sequence from Leishmania guyanensis (42) in Fig. 2. All of the published Leishmania GP63 sequences from several different Leishmania species display significant homology to the African trypanosome sequence in a BLASTX search (data not shown) (43), but the L. guyanensis GP63 shown in Fig. 2 has slightly more homology than the others. The shaded boxes show positions in the trypanosome sequence at which an identical amino acid occurs in the Crithidia and/or Leishmania sequence. Across the full 622-amino acid trypanosome sequence, 38% of the residues (235/622) are identical to the Crithidia and/or Leishmania GP63s. Between positions 114 and 425, i.e. within the middle one-half of the trypanosome protein, 50% of the positions (156/313) are identical to the Crithidia and/or Leishmania GP63s. Furthermore, the positions of 19 cysteine residues (black ovals) and 10 proline residues (black triangles) are conserved in all three sequences. Because cysteines are involved in potential disulfide linkages and prolines disrupt secondary structure, the conservation of these amino acid posi-tions suggests that the three proteins have a similar overall tertiary structure.
In the trypanosome protein, the longest stretch of amino acid identity with the Crithidia and/or Leishmania GP63s occurs at the 9 residues between positions 225-233, i.e. HEIAHALGF, a region known to be at the catalytic site of the zinc-protease activity of Leishmania GP63 (large open box in Fig. 2). This region contains 2 histidines and 1 glutamic acid that are conserved among all three sequences and have been shown to be essential for the protease activity (44). Downstream of this protease site are several other 4-and 5-amino acid segments in the trypanosome protein that occur in the Crithidia and/or Leishmania GP63s. Less well conserved in the trypanosome protein is a sequence thought to be involved in macrophage binding, i.e. SRYD at positions 250 -253 in the Leishmania sequence shown in Fig. 2 (11). Because African trypanosomes are not known to interact directly with macrophages, it is not surprising that this sequence is not conserved.
In general, the Crithidia and Leishmania GP63s are more similar to each other than either is to the trypanosome sequence, an observation expected from the fact that in phylogenetic trees deduced from sequences of cytochrome c or mitochondrial DNA, African trypanosomes are the most divergent of the three organisms (45,46). The least similarity of the trypanosome protein with the Crithidia and/or Leishmania GP63s occurs in the first 50 and last 50 residues of the trypanosome sequence. However, near their N and C termini, all three proteins are notably hydrophobic. The first 50 residues of the trypanosome protein possess the hallmarks of a prepropeptide, as do the Crithidia and Leishmania GP63s. The potential Nterminal signal (pre)peptide in the trypanosome protein has a positively charged arginine at position 8, followed by a number of hydrophobic residues before the next charged residue, an aspartic acid at position 22. The potential propeptide segment of the trypanosome protein contains an Arg-Cys dipeptide thought to complex with zinc in the catalytic site of the latent form of Leishmania major GP63 (7,47).
At its C terminus, the trypanosome protein terminates with a hydrophobic tail whose sequence is consistent with the possibility that this tail is replaced with a GPI anchor similar to most GP63s of Crithidia and Leishmania (48). Immediately upstream of this hydrophobic tail is a very acidic, serine-rich region (overline in Fig. 2) that is less prominent in the corresponding Crithidia and Leishmania proteins shown in Fig. 2. However, other Leishmania GP63s not shown have similar acidic, serine-rich regions in the same location, suggesting that this motif may play a role in the functions of some, but not all, GP63 variants.
Thus, the variety of amino acid similarities of this African trypanosome protein with Leishmania and Crithidia GP63s indicates that it shares structural properties, and likely protease activity, with these other GP63s. On the basis of these amino acid similarities, and for the purpose of brevity, we will henceforth refer to this African trypanosome protein as GP63.
Characterization of Genes in T. brucei Encoding GP63-Preliminary characterizations of several T. brucei genomic DNA clones that hybridized to the GSS indicated that all but one contained either one or two GP63 genes. However, one clone with a 14-kb insert was found to contain three adjacent GP63 genes and was chosen for further characterization (Fig. 1,  panel B). BamHI fragments of 4.5 and 1.2 kb were subcloned from this insert and their complete sequences determined. The phage DNA itself was used as a template for PCR amplifications and further sequence determinations to verify that the 4.5-kb fragment is followed by the 1.2-kb fragment and to determine another 0.3 kb of downstream sequence. This 6.0 kb of sequence is denoted by the black region in panel B. The 6.0-kb sequence, along with additional Southern blots not shown, confirms the presence of three adjacent GP63 genes in this region and demonstrates that they do not share an identical sequence with each other or with the cDNA. Thus, the genome of this T. brucei clone contains a minimum of four GP63 genes, i.e. genes 1-4 indicated in Fig. 1  gave rise to the cDNA sequence. The three genes in the cloned genomic DNA segment shown in Fig. 1B are separated by intergenic regions of 124 and 1335 bp if the 5Ј-and 3Ј-UTRs of genes 2 and 3 are assumed to be about the same length as that of gene 1 (whose UTR lengths are known from the cDNA sequence). A sequence very similar to the 124-bp intergenic region occurs at the 3Ј end of the 1335-bp intergenic region (results not shown), suggesting that this region might be the minimum segment necessary for precursor RNA processing events if the three genes are transcribed into a polycistronic precursor RNA, as are most tandemly repeated genes in Trypanosomatids (49).
Panel C of Fig. 1 shows a Southern blot of trypanosome genomic DNA digested with several restriction enzymes and probed with the GP63 cDNA shown in panel A. This probe hybridizes to seven BamHI fragments, four of which are derived from the genomic region shown in panel B, i.e., the 4.5and 1.2-kb fragments (arrows in panel C) and two flanking BamHI fragments of unknown length. One of the additional BamHI fragments must contain GP63 gene 1 (whose cDNA does not have an internal BamHI site). The other two BamHI fragments likely contain at least one more GP63 gene, but no attempt was made to determine unambiguously the exact number of GP63 genes per haploid genome. All of these genes are likely clustered on one or two chromosomes inasmuch as they appear to be confined to two large EcoRI fragments (lane E of panel C). The multiple number of BglII, EcoRV, and PstI fragments hybridizing to the cDNA probe is consistent with the many polymorphisms observed in the four GP63 genes for which complete or partial sequence was obtained. Fig. 3 shows a comparison of the GP63s encoded by gene 1 (as deduced from the cDNA) and gene 3, the two trypanosome GP63 genes for which the complete coding sequence was determined. Amino acid substitutions, and occasional small deletions/additions, are scattered throughout most of the alignment. The largest difference occurs just before the C-terminal hydrophobic tails, where GP63-1, but not GP63-3, contains the highly acidic, serine-rich region (overline in Fig. 3) mentioned above. This region in GP63-3 contains some serines, but it is neither as long nor as acidic as in GP63-1, and it actually resembles the corresponding region in Crithidia GP63 (Fig. 2) more than GP63-1. The putative protease zinc-binding/catalytic site (open box) is the same in GP63-1 and -3. In contrast to the scattered amino acid replacements between these two T. brucei GP63s, the amino acid differences among the multiple GP63s of Leishmania chagasi are confined to four or five specific regions (50). The significance of this different distribution in amino acid changes among the GP63s of these two organisms is not known, but it could reflect differing functions of GP63s in the different organisms.
Steady State GP63 mRNA Levels in T. brucei- Fig. 4 shows Northern blots of RNAs from bloodstream and procyclic trypanosomes probed with the trypanosome GP63 cDNA and, as control hybridizations, with cDNAs encoding the expressed VSG, procyclin, and tubulin. The GP63 cDNA hybridizes strongly to a 3-kb RNA in bloodstream trypanosomes and only weakly to the same sized RNA in procyclic trypanosomes. The 3-kb size is consistent with the 2786 nucleotides in the GP63-1 cDNA and with the sizes of GP63 mRNAs in Lieshmania and Crithidia (24,51). As expected, the VSG and procyclin cDNAs hybridize predominantly to the bloodstream and procyclic RNAs, respectively, and the tubulin cDNA hybridizes about equally to both RNAs, indicating similar amounts of RNA were added to each lane. The ratio of the GP63 signals in the bloodstream versus procyclic RNA lanes was determined by electronic autoradiography to be at about 50:1. The ratio of the GP63 and VSG signals in bloodstream RNA was not determined because the two radioactive 32 P-probes may have had differing specific activities. However, in a duplicate screening of the T. brucei bloodstream MVAT4 cDNA library, about 1 GP63 cDNA clone was detected for every 150 VSG cDNA clones. Thus, if the ratio of the two cDNA species in this amplified cDNA library reflects the levels of their corresponding mRNAs, the ratio of GP63 mRNA to VSG mRNA in bloodstream trypanosomes is about 1:150. Although this ratio of mRNA levels cannot be extrapolated to an estimate of the number of GP63 proteins in bloodstream trypanosomes, it may still be worth noting that 1/150th of the 10 7 VSGs on the surface of a bloodstream trypanosome (52) is 67,000, or about one-sixth the maximum of 500,000 GP63 molecules on the surface of infective promastigotes of some Leishmania species (7).
Transcription of the GP63 Genes in T. brucei-Nuclear run-on transcription experiments have been used to demonstrate that the active telomere-linked VSG gene is transcribed in bloodstream trypanosomes, but not in procyclic trypanosomes (29,32). In contrast, many tandemly repeated genes in Trypanosomatids, including GP63 genes in Leishmania, are transcribed at all stages of the organisms' life cycles, and their steady state mRNA levels are determined by post-transcriptional regulatory events (reviewed in Refs. 29 and 53). To determine whether the bloodstream expression of the GP63 genes in African trypanosomes is regulated at the level of transcription (similar to VSG genes) or at a post-transcrip- tional step (similar to most other Trypanosomatid genes), we conducted similar nuclear run-on experiments. Radiolabeled run-on RNAs from nuclei of procyclic and bloodstream trypanosomes were used to probe the GP63 cDNA. Panel B of Fig. 5 shows the results when procyclic run-on RNA was used as the probe. As expected, procyclic RNA hybridizes to the procyclin and tubulin genes but not to the VSG gene. It also hybridizes with a strong signal to GP63 gene 1. Thus, GP63 RNA is synthesized in procyclic organisms, even though they contain very little mature GP63 mRNA detectable on Northern blots. Panels C and D of Fig. 5 show a similar experiment using run-on RNA from bloodstream MVAT4 nuclei in absence or presence of ␣-amanitin. Transcription from the VSG and procyclin gene expression sites is known to be resistant to ␣amanitin, suggesting that these expression sites are transcribed by RNA polymerase I or a modified RNA polymerase II (54,55). In the absence of ␣-amanitin (panel C), nascent RNA from bloodstream nuclei hybridizes strongly to the MVAT4 VSG gene and more weakly to the tubulin and GP63 genes. Thus, the single MVAT4 VSG gene is transcribed at a much higher rate than are the multiple GP63s, consistent with the 1:150 ratio of their respective cDNAs in the MVAT 4 cDNA library. The trace-level hybridization to the procyclin gene ob-served here has been reported previously (56, 57). When ␣amanitin was present during the bloodstream nuclei incubation (panel D), transcription of the VSG and procyclin genes was unaffected, but transcription of the genes for tubulin and GP63 was nearly abolished, indicating that the GP63 genes are transcribed by a conventional RNA polymerase II, similar to the tubulin genes, but distinct from VSG and procyclin genes. Although a weaker signal is obtained to the GP63 gene in bloodstream versus procyclic forms, electronic autoradiography revealed that the ratio of the GP63 and tubulin transcripts remains constant (results not shown), indicating that the levels of transcription of GP63 are the same in both developmental stages and that the differential steady state mRNA levels observed are due to post-transcriptional regulatory events. DISCUSSION Despite the numerous studies devoted to Leishmania GP63s since their discovery, the potential functions of this family of surface proteases and the identities of their substrates in vivo remain subjects of speculation (7,8). The identification of an expressed GP63-like gene in C. fasciculata (24), a trypanosomatid whose life cycle is confined to insects, suggests that one of the main roles of GP63 for Leishmania occurs during the parasite's insect stages. Supporting this possibility is the finding that another protozoan parasite of insects, Herpetomonas samuelpessoai, also has a GP63-like metalloprotease activity on its surface (58,59). However, several other studies suggest that GP63 is crucial for the intracellular existence of Leishmania in its mammalian host. These investigations indicate that GP63 participates in the attachment and entry of Leishmania into host macrophages and contributes to its survival within the macrophage (reviewed in Refs. 7,8,and 44). All Leishmania species examined have at least six tandem copies of closely related genes encoding slightly different GP63 proteins, and some species have more than 20 tandem GP63 genes (7,42,60). The simplest interpretation of all these biochemical and molecular biology studies may be that the multiple GP63s serve a variety of functions throughout the Leishmania life cycle.
The identification of GP63-like genes in African trypanosomes leads to a further consideration of the potential functions of GP63, inasmuch as African trypanosomes are not known to interact with macrophages or to have a natural intracellular stage during the life cycle involving humans and the tsetse fly. Likewise, the expression of GP63 predominantly in bloodstream trypanosomes is counter to what one might expect, based on the detection of GP63s on the surface of the insect-specific protozoa, C. fasciculata and H. samuelpessoai. It is worth mentioning, however, that in experimental infections of rats and mice with Trypanosoma brucei brucei, intracellular parasites have been reported to occur in the epithelium (ependymal cells) that covers the choroid plexus and constitutes the local blood-brain barrier (61)(62)(63). Although, to our knowledge, the existence of an intracellular form of African trypanosomes has not been reconfirmed since the original reports in 1982-1986, this observation might be of significance in considering a potential role for GP63 when the parasite crosses the blood-brain barrier.
The nuclear run-on assays (Fig. 5) indicated that the GP63 genes are transcribed in both procyclic and bloodstream trypanosomes by an ␣-amanitin-sensitive RNA polymerase, in contrast to the expressed VSG gene, which is transcribed exclusively in bloodstream forms by an ␣-amanitin-resistant polymerase. However, Northern blots showed that the GP63 mRNAs accumulate to a detectable steady-state level only in bloodstream trypanosomes, as do VSG mRNAs. In the case of VSG mRNAs, their semiconserved 3Ј-UTRs are important in conferring the bloodstream stage specificity to the VSG mRNAs (64). A multiple alignment of the three 3Ј-UTRs of the sequenced trypanosome GP63 genes revealed a high level of homology in those regions. Our attempts to identify substantive sequence similarities in the 3Ј-UTR sequences of several expressed VSG genes and these three GP63 3Ј-UTRs were not particularly revealing; however, a sequence resemblance to a 14-mer element conserved in all VSG mRNAs was observed immediately upstream of the polyadenylation site of the GP63 genes. The presence of this 14-mer element has been shown to be necessary, but not sufficient, for VSG gene regulation (64).
Many questions remain about the location and differential expression of GP63s in African trypanosomes, and the results described here provide the basis from which to deduce their possible functions. The sequence conservation of the region known to be at the catalytic site of the zinc-protease activity and the complete conservation of 19 cysteines and 10 prolines (see Fig. 2) suggest that the GP63 protease activity does indeed contribute to the survival of the bloodstream stage of African trypanosomes. Although we have not yet shown that this trypanosome protein is a membrane-bound, proteolytically active, homologue of Leishmania GP63, we anticipate that a demonstration of zinc-protease activity in the native or recombinant protein will be forthcoming. An attractive possible function for GP63 in African trypanosomes is to mediate their known resistance to complement-mediated lysis. Devine et al. (65) showed that, whereas bloodstream trypanosomes activate the alternative pathway of complement in human serum, they are not lysed because the complement cascade does not go beyond establishment of C3 convertase on the parasite surface. In their study, the trypanosomes displayed high levels of C3 deposition as well as factor B on their surface. Similar results demonstrating interrupted complement fixation on Leishmania promastigotes expressing high levels of GP63 were reported recently (18). A test of the complement resistance model in African trypanosomes will be to overexpress GP63 in bloodstream trypanosomes to see whether their resistance to complement-mediated lysis is increased, and to see whether procyclic trypanosomes engineered to express GP63 acquire resistance (procyclic T. brucei is susceptible to complementmediated cytolysis; Ref. 66).
Finally, the amount of GP63 on the surface of several Leishmania species has been shown to increase as much as 11-fold when promastigotes develop in culture from logarithmic forms to a stationary infective "metacyclic" form (51,67). Thus, it is tempting to speculate that GP63 is also highly expressed in metacyclic African trypanosomes, the infective form of the parasite that first comes in contact with the immune system. Consistent with this possibility is a preliminary report of a 65-kDa membrane-bound metalloprotease in infective metacyclic trypomastigotes of Trypanosoma cruzi (68), an intracellular parasite that causes Chagas disease in Latin America and must also evade complement-mediated lysis before entering host cells.