Discovering the Bacterial Circular Proteins: Bacteriocins, Cyanobactins, and Pilins*

Over recent years, several examples of natural ribosomally synthesized circular proteins and peptides from diverse organisms have been described. They are a group of proteins for which the precursors must be post-translationally modified to join the N and C termini with a peptide bond. This feature appears to confer a range of potential advantages because these proteins show increased resistance to proteases and higher thermodynamic stability, both of which improve their biological activity. They are produced by prokaryotic and eukaryotic organisms and show diverse biological activities, related mostly to a self-defense or competition mechanism of the producer organisms, with the only exception being the circular pilins. This minireview highlights ribosomally synthesized circular proteins produced by members of the domain Bacteria: circular bacteriocins, cyanobactins, and circular pilins. We pay special attention to the genetic organization of the biosynthetic machinery of these molecules, the role of circularization, and the differences in the possible circularization mechanisms.

Most proteins are synthesized as linear polymers, whose free ends are often flexible or unclearly defined, and routinely targeted by exopeptidases that weaken the stability of the molecule. Over the past decades, a unique family of circular proteins/peptides with a well defined three-dimensional structure has been described. Their N-and C-terminal ends are joined with a conventional amide bond to form a circular backbone (1)(2)(3)(4). Despite the efforts expended on this subject, the role of circularization in proteins remains unclear, although evidence indicates that it contributes to their stability and potency and provides a range of potential advantages (5)(6)(7). Some circular proteins/peptides have emerged as good therapeutic candidates due to their broad range of biological activities and potency (3,8,9). It bears noting that circular proteins have been discovered in bacteria, plants, fungi, and animals. Circular proteins from higher organisms are typically shorter and contain at least one disulfide bond, further bracing the structure and bolstering stability (10). CyBase Database describes and updates sequence information on these proteins (11). This minireview focuses on the circular proteins and peptides produced by bacteria.
During evolution, prokaryotic organisms have developed diverse strategies to adjust to their particular ecological niche, such as the production of pili (12,13), or to give them a competitive advantage over other inhabitants of the same environments, i.e. by producing antagonistic substances like bacteriocins or cytotoxic peptides. In contrast to antibiotics, bacteriocins constitute a family of ribosomally synthesized proteins with variable molecular weight, genetic origin, biochemical properties, and mode of action, capable of targeting bacteria either within the same species or from different genera. In general, these proteins have low eukaryotic toxicity (14). Bacteriocins produced by lactic acid bacteria are highly useful to the food industry for food safety (15). In addition, they can complement or, in certain cases, replace antibiotics and chemical preservatives, the use of which is increasingly being called into question (14,16). Moreover, some bacteriocins have demonstrated a remarkable therapeutic potential for the treatment of local and systemic bacterial infections (17,18).
A large family of bacterial cytotoxic modified cyclic peptides, ranging in size from 6 to 20 amino acids and collectively called cyanobactins, has been identified in species belonging to the phylum Cyanobacteria (19,20). They exhibit diverse biological effects and are a promising source of post-translational modifying enzymes to synthesize new products with potential applications (20).

Prokaryotic Circular Proteins
Competition-related Molecules: Circular Bacteriocins and Cyanobactins-The most common antagonistic mechanism among bacteria is the production of specific molecules, such as bacteriocins. The circular bacteriocins described so far are produced exclusively by Gram-positive bacteria from the phylum Firmicutes. To date, 10 different globular circular bacteriocins, ranging from 58 to 70 amino acids, have been reported (Table 1) (21)(22)(23)(24)(25)(26)(27)(28)(29). They are often cationic and amphiphilic molecules that kill bacterial cells by accumulation or insertion into the membrane, thereby causing increased permeability and loss of barrier functions. Circular bacteriocins differ from other smaller nonribosomal cyclic peptides, such as cyclosporin A and polymyxin B1, which bacteria and fungi produce by peptide synthetases (30).
Subtilosin A from Bacillus subtilis is a smaller anionic peptide (35 amino acids) that is extensively post-translationally modified by three covalent thioether bonds, besides the linkage between the N and C termini (Table 1) (31). For this reason, is now considered the prototype of a new (sub)class of bacteriocins known as sactibiotics (32).
Bacteriocin production requires the coordinated expression of several genetic determinants involved in maturation (cleavage/circularization) and secretion via different transporter systems, as well as the immunity mechanisms to ensure self-protection ( Fig. 1, panel 1). These gene clusters are chromosomally encoded, except those of gassericin A/reutericin 6 and acidocin B (21,27,34), garvicin ML (28), and AS-48 (1), which are plasmid-located. As an exception, the AS-48RJ variant is encoded on the chromosome (35). The clusters are organized either into polygenic operons transcribed from a promoter located upstream of the structural gene or into two quite compressed gene modules (Fig. 1, panel 1). For a detailed discussion of the genetic characteristics of these bacteriocins, see the recent reviews in Refs. 1, 2, and 4.
The major feature of these molecules is their circular structure. Despite their unquestionable interest, only the three-dimensional structures of AS-48 (9), carnocyclin A (33), and subtilosin A (36) are available ( Fig. 2A). Additional secondary structure prediction and homology modeling analysis of circular bacteriocins indicate that even though they have low sequence identities, they are organized into a well defined three-dimensional structure (33,37). The longest of the circular bacteriocins (AS-48, circularin A, and uberolysin A) are organized into five ␣-helices, whereas the shortest ones (carnocyclin A, lactocyclicin Q, and leucocyclicin Q) contain just four amphipathic tightly packed ␣-helices connected by well defined loops, which encompass a compact hydrophobic core with the common architecture of the saposin fold (1,2,4,26). The head-to-tail union occurs within one ␣-helix, as has actually been confirmed for AS-48 and carnocyclin A. This may be a key factor in the maturation process and may be a determinant for protein folding (37). In contrast, subtilosin A contains several small amino acids that are distributed throughout the molecule, although as a consequence of its circular structure and sulfide bridges, the conformational space of the backbone is severely constrained. Thus, subtilosin A assumes a twisted bowl-like structure, with most side chains pointing toward the solvent ( Fig. 2A) (36).
Most circular bacteriocins have an asymmetrical distribution of the positive charges, where basic residues are clustered in some helices, and a hydrophobic surface configures the rest of the molecule (Fig. 2, A and B). The positively charged residues are probably responsible for attracting polypeptides to the surface of the negatively charged bacterial membranes (Fig. 2C). Analysis of the AS-48 oligomeric structure in the crystals, how-ever, involves a transition from the water-soluble dimer form I (DF-I) 2 to the membrane-bound dimer form II (DF-II) (Fig. 2B). It has been shown that AS-48 causes nonselective pores in the lipid bilayers, thereby allowing the free diffusion of ions and low molecular weight solutes across the membrane (9). By contrast, carnocyclin A does not form dimers in solution and causes anion-selective channels in the lipid bilayer in a voltage-dependent manner (33).
The group of cyanobactins, recently demonstrated to be ribosomally synthesized, encompasses Ͼ100 different backbone-cyclized peptides produced by a wide variety of cyanobacteria (19,20). They have a long predicted highly conserved helical leader peptide that is assumed to participate in the targeting of the post-translational machinery (38). In addition, they undergo other post-translational modifications, such as heterocyclization or prenylation of amino acids and epimerization. Genome mining and heterologous expression studies have revealed that they are encoded in a cluster ϳ10 kb in size that contains between 7 and 12 genes. The cyanobactin gene clusters are exemplified here by that of patellamide A produced by Prochloron spp. (Fig. 1, panel 2). This cluster includes the structural gene patE, which is transcribed with patA and patG, involved in patellamide circularization. The cluster also encodes PatD, an enzyme thought to be involved in heterocycle formation, and hypothetical genes with an unassigned function (patB, patC, and patF). Similar organization is observed in anacyclamide from Anabaena sp. or trichamide from Trichodesmium erythraeum (Fig. 1, panel 2).
Structural Proteins: TrbC and T-pilins-Pili are proteinaceous appendages present in Gram-negative bacteria (phylum Proteobacteria). They perform several functions, including 2 The abbreviation used is: DF, dimer form. In DF-I, charged helices are exposed, whereas in DF-II, hydrophilic helices interact, exposing the hydrophobic core. C, proposed mechanism of action of AS-48. The electrostatic attraction guides DF-I to the membrane, in which the conditions promote the transition to the conformation DF-II, which is more stable in a hydrophobic environment. facilitating pathogenesis, contributing to adhesion to specific receptors, providing receptors for bacteriophage, and providing limited cellular locomotion. In addition, they are essential to establish contact between cells during conjugation and as components for the type IV secretion channel (39 -41). Circular pilins are protein subunits that are assembled into a pilus protruding from the surface of Escherichia coli or Agrobacterium tumefaciens containing the RP4 or Ti plasmid (TrbC RP4 and VirB2 Ti ) ( Table 1). Despite being very similar in function and size (78 versus 74 amino acids), TrbC and T-pilin do not have a high degree of sequence similarity, although conserved Gly residues are dispersed along both polypeptides.
VirB2 (121 residues) is the major pilin subunit of the A. tumefaciens VirB/VirD4 T-pilus and an essential component of the secretion channel (40,41). The VirB2 pilin is processed by cleavage of an unusually long leader peptide (47 amino acids) to produce the T-pilin, whereas the TrbC pilin (145 residues) undergoes multiple processing for maturation (see below). The T-pilus is composed of T-pilin and VirB5 subunits (39,40). The resulting pilins are assembled into flexuous filaments projecting from the bacterial cell wall via expression of a large set of genes (41). T-pili are generated when A. tumefaciens cells are naturally induced by plant phenolic compounds or induced in vitro, which leads to the expression of the virB operon (11 genes) located on the resident Ti plasmid (39).
The RP4 pili are fragile filamentous structures rarely observed as discernible pili on the surface of RP4-bearing bacteria (ϳ1 in 50 cells). These cells contain the 11 genes of the mating pair formation system necessary for pilus biogenesis: one gene for DNA metabolism (tra1) and 10 tra2 genes of the RP4 plasmid (12,41).

Maturation of Circular Prokaryotic Proteins: A Matter for Review
The biosynthesis of circular proteins involves the combined action of diverse enzymes, many of which are assumed to be encoded in the same gene cluster. However, in some cases, the expression of mature proteins seems to require the existence of host chromosomal determinants (40,43). A common feature among bacterial circular proteins is that they are derived from precursors with an N-terminal leader peptide of different lengths. Thus, they are unusually short for subtilosin A, uberolysin A, carnocyclin A, garvicin ML, circularin A, lactocyclicin Q, and leucocyclicin (eight, six, four, three, three, two and two amino acids, respectively), compared with the extended leader peptides for AS-48, acidocin B, gassericin A/reutericin 6, butyrivibriocin AR10, and T-pilins (Table 1) and cyanobactins (20). Leader sequences containing information specifying the choice of the targeting pathway, translocation efficiency, cleavage timing, and even post-cleavage functions have been proposed (44). Recently, Oman and van der Donk (45) have reviewed the different roles of the leader peptides. The most common function is that of a secretion signal, but these leaders have also been postulated as a recognition signal for the post-translational modifying enzymes or as a cis-acting chaperone that actively assists during the post-translational modification process. However, the comparative analysis of the leaders of the circular bacteriocins reflects remarkable differences in length and sequence, as well as the absence of conserved motifs among them (Table 1), hindering a consensus sequence between their cleavable sites. On the whole, the removal of the leader peptide and the covalent union of the N/C-terminal residues are required to form the active/functional mature circular protein ( Fig. 1) (1).
Circularization with a C-terminal Signal-Maturation of the TrbC RP4 precursor involves a multistep process with at least three components (Fig. 3A). TrbC prepilin is post-translationally truncated at the C terminus (27 residues) by an unidentified chromosomal protease, followed by the removal of the signal peptide (36 residues) by the host-encoded signal peptidase LepB. Finally, the inner membrane-associated IncP TraF replaces a four-amino acid C-terminal peptide (AEIA) with the truncated N terminus (46).
In cyanobactins, the structural gene patE encodes a prepeptide with a 37-mer N-terminal leader sequence and two core peptides called cassettes I (VTACITFC) and II (ITVCISVC). Both cassettes are flanked by N-and C-terminal recognition sequences consisting of G(L/V)E(A/P)S and AYDG(E), respec- A, TrbC pilin. An unknown C-terminal host protease (CTP) truncates part of the C terminus (orange). The propeptide is then directed to the membrane, where LepB releases the leader peptide (red) and inserts the propilin (green) with the C-terminal signal (blue) still attached. The serine protease TraF cleaves the C-terminal signal and catalyzes the cyclization. B, cyanobactins. The prepeptide, containing diverse cyanobactin sequences, is enzymatically modified by PatD or other protein(s) with unassigned function encoded in the gene cluster (Pat?). PatA then releases the leader peptide (red) and the propeptides (green or pink) with the C-terminal signal (blue) attached. PatG cuts off the C-terminal signal and catalyzes the cyclization. C, circular bacteriocins and T-pilin. Cleavage of the leader peptide (red) by an unknown leader peptidase (LPase) releases the proprotein (green), which could require a C-terminal activation step (star) before a cyclase produces the circular protein.
tively. The leader peptide shows a hydrophobic surface that has been proposed to be the site of the initial binding with modifying enzymes (38). The serine protease PatA plays a role in the cleavage of the prepeptide at the N termini of cassettes I and II (Fig. 3B). The other subtilosin-like serine protease, PatG, in addition to removing the C-terminal recognition sequences of both cassettes, also catalyzes macrocyclization. A proposed chemical reaction mechanism is that a serine residue at the active site of PatG and the carboxyl group of the C-terminal amino acid in the core peptide form a covalent acyl-enzyme intermediate, followed by nucleophilic attack of the N-terminal amino group of the activated acyl moiety in the intermediate. Subsequently, the cyclic peptide is released from the enzyme, and cassettes I and II are converted into mature patellamides C and A, respectively (20). The formation of the new peptide bonds between the N and C termini may use energy released upon removal of the C-terminal signal catalyzed by the protease itself. Similarly, cyclotides contain an N-terminal leader peptide and diverse domains, namely an N-terminal repeat, the cyclotide sequence, and a C-terminal signal (GLP) (3). The main difference is that the enzyme involved in cyclotide maturation is a cysteine protease (asparaginyl endopeptidase) not encoded in the cyclotide cluster, whereas in pilins and cyanobactins, the analogous enzyme is a serine protease located adjacent to the structural gene (TraF and PatG, respectively) (Fig.  3). In addition, the C-terminal signal and the first three residues of the cyclotides are the same, thus fitting into the enzyme active site, whereas in cyanobactins, they are unrelated, allowing for a putative stronger tolerance of the cyclase for a different substrate sequence (Fig. 3).
Circularization without a C-terminal Signal-In the case of circular bacteriocins and VirB2 protein, no C terminus is truncated during maturation (Fig. 3C). Both preproteins are processed by leader peptide cleavage, presumably during the translocation step, followed by the covalent linkage of the N and C termini (47).
The lack of homology in the residues involved in the processing of the leader peptide, together with the high sequence variability in the surrounding residues, points to the existence of more than one circularization mechanism. However, the comparative analysis of the sequences shows homology between the last residues located in the C-terminal region. Thus, an aromatic residue (Trp or Tyr) is found in ϳ50% of the linear propeptide, either as the last amino acid (AS-48, lactocyclicin Q, uberolysin A, and circularin A) or at a subterminal position (subtilosin A) ( Table 1). In butyrivibriocin AR10, garvicin ML, gassericin A, and acidocin B, circularization occurs via a conserved Ala (C terminus), thus suggesting the importance of these C-terminal residues in circularization (37).
The residues directly involved in maturation provide clues concerning the processing steps leading to circular proteins. Site-directed mutational experiments were performed on AS-48 (48) in which single amino acid replacements were specifically introduced into the recognition site for the leader peptidase (His-1 and Met1) and into those involved in circularization (Met1 and Trp70). In the W70A mutant, linear derivatives coexist with circular forms, a fact that has never been described before in the wild-type strain. This demonstrates that the intro-duced change has no effect on the leader peptide cleavage and points to the requirement of a separate biosynthetic enzyme/ domain once the non-mature pro-AS-48 is excised from its precursor. The enzyme/domain involved in the circularization reaction may require the proximity of an -NH 2 group at the N terminus of the proprotein. This group acts as a nucleophile in the circularization reaction with the specific, although not essential, Trp70 residue because circular forms are also produced (48). In fact, it has been proposed that the hydrophobic environment that surrounds the curl connecting the four or five ␣-helices into which these molecules are organized may be crucial. In the mutational study on AS-48, it was also confirmed that His-1 has a key part in the cleavage reaction, as no AS-48 molecules were detected in the H-1I mutant. On the other hand, Met1 is critical to the correct processing because of the lowered circularization efficiency demonstrated for M1A mutants (37,48). Similarly, it has been demonstrated that the specificity of the TrbC cyclization reaction resides mainly in Gly112-Ile117 (46).
A recent study on subtilosin A has revealed that the S-adenosylmethionine enzyme AlbA, encoded by the sbo-alb operon, is responsible for the thioether bond formation in a reaction that is leader peptide-dependent. This leader is later cleaved off by a putative protease (either AlbE or AlbF). In the last step, the peptide backbone is circularized by one of the two proteases, and the resulting subtilosin A is subsequently exported by the putative ATP-binding cassette transporter AlbC (49).

Conclusions
The ribosomal origin of the bacterial circular proteins enables strategies to modify the peptide sequence and to create variants with altered biological and physicochemical properties. It is noteworthy that, compared with their linear counterparts, they are not only more stable but also more active (10,37). This fact has also been shown with other cyclized peptides using lanthionine rings (45), but no clear relationship has been established between circularization and increased potency. Thus, the interest in bacterial circular proteins is dual: on one hand, the study of their bioactivity (especially for medical application) and, on the other hand, the elucidation of the circularization mechanism for biotechnological application of the enzyme(s) responsible for these modifications. However, there are still several unanswered questions concerning the mechanism of action of some of these molecules, and studies involving their medical applicability are not yet available.
Analysis of the maturation of circular bacteriocins and Tpilins supports the existence of different circularization mechanisms, in which the enzymes joining the N and C termini would require some energy or activation of the propeptides due to the absence of a C-terminal signal sequence. In addition, the data suggest that the leader peptide plays a major role in the correct processing of the propeptides, probably assisting in folding of the precursors (50). All in all, the great potential of the leader peptide directing the post-translational modifications offers exciting insights into basic biological processes and provides opportunities to exploit these striking expression systems to produce new natural products for therapeutic use.