Attenuated CagA Oncoprotein in Helicobacter pylori from Amerindians in Peruvian Amazon*

Population genetic analyses of bacterial genes whose products interact with host tissues can give new understanding of infection and disease processes. Here we show that strains of the genetically diverse gastric pathogen Helicobacter pylori from Amerindians from the remote Peruvian Amazon contain novel alleles of cagA, a major virulence gene, and reveal distinctive properties of their encoded CagA proteins. CagA is injected into the gastric epithelium where it hijacks pleiotropic signaling pathways, helps Hp exploit its special gastric mucosal niche, and affects the risk that infection will result in overt gastroduodenal diseases including gastric cancer. The Amerindian CagA proteins contain unusual but functional tyrosine phosphorylation motifs and attenuated CRPIA motifs, which affect gastric epithelial proliferation, inflammation, and bacterial pathogenesis. Amerindian CagA proteins induced less production of IL-8 and cancer-associated Mucin 2 than did those of prototype Western or East Asian strains and behaved as dominant negative inhibitors of action of prototype CagA during mixed infection of Mongolian gerbils. We suggest that Amerindian cagA is of relatively low virulence, that this may have been selected in ancestral strains during infection of the people who migrated from Asia into the Americas many thousands of years ago, and that such attenuated CagA proteins could be useful therapeutically.

Population genetic analyses of bacterial genes whose products interact with host tissues can give new understanding of infection and disease processes. Here we show that strains of the genetically diverse gastric pathogen Helicobacter pylori from Amerindians from the remote Peruvian Amazon contain novel alleles of cagA, a major virulence gene, and reveal distinctive properties of their encoded CagA proteins. CagA is injected into the gastric epithelium where it hijacks pleiotropic signaling pathways, helps Hp exploit its special gastric mucosal niche, and affects the risk that infection will result in overt gastroduodenal diseases including gastric cancer. The Amerindian CagA proteins contain unusual but functional tyrosine phosphorylation motifs and attenuated CRPIA motifs, which affect gastric epithelial proliferation, inflammation, and bacterial pathogenesis. Amerindian CagA proteins induced less production of IL-8 and cancer-associated Mucin 2 than did those of prototype Western or East Asian strains and behaved as dominant negative inhibitors of action of prototype CagA during mixed infection of Mongolian gerbils. We suggest that Amerindian cagA is of relatively low virulence, that this may have been selected in ancestral strains during infection of the people who migrated from Asia into the Americas many thousands of years ago, and that such attenuated CagA proteins could be useful therapeutically.
Population and molecular genetic studies of genes of bacterial pathogens whose products interact with host tissues can give important insights into evolutionary forces, enhance understanding of infection and disease processes, and potentially lead to improvements in human health. Helicobacter pylori (Hp) 3 chronically infects the gastric epithelial surface and a narrow band of overlying mucus in billions of people worldwide (1)(2)(3). Most Hp infections begin in infancy and can last for decades despite features such as inflammatory responses, mucosal shedding, and peristalsis that make the gastric mucosal niche hostile to nearly all other microbes. Hp constitutes a major risk factor for several gastroduodenal diseases including peptic ulcer and gastric cancer even though most Hp infections are asymptomatic.
DNA sequence analysis of representative housekeeping genes (multilocus sequence typing) has shown that Hp is extremely diverse genetically and that different genotypes predominate in different geographic regions or human populations (4 -7). Hp prevalence and multilocus sequence typing studies have indicated that Hp transmission occurs preferentially within families and local communities, not in sweeping epidemics. Such localized transmission fosters accumulation of genetic diversity by genetic drift and selection for adaptation to local conditions including differences in human physiology. Multilocus sequence typing analyses have indicated that probably Hp has infected humans for eons and was carried by the people who migrated into Eurasia from Africa some 60,000 years ago and by the ancestors of modern Amerindians who migrated from Asia into Beringia and then into the Americas some 20,000 years ago (4 -7). Geographic differences among alleles of several genes whose products interact with host tissues (such as cagA, vacA, and babA2) are even greater than those of housekeeping genes (4 -6, 8). Much of this extra diver-sity may stem from differences in selective forces operating in different human populations and geographic regions (9,10).
Recent epidemiological studies revealed that some Hp strains from Amerindians in the Amazon tend to be of a novel genotype, distinct from those of Western and East Asian strains (9, 24 -27). Here we describe novel alleles of the cagA gene in strains from the remote Peruvian Amazon and the distinctive properties of the CagA proteins they encode. These Amerindian strains are unusual in causing only attenuated cell proliferation and inflammation responses, resulting in lower virulence and decreased risk of severe pathologies including gastric cancer both in in vitro and in vivo infection models. Much of this attenuation was traced to the low potency of their CRIPA motifs. Furthermore, the attenuated Amerindian cagA alleles diminished the strongly proinflammatory response provoked by more pathogenic strains during mixed infection of gerbils. We suggest that this CagA attenuation may have been selected during ancestral human migrations and that it might be useful therapeutically.

EXPERIMENTAL PROCEDURES
Clinical Specimens-Hp strains from the Machiguengaspeaking residents of Shimaa village in the remote Peruvian Amazon were cultured under standard microaerobic conditions with informed consent in protocols approved by local and international human studies committees (25). Hp strains from separate Ashaninka-speaking Amazonian villages in the province of Satipo were cultured under the same protocols. Gastric biopsies were fixed in pH 7.2 buffered formalin, paraffin-embedded, sectioned, hematoxylin/eosin-stained, and graded histologically using the Sidney System (25). Strains from Amerindian Lima shantytown residents were cultured similarly. All Peruvian strains used in these analyses were obtained with informed consent. Other strains used here were from the collections of Sasakawa, Berg, Kamiya, and Zou laboratories. The cagA sequences were determined by CagTF and CagTR primer-mediated PCR amplifying the PY region from Hp genomic DNAs (28) and then submitted to the DDBJ/EMBL/ GenBank TM databases (accession nos. AB587140-AB587258). Although some Amerindian isolates such as Shi470 have a second apparently defective copy of cagA inserted between cag14 and cag15 in the middle of the cag pathogenicity island (PAI) in addition to cagA in the usual cag PAI right end location (27), our PCR primers were designed only to detect, clone, and analyze the conventionally placed cagA gene.
Sequence and Bioinformatic Analysis-Sequence analysis was performed with programs and data in Blast homology search programs (www.ncbi.nlm.nih.gov) and genome sequence databases. Multiple Sequence Alignments and Unrooted Trees constructed by Neighbor-Joining were performed with programs in ClustalW. Sequence logos analysis was performed with the program in WebLogo 3.
Hp Infection in Cell Culture-Reference Hp strains such as 26695 and ATCC43504 (also often referred to as NCTC11637) and their isogenic ⌬cagA and ⌬virD4 mutants were previously described (15,16,29). ATCC43504 ⌬cagA derivatives containing chromosomal FLAG-tagged cagA (W, W/EA, W/AM-I, or W/AM-II) genes were constructed for this study. Briefly, FLAG-tagged cagA genes were ligated with kanamycin-resistant genes as part of the same transcription unit, and then these fragments were ligated between 1000-bp segments that flank the 5Ј and 3Ј ends of the ATCC43504 cagA gene. These DNA constructs were then used for transformation of ⌬cagA strains with selection for kanamycin resistance and thereby chromosomal placement of our engineered cagA genes. Hp strains were cultured according to standard procedures (15,16). AGS, HEK293, and 293T cells were maintained in Dulbecco's modified Eagle's medium (Sigma) containing 10% fetal bovine serum. Cultured cells were infected with Hp at a multiplicity of infection of 100.
Hp Infection in Cell Culture and Animals-Gerbil infection experiments were approved by the University of Tokyo ethics committee and carried out as described (15,29). Briefly, 6-week-old male MGS/Sea Mongolian gerbils (CLEA Japan Inc.) were intragastrically inoculated with 10 9 colony forming units (CFU) of Hp strain ATCC43504 or its derivatives. After 8 weeks, the stomachs were examined to determine bacterial load and analyzed by immunohistology. A competition assay was performed by co-inoculating wild-type ATCC43504 and derivatives with engineered cagA alleles (10 9 CFU of each strain). The competitive index was calculated as the ratio of the wildtype strain CFU:mutant strain CFU. The two genotypes were distinguished by colony PCR of PY regions in cagA. If no CFU were recovered from the stomach, a CFU count of one at the lowest dilution was assigned. Data were analyzed statistically using a Mann-Whitney U test for unpaired groups.
Immunoprecipitation, Immunoblotting, and Immunostaining-Immunoprecipitation, immunoblotting, and immunostaining were performed using appropriate antibodies as previously described (15,16). After immunostaining, specimens were examined using a confocal laser-scanning microscope (Carl Zeiss LSM510), and fluorescent images were analyzed using LSM510 Version 3.2 software (Carl Zeiss). Akt kinase assays using purified GSK3␤ protein were performed using an Akt kinase assay kit (Cell Signaling Technology) according to the manufacturer's instructions. IKK kinase assays using purified GST-IB protein were performed as described (31).
Luciferase Assay-Luciferase assays were performed using the Dual-luciferase Reporter Assay System (Promega) according to the manufacturer's instructions. The firefly luciferase levels were measured and normalized to the activity of phRL-TK-derived Renilla luciferase. Data were analyzed statistically using Student's t test for unpaired groups and expressed as the means Ϯ S.E. of triplicate experiments.
Homodimerization Assay-Homodimerization assays were performed using the Regulated Homodimerization kit (ARIAD Pharmaceuticals, Inc.) according to the manufacturer's instructions. Briefly, 293T cells were co-transfected for 24 h with the Fv-W-CagA-FLAG, Fv-W-CagA-T7, and W/AM-II-CagA-Myc plasmids in the presence or absence of AP20187 (100 nM), a small molecule that induces homodimerization of Fv domaincontaining fusion proteins.

RESULTS
Histopathologic examination of gastric biopsy specimens from 39 Hp-infected Amerindian residents of Shimaa, a small village in the remote Peruvian Amazon (25,27), indicated that most villagers had active gastritis and mild atrophy, and several had intestinal metaplasia. None had peptic ulcer disease or gastric cancer (Fig. 1C and supplemental Table S1). We next examined the cagA genes and encoded CagA proteins of Hp cultured from these Shimaa villagers. Prior studies had shown that CagA of European and North American (Western (W)) and East Asian (EA) strains differed markedly in their C-terminal region (PY region) and that these differences are functionally important (11)(12)(13)(14). PY regions contain several EPIYA tyrosine phosphorylation motifs: EPIYA-A and EPIYA-B motifs characteristic of both W and EA strains and EPIYA-C and EPIYA-D motifs characteristic of W and of EA strains, respectively (Fig. 1A) (11,14). PY regions also contain CRPIA motifs, which are responsible for separate phosphorylation-independent activities of CagA (Fig. 1A) (15).
By sequencing, we found that CagA proteins from most strains from urban Peruvians are of the W type ( Fig. 1A and  supplemental Fig. S1). In contrast, those of Shimaa strains are distinct from both W and EA types and could be placed in two main groups designated AM-I and AM-II (AM, Amerindian) (Fig. 1B). Although the overall sequences of AM and W CagAs are similar, AM CagAs have altered or degenerate "EPIYA-B" motifs: ESIYT and GSIYD in AM-I and AM-II CagAs, respectively (Fig. 1A). In addition, CagA sequences in some nominally AM-II strains such as Shi30, Shi35, and Shi156 have AM-I type CRPIA motifs (Fig. 1A), suggesting that they are actually hybrids and thus might differ functionally from purely AM-I or AM-II type CagA proteins. Most CagA sequences found in strains from the distinct Peruvian Amazonian region of Satipo, which is far from Shimaa and whose residents are of another language group, also fit into this AM-I and AM-II classification, as did CagA of several isolates from Colombian and Venezuelan Amazon (supplemental Fig. S1) (9, 24, 26).
The CagA N terminus, which is far from its PY region, is needed for efficient CagA translocation into target epithelial cells, subsequent plasma membrane anchoring, and modulation of C-terminal region activities (32)(33)(34). The N-terminal regions of each of four AM-I CagAs that we characterized were similar to those of prototype CagA, whereas each of three AM-II CagAs lacked two large internal segments totaling 180 amino acids (supplemental Fig. S2) and thus may well differ functionally from other CagA proteins.
To begin functional testing of AM CagA proteins, we examined CagA tyrosine phosphorylation during Hp infection of AGS cells. CagA from representative W, EA, AM-I, and AM-II strains each underwent such phosphorylation ( Fig. 2A and supplemental Fig. S3A). However, AM-II strains were less effective than W and EA strains in inducing subsequent phosphorylation of the host Erk/MAPK, Akt, and Met and production of IL-8 and cell scattering. AM-I strains, although more potent than AM-II strains, were also less active than W and EA strains in these activities (Fig. 2B, supplemental Fig. S3, B and C), each of which is involved in Hp pathophysiology and associated diseases (15).
CagA-host cell interactions were further examined in several ways. First, GST pulldown assays indicated that PY regions of W, EA, and AM-I CagAs interacted with known CagA-target  AUGUST 26, 2011 • VOLUME 286 • NUMBER 34 proteins more strongly than did AM-II CagA (Fig. 2C). Second, CagA-p85 and CagA-Met interactions are known to mediate phosphorylation of Akt and IKK kinases (15), and CagA-Csk and CagA-Par1 interactions are known to mediate dephosphorylation of Src and Tau kinases, respectively (17,23). Ectopic GFP-CagA expression in 293T cells showed that AM CagA was less potent than W or EA CagA and that AM-II CagA was less potent than AM-I CagA in Akt and IKK kinase activation and Src and Tau down-regulation (Fig. 2D).

Attenuated CagA in Amerindian H. pylori
Subcellular localization studies showed that AM-II CagA was distributed diffusely and not strictly localized to the plasma membrane, as were other CagA proteins during Hp infection ( Fig. 2A) or plasmid transfection (supplemental Fig. S4A). A W CagA derivative containing internal deletions matching those of AM-II CagA N-terminal regions (called ⌬N) was also distributed diffusely (supplemental Fig. S5A). These deletions resulted in attenuation of CagA-dependent NF-B and SRE activation and cell scattering activities (supplemental Fig. S5B), which reflects the importance of the CagA N terminus for plasma membrane localization, PY region-mediated intracellular target activation, and general modulation of CagA activities (32)(33)(34).
Several CagA chimaeras were constructed and studied to further test these ideas. The PY region of W CagA was replaced by that of EA, AM-I, or AM-II CagA (chimaeras called W/EA, W/AM-I, and W/AM-II) (supplemental Fig. S5D). We found that, although these chimeric CagAs localized to plasma membrane microdomains (supplemental Fig. S5C), W/AM-I and W/AM-II CagAs activated NF-B less effectively than did their W parent or W/EA CagA (supplemental Fig. S5D). The CRPIA domain promotes cell proliferation and inflammation independent of CagA phosphorylation. The CRPIA domain fifth arginine residue is well conserved in W, EA, and AM-I CagA but not AM-II CagA ( Fig. 1A and supplemental Fig. S1) (15). We generated W, EA, and AM-I CagAs in which this arginine replaced by alanine (W, R952A/R986A; EA, R977A; AM-I, R964A/R1000A). We found these replacement derivatives to be much less able than the corresponding wild-type CagA proteins to activate NF-B, SRE, T-cell factor/␤-catenin, and activator protein-1 ( Fig. 2E and supplemental Fig. S4B). Together these five sets of results indicate that the AM-II CagA protein N-terminal deletion and variant CRPIA sequences each contribute importantly to low potency during Hp infection.
The CRPIA motifs of W and EA CagA are known to interact with Par1 and Met (15,23). Consistent with the data in Fig. 2C, AM-II CagA interacted less with these proteins (Fig. 3A) and stimulated less Met-dependent expression of IL-8 and MUC2 than did AM-I, W, or EA CagA proteins (Fig. 3B). Mucin 2 (MUC2) is a major marker of gastric intestinal metaplasia that itself is a preneoplastic condition closely linked to chronic Hp infection (35,36). In support of these results, knockdown of endogenous Met or p85 expression (Fig. 3C) diminished the ability of Hp to activate Erk and Akt and to stimulate MUC2 transcription (Fig. 3D). We thus interpret that AM CRPIA motifs cause decreased CagA-Met interactions and epithelial cell responses to Hp infection.
Comparison of several W CagA derivatives showed that the strength of NF-B and SRE activation varied among CagA proteins and that, although W CagA could contain multiple CRPIA motifs, the strength of activation was not strictly proportional to CRPIA copy number (supplemental Fig. S6, A and B). Given sequence differences among CRPIA motifs of a given type superimposed on consensus motif differences among W, EA, and AM-I and AM-II CagAs (Fig. 4A), we hypothesized that these differences in potency stem from differences in CRPIA sequence per se, overall PY context, or both.
To test this we constructed derivatives of ⌬PY2 CagA (⌬871-976) that contain just one W CRPIA motif ( Fig. 4B and supplemental Fig. S6C) (15) and in which individual residues were changed, often to those of EA CRPIA (supplemental Figs. S6D and S7A). In parallel, we also made EA CagA derivatives in which individual CRPIA residues were replaced by residues that are well conserved in W motifs (Fig. supplemental Fig. S7, B and C). Most single substitutions had only subtle effects on the strength of NF-B activation (supplemental Fig. S7, A and C). Stronger effects were obtained by multiple replacements that changed a W CRPIA to an EA CRPIA motif and vice versa (supplemental Fig. S7, A and C); an Յ2-fold decrease in NF-B and SRE activation if individual ⌬PY2 CRPIA residues were replaced with the corresponding AM-I residues or replacement of W by AM-I motifs (supplemental Fig. S8A) but an ϳ5-fold decrease if the W motif was replaced by an AM-II CRPIA motif (supplemental Fig. S8B). These chimaeras also revealed differences between W, EA, and AM CRPIA motifs in strengths of induction of IL-8 and MUC2 expression, with AM-II being the weakest of all CRPIA motifs tested (Fig. 4C).
Experimental animal infections were used to test if particular CagA types could affect Hp fitness. First, Hp infection studies confirmed that chimeric CagA proteins could be delivered from our engineered Hp strains (containing plasmid-borne TEM-1 ␤-lactamase tagged, chimeric cagA genes (W, W/EA, W/AM-I, and W/AM-II)) into AGS cells (supplemental Fig. S9). In addi-tion, in gerbil infections, strain ATCC43504 derivatives with W, W/EA, and W/AM-I cagA genes achieved gastric bacterial titers that were similar to one another and, significantly, ϳ10fold higher than those of isogenic strains with ⌬cagA and W/AM-II alleles (Fig. 4D). This implies that W, EA, and AM-I alleles contribute to fitness, whereas an AM-II allele may not. In addition, W and W/EA alleles caused strong gastric epithelial induction of MUC2 expression, whereas W/AM-I caused intermediate, and W/AM-II and ⌬cagA alleles caused negligible levels of induction (Fig. 4E). Thus these experiments indicate that CRPIA motifs of AM, especially AM-II, strains are relatively impotent and further implicate CRPIA as a key determinant of gastric mucosal changes after Hp infection, which affects Hp fitness and disease risk.
Mixed infection experiments provided important additional insight into these issues. W/AM-II Hp infection of AGS cells strongly suppressed the ability of superinfecting W or EA strains to cause Erk, p38, and Akt phosphorylation or IL-8 production, whereas initial infection with an isogenic ⌬cagA strain did not (Fig. 5A). Furthermore, coexpression of W/AM-II and W CagA caused less induction of IL-8 transcription than did ⌬PY and W coexpression (Fig. 5B). Because intracellular CagA forms multimers (22,23), these results suggested interference of W CagA function in mixed complexes that also contained W/AM-II protein.
Indeed, cotransfection studies showed that W/AM-II CagA could form heteromultimers with W CagA and inhibit its activity in 293T cells (Fig. 5B). In support, the inhibitory effect of W/AM-II was avoided when W CagA was forcibly premultimerized using AP20187, a small-molecule protein dimerizer (Fig.  5B). In a further illustration of AM-II CagA poisoning in mixed CagA multimers with implications for fitness, we found that W/AM-II CagA-expressing strains interfered with the ability of coinfecting W strains to achieve high titers in gerbil stomachs and to fully induce IKK-mediated IB phosphorylation (Fig. 5,  C and D).

DISCUSSION
Our study of geographically and genetically distinct types of cagA genes established that the CagA protein CRPIA motif is a key determinant of Hp virulence and fitness. We found that AM type CagAs, which are characteristic of Hp strains from Amerindians in the remote Peruvian Amazon village of Shimaa, were significantly attenuated in abilities to stimulate gastric epithelial proliferation and inflammation during infection.
A recent study indicated that such AM strains descend from Asian strains that arrived in the Americas with the ancestral Asian people some 15,000 -20,000 years ago and that AM strains are less fit than and substantially displaced by hybrid or W strains in less isolated communities (27). Our in vitro and in vivo experiments indicated AM CagA proteins are less potent and contribute less to virulence and fitness than do W and EA CagA proteins. We, therefore, propose that AM versus W differences in CagA proteins contributed importantly to the apparent displacement of AM by W Hp strains in urban communities.
Intriguingly, residents of Satipo-region villages from elsewhere in the Peruvian Amazon watershed contained both AM and W strains, in contrast to only AM strains in Shimaa residents ( Fig. 1A and supplemental Fig. S1). This might reflect , Fv-W-CagA-FLAG, and Fv-W-CagA-T7 plasmids with or without the dimerizer AP20187 (**, p Ͻ 0.01). IP, immunoprecipitate. C, shown is a competitive index of viable counts of bacteria from the stomachs of gerbils co-infected for 8 weeks with the indicated ATCC43504 strains (wild type and ⌬cagA or wild type and ⌬cagA complemented with W/AM-II cagA) (**, p Ͻ 0.01). D, immunostaining for IB phosphorylation in gastric antral sections from infected gerbils is shown. Bar ϭ 50 m. more contact with people from outside, e.g. during guerrilla wars of the late 20th century (37,38). Continued studies of the genetics, virulence, fitness, and associated diseases of Hp from Amazonian and other Latin American populations promises to provide further insights into mechanisms of infection, persistence, and disease, especially in developing country at-risk human populations.
A mutational analysis of CRPIA motifs indicated that they could be key determinants of geographical differences in CagAmediated host carcinogenic responses (Fig. 4C and supplemental Figs. S6 -S8). Indeed, AM strains had less CagA activity than W and EA strains, with AM-II strains having markedly less than AM-I strains. Several nominally AM-II strains (Shi30, Shi35, Shi156) contained AM-I CRPIA motifs (Fig. 1A). These natural hybrids had activities similar to AM-I CagAs and higher than those of AM-II CagAs, as scored by induced Erk, Akt, and Met phosphorylation and IL-8 production (Fig. 2B). These results emphasize the importance of the CagA CRPIA motif in the development of Hp-related gastric illnesses.
Finally, we discovered that AM-II CagA can interfere with other CagA proteins both in vitro and in vivo (Fig. 5), an outcome suggesting new therapeutic approaches. Because current therapies for Hp infection are antibiotic-based and often compromised by antimicrobial resistance and also high rates of reinfection in developing country settings (39), new therapeutic strategies are needed to control Hp infections and diminish risks of associated gastric diseases (40). We propose considering AM-II strains such as Shi193 as novel live vaccines because these strains are highly human-adapted, well attenuated, and yet protective against infection by other more highly pathogenic strains.