Intracellular Localization of Homopolymeric Amino Acid-containing Proteins Expressed in Mammalian Cells*

Many human proteins have homopolymeric amino acid (HPAA) tracts, which are involved in protein-pro-tein interactions and also have intrinsic polymerization properties. Polyglutamine or polyalanine expansions cause several neurodegenerative diseases. To examine the properties of HPAAs, we expressed 20 kinds of 30-residue HPAA fused to the C terminus of yellow fluorescent protein in mammalian cells. Specific localization was observed depending on the HPAA. Polyarginine and polylysine aggregated in the nucleus. Polyalanine, poly-histidine, polyisoleucine, polyleucine, polymethionine, polyphenylalanine, polythreonine, polytryptophan, and polyvaline localized in the cytoplasm, and some of these HPAAs formed aggregate(s). Hydrophobic HPAAs such as polyisoleucine, polyleucine, polyphenylalanine, and polyvaline were found as one major aggregate or cumulus in the perinuclear region. Western blot analysis in-dicated that hydrophobic HPAA tracts appear to oligomerize and form high molecular weight complexes. These results indicate that hydrophobicity itself may trigger the oligomerization and aggregation of proteins when overexpressed in cells. Our experiments provide novel insights into the nature of the HPAAs that are often seen in human and other organisms. When a triplet °C for 1 h, and then with an anti-rabbit IgG antibody at 37 °C for 40 min. The resulting membranes were visualized with a POD immunostain kit (Wako, Tokyo, Japan) or an enhanced chemiluminescence kit (Amersham Bioscience).

In the human genome, there are many repetitive sequences, including trinucleotide repeats (triplet repeat) (1,2). When a triplet repeat is located in the open reading frame of a gene, it will be translated into an amino acid homopolymer and become a homopolymeric amino acid (HPAA) 1 tract in the protein. HPAA tracts exist in many proteins, including a variety of transcription factors.
Some HPAA-containing proteins are associated with neurodegenerative diseases. At least nine inherited neurological disorders, including Huntington's disease, spinobulbar muscular atrophy, dentatorubral-pallidoluysian atrophy, and six forms of spinocerebellar ataxia (3,4) are caused by the expansion of trinucleotide (CAG) repeats encoding polyglutamine. These are adult onset diseases involving the progressive degeneration of the nervous system. The presence of a polyglutamine tract is the only common feature of those proteins in these diseases.
These diseases likely share a common molecular pathogenesis resulting from toxicity associated with the expanded polyglutamine tract. It is said that expanded polyglutamine endows the disease proteins with a dominant gain of function that causes apoptotic cell death. Several years ago, it was recognized that expanded polyglutamine forms neuronal intranuclear inclusions in animal models of polyglutamine diseases and in the central nervous system of patients with these diseases (3,5,6). These inclusions consist of accumulations of insoluble aggregated polyglutamine-containing fragments in association with other proteins. It has been proposed that proteins with long polyglutamine tracts misfold and aggregate as antiparallel ␤-strands termed "polar zippers" (7). The correlation between the threshold polyglutamine length for aggregation in experimental systems and the CAG repeat length that leads to human disease supports the argument that aggregation of expanded polyglutamine underlies the toxic gain of function. Although in some experimental systems the toxicity of expanded polyglutamine has been dissociated from the formation of visible inclusions, the formation of insoluble molecular aggregates appears to be a consistent feature of toxicity (8 -11).
There are indications that other expanded HPAA sequences, such as polyalanine, also confer toxic functions via similar mechanisms. For example, oculopharyngeal muscular dystrophy (OPMD) has been found to be associated with the expansion of an alanine repeat of polyadenine-binding protein 2 (12). Intranuclear filament inclusions in skeletal muscle fibers are the morphological hallmark of OPMD (13,14). The overexpression of polyalanine tracts in COS-7 cells results in the formation of aggregates and toxicity toward the cells (15). Polyalanine peptides of 14 residues have been shown to form ␤-sheets in vitro (16), and extended polyglutamine repeats have also been shown to form such structures in vitro and in vivo (17,18).
In addition to the case of OPMD, at least six other genes have been identified in which polyalanine expansions cause human diseases, including the Aristaless-related homeobox protein in X-linked mental retardation and epilepsy (19), the SRY-box 3 in X-linked mental retardation with growth hormone deficiency (20), and the homeobox protein HoxD13 in synpolydactyly (21). Among these polyalanine diseases, some that are caused by the expansion of polyalanine tracts might not be "polyalanine diseases," because they can be caused by point mutations in the genes as well as by the expansion of polyalanine tracts, suggesting that the loss of function of these proteins rather than the expansion itself is essential for the pathogenesis. Huntington's disease-like 2 has been described as being caused by CTG repeat expansion, which will be translated into either a polyalanine or polyleucine stretch (22). Some transcription factors have HPAA tracts such as polyproline, polyglutamine, polyglycine, or polyalanine. For example, the Forkhead box protein P2 (FOXP2) contains a 40residue polyglutamine tract. Myelin transcription factor 1 contains a 32-residue polyglutamic acid tract. Brain-2 (POU domain, class 3, transcription factor 2) contains a 5-residue polyalanine, 21-residue polyglycine, 7-residue polyproline, and a 21-residue polyglutamine. Polyproline or polyglutamine have been shown to activate transcription when fused to the DNA binding domain of GAL4 factor, and the activity increases with HPAA length (23). But the role of HPAA tracts in many transcription factors is not yet clear. Two polyglutamic acid regions in bone sialoprotein are conserved and play a role in the hydroxyapatite-nucleating activity (24). A six-residue polyhistidine is a widely used "polyhistidine tag" with metal binding activity. Polylysine coatings on the surface of culture dishes make them very hydrophilic and help in the adhesion and proliferation of cultured cells.
In this study, we expressed 20 kinds of HPAA (of about 30 residues) fused to the C terminus of YFP to clarify the property of HPAA itself under the same experimental conditions. This is the first report comparing the aggregation properties of each type of HPAA in cells.

EXPERIMENTAL PROCEDURES
Plasmid Construction-Expression constructs encoding EYFP with C-terminal polyamino acid tracts were synthesized by ligating doublestranded 90-mer oligonucleotides (Proligo, Kyoto, Japan) into the pEYFP-C1 mammalian expression vector (Clontech). (CAG/CTG) 30 was used for Gln, Ser, Ala, Leu, and Cys; (ATG/CAT) 30 was used for Met, Asp, and Ile; (TGG/CCA) 30 was used for Trp, Val, Pro, and His; (GGC/ CCC) 30 was used for Arg; (GAA/TTC) 30 was used for Glu, Lys, and Phe; (TAC/GTA) 30 was used for Tyr; (AAC/TTG) 30 was used for Asn and Thr; and (AGG/CCT) 30 was used for Gly. Restriction enzymes and T4 polymerase were used to adjust the frame for each polyamino acid. The integrity of repeat was confirmed by sequencing.
Fluorescence Microscopy Analysis-COS-7 cells were grown in Dulbecco's modified Eagle's medium with 10% fetal bovine serum (Sigma-Aldrich). Transient transfection was performed using the FuGENE 6 transfection reagent (Roche Diagnostics) following the manufacturer's instruction. At 48 h after transfection, the cells were treated with Hoechst 33342 (Sigma-Aldrich) at 37°C for 30 min, and the medium was removed and replaced with phosphate-buffered saline. The fluorescence of YFP was visualized by fluorescence microscopy IX70 (Olympus, Tokyo, Japan).
Transfection Efficiency-COS-7 cells were transiently transfected with the YFP-HPAA plasmid. After incubation for 48 h, the cells were harvested and dissolved in phosphate-buffered saline. The percentage of transfected cells was measured as the percentage of fluorescent positive cells by flow cytometer (EPICS ® XL T⌴ , Beckman Coulter).
Western Blot Analysis-COS-7 cells were transiently transfected with YFP-HPAA plasmids. After incubation for 48 h, the cells were harvested and sonicated in PBS with 1% Triton X-100. The protein concentration was measured with a DC protein Assay Kit (Bio-Rad). Equal amounts of protein, 14.1 g for each sample, were subjected to SDS-polyacrylamide gel electrophoresis on 12.5% gels and transferred onto polyvinylidene difluoride membranes (Finetrap NT-32; Nihon Eido, Tokyo, Japan). The membranes were incubated with peroxidaseconjugated anti-GFP/YFP polyclonal antibody (1:1000; Santa Cruz Biotechnology, Santa Cruz, California) at 37°C for 1 h, and then with an anti-rabbit IgG antibody at 37°C for 40 min. The resulting membranes were visualized with a POD immunostain kit (Wako, Tokyo, Japan) or an enhanced chemiluminescence kit (Amersham Bioscience).

Intracellular Localization of Homopolymeric Amino
Acidfused YFP-To compare the molecular properties of different HPAA stretches in mammalian cells, 20 kinds of triplet repeats, each encoding every kind of HPAA, were cloned in mammalian expression vector pEYFP-C1. HPAA tracts of ϳ30 (26ϳ32) residues were fused to the C terminus of YFP. The tracts were expressed in COS-7 cells via the cytomegalovirus (CMV) promotor. The nucleus was visualized by Hoechst staining, and the cells were observed under a fluorescence micro-scope ( Figs. 1 and 2). We examined ϳ200 transfected cells to count the localization 48 h after transfection ( Fig. 2A). Transfection efficiency was examined and ranged from 46 to 63% of total cells; there was no significant difference among all constructs on analysis of variance (ANOVA) tests (data not shown).
YFP fluorescence was distributed diffusely in cells expressing only YFP (Fig. 1A). YFP with Asn-35 (35-residue homopolymer of asparagine fused at the C terminus of YFP), Asp-30, Gln-30, Glu-30, Gly-28, Pro-27, and Ser-29 gave similar fluorescence patterns to that of YFP only (Figs. 1B and 2, A and B). YFP with Arg-30 or Lys-30 formed aggregates in the nucleus, in addition to showing diffuse cytoplasmic expression. The patterns looked alike, with 2-3 aggregates in the nucleus of each cell, although the aggregates of Lys-30 were larger and brighter than those of Arg-30. With Cys-29 or Tyr-28 the fluorescence was diffusely present throughout the cell but intense in the nucleus. Tyr-28 formed several aggregates both in the nucleus and cytoplasm, whereas Cys-29 formed aggregates or cumuli only in the cytoplasm.
With Ala-29, His-26, Ile-32, Leu-30, Met-30, Phe-30, Thr-35, Trp-30, or Val-29, the fluorescence was present exclusively in the cytoplasm. Among these residues, His-26 and Trp-30 formed several cytoplasmic aggregates. Most of the rest, except Ala-29 and Thr-35, showed small and dispersed aggregates in the cytoplasm. Ala-29 and Thr-35 did not produce visible aggregation under light microscopy. Ile-32, Leu-30, Phe-30, and Val-29 formed one large cumulus in the perinuclear region of each cell in addition to small and dispersed aggregates in the cytoplasm. Interestingly, these large cumulus-containing cells showed a distorted nuclear morphology.
To learn the effect of repeat expansion, constructs containing longer HPAA tracts were made for four kinds of HPAA (Ala, Cys, Leu, and Gln) with the lengths of 70, 70, 130, and 150, respectively. As shown previously (3,25), about half of Gln-150transfected cells formed one large and bright aggregate in the perinuclear region of each cell (Fig. 2C), a drastically different pattern than that in the shorter Gln-30-expressing cells, which did not differ from the pattern seen in the case of control YFP. 20% of Ala-70-transfected cells formed small and dispersed aggregates in the cytoplasm that were not seen in Ala-29. Cys-70 and Leu-130 showed the same localization as their shorter Cys-29 and Leu-30 counterparts. Similar results were observed in other cell types, such as HEK293 (data not shown).
Western Blot Analysis-SDS-PAGE and Western blot analysis were performed using an antibody against YFP (Fig. 3, A  and B). Expression of all constructs was confirmed. Enhanced chemiluminescence staining was performed for Arg and Lys (Fig. 3B) because the normal immunoblot staining of these two was too faint. The conformation of these two constructs might be changed when they are aggregated in the nucleus and, thus, not be recognized by the antibody because the fluorescence of these two was not faint under the fluorescent microscopy observation ( Figs. 1 and 2). It is also possible that the long stretch of positively charged amino acids affect migration of these proteins in the electric field, or insufficient transfer onto the membrane might occur because of the high molecular weight of these proteins. Except for Lys, Cys, Thr, and the longer Cys-70, every YFP-HPAA showed a band around its calculated molecular weight.
Arg, His, Ile, Leu, Met, and Phe showed one or several bands on the SDS gel above the expected molecular weight bands, indicating that those proteins might exist in oligomeric forms. Conversely, several had more bands under the expected molecular weight, which might represent degradation products. Cys, His, Ile, Leu, Lys, Met, Phe, Thr, Tyr, Val, and all longer HPAA constructs (Ala-70, Cys-70, Leu-130, and Gln-150) showed smear staining in the stacking gel, or a band between the stacking gel and running gel was observed. This indicates that these proteins oligomerized and assumed an aggregated conformation.

DISCUSSION
Hydrophobic HPAAs Oligomerize and Aggregate-Studies of HPAA have focused on polyglutamine or polyalanine because these two play an important role in several human inherited diseases. It is widely known that many causative proteins with elongated polyglutamine or polyalanine form aggregates in cells (3).
To compare the properties of different HPAA stretches in mammalian cells, we expressed all of the 30-residue HPAAs of all the 20 amino acids fused to the C terminus of YFP. As shown in Fig. 1B, the intracellular localization of HPAAs differed dramatically depending on the HPAA. With Arg or Lys, aggregates were observed in the nucleus. It is possible that these basic amino acids form aggregates in the nucleus due to electrostatic interaction or work as nuclear localization signals. The properties of each amino acid might be amplified when they form homopolymers.
In the case of Ala, His, Ile, Leu, Met, Phe, Thr, Trp, or Val, the fluorescence was present exclusively in the cytoplasm. Among these HPAAs, His, Ile, Met, Phe, Trp, and Val formed diffuse aggregates within the cytoplasm, and Ile, Leu, Phe, and Val formed a single cumulus in the perinuclear region of each cell. Of the aggregate or cumulus-forming HPAAs, His, Ile, Leu, Met, Phe, and Val were shown to have higher molecular mass in Western blot analysis (Fig. 3A). Moreover, we found that the cumulus-forming HPAAs, Ile, Leu, Phe, and Val, are all highly hydrophobic and show strong aggregation as judged by Western blot staining.
It is highly possible that the protein context may play a role in modulating the properties of single amino acid repeats. Hydrophobic amino acid repeats near the hydrophobic domain could promote aggregation of the protein. Therefore, overall hydrophobicity as well as the localized hydrophobic nature of each protein should be taken in consideration.
The Occurrence of HPAAs in Nature-We searched the human protein data base for HPAA-containing proteins to determine the occurrence of HPAA stretches in nature. There are many HPAA-containing proteins in the human genome. However, the number of HPAA-containing proteins varies among the different HPAA species. In Table I, we show the number of human proteins containing each species of HPAA in nature. The table includes proteins with an HPAA tract longer than 11 consecutive residues. As seen in Table I polytyrosine, or polyvaline stretches. This table predicts that hydrophobic HPAAs and polylysine in proteins, both of which are shown in this study to be highly aggregated, could have played a negative role in organism survival during evolution. It has been shown that aggregation-prone proteins possess cytotoxic effects on cells in studies of many human diseases caused by aggregation and deposition of abnormal proteins (26) including polyglutamine diseases, Alzheimer's disease, and Parkinson's disease.
With regard to the occurrence of each HPAA, two other possibilities should also be considered apart from cytotoxicity, i.e. the instability of some specific codons and the potent functions of HPAAs. CAG/CTG, CGG/CCG, GTC/GAC, and GTG/ CAC are reported to be unstable during replication and, moreover, CAG/CTG has been shown to be eight times more unstable than other repeats (27). Therefore, this might be the reason for the abundance of polyglutamine, polyalanine, and polyserine, because they are all encoded by CAG/CTG. Interestingly, polyleucine and polycysteine, which are also encoded by CAG/CTG, are less abundant than polyglutamine, polyalanine, and polyserine. This suggests again that the more hydrophobic HPAAs are less abundant in natural proteins.
New Insights into Polyglutamine and Polyalanine-Until now, no study has been undertaken on HPAAs other than polyglutamine and polyalanine, except for one study in which a long polyleucine stretch of 291 residues was compared with a polyglutamine of the same length (28). Because our study is the first research to consider all kinds of HPAA, it not only sheds light on the properties of all HPAAs, but also offers new insight into polyglutamine and polyalanine diseases. Polyglutamine and polyalanine diseases arise when the HPAA tract in the pathogenic protein elongates over a threshold length. Whereas the threshold is 38 residues in the case of polyglutamine, the threshold of polyalanine is estimated to be smaller. In this study, we observed aggregates formed by long Gln-150 and Ala-70 whereas their shorter counterparts, Gln-30 and Ala-29, did not form visible aggregates. Our Western blot analysis also showed that these longer HPAAs are highly aggregated and revealed staining patterns similar to those of 30-residue HPAAs of hydrophobic amino acids, including Cys, Ile, Leu, and Val. From these results, we suggest that the hydrophobicity of polyglutamine and polyalanine might become stronger as the tract becomes longer. Moreover, the difference in the threshold between polyglutamine and polyalanine diseases may be explained by the fact that alanine is more hydrophobic than glutamine and thus needs fewer residues to become hydrophobic enough to cause toxicity. It has been suggested that expanded polyglutamine or polyalanine forms anti-parallel ␤-sheets (16 -18) resulting in a polar zipper, and this is not inconsistent with our results.
In summary, it is likely that the expanded polyglutamine or polyalanine repeats associated with many diseases remain at a soluble state compared with other HPAAs of the same length. The existence of HPAAs and their lengths might be determined by several factors including hydrophobicity and the stability of the triplet repeats coding for the individual amino acids.

TABLE I
The occurrence of HPAA-containing proteins in nature The number of HPAA-containing proteins varies among HPAA species. A search of the human protein data base reveals many proteins containing relatively hydrophilic HPAAs, including polyalanine, polyglutamine, polyglutamic acid, polyglycine, polyhistidine, polyproline, or polyserine. On the other hand, there are few hydrophobic HPAAs composed of polyisoleucine, polylysine, polymethionine, polyphenylalanine, polytryptophan, polytyrosine, or polyvaline stretches that occur naturally.