|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 281, Issue 26, 18184-18192, June 30, 2006
Evolutionarily Conserved Allosteric Network in the Cys Loop Family of Ligand-gated Ion Channels Revealed by Statistical Covariance Analyses*![]() ![]() 1
From the
Received for publication, January 12, 2006 , and in revised form, March 20, 2006.
The Cys loop family of ligand-gated ion channels mediate fast synaptic transmission for communication between neurons. They are allosteric proteins, in which binding of a neurotransmitter to its binding site in the extracellular amino-terminal domain triggers structural changes in distant transmembrane domains to open a channel for ion flow. Although the locations of binding site and channel gating machinery are well defined, the structural basis of the activation pathway coupling binding and channel opening remains to be determined. In this paper, by analyzing amino acid covariance in a multiple sequence alignment, we have identified an energetically interconnected network in the Cys loop family of ligand-gated ion channels. Statistical coupling and correlated mutational analyses along with clustering revealed a highly coupled cluster. Mapping the positions in the cluster onto a three-dimensional structural model demonstrated that these highly coupled positions form an interconnected network linking experimentally identified binding domains through the coupling region to the gating machinery. In addition, these highly coupled positions are also condensed in the transmembrane domains, which are a recent focus for the sites of action of many allosteric modulators. Thus, our results revealed a genetically interconnected network that potentially plays an important role in the allosteric activation and modulation of the Cys loop family of ligand-gated ion channels.
Ligand-gated ion channels (LGICs)2 mediate fast synaptic transmission for communication between neurons. The Cys loop family of LGICs, with the signature cysteine loop in the amino-terminal domain, includes nicotinic acetylcholine receptors, serotonin receptor type 3, -aminobutyric acid receptor types A and C, glycine receptors, zinc activated cation channels, and invertebrate glutamate/serotonin-activated anionic channels or GABA-gated cation channels (1-4). Studies using site-directed mutagenesis, affinity labeling, cysteine accessibility test, and electron microscopy in the last two decades have demonstrated that all of the members of this receptor family have similar structural architecture (5). Each receptor is comprised of five subunits. Each subunit has a large amino-terminal extracellular domain that forms agonist-binding sites in subunit interfaces, four transmembrane domains (M1-M4) that form ion conduction pore, and a large intracellular loop that can interact with intracellular proteins for receptor targeting and regulation. The structure model of amino-terminal extracellular domain is further extended by the crystal structure of a homologous protein, acetylcholine-binding protein (6-9), and by the electron microscopic structure of the Torpedo nicotinic receptor (10). The structural model of nicotinic receptor transmembrane domains is also available via electron microscopy at 4 Å resolution (10, 11).
The Cys loop family of LGICs are allosteric proteins (3), in which binding of a neurotransmitter to its binding site in the extracellular amino-terminal domain controls distant gating machinery in the transmembrane domain to open the ion conduction pore. The kinetic mechanism of channel activation can be best described by an allosteric model in which agonist binding and channel gating are highly coupled (3, 12, 13). This long range coupling of the agonist-binding domain to the gating machinery requires an interconnected allosteric network, through which binding energy can be reliably transmitted, in the form of a "conformational wave" (14), from the agonist-binding site to the gating machinery to open the channel. Information about this interconnected allosteric network, however, is not readily available by directly examining the structural models. Although recent experimental studies have made significant contributions toward understanding the mechanism of ligand-gated ion channel activation (4, 15), exhaustive experimental studies are time consuming, and the activation pathway still needs to be defined. Thus, to facilitate future experimental studies, it is necessary to use computer-aided analysis to define the entire allosteric network for experimental validation. Statistical coupling analysis (SCA) is a sequence-based statistical method designed to estimate the thermodynamic coupling of two residues in a protein. The basis of this method is that the coupling of two sites in a protein, either directly or allosterically, should cause these two positions to coevolve. Such coevolved residues can be identified by analyzing a large and diverse multiple sequence alignment (MSA) of a protein family for the distribution probability of 20 amino acid residues at each position (16). With this method, the degree of residue covariance at two sites, in a form of "coupling energy," can be determined by observing the effect of perturbation at one site (extracting a subset of sequence alignment containing a relatively conserved residue at the site) on the amino acid distribution of another site. Prediction of potential interacting residues could dramatically reduce the work of exhaustive mutagenesis scanning and facilitates identification of functionally important residues in the interconnected allosteric network of the protein for the mechanisms of binding-gating coupling of the entire family. This method has been successfully used to define interconnected allosteric networks of several protein families, such as PDZ domains (16), G-proteins (17), G-protein-coupled receptors, serine proteases, globins (18), and retinoid X receptors (19).
McLachlan-based substitution correlation (McBASC) is another approach to find covariant positions in a protein family (20), although it is more frequently used to find direct contacting residues (21). By comparing pairs of sequences in an MSA, this method assigns a score for each comparison at each position based on the change of amino acid residue properties using the McLachlan substitution matrix (22). Correlation analysis (correlation coefficient) of these mutational scores between two sites from the MSA of a protein family then can be used to identify coevolved sites. In this paper, using these two approaches with different scoring systems, we have identified a cluster of genetically covariant sites in the Cys loop receptors. Mapping these positions onto the three-dimensional structural model of a nicotinic receptor subunit reveals that these positions are mainly clustered in functionally important domains, forming an interconnected allosteric network linking the agonist-binding pocket to the gating machinery via coupling domains. In addition, these highly coupled positions are also clustered in transmembrane domains, the recent focus for the sites of action of many allosteric modulators. Thus, our results revealed a genetically interconnected network that potentially serves as the activation pathway and plays an important role in allosteric modulation of the Cys loop family of LGICs.
Data Source and Multiple Sequence AlignmentThe amino acid sequences of subunits in the Cys loop receptor family of ligand-gated ion channels were downloaded from the Ligand-Gated Ion Channel Data base in the European Bioinformatics Institute website (www.ebi.ac.uk/compneur-srv/LGICdb/LGIC.html), where a redundancy check has been performed. Based on the length distribution histogram of all of the sequences (data not shown), we excluded those sequences that clearly do not belong to the same population. Extra long sequences (>700 residues) could have different structure, and extra short sequences (<250 residues) are likely incomplete sequences that would introduce unnatural gaps and influence coupling analysis (see Fig. 2B). Thus, these extra long and short sequences were excluded for further analysis. The remaining 389 sequences were used for analysis. All of the sequences were aligned using the Clustalw1.83 package with default parameters: 10.00 gap opening penalty, 0.20 gap extension penalty, and Gonnet series of the protein weight matrix. Because the structural model of the Torpedo nicotinic receptor is the best model available, for all calculations, the numbering in the subunit of Torpedo california nicotinic receptor was used, ignoring the signal peptide and gaps inserted into the subunit during the sequence alignment.
Statistical Coupling Analysis (SCA)The static energy ( Correlated Mutational AnalysisCorrelated mutational analysis was carried out using the McBASC (23). The program for this calculation, written in JAVA, was downloaded from Anthony A. Fodor's website (www.afodor.net), modified for formatted output, and executed under a JAVA environment. Clustering AnalysisTo extract information from the large data sets for coupling or correlated mutation analysis, a clustering analysis was performed using Hierarchical Clustering Explorer 3.0 by Jinwook Seo at University of Maryland (www.cs.umd.edu/hcil/multi-cluster/hce3.html). The coupling energy/correlation coefficient data matrices without normalization were clustered with complete linkage.
Visual PresentationFor visual presentation of the highly coupled residues in the structural model of a subunit, the structure of the
Static EnergyTo calculate the coupling, we started by counting occurrences of amino acids at each position in the MSA. Fig. 1A shows the relative frequencies of the amino acid residues in the Cys loop family of LGICs (open bars) and in all proteins from the Swiss-Prot data base (filled bars) used for calculation. Note that the frequencies of amino acid residues in the MSA slightly deviate from those in all proteins. Hydrophobic residues, such as Leu, Ile, Phe, Val, and Trp in the MSA, had slightly higher frequencies than average. Small and some hydrophilic residues such as Gly, Ala, and Lys had lower frequencies than average. This is expected for membrane proteins with multiple hydrophobic transmembrane stretches. The amino acid frequencies at each site were then determined and converted to probabilities for all 20 amino acids (16). The probabilities then were used to calculate the static energy. If the amino acid distribution at a site is similar to the distribution for all positions, then the site is not conserved, and the static energy approaches zero. In contrast, if a site is conserved, its amino acid distribution will deviate from the mean, and the static energy at that site will be higher. Thus, the magnitude of the static energy represents the extent of deviation of the amino acid distribution at each site from the mean in the MSA and therefore represents the extent of residue conservation at that site.
Fig. 1B shows the static energy for all 437 positions using the numbering of T. california nicotinic receptor Coupling EnergyTo calculate the statistical coupling energy, we performed perturbation analysis as described by Suel et al. (18). Briefly, sequences containing a conserved residue (>30%) at a particular site were taken out of the MSA to form a subset. There are 253 sites with at least one relatively conserved residue (>30%). Thus, 253 perturbations were performed (one perturbation at each site), and 253 subsets were generated. The extracted sequences in a subset containing only the conserved residue at the perturbation site resulted in amino acid redistribution at this and all the other sites. The amino acid probabilities at each site in a subset were then determined and used for coupling energy calculation. If the perturbation at one site significantly changes the amino acid distribution at another site, then these two sites have high coupling energy. Otherwise, they have low coupling energy. The calculation resulted in a 437 x 253 matrix of the coupling energy (Fig. 2A). In some regions of the receptor, such as the large intracellular loop (sites 301-402) between the third and fourth transmembrane domains with the most diversified sequences and low static energy (Fig. 1B), the alignment generated large gaps at many positions. To determine whether gaps can influence the coupling result, we examined the relationship between the number of gaps at each position and the mean coupling energy of all positions in response to the same perturbation. Fig. 2B plots the number of gaps against the mean coupling energy for each perturbation. Note that all of the positions with more than 60 gaps had high mean coupling, suggesting the number of gaps does have some influence in the coupling energy calculation. To avoid this potential influence, we discarded positions with more than 60 gaps and most of the M3-M4 intracellular loop (sites 296-392) for further analysis in both rows (coupling) and columns (perturbation). This resulted in a 311 x 219 matrix. To identify highly coupled sites from this large data set, we performed a clustering analysis. Fig. 3A shows the clustering result of coupling energy for this matrix with 219 rows (perturbation) and 311 columns. Note that the sites with high coupling energy are mainly clustered in the bottom right as indicated by the three yellow boxes. Fig. 3B is a closer view of these clusters. The positions of all of the columns in this highly coupled cluster showed a similar coupling pattern to many perturbations, suggesting that they are covariant in response to same set of perturbations and thus are mutually coupled. The detailed positions in Fig. 3B are listed in Table 1.
Correlated Mutation AnalysisWith the same set of MSA, we performed correlated mutation analysis using the McBASC method. This resulted in a 437 x 437 matrix (data not shown). Similarly, to extract information from the large data set, we first removed positions with large gaps (>60) and intracellular loop (sites 296-392) to avoid potential influence of gaps and improper alignment. The remaining data were clustered using the Hierarchical Clustering Explorer 3.0 software. The results are shown in Fig. 4A. Note that there is a high correlation coefficient cluster (the yellow box in the bottom right corner) from the large background. The details of this cluster with high correlation coefficient are shown in Fig. 4B, and the positions in this cluster are listed in Table 1.
In search for the activation pathway, we used two statistical analyses along with clustering to systematically identify the genetically interconnected positions in the Cys loop family of ligand-gated ion channels. Highly coupled positions predicted by both methods overlapped by nearly 70% (see below). Mapping these positions onto the three-dimensional structural model demonstrated that these highly coupled positions were mainly clustered in important functional domains, linking the binding pocket through coupling domains to the gating machinery. Thus, our results suggest an interconnected network that may serve as the allosteric activation pathway, coupling agonist binding to channel function. The finding can be used as a guide for experimental design and to facilitate elucidation of the activation mechanism for the Cys loop family of LGICs.
Comparison of Coupling and Correlation ResultsTo compare the identified covariant positions by the two methods, we list these positions in Table 1. Note that positions predicted by the two methods substantially overlap. In fact, the overlapping sites represent 62% of the total number of positions predicted by SCA and 65% of the total number of positions predicted by McBASC. If we take an additional stringent step by removing the sites with 20 or more gaps (sites 7, 11, 22, 81, 95, 166, 230, 240, 398, 399, 425, and 429), then the results are more consistent, and the overlapping sites represent 69 and 68% for the predictions by SCA and McBASC, respectively. The prediction differences could be due to different scoring methods: SCA uses amino acid probability and observes changes in the probability distribution in response to a perturbation at a site by extracting a fraction of total number of sequences containing a relatively conserved residue, whereas McBASC uses a score matrix with consideration of amino acid properties and compares all possible pairs. Thus, theoretically McBASC more effectively uses sequence data and therefore could be a better predictor for genetically covariant positions in a protein family. Nevertheless, the positions predicted by both methods are the most reliable ones with high coupling. Positions predicted by only one method still should be coupled but with slightly lower coupling strength.
To visualize this genetically interconnected network, we mapped the positions predicted by both methods (after removing the sites with 20 or more gaps) in the three-dimensional structure of the Torpedo nicotinic receptor
Two salient features are apparent in this interconnected network. First, they are highly clustered in functionally important domains connecting the agonist-binding site through the coupling region to the gating machinery, forming a putative activation pathway. Second, many positions are concentrated in a region of the recent focus for the sites of action of many allosteric modulators (24): transmembrane domains. These two aspects are further discussed in detail below. Allosteric Activation PathwayThe strategic location of the highly coupled residues strongly suggests their importance in channel function. The agonist-binding pocket of a receptor is located in the amino-terminal domain at a interface between two subunits, each contributing three binding loops (7). Fig. 5 (B and C) plots highly coupled residues in the context of other residues in two different views (principal face and complementary face). The residues in the high coupling cluster for all three colors in Fig. 5A are now in yellow. Important binding sites are highlighted with red in the principal face and cyan in the complementary face. For the convenience of numbering, both faces are shown in the same subunit. In reality, this is only true for homomeric channels in that the same subunit contributes both the principal face and the complementary face of the binding sites. In heteromeric channels, the principal and complementary faces of the binding pocket are in different subunits. The overlapping residues are in orange for the overlapping between red and yellow or in green for overlapping between cyan and yellow. The gate forming residue, the conserved M2 Leu, is highlighted in purple. Note that only two highly coupled positions (sites 55 and 149) overlap with binding site residues. This is because many functionally important residues are highly conserved and nonvariant and thus escape detection by covariant analysis. However, these binding site residues are flanked by highly coupled residues in both the principal and complementary faces (Fig. 5, B and C). With the exception that predicted high coupling positions flank binding residues in loop E from the top of the molecule (Fig. 5C, sites 67 and 112), all of the other positions form an interconnected network connecting the binding pocket through the coupling region (see below) to the gating machinery. Interestingly, in the amino-terminal domain, the highly coupled residues are distributed only in the inner sheet. This is consistent with current understanding of the activation mechanism as suggested by 4 Å electron microscopic study: activation involves a clockwise rotation of the inner sheet of the amino-terminal domain around its own axis in each subunit (10, 11).
The highly coupled positions also clustered in the contact region between the amino-terminal domain and transmembrane domain. This is a region that is believed to be crucial in coupling binding to channel gating. In fact, the coupling between amino-terminal domain and channel domain has been postulated to be mediated by the M2-M3 linker (25-28). More recent studies with electron microscopy structure of nicotinic receptor (10, 11) or mutagenesis studies (29-32) further suggest that it is mediated by interactions between amino-terminal domain loop2/loop 7 and transmembrane domain linker M2-M3, although crucial residues involved in coupling vary with different receptors. The rate-equilibrium free energy relationship analysis suggests that both loop 2 and loop 7 (cysteine loop) are involved in channel activation (33). Loop 9 (Loop F) is also required for the function of a chimera channel (34). More recently, mutant cycle analysis in nicotinic acetylcholine receptor (35) or unnatural amino acid substitution in serotonin receptor type 3 (36) have identified a key residue in the M2-M3 linker for channel activation. This functionally important residue does overlap with the high coupling site 272. Moreover, the residues required for benzodiazepine allosteric coupling in M2 and M2-M3 linker (GABAR 2T281,I282,S291 (37) and GABAR 1V279 (38)) overlap with the highly coupled positions. In addition, our results also provided a potential link between the binding pocket and the coupling region (loops 2 and 7) in this putative allosteric network (Fig. 5A), which may represent the physical basis for inner sheet movement as a "rigid body" during channel activation (11). Finally, highly coupled residues are also clustered in the middle and intracellular end of the M2 domain, a region with the putative ion channel gate (Fig. 5B, the purple residue in the transmembrane domain) as suggested by many studies ((1, 5, 13, 39-41) and ultimately confirmed by electron microscopy studies of nicotinic receptor at 4Å resolution (10, 11). Again, the conserved M2 leucine is not predicted by covariant analysis but is surrounded by highly coupled residues. The highly coupled positions, however, do cover the region in the beginning of M2 in the intracellular end, the location of the selectivity filter, which differentiates cationic nicotinic and serotonin receptor channels from anionic GABA- and glycine receptor channels (42-46). In addition, other residues in the M2 domain (47, 48) and other transmembrane domains such as pre-M1 (49), M1 (50, 51), M3 (52), and M4 (53, 54) are also important in channel gating, and the M1-M2 and M2-M3 linkers have been suggested to act as hinges governing allosteric control of the M2 domain (11, 27). Given the significance of all four transmembrane domains in channel gating, it is understandable that the highly coupled cluster covers these transmembrane domains. In summary, we have identified an interconnected network that physically links agonist-binding domains to channel gating domain. This would represent the entire allosteric network, through which binding signals in the amino-terminal domain can be transduced to gating function in the distant location.
Sites of Action for Allosteric ModulatorsIn addition to the gate-containing M2, our results showed that the highly covariant cluster also includes positions in the M1, M3, and M4 domains. All four transmembrane domains, especially the extracellular half of M2 and M3, are recently recognized as important sites of action for many allosteric modulators such as alcohol, general anesthetics, neurosteroids, and barbiturates (24, 55). Allosteric modulators for ligand-gated ion channels are compounds binding to a site distinct from the agonist-binding site. With the exception of benzodiazepines, which, like agonists, bind to the amino-terminal domain but in a different subunit interface, most allosteric modulators exert their action by binding to the transmembrane domains of the receptor. In fact, the sites of action for many allosteric modulators in all four transmembrane domains overlap or flank the highly coupled positions. These include a site for barbiturate/neurosteroid/etomidate/propofol modulation (GABAR Concluding RemarksOur finding that the highly coupled cluster spans the regions from the binding pocket to the gating machinery re-emphasizes an important concept: binding and channel function are mutually coupled (12, 63). This long range coupling requires an interconnected allosteric network. Perturbation of this allosteric network, either by agonist binding or mutations in binding domains (e.g. loops A (64, 65), B (66), D (67), or E (64)) or gating machinery (13, 40, 41, 47, 48, 68) can alter channel gating behavior and even make a channel open spontaneously in the absence of agonist. Thus, it is the fine balance of all residues in this allosteric network that determine the function of the channel, from agonist binding to channel gating. Fine tuning of this allosteric network with coordinated changes of the side chains of amino acid residues during long evolution preserves channel function and generates functional diversity of the channels in this family to meet the growing need of ever evolving brain function. Although our results can provide a useful general reference for structural dynamics studies of ligand-gated ion channels, caution should be exercised when applying the results to a particular member of the Cys loop family at precise positions. First, because of the nature of the interconnection with coordinated mutations, the effect of single point mutation at a particular site on channel function may vary with different receptors. Second, our analysis could be limited to the coupling between residues within one subunit. It may not account for the interaction between subunits. Because receptors in the Cys loop receptors have a pentameric structure with five subunits in a receptor, interactions between subunits are also important for receptor structure and function. Although detailed interaction between subunits can be determined by another type of analysis such as the subtractive correlated mutation method analyzing linked subunits (69), positions for this intersubunit interaction may be already embedded in the covariant sites of our results, because our analysis includes all subunits. Furthermore, the conformational change in the amino-terminal domain is proposed to be coupled to the channel gating machinery within each subunit (11). Thus, our results can still provide valuable information for the mechanisms of activation and modulation for the Cys loop receptors.
* This work was supported by funds from the Barrow Neurological Foundation and Women's Board (to Y. Chang). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 1 To whom correspondence should be addressed: Division of Neurobiology, Barrow Neurological Institute, 350 West Thomas Rd., Phoenix, AZ 85013. Tel.: 602-406-6192; Fax: 602-406-4172; E-mail: yongchang.chang{at}chw.edu.
2 The abbreviations used are: LGIC, ligand-gated ion channel; GABA,
We thank Dr. Rama Ranganathan (University of Texas Southwestern Medical Center at Dallas) for providing the method for statistical coupling analysis and Dr. Anthony A. Fodor (Stanford University) for providing the free download program for McBASC analysis.
This article has been cited by other articles:
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Advertisement | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||