If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, California, USADepartment of Neurology, University of California San Francisco, San Francisco, California, USA
Biological membranes define the boundaries of cells and compartmentalize the chemical and physical processes required for life. Many biological processes are carried out by proteins embedded in or associated with such membranes. Determination of membrane protein (MP) structures at atomic or near-atomic resolution plays a vital role in elucidating their structural and functional impact in biology. This endeavor has determined 1198 unique MP structures as of early 2021. The value of these structures is expanded greatly by deposition of their three-dimensional (3D) coordinates into the Protein Data Bank (PDB) after the first atomic MP structure was elucidated in 1985. Since then, free access to MP structures facilitates broader and deeper understanding of MPs, which provides crucial new insights into their biological functions. Here we highlight the structural and functional biology of representative MPs and landmarks in the evolution of new technologies, with insights into key developments influenced by the PDB in magnifying their impact.
As membranes were critical to the separation of chemistries essential to the origin of life, membrane proteins (MPs) are key players in some of the most important physiological processes in living organisms. Characterizing MPs structurally and functionally is still extremely challenging due to their frequent low abundance, and difficulties in purifying functional MPs intact from their native membrane, though it is going through an exciting revolution now due to several key factors. It took several decades to obtain the structural information that now allows pursuit of understanding MP function in health and disease, to manipulate them as drug targets, and to engineer them into new powerful tools to fuel discovery. We highlight some of the landmarks in this endeavor that drove or depended on the discovery of new technologies required specifically for structural studies of membrane, versus soluble proteins.
Previously, hand-written letters delivered by post requested coordinate sets that were not always readily given. The vision of Hamilton, Myer, Koetzle, and the joint venture between the Cambridge Crystallographic Data Center in the United Kingdom and the Brookhaven National Laboratories at Stony Brook University led to the Protein Data Bank (PDB). Then for the first time, one could begin to ask questions with all the relevant structures at hand. In celebrating 50 years of the PDB, we focus on some of the ingeniously crafted inventions and discoveries that led the way for entire classes of MPs, and those new approaches that now promise structures of large and complex machines from their native cellular environments, in action.
In the following Historical perspective, we provide a brief historical perspective of some MP insights (other than the crucial G-protein-coupled receptors (GPCRs) that are the subject of a dedicated review in this volume) and consider the value of the PDB in disseminating this information. Seemingly insurmountable difficulties were often overcome with invention of new technologies to reveal important structural features of classes of MPs that make the fabric of today's approaches.
In How do membrane proteins accomplish key physiological functions?, we describe how the structures of several of the major MP classes were uncovered, which often required technological developments that are now woven into the fabric of structural biology. First, how are the α-helical, tail-anchored, and all β-sheet MP broad categories correctly targeted to and inserted into membranes and allowed to fold correctly? We progress to landmark discoveries involving the roles of MPs in physiology and some of the critical barriers that had to be overcome to realize these achievements. How do water channels conduct water at diffusion limited rates without allowing leakage of protons (H+) or hydronium ions (H3O+) or any other ions? How do potassium channels conduct K+ ions, but not the Na+ ion of similar charge and smaller ionic radius? How are epithelial cells held together side by side to make a selectively permeable sealed layer of cells? The question of how substrates are transported across membranes using energy then follows. How does one superfamily of “primary” transporters, the “ABC transporters” that directly harness ATP hydrolysis on their cytoplasmic side, move materials across membranes? How does another superfamily of “secondary” transporters, the Major Facilitator Superfamily (MFS), use ion or proton electrochemical gradients to drive nutrient import, export, or efflux of xenobiotics? We progress to consider a specialized class of essential viral proteins that form channels that are essential for viral virulence.
In Membrane protein structures instruct drug design and protein engineering, we focus on a few examples of therapeutics and opportunities for engineering of MPs. Critical to any therapeutic drugs are the membrane-attached cytochrome P450s that are discussed in the light of their key roles in sterol metabolism, but as metabolizers of therapeutic drugs, and therefore as drug targets themselves. Structures help elucidate mechanisms of action of therapeutics to modulate MP activities. The Cystic Fibrosis Transmembrane-Conductance Regulator (CFTR) is used as an example of the impact structural studies can have on the understanding and treatment of rare diseases. Another example addresses glucose import as an anticancer target opportunity. A key example in neurology is the class of ligand-gated ion channels, represented by GABAA receptors (GABAAR) and their ligands. Then the Leucine Transporter (LeuT), representing several classes of transporters with a common core of ten trans-membrane α-helices (TMs), is used to illustrate how major antidepressants work. Finally, some exciting developments in MP engineering focus on channelrhodopsins where light can trigger cellular responses and engineered dopamine sensors.
We hope that this necessarily limited perspective on the impact of selected MP structure classes may encourage opportunities in a broader context. We hope to communicate the value of MPs as guardians of the health of cells and how their structures, through the PDB, contribute important insights into many crucial aspects of physiology.
A brief history of membrane protein structures and the PDB
With soluble proteins beginning with myoglobin in 1961, followed by lysozyme, hemoglobin, and digestive proteases, these were available only from large animals or from bacteria. Long before overexpression systems became available, the number of soluble macromolecular structures followed an exponential growth. This pattern was noted by Dickerson in a letter written to the PDB in 1978 (
). The first integral MP structure was not determined until 25 years later in 1985. The number of MP structures determined and deposited in the PDB since that time also increased exponentially, but with a smaller exponent reflecting the considerable challenges that pertain to MP structural biology. The amount of time for the number of unique MP structures to double was about 3 years compared with 2.4 years for soluble macromolecules (
). This doubling time for MPs slowed to ∼5 years recently (Fig. 1). Most obvious challenges are functional expression in a limited lipid membrane environment versus expression in a soluble volume, followed by the requirement for detergents, amphiphiles, and lipids during extraction and purification of functional proteins from the membrane. There were critical breakthroughs necessary for many new classes of MPs over the last three decades. As a reflection, on January 15th, 2021, there were a total of 4569 coordinate files for MPs in the PDB, barely 2.6% of the total for all proteins, and of these 1203 represent unique structures. This is brilliantly tracked and annotated by Stephen H. White’s invaluable mpstruct database of MPs of known structure (Fig. 1) (
). One of our aims here is to celebrate some of these developments in association with the breakthroughs by those that made them possible.
Like those in the 1960s who worked on the soluble proteins available in quantity from the tissues of large animals or bacteria, work on the first MP structures also focused on rich natural sources. The purple membrane of archaebacteria was first described in 1971 by Oesterhelt and Stoeckenius, who showed that it contained a light-driven proton pump. This validated the Mitchell hypothesis, the revolutionary concept that biological energy could be stored as a proton gradient across a membrane (
). By freeze-fracture electron microscopy, they showed that “bacteriorhodopsin” formed an in-plane trigonal two-dimensional (2D) lattice. This was pursued structurally by Henderson and Unwin using electron diffraction of the arrays formed in the membrane (
Electrochemical proton gradient across the cell membrane of Halobacterium halobium: Comparison of the light-induced increase with the increase of intracellular adenosine triphosphate under steady-state illumination.
). Henderson and Unwin produced the first electron diffraction patterns from the trigonal latticed membranes, purified from the natural source, by sustaining its single bilayer in a glucose solution to prevent dehydration in the electron microscope (
). In a towering landmark, in 1975, they phased the patterns based on the images and reconstructed a 7 Å resolution 3D structure that beautifully demonstrated that the protein is comprised of seven TMs (
). This first discovery using electron microscopy/diffraction was at the heart of the concepts later recognized by the Nobel prize in chemistry to Henderson, Frank, and Dubochet in 2017. Soon after, in 1977 to 1979, the amino acid sequence of bacteriorhodopsin was determined by the groups of Khorana (
) using chemical sequencing methods. Agard and Stroud developed computational ways to extend resolution perpendicular to the membrane plane in attempts to reveal the interhelix linker regions to map sequence into the structure from electron diffraction data (
). As a reflection of the extreme challenges for membrane versus soluble proteins, the bacteriorhodopsin structure reached atomic resolution by electron diffraction 20 years after the 1975 breakthrough in 1996 (
), respectively, key to oxidative phosphorylation in mitochondria (see Early breakthroughs in High Resolution Structures of Bioenergetic Membrane Complexes and Lipid Interactions) and function of the neuromuscular junction, also provided highly important sources of MP complexes. Beginning in 1971, the first 3D surface shapes of the acetylcholine receptor ∼300 kDa complex of five homologous subunits in a quasi-fivefold pentameric complex αβαγδ were revealed using low-dose negative stain EM and small-angle X-ray scattering from stacked membranes by Stroud and Unwin (
Three-dimensional structure of the nicotinic acetylcholine receptor and location of the major associated 43-kD cytoskeletal protein, determined at 22 A by low dose electron microscopy and x-ray diffraction to 12.5 A.
) provided the necessary shielding of all polar groups, namely the carbonyls and amides of the polypeptide backbone. Hence, a sequence of ∼19 hydrophobic amino acids became a recognizable signal for TMs, while strongly amphipathic helices signal membrane surface-seeking helices (
). Remarkable insights followed from the few MP structures already determined, including the finding that their hydrophobic helices formed well-ordered 3D structures in lipid bilayer membranes and their retention even after gentle extraction in amphiphiles and detergents. Yet it took 45 years to obtain the first atomic structures for the acetylcholine receptor. This required screening of many specific genetic constructs of several members of this receptor family (
). After many years trying to crystallize bacteriorhodopsin, Michel and Oesterhelt had appreciated the value of colored proteins such as bacteriorhodopsin. Visible bands on columns aided purification, and more importantly, where the color of unique spectral features reflected the integrity of the MP (
) and signaled for monitoring structural preservation in harsh solubilizing detergents. Using photobacteria as a rich source of photosynthetic proteins, Michel and colleagues determined the first MP structure from crystals that diffracted to atomic resolution (3 Å and then at 2.3 Å). The structure of the Rhodopseudomonas viridis photosynthetic reaction center of 145 kDa published in 1984 (
) remains an amazing achievement. It showed that the predominantly hydrophobic α-helices are often longer than enough to span the 40 Å bilayer and can be correspondingly tilted at different angles to harbor the many chromophores that harness the light that supports life. This breakthrough discovery was recognized by the Nobel prize in chemistry in 1988 to Michel, Deisenhofer, and Huber (Fig. 1).
Ion channels in membranes were in the limelight because they accounted for the currents across membranes that are key to the nervous system, as described by Hodgkin, Huxley, and Katz in 1952 (
), and beautifully elaborated since then. The key was that Na+ ions are consistently pumped out of the mammalian cell using energy from ATP hydrolysis, while potassium ions remain inside to balance the electrochemical potential across the plasma membranes according to the Nernst equation. In neurons, however, the need for fast communication relies on rapid conduction of the Na+ ions inward that is signaled by the release of neurotransmitters at neuronal synapses that depolarize the plasma membranes of the target cell. This is then rapidly followed by release of K+ ions though highly selective channels that do not leak the smaller Na+ ions, but restore the transmembrane electrical potential. How is such selectivity accomplished? The alkali metal Na+ and K+ ions differ in ionic radius, yet why don’t K+ channels leak the smaller Na+ ion? The answer came in MacKinnon’s finding that bacterial K+ channels existed and could be extracted for structure determination. In 1998, he reported the first K+ channel structure (KcsA) from the bacterium Streptomyces lividans (
). The structure showed precisely how its selectivity was achieved. The key was in the precise arrangement of a fourfold arrangement of lines of carbonyl oxygens that surround a “selectivity filter” (SF) in the pore entry. They provide the precise counterpart for the normal water hydration shell around the K+ ion, but were too far apart to provide similar energy balance for the Na+ ion. MacKinnon was awarded the Nobel Prize in chemistry for this landmark discovery in 2003 (Fig. 1).
The 2003 Nobel prize in chemistry was shared with Agre, recognizing another remarkable discovery, that of water channels (
) demonstrated that red blood cells had water channels that were inhibitable by mercurials, implying that a sulfhydryl containing protein was responsible. They showed that this was true across many (now all) species, and Nielsen and Agre (
). These were more difficult to recognize than ion channels because there were no electrical properties to measure their activity. Their discovery was made while searching for the well-known rhesus (Rh) factors in red blood cells. They discovered a second major MP in these cells, and then showed that it conducted water, and at a speed up to the diffusion-limited values for a pore of the requisite size (see below). Structures of the aquaporins were determined in 2000 by electron imaging (
), showing precisely how these channels excluded passage of protons (H+) or hydronium ions (H3O+), while allowing a single file of water molecules to progress, hydrogen bonded to themselves, and to eight key carbonyl oxygen atoms that act like pitons along the walls of the channel (
) and its dynamo-like rotation that harnessed proton gradients to drive the synthesis of ATP in mitochondria. These critical structures of a ∼500 kDa complex provided the amazing link to elucidate the subunit stoichiometry of eight different protein chain types needed to harness proton gradients to produce ATP, the major source of energy in all cells. The discoveries led to the Nobel Prizes in chemistry in 1997 to Boyer and to Walker, who reported the atomic structure of the complex in 1994 (
The LeuT structure determined by Gouaux's group signaled a new approach to several related transporters that are critical to mental health, responses to neurotransmitters, to therapeutic and street drugs (
). This first structure provided a landmark insight into the transporters that are required to repackaged and reabsorb neurotransmitters back into the neuronal cell after release into neurological synapses. Dopamine, serotonin transporters, and others from the same family are all targets of therapies for the critically important spectrum of neurological disorders (
). These transporters are also key to the control of major mental diseases including such poorly understood disorders as schizophrenia and bipolar disease that together affect ∼1/50 persons globally. The PDB now provides a critical catalyst for disciplines that need to “know what we are dealing with” in human health versus disease. It creates a scenario in which we can think about and design strategies for modulating therapeutic responses as well as providing an ongoing reminder of how evolution has separated chemistries in order to provide the essence of life.
With increasing impact on prospects for drug discovery, structures of membrane embedded proteases that cleave transmembrane regions to release components are critical. One of the first key examples was discovered by Brown, Goldstein, and colleagues in the critical regulation of cholesterol metabolism. In recognition of their many contributions to the broader field of cholesterol metabolism, they were awarded the Nobel prize in Physiology and Medicine in 1985. They subsequently showed that a membrane-embedded S1P serine protease cleaves the sterol regulatory element-binding protein SREBP. A second intramembrane protease S2P, a zinc metalloprotease, liberates a transcription factor from the N terminus of SREBP that then upregulates cholesterol synthesis. These proteases also coordinate fatty acid synthesis that is also required for membrane biogenesis (
). Rhomboid proteases constitute another superfamily of ubiquitous intramembrane proteases involved in a wide range of biological processes and were the first structures of intramembrane proteases to be determined (
). Another critical example of intramembrane proteolysis is the cleavage of the β-amyloid precursor protein (APP) by the β-site cleaving enzyme (BACE1) that converts APP to the pathogenic forms Aβ42/40. Validated by mouse gene knockouts, BACE1 and related γ-secretase (
). Many of the intramembrane proteases destabilize the α-helical regions that characterize transmembrane proteins and often don’t rely on specific amino acid side chains for specificity as do the soluble proteases.
As a portent of the impact of electron microscopy, the discovery of the TRPV1 channel structure came after several groups had purified and tried extensively to crystallize Transient Receptor Potential (TRP) channels for many years in seeking the structural basis for their roles in pain sensing (
As a critical library of the molecular bases for MP function, the PDB provides a historical timestamp of the progression in our understanding of these important proteins. It uniquely affords a quantitative visual imprint of the mechanisms of these proteins as the gatekeepers, importers, and exporters of the cell and of organelles within the cell, to the benefit of current and future generations.
Early breakthroughs in high-resolution structures of bioenergetic membrane complexes and lipid interactions
The story of MP structure determination and its profound impact on understanding biological systems is a fascinating illustration of the power of technology development and the value of rapid communication of detailed structural information through the PDB. The problem of transferring MPs from lipid environments to an aqueous phase is substantial and was perhaps initially underestimated due to the prevalent concept that the detergents needed to accomplish this feat were intrinsically disordered and simply required to create a not-necessarily-homogeneous hydrophobic phase to cover newly exposed membrane-embedded regions. Indeed, the importance of pure, well-defined detergents was not immediately recognized, and when it was, the chemistry for creating and purifying these amphipathic compounds was challenging. But gradually it became clear that achieving high-resolution MP crystal structures required a protein sample that had not only native activity and a high degree of purity but also was dependent on the specific structure and purity of the detergent itself for achieving these ends. In many cases, success also depended on retaining a subset of structurally important lipids (
), the importance of specific lipid interactions became increasingly apparent: the lipidic detergent molecules as well as native lipids were not just satisfying the need for a disordered hydrophobic phase, but many were actually fitting into defined, conserved slots in the protein structure. The most dramatic illustrations of this fact came initially from the complexes of the electron transport chain. One of the first of these to be obtained at high resolution was the structure of mammalian cytochrome c oxidase, also known as Complex IV (
), in which many lipids were resolved, with up to 13 bound to each monomer of ∼200 kDa molecular weight. These lipids, characterized by mass spectrometry, were found to be specific not only in their placement, occupying the same positions in each preparation, but also in head group and tail length and degree of saturation (
), demonstrating that the lipid/detergent-binding sites play important and specific roles in the native protein structure. Similar results were found in crystal structures of mammalian and yeast cytochrome bc1 (Complex III of the electron transport chain) (
). Although achieving very high-resolution structures (<2 Å) remains difficult, high resolution is essential where function involves conformational change or depends on the positioning of water. Again, the large multisubunit complexes of the mitochondrial electron transport chain have provided insightful success stories, in part due to their distinct spectral characteristics that aided purification of native forms. These membrane-embedded, spectrally unique proteins were among the first to be structurally defined at high resolution, after the groundbreaking achievement of the bacterial reaction center by Deisenhofer et al. (
), and helped move the field of bioenergetics into a new era of mechanistic understanding. The structures of these large complex proteins revealed the exact positioning of their many prosthetic groups, enabled the subunit composition to be confirmed, and allowed meaningful interpretation of spectral changes and fast kinetic analyses of electron transfer events. Their unique UV–visible characteristics were powerful tools for assessing the native state of the purified proteins and for analyzing the rates and ordering of electron and proton transfer events. Without structures that showed water positions and sometimes distinct hydrogen-bonded water chains (
), their kinetics could not be definitively interpreted. High-resolution structures led to the clarification of the proton transfer process, by the Grotthus mechanism, first suggested in 1806 by Theodor von Grotthuss (
) and brought us to our current understanding of how coupling of electron transfer, oxygen reduction, and proton transfer leads to energy conservation in most living organisms. But the supreme success story from the bioenergetics field was capturing the machinery that completes the energy transducing process, the ATP synthase (
) whose structure enabled the deciphering of how rotational motion could be driven by the proton motive force to generate ATP. Beautiful structures of this amazing machine from the Walker group and others, initially without its membrane-embedded region but still the largest asymmetric protein structure known at the time, continue to reveal new and surprising aspects of how energy is transduced (
) that involved mixing a specific ratio of lipid (such as monoolein) and aqueous phase to form a membranous lipid organization into which the protein could insert, maintaining a membrane-like environment. Otherwise difficult-to-crystallize proteins are able to diffuse and form well-diffracting crystals in this phase, with many success stories (
). Bicelles are similar but are self-assembling small model membrane systems edged by bile salt lipids. Bicelles were initially designed to aid in aligning MPs in a magnetic field suitable for NMR analysis (
). Nanodiscs, with added stability and control of size, have been useful for many types of analysis including single-molecule studies, activity studies, and resolving aggregation state issues, as well as facilitating the use of cryo-EM. This latter major technical development has further enabled MP structure determination, with the use of single-particle cryo-EM and the nanodisc method showing great success (
A revolution in cryo-EM detector methodology and computational analysis has facilitated new levels of resolution of very large complexes in different conformations even in a single sample (Fig. 1). Again, the electron transfer chain has been the subject of dramatic early success stories, with the most impressive being complex I (
), the largest respiratory chain component with 45 subunits. These structures also show specifically bound lipids and reveal the likely proton routes through the membrane segment. Having the Complex I structure was essential, in turn, to solve the atomic resolution structures of supercomplexes up to 1.7 MDa.
The story of supercomplexes of the electron transport chain is a long and controversial one (
). One key to successfully defining the atomic resolution structures of these unusually large assemblies has been the availability of X-ray structures of all the components, along with sophisticated computational fitting methods. A number of different assemblies have been discovered (
), which shows an assembled dimer of two supercomplexes and suggests a place into which Complex II, the only missing member of the chain, would fit. The continuing issue of the functional significance of these assemblies remains. The question of what is the true native assembled state and how dynamic it is contributes to this issue (
). Since Complex IV appears as a monomer in all the supercomplexes so far established, this may be revealing an important new regulatory mechanism.
These amazing structures of the respiratory machinery have led the way in illuminating not only unique protein structures and functions, but also the fundamental role of lipid in MP integrity. Their accessibility through the PDB in the most useful formats continues to allow us to test hypotheses and to excite our imagination, stimulating new modeling efforts and new concepts in the field of bioenergetics and pushing the boundaries of our understanding of energy metabolism in health and disease. These early results provided the initial demonstration that it was possible to obtain crystal structures of MPs of even massive dimensions and expanded our understanding of lipid environments that were needed to permit this achievement.
How do membrane proteins accomplish key physiological functions?
Residing in the boundaries of cells and their organelles, MPs and their complexes underpin many important aspects of cell physiology that can be addressed through structures held in the PDB. A representative repertoire of these physiological functions includes how MPs are correctly chaperoned and inserted into their correct host membrane, how some proteins are secreted across membranes. How is osmotic balance maintained by water channels that do not leak any protons or ions, and how ions are specifically passaged by gated channels? How are cells held together in surfaces of the skin and organs with appropriate paracellular transport? The bioenergetic transformations required for ATP synthesis (respiratory electron transfer complexes, ATP synthases), maintenance of membrane potential (P-type ATPases), production and control of ion gradients (ion channels), sequestration and export of key regulators such as calcium ions (sarcoplasmic reticulum and plasma membrane calcium ATPases), transport of nutrient, toxin and xenobiotic through primary active transporters driven by ATP hydrolysis (ABC transporters) and secondary active transporters driven by proton plus electrochemical gradients. We summarize the MFS transporters and also describe viroporins (VPs), small viral MPs that form channels necessary for virulence. These structures offer routes to drug discovery that are often broader than simply inhibition, such as modulation or even enhancement of activities. While the examples we have chosen illuminate the value of the PDB in providing structure–function insight into basic biological phenomena and their possible application, our goal is to provide an instructive vision rather than a comprehensive oversight.
How are membrane proteins targeted, inserted, and folded correctly?
How do MPs come to be? How do MPs find their final destination and fold properly in a lipid bilayer? This “chicken and the egg” question has been partially answered over the last 50 years since the PDB was established in 1971. Like all proteins, MPs begin their journey as nascent polypeptides that emerge from the ribosome with their final destination encoded in their primary structures. Following the discovery that proteins have intrinsic signals encoded in several amino acids near the emergent N terminus, that govern their transport and localization in the cell by Sabatini and Blobel in 1971 (
), a large body of biochemical and functional studies have identified the machinery and dissected the steps involved in the sorting, targeting, and chaperoning of MPs through the cell toward their final destination. Experimental validation of the “signal hypothesis” led to powerful and widely used computational methods aimed at predicting the diverse protein-sorting signals (
) for subcellular compartments or organelles such as the endoplasmic reticulum (ER), mitochondria, lysosomes or chloroplasts, and other plastids in all living organisms.
The structural biology of MPs has not only provided remarkable insights into the architecture and mechanisms of action of channels, transporters, pumps, and receptors but also elucidated key aspects of the molecular machinery involved in their biogenesis: from the targeting and sorting as they emerge from the ribosome until their translocation across or insertion into biological membranes. The translocation/insertion step is catalyzed and assisted by translocons, a general term referring to the MP machinery facilitating this process. The Sec61/SecY universal translocon (Sec standing for Secretory) is central to the translocation of most membrane and secreted proteins while other “specialized” translocons can be highly substrate-specific.
Most proteins are thermodynamically stable in their native state. The process of translocating a polypeptide across or into a membrane is thermodynamically unfavorable. This is due to the large entropic costs associated with i) moving polar or charged amino acids across the inner hydrophobic part of the lipid bilayer and, conversely, ii) moving hydrophobic amino acids across the outer membrane surface layers consisting of the phospholipid polar head groups (
). It is particularly significant for integral MPs with large soluble loops or domains and complex topological arrangements of hydrophobic TMs. In the last two decades, the PDB has been the repository of increasing numbers of structures of soluble and membrane proteins that help MPs reach their final destination and fold properly, thus contributing to deciphering the general principles and mechanisms underlying “protein-assisted” MP biogenesis.
The signal sequence-dependent universal targeting pathway of membrane proteins
Most MPs follow the Signal Recognition Particle (SRP) dependent cotranslational targeting pathway (
). At the core of this evolutionarily conserved machinery is the catalytic and structural SRP RNA associated with the SRP54 protein. It is involved in complex assembly, signal sequence recognition, and nascent chain transfer from the ribosome to the Sec61/SecY translocon (Fig. 3A). The presence of an RNA component hints at its ancient origin when RNAs played more prominent roles as catalysts and structural scaffolds in primordial cells (
). Targeting relies on three crucial steps; (i) the interaction between the Methionine-rich (M) domain of the soluble SRP54 protein that recognizes and binds N-terminal “greasy” signal sequences as they first emerge from the ribosomal tunnel (
) to form a targeting complex (TC) at the ER or plasma membranes resulting in (iii) the delivery of the ribosome-nascent chain (RNC) from the SRP54/SR complex docked to the translocon upon structural rearrangement of the SRP54/SR core on its SRP RNA scaffold and the channel-stimulated reciprocal GTP hydrolysis that dissociates SRP from its receptor (
The conserved heterotrimeric channels, Sec61αβγ in eukaryotes or SecYEG in prokaryotes, act as portals and gatekeepers for the insertion of most of the MPs and secreted proteins. The large subunit forms the protein-conducting pore itself and is composed of ten TMs, where the first five TMs are related to the second five TMs by a twofold quasi-symmetry in the plane of the membrane (
). This internal pseudo-symmetry divides the channel into two halves articulated around a hinge between TMs 5 and 6 (Fig. 3B). This channel has a unique architecture with two conducting paths oriented perpendicular to each other: An hour-glass-shaped pore for the secretion of proteins across the membrane and a lateral gate opening to chaperone TMs of integral MPs destined for insertion into the bilayer and delineated by TMs 2, 3, 7, and 8 (Fig. 3C). The protein-conducting pore is occluded by a short plug helix and sealed by a ring of hydrophobic residues (Leu, Ile, and Val) at its constriction point that prevent leakage of water and ions (
). The funnel-shaped cytoplasmic vestibule of the translocon is large enough to accommodate incoming folded helices while soluble loops can extend at the interface between ribosome and channel. The α-helical signal sequence of the nascent chain emerges, opening the lateral gate as it displaces and supplants TM2. As lateral gate TMs shift, the ring and plug helix are widened and displaced/unfolded, respectively, to make room for this nascent chain that contains a folded secondary structure (Fig. 3B).
While the Sec61/SecY protein-conducting channel provides a clear path across or into the membrane, it is not the sole arbitrator of MP insertion topology. Proteins start to fold as soon as space allows, and the ribosome also plays an important role as the nascent chain can either fold or not, while it elongates from within the ribosomal exit tunnel into the Sec61/SecY channel (
). Thus, in vivo, rather than start folding inside a lipid bilayer environment, MPs begin this process within the ribosome tunnel and the Sec61 translocon, where secondary structure elements can form and associate into smaller kernels of tertiary structure (
) and where, perhaps more importantly, the insertion topology is assigned as the nascent polypeptide enters into the membrane plane by lateral diffusion. Prokaryotic and eukaryotic integral MP topologies generally follow the so-called “positive inside” rule for TM orientation as established by Sipos and von Heijne (
). In this regard, the Sec complex associated with a ribosome forms an Anfinsen cage around the nascent chain, surrounding the nascent polypeptides with a protective “enclosure” and environment where folding occurs unhindered, thus preventing aggregation or misfolding (
) in prokaryotes have been solved. These structures reveal that the Sec61/SecY translocon is part of a highly dynamic “MP assembly line,” sometimes referred to as the holo translocon, with “quality control” mechanisms (
). The human genome encodes about 2500 multipass MPs (such as GPCRs, ion channels, and ABC transporters) of considerable topological complexity. Lateral insertion and folding of such polytopic MPs require the intervention of Sec61-accessory complexes such as the Endoplasmic Reticulum membrane protein complex (EMC). Although our understanding of the mechanisms governing protein secretion and insertion has progressed, the specifics of MP complex assembly into homo- or hetero-oligomers still elude us.
The targeting of tail-anchored proteins and new paradigms in membrane protein biogenesis
Most membrane-inserted proteins harbor an N-terminal signal sequence, which defines the main route for protein targeting and secretion/insertion. A significant subset of MPs uses different routes and distinct protein machineries. Tail-anchored (TA) proteins (proteins with a single C-terminal TM anchor) constitute about 5% of any given eukaryotic proteome and are targeted with exquisite specificity to the diverse organelles (e.g., ER, mitochondria, lysosomes) within the eukaryotic cell. The yeast Guided-entry of tail-anchored proteins (Get) pathway, known as TRC in mammals, was discovered in 2008 (
). It represents a perfect example of a posttranslational MP targeting system in the ER. TA proteins do not follow the SRP targeting pathway because they lack an N-terminal signal sequence. Their targeting signal consists of their single C-terminal hydrophobic TM, which only emerges from the ribosomal channel once the rest of the polypeptide has been synthesized. In yeast, the Get pathway involves at least six proteins. Among these, the soluble ATPase Get3 and the two MPs Get1/Get2 are involved in the final steps of TA targeting to, and insertion into, the membrane, respectively (
), much akin to the M domain of SRP54. The long N-terminal cytoplasmic end of Get1 captures the Get3/TA complex and targets it to the membrane where the cytoplasmic coiled-coil extension of Get2 interacts with the docked Get3/TA complex to stimulate the release of its TA cargo before insertion into the membrane (
The exact mechanism of TM insertion by Get1 (with Get2) has yet to be elucidated. However, Get1 is a member of the YidC/Oxa1/Alb3 superfamily of proteins that insert proteins into membranes in bacteria, mitochondria, and chloroplasts, respectively (
), which is involved in the cotranslational insertion of polytopic MPs, such as GPCRs, in association with the Sec61 translocon, and also the posttranslational insertion of some more hydrophilic TA proteins. In particular, EMC assists Sec61 with the insertion of MPs with complex topologies and destabilizing features in their TMs such as charged residues (
). Remarkably, the structure of another Sec61-associated complex containing five accessory factors (TMCO1-CCDC47 and Nicalin-TMEM147-NOMO), distinct of EMC but also involved in the biogenesis of hundreds of multipass MPs was recently solved (
), a model for the insertion of TMs and TAs has emerged. A “hydrophobic slide” is created between TMs 1, 2, and 5, while the hydrophilic environment generated by the groove can recruit the extracellular regions on substrates into the low-dielectric environment of the membrane, thus facilitating insertion (Fig. 4B). Furthermore, the protein architecture and its interactions with membrane phospholipids result in an asymmetric thinning of the membrane on the cytoplasmic side near the aqueous membrane cavity (
The nine-subunit structure of EMC reveals that TMs from subunits EMC3, 5 and 6, form a lumen-sealed lipid-exposed intramembranous groove large enough to accommodate a single TM, similar to YidC. Furthermore, protein translocation involves a cytosolic hydrophilic vestibule (EMC2/EMC8-9) at the interface between EMC2 and EMC3, EMC5 and EMC6, and methionine-rich loops on EMC3 to probably accommodate the bulkier and more hydrophilic ends of TA proteins and capture their hydrophobic TA, respectively. EMC uses spatially distinct yet coupled regions including lipid-accessible membrane cavities and cytosolic surfaces to function as an insertase for TA proteins and a protein holdase-chaperone for complex polytopic MPs (
). Membrane thinning around the EMC further lowers the energy barrier for translocation (Fig. 4C). While the Get machinery targets client proteins with highly hydrophobic TAs, the EMC seems to prefer TA proteins with lower hydrophobicity. Thus, triage of TMs and TAs into the different insertion machineries (i.e., EMC versus Get) relies, at least in part, on their relative hydrophobicity (
). These recent structures illustrate the roles of electrostatics, hydrophobicity, and protein architecture in establishing the topology of TM translocation through shaping and thinning of the surrounding membrane to facilitate insertion.
Folding it backward: The case of β-barrel transmembrane proteins
Helical MPs are ubiquitous in prokaryotes and eukaryotes. Transmembrane β-barrel proteins constitute another functionally important class of integral MPs composed mostly of membrane-spanning β-strands; they are exclusively found in the outer membranes of Gram-negative bacteria and the membranes of mitochondria, chloroplasts, and other plastids in eukaryotes. The transmembrane β-barrel scaffold is based on an antiparallel sheet of β-strands (usually an even number) arranged into a cylindrical structure delineating a central and functional pore or cavity. The first transmembrane β-barrel protein structures, PhoE and OmpF, were solved in 1991 to 1992 by the Jansonius/Rosenbusch and the Schulz groups (
) reveals a “templating” mechanism based on sequential β-augmentation for the folding and insertion of β-strands into membranes and the growth of the cylindrical β-barrel structure. The BAM complex is composed of the membranous subunit BamA and four lipid-binding soluble proteins (BamBCDE) (
). The central component BamA is itself a β-barrel protein that forms a closed 16 β-stranded antiparallel barrel when not engaged with a substrate (Fig. 5A).
The structure of BamA trapped with a model β-barrel substrate (a modified version of its barrel) shows that the active BamA barrel is splayed open along its lateral gate, exposing the two N-terminal strands β1 and β2. The N-terminal β1 strand no longer hydrogen bonds with C-terminal strand β16, as seen in the resting structure of folded BamA. Instead, it forms six hydrogen bonds with the C-terminal strand β16 of its barrel substrate protein (Fig. 5B). However, the resulting hybrid barrel is asymmetric since the C-terminal strand β16 of BamA does not pair with the N-terminal β1 strand of the barrel substrate. By binding the N-terminal edge of BamA as an extended β-strand, the C terminus of the substrate forms a new edge that serves as a template to guide binding and folding of the subsequent β-strand by β-augmentation. As a consequence, folding is directional and proceeds from C- to N terminus (Fig. 5B). Interestingly, the six hydrogen bonds between strands in the membrane environment form a very stable interface between the two proteins, BAM and the nascent β-barrel substrate. Such stability enables folding but results in a high kinetic barrier for substrate release once folding is complete. Sequence features at each end of the substrates overcome this barrier and favor substrate release by stepwise exchange of hydrogen bonds (
). More recently, two cryo-EM structures of the Sorting and Assembly Machinery, structurally and functionally related to BAM, also revealed an opening of the lateral gate as substrates get inserted and suggested that entire precursor proteins might fold using a β-barrel switching mechanism (
The general dogma is that translocons must harness some form of energy to lower the energetic barrier associated with the translocation of a polypeptide and to function as translocases and/or insertases. The most common energy sources for such processes are the binding and hydrolysis energy of nucleotides (i.e., ATP and GTP) and also proton gradients or membrane potential that can impart directionality. In particular, ATPases such as the SecA ATPase, which energizes the SecY-dependent posttranslational protein secretion in bacteria (
), or AAA+ ATPases specific to secretion systems found in pathogenic bacteria or protozoa, couple ATP binding and hydrolysis energy to generate mechanical forces used to unfold and translocate their substrates. These translocons use so-called polypeptide “clamps”; they engage the polypeptide main chain nonspecifically and keep it unfolded as it is threaded through a proteinaceous pore traversing the membrane. The recent structures of the EMC and BAM machineries reveal in exquisite detail how protein-assisted translocation and insertion combine mechanisms such as membrane thinning, hydrophobic sliding, and protein-templating to decrease the energetic barrier associated with translocation across or insertion into the bilayer of a polypeptide without apparent expenditure of energy.
Most extreme translocation: Effector protein and virulence factor secretion in pathogens
The journey of some secreted proteins across the cell can be more arduous in the extreme cases of intracellular pathogens such as the deadly malaria parasite Plasmodium falciparum (Pf). Malaria is a disease mentioned in Sumerian and Egyptian ancient texts (1550 BC). In 2018, Plasmodium infected some 230 million humans and claimed about 435,000 lives. Upon infection of its human host, this obligate intracellular parasite dwells within a parasitophorous vacuole derived by invagination through the membrane of infected hepatocytes or erythrocytes. Plasmodium is a master cell renovator; it exports hundreds of effector proteins and virulence factors across the PV membrane into the host cell in order to gather nutrients, eliminate waste, persist and thrive in its host, and evade the immune response (
)—a compendium of transporters, pumps, and channels involved in ion flux, nutrient/metabolite import, and waste/drug efflux within the parasite and the infected host cell such as the hexose transporter PfHT1 and the chloroquine-resistance transporter PfCRT. In 2009, de Koning-Ward et al. (
) discovered the complex responsible for vacuolar secretion and named it Plasmodium translocon of exported proteins (PTEX) (Fig. 6A).
PTEX is composed of three essential core subunits: The pore-forming protein EXP2, the AAA+ protein unfoldase HSP101, and the adaptor PTEX150. Two accessory subunits, PTEX88 and TRX2, complete the assembly (
), including the novel MP structure of EXP2 that forms the protein-conducting pore (Fig. 6B). Most of the effector proteins are essential to parasite survival. Thus, the vacuolar secretion pathway, with its Plasmodium-specific and unique vacuolar translocon, provides prime targets for the design of novel antimalarial drugs (
The 1.6 MDa PTEX complex solved by Ho and Beck was obtained from endogenous sources, expressed at its natural (low) abundance, and extracted directly from human red blood cells infected with Plasmodium parasites CRISPR/Cas9-edited to express HSP101 bearing an affinity purification tag at its C terminus. While purification from endogenous sources is not a novel strategy per se, the clever use of CRISPR/Cas9 gene editing by Beck and Goldberg (
) on an “unruly” organism such as Plasmodium, traditionally recalcitrant to facile genetics, opens new frontiers to tackle the structures of scarcely available complex molecular machines that could not be reconstituted using more traditional approaches based on recombinant expression. Furthermore, extraction of the endogenous PTEX structure in the presence of the slowly hydrolysable nucleotide analog ATPγS allowed the complex to be caught “in the act” under two distinct states engaged with endogenous cargo trapped in the translocation channel of the HSP101 ATPase (Fig. 6B). It provided valuable insights into the mechanism of effector secretion. PTEX is an example of purely posttranslational translocon and, in contrast with Sec61/SecY, it seems devoid of insertase function and exclusively provides a path for posttranslational secretion across the vacuolar membrane (
). This example illustrates the power of a “shotgun” approach combining cryo-EM (or cryo-ET) and Mass-Spectrometry-based proteomics to isolate, identify, and resolve ensembles of large macromolecular assemblies in complex mixtures such as crude cellular extracts (
Each of these structures brings us back to our “chicken and the egg” problem. During the emergence of life from primordial macromolecular systems, it is likely that primitive MPs spontaneously inserted in simple lipidic bilayers. Spontaneous insertion of very hydrophobic TMs has been observed for MPs with relatively simple topologies. The emergence of a translocon with its accessory factors and protein quality control machinery probably paralleled increasing complexification of MP folds, cellular compartmentalization, and the associated evolutionary pressure to evolve more sophisticated and dynamic proteinaceous systems to catalyze and chaperone their folding and proper biogenesis. In-depth analyses of the YidC and SecY structures and of the requirements for the biogenesis of different classes of integral MPs and secreted proteins shed light on the origins and evolution of the most ancient “translocators.” Lewis and Hegde speculate that SecY originated as a YidC homolog formed of two hydrophilic grooves juxtaposed within an antiparallel homodimer (
). The proposed “molecular filiation” between the two major MP biogenesis factors offers a new perspective on a fundamental step in the evolution of macromolecular biological systems and cells.
Water transport by aquaporins and ammonia gas transport by the Rh family
The AQPs provide a striking example of how, over two decades, the PDB coordinated structures into functional knowledge. Because life depends on water, regulation of water movement across cells without leakage of any ions or protons is an essential function. It is encoded in a family of highly regulated water channels that are expressed in the organ, tissue, and cell-specific locations (
). By the 1960s, there was evidence that water permeated membranes faster than through lipid bilayers. Red blood cells conducted water with a low activation energy barrier, while oocytes had very high resistance to conductance. Benga and colleagues showed in 1986 that this water conductance could be inhibited by p-chloromercuribenzoate. The reversal of this inhibition by reducing agents implied a protein channel that contained a sulfhydryl accessible to mercury ions (
), in a search for Rh blood group antigens in highly porous red blood cells, found two protein bands—one of 32 kDa and the other 28 kDa. Expression cloning in frog oocytes of the 28 kDa “Channel-like Integral membrane Protein of 28 kDa” (CHIP28) (
), which equated to the diffusion limit for a pore the size of a single file of water molecules. More AQP homologs were discovered and the family provided essential functions represented in all living species from bacteria to humans (
). AQP3 is permeable to glycerol and water while AQP9 has even broader specificity. There is also an ion-conducting AQP. AQP6 conducts water poorly and is unique in conducting ions through the same water-conducting channel, supported by altered sequence in the pore, or possibly through the fourfold axis of the tetramer (