Selenoprotein Gene Nomenclature*

The human genome contains 25 genes coding for selenocysteine-containing proteins (selenoproteins). These proteins are involved in a variety of functions, most notably redox homeostasis. Selenoprotein enzymes with known functions are designated according to these functions: TXNRD1, TXNRD2, and TXNRD3 (thioredoxin reductases), GPX1, GPX2, GPX3, GPX4, and GPX6 (glutathione peroxidases), DIO1, DIO2, and DIO3 (iodothyronine deiodinases), MSRB1 (methionine sulfoxide reductase B1), and SEPHS2 (selenophosphate synthetase 2). Selenoproteins without known functions have traditionally been denoted by SEL or SEP symbols. However, these symbols are sometimes ambiguous and conflict with the approved nomenclature for several other genes. Therefore, there is a need to implement a rational and coherent nomenclature system for selenoprotein-encoding genes. Our solution is to use the root symbol SELENO followed by a letter. This nomenclature applies to SELENOF (selenoprotein F, the 15-kDa selenoprotein, SEP15), SELENOH (selenoprotein H, SELH, C11orf31), SELENOI (selenoprotein I, SELI, EPT1), SELENOK (selenoprotein K, SELK), SELENOM (selenoprotein M, SELM), SELENON (selenoprotein N, SEPN1, SELN), SELENOO (selenoprotein O, SELO), SELENOP (selenoprotein P, SeP, SEPP1, SELP), SELENOS (selenoprotein S, SELS, SEPS1, VIMP), SELENOT (selenoprotein T, SELT), SELENOV (selenoprotein V, SELV), and SELENOW (selenoprotein W, SELW, SEPW1). This system, approved by the HUGO Gene Nomenclature Committee, also resolves conflicting, missing, and ambiguous designations for selenoprotein genes and is applicable to selenoproteins across vertebrates.

The human genome contains 25 genes coding for selenocysteine-containing proteins (selenoproteins). These proteins are involved in a variety of functions, most notably redox homeostasis. Selenoprotein enzymes with known functions are designated according to these functions: TXNRD1, TXNRD2, and TXNRD3 (thioredoxin reductases), GPX1, GPX2, GPX3, GPX4, and GPX6 (glutathione peroxidases), DIO1, DIO2, and DIO3 (iodothyronine deiodinases), MSRB1 (methionine sulfoxide reductase B1), and SEPHS2 (selenophosphate synthetase 2). Selenoproteins without known functions have traditionally been denoted by SEL or SEP symbols. However, these symbols are sometimes ambiguous and conflict with the approved nomenclature for several other genes. Therefore, there is a need to implement a rational and coherent nomenclature system for selenoprotein-encoding genes. Our solution is to use the root symbol SELENO followed by a letter. This nomenclature applies to SELENOF (selenoprotein F, the 15 Selenium is an essential trace element in humans, which is present in proteins in the form of the 21st proteinogenic amino acid, selenocysteine (Sec). 2 Sec is co-translationally inserted into a polypeptide chain in response to in-frame UGA codons directed by the Sec insertion sequence element, a stem-loop structure in the 3Ј-UTRs of selenoprotein mRNAs. The human genome contains 25 selenoprotein genes (1), and selenoproteins are essential for embryo development and human health (2,3). Among the selenoproteins, 13 have known functions; at least 12 of them serve as oxidoreductases, wherein Sec is the catalytic redox-active residue. The redox theme is also common for selenoproteins in other organisms (4).
The remaining 12 selenoproteins either have no known function, or their functions are only partially established. One of the selenoproteins, selenoprotein P (5), requires special mention as it has more than one Sec. It is a major plasma selenoprotein that delivers selenium primarily from the liver to other organs (6,7), and is involved in selenium transport and metabolism within organs. However, this protein also has an N-terminal Sec-containing thioredoxin domain similar to that found in most selenoproteins with known functions, which points to a potential redox function. Several other selenoproteins, including selenoproteins H, M, T, V, W, and Sep15, also possess thioredoxinlike domains, suggesting redox-related functions (8).
Selenoproteins are not all homologous, but are characterized by their incorporation of Sec. Historically they have been given designations by the groups that discovered them, e.g. because of its presence in plasma the respective selenoprotein was named selenoprotein P (9, 10), or because of its size another protein was called the 15-kDa selenoprotein or Sep15 (11). However, some selenoproteins were identified independently by two or more groups, which created confusion and discrepancies in the field. For example, the same protein was named selenoprotein R by one group (12), but discovered concurrently and designated by another group as selenoprotein X (13). This protein was then functionally characterized (14) and renamed MsrB1 (for methionine-R-sulfoxide reductase 1) (15), but all three designations persist in the literature and/or databases. Another problematic example is the nomenclature used for thioredoxin reductases. The names for the first thioredoxin reductase, which had been known decades before its selenoprotein nature was discovered (16), are generally internally consistent, although they differ in the abbreviations used, e.g. TR1 and TrxR1 (17). The second and third thioredoxin reductases discovered, however, were named inconsistently by the authors, wherein the mitochondrial thioredoxin reductase was designated as TrxR2 (18) and TR3 (19), and the testis-specific thioredoxin-glutathione reductase has been alternatively labeled as TR2 (19), TrxR3, or TGR.
Designations are also confusing for several other selenoproteins. For example, selenoprotein S was named SelS (1), but a later paper introduced the designation VIMP (20). Similarly, selenoprotein H was named SelH (1), but also C11orf31, and selenoprotein I was named SelI (1), but also called EPT1 (21). To avoid confusion, and at the instigation of the HUGO Gene Nomenclature Committee (HGNC), we describe a new standardized designation system for human (and other vertebrate) selenoproteins.

Results and Discussion
Resolving the Nomenclature of Selenoprotein Genes-Human gene designations are approved by the HUGO Gene Nomenclature Committee (HGNC), and genes in other mammals follow the same designations. Selenoproteins have traditionally been published using SEL or SEP symbols followed by a letter or a number. Unfortunately, for naming the genes encoding these proteins, the SEL root was not an option as it was already approved for the selectin gene family; for example, SELP is the approved gene symbol for selectin P (P-selectin) and not selenoprotein P. Some selenoprotein genes had been approved using the root SEP (i.e. SEPN1, SEPP1, and SEPW1) but this could not be utilized for all selenoproteins as selenoprotein T gene would then be SEPT or SEPT1, and SEPT# is already used for the septin genes. HGNC does not use the same root for unrelated groups of genes (e.g. SEL for selectins and selenoproteins) and does not endorse the use of multiple root symbols for genes sharing a common name (e.g. SEP and SEL for selenoprotein). With a view to solving these issues, HGNC approached selenoprotein researchers to propose a new unifying root symbol for all selenoprotein genes.
Proposal for a New Nomenclature-We propose that all selenoproteins (except those that have been functionally characterized, e.g. with enzymatic activity) use the root symbol SELENO followed by a letter. This gene nomenclature is designed to highlight selenium, the key functional site in these proteins, and to provide a new and unambiguous root for these genes. The new nomenclature applies to 12 human selenoprotein genes as detailed in Table 1. Selenoproteins with known functions will continue to use the same designations ( Table 2). Once functions are established for other selenoproteins, they may be renamed, as required. The proposed designations apply to the selenoprotein genes; although the same designations may be used for many of the encoded proteins, traditional names of selenoproteins, e.g. selenoprotein P, may also be used.
Selenoprotein Gene Designation in Other Species-The new HGNC nomenclature will automatically be used to designate orthologous selenoprotein genes in other vertebrates and extended to accommodate selenoprotein genes with no orthologs in human (22) (Table 3). Where vertebrate gene duplications have occurred, the additional paralogs will be named in line with the human genes, but with suffixes on the symbols, e.g. zebrafish selenot1a, selenot1b, and selenot2. Selenoproteins are widespread in all three domains of life. Despite the fact that land plants, yeast, and some other species have lost selenoprotein biosynthesis pathways, a unifying nomenclature beyond vertebrates might be desirable. We suggest using the human nomenclature described in this paper for orthologs of vertebrate selenoprotein genes. This nomenclature may also be extended to accommodate additional selenoprotein genes as they are discovered. Although we use human designations in this paper, we note that most vertebrates use all uppercase letters for genes and proteins (italics for genes), rodents use title case for genes (uppercase for proteins), Xenopus and zebrafish use lowercase for genes and title case for proteins, and Anolis use lowercase for genes and uppercase for proteins.
Designations of Proteins That Do Not Contain Selenocysteine-There exists another class of selenium-containing proteins, those which contain a bound atom of selenium but do not contain a UGA-encoded Sec, for which there is also ambiguous nomenclature. For example, selenium-binding protein 1 (SBP1), also referred to as SELENBP1 or hSP56, is one such protein (23). The naming of such proteins will not be included in the new nomenclature as they lack Sec. Similarly, the machinery for Sec biosynthesis and insertion will not be renamed.
Implementation-The new selenoprotein gene nomenclature has been approved by the HGNC, can be found on their website (http://www.genenames.org/cgi-bin/genefamilies/set/ 890), and will be found in all major genomic resources in due course. We recommend that future publications primarily use   Selenoprotein Gene Nomenclature the new SELENO designations, but supplement them (as secondary designations/synonyms) with the names previously used by the community. Once the new nomenclature is consistently used, the old designations will no longer be needed. We hope that other researchers in the field will join us in implementing this new nomenclature.
Author Contributions-The article was drafted by V. N. G. in consultation with other authors. All authors contributed to revisions and discussion.