Advertisement

Sequence, Structure, and Evolution of Cellulases in Glycoside Hydrolase Family 48*

  • Leonid O. Sukharnikov
    Footnotes
    Affiliations
    BioEnergy Science Center, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831

    Joint Institute for Computational Sciences, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
    Search for articles by this author
  • Markus Alahuhta
    Footnotes
    Affiliations
    BioEnergy Science Center, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831

    Biosciences Center, National Renewable Energy Laboratory, Golden, Colorado 80401
    Search for articles by this author
  • Roman Brunecky
    Affiliations
    BioEnergy Science Center, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831

    Biosciences Center, National Renewable Energy Laboratory, Golden, Colorado 80401
    Search for articles by this author
  • Amit Upadhyay
    Affiliations
    BioEnergy Science Center, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831

    Joint Institute for Computational Sciences, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
    Search for articles by this author
  • Michael E. Himmel
    Affiliations
    BioEnergy Science Center, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831

    Biosciences Center, National Renewable Energy Laboratory, Golden, Colorado 80401
    Search for articles by this author
  • Vladimir V. Lunin
    Correspondence
    To whom correspondence may be addressed.
    Affiliations
    BioEnergy Science Center, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831

    Biosciences Center, National Renewable Energy Laboratory, Golden, Colorado 80401
    Search for articles by this author
  • Igor B. Zhulin
    Correspondence
    To whom correspondence may be addressed.
    Affiliations
    BioEnergy Science Center, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831

    Joint Institute for Computational Sciences, University of Tennessee, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
    Search for articles by this author
  • Author Footnotes
    * This work was supported by the Department of Energy Office of Science, Office of Biological and Environmental Research, through the BioEnergy Science Center, a Department of Energy Bioenergy Research Center.
    This article contains supplemental Fig. S1 and Tables S1–S5.
    1 Both authors contributed equally to this work.
Open AccessPublished:October 10, 2012DOI:https://doi.org/10.1074/jbc.M112.405720
      Currently, the cost of cellulase enzymes remains a key economic impediment to commercialization of biofuels (
      • Aden A.
      • Foust T.
      Technoeconomic analysis of the dilute sulfuric acid and enzymatic hydrolysis process for the conversion of corn stover to ethanol.
      ). Enzymes from glycoside hydrolase family 48 (GH48) are a critical component of numerous natural lignocellulose-degrading systems. Although computational mining of large genomic data sets is a promising new approach for identifying novel cellulolytic activities, current computational methods are unable to distinguish between cellulases and enzymes with different substrate specificities that belong to the same protein family. We show that by using a robust computational approach supported by experimental studies, cellulases and non-cellulases can be effectively identified within a given protein family. Phylogenetic analysis of GH48 showed non-monophyletic distribution, an indication of horizontal gene transfer. Enzymatic function of GH48 proteins coded by horizontally transferred genes was verified experimentally, which confirmed that these proteins are cellulases. Computational and structural studies of GH48 enzymes identified structural elements that define cellulases and can be used to computationally distinguish them from non-cellulases. We propose that the structural element that can be used for in silico discrimination between cellulases and non-cellulases belonging to GH48 is an ω-loop located on the surface of the molecule and characterized by highly conserved rare amino acids. These markers were used to screen metagenomics data for “true” cellulases.

      Introduction

      The recent exponential growth of genomic data presents a unique opportunity to search for novel cellulolytic activities. However, the absence of a clear understanding of structural and functional features that are critical for decisive computational identification of cellulases prevents their identification in these data sets. True cellulases are defined as enzymes that show biochemical activity on cellulose substrates (i.e. crystalline or amorphous cellulose). Strikingly, all known cellulases have homologs that have similar protein folds and even amino acid sequences but do not show biochemical activity on cellulosic substrates (
      • Sukharnikov L.O.
      • Cantwell B.J.
      • Podar M.
      • Zhulin I.B.
      Cellulases. Ambiguous nonhomologous enzymes in a genomic perspective.
      ), which makes computational-only identification of true cellulases error-prone. Glycoside hydrolase family 48 (GH48)
      The abbreviations used are: GH48
      glycoside hydrolase family 48
      MSA
      multiple sequence alignment.
      is one of the many families defined in the CAZy (Carbohydrate-Active EnZymes) database (
      • Cantarel B.L.
      • Coutinho P.M.
      • Rancurel C.
      • Bernard T.
      • Lombard V.
      • Henrissat B.
      The Carbohydrate-Active EnZymes database (CAZy). An expert resource for Glycogenomics.
      ) that contains biochemically confirmed cellulases. Furthermore, GH48 cellulases are considered the key component of various cellulolytic systems (
      • Olson D.G.
      • Tripathi S.A.
      • Giannone R.J.
      • Lo J.
      • Caiazza N.C.
      • Hogsett D.A.
      • Hettich R.L.
      • Guss A.M.
      • Dubrovsky G.
      • Lynd L.R.
      Deletion of the Cel48S cellulase from Clostridium thermocellum.
      ,
      • Devillard E.
      • Goodheart D.B.
      • Karnati S.K.
      • Bayer E.A.
      • Lamed R.
      • Miron J.
      • Nelson K.E.
      • Morrison M.
      Ruminococcus albus 8 mutants defective in cellulose degradation are deficient in two processive endocellulases, Cel48A and Cel9B, both of which possess a novel modular architecture.
      ,
      • Izquierdo J.A.
      • Sizova M.V.
      • Lynd L.R.
      Diversity of bacteria and glycosyl hydrolase family 48 genes in cellulolytic consortia enriched from thermophilic biocompost.
      ). They are highly expressed in cellulolytic bacteria, such as Clostridium cellulolyticum, Clostridium cellulovorans, Clostridium josui, Clostridium thermocellum, and many others (
      • Olson D.G.
      • Tripathi S.A.
      • Giannone R.J.
      • Lo J.
      • Caiazza N.C.
      • Hogsett D.A.
      • Hettich R.L.
      • Guss A.M.
      • Dubrovsky G.
      • Lynd L.R.
      Deletion of the Cel48S cellulase from Clostridium thermocellum.
      ). In C. thermocellum, a bacterium that exhibits one of the highest rates of cellulose degradation among all known cellulolytic bacteria, GH48 cellulases are up-regulated during growth on crystalline cellulose (
      • Olson D.G.
      • Tripathi S.A.
      • Giannone R.J.
      • Lo J.
      • Caiazza N.C.
      • Hogsett D.A.
      • Hettich R.L.
      • Guss A.M.
      • Dubrovsky G.
      • Lynd L.R.
      Deletion of the Cel48S cellulase from Clostridium thermocellum.
      ). Hence, these enzymes become the most abundant subunits in the C. thermocellum cellulosome, a complex of enzymes highly efficient in cellulose degradation (
      • Olson D.G.
      • Tripathi S.A.
      • Giannone R.J.
      • Lo J.
      • Caiazza N.C.
      • Hogsett D.A.
      • Hettich R.L.
      • Guss A.M.
      • Dubrovsky G.
      • Lynd L.R.
      Deletion of the Cel48S cellulase from Clostridium thermocellum.
      ,
      • Gold N.D.
      • Martin V.J.
      Global view of the Clostridium thermocellum cellulosome revealed by quantitative proteomic analysis.
      ). Notably, complete knockout of both GH48 enzymes in C. thermocellum leads to a significant decrease in performance but does not completely abolish cellulolytic activity (
      • Olson D.G.
      • Tripathi S.A.
      • Giannone R.J.
      • Lo J.
      • Caiazza N.C.
      • Hogsett D.A.
      • Hettich R.L.
      • Guss A.M.
      • Dubrovsky G.
      • Lynd L.R.
      Deletion of the Cel48S cellulase from Clostridium thermocellum.
      ), whereas knockout of the GH48 gene in Ruminococcus albus (
      • Devillard E.
      • Goodheart D.B.
      • Karnati S.K.
      • Bayer E.A.
      • Lamed R.
      • Miron J.
      • Nelson K.E.
      • Morrison M.
      Ruminococcus albus 8 mutants defective in cellulose degradation are deficient in two processive endocellulases, Cel48A and Cel9B, both of which possess a novel modular architecture.
      ) leads to nearly complete loss of cellulase activity.
      Usually, only one (or rarely two or three) gene(s) encoding GH48 enzymes can be found in the genomes of cellulose-degrading bacteria (
      • Izquierdo J.A.
      • Sizova M.V.
      • Lynd L.R.
      Diversity of bacteria and glycosyl hydrolase family 48 genes in cellulolytic consortia enriched from thermophilic biocompost.
      ), whereas genes for GH5 and GH9 cellulases are present in much higher numbers (
      • Wisniewski-Dyé F.
      • Borziak K.
      • Khalsa-Moyers G.
      • Alexandre G.
      • Sukharnikov L.O.
      • Wuichet K.
      • Hurst G.B.
      • McDonald W.H.
      • Robertson J.S.
      • Barbe V.
      • Calteau A.
      • Rouy Z.
      • Mangenot S.
      • Prigent-Combaret C.
      • Normand P.
      • Boyer M.
      • Siguier P.
      • Dessaux Y.
      • Elmerich C.
      • Condemine G.
      • Krishnen G.
      • Kennedy I.
      • Paterson A.H.
      • González V.
      • Mavingui P.
      • Zhulin I.B.
      Azospirillum genomes reveal transition of bacteria from aquatic to terrestrial environments.
      ,
      • Dam P.
      • Kataeva I.
      • Yang S.J.
      • Zhou F.
      • Yin Y.
      • Chou W.
      • Poole 2nd, F.L.
      • Westpheling J.
      • Hettich R.
      • Giannone R.
      • Lewis D.L.
      • Kelly R.
      • Gilbert H.J.
      • Henrissat B.
      • Xu Y.
      • Adams M.W.
      Insights into plant biomass conversion from the genome of the anaerobic thermophilic bacterium Caldicellulosiruptor bescii DSM 6725.
      ). Interestingly, GH48 cellulases often act in synergy with GH9 cellulases, which increases their catalytic activity dramatically (
      • Irwin D.C.
      • Zhang S.
      • Wilson D.B.
      Cloning, expression and characterization of a family 48 exocellulase, Cel48A, from Thermobifida fusca.
      ), a feature that may be utilized for industrial application of these enzymes (e.g. “designer cellulosomes”) (
      • Vazana Y.
      • Moraïs S.
      • Barak Y.
      • Lamed R.
      • Bayer E.A.
      Interplay between Clostridium thermocellum family 48 and family 9 cellulases in cellulosomal versus noncellulosomal states.
      ).
      Experimental studies revealed that some GH48 cellulases have only cellulolytic activity and thus cannot hydrolyze other substrates (i.e. xylan and mannan) (
      • Shen H.
      • Gilkes N.R.
      • Kilburn D.G.
      • Miller Jr., R.C.
      • Warren R.A.
      Cellobiohydrolase B, a second exo-cellobiohydrolase from the cellulolytic bacterium Cellulomonas fimi.
      ). A few GH48 cellulases have mixed substrate specificity (e.g. they are capable of degradation of xylan (
      • Liu C.C.
      • Doi R.H.
      Properties of exgS, a gene for a major subunit of the Clostridium cellulovorans cellulosome.
      ) or β-glucan (
      • Berger E.
      • Zhang D.
      • Zverlov V.V.
      • Schwarz W.H.
      Two noncellulosomal cellulases of Clostridium thermocellum, Cel9I and Cel48Y, hydrolyse crystalline cellulose synergistically.
      ) in addition to cellulose). There are two GH48 enzymes from the beetle Gastrophysa atrocyanea that are unable to hydrolyze cellulose-containing substrates (e.g. Avicel, carboxymethylcellulose, acid-swollen cellulose, etc.), whereas they showed distinct enzymatic activity toward chitin (
      • Fujita K.
      • Shimomura K.
      • Yamamoto K.
      • Yamashita T.
      • Suzuki K.
      A chitinase structurally related to the glycoside hydrolase family 48 is indispensable for the hormonally induced diapause termination in a beetle.
      ) (supplemental Table S1).
      Previous genomic studies have shown that GH48 enzymes are found in fungi as well as in bacteria, including Clostridia, Bacilli (both Firmicutes), and Actinobacteria. However, the presence of the GH48 cellulase (
      • Ramírez-Ramírez N.
      • Romero-García E.R.
      • Calderón V.C.
      • Avitia C.I.
      • Téllez-Valencia A.
      • Pedraza-Reyes M.
      Expression, characterization and synergistic interactions of Myxobacter sp. AL-1 Cel9 and Cel48 glycosyl hydrolases.
      ) in the evolutionarily distant deltaproteobacterium, Myxobacter sp. AL-1, was never explained.
      Here we report evolutionary studies of GH48 enzymes, present a crystal structure of the GH48 enzyme encoded by a horizontally transferred gene, and characterize structural and functional differences between cellulases and chitinases in this group of enzymes. We also show that our computational approach can be used to search for true GH48 cellulases in metagenomic databases.

      DISCUSSION

      Using a phylogenomic approach, we have determined that the GH48-type enzymes might have originated in a common ancestor of three closely related phyla: Firmicutes, Actinobacteria, and Chloroflexi (
      • Gutiérrez-Preciado A.
      • Henkin T.M.
      • Grundy F.J.
      • Yanofsky C.
      • Merino E.
      Biochemical features and functional implications of the RNA-based T-box regulatory mechanism.
      ). We have determined a number of gene duplication events in representatives of these phyla and several cases of horizontal gene transfer. For example, fungi received these genes horizontally from a representative of Firmicutes, whereas insects received these genes from a representative of Actinobacteria. Similarly, representatives of Proteobacteria also received their GH48 genes horizontally. By comparing orthologous sequences from Firmicutes, Actinobacteria, and Chloflexi, we identified a number of amino acid positions that are uniquely conserved in this group of organisms. Satisfactorily, the only activity that was previously found in this group is that of a cellulase. Thus, we suggest that conserved positions in the catalytic domains from Firmicutes, Actinobacteria, and Chloflexi can be used as a genomic signature for a GH48 cellulase.
      We then wondered if this genomic signature for a cellulase remains intact in paralogs and horizontally transferred genes, because these types of genes often assume a slightly different function. For example, just one or a few mutations in a catalytic domain may lead to different substrate specificity. Notably, screening and study of paralogous sequences of GH48 proteins showed no significant differences in their catalytic domains but rather noticeable differences in their auxiliary domains (i.e. cellulose-binding domain, fibronectin type III-like domain, etc.). On the contrary, genes that were horizontally transferred from Actinobacteria to insects (Metazoa) acquired a new activity to hydrolyze chitin but lost the ability to degrade cellulose.
      Following this initial evolutionary analysis, we extended our findings to structural analysis of GH48 enzymes. We found that all orthologs and paralogs have a 10–14 residue ω-loop (Pro-469 to Ala-482 as in Cel48F) that has no counterpart in enzymes from insects. Moreover, this ω-loop is constituted by highly conserved amino acids (Trp-472 and Asn-481 as in Cel48F) and located on the surface of the molecule. Thus, in accord with the classical definition of ω-loops (
      • Fetrow J.S.
      ω-Loops. Nonregular secondary structures significant in protein function and stability.
      ), it may play the following roles in this enzyme structure: folding, stability, or contribution to the dynamics of the enzyme during catalysis.
      High conservation of the ω-loop residues in cellulases suggests its importance for the computational identification of cellulases, and the complete absence of the loop in all non-cellulases indicates that GH48 chitinases lost this structural element. We hypothesize that the absence of the loop in chitinases allows more conformational degrees of freedom in the active site tunnel upon binding of the substrate, which permits a bulkier chitin to “slide” freely. In contrast, cellulases may have more rigid structures “reinforced” by the ω-loop. Regardless of the exact role of the ω-loop, which can be determined only experimentally, we have suggested that it is important for cellulolytic activity, which has allowed us to design a strategy to identify new cellulases in metagenomic data.
      Thus, phylogenomic and structural analyses of GH48 suggest that proteins from Actinobacteria, Firmicutes, Chloflexi, and Proteobacteria are indeed cellulases. Biochemical activities of GH48 proteins from two Pyromyces species have never been studied; thus, it is unknown whether they are cellulases. However, because these proteins are not only homologous to known cellulases but also contain all conserved amino acids identified in our analysis, it is very likely that they also possess cellulolytic activities. On the other hand, GH48s from insects, where only chitinolytic activities were detected experimentally, are non-cellulases. Consequently, the existing Pfam model for GH48 can be used to retrieve true cellulases; however, there is one exception. GH48 proteins from insects should be annotated as non-cellulases. This approach allowed us to identify 166 true cellulases in the combined metagenomic data set of hundreds of environmental samples. The largest number of cellulases came from the metagenomes of “engineered” microbial communities, such as enriched samples or bioreactors (e.g. the “mixed alcohol bioreactor” and the “cellulolytic enrichment from sediment of Great Boiling Springs”). Most of the environmental cellulases come from communities that typically include saprophytes (
      • Mba Medie F.
      • Davies G.J.
      • Drancourt M.
      • Henrissat B.
      Genome analyses highlight the different biological roles of cellulases.
      ), such as soil, wastewater, ant fungal gardens, and the rhizosphere (Fig. 5), which is in agreement with previously published research (
      • Suen G.
      • Scott J.J.
      • Aylward F.O.
      • Adams S.M.
      • Tringe S.G.
      • Pinto-Tomás A.A.
      • Foster C.E.
      • Pauly M.
      • Weimer P.J.
      • Barry K.W.
      • Goodwin L.A.
      • Bouffard P.
      • Li L.
      • Osterberger J.
      • Harkins T.T.
      • Slater S.C.
      • Donohue T.J.
      • Currie C.R.
      An insect herbivore microbiome with high plant biomass-degrading capacity.
      ,
      • Sessitsch A.
      • Hardoim P.
      • Döring J.
      • Weilharter A.
      • Krause A.
      • Woyke T.
      • Mitter B.
      • Hauberg-Lotte L.
      • Friedrich F.
      • Rahalkar M.
      • Hurek T.
      • Sarkar A.
      • Bodrossy L.
      • van Overbeek L.
      • Brar D.
      • van Elsas J.D.
      • Reinhold-Hurek B.
      Functional characteristics of an endophyte community colonizing rice roots as revealed by metagenomic analysis.
      ). Interestingly, very few GH48 cellulases were identified in cow rumen microbial communities, which also correlates with previous extensive biochemical analysis of this classical cellulolytic community (
      • Hess M.
      • Sczyrba A.
      • Egan R.
      • Kim T.W.
      • Chokhawala H.
      • Schroth G.
      • Luo S.
      • Clark D.S.
      • Chen F.
      • Zhang T.
      • Mackie R.I.
      • Pennacchio L.A.
      • Tringe S.G.
      • Visel A.
      • Woyke T.
      • Wang Z.
      • Rubin E.M.
      Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.
      ). Moreover, all of the GH48s from cow rumen, found in this study, belong to Ruminococcus flavefaciens, a highly specialized cellulose degrader. We hypothesize that because, collectively, major ruminal cellulolytic specialists are found to represent as little as 0.3% of the total bacterial population (
      • Brulc J.M.
      • Yeoman C.J.
      • Wilson M.K.
      • Berg Miller M.E.
      • Jeraldo P.
      • Jindou S.
      • Goldenfeld N.
      • Flint H.J.
      • Lamed R.
      • Borovok I.
      • Vodovnik M.
      • Nelson K.E.
      • Bayer E.A.
      • White B.A.
      Cellulosomics, a gene-centric approach to investigating the intraspecific diversity and adaptation of Ruminococcus flavefaciens within the rumen.
      ), and R. flavefaciens is typically one of the three most abundant cellulolytic bacteria in cow rumen (
      • Huws S.A.
      • Lee M.R.
      • Muetzel S.M.
      • Scott M.B.
      • Wallace R.J.
      • Scollan N.D.
      Forage type and fish oil cause shifts in rumen bacterial diversity.
      ), its GH48 gene was more selective for sequencing (
      • Cowan D.
      • Meyer Q.
      • Stafford W.
      • Muyanga S.
      • Cameron R.
      • Wittwer P.
      Metagenomic gene discovery. Past, present and future.
      ) when compared with the genes of other “rare” members of the community.

      CONCLUSIONS

      High-throughput computational screening for cellulases from genomic and metagenomic data sets is a challenge due to the absence of a clear understanding of structural and functional features that distinguish them from closely related enzymes with other substrate specificities (
      • Sukharnikov L.O.
      • Cantwell B.J.
      • Podar M.
      • Zhulin I.B.
      Cellulases. Ambiguous nonhomologous enzymes in a genomic perspective.
      ). Here, we present a combined sequence-structure approach leading to the identification of clear markers that can be used to distinguish between cellulases and non-cellulases within the GH48 family. This approach was applied to identify “true” GH48 cellulases in large metagenomic data sets, illustrating its feasibility in the search for novel cellulolytic capabilities.
      Finally, we propose that this approach can be generalized to define genomic signatures for identifying cellulases in other CAZy families (
      • Sukharnikov L.O.
      • Cantwell B.J.
      • Podar M.
      • Zhulin I.B.
      Cellulases. Ambiguous nonhomologous enzymes in a genomic perspective.
      ), such as GH5, GH9, GH12, GH45, and GH61, that are known to contain biochemically confirmed cellulases.

      REFERENCES

        • Aden A.
        • Foust T.
        Technoeconomic analysis of the dilute sulfuric acid and enzymatic hydrolysis process for the conversion of corn stover to ethanol.
        Cellulose. 2009; 16: 535-545
        • Sukharnikov L.O.
        • Cantwell B.J.
        • Podar M.
        • Zhulin I.B.
        Cellulases. Ambiguous nonhomologous enzymes in a genomic perspective.
        Trends Biotechnol. 2011; 29: 473-479
        • Cantarel B.L.
        • Coutinho P.M.
        • Rancurel C.
        • Bernard T.
        • Lombard V.
        • Henrissat B.
        The Carbohydrate-Active EnZymes database (CAZy). An expert resource for Glycogenomics.
        Nucleic Acids Res. 2009; 37: D233-D238
        • Olson D.G.
        • Tripathi S.A.
        • Giannone R.J.
        • Lo J.
        • Caiazza N.C.
        • Hogsett D.A.
        • Hettich R.L.
        • Guss A.M.
        • Dubrovsky G.
        • Lynd L.R.
        Deletion of the Cel48S cellulase from Clostridium thermocellum.
        Proc. Natl. Acad. Sci. U.S.A. 2010; 107: 17727-17732
        • Devillard E.
        • Goodheart D.B.
        • Karnati S.K.
        • Bayer E.A.
        • Lamed R.
        • Miron J.
        • Nelson K.E.
        • Morrison M.
        Ruminococcus albus 8 mutants defective in cellulose degradation are deficient in two processive endocellulases, Cel48A and Cel9B, both of which possess a novel modular architecture.
        J. Bacteriol. 2004; 186: 136-145
        • Izquierdo J.A.
        • Sizova M.V.
        • Lynd L.R.
        Diversity of bacteria and glycosyl hydrolase family 48 genes in cellulolytic consortia enriched from thermophilic biocompost.
        Appl. Environ. Microbiol. 2010; 76: 3545-3553
        • Gold N.D.
        • Martin V.J.
        Global view of the Clostridium thermocellum cellulosome revealed by quantitative proteomic analysis.
        J. Bacteriol. 2007; 189: 6787-6795
        • Wisniewski-Dyé F.
        • Borziak K.
        • Khalsa-Moyers G.
        • Alexandre G.
        • Sukharnikov L.O.
        • Wuichet K.
        • Hurst G.B.
        • McDonald W.H.
        • Robertson J.S.
        • Barbe V.
        • Calteau A.
        • Rouy Z.
        • Mangenot S.
        • Prigent-Combaret C.
        • Normand P.
        • Boyer M.
        • Siguier P.
        • Dessaux Y.
        • Elmerich C.
        • Condemine G.
        • Krishnen G.
        • Kennedy I.
        • Paterson A.H.
        • González V.
        • Mavingui P.
        • Zhulin I.B.
        Azospirillum genomes reveal transition of bacteria from aquatic to terrestrial environments.
        PLoS Genet. 2011; 7: e1002430
        • Dam P.
        • Kataeva I.
        • Yang S.J.
        • Zhou F.
        • Yin Y.
        • Chou W.
        • Poole 2nd, F.L.
        • Westpheling J.
        • Hettich R.
        • Giannone R.
        • Lewis D.L.
        • Kelly R.
        • Gilbert H.J.
        • Henrissat B.
        • Xu Y.
        • Adams M.W.
        Insights into plant biomass conversion from the genome of the anaerobic thermophilic bacterium Caldicellulosiruptor bescii DSM 6725.
        Nucleic Acids Res. 2011; 39: 3240-3254
        • Irwin D.C.
        • Zhang S.
        • Wilson D.B.
        Cloning, expression and characterization of a family 48 exocellulase, Cel48A, from Thermobifida fusca.
        Eur. J. Biochem. 2000; 267: 4988-4997
        • Vazana Y.
        • Moraïs S.
        • Barak Y.
        • Lamed R.
        • Bayer E.A.
        Interplay between Clostridium thermocellum family 48 and family 9 cellulases in cellulosomal versus noncellulosomal states.
        Appl. Environ. Microbiol. 2010; 76: 3236-3243
        • Shen H.
        • Gilkes N.R.
        • Kilburn D.G.
        • Miller Jr., R.C.
        • Warren R.A.
        Cellobiohydrolase B, a second exo-cellobiohydrolase from the cellulolytic bacterium Cellulomonas fimi.
        Biochem. J. 1995; 311: 67-74
        • Liu C.C.
        • Doi R.H.
        Properties of exgS, a gene for a major subunit of the Clostridium cellulovorans cellulosome.
        Gene. 1998; 211: 39-47
        • Berger E.
        • Zhang D.
        • Zverlov V.V.
        • Schwarz W.H.
        Two noncellulosomal cellulases of Clostridium thermocellum, Cel9I and Cel48Y, hydrolyse crystalline cellulose synergistically.
        FEMS Microbiol. Lett. 2007; 268: 194-201
        • Fujita K.
        • Shimomura K.
        • Yamamoto K.
        • Yamashita T.
        • Suzuki K.
        A chitinase structurally related to the glycoside hydrolase family 48 is indispensable for the hormonally induced diapause termination in a beetle.
        Biochem. Biophys. Res. Commun. 2006; 345: 502-507
        • Ramírez-Ramírez N.
        • Romero-García E.R.
        • Calderón V.C.
        • Avitia C.I.
        • Téllez-Valencia A.
        • Pedraza-Reyes M.
        Expression, characterization and synergistic interactions of Myxobacter sp. AL-1 Cel9 and Cel48 glycosyl hydrolases.
        Int. J. Mol. Sci. 2008; 9: 247-257
        • Finn R.D.
        • Clements J.
        • Eddy S.R.
        HMMER web server. Interactive sequence similarity searching.
        Nucleic Acids Res. 2011; 39: W29-W37
        • Katoh K.
        • Toh H.
        Parallelization of the MAFFT multiple sequence alignment program.
        Bioinformatics. 2010; 26: 1899-1900
        • Tamura K.
        • Peterson D.
        • Peterson N.
        • Stecher G.
        • Nei M.
        • Kumar S.
        MEGA5. Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.
        Mol. Biol. Evol. 2011; 28: 2731-2739
        • Waterhouse A.M.
        • Procter J.B.
        • Martin D.M..
        • Clamp M.
        • Barton G.J.
        Jalview version 2. A multiple sequence alignment editor and analysis workbench.
        Bioinformatics. 2009; 25: 1189-1191
        • Altschul S.F.
        • Madden T.L.
        • Schäffer A.A.
        • Zhang J.
        • Zhang Z.
        • Miller W.
        • Lipman D.J.
        Gapped BLAST and PSI-BLAST. A new generation of protein database search programs.
        Nucleic Acids Res. 1997; 25: 3389-3402
        • Guindon S.
        • Gascuel O.
        PhyML. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.
        Syst. Biol. 2003; 52: 696-704
        • Finn R.D.
        • Mistry J.
        • Tate J.
        • Coggill P.
        • Heger A.
        • Pollington J.E.
        • Gavin O.L.
        • Gunasekaran P.
        • Ceric G.
        • Forslund K.
        • Holm L.
        • Sonnhammer E.L.
        • Eddy S.R.
        • Bateman A.
        The Pfam protein families database.
        Nucleic Acids Res. 2010; 38: D211-D222
        • Guimarães B.G.
        • Souchon H.
        • Lytle B.L.
        • David Wu J.H.
        • Alzari P.M.
        The crystal structure and catalytic mechanism of cellobiohydrolase CelS, the major enzymatic component of the Clostridium thermocellum cellulosome.
        J. Mol. Biol. 2002; 320: 587-596
        • Parsiegla G.
        • Reverbel-Leroy C.
        • Tardif C.
        • Belaich J.P.
        • Driguez H.
        • Haser R.
        Crystal structures of the cellulase Cel48F in complex with inhibitors and substrates give insights into its processive action.
        Biochemistry. 2000; 39: 11238-11246
        • Berman H.M.
        • Westbrook J.
        • Feng Z.
        • Gilliland G.
        • Bhat T.N.
        • Weissig H.
        • Shindyalov I.N.
        • Bourne P.E.
        The Protein Data Bank.
        Nucleic Acids Res. 2000; 28: 235-242
        • Koonin E.V.
        • Makarova K.S.
        • Aravind L.
        Horizontal gene transfer in prokaryotes. Quantification and classification.
        Annu. Rev. Microbiol. 2001; 55: 709-742
        • Markowitz V.M.
        • Chen I.M.
        • Palaniappan K.
        • Chu K.
        • Szeto E.
        • Grechkin Y.
        • Ratner A.
        • Jacob B.
        • Huang J.
        • Williams P.
        • Huntemann M.
        • Anderson I.
        • Mavromatis K.
        • Ivanova N.N.
        • Kyrpides N.C.
        IMG. The integrated microbial genomes database and comparative analysis system.
        Nucleic Acids Res. 2012; 40: D115-D122
        • Hess M.
        • Sczyrba A.
        • Egan R.
        • Kim T.W.
        • Chokhawala H.
        • Schroth G.
        • Luo S.
        • Clark D.S.
        • Chen F.
        • Zhang T.
        • Mackie R.I.
        • Pennacchio L.A.
        • Tringe S.G.
        • Visel A.
        • Woyke T.
        • Wang Z.
        • Rubin E.M.
        Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.
        Science. 2011; 331: 463-467
        • Sluiter A.
        • Hames B.
        • Ruiz R.
        • Scarlata C.
        • Sluiter J.
        • Templeton D.
        • Crocker D.
        Determination of structural carbohydrates and lignin in biomass. Laboratory Analytical Procedure (LAP).
        Technical Report NREL/TP-510–42618, National Renewable Energy Laboratory, Golden, CO2006
        • Winn M.D.
        • Ballard C.C.
        • Cowtan K.D.
        • Dodson E.J.
        • Emsley P.
        • Evans P.R.
        • Keegan R.M.
        • Krissinel E.B.
        • Leslie A.G.
        • McCoy A.
        • McNicholas S.J.
        • Murshudov G.N.
        • Pannu N.S.
        • Potterton E.A.
        • Powell H.R.
        • Read R.J.
        • Vagin A.
        • Wilson K.S.
        Overview of the CCP4 suite and current developments.
        Acta Crystallogr. D Biol. Crystallogr. 2011; 67: 235-242
        • Vagin A.
        • Teplyakov A.
        Molecular replacement with MOLREP.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66: 22-25
        • Langer G.
        • Cohen S.X.
        • Lamzin V.S.
        • Perrakis A.
        Automated macromolecular model building for x-ray crystallography using ARP/wARP version 7.
        Nat. Protoc. 2008; 3: 1171-1179
        • Emsley P.
        • Lohkamp B.
        • Scott W.G.
        • Cowtan K.
        Features and development of Coot.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66: 486-501
        • Murshudov G.N.
        • Skubák P.
        • Lebedev A.A.
        • Pannu N.S.
        • Steiner R.A.
        • Nicholls R.A.
        • Winn M.D.
        • Long F.
        • Vagin A.A.
        REFMAC5 for the refinement of macromolecular crystal structures.
        Acta Crystallogr. D Biol. Crystallogr. 2011; 67: 355-367
        • Chen V.B.
        • Arendall 3rd, W.B.
        • Headd J.J.
        • Keedy D.A.
        • Immormino R.M.
        • Kapral G.J.
        • Murray L.W.
        • Richardson J.S.
        • Richardson D.C.
        MolProbity. All-atom structure validation for macromolecular crystallography.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66: 12-21
        • Engh R.A.
        • Huber R.
        Accurate bond and angle parameters for x-ray protein-structure refinement.
        Acta Crystallogr. A. 1991; 47: 392-400
        • Gutiérrez-Preciado A.
        • Henkin T.M.
        • Grundy F.J.
        • Yanofsky C.
        • Merino E.
        Biochemical features and functional implications of the RNA-based T-box regulatory mechanism.
        Microbiol. Mol. Biol. Rev. 2009; 73: 36-61
        • Koonin E.V.
        Orthologs, paralogs and evolutionary genomics.
        Annu. Rev. Genet. 2005; 39: 309-338
        • Krissinel E.
        • Henrick K.
        Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions.
        Acta Crystallogr. D Biol. Crystallogr. 2004; 60: 2256-2268
        • Parsiegla G.
        • Juy M.
        • Reverbel-Leroy C.
        • Tardif C.
        • Belaïch J.P.
        • Driguez H.
        • Haser R.
        The crystal structure of the processive endocellulase CelF of Clostridium cellulolyticum in complex with a thiooligosaccharide inhibitor at 2.0 Å resolution.
        EMBO J. 1998; 17: 5551-5562
        • Parsiegla G.
        • Reverbel C.
        • Tardif C.
        • Driguez H.
        • Haser R.
        Structures of mutants of cellulase Cel48F of Clostridium cellulolyticum in complex with long hemithiocellooligosaccharides give rise to a new view of the substrate pathway during processive action.
        J. Mol. Biol. 2008; 375: 499-510
        • Pignatelli M.
        • Moya A.
        Evaluating the fidelity of de novo short read metagenomic assembly using simulated data.
        PLoS One. 2011; 6: e19984
        • Rho M.
        • Tang H.
        • Ye Y.
        FragGeneScan. Predicting genes in short and error-prone reads.
        Nucleic Acids Res. 2010; 38: e191
        • Fetrow J.S.
        ω-Loops. Nonregular secondary structures significant in protein function and stability.
        FASEB J. 1995; 9: 708-717
        • Mba Medie F.
        • Davies G.J.
        • Drancourt M.
        • Henrissat B.
        Genome analyses highlight the different biological roles of cellulases.
        Nat. Rev. Microbiol. 2012; 10: 227-234
        • Suen G.
        • Scott J.J.
        • Aylward F.O.
        • Adams S.M.
        • Tringe S.G.
        • Pinto-Tomás A.A.
        • Foster C.E.
        • Pauly M.
        • Weimer P.J.
        • Barry K.W.
        • Goodwin L.A.
        • Bouffard P.
        • Li L.
        • Osterberger J.
        • Harkins T.T.
        • Slater S.C.
        • Donohue T.J.
        • Currie C.R.
        An insect herbivore microbiome with high plant biomass-degrading capacity.
        PLoS Genet. 2010; 6: e1001129
        • Sessitsch A.
        • Hardoim P.
        • Döring J.
        • Weilharter A.
        • Krause A.
        • Woyke T.
        • Mitter B.
        • Hauberg-Lotte L.
        • Friedrich F.
        • Rahalkar M.
        • Hurek T.
        • Sarkar A.
        • Bodrossy L.
        • van Overbeek L.
        • Brar D.
        • van Elsas J.D.
        • Reinhold-Hurek B.
        Functional characteristics of an endophyte community colonizing rice roots as revealed by metagenomic analysis.
        Mol. Plant Microbe Interact. 2012; 25: 28-36
        • Brulc J.M.
        • Yeoman C.J.
        • Wilson M.K.
        • Berg Miller M.E.
        • Jeraldo P.
        • Jindou S.
        • Goldenfeld N.
        • Flint H.J.
        • Lamed R.
        • Borovok I.
        • Vodovnik M.
        • Nelson K.E.
        • Bayer E.A.
        • White B.A.
        Cellulosomics, a gene-centric approach to investigating the intraspecific diversity and adaptation of Ruminococcus flavefaciens within the rumen.
        PLoS One. 2011; 6: e25329
        • Huws S.A.
        • Lee M.R.
        • Muetzel S.M.
        • Scott M.B.
        • Wallace R.J.
        • Scollan N.D.
        Forage type and fish oil cause shifts in rumen bacterial diversity.
        FEMS Microbiol Ecol. 2010; 73: 396-407
        • Cowan D.
        • Meyer Q.
        • Stafford W.
        • Muyanga S.
        • Cameron R.
        • Wittwer P.
        Metagenomic gene discovery. Past, present and future.
        Trends Biotechnol. 2005; 23: 321-329