Your browser doesn't support javascript.
loading
Unraveling the functional dark matter through global metagenomics.
Pavlopoulos, Georgios A; Baltoumas, Fotis A; Liu, Sirui; Selvitopi, Oguz; Camargo, Antonio Pedro; Nayfach, Stephen; Azad, Ariful; Roux, Simon; Call, Lee; Ivanova, Natalia N; Chen, I Min; Paez-Espino, David; Karatzas, Evangelos; Iliopoulos, Ioannis; Konstantinidis, Konstantinos; Tiedje, James M; Pett-Ridge, Jennifer; Baker, David; Visel, Axel; Ouzounis, Christos A; Ovchinnikov, Sergey; Buluç, Aydin; Kyrpides, Nikos C.
Afiliación
  • Pavlopoulos GA; Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece. pavlopoulos@fleming.gr.
  • Baltoumas FA; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA. pavlopoulos@fleming.gr.
  • Liu S; Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece. pavlopoulos@fleming.gr.
  • Selvitopi O; Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece.
  • Camargo AP; John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA.
  • Nayfach S; Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
  • Azad A; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
  • Roux S; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
  • Call L; Luddy School of Informatics, Computing and Engineering, Indiana University Bloomington, Bloomington, IN, USA.
  • Ivanova NN; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
  • Chen IM; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
  • Paez-Espino D; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
  • Karatzas E; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
  • Iliopoulos I; Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece.
  • Tiedje JM; Department of Basic Sciences, School of Medicine, University of Crete, Heraklion, Greece.
  • Pett-Ridge J; School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA, USA.
  • Baker D; Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA.
  • Visel A; Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA.
  • Ouzounis CA; Department of Biochemistry, University of Washington, Seattle, WA, USA.
  • Ovchinnikov S; Institute for Protein Design, University of Washington, Seattle, WA, USA.
  • Buluç A; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
  • Kyrpides NC; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
Nature ; 622(7983): 594-602, 2023 Oct.
Article en En | MEDLINE | ID: mdl-37821698
ABSTRACT
Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Proteínas / Metagenoma / Metagenómica / Microbiología Tipo de estudio: Prognostic_studies Idioma: En Revista: Nature Año: 2023 Tipo del documento: Article País de afiliación: Grecia

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Proteínas / Metagenoma / Metagenómica / Microbiología Tipo de estudio: Prognostic_studies Idioma: En Revista: Nature Año: 2023 Tipo del documento: Article País de afiliación: Grecia