Your browser doesn't support javascript.
loading
Scalable, accessible, and reproducible reference genome assembly and evaluation in Galaxy.
Larivière, Delphine; Abueg, Linelle; Brajuka, Nadolina; Gallardo-Alba, Cristóbal; Grüning, Bjorn; Ko, Byung June; Ostrovsky, Alex; Palmada-Flores, Marc; Pickett, Brandon D; Rabbani, Keon; Balacco, Jennifer R; Chaisson, Mark; Cheng, Haoyu; Collins, Joanna; Denisova, Alexandra; Fedrigo, Olivier; Gallo, Guido Roberto; Giani, Alice Maria; Gooder, Grenville MacDonald; Jain, Nivesh; Johnson, Cassidy; Kim, Heebal; Lee, Chul; Marques-Bonet, Tomas; O'Toole, Brian; Rhie, Arang; Secomandi, Simona; Sozzoni, Marcella; Tilley, Tatiana; Uliano-Silva, Marcela; van den Beek, Marius; Waterhouse, Robert M; Phillippy, Adam M; Jarvis, Erich D; Schatz, Michael C; Nekrutenko, Anton; Formenti, Giulio.
Afiliação
  • Larivière D; Dept. of Biochemistry and Molecular Biology, Pennsylvania State University, USA.
  • Abueg L; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Brajuka N; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Gallardo-Alba C; Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg, Germany.
  • Grüning B; Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg, Germany.
  • Ko BJ; Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
  • Ostrovsky A; Departments of Biology and Computer Science, Johns Hopkins University, USA.
  • Palmada-Flores M; Department of Medicine and Life Sciences (MELIS), Institut de Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, Barcelona 08003, Spain.
  • Pickett BD; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
  • Rabbani K; Department of Quantitative and Computational Biology, University of Southern California.
  • Balacco JR; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Chaisson M; Department of Quantitative and Computational Biology, University of Southern California.
  • Cheng H; Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
  • Collins J; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
  • Denisova A; Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom.
  • Fedrigo O; Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russia.
  • Gallo GR; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Giani AM; Department of Biosciences, University of Milan, Milan, Italy.
  • Gooder GM; BMRI, Weill Cornell Medical College, New York, 10021, USA.
  • Jain N; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Johnson C; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Kim H; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Lee C; Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
  • Marques-Bonet T; eGnome, Inc, Seoul, Republic of Korea.
  • O'Toole B; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
  • Rhie A; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
  • Secomandi S; Laboratory of Neurogenetics of Language, The Rockefeller University, New York City, NY, 10065, USA.
  • Sozzoni M; Department of Medicine and Life Sciences (MELIS), Institut de Biologia Evolutiva, Universitat Pompeu Fabra-CSIC, Barcelona 08003, Spain.
  • Tilley T; Catalan Institution of Research and Advanced Studies (ICREA), Barcelona 08010, Spain.
  • Uliano-Silva M; CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona 08028, Spain.
  • van den Beek M; Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Cerdanyola del Vallès 08193, Spain.
  • Waterhouse RM; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Phillippy AM; Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
  • Jarvis ED; Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus.
  • Schatz MC; University of Florence, Department of Biology, Via Madonna del Piano 6, Sesto Fiorentino (FI).
  • Nekrutenko A; Vertebrate Genome Laboratory, The Rockefeller University, USA.
  • Formenti G; Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom.
bioRxiv ; 2023 Jun 30.
Article em En | MEDLINE | ID: mdl-37425881
Improvements in genome sequencing and assembly are enabling high-quality reference genomes for all species. However, the assembly process is still laborious, computationally and technically demanding, lacks standards for reproducibility, and is not readily scalable. Here we present the latest Vertebrate Genomes Project assembly pipeline and demonstrate that it delivers high-quality reference genomes at scale across a set of vertebrate species arising over the last ~500 million years. The pipeline is versatile and combines PacBio HiFi long-reads and Hi-C-based haplotype phasing in a new graph-based paradigm. Standardized quality control is performed automatically to troubleshoot assembly issues and assess biological complexities. We make the pipeline freely accessible through Galaxy, accommodating researchers even without local computational resources and enhanced reproducibility by democratizing the training and assembly process. We demonstrate the flexibility and reliability of the pipeline by assembling reference genomes for 51 vertebrate species from major taxonomic groups (fish, amphibians, reptiles, birds, and mammals).
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Idioma: En Revista: BioRxiv Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Base de dados: MEDLINE Idioma: En Revista: BioRxiv Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Estados Unidos