Your browser doesn't support javascript.
loading
Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space.
Schatz, Michael C; Philippakis, Anthony A; Afgan, Enis; Banks, Eric; Carey, Vincent J; Carroll, Robert J; Culotti, Alessandro; Ellrott, Kyle; Goecks, Jeremy; Grossman, Robert L; Hall, Ira M; Hansen, Kasper D; Lawson, Jonathan; Leek, Jeffrey T; Luria, Anne O'Donnell; Mosher, Stephen; Morgan, Martin; Nekrutenko, Anton; O'Connor, Brian D; Osborn, Kevin; Paten, Benedict; Patterson, Candace; Tan, Frederick J; Taylor, Casey Overby; Vessio, Jennifer; Waldron, Levi; Wang, Ting; Wuichet, Kristin.
Affiliation
  • Schatz MC; Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
  • Philippakis AA; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
  • Afgan E; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Banks E; Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
  • Carey VJ; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Carroll RJ; Harvard Medical School, Harvard University, Cambridge, MA, USA.
  • Culotti A; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
  • Ellrott K; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Goecks J; Center for Translational Data Science, University of Chicago, Chicago, IL, USA.
  • Grossman RL; Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
  • Hall IM; Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
  • Hansen KD; Center for Translational Data Science, University of Chicago, Chicago, IL, USA.
  • Lawson J; Yale School of Medicine, Yale University, New Haven, CT, USA.
  • Leek JT; Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA.
  • Luria AO; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Mosher S; Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA.
  • Morgan M; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Nekrutenko A; Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
  • O'Connor BD; Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA.
  • Osborn K; Department of Biochemistry and Molecular Biology, The Pennsylvania State University, State College, PA, USA.
  • Paten B; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Patterson C; UC Santa Cruz Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA.
  • Tan FJ; UC Santa Cruz Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA.
  • Taylor CO; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
  • Vessio J; Department of Embryology, Carnegie Institution, Baltimore, MD, USA.
  • Waldron L; Departments of Medicine and Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
  • Wang T; Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
  • Wuichet K; Department of Epidemiology and Biostatistics, City University of New York Graduate School of Public Health and Health Policy, New York, NY, USA.
Cell Genom ; 2(1)2022 Jan 12.
Article in En | MEDLINE | ID: mdl-35199087
ABSTRACT
The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https//anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Cell Genom Year: 2022 Document type: Article Affiliation country: Estados Unidos

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Cell Genom Year: 2022 Document type: Article Affiliation country: Estados Unidos