Your browser doesn't support javascript.
loading
Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data.
Wang, Yichen; Sarfraz, Irzam; Teh, Wei Kheng; Sokolov, Artem; Herb, Brian R; Creasy, Heather H; Virshup, Isaac; Dries, Ruben; Degatano, Kylee; Mahurkar, Anup; Schnell, Daniel J; Madrigal, Pedro; Hilton, Jason; Gehlenborg, Nils; Tickle, Timothy; Campbell, Joshua D.
Afiliação
  • Wang Y; Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
  • Sarfraz I; Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
  • Teh WK; European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, Cambridgeshire, UK.
  • Sokolov A; Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA.
  • Herb BR; Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
  • Creasy HH; Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
  • Virshup I; Department of Computational Health, Helmholtz Munich, Oberschleißheim, Germany.
  • Dries R; Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
  • Degatano K; Data Sciences Platform, Broad Institute, Cambridge, MA, USA.
  • Mahurkar A; Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
  • Schnell DJ; Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.
  • Madrigal P; European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, Cambridgeshire, UK.
  • Hilton J; Department of Genetics, Stanford University, Stanford, CA, USA.
  • Gehlenborg N; Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
  • Tickle T; Data Sciences Platform, Broad Institute, Cambridge, MA, USA.
  • Campbell JD; Department of Medicine, Boston University School of Medicine, Boston, MA, USA.
bioRxiv ; 2023 Mar 07.
Article em En | MEDLINE | ID: mdl-36945543
ABSTRACT
A large number of genomic and imaging datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While much effort has been devoted to capturing information related to biospecimen information and experimental procedures, the metadata standards that describe data matrices and the analysis workflows that produced them are relatively lacking. Detailed metadata schema related to data analysis are needed to facilitate sharing and interoperability across groups and to promote data provenance for reproducibility. To address this need, we developed the Matrix and Analysis Metadata Standards (MAMS) to serve as a resource for data coordinating centers and tool developers. We first curated several simple and complex "use cases" to characterize the types of feature-observation matrices (FOMs), annotations, and analysis metadata produced in different workflows. Based on these use cases, metadata fields were defined to describe the data contained within each matrix including those related to processing, modality, and subsets. Suggested terms were created for the majority of fields to aid in harmonization of metadata terms across groups. Additional provenance metadata fields were also defined to describe the software and workflows that produced each FOM. Finally, we developed a simple list-like schema that can be used to store MAMS information and implemented in multiple formats. Overall, MAMS can be used as a guide to harmonize analysis-related metadata which will ultimately facilitate integration of datasets across tools and consortia. MAMS specifications, use cases, and examples can be found at https//github.com/single-cell-mams/mams/.

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2023 Tipo de documento: Article