Your browser doesn't support javascript.
loading
RaggedExperiment: the missing link between genomic ranges and matrices in Bioconductor.
Ramos, Marcel; Morgan, Martin; Geistlinger, Ludwig; Carey, Vincent J; Waldron, Levi.
Afiliação
  • Ramos M; Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, United States.
  • Morgan M; Institute for Implementation Science and Population Health, City University of New York, New York, NY 10027, United States.
  • Geistlinger L; Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14203, United States.
  • Carey VJ; Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14203, United States.
  • Waldron L; Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, United States.
Bioinformatics ; 39(6)2023 06 01.
Article em En | MEDLINE | ID: mdl-37208161
ABSTRACT

SUMMARY:

The RaggedExperiment R / Bioconductor package provides lossless representation of disparate genomic ranges across multiple specimens or cells, in conjunction with efficient and flexible calculations of rectangular-shaped summaries for downstream analysis. Applications include statistical analysis of somatic mutations, copy number, methylation, and open chromatin data. RaggedExperiment is compatible with multimodal data analysis as a component of MultiAssayExperiment data objects, and simplifies data representation and transformation for software developers and analysts. MOTIVATION AND

RESULTS:

Measurement of copy number, mutation, single nucleotide polymorphism, and other genomic attributes that may be stored as VCF files produce "ragged" genomic ranges data i.e. across different genomic coordinates in each sample. Ragged data are not rectangular or matrix-like, presenting informatics challenges for downstream statistical analyses. We present the RaggedExperiment R/Bioconductor data structure for lossless representation of ragged genomic data, with associated reshaping tools for flexible and efficient calculation of tabular representations to support a wide range of downstream statistical analyses. We demonstrate its applicability to copy number and somatic mutation data across 33 TCGA cancer datasets.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Genômica / Neoplasias Limite: Humans Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Genômica / Neoplasias Limite: Humans Idioma: En Ano de publicação: 2023 Tipo de documento: Article