Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 37(18): 3058-3060, 2021 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-33715007

RESUMO

MOTIVATION: R Experiment objects such as the SummarizedExperiment or SingleCellExperiment are data containers for storing one or more matrix-like assays along with associated row and column data. These objects have been used to facilitate the storage and analysis of high-throughput genomic data generated from technologies such as single-cell RNA sequencing. One common computational task in many genomics analysis workflows is to perform subsetting of the data matrix before applying down-stream analytical methods. For example, one may need to subset the columns of the assay matrix to exclude poor-quality samples or subset the rows of the matrix to select the most variable features. Traditionally, a second object is created that contains the desired subset of assay from the original object. However, this approach is inefficient as it requires the creation of an additional object containing a copy of the original assay and leads to challenges with data provenance. RESULTS: To overcome these challenges, we developed an R package called ExperimentSubset, which is a data container that implements classes for efficient storage and streamlined retrieval of assays that have been subsetted by rows and/or columns. These classes are able to inherently provide data provenance by maintaining the relationship between the subsetted and parent assays. We demonstrate the utility of this package on a single-cell RNA-seq dataset by storing and retrieving subsets at different stages of the analysis while maintaining a lower memory footprint. Overall, the ExperimentSubset is a flexible container for the efficient management of subsets. AVAILABILITY AND IMPLEMENTATION: ExperimentSubset package is available at Bioconductor: https://bioconductor.org/packages/ExperimentSubset/ and Github: https://github.com/campbio/ExperimentSubset. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Software , Genoma , Fluxo de Trabalho
2.
Genome Biol ; 25(1): 205, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39090672

RESUMO

Many datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While biospecimen and experimental information is often captured, detailed metadata standards related to data matrices and analysis workflows are currently lacking. To address this, we develop the matrix and analysis metadata standards (MAMS) to serve as a resource for data centers, repositories, and tool developers. We define metadata fields for matrices and parameters commonly utilized in analytical workflows and developed the rmams package to extract MAMS from single-cell objects. Overall, MAMS promotes the harmonization, integration, and reproducibility of single-cell data across platforms.


Assuntos
Metadados , Análise de Célula Única , Análise de Célula Única/métodos , Análise de Célula Única/normas , Reprodutibilidade dos Testes , Humanos , Software
3.
Patterns (N Y) ; 4(8): 100814, 2023 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-37602214

RESUMO

Analysis of single-cell RNA sequencing (scRNA-seq) data can reveal novel insights into the heterogeneity of complex biological systems. Many tools and workflows have been developed to perform different types of analyses. However, these tools are spread across different packages or programming environments, rely on different underlying data structures, and can only be utilized by people with knowledge of programming languages. In the Single-Cell Toolkit 2 (SCTK2), we have integrated a variety of popular tools and workflows to perform various aspects of scRNA-seq analysis. All tools and workflows can be run in the R console or using an intuitive graphical user interface built with R/Shiny. HTML reports generated with Rmarkdown can be used to document and recapitulate individual steps or entire analysis workflows. We show that the toolkit offers more features when compared with existing tools and allows for a seamless analysis of scRNA-seq data for non-computational users.

4.
bioRxiv ; 2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-38077085

RESUMO

Emerging spatial omics technologies continue to advance the molecular mapping of tissue architecture and the investigation of gene regulation and cellular crosstalk, which in turn provide new mechanistic insights into a wide range of biological processes and diseases. Such technologies provide an increasingly large amount of information content at multiple spatial scales. However, representing and harmonizing diverse spatial datasets efficiently, including combining multiple modalities or spatial scales in a scalable and flexible manner, remains a substantial challenge. Here, we present Giotto Suite, a suite of open-source software packages that underlies a fully modular and integrated spatial data analysis toolbox. At its core, Giotto Suite is centered around an innovative and technology-agnostic data framework embedded in the R software environment, which allows the representation and integration of virtually any type of spatial omics data at any spatial resolution. In addition, Giotto Suite provides both scalable and extensible end-to-end solutions for data analysis, integration, and visualization. Giotto Suite integrates molecular, morphology, spatial, and annotated feature information to create a responsive and flexible workflow for multi-scale, multi-omic data analyses, as demonstrated here by applications to several state-of-the-art spatial technologies. Furthermore, Giotto Suite builds upon interoperable interfaces and data structures that bridge the established fields of genomics and spatial data science, thereby enabling independent developers to create custom-engineered pipelines. As such, Giotto Suite creates an immersive ecosystem for spatial multi-omic data analysis.

5.
bioRxiv ; 2023 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-36945543

RESUMO

A large number of genomic and imaging datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While much effort has been devoted to capturing information related to biospecimen information and experimental procedures, the metadata standards that describe data matrices and the analysis workflows that produced them are relatively lacking. Detailed metadata schema related to data analysis are needed to facilitate sharing and interoperability across groups and to promote data provenance for reproducibility. To address this need, we developed the Matrix and Analysis Metadata Standards (MAMS) to serve as a resource for data coordinating centers and tool developers. We first curated several simple and complex "use cases" to characterize the types of feature-observation matrices (FOMs), annotations, and analysis metadata produced in different workflows. Based on these use cases, metadata fields were defined to describe the data contained within each matrix including those related to processing, modality, and subsets. Suggested terms were created for the majority of fields to aid in harmonization of metadata terms across groups. Additional provenance metadata fields were also defined to describe the software and workflows that produced each FOM. Finally, we developed a simple list-like schema that can be used to store MAMS information and implemented in multiple formats. Overall, MAMS can be used as a guide to harmonize analysis-related metadata which will ultimately facilitate integration of datasets across tools and consortia. MAMS specifications, use cases, and examples can be found at https://github.com/single-cell-mams/mams/.

6.
Nat Commun ; 13(1): 1688, 2022 03 30.
Artigo em Inglês | MEDLINE | ID: mdl-35354805

RESUMO

Single-cell RNA sequencing (scRNA-seq) can be used to gain insights into cellular heterogeneity within complex tissues. However, various technical artifacts can be present in scRNA-seq data and should be assessed before performing downstream analyses. While several tools have been developed to perform individual quality control (QC) tasks, they are scattered in different packages across several programming environments. Here, to streamline the process of generating and visualizing QC metrics for scRNA-seq data, we built the SCTK-QC pipeline within the singleCellTK R package. The SCTK-QC workflow can import data from several single-cell platforms and preprocessing tools and includes steps for empty droplet detection, generation of standard QC metrics, prediction of doublets, and estimation of ambient RNA. It can run on the command line, within the R console, on the cloud platform or with an interactive graphical user interface. Overall, the SCTK-QC pipeline streamlines and standardizes the process of performing QC for scRNA-seq data.


Assuntos
Benchmarking , Software , Controle de Qualidade , Análise de Sequência de RNA , Sequenciamento do Exoma
7.
Comput Biol Med ; 116: 103561, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31785415

RESUMO

Gene expression microarrays capture a complete image of all the transcriptional activity in a biological sample. Microarrays produce a large amount of data, which becomes a challenge when it comes to exploring and interpreting using modern computational and statistical tools. We propose the Microarray Analysis (MiCA) tool that outperforms other similar tools both in terms of ease of use and statistical features requiring minimal input to conduct an analysis. MiCA is an integrated, interactive, and streamlined desktop software for the analysis of microarray gene expression data. MiCA consists of a complete microarray analysis pipeline including but not limited to fetching data directly from GEO, normalization, interactive quality control, batch-effect correction, regression analysis, surrogate variable analysis and functional annotation methods such as GSVA using known existing R packages. We compare the features offered by MiCA and other similar tools while performing differential expression analysis using previously published datasets. MiCA offers additional statistical and visualization methods to conduct a microarray data analysis compared to other available microarray analysis tools. MiCA minimizes the need for technical knowledge by providing a very intuitive and versatile interface that integrates all necessary tasks and features required for basic microarray data analysis. We analyzed multiple published datasets and showed that the features offered by MiCA not only simplify the analysis pipeline but also provide additional interpretation to the data.


Assuntos
Biologia Computacional , Software , Bases de Dados Genéticas , Expressão Gênica , Perfilação da Expressão Gênica , Análise em Microsséries
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA