Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Nat Commun ; 13(1): 780, 2022 02 09.
Artículo en Inglés | MEDLINE | ID: mdl-35140223

RESUMEN

Single-cell genomic technologies provide an unprecedented opportunity to define molecular cell types in a data-driven fashion, but present unique data integration challenges. Many analyses require "mosaic integration", including both features shared across datasets and features exclusive to a single experiment. Previous computational integration approaches require that the input matrices share the same number of either genes or cells, and thus can use only shared features. To address this limitation, we derive a nonnegative matrix factorization algorithm for integrating single-cell datasets containing both shared and unshared features. The key advance is incorporating an additional metagene matrix that allows unshared features to inform the factorization. We demonstrate that incorporating unshared features significantly improves integration of single-cell RNA-seq, spatial transcriptomic, SNARE-seq, and cross-species datasets. We have incorporated the UINMF algorithm into the open-source LIGER R package ( https://github.com/welch-lab/liger ).


Asunto(s)
Algoritmos , Biología Computacional , Análisis de la Célula Individual , Bases de Datos Factuales , Genómica , RNA-Seq , Programas Informáticos , Transcriptoma , Secuenciación del Exoma
2.
Nat Biotechnol ; 39(8): 1000-1007, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-33875866

RESUMEN

Integrating large single-cell gene expression, chromatin accessibility and DNA methylation datasets requires general and scalable computational approaches. Here we describe online integrative non-negative matrix factorization (iNMF), an algorithm for integrating large, diverse and continually arriving single-cell datasets. Our approach scales to arbitrarily large numbers of cells using fixed memory, iteratively incorporates new datasets as they are generated and allows many users to simultaneously analyze a single copy of a large dataset by streaming it over the internet. Iterative data addition can also be used to map new data to a reference dataset. Comparisons with previous methods indicate that the improvements in efficiency do not sacrifice dataset alignment and cluster preservation performance. We demonstrate the effectiveness of online iNMF by integrating more than 1 million cells on a standard laptop, integrating large single-cell RNA sequencing and spatial transcriptomic datasets, and iteratively constructing a single-cell multi-omic atlas of the mouse motor cortex.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Aprendizaje Automático , Análisis de la Célula Individual/métodos , Transcriptoma/genética , Animales , Ratones , Análisis Multivariante
3.
Artículo en Inglés | MEDLINE | ID: mdl-35187422

RESUMEN

We are bioinformatics trainees at the University of Michigan who started a local chapter of Girls Who Code to provide a fun and supportive environment for high school women to learn the power of coding. Our goal was to cover basic coding topics and data science concepts through live coding and hands-on practice. However, we could not find a resource that exactly met our needs. Therefore, over the past three years, we have developed a curriculum and instructional format using Jupyter notebooks to effectively teach introductory Python for data science. This method, inspired by The Carpentries organization, uses bite-sized lessons followed by independent practice time to reinforce coding concepts, and culminates in a data science capstone project using real-world data. We believe our open curriculum is a valuable resource to the wider education community and hope that educators will use and improve our lessons, practice problems, and teaching best practices. Anyone can contribute to our Open Educational Resources on GitHub.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA