Búsqueda | Portal de Búsqueda de la BVS

UINMF performs mosaic integration of single-cell multi-omic datasets using nonnegative matrix factorization.

Kriebel, April R; Welch, Joshua D.

Nat Commun ; 13(1): 780, 2022 02 09.

Artículo en Inglés | MEDLINE | ID: mdl-35140223

RESUMEN

Single-cell genomic technologies provide an unprecedented opportunity to define molecular cell types in a data-driven fashion, but present unique data integration challenges. Many analyses require "mosaic integration", including both features shared across datasets and features exclusive to a single experiment. Previous computational integration approaches require that the input matrices share the same number of either genes or cells, and thus can use only shared features. To address this limitation, we derive a nonnegative matrix factorization algorithm for integrating single-cell datasets containing both shared and unshared features. The key advance is incorporating an additional metagene matrix that allows unshared features to inform the factorization. We demonstrate that incorporating unshared features significantly improves integration of single-cell RNA-seq, spatial transcriptomic, SNARE-seq, and cross-species datasets. We have incorporated the UINMF algorithm into the open-source LIGER R package ( https://github.com/welch-lab/liger ).

Asunto(s)

Algoritmos , Biología Computacional , Análisis de la Célula Individual , Bases de Datos Factuales , Genómica , RNA-Seq , Programas Informáticos , Transcriptoma , Secuenciación del Exoma

Iterative single-cell multi-omic integration using online learning.

Gao, Chao; Liu, Jialin; Kriebel, April R; Preissl, Sebastian; Luo, Chongyuan; Castanon, Rosa; Sandoval, Justin; Rivkin, Angeline; Nery, Joseph R; Behrens, Margarita M; Ecker, Joseph R; Ren, Bing; Welch, Joshua D.

Nat Biotechnol ; 39(8): 1000-1007, 2021 08.

Artículo en Inglés | MEDLINE | ID: mdl-33875866

RESUMEN

Integrating large single-cell gene expression, chromatin accessibility and DNA methylation datasets requires general and scalable computational approaches. Here we describe online integrative non-negative matrix factorization (iNMF), an algorithm for integrating large, diverse and continually arriving single-cell datasets. Our approach scales to arbitrarily large numbers of cells using fixed memory, iteratively incorporates new datasets as they are generated and allows many users to simultaneously analyze a single copy of a large dataset by streaming it over the internet. Iterative data addition can also be used to map new data to a reference dataset. Comparisons with previous methods indicate that the improvements in efficiency do not sacrifice dataset alignment and cluster preservation performance. We demonstrate the effectiveness of online iNMF by integrating more than 1 million cells on a standard laptop, integrating large single-cell RNA sequencing and spatial transcriptomic datasets, and iteratively constructing a single-cell multi-omic atlas of the mouse motor cortex.

Asunto(s)

Algoritmos , Biología Computacional/métodos , Aprendizaje Automático , Análisis de la Célula Individual/métodos , Transcriptoma/genética , Animales , Ratones , Análisis Multivariante

Teaching Python for Data Science: Collaborative development of a modular & interactive curriculum.

Duda, Marlena; Sovacool, Kelly L; Farzaneh, Negar; Nguyen, Vy Kim; Haynes, Sarah E; Falk, Hayley; Furman, Katherine L; Walker, Logan A; Diao, Rucheng; Oneka, Morgan; Drotos, Audrey C; Woloshin, Alana; Dotson, Gabrielle A; Kriebel, April; Meng, Lucy; Thiede, Stephanie N; Lapp, Zena; Wolford, Brooke N.

J Open Source Educ ; 4(46)2021.

Artículo en Inglés | MEDLINE | ID: mdl-35187422

RESUMEN

We are bioinformatics trainees at the University of Michigan who started a local chapter of Girls Who Code to provide a fun and supportive environment for high school women to learn the power of coding. Our goal was to cover basic coding topics and data science concepts through live coding and hands-on practice. However, we could not find a resource that exactly met our needs. Therefore, over the past three years, we have developed a curriculum and instructional format using Jupyter notebooks to effectively teach introductory Python for data science. This method, inspired by The Carpentries organization, uses bite-sized lessons followed by independent practice time to reinforce coding concepts, and culminates in a data science capstone project using real-world data. We believe our open curriculum is a valuable resource to the wider education community and hope that educators will use and improve our lessons, practice problems, and teaching best practices. Anyone can contribute to our Open Educational Resources on GitHub.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA