Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-38464242

RESUMO

Recent experimental developments enable single-cell multimodal epigenomic profiling, which measures multiple histone modifications and chromatin accessibility within the same cell. Such parallel measurements provide exciting new opportunities to investigate how epigenomic modalities vary together across cell types and states. A pivotal step in using this type of data is integrating the epigenomic modalities to learn a unified representation of each cell, but existing approaches are not designed to model the unique nature of this data type. Our key insight is to model single-cell multimodal epigenome data as a multi-channel sequential signal. Based on this insight, we developed ConvNet-VAEs, a novel framework that uses 1D-convolutional variational autoencoders (VAEs) for single-cell multimodal epigenomic data integration. We evaluated ConvNet-VAEs on nano-CT and scNTT-seq data generated from juvenile mouse brain and human bone marrow. We found that ConvNet-VAEs can perform dimension reduction and batch correction better than previous architectures while using significantly fewer parameters. Furthermore, the performance gap between convolutional and fully-connected architectures increases with the number of modalities, and deeper convolutional architectures can increase performance while performance degrades for deeper fully-connected architectures. Our results indicate that convolutional autoencoders are a promising method for integrating current and future single-cell multimodal epigenomic datasets.

2.
bioRxiv ; 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-36993393

RESUMO

HIV-1 Vpr promotes efficient spread of HIV-1 from macrophages to T cells by transcriptionally downmodulating restriction factors that target HIV-1 Envelope protein (Env). Here we find that Vpr induces broad transcriptomic changes by targeting PU.1, a transcription factor necessary for expression of host innate immune response genes, including those that target Env. Consistent with this, we find silencing PU.1 in infected macrophages lacking Vpr rescues Env. Vpr downmodulates PU.1 through a proteasomal degradation pathway that depends on physical interactions with PU.1 and DCAF1, a component of the Cul4A E3 ubiquitin ligase. The capacity for Vpr to target PU.1 is highly conserved across primate lentiviruses. In addition to impacting infected cells, we find that Vpr suppresses expression of innate immune response genes in uninfected bystander cells, and that virion-associated Vpr can degrade PU.1. Together, we demonstrate Vpr counteracts PU.1 in macrophages to blunt antiviral immune responses and promote viral spread.

3.
Bioessays ; 46(3): e2300173, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38161246

RESUMO

Endosteal stem cells are a subclass of bone marrow skeletal stem cell populations that are particularly important for rapid bone formation occurring in growth and regeneration. These stem cells are strategically located near the bone surface in a specialized microenvironment of the endosteal niche. These stem cells are abundant in young stages but eventually depleted and replaced by other stem cell types residing in a non-endosteal perisinusoidal niche. Single-cell molecular profiling and in vivo cell lineage analyses play key roles in discovering endosteal stem cells. Importantly, endosteal stem cells can transform into bone tumor-making cells when deleterious mutations occur in tumor suppressor genes. The emerging hypothesis is that osteoblast-chondrocyte transitional identities confer a special subset of endosteal stromal cells with stem cell-like properties, which may make them susceptible for tumorigenic transformation. Endosteal stem cells are likely to represent an important therapeutic target of bone diseases caused by aberrant bone formation.


Assuntos
Doenças Ósseas , Medula Óssea , Humanos , Medula Óssea/metabolismo , Osteogênese , Osteoblastos/metabolismo , Doenças Ósseas/metabolismo , Doenças Ósseas/patologia , Células-Tronco , Células da Medula Óssea/metabolismo
4.
Nat Commun ; 14(1): 2383, 2023 04 25.
Artigo em Inglês | MEDLINE | ID: mdl-37185464

RESUMO

The bone marrow contains various populations of skeletal stem cells (SSCs) in the stromal compartment, which are important regulators of bone formation. It is well-described that leptin receptor (LepR)+ perivascular stromal cells provide a major source of bone-forming osteoblasts in adult and aged bone marrow. However, the identity of SSCs in young bone marrow and how they coordinate active bone formation remains unclear. Here we show that bone marrow endosteal SSCs are defined by fibroblast growth factor receptor 3 (Fgfr3) and osteoblast-chondrocyte transitional (OCT) identities with some characteristics of bone osteoblasts and chondrocytes. These Fgfr3-creER-marked endosteal stromal cells contribute to a stem cell fraction in young stages, which is later replaced by Lepr-cre-marked stromal cells in adult stages. Further, Fgfr3+ endosteal stromal cells give rise to aggressive osteosarcoma-like lesions upon loss of p53 tumor suppressor through unregulated self-renewal and aberrant osteogenic fates. Therefore, Fgfr3+ endosteal SSCs are abundant in young bone marrow and provide a robust source of osteoblasts, contributing to both normal and aberrant osteogenesis.


Assuntos
Medula Óssea , Osteogênese , Adulto , Humanos , Idoso , Osteogênese/genética , Medula Óssea/metabolismo , Osso e Ossos , Osteoblastos/metabolismo , Células-Tronco , Carcinogênese/genética , Carcinogênese/metabolismo , Células da Medula Óssea/metabolismo , Diferenciação Celular
5.
Nat Biotechnol ; 41(3): 387-398, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36229609

RESUMO

Multi-omic single-cell datasets, in which multiple molecular modalities are profiled within the same cell, offer an opportunity to understand the temporal relationship between epigenome and transcriptome. To realize this potential, we developed MultiVelo, a differential equation model of gene expression that extends the RNA velocity framework to incorporate epigenomic data. MultiVelo uses a probabilistic latent variable model to estimate the switch time and rate parameters of chromatin accessibility and gene expression and improves the accuracy of cell fate prediction compared to velocity estimates from RNA only. Application to multi-omic single-cell datasets from brain, skin and blood cells reveals two distinct classes of genes distinguished by whether chromatin closes before or after transcription ceases. We also find four types of cell states: two states in which epigenome and transcriptome are coupled and two distinct decoupled states. Finally, we identify time lags between transcription factor expression and binding site accessibility and between disease-associated SNP accessibility and expression of the linked genes. MultiVelo is available on PyPI, Bioconda and GitHub ( https://github.com/welch-lab/MultiVelo ).


Assuntos
Epigenoma , Transcriptoma , Transcriptoma/genética , Multiômica , Cromatina/genética , RNA , Análise de Célula Única
6.
bioRxiv ; 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38168419

RESUMO

Skeletal muscle, the largest human organ by weight, is relevant to several polygenic metabolic traits and diseases including type 2 diabetes (T2D). Identifying genetic mechanisms underlying these traits requires pinpointing the relevant cell types, regulatory elements, target genes, and causal variants. Here, we used genetic multiplexing to generate population-scale single nucleus (sn) chromatin accessibility (snATAC-seq) and transcriptome (snRNA-seq) maps across 287 frozen human skeletal muscle biopsies representing 456,880 nuclei. We identified 13 cell types that collectively represented 983,155 ATAC summits. We integrated genetic variation to discover 6,866 expression quantitative trait loci (eQTL) and 100,928 chromatin accessibility QTL (caQTL) (5% FDR) across the five most abundant cell types, cataloging caQTL peaks that atlas-level snATAC maps often miss. We identified 1,973 eGenes colocalized with caQTL and used mediation analyses to construct causal directional maps for chromatin accessibility and gene expression. 3,378 genome-wide association study (GWAS) signals across 43 relevant traits colocalized with sn-e/caQTL, 52% in a cell-specific manner. 77% of GWAS signals colocalized with caQTL and not eQTL, highlighting the critical importance of population-scale chromatin profiling for GWAS functional studies. GWAS-caQTL colocalization showed distinct cell-specific regulatory paradigms. For example, a C2CD4A/B T2D GWAS signal colocalized with caQTL in muscle fibers and multiple chromatin loop models nominated VPS13C, a glucose uptake gene. Sequence of the caQTL peak overlapping caSNP rs7163757 showed allelic regulatory activity differences in a human myocyte cell line massively parallel reporter assay. These results illuminate the genetic regulatory architecture of human skeletal muscle at high-resolution epigenomic, transcriptomic, and cell state scales and serve as a template for population-scale multi-omic mapping in complex tissues and traits.

7.
bioRxiv ; 2023 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-38187531

RESUMO

Protein structure prediction with neural networks is a powerful new method for linking protein sequence, structure, and function, but structures have generally been predicted for only a single isoform of each gene, neglecting splice variants. To investigate the structural implications of alternative splicing, we used AlphaFold2 to predict the structures of more than 11,000 human isoforms. We employed multiple metrics to identify splicing-induced structural alterations, including template matching score, secondary structure composition, surface charge distribution, radius of gyration, accessibility of post-translational modification sites, and structure-based function prediction. We identified examples of how alternative splicing induced clear changes in each of these properties. Structural similarity between isoforms largely correlated with degree of sequence identity, but we identified a subset of isoforms with low structural similarity despite high sequence similarity. Exon skipping and alternative last exons tended to increase the surface charge and radius of gyration. Splicing also buried or exposed numerous post-translational modification sites, most notably among the isoforms of BAX. Functional prediction nominated numerous functional differences among isoforms of the same gene, with loss of function compared to the reference predominating. Finally, we used single-cell RNA-seq data from the Tabula Sapiens to determine the cell types in which each structure is expressed. Our work represents an important resource for studying the structure and function of splice isoforms across the cell types of the human body.

8.
Nat Commun ; 13(1): 7319, 2022 11 28.
Artigo em Inglês | MEDLINE | ID: mdl-36443296

RESUMO

In endochondral bone development, bone-forming osteoblasts and bone marrow stromal cells have dual origins in the fetal cartilage and its surrounding perichondrium. However, how early perichondrial cells distinctively contribute to developing bones remain unidentified. Here we show using in vivo cell-lineage analyses that Dlx5+ fetal perichondrial cells marked by Dlx5-creER do not generate cartilage but sustainably contribute to cortical bone and marrow stromal compartments in a manner complementary to fetal chondrocyte derivatives under the regulation of Hedgehog signaling. Postnatally, Dlx5+ fetal perichondrial cell derivatives preferentially populate the diaphyseal marrow stroma with a dormant adipocyte-biased state and are refractory to parathyroid hormone-induced bone anabolism. Therefore, early perichondrial cells of the fetal cartilage are destined to become an adipogenic subset of stromal cells in postnatal diaphyseal bone marrow, supporting the theory that the adult bone marrow stromal compartments are developmentally prescribed within the two distinct cells-of-origins of the fetal bone anlage.


Assuntos
Cartilagem , Proteínas Hedgehog , Adulto , Humanos , Osso e Ossos , Desenvolvimento Ósseo , Condrócitos
9.
Bioinformatics ; 38(10): 2946-2948, 2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35561174

RESUMO

MOTIVATION: LIGER (Linked Inference of Genomic Experimental Relationships) is a widely used R package for single-cell multi-omic data integration. However, many users prefer to analyze their single-cell datasets in Python, which offers an attractive syntax and highly optimized scientific computing libraries for increased efficiency. RESULTS: We developed PyLiger, a Python package for integrating single-cell multi-omic datasets. PyLiger offers faster performance than the previous R implementation (2-5× speedup), interoperability with AnnData format, flexible on-disk or in-memory analysis capability and new functionality for gene ontology enrichment analysis. The on-disk capability enables analysis of arbitrarily large single-cell datasets using fixed memory. AVAILABILITY AND IMPLEMENTATION: PyLiger is available on Github at https://github.com/welch-lab/pyliger and on the Python Package Index. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Software , Ontologia Genética , Genoma
10.
Nat Commun ; 13(1): 780, 2022 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-35140223

RESUMO

Single-cell genomic technologies provide an unprecedented opportunity to define molecular cell types in a data-driven fashion, but present unique data integration challenges. Many analyses require "mosaic integration", including both features shared across datasets and features exclusive to a single experiment. Previous computational integration approaches require that the input matrices share the same number of either genes or cells, and thus can use only shared features. To address this limitation, we derive a nonnegative matrix factorization algorithm for integrating single-cell datasets containing both shared and unshared features. The key advance is incorporating an additional metagene matrix that allows unshared features to inform the factorization. We demonstrate that incorporating unshared features significantly improves integration of single-cell RNA-seq, spatial transcriptomic, SNARE-seq, and cross-species datasets. We have incorporated the UINMF algorithm into the open-source LIGER R package ( https://github.com/welch-lab/liger ).


Assuntos
Algoritmos , Biologia Computacional , Análise de Célula Única , Bases de Dados Factuais , Genômica , RNA-Seq , Software , Transcriptoma , Sequenciamento do Exoma
11.
Front Dent Med ; 22021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-34966906

RESUMO

The periodontium is essential for supporting the functionality of the tooth, composed of diversity of mineralized and non-mineralized tissues such as the cementum, the periodontal ligament (PDL) and the alveolar bone. The periodontium is developmentally derived from the dental follicle (DF), a fibrous tissue surrounding the developing tooth bud. We previously showed through in vivo lineage-tracing experiments that DF contains mesenchymal progenitor cells expressing parathyroid hormone-related protein (PTHrP), which give rise to cells forming the periodontal attachment apparatus in a manner regulated by autocrine signaling through the PTH/PTHrP receptor. However, the developmental relationships between PTHrP+ DF cells and diverse cell populations constituting the periodontium remain undefined. Here, we performed single-cell RNA-sequencing (scRNA-seq) analyses of cells in the periodontium by integrating the two datasets, i.e. PTHrP-mCherry+ DF cells at P6 and 2.3kb Col1a1 promoter-driven GFP+ periodontal cells at P25 that include descendants of PTHrP+ DF cells, cementoblasts, osteoblasts and periodontal ligament cells. This integrative scRNA-seq analysis revealed heterogeneity of cells of the periodontium and their cell type-specific markers, as well as their relationships with DF cells. Most importantly, our analysis identified a cementoblast-specific metagene that discriminate cementoblasts from alveolar bone osteoblasts, including Pthlh (encoding PTHrP) and Tubb3. RNA velocity analysis indicated that cementoblasts were directly derived from PTHrP+ DF cells in the early developmental stage and did not interconvert with other cell types. Further, CellPhoneDB cell-cell communication analysis indicated that PTHrP derived from cementoblasts acts on diversity of cells in the periodontium in an autocrine and paracrine manner. Collectively, our findings provide insights into the lineage hierarchy and intercellular interactions of cells in the periodontium at a single-cell level, aiding to understand cellular and molecular basis of periodontal tissue formation.

12.
Genome Biol ; 22(1): 298, 2021 10 27.
Artigo em Inglês | MEDLINE | ID: mdl-34706748

RESUMO

We present SquiggleNet, the first deep-learning model that can classify nanopore reads directly from their electrical signals. SquiggleNet operates faster than DNA passes through the pore, allowing real-time classification and read ejection. Using 1 s of sequencing data, the classifier achieves significantly higher accuracy than base calling followed by sequence alignment. Our approach is also faster and requires an order of magnitude less memory than alignment-based approaches. SquiggleNet distinguished human from bacterial DNA with over 90% accuracy, generalized to unseen bacterial species in a human respiratory meta genome sample, and accurately classified sequences containing human long interspersed repeat elements.


Assuntos
Aprendizado Profundo , Sequenciamento por Nanoporos/métodos , DNA Bacteriano/análise , Humanos , Elementos Nucleotídeos Longos e Dispersos , Metagenoma , Sistema Respiratório/microbiologia
13.
EMBO Rep ; 22(11): e52901, 2021 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-34523214

RESUMO

Cardiac regeneration occurs primarily through proliferation of existing cardiomyocytes, but also involves complex interactions between distinct cardiac cell types including non-cardiomyocytes (non-CMs). However, the subpopulations, distinguishing molecular features, cellular functions, and intercellular interactions of non-CMs in heart regeneration remain largely unexplored. Using the LIGER algorithm, we assemble an atlas of cell states from 61,977 individual non-CM scRNA-seq profiles isolated at multiple time points during regeneration. This analysis reveals extensive non-CM cell diversity, including multiple macrophage (MC), fibroblast (FB), and endothelial cell (EC) subpopulations with unique spatiotemporal distributions, and suggests an important role for MC in inducing the activated FB and EC subpopulations. Indeed, pharmacological perturbation of MC function compromises the induction of the unique FB and EC subpopulations. Furthermore, we developed computational algorithm Topologizer to map the topological relationships and dynamic transitions between functional states. We uncover dynamic transitions between MC functional states and identify factors involved in mRNA processing and transcriptional regulation associated with the transition. Together, our single-cell transcriptomic analysis of non-CMs during cardiac regeneration provides a blueprint for interrogating the molecular and cellular basis of this process.


Assuntos
Miócitos Cardíacos , Peixe-Zebra , Animais , Proliferação de Células/genética , Células Endoteliais/metabolismo , Fibroblastos/metabolismo , Coração/fisiologia , Miócitos Cardíacos/metabolismo , Peixe-Zebra/metabolismo , Proteínas de Peixe-Zebra/metabolismo
14.
J Clin Invest ; 131(21)2021 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-34546975

RESUMO

In this study, we demonstrate that forkhead box F1 (FOXF1), a mesenchymal transcriptional factor essential for lung development, was retained in a topographically distinct mesenchymal stromal cell population along the bronchovascular space in an adult lung and identify this distinct subset of collagen-expressing cells as key players in lung allograft remodeling and fibrosis. Using Foxf1-tdTomato BAC (Foxf1-tdTomato) and Foxf1-tdTomato Col1a1-GFP mice, we show that Lin-Foxf1+ cells encompassed the stem cell antigen 1+CD34+ (Sca1+CD34+) subset of collagen 1-expressing mesenchymal cells (MCs) with a capacity to generate CFU and lung epithelial organoids. Histologically, FOXF1-expressing MCs formed a 3D network along the conducting airways; FOXF1 was noted to be conspicuously absent in MCs in the alveolar compartment. Bulk and single-cell RNA-Seq confirmed distinct transcriptional signatures of Foxf1+ and Foxf1- MCs, with Foxf1-expressing cells delineated by their high expression of the transcription factor glioma-associated oncogene 1 (Gli1) and low expression of integrin α8 (Itga), versus other collagen-expressing MCs. FOXF1+Gli1+ MCs showed proximity to Sonic hedgehog-expressing (Shh-expressing) bronchial epithelium, and mesenchymal expression of Foxf1 and Gli1 was found to be dependent on paracrine Shh signaling in epithelial organoids. Using a murine lung transplant model, we show dysregulation of epithelial-mesenchymal SHH/GLI1/FOXF1 crosstalk and expansion of this specific peribronchial MC population in chronically rejecting fibrotic lung allografts.


Assuntos
Fatores de Transcrição Forkhead/metabolismo , Rejeição de Enxerto/metabolismo , Transplante de Pulmão , Células-Tronco Mesenquimais/metabolismo , Alvéolos Pulmonares/metabolismo , Fibrose Pulmonar/metabolismo , Aloenxertos , Animais , Doença Crônica , Fatores de Transcrição Forkhead/genética , Rejeição de Enxerto/genética , Rejeição de Enxerto/patologia , Células-Tronco Mesenquimais/patologia , Camundongos , Camundongos Transgênicos , Alvéolos Pulmonares/patologia , Fibrose Pulmonar/etiologia , Fibrose Pulmonar/genética , Fibrose Pulmonar/patologia
17.
Genome Biol ; 22(1): 158, 2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-34016135

RESUMO

Deep generative models such as variational autoencoders (VAEs) and generative adversarial networks (GANs) generate and manipulate high-dimensional images. We systematically assess the complementary strengths and weaknesses of these models on single-cell gene expression data. We also develop MichiGAN, a novel neural network that combines the strengths of VAEs and GANs to sample from disentangled representations without sacrificing data generation quality. We learn disentangled representations of three large single-cell RNA-seq datasets and use MichiGAN to sample from these representations. MichiGAN allows us to manipulate semantically distinct aspects of cellular identity and predict single-cell gene expression response to drug treatment.


Assuntos
Algoritmos , Redes Neurais de Computação , Análise de Célula Única , Simulação por Computador , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , RNA-Seq , Estatísticas não Paramétricas
18.
Nat Biotechnol ; 39(8): 1000-1007, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-33875866

RESUMO

Integrating large single-cell gene expression, chromatin accessibility and DNA methylation datasets requires general and scalable computational approaches. Here we describe online integrative non-negative matrix factorization (iNMF), an algorithm for integrating large, diverse and continually arriving single-cell datasets. Our approach scales to arbitrarily large numbers of cells using fixed memory, iteratively incorporates new datasets as they are generated and allows many users to simultaneously analyze a single copy of a large dataset by streaming it over the internet. Iterative data addition can also be used to map new data to a reference dataset. Comparisons with previous methods indicate that the improvements in efficiency do not sacrifice dataset alignment and cluster preservation performance. We demonstrate the effectiveness of online iNMF by integrating more than 1 million cells on a standard laptop, integrating large single-cell RNA sequencing and spatial transcriptomic datasets, and iteratively constructing a single-cell multi-omic atlas of the mouse motor cortex.


Assuntos
Algoritmos , Biologia Computacional/métodos , Aprendizado de Máquina , Análise de Célula Única/métodos , Transcriptoma/genética , Animais , Camundongos , Análise Multivariada
19.
J Bone Miner Res ; 36(6): 1145-1158, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33651379

RESUMO

Bone marrow houses a multifunctional stromal cell population expressing C-X-C motif chemokine ligand 12 (CXCL12), termed CXCL12-abundant reticular (CAR) cells, that regulates osteogenesis and adipogenesis. The quiescent pre-adipocyte-like subset of CXCL12+ stromal cells ("Adipo-CAR" cells) is localized to sinusoidal surfaces and particularly enriched for hematopoiesis-supporting cytokines. However, detailed characteristics of these CXCL12+ pre-adipocyte-like stromal cells and how they contribute to marrow adipogenesis remain largely unknown. Here we highlight CXCL12-dependent physical coupling with hematopoietic cells as a potential mechanism regulating the adipogenic potential of CXCL12+ stromal cells. Single-cell computational analyses of RNA velocity and cell signaling reveal that Adipo-CAR cells exuberantly communicate with hematopoietic cells through CXCL12-CXCR4 ligand-receptor interactions but do not interconvert with Osteo-CAR cells. Consistent with this computational prediction, a substantial fraction of Cxcl12-creER+ pre-adipocyte-like cells intertwines with hematopoietic cells in vivo and in single-cell preparation in a protease-sensitive manner. Deletion of CXCL12 in these cells using Col2a1-cre leads to a reduction of stromal-hematopoietic coupling and extensive marrow adipogenesis in adult bone marrow, which appears to involve direct conversion of CXCL12+ cells to lipid-laden marrow adipocytes without altering mesenchymal progenitor cell fates. Therefore, these findings suggest that CXCL12+ pre-adipocyte-like marrow stromal cells prevent their premature differentiation by maintaining physical coupling with hematopoietic cells in a CXCL12-dependent manner, highlighting a possible cell-non-autonomous mechanism that regulates marrow adipogenesis. © 2021 American Society for Bone and Mineral Research (ASBMR).


Assuntos
Adipogenia , Células-Tronco Mesenquimais , Animais , Medula Óssea , Células da Medula Óssea , Diferenciação Celular , Quimiocina CXCL12 , Células-Tronco Hematopoéticas , Camundongos , Células Estromais
20.
Nat Protoc ; 15(11): 3632-3662, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33046898

RESUMO

High-throughput single-cell sequencing technologies hold tremendous potential for defining cell types in an unbiased fashion using gene expression and epigenomic state. A key challenge in realizing this potential is integrating single-cell datasets from multiple protocols, biological contexts, and data modalities into a joint definition of cellular identity. We previously developed an approach, called linked inference of genomic experimental relationships (LIGER), that uses integrative nonnegative matrix factorization to address this challenge. Here, we provide a step-by-step protocol for using LIGER to jointly define cell types from multiple single-cell datasets. The main stages of the protocol are data preprocessing and normalization, joint factorization, quantile normalization and joint clustering, and visualization. We describe how to jointly define cell types from single-cell RNA-seq (scRNA-seq) and single-nucleus ATAC-seq (snATAC-seq) data, but similar steps apply across a wide range of other settings and data types, including cross-species analysis, single-nucleus DNA methylation, and spatial transcriptomics. Our protocol contains examples of expected results, describes common pitfalls, and relies only on our freely available, open-source R implementation of LIGER. We also provide R Markdown tutorials showing the outputs from each individual code segment. The analysis process can be performed in 1-4 h, depending on dataset size, and assumes no specialized bioinformatics training.


Assuntos
Genômica/métodos , Análise de Célula Única/métodos , Animais , Núcleo Celular/genética , Análise por Conglomerados , Metilação de DNA , Perfilação da Expressão Gênica/métodos , Humanos , Análise de Sequência de RNA/métodos , Software , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...