Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
BMC Bioinformatics ; 25(1): 198, 2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38789920

RESUMO

BACKGROUND: Single-cell transcriptome sequencing (scRNA-Seq) has allowed new types of investigations at unprecedented levels of resolution. Among the primary goals of scRNA-Seq is the classification of cells into distinct types. Many approaches build on existing clustering literature to develop tools specific to single-cell. However, almost all of these methods rely on heuristics or user-supplied parameters to control the number of clusters. This affects both the resolution of the clusters within the original dataset as well as their replicability across datasets. While many recommendations exist, in general, there is little assurance that any given set of parameters will represent an optimal choice in the trade-off between cluster resolution and replicability. For instance, another set of parameters may result in more clusters that are also more replicable. RESULTS: Here, we propose Dune, a new method for optimizing the trade-off between the resolution of the clusters and their replicability. Our method takes as input a set of clustering results-or partitions-on a single dataset and iteratively merges clusters within each partitions in order to maximize their concordance between partitions. As demonstrated on multiple datasets from different platforms, Dune outperforms existing techniques, that rely on hierarchical merging for reducing the number of clusters, in terms of replicability of the resultant merged clusters as well as concordance with ground truth. Dune is available as an R package on Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/Dune.html . CONCLUSIONS: Cluster refinement by Dune helps improve the robustness of any clustering analysis and reduces the reliance on tuning parameters. This method provides an objective approach for borrowing information across multiple clusterings to generate replicable clusters most likely to represent common biological features across multiple datasets.


Assuntos
RNA-Seq , Análise de Célula Única , Software , Análise de Célula Única/métodos , RNA-Seq/métodos , Análise por Conglomerados , Algoritmos , Análise de Sequência de RNA/métodos , Humanos , Transcriptoma/genética , Reprodutibilidade dos Testes , Perfilação da Expressão Gênica/métodos , Análise da Expressão Gênica de Célula Única
2.
Bioinformatics ; 38(Suppl 1): i36-i44, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758804

RESUMO

MOTIVATION: Genome-wide association studies (GWAS), aiming to find genetic variants associated with a trait, have widely been used on bacteria to identify genetic determinants of drug resistance or hypervirulence. Recent bacterial GWAS methods usually rely on k-mers, whose presence in a genome can denote variants ranging from single-nucleotide polymorphisms to mobile genetic elements. This approach does not require a reference genome, making it easier to account for accessory genes. However, a same gene can exist in slightly different versions across different strains, leading to diluted effects. RESULTS: Here, we overcome this issue by testing covariates built from closed connected subgraphs (CCSs) of the de Bruijn graph defined over genomic k-mers. These covariates capture polymorphic genes as a single entity, improving k-mer-based GWAS both in terms of power and interpretability. However, a method naively testing all possible subgraphs would be powerless due to multiple testing corrections, and the mere exploration of these subgraphs would quickly become computationally intractable. The concept of testable hypothesis has successfully been used to address both problems in similar contexts. We leverage this concept to test all CCSs by proposing a novel enumeration scheme for these objects which fully exploits the pruning opportunity offered by testability, resulting in drastic improvements in computational efficiency. Our method integrates with existing visual tools to facilitate interpretation. AVAILABILITY AND IMPLEMENTATION: We provide an implementation of our method, as well as code to reproduce all results at https://github.com/HectorRDB/Caldera_ISMB. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Estudo de Associação Genômica Ampla , Software , Algoritmos , Bactérias/genética , Análise de Sequência de DNA/métodos
3.
Nat Commun ; 15(1): 833, 2024 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-38280860

RESUMO

In single-cell RNA sequencing (scRNA-Seq), gene expression is assessed individually for each cell, allowing the investigation of developmental processes, such as embryogenesis and cellular differentiation and regeneration, at unprecedented resolution. In such dynamic biological systems, cellular states form a continuum, e.g., for the differentiation of stem cells into mature cell types. This process is often represented via a trajectory in a reduced-dimensional representation of the scRNA-Seq dataset. While many methods have been suggested for trajectory inference, it is often unclear how to handle multiple biological groups or conditions, e.g., inferring and comparing the differentiation trajectories of wild-type and knock-out stem cell populations. In this manuscript, we present condiments, a method for the inference and downstream interpretation of cell trajectories across multiple conditions. Our framework allows the interpretation of differences between conditions at the trajectory, cell population, and gene expression levels. We start by integrating datasets from multiple conditions into a single trajectory. By comparing the cell's conditions along the trajectory's path, we can detect large-scale changes, indicative of differential progression or fate selection. We also demonstrate how to detect subtler changes by finding genes that exhibit different behaviors between these conditions along a differentiation path.


Assuntos
Análise de Célula Única , Células-Tronco , Análise de Célula Única/métodos , Diferenciação Celular/genética , Desenvolvimento Embrionário , Análise de Sequência de RNA/métodos , Condimentos , Perfilação da Expressão Gênica/métodos
4.
Cell Rep Methods ; 2(11): 100321, 2022 11 21.
Artigo em Inglês | MEDLINE | ID: mdl-36452861

RESUMO

The assay for transposase-accessible chromatin using sequencing (ATAC-seq) allows the study of epigenetic regulation of gene expression by assessing chromatin configuration for an entire genome. Despite its popularity, there have been limited studies investigating the analytical challenges related to ATAC-seq data, with most studies leveraging tools developed for bulk transcriptome sequencing. Here, we show that GC-content effects are omnipresent in ATAC-seq datasets. Since the GC-content effects are sample specific, they can bias downstream analyses such as clustering and differential accessibility analysis. We introduce a normalization method based on smooth-quantile normalization within GC-content bins and evaluate it together with 11 different normalization procedures on 8 public ATAC-seq datasets. Accounting for GC-content effects in the normalization is crucial for common downstream ATAC-seq data analyses, improving accuracy and interpretability. Through case studies, we show that exploratory data analysis is essential to guide the choice of an appropriate normalization method for a given dataset.


Assuntos
Benchmarking , Sequenciamento de Cromatina por Imunoprecipitação , Epigênese Genética , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala
5.
JMIR Form Res ; 5(3): e20175, 2021 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-33661120

RESUMO

BACKGROUND: Novel wearable biosensors, ubiquitous smartphone ownership, and telemedicine are converging to enable new paradigms of clinical research. A new generation of continuous glucose monitoring (CGM) devices provides access to clinical-grade measurement of interstitial glucose levels. Adoption of these sensors has become widespread for the management of type 1 diabetes and is accelerating in type 2 diabetes. In parallel, individuals are adopting health-related smartphone-based apps to monitor and manage care. OBJECTIVE: We conducted a proof-of-concept study to investigate the potential of collecting robust, annotated, real-time clinical study measures of glucose levels without clinic visits. METHODS: Self-administered meal-tolerance tests were conducted to assess the impact of a proprietary synbiotic medical food on glucose control in a 6-week, double-blind, placebo-controlled, 2×2 cross-over pilot study (n=6). The primary endpoint was incremental glucose measured using Abbott Freestyle Libre CGM devices associated with a smartphone app that provided a visual diet log. RESULTS: All subjects completed the study and mastered CGM device usage. Over 40 days, 3000 data points on average per subject were collected across three sensors. No adverse events were recorded, and subjects reported general satisfaction with sensor management, the study product, and the smartphone app, with an average self-reported satisfaction score of 8.25/10. Despite a lack of sufficient power to achieve statistical significance, we demonstrated that we can detect meaningful changes in the postprandial glucose response in real-world settings, pointing to the merits of larger studies in the future. CONCLUSIONS: We have shown that CGM devices can provide a comprehensive picture of glucose control without clinic visits. CGM device usage in conjunction with our custom smartphone app can lower the participation burden for subjects while reducing study costs, and allows for robust integration of multiple valuable data types with glucose levels remotely. TRIAL REGISTRATION: ClinicalTrials.gov NCT04424888; http://clinicaltrials.gov/ct2/show/NCT04424888.

6.
Cell Rep ; 37(6): 109982, 2021 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-34758315

RESUMO

Early blastomeres of mouse preimplantation embryos exhibit bi-potential cell fate, capable of generating both embryonic and extra-embryonic lineages in blastocysts. Here we identify three major two-cell-stage (2C)-specific endogenous retroviruses (ERVs) as the molecular hallmark of this bi-potential plasticity. Using the long terminal repeats (LTRs) of all three 2C-specific ERVs, we identify Krüppel-like factor 5 (Klf5) as their major upstream regulator. Klf5 is essential for bi-potential cell fate; a single Klf5-overexpressing embryonic stem cell (ESC) generates terminally differentiated embryonic and extra-embryonic lineages in chimeric embryos, and Klf5 directly induces inner cell mass (ICM) and trophectoderm (TE) specification genes. Intriguingly, Klf5 and Klf4 act redundantly during ICM specification, whereas Klf5 deficiency alone impairs TE specification. Klf5 is regulated by multiple 2C-specific transcription factors, particularly Dux, and the Dux/Klf5 axis is evolutionarily conserved. The 2C-specific transcription program converges on Klf5 to establish bi-potential cell fate, enabling a cell state with dual activation of ICM and TE genes.


Assuntos
Massa Celular Interna do Blastocisto/citologia , Blastocisto , Linhagem da Célula , Células-Tronco Embrionárias/citologia , Regulação da Expressão Gênica no Desenvolvimento , Fatores de Transcrição Kruppel-Like/metabolismo , Trofoblastos/citologia , Animais , Massa Celular Interna do Blastocisto/metabolismo , Diferenciação Celular , Células-Tronco Embrionárias/metabolismo , Feminino , Fatores de Transcrição Kruppel-Like/genética , Masculino , Camundongos , Camundongos Endogâmicos C3H , Camundongos Endogâmicos C57BL , RNA-Seq , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Trofoblastos/metabolismo
7.
Nat Commun ; 11(1): 1201, 2020 03 05.
Artigo em Inglês | MEDLINE | ID: mdl-32139671

RESUMO

Trajectory inference has radically enhanced single-cell RNA-seq research by enabling the study of dynamic changes in gene expression. Downstream of trajectory inference, it is vital to discover genes that are (i) associated with the lineages in the trajectory, or (ii) differentially expressed between lineages, to illuminate the underlying biological processes. Current data analysis procedures, however, either fail to exploit the continuous resolution provided by trajectory inference, or fail to pinpoint the exact types of differential expression. We introduce tradeSeq, a powerful generalized additive model framework based on the negative binomial distribution that allows flexible inference of both within-lineage and between-lineage differential expression. By incorporating observation-level weights, the model additionally allows to account for zero inflation. We evaluate the method on simulated datasets and on real datasets from droplet-based and full-length protocols, and show that it yields biological insights through a clear interpretation of the data.


Assuntos
Perfilação da Expressão Gênica , Análise de Sequência de RNA , Análise de Célula Única , Animais , Medula Óssea/metabolismo , Simulação por Computador , Bases de Dados Genéticas , Regulação da Expressão Gênica , Camundongos , Modelos Estatísticos , Mucosa Olfatória/metabolismo , Análise de Componente Principal
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA