Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
1.
Res Sq ; 2024 Apr 04.
Article in English | MEDLINE | ID: mdl-38645152

ABSTRACT

With the growing number of single-cell analysis tools, benchmarks are increasingly important to guide analysis and method development. However, a lack of standardisation and extensibility in current benchmarks limits their usability, longevity, and relevance to the community. We present Open Problems, a living, extensible, community-guided benchmarking platform including 10 current single-cell tasks that we envision will raise standards for the selection, evaluation, and development of methods in single-cell analysis.

2.
iScience ; 25(9): 104927, 2022 Sep 16.
Article in English | MEDLINE | ID: mdl-36065187

ABSTRACT

In this work, we studied the generation of memory precursor cells following an acute infection by analyzing single-cell RNA-seq data that contained CD8 T cells collected during the postinfection expansion phase. We used different tools to reconstruct the developmental trajectory that CD8 T cells followed after activation. Cells that exhibited a memory precursor signature were identified and positioned on this trajectory. We found that these memory precursors are generated continuously with increasing numbers arising over time. Similarly, expression of genes associated with effector functions was also found to be raised in memory precursors at later time points. The ability of cells to enter quiescence and differentiate into memory cells was confirmed by BrdU pulse-chase experiment in vivo. Analysis of cell counts indicates that the vast majority of memory cells are generated at later time points from cells that have extensively divided.

4.
Nat Biotechnol ; 39(11): 1453-1465, 2021 11.
Article in English | MEDLINE | ID: mdl-34140680

ABSTRACT

Existing compendia of non-coding RNA (ncRNA) are incomplete, in part because they are derived almost exclusively from small and polyadenylated RNAs. Here we present a more comprehensive atlas of the human transcriptome, which includes small and polyA RNA as well as total RNA from 300 human tissues and cell lines. We report thousands of previously uncharacterized RNAs, increasing the number of documented ncRNAs by approximately 8%. To infer functional regulation by known and newly characterized ncRNAs, we exploited pre-mRNA abundance estimates from total RNA sequencing, revealing 316 microRNAs and 3,310 long non-coding RNAs with multiple lines of evidence for roles in regulating protein-coding genes and pathways. Our study both refines and expands the current catalog of human ncRNAs and their regulatory interactions. All data, analyses and results are available for download and interrogation in the R2 web portal, serving as a basis for future exploration of RNA biology and function.


Subject(s)
MicroRNAs , RNA, Long Noncoding , Humans , MicroRNAs/genetics , RNA, Long Noncoding/genetics , RNA, Messenger , RNA, Untranslated/genetics , Transcriptome/genetics
5.
Nat Commun ; 12(1): 3942, 2021 06 24.
Article in English | MEDLINE | ID: mdl-34168133

ABSTRACT

We present dyngen, a multi-modal simulation engine for studying dynamic cellular processes at single-cell resolution. dyngen is more flexible than current single-cell simulation engines, and allows better method development and benchmarking, thereby stimulating development and testing of computational methods. We demonstrate its potential for spearheading computational methods on three applications: aligning cell developmental trajectories, cell-specific regulatory network inference and estimation of RNA velocity.


Subject(s)
Computer Simulation , Gene Regulatory Networks , Single-Cell Analysis/methods , Algorithms , Benchmarking , Computational Biology/methods , Gene Expression Profiling
6.
Bioinformatics ; 36(Suppl_1): i66-i74, 2020 07 01.
Article in English | MEDLINE | ID: mdl-32657409

ABSTRACT

MOTIVATION: During the last decade, trajectory inference (TI) methods have emerged as a novel framework to model cell developmental dynamics, most notably in the area of single-cell transcriptomics. At present, more than 70 TI methods have been published, and recent benchmarks showed that even state-of-the-art methods only perform well for certain trajectory types but not others. RESULTS: In this work, we present TinGa, a new TI model that is fast and flexible, and that is based on Growing Neural Graphs. We performed an extensive comparison of TinGa to five state-of-the-art methods for TI on a set of 250 datasets, including both synthetic as well as real datasets. Overall, TinGa improves the state-of-the-art by producing accurate models (comparable to or an improvement on the state-of-the-art) on the whole spectrum of data complexity, from the simplest linear datasets to the most complex disconnected graphs. In addition, TinGa obtained the fastest execution times, showing that our method is thus one of the most versatile methods up to date. AVAILABILITY AND IMPLEMENTATION: R scripts for running TinGa, comparing it to top existing methods and generating the figures of this article are available at https://github.com/Helena-todd/TinGa.


Subject(s)
Nerve Agents , Computational Biology
7.
Nat Protoc ; 15(7): 2247-2276, 2020 07.
Article in English | MEDLINE | ID: mdl-32561888

ABSTRACT

This protocol explains how to perform a fast SCENIC analysis alongside standard best practices steps on single-cell RNA-sequencing data using software containers and Nextflow pipelines. SCENIC reconstructs regulons (i.e., transcription factors and their target genes) assesses the activity of these discovered regulons in individual cells and uses these cellular activity patterns to find meaningful clusters of cells. Here we present an improved version of SCENIC with several advances. SCENIC has been refactored and reimplemented in Python (pySCENIC), resulting in a tenfold increase in speed, and has been packaged into containers for ease of use. It is now also possible to use epigenomic track databases, as well as motifs, to refine regulons. In this protocol, we explain the different steps of SCENIC: the workflow starts from the count matrix depicting the gene abundances for all cells and consists of three stages. First, coexpression modules are inferred using a regression per-target approach (GRNBoost2). Next, the indirect targets are pruned from these modules using cis-regulatory motif discovery (cisTarget). Lastly, the activity of these regulons is quantified via an enrichment score for the regulon's target genes (AUCell). Nonlinear projection methods can be used to display visual groupings of cells based on the cellular activity patterns of these regulons. The results can be exported as a loom file and visualized in the SCope web application. This protocol is illustrated on two use cases: a peripheral blood mononuclear cell data set and a panel of single-cell RNA-sequencing cancer experiments. For a data set of 10,000 genes and 50,000 cells, the pipeline runs in <2 h.


Subject(s)
Gene Regulatory Networks , Single-Cell Analysis/methods , Workflow , Animals , Cell Line, Tumor , Humans , Mice
8.
Nat Commun ; 11(1): 1201, 2020 03 05.
Article in English | MEDLINE | ID: mdl-32139671

ABSTRACT

Trajectory inference has radically enhanced single-cell RNA-seq research by enabling the study of dynamic changes in gene expression. Downstream of trajectory inference, it is vital to discover genes that are (i) associated with the lineages in the trajectory, or (ii) differentially expressed between lineages, to illuminate the underlying biological processes. Current data analysis procedures, however, either fail to exploit the continuous resolution provided by trajectory inference, or fail to pinpoint the exact types of differential expression. We introduce tradeSeq, a powerful generalized additive model framework based on the negative binomial distribution that allows flexible inference of both within-lineage and between-lineage differential expression. By incorporating observation-level weights, the model additionally allows to account for zero inflation. We evaluate the method on simulated datasets and on real datasets from droplet-based and full-length protocols, and show that it yields biological insights through a clear interpretation of the data.


Subject(s)
Gene Expression Profiling , Sequence Analysis, RNA , Single-Cell Analysis , Animals , Bone Marrow/metabolism , Computer Simulation , Databases, Genetic , Gene Expression Regulation , Mice , Models, Statistical , Olfactory Mucosa/metabolism , Principal Component Analysis
9.
Genome Biol ; 20(1): 125, 2019 06 20.
Article in English | MEDLINE | ID: mdl-31221194

ABSTRACT

In computational biology and other sciences, researchers are frequently faced with a choice between several computational methods for performing data analyses. Benchmarking studies aim to rigorously compare the performance of different methods using well-characterized benchmark datasets, to determine the strengths of each method or to provide recommendations regarding suitable choices of methods for an analysis. However, benchmarking studies must be carefully designed and implemented to provide accurate, unbiased, and informative results. Here, we summarize key practical guidelines and recommendations for performing high-quality benchmarking analyses, based on our experiences in computational biology.


Subject(s)
Computational Biology/standards , Guidelines as Topic , Benchmarking , Datasets as Topic , Publishing , Research Design , Software
10.
Nat Biotechnol ; 37(5): 547-554, 2019 05.
Article in English | MEDLINE | ID: mdl-30936559

ABSTRACT

Trajectory inference approaches analyze genome-wide omics data from thousands of single cells and computationally infer the order of these cells along developmental trajectories. Although more than 70 trajectory inference tools have already been developed, it is challenging to compare their performance because the input they require and output models they produce vary substantially. Here, we benchmark 45 of these methods on 110 real and 229 synthetic datasets for cellular ordering, topology, scalability and usability. Our results highlight the complementarity of existing tools, and that the choice of method should depend mostly on the dataset dimensions and trajectory topology. Based on these results, we develop a set of guidelines to help users select the best method for their dataset. Our freely available data and evaluation pipeline ( https://benchmark.dynverse.org ) will aid in the development of improved tools designed to analyze increasingly large and complex single-cell datasets.


Subject(s)
Computational Biology/methods , Genome/genetics , High-Throughput Nucleotide Sequencing/methods , Single-Cell Analysis/methods , Benchmarking , High-Throughput Nucleotide Sequencing/trends , Single-Cell Analysis/trends
11.
Methods Mol Biol ; 1883: 235-249, 2019.
Article in English | MEDLINE | ID: mdl-30547403

ABSTRACT

Recent technological breakthroughs in single-cell RNA sequencing are revolutionizing modern experimental design in biology. The increasing size of the single-cell expression data from which networks can be inferred allows identifying more complex, non-linear dependencies between genes. Moreover, the inter-cellular variability that is observed in single-cell expression data can be used to infer not only one global network representing all the cells, but also numerous regulatory networks that are more specific to certain conditions. By experimentally perturbing certain genes, the deconvolution of the true contribution of these genes can also be greatly facilitated. In this chapter, we will therefore tackle the advantages of single-cell transcriptomic data and show how new methods exploit this novel data type to enhance the inference of gene regulatory networks.


Subject(s)
Gene Expression Profiling/methods , Gene Regulatory Networks , Models, Genetic , Single-Cell Analysis/methods , Systems Biology/methods , Algorithms , Gene Expression Profiling/instrumentation , High-Throughput Nucleotide Sequencing/instrumentation , High-Throughput Nucleotide Sequencing/methods , Humans , Sequence Analysis, RNA , Single-Cell Analysis/instrumentation , Systems Biology/instrumentation
12.
Immunity ; 49(2): 312-325.e5, 2018 08 21.
Article in English | MEDLINE | ID: mdl-30076102

ABSTRACT

Heterogeneity between different macrophage populations has become a defining feature of this lineage. However, the conserved factors defining macrophages remain largely unknown. The transcription factor ZEB2 is best described for its role in epithelial to mesenchymal transition; however, its role within the immune system is only now being elucidated. We show here that Zeb2 expression is a conserved feature of macrophages. Using Clec4f-cre, Itgax-cre, and Fcgr1-cre mice to target five different macrophage populations, we found that loss of ZEB2 resulted in macrophage disappearance from the tissues, coupled with their subsequent replenishment from bone-marrow precursors in open niches. Mechanistically, we found that ZEB2 functioned to maintain the tissue-specific identities of macrophages. In Kupffer cells, ZEB2 achieved this by regulating expression of the transcription factor LXRα, removal of which recapitulated the loss of Kupffer cell identity and disappearance. Thus, ZEB2 expression is required in macrophages to preserve their tissue-specific identities.


Subject(s)
Kupffer Cells/cytology , Liver X Receptors/genetics , Zinc Finger E-box Binding Homeobox 2/genetics , Animals , Cell Lineage/immunology , Epithelial-Mesenchymal Transition , Female , Gene Expression Regulation, Neoplastic , Kupffer Cells/immunology , Liver/cytology , Liver X Receptors/metabolism , Lung/cytology , Male , Mice , Mice, Inbred C57BL , Mice, Transgenic
13.
PLoS One ; 13(4): e0195997, 2018.
Article in English | MEDLINE | ID: mdl-29698494

ABSTRACT

MOTIVATION: Graphlets are small network patterns that can be counted in order to characterise the structure of a network (topology). As part of a topology optimisation process, one could use graphlet counts to iteratively modify a network and keep track of the graphlet counts, in order to achieve certain topological properties. Up until now, however, graphlets were not suited as a metric for performing topology optimisation; when millions of minor changes are made to the network structure it becomes computationally intractable to recalculate all the graphlet counts for each of the edge modifications. RESULTS: IncGraph is a method for calculating the differences in graphlet counts with respect to the network in its previous state, which is much more efficient than calculating the graphlet occurrences from scratch at every edge modification made. In comparison to static counting approaches, our findings show IncGraph reduces the execution time by several orders of magnitude. The usefulness of this approach was demonstrated by developing a graphlet-based metric to optimise gene regulatory networks. IncGraph is able to quickly quantify the topological impact of small changes to a network, which opens novel research opportunities to study changes in topologies in evolving or online networks, or develop graphlet-based criteria for topology optimisation. AVAILABILITY: IncGraph is freely available as an open-source R package on CRAN (incgraph). The development version is also available on GitHub (rcannood/incgraph).


Subject(s)
Software , Algorithms , Gene Regulatory Networks , Models, Biological
14.
J Natl Cancer Inst ; 110(10): 1084-1093, 2018 10 01.
Article in English | MEDLINE | ID: mdl-29514301

ABSTRACT

Background: Neuroblastoma is characterized by substantial clinical heterogeneity. Despite intensive treatment, the survival rates of high-risk neuroblastoma patients are still disappointingly low. Somatic chromosomal copy number aberrations have been shown to be associated with patient outcome, particularly in low- and intermediate-risk neuroblastoma patients. To improve outcome prediction in high-risk neuroblastoma, we aimed to design a prognostic classification method based on copy number aberrations. Methods: In an international collaboration, normalized high-resolution DNA copy number data (arrayCGH and SNP arrays) from 556 high-risk neuroblastomas obtained at diagnosis were collected from nine collaborative groups and segmented using the same method. We applied logistic and Cox proportional hazard regression to identify genomic aberrations associated with poor outcome. Results: In this study, we identified two types of copy number aberrations that are associated with extremely poor outcome. Distal 6q losses were detected in 5.9% of patients and were associated with a 10-year survival probability of only 3.4% (95% confidence interval [CI] = 0.5% to 23.3%, two-sided P = .002). Amplifications of regions not encompassing the MYCN locus were detected in 18.1% of patients and were associated with a 10-year survival probability of only 5.8% (95% CI = 1.5% to 22.2%, two-sided P < .001). Conclusions: Using a unique large copy number data set of high-risk neuroblastoma cases, we identified a small subset of high-risk neuroblastoma patients with extremely low survival probability that might be eligible for inclusion in clinical trials of new therapeutics. The amplicons may also nominate alternative treatments that target the amplified genes.


Subject(s)
Chromosome Deletion , Chromosomes, Human, Pair 6 , Gene Amplification , Genomics , Neuroblastoma/genetics , Neuroblastoma/mortality , Biomarkers, Tumor , Child, Preschool , DNA Copy Number Variations , Genetic Association Studies , Genetic Predisposition to Disease , Genomics/methods , Humans , Infant , N-Myc Proto-Oncogene Protein/genetics , Neoplasm Staging , Neuroblastoma/pathology , Neuroblastoma/therapy , Prognosis
15.
Nat Commun ; 9(1): 1090, 2018 03 15.
Article in English | MEDLINE | ID: mdl-29545622

ABSTRACT

A critical step in the analysis of large genome-wide gene expression datasets is the use of module detection methods to group genes into co-expression modules. Because of limitations of classical clustering methods, numerous alternative module detection methods have been proposed, which improve upon clustering by handling co-expression in only a subset of samples, modelling the regulatory network, and/or allowing overlap between modules. In this study we use known regulatory networks to do a comprehensive and robust evaluation of these different methods. Overall, decomposition methods outperform all other strategies, while we do not find a clear advantage of biclustering and network inference-based approaches on large gene expression datasets. Using our evaluation workflow, we also investigate several practical aspects of module detection, such as parameter estimation and the use of alternative similarity measures, and conclude with recommendations for the further development of these methods.


Subject(s)
Cluster Analysis , Algorithms , Computational Biology/methods , Gene Expression Profiling/methods , Gene Regulatory Networks/genetics , Gene Regulatory Networks/physiology
16.
Genet Med ; 19(4): 457-466, 2017 04.
Article in English | MEDLINE | ID: mdl-27608171

ABSTRACT

PURPOSE: Our goal was to design a customized microarray, arrEYE, for high-resolution copy number variant (CNV) analysis of known and candidate genes for inherited retinal dystrophy (iRD) and retina-expressed noncoding RNAs (ncRNAs). METHODS: arrEYE contains probes for the full genomic region of 106 known iRD genes, including those implicated in retinitis pigmentosa (RP) (the most frequent iRD), cone-rod dystrophies, macular dystrophies, and an additional 60 candidate iRD genes and 196 ncRNAs. Eight CNVs in iRD genes identified by other techniques were used as positive controls. The test cohort consisted of 57 patients with autosomal dominant, X-linked, or simplex RP. RESULTS: In an RP patient, a novel heterozygous deletion of exons 7 and 8 of the HGSNAT gene was identified: c.634-408_820+338delinsAGAATATG, p.(Glu212Glyfs*2). A known variant was found on the second allele: c.1843G>A, p.(Ala615Thr). Furthermore, we expanded the allelic spectrum of USH2A and RCBTB1 with novel CNVs. CONCLUSION: The arrEYE platform revealed subtle single-exon to larger CNVs in iRD genes that could be characterized at the nucleotide level, facilitated by the high resolution of the platform. We report the first CNV in HGSNAT that, combined with another mutation, leads to RP, further supporting its recently identified role in nonsyndromic iRD.Genet Med 19 4, 457-466.


Subject(s)
Comparative Genomic Hybridization/methods , DNA Copy Number Variations , Oligonucleotide Array Sequence Analysis/methods , Retinal Dystrophies/genetics , Acetyltransferases/genetics , Extracellular Matrix Proteins/genetics , Female , Guanine Nucleotide Exchange Factors/genetics , Humans , Male , RNA, Untranslated/genetics , Sequence Deletion
17.
Oncotarget ; 8(63): 106820-106832, 2017 Dec 05.
Article in English | MEDLINE | ID: mdl-29290991

ABSTRACT

BACKGROUND: Neuroblastoma is an aggressive childhood malignancy of the sympathetic nervous system. Despite multi-modal therapy, survival of high-risk patients remains disappointingly low, underscoring the need for novel treatment strategies. The discovery of ALK activating mutations opened the way to precision treatment in a subset of these patients. Previously, we investigated the transcriptional effects of pharmacological ALK inhibition on neuroblastoma cell lines, six hours after TAE684 administration, resulting in the 77-gene ALK signature, which was shown to gradually decrease from 120 minutes after TAE684 treatment, to gain deeper insight into the molecular effects of oncogenic ALK signaling. AIM: Here, we further dissected the transcriptional dynamic profiles of neuroblastoma cells upon TAE684 treatment in a detailed timeframe of ten minutes up to six hours after inhibition, in order to identify additional early targets for combination treatment. RESULTS: We observed an unexpected initial upregulation of positively regulated MYCN target genes following subsequent downregulation of overall MYCN activity. In addition, we identified adrenomedullin (ADM), previously shown to be implicated in sunitinib resistance, as the earliest response gene upon ALK inhibition. CONCLUSIONS: We describe the early and late effects of ALK inhibitor TAE684 treatment on the neuroblastoma transcriptome. The observed unexpected upregulation of ADM warrants further investigation in relation to putative ALK resistance in neuroblastoma patients currently undergoing ALK inhibitor treatment.

18.
Eur J Immunol ; 46(11): 2496-2506, 2016 11.
Article in English | MEDLINE | ID: mdl-27682842

ABSTRACT

Recent developments in single-cell transcriptomics have opened new opportunities for studying dynamic processes in immunology in a high throughput and unbiased manner. Starting from a mixture of cells in different stages of a developmental process, unsupervised trajectory inference algorithms aim to automatically reconstruct the underlying developmental path that cells are following. In this review, we break down the strategies used by this novel class of methods, and organize their components into a common framework, highlighting several practical advantages and disadvantages of the individual methods. We also give an overview of new insights these methods have already provided regarding the wiring and gene regulation of cell differentiation. As the trajectory inference field is still in its infancy, we propose several future developments that will ultimately lead to a global and data-driven way of studying immune cell differentiation.


Subject(s)
Cell Differentiation , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Algorithms , Cell Differentiation/genetics , Computational Biology , Humans
19.
Oncotarget ; 7(2): 1960-72, 2016 Jan 12.
Article in English | MEDLINE | ID: mdl-26646589

ABSTRACT

Accurate assessment of neuroblastoma outcome prediction remains challenging. Therefore, this study aims at establishing novel prognostic tumor DNA methylation biomarkers. In total, 396 low- and high-risk primary tumors were analyzed, of which 87 were profiled using methyl-CpG-binding domain (MBD) sequencing for differential methylation analysis between prognostic patient groups. Subsequently, methylation-specific PCR (MSP) assays were developed for 78 top-ranking differentially methylated regions and tested on two independent cohorts of 132 and 177 samples, respectively. Further, a new statistical framework was used to identify a robust set of MSP assays of which the methylation score (i.e. the percentage of methylated assays) allows accurate outcome prediction. Survival analyses were performed on the individual target level, as well as on the combined multimarker signature. As a result of the differential DNA methylation assessment by MBD sequencing, 58 of the 78 MSP assays were designed in regions previously unexplored in neuroblastoma, and 36 are located in non-promoter or non-coding regions. In total, 5 individual MSP assays (located in CCDC177, NXPH1, lnc-MRPL3-2, lnc-TREX1-1 and one on a region from chromosome 8 with no further annotation) predict event-free survival and 4 additional assays (located in SPRED3, TNFAIP2, NPM2 and CYYR1) also predict overall survival. Furthermore, a robust 58-marker methylation signature predicting overall and event-free survival was established. In conclusion, this study encompasses the largest DNA methylation biomarker study in neuroblastoma so far. We identified and independently validated several novel prognostic biomarkers, as well as a prognostic 58-marker methylation signature.


Subject(s)
Biomarkers/analysis , CpG Islands/genetics , DNA Methylation , DNA, Neoplasm/genetics , Neuroblastoma/diagnosis , Neuroblastoma/genetics , Binding Sites , Cohort Studies , Computational Biology , Female , Humans , Infant , Male , Neoplasm Staging , Prognosis , Real-Time Polymerase Chain Reaction , Tumor Cells, Cultured
SELECTION OF CITATIONS
SEARCH DETAIL
...